{"id":16532,"date":"2019-12-28T20:04:08","date_gmt":"2019-12-29T01:04:08","guid":{"rendered":"https:\/\/bangla.salearningschool.com\/recent-posts\/part-1-some-math-stat-background-that-true-data-scientists-will-know-use-from-the-internet\/"},"modified":"2020-01-30T21:04:26","modified_gmt":"2020-01-31T02:04:26","slug":"part-1-some-math-stat-background-that-true-data-scientists-will-know-use-from-the-internet","status":"publish","type":"post","link":"http:\/\/bangla.sitestree.com\/?p=16532","title":{"rendered":"Part 1: Some Math\/Stat Background that (true) Data Scientists will know\/use: from the internet"},"content":{"rendered":"<p>Chebyshev&#8217;s inequality<\/p>\n<p>&#8220;In <a title=\"Probability theory\" href=\"https:\/\/en.wikipedia.org\/wiki\/Probability_theory\">probability theory<\/a>, <strong>Chebyshev&#8217;s inequality<\/strong> (also called the <strong>Bienaym\u00e9\u2013Chebyshev inequality<\/strong>) guarantees that, for a wide class of <a title=\"Probability distributions\" href=\"https:\/\/en.wikipedia.org\/wiki\/Probability_distributions\">probability distributions<\/a>, no more than a certain fraction of values can be more than a certain distance from the <a title=\"Expected value\" href=\"https:\/\/en.wikipedia.org\/wiki\/Expected_value\">mean<\/a>.<\/p>\n<p>Specifically, no more than 1\/<em>k<\/em>2 of the distribution&#8217;s values can be more than <em>k<\/em> <a title=\"\" href=\"https:\/\/en.wikipedia.org\/wiki\/Standard_deviations\">standard deviations<\/a> away from the mean<\/p>\n<p>equivalently, at least 1 \u2212 1\/<em>k<\/em>2 of the distribution&#8217;s values are within <em>k<\/em> standard deviations of the mean<\/p>\n<p>In statistics. The inequality has great utility because it can be applied to any probability distribution in which the mean and variance are defined.&#8221;<\/p>\n<p>Ref: <a href=\"https:\/\/en.wikipedia.org\/wiki\/Chebyshev%27s_inequality\">https:\/\/en.wikipedia.org\/wiki\/Chebyshev%27s_inequality<\/a><\/p>\n<h3>Probabilistic statement[<a style=\"text-decoration-line: none; color: #0b0080; background: none;\" title=\"Edit section: Probabilistic statement\" href=\"https:\/\/en.wikipedia.org\/w\/index.php?title=Chebyshev%27s_inequality&amp;action=edit&amp;section=3\">edit<\/a><span class=\"gmail-mw-editsection-bracket\" style=\"margin-left: 0.25em; color: #54595d;\">]<\/span><\/h3>\n<p>Let <em>X<\/em> (integrable) be a <a title=\"Random variable\" href=\"https:\/\/en.wikipedia.org\/wiki\/Random_variable\">random variable<\/a> with finite <a title=\"Expected value\" href=\"https:\/\/en.wikipedia.org\/wiki\/Expected_value\">expected value<\/a> <em>\u03bc<\/em> and finite non-zero <a title=\"Variance\" href=\"https:\/\/en.wikipedia.org\/wiki\/Variance\">variance<\/a> <em>\u03c3<\/em>2. Then for any <a title=\"Real number\" href=\"https:\/\/en.wikipedia.org\/wiki\/Real_number\">real number<\/a> <em>k<\/em> &gt; 0,<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/wikimedia.org\/api\/rest_v1\/media\/math\/render\/svg\/13787911b032508f2a54da8eb84750f331a70401\" alt=\"\\Pr(|X-\\mu |\\geq k\\sigma )\\leq {\\frac {1}{k^{2}}}.\" \/><\/p>\n<p>Only the case <img decoding=\"async\" src=\"https:\/\/wikimedia.org\/api\/rest_v1\/media\/math\/render\/svg\/5cda43bd4034dc2d04cd562005d0af81d3d2dbc6\" alt=\"k &gt; 1\" \/> is useful. When <img decoding=\"async\" src=\"https:\/\/wikimedia.org\/api\/rest_v1\/media\/math\/render\/svg\/469d19178c15078531ed85c412c641ff664f028b\" alt=\"{\\displaystyle k\\leq 1}\" \/> the right-hand side <img decoding=\"async\" src=\"https:\/\/wikimedia.org\/api\/rest_v1\/media\/math\/render\/svg\/ded7f223f46f4516f81f0590190636a894378729\" alt=\"{\\displaystyle {\\frac {1}{k^{2}}}\\geq 1}\" \/> and the inequality is trivial as all probabilities are \u2264 1.<\/p>\n<p>As an example, using <img decoding=\"async\" src=\"https:\/\/wikimedia.org\/api\/rest_v1\/media\/math\/render\/svg\/59b8a83b1bbde74730e5387d2099f2a18fea8a7a\" alt=\"{\\displaystyle k={\\sqrt {2}}}\" \/> shows that the probability that values lie outside the interval <img decoding=\"async\" src=\"https:\/\/wikimedia.org\/api\/rest_v1\/media\/math\/render\/svg\/095b4536892df57bdf2208e47a327e66fe7b3833\" alt=\"{\\displaystyle (\\mu -{\\sqrt {2}}\\sigma ,\\mu +{\\sqrt {2}}\\sigma )}\" \/> does not exceed <img decoding=\"async\" src=\"https:\/\/wikimedia.org\/api\/rest_v1\/media\/math\/render\/svg\/a11cfb2fdb143693b1daf78fcb5c11a023cb1c55\" alt=\"{\\frac {1}{2}}\" \/>.<\/p>\n<h3>Ref: <a href=\"https:\/\/en.wikipedia.org\/wiki\/Chebyshev%27s_inequality\">https:\/\/en.wikipedia.org\/wiki\/Chebyshev%27s_inequality<\/a><\/h3>\n<h3>&#8220;Markov&#8217;s inequality<\/h3>\n<p>&#8220;Markov&#8217;s inequality (and other similar inequalities) relate probabilities to <a title=\"Expected value\" href=\"https:\/\/en.wikipedia.org\/wiki\/Expected_value\">expectations<\/a>, and provide (frequently loose but still useful) bounds for the <a title=\"Cumulative distribution function\" href=\"https:\/\/en.wikipedia.org\/wiki\/Cumulative_distribution_function\">cumulative distribution function<\/a> of a random variable.&#8221;<\/p>\n<h2>Statement<\/h2>\n<p>&#8220;If X is a nonnegative random variable and <em>a<\/em> &gt; 0, then the probability that X is at least a is at most the expectation of X divided by a:<a href=\"https:\/\/en.wikipedia.org\/wiki\/Markov%27s_inequality#cite_note-ProbabilityCourse-1\">[1]<\/a><\/p>\n<p><img decoding=\"async\" src=\"https:\/\/wikimedia.org\/api\/rest_v1\/media\/math\/render\/svg\/bd6bedf71baa9941ef8cc368072afab09e5ec9fb\" alt=\"{\\displaystyle \\operatorname {P} (X\\geq a)\\leq {\\frac {\\operatorname {E} (X)}{a}}.}\" \/><\/p>\n<p>Let {\\displaystyle a={\\tilde {a}}\\cdot \\operatorname {E} (X)}<img decoding=\"async\" src=\"https:\/\/wikimedia.org\/api\/rest_v1\/media\/math\/render\/svg\/1300660e9b14ba98586aa86bf96552dcf0da68e1\" alt=\"{\\displaystyle a={\\tilde {a}}\\cdot \\operatorname {E} (X)}\" \/><img decoding=\"async\" src=\"https:\/\/wikimedia.org\/api\/rest_v1\/media\/math\/render\/svg\/a9f75812b542a4ab9205a6b5983c887ea43029ca\" alt=\"{\\displaystyle {\\tilde {a}}&gt;0}\" \/>); then we can rewrite the previous inequality as<\/p>\n<p>&#8220;<\/p>\n<p>Ref: <a href=\"https:\/\/en.wikipedia.org\/wiki\/Markov%27s_inequality\">https:\/\/en.wikipedia.org\/wiki\/Markov%27s_inequality<\/a><\/p>\n<p>Check Null Hypothesis concept as well as Chi Square Test here: <a href=\"http:\/\/bangla.salearningschool.com\/recent-posts\/important-basic-concepts-statistics-for-big-data\/\">http:\/\/bangla.salearningschool.com\/recent-posts\/important-basic-concepts-statistics-for-big-data\/<\/a><\/p>\n<p><strong>Chi-Square Statistic:<\/strong><\/p>\n<p>&#8220;A <strong>chi square<\/strong> (\u03c72) <strong>statistic<\/strong> is a test that measures how expectations compare to actual observed data (or model results).&#8221;<\/p>\n<p><a href=\"https:\/\/www.investopedia.com\/terms\/c\/chi-square-statistic.asp\">https:\/\/www.investopedia.com\/terms\/c\/chi-square-statistic.asp<\/a><\/p>\n<p>&#8220;What does chi square test tell you?<\/p>\n<p>The <strong>Chi<\/strong>&#8211;<strong>square test<\/strong> is intended to <strong>test<\/strong> how likely it is that an observed distribution is due to chance. It is also called a &#8220;goodness of fit&#8221; statistic, because it measures how well the observed distribution of data fits with the distribution that is expected if the variables are independent.&#8221;<\/p>\n<p><a href=\"https:\/\/www.ling.upenn.edu\/~clight\/chisquared.htm\">https:\/\/www.ling.upenn.edu\/~clight\/chisquared.htm<\/a><\/p>\n<p>&#8220;In <a title=\"Probability theory\" href=\"https:\/\/en.wikipedia.org\/wiki\/Probability_theory\">probability theory<\/a> and <a title=\"Statistics\" href=\"https:\/\/en.wikipedia.org\/wiki\/Statistics\">statistics<\/a>, the <strong>chi-square distribution<\/strong> (also <strong>chi-squared<\/strong> or <strong><em>\u03c7<\/em>2-distribution<\/strong>) with k <a title=\"Degrees of freedom (statistics)\" href=\"https:\/\/en.wikipedia.org\/wiki\/Degrees_of_freedom_(statistics)\">degrees of freedom<\/a> is the distribution of a sum of the squares of k <a title=\"Independence (probability theory)\" href=\"https:\/\/en.wikipedia.org\/wiki\/Independence_(probability_theory)\">independent<\/a> <a title=\"Standard normal\" href=\"https:\/\/en.wikipedia.org\/wiki\/Standard_normal\">standard normal<\/a> random variables. The chi-square distribution is a special case of the <a title=\"Gamma distribution\" href=\"https:\/\/en.wikipedia.org\/wiki\/Gamma_distribution\">gamma distribution<\/a> and is one of the most widely used <a title=\"Probability distribution\" href=\"https:\/\/en.wikipedia.org\/wiki\/Probability_distribution\">probability distributions<\/a> in <a title=\"Inferential statistics\" href=\"https:\/\/en.wikipedia.org\/wiki\/Inferential_statistics\">inferential statistics<\/a>, notably in <a title=\"Hypothesis testing\" href=\"https:\/\/en.wikipedia.org\/wiki\/Hypothesis_testing\">hypothesis testing<\/a> and in construction of <a title=\"Confidence interval\" href=\"https:\/\/en.wikipedia.org\/wiki\/Confidence_interval\">confidence intervals<\/a>.<a href=\"https:\/\/en.wikipedia.org\/wiki\/Chi-squared_distribution#cite_note-abramowitz-2\">[2]<\/a><a href=\"https:\/\/en.wikipedia.org\/wiki\/Chi-squared_distribution#cite_note-3\">[3]<\/a><a href=\"https:\/\/en.wikipedia.org\/wiki\/Chi-squared_distribution#cite_note-Johnson_et_al-4\">[4]<\/a><a href=\"https:\/\/en.wikipedia.org\/wiki\/Chi-squared_distribution#cite_note-5\">[5]<\/a> When it is being distinguished from the more general <a title=\"Noncentral chi-square distribution\" href=\"https:\/\/en.wikipedia.org\/wiki\/Noncentral_chi-square_distribution\">noncentral chi-square distribution<\/a>, this distribution is sometimes called the <strong>central chi-square distribution<\/strong>.&#8221;: <a href=\"https:\/\/en.wikipedia.org\/wiki\/Chi-squared_distribution\">https:\/\/en.wikipedia.org\/wiki\/Chi-squared_distribution<\/a><\/p>\n<p>&#8220;A <strong>chi-squared test<\/strong>, also written as <strong><em>\u03c7<\/em>2 test<\/strong>, is any <a title=\"Statistical hypothesis testing\" href=\"https:\/\/en.wikipedia.org\/wiki\/Statistical_hypothesis_testing\">statistical hypothesis test<\/a> where the <a title=\"Sampling distribution\" href=\"https:\/\/en.wikipedia.org\/wiki\/Sampling_distribution\">sampling distribution<\/a> of the test statistic is a <a title=\"Chi-squared distribution\" href=\"https:\/\/en.wikipedia.org\/wiki\/Chi-squared_distribution\">chi-squared distribution<\/a> when the <a title=\"Null hypothesis\" href=\"https:\/\/en.wikipedia.org\/wiki\/Null_hypothesis\">null hypothesis<\/a> is true. Without other qualification, &#8216;chi-squared test&#8217; often is used as short for <a title=\"Pearson's chi-squared test\" href=\"https:\/\/en.wikipedia.org\/wiki\/Pearson%27s_chi-squared_test\"><em>Pearson&#8217;s<\/em> chi-squared test<\/a>. The chi-squared test is used to determine whether there is a significant difference between the expected frequencies and the observed frequencies in one or more categories.&#8221;: <a href=\"https:\/\/en.wikipedia.org\/wiki\/Chi-squared_test\">https:\/\/en.wikipedia.org\/wiki\/Chi-squared_test<\/a><\/p>\n<p>&#8220;<\/p>\n<h1>Statistical Significance Tests for Comparing Machine Learning Algorithms<\/h1>\n<p>Learn<\/p>\n<p>&#8220;<\/p>\n<ul>\n<li>Statistical hypothesis tests can aid in comparing machine learning models and choosing a final model.<\/li>\n<li>The naive application of statistical hypothesis tests can lead to misleading results.<\/li>\n<li>Correct use of statistical tests is challenging, and there is some consensus for using the McNemar\u2019s test or 5\u00d72 cross-validation with a modified paired Student t-test.<\/li>\n<\/ul>\n<p><a href=\"https:\/\/machinelearningmastery.com\/statistical-significance-tests-for-comparing-machine-learning-algorithms\/\">https:\/\/machinelearningmastery.com\/statistical-significance-tests-for-comparing-machine-learning-algorithms\/<\/a><\/p>\n<p><strong>Probability Axioms (I am not convinced that the following is the\u00a0 best way to say)<\/strong><\/p>\n<p><strong>&#8220;<\/strong><\/p>\n<ul>\n<li>Axiom 1: The probability of an event is a real number greater than or equal to 0.<\/li>\n<li>Axiom 2: The probability that at least one of all the possible outcomes of a process (such as rolling a die) will occur is 1.<\/li>\n<li>Axiom 3: If two events <em>A<\/em> and <em>B<\/em> are mutually exclusive, then the probability of either <em>A<\/em> or <em>B<\/em> occurring is the probability of <em>A<\/em> occurring plus the probability of <em>B<\/em> occurring.<\/li>\n<\/ul>\n<p><strong>&#8220;<\/strong><\/p>\n<p><a href=\"https:\/\/plus.maths.org\/content\/maths-minute-axioms-probability\">https:\/\/plus.maths.org\/content\/maths-minute-axioms-probability<\/a><\/p>\n<p>1. Probability is non-negative<\/p>\n<p>2. P{S} = 1<\/p>\n<p>3. Probability is additive<\/p>\n<p>If A and B are two mutually exclusive (independent) events<\/p>\n<p>P (A U B) = P(A) + P(B)<\/p>\n<p>P (A intersection B) = empty = 0 . [nothing common]<\/p>\n<p>P{A} = 1 &#8211; P'(A)<\/p>\n<p>P{phi = empty} = 0<\/p>\n<p><strong>What does probability density function mean?<\/strong><\/p>\n<p><strong>&#8220;Probability density function<\/strong> (PDF) is a statistical expression that defines a <strong>probability distribution<\/strong> for a continuous random variable as opposed to a discrete random variable. When the PDF is graphically portrayed, the area under the curve will indicate the interval in which the variable will fall&#8221; <a href=\"https:\/\/www.investopedia.com\/terms\/p\/pdf.asp\">https:\/\/www.investopedia.com\/terms\/p\/pdf.asp<\/a><\/p>\n<p>&#8220;A probability density function is most commonly associated with <a title=\"Continuous probability distribution\" href=\"https:\/\/en.wikipedia.org\/wiki\/Continuous_probability_distribution\">absolutely continuous<\/a> <a title=\"Univariate distribution\" href=\"https:\/\/en.wikipedia.org\/wiki\/Univariate_distribution\">univariate distributions<\/a>. A <a title=\"Random variable\" href=\"https:\/\/en.wikipedia.org\/wiki\/Random_variable\">random variable<\/a> <img decoding=\"async\" src=\"https:\/\/wikimedia.org\/api\/rest_v1\/media\/math\/render\/svg\/68baa052181f707c662844a465bfeeb135e82bab\" alt=\"X\" \/> has density <img decoding=\"async\" src=\"https:\/\/wikimedia.org\/api\/rest_v1\/media\/math\/render\/svg\/17fd6605a04f97c6bedb0a9632f9f023cb18dd40\" alt=\"f_X\" \/>, where <img decoding=\"async\" src=\"https:\/\/wikimedia.org\/api\/rest_v1\/media\/math\/render\/svg\/17fd6605a04f97c6bedb0a9632f9f023cb18dd40\" alt=\"f_X\" \/> is a non-negative <a title=\"Lebesgue integration\" href=\"https:\/\/en.wikipedia.org\/wiki\/Lebesgue_integration\">Lebesgue-integrable<\/a> function, if:<br \/><img decoding=\"async\" src=\"https:\/\/wikimedia.org\/api\/rest_v1\/media\/math\/render\/svg\/45fd7691b5fbd323f64834d8e5b8d4f54c73a6f8\" alt=\"\\Pr[a\\leq X\\leq b]=\\int _{a}^{b}f_{X}(x)\\,dx.\" \/><\/p>\n<p>Hence, if <img decoding=\"async\" src=\"https:\/\/wikimedia.org\/api\/rest_v1\/media\/math\/render\/svg\/062f285db773e329f6c270cb6b65fa076996c941\" alt=\"F_{X}\" \/> is the <a title=\"Cumulative distribution function\" href=\"https:\/\/en.wikipedia.org\/wiki\/Cumulative_distribution_function\">cumulative distribution function<\/a> of <img decoding=\"async\" src=\"https:\/\/wikimedia.org\/api\/rest_v1\/media\/math\/render\/svg\/68baa052181f707c662844a465bfeeb135e82bab\" alt=\"X\" \/>, then:<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/wikimedia.org\/api\/rest_v1\/media\/math\/render\/svg\/237edf4296a8ef4a946134c613b04b250d2de5be\" alt=\"F_{X}(x)=\\int _{-\\infty }^{x}f_{X}(u)\\,du,\" \/><\/p>\n<p>and <img decoding=\"async\" src=\"https:\/\/wikimedia.org\/api\/rest_v1\/media\/math\/render\/svg\/17fd6605a04f97c6bedb0a9632f9f023cb18dd40\" alt=\"f_X\" \/> is continuous at <img decoding=\"async\" src=\"https:\/\/wikimedia.org\/api\/rest_v1\/media\/math\/render\/svg\/87f9e315fd7e2ba406057a97300593c4802b53e4\" alt=\"x\" \/><\/p>\n<p><img decoding=\"async\" src=\"https:\/\/wikimedia.org\/api\/rest_v1\/media\/math\/render\/svg\/f9aa7045569fa51444daf07a7161379c02f5d9a6\" alt=\"f_{X}(x)={\\frac {d}{dx}}F_{X}(x).\" \/><\/p>\n<p>Intuitively, one can think of <img decoding=\"async\" src=\"https:\/\/wikimedia.org\/api\/rest_v1\/media\/math\/render\/svg\/e876fc9aa900411e8eb8d3e8a8101cc1cdb36e7c\" alt=\"{\\displaystyle f_{X}(x)\\,dx}\" \/> as being the probability of <img decoding=\"async\" src=\"https:\/\/wikimedia.org\/api\/rest_v1\/media\/math\/render\/svg\/68baa052181f707c662844a465bfeeb135e82bab\" alt=\"X\" \/> falling within the infinitesimal <a title=\"Interval (mathematics)\" href=\"https:\/\/en.wikipedia.org\/wiki\/Interval_(mathematics)\">interval<\/a> <img decoding=\"async\" src=\"https:\/\/wikimedia.org\/api\/rest_v1\/media\/math\/render\/svg\/f07271dbe3f8967834a2eaf143decd7e41c61d7a\" alt=\"[x,x+dx]\" \/>.&#8221;<br \/><a href=\"https:\/\/en.wikipedia.org\/wiki\/Probability_density_function\">https:\/\/en.wikipedia.org\/wiki\/Probability_density_function<\/a><\/p>\n<h1>Probability mass function<\/h1>\n<p><a href=\"https:\/\/en.wikipedia.org\/wiki\/Probability_mass_function#mw-head\">Jump to navigation<\/a><a href=\"https:\/\/en.wikipedia.org\/wiki\/Probability_mass_function#p-search\">Jump to search<\/a><a href=\"https:\/\/en.wikipedia.org\/wiki\/File:Discrete_probability_distrib.svg\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/upload.wikimedia.org\/wikipedia\/commons\/thumb\/8\/85\/Discrete_probability_distrib.svg\/220px-Discrete_probability_distrib.svg.png\" alt=\"\" width=\"220\" height=\"104\" \/><\/a><br \/>The graph of a probability mass function. All the values of this function must be non-negative and sum up to 1.<\/p>\n<p>&#8220;In <a title=\"Probability theory\" href=\"https:\/\/en.wikipedia.org\/wiki\/Probability_theory\">probability<\/a> and <a title=\"Statistics\" href=\"https:\/\/en.wikipedia.org\/wiki\/Statistics\">statistics<\/a>, a <strong>probability mass function<\/strong> (<strong>PMF<\/strong>) is a function that gives the probability that a <a title=\"Discrete random variable\" href=\"https:\/\/en.wikipedia.org\/wiki\/Discrete_random_variable\">discrete random variable<\/a> is exactly equal to some value.<a href=\"https:\/\/en.wikipedia.org\/wiki\/Probability_mass_function#cite_note-1\">[1]<\/a> Sometimes it is also known as the discrete density function. The probability mass function is often the primary means of defining a <a title=\"Discrete probability distribution\" href=\"https:\/\/en.wikipedia.org\/wiki\/Discrete_probability_distribution\">discrete probability distribution<\/a>, and such functions exist for either <a title=\"Scalar variable\" href=\"https:\/\/en.wikipedia.org\/wiki\/Scalar_variable\">scalar<\/a> or <a title=\"Multivariate random variable\" href=\"https:\/\/en.wikipedia.org\/wiki\/Multivariate_random_variable\">multivariate random variables<\/a> whose <a title=\"Domain of a function\" href=\"https:\/\/en.wikipedia.org\/wiki\/Domain_of_a_function\">domain<\/a> is discrete.<\/p>\n<p>A probability mass function differs from a <a title=\"Probability density function\" href=\"https:\/\/en.wikipedia.org\/wiki\/Probability_density_function\">probability density function<\/a> (PDF) in that the latter is associated with continuous rather than discrete random variables. A PDF must be <a title=\"Integration (mathematics)\" href=\"https:\/\/en.wikipedia.org\/wiki\/Integration_(mathematics)\">integrated<\/a> over an interval to yield a probability.<a href=\"https:\/\/en.wikipedia.org\/wiki\/Probability_mass_function#cite_note-:0-2\">[2]<\/a><\/p>\n<p>The value of the random variable having the largest probability mass is called the <a title=\"Mode (statistics)\" href=\"https:\/\/en.wikipedia.org\/wiki\/Mode_(statistics)\">mode<\/a>.&#8221;<a href=\"https:\/\/en.wikipedia.org\/wiki\/Probability_mass_function\">https:\/\/en.wikipedia.org\/wiki\/Probability_mass_function<\/a><\/p>\n<h2>4.3.1 Mixed Random Variables<\/h2>\n<p>Here, we will discuss <em>mixed<\/em> random variables. These are random variables that are neither discrete nor continuous, but are a mixture of both. In particular, a mixed random variable has a continuous part and a discrete part.<\/p>\n<p><a href=\"https:\/\/www.probabilitycourse.com\/chapter4\/4_3_1_mixed.php\">https:\/\/www.probabilitycourse.com\/chapter4\/4_3_1_mixed.php<\/a> . Also check the examples from here<\/p>\n<p><strong>Expected values of a random variable<\/strong><br \/>The expected value of a discrete <strong>random variable<\/strong> is the probability-weighted average of all its possible values. In other words, each possible value the <strong>random variable<\/strong> can assume is multiplied by its probability of occurring, and the resulting products are summed to produce the expected value.<br \/><a href=\"https:\/\/en.wikipedia.org\/wiki\/Expected_value\">https:\/\/en.wikipedia.org\/wiki\/Expected_value<\/a><\/p>\n<p>The \u201c<strong>moments<\/strong>\u201d of a <strong>random variable<\/strong><\/p>\n<p>The \u201c<strong>moments<\/strong>\u201d of a <strong>random variable<\/strong> (or of its distribution) are expected values of powers or related functions of the <strong>random variable<\/strong>. The rth <strong>moment<\/strong> of X is E(Xr). In particular, the first <strong>moment<\/strong> is the mean, \u00b5X = E(X). The mean is a measure of the \u201ccenter\u201d or \u201clocation\u201d of a distribution<\/p>\n<p><a href=\"http:\/\/homepages.gac.edu\/~holte\/courses\/mcs341\/fall10\/documents\/sect3-3a.pdf\">http:\/\/homepages.gac.edu\/~holte\/courses\/mcs341\/fall10\/documents\/sect3-3a.pdf<\/a><\/p>\n<p>Joint distributions<\/p>\n<p>&#8220;Joint distributions Notes: Below X and Y are assumed to be continuous random variables. This case is, by far, the most important case. Analogous formulas, with sums replacing integrals and p.m.f.\u2019s instead of p.d.f.\u2019s, hold for the case when X and Y are discrete r.v.\u2019s. Appropriate analogs also hold for mixed cases (e.g., X discrete, Y continuous), and for the more general case of n random variables X1, . . . , Xn.<\/p>\n<p>\u2022 Joint cumulative distribution function (joint c.d.f.): F(x, y) = P(X \u2264 x, Y \u2264 y)&#8221;<\/p>\n<p><a href=\"https:\/\/faculty.math.illinois.edu\/~hildebr\/461\/jointdistributions.pdf\">https:\/\/faculty.math.illinois.edu\/~hildebr\/461\/jointdistributions.pdf<\/a><\/p>\n<p>The above were mostly from the Internet and as is.<\/p>\n\n","protected":false},"excerpt":{"rendered":"<p>Chebyshev&#8217;s inequality &#8220;In probability theory, Chebyshev&#8217;s inequality (also called the Bienaym\u00e9\u2013Chebyshev inequality) guarantees that, for a wide class of probability distributions, no more than a certain fraction of values can be more than a certain distance from the mean. Specifically, no more than 1\/k2 of the distribution&#8217;s values can be more than k standard deviations &hellip; <\/p>\n<p><a class=\"more-link btn\" href=\"http:\/\/bangla.sitestree.com\/?p=16532\">Continue reading<\/a><\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[1908,182],"tags":[],"class_list":["post-16532","post","type-post","status-publish","format-standard","hentry","category-math-and-statistics-for-data-science-and-engineering","category---blog","item-wrap"],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"jetpack-related-posts":[{"id":16531,"url":"http:\/\/bangla.sitestree.com\/?p=16531","url_meta":{"origin":16532,"position":0},"title":"Test: Estimation, Tracking, Probability, Data Science","author":"Sayed","date":"December 28, 2019","format":false,"excerpt":"Chebyshev's inequality \"In probability theory, Chebyshev's inequality (also called the Bienaym\u00e9\u2013Chebyshev inequality) guarantees that, for a wide class of probability distributions, no more than a certain fraction of values can be more than a certain distance from the mean. Specifically, no more than 1\/k2 of the distribution's values can be\u2026","rel":"","context":"In &quot;AI ML DS RL DL NN NLP Data Mining Optimization&quot;","block_context":{"text":"AI ML DS RL DL NN NLP Data Mining Optimization","link":"http:\/\/bangla.sitestree.com\/?cat=1910"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":16536,"url":"http:\/\/bangla.sitestree.com\/?p=16536","url_meta":{"origin":16532,"position":1},"title":"Part 2: Some basic Math\/Statistics concepts that Data Scientists (the true ones) will usually know\/use","author":"Sayed","date":"December 29, 2019","format":false,"excerpt":"Part 2: Some basic Math\/Statistics concepts that Data Scientists (the true ones) will usually know\/use (came across, studied, learned, used) Covariance and Correlation \"Covariance is a measure of how two variables change together, but its magnitude is unbounded, so it is difficult to interpret. By dividing covariance by the product\u2026","rel":"","context":"In &quot;Math and Statistics for Data Science, and Engineering&quot;","block_context":{"text":"Math and Statistics for Data Science, and Engineering","link":"http:\/\/bangla.sitestree.com\/?cat=1908"},"img":{"alt_text":"[eq5]","src":"https:\/\/i0.wp.com\/www.statlect.com\/images\/covariance-formula__12.png?resize=350%2C200&ssl=1","width":350,"height":200},"classes":[]},{"id":17441,"url":"http:\/\/bangla.sitestree.com\/?p=17441","url_meta":{"origin":16532,"position":2},"title":"MISC STATistic PROBability LINEAR ALGebra MATRIX","author":"Sayed","date":"September 14, 2020","format":false,"excerpt":"MISC STAT PROB LINEAR ALG MATRIX PDF AND Stock and Bell Curve: https:\/\/www.investopedia.com\/terms\/p\/pdf.asp PDF in Khan Academy: https:\/\/www.khanacademy.org\/math\/statistics-probability\/random-variables-stats-library\/random-variables-continuous\/v\/probability-density-functions Mixed Random Variable https:\/\/www.youtube.com\/watch?v=ZXJjuRAXMhE \"The variance and the standard deviation are measures of the spread of the data around the mean. They summarise how close each observed data value is to the\u2026","rel":"","context":"In &quot;\u09ac\u09cd\u09b2\u0997 \u0964 Blog&quot;","block_context":{"text":"\u09ac\u09cd\u09b2\u0997 \u0964 Blog","link":"http:\/\/bangla.sitestree.com\/?cat=182"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":16200,"url":"http:\/\/bangla.sitestree.com\/?p=16200","url_meta":{"origin":16532,"position":3},"title":"Important Basic Concepts: Statistics for Big Data","author":"Sayed","date":"September 15, 2019","format":false,"excerpt":"Important Basic Concepts: Statistics for Big Data Graphical : Exploratory Data Analysis (EDA) methods? First of all, EDA is about exploring the data and understanding if the data will be good for the experiment and study. Graphs and plots can easily show the data patterns. The raw data can be\u2026","rel":"","context":"In &quot;Statistics for Big Data&quot;","block_context":{"text":"Statistics for Big Data","link":"http:\/\/bangla.sitestree.com\/?cat=1904"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":26205,"url":"http:\/\/bangla.sitestree.com\/?p=26205","url_meta":{"origin":16532,"position":4},"title":"Important Basic Concepts: Statistics for Big Data #Root","author":"Author-Check- Article-or-Video","date":"April 19, 2021","format":false,"excerpt":"Important Basic Concepts: Statistics for Big Data Graphical : Exploratory Data Analysis (EDA) methods? First of all, EDA is about exploring the data and understanding if the data will be good for the experiment and study. Graphs and plots can easily show the data patterns. The raw data can be\u2026","rel":"","context":"In &quot;FromSitesTree.com&quot;","block_context":{"text":"FromSitesTree.com","link":"http:\/\/bangla.sitestree.com\/?cat=1917"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":16550,"url":"http:\/\/bangla.sitestree.com\/?p=16550","url_meta":{"origin":16532,"position":5},"title":"Part 3: Some Basic Math\/Stat Concepts for the wanna be Data Scientists","author":"Sayed","date":"December 30, 2019","format":false,"excerpt":"Conditional Probability and PDF \"The conditional probability of an event B is the probability that the event will occur given the knowledge that an event A has already occurred. This probability is written P(B|A), notation for the probability of B given A. \" \"In the case where events A and\u2026","rel":"","context":"In &quot;AI ML DS RL DL NN NLP Data Mining Optimization&quot;","block_context":{"text":"AI ML DS RL DL NN NLP Data Mining Optimization","link":"http:\/\/bangla.sitestree.com\/?cat=1910"},"img":{"alt_text":"","src":"https:\/\/i0.wp.com\/bangla.sitestree.com\/wp-content\/uploads\/2019\/12\/image-8.png?resize=350%2C200","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/bangla.sitestree.com\/wp-content\/uploads\/2019\/12\/image-8.png?resize=350%2C200 1x, https:\/\/i0.wp.com\/bangla.sitestree.com\/wp-content\/uploads\/2019\/12\/image-8.png?resize=525%2C300 1.5x"},"classes":[]}],"_links":{"self":[{"href":"http:\/\/bangla.sitestree.com\/index.php?rest_route=\/wp\/v2\/posts\/16532","targetHints":{"allow":["GET"]}}],"collection":[{"href":"http:\/\/bangla.sitestree.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/bangla.sitestree.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/bangla.sitestree.com\/index.php?rest_route=\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"http:\/\/bangla.sitestree.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=16532"}],"version-history":[{"count":1,"href":"http:\/\/bangla.sitestree.com\/index.php?rest_route=\/wp\/v2\/posts\/16532\/revisions"}],"predecessor-version":[{"id":16535,"href":"http:\/\/bangla.sitestree.com\/index.php?rest_route=\/wp\/v2\/posts\/16532\/revisions\/16535"}],"wp:attachment":[{"href":"http:\/\/bangla.sitestree.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=16532"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/bangla.sitestree.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=16532"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/bangla.sitestree.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=16532"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}