In this lecture we’re going to continue our talk about hypothesis testing of Gaussian distributed data. So to elaborate on how hypothesis tests work we’re going to talk about the hypotheses a little bit more. So there are two hypotheses when we’re doing a statistical test. There’s a null hypothesis which we call hnaught, and it usually represents the control data or random data that results purely from chance. The second hypothesis is the alternative hypothesis. So this is the thing we’re trying to prove, generally. For example, if we’re doing an experiment where we’re testing a drug, the drug actually working would be the alternative hypothesis. And so the alternative hypothesis is the hypothesis that sample observations are influenced by some nonrandom cause. So now suppose we have some random data, so let’s say R1 equals randn. Let’s say it as a hundred points, a mean of zero, and a variance of one. Now let’s say we have a another data set which is maybe 20 points, but this one’s going to have a different mean. So let’s say it’s mean is one, and let’s increase its variance a little bit. Okay, so, we have two distributions that are Gaussian distributed. The first one has 0 mean and one variance, and the second one has a mean of one but a standard deviation of two. So, how do we compare these two distributions? There is a test called the ttest, or the twosided ttest, or the twosample ttest which does what we want, and it again returns a hypothesis and a pvalue. So we can try this see ttest2(R1,R2). Ok, so, we reject the null hypothesis in this case with a very, very small pvalue. Remember it only has to be less than five percent for us to reject the null hypothesis. So let’s try some things. Let’s have less data points for R1. Okay, and let’s do our ttest again. So notice how we still reject the null hypothesis but our p value has increased, so it’s less significant than before. Alright, so, now let’s do the same thing for R2, let’s say this now only has 10 points. Let’s do the ttest again. Alright, so, this is still significant. Let’s increase the variance. Alright, so, I had to increase the variance a lot to get an insignificant p value, so that’s one thing about when you’re comparing two gaussian distributions you can’t really say one is bigger than the other if they’re spread out a lot. So let’s put the variance back for R2, but let’s say the mean is now less far away from R1’s mean. Let’s do our ttest again, and so this also gives us an insignificant p value. So that’s another fact about the ttest is you also can’t tell if two distributions are different if they are very close together. If they’re very far apart, so the mean is, let’s say we do the ttest again, we now get a very small pvalue.
Subscribe on YouTube!
Most Popular Courses

The Complete Ethical Hacking Course Bundle!
Rated 4.62 out of 5
$780.00$19.99 
The Complete Linux Administration Course Bundle!
Rated 4.94 out of 5
$780.00$19.99 
26 Course Forever Bundle!
Rated 5.00 out of 5
$2,730.00$39.99 
The Complete MATLAB Course Bundle!
Rated 4.60 out of 5
$585.00$24.99 
The Learn to Code Course Bundle!
Rated 5.00 out of 5
$780.00$19.99 
The Complete Python 3 Course: Beginner to Advanced!
Rated 5.00 out of 5
$199.99$9.99 
The Ermin Kreponic Course Bundle!
Rated 4.54 out of 5
$1,755.00$39.99 
The Complete Digital Marketing Course Bundle!
$780.00$19.99 
Certified Ethical Hacker Boot Camp!
$195.00$9.99 
Linux System Administration for Beginners!
Rated 5.00 out of 5
$199.99$9.99