STAT1008 Lecture Notes - Lecture 23: Standardized Test, Dependent And Independent Variables, Central Limit Theorem
STAT1008 Week 8 Lecture B
● Goals:
○ If we can approximate a bootstrap distribution with a normal… compute a
confidence interval
○ If we can approximate a randomisation distribution with a normal… compute a p-
value
○ If we can find an easy way to estimate SE, we can even do this without
generating the distribution!
● Central limit theorem:
○ For random samples with a sufficiently large sample size, the distribution of
sample statistic for a mean or a proportion is normally distributed
○ Sufficiently large varies from size to size
○ If small sample size the theorem does not hold
○ The central limit theorem holds for ANY original distribution, although “sufficiently
large sample size” varies
○ The more skewed the original distribution is, the larger n has to be for the CLT to
work
■ For quantitative variables that are not very skewed, n> or equal to 30 is
usually sufficient
■ For categorical variables, counts of at least 10 within each category is
usually sufficient
○ E.g. Hearing loss
■ In a random sample of 1771 Americans aged 12-19, 19.5% had some
hearing loss
■ What proportion of Americans aged 12 to 19 have some hearing loss?
Give a 95% CI
■ Normally would of evaluated a bootstrap
■ N(0.195,0.0095) thus stat + or - z*xSE = (0.176,0.214)
● Confidence Intervals:
○ If the bootstrap distribution is normal, to find a P% confidence interval, we just
need to find the middle P% of the distribution N(stat,SE)
● Confidence Interval using N(0,1):
○ If a statistic is normally distributed, we find a confidence interval for the
parameter using statistic plus or minus z*.SE where the area between -z* and
+z* in the standard normal distribution is the desired level of confidence
○ Statistic comes from original data and SE comes from bootstrap distribution and
z* comes from N(0,1)
● P% confidence interval:
○ Area between the -z* and +z*
find more resources at oneclass.com
find more resources at oneclass.com
Document Summary
If we can approximate a bootstrap distribution with a normal compute a confidence interval. If we can approximate a randomisation distribution with a normal compute a p- value. If we can find an easy way to estimate se, we can even do this without generating the distribution! For random samples with a sufficiently large sample size, the distribution of sample statistic for a mean or a proportion is normally distributed. Sufficiently large varies from size to size. The central limit theorem holds for any original distribution, although sufficiently. If small sample size the theorem does not hold large sample size varies. The more skewed the original distribution is, the larger n has to be for the clt to work. For quantitative variables that are not very skewed, n> or equal to 30 is usually sufficient. For categorical variables, counts of at least 10 within each category is usually sufficient.