STAT1008 Lecture Notes - Lecture 32: Gestation, Independent And Identically Distributed Random Variables, Heteroscedasticity
STAT1008 Week 11 Lecture B
● Linear output:
○ The linear output normally provides you with the coefficients
○ e = 10-5
● Test for correlation:
○ How else can we measure the strength of association between two quantitative
variables? Correlation
■ R = correlation for a sample
■ P = correlation for a population
■ H0: p= 0
■ Ha: p doesn’t = 0 or 1-tail
■ t = r - 0/(sqrt((1-r2)/(n-2))
■ SE = sqrt((1-r2)/(n-2))
■ Find p-value using t-distribution with n-2 df
○ E.g. The correlation for the n= 7 cricket chirp data points is r = 0.99062. Compute
the t-statistic for testing
■ H0: p = 0 and Ha: p doesn’t = 0
■ t = 0.99062sqrt(7-2)/sqrt(1-0.990622) = 16.21
■ The t-test for slope and t-test for correlation are identical!
● Coefficient for determination, R2
○ Recall that for correlation: -1< or equal to r < or equal to 1
○ If we square the correlation, r2, we get a number between 0 and 1 that can be
interpreted as a percentage
○ R2 = proportion of variability in response variable Y that is “explained” by the
model based on the predictor X.
○ By convention we use a capital R2, although the value is just r2 for a single
predictor
● Temperature and cricket chirps
○ Find and interpret the value of R2 for the cricket chirp data where r=0.99062
○ r2 = (0.99062)2 = 0.9813
○ 98.13% of the variability in these temperatures can be explained by the cricket
chirp rates
● Checking conditions:
○ y = beta0 + beta1x + epsilon
○ For a simple linear model, we assume the errors (epsilon) are randomly
distributed above and below the line
○ Epsilon ~ N(0, sigma2)
○ Quick check (more details in section 10.2)
■ Look at a scatter plot with regression line on it
○ Watch out for:
find more resources at oneclass.com
find more resources at oneclass.com
Document Summary
The linear output normally provides you with the coefficients. Ha: p doesn"t = 0 or 1-tail t = r - 0/(sqrt((1-r2)/(n-2)) Find p-value using t-distribution with n-2 df. The correlation for the n= 7 cricket chirp data points is r = 0. 99062. H0: p = 0 and ha: p doesn"t = 0. The t-test for slope and t-test for correlation are identical! t = 0. 99062sqrt(7-2)/sqrt(1-0. 990622) = 16. 21. Recall that for correlation: -1< or equal to r < or equal to 1. If we square the correlation, r2, we get a number between 0 and 1 that can be interpreted as a percentage. R2 = proportion of variability in response variable y that is explained by the model based on the predictor x. By convention we use a capital r2, although the value is just r2 for a single predictor. Find and interpret the value of r2 for the cricket chirp data where r=0. 99062.