STATS 10 Lecture Notes - Lecture 3: Standard Deviation, Missy Franklin, Box Plot

91 views7 pages

blushsquirrel508

29 Apr 2018

School

Department

Course

Professor

For unlimited access to Class Notes, a Class+ subscription is required.

Chapter 3: Numerical Summaries of Center & Variation

April 9, 11, 16

INTRODUCTION

● in Chapter 2, we learned the features to always describe when considering a distribution

○ shape



- how many peaks, symmetric or skewed, any outliers

○ center



- the “typical” value

○ variability (spread)



- how spread out the data is

● In Chapter 3, we will learn about what values to use to measure center and spread

CENTER (the “typical value”)

●mean



: the arithmetic average

= x=sum of all data values

number of data values n

∑x

●median



: the midpoint of ranked values

○ need to put the values in increasing order → the median will be the

value in the middle of the data set

○ if there is an even number of values, we take the average of the 2

middle numbers to find the median

●mode:



the most frequently observed value

SPREAD (how spread out the data is)

●standard deviation



: described by the square root of the variance (represents the typical

distance of a value from the mean)

σ =

√n − 1

∑ (x − x)²

○(x - x̄) means to take each data point, subtract the mean, and then square the

difference → called a deviation

○∑ (sigma) means to add up all the deviations

○n = the total number of data values

● Steps to Find the Standard Deviation:

1. Find the mean of your data

2. Subtract the mean from each data point, and then square those differences

3. Add (sum) all the squared values from Step 2

4. Divide the value from Step 3 by the number of your data points minus one → this

gives you the variance

Unlock document

This preview shows pages 1-2 of the document.
Unlock all 7 pages and 3 million more documents.

Already have an account? Log in

5. Take the square root of Step 4 (the variance) to get the standard deviation

●Comparing Standard Deviations:

● interquartile range (IQR)



: the third quartile minus the first quartile

IQR = Q3 - Q1

○ Quartiles:

■ Q1 = first quartile (25% of the data are below this point)

● the median of the numbers less than Q2

■ Q2 = the median (50% of the data are below this point)

■ Q3 = third quartile (75% of the data are below this point)

● the median of the numbers greater than Q2

● range



: the maximum value minus the minimum value

range = max - min

○poor measure of spread, because:

■it is not resistant to outliers

■generally doesn’t tell us where most of the data is located

● measures of spread help us talk about what we don’t know

○ when the data values are tightly clustered around the center of distribution → the

IQR and standard deviation are small

○ when the data values are scattered far from the center → the IQR and standard

deviation are large

WHICH CENTER & SPREAD ARE BEST?

● when the distribution is symmetric and unimodal → use mean and standard deviation

● when the distribution is left- or right-skewed → use median and IQR

● when distribution is not unimodal → may be better to split the data:

○ in this case, neither the mean nor the median represent typical values or the

center

○ investigate further into possible separate sub-populations

○ present graphs & statistics of sub-populations separately

● Review:

○ shape



- how many peaks, symmetric or skewed, any outliers

○ center



- the “typical” value

■ use mean



for symmetric distribution

Unlock document

This preview shows pages 1-2 of the document.
Unlock all 7 pages and 3 million more documents.

Already have an account? Log in

Document Summary

Chapter 3: numerical summaries of center & variation. In chapter 2, we learned the features to always describe when considering a distribution. In chapter 3, we will learn about what values to use to measure center and spread. Mean : the arithmetic average x = sum of all data values number of data values. Need to put the values in increasing order the median will be the value in the middle of the data set. If there is an even number of values, we take the average of the 2 middle numbers to find the median. Standard deviation : described by the square root of the variance (represents the typical distance of a value from the mean) N 1 gives you the variance. (x - x ) means to take each data point, subtract the mean, and then square the difference called a deviation. (sigma) means to add up all the deviations.

STATS 10 Lecture Notes - Lecture 3: Standard Deviation, Missy Franklin, Box Plot

Document Summary

Get access

Related textbook solutions

Introductory Statistics

Related Documents

STATS 10 Chapter Notes - Chapter 3: Quartile, Standard Deviation, Interquartile Range

STATS 10 Lecture Notes - Lecture 4: Grammatical Number, Standard Deviation, Unimodality

STATS 10 Lecture Notes - Lecture 4: Interquartile Range, Standard Deviation, Big Data