RSM318H1 Chapter Notes - Chapter 1,2,10: Unsupervised Learning, Dependent And Independent Variables, Hierarchical Clustering
Document Summary
Output variable: typically denoted with y f represents systematic information that x provides about y is independent of x and has mean zero. Some error is irreducible because y is a function of , which cannot be predicted using x. X is readily available but y cannot be easily obtained. Reducible error can be reduced by using a more appropriate statistical learning technique to estimate f. Error may contain unmeasured variables that are useful in predicting y. Understanding how y is affected as x changes. What is relationship between response and each predictor. Can relationship between y and each predictor be adequately summarized using a linear equation. End goal (prediction or inference) will influence which model is most appropriate. Develop a procedure that uses training data to train the model and estimate model parameters. Reduces problem of estimating f down to estimating a set of parameters. Disadvantage is that model we choose may not fit true form of f.