ITM 760 Lecture Notes - Lecture 7: Sildenafil, Dachshund, Feature Vector
Document Summary
Itm760: lecture 7 large scale machine learning i. Machine learning enthusiasts often speak of clustering with the neologism unsupervised learning. The term unsupervised refers to the fact that the input data does not tell the clustering algorithm what the clusters should be. In supervised machine learning the available data includes information about the correct way to classify at least some of the data. The data that are classified already is called the training set. Estimate a function f(x) so that y = f(x) (predicting for y ). Real number regression (y can be 0 to 1) Data is labeled: have many pairs {(x, y}) x is a vector of binary, categorical, real valued features. y is a class ({+1, -1}), a real number, or a category) We have the height and weight of some dogs from three breeds: beagles, chihuahuas, dachshunds.