ENV330H5 Lecture Notes - Lecture 5: Angina Pectoris, Support Vector Machine, Overfitting

38 views3 pages
5 Nov 2020
School
Department
Course

Document Summary

The best way to infer causation (randomization, replication & blocking) Replication = our results represent what is typical of the population of interest. Randomization = results are generalizable, and free of bias. Blocking = control for confounding & lurking variables. Not every scientific question is about cause and effect. Sometimes we want to understand patterns (to classify observations), or to make predictions. What variables are important in classifying a watershed as being degraded or not degraded? . High n (number of observations) & high p (number of variables) Data are collected without a specific hypothesis test in mind. Machine learning methods have become very popular in environmental science & ecology. Many, many methods (e. g. random trees, support vector machines, neural networks, k- nearest-neighbour clustering . Simplest (but still powerful): classification & regression trees (decision trees) Classification tree: the leaves are levels of a categorical variable (simplest is binary) Regression tree: the leaves represent values of a continuous variable.

Get access

Grade+20% off
$8 USD/m$10 USD/m
Billed $96 USD annually
Grade+
Homework Help
Study Guides
Textbook Solutions
Class Notes
Textbook Notes
Booster Class
40 Verified Answers
Class+
$8 USD/m
Billed $96 USD annually
Class+
Homework Help
Study Guides
Textbook Solutions
Class Notes
Textbook Notes
Booster Class
30 Verified Answers

Related Documents