See more in the series visit the main course outline page
Cross validation: any train test validation split should be representative of the real world dataset. Else the model is invalid. Can compare stats, representation, distribution. IID independently identically distributed, drawn from same distribution. Fundamental assumptions of algorithms.
Lesson 3 More Regression
Section by Georgia Tech, professor also led courses for ML for finance. He's really good. Much better than many Georgia Tech instructors who gave contents for Udacity.
.2 Parametric regression
Basic line is y=mx+b
Parameters are m,b
There are regions of the data the line cannot track because it's just a line. We can then use higher degree polynomial.
y = m_2 * x**2 + m_1 * x + b
m_2,m_1, b are the parameters.
Lesson 3 Supervised Learning
Lesson 3 More Regressions
.3 K Nearest Neighbors
Data centric approach, instance based approach. Example historic data, weather prediction based on weather data.
Identify related datapoints, then what to do? : use the mean of their y values, or prediciton values. voting
Kernel Regression: weigh each data point based its distance. KNN each data point is weighted equally.
Parametric vs Non Parametric
Yes, the cannon ball distance can be best estimated using a parametric model, as it follows a well-defined trajectory.
On the other hand, the behavior of honey bees can be hard to model mathematically. Therefore, a non-parametric approach would be more suitable.
Biased means theres a formula for it. It's biased towards the math formula. There's no formula for the other method, so it's unbiased.
Parametric don't have to store original data, space efficient, however if there's new data, we cannot update it. Training is slow, querying is fast.
Non parametric querying is slow. Need to store all data points. Easily add new data points. (Kind of like Neo4j). Training is fast. Avoid assuming type of model such as linear or quadratic. If complex, we don't need to assume.
.7 Quiz : which problems are regression useful
.8 Quiz: Are polynoials linears: is polynomial regression still linear?
yes: the space of polynomials is linear in its coeffecients.
Lesson 4 Regressions in sklearn
.1 Quiz Continuous Output Quiz :