Can modify the weight parameter: uniform treat all neighbors equally, distance as measured in a specified distance function. Can customize func.
| KNN | k nearest neigbhors to a given point | |||
| k | the number of neighbors | |||
| n | the number of data points, data is sorted to speed up the algorithm | |||
| distance | Euclidean distance shortest line connecting the points | OR a custom function that defines the distance | ||
| visualize | scatter plot, each point is a circle with a radius that include certain number of neighbors | |||
| KNN | a query intensive algorithm, not learning intensive | |||
| performance | big o notation, log(n) binary search, 1 is constant, n is linear | |||
| intuition | KNN stores all the data, then performs a binary search on the data when querying. Linear regression only stores the model y = mx+b. Key concepts: LEARN vs QUERY | |||
| Running Time | Space | |||
| 1 NN 1-nearest neigbhor, 1 dimensional list e.g. [1 2 4 7 8 9] | learning | 1 | n | KNN all data to storage without learning, so running time is 1 which means constant in Big O notation, and storage space is n for the number of data points |
| query | log(n) binary search to find one point | 1 | ||
| K NN | learning | 1 | n | |
| query | log(n) + k binary search log(n) to find one point and the k items next to it in a sorted list | 1 | ||
| linear regression | learning | n | 1 | |
| query | 1 | 1 |
No comments:
Post a Comment