Can modify the weight parameter: uniform treat all neighbors equally, distance as measured in a specified distance function. Can customize func.
KNN | k nearest neigbhors to a given point | |||
k | the number of neighbors | |||
n | the number of data points, data is sorted to speed up the algorithm | |||
distance | Euclidean distance shortest line connecting the points | OR a custom function that defines the distance | ||
visualize | scatter plot, each point is a circle with a radius that include certain number of neighbors | |||
KNN | a query intensive algorithm, not learning intensive | |||
performance | big o notation, log(n) binary search, 1 is constant, n is linear | |||
intuition | KNN stores all the data, then performs a binary search on the data when querying. Linear regression only stores the model y = mx+b. Key concepts: LEARN vs QUERY | |||
Running Time | Space | |||
1 NN 1-nearest neigbhor, 1 dimensional list e.g. [1 2 4 7 8 9] | learning | 1 | n | KNN all data to storage without learning, so running time is 1 which means constant in Big O notation, and storage space is n for the number of data points |
query | log(n) binary search to find one point | 1 | ||
K NN | learning | 1 | n | |
query | log(n) + k binary search log(n) to find one point and the k items next to it in a sorted list | 1 | ||
linear regression | learning | n | 1 | |
query | 1 | 1 |
No comments:
Post a Comment