StrixTheKiet Notes

Search

❯

❯

ArtificialIntelligence

❯

❯

Nearest Neighbor for Classification

Nearest Neighbor for Classification

Mar 20, 20252 min read

Definition:

Simple approach:
- Store all training examples
- Classify a new example based on most similar stored examples

Components of a KNN classifier:

Distance function hyperparameter
- How do we measure distance between instances?
- Determine the layout of the example space
The k hyperparameter
- Determine the complexity of the hypothesis space
  - If k = 1, every training example has its own neighborhood
  - If k = N, the entire feature space is one neighborhood
- Higher k yields smoother decision boundaries?
- Inductive Bias of k-NN
  - Prediction of an instance should be most similar to its nearby instances
  - All features are equally important
  - Complexity is tuned by the k parameter

Decision boundary of a classifier:

After plotting all points of training data, we draw a line/plane/3d plane/… which decides which area/volume belongs to which class
- a hyperplane, $w$ that passes through the origin, unless we define a bias scalar $b$
- then $\overset{y}{^} = sign (w^{T} x + b)$ , project $x$ on $w$ to see if its on negative half or positive half
No longer need to loop through all training examples

Inductive bias of knn

Prediction of an instance should be most similar to its nearby instances
All features are equally important
Complexity is tuned by the k parameter

Variations on KNN:

Weighted voting:

Weight neighbors by (inverse) distance

Epsilon Ball Nearest Neighbors:

change the method for selecting which training examples vote
Instead of using K nearest neighbors, use all examples $x$ such that $distance (\overset{x}{^}, x) \leq ϵ$

Graph View

Definition:
Components of a KNN classifier:
Decision boundary of a classifier:
Inductive bias of knn
Variations on KNN:

Backlinks

Binary Classification with Linear Models
Machine Learning
Classification Model

Created with strixthekiet

GitHub
Email