url:https://en.wikipedia.org/wiki...python
In pattern recognition, the k-nearest neighbors algorithm (k-NN) is a non-parametric method used for classification and regression.[1] In both cases, the input consists of the k closest training examples in the feature space. The output depends on whether k-NN is used for classification or regression:In k-NN classification, the output is a class membership. An object is classified by a majority vote of its neighbors, with the object being assigned to the class most common among its k nearest neighbors (k is a positive integer, typically small). If k = 1, then the object is simply assigned to the class of that single nearest neighbor.app
In k-NN regression, the output is the property value for the object. This value is the average of the values of its k nearest neighbors.url
其實簡單理解就是:經過計算新加入點與附近K個點的距離,而後尋找到距離最近的K個點,進行佔比統計,找到k個點中數量佔比最高的target,那麼新加入的樣本,它的target就是頻數最高的target
語言:python3
歐拉距離:spa
# -*- coding: utf-8 -*- """ Created on Sat Mar 17 11:17:18 2018 @author: yangzinan """ import numpy as np import matplotlib as mpl import matplotlib.pyplot as plt from math import sqrt from collections import Counter # 樣本 x= [ [3.393533211,2.331273381], [3.110073483,1.781539638], [1.343808831,3.368360954], [3.582294042,4.679179110], [2.280362439,2.866990263], [7.423436942,4.696522875], [5.745051997,3.533989803], [9.172168622,2.511101045], [7.792783481,3.424088941], [7.939820817,0.791637231] ] y= [0,0,0,0,0,1,1,1,1,1] x_train = np.array(x) y_train = np.array(y) # 繪圖 plt.scatter(x_train[y_train==0,0],x_train[y_train==0,1],color="red") plt.scatter(x_train[y_train==1,0],x_train[y_train==1,1],color="green") x_point = np.array([8.093607318,3.365731514]) plt.scatter(x_point[0],x_point[1],color="blue") plt.show() #計算距離 歐拉距離 distances = [] for d in x_train: # 求出和x相差的距離 d_sum = sqrt(np.sum(((d-x)**2))) distances.append(d_sum) print(distances) #求出最近的點 #按照從小到大的順序,獲得下標 nearest = np.argsort(distances) #指定應該求出的個數 k = 3 topK_y = [] #求出前K個target for i in nearest[:k]: topK_y.append(y_train[i]) #獲得頻數最高的target,那麼新加入點target 就是頻數最高的 predict_y = Counter(topK_y).most_common(1)[0][0] print(predict_y)