import cv2
import numpy as np
from matplotlib import pyplot as plt

각 데이터의 위치 : 25 x 2 크기에 각각 0~100¶

trainData = np.random.randint(0, 100, (25,2)).astype(np.float32)

각 데이터는 0 or 1¶

response = np.random.randint(0,2,(25,1)).astype(np.float32)

red = trainData[response.ravel()==0]
plt.scatter(red[:,0],red[:,1],80,'r','^')

<matplotlib.collections.PathCollection at 0x11beedb50>

blue = trainData[response.ravel() == 1]
plt.scatter(blue[:,0],blue[:,1],80,'b','s')

<matplotlib.collections.PathCollection at 0x11bd5bd10>

newcomer = np.random.randint(0,100,(1,2)).astype(np.float32)
plt.scatter(newcomer[:,0],newcomer[:,1],80,'g','o')

<matplotlib.collections.PathCollection at 0x11be72e90>

knn = cv2.ml.KNearest_create()
knn.train(trainData,cv2.ml.ROW_SAMPLE,response)
ret,results, neighbours,dist = knn.findNearest(newcomer,3)

print("result : ", results)
print("neighbours : ", neighbours)
print("distance : ",dist)

result :  [[1.]]
neighbours :  [[0. 1. 1.]]
distance :  [[ 13.  17. 130.]]

빨간색 세모를 0번 레이어로 두고 파란색 네모를 1번 레이어라고 본다.

result : 1 새로운 데이터를 1번 레이어라고 예측했다.
neighbours : [0.1.1] 주변 데이터들의 종류 리스트
distance : 주변 데이터들의 떨어진 거리 데이터 리스트

plt.show()

[데이터 전처리] . 이산형 데이터 처리하기 ( OneHotEncoding ) (0)	2020.10.23
[데이터 전처리] . 결측치 처리하기 ( Missing Values ) (0)	2020.10.23
[기계학습] 나이브 베이즈 분류 - Naive bayes classifier (0)	2020.07.07
[기계학습]PCA (Principal Conponents Analysis) 주성분 분석 (0)	2020.07.02
[기계학습]회귀계수 축소법 ( Ridge regression, Ridge 회귀) (0)	2020.06.25