资源简介
机器学习实战(源码和数据样本)
代码片段和文件信息
‘‘‘
Created on Sep 16 2010
kNN: k Nearest Neighbors
Input: inX: vector to compare to existing dataset (1xN)
dataSet: size m data set of known vectors (NxM)
labels: data set labels (1xM vector)
k: number of neighbors to use for comparison (should be an odd number)
Output: the most popular class label
@author: pbharrin
‘‘‘
from numpy import *
import operator
from os import listdir
def classify0(inX dataSet labels k):
dataSetSize = dataSet.shape[0]
diffMat = tile(inX (dataSetSize1)) - dataSet
sqDiffMat = diffMat**2
sqDistances = sqDiffMat.sum(axis=1)
distances = sqDistances**0.5
sortedDistIndicies = distances.argsort()
classCount={}
for i in range(k):
voteIlabel = labels[sortedDistIndicies[i]]
classCount[voteIlabel] = classCount.get(voteIlabel0) + 1
sortedClassCount = sorted(classCount.iteritems() key=operator.itemgetter(1) reverse=True)
return sortedClassCount[0][0]
def createDataSet():
group = array([[1.01.1][1.01.0][00][00.1]])
labels = [‘A‘‘A‘‘B‘‘B‘]
return group labels
def file2matrix(filename):
love_dictionary={‘largeDoses‘:3 ‘smallDoses‘:2 ‘didntLike‘:1}
fr = open(filename)
arrayOLines = fr.readlines()
numberOfLines = len(arrayOLines) #get the number of lines in the file
returnMat = zeros((numberOfLines3)) #prepare matrix to return
classLabelVector = [] #prepare labels return
index = 0
for line in arrayOLines:
line = line.strip()
listFromLine = line.split(‘\t‘)
returnMat[index:] = listFromLine[0:3]
if(listFromLine[-1].isdigit()):
classLabelVector.append(int(listFromLine[-1]))
else:
classLabelVector.append(love_dictionary.get(listFromLine[-1]))
index += 1
return returnMatclassLabelVector
def autoNorm(dataSet):
minVals = dataSet.min(0)
maxVals = dataSet.max(0)
ranges = maxVals - minVals
normDataSet = zeros(shape(dataSet))
m = dataSet.shape[0]
normDataSet = dataSet - tile(minVals (m1))
normDataSet = normDataSet/tile(ranges (m1)) #element wise divide
return normDataSet ranges minVals
def datingClassTest():
hoRatio = 0.50 #hold out 10%
datingDataMatdatingLabels = file2matrix(‘datingTestSet2.txt‘) #load data setfrom file
normMat ranges minVals = autoNorm(datingDataMat)
m = normMat.shape[0]
numTestVecs = int(m*hoRatio)
errorCount = 0.0
for i in range(numTestVecs):
classifierResult = classify0(normMat[i:]normMat[numTestVecs:m:]datingLabels[numTestVecs:m]3)
print “the classifier came back with: %d the real answer is: %d“ % (classifierResult datingLabels[i])
if (classifierResult != datingLabels[i]): errorCount += 1.0
print “the total error rate is: %f“ % (errorCount/float(numTestVecs))
print errorCount
属性 大小 日期 时间 名称
----------- --------- ---------- ----- ----
目录 0 2016-02-18 14:50 machinelearninginaction-master\
目录 0 2016-02-18 14:50 machinelearninginaction-master\Ch02\
目录 0 2016-02-18 14:50 machinelearninginaction-master\Ch02\EXTRAS\
文件 514 2016-02-18 14:50 machinelearninginaction-master\Ch02\EXTRAS\README.txt
文件 1988 2016-02-18 14:50 machinelearninginaction-master\Ch02\EXTRAS\createDist.py
文件 2094 2016-02-18 14:50 machinelearninginaction-master\Ch02\EXTRAS\createDist2.py
文件 543 2016-02-18 14:50 machinelearninginaction-master\Ch02\EXTRAS\createFirstPlot.py
文件 0 2016-02-18 14:50 machinelearninginaction-master\Ch02\EXTRAS\testSet.txt
文件 239 2016-02-18 14:50 machinelearninginaction-master\Ch02\README.txt
文件 34725 2016-02-18 14:50 machinelearninginaction-master\Ch02\datingTestSet.txt
文件 26067 2016-02-18 14:50 machinelearninginaction-master\Ch02\datingTestSet2.txt
文件 739988 2016-02-18 14:50 machinelearninginaction-master\Ch02\digits.zip
文件 5222 2016-02-18 14:50 machinelearninginaction-master\Ch02\kNN.py
目录 0 2016-02-18 14:50 machinelearninginaction-master\Ch03\
文件 84 2016-02-18 14:50 machinelearninginaction-master\Ch03\classifierStorage.txt
文件 771 2016-02-18 14:50 machinelearninginaction-master\Ch03\lenses.txt
文件 3824 2016-02-18 14:50 machinelearninginaction-master\Ch03\treePlotter.py
文件 4065 2016-02-18 14:50 machinelearninginaction-master\Ch03\trees.py
目录 0 2016-02-18 14:50 machinelearninginaction-master\Ch04\
目录 0 2016-02-18 14:50 machinelearninginaction-master\Ch04\EXTRAS\
文件 514 2016-02-18 14:50 machinelearninginaction-master\Ch04\EXTRAS\README.txt
文件 922 2016-02-18 14:50 machinelearninginaction-master\Ch04\EXTRAS\create2Normal.py
文件 433 2016-02-18 14:50 machinelearninginaction-master\Ch04\EXTRAS\monoDemo.py
文件 7076 2016-02-18 14:50 machinelearninginaction-master\Ch04\bayes.py
文件 15141 2016-02-18 14:50 machinelearninginaction-master\Ch04\email.zip
目录 0 2016-02-18 14:50 machinelearninginaction-master\Ch05\
目录 0 2016-02-18 14:50 machinelearninginaction-master\Ch05\EXTRAS\
文件 514 2016-02-18 14:50 machinelearninginaction-master\Ch05\EXTRAS\README.txt
文件 1233 2016-02-18 14:50 machinelearninginaction-master\Ch05\EXTRAS\plot2D.py
文件 1710 2016-02-18 14:50 machinelearninginaction-master\Ch05\EXTRAS\plotGD.py
文件 1846 2016-02-18 14:50 machinelearninginaction-master\Ch05\EXTRAS\plotSDerror.py
............此处省略99个文件信息
相关资源
- 中国地图,shp,单独的省。另一个文
- 最全中文情感和语义词库包含好几种
- 基于PLC和组态软件的加热炉温度控制
- Excel两表校对附件中有使用说明书和样
- namp大集合[linux64版和windows64和32版]
- 信号与系统:理论方法和应用-徐守时
- 批量插入大量联系人,通话记录和短
- 2009-2017年系统架构师真题和答案详解
- UML和模式应用第3版
- stm32各种原理图和PCB集锦Altium Designe
- 匿名领航者飞控源码、原理图和上位
- 机器学习技法原始讲义和课程笔记
- 计算机程序的构造和解释-中文版SIC
- 智能家居照明控制系统设计proteus仿真
- TortoiseSVN-1.10.1.28295-svn-1.10.2-32位-64位和
- 完整版2005-2019信息系统项目管理师历
- 济安横断面2.1无水印版支持32位和64位
- 人工智能全部课件和作业题
- LCD12864驱动程序 带字库版本ST7920内含
- 自然语言处理、文本挖掘论文40篇 包
- PLC编程100例针对三菱和西门子两款P
- 最全中文情感和语义词库
- PC端-民主法治和谐富强.zip
- 飞机大战_scratch2.0脚本_包含图片素材
- 国家统计局 2018年统计用区划代码和城
- 人工噪声和天线选择
- 高校人事管理系统某高校,主要人员
- Qt 日历可显示阴历生宵和时钟基于Q
- 基子PLC和力控组态软件的中央空调监
-
vcruntime140d.dll,ucrtba
sed.dll,32位和
评论
共有 条评论