资源简介
在这个项目中,你将利用马萨诸塞州波士顿郊区的房屋信息数据训练和测试一个模型,并对模型的性能和预测能力进行测试。通过该数据训练后的好的模型可以被用来对房屋做特定预测—尤其是对房屋的价值。对于房地产经纪等人的日常工作来说,这样的预测模型被证明非常有价值。
代码片段和文件信息
###########################################
# Suppress matplotlib user warnings
# Necessary for newer version of matplotlib
import warnings
warnings.filterwarnings(“ignore“ category = UserWarning module = “matplotlib“)
###########################################
import matplotlib.pyplot as pl
import numpy as np
import sklearn.learning_curve as curves
from sklearn.tree import DecisionTreeRegressor
from sklearn.cross_validation import ShuffleSplit train_test_split
def ModelLearning(X y):
“““ Calculates the performance of several models with varying sizes of training data.
The learning and testing scores for each model are then plotted. “““
# Create 10 cross-validation sets for training and testing
cv = ShuffleSplit(X.shape[0] n_iter = 10 test_size = 0.2 random_state = 0)
# Generate the training set sizes increasing by 50
train_sizes = np.rint(np.linspace(1 X.shape[0]*0.8 - 1 9)).astype(int)
# Create the figure window
fig = pl.figure(figsize=(107))
# Create three different models based on max_depth
for k depth in enumerate([13610]):
# Create a Decision tree regressor at max_depth = depth
regressor = DecisionTreeRegressor(max_depth = depth)
# Calculate the training and testing scores
sizes train_scores test_scores = curves.learning_curve(regressor X y \
cv = cv train_sizes = train_sizes scoring = ‘r2‘)
# Find the mean and standard deviation for smoothing
train_std = np.std(train_scores axis = 1)
train_mean = np.mean(train_scores axis = 1)
test_std = np.std(test_scores axis = 1)
test_mean = np.mean(test_scores axis = 1)
# Subplot the learning curve
ax = fig.add_subplot(2 2 k+1)
ax.plot(sizes train_mean ‘o-‘ color = ‘r‘ label = ‘Training Score‘)
ax.plot(sizes test_mean ‘o-‘ color = ‘g‘ label = ‘Testing Score‘)
ax.fill_between(sizes train_mean - train_std \
train_mean + train_std alpha = 0.15 color = ‘r‘)
ax.fill_between(sizes test_mean - test_std \
test_mean + test_std alpha = 0.15 color = ‘g‘)
# Labels
ax.set_title(‘max_depth = %s‘%(depth))
ax.set_xlabel(‘Number of Training Points‘)
ax.set_ylabel(‘Score‘)
ax.set_xlim([0 X.shape[0]*0.8])
ax.set_ylim([-0.05 1.05])
# Visual aesthetics
ax.legend(bbox_to_anchor=(1.05 2.05) loc=‘lower left‘ borderaxespad = 0.)
fig.suptitle(‘Decision Tree Regressor Learning Performances‘ fontsize = 16 y = 1.03)
fig.tight_layout()
fig.show()
def ModelComplexity(X y):
“““ Calculates the performance of the model as model complexity increases.
The learning and testing errors rates are then plotted. “““
# Create 10 cross-validation sets for training and testing
cv = ShuffleSplit(X.shape[0] n_iter = 10 test_size = 0.2 random_state = 0)
# V
属性 大小 日期 时间 名称
----------- --------- ---------- ----- ----
目录 0 2018-01-08 15:11 boston_housing\
目录 0 2018-01-07 21:16 boston_housing\.ipynb_checkpoints\
文件 146801 2018-01-08 13:51 boston_housing\.ipynb_checkpoints\boston_housing-checkpoint.ipynb
文件 143652 2018-01-08 15:11 boston_housing\boston_housing.ipynb
文件 12435 2016-08-12 03:37 boston_housing\housing.csv
文件 1768 2016-08-12 03:37 boston_housing\README.md
文件 4882 2018-01-07 21:24 boston_housing\visuals.py
目录 0 2018-01-07 21:24 boston_housing\__pycache__\
文件 3672 2018-01-07 21:24 boston_housing\__pycache__\visuals.cpython-36.pyc
- 上一篇:微机原理:交通灯课程设计
- 下一篇:Paillier算法
相关资源
- 机器学习复习资料2
- AdaBoost实战代码
- 支持向量机实战代码全
- 浅析机器学习的研究与应用
- 2017年最新机器学习与深度学习从基础
- 机器学习课后习题答案(整理所有版
- Adaboost训练和测试代码
- CMU机器学习讲义
- 李宏毅机器学习中文课程全套视频
- PRML模式识别与机器学习 习题答案完整
- libsvm-3.20.zip
- 基于tensorflow的猫狗图片的识别分类
- 漏洞推断-利用机器学习辅助发现漏洞
- 使用C5.0决策树识别高风险银行贷款
- 小象学习机器学习视频不知道哪期.
- 机器学习第七期升级版.docx
- 百度网盘链接coursera 吴恩达深度机器
- MachineLearning-相似度距离公式
- Spark机器学习回归模型数据集
- coursera吴恩达机器学习全套视频和文档
- 电机振动故障检测tensorflow神经网络
- Machine Learning Linear Regression-线性回归
- 疝气病数据集逻辑回归
- 17 机器学习案例——基于朴素贝叶斯
- spark机器学习简单文档
- 机器学习特征的的代码lpq
- 图解机器学习代码
- 南京大学周志华老师的一个讲普适机
- 最新版coursera吴恩达机器学习全套视频
- 吴恩达机器学习视频中英文字幕字幕
评论
共有 条评论