资源简介
在这个项目中,你将利用马萨诸塞州波士顿郊区的房屋信息数据训练和测试一个模型,并对模型的性能和预测能力进行测试。通过该数据训练后的好的模型可以被用来对房屋做特定预测—尤其是对房屋的价值。对于房地产经纪等人的日常工作来说,这样的预测模型被证明非常有价值。

代码片段和文件信息
###########################################
# Suppress matplotlib user warnings
# Necessary for newer version of matplotlib
import warnings
warnings.filterwarnings(“ignore“ category = UserWarning module = “matplotlib“)
###########################################
import matplotlib.pyplot as pl
import numpy as np
import sklearn.learning_curve as curves
from sklearn.tree import DecisionTreeRegressor
from sklearn.cross_validation import ShuffleSplit train_test_split
def ModelLearning(X y):
“““ Calculates the performance of several models with varying sizes of training data.
The learning and testing scores for each model are then plotted. “““
# Create 10 cross-validation sets for training and testing
cv = ShuffleSplit(X.shape[0] n_iter = 10 test_size = 0.2 random_state = 0)
# Generate the training set sizes increasing by 50
train_sizes = np.rint(np.linspace(1 X.shape[0]*0.8 - 1 9)).astype(int)
# Create the figure window
fig = pl.figure(figsize=(107))
# Create three different models based on max_depth
for k depth in enumerate([13610]):
# Create a Decision tree regressor at max_depth = depth
regressor = DecisionTreeRegressor(max_depth = depth)
# Calculate the training and testing scores
sizes train_scores test_scores = curves.learning_curve(regressor X y \
cv = cv train_sizes = train_sizes scoring = ‘r2‘)
# Find the mean and standard deviation for smoothing
train_std = np.std(train_scores axis = 1)
train_mean = np.mean(train_scores axis = 1)
test_std = np.std(test_scores axis = 1)
test_mean = np.mean(test_scores axis = 1)
# Subplot the learning curve
ax = fig.add_subplot(2 2 k+1)
ax.plot(sizes train_mean ‘o-‘ color = ‘r‘ label = ‘Training Score‘)
ax.plot(sizes test_mean ‘o-‘ color = ‘g‘ label = ‘Testing Score‘)
ax.fill_between(sizes train_mean - train_std \
train_mean + train_std alpha = 0.15 color = ‘r‘)
ax.fill_between(sizes test_mean - test_std \
test_mean + test_std alpha = 0.15 color = ‘g‘)
# Labels
ax.set_title(‘max_depth = %s‘%(depth))
ax.set_xlabel(‘Number of Training Points‘)
ax.set_ylabel(‘Score‘)
ax.set_xlim([0 X.shape[0]*0.8])
ax.set_ylim([-0.05 1.05])
# Visual aesthetics
ax.legend(bbox_to_anchor=(1.05 2.05) loc=‘lower left‘ borderaxespad = 0.)
fig.suptitle(‘Decision Tree Regressor Learning Performances‘ fontsize = 16 y = 1.03)
fig.tight_layout()
fig.show()
def ModelComplexity(X y):
“““ Calculates the performance of the model as model complexity increases.
The learning and testing errors rates are then plotted. “““
# Create 10 cross-validation sets for training and testing
cv = ShuffleSplit(X.shape[0] n_iter = 10 test_size = 0.2 random_state = 0)
# V
属性 大小 日期 时间 名称
----------- --------- ---------- ----- ----
目录 0 2018-01-08 15:11 boston_housing\
目录 0 2018-01-07 21:16 boston_housing\.ipynb_checkpoints\
文件 146801 2018-01-08 13:51 boston_housing\.ipynb_checkpoints\boston_housing-checkpoint.ipynb
文件 143652 2018-01-08 15:11 boston_housing\boston_housing.ipynb
文件 12435 2016-08-12 03:37 boston_housing\housing.csv
文件 1768 2016-08-12 03:37 boston_housing\README.md
文件 4882 2018-01-07 21:24 boston_housing\visuals.py
目录 0 2018-01-07 21:24 boston_housing\__pycache__\
文件 3672 2018-01-07 21:24 boston_housing\__pycache__\visuals.cpython-36.pyc
- 上一篇:微机原理:交通灯课程设计
- 下一篇:Paillier算法
相关资源
- ppt 机器学习.ppt
- Logistic回归总结非常好的机器学习总结
- Convex Analysis and Optimization (Bertsekas
- 机器学习个人笔记完整版v5.2-A4打印版
- JUNIOR:粒子物理学中无监督机器学习
- 语料库.zip
- 中国科学技术大学 研究生课程 机器学
- 遗传算法越野小车unity5.5
- 吴恩达机器学习编程题
- shape_predictor_68_face_landmarks.dat.bz2 68个标
- 机器学习实战高清pdf,中文版+英文版
- 李宏毅-机器学习(视频2017完整)
- 机器学习深度学习 PPT
- 麻省理工:深度学习介绍PPT-1
- Wikipedia机器学习迷你电子书之四《D
- Learning From Data Yaser S. Abu-Mostafa
- 北大林宙辰:机器学习一阶算法的优
- 李宏毅深度学习ppt
- 机器学习方法R实现-用决策树、神经网
- 数字金融反欺诈白皮书
- 机器学习班PPT原件全邹博
- 机器学习实战(源码和数据样本)
- 计算广告含有目录 刘鹏版
- 数据挖掘导论完整版PPT及课后习题答
- kaggle信用卡欺诈数据
- 机器学习技法原始讲义和课程笔记
- 机器学习数学 陈希孺《 概率论与数理
- 概率论与数理统计陈希孺
- 哈尔滨工业大学深圳 机器学习 2017 考
- [概率论与数理统计]陈希孺带目录
评论
共有 条评论