我的以爲這是最最簡單的,套直線方程就能夠。python
同時,這個擬合的代碼彷佛在後續監督學習中能夠套用,未完待續。先貼這段代碼。學習
def studentReg(ages_train, net_worths_train):
### import the sklearn regression module, create, and train your regression
### name your regression reg
### your code goes here!
from sklearn.linear_model import LinearRegression
reg=LinearRegression()
reg.fit(ages_train,net_worths_train)
return regcode
print"Katie's net worth prediction:", reg.predict([27])
print"slope:",reg.coef_
print"intercept:",reg.intercept_it
print"\n ##########stats on test dataset #########\n"
print"r-squared score:", reg.score(ages_test,net_worths_test)io
print"\n ##########stats on training dataset ##########\n"
print"r-squared score:", reg.score(ages_train,net_worths_train)class
打印出的預測結果並不精確,因此後續須要學習迴歸會出現的偏差類型,以及R平方指標,若是過分擬合,那麼這個指標就會很低test
#!/usr/bin/pythonimport
import numpy
import matplotlib
matplotlib.use('agg')module
import matplotlib.pyplot as plt
from studentRegression import studentReg
from class_vis import prettyPicture, output_imagemodel
from ages_net_worths import ageNetWorthData
ages_train, ages_test, net_worths_train, net_worths_test = ageNetWorthData()
reg = studentReg(ages_train, net_worths_train)
plt.clf()
plt.scatter(ages_train, net_worths_train, color="b", label="train data")
plt.scatter(ages_test, net_worths_test, color="r", label="test data")
plt.plot(ages_test, reg.predict(ages_test), color="black")
plt.legend(loc=2)
plt.xlabel("ages")
plt.ylabel("net worths")
plt.savefig("test.png")
output_image("test.png", "png", open("test.png", "rb").read())