下圖是要用到的數據集,反映了從1984到2016年的失業率的變化python
1.導入可視化模塊import matlibplot.pyplot as plt, 函數plt.plot(x, y)肯定折線圖的點,x是由這些點的x座標組成的列表,y是由這些點的y座標組成函數
的列表。plt.show()顯示圖像,plt.xlabel()給x軸命名,plt.xticks()能夠設置x座標刻度點旋轉指定角度,plt.title()給折線圖命名佈局
下面的代碼是以上函數的應用學習
1 import pandas as pd 2 import numpy as np 3 import matplotlib.pyplot as plt 4 #畫出1984年失業率折線圖 5 unrated = pd.read_csv("C:/學習/python/hello/UNRATE.csv") 6 first_twelve = unrated.head(12) 7 8 x_series = first_twelve["DATE"] 9 y_series = first_twelve["VALUE"] 10 11 plt.xticks(rotation=90) 12 plt.xlabel("Month") 13 plt.ylabel("Unemployment rate") 14 15 plt.title("Unemployment rate trends on 1984") 16 17 plt.plot(x_series, y_series) 18 plt.show()
運行結果以下spa
2.經過plt.subplot(n, m, x)在一個figure中添加多個子圖, n和m表示子圖的佈局,分別表明行數和列數,x表示從左往右,從上往下數的第x個子圖code
下面的代碼提供了該函數使用實例orm
1 import pandas as pd 2 import numpy as np 3 import matplotlib.pyplot as plt 4 5 plt.figure(figsize=(20, 16)) 6 ax1 = plt.subplot(2, 3, 1) 7 ax2 = plt.subplot(2, 3, 2) 8 ax3 = plt.subplot(2, 3, 3) 9 ax5 = plt.subplot(2, 3, 5) 10 plt.show()
運行結果以下blog
3.下面的代碼是在一個座標軸中畫多個折線圖的示例索引
1 import pandas as pd 2 import numpy as np 3 import matplotlib.pyplot as plt 4 5 unrated = pd.read_csv("C:/學習/python/hello/UNRATE.csv") 6 unrated["DATE"] = pd.to_datetime(unrated["DATE"]) 7 color = ["red", "yellow", "blue", "green", "purple"] 8 plt.figure(figsize=(20, 16)) 9 for i in range(5): 10 sub_unrated = unrated.loc[i*12:(i+1)*12-1] 11 sub_unrated_x = sub_unrated['DATE'].dt.month 12 sub_unrated_y = sub_unrated["VALUE"] 13 label = 1948+i 14 plt.plot(sub_unrated_x, sub_unrated_y, color=color[i], label=label) 15 plt.legend(loc="best") 16 plt.show()
運行結果以下pandas
4.figure和subplot的定義順序決定了subplot是畫在哪一個figure中。當代碼中定義了多個figure時候,緊接着該figure定義的subplot才畫在該figure中,
以下代碼所示,定義了figure1和figure2,ax1和ax2在figure1中,ax在figure2中。
1 plt.figure(figsize=(20, 16)) 2 ax1 = plt.subplot(2,2,1) 3 plt.figure(figsize=(14, 12)) 4 ax = plt.subplot(1, 1, 1) 5 6 plt.show()
5.用matplotlib畫條形圖
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
reviews = pd.read_csv("C:/學習/python/hello/fandango_score_comparison.csv")
cols = ["FILM", "RT_user_norm", "Metacritic_user_nom", "IMDB_norm", "Fandango_Ratingvalue", "Fandango_Stars"]
norm_reviews = reviews[cols]
num_cols = ["RT_user_norm", "Metacritic_user_nom", "IMDB_norm", "Fandango_Ratingvalue", "Fandango_Stars"]
bar_height = norm_reviews.loc[0, num_cols].values #第一部電影的評價,注意利用loc索引某一行的用法,能夠添加第二維
bar_position = 1 + np.arange(5) #arange返回的是ndarray類型,range返回的是list類,使用arange須要用numpy
plt.figure(figsize=(10, 10))
ax = plt.subplot(1, 1, 1)
ax.bar(bar_position, bar_height, 0.3) #bar_position是條形圖的x座標(中點座標),bar_height是高,0.3是寬
#設置x座標刻度
tick_positions = range(1, 6)
ax.set_xticks(tick_positions)
ax.set_xticklabels(num_cols)
#設置x軸和y軸名稱
ax.set_xlabel("Rating Source")
ax.set_ylabel("Average Rating")
ax.set_title("Whatever")
plt.show()
運行結果以下
將上面的代碼改變幾處,就會成爲橫着的條形圖了。代碼以下所示(改動之處用白底紅字加粗下劃線標出來了)
1 import pandas as pd 2 import numpy as np 3 import matplotlib.pyplot as plt 4 5 reviews = pd.read_csv("C:/學習/python/hello/fandango_score_comparison.csv") 6 cols = ["FILM", "RT_user_norm", "Metacritic_user_nom", "IMDB_norm", "Fandango_Ratingvalue", "Fandango_Stars"] 7 norm_reviews = reviews[cols] 8 9 num_cols = ["RT_user_norm", "Metacritic_user_nom", "IMDB_norm", "Fandango_Ratingvalue", "Fandango_Stars"] 10 bar_height = norm_reviews.loc[0, num_cols].values #第一部電影的評價,注意利用loc索引某一行的用法,能夠添加第二維 11 12 bar_position = 1 + np.arange(5) #arange返回的是ndarray類型,range返回的是list類,使用arange須要用numpy 13 plt.figure(figsize=(10, 10)) 14 ax = plt.subplot(1, 1, 1) 15 ax.barh(bar_position, bar_height, 0.3) #bar_position是條形圖的x座標(中點座標),bar_height是高,0.3是寬 16 #設置x座標刻度 17 tick_positions = range(1, 6) 18 ax.set_yticks(tick_positions) 19 ax.set_yticklabels(num_cols) 20 21 #設置x軸和y軸名稱 22 ax.set_ylabel("Rating Source") 23 ax.set_xlabel("Average Rating") 24 ax.set_title("Whatever") 25 plt.show()
運行結果以下
6.畫散點圖
1 import pandas as pd 2 import numpy as np 3 import matplotlib.pyplot as plt 4 5 reviews = pd.read_csv("C:/學習/python/hello/fandango_score_comparison.csv") 6 plt.figure(figsize=(10,10)) 7 ax = plt.subplot(1, 1, 1) 8 ax.scatter(reviews["RT_norm"], reviews["Metacritic_user_nom"]) 9 plt.show()
運行結果以下
7.設a是Series類型,b = a.value_counts()能夠獲得a的一個頻數統計,b是Series結構,b的index是a的值,b的value是該值出現的頻數。
以下代碼所示
1 import pandas as pd 2 import numpy as np 3 import matplotlib.pyplot as plt 4 5 reviews = pd.read_csv("C:/學習/python/hello/fandango_score_comparison.csv") 6 cols = ["FILM", "RT_user_norm", "Metacritic_user_nom", "Fandango_Ratingvalue"] 7 norm_reviews = reviews[cols] 8 #fandango_distribution是Series結構,index是原來的列的值,value是該值出現的頻率 9 fandango_distribution = norm_reviews["Fandango_Ratingvalue"].value_counts() 10 print(fandango_distribution.head(5)) 11 print(type(fandango_distribution)) 12 print(fandango_distribution.index)
運行結果以下
8.咱們來畫直方圖
1 import pandas as pd 2 import numpy as np 3 import matplotlib.pyplot as plt 4 5 reviews = pd.read_csv("C:/學習/python/hello/fandango_score_comparison.csv") 6 cols = ["FILM", "RT_user_norm", "Metacritic_user_nom","IMDB_norm", "Fandango_Ratingvalue"] 7 norm_reviews = reviews[cols] 8 plt.figure(figsize=(10, 10)) 9 ax = plt.subplot(1, 1, 1) 10 ax.hist(norm_reviews["RT_user_norm"], bins=20) #參數bins表示直方圖的x軸分紅多少區間 11 12 plt.show()
運行結果以下