python爬蟲28 | 你爬下的數據不分析一波可就虧了啊，使用python進行數據可視化

時間 2019-11-16

標籤 python 爬蟲爬下數據分析虧了使用進行可視化欄目 Python 简体版

原文原文鏈接

經過這段時間 javascript

小帥b教你從抓包開始php

到數據爬取html

到數據解析java

再到數據存儲python

相信你已經能抓取大部分你想爬取的網站數據了ios

恭喜恭喜nginx

可是web

數據抓取下來數據庫

要好好分析一波app

最好的方式就是把數據進行可視化

這樣才能直觀的感覺到數據的魅力

不過有一點

如今市面上可使用 python 的可視化庫多如牛毛

各有各的優勢

接下來小帥b把本身經常使用的一些可視化數據庫分享給你

好不？

那麼

接下來就是

學習 python 的正確姿式

先來講說一個經典的可視化庫

matplotlib

它是基於 NumPy 的一個數據可視化工具，內置了很是多圖給咱們使用

接下來咱們就來玩玩吧

首先你得去下載一下這個庫

python -m pip install -U pip setuptoolspython -m pip install matplotlib

下載完以後

就能夠來玩代碼啦

畫畫sin和cos線

import numpy as npimport .pyplot as plt
x = np.linspace(-np.pi, np.pi, 256)
cos = np.cos(x)sin = np.sin(x)
plt.plot(x, cos, '--', linewidth=2)plt.plot(x, sin)
plt.show()

畫個餅圖

# Pie chart, where the slices will be ordered and plotted counter-clockwise:labels = 'Frogs', 'Hogs', 'Dogs', 'Logs'sizes = [15, 30, 45, 10]explode = (0, 0.1, 0, 0) # only "explode" the 2nd slice (i.e. 'Hogs')
fig1, ax1 = plt.subplots()ax1.pie(sizes, explode=explode, labels=labels, autopct='%1.1f%%', shadow=True, startangle=90)ax1.axis('equal') # Equal aspect ratio ensures that pie is drawn as a circle.
plt.show()

畫畫直方圖

import numpy as npimport matplotlib.pyplot as plt
np.random.seed(0)
mu = 200sigma = 25x = np.random.normal(mu, sigma, size=100)
fig, (ax0, ax1) = plt.subplots(ncols=2, figsize=(8, 4))
ax0.hist(x, 20, normed=1, histtype='stepfilled', facecolor='g', alpha=0.75)ax0.set_title('stepfilled')
# Create a histogram by providing the bin edges (unequally spaced).bins = [100, 150, 180, 195, 205, 220, 250, 300]ax1.hist(x, bins, normed=1, histtype='bar', rwidth=0.8)ax1.set_title('unequal bins')
fig.tight_layout()plt.show()

更多關於 matplotlib 的文檔能夠到如下連接查看

https://matplotlib.org/2.0.2/contents.html

seaborn

seaborn 是基於 matplotlib 的庫，因此有更加高級的接口給咱們使用，相對來講更加簡單使用一些

畫個散點圖

import numpy as npimport pandas as pdimport matplotlib.pyplot as pltimport seaborn as snssns.set(style="darkgrid")

tips = sns.load_dataset("tips")sns.relplot(x="total_bill", y="tip", data=tips);plt.show()

畫個折線圖

fmri = sns.load_dataset("fmri")sns.relplot(x="timepoint", y="signal", hue="event", kind="line", data=fmri);plt.show()

畫個直方圖


titanic = sns.load_dataset("titanic")sns.catplot(x="sex", y="survived", hue="class", kind="bar", data=titanic);plt.show()

更多關於 seaborn 的能夠看看如下連接

https://seaborn.pydata.org/index.html

pyecharts

這是基於百度開源的數據可視化的 echarts 的庫

echarts 趕上了 python 以後

就像巧克力趕上了音樂

絲滑～

特別是當 pyechart 結合 Notebook 的時候

簡直不能在絲滑了

來畫個直方圖

from pyecharts.charts import Barfrom pyecharts import options as opts
bar = ( Bar() .add_xaxis(["襯衫", "毛衣", "領帶", "褲子", "風衣", "高跟鞋", "襪子"]) .add_yaxis("商家A", [114, 55, 27, 101, 125, 27, 105]) .add_yaxis("商家B", [57, 134, 137, 129, 145, 60, 49]) .set_global_opts(title_opts=opts.TitleOpts(title="某商場銷售狀況")))bar.render()

畫個餅圖

def pie_base() -> Pie: c = ( Pie() .add("", [list(z) for z in zip(Faker.choose(), Faker.values())]) .set_global_opts(title_opts=opts.TitleOpts(title="Pie-基本示例")) .set_series_opts(label_opts=opts.LabelOpts(formatter="{b}: {c}")) ) return c
# 須要安裝 snapshot_seleniummake_snapshot(driver, pie_base().render(), "pie.png")

再來畫個詞雲圖

words = [ ("Sam S Club", 10000), ("Macys", 6181), ("Amy Schumer", 4386), ("Jurassic World", 4055), ("Charter Communications", 2467), ("Chick Fil A", 2244), ("Planet Fitness", 1868), ("Pitch Perfect", 1484), ("Express", 1112), ("Home", 865), ("Johnny Depp", 847), ("Lena Dunham", 582), ("Lewis Hamilton", 555), ("KXAN", 550), ("Mary Ellen Mark", 462), ("Farrah Abraham", 366), ("Rita Ora", 360), ("Serena Williams", 282), ("NCAA baseball tournament", 273), ("Point Break", 265),]

def wordcloud_base() -> WordCloud: c = ( WordCloud() .add("", words, word_size_range=[20, 100]) .set_global_opts(title_opts=opts.TitleOpts(title="WordCloud-基本示例")) ) return c
# 須要安裝 snapshot_seleniummake_snapshot(driver, wordcloud_base().render(), "WordCloud.png")