一.數據爬取和數據入庫

在使用jsoup爬取數據出現必定問題以後，我改變了方法採用Python來快速爬取疫情數據。python

通過必定時間學習Python相關知識後採用了requests 裏的一些方法和 json 格式的轉換以及就是數據庫的添加操做。mysql

爬取代碼以下web

# 爬取騰訊的每日疫情數據

import requests import json import pymysql def get_tencent_data(): """ 爬取目標網站的目標數據 :return: json 類型數據集合 """
    #須要爬取的數據網址
    url="https://view.inews.qq.com/g2/getOnsInfo?name=disease_h5" headers ={ #用戶代理 一個反爬取措施
        "user-agent": "Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.130 Mobile Safari/537.36" } r=requests.get(url,headers) res=json.loads(r.text)  #第一級轉換 json 字符轉換爲字典
    data_all =json.loads(res["data"]) details = [] """ 獲取的數據類型以下： lastUpdateTime 最後更新時間 chinaTotal 總數 chinaDayList 歷史記錄 chinaDayAddList 歷史新增記錄 areaTree:-name areaTree[0] 中國數據 -today -total -children：-name 省級數據，列表 json類型 -today -total -chilidren：-name 市級數據 列表 -today -total 在上面的url當中 已經沒有疫情歷史數據 能夠在https://view.inews.qq.com/g2/getOnsInfo?name=disease_other 查詢 """ update_time=data_all["lastUpdateTime"] data_country=data_all["areaTree"]  #lsit集合 47 個國家
    data_province =data_country[0]["children"]  #中國各省

    for pro_infos in data_province: province= pro_infos["name"]   #省名
       # print(province)
        for city_infos in pro_infos["children"]: city = city_infos["name"] confirm = city_infos["total"]["confirm"] confirm_add=city_infos["today"]["confirm"] heal= city_infos["total"]["heal"] dead=city_infos["total"]["dead"] details.append([update_time,province,city,confirm,confirm_add,heal,dead]) return details def get_conn(): """ 創建數據庫鏈接 :return: """ conn=pymysql.connect( #本機IP地址
                        host='127.0.0.1', #數據庫用戶名
                        user='root', #密碼
                        password='123456', #須要操做的數據庫名稱
                        db='web01', ) #cursor對象 能夠進行sql語句執行 和 得到返回值
    cursor=conn.cursor() return conn,cursor def close_conn(conn,cursor): """ 關閉鏈接 :param conn: 鏈接對象 :param cursor: cursor對象 :return: """
    if cursor: cursor.close() if conn: conn.close() def update_yiqingdata(): """ 更新每日數據 :return: """
    #獲取鏈接
    conn,cursor=get_conn() #獲取數據
    data=get_tencent_data() #sql語句 對數據庫進行操做
    sql = "insert into infos(updatetime,province,city,confirm,confirmadd,heal,dead) values(%s,%s,%s,%s,%s,%s,%s)"
    try: #執行sql語句
 cursor.executemany(sql,data) conn.commit() except: conn.rollback() close_conn(conn,cursor) #調用函數
update_yiqingdata()

View Code

二.可視化展現

效果以下圖：sql

爬取數據後只需將上次的數據查詢sql 語句更改一些，並對 Echart 格式進行些許修改便可。數據庫

三.學習及實現過程的psp表

日期	開始時間	結束時間	中斷時間	淨時間	活動	備註
3.10	15:35	17:35	10min	1h50min	學習jsoup的使用	觀看視頻進行學習json 並對jsoup有了大體瞭解app
3.11	9:50	10:50	5min	55min	親自實踐使用jsoup	經過視頻案例成功爬取了網頁圖片
3.11	13:30	15:30	0	2h	用jsoup進行數據爬取	網頁當中js動態生成的網頁沒法抓取ide 找到使用phantomjs 插件的解決方案函數對其瞭解並嘗試使用學習
3.11	16:00	17:00	0	1h	使用phantomjs插件	並未成功爬取到數據轉換思路使用python進行數據爬取
3.11	19:00	22:00	30min	2h30min	學習python基本語法以及爬取的相關知識	使用python抓取數據，並將給出的示例進行改編成功實現數據存入數據庫，並用Echarts可視化展現

Python爬取全國疫情數據+可視化地圖

一.數據爬取和數據入庫

二.可視化展現

三.學習及實現過程的psp表