鞏固複習(Hany驛站原創)_python的禮物

時間 2021-04-15

標籤 javascript css html 前端 java node python mysql linux ios 欄目 Python 简体版

原文原文鏈接

Python編程語言簡介 https://www.cnblogs.com/hany-postq473111315/p/12256134.html Python環境搭建及中文編碼 https://www.cnblogs.com/hany-postq473111315/p/12256337.html Python 基礎語法 https://www.cnblogs.com/hany-postq473111315/p/12257287.html Python 變量類型及變量賦值 https://www.cnblogs.com/hany-postq473111315/p/12258952.html Python 標準數據類型 https://www.cnblogs.com/hany-postq473111315/p/12259374.html Python 數字數據類型 https://www.cnblogs.com/hany-postq473111315/p/12259684.html Python字符串 https://www.cnblogs.com/hany-postq473111315/p/12259859.html Python列表 https://www.cnblogs.com/hany-postq473111315/p/12259925.html Python元組 https://www.cnblogs.com/hany-postq473111315/p/12260559.html Python字典 https://www.cnblogs.com/hany-postq473111315/p/12260671.html Python數據類型轉換 https://www.cnblogs.com/hany-postq473111315/p/12260913.html Python算術運算符 https://www.cnblogs.com/hany-postq473111315/p/12261418.html Python比較運算符 https://www.cnblogs.com/hany-postq473111315/p/12262652.html Python賦值運算符 https://www.cnblogs.com/hany-postq473111315/p/12262791.html Python位運算符 https://www.cnblogs.com/hany-postq473111315/p/12262990.html Python邏輯運算符 https://www.cnblogs.com/hany-postq473111315/p/12263166.html Python成員運算符 https://www.cnblogs.com/hany-postq473111315/p/12263573.html Python身份運算符 https://www.cnblogs.com/hany-postq473111315/p/12263746.html Python運算符優先級 https://www.cnblogs.com/hany-postq473111315/p/12263888.html Python條件語句 https://www.cnblogs.com/hany-postq473111315/p/12264149.html Python簡單的語句組 https://www.cnblogs.com/hany-postq473111315/p/12264240.html Python循環語句 https://www.cnblogs.com/hany-postq473111315/p/12264329.html Python循環控制語句 https://www.cnblogs.com/hany-postq473111315/p/12264653.html Python while循環語句 https://www.cnblogs.com/hany-postq473111315/p/12264768.html Python無限循環 https://www.cnblogs.com/hany-postq473111315/p/12267695.html Python while 循環中使用 else 語句 https://www.cnblogs.com/hany-postq473111315/p/12267719.html Python while 中簡單的語句組 https://www.cnblogs.com/hany-postq473111315/p/12267753.html Python for循環語句 https://www.cnblogs.com/hany-postq473111315/p/12268067.html Python for循環使用 else 語句 https://www.cnblogs.com/hany-postq473111315/p/12268076.html Python for循環經過序列索引迭代 https://www.cnblogs.com/hany-postq473111315/p/12268174.html Python 循環嵌套 https://www.cnblogs.com/hany-postq473111315/p/12268209.html Python break語句 https://www.cnblogs.com/hany-postq473111315/p/12268271.html Python continue語句 https://www.cnblogs.com/hany-postq473111315/p/12268306.html Python pass語句 https://www.cnblogs.com/hany-postq473111315/p/12268565.html Python 數字類型轉換 https://www.cnblogs.com/hany-postq473111315/p/12268709.html Python數學函數 https://www.cnblogs.com/hany-postq473111315/p/12268737.html Python隨機數函數 https://www.cnblogs.com/hany-postq473111315/p/12269533.html Python三角函數 https://www.cnblogs.com/hany-postq473111315/p/12269772.html Python數學常量 https://www.cnblogs.com/hany-postq473111315/p/12269905.html Python建立字符串 https://www.cnblogs.com/hany-postq473111315/p/12275504.html Python訪問字符串中的值 https://www.cnblogs.com/hany-postq473111315/p/12275513.html Python字符串更新 https://www.cnblogs.com/hany-postq473111315/p/12275591.html Python轉義字符 https://www.cnblogs.com/hany-postq473111315/p/12275690.html Python字符串運算符 https://www.cnblogs.com/hany-postq473111315/p/12275891.html Python字符串格式化 https://www.cnblogs.com/hany-postq473111315/p/12276031.html Python三引號 https://www.cnblogs.com/hany-postq473111315/p/12283515.html Python Unicode字符串 https://www.cnblogs.com/hany-postq473111315/p/12283589.html Python字符串內建函數_上 https://www.cnblogs.com/hany-postq473111315/p/12284044.html Python字符串內建函數_下 https://www.cnblogs.com/hany-postq473111315/p/12284555.html Python列表 https://www.cnblogs.com/hany-postq473111315/p/12286383.html Python訪問列表中的值 https://www.cnblogs.com/hany-postq473111315/p/12286568.html Python更新列表 https://www.cnblogs.com/hany-postq473111315/p/12286660.html Python刪除列表元素 https://www.cnblogs.com/hany-postq473111315/p/12286710.html Python列表腳本操做符 https://www.cnblogs.com/hany-postq473111315/p/12286755.html Python列表截取 https://www.cnblogs.com/hany-postq473111315/p/12286809.html Python列表函數和方法 https://www.cnblogs.com/hany-postq473111315/p/12286964.html Python元組 https://www.cnblogs.com/hany-postq473111315/p/12287445.html Python訪問元組 https://www.cnblogs.com/hany-postq473111315/p/12287609.html Python修改元組 https://www.cnblogs.com/hany-postq473111315/p/12287673.html Python刪除元組 https://www.cnblogs.com/hany-postq473111315/p/12287811.html Python元組運算符 https://www.cnblogs.com/hany-postq473111315/p/12287938.html Python元組索引、截取 https://www.cnblogs.com/hany-postq473111315/p/12287975.html Python元組內置函數 https://www.cnblogs.com/hany-postq473111315/p/12288016.html Python字典 https://www.cnblogs.com/hany-postq473111315/p/12288198.html Python訪問、修改、刪除字典中的值 https://www.cnblogs.com/hany-postq473111315/p/12288231.html Python字典內置函數和方法 https://www.cnblogs.com/hany-postq473111315/p/12288354.html Python日期和時間_什麼是Tick_什麼是時間元組_獲取當前時間 https://www.cnblogs.com/hany-postq473111315/p/12289960.html Python獲取當前時間_獲取格式化時間_格式化日期 https://www.cnblogs.com/hany-postq473111315/p/12290540.html Python Time模塊 https://www.cnblogs.com/hany-postq473111315/p/12290622.html Python日曆模塊 https://www.cnblogs.com/hany-postq473111315/p/12290929.html Python定義一個函數 https://www.cnblogs.com/hany-postq473111315/p/12293310.html Python函數調用 https://www.cnblogs.com/hany-postq473111315/p/12294753.html Python按值傳遞參數和按引用傳遞參數 https://www.cnblogs.com/hany-postq473111315/p/12294915.html Python函數參數 https://www.cnblogs.com/hany-postq473111315/p/12295062.html Python匿名函數_return語句 https://www.cnblogs.com/hany-postq473111315/p/12295925.html Python變量做用域 https://www.cnblogs.com/hany-postq473111315/p/12296130.html Python全局變量和局部變量 https://www.cnblogs.com/hany-postq473111315/p/12298147.html Python模塊_import語句_from...import 函數名_from ... import * https://www.cnblogs.com/hany-postq473111315/p/12298371.html Python定位模塊_PYTHONPATH變量 https://www.cnblogs.com/hany-postq473111315/p/12299402.html Python命名空間和做用域 https://www.cnblogs.com/hany-postq473111315/p/12299536.html Python dir( )函數 https://www.cnblogs.com/hany-postq473111315/p/12299640.html Python globals和locals函數_reload函數 https://www.cnblogs.com/hany-postq473111315/p/12299945.html Python包 https://www.cnblogs.com/hany-postq473111315/p/12302885.html Python打印到屏幕_讀取鍵盤輸入 https://www.cnblogs.com/hany-postq473111315/p/12303156.html Python打開和關閉文件 https://www.cnblogs.com/hany-postq473111315/p/12303245.html Python read和write方法 https://www.cnblogs.com/hany-postq473111315/p/12303304.html PythonFile對象的屬性 https://www.cnblogs.com/hany-postq473111315/p/12303309.html Python裏的目錄方法 https://www.cnblogs.com/hany-postq473111315/p/12303318.html Python重命名和刪除文件 https://www.cnblogs.com/hany-postq473111315/p/12303314.html 猜數字遊戲 https://www.cnblogs.com/hany-postq473111315/p/12651677.html 實現功能菜單欄 https://www.cnblogs.com/hany-postq473111315/p/12652887.html pass 出錯問題 https://www.cnblogs.com/hany-postq473111315/p/12652896.html Python 實現分層聚類算法 https://www.cnblogs.com/hany-postq473111315/p/12671890.html KNN算法基本原理與sklearn實現 https://www.cnblogs.com/hany-postq473111315/p/12672247.html 矩陣經常使用操做 https://www.cnblogs.com/hany-postq473111315/p/12672549.html 不一樣複製操做對比（三種） https://www.cnblogs.com/hany-postq473111315/p/12672623.html 排序和索引的問題 https://www.cnblogs.com/hany-postq473111315/p/12672695.html 正則表達式補充 https://www.cnblogs.com/hany-postq473111315/p/12675139.html Pandas 複習 https://www.cnblogs.com/hany-postq473111315/p/12675657.html Pandas 複習2 https://www.cnblogs.com/hany-postq473111315/p/12677896.html Series結構(經常使用) https://www.cnblogs.com/hany-postq473111315/p/12678080.html 部分畫圖 https://www.cnblogs.com/hany-postq473111315/p/12679161.html 提取txt文本有效內容 https://www.cnblogs.com/hany-postq473111315/p/12680021.html 獲取所有 txt 文本中出現次數最多的前N個詞彙 https://www.cnblogs.com/hany-postq473111315/p/12680560.html 使用樸素貝葉斯模型對郵件進行分類 https://www.cnblogs.com/hany-postq473111315/p/12680882.html 函數進階1 https://www.cnblogs.com/hany-postq473111315/p/12681153.html 函數進階2 https://www.cnblogs.com/hany-postq473111315/p/12681487.html Python異常及異常處理 https://www.cnblogs.com/hany-postq473111315/p/12303328.html Python 中 False 和 True 關鍵字 https://www.cnblogs.com/hany-postq473111315/p/12257604.html 正則表達式補充2 https://www.cnblogs.com/hany-postq473111315/p/12681687.html 正則表達式基礎1 https://www.cnblogs.com/hany-postq473111315/p/12682011.html 兩數相加(B站看視頻總結) https://www.cnblogs.com/hany-postq473111315/p/12682862.html 函數進階3 https://www.cnblogs.com/hany-postq473111315/p/12683682.html 函數進階4 https://www.cnblogs.com/hany-postq473111315/p/12683729.html 列表推導式 https://www.cnblogs.com/hany-postq473111315/p/12683951.html 關於類和異常的筆記 https://www.cnblogs.com/hany-postq473111315/p/12683969.html 類 鞏固小結 https://www.cnblogs.com/hany-postq473111315/p/12684207.html 模塊小結 https://www.cnblogs.com/hany-postq473111315/p/12684404.html 異常 鞏固1 https://www.cnblogs.com/hany-postq473111315/p/12684508.html 異常 鞏固2 https://www.cnblogs.com/hany-postq473111315/p/12684638.html logging日誌基礎示例 https://www.cnblogs.com/hany-postq473111315/p/12684676.html 異常 鞏固3 https://www.cnblogs.com/hany-postq473111315/p/12684747.html 多線程複習1 https://www.cnblogs.com/hany-postq473111315/p/12685980.html 多線程複習2 https://www.cnblogs.com/hany-postq473111315/p/12686171.html yield 複習 https://www.cnblogs.com/hany-postq473111315/p/12686330.html 協程解決素數 https://www.cnblogs.com/hany-postq473111315/p/12686467.html 爬蟲流程複習 https://www.cnblogs.com/hany-postq473111315/p/12686857.html 列表經常使用方法複習 https://www.cnblogs.com/hany-postq473111315/p/12691570.html 字典經常使用操做複習 https://www.cnblogs.com/hany-postq473111315/p/12691575.html 單鏈表複習 https://www.cnblogs.com/hany-postq473111315/p/12695182.html 雙鏈表複習 https://www.cnblogs.com/hany-postq473111315/p/12697287.html 單向循環鏈表 https://www.cnblogs.com/hany-postq473111315/p/12700855.html 將"089,0760,009"變爲 89,760,9 https://www.cnblogs.com/hany-postq473111315/p/12701254.html Linux最經常使用的基本操做複習 https://www.cnblogs.com/hany-postq473111315/p/12702144.html python鏈接數據庫 https://www.cnblogs.com/hany-postq473111315/p/12704539.html 複習實現棧的基本操做 https://www.cnblogs.com/hany-postq473111315/p/12705117.html 顯示列表重複值 https://www.cnblogs.com/hany-postq473111315/p/12705591.html 複習 隊列的實現 https://www.cnblogs.com/hany-postq473111315/p/12706952.html 擴展 雙端隊列 https://www.cnblogs.com/hany-postq473111315/p/12707455.html 複習 二叉樹的建立 https://www.cnblogs.com/hany-postq473111315/p/12723432.html 複習 廣度遍歷 https://www.cnblogs.com/hany-postq473111315/p/12725087.html 複習 深度遍歷(先序中序後序) https://www.cnblogs.com/hany-postq473111315/p/12725386.html 應用場景 https://www.cnblogs.com/hany-postq473111315/p/12725748.html 類 擴展 https://www.cnblogs.com/hany-postq473111315/p/12727814.html 複習 裝飾器 https://www.cnblogs.com/hany-postq473111315/p/12731157.html 擴展 求奇數和 https://www.cnblogs.com/hany-postq473111315/p/12732794.html 字符串重複出現 https://www.cnblogs.com/hany-postq473111315/p/12732912.html 最基本的Tkinter界面操做 https://www.cnblogs.com/hany-postq473111315/p/12736499.html Tkinter經常使用簡單操做 https://www.cnblogs.com/hany-postq473111315/p/12736695.html Tkinter經典寫法 https://www.cnblogs.com/hany-postq473111315/p/12736898.html Label 組件基本寫法 https://www.cnblogs.com/hany-postq473111315/p/12739187.html 類實例化的對象調用的方法或屬性來自於類的哪一個方法中 https://www.cnblogs.com/hany-postq473111315/p/12739493.html Button基本用語 https://www.cnblogs.com/hany-postq473111315/p/12739711.html Entry基本用法 https://www.cnblogs.com/hany-postq473111315/p/12739996.html Text多行文本框基本用法 https://www.cnblogs.com/hany-postq473111315/p/12741482.html Radiobutton基礎語法 https://www.cnblogs.com/hany-postq473111315/p/12742418.html Checkbutton基本寫法 https://www.cnblogs.com/hany-postq473111315/p/12742886.html 擴展寫法 https://www.cnblogs.com/hany-postq473111315/p/12748369.html 整理上課內容 https://www.cnblogs.com/hany-postq473111315/p/12758012.html Seaborn基礎1 https://www.cnblogs.com/hany-postq473111315/p/12765835.html Seaborn基礎2 https://www.cnblogs.com/hany-postq473111315/p/12765847.html Seaborn基礎3 https://www.cnblogs.com/hany-postq473111315/p/12765857.html Seaborn實現單變量分析 https://www.cnblogs.com/hany-postq473111315/p/12766032.html Seaborn實現迴歸分析 https://www.cnblogs.com/hany-postq473111315/p/12766396.html Seaborn實現多變量分析 https://www.cnblogs.com/hany-postq473111315/p/12766576.html 將形如 5D, 30s 的字符串轉爲秒 https://www.cnblogs.com/hany-postq473111315/p/12770867.html 得到昨天和明天的日期 https://www.cnblogs.com/hany-postq473111315/p/12770871.html 計算兩個日期相隔的秒數 https://www.cnblogs.com/hany-postq473111315/p/12770888.html 遍歷多個 txt 文件進行獲取值 https://www.cnblogs.com/hany-postq473111315/p/12771248.html Django學習路 https://www.cnblogs.com/hany-postq473111315/p/12774607.html Django學習路2 https://www.cnblogs.com/hany-postq473111315/p/12775279.html Django學習路3 https://www.cnblogs.com/hany-postq473111315/p/12778795.html 安裝第三方庫 https://www.cnblogs.com/hany-postq473111315/p/12785024.html 安裝第三方庫進階 https://www.cnblogs.com/hany-postq473111315/p/12785212.html Python第一次實驗 https://www.cnblogs.com/hany-postq473111315/p/12787851.html 打包程序問題 https://www.cnblogs.com/hany-postq473111315/p/12788073.html pip 國內源 https://www.cnblogs.com/hany-postq473111315/p/12802294.html format 進階 https://www.cnblogs.com/hany-postq473111315/p/12806654.html 進階刪除重複元素 https://www.cnblogs.com/hany-postq473111315/p/12819466.html 實現優先級隊列 https://www.cnblogs.com/hany-postq473111315/p/12819475.html 爬蟲流程複習2 https://www.cnblogs.com/hany-postq473111315/p/12819589.html numpy第三方庫 https://www.cnblogs.com/hany-postq473111315/p/12821506.html pandas第三方庫 https://www.cnblogs.com/hany-postq473111315/p/12821512.html 函數式迴文串 https://www.cnblogs.com/hany-postq473111315/p/12821617.html 判斷是否包含重複值 https://www.cnblogs.com/hany-postq473111315/p/12821629.html 函數實現 多個數據求平均值 https://www.cnblogs.com/hany-postq473111315/p/12821637.html 對傳入的數據進行分類 https://www.cnblogs.com/hany-postq473111315/p/12821665.html 進階 對傳入的數據進行分類 https://www.cnblogs.com/hany-postq473111315/p/12821676.html 二進制字符長度 https://www.cnblogs.com/hany-postq473111315/p/12821692.html 將包含_或-的字符串最開始的字母小寫,其他的第一個字母大寫 https://www.cnblogs.com/hany-postq473111315/p/12821750.html 將字符串的首字母大寫其他字符根據須要,判斷是否大寫 https://www.cnblogs.com/hany-postq473111315/p/12821780.html 將每個分隔開的字符的首字母大寫 https://www.cnblogs.com/hany-postq473111315/p/12821790.html 不管傳入什麼數據都轉換爲列表 https://www.cnblogs.com/hany-postq473111315/p/12821818.html 斐波那契數列進一步討論性能 https://www.cnblogs.com/hany-postq473111315/p/12827717.html 迭代器和可迭代對象區別 https://www.cnblogs.com/hany-postq473111315/p/12827742.html map 函數基本寫法 https://www.cnblogs.com/hany-postq473111315/p/12827787.html filter 函數基本寫法 https://www.cnblogs.com/hany-postq473111315/p/12827853.html functools 中的 reduce 函數基本寫法 https://www.cnblogs.com/hany-postq473111315/p/12827943.html 三元運算符 https://www.cnblogs.com/hany-postq473111315/p/12827981.html 學裝飾器以前必需要了解的四點 https://www.cnblogs.com/hany-postq473111315/p/12828151.html 列表推導式,最基本寫法 https://www.cnblogs.com/hany-postq473111315/p/12828176.html 字典推導式,最基本寫法 https://www.cnblogs.com/hany-postq473111315/p/12828267.html 集合推導式,最基本寫法 https://www.cnblogs.com/hany-postq473111315/p/12828283.html 狀態碼 https://www.cnblogs.com/hany-postq473111315/p/12835104.html 爬蟲流程複習3 https://www.cnblogs.com/hany-postq473111315/p/12836761.html Django坑_01 https://www.cnblogs.com/hany-postq473111315/p/12837316.html list 和 [ ] 的功能不相同 https://www.cnblogs.com/hany-postq473111315/p/12838134.html Django建立簡單數據庫 https://www.cnblogs.com/hany-postq473111315/p/12840986.html 數據庫設計基礎知識 https://www.cnblogs.com/hany-postq473111315/p/12841422.html Django學習路4_數據庫添加元素,讀取及顯示到網頁上 https://www.cnblogs.com/hany-postq473111315/p/12841899.html [] 和 () 的區別 https://www.cnblogs.com/hany-postq473111315/p/12842366.html 使用 you-get 下載免費電影或電視劇 https://www.cnblogs.com/hany-postq473111315/p/12843854.html Django學習路5_更新和刪除數據庫表中元素 https://www.cnblogs.com/hany-postq473111315/p/12844479.html Django學習路6_修改數據庫爲 mysql ,建立mysql及進行遷徙 https://www.cnblogs.com/hany-postq473111315/p/12844651.html python 鏈接 mysql 的三種驅動 https://www.cnblogs.com/hany-postq473111315/p/12844667.html pandas_DateFrame的建立 https://www.cnblogs.com/hany-postq473111315/p/12844777.html pandas_一維數組與經常使用操做 https://www.cnblogs.com/hany-postq473111315/p/12844790.html pandas_使用屬性接口實現高級功能 https://www.cnblogs.com/hany-postq473111315/p/12844800.html pandas_使用透視表與交叉表查看業績彙總數據 https://www.cnblogs.com/hany-postq473111315/p/12844818.html pandas_分類與聚合 https://www.cnblogs.com/hany-postq473111315/p/12844831.html pandas_處理異常值缺失值重複值數據差分 https://www.cnblogs.com/hany-postq473111315/p/12844851.html pandas_數據拆分與合併 https://www.cnblogs.com/hany-postq473111315/p/12844857.html pandas_數據排序 https://www.cnblogs.com/hany-postq473111315/p/12844866.html pandas_時間序列和經常使用操做 https://www.cnblogs.com/hany-postq473111315/p/12844876.html pandas_學習的時候總會忘了的知識點 https://www.cnblogs.com/hany-postq473111315/p/12844887.html pandas_查看數據特徵和統計信息 https://www.cnblogs.com/hany-postq473111315/p/12844895.html pandas_讀取Excel並篩選特定數據 https://www.cnblogs.com/hany-postq473111315/p/12844906.html pandas_重採樣多索引標準差協方差 https://www.cnblogs.com/hany-postq473111315/p/12844921.html Numpy random函數 https://www.cnblogs.com/hany-postq473111315/p/12844924.html Numpy修改數組中的元素值 https://www.cnblogs.com/hany-postq473111315/p/12844928.html Numpy建立數組 https://www.cnblogs.com/hany-postq473111315/p/12844937.html Numpy改變數組的形狀 https://www.cnblogs.com/hany-postq473111315/p/12844944.html Numpy數組排序 https://www.cnblogs.com/hany-postq473111315/p/12844954.html Numpy數組的函數 https://www.cnblogs.com/hany-postq473111315/p/12844962.html Numpy數組的運算 https://www.cnblogs.com/hany-postq473111315/p/12844978.html Numpy訪問數組元素 https://www.cnblogs.com/hany-postq473111315/p/12845013.html 關於這幾天發佈的文章 https://www.cnblogs.com/hany-postq473111315/p/12845031.html TCP 客戶端 https://www.cnblogs.com/hany-postq473111315/p/12845038.html TCP 服務器端 https://www.cnblogs.com/hany-postq473111315/p/12845044.html UDP 綁定信息 https://www.cnblogs.com/hany-postq473111315/p/12845055.html UDP 網絡程序-發送_接收數據 https://www.cnblogs.com/hany-postq473111315/p/12845060.html WSGI應用程序示例 https://www.cnblogs.com/hany-postq473111315/p/12845068.html 定義 WSGI 接口 https://www.cnblogs.com/hany-postq473111315/p/12845080.html encode 和 decode 的使用 https://www.cnblogs.com/hany-postq473111315/p/12845094.html abs,all,any函數的使用 https://www.cnblogs.com/hany-postq473111315/p/12845118.html 一行代碼合併兩個字典 https://www.cnblogs.com/hany-postq473111315/p/12845131.html 一行代碼求多個列表中的最大值 https://www.cnblogs.com/hany-postq473111315/p/12845153.html 文件的某些操做(之前發過相似的) https://www.cnblogs.com/hany-postq473111315/p/12845193.html 冒泡排序 https://www.cnblogs.com/hany-postq473111315/p/12845219.html 希爾排序 https://www.cnblogs.com/hany-postq473111315/p/12845247.html 歸併排序 https://www.cnblogs.com/hany-postq473111315/p/12845260.html 快速排序 https://www.cnblogs.com/hany-postq473111315/p/12845271.html 插入排序 https://www.cnblogs.com/hany-postq473111315/p/12845283.html 選擇排序 https://www.cnblogs.com/hany-postq473111315/p/12845306.html 數據庫基礎應用 https://www.cnblogs.com/hany-postq473111315/p/12845310.html 數據庫進行參數化,查詢一行或多行語句 https://www.cnblogs.com/hany-postq473111315/p/12845325.html 數據結構_鏈表(單鏈表,單向循環鏈表,雙鏈表) https://www.cnblogs.com/hany-postq473111315/p/12845339.html 數據結構_隊列(普通隊列和雙端隊列) https://www.cnblogs.com/hany-postq473111315/p/12845346.html 數據結構_棧 https://www.cnblogs.com/hany-postq473111315/p/12845356.html 數據結構_二叉樹 https://www.cnblogs.com/hany-postq473111315/p/12845365.html 二分法查找 https://www.cnblogs.com/hany-postq473111315/p/12845370.html 正則表達式_合集上 https://www.cnblogs.com/hany-postq473111315/p/12845456.html 正則表達式_合集下(後續還會有補充) https://www.cnblogs.com/hany-postq473111315/p/12845485.html 線程_apply堵塞式 https://www.cnblogs.com/hany-postq473111315/p/12845503.html 線程_FIFO隊列實現生產者消費者 https://www.cnblogs.com/hany-postq473111315/p/12845511.html 線程_GIL最簡單的例子 https://www.cnblogs.com/hany-postq473111315/p/12845522.html 線程_multiprocessing實現文件夾copy器 https://www.cnblogs.com/hany-postq473111315/p/12845531.html 線程_multiprocessing異步 https://www.cnblogs.com/hany-postq473111315/p/12845536.html 線程_Process實例 https://www.cnblogs.com/hany-postq473111315/p/12845545.html 線程_Process基礎語法 https://www.cnblogs.com/hany-postq473111315/p/12845549.html 線程_ThreadLocal https://www.cnblogs.com/hany-postq473111315/p/12845560.html 線程_互斥鎖_Lock及fork建立子進程 https://www.cnblogs.com/hany-postq473111315/p/12845573.html 線程_gevent實現多個視頻下載及併發下載 https://www.cnblogs.com/hany-postq473111315/p/12845587.html 線程_gevent自動切換CPU協程 https://www.cnblogs.com/hany-postq473111315/p/12845603.html 線程_使用multiprocessing啓動一個子進程及建立Process 的子類 https://www.cnblogs.com/hany-postq473111315/p/12845615.html 線程_共享全局變量(全局變量在主線程和子線程中不一樣) https://www.cnblogs.com/hany-postq473111315/p/12845624.html 線程_多線程_列表當作實參傳遞到線程中 https://www.cnblogs.com/hany-postq473111315/p/12845630.html 線程_threading合集 https://www.cnblogs.com/hany-postq473111315/p/12845649.html 線程_進程間通訊Queue合集 https://www.cnblogs.com/hany-postq473111315/p/12845662.html 線程_進程池 https://www.cnblogs.com/hany-postq473111315/p/12845673.html 線程_可能發生的問題 https://www.cnblogs.com/hany-postq473111315/p/12845683.html == 和 is 的區別 https://www.cnblogs.com/hany-postq473111315/p/12845720.html __getattribute__小例子 https://www.cnblogs.com/hany-postq473111315/p/12846865.html __new__方法理解 https://www.cnblogs.com/hany-postq473111315/p/12846871.html ctime使用及datetime簡單使用 https://www.cnblogs.com/hany-postq473111315/p/12846876.html functools函數中的partial函數及wraps函數 https://www.cnblogs.com/hany-postq473111315/p/12846888.html gc 模塊經常使用函數 https://www.cnblogs.com/hany-postq473111315/p/12846892.html hashlib加密算法 https://www.cnblogs.com/hany-postq473111315/p/12846896.html __slots__屬性 https://www.cnblogs.com/hany-postq473111315/p/12846904.html isinstance方法判斷可迭代和迭代器 https://www.cnblogs.com/hany-postq473111315/p/12846908.html metaclass 攔截類的建立,並返回 https://www.cnblogs.com/hany-postq473111315/p/12846911.html timeit_list操做測試 https://www.cnblogs.com/hany-postq473111315/p/12846913.html nonlocal 訪問變量 https://www.cnblogs.com/hany-postq473111315/p/12846915.html pdb 進行調試 https://www.cnblogs.com/hany-postq473111315/p/12846921.html 使用property取代getter和setter方法 https://www.cnblogs.com/hany-postq473111315/p/12846925.html 使用types庫修改函數 https://www.cnblogs.com/hany-postq473111315/p/12846932.html type 建立類,賦予類\靜態方法等 https://www.cnblogs.com/hany-postq473111315/p/12846940.html 迭代器實現斐波那契數列 https://www.cnblogs.com/hany-postq473111315/p/12846943.html 建立生成器 https://www.cnblogs.com/hany-postq473111315/p/12846945.html 動態給類的實例對象 或 類 添加屬性 https://www.cnblogs.com/hany-postq473111315/p/12846946.html 線程_同步應用 https://www.cnblogs.com/hany-postq473111315/p/12846947.html 垃圾回收機制_合集 https://www.cnblogs.com/hany-postq473111315/p/12846950.html 協程的簡單實現 https://www.cnblogs.com/hany-postq473111315/p/12846953.html 實現了__iter__和__next__的對象是迭代器 https://www.cnblogs.com/hany-postq473111315/p/12846956.html 對類中私有化的理解 https://www.cnblogs.com/hany-postq473111315/p/12846958.html 拷貝的一些生成式 https://www.cnblogs.com/hany-postq473111315/p/12846961.html 查看 __class__屬性 https://www.cnblogs.com/hany-postq473111315/p/12846966.html 運行過程當中給類添加方法 types.MethodType https://www.cnblogs.com/hany-postq473111315/p/12846970.html 查看對象的引用計數及計數加一 https://www.cnblogs.com/hany-postq473111315/p/12846975.html 淺拷貝和深拷貝 https://www.cnblogs.com/hany-postq473111315/p/12846978.html 點format方式輸出星號字典的值是鍵 https://www.cnblogs.com/hany-postq473111315/p/12846982.html 類能夠打印，賦值，做爲實參和實例化 https://www.cnblogs.com/hany-postq473111315/p/12846988.html 類能夠在函數中建立，做爲返回值(返回類) https://www.cnblogs.com/hany-postq473111315/p/12846992.html 查看某一個字符出現的次數 https://www.cnblogs.com/hany-postq473111315/p/12847012.html 閉包函數 https://www.cnblogs.com/hany-postq473111315/p/12847006.html 自定義建立元類 https://www.cnblogs.com/hany-postq473111315/p/12846996.html 迪傑斯特拉算法(網上找的) https://www.cnblogs.com/hany-postq473111315/p/12847011.html 裝飾器_上 https://www.cnblogs.com/hany-postq473111315/p/12846997.html 裝飾器_下 https://www.cnblogs.com/hany-postq473111315/p/12847002.html Django學習路7_註冊app到可以在頁面上顯示app網頁內容 https://www.cnblogs.com/hany-postq473111315/p/12849466.html Django學習路8_學生表和班級表級聯並相互查詢信息 https://www.cnblogs.com/hany-postq473111315/p/12852064.html Django學習路9_流程複習 https://www.cnblogs.com/hany-postq473111315/p/12856419.html 設計模式_理解單例設計模式 https://www.cnblogs.com/hany-postq473111315/p/12856511.html 設計模式_單例模式的懶漢式實例化 https://www.cnblogs.com/hany-postq473111315/p/12856586.html Django學習路10_建立一個新的數據庫,指定列名並修改表名 https://www.cnblogs.com/hany-postq473111315/p/12857363.html Django學習路11_向數據庫中添加 和 獲取指定條件數據 https://www.cnblogs.com/hany-postq473111315/p/12859172.html Django學習路12_objects 方法(all,filter,exclude,order by,values) https://www.cnblogs.com/hany-postq473111315/p/12859187.html 數據庫實驗內容,不包括對錶的增刪改查 https://www.cnblogs.com/hany-postq473111315/p/12862517.html Django學習路13_建立用戶登陸,判斷數據庫中帳號名密碼是否正確 https://www.cnblogs.com/hany-postq473111315/p/12864746.html Django學習路14_獲取數據庫中用戶名字並展現,獲取指定條數 https://www.cnblogs.com/hany-postq473111315/p/12867317.html 在類外建立函數,而後使用類的實例化對象進行調用 https://www.cnblogs.com/hany-postq473111315/p/12867420.html Django坑_02 https://www.cnblogs.com/hany-postq473111315/p/12868042.html Django學習路15_建立一個訂單信息,並查詢2020年\9月的信息都有哪些 https://www.cnblogs.com/hany-postq473111315/p/12868272.html Django學習路16_獲取學生所在的班級名 https://www.cnblogs.com/hany-postq473111315/p/12868775.html Django學習路17_聚合函數(Avg平均值,Count數量,Max最大,Min最小,Sum求和)基本使用 https://www.cnblogs.com/hany-postq473111315/p/12868786.html Django學習路18_F對象和Q對象 https://www.cnblogs.com/hany-postq473111315/p/12870354.html PageRank算法 https://www.cnblogs.com/hany-postq473111315/p/12871080.html Python實現數據結構 圖 https://www.cnblogs.com/hany-postq473111315/p/12871103.html Django學習路19_is_delete屬性,重寫類方法,顯性隱性屬性 https://www.cnblogs.com/hany-postq473111315/p/12881137.html Django學習路20_流程複習 https://www.cnblogs.com/hany-postq473111315/p/12881490.html Django學習路21_views函數中定義字典及html中使用類實例對象的屬性及方法 https://www.cnblogs.com/hany-postq473111315/p/12882201.html Django學習路22_empty爲空,forloop.counter 從1計數,.counter0 從0計數 .revcounter最後末尾數字是1,.revcounter0 倒序,末尾爲 0 https://www.cnblogs.com/hany-postq473111315/p/12887676.html Django學習路23_if else 語句,if elif else 語句 forloop.first第一個元素 .last最後一個元素,註釋 https://www.cnblogs.com/hany-postq473111315/p/12887789.html Django學習路24_乘法和除法 https://www.cnblogs.com/hany-postq473111315/p/12890956.html Django學習路25_ifequal 和 ifnotequal 判斷數值是否相等及加減法 {{數值|add 數值}} https://www.cnblogs.com/hany-postq473111315/p/12891104.html Django學習路26_轉換字符串大小寫 upper,lower https://www.cnblogs.com/hany-postq473111315/p/12891253.html Django學習路27_HTML轉義 https://www.cnblogs.com/hany-postq473111315/p/12898638.html Django學習路28_ .html 文件繼承及<block 標籤>,include 'xxx.html' https://www.cnblogs.com/hany-postq473111315/p/12898808.html Django學習路29_css樣式渲染 h3 標籤 https://www.cnblogs.com/hany-postq473111315/p/12899854.html Django學習路30_view中存在重複名時,取第一個知足條件的 https://www.cnblogs.com/hany-postq473111315/p/12899918.html Django學習路31_使用 locals 簡化 context 寫法,點擊班級顯示該班學生信息 https://www.cnblogs.com/hany-postq473111315/p/12909241.html 分解質因數 https://www.cnblogs.com/hany-postq473111315/p/12910220.html 計算皮球下落速度 https://www.cnblogs.com/hany-postq473111315/p/12910268.html 給定年月日,判斷是這一年的第幾天 https://www.cnblogs.com/hany-postq473111315/p/12910344.html Django學習路32_建立管理員及內容補充+前面內容複習 https://www.cnblogs.com/hany-postq473111315/p/12911605.html Django學習路33_url 地址及刪除元素 delete() 和重定向 return redirect('路徑') https://www.cnblogs.com/hany-postq473111315/p/12917747.html Django學習路34_models 文件建立數據表 https://www.cnblogs.com/hany-postq473111315/p/12920060.html Django學習路35_視圖使用方法(複製的代碼) + 簡單總結 https://www.cnblogs.com/hany-postq473111315/p/12922362.html 實驗1-5 https://www.cnblogs.com/hany-postq473111315/p/12932750.html 學生成績表數據包括:學號,姓名,高數,英語和計算機三門課成績,計算每一個學生總分,每課程平均分,最高分和最低分 https://www.cnblogs.com/hany-postq473111315/p/12939916.html 四位玫瑰數 https://www.cnblogs.com/hany-postq473111315/p/12940081.html 四平方和 https://www.cnblogs.com/hany-postq473111315/p/12940256.html 學生管理系統-明日學院的 https://www.cnblogs.com/hany-postq473111315/p/12941008.html 定義函數，給定一個列表做爲函數參數，將列表中的非數字字符去除 https://www.cnblogs.com/hany-postq473111315/p/12943097.html 給定幾位數，查看數根(使用函數實現) https://www.cnblogs.com/hany-postq473111315/p/12943740.html 水果系統(面向過程,面向對象) https://www.cnblogs.com/hany-postq473111315/p/12950313.html matplotlib基礎彙總_01 https://www.cnblogs.com/hany-postq473111315/p/12950862.html matplotlib基礎彙總_02 https://www.cnblogs.com/hany-postq473111315/p/12950897.html matplotlib基礎彙總_03 https://www.cnblogs.com/hany-postq473111315/p/12950922.html matplotlib基礎彙總_04 https://www.cnblogs.com/hany-postq473111315/p/12951001.html 根據列表的值來顯示每個元素出現的次數 https://www.cnblogs.com/hany-postq473111315/p/12951610.html 鑽石和玻璃球遊戲(鑽石位置固定) https://www.cnblogs.com/hany-postq473111315/p/12952401.html 小人推心圖(網上代碼) https://www.cnblogs.com/hany-postq473111315/p/12952806.html 0525習題 https://www.cnblogs.com/hany-postq473111315/p/12957586.html jieba嚐鮮 https://www.cnblogs.com/hany-postq473111315/p/12957868.html inf https://www.cnblogs.com/hany-postq473111315/p/12968590.html 讀取文件進行繪圖 https://www.cnblogs.com/hany-postq473111315/p/12968611.html 雙約束重力模型 https://www.cnblogs.com/hany-postq473111315/p/12968621.html 未解決問題01 https://www.cnblogs.com/hany-postq473111315/p/12971205.html Sqlite3 實現學生信息增刪改查 https://www.cnblogs.com/hany-postq473111315/p/12953587.html 未解決問題02 https://www.cnblogs.com/hany-postq473111315/p/12971235.html 簡單繪圖 https://www.cnblogs.com/hany-postq473111315/p/12971256.html 鑽石和玻璃球遊戲(鑽石位置不固定) https://www.cnblogs.com/hany-postq473111315/p/12971307.html 使用正則匹配數字 https://www.cnblogs.com/hany-postq473111315/p/12971958.html 打包文件一閃而過及導入文件措施 https://www.cnblogs.com/hany-postq473111315/p/12976957.html 文件簡單操做 https://www.cnblogs.com/hany-postq473111315/p/12978701.html 0528習題 1-5 https://www.cnblogs.com/hany-postq473111315/p/12978717.html Django學習路36_函數參數 反向解析 修改404 頁面 https://www.cnblogs.com/hany-postq473111315/p/12980002.html 關於某一爬蟲實例的總結 https://www.cnblogs.com/hany-postq473111315/p/12980219.html 小技巧_01 https://www.cnblogs.com/hany-postq473111315/p/12983898.html 安裝 kreas 2.2.4 版本問題 https://www.cnblogs.com/hany-postq473111315/p/12984543.html 是否感染病毒 https://www.cnblogs.com/hany-postq473111315/p/12985989.html python文件操做 https://www.cnblogs.com/hany-postq473111315/p/12986729.html pandas 幾個重要知識點 https://www.cnblogs.com/hany-postq473111315/p/12989028.html 一千美圓的故事(錢放入信封中) https://www.cnblogs.com/hany-postq473111315/p/12989353.html 給定兩個列表,轉換爲 DataFrame 類型 https://www.cnblogs.com/hany-postq473111315/p/12989923.html 經過文檔算學生的平均分 https://www.cnblogs.com/hany-postq473111315/p/12990789.html 0528習題 11-15 https://www.cnblogs.com/hany-postq473111315/p/12978739.html 0528習題 6-10 https://www.cnblogs.com/hany-postq473111315/p/12978731.html 0528習題 16-20 https://www.cnblogs.com/hany-postq473111315/p/12978750.html 0528習題 21-25 https://www.cnblogs.com/hany-postq473111315/p/12978764.html 0528習題 26-31 https://www.cnblogs.com/hany-postq473111315/p/12978781.html python 安裝 0x000007b錯誤解決及VC++ 安裝第三方庫報紅 https://www.cnblogs.com/hany-postq473111315/p/12993454.html 讀取 csv , xlsx 表格並添加總分列 https://www.cnblogs.com/hany-postq473111315/p/13020328.html matplotlib 顯示中文問題 https://www.cnblogs.com/hany-postq473111315/p/13021265.html gbk codec can't encode character  https://www.cnblogs.com/hany-postq473111315/p/13024069.html cmd 安裝第三方庫問題 https://www.cnblogs.com/hany-postq473111315/p/13030558.html 十進制轉換 https://www.cnblogs.com/hany-postq473111315/p/13030835.html 正則表達式鞏固_從別的資料上弄下來的 https://www.cnblogs.com/hany-postq473111315/p/13030596.html pandas鞏固 https://www.cnblogs.com/hany-postq473111315/p/13034160.html numpy鞏固 https://www.cnblogs.com/hany-postq473111315/p/13035346.html Django學習路37_request屬性 https://www.cnblogs.com/hany-postq473111315/p/12981943.html 菜鳥教程的 mysql-connector 基礎 https://www.cnblogs.com/hany-postq473111315/p/13037118.html 爬蟲流程(前面發過的文章的合集)鞏固 https://www.cnblogs.com/hany-postq473111315/p/13040839.html matplotlib示例 https://www.cnblogs.com/hany-postq473111315/p/13047992.html matplotlib顏色線條及繪製直線 https://www.cnblogs.com/hany-postq473111315/p/13049034.html matplotlib繪製子圖 https://www.cnblogs.com/hany-postq473111315/p/13054122.html 下載數據到csv中(亂碼),使用numpy , pandas讀取失敗 解決方案 https://www.cnblogs.com/hany-postq473111315/p/13054802.html 查看一個數全部的因子及因子的和 https://www.cnblogs.com/hany-postq473111315/p/13059879.html 輸入 1,2,4,5,78 返回 (1, 78, 2, 4, 5, 90) 返回形式:最小值 最大值 其他值 及 總和 https://www.cnblogs.com/hany-postq473111315/p/13059889.html 1000之內能被3或5整除但不能被10整除的數的個數爲 https://www.cnblogs.com/hany-postq473111315/p/13059900.html 輸入數字判斷是不是偶數,輸出兩個質數的和爲該偶數的值 https://www.cnblogs.com/hany-postq473111315/p/13059922.html 十進制轉換爲其餘進制(不使用format) https://www.cnblogs.com/hany-postq473111315/p/13059936.html 字典元組列表經常使用方法 https://www.cnblogs.com/hany-postq473111315/p/13061946.html 設置x 軸斜體(每次我都百度,此次單獨爲它發一個) https://www.cnblogs.com/hany-postq473111315/p/13062468.html 對字典進行排序 https://www.cnblogs.com/hany-postq473111315/p/13080091.html 終於,我仍是對本身的博客下手了 https://www.cnblogs.com/hany-postq473111315/p/13087385.html 字符串經常使用函數總結 https://www.cnblogs.com/hany-postq473111315/p/13112074.html 解決SyntaxError: Non-UTF-8 code starting with '\xbb'問題 https://www.cnblogs.com/hany-postq473111315/p/13113564.html 獲取列表中出現的值,並按降序進行排列 https://www.cnblogs.com/hany-postq473111315/p/13120968.html CSV文件指定頁腳 https://www.cnblogs.com/hany-postq473111315/p/13130528.html 重置spyder 解決 gbk 編碼不能讀取問題 https://www.cnblogs.com/hany-postq473111315/p/13159943.html 使用 eval(input()) 的便利 https://www.cnblogs.com/hany-postq473111315/p/13159954.html flask的第一次嘗試 https://www.cnblogs.com/hany-postq473111315/p/13161637.html 條件表達式 https://www.cnblogs.com/hany-postq473111315/p/13167125.html 安裝fiddler 谷歌插件 https://www.cnblogs.com/hany-postq473111315/p/13168359.html 繪圖小結 https://www.cnblogs.com/hany-postq473111315/p/13169418.html 今日成果:爬取百度貼吧 https://www.cnblogs.com/hany-postq473111315/p/13170170.html map,reduce,filter基礎實現 https://www.cnblogs.com/hany-postq473111315/p/13171509.html 數據分析小題 https://www.cnblogs.com/hany-postq473111315/p/13173809.html 求最大公約數最小公倍數及整除求餘數等 https://www.cnblogs.com/hany-postq473111315/p/13189664.html matplotlib 去掉座標軸 https://www.cnblogs.com/hany-postq473111315/p/13193886.html 字符串的三個函數 https://www.cnblogs.com/hany-postq473111315/p/13213108.html 關於這學期的總結 https://www.cnblogs.com/hany-postq473111315/p/13228741.html 轉義字符 https://www.cnblogs.com/hany-postq473111315/p/13233072.html format格式 https://www.cnblogs.com/hany-postq473111315/p/13233095.html IPython magic命令 https://www.cnblogs.com/hany-postq473111315/p/13246648.html 關鍵字 https://www.cnblogs.com/hany-postq473111315/p/13246716.html 運算符 https://www.cnblogs.com/hany-postq473111315/p/13246732.html 經常使用類型轉換函數 https://www.cnblogs.com/hany-postq473111315/p/13246740.html 數學方法內置函數 https://www.cnblogs.com/hany-postq473111315/p/13246749.html 字符串經常使用函數 https://www.cnblogs.com/hany-postq473111315/p/13246759.html 經常使用內置函數 https://www.cnblogs.com/hany-postq473111315/p/13246765.html math庫經常使用函數 https://www.cnblogs.com/hany-postq473111315/p/13246775.html random隨機數函數 https://www.cnblogs.com/hany-postq473111315/p/13246781.html time模塊 https://www.cnblogs.com/hany-postq473111315/p/13246790.html datetime模塊 https://www.cnblogs.com/hany-postq473111315/p/13246801.html 列表字典集合經常使用函數 https://www.cnblogs.com/hany-postq473111315/p/13246827.html 選擇結構和循環結構 https://www.cnblogs.com/hany-postq473111315/p/13246839.html 函數調用 https://www.cnblogs.com/hany-postq473111315/p/13246864.html jieba.lcut方法 https://www.cnblogs.com/hany-postq473111315/p/13246933.html turtle庫經常使用函數 https://www.cnblogs.com/hany-postq473111315/p/13246941.html int轉換sys,argv參數問題 https://www.cnblogs.com/hany-postq473111315/p/13246955.html 文件基本用法 https://www.cnblogs.com/hany-postq473111315/p/13253872.html 讀/寫docx文件 https://www.cnblogs.com/hany-postq473111315/p/13253912.html 讀/寫xlsx文件 https://www.cnblogs.com/hany-postq473111315/p/13253937.html os模塊經常使用方法 https://www.cnblogs.com/hany-postq473111315/p/13253955.html pandas屬性和方法 https://www.cnblogs.com/hany-postq473111315/p/13254000.html numpy的random方法和經常使用數據類型 https://www.cnblogs.com/hany-postq473111315/p/13254012.html matplotlib經常使用基礎知識 https://www.cnblogs.com/hany-postq473111315/p/13254039.html 學習python的幾個資料網站 https://www.cnblogs.com/hany-postq473111315/p/13265849.html 機器學習網址 https://www.cnblogs.com/hany-postq473111315/p/13278370.html 機器學習算法速查表 https://www.cnblogs.com/hany-postq473111315/p/13278528.html matpltlib 示例 https://www.cnblogs.com/hany-postq473111315/p/13278593.html statsmodels 示例 https://www.cnblogs.com/hany-postq473111315/p/13278607.html Biopython 第三方庫示例 https://www.cnblogs.com/hany-postq473111315/p/13278677.html 爬取三寸人間 https://www.cnblogs.com/hany-postq473111315/p/13306001.html 爬取圖蟲網 示例網址 https://wangxu.tuchong.com/23892889/ https://www.cnblogs.com/hany-postq473111315/p/13306056.html 爬蟲基礎鞏固 https://www.cnblogs.com/hany-postq473111315/p/13306114.html 老男孩Django筆記(非原創) https://www.cnblogs.com/hany-postq473111315/p/13339511.html

import requests from fake_useragent import UserAgent from lxml import etree headers = { 'UserAgent':UserAgent().random } title_xpath = "//a[@class='postTitle2 vertical-middle']/span/text()" title_url_xpath = "//a[@class='postTitle2 vertical-middle']/@href" url_list = ["https://www.cnblogs.com/hany-postq473111315/default.html?page={}".format(num) for num in range(49,0,-1) ] file = open('隨筆.txt','w') num = 0 for url in url_list: print("第{}頁隨筆獲取完畢!".format(num + 1)) num = num + 1 response = requests.get(url,headers = headers) e = etree.HTML(response.text) title = e.xpath(title_xpath) url = e.xpath(title_url_xpath) for t,u in zip(title[::-1],url[::-1]): file.write(t + '\n') file.write(u + '\n') file.close()

python 打包 在線生成圖標網站 http://www.ico51.cn/ 打包命令(帶圖標) pyinstaller -F -i 圖標名稱.ico 文件名.py 不帶圖標 pyinstaller -F 文件名.py

將黑框去掉,使用 -w pyinstaller -F -w -i favicon.ico 文件名.py

注:此篇隨筆爲之前的隨筆的總結,過於基礎的並無展示,只是將之前的隨筆中的重點提取出來了

Python 語言遵循 GPL(GNU General Public Licence) 協議。
GPL 指的是GNU通用公共許可協議，該協議經過提供給做者軟件版權保護和許可證保障做者的權益

查看關鍵字。 help("keywords") 關鍵字以下:(注:複製代碼只複製上一行代碼便可，下面爲輸出結果) False class from or None continue global pass True def if raise and del import return as elif in try assert else is while async except lambda with await finally nonlocal yield break for not

Python具備可擴展性，可以導入使用 C/C++ 語言編寫的程序，從而提升程序運行速度。 Python具備可嵌入性，能夠把Python程序嵌入到 C/C++ 程序中。

Python支持 GUI 編程，提供多個圖形開發界面的庫，如 Tkinter ，wxPython ，Jython 等。 學了這麼久,tkinter 接觸的要明顯多餘 Jython 和 wxPython ..

在 Windows 中設置環境變量時，打開命令提示框，輸入： path %path%;python的安裝目錄

Python 的重要環境變量： PYTHONPATH：使用 import 語句後會從該環境變量進行尋找。 PYTHONSTARTUP：在 Python 啓動後，會執行此文件中變量指定的執行代碼。 PYTHONCASEOK：寫在這裏面的環境變量，在導入時模塊不區分大小寫。 PYTHONHOME：一般存在於 PYTHONSTARTUP 和 PYTHONPATH 目錄中，便於切換模塊庫。

使用 Python 在命令提示符中運行 .py 程序 python 文件名.py 參數 參數 補充: 參數能夠使用 sys.args 進行獲取 可是是從第2個開始進行獲取參數

python2 中使用 #-*-coding:UTF-8-*- 對中文進行編碼

以單個下劃線開頭的變量或方法 _temp ，表示不可以直接訪問的類屬性，須要經過類提供的接口(函數)進行訪問。
當使用 from xx import * 時，_temp不可以被導入。使用者不該該訪問 _temp 的變量或方法。

以兩個下劃線開頭的變量 __temp ，能夠經過類提供的接口(函數)進行訪問。
使用了__xxx 表示的變量或方法，其實是實現了 名稱轉寫 機制。

__temp 會被轉寫成 _classname__temp ，避免了使用者的錯誤訪問。
使用 __temp 的類，在被子類繼承時，可以避免子類中方法的命名衝突。
定義子類時，每每會使用到父類的 __init__構造方法，實際上爲了不衝突，調用的是_父類名 _initalize() 方法。

輸出關鍵字的另外一種方式 import keyword print(keyword.kwlist) ['False', 'None', 'True', 'and', 'as', 'assert', 'async', 'await', 'break', 'class', 'continue', 'def', 'del', 'elif', 'else', 'except', 'finally', 'for', 'from', 'global', 'if', 'import', 'in', 'is', 'lambda', 'nonlocal', 'not', 'or', 'pass', 'raise', 'return', 'try', 'while', 'with', 'yield']

類方法和靜態方法在使用裝飾器方面(很久沒有使用了,有點生疏了) @classmethod def eat(cls): pass @staticmethod def eat(): pass

關於 finally 語句,最後必定會執行的問題,有一些遺忘 # 使用 try 語句 try : pass except Exception as e: pass finally: pass

使用 ctime 獲取當前時間 from time import ctime print("{}".format(ctime()))

%c 格式化字符及其ASCII碼 %s 格式化字符串 %d 格式化整數 %u 格式化無符號整型 %f 格式化浮點數字，可指定小數點後的精度 %e 用科學計數法格式化浮點數 %E 做用同%e，用科學計數法格式化浮點數

print("pi = %.*f" % (3,pi)) #用*從後面的元組中讀取字段寬度或精度 # pi = 3.142 print('%010.3f' % pi) #用0填充空白 # 000003.142 print('%-10.3f' % pi) #使用 - 號左對齊 # 3.142

在 Python 中，變量不必定佔用內存變量。變量就像是對某一處內存的引用，能夠經過變量訪問到其所指向的內存中的值，
而且可讓變量指向其餘的內存。在 Python 中，變量不須要聲明，可是使用變量以前，必定要先對變量進行賦值。
當建立了一個新對象時，Python 會向操做系統請求內存，Python 自己實現了內存分配系統。變量類型指的是變量所指向的內存中 對象 的類型。
Python 中變量賦值使用 = 等號，等號左面爲建立的變量，等號右面爲須要的值。

變量包含的內容主要包含四個方面： 　　　　1.變量的名稱：在對變量賦值時也就是建立變量時所使用的名字。注：根據標識符規則。 　　　　2.變量保存的數據：一般爲賦值時 = 等號 右面的對象。 　　　　　對象主要包括： 　　　　　　①.數字：int 、float 、complex 、bool、表達式、函數調用返回值等。 　　　　　　　　　　數字: int 表示整數，包含正數，負數，0 　　　　　　　　　　　　float 表示浮點數，帶有小數點的數 　　　　　　　　　　　　complex 表示複數，實部 + 虛部 J 或 j 都可 　bool 布爾類型，True 爲真，False 爲假 ②.字符串：字符串變量、帶有" "的字符串、表達式、函數調用的返回值等。 　　　　　　　　　　注：Python3 以 Unicode 編碼方式編碼。 　　　　　　　　　　使用雙引號 " " 或單引號 ' ' 建立字符串或者進行強制轉換 str 。 ③.列表：列表變量、帶有 [ ] 的對象、表達式、函數調用的返回值等。 　　　　　　　　　　使用了 [ ] 的，[ ] 內能夠是數字，字符串，字典，元組，列表，集合，表達式等。 ④.元組：元組變量、帶有逗號的或被( )包圍的多個變量或值、表達式、函數調用的返回值等。 　　　　　　　　　　空元組 ( ) 　　　　　　　　　　建立一個只包含數字 1 的元素的元組 (1,) 注：必定要帶有 ， 號 　　　　　　　　　　建立包含多個元素的元組，能夠直接用 (元素1，元素2，...，元素n) 賦值 　　　　　　　　　　　　或者元素1，元素2，...，元素n ，使用，逗號進行賦值 ⑤.集合：空集合 set( )、使用了{ }的內部爲單個變量或值、表達式、函數調用的返回值等。 　　　　　　　　　　空集合 set( ) 　　　　　　　　　　建立多個元素的集合，{元素1，元素2，...，元素n} 　　　　　　　　　　注：集合中元素不重複，可利用此特性判斷別的序列對象是否存在重複元素。 ⑥.字典：字典變量、帶有 {鍵:值} 的變量或值、表達式、函數調用的返回值等。 　　　　　　　　　　空字典 { } 　　　　　　　　　　建立多個元素的字典，變量名 = {鍵1:值1,鍵2:值2,...,鍵n:值n} ⑦.類：一般爲類建立實例時，函數調用的返回值等。 　　　　　　　　　　class關鍵字聲明。 　　　　　　⑧.函數：函數名、函數調用等。 　　　　　　　　　　def 關鍵字聲明，在函數前可能會有裝飾器。另外，函數能夠嵌套函數，當內部函數使用了外部函數的某些對象時稱爲閉包函數。 　　　　　　注:表達式是指關於對象之間的運算。

變量的地址，也就是所指向的內存中的地址。使用 id(變量名) 函數獲取。 # 查看 a 的內存地址 a = 123 print(id(a)) # 140734656708688

字典： {鍵 : 值} 對的元素的集合。字典內部的元素是無序的，經過鍵來獲取鍵所對應的值。字典中的鍵是不可以改變的，而且是惟一的。

# 建立一個元素的集合，能夠不使用 ， set_2 = {1} print(set_2) # {1} print(type(set_2)) # <class 'set'> set_3 = {1,} print(set_3) # {1} print(type(set_3)) # <class 'set'>

集合中不能包含列表和字典對象

# 負數 num_int_3 = -226 print(type(num_int_3)) # <class 'int'>

# 擴大100倍 num_float_3 = 2.5e2 print(num_float_3) # 250.0 print(type(num_float_3)) # <class 'float'>

關於複數 0 進行隱藏問題 num_complex_3 = 3.j print(num_complex_3) # 3j print(type(num_complex_3)) # <class 'complex'> num_complex_4 = .6j print(num_complex_4) # 0.6j print(type(num_complex_4)) # <class 'complex'>

有時判斷條件是否成立使用 not True 好比說某一個元素不在列表中使用 not in

Unicode碼：主要有三種，分別爲 utf-八、utf-1六、utf-32。utf-8 佔用一到四個字節，utf-16 佔用二到四個字節，utf-32 佔用四個字節。 　　Python 在訪問時，使用方括號 [ 索引位置 ] ,進行訪問。字符串能夠進行拼接操做，就是將兩個字符串進行拼接，而後產生新的字符串。能夠進行切片操做 [ : ] ,(注:左閉右開)。

列表能夠進行增長元素、刪除元素、查詢是否存在該元素、修改某一位置上的元素、查看列表的長度、肯定最大最小元素以及對列表排序等。有時候，還能夠經過強制轉換修改元組。

Python 的元組與列表相似。元組使用小括號 ( ) 包含數據。元組能夠經過索引下標進行訪問元組中的值。元組中的值不是容許修改的，可是能夠對元組進行拼接。

# 建立只包含一個元素的元組 tuple_2 = (1,) print(type(tuple_1)) # <class 'tuple'> # 建立包含多個元素的元組 tuple_4 = 7,8,9 print(tuple_4) # (7, 8, 9) print(type(tuple_4)) # <class 'tuple'>

字典建立以後，能夠使用 字典名['鍵名'] 進行訪問。 增長字典元素，能夠直接使用 字典名['新的鍵'] = 新的值 使用 del 能夠將字典元素進行刪除。 能夠對字典求長度，強制轉換，拷貝字典等操做。 注：當後來又添加了新的鍵，而原來有同名的鍵時，之後來的爲準。

在字典中建立鍵值對,沒有寫明: 時,建立的是值 dic = {'a':123,888:'n',(4,5):[7,8]} dic.keys() # dict_keys(['a', 888, (4, 5)]) dic.values() # dict_values([123, 'n', [7, 8]])

# 使用 dict 轉化爲字典 dic = dict(zip(['a','b','c'],[4,5,6])) print(dic) # {'a': 4, 'b': 5, 'c': 6}

　　Python中的數據類型能夠進行相互轉換： 　　　　1.將 float 浮點型轉化成 int 長整型。int( ) 　　　　2. 將 2,3 轉化爲複數。complex(實部,虛部) 　　　　3.將數字、列表、元組、字典轉化爲字符串類型。str( ) , json.dumps(字典) 　　　　4.將字符串轉化爲數字類型。eval( ) 　　　　5.將列表轉化成元組。tuple( ) 　　　　6.將元組轉化成列表。list( ) 　　　　7.將列表轉化成集合，用來消除多餘重複元素。set( ) 　　　　8.將字符串轉化爲集合元素。set( ) 9.將整數轉化爲字符。 chr( ) 　　　　10.將字符轉化爲整數。ord( ) 　　　　11.將十進制整數轉化爲十六進制數。hex( ) 　　　　12.將十進制整數轉化爲八進制數。 oct( )

# 將整數轉化爲字符。 print(chr(65)) # A print(chr(90)) # Z print(chr(97)) # a print(chr(122)) # z # 將字符轉化爲整數。 print(ord('A')) # 65 # 將十進制整數轉化爲十六進制數。 print(hex(17)) # 0x11 # 將十進制整數轉化爲八進制數。 print(oct(9)) # 0o11

Python算術運算符。 　　算術運算符： 　　　　+ ：兩個對象相加。 　　　 －：獲得負數 或 前一個數減去後一個數。 　　　　* ： 兩個數進行相乘 或 重複字符串元素、列表元素、元組元素。 　　　　/ ： 前一個數對後一個數進行除法操做，返回浮點數類型。 　　　 %： 取模，返回前一個數除去後一個數的餘數。 　　　 ** ： 返回前一個數的後面數次冪。 　　　 // ： 前一個數對後一個數整除，返回整數部分。

Python 比較運算符，多用於條件判斷語句 if 中，返回值爲 True (真)或 False (假)： 　　== ： 等於，比較兩個對象的值是否相等。 　 ！= ： 不等於，比較兩個對象的值是否不相等。 　　> ： 大於，前面一個數是否大於後面的數。 　　< ： 小於，前面一個數是否小於後面的數。 　 >= ： 大於等於，前面的是是否大於等於後面的數。 　 <=： 小於等於，前面的數是否小於等於後面的數。

Python賦值運算符： 　　= ： 賦值運算符 　 += ： 加法賦值運算符 　 -= ： 減法賦值運算符 　 *= ： 乘法賦值運算符 /= ： 除法賦值運算符 　%= ： 取模賦值運算符 ，當前面的數小於後面的數時，返回前一個數自己(數大於 0)。 **= ： 冪賦值運算符 //= ： 取整賦值運算符 　注：a 符號等於 b 等價於 a 等於 a 符號 (b)

# *= 乘法賦值運算符 a = 4 b = 5 a *= b #等價於 a = a * (b) print("a = {0} , b = {1} ".format(a,b)) # a = 20 , b = 5 # /= 除法賦值運算符 a = 4 b = 5 a /= b #等價於 a = a / (b) print("a = {0} , b = {1} ".format(a,b)) # a = 0.8 , b = 5 # %= 取模賦值運算符 a = 4 b = 5 a %= b #等價於 a = a % (b) print("a = {0} , b = {1} ".format(a,b)) # a = 4 , b = 5 a = 6 b = 4 a %= b #等價於 a = a % (b) print("a = {0} , b = {1} ".format(a,b)) # a = 2 , b = 4 # **= 冪賦值運算符 a = 4 b = 2 a **= b #等價於 a = a ** (b) print("a = {0} , b = {1} ".format(a,b)) # a = 16 , b = 2 # //= 取整賦值運算符,返回整數 a = 4 b = 3 a //= b #等價於 a = a // (b) print("a = {0} , b = {1} ".format(a,b)) # a = 1 , b = 3

Python位運算符：將 int 長整型數據看作二進制進行計算，主要是將前面的數和後面的數的對應位置上的數字 0，1 進行判斷。 　　　　 & 按位與：若是對應位置上的兩個數都爲 1，那麼獲得的該結果的該位置上也爲 1 。其餘狀況都爲 0。 　　　　 | 按位或：若是對應位置上的兩個數有一個爲 1 或都爲 1，則獲得的該結果的該位置上也爲 1 。其餘狀況都爲 0。 　　　 ^ 按位異或：若是對應位置上的兩個數爲 0 和 1 或 1 和 0，則獲得的該結果的該位置上也爲 1 。其餘狀況都爲 0。 　　　 ~ 按位取反：若是~後面爲正數或 0，則結果爲-(數+1)， 　　　　　　　　　 若是後面的數爲負數，則結果爲-(負數(帶符號)+1)。 　 << 左移運算符：將前面的數乘以 2 的(後面的數) 次冪。 　 >> 右移運算符：將前面的數除以 2 的(後面的數) 次冪。

# ~ 按位取反：若是後面的爲正數，則結果爲-(正數+1) print(~2) # -3 # 若是後面的數爲負數，則結果爲-(負數(帶符號)+1)。 print(~(-5)) # 4

注意返回的是對象,不是True 和 False Python邏輯運算符： 　　and 布爾‘與’： 當左面的對象爲真時，返回右面的對象。 　　　　 　　　　當左面的對象不爲真時，返回左面的對象。 　 　or 布爾‘或’： 當左面的對象爲真時，返回左面的對象。 　　　　　　　　 當左面的對象不爲真時，返回右面的對象。 　 not 布爾'非'： 若是後面的對象爲True，則返回False。不然返回True。

Python成員運算符： 　　in：若是左面的對象在右面的對象中，則返回 True，不在則返回 False。 not in：若是左面的對象不在右面的對象中，則返回 True，在則返回 False。

a = 'a' d = 'd' dic = {'a':123,'b':456,'c':789} # 判斷 a 是否在 dic 中 # 字典主要是看,是否存在該鍵 print(a in dic) # True # 判斷 d 是否在 s 中 print(d in dic) # False

a = 'a' d = 'd' strs = 'abc' # 判斷 a 是否不在 strs 中 print(a not in strs) # False # 判斷 d 是否不在 strs 中 print(d not in strs) # True

Python身份運算符： 　　is ：判斷左右兩個對象內存地址是否相等。 　　is not ：判斷左右兩個對象內存地址是否不相等。 　　注：對於不可變類型數據，當引用自相同數據時，is 返回值爲 True 。 　　　　　　數字、字符串、元組。 　　　　對於可變類型數據，當引用自相同數據時，is not 返回值爲 True 。 　　　　　　列表、字典、集合。

# 對於不可變類型數據，引用自相同數據時，is 爲真 # 數字 num = 123 num_two = 123 # 輸出 num 和 num_two 的地址 print("num地址爲:{0},num_two地址爲:{1}".format(id(num),id(num_two))) # num地址爲:140729798131792,num_two地址爲:140729798131792 print(num is num_two) # True ,num 和 num_two 指向同一塊內存地址 print(num is not num_two) # False

# 對於可變類型，即便引用自相同數據，內存地址也不相同。is not 爲 True # 列表 lst = [1,2,3] lst_two = [1,2,3] # 輸出 lst 和 lst_two 的地址 print("lst地址爲:{0},lst_two地址爲:{1}".format(id(lst),id(lst_two))) # lst地址爲:2781811921480,lst_two地址爲:2781811921992 print(lst is lst_two) # False print(lst is not lst_two) # True

Python運算符優先級（從高到低、有括號則最早算括號）： 　** ：指數 　~ 按位取反 　* 乘法、/ 除法、% 取模、// 整除 　+ 加法、- 減法 　>> 右移運算、<< 左移運算 　& 按位與 　^ 按位異或、| 按位或 　<= 小於等於、< 小於、> 大於、>= 大於等於 　== 是否相等、!= 是否不相等 　= 賦值、%= 取模賦值、/= 除法賦值、//= 整除賦值、-= 減法賦值、+= 加法賦值、*= 乘法賦值、**= 冪賦值 　is 是、is not 不是 引用自同一地址空間 　in 是否在、not in 是否不在 　not 非、and 與、or 或

條件語句的幾種狀況 第一種： ''' if 條件1: 條件1知足時，須要運行的內容 ''' 第二種： ''' if 條件1: 條件1知足時，須要運行的內容 else: 條件1不知足時，須要運行的內容 ''' 第三種： ''' if 條件1: 條件1知足時，須要運行的內容 elif 條件2: 條件1不知足時，條件2知足，須要運行的內容 ''' 第四種： ''' if 條件1： 當條件1知足時，須要運行的內容 elif 條件2： 當條件1不知足，知足條件2時，須要運行的內容 ... elif 條件n: 前面的 n-1 條條件都不知足，第n條條件知足，須要運行的內容 else: 前面的全部條件都不知足時，須要運行的內容 '''

在 if 中經常使用的操做運算符： 　　< 小於、<= 小於等於、> 大於、>= 大於等於、== 等於、!= 不等於 　　注：能夠配合 and、or、not 進行混合搭配。

Python 中的循環包括 for 循環和 while 循環。 while 循環，當給定的判斷條件爲 True 時，會執行循環體，不然退出循環。(可能不知道具體執行多少次) for 循環，重複執行某一塊語句，執行 n 次。 在 Python 中能夠進行嵌套使用循環。while 中包含 for ，或 for 包含 while。

Python循環控制語句：主要有三種，break、continue 和 pass 語句。 　　break 語句　：在語句塊執行過程當中，終止循環、並跳出整個循環。 　　continue 語句 ：在語句執行過程當中，跳出本次循環，進行下一次循環。 　　pass 語句 ：空語句，用來保持結構的完整性。

Python while循環語句（代碼塊中要有使判斷條件不成立的時候、不然會陷入無限循環）： 第一種結構： ''' while 判斷條件: 一行語句 或 多行語句組 ''' 第二種結構、else 表示只有程序正常運行纔會進行使用的代碼塊： ''' while 判斷條件: 一行語句 或 多行語句組 else: 一行語句 或 多行語句組 '''

Python 無限循環：在 while 循環語句中，能夠經過讓判斷條件一直達不到 False ，實現無限循環。

Python while 循環中使用 else 語句： 　　else：表示 while 中的語句正常執行完，而後執行 else 語句的部分。 示例： while 判斷條件: 一行語句 或 多行語句組 else: 一行語句 或 多行語句組

Python for 循環語句：遍歷任何序列的項目，能夠是字符串、列表、元組、字典、集合對象

for 中的 else 依舊爲正常執行以後會進行輸出的代碼塊 第一種： ''' for 迭代對象 in 序列: 代碼塊(一行語句或多行代碼) ''' 第二種： ''' for 迭代對象 in 序列: 代碼塊(一行語句或多行代碼) else: 代碼塊(一行語句或多行代碼) '''

Python for 循環經過序列索引迭代： 注：集合 和 字典 不能夠經過索引進行獲取元素，由於集合和字典都是無序的。 使用 len (參數) 方法能夠獲取到遍歷對象的長度。

使用 range 方法（左閉右開）： range 函數參數以下，起始位置、終止位置(不包含)、步長。 　　注：起始位置默認爲 0 。 　　　　步長能夠爲負，默認爲 1。

lst = [i for i in range(5)] print(lst) # 起始位置默認爲 0 # [0, 1, 2, 3, 4] lst = [i for i in range(1,5)] print(lst) # 不包含終止位置 # [1, 2, 3, 4] lst = [i for i in range(1,5,2)] print(lst) #步長能夠根據本身須要進行更改 # [1, 3] lst = [i for i in range(-5,-1,1)] print(lst) # 起始位置和終止位置能夠爲負 # [-5, -4, -3, -2]

經過序列索引進行迭代操做程序： 字符串： strs = "Hello World." for i in range(len(strs)): print(strs[i],end = " ") # H e l l o W o r l d .

Python循環嵌套：將 for 循環和 while 循環進行嵌套。 示例： while 循環嵌套 for 循環： while True: for i in range(3): print("while 和 for 進行嵌套") break # while 和 for 進行嵌套 # while 和 for 進行嵌套 # while 和 for 進行嵌套

for 循環嵌套 while 循環、不推薦進行使用： a = 1 for i in range(3): while a < 3: print("while 和 for 進行嵌套") a += 1 # while 和 for 進行嵌套 # while 和 for 進行嵌套

關於 break 的理解 Python break語句：當運行到 break 語句時，終止包含 break 的循環語句。 注：不管判斷條件是否達到 False 或 序列是否遍歷完都會中止執行循環語句和該 break 下的全部語句。 　　當使用循環嵌套時，break 語句將會終止最內層的 while 或 for 語句、而後執行外一層的 while 或 for 循環。

lst = [7,8,9,4,5,6] for i in range(len(lst)): if lst[i] == 4: print("循環終止") break #終止循環語句 print(lst[i],end = " ") # 7 8 9 循環終止

關於 continue 的理解 當執行到 continue 語句時，將再也不執行本次循環中 continue 語句接下來的部分，而是繼續下一次循環

lst = [7,8,9,4,5,6] for i in range(len(lst)): if lst[i] == 9: continue #當運行到 continue 語句時，不執行本次循環中剩餘的代碼，而是繼續下一層循環 print(lst[i],end = " ") # 7 8 4 5 6

Python pass語句：空語句，主要用於保持程序結構的完整性 或者 函數想要添加某種功能，可是尚未想好具體應該怎麼寫。

Python數字類型轉換： 　　int(x)：將 x 轉換爲一個整數 　　float(x)：將 x 轉換爲一個浮點數 　　complex(x,y)：將 x 和 y 轉換爲一個複數。x 爲複數的實部，y 爲複數的虛部。 　　eval(x)：將 x 轉化爲一個整數 　　chr(x)：x 爲數字，將數字轉化爲對應的 ASCII 碼。 65 -> A 、90 -> Z 　　ord(x)：x 爲單個字符，將字符轉換爲對應的整數。 a -> 9七、122 -> z

# 將 2,3 轉化爲複數 num_complex = complex(2,3) print(num_complex) # (2+3j) print(type(num_complex)) # <class 'complex'>

Python數學函數 abs(x) 返回數字的絕對值，如abs(-10) 返回 10 math.ceil(x) 返回數字的上入整數，如math.ceil(4.1) 返回 5 math.exp(x) 返回e的x次冪(ex),如math.exp(1) 返回2.718281828459045 fabs(x) 返回數字的絕對值，如math.fabs(-10) 返回10.0 floor(x) 返回數字的下舍整數，如math.floor(4.9)返回 4 log(x) 如math.log(math.e)返回1.0,math.log(100,10)返回2.0 log10(x) 返回以10爲基數的x的對數，如math.log10(100)返回 2.0 max(x1, x2,...) 返回給定參數的最大值，參數能夠爲序列。 min(x1, x2,...) 返回給定參數的最小值，參數能夠爲序列。 modf(x) 返回x的整數部分與小數部分，兩部分的數值符號與x相同，整數部分以浮點型表示。 pow(x, y) x**y 運算後的值。 round(x [,n]) 返回浮點數 x 的四捨五入值，如給出 n 值，則表明舍入到小數點後的位數。 其實準確的說是保留值將保留到離上一位更近的一端。 sqrt(x) 返回數字x的平方根。

# math.floor(x) 返回比 x 稍小的整數 print(math.floor(-5.9),math.floor(8.6)) # -6 8

# math.modf(x) 返回 x 的整數部分與小數部分， # 兩部分的數值符號與x相同，整數部分以浮點型表示。 print(math.modf(-5.9),math.modf(8.6)) # (-0.9000000000000004, -5.0) (0.5999999999999996, 8.0)

# math.round(x[,n]) # n 爲保留的位數，將 x 進行四捨五入輸出 print(round(-5.984,2),round(8.646,2)) # -5.98 8.65

# math.log(x) log 以 e 結尾，e 的 返回值 爲 x print(math.log(math.e),math.log(math.e ** 2)) # 1.0 2.0

Python隨機數函數： choice(seq) 從序列的元素中隨機選出一個元素 randrange ([start,] stop [,step]) 從指定範圍內，在指定步長遞增的集合中 獲取一個隨機數，步長默認爲 1 。注：不包含 stop 值 random() 隨機生成下一個實數，它在[0,1)範圍內。 shuffle(lst) 將序列的全部元素隨機排序，返回值爲 None uniform(x, y) 隨機生成下一個實數，它在[x,y]範圍內。

python 三角函數 math.sin(x) 返回的x弧度的正弦值。 math.asin(x) 返回x的反正弦弧度值。 math.cos(x) 返回x的弧度的餘弦值。 math.acos(x) 返回x的反餘弦弧度值。 math.tan(x) 返回x弧度的正切值。 math.atan(x) 返回x的反正切弧度值。 math.degrees(x) 將弧度轉換爲角度,如degrees(math.pi/2) ， 返回90.0 math.radians(x) 將角度轉換爲弧度 math.hypot(x, y) 返回 sqrt(x*x + y*y) 的值。

import math # π/2 的正弦值 print(math.sin(math.pi/2)) # 1.0 # 1 的反正弦值 print(math.asin(1)) # 1.5707963267948966 π/2 # π 的餘弦值 print(math.cos(math.pi)) # -1.0 # -1 的反餘弦值 print(math.acos(-1)) # 3.141592653589793 # 四分之三 π 的正切值 print(math.tan(math.pi*3/4)) # -1.0000000000000002 # 使用 math.degrees(x) 函數查看 四分之一 π 的角度 print(math.degrees(math.pi/4)) # 45.0 # 使用 math.radians(x) 函數查看 135° 對應的弧度制 print(math.radians(135)) # 2.356194490192345 print(math.pi*3/4) # 2.356194490192345 # math.hypot(x, y) 查看 sqrt(x*x + y*y) 的值 print(math.hypot(3,4)) # 5.0 print(math.hypot(6,8)) # 10.0 print(math.sqrt(6*6 + 8*8)) # 10.0

Python數學常量： 　　math.pi：π 　　math.e：天然常數 e

# lg 函數中求值 a = math.e b = math.e ** 5 print("ln(a)的值爲:",math.log(a)) # ln(a)的值爲: 1.0 print("ln(b)的值爲:",math.log(b)) # ln(b)的值爲: 5.0

Python建立字符串： 　　通常狀況下能夠使用 ' 或 " 建立字符串 或 使用引用字符串變量 或 字符串表達式。

# 使用字符串表達式進行賦值 a = 'ABCD' b = 'EFG' c = a + b print(c)

Python訪問字符串中的值： 1.能夠使用索引下標進行訪問，索引下標從 0 開始： # 使用索引下標進行訪問，索引下標從 0 開始 strs = "ABCDEFG" print(strs[0]) # A 2.使用切片操做獲取字符串： 示例：[start:stop:step] 　　start ：須要獲取的字符串的開始位置，默認爲 0 。(一般能夠不寫) 　　stop ：須要獲取的字符串的結束位置 的後一個位置。 　　step ：步長，默認爲 1 、當 start 大於 stop 時，step 爲負數。 # 使用[start:stop:step]切片操做獲取字符串 strs = "ABCDEFG" print(strs[:4]) # ABCD print(strs[:4:2]) # AC print(strs[2:6]) # CDEF print(strs[2:6:2]) # CE # 不包含結束位置 print(strs[6:2:-1]) # GFED print(strs[6:2:-2]) # GE 3.經過 for 循環進行獲取字符串： strs = "ABCDEFG" for i in strs: # 其中 i 爲字符串中的單個字母 # 注:此時的 i 不要用作索引下標 print(i,end =" ") # A B C D E F G

Python字符串更新：截取字符串的某一部分 和 其餘字符串進行拼接。 　　注：能夠修改字符串的值，但修改的不是內存中的值，而是建立新的字符串。 1.使用字符串常量進行更新： # 使用字符串常量 strs = "hello,hey" print(strs[:6] + "world.") # hello,world. 2.使用切片操做(不包含結尾 stop)進行更新： strs = "hello,hey" py = "Tom,Jerry" s_2 = strs[:5] + py[3:] print(strs[:5]) # hello print(py[3:]) # ,Jerry print("更新後的字符串:",s_2) # 更新後的字符串: hello,Jerry 修改字符串： # 修改字符串,將 world 修改成 python strs = "hello，world" strs = strs[:6] + "python" print("更新後的字符串:{0}".format(strs)) # 更新後的字符串:hello，python

Python轉義字符：當須要在字符串中使用特殊字符時，使用 \ 轉義字符。 注：轉義字符在字符串中，註釋也是字符串類型。 \(在行尾時):續行符 \\ :反斜槓符號 \' :單引號 \" :雙引號 \a :響鈴 \b :退格(Backspace) \000:空 \n :換行 \v :縱向製表符 \t :橫向製表符 \r :回車

Python字符串運算符： 　　+ ：鏈接左右兩端的字符串。 　　* ：重複輸出字符串。 　　[ ] ：經過索引獲取字符串中的值。 　　[start:stop:step]：開始，結束位置的後一個位置，步長。 　　in ：判斷左端的字符是否在右面的序列中。 　　not in：判斷左端的字符是否不在右面的序列中。 　　r/R ：在字符串開頭使用，使轉義字符失效。

Python字符串格式化： 字符串中符號： 　　%c ：單個字符 　　%s ：字符串 　　%d ：整數 　　%u ：無符號整數 　　%o ：無符號八進制數 　　%x ：無符號十六進制數 　　%X ：無符號十六進制數（大寫） 　　%f ：浮點數，可指定小數點後的精度 　　%e ：對浮點數使用科學計數法，可指定小數點後的精度。%E 與 %e 做用相同 　　%g ：%f 和 %e 的簡寫，%G 與 %g 做用相同 注：%o 爲八進制（oct）、%x 爲十六進制（hex）。

# %o 八進制數 num = 11 print("%o"%(num)) # 13 1*8**1 + 3*8**0 = 11 print(oct(11)) # 0o13 # %x 十六進制數 num = 18 print("%x"%(num)) # 12 1*16**1 + 2*8**0 = 18 print(hex(num)) # 0o12

# %e 科學計數法 num = 120000 print("%e"%(num)) # 1.200000e+05 print("%.2e"%(num)) # 1.20e+05 print("%E"%(num)) # 1.200000E+05 print("%.2E"%(num)) # 1.20E+05 # %g : %f 和 %e 的簡寫 num = 31415926 print("%g"%(num)) # 3.14159e+07 print("%G"%(num)) # 3.14159E+07

格式化操做符的輔助指令： 　　* ：定義寬度 或 小數點精度 　　- ： 左對齊 　　+ ： 使正數顯示符號 　　<sp>：在正數前顯示空格 　　 # ：在八進制前顯示 0 ，在十六進制前顯示 0x 或 0X 　　 0 ：顯示的數字前面填充 '0' 　　% ：%%輸出單個% 　(var) ：字典參數映射變量 　m.n. ：m是顯示的寬度，n 是小數點後的位數

Python三引號：多用做註釋、數據庫語句、編寫 HTML 文本

UTF-8 編碼將英文字母編碼成一個字節，漢字一般是三個字節。適用於存在大量英文字符時，節省空間

Python字符串內建函數： 注：漢字屬於字符（既是大寫又是小寫）、數字能夠是： Unicode 數字，全角數字（雙字節），羅馬數字，漢字數字。 1.capitalize( )： 將字符串第一個字母大寫 # 使用 字符串.capitalize() 方法將字符串首字母大寫 strs = 'abc' print(strs.capitalize()) # Abc 2.center(width[,fillchar]) ： 讓字符串在 width 長度居中，兩邊填充 fillchar 字符(默認是空格) # center(width,fillchar) # 使用 字符串.center() 方法，將字符串在 width 長度居中，兩邊補充 fillchar strs = 'abcdefgh' print(strs.center(20,'-')) #------abcdefgh------ 3.count(str,start=0,end=len(string))： 返回 str 在字符串從 start 到 end 範圍內出現的次數(不包含end)。 # 使用 字符串.count(str) 方法，返回 str 在 字符串中出現的次數 strs = 'abcdefghabcd' print(strs.count('c')) #2 # 使用 字符串.count(str) 方法，返回 str 在 字符串中出現的次數 strs = 'abcdefghabcd' # a 的索引位置爲 0,8 print(len(strs)) # 12 print(strs.count('a',2,8)) # 0 print(strs.count('a',2,9)) # 1 4.bytes.decode(encoding="UTF-8")： 將字節碼轉換爲字符串 strs_bytes = b'\xe6\xac\xa2\xe8\xbf\x8e' print(strs_bytes.decode(encoding='UTF-8')) # 歡迎 5.encode(encoding='UTF-8')： 將字符串轉換爲字節碼 strs = '歡迎' print(strs.encode(encoding='UTF-8')) # b'\xe6\xac\xa2\xe8\xbf\x8e' 6.endswith(str[,start[,end]])： 判斷字符串在 start 到 end 是否以 str結尾 # 字符串.endswith(str[,start[,end]]) strs = 'ABCDEFG' print(strs.endswith('G')) # True print(strs.endswith('F',0,6)) # True 7.expandtabs(tabsize = 4)： 將字符串中的 tab 符號轉換爲空格，tabsize 爲替換的空格數 # 字符串.expandtabs(tabsize = 4) # 將字符串中的 tab 符號轉換爲空格，tabsize 爲替換的空格數 strs = 'ABCD EFG' print(strs.expandtabs(tabsize = 4)) # ABCD EFG 8.find(str,start = 0,end = len(string))： 在 start 到 end 範圍內尋找 str 元素，若是找到則返回 str 元素的索引位置，不然返回 -1。 # find(str,start = 0,end = len(string))： # 在 start 到 end 範圍內尋找 str 元素，若是找到則返回 str 元素的索引位置，不然返回 -1 strs = 'ABCDEFG' #索引位置，從 0 開始 print(strs.find('E')) # 4 print(strs.find('K')) # -1 9.index(str,start = 0,end = len(string))： 在 start 到 end 範圍內尋找 str 元素，若是找到則返回 str 元素的索引位置，找不到則會報錯。 # index(str,start = 0,end = len(string))： # 在 start 到 end 範圍內尋找 str 元素，若是找到則返回 str 元素的索引位置，找不到則返回-1。 strs = 'ABCDEFG' print(strs.index('F')) # 5 10.isalnum( )： 若是字符串全部字符都是 字母 或者 數字 則返回 True，不然返回 False。 # isalnum( )： # 若是字符串全部字符都是 字母 或者 數字 則返回 True，不然返回 False。 strs = 'abcd123' print(strs.isalnum()) # True strs = '好的' print(strs.isalnum()) # True strs = 'abc_' print(strs.isalnum()) # False 11.isalpha( )： 若是字符串中全部字符都是字母則返回 True，不然返回 False。 # isalpha( )： # 若是字符串中全部字符都是字母則返回 True，不然返回 False。 strs = 'ABCD漢字' print(strs.isalpha()) # True strs_two = 'ABCD123' print(strs_two.isalpha()) # False 12.isdigit( )： 若是字符串中全部字符都是數字則返回True，不然返回 False。 # isdigit( )： # 若是字符串中全部字符都是數字則返回True，不然返回 False。 # 注: ① 也是數字 strs = '①②12' print(strs.isdigit()) # True strs_two = 'ABCD123' print(strs_two.isdigit()) # False 13.islower( )： 若是字符串中全部可以區分大小寫的字符都是小寫的，則返回True。不然返回 False。 # islower( )： # 若是字符串中全部字符都是小寫的，則返回True。不然返回 False。 strs = 'abcd' print(strs.islower()) # True strs_two = 'abc123' print(strs.islower()) # True strs_three = 'Abcd' print(strs_three.islower()) # False 14.isnumeric( )： 若是字符串只包含數字字符，則返回 True。不然返回 False。 # isnumeric( )： # 若是字符串只包含數字字符，則返回 True。不然返回 False。 strs = '123456' print(strs.isnumeric()) #True strs_two = '½⅓123①②ⅡⅣ❶❷' print(strs_two.isnumeric()) # True strs_three = 'abc123A' print(strs_three.isnumeric()) # False 15.isspace( )： 若是字符串只包含空格，則返回True。不然返回False。 # isspace( )： # 若是字符串只包含空格，則返回True。不然返回False。 strs = ' ' print(strs.isspace()) # True strs = ' 1' print(strs.isspace()) # False 16.istitle( )： 若是全部被空格分割成的子字符串的首字母都大寫，則返回 True。不然返回 False。 # istitle( ) # 若是全部被空格分割成的子字符串的首字母都大寫，則返回 True。不然返回 False。 strs = 'Hello World' print(strs.istitle()) # True strs_two = 'Welcome to Harbin' print(strs_two.istitle()) # False strs_three = 'World T12' print(strs_three.istitle()) # True 17.isupper( ) ： 若是字符串中全部可以區分大小寫的字符都是大寫的，則返回True。不然返回 False。 # isupper( ) ： # 若是字符串中全部可以區分大小寫的字符都是大寫的，則返回True。不然返回 False。 strs = 'ABCD123漢字' print(strs.isupper()) # True strs_two = 'ABCabc漢字' print(strs_two.isupper()) # False

Python字符串內建函數： 1.join(str) ： 使用調用的字符串對 str 進行分割，返回值爲字符串類型 # join(str) ： # 使用調用的字符串對 str 進行分割。 strs = "Hello" strs_two = ' '.join(strs) print(strs_two) # H e l l o print(','.join(strs)) # H,e,l,l,o 2.len(string)： 返回字符串的長度 # len(string)： # 返回字符串的長度 strs = 'happy' print(len(strs)) # 5 3.ljust(width[,fillchar])： 以前的是 center 函數，也能夠進行填充。 字符串左對齊，使用 fillchar 填充 width 的剩餘部分。 # ljust(width[,fillchar])： # 字符串左對齊，使用 fillchar 填充 width 的剩餘部分。 strs = 'Hello' print(strs.ljust(20,'-')) # Hello--------------- # fillchar 默認爲空 print(strs.ljust(20)) # Hello 4.lower( )：注：使用了 lower 函數後，原來的字符串不變。 將字符串全部可以區分大小寫的字符都轉換爲小寫字符。 # lower( )： # 將字符串全部可以區分大小寫的字符都轉換爲小寫字符。 strs = 'Hello 123' print(strs.lower()) # hello 123 print(type(strs.lower())) # <class 'str'> # 原來的字符串沒有發生改變 print(strs) # Hello 123 # 使用字符串接收 lower 函數的返回值 strs = strs.lower() print(strs) # hello 123 5.lstrip(str)： 將字符串最左面的 str 部分去除，輸出剩餘的部分(str 默認爲空格)。 #lstrip( )： # 將字符串左面的空格部分去除，輸出剩餘的部分。 strs = ' hello' print(strs.lstrip()) # hello print(strs) # hello # 使用 lstrip('過濾的參數') 函數，將最左面的 a 過濾掉 strs = 'abcd' print(strs.lstrip('a')) # bcd 6.maketrans(參數1，參數2)：調用後，使用字符串.translate函數對字符串進行替換。 建立字符映射的轉換表。 　　參數 1 是須要轉換的字符 　　參數 2 是轉換的目標 # maketrans(參數1，參數2)： # 建立字符映射的轉換表。 # 　　參數 1 是須要轉換的字符 # 　　參數 2 是轉換的目標 # 將 abcd 使用 1234 替換 keys = 'abcd' values = '1234' tran = str.maketrans(keys,values) print(type(tran)) #<class 'dict'> # 使用 字符串.translate(接收了 maketrans 函數的對象) strs = 'abcdef' print(strs.translate(tran)) # 1234ef 7.max(str)： 返回 str 中最大的字母，小寫字母的 Unicode 編碼比大寫字母的 Unicode 編碼大。 # max(str)： # 返回 str 中最大的字母 strs = 'abcABC' print(max(strs)) # c strs = 'ABCDE' print(max(strs)) # E 8.min(str)： 返回 str 中最小的字母，大寫字母的 Unicode 編碼比小寫字母的 Unicode 編碼小。 # min(str)： # 返回 str 中最小的字母 strs = 'abcABC' print(min(strs)) # A strs = 'ABCDE' print(min(strs)) # A 9.replace(old,new[,num])： 將舊字符串替換爲新的字符串，num 爲最多替換的次數。(默認爲所有替換) # replace(old,new[,num])： # 將舊字符串替換爲新的字符串，num 爲替換的次數。 strs = 'abc abc abc abc' print(strs.replace('abc','ABC')) # ABC ABC ABC ABC # 替換 3 次 print(strs.replace('abc','ABC',3)) # ABC ABC ABC abc 10.rfind(str,start = 0,end = len(string))： 從字符串的最右面查找 str # rfind(str,start = 0,end = len(string))： # 從字符串的最右面查找 str，不包含end strs = 'happy happy' # h 的索引位置分別爲 0,6 print(strs.rfind('h')) # 6 # y 的索引位置分別爲 4,10 # 在 索引位置 2 到 11 之間進行查找 print(strs.rfind('y',2,11)) # 10 11.rindex(str,start = 0,end = len(string))： 從字符串右面開始尋找 str ，返回索引值、找不到則報錯。 # rindex(str,start = 0,end = len(string))： # 從字符串右面開始尋找 str ，返回索引值 strs = 'happy happy' # a 的索引位置爲 1,7 print(strs.rindex('a')) # 7 12.rjust(width[,fillchar])： 返回一個以字符串右對齊，使用 fillchar 填充左面空餘的部分的字符串 # rjust(width[,fillchar])： # 返回一個以字符串右對齊，使用 fillchar 填充左面空餘的部分的字符串 strs = 'hello' print(strs.rjust(20)) # hello print(strs.rjust(20,'*')) # ***************hello 13.rstrip(str)： 刪除字符串最右面的 str 字符，str默認爲空格 注：遇到不是 str 字符才中止刪除 # rstrip(str)： # 刪除字符串最右面的 str 字符，str默認爲空格 strs = 'hello ' print(strs.rstrip()) # hello strs = 'hello aaaaa' print(strs.rstrip('a')) # hello 14.split(str,num)： 對字符串使用 str 進行分割，若是 num有指定值，則分割 num次(默認爲所有分割) # split(str=" ",num=string.count(str))： # 對字符串使用 str 進行分割，若是 num有指定值，則分割 num次(默認爲所有分割) strs = 'hahahah' print(strs.split('a')) # ['h', 'h', 'h', 'h'] # 對字符串進行切割兩次 print(strs.split('a',2)) # ['h', 'h', 'hah'] 15.splitlines(is_keep)： 按照 回車\r 、換行\n 對字符串進行分割。 　　is_keep ：當 is_keep 爲 True 時，返回值保留換行符。 　　　　　　 當 is_keep 爲 False 時，返回值不包留換行符。 # splitlines(is_keep)： # # 按照 回車\r 、換行\n 對字符串進行分割。 # 　　is_keep ：當 is_keep 爲 True 時，返回值保留換行符。 # 　　　　　　 當 is_keep 爲 False 時，返回值不包留換行符。 strs = "a\r\nb\nc" # True則保留換行符和回車，False則不保存 print(strs.splitlines(True)) # ['a\r\n', 'b\n', 'c'] print(strs.splitlines()) # ['a', 'b', 'c'] 16.startswith(str,start = 0,end = len(string))： 查看在字符串的 start 到 end-1 的區間中，是否以 str 開頭。 # startswith(str,start = 0,end = len(string))： # 查看在字符串的 start 到 end-1 的區間中，是否以 str 開頭。 strs = 'hello , hey , world' print(strs.startswith('hello')) # True print(strs.startswith('hey',8,13)) # True print(strs[8:13]) # hey , 17.strip(str)： 返回在最左端和最右端都刪除 str 的字符串。 注：遇到其餘字符則中止。 # strip(str)： # 返回在最左端和最右端都刪除 str 的字符串。 # 注：遇到其餘字符則中止,只要是 str 進行刪除、不限次數。 strs = 'ABCDEABCD' print(strs.strip('A')) # BCDEABCD # 右端字符由於遇到了D,因此中止了。 strs = 'ABCDEABCDAAA' print(strs.strip('A')) # BCDEABCD strs = 'ABCDEABCD' # 若是最左和最右兩端都沒有 str ，則不進行刪除 print(strs.strip('E')) # ABCDEABCD 18.swapcase( )： 將可以區分大小寫的字符，大小寫互換。 # swapcase( )： # 將可以區分大小寫的字符，大小寫互換。 strs = 'ABCDabcdefg' print(strs.swapcase()) # abcdABCDEFG 19.title( )： 將字符串變爲每個單詞都是大寫字母開頭，其他字母爲小寫或數字。 # title( )： # 將字符串變爲每個單詞都是大寫字母開頭，其他字母爲小寫或數字。 strs = 'hello world abc123' print(strs.title()) # Hello World Abc123 20.translate(字典 或 接收了字符串.maketrans(被替換元素,替換元素)的對象)： 將字符串按照參數進行轉換 # translate(字典 或 接收了字符串.maketrans(被替換元素,替換元素)的對象)： # 將字符串按照參數進行轉換 keys = 'abcd' values = '1234' tran = str.maketrans(keys,values) print(type(tran)) #<class 'dict'> # 使用 字符串.translate(接收了 maketrans 函數的對象) strs = 'abcdef' print(strs.translate(tran)) # 1234ef 21.upper( )： 將全部可以區分大小寫的字符都轉換爲大寫 # upper()： # 將全部可以區分大小寫的字符都轉換爲大寫 strs = 'hello World' print(strs.upper()) # HELLO WORLD 22.zfill(width)： 返回長度爲 width 的字符串，在左端填充 0 # zfill(width)： # 返回長度爲 width 的字符串，在左端填充 0 strs = 'hello' print(strs.zfill(10)) # 00000hello 23.isdecimal( )： 字符串是否只包含十進制數，其他進制都返回False。 # isdecimal( )： # 字符串是否只包含十進制數 # 二進制 strs_bin = '0b11' print(strs_bin.isdecimal()) # False # 八進制 strs_oct = '0o56' print(strs_oct.isdecimal()) # 十六進制 strs_hex = '0xa4' print(strs_hex.isdecimal()) # False strs_int = '123' print(strs_int.isdecimal()) # True

Python列表：在 [ ] 中括號 中添加元素 或者 經過 list 轉換其餘類型。 列表(我的總結)： 　　1.列表是可變類型，便可以使用列表內置方法對列表進行增刪查改排序操做 　　　　經常使用的增刪查改排序方法： 　　　　　　　　增 ：append、extend、insert、+ 鏈接、 　　　　　　　　刪 ：pop、remove、clear、del 　　　　　　　　查 ： in、not in、for循環迭代等 　　　　　　　　改 ： 列表變量[索引下標] = 元素、切片修改 　　　　　　　　排序： sort、sorted 　　2.列表是序列對象，即列表的索引下標從 0 開始，依次遞增，最後一個元素爲-1，從右向左依次遞減 　　3.列表能夠包含全部數據類型：數字、字符串、列表、元組、集合、字典 　　4.列表是可迭代對象，便可以進行 for 循環(推薦使用列表推導式) 　　5.列表能夠進行切片操做 [start:end:step] (不包含end) 　　6.列表能夠查看元素出現的次數 count 和 元素的位置(索引下標) index 　　7.獲取列表的長度 len 注：列表還有不少其餘的 用法 和 功能，以上只是常見的

訪問列表元素： 經過索引下標： # 經過索引下標獲取列表元素 lst = [1,4,7,2,5,8] print("lst 的第一個元素是",lst[0]) # lst 的第一個元素是 1 print("lst 的第四個元素是",lst[3]) # lst 的第四個元素是 2 經過切片進行獲取： # 切片 [start(默認爲 0 ),end(一直到 end-1 位置),step(默認爲 1 )] # 默認的均可以省略不寫 # 列表翻轉 lst = [1,4,7,2,5,8] # 8 的索引位置是 5 print(lst[::-1]) # [8, 5, 2, 7, 4, 1] print(lst[2:5]) #不包含 5 # [7, 2, 5] print(lst[2:5:2]) #不包含 5 # [7, 5]

Python更新列表： 使用索引下標進行更新： # 修改列表的第 6 個元素爲 d lst = ['a','b','c',1,2,3] lst[5] = 'd' print(lst) # ['a', 'b', 'c', 1, 2, 'd'] 使用切片對列表進行更新： # 修改列表的第2個元素到最後爲 hey lst = ['a','b','c',1,2] lst[2:] = 'hey' print(lst) # ['a', 'b', 'h', 'e', 'y'] # 修改列表的第3個元素到第8個元素爲 hello lst = [1,2,3,'a','b','c','d','e',4,5,6] lst[3:8] = 'hello' print(lst) # [1, 2, 3, 'h', 'e', 'l', 'l', 'o', 4, 5, 6] 使用 append 方法增長元素： # 使用 append 方法對列表進行更新 lst = ['a','b','c',1,2,3] lst.append('d') print(lst) # ['a', 'b', 'c', 1, 2, 3, 'd']

Python刪除列表元素： pop( )： 刪除最後一個元素，返回該元素的值 # 使用 pop 方法刪除列表中的元素 lst = ['a','b','c',1,2,3] print(lst.pop()) # 3 ,pop方法刪除最後一個元素並返回它的值 print(lst) # ['a', 'b', 'c', 1, 2] remove(str)： 在列表中刪除 str 元素，無返回值 注：當列表中不存在 str 元素時，則會報錯 lst = ['A','B','C','D'] print(lst.remove('C')) # None print(lst) # ['A', 'B', 'D'] remove(str)： # 將 str 從列表中刪除一次 # 當列表中存在多個 str 元素時，只刪除一次 lst = ['a','b','c','d','a'] lst.remove('a') print(lst) # ['b', 'c', 'd', 'a'] del 元素： 刪除列表中的元素 或 整個列表 # 使用 del 方法刪除列表的第 3 個元素 lst = ['A','B','C','D'] del lst[2] print(lst) # ['A', 'B', 'D'] # 刪除整個 lst 列表 del lst

Python列表腳本操做符： len(列表名)： 查看列表長度 列表對象 1 + 列表對象 2 ： 將兩個列表進行組合，有時可用於賦值 lst = [1,2,3,4] lst_two = [7,8,9] print(lst + lst_two) # [1, 2, 3, 4, 7, 8, 9] 成員運算符 in 、not in： 判斷左端元素是否在右端列表中 lst = ['a','b','c'] print('a' in lst) # True 將列表用做可迭代對象 lst = [1,2,3,'a','b','c'] for i in lst: print(i,end = " ") # 1 2 3 a b c # 注：此時的 i 不是數字，而是列表中的元素，不要用於索引下標

Python列表函數和方法： 函數： len(列表名)： 返回列表長度 # len(列表名)： # 返回列表長度 lst = [1,2,3,'a','b','c'] print("lst 列表的長度爲 %d"%(len(lst))) # lst 列表的長度爲 6 max(列表名)： 返回列表元素的最大值 注：列表內的元素必定要是同一類型，都是字母 或 數字 # max(列表名)： # 返回列表元素的最大值 # 注：列表內的元素必定要是同一類型，都是字母 或 數字 lst = [8,4,5,6,9] print(max(lst)) # 9 lst = ['a','b','c','A','B','C'] print(max(lst)) # c min(列表名)： 返回列表元素的最小值 注：列表內的元素必定要是同一類型，都是字母 或 數字 # min(列表名)： # 返回列表元素的最小值 # 注：列表內的元素必定要是同一類型，都是字母 或 數字 lst = [8,4,5,6,9] print(min(lst)) # 4 lst = ['a','b','c','A','B','C'] print(min(lst)) # A 使用 list 將元組轉換爲列表對象（一般用來修改元組的值）： # 使用 list 將元組轉換爲列表對象（一般用來修改元組的值） tup = (1,2,3,'a','b') tuple_lst = list(tup) print(tuple_lst) # [1, 2, 3, 'a', 'b'] 方法： append(對象)： 在列表末尾添加該對象 # append(對象)： # 在列表末尾添加該對象 lst = ['A','B','C'] lst.append('D') print(lst) # ['A', 'B', 'C', 'D'] # 若是添加元素爲列表，則將列表當作一個元素進行添加 lst = ['A','B','C'] lst_two = ['a','b','c'] lst.append(lst_two) print(lst) # ['A', 'B', 'C', ['a', 'b', 'c']] count(str)： 返回列表中 str 出現的次數 # count(str)： # 返回列表中 str 出現的次數 lst = [1,2,1,2,1,3] print(lst.count(1)) # 3 # 若是不存在該元素，則返回 0 print(lst.count(4)) # 0 extend(序列對象)： 在列表末尾添加全部序列對象中的元素，返回值爲空。多用來擴展原來的列表 # extend(序列對象)： # 在列表末尾添加全部序列對象中的元素，多用來擴展原來的列表 lst = [1,2,3] lst_two = [7,8,9] lst.extend(lst_two) print(lst) # [1, 2, 3, 7, 8, 9] index(str)： 返回 str 在列表中第一次出現的索引位置，str 不在列表中則報錯 # index(str)： # 返回 str 在列表中第一次出現的索引位置 lst = ['a','b','c','d','e','c'] print(lst.index('c')) # 2 insert(index,對象)： 在列表的 index 位置添加對象，原 index 位置及後面位置的全部元素都向後移 # insert(index,對象)： # 在列表的 index 位置添加對象 lst = ['a','b','c','d'] lst.insert(2,'q') print(lst) # ['a', 'b', 'q', 'c', 'd'] pop(index = -1)： 默認刪除列表的最後一個元素，並返回值 當 index 爲 2 時，刪除第 3 個元素 # pop(index)： # 默認刪除列表的最後一個元素，並返回值 lst = ['a','b','c','d'] lst.pop() print(lst) # ['a', 'b', 'c'] lst = ['a','b','c','d'] # 刪除第二個元素 lst.pop(2) print(lst) # ['a', 'b', 'd'] remove(str)： 將 str 從列表中刪除，注：若是列表中有多個 str 元素，只刪除一次 str # remove(str)： # 將 str 從列表中刪除一次 lst = ['a','b','c','d','a'] lst.remove('a') print(lst) # ['b', 'c', 'd', 'a'] lst = ['a','b','c','d'] lst.remove('b') print(lst) # ['a', 'c', 'd'] reverse( )： 翻轉列表的元素 # reverse( )： # 翻轉列表的元素 lst = [1,7,3,2,5,6] lst.reverse() print(lst) # [6, 5, 2, 3, 7, 1] sort(key = None,reverse = False)： 對列表進行排序，無返回值 # sort(key = None,reverse = False)： # key 用來接收函數，對排序的數據進行修改 # 對列表進行排序 lst = [1,4,5,2,7,6] lst.sort() print(lst) # [1, 2, 4, 5, 6, 7] lst = [1,4,5,2,7,6] # 先排序後翻轉 lst.sort(reverse = True) print(lst) # [7, 6, 5, 4, 2, 1] lst = [5,4,1,2,7,6] # 不知足 key 的放前面，知足 key 的放後面 lst.sort(key = lambda x : x % 2 == 0) print(lst) # [1, 5, 7, 4, 2, 6] clear( )： 清空列表 # clear( )： # 清空列表,無返回值 lst = ['a','c','e'] lst.clear() print(lst) # [] copy( )： 複製一份列表元素 # copy( )： # 複製列表 lst = ['a','b','c'] print(lst.copy()) # ['a', 'b', 'c'] print(lst) # ['a', 'b', 'c']

Python元組：元組與字典相似，不一樣之處在於元組的值不可以修改。 　　使用 ( ) 或 tuple 強制轉換 能夠獲得元祖 建立空元組： # 建立空元祖 tuple_1 = () print(type(tuple_1)) # <class 'tuple'> print(tuple_1) # () 建立只包含一個元素的元組： 注：必定要使用，逗號，不然會被當作是一個常量值 # 建立只包含一個元素的元組 tuple_2 = (1,) print(type(tuple_1)) # <class 'tuple'> print(tuple_2) # (1,) 建立包含有多個元素的元組： # 建立包含多個元素的元組 tuple_3 = (4,5) print(type(tuple_3)) # <class 'tuple'> print(tuple_3) # (4, 5) # 使用 tuple 對列表進行強制轉換 lst = [1,2,3,4,5] tuple_4 = tuple(lst) print(type(tuple_4)) # <class 'tuple'> print(tuple_4) # (1, 2, 3, 4, 5) 不使用括號，建立包含多個元素的元組： # 建立包含多個元素的元組 tuple_4 = 7,8,9 print(tuple_4) # (7, 8, 9) print(type(tuple_4)) # <class 'tuple'>

Python訪問元組： 使用索引下標進行訪問元組： # 經過索引下標進行訪問 tuple_1 = ('a','b','c','d','e','f','g') # 輸出元組中的第一個值 print(tuple_1[0]) # a # 輸出元組中的第六個值 print(tuple_1[5]) # f # 輸出元組的最後一個元素 print(tuple_1[-1]) # g 經過切片訪問元組： # 使用切片對元組進行輸出 [start:end:step] 注：不包含 end 位置 tuple_1 = ('a','b','c',4,5,6,7) # 輸出全部元組的元素 print(tuple_1[::]) # ('a', 'b', 'c', 4, 5, 6, 7) # 輸出元組的第三個元素到第五個元素 print(tuple_1[2:5]) #不包含 end # ('c', 4, 5) # 步長能夠進行修改 print(tuple_1[2:5:2]) # ('c', 5) # 將元組倒序輸出 print(tuple_1[::-1]) # (7, 6, 5, 4, 'c', 'b', 'a')

Python訪問元組： 使用索引下標進行訪問元組： # 經過索引下標進行訪問 tuple_1 = ('a','b','c','d','e','f','g') # 輸出元組中的第一個值 print(tuple_1[0]) # a # 輸出元組中的第六個值 print(tuple_1[5]) # f # 輸出元組的最後一個元素 print(tuple_1[-1]) # g 經過切片訪問元組： # 使用切片對元組進行輸出 [start:end:step] 注：不包含 end 位置 tuple_1 = ('a','b','c',4,5,6,7) # 輸出全部元組的元素 print(tuple_1[::]) # ('a', 'b', 'c', 4, 5, 6, 7) # 輸出元組的第三個元素到第五個元素 print(tuple_1[2:5]) #不包含 end # ('c', 4, 5) # 步長能夠進行修改 print(tuple_1[2:5:2]) # ('c', 5) # 將元組倒序輸出 print(tuple_1[::-1]) # (7, 6, 5, 4, 'c', 'b', 'a') 注: 元組沒有 reverse 方法

修改元組的元素 # 將元組中的 'c' 改成 'd' tuple_1 = ('a','b','c',4,5,6,7) # c 的索引位置是 2 # 修改元組的值，可先將元組轉換爲列表類型，而後再轉變爲元組類型 lst = list(tuple_1) lst[2] = 'd' #進行修改 tuple_1 = tuple(lst) #從新轉換爲元組類型 print(tuple_1) # ('a', 'b', 'd', 4, 5, 6, 7)

刪除元組中的某一個元素,能夠使用del 進行刪除,也能夠利用切片進行相加 # 方法1 # 將元組中的 'b' 刪除 tuple_1 = ('a','b','c',4,5,6,7) # b 的索引位置是 1 lst = list(tuple_1) del lst[1] tuple_1 = tuple(lst) #從新轉換爲元組類型 print(tuple_1) # ('a', 'c', 4, 5, 6, 7) # 方法2 # 將元組中的 'b' 刪除 tuple_1 = ('a','b','c',4,5,6,7) # b 的索引位置是 1 tuple_1 = tuple_1[:1] + tuple_1[2:] print(tuple_1) # ('a', 'c', 4, 5, 6, 7)

del 語句 刪除元組： # 刪除元組 tuple_1 = ('a','b','c') del tuple_1

Python元組運算符： len(元組名)： 返回元組對象的長度 tuple_1 = (1,4,5,2,3,6) print(len(tuple_1)) # 6 + 鏈接： tuple_1 = (1,4,5) tupel_2 = (3,5,4) print(tuple_1 + tupel_2) # (1, 4, 5, 3, 5, 4) * 重複： tuple_1 = (1,4,5) num = 3 print(num * tuple_1) # (1, 4, 5, 1, 4, 5, 1, 4, 5) 成員運算符 in ，not in： tuple_1 = (1,4,5,2,3,6) print(4 in tuple_1) # True print(8 in tuple_1) # False print(4 not in tuple_1) # False print(8 not in tuple_1) # True 將元組做爲可迭代對象： # 將元組做爲可迭代對象 tuple_1 = ('a','b','c') for i in tuple_1: print(i , end = " ") # a b c

Python元組索引、截取： 索引下標： tuple_1 = ('a','b','c','d','e','f','g','h') print(tuple_1[0]) # a print(tuple_1[3]) # d print(tuple_1[7]) # h # 當索引下標爲負數時，-1表示最右端元素，從右向左依次遞減 print(tuple_1[-1]) # h print(tuple_1[-4]) # e 切片操做： # 使用切片進行截取列表元素 tuple_1 = (1,2,3,4,5,6,7,8,9,10) print(tuple_1[::]) # (1, 2, 3, 4, 5, 6, 7, 8, 9, 10) print(tuple_1[2:8]) # (3, 4, 5, 6, 7, 8) print(tuple_1[2:8:3]) # (3, 6) , 不包含end print(tuple_1[2::-1]) # (3, 2, 1) print(tuple_1[8:1:-1]) # (9, 8, 7, 6, 5, 4, 3) print(tuple_1[8:1:-2]) # (9, 7, 5, 3) print(tuple_1[-1:-5:-1]) # (10, 9, 8, 7)

Python元組內置函數： len(元組名)： 返回元組長度 tuple_1 = (1,2,3,'a','b','c') print("tuple_1 元組的長度爲 %d"%(len(tuple_1))) # tuple_1 元組的長度爲 6 max(元組名)： 返回元組元素的最大值 注：元組內的元素必定要是同一類型，都是字母 或 數字 tuple_1 = (8,4,5,6,9) print(max(tuple_1)) # 9 tuple_1 = ('a','b','c','A','B','C') print(max(tuple_1)) # c min(元組名)： 返回元組元素的最小值 注：元組內的元素必定要是同一類型，都是字母 或 數字 tuple_1 = (8,4,5,6,9) print(min(tuple_1)) # 4 tuple_1 = ('a','b','c','A','B','C') print(min(tuple_1)) # A 使用 tuple 將列表轉換爲元組對象 # 使用 tuple 將列表轉換爲元組對象 tuple_1 = (1,2,3,'a','b') lst = list(tuple_1) print(lst) # [1, 2, 3, 'a', 'b']

Python字典：{鍵：值}，多個鍵值對使用 ， 進行分隔。 建立空字典： dic = {} print(type(dic)) # <class 'dict'> print(dic) # {} 建立只有一個元素的字典： dic = {'a':123} print(dic) # {'a': 123} 建立包含多個元素的字典： dic = {'a':123,888:'n',(4,5):[7,8]} print(dic) # {'a': 123, 888: 'n', (4, 5): [7, 8]} # 鍵必定是不可變類型 使用 dict 轉化爲字典： dic = dict(zip(['a','b','c'],[4,5,6])) print(dic) # {'a': 4, 'b': 5, 'c': 6}

Python訪問字典中的值： # 使用字典 ['鍵'] 獲取字典中的元素 dic = {'a':123,'b':456,'c':789} print(dic['a']) # 123 修改字典元素： dic = {'a': 123, 'b': 456, 'c': 789} dic['b'] = 14 print(dic) # {'a': 123, 'b': 14, 'c': 789} 增長字典元素： dic = {'a':123,'b':456,'c':789} dic['d'] = 555 print(dic) # {'a': 123, 'b': 456, 'c': 789, 'd': 555} 刪除操做： dic = {'a': 123, 'b': 14, 'c': 789} # 刪除字典元素 del dic['b'] print(dic) # {'a': 123, 'c': 789} # 清空字典 dic = {'a': 123, 'b': 14, 'c': 789} dic.clear() print(dic) # {} # 刪除字典 dic = {'a': 123, 'b': 14, 'c': 789} del dic

Python字典內置函數和方法： 注：使用了 items、values、keys 返回的是可迭代對象，能夠使用 list 轉化爲列表。 len(字典名)： 返回鍵的個數，即字典的長度 dic = {'a':123,'b':456,'c':789,'d':567} print(len(dic)) # 4 str(字典名)： 將字典轉化成字符串 dic = {'a':123,'b':456,'c':789,'d':567} print(str(dic)) # {'a': 123, 'b': 456, 'c': 789, 'd': 567} type(字典名)： 查看字典的類型 dic = {'a':123,'b':456,'c':789,'d':567} print(type(dic)) # <class 'dict'>

字典的內置方法 clear( )： 刪除字典內全部的元素 dic = {'a':123,'b':456,'c':789,'d':567} dic.clear() print(dic) # {} copy( )： 淺拷貝一個字典 dic = {'a':123,'b':456,'c':789,'d':567} dic_two = dic.copy() print(dic) # {'a': 123, 'b': 456, 'c': 789, 'd': 567} print(dic_two) # {'a': 123, 'b': 456, 'c': 789, 'd': 567} fromkeys(seq[,value])： 建立一個新字典,seq做爲鍵，value爲字典全部鍵的初始值(默認爲None) # fromkeys(seq[,value])： # 建立一個新字典,seq做爲鍵，value爲字典全部鍵的初始值(默認爲None) dic = dict.fromkeys('abcd') # 默認爲 None print(dic) # {'a': None, 'b': None, 'c': None, 'd': None} dic = dict.fromkeys('abc',1) print(dic) # {'a': 1, 'b': 1, 'c': 1} get(key,default = None)： 返回指定的鍵的值，若是鍵不存在，則返會 default 的值 dic = {'a':1,'b':2,'c':3,'d':4} print(dic.get('b')) # 2 print(dic.get('e',5)) # 5 成員運算符 in、not in： 查看 鍵 是否在字典中: dic = {'a':1,'b':2,'c':3,'d':4} print('a' in dic) # True print('a' not in dic) # False items( )： 返回鍵值對的可迭代對象，使用 list 可轉換爲 [(鍵,值)] 形式 dic = {'a':1,'b':2,'c':3,'d':4} print(dic.items()) # dict_items([('a', 1), ('b', 2), ('c', 3), ('d', 4)]) print(list(dic.items())) # [('a', 1), ('b', 2), ('c', 3), ('d', 4)] keys( )： 返回一個迭代器，能夠使用 list() 來轉換爲列表 dic = {'a':1,'b':2,'c':3,'d':4} print(dic.keys()) # dict_keys(['a', 'b', 'c', 'd']) print(list(dic.keys())) # ['a', 'b', 'c', 'd'] setdefault(key,default = None)： 若是鍵存在於字典中，則不修改鍵的值 若是鍵不存在於字典中，則設置爲 default 值 dic = {'a':1,'b':2,'c':3,'d':4} dic.setdefault('a',8) print(dic) # {'a': 1, 'b': 2, 'c': 3, 'd': 4} # 若是鍵不存在於字典中，則設置爲 default 值 dic = {'a':1,'b':2,'c':3,'d':4} dic.setdefault('e',5) print(dic) # {'a': 1, 'b': 2, 'c': 3, 'd': 4, 'e': 5} update(字典對象)： 將字典對象更新到字典中 dic = {'a':1,'b':2,'c':3,'d':4} dic_two = {'f':6} dic.update(dic_two) print(dic) # {'a': 1, 'b': 2, 'c': 3, 'd': 4, 'f': 6} values( )： 返回一個可迭代對象，使用 list 轉換爲字典中 值 的列表 dic = {'a':1,'b':2,'c':3,'d':4} print(list(dic.values())) pop(key[,default])： 刪除字典中 key 的值，返回被刪除的值。key 值若是不給出，則返回default的值 dic = {'a':1,'b':2,'c':3,'d':4} print(dic.pop('a',6)) # 1 , 返回刪除的值 print(dic) # {'b': 2, 'c': 3, 'd': 4} print(dic.pop('e','字典中沒有該值')) # 字典中沒有該值 ， 若是字典中不存在該鍵，則返回 default 的值 print(dic) # {'b': 2, 'c': 3, 'd': 4} popitem( )： 隨機返回一個鍵值對(一般爲最後一個)，並刪除最後一個鍵值對 dic = {'a':1,'b':2,'c':3,'d':4} print(dic.popitem()) # ('d', 4) print(dic) # {'a': 1, 'b': 2, 'c': 3} print(dic.popitem()) # ('c', 3) print(dic) # {'a': 1, 'b': 2}

time 模塊 time.time() # 1595600800.6158447 Python函數使用一個元組裝起來的 9 組數字，專門用來處理時間使用 　tm_year　 年：使用 4 位數字 、2020 　tm_mon　 月：1~12 、6月 　tm_mday　日：1~31 、25號 　tm_hour 小時：0~23 、16時 　tm_min　分鐘：0~59 、33分 　tm_sec　 秒：0~60 或 0~61 、61 是閏秒 　tm_wday　一週的第幾日：0~6 、0 是週一 　tm_yday　一年的第幾日：1~365 或 1~366 、366是閏年 　tm_isdst　夏令時：1 爲夏令時、0 非夏令時、-1 未知

獲取當前時間 #使用 time.localtime(time.time()) import time times = time.time() print(times) # 表示自 1970 年 1月 1 日 過去了多久 # 1595601014.0598545 localtime = time.localtime(times) print(localtime) # time.struct_time(tm_year=2020, tm_mon=7, tm_mday=24, tm_hour=22, tm_min=30, # tm_sec=14, tm_wday=4, tm_yday=206, tm_isdst=0)

獲取格式化時間： time.time() 獲取到1970年1月1日的秒數 -> time.localtime() 轉變爲當前時間 -> time.asctime() 將時間格式化 #獲取格式化時間 import time times = time.time() print(times) # 1595601087.3453288 local_times = time.localtime(times) print(local_times) # time.struct_time(tm_year=2020, tm_mon=7, tm_mday=24, tm_hour=22, # tm_min=31, tm_sec=27, tm_wday=4, tm_yday=206, tm_isdst=0) # 使用 asctime 將獲得的 local_times 轉化爲有格式的時間 local_time_asctimes = time.asctime(local_times) print(local_time_asctimes) # Fri Jul 24 22:31:27 2020

格式化日期： 　　%y ：兩位數的年份表示 (00~99) 　　%Y ：四位數的年份表示 (000~9999) 　　%m ：月份（01~12） 　　%d ：月份中的某一天（0~31） 　　%H ：某時，24小時制（0~23） 　　%I ：某時，12小時制（01~12） 　　%M ：某分（0~59） 　　%S ：某秒（00~59） 　　%a ：周幾的英文簡稱 　　%A ：周幾的完整英文名稱 　　%b ：月份的英文簡稱 　　%B ：月份的完整英文名稱 　　%c ：本地相應的日期表示和時間表示 　　%j ：年內的某一天（001~366） 　　%p ：本地 A.M. 或 P.M.的等價符 　　%U ：一年中的星期數（00~53）注：星期天爲星期的開始 　　%w ：星期（0~6）注：星期天爲星期的開始 　　%W ：一年中的星期數（00~53）注：星期一爲星期的開始 　　%x ：本地相應的日期表示 　　%X ：本地相應的時間表示 　　%Z ： 當前時區的名稱 　　%% ：輸出一個%

time.strftime(format[,t]) 參數爲日期格式 # 格式化日期 # time.strftime(format[,t]) 參數爲日期格式 import time times = time.time() local_time = time.localtime(times) # Y 年 - m 月 - d 日 H 時 - M 分 - S 秒 print(time.strftime("%Y-%m-%d %H:%M:%S",local_time)) # 2020-07-24 22:33:43 # Y 年 - b 月份英文簡稱 - d 日期 - H 時 - M 分 - S 秒 - a 周幾的英文簡稱 print(time.strftime("%Y %b %d %H:%M:%S %a",local_time)) # 2020 Jul 24 22:33:43 Fri

獲取某月的日曆： calendar.month(year,month)： 獲取 year 年 month 月的日曆 import calendar cal = calendar.month(2020,2) print("如下爲2020年2月的日曆") print(cal) # February 2020 # Mo Tu We Th Fr Sa Su # 1 2 # 3 4 5 6 7 8 9 # 10 11 12 13 14 15 16 # 17 18 19 20 21 22 23 # 24 25 26 27 28 29

Python Time模塊： altzone： 注：咱們在格林威治的東部，返回負值。對啓動夏令時的地區才能使用 返回格林威治西部的夏令時地區的偏移秒數，若是是在東部（西歐），則返回負值 import time print(time.altzone) # -32400 time( )： 返回當前時間的時間戳 import time times = time.time() print(times) # 1595601470.093444 asctime(時間元組)：時間元組：如使用了 gmtime 或 localtime 函數的對象 接受時間元組並返回可讀形式，時間元組能夠是使用了 time.localtime(time.time()) 的對象 import time times = time.time() print(times) # 1595601496.8390365 local_time = time.localtime(times) print(local_time) # time.struct_time(tm_year=2020, tm_mon=7, tm_mday=24, # tm_hour=22, tm_min=38, tm_sec=16, tm_wday=4, tm_yday=206, tm_isdst=0) # 使用 asctime 轉變爲可讀形式 print(time.asctime(local_time)) # Fri Jul 24 22:38:16 2020 perf_counter( )： 返回系統的運行時間 import time # 使用 perf_counter()函數查看系統運行時間 print(time.perf_counter()) # 2.0289763 process_time( )： 查看進程運行時間 import time # 使用 process_time() 查看進程運行時間 print(time.process_time()) # 0.0625 gmtime(secs)：secs 時間戳：從1970年1月1日到如今的秒數 查看格林威治的時間元組 # 使用 gmtime(時間戳) 查看格林威治的時間元組 import time times = time.time() # 查看格林威治的當前時間元組 print(time.gmtime(times)) # time.struct_time(tm_year=2020, tm_mon=2, tm_mday=10, tm_hour=5, # tm_min=18, tm_sec=7, tm_wday=0, tm_yday=41, tm_isdst=0) localtime(secs)：secs 時間戳：從1970年1月1日到如今的秒數 返回 secs 時間戳下的時間元組 import time times = time.time() print(times) # 1581312227.3952267 , 時間戳 local_time = time.localtime() print(local_time) # time.struct_time(tm_year=2020, tm_mon=2, tm_mday=10, tm_hour=13, # tm_min=23, tm_sec=47, tm_wday=0, tm_yday=41, tm_isdst=0) mktime(使用了 gmtime 或 localtime 函數的對象)： 返回時間戳 import time times = time.time() print(times) # 1581312492.6350465 local_time = time.localtime() print(time.mktime(local_time)) # 1581312492.0 ctime( )： 返回可讀形式的當前時間 # ctime() # 返回可讀形式的當前時間 import time print(time.ctime()) # Mon Feb 10 13:32:22 2020 sleep(seconds)： 使程序延遲 seconds 秒運行 import time print(time.ctime()) # Mon Feb 10 13:34:38 2020 time.sleep(5) print(time.ctime()) # Mon Feb 10 13:34:43 2020 strftime(format,時間元組)：format表示時間顯示格式，如 %Y-%m-%d %H-%M-%S 將時間元組轉換爲 format 形式 import time times = time.time() local_time = time.localtime() time_strftime = time.strftime("%Y-%m-%d %H-%M-%S",local_time) print(time_strftime) # 2020-02-10 13-42-09 strptime(str,format)： 按照 format 格式將 str 解析爲一個時間元組，若是 str 不是 format 格式，則會報錯 import time str = '2020-02-10 13-42-09' str_time = time.strptime(str,"%Y-%m-%d %H-%M-%S") print(str_time) # time.struct_time(tm_year=2020, tm_mon=2, tm_mday=10, tm_hour=13, # tm_min=42, tm_sec=9, tm_wday=0, tm_yday=41, tm_isdst=-1) 屬性： timezone: 當前地區距離格林威治的偏移秒數 import time print(time.timezone) # -28800 tzname： 屬性time.tzname包含一對字符串，分別是帶夏令時的本地時區名稱，和不帶的。 # 使用 time.tzname 查看 帶夏令時的本地時區名稱 與 不帶夏令時的本地時區名稱 import time print(time.tzname) # ('中國標準時間', '中國夏令時')

Python日曆模塊 calendar： 0：星期一是第一天 6：星期日是最後一天 注：形參 w，I，c 能夠不寫，正常使用，使用默認形參便可 calendar(year,w=2,I=1,c=6)： 返回一個多行字符格式的 year 年年曆，3個月一行，間隔距離爲 c。每日寬度間隔爲 w 個字符 # calendar(year,w = 2,I = 1,c = 6) import calendar # calendar(年份，天與天之間的間隔，周與周之間的間隔，月與月之間的間隔) print(calendar.calendar(2020,2,1,6)) # 2020 # # January February March # Mo Tu We Th Fr Sa Su Mo Tu We Th Fr Sa Su Mo Tu We Th Fr Sa Su # 1 2 3 4 5 1 2 1 # 6 7 8 9 10 11 12 3 4 5 6 7 8 9 2 3 4 5 6 7 8 # 13 14 15 16 17 18 19 10 11 12 13 14 15 16 9 10 11 12 13 14 15 # 20 21 22 23 24 25 26 17 18 19 20 21 22 23 16 17 18 19 20 21 22 # 27 28 29 30 31 24 25 26 27 28 29 23 24 25 26 27 28 29 # 30 31 # # April May June # Mo Tu We Th Fr Sa Su Mo Tu We Th Fr Sa Su Mo Tu We Th Fr Sa Su # 1 2 3 4 5 1 2 3 1 2 3 4 5 6 7 # 6 7 8 9 10 11 12 4 5 6 7 8 9 10 8 9 10 11 12 13 14 # 13 14 15 16 17 18 19 11 12 13 14 15 16 17 15 16 17 18 19 20 21 # 20 21 22 23 24 25 26 18 19 20 21 22 23 24 22 23 24 25 26 27 28 # 27 28 29 30 25 26 27 28 29 30 31 29 30 # # July August September # Mo Tu We Th Fr Sa Su Mo Tu We Th Fr Sa Su Mo Tu We Th Fr Sa Su # 1 2 3 4 5 1 2 1 2 3 4 5 6 # 6 7 8 9 10 11 12 3 4 5 6 7 8 9 7 8 9 10 11 12 13 # 13 14 15 16 17 18 19 10 11 12 13 14 15 16 14 15 16 17 18 19 20 # 20 21 22 23 24 25 26 17 18 19 20 21 22 23 21 22 23 24 25 26 27 # 27 28 29 30 31 24 25 26 27 28 29 30 28 29 30 # 31 # # October November December # Mo Tu We Th Fr Sa Su Mo Tu We Th Fr Sa Su Mo Tu We Th Fr Sa Su # 1 2 3 4 1 1 2 3 4 5 6 # 5 6 7 8 9 10 11 2 3 4 5 6 7 8 7 8 9 10 11 12 13 # 12 13 14 15 16 17 18 9 10 11 12 13 14 15 14 15 16 17 18 19 20 # 19 20 21 22 23 24 25 16 17 18 19 20 21 22 21 22 23 24 25 26 27 # 26 27 28 29 30 31 23 24 25 26 27 28 29 28 29 30 31 # 30 firstweekday( )： 返回當前每週起始日期的設置，默認返回 0 、週一爲 0 import calendar print(calendar.firstweekday()) # 0 isleap(year)： 若是是閏年則返回 True，不然返回 False import calendar print(calendar.isleap(2020)) # True leapdays(year1,year2)： 返回 year1 到 year2 之間的閏年數量 import calendar print(calendar.leapdays(2001,2100)) # 24 month(year,month,w = 2,I = 1)： 返回 year 年 month 月日曆，兩行標題，一週一行。 注：天天與天天的寬度間隔爲 w 個字符，i 是每一個星期與每一個星期的間隔的空數 import calendar print(calendar.month(2020,3,2,1)) # March 2020 # Mo Tu We Th Fr Sa Su # 1 # 2 3 4 5 6 7 8 # 9 10 11 12 13 14 15 # 16 17 18 19 20 21 22 # 23 24 25 26 27 28 29 # 30 31 monthcalendar(year,month)： 以列表形式返回，每一週爲內嵌列表，沒有日子則爲 0 import calendar print(calendar.monthcalendar(2020,4)) # [[0, 0, 1, 2, 3, 4, 5], [6, 7, 8, 9, 10, 11, 12], # [13, 14, 15, 16, 17, 18, 19], [20, 21, 22, 23, 24, 25, 26], # [27, 28, 29, 30, 0, 0, 0]] monthrange(year,month)： 返回（這個月的第一天是星期幾，這個月有多少天） import calendar print(calendar.monthrange(2020,2)) # (5, 29) prcal(year,w = 2,I = 1,c = 6)： 輸出 year 年的日曆 import calendar calendar.prcal(2020) # print(calendar.prcal(2020,2,1,6)) # 2020 # # January February March # Mo Tu We Th Fr Sa Su Mo Tu We Th Fr Sa Su Mo Tu We Th Fr Sa Su # 1 2 3 4 5 1 2 1 # 6 7 8 9 10 11 12 3 4 5 6 7 8 9 2 3 4 5 6 7 8 # 13 14 15 16 17 18 19 10 11 12 13 14 15 16 9 10 11 12 13 14 15 # 20 21 22 23 24 25 26 17 18 19 20 21 22 23 16 17 18 19 20 21 22 # 27 28 29 30 31 24 25 26 27 28 29 23 24 25 26 27 28 29 # 30 31 # # April May June # Mo Tu We Th Fr Sa Su Mo Tu We Th Fr Sa Su Mo Tu We Th Fr Sa Su # 1 2 3 4 5 1 2 3 1 2 3 4 5 6 7 # 6 7 8 9 10 11 12 4 5 6 7 8 9 10 8 9 10 11 12 13 14 # 13 14 15 16 17 18 19 11 12 13 14 15 16 17 15 16 17 18 19 20 21 # 20 21 22 23 24 25 26 18 19 20 21 22 23 24 22 23 24 25 26 27 28 # 27 28 29 30 25 26 27 28 29 30 31 29 30 # # July August September # Mo Tu We Th Fr Sa Su Mo Tu We Th Fr Sa Su Mo Tu We Th Fr Sa Su # 1 2 3 4 5 1 2 1 2 3 4 5 6 # 6 7 8 9 10 11 12 3 4 5 6 7 8 9 7 8 9 10 11 12 13 # 13 14 15 16 17 18 19 10 11 12 13 14 15 16 14 15 16 17 18 19 20 # 20 21 22 23 24 25 26 17 18 19 20 21 22 23 21 22 23 24 25 26 27 # 27 28 29 30 31 24 25 26 27 28 29 30 28 29 30 # 31 # # October November December # Mo Tu We Th Fr Sa Su Mo Tu We Th Fr Sa Su Mo Tu We Th Fr Sa Su # 1 2 3 4 1 1 2 3 4 5 6 # 5 6 7 8 9 10 11 2 3 4 5 6 7 8 7 8 9 10 11 12 13 # 12 13 14 15 16 17 18 9 10 11 12 13 14 15 14 15 16 17 18 19 20 # 19 20 21 22 23 24 25 16 17 18 19 20 21 22 21 22 23 24 25 26 27 # 26 27 28 29 30 31 23 24 25 26 27 28 29 28 29 30 31 # 30 prmonth(year,month,w = 2,I = 1)： 輸出 year 年 month 月的日曆 import calendar calendar.prmonth(2020,12) # December 2020 # Mo Tu We Th Fr Sa Su # 1 2 3 4 5 6 # 7 8 9 10 11 12 13 # 14 15 16 17 18 19 20 # 21 22 23 24 25 26 27 # 28 29 30 31 setfirstweekday(weekday)： 設置每週的起始日期，0爲週一，6爲週日 import calendar calendar.setfirstweekday(2) timegm(時間元組)： 返回該時刻的時間戳 import time import calendar print(calendar.timegm(time.localtime(time.time()))) # 1581362983 weekday(year,month,day)： 查看 year 年 month 月 day 日 星期幾 import calendar print(calendar.weekday(2020,2,10)) # 0 週一

Python函數：實現某種功能的代碼段 定義一個函數須要遵循的規則： 　　1.使用 def 關鍵字 函數名和( )，括號內能夠有形參 　　　　匿名函數使用 lambda 關鍵字定義 　　2.任何傳入參數和自變量必須放在括號中 　　3.函數的第一行語句能夠使用字符串存放函數說明 　　4.函數內容以冒號開始，函數內的代碼塊縮進 　　5.使用了 return [表達式] 函數會返回一個值，若是不使用 return 沒有返回值

def 函數名([參數，根據函數運行須要，若是函數不須要參數，能夠不寫]): 函數聲明（可不寫，若是怕忘記函數功能，最好是寫） 函數語句 ... return 須要返回的值（有些函數能夠不使用 return，只進行某些操做，返回爲None） 定義一個 say_hello 函數，使函數可以說出 hello def say_hello(): print("hello") 定義一個求長方形面積的函數 area，要求具備返回值且參數爲 height 和 width def area(height,width): return height * width 定義一個函數，輸出 歡迎，接收到的參數 name def huanying(name): print("歡迎",name) 定義一個求 1~10 的總和的函數，函數名爲 one_to_ten，要求具備返回值 def one_to_ten(): sum = 0 for i in range(1,11): sum += i return sum

函數名能夠賦值給變量，使用變量進行調用（變量此時至關於函數名） def add(num_1,num_2): print(num_1 + num_2) a = add print(type(a)) # <class 'function'> a(3,5) # 8

函數內能夠進行定義類： def run(): class student(object): pass

Python函數調用： 函數定義後，使用函數名(實參)進行調用，若是具備返回值，則使用變量接收。 示例： 無參數，無返回值 # 程序：定義一個 say_hello 函數，使函數可以說出 hello def say_hello(): print("hello") say_hello() # hello 有參數，無返回值 # 定義一個函數，輸出 歡迎，接收到的參數 name def huanying(name): print("歡迎",name) huanying("小明") # 歡迎 小明 無參數，有返回值 #定義一個求 1~10 的總和的函數，函數名爲 one_to_ten，要求具備返回值 def one_to_ten(): sum = 0 for i in range(1,11): sum += i return sum sum_1 = one_to_ten() print(sum_1) # 55 有參數，有返回值 # 定義一個求長方形面積的函數，要求具備返回值且參數爲 height 和 width def area(height,width): return height * width mianji = area(5,4) print(mianji)

Python按值傳遞參數和按引用傳遞參數： 按值傳遞參數： 使用一個變量的值(數字，字符串)，放到實參的位置上 注：傳遞過去的是變量的副本，不管副本在函數中怎麼變，變量的值都不變 傳遞常量： # 定義一個函數，輸出 歡迎，接收到的參數 name def huanying(name): print("歡迎",name) huanying("小明") # 歡迎 小明 # 定義一個求長方形面積的函數，要求具備返回值且參數爲 height 和 width def area(height,width): return height * width mianji = area(5,4) print(mianji) # 20 傳遞變量： # 定義一個函數，輸出 歡迎，接收到的參數 name def huanying(name): print("歡迎",name) strs_name = "小明" huanying(strs_name) # 歡迎 小明 # 定義一個求長方形面積的函數，要求具備返回值且參數爲 height 和 width def area(height,width): return height * width height = 5 width = 4 mianji = area(height,width) print(mianji) # 20

按引用傳遞： 輸出 列表、元組 和 集合全部元素的總和 lst = [1,2,3,4] tuple_1 = (4,5,7) se = {9,6,5,8} def add(args): '''將 args 中的元素總和輸出''' print(sum(args)) add(lst) # 10 add(tuple_1) # 16 add(se) # 28 # 輸出程序的註釋 print(add.__doc__) # 將 args 中的元素總和輸出 使用函數，將參數引用傳遞輸出參數中的奇數 lst = [1,2,3,4] tuple_1 = (4,5,7) def jishu(args): '''將 args 中的奇數輸出''' for i in range(len(args)): if args[i] % 2 == 1: print(args[i], end = " ") print() jishu(lst) # 1 3 jishu(tuple_1) # 5 7 # 輸出程序的註釋 print(jishu.__doc__) # 將 args 中的奇數輸出

若是參數發生修改 　　1.原參數若是爲可變類型(列表、集合、字典)則也跟着修改 　　2.原參數若是爲不可變類型(數字、字符串、元組)則不發生改變 示例： 列表在函數中調用方法，列表自己發生改變 def add_elem(args,string): '''將 lst_num 添加到 args 中''' args.append(string) # args += string # args.extend(string) # args.insert(len(args),string) return args lst = [1,2,3,4] string = "ABC" print(add_elem(lst,string)) # [1, 2, 3, 4, 'ABC'] print(lst) # [1, 2, 3, 4, 'ABC'] 在函數中，若是列表做爲實參，形參發生修改時，列表值發生改變 def add_elem(args,string): '''將 lst_num 添加到 args 中''' args[len(args)-1] = string return args lst = [1,2,3,4] string = "ABC" print(add_elem(lst,string)) # [1, 2, 3, 'ABC'] print(lst) # [1, 2, 3, 'ABC']

Python函數參數： 注：變量沒有類型，有類型的是變量指向的內存空間中的值 可變類型：列表、集合、字典 不可變類型：數字、字符串、元組 可變類型在函數中發生改變時，原變量也會跟着發生變化 列表使用 賦值，+=，append，extend，insert 方法均會使列表的值發生改變 位置參數： 實參必須以正確的順序傳入函數，調用時的數量必須與聲明時一致 # 必需參數 def hello(name): '''輸出歡迎信息''' print("hello {0}".format(name)) name = "XiaoMing" hello(name) # hello XiaoMing # hello() 會報錯，由於沒有傳入參數

關鍵字參數： 函數在調用時使用關鍵字肯定傳入參數的值(能夠不根據參數位置) 注：關鍵字即爲函數定義時使用的形參名 對應關鍵字名進行傳遞： def add(num_1,num_2): '''將兩個數字進行相加''' print("num_1:",num_1) print("num_2:",num_2) print("num_1 + num_2",num_1 + num_2) add(num_2 = 6,num_1 = 8) # num_1: 8 # num_2: 6 # num_1 + num_2 14

默認參數： 當調用函數時，若是沒有傳遞參數，則會使用默認參數 　　　　　　　若是傳遞了參數，則默認參數不起做用 注：程序有時會設置好已經修改好的默認參數，調用只須要傳入不是默認參數的參數便可 # 默認參數 def add(num_1,num_2 = 10): '''將兩個數字進行相加''' print("num_1:",num_1) print("num_2:",num_2) print("num_1 + num_2",num_1 + num_2) # add(15) # # num_1: 15 # # num_2: 10 # # num_1 + num_2 25 # 不傳入 num_2 的值，使用 num_2 的默認參數 add(num_1 = 15) # num_1: 15 # num_2: 10 # num_1 + num_2 25 # 傳入 num_2 的值，不使用 num_2 的默認參數 add(num_2 = 6,num_1 = 8) # num_1: 8 # num_2: 6 # num_1 + num_2 14

不定長參數： 當須要的參數不肯定，又還想使用參數進行函數內的運算時，能夠考慮不定長參數 不定長參數： * 變量 　　1.形參使用 *變量名： 實參一般爲傳入的多餘變量(能夠是字典變量)、 列表 或 元組 等 　　　　　　若是實參使用了 *列表，*元組，則函數中接收的是列表或元組中的全部的元素值 　　2.形參使用 **變量名：一般爲 **字典變量 或 字典元素(鍵值對) 等 示例： # 不定長參數 def print_info(*vartuple): print(vartuple) # 調用 printinfo 函數 print_info(70, 60, 50) # (70, 60, 50) 當不定長參數前面存在位置參數時： 傳入參數的值先傳遞給位置參數，剩餘的傳遞給不定長參數 注：若是沒有剩餘的實參，則不定長參數沒有值 不使用位置參數： 能夠接收全部數據類型數據，除了 a = 2 這種鍵值對，**字典變量等 注：若是實參中使用了 *列表，*元組，則函數接收的爲列表或元組的全部元素值 def print_info(*vartuple): # print(type(vartuple)) print(vartuple) # for i in vartuple: # print(i,end =" ") # print(type(vartuple[5])) # # print(vartuple[5])# 不定長參數 # 不使用位置參數 def print_info(*vartuple): # print(type(vartuple)) print(vartuple) # for i in vartuple: # print(i,end =" ") # print(type(vartuple[5])) # # print(vartuple[5]) # 調用 printinfo 函數 print_info(70,12.3,5+9j,True,"hello",[1,2,3],(7,8,9),{'a':123}) # (70, 12.3, (5+9j), True, 'hello', [1, 2, 3], (7, 8, 9), {'a': 123}) print_info([1,2,3]) # # ([1, 2, 3],) print_info(*[1,2,3],'a') # (1, 2, 3, 'a') print_info((1,2,3)) # # ((1, 2, 3),) print_info(*(1,2,3),'a') # (1, 2, 3, 'a')

使用位置參數： def print_info(num,*vartuple): print(num) print(vartuple) # 調用 printinfo 函數 print_info(70, 60, 50) # 70 # (60, 50) ** 變量名： 　　形參使用 **變量名：實參能夠使用 a = 二、**字典對象 # ** 變量名： # 　　形參使用 **變量名：實參能夠使用 a = 二、**字典元素 def print_info(**attrs): print(attrs) print(type(attrs)) # <class 'dict'> dic = {'a':123} print_info(**dic,b = 4,c = 6) # {'a': 123, 'b': 4, 'c': 6} 在形參中使用 * 和 ** 參數接收： def print_info(num,*vartuple,**attrs): print(num) print(vartuple) print(attrs) # 調用 printinfo 函數 print_info(70, 60, 50,{'a':123},b = 456,c = 789) # 70 # (60, 50, {'a': 123}) # {'b': 456, 'c': 789}

Python匿名函數： 使用 lambda 關鍵字建立匿名函數： lambda 定義的函數只是一個表達式，而不是代碼塊 lambda 函數擁有本身的命名空間，不可以訪問參數列表以外的 或 全局命名空間的參數 # 使 lambda 實現輸出 x 的 y 次方 # 使用變量接收函數 cifang = lambda x,y:x**y # 匿名函數的調用：使用接收到的變量進行調用 print(cifang(3,2)) # 9 # 拓展：使用變量接收函數名，而後能夠使用變量進行函數調用 def print_info(): print("hello") pr = print_info pr() # hello

return語句： 在函數內當遇到 return 語句時，退出函數並返回 return 後面的對象，能夠是表達式 或 值 不帶 return 語句的函數返回值爲None,也就是沒有返回值 def add_num(num_1, num_2): # 返回 num_1 和 num_2 的和" total = num_1 + num_2 print(total) return total # 調用 add_num 函數 # 使用變量接收函數的返回值 total = add_num(10, 20) print(total) # 30 # 30

Python變量做用域 運行 Python 程序時，Python會在做用域中依次尋找該變量，直到找到爲止，不然會報錯(未定義) Python定義的變量並非任意一個位置均可以進行訪問的，主要根據變量的做用域。 局部做用域：好比在一個函數內部 全局做用域：一個 .py 文件中只要不是在函數內部，都是全局變量 內建做用域： import builtins print(dir(builtins)) Python 中只有模塊（module），類（class）以及函數（def、lambda）纔會引入新的做用域

Python全局變量和局部變量： 定義在函數內的爲局部變量，在外部訪問局部變量會出現未定義的錯誤 定義在函數外的變量稱爲全局變量，能夠在整個函數範圍內訪問 當函數中存在與全局變量重名的變量，以函數中的局部變量爲準 定義在函數中的局部變量的做用域只在函數中 # 定義全局變量 total total = 0 def add(num1,num2): # 定義局部變量 total total = num1 + num2 # 輸出局部變量 print(total) add(4,6) # 10 print(total) # 0

在函數內部，若是想要修改外部變量時，能夠使用 global 關鍵字 global 全局變量名 在函數中使用全局變量，能夠對全局變量進行修改。 注：若是隻是在函數中使用了和全局變量相同的名字，則只是局部變量 # 定義全局變量 total total = 0 def add(num1,num2): # 使用 global 關鍵字聲明全局變量 total global total total = num1 + num2 # 輸出全局變量 print(total) add(4,6) # 10 # 輸出全局變量 print(total) # 10

nonlocal 嵌套做用域中的變量(嵌套函數，外層函數與內層函數之間)： 修改嵌套函數之間的變量 def func_out(): num = 5 def func_inner(): # 使用嵌套函數中的 num 值 nonlocal num num = 10 print("最內部函數中的 num 的值：",num) func_inner() print("嵌套函數中的 num 的值：",num) func_out() # 最內部函數中的 num 的值： 10 # 嵌套函數中的 num 的值： 10

Python模塊：包含了全部定義的函數和變量的文件，後綴名爲 .py 將某些方法存放在文件中，當某些腳本 或 交互式須要使用的時候，導入進去。 導入的文件，就稱爲模塊。導入以後就能夠使用導入的文件的函數等功能 import math # 導入 math 庫 print(math.exp(1) == math.e) # 導入 exp() 和 e # True import 語句： import 模塊名 或 包：調用方法，使用 模塊名.方法 當解釋器遇到 import 語句時，若是模塊在 搜索路徑 中，則模塊會被導入 注：搜索路徑是解釋器進行搜索的全部目錄名的一個列表。 在一個 .py 文件中建立函數，在另外一個 .py 文件中導入 func.py # 在 func 模塊中定義一個 print_info 函數 def print_info(): print("我是在 func 模塊內部的") test.py # 導入 func 模塊 import func # 調用自定義的模塊函數 func.print_info() # 我是在 func 模塊內部的 一個模塊只會被導入一次，不管使用多少次 import 語句，都只導入一次

from 模塊名 import 語句 from 模塊名 import 子模塊 或 函數 或 類 或 變量 導入的不是整個模塊，而是 import 後面的對象 注：在調用導入的模塊函數使，不使用模塊名.函數名 而是 直接使用函數名進行調用 func.py # 在 func 模塊中定義一個 print_info 函數 def print_info(): print("我是在 func 模塊內部的") def get_info(): print("獲取到了 func 模塊的信息") test.py # 導入 func 模塊 from func import get_info # 調用自定義的模塊函數 get_info() # 獲取到了 func 模塊的信息 注：沒有導入 print_info 方法，使用會報錯

from 模塊名 import * 使用 函數 或 變量 直接進行使用 將模塊內的全部內容都導入到當前的模塊中，可是不會將該模塊的子模塊的全部內容也導入 導入語句遵循以下規則：若是包定義文件 __init__.py 存在一個叫作 __all__ 的列表變量， 那麼在使用 from package import * 的時候就把這個列表中的全部名字做爲包內容導入 from func import * # 調用自定義的模塊函數 print_info() # 我是在 func 模塊內部的 get_info() # 獲取到了 func 模塊的信息 注：不導入 _ 單個下劃線開頭的變量或方法

接着上一篇隨筆,繼續進行整理總結 注:如下內容基本都在之前的隨筆中能夠找到

Python定位模塊： 導入模塊時，系統會根據搜索路徑進行尋找模塊： 　　1.在程序當前目錄下尋找該模塊 　　2.在環境變量 PYTHONPATH 中指定的路徑列表尋找 　　3.在 Python 安裝路徑中尋找 搜索路徑是一個列表，因此具備列表的方法

使用 sys 庫的 path 能夠查看系統路徑 import sys # 以列表方式輸出系統路徑，能夠進行修改 print(sys.path)

增長新目錄到系統路徑中 sys.path.append("新目錄路徑")

sys.path.insert(0,"新目錄路徑")

添加環境變量 set PYTHONPATH=安裝路徑\lib; Python 會在每次啓動時，將 PYTHONPATH 中的路徑加載到 sys.path中。

Python命名空間和做用域： 變量擁有匹配對象的名字，命名空間包含了變量的名稱（鍵）和所指向的對象（值）。 Python表達式能夠訪問局部命名空間和全局命名空間 注：當局部變量和全局變量重名時，使用的是局部變量 每一個函數和類都具備本身的命名空間，稱爲局部命名空間 若是須要在函數中使用全局變量，能夠使用 global 關鍵字聲明，聲明後，Python會將該關鍵字看做是全局變量 # global 全局變量名： # 在函數中使用全局變量，能夠對全局變量進行修改。 # 注：若是隻是在函數中使用了和全局變量相同的名字，則只是局部變量 # 定義全局變量 total total = 0 def add(num1,num2): # 使用 global 關鍵字聲明全局變量 total global total total = num1 + num2 # 輸出全局變量 print(total) add(4,6) # 10 # 輸出全局變量 print(total) # 10

Python dir(模塊名 或 模塊名.方法名)： 查看 模塊名 或 模塊名.方法名 的全部能夠調用的方法 # 導入 math 庫 import math # 查看 math 能夠調用的方法 print(dir(math)) ''' ['__doc__', '__loader__', '__name__', '__package__', '__spec__', 'acos', 'acosh', 'asin', 'asinh', 'atan', 'atan2', 'atanh', 'ceil', 'copysign', 'cos', 'cosh', 'degrees', 'e', 'erf', 'erfc', 'exp', 'expm1', 'fabs', 'factorial', 'floor', 'fmod', 'frexp', 'fsum', 'gamma', 'gcd', 'hypot', 'inf', 'isclose', 'isfinite', 'isinf', 'isnan', 'ldexp', 'lgamma', 'log', 'log10', 'log1p', 'log2', 'modf', 'nan', 'pi', 'pow', 'radians', 'remainder', 'sin', 'sinh', 'sqrt', 'tan', 'tanh', 'tau', 'trunc'] ''' import urllib.request print(dir(urllib.request)) ''' ['AbstractBasicAuthHandler', 'AbstractDigestAuthHandler', 'AbstractHTTPHandler', 'BaseHandler', 'CacheFTPHandler', 'ContentTooShortError', 'DataHandler', 'FTPHandler', 'FancyURLopener', 'FileHandler', 'HTTPBasicAuthHandler', 'HTTPCookieProcessor', 'HTTPDefaultErrorHandler', 'HTTPDigestAuthHandler', 'HTTPError', 'HTTPErrorProcessor', 'HTTPHandler', 'HTTPPasswordMgr', 'HTTPPasswordMgrWithDefaultRealm', 'HTTPPasswordMgrWithPriorAuth', 'HTTPRedirectHandler', 'HTTPSHandler', 'MAXFTPCACHE', 'OpenerDirector', 'ProxyBasicAuthHandler', 'ProxyDigestAuthHandler', 'ProxyHandler', 'Request', 'URLError', 'URLopener', 'UnknownHandler', '__all__', '__builtins__', '__cached__', '__doc__', '__file__', '__loader__', '__name__', '__package__', '__spec__', '__version__', '_cut_port_re', '_ftperrors', '_have_ssl', '_localhost', '_noheaders', '_opener', '_parse_proxy', '_proxy_bypass_macosx_sysconf', '_randombytes', '_safe_gethostbyname', '_splitattr', '_splithost', '_splitpasswd', '_splitport', '_splitquery', '_splittag', '_splittype', '_splituser', '_splitvalue', '_thishost', '_to_bytes', '_url_tempfiles', 'addclosehook', 'addinfourl', 'base64', 'bisect', 'build_opener', 'contextlib', 'email', 'ftpcache', 'ftperrors', 'ftpwrapper', 'getproxies', 'getproxies_environment', 'getproxies_registry', 'hashlib', 'http', 'install_opener', 'io', 'localhost', 'noheaders', 'os', 'parse_http_list', 'parse_keqv_list', 'pathname2url', 'posixpath', 'proxy_bypass', 'proxy_bypass_environment', 'proxy_bypass_registry', 'quote', 're', 'request_host', 'socket', 'ssl', 'string', 'sys', 'tempfile', 'thishost', 'time', 'unquote', 'unquote_to_bytes', 'unwrap', 'url2pathname', 'urlcleanup', 'urljoin', 'urlopen', 'urlparse', 'urlretrieve', 'urlsplit', 'urlunparse', 'warnings'] '''

Python globals和locals函數_reload函數： globals( )： 返回全部可以訪問到的全局名字 num = 5 sum = 0 def add(num): func_sum = 0 func_sum += num return func_sum print(globals()) ''' {'__name__': '__main__', '__doc__': None, '__package__': None, '__loader__': <_frozen_importlib_external.SourceFileLoader object at 0x000001F5F98CC2E0>, '__spec__': None, '__annotations__': {}, '__builtins__': <module 'builtins' (built-in)>, '__file__': 'G:/code/time/1.py', '__cached__': None, 'num': 5, 'sum': 0, 'add': <function add at 0x000001F5F97E51F0>} ''' 注:我此處的代碼是停留在昨天的time模塊中,因此在路徑下會出現G:/code/time/1.py

locals( )： 在函數中使用 locals ，返回形參和局部變量 num = 5 sum = 0 def add(num): func_sum = 0 func_sum += num print(locals()) return func_sum add(num) # {'num': 5, 'func_sum': 5}

reload(模塊名)：reload 在 importlib 模塊中 從新導入以前導入過的模塊 注：當一個模塊導入到另外一個腳本時，模塊頂層部分的代碼只會被執行一次 # 從新導入模塊 import func # 導入自定義的模塊 from importlib import reload # reload 函數在 importlib 模塊中 reload(func) # 從新導入 func 模塊 from func import get_info get_info() # 獲取到了 func 模塊的信息 使用reload的前提，是reload的 模塊，以前已經使用import或者from導入過，不然會失敗 import 導入的模塊，使用模塊名.方法的方式，reload會強制運行模塊文件，而後原來導入的模塊會被新使用的導入語句覆蓋掉 from 導入的模塊，本質是一個賦值操做 即在當前文件中(即執行 from 語句的文件)進行 attr = module.attr 注：reload 函數對 reload 運行以前的from語句沒有影響

Python包： 包是一種管理 Python 模塊命名空間的形式，採用 「點模塊名稱」 例：A.B 表示 A 模塊的 B子模塊 當不一樣模塊間存在相同的變量名時，一個是使用 模塊名.變量名 另外一個是 變量名 當導入一個包時，Python 會根據 sys 模塊的 path 變量中尋找這個包 目錄中只有一個 __init__.py 文件纔會被認爲是一個包 導包常見的幾種方式： 　　import 模塊名 或 包：調用方法，使用 模塊名.方法 　　from 模塊名 import 子模塊（子模塊 或 函數 或 類 或 變量）：使用函數調用 　　from 模塊名 import * ：使用函數進行調用 　　注：若是 __init__.py 文件中存在 __all__變量，則導入 __all__變量的值，在更新包的時候，注意修改__all__的值 　　__all__ = ["echo", "surround", "reverse"] 導入 * 時，從 __all__ 導入 包還提供 __path__ ，一個目錄列表，每個目錄都有爲這個包服務的 __init__.py 文件，先定義，後運行其餘的__init__.py文件 __path__ ：主要用於擴展包裏面的模塊

float_num = 2.635 print("輸出的是%f"%(float_num)) # 輸出的是2.635000 # 保留兩位小數 print("輸出的是%.2f"%(float_num)) # 輸出的是2.63 float_num = 5.99 print("輸出的數字是{}".format(float_num)) # 輸出的數字是5.99 # 指定參數名 float_num = 5.99 strs = 'happy' print("輸出的數字是{num},輸出的字符串是{str}".format(num = float_num,str = strs)) # 輸出的數字是5.99,輸出的字符串是happy

input(字符串)：字符串一般爲提示語句 輸入語句 注：一般使用變量進行接收，input 返回字符串類型，若是輸入的爲數字，可以使用 int 轉化爲數字 # input(字符串)：字符串一般爲提示語句 # 輸入語句 # 注：一般使用變量進行接收，input 返回字符串類型，若是輸入的爲數字，可以使用 eval 轉化爲數字 strs = input("請輸入一個字符串") print(type(strs)) print("輸入的字符串爲：",strs) # 請輸入一個字符串HELLO # <class 'str'> # 輸入的字符串爲： HELLO num = eval(input("請輸入一個數字")) print(type(num)) print("輸入的數字爲：",num) # 請輸入一個數字56 # <class 'int'> # 輸入的數字爲： 56

注: input 使用 eval 進行轉換時,若是輸入的是 [1,2,3] 那麼轉換的就是 [1,2,3] 爲列表類型

Python打開和關閉文件： open(文件名，打開文件的模式[，寄存區的緩衝])： 　　文件名：字符串值　　 　　　　注：文件名帶有後綴名 # 打開建立好的 test.txt 文件 f = open("test.txt",'r') # 輸出文件全部的內容 print(f.readlines( )) # ['hello,world.\n'] # 關閉文件 f.close()

打開文件的模式：　 r 以只讀方式打開文件。文件的指針將會放在文件的開頭。這是默認模式。 rb 以二進制格式打開一個文件用於只讀。文件指針將會放在文件的開頭。這是默認模式。 r+ 打開一個文件用於讀寫。文件指針將會放在文件的開頭。 rb+ 以二進制格式打開一個文件用於讀寫。文件指針將會放在文件的開頭。 w 打開一個文件只用於寫入。若是該文件已存在則將其覆蓋。若是該文件不存在，建立新文件。 wb 以二進制格式打開一個文件只用於寫入。若是該文件已存在則將其覆蓋。若是該文件不存在，建立新文件。 w+ 打開一個文件用於讀寫。若是該文件已存在則將其覆蓋。若是該文件不存在，建立新文件。 wb+ 以二進制格式打開一個文件用於讀寫。若是該文件已存在則將其覆蓋。若是該文件不存在，建立新文件。 a 打開一個文件用於追加。若是該文件已存在，文件指針將會放在文件的結尾。也就是說，新的內容將會被寫入到已有內容以後。若是該文件不存在，建立新文件進行寫入。 ab 以二進制格式打開一個文件用於追加。若是該文件已存在，文件指針將會放在文件的結尾。也就是說，新的內容將會被寫入到已有內容以後。若是該文件不存在，建立新文件進行寫入。 a+ 打開一個文件用於讀寫。若是該文件已存在，文件指針將會放在文件的結尾。文件打開時會是追加模式。若是該文件不存在，建立新文件用於讀寫。 ab+ 以二進制格式打開一個文件用於追加。若是該文件已存在，文件指針將會放在文件的結尾。若是該文件不存在，建立新文件用於讀寫。

寄存區的緩衝： 　　　　小於 0 的整數：系統默認設置寄存區的大小 　　　　0：不進行寄存 　　　　1：進行寄存 　　　　大於 1 的整數：整數即爲寄存區的緩衝區大小

Python read和write方法： read()： 從文件中讀取字符串 注：Python 字符串能夠是二進制數據，而不只僅是文字。 語法： 文件對象.read([count]) count：打開文件須要讀取的字符數 注：read 函數不使用 count 會盡量多地讀取更多的內容，一般一直讀取到文件末尾

# 使用 count # 打開建立好的 test.txt 文件 f = open("test.txt",'r') # 輸出文件的前 11 個字符 print(f.read(11)) # 關閉文件 f.close() # hello,world

文件位置： tell()： 返回文件內當前指向的位置 f = open("test.txt",'r') # 輸出文件的前 11 個字符 f.read(11) print(f.tell()) # 11

seek(offset [,from])： 改變當前文件的位置 　　offset：表示要移動的字節數 　　from ：指定開始移動字節的參考位置。 　　　　0：文件開頭 　　　　1：當前位置 　　　　2：文件末尾

# 打開建立好的 test.txt 文件 f = open("test.txt",'r') # 輸出文件的前 11 個字符 print(f.read(11)) # hello,world # 返回文件內當前指向的位置 print(f.tell()) # 11 print(f.seek(0,0)) # 0 print(f.tell()) # 0 print(f.read(11)) # hello,world # 關閉文件 f.close()

write( )： 將任意字符串寫入一個文件中 注：Python字符串能夠是二進制數據 和 文字，換行符('\n') 須要本身添加 語法： 文件對象.write(字符串) 程序： # write 方法 # 打開建立好的 test.txt 文件 f = open("test.txt",'w') # 在開頭，添加文件內容 f.write('hey boy') # 關閉文件 f.close()

PythonFile對象的屬性： 一個文件被打開後，使用對象進行接收，接收的對象即爲 File 對象 file.closed 返回true若是文件已被關閉，不然返回false file.mode 返回被打開文件的訪問模式 file.name 返回文件的名稱 file = open("test.txt",'r') # file.name 返回文件的名稱 print(file.name) # test.txt # file.closed 若是文件未關閉返回 False print(file.closed) # False # file.mode 返回被打開文件的訪問模式 print(file.mode) # r # file.closed 若是文件已關閉返回 True file.close() print(file.closed) # True

os 的方法： mkdir(目錄名)： 在當前目錄下建立新的目錄 程序： import os # 建立新的目錄-包結構 os.mkdir('新目錄-test') getcwd()方法： 顯示當前的工做目錄。 程序： import os print(os.getcwd()) # G:\code\time chdir(修改的目錄名)： 修改當前的目錄名爲 修改的目錄名 程序： import os # 建立新的目錄-包結構 print(os.getcwd()) # D:\看法\Python\Python代碼\vacation\備課\新目錄-test os.chdir('新目錄-test2') print(os.getcwd()) # D:\看法\Python\Python代碼\vacation\備課\新目錄-test\新目錄-test2 rmdir(目錄名)： 刪除目錄 注：目錄名下爲空，沒有其餘文件 程序： import os os.rmdir('新目錄-test2') 刪除後包文件消失

rename(當前的文件名，新文件名)： 將當前的文件名修改成新文件名 程序： # os.rename('舊名字'，’新名字‘) import os os.rename('舊名字.txt','新名字.txt') remove(文件名)： 刪除文件 程序： import os os.remove('名稱.txt')

''' 編程完成一個簡單的學生管理系統，要求以下： （1）使用自定義函數，完成對程序的模塊化 （2）學生信息至少包括：姓名、性別及手機號 （3）該系統具備的功能：添加、刪除、修改、顯示、退出系統 設計思路以下： （1） 提示用戶選擇功能序號 （2） 獲取用戶選擇的可以序號 （3） 根據用戶的選擇，調用相應的函數，執行相應的功能 ''' stu_lst = [[],[],[],[],[]] # 建立存儲五個學生的容器 def show_gn(): '''展現學生管理系統的功能''' print("==========================") print("學生管理系統v1.0") print("1.添加學生信息(請先輸入1)") print("2.刪除學生信息") print("3.修改學生信息") print("4.顯示全部學生信息") print("0.退出系統") print("==========================") def tj_gn(num): '''添加功能''' stu_lst[num].append(input("請輸入新學生的名字：")) # 第一個參數爲新學生的名字 stu_lst[num].append(input("請輸入新學生的性別:")) # 第二個參數爲新學生的性別 stu_lst[num].append(input("請輸入新學生的手機號:")) # 第三個參數爲新學生的手機號 stu_lst[num].append(num) # 第四個參數爲新學生的默認學號(從 0 開始) def sc_gn(): '''刪除功能''' stu_xlh = int(eval(input("請輸入須要刪除的學生序列號:"))) xs_gn_returni = xs_gn(stu_xlh) pd_str = input("請問肯定刪除嗎? 請輸入全小寫字母 yes / no ? ") # pd_str 判斷是否刪除學生信息 if pd_str == 'yes': del stu_lst[xs_gn_returni] print("刪除完畢") if pd_str == 'no': print("刪除失敗") def xg_gn(): '''修改功能''' stu_xlh = int(eval(input("請輸入須要修改的學生序列號:"))) xs_gn_returni = xs_gn(stu_xlh) # xs_gn_returni 接收的是若是存在輸入的學生序列號，則返回通過確認的索引下標 xg_str = input("請問須要修改該名學生哪一處信息,請輸入提示後面的小寫字母 (姓名)name,(性別)sex,(手機號)sjh") if xg_str in ['name','sex','sjh']: if xg_str == 'name': stu_lst[xs_gn_returni][0] = input("請輸入新的姓名值") elif xg_str == 'sex': stu_lst[xs_gn_returni][1] = input("請輸入新的性別值") else: stu_lst[xs_gn_returni][2] = input("請輸入新的手機號值") else: print("輸入錯誤") def xs_gn(stu_xlh = -1): '''顯示功能''' print("姓名性別手機號序列號信息以下") if stu_xlh == -1: for i in stu_lst: if i != []: print(i) else: for i in range(len(stu_lst)): if stu_xlh in stu_lst[i] and i != []: print("該學生信息以下:") print(stu_lst[i]) return i show_gn() gn_num = int(eval(input("請輸入功能對應的數字:"))) # gn_num 功能對應的數字 num = 0 while 0 <= gn_num < 1000: if gn_num == 1: tj_gn(num) num += 1 gn_num = int(eval(input("請輸入功能對應的數字:"))) elif gn_num == 2: sc_gn() gn_num = int(eval(input("請輸入功能對應的數字:"))) elif gn_num == 3: xg_gn() gn_num = int(eval(input("請輸入功能對應的數字:"))) elif gn_num == 4: xs_gn() gn_num = int(eval(input("請輸入功能對應的數字:"))) elif gn_num == 0: print("退出系統") exit() else: print("請從新運行該程序，輸入的數字不在 0~4 之中") exit()

a = 10 b = 8 print("a>b") if a>b else pass pass 爲什麼報錯問題： 第一部分：print 第二部分：("a>b") if a>b else pass 第一種狀況 print ("a>b") 第二種狀況 print(pass) pass 關鍵字，不用於輸出，致使出錯

Python 實現分層聚類算法 ''' 1.將全部樣本都看做各自一類 2.定義類間距離計算公式 3.選擇距離最小的一堆元素合併成一個新的類 4.從新計算各種之間的距離並重覆上面的步驟 5.直到全部的原始元素劃分紅指定數量的類 程序要點： 1.生成測試數據 sklearn.datasets.make_blobs 2.系統聚類算法 sklearn.cluster.AgglomerativeClustering 3.必須知足該條件否則會報錯(自定義函數中的參數) assert 1 <= n_clusters <= 4 4.顏色，紅綠藍黃 r g b y 5. o * v + 散點圖的形狀 6.[] 內能夠爲條件表達式,輸出數組中知足條件的數據 data[predictResult == i] 7.訪問 x 軸，y 軸座標 subData[:,0] subData[:,1] 8.plt.scatter(x軸,y軸,c,marker,s=40) colors = "rgby" markers = "o*v+" c 顏色 c=colors[i] marker 形狀 marker=markers[i] 9.生成隨機數據並返回樣本點及標籤 data,labels = make_blobs(n_samples=200,centers=4) make_blobs 爲 sklearn.datasets.make_blobs 庫 n_samples 爲須要的樣本數量 centers 爲標籤數 ''' import numpy as np import matplotlib.pyplot as plt from sklearn.datasets import make_blobs from sklearn.cluster import AgglomerativeClustering def AgglomerativeTest(n_clusters): assert 1 <= n_clusters <= 4 predictResult = AgglomerativeClustering( n_clusters=n_clusters, affinity='euclidean', linkage='ward' ).fit_predict(data) # 定義繪製散點圖時使用的顏色和散點符號 colors = "rgby" markers = "o*v+" # 依次使用不一樣的顏色和符號繪製每一個類的散點圖 for i in range(n_clusters): subData = data[predictResult == i] plt.scatter( subData[:,0], subData[:,1], c = colors[i], marker = markers[i], s = 40 ) plt.show() # 生成隨機數據，200個點，4類標籤，返回樣本及標籤 data , labels = make_blobs(n_samples=200,centers=4) print(data) AgglomerativeTest(2)

KNN算法基本原理與sklearn實現 ''' KNN 近鄰算法，有監督學習算法 用於分類和迴歸 思路: 1.在樣本空間中查找 k 個最類似或者距離最近的樣本 2.根據這 k 個最類似的樣本對未知樣本進行分類 步驟： 1.對數據進行預處理 提取特徵向量，對原來的數據從新表達 2.肯定距離計算公式 計算已知樣本空間中全部樣本與未知樣本的距離 3.對全部的距離按升序進行排列 4.選取與未知樣本距離最小的 k 個樣本 5.統計選取的 k 個樣本中每一個樣本所屬類別的出現機率 6.把出現頻率最高的類別做爲預測結果，未知樣本則屬於這個類別 程序要點： 1.建立模型須要用到的包 sklearn.neighbors.KNeighborsClassifier 2.建立模型,k = 3 knn = KNeighborsClassifier(n_neighbors = 3) n_neighbors 數值不一樣，建立的模型不一樣 3.訓練模型,進行擬合 knn.fit(x,y) x 爲二維列表數據 x = [[1,5],[2,4],[2.2,5], [4.1,5],[5,1],[5,2],[5,3],[6,2], [7.5,4.5],[8.5,4],[7.9,5.1],[8.2,5]] y 爲一維分類數據,將數據分爲 0 1 2 三類 y = [0,0,0, 1,1,1,1,1, 2,2,2,2] 4.進行預測未知數據，返回所屬類別 knn.predict([[4.8,5.1]]) 5.屬於不一樣類別的機率 knn.predict_proba([[4.8,5.1]]) ''' from sklearn.neighbors import KNeighborsClassifier # 導包 x = [[1,5],[2,4],[2.2,5], [4.1,5],[5,1],[5,2],[5,3],[6,2], [7.5,4.5],[8.5,4],[7.9,5.1],[8.2,5]] # 設置分類的數據 y = [0,0,0, 1,1,1,1,1, 2,2,2,2] # 對 x 進行分類，前三個分爲 0類，1類和2類 knn = KNeighborsClassifier(n_neighbors=3) # 建立模型 k = 3 knn.fit(x,y) # 開始訓練模型 ''' KNeighborsClassifier(algorithm='auto', leaf_size=30, metric='minkowski', metric_params=None, n_jobs=None, n_neighbors=3, p=2, weights='uniform') ''' knn.predict([[4.8,5.1]]) # array([1]) 預測 4.8,5.1 在哪個分組中 knn = KNeighborsClassifier(n_neighbors=9) # 設置參數 k = 9 knn.fit(x,y) ''' KNeighborsClassifier(algorithm='auto', leaf_size=30, metric='minkowski', metric_params=None, n_jobs=None, n_neighbors=9, p=2, weights='uniform') ''' knn.predict([[4.8,5.1]]) # array([1]) knn.predict_proba([[4.8,5.1]]) # 屬於不一樣類別的機率 # array([[0.22222222, 0.44444444, 0.33333333]]) # 返回的是在不一樣組的機率 ''' 總結: knn = KNeighborsClassifier(n_neighbors=3) 使用 KNeighborsClassifier 建立模型 n_neighbors 爲 k 使用 knn.fit() 進行預測 第一個參數爲 二維列表 第二個參數爲 一維列表 使用 predict_proba([[num1,num2]]) 查看num1,num2 在模型中出現的機率 '''

''' 程序要點：import numpy as np 1.查看 e 的 多少次方 np.exp(參數) 2.查看參數的平方根 np.sqrt(參數) 3.生成三維四列的隨機值(-1,1)之間 np.random.random((3,4)) 4.向下取整 a = np.floor(參數) 5.將矩陣拉平 a.ravel() 6.修改矩陣的形狀 a.shape(6,2) 7.將矩陣轉置 a.T 8.將矩陣橫行進行拼接 a = np.floor(參數) b = np.floor(參數) np.hstack((a,b)) 9.將矩陣縱行進行拼接 np.vstack((a,b)) 10.按照行進行切分數組，切分爲3份 np.hsplit(a,3) 注：第二個參數能夠爲元組(3,4) 11.按照列進行切分數組，切分爲2份 np.vsplit(a,2) ''' import numpy as np a = np.floor(10 *np.random.random((3,4))) ''' array([[6., 8., 0., 9.], [9., 3., 7., 4.], [2., 4., 9., 1.]]) ''' np.exp(a) ''' array([[4.03428793e+02, 2.98095799e+03, 1.00000000e+00, 8.10308393e+03], [8.10308393e+03, 2.00855369e+01, 1.09663316e+03, 5.45981500e+01], [7.38905610e+00, 5.45981500e+01, 8.10308393e+03, 2.71828183e+00]]) ''' np.sqrt(a) ''' array([[2.44948974, 2.82842712, 0. , 3. ], [3. , 1.73205081, 2.64575131, 2. ], [1.41421356, 2. , 3. , 1. ]]) ''' a.ravel() '''array([6., 8., 0., 9., 9., 3., 7., 4., 2., 4., 9., 1.])''' a = np.floor(10 *np.random.random((3,4))) b = np.floor(10 *np.random.random((3,4))) np.hstack((a,b)) ''' array([[8., 2., 3., 8., 0., 1., 8., 6.], [7., 1., 0., 3., 1., 9., 6., 0.], [6., 0., 6., 6., 4., 3., 6., 3.]]) ''' np.vstack((a,b)) ''' array([[8., 2., 3., 8.], [7., 1., 0., 3.], [6., 0., 6., 6.], [0., 1., 8., 6.], [1., 9., 6., 0.], [4., 3., 6., 3.]]) ''' a = np.floor(10 *np.random.random((2,12))) np.hsplit(a,3) ''' [array([[6., 4., 2., 2.], [6., 2., 2., 8.]]), array([[1., 8., 8., 8.], [1., 9., 3., 6.]]), array([[2., 6., 0., 8.], [7., 1., 4., 3.]])] ''' a = np.floor(10 *np.random.random((12,2))) np.vsplit(a,3) ''' [array([[2., 6.], [7., 1.], [4., 7.], [3., 1.]]), array([[2., 5.], [4., 6.], [2., 0.], [4., 4.]]), array([[9., 5.], [7., 1.], [2., 1.], [5., 1.]])] '''

注:上中下的這三篇隨筆,在分類方面都會比較亂.

不一樣複製操做對比（三種） ''' 1.b = a b 發生變化 a 也會發生變化 2.淺複製 c = a.view() c.shape 發生變化，a.shape 不會發生變化 c 和 a 共用元素值,id 指向不一樣 c[1,0] = 1234 ， a 的值也會發生變化 3.深複製 d = a.copy() d[0,0] = 999 d 發生改變，a 不會發生改變 ''' import numpy as np a = np.arange(1,8) # array([1, 2, 3, 4, 5, 6, 7]) b = a b[2] = 999 b # array([ 1, 2, 999, 4, 5, 6, 7]) a # array([ 1, 2, 999, 4, 5, 6, 7]) a = np.arange(1,9) c = a.view() c.shape = 4,2 ''' array([[1, 2], [3, 4], [5, 6], [7, 8]]) ''' a # array([1, 2, 3, 4, 5, 6, 7, 8]) d = a.copy() d[3] = 888 d # array([ 1, 2, 3, 888, 5, 6, 7, 8]) a # array([1, 2, 3, 4, 5, 6, 7, 8])

''' 1.查看列上最大索引的位置 data.argmax(axis = 0) 2.輸出索引位置上的元素 data[index,range(data.shape[1])] 使用 range 輸出幾個元素 3.對numpy 對象進行擴展 a = np.array([4,5,6,2]) np.tile(a,(2,3)) 4.對數組按行排序,從小到大 a = np.array([[4,3,5],[1,7,6]]) np.sort(a,axis = 1) 5.對數組元素進行排序，返回索引下標 a = np.array([4,3,1,2]) j = np.argsort(a) a[j] ''' import numpy as np data = np.array([ [4,5,6,8], [7,4,2,8], [9,5,4,2] ]) data.argmax(axis = 0) # array([2, 0, 0, 0], dtype=int64) data.argmax(axis = 1) # array([3, 3, 0], dtype=int64) index = data.argmax(axis = 0) # array([9, 5, 6, 8]) a = np.array([4,5,6,2]) np.tile(a,(2,3)) ''' array([[4, 5, 6, 2, 4, 5, 6, 2, 4, 5, 6, 2], [4, 5, 6, 2, 4, 5, 6, 2, 4, 5, 6, 2]]) ''' a = np.array([[4,3,5],[1,7,6]]) ''' array([[4, 3, 5], [1, 7, 6]]) ''' np.sort(a,axis = 1) ''' array([[3, 4, 5], [1, 6, 7]]) ''' np.sort(a,axis = 0) ''' array([[1, 3, 5], [4, 7, 6]]) ''' a = np.array([4,3,1,2]) j = np.argsort(a) # array([1, 2, 3, 4])

正則表達式

1. /b 和 /B # /bthe 匹配任何以 the 開始的字符串 # /bthe/b 僅匹配 the # /Bthe 任何包含但並不以 the 做爲起始的字符串 2. [cr] 表示 c 或者 r [cr][23][dp][o2] 一個包含四個字符的字符串，第一個字符是「c」或「r」， 而後是「2」或「3」，後面 是「d」或「p」，最後要麼是「o」要麼是「2」。 例如，c2do、r3p二、r2d二、c3po等 3.["-a] 匹配在 34-97 之間的字符 4.Kleene 閉包 * 匹配 0 次或 屢次 + 匹配一次或 屢次 ? 匹配 0 次或 1 次 5.匹配所有有效或無效的 HTML 標籤 </?[^>]+> 6.國際象棋合法的棋盤移動 [KQRBNP][a-h][1-8]-[a-h][1-8] 從哪裏開始走棋-跳到哪裏 7.信用卡號碼 [0-9]{15,16} 8.美國電話號碼 \d{3}-/d{3}-/d{4} 800-555-1212 9.簡單電子郵件地址 \w+@\w+\.com XXX@YYY.com 10.使用正則表達式，使用一對圓括號能夠實現 對正則表達式進行分組 匹配子組 對正則表達式進行分組能夠在整個正則表達式中使用重複操做符 反作用： 匹配模式的子字符串能夠保存起來 提取所匹配的模式 簡單浮點數的字符串： 使用 \d+(\.\d*)? 0.004 2 75. 名字和姓氏 (Mr?s?\.)?[A-Z][a-z]*[A-Za-z-]+ 11.在匹配模式時，先使用 ? ,實現前視或後視匹配 條件檢查 (?P<name>) 實現分組匹配 (?P:\w+\.)* 匹配以 . 結尾的字符串 google twitter. facebook. (?#comment) 用做註釋 (?=.com) 若是一個字符串後面跟着 .com 則進行匹配 (?<=800-) 若是一個字符串以前爲 800- 則進行匹配 不使用任何輸入字符串 (?<!192\.168\.) 過濾掉一組 C 類 IP 地址 (?(1)y|x) 若是一個匹配組1(\1) 存在，就與y匹配，不然與x匹配

Pandas 複習 1.導包 import pandas as pd 2.數據讀取,文件在該代碼文件夾內 food_info = pd.read_csv('food_info.csv') 3.查看類型 food_info.dtypes 4.查看前五條數據 food_info.head() 查看前三條數據 food_info.head(3) 5.查看後四行數據 food_info.tail(4) 6.查看列名 food_info.columns 7.查看矩陣的維度 food_info.shape 8.取出第 0 號數據 food_info.loc[0] 使用切片獲取數據 food_info.loc[3:6] 使用索引獲取數據 food_info.loc['Name'] 獲取多個列 columns = ["Price","Name"] food_info.loc[columns] 9.將列名放入到列表中 col_names = food_info.columns.tolist() 10.查看以 d 結尾的列 for c in col_names: c.endswith("d") 11.將商品價格打一折 food_info["Price"]/10 12.最大值 最小值 均值 food_info["Price"].max() food_info["Price"].min() food_info["Price"].mean() 13.根據某一列進行排序 升序： food_info.sort_values["Price",inplace=True] 降序： food_info.sort_values["Price",inplace=True,ascending=False] 14.查看該數值是否爲 NaN price = food_info["Price"] price_is_null = pd.isnull(price) food_info[price_is_null]

Pandas 複習2 import pandas as pd import numpy as np food_info = pd.read_csv('food_info.csv') 1.處理缺失值(可以使用平均數，衆數填充) 查看非缺失值的數據： price_is_null = pd.isnull(food_info["Price"]) price = food_info["Price"][price_is_null==False] 使用 fillna 填充 food_info['Price'].fillna(food_info['Price'].mean(),inplace = True) 2.求平均值 food_info["Price"].mean() 3.查看每個 index 級,values 的平均值 food_info.pivot(index = "",values = "",aggfunc = np.mean) 4.查看總人數 food_info.pivot(index = "",values = ["",""],aggfunc = np.sum) 5.丟棄缺失值 dropna_columns = food_info.dropna(axis = 1) 將 Price 和 Time 列存在 NaN 的行去掉 new_food_info = food_info.dropna(axis = 0,subset = ["Price","Time"]) 6.定位具體值到 83 row_index_83_price = food_info.loc[83,"Price"] 7.進行排序(sort_values 默認升序) new_food_info.sort_values("Price") 8.將索引值從新排序，使用 reset_index new_food_info.reset_index(drop = True) 9.使用 apply 函數 new_food_info.apply(函數名) 10.查看缺失值的個數 def not_null_count(column): column_null = pd.isnull(column) # column_null 爲空的布爾類型 null = column[column_null] # 將爲空值的列表傳遞給 null return len(null) column_null_count = food_info.apply(not_null_count) 11.劃分等級：年齡 成績 def which_class(row): pclass = row["Pclass"] if pd.isnull(pclass): return "未知等級" elif pclass == 1: return "第一級" elif pclass == 2: return "第二級" elif pclass == 3: return "第三級" new_food_info.apply(which_class,axis = 1) 12.使用 pivot_table 展現透視表 new_food_info.pivot_table(index = " ",values=" ")

Series結構(經常使用) 1.建立 Series 對象 fandango = pd.read_csv("xxx.csv") series_rt = fandango["RottenTomatoes"] rt_scores = series_rt.values series_film = fandango["FILM"] # 獲取數據 使用 .values film_names = series_film.values series_custom = Series(rt_scores,index = film_names) 2.使用切片獲取數據 series_custom[5:10] 3.轉換爲列表 original_index = series_custom.index.tolist() 4.進行排序，此時的 original_index 是一個列表 sorted_index = sorted(original_index) 5.排序索引和值 series_custom.sort_index() series_custom.sort_values() 6.將大於 50 的數據輸出 series_custom[series_custom > 50] 7.設置索引值 fandango.set_index("FILM",drop = False) 8.顯示索引值 fandango.index 9.顯示數據類型 series_film.dtypes 10.在 apply 中使用匿名函數 series_film.apply(lambda x : np.std(x),axis = 1)

部分畫圖 import pandas as pd unrate = pd.read_csv("unrate.csv") 1.轉換日期時間 unrate["date"] = pd.to_datetime(unrate["DATE"]) import matplotlib.pyplot as plt 2.畫圖操做 plt.plot() 傳遞 x y 軸,繪製折線圖 plt.plot(unrate["date"],unrate["values"]) 3.展現 plt.show() 4.對 x 軸上的標籤傾斜 45 度 plt.xticks(rotation = 45) 5.設置 x 軸 y 軸標題 plt.xlabel("xxx") plt.ylabel("xxx") 6.設置名字 plt.title("xxx") fig = plt.figure() 7.繪製子圖 fig.add_subplot(行,列,x) x 表示 在第幾個模塊 fig.add_subplot(4,3,1) 8.建立畫圖區域時指定大小 fig = plt.figure((3,6)) 長爲 3 寬爲 6 9.畫線時指定顏色 plt.plot(x,y,c="顏色") 10.將每年的數據都顯示出來 fig = plt.figure(figsize = (10,6)) colors = ['red','blue','green','orange','black'] # 設置顏色 for i in range(5): start_index = i*12 end_index = (i+1)*12 # 定義開始和結束的位置 subset = unrate[start_index:end_index] # 使用切片 label = str(1948 + i) # 將標籤 動態命名 plt.plot(subset['month'],subset['value'],c = colors[i],label = label) # 進行繪製 plt.legend(loc = 'best') loc = upper left 顯示在左上角 # 打印右上角的數據 plt.show() # 展現 11.柱狀圖： 明確：柱與柱之間的距離，柱的高度 高度： cols = ['FILM','XXX','AAA','FFF','TTT','QQQ'] norm_reviews = reviews[cols] num_cols = ['A','B','C','D','E','F'] bar_heights = norm_reviews.ix[0,num_cols].values # 將每一列的高度都存起來 位置(距離 0 的距離)： bar_positions = arange(5) + 0.75 fig,ax = plt.subplots() ax.bar(bar_positions,bar_heights,0.3) 先寫位置後寫距離 0.3 表示柱的寬度 ax.barh() 表示橫着畫 plt.show() 12.散點圖 fig,ax = plt.subplots() # 使用 ax 進行畫圖,ax 畫圖的軸,fig 圖的樣子 ax.scatter(norm_reviews['A'],norm_reviews['B']) ax.set_xlabel('x') ax.set_ylabel('y') plt.show() # 使用 add_subplot 繪製子圖 fig = plt.figure(figsize = (5,10)) ax1 = fig.add_subplot(2,2,1) ax2 = fig.add_subplot(2,2,2) ax1.scatter(x,y) ax1.set_xlabel('x') ax1.set_ylabel('y') ax2.scatter(x2,y2) ax2.set_xlabel('x') ax2.set_ylabel('y') plt.show() 13.當存在多個值時,能夠指定區間進行畫圖 # 指定 bins 默認爲 10 fig,ax = plt.subplots() ax.hist(unrate['Price'],bins = 20) ax.hist(unrate['Price'],range(4,5),bins = 20) 14.設置 x y 區間 sey_xlim(0,50) set_ylim(0,50) 15.盒圖： num_cols = ['AA','BB','CC','DD'] fig,ax = plt.subplots() ax.boxplot(norm_reviews[num_cols].values) # 繪製盒圖 ax.set_xticklabels(num_cols,rotation = 45) # 設置 x 軸標籤，並傾斜45度 ax.set_ylim(0,5) # 設置 y 的區間 plt.show() 16.去掉圖標後的尺 ax.tick_params(bottom = "off",top = "off",left = "off",right = "off") 17.展現在右上角 ax.legend(loc = "upper right")

提取txt文本有效內容

from re import sub from jieba import cut def getWordsFromFile(txtFile): # 獲取每一封郵件中的全部詞語 words = [] # 將全部存儲郵件文本內容的記事本文件都使用 UTF8 編碼 with open(txtFile,encoding = "utf8") as fp: for line in fp: # 遍歷每一行，刪除兩端的空白字符 line = line.strip() # 過濾掉干擾字符 line = sub(r'[.【】 0-九、-。,!~\*]','',line) # 對 line 進行分詞 line = cut(line) # 過濾長度爲 1 的詞 line = filter(lambda word:len(word) > 1 ,line) # 將文本預處理獲得的詞語添加到 words 列表中 words.extend(line) return words

chain 和 Counter from collections import Counter from itertools import chain 1.使用 chain 對 allwords 二維列表進行解包 解包： chain(*allwords) 將 allwords 裏面的子列表解出來 2.獲取有效詞彙的數目 freq = Counter(chain(*allwords)) 3.Counter 返回的是可迭代對象出現的次數 使用 most_common 方法返回出現次數最多的前三個 .most_common(3) Counter ("dadasfafasfa") Counter({'a': 5, 'f': 3, 'd': 2, 's': 2}) Counter ("dadasfafasfa").most_common(2) [('a', 5), ('f', 3)]

1.特徵向量 每個有效詞彙在郵件中出現的次數(使用一維列表方法) word 詞彙出現的次數 一維列表.count(word) 
 2.將列表轉換爲數組形式 array(參數) 建立垃圾郵件，正常郵件訓練集 array(列表對象 或 表達式)
 3.使用 樸素貝葉斯算法 model = MultinomialNB()
 4.進行訓練模型 model.fit model.fit(array數組,array數組)
 5.對指定 topWords 數據使用函數 map(lambda x:words.count(x),topWords)
 6.預測數據 model.predict ,返回值爲 0 或 1 result = model.predict(array數組.reshape(1,-1))[0]
 7.查看在不一樣區間的機率 model.predict_proba(array數組.reshape(1,-1))
 8.條件語句，預測的結果便於區分 1 爲垃圾郵件，0 爲 正常郵件 return "垃圾郵件" if result == 1 else "正常郵件"

函數進階1 ''' 1.print "a>b" if a>b else pass 出錯問題 pass 不能夠被輸出，致使報錯 2.定義函數： def 函數名(): return 可選 3.print 輸出時會運行函數 print func_name() 注:func_name 中有 print 後，最好不要再使用 print 輸出 會返回兩個結果 4.最好讓函數擁有返回值,便於維護 沒有返回值會返回 None 5.如何製造函數： 抽象需求,注意可維護性 當創造方法時，注意可維護性和健壯性 6.參數使用 * 號,函數內爲元組對象 7.可選參數存在默認值，必選參數沒有默認值 8.健壯性： 直到函數會返回什麼(異常處理，條件判斷) 返回的結果是你須要的 9.測試時使用斷言 assert 程序:''' def func_name(): return 1 print(func_name()) # 1 def func_name2(): print("hello") print(func_name2()) # hello # None def add(num1,num2): return num1 + num2 print(add(5,6)) # 11 def add(*num): d = 0 for i in num: d += i return d print(add(1,2,3,4)) # 10 def add(num1,num2 = 4): return num1 + num2 print(add(5)) # 9 print(add(5,8)) # 13 def add(num1,num2): # 健壯性 if isinstance(num1,int) and isinstance(num2,int): return num1 + num2 else: return "Error" print(add('a',(1,2,3))) # Error print(add(3,4)) # 7

''' 1.在循環中不要使用 排序函數 2.解決問題先要有正確的思路 寫出僞代碼 第一步作什麼 第二步作什麼 ... 慢慢實現 3.使用 filter 函數 當函數中參數類型爲 int 時才進行比較 def func(*num): num = filter(lambda x:isinstance(x,int),num) 4.參數爲 module ,將參數輸出 print("doc %s"%module) 5.不要將代碼複雜化，讓人一看到就知道實現了什麼功能 6.os.path.exists(file) 做爲條件判斷語句，看是否存在該 file 文件 7.檢測函數 assert： 類型斷言、數據斷言 8.將問題實現的越簡單越好,測試完整 9.使用下劃線或駝峯命名函數名 get_doc getDoc 10.僞代碼： 將思路寫出來 11.默認值的好處: 省事,方便配置,多寫註釋 傳入參數的數據類型 返回的數據的類型 12.測試 程序：''' def function(*num): # 輸出 最大值和最小值 num = filter(lambda x : isinstance(x,int),num) # 過濾掉不是 int 類型的數據 a = sorted(num) return "max:",a[-1],"min:",a[0] print(function(5,6,"adaf",1.2,99.5,[4,5])) # ('max:', 6, 'min:', 5)

Python異常及異常處理： 當程序運行時，發生的錯誤稱爲異常 例： 　　0 不能做爲除數：ZeroDivisionError 　　變量未定義：NameError 　　不一樣類型進行相加：TypeError 異常處理： ''' try: 執行代碼 except: 發生異常時執行的代碼 執行 try 語句: 若是發生異常，則跳轉到 except 語句中 若是沒有異常，則運行完 try 語句，繼續 except 後面的語句 '''

False：布爾類型，假。當條件判斷不成立時，返回False。 # == 判斷兩個對象的值是否相等 print('' == False) # False print(None == False) # False print([] == False) # False print(() == False) # False print({} == False) # False # is 判斷兩個對象是否引用自同一地址空間 print('' is False) # False print(None is False) # False print([] is False) # False print(() is False) # False print({} is False) # False

正則表達式補充2

正則表達式基礎1 ''' 1.指定 eval()調用一個代碼對象會提升性能 2.在進行模式匹配以前,正則表達式的模式必須編譯爲正則表達式對象 匹配時須要進行屢次匹配,進行預編譯能夠提高速度 re.compile(pattern,flags = 0) 3.消除緩存 re.purge() 4.使用 re.S 後 . 能夠匹配換行符 \n 5.使用了match() 和 search() 方法以後返回的對象稱爲匹配對象 匹配對象經常使用 group() 和 groups() 方法 group() 返回整個匹配對象 或 特定子組 groups() 返回一個包含惟一 或 所有子組的元組 6.match() 方法 對字符串的起始部分進行模式匹配 成功：返回一個匹配對象 失敗：返回 None 7.search() 方法 對字符串任意位置進行模式匹配 成功：返回匹配對象 失敗：返回 None 8.能夠使用 pos , endpos 參數指定目標字符串的搜索範圍 9.使用 . 匹配任何單個字符 .end 不能匹配 end 不能匹配 \n 換行符 使用 \. 匹配 . 10.建立字符集 [ ] [cr][23] 表示 匹配第一個字符 c 或者 r 第二個字符 2 或者 3 程序：''' import re # match(pattern,string,flags = 0) 匹配示例 m = re.match('foo','foo') if m is not None: # 若是匹配成功則輸出匹配內容 print(m.group()) # print(m.groups()) 返回空元組 由於沒有子組 # foo # 匹配對象具備 group() 和 groups() 方法 print(m) # <re.Match object; span=(0, 3), match='foo'> # search(pattern,string,flags = 0) 匹配示例 m = re.search('foo','sea food') if m is not None: print(m.group()) # foo print(m) # <re.Match object; span=(4, 7), match='foo'> # 返回第四個元素的位置 # | 或 示例 bt = 'bat|bet' m = re.match(bt,'bat') # match 只匹配開頭 m2 = re.search(bt,'abat') # search 從開始到結尾 print(m.group()) # bat print(m2.group()) # bat # . 匹配任意字符示例 anyend = '.end' m = re.match(anyend,'aend') print(m.group()) # aend m2 = re.search(anyend,'abcdend') print(m2.group()) # dend pi_pattern = '3.14' m = re.match(pi_pattern,'3.14') print(m.group()) # 3.14 pi_pattern = '3\.14' # 將 . 轉義 m = re.match(pi_pattern,'3.14') print(m.group()) # 3.14 # [ ] 建立字符集 m = re.match('[cr][23][dp][po]','c3po') print(m.group()) # c3po m = re.match('[cr][23][dp][po]','c2do') print(m.group()) # c2do

兩數相加(B站看視頻總結) ''' 兩數相加: 給出兩個 非空 的鏈表用來表示兩個非負的整數 各自的位數是按照逆序的方式存儲的 每個節點只能保存 一位數 示例： 輸入：(2->4->3) + (5->6->4) 輸出：7->0->8 緣由：342 + 465 = 807 ''' class ListNode: def __init__(self,x): # 在類聲明時進行調用 self.val = x self.next = None # self 指的是自身，在類中聲明函數須要添加才能夠訪問自身元素和其餘函數 a = ListNode(10086) # print(a,a.val) # <__main__.ListNode object at 0x000001F76D46A148> 10086 # # 實現尾部元素指向頭部元素 # move = a # for i in range(4): # temp = ListNode(i) # # temp 爲 ListNode 節點 # move.next = temp # # move 下面的節點爲 temp # move = move.next # # 將節點向下移動 # move.next = a # # 從新指向頭節點 a class Solution1: def addTwoNumbers(self,l1:ListNode,l2:ListNode) ->ListNode: res = ListNode(10086) move = res carry = 0 # 進位 while l1 != None or l2 != None: if l1 == None: l1,l2 = l2,l1 # 替換位置,將 l1 做爲輸出 if l2 == None: carry,l1.val = divmod((l1.val + carry),10) # 對 l1 進行刷新 move.next = l1 # 設置數據 l1,l2,move = l1.next,l2.next,move.next # 將數據向下移動 else: carry,l1.val = divmod((l1.val + l2.val + carry),10) # 若是都不爲 None,則對應位置進行相加，而後進行求餘 move.next = l1 # 更新數據 l1,move = l1.next,move.next # 向下移動 if carry == 1: move.next = ListNode(carry) return res.next

class ListNode: def __init__(self,x): # 在類聲明時進行調用 self.val = x self.next = None # self 指的是自身，在類中聲明函數須要添加才能夠訪問自身元素和其餘函數 # a = ListNode(10086) # 使用迭代寫法 class Solution1: def addTwoNumbers(self,l1:ListNode,l2:ListNode) -> ListNode: def recursive(n1,n2,carry = 0): if n1 == None and n2 == None: return ListNode(1) if carry == 1 else None # 若是存在進位 則 輸出 1 if n1 == None: n1,n2 = n2,n1 # 當 n1 爲空時 將位置替換 return recursive(n1,None,carry) # 進行遞歸 使用 n1 進行遞歸 if n2 == None: carry,n1.val = divmod((n1.val + carry),10) # 返回值爲 進位和數值 將 n1 的值進行替換 n1.next = recursive(n1.next,None,carry) # 對 n1 接下來的數據繼續進行調用,更新 n1 鏈表 return n1 carry,n1.val = divmod((n1.val + n2.val + carry),10) # 當不存在空值時,進行相加,更新 n1 值 n1.next = recursive(n1.next,n2.next,carry) # 設置 n1 接下來的值爲 全部 n1 和 n2 接下來的運算調用 return n1 return recursive(l1,l2) # 返回到內部函數中

猜數字遊戲 ''' 分析： 參數->指定整數範圍，最大次數 在指定範圍內隨機產生一個整數，讓用戶猜想該數 提示，猜大，猜小，猜對 給出提示，直到猜對或次數用盡 ''' import random def fail(os_num): '''輸入數字範圍錯誤，沒有猜數字次數''' print("猜數失敗") print("系統隨機的數爲:", os_num) print("遊戲結束，歡迎下次再來玩") return def cxsr(count): '''從新輸入一個數''' count -= 1 print("提示：您還有 %d 次機會" % (count)) if count == 0: fail(os_num) else: user_cs = int(eval(input("請從新輸入一個 0~8 之間的整數:\n"))) csz(os_num,count,user_cs) def csz(os_num,count,user_cs,num_range = 8): '''這是一個猜數字的函數''' # num_range 是整數範圍，count爲最大次數,user_cs 爲用戶猜到的數 if user_cs > num_range or user_cs < 0 : print("請從新運行，輸入錯誤~") return if count == 0: fail() else: if os_num > user_cs: print("您猜的數字比系統產生的隨機數小") cxsr(count) elif os_num < user_cs: print("您猜的數字比系統產生的隨機數大") cxsr(count) else: print("恭喜您，猜對了~") print("歡迎下次再玩！") os_num = random.randint(0,8) # os_num 爲系統產生的隨機數 print("遊戲開始~") user_cs = int(eval(input("這是一個猜數字的遊戲(您有三次猜數字的機會)，請輸入一個 0~8 之間的整數\n"))) # user_cs 爲用戶猜到的數 csz(os_num,3,user_cs)

因爲之前的隨筆中的習題和我手裏現存的題太多了,大約1000多道左右,因此此處不作整理了.只作一些知識點上的積累

函數進階3 ''' 1.考慮可維護性 一行代碼儘可能簡單 列表推導式 lambda 匿名函數 2.斷言語句用於本身測試，不要寫在流程控制中 assert 不要寫在 for 循環中 3.程序的異常處理 參數處理 try 異常處理 ,參數類型是什麼 4.函數->儘可能不要在特定環境下使用 5.斷言就是異常->出錯了就會拋出異常 6.局部變量和全局變量的區別： 當局部變量與全局變量重名時，生成一個在局部做用域中的變量 使用 global 聲明 可讓局部變量修改成全局變量 7.參數爲可變參數時,使用索引下標會修改原數據 程序：''' def func1(num1,num2): return num1 + num2 # 打印變量名 print(func1.__code__.co_varnames) # ('num1', 'num2') print(func1.__code__.co_filename) # 文件名 # 第六點: arg = 6 def add(num = 3): arg = 4 return arg + num print(add()) # 7 arg = 6 def add(num = 3): # 使用 global 聲明 global arg return arg + num print(add()) # 9

函數進階4 ''' 1.匿名函數： 一個表達式,沒有 return 沒有名稱 執行很小的功能 2.判斷參數是否存在 若是不存在會怎樣->給出解決辦法 3.能夠使用 filter 和 lambda 進行使用 若是不進行 list 轉換,則只返回 filter 對象 4.參數： 位置匹配： func(name) 關鍵字匹配： func(key = value) 從最右面開始賦予默認值 收集匹配： 元組收集 func(name,arg1,arg2) func(*args) 字典收集 func(name,key1 = value1,key2 = value2) func(**kwargs) 參數順序 5.遞歸： 遞歸就是調用自身 程序：''' # 第一點： d = lambda x:x+1 print(d(2)) # 3 d = lambda x:x + 1 if x > 0 else "不大於0" print(d(3)) # 4 print(d(-3)) # 不大於0 # 列表推導 g = lambda x:[(x,i) for i in range(5)] print(g(5)) # 只傳遞了一個參數 # [(5, 0), (5, 1), (5, 2), (5, 3), (5, 4)] # 第三點： t = [1,4,7,8,5,3,9] g = list(filter(lambda x: x > 5,t)) # 使用 list 將結果轉換爲列表 print(g) # [7, 8, 9] # 第四點： # 對應位置傳遞參數 def func(arg1,arg2,arg3): return arg1,arg2,arg3 print(func(1,2,3)) # (1, 2, 3) # 關鍵字匹配,不按照位置進行匹配 def func(k1 = '',k2 = '',k3 = ''): return k1,k2,k3 print(func(k2 = 4 , k3 = 5 , k1 = 3)) # (3, 4, 5) # 收集匹配 # 元組 def func(*args): return args print(func(5,6,7,8,[1,2,3])) # (5, 6, 7, 8, [1, 2, 3]) def func(a,*args): # 先匹配 a 後匹配 *args return args print(func(5,6,7,8,[1,2,3])) # (6, 7, 8, [1, 2, 3]) # 字典 def func(**kwargs): return kwargs print(func(a = 5,b = 8)) # {'a': 5, 'b': 8}

列表推導式 lst = [1,2,3] lst2 = [1,2,3,5,1] lst3 = [1,2] lst4 = [1,2,3,65,8] lst5 = [1,2,3,59,5,1,2,3] def length(*args): # 返回長度 lens = [] lens = [len(i) for i in args] return lens print(length(lst,lst2,lst3,lst4,lst5)) # [3, 5, 2, 5, 8] dic = dict(zip(['a','b','c','d','e'],[4,5,6,7,8])) lst = ["%s : %s" %(key,value) for key,value in dic.items()] print(lst) # ['a : 4', 'b : 5', 'c : 6', 'd : 7', 'e : 8']

關於類和異常的筆記 原文連接:https://www.runoob.com/python/python-object.html 面向對象技術簡介 類(Class): 用來描述具備相同的屬性和方法的對象的集合。它定義了該集合中每一個對象所共有的屬性和方法。對象是類的實例。 類變量：類變量在整個實例化的對象中是公用的。類變量定義在類中且在函數體以外。類變量一般不做爲實例變量使用。 數據成員：類變量或者實例變量, 用於處理類及其實例對象的相關的數據。 方法重寫：若是從父類繼承的方法不能知足子類的需求，能夠對其進行改寫，這個過程叫方法的覆蓋（override），也稱爲方法的重寫。 局部變量：定義在方法中的變量，只做用於當前實例的類。 實例變量：在類的聲明中，屬性是用變量來表示的。這種變量就稱爲實例變量，是在類聲明的內部可是在類的其餘成員方法以外聲明的。 繼承：即一個派生類（derived class）繼承基類（base class）的字段和方法。繼承也容許把一個派生類的對象做爲一個基類對象對待。例如，有這樣一個設計：一個Dog類型的對象派生自Animal類，這是模擬"是一個（is-a）"關係（例圖，Dog是一個Animal）。 實例化：建立一個類的實例，類的具體對象。 方法：類中定義的函數。 對象：經過類定義的數據結構實例。對象包括兩個數據成員（類變量和實例變量）和方法。 建立類 使用 class 語句來建立一個新類，class 以後爲類的名稱並以冒號結尾: class ClassName: '類的幫助信息' #類文檔字符串 class_suite #類體 類的幫助信息能夠經過ClassName.__doc__查看 class_suite 由類成員，方法，數據屬性組成 實例 如下是一個簡單的 Python 類的例子: 實例 #!/usr/bin/python # -*- coding: UTF-8 -*- class Employee: '全部員工的基類' empCount = 0 def __init__(self, name, salary): self.name = name self.salary = salary Employee.empCount += 1 def displayCount(self): print "Total Employee %d" % Employee.empCount def displayEmployee(self): print "Name : ", self.name, ", Salary: ", self.salary
 empCount 變量是一個類變量，它的值將在這個類的全部實例之間共享。你能夠在內部類或外部類使用 Employee.empCount 訪問。 第一種方法__init__()方法是一種特殊的方法，被稱爲類的構造函數或初始化方法，當建立了這個類的實例時就會調用該方法 self 表明類的實例，self 在定義類的方法時是必須有的，雖然在調用時沒必要傳入相應的參數。 self表明類的實例，而非類 類的方法與普通的函數只有一個特別的區別——它們必須有一個額外的第一個參數名稱, 按照慣例它的名稱是 self。 class Test: def prt(self): print(self) print(self.__class__) t = Test() t.prt() 以上實例執行結果爲： <__main__.Test instance at 0x10d066878> __main__.Test 從執行結果能夠很明顯的看出，self 表明的是類的實例，表明當前對象的地址，而 self.__class__ 則指向類。 self 不是 python 關鍵字 "建立 Employee 類的第一個對象" emp1 = Employee("Hany", 9000) 訪問屬性 能夠使用點號 . 來訪問對象的屬性。使用以下類的名稱訪問類變量: emp1.displayEmployee() emp2.displayEmployee() print "Total Employee %d" % Employee.empCount 能夠添加，刪除，修改類的屬性，以下所示： del emp1.age # 刪除 'age' 屬性

 你也能夠使用如下函數的方式來訪問屬性： getattr(obj, name[, default]) : 訪問對象的屬性。 hasattr(obj,name) : 檢查是否存在一個屬性。 setattr(obj,name,value) : 設置一個屬性。若是屬性不存在，會建立一個新屬性。 delattr(obj, name) : 刪除屬性。 hasattr(emp1, 'age') # 若是存在 'age' 屬性返回 True。 getattr(emp1, 'age') # 返回 'age' 屬性的值 setattr(emp1, 'age', 8) # 添加屬性 'age' 值爲 8 delattr(emp1, 'age') # 刪除屬性 'age' Python內置類屬性 __dict__ : 類的屬性（包含一個字典，由類的數據屬性組成） __doc__ :類的文檔字符串 __name__: 類名 __module__: 類定義所在的模塊（類的全名是'__main__.className'，若是類位於一個導入模塊mymod中，那麼className.__module__ 等於 mymod） __bases__ : 類的全部父類構成元素（包含了一個由全部父類組成的元組） Python內置類屬性調用實例以下： python對象銷燬(垃圾回收) Python 使用了引用計數這一簡單技術來跟蹤和回收垃圾。 在 Python 內部記錄着全部使用中的對象各有多少引用。 一個內部跟蹤變量，稱爲一個引用計數器。 當對象被建立時， 就建立了一個引用計數， 當這個對象再也不須要時， 也就是說， 這個對象的引用計數變爲0 時， 它被垃圾回收。可是回收不是"當即"的，
 由解釋器在適當的時機，將垃圾對象佔用的內存空間回收。 a = 40 # 建立對象 <40> b = a # 增長引用， <40> 的計數 c = [b] # 增長引用. <40> 的計數 del a # 減小引用 <40> 的計數 b = 100 # 減小引用 <40> 的計數 c[0] = -1 # 減小引用 <40> 的計數 垃圾回收機制不只針對引用計數爲0的對象，一樣也能夠處理循環引用的狀況。循環引用指的是，兩個對象相互引用，可是沒有其餘變量引用他們。
這種狀況下，僅使用引用計數是不夠的。Python 的垃圾收集器其實是一個引用計數器和一個循環垃圾收集器。
做爲引用計數的補充， 垃圾收集器也會留心被分配的總量很大（及未經過引用計數銷燬的那些）的對象。 在這種狀況下， 解釋器會暫停下來， 試圖清理全部未引用的循環。 析構函數 __del__ ，__del__在對象銷燬的時候被調用，當對象再也不被使用時，__del__方法運行： 類的繼承 面向對象的編程帶來的主要好處之一是代碼的重用，實現這種重用的方法之一是經過繼承機制。 經過繼承建立的新類稱爲子類或派生類，被繼承的類稱爲基類、父類或超類。 繼承語法 class 派生類名(基類名) ... 在python中繼承中的一些特色： 一、若是在子類中須要父類的構造方法就須要顯式的調用父類的構造方法，或者不重寫父類的構造方法。詳細說明可查看： python 子類繼承父類構造函數說明。 二、在調用基類的方法時，須要加上基類的類名前綴，且須要帶上 self 參數變量。區別在於類中調用普通函數時並不須要帶上 self 參數 三、Python 老是首先查找對應類型的方法，若是它不能在派生類中找到對應的方法，它纔開始到基類中逐個查找。（先在本類中查找調用的方法，找不到纔去基類中找）。 若是在繼承元組中列了一個以上的類，那麼它就被稱做"多重繼承" 。 語法： 派生類的聲明，與他們的父類相似，繼承的基類列表跟在類名以後，以下所示： class SubClassName (ParentClass1[, ParentClass2, ...]): ... 實例 #!/usr/bin/python # -*- coding: UTF-8 -*- class Parent: # 定義父類 parentAttr = 100 def __init__(self): print "調用父類構造函數" def parentMethod(self): print '調用父類方法' def setAttr(self, attr): Parent.parentAttr = attr def getAttr(self): print "父類屬性 :", Parent.parentAttr class Child(Parent): # 定義子類 def __init__(self): print "調用子類構造方法" def childMethod(self): print '調用子類方法' c = Child() # 實例化子類 c.childMethod() # 調用子類的方法 c.parentMethod() # 調用父類方法 c.setAttr(200) # 再次調用父類的方法 - 設置屬性值 c.getAttr() # 再次調用父類的方法 - 獲取屬性值 以上代碼執行結果以下： 調用子類構造方法 調用子類方法 調用父類方法 父類屬性 : 200 你能夠繼承多個類 class A: # 定義類 A ..... class B: # 定義類 B ..... class C(A, B): # 繼承類 A 和 B .....
 能夠使用issubclass()或者isinstance()方法來檢測。 issubclass() - 布爾函數判斷一個類是另外一個類的子類或者子孫類，語法：issubclass(sub,sup) isinstance(obj, Class) 布爾函數若是obj是Class類的實例對象或者是一個Class子類的實例對象則返回true。 方法重寫 若是你的父類方法的功能不能知足你的需求，你能夠在子類重寫你父類的方法： 下表列出了一些通用的功能： 1 __init__ ( self [,args...] ) 構造函數 簡單的調用方法: obj = className(args)
 2 __del__( self ) 析構方法, 刪除一個對象 簡單的調用方法 : del obj
 3 __repr__( self ) 轉化爲供解釋器讀取的形式 簡單的調用方法 : repr(obj)
 4 __str__( self ) 用於將值轉化爲適於人閱讀的形式 簡單的調用方法 : str(obj)
 類屬性與方法 類的私有屬性 __private_attrs：兩個下劃線開頭，聲明該屬性爲私有，不能在類的外部被使用或直接訪問。在類內部的方法中使用時 self.__private_attrs。 類的方法 在類的內部，使用 def 關鍵字能夠爲類定義一個方法，與通常函數定義不一樣，類方法必須包含參數 self,且爲第一個參數 類的私有方法 __private_method：兩個下劃線開頭，聲明該方法爲私有方法，不能在類的外部調用。在類的內部調用 self.__private_methods Python 經過改變名稱來包含類名: Traceback (most recent call last): File "test.py", line 17, in <module> print counter.__secretCount # 報錯，實例不能訪問私有變量 AttributeError: JustCounter instance has no attribute '__secretCount' Python不容許實例化的類訪問私有數據，但能夠使用 object._className__attrName（ 對象名._類名__私有屬性名 ）訪問屬性： 單下劃線、雙下劃線、頭尾雙下劃線說明： __foo__: 定義的是特殊方法，通常是系統定義名字 ，相似 __init__() 之類的。 _foo: 以單下劃線開頭的表示的是 protected 類型的變量，即保護類型只能容許其自己與子類進行訪問，不能用於 from module import * __foo: 雙下劃線的表示的是私有類型(private)的變量, 只能是容許這個類自己進行訪問了。

原文連接:https://www.runoob.com/python3/python3-class.html Python中的類提供了面向對象編程的全部基本功能：類的繼承機制容許多個基類，派生類能夠覆蓋基類中的任何方法，方法中能夠調用基類中的同名方法。 對象能夠包含任意數量和類型的數據。 類定義 語法格式以下： class ClassName: <statement-1> . . . <statement-N> 類實例化後，能夠使用其屬性，實際上，建立一個類以後，能夠經過類名訪問其屬性。 類對象 類對象支持兩種操做：屬性引用和實例化。 屬性引用使用和 Python 中全部的屬性引用同樣的標準語法：obj.name。 類對象建立後，類命名空間中全部的命名都是有效屬性名: 類有一個名爲 __init__() 的特殊方法（構造方法），該方法在類實例化時會自動調用

 self表明類的實例，而非類 類的方法與普通的函數只有一個特別的區別——它們必須有一個額外的第一個參數名稱, 按照慣例它的名稱是 self。 繼承 Python 一樣支持類的繼承，若是一種語言不支持繼承，類就沒有什麼意義。派生類的定義以下所示: class DerivedClassName(BaseClassName1): <statement-1> . . . <statement-N> BaseClassName（示例中的基類名）必須與派生類定義在一個做用域內。除了類，還能夠用表達式，基類定義在另外一個模塊中時這一點很是有用: class DerivedClassName(modname.BaseClassName): 多繼承 Python一樣有限的支持多繼承形式。多繼承的類定義形以下例: class DerivedClassName(Base1, Base2, Base3): <statement-1> . . . <statement-N> 須要注意圓括號中父類的順序，如果父類中有相同的方法名，而在子類使用時未指定，python從左至右搜索 即方法在子類中未找到時，從左到右查找父類中是否包含方法。 方法重寫 若是你的父類方法的功能不能知足你的需求，你能夠在子類重寫你父類的方法： 類的私有屬性 __private_attrs：兩個下劃線開頭，聲明該屬性爲私有，不能在類的外部被使用或直接訪問。在類內部的方法中使用時 self.__private_attrs。 類的方法 在類的內部，使用 def 關鍵字來定義一個方法，與通常函數定義不一樣，類方法必須包含參數 self，且爲第一個參數，self 表明的是類的實例。 類的私有方法 __private_method：兩個下劃線開頭，聲明該方法爲私有方法，只能在類的內部調用 ，不能在類的外部調用。self.__private_methods。 類的實例化對象不能訪問類中的私有屬性 類的專有方法： __init__ : 構造函數，在生成對象時調用 __del__ : 析構函數，釋放對象時使用 __repr__ : 打印，轉換 __setitem__ : 按照索引賦值 __getitem__: 按照索引獲取值 __len__: 得到長度 __cmp__: 比較運算 __call__: 函數調用 __add__: 加運算 __sub__: 減運算 __mul__: 乘運算 __truediv__: 除運算 __mod__: 求餘運算 __pow__: 乘方

Python 子類繼承父類構造函數說明 原文連接:https://www.runoob.com/w3cnote/python-extends-init.html 若是在子類中須要父類的構造方法就須要顯式地調用父類的構造方法，或者不重寫父類的構造方法。 子類不重寫 __init__，實例化子類時，會自動調用父類定義的 __init__。 父類名稱.__init__(self,參數1，參數2，...) 實例 class Father(object): def __init__(self, name): self.name=name print ( "name: %s" %( self.name)) def getName(self): return 'Father ' + self.name class Son(Father): def __init__(self, name): super(Son, self).__init__(name) print ("hi") self.name = name def getName(self): return 'Son '+self.name if __name__=='__main__': son=Son('runoob') print ( son.getName() ) 輸出結果爲： name: runoob hi Son runoob

Python 異常(菜鳥教程) 原文:https://www.runoob.com/python/python-exceptions.html BaseException 全部異常的基類 SystemExit 解釋器請求退出 KeyboardInterrupt 用戶中斷執行(一般是輸入^C) Exception 常規錯誤的基類 StopIteration 迭代器沒有更多的值 GeneratorExit 生成器(generator)發生異常來通知退出 StandardError 全部的內建標準異常的基類 ArithmeticError 全部數值計算錯誤的基類 FloatingPointError 浮點計算錯誤 OverflowError 數值運算超出最大限制 ZeroDivisionError 除(或取模)零 (全部數據類型) AssertionError 斷言語句失敗 AttributeError 對象沒有這個屬性 EOFError 沒有內建輸入,到達EOF 標記 EnvironmentError 操做系統錯誤的基類 IOError 輸入/輸出操做失敗 OSError 操做系統錯誤 WindowsError 系統調用失敗 ImportError 導入模塊/對象失敗 LookupError 無效數據查詢的基類 IndexError 序列中沒有此索引(index) KeyError 映射中沒有這個鍵 MemoryError 內存溢出錯誤(對於Python 解釋器不是致命的) NameError 未聲明/初始化對象 (沒有屬性) UnboundLocalError 訪問未初始化的本地變量 ReferenceError 弱引用(Weak reference)試圖訪問已經垃圾回收了的對象 RuntimeError 通常的運行時錯誤 NotImplementedError 還沒有實現的方法 SyntaxError Python 語法錯誤 IndentationError 縮進錯誤 TabError Tab 和空格混用 SystemError 通常的解釋器系統錯誤 TypeError 對類型無效的操做 ValueError 傳入無效的參數 UnicodeError Unicode 相關的錯誤 UnicodeDecodeError Unicode 解碼時的錯誤 UnicodeEncodeError Unicode 編碼時錯誤 UnicodeTranslateError Unicode 轉換時錯誤 Warning 警告的基類 DeprecationWarning 關於被棄用的特徵的警告 FutureWarning 關於構造未來語義會有改變的警告 OverflowWarning 舊的關於自動提高爲長整型(long)的警告 PendingDeprecationWarning 關於特性將會被廢棄的警告 RuntimeWarning 可疑的運行時行爲(runtime behavior)的警告 SyntaxWarning 可疑的語法的警告 UserWarning 用戶代碼生成的警告 什麼是異常？ 異常便是一個事件，該事件會在程序執行過程當中發生，影響了程序的正常執行。 通常狀況下，在Python沒法正常處理程序時就會發生一個異常。 異常是Python對象，表示一個錯誤。 當Python腳本發生異常時咱們須要捕獲處理它，不然程序會終止執行。 異常處理 捕捉異常能夠使用try/except語句。 try/except語句用來檢測try語句塊中的錯誤，從而讓except語句捕獲異常信息並處理。 若是你不想在異常發生時結束你的程序，只需在try裏捕獲它。 語法： 如下爲簡單的try....except...else的語法： try: <語句> #運行別的代碼 except <名字>： <語句> #若是在try部份引起了'name'異常 except <名字>，<數據>: <語句> #若是引起了'name'異常，得到附加的數據 else: <語句> #若是沒有異常發生 try的工做原理是，當開始一個try語句後，python就在當前程序的上下文中做標記，這樣當異常出現時就能夠回到這裏，try子句先執行，接下來會發生什麼依賴於執行時是否出現異常。 若是當try後的語句執行時發生異常，python就跳回到try並執行第一個匹配該異常的except子句，異常處理完畢，控制流就經過整個try語句（除非在處理異常時又引起新的異常）。 若是在try後的語句裏發生了異常，卻沒有匹配的except子句，異常將被遞交到上層的try，或者到程序的最上層（這樣將結束程序，並打印默認的出錯信息）。 若是在try子句執行時沒有發生異常，python將執行else語句後的語句（若是有else的話），而後控制流經過整個try語句。 使用except而不帶任何異常類型 你能夠不帶任何異常類型使用except，以下實例： try: 正常的操做 ...................... except: 發生異常，執行這塊代碼 ...................... else: 若是沒有異常執行這塊代碼 以上方式try-except語句捕獲全部發生的異常。但這不是一個很好的方式，咱們不能經過該程序識別出具體的異常信息。由於它捕獲全部的異常。 使用except而帶多種異常類型 你也能夠使用相同的except語句來處理多個異常信息，以下所示： try: 正常的操做 ...................... except(Exception1[, Exception2[,...ExceptionN]]]): 發生以上多個異常中的一個，執行這塊代碼 ...................... else: 若是沒有異常執行這塊代碼 try-finally 語句 try-finally 語句不管是否發生異常都將執行最後的代碼。 try: <語句> finally: <語句> #退出try時總會執行 try: fh = open("testfile", "w") fh.write("這是一個測試文件，用於測試異常!!") finally: print "Error: 沒有找到文件或讀取文件失敗" 若是打開的文件沒有可寫權限，輸出以下所示： try: fh = open("testfile", "w") try: fh.write("這是一個測試文件，用於測試異常!!") finally: print "關閉文件" fh.close() except IOError: print "Error: 沒有找到文件或讀取文件失敗" 當在try塊中拋出一個異常，當即執行finally塊代碼。 finally塊中的全部語句執行後，異常被再次觸發，並執行except塊代碼。 變量接收的異常值一般包含在異常的語句中。在元組的表單中變量能夠接收一個或者多個值。 元組一般包含錯誤字符串，錯誤數字，錯誤位置。 觸發異常 咱們能夠使用raise語句本身觸發異常 raise語法格式以下： raise [Exception [, args [, traceback]]] 語句中 Exception 是異常的類型（例如，NameError）參數標準異常中任一種，args 是自已提供的異常參數。 最後一個參數是可選的（在實踐中不多使用），若是存在，是跟蹤異常對象。 實例 一個異常能夠是一個字符串，類或對象。 Python的內核提供的異常，大多數都是實例化的類，這是一個類的實例的參數。 定義一個異常很是簡單，以下所示： def functionName( level ): if level < 1: raise Exception("Invalid level!", level) # 觸發異常後，後面的代碼就不會再執行 注意：爲了可以捕獲異常，"except"語句必須有用相同的異常來拋出類對象或者字符串。 例如咱們捕獲以上異常，"except"語句以下所示： try: 正常邏輯 except Exception,err: 觸發自定義異常 else: 其他代碼 實例 #!/usr/bin/python # -*- coding: UTF-8 -*- # 定義函數 def mye( level ): if level < 1: raise Exception,"Invalid level!" # 觸發異常後，後面的代碼就不會再執行 try: mye(0) # 觸發異常 except Exception,err: print 1,err else: print 2 執行以上代碼，輸出結果爲： $ python test.py 1 Invalid level! 用戶自定義異常 經過建立一個新的異常類，程序能夠命名它們本身的異常。異常應該是典型的繼承自Exception類，經過直接或間接的方式。

Python3 錯誤和異常 原文連接: https://www.runoob.com/python3/python3-errors-execptions.html

異常 即使 Python 程序的語法是正確的，在運行它的時候，也有可能發生錯誤。運行期檢測到的錯誤被稱爲異常。 大多數的異常都不會被程序處理，都以錯誤信息的形式展示在這裏: 實例 >>> 10 * (1/0) # 0 不能做爲除數，觸發異常 Traceback (most recent call last): File "<stdin>", line 1, in ? ZeroDivisionError: division by zero >>> 4 + spam*3 # spam 未定義，觸發異常 Traceback (most recent call last): File "<stdin>", line 1, in ? NameError: name 'spam' is not defined >>> '2' + 2 # int 不能與 str 相加，觸發異常 Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: can only concatenate str (not "int") to str 異常以不一樣的類型出現，這些類型都做爲信息的一部分打印出來: 例子中的類型有 ZeroDivisionError，NameError 和 TypeError。 錯誤信息的前面部分顯示了異常發生的上下文，並以調用棧的形式顯示具體信息。 異常處理 try/except 異常捕捉能夠使用 try/except 語句。

如下例子中，讓用戶輸入一個合法的整數，可是容許用戶中斷這個程序（使用 Control-C 或者操做系統提供的方法）。用戶中斷的信息會引起一個 KeyboardInterrupt 異常。 while True: try: x = int(input("請輸入一個數字: ")) break except ValueError: print("您輸入的不是數字，請再次嘗試輸入！") try 語句按照以下方式工做； 首先，執行 try 子句（在關鍵字 try 和關鍵字 except 之間的語句）。 若是沒有異常發生，忽略 except 子句，try 子句執行後結束。 若是在執行 try 子句的過程當中發生了異常，那麼 try 子句餘下的部分將被忽略。若是異常的類型和 except 以後的名稱相符，那麼對應的 except 子句將被執行。 若是一個異常沒有與任何的 except 匹配，那麼這個異常將會傳遞給上層的 try 中。 一個 try 語句可能包含多個except子句，分別來處理不一樣的特定的異常。最多隻有一個分支會被執行。 處理程序將只針對對應的 try 子句中的異常進行處理，而不是其餘的 try 的處理程序中的異常。 一個except子句能夠同時處理多個異常，這些異常將被放在一個括號裏成爲一個元組，例如: except (RuntimeError, TypeError, NameError): pass try/except...else try/except 語句還有一個可選的 else 子句，若是使用這個子句，那麼必須放在全部的 except 子句以後。 else 子句將在 try 子句沒有發生任何異常的時候執行。

如下實例在 try 語句中判斷文件是否能夠打開，若是打開文件時正常的沒有發生異常則執行 else 部分的語句，讀取文件內容： for arg in sys.argv[1:]: try: f = open(arg, 'r') except IOError: print('cannot open', arg) else: print(arg, 'has', len(f.readlines()), 'lines') f.close() 使用 else 子句比把全部的語句都放在 try 子句裏面要好，這樣能夠避免一些意想不到，而 except 又沒法捕獲的異常。 異常處理並不只僅處理那些直接發生在 try 子句中的異常，並且還能處理子句中調用的函數（甚至間接調用的函數）裏拋出的異常。例如: >>> def this_fails(): x = 1/0 >>> try: this_fails() except ZeroDivisionError as err: print('Handling run-time error:', err) Handling run-time error: int division or modulo by zero try-finally 語句 try-finally 語句不管是否發生異常都將執行最後的代碼。

如下實例中 finally 語句不管異常是否發生都會執行： 實例 try: runoob() except AssertionError as error: print(error) else: try: with open('file.log') as file: read_data = file.read() except FileNotFoundError as fnf_error: print(fnf_error) finally: print('這句話，不管異常是否發生都會執行。') 拋出異常 Python 使用 raise 語句拋出一個指定的異常。 raise語法格式以下： raise [Exception [, args [, traceback]]]

如下實例若是 x 大於 5 就觸發異常: x = 10 if x > 5: raise Exception('x 不能大於 5。x 的值爲: {}'.format(x)) 執行以上代碼會觸發異常： Traceback (most recent call last): File "test.py", line 3, in <module> raise Exception('x 不能大於 5。x 的值爲: {}'.format(x)) Exception: x 不能大於 5。x 的值爲: 10 raise 惟一的一個參數指定了要被拋出的異常。它必須是一個異常的實例或者是異常的類（也就是 Exception 的子類）。 若是你只想知道這是否拋出了一個異常，並不想去處理它，那麼一個簡單的 raise 語句就能夠再次把它拋出。 >>> try: raise NameError('HiThere') except NameError: print('An exception flew by!') raise An exception flew by! Traceback (most recent call last): File "<stdin>", line 2, in ? NameError: HiThere 用戶自定義異常 你能夠經過建立一個新的異常類來擁有本身的異常。異常類繼承自 Exception 類，能夠直接繼承，或者間接繼承，例如: >>> class MyError(Exception): def __init__(self, value): self.value = value def __str__(self): return repr(self.value) >>> try: raise MyError(2*2) except MyError as e: print('My exception occurred, value:', e.value) My exception occurred, value: 4 >>> raise MyError('oops!') Traceback (most recent call last): File "<stdin>", line 1, in ? __main__.MyError: 'oops!' 在這個例子中，類 Exception 默認的 __init__() 被覆蓋。 當建立一個模塊有可能拋出多種不一樣的異常時，一種一般的作法是爲這個包創建一個基礎異常類，而後基於這個基礎類爲不一樣的錯誤狀況建立不一樣的子類: class Error(Exception): """Base class for exceptions in this module.""" pass class InputError(Error): """Exception raised for errors in the input. Attributes: expression -- input expression in which the error occurred message -- explanation of the error """ def __init__(self, expression, message): self.expression = expression self.message = message class TransitionError(Error): """Raised when an operation attempts a state transition that's not allowed. Attributes: previous -- state at beginning of transition next -- attempted new state message -- explanation of why the specific transition is not allowed """ def __init__(self, previous, next, message): self.previous = previous self.next = next self.message = message 大多數的異常的名字都以"Error"結尾，就跟標準的異常命名同樣。 定義清理行爲 try 語句還有另一個可選的子句，它定義了不管在任何狀況下都會執行的清理行爲。 例如: >>> try: ... raise KeyboardInterrupt ... finally: ... print('Goodbye, world!') ... Goodbye, world! Traceback (most recent call last): File "<stdin>", line 2, in <module> KeyboardInterrupt 以上例子無論 try 子句裏面有沒有發生異常，finally 子句都會執行。 若是一個異常在 try 子句裏（或者在 except 和 else 子句裏）被拋出，而又沒有任何的 except 把它截住，那麼這個異常會在 finally 子句執行後被拋出。 預約義的清理行爲 一些對象定義了標準的清理行爲，不管系統是否成功的使用了它，一旦不須要它了，那麼這個標準的清理行爲就會執行。 這面這個例子展現了嘗試打開一個文件，而後把內容打印到屏幕上: for line in open("myfile.txt"): print(line, end="") 以上這段代碼的問題是，當執行完畢後，文件會保持打開狀態，並無被關閉。 關鍵詞 with 語句就能夠保證諸如文件之類的對象在使用完以後必定會正確的執行他的清理方法: with open("myfile.txt") as f: for line in f: print(line, end="") 以上這段代碼執行完畢後，就算在處理過程當中出問題了，文件 f 老是會關閉。

Python assert（斷言） Python assert（斷言）用於判斷一個表達式，在表達式條件爲 false 的時候觸發異常。 斷言能夠在條件不知足程序運行的狀況下直接返回錯誤，而沒必要等待程序運行後出現崩潰的狀況

語法格式以下： assert expression 等價於： if not expression: raise AssertionError assert 後面也能夠緊跟參數: assert expression [, arguments] 等價於： if not expression: raise AssertionError(arguments)

很是感謝菜鳥教程. 如下爲網址: https://www.runoob.com/ 之前學習的時候會常常翻閱,學習一些編程知識
已附上原文連接,推薦到原連接進行訪問,畢竟顯示效果會差不少

類 鞏固小結 ''' 1.當方法具備結構時,使用 class 比函數要好 2.類中定義的 方法 或 屬性 若是沒有聲明權限 在類外使用實例化對象能夠直接訪問 3.類中的方法,第一個參數必定要寫上 self ,self 是約定好的 4.析構方法一般調用不到,垃圾回收機制中引用計數後會自動銷燬 5.寫程序以前： 僞代碼 小的程序->直接寫流程 大項目-> 先分析結構 6.父類一般具備一些公共的方法,使用子類進行擴展 子類能夠使用父類中定義的初始化方法 __init__ , 可是參數可能不全 7.找到最小的節點,將節點與節點之間發生的關係使用不一樣的類進行標識 8._xxx, __xxx 外界都不該該直接訪問,放到類中的方法中使用 9.方法必定要先實現再進行優化 10.使用裝飾器 @property 後 下面的方法會變爲屬性 @property def function(self,args): pass 類實例化對象.function 調用 function 方法 11.@staticmethod 靜態方法,下面的方法聲明爲"內置方法" 不須要再使用 self 進行調用 @staticmethod def function(object):pass 12.最小驚訝原則 讓不少人一看就知道在寫什麼 13.注意代碼的複雜度 ''' # 示例： class stu(object): # 繼承 object 基類 a = 'a' # 定義一個屬性 def __init__(self,name,age): # 初始化賦值 self.name = name self.age = age def showName(self): print(self.name) def __del__(self): # 析構方法,一般自動銷燬 del self.name del self.age student = stu('張三',20) student.showName() print(student.a) # 輸出類中的屬性 # a class stu_extends(stu): """ 繼承 stu 類""" pass class lst_extends(list): def append(self): pass # 繼承列表類

模塊小結 1.import 模塊名(文件) 導入模塊 2.查看模塊內的方法 dir(模塊名) 3.if __name__ == "__main__": 須要測試的語句 4.from 模塊名 import 方法/屬性/* 再使用時 使用 方法便可 再也不使用模塊名.方法 5.使用 __all__ = ["方法1","方法2","屬性名1","屬性名2"] 在導入 * 時,只導入該 all 中的方法或屬性 在 __init__.py 文件第一行 6.import 模塊名(文件) as 模塊別名 7.搜索模塊： import sys sys.path.append(r'絕對路徑') 絕對路徑指 導入的模塊的 文件夾的位置 a.py 程序： import linecache print(linecache.__file__) def show( ): print("我是 0411 的模塊") 程序： import linecache # 導入模塊 print(dir(linecache)) # 查看都具備哪些方法 ''' ['__all__', '__builtins__', '__cached__', '__doc__', '__file__', '__loader__', '__name__', '__package__', '__spec__', 'cache', 'checkcache', 'clearcache', 'functools', 'getline', 'getlines', 'lazycache', 'os', 'sys', 'tokenize', 'updatecache'] ''' linecache.__file__ # 查看模塊地址 # F:\Python IDLE\lib\linecache.py from math import pi import sys sys.path.append(r'D:\看法\Python\Python代碼\學習\0411') if __name__ == "__main__": # 執行文件時,進行測試 print(linecache.__file__) # F:\Python IDLE\lib\linecache.py print(pi) # 使用 math 中的 pi 屬性 # 3.141592653589793 import a # 導入 0411 的 a.py 文件 a.show() # 我是 0411 的模塊

如下部份內容爲對之前內容的鞏固,部分存在重複現象.

異常 鞏固1 ''' 1.索引異常 IndexError: list index out of range 2.語法異常 SyntaxError 3.縮進異常 IndentationError: unexpected indent 4.try 語句完整形態：try except else finally 5.try 內的語句 出錯以後不會運行出現異常以後的 try 內語句 6.開發某些功能時 任何地方均可能會出錯 一般參數傳遞過來時 讀取某些未知文件時 打開某個網頁時 7.except 捕獲正確的異常,對異常進行處理 程序：''' # lst = [1,2,3,4,5] # print(lst[5]) # 索引異常,不存在下標爲 5 的元素 # IndexError: list index out of range # print 444 # 語法異常 # SyntaxError # print(444) # 縮進異常 # IndentationError: unexpected indent lst = [1,2,3,4,5] try : print(lst[5]) print("出錯以後不會運行出現異常以後的語句") except IndexError as e : '''try 出現異常時執行''' print("出現索引異常") else: '''try 正常運行時執行''' print("程序運行 OK, 沒有問題") finally: print("不管是否出錯必定會運行到 finally") # 出現索引異常 # 不管是否出錯必定會運行到 finally

異常 鞏固2 ''' 1.找到可能會拋出異常的地方,僅對這幾行代碼進行異常處理 2.明確會出現的異常類型 縮進,類型,語法,索引等等 3.捕獲出現的異常 import sys exc = sys.exc_info() exc[1] 爲問題出現的緣由 4.日誌 logging 模塊 import logging logger = logging.getLogger() # 獲取日誌對象 logfile = 'test.log' hdlr = logging.FileHandler('senging.txt') # 存儲文件日誌 formatter = logging.Formatter('%(asctime)s %(levelname)s %(message)s') # 以什麼格式進行存儲,時間,等級,日誌信息 hdlr.setFormatter(formatter) # 導入日誌格式 logger.addHandler(hdlr) # 將日誌綁定 logger.setLevel(logging.NOTSET) # 設置日誌級別 5.斷言 assert assert 表達式,出錯之後拋出的提示信息 表達式 ： 1 > 4 3 > 2 1 == 2 斷言絕對不能發生的錯誤,而後再處理異常 程序：''' import logging logger = logging.getLogger() # 獲取日誌對象 logfile = 'test.log' hdlr = logging.FileHandler('senging.txt') # 存儲文件日誌 formatter = logging.Formatter('%(asctime)s %(levelname)s %(message)s') # 以什麼格式進行存儲,時間,等級,日誌信息 hdlr.setFormatter(formatter) # 導入日誌格式 logger.addHandler(hdlr) # 將日誌綁定 logger.setLevel(logging.NOTSET) # 設置日誌級別 import sys try: print(a) except: exc = sys.exc_info() print(exc[1]) # 查看異常的問題 # name 'a' is not defined print(exc[0]) # <class 'NameError'> print(exc) # (<class 'NameError'>, NameError("name 'a' is not defined"), # <traceback object at 0x000002A8BD9DA188>) logging.debug(exc[1])

附:日誌這個仍是很不錯的,如下爲顯示的內容,會將錯誤保存起來,便於之後的查看 2020-07-25 09:51:01,344 DEBUG name 'a' is not defined

logging 日誌基礎 import logging logger = logging.getLogger() # 獲取日誌對象 logfile = 'test.log' hdlr = logging.FileHandler('senging.txt') # 存儲文件日誌 formatter = logging.Formatter('%(asctime)s %(levelname)s %(message)s') # 以什麼格式進行存儲,時間,等級,日誌信息 hdlr.setFormatter(formatter) # 導入日誌格式 logger.addHandler(hdlr) # 將日誌綁定 logger.setLevel(logging.NOTSET) # 設置日誌級別 注:此處只是顯示了擁有哪些方法,具體實例還須要進行查閱相關資料

異常 鞏固3 ''' 1.with open("文件路徑","模式") as fp: 操做 進入時 調用 __enter__ 方法 def __enter__(self): print("開始執行 with 方法") 退出時 調用 __exit__ 方法 def __exit__(self,type,value,traceback): print("退出 with 方法") 2.文件操做方法: 打開、讀取、關閉 d = open('a','r') d.read() d.close() 3.能夠本身定義異常,繼承 Exception 類 程序：''' # 查看 with 執行的方法 class sth(object): def __enter__(self): print("開始執行 with 方法") def __exit__(self,type,value,traceback): print("退出 with 方法") with sth( ) as fp: # with 自動關閉文件 pass # 自定義異常 class myException(Exception): # 繼承 Exception def __init__(self,error,msg): self.args = (error,msg) self.error = error self.msg = msg try: raise myException(1,'my exception') except Exception as e : print(str(e)) # (1, 'my exception')

多線程複習1

多線程複習1 ''' 1.進程在運行時會建立一個主線程,每個進程只有一個主線程 2.子進程 pid 惟一標識符 3.任意時間裏,只有一個線程在運行,且運行順序不能肯定(全局鎖限制) 4.threading.Thread(target = test,args = [i]) target = 函數名 ,args = [ 參數 ] 5.能夠繼承 threading.Thread 而後重寫 run 方法 class myThread(threading.Thread): def run(): pass 程序：''' import threading # 導入線程庫 def test(): print(1) a = threading.Thread(target = test) # 建立 a 線程 b = threading.Thread(target = test) a.start() # 啓動線程 b.start() a.join() b.join() # 結束以前使用 join 等待其餘線程

import threading # 導入線程庫 import time def test(i): time.sleep(0.1) print(i) tests_thread = [] for i in range(0,10): threads = threading.Thread(target = test,args = [i]) # 建立 a 線程 tests_thread.append(threads) for i in tests_thread: i.start() # 啓動線程 for i in tests_thread: i.join() # 結束以前使用 join 等待其餘線程 print("線程結束") ''' 運行結果 2 0 1 8 7 9 6 534 線程結束 '''

多線程複習2 import time def func_a(): print("a 函數開始") time.sleep(2) print("a 函數結束") def func_b(): print("b 函數開始") time.sleep(2) print("b 函數結束") b_time = time.time() func_a() func_b() print(time.time() - b_time) # 查看運行多少秒 ''' 運行結果: a 函數開始 a 函數結束 b 函數開始 b 函數結束 4.00050163269043 '''

import threading import time def func_a(): print("a 函數開始") time.sleep(2) print("a 函數結束") def func_b(): print("b 函數開始") time.sleep(2) print("b 函數結束") b_time = time.time() _a = threading.Thread(target = func_a) _b = threading.Thread(target = func_b) _a.start() _b.start() # 開始 _a.join() _b.join() # 等待 print(time.time() - b_time) # 查看時間 ''' 運行結果: a 函數開始 b 函數開始 b 函數結束a 函數結束 2.001542091369629 '''

經過使用了線程和不使用線程的對比,使用了線程的要快不少

線程裏面的加鎖和釋放鎖 # 加鎖和釋放 import threading mlock = threading.Lock() # 建立一把鎖, mlock # 當存在死鎖時,防止死鎖 可重用鎖 num = 0 def a(): global num mlock.acquire() # 加鎖 num += 1 mlock.release() # 釋放鎖 print(num) for i in range(10): d = threading.Thread(target = a) d.start()

''' 1.協程,微型進程: yield 生成器 yield 會保存聲明的變量,能夠進行迭代 使用 接收函數返回的對象.__next__() next(接收函數返回的對象) .send() 方法 傳遞給函數中 yield 聲明的對象 x = yield i 會發送給 x 變量 若是一直沒有使用 send() ,x 值一直爲 None 賦值以後若是沒有修改則 x 一直爲 send 後的值 2.此時 x 的值爲 None ,並無將 i 賦值給 x x = yield i 程序：''' # 建立一個包含 yield 聲明的函數 def test_yield(): i = 0 a = 4 while i < a: x = yield i # x 經過 gener 進行賦值 i += 1 # 使用 .__next__() 查看迭代對象 gener = test_yield() print(gener.__next__()) # 0 print(gener.__next__()) # 1 print(next(gener)) # 2 gener.send("x 經過 gener 進行賦值") for i in test_yield(): # i 在 test_yield 中 yield 聲明的迭代對象中 print(i,end = " ") # 0 1 2 3

解決素數(質數) def is_sushu(int_num): # 判斷輸入的數是否爲質數 if int_num == 1: return False if int_num == 2: return True else: for i in range(2,int_num): if int_num % i == 0: return False return True def _get_sushu(max_num): return [i for i in range(1,max_num) if is_sushu(i)] # 使用列表推導式 if __name__ == "__main__": a = _get_sushu(101) # 返回判斷素數的列表 print(a)

爬蟲流程複習 設置爬蟲終端： URL 管理器 -> 網頁下載器 -> 網頁解析器 -> 產生價值數據 URL 管理器判斷爬取網頁連接 流程： 調度器詢問 URL 管理器,是否存在要爬取的 URL URL 管理器返回 是或否 調度器 從 URL 管理器中 取出一個 URL URL 管理器 將 URL 傳遞給調度器 調度器將 URL 發送到下載器 下載器將 URL 下載的內容傳遞給調度器 調度器將 URL 下載的內容傳遞給解析器 解析器解析後傳遞給調度器 此時能夠收集價值數據 調度器再將須要爬取的 URL 傳遞給 URL管理器 一直到沒有須要爬取的 URL URL 管理器： 管理待爬取的 URL 集合和已經爬取的 URL 集合 使用管理器是爲了防止重複抓取和防止重複抓取一個 URL URL 功能： 添加新的 URL 到待爬取的集合中 肯定待添加的 URL 是否在 URL 中 獲取待爬取的 URL 將 URL 從待爬取的移動到已爬取的集合中 判斷是否還有待爬取的數據 URL 管理器實現方式： 將 待爬取的 和 已爬取的 URL 存儲在集合中 set() 將 URL 存儲在 關係數據庫中,區分 URL 是待爬取仍是已經爬取 MySQL urls(url,is_crawled) 緩存數據庫 redis 網頁下載器： 將 URL 對應的網頁轉換爲 HTML 數據 存儲到本地文件或者內存字符串中 requests 、 urllib 庫實現下載 特殊情景處理器： 須要使用 Cookie 訪問時：HTTPCookieProcessor 須要使用 代理 訪問時：ProxyHandler 須要使用 加密 訪問時：HTTPHandler 網頁存在跳轉關係訪問時：HTTPRedirectHandler 網頁解析器： 從網頁提取有價值的數據 HTML 網頁文檔字符串 提取出價值數據 提取出新的 URL 列表 正則表達式 -> 模糊匹配 文檔做爲字符串,直接匹配 html.parser BeautifulSoup -> 能夠使用 html.parser 和 lxml 從 HTML 和 XHTML 中提取數據 語法： 建立 BeautifulSoup 對象 搜索節點 findall find 訪問節點(名稱,屬性,文字) lxml ->結構化解析 DOM 樹 進行上下級的遍歷 html head title 文本 body a href 文本 div 文本 爬蟲： 肯定目標 分析目標 URL 格式 數據的連接 數據的格式 網頁編碼 編寫代碼 執行爬蟲

列表經常使用方法複習: 列表的經常使用操做： 1.使用 索引下標 或 切片 查找對應元素的值 修改特定位置上的值 2.刪除列表元素 del 對象 對象.pop(index=-1) 對象.remove(元素) 對象.clear() 3.查看列表長度 len(對象) 4.重複 n 次 對象 * n 5.拼接兩個列表對象 對象 + 對象 6.查看某一個元素是否在對象中 元素 in 對象 7.列表做爲可迭代對象使用 for i in 對象 for i in range(len(對象)) 8.列表能夠嵌套 [[],[],[]] 9.列表內元素類型能夠是任意類型 [任意類型,任意類型] 10.查看列表中最大值 最小值 max(對象) min(對象) 11.將其餘類型數據轉換爲列表對象 list(對象) list(可迭代對象) 12.在尾部增長元素 總體添加 對象.append(元素) 解包添加 對象.extend(元素) 13.在任意位置添加 對象.insert(index,元素) 14.排序 對象.sort() 倒序 對象.reverse() 15.查看元素索引位置 對象.index(元素)

字典經常使用操做複習 字典的經常使用操做： 1.建立字典對象 dict.fromkeys(seq[,value]) dic = {key1:value1,key2,value2} dic = dict(zip([keys],[values])) 2.使用 對象['鍵值'] 訪問字典元素 3.修改字典 對象['鍵值'] = 對象 4.刪除字典元素或字典對象 del 對象['鍵值'] del 對象 對象.clear() 對象.pop(key) 對象.popitem() 5.獲取字典長度 len(對象) 6.複製字典 對象.copy() 對象.update(對象2) 7.獲取指定鍵值的元素 對象.get(key[,default=None]) 8.查看 鍵 是否在字典中 key in 對象 9.獲取字典中的元素 對象.keys() 對象.values() 對象.items() 10.對某一個元素設置默認值 若是該鍵已有值 則設置無效 對象.setdefault(key,default = None)

將"089,0760,009"變爲 89,760,9 remove_zeros = lambda s: ','.join(map(lambda sub: str(int(sub)), s.split(','))) remove_zeros("089,0760,009")

Linux最經常使用的基本操做複習 1.ctrl + shift + = 放大終端字體 2.ctrl + - 縮小終端字體 3.ls 查看當前文件夾下的內容 4.pwd 查看當前所在的文件夾 5.cd 目錄名 切換文件夾 6.touch 若是文件不存在 則建立文件 建立一個文檔 7.mkdir 建立目錄 建立一個文件夾 8.rm 刪除指定的文件名 9.clear 清屏

python鏈接數據庫 MySQLdb 版本 import MySQLdb # 導入 MySQL 庫 class MysqlMethod(object): def __init__(self): # 前提：可以鏈接上數據庫 self.get_connect() def get_connect(self): # 獲取鏈接 try: self.conn = MySQLdb.connect( host = '127.0.0.1', # 主機 user = 'root', # 用戶名 passwd = 'root', # 密碼 db = 'python_prac', # 數據庫 port = 3306, # 端口號 charset = 'utf8' # 避免字符編碼問題 ) except MySQLdb.Error as e: print("鏈接數據庫時,出現錯誤") print("錯誤信息以下:\n %s"%e) else: print("鏈接 MySQL 成功！") def close_connect(self): # 關閉鏈接 try: # 關閉鏈接 self.conn.close() # 關閉數據庫鏈接 except MySQLdb.Error as e: print("關閉數據庫時出現錯誤") print("錯誤信息以下:\n %s"%e) else: print("退出成功,歡迎下次使用!") def get_onedata(self): # 獲取一條數據 cursor = self.conn.cursor() # 獲取遊標 sql = 'select * from students where age between %s and %s' # 查詢語句 cursor.execute(sql,(15,25)) # execute(語句,(參數)) result = dict(zip([k[0] for k in cursor.description],cursor.fetchone())) ''' zip(列表推導式,獲取到的值) 字典的鍵：描述數據的值 字典的值：獲取到的值 例： lst_keys = ['a','b','c'] lst_values = [1,2,3] dict(zip(lst_keys,lst_values)) 獲得的結果： {'a': 1, 'b': 2, 'c': 3} ''' # 元組類型轉換爲字典,便於經過索引查找數據 print("獲取到一條數據:") return result def get_moredata(self,page,page_size): # 添加多條數據 offset = (page - 1) * page_size # 起始位置 cursor = self.conn.cursor() sql = 'select * from students where age between %s and %s limit %s,%s;' cursor.execute(sql,(15,45,offset,page_size)) result = list(dict(zip([k[0] for k in cursor.description],row)) for row in cursor.fetchall()) ''' 使用 zip 將 列名 和 獲取到的數據 壓縮爲一個個單獨的二元組 但類型爲 <class 'zip'> 須要進行轉換才能看到具體的值 zip([k[0] for k in cursor.description],row) ('id', 1)··· 使用 dict 將 zip 類型轉換爲字典類型 dict(zip([k[0] for k in cursor.description],row)) {'id': 1,···} 使用 列表推導式 將每個 row 變爲查找到的多個數據中的一個 原理：[元素操做 for 元素 in 序列對象] list -> [] list[ row 的操做 for row in 數據集] ''' print("獲取到多條數據:") # result 爲[{},{}] 形式 return result def insert_onedata(self): # 添加一條數據 try: sql = "insert into stu_0415(name,school) values (%s,%s);" # 查詢語句 cursor = self.conn.cursor() # 獲取遊標 need_info = ('王五','廈大') # 須要插入的數據 cursor.execute(sql,need_info) # 運行 sql 語句 self.conn.commit() # 提交,若是沒有提交,數據庫數據不會發生變化 except : print("插入數據失敗") self.conn.rollback() # 若是個別數據插入成功了,則也不算入數據庫 print("插入數據成功") def main(): sql_obj = MysqlMethod() # 建立一個 sql 對象 data = sql_obj.get_onedata() # 獲取一條數據 print(data) moredata = obj.get_moredata(1,5) # 查看 0~5 的數據 for item in moredata: print(item) # 循環遍歷輸出字典對象 print("-------------") obj.insert_onedata() # 插入一條數據 if __name__ == '__main__': main() # 運行主程序

顯示列表重複值 lst = [1,2,3,2,1,5,5] lst = list(filter(lambda x:lst.count(x) != 1,lst))

此處使用了 filter 和 lambda 進行混合使用, x 爲 lst 中的元素

應用場景 列表經常使用場景： 存儲不一樣類型的數據 任意類型都可 列表存儲相同類型的數據 類 node結點 next、data 經過迭代遍歷,在循環體內部(多爲 while 內),對列表的每一項都進行遍歷 樹的深度遍歷等等 列表推導式的使用等等 元組經常使用場景： 做爲函數的參數和返回值 傳遞任意多個參數 *args 函數內爲元組形式 一次返回多個數據 return (a,b) 或 a,b return a,b,c 接收函數返回值時 value1,value2 = 函數(參數) 函數(參數)即爲 return (a,b) 格式化字符串 s1 = "%s %s" s2 = ('hello','world') s1%s2 'hello world' 數據庫execute語句 cursor.execute(sql,(15,25)) 讓列表不能夠被修改,保護數據 tuple(list對象) 字典經常使用場景： for in 遍歷字典 dic = {'a':1,'b':2,'c':3} 遍歷鍵值： for key in dic.keys(): print(key,end = " ") 遍歷值： for value in dic.values(): print(value,end = " ") 遍歷鍵值對： for key,value in dic.items(): print(key,":",value,end = " ") 使用字典(多個鍵值對)存儲一個物體的信息 {"name":"張三","age":23} 將多個字典放到列表中,循環時對每個字典進行相同操做 students_info = [{"name":"張三","age":23},{"name":"李四","age":22}] 訪問張三數據: students_info[0]['name'] 操做 for i in range(len(students_info)): students_info[i] 便可進行操做數據 for stu in students_info: print(stu) 輸出的爲單個字典元素

類 擴展 關於類和對象的理解： 類 -> 設計圖紙,設計應該具備哪些屬性和行爲 對象 -> 使用圖紙製造出來的模型 類中定義普通方法,第一個參數爲 self self能夠修改成別的,但最好仍是不要改變,約定好的 self.屬性 self.方法 調用 self 指向的對象的屬性和行爲 在類外能夠爲實例化對象直接建立屬性,可是該屬性只適用於該對象 不推薦使用,若是必定要使用,必須先建立屬性,後使用方法 在 __init__(self,..) 初始化方法內,定義屬性初始值有利於表達屬性,定義方法 打印類的實例化對象時,實際調用的是 類中的 __str__方法 __str__必須返回字符串,能夠本身定義 若是實例化對象 先使用 del 方法刪除了,那麼不會再執行類中的 __del__方法 保護私有公有對象,保護私有共有方法 在方法內均可以調用 先開發被使用的類,被包含操做的類

複習 裝飾器 ''' 裝飾器的做用 引入日誌 函數執行時間的統計 執行函數前預備處理 執行函數後清理功能 權限校驗等場景 緩存 ''' # 定義一個函數,遵循閉包原則(函數做爲參數) def decorator(func): '''定義一個裝飾器函數''' print("func 函數開始") def wrapper(): # 建立裝飾器內容 print("進行裝飾") func() print("裝飾完畢") print("func 函數結束") return wrapper @decorator # 加載 wrapper 函數,將 wrapper 函數傳遞給使用裝飾器的函數 def house(): print("大房子") house() ''' 運行結果: func 函數開始 func 函數結束 進行裝飾 大房子 裝飾完畢 '''

re 正則表達式練習 字符串重複出現 ''' 有一段英文文本,其中有單詞連續重複了 2 次,編寫程序檢查重複的單詞並只保留一個 例: This is a a desk. 輸出 This is a desk. ''' # 方法一 import re x = 'This is a a desk.' # 設置字符串 pattern = re.compile(r'\b(\w+)(\s+\1){1,}\b') # \b 匹配單詞和空格間的位置 # \w 匹配包括下劃線的任何單詞字符 [A-Za-z0-9_] # \s 匹配任何空白字符 # {1,} 大於 1 個 matchResult = pattern.search(x) # 查找這樣的結果 x = pattern.sub(matchResult.group(1),x) # sub 進行替換字符 # group(1) 爲 a group(0) 爲 a a print(x) # This is a desk. # 方法二 import re x = 'This is a a desk.' # 設置字符串 pattern = re.compile(r'(?P<f>\b\w+\b)\s(?P=f)') # # \b 匹配單詞和空格間的位置 # \w 匹配包括下劃線的任何單詞字符 [A-Za-z0-9_] matchResult = pattern.search(x) # 匹配到 a a x = x.replace(matchResult.group(0),matchResult.group(1)) # 字符串對象.replace(舊字符串,新字符串) # print(matchResult.group(0)) # a a # print(matchResult.group(1)) # a print(x) # This is a desk.

最基本的Tkinter界面操做 ''' 1.建立應用程序主窗口對象 root = Tk() 2.在主窗口中,添加各類可視化組件 btn1 = Button(root) btn1["text"] = "點我" 3.經過幾何佈局管理器,管理組件得大小和位置 btn1.pack() 4.事件處理 經過綁定事件處理程序,響應用戶操做所觸發的事件 def songhua(e): messagebox.showinfo("Message","送你一朵玫瑰花") print("送花花") btn1.bind("<Button-1>",songhua) 5.Tk() 的對象.mainloop() 方法會一直進行事件循環,監聽用戶操做 6.Button() 組件的參數爲 Tk() 對象 Button() 的實例化對象 ["text"] 內容爲顯示在按鈕上的內容 7.from tkinter import messagebox 顯示點擊以後提示的窗口 messagebox.showinfo("Message","送你一朵玫瑰花") 第一個參數爲 標題 第二個參數爲 顯示信息 8.btn1.bind("<Button-1>",songhua) 使用建立好的按鈕對象綁定鼠標事件和對應須要運行的函數 9.root.mainloop() 事件循環,一直監聽用戶操做 程序:''' from tkinter import * from tkinter import messagebox root = Tk() # 建立一個窗口對象 btn1 = Button(root) btn1["text"] = "Submit" btn1.pack() # 將組件對象合理的放在窗口中 def songhua(e): # e 爲事件 event messagebox.showinfo("Message","送你一朵玫瑰花") print("送花花") btn1.bind("<Button-1>",songhua) # <Button-1> 表示鼠標左鍵單擊 root.mainloop() # root.mainloop() 事件循環,一直監聽用戶操做

Tkinter經典寫法 ''' 1.繼承 tkinter.Frame 類,實現類的基本寫法 2.建立 主窗口 及 主窗口大小 位置 及 標題 3.將須要添加的組件放入到類中進行建立, 繼承的 Frame 類須要使用 master 參數做爲父類的初始化使用 4.初始化時,將屬性和方法都進行初始化,此時能夠將 GUI 程序所要實現的功能肯定好 5.在類中定義事件發生時,須要實現的功能 6.self.btn1["command"] = self.kuaJiang btn1["command"] 爲事件發生時進行相應的函數 7.self.btnQuit = Button(self,text = "退出",command = root.destroy) 退出按鈕的寫法 ''' from tkinter import * from tkinter import messagebox class Application(Frame): '''GUI程序經典寫法''' def __init__(self,master = None): super().__init__(master) # super() 表示父類的定義,父類使用 master 參數 self.master = master # 子類定義一個屬性接收傳遞過來的 master 參數 self.pack() # .pack 設置佈局管理器 self.createWidget() # 在初始化時,將按鈕也實現 # master傳遞給父類 Frame 使用後,子類中再定義一個 master 對象 def createWidget(self): '''建立組件''' self.btn1 = Button(self) # self 爲組件容器 self.btn1["text"] = "Hany love Python." # 按鈕的內容爲 btn1["text"]定義的內容 self.btn1.pack() # 最佳位置 self.btn1["command"] = self.kuaJiang # 響應函數 self.btnQuit = Button(self,text = "退出",command = root.destroy) # 設置退出操做 self.btnQuit.pack() def kuaJiang(self): messagebox.showinfo("人艱不拆","繼續努力,你是最棒的!") if __name__ == '__main__': root = Tk() # 定義主窗口對象 root.geometry("200x200+200+300") # 建立大小 root.title("GUI 經典寫法") app = Application(master = root) # 傳遞 master 參數爲 主窗口對象 root.mainloop()

Label 組件基本寫法 ''' 1.width,height 指定區域大小 文本 漢字 2 個字節 2.font 指定字體和字體大小 font(font_name,size) 3.image 顯示在 Label 上的圖像 支持 gif 格式 4.fg 前景色 5.bg 背景色 6.justify 針對多行文字的對齊 left center right 7.self.lab1 = Label(self,text = "Label實現",width = 10,height = 2, bg = 'black',fg = 'white') 8. photo_gif = PhotoImage(file = "images/小熊.gif") self.lab3 = Label(self,image = photo_gif) 將照片傳遞給 photo_gif 而後使用 Label 將圖片變量做爲參數進行傳遞 9.self.lab4 = Label(self,text = " Hany加油\n 人艱不拆！" ,borderwidth = 1,relief = "solid",justify = "right") borderwidth 設置文本線的寬度 justify 表示左對齊 右對齊 ''' from tkinter import * class Application(Frame): '''GUI程序經典寫法''' def __init__(self,master = None): super().__init__(master) # super() 表示父類的定義,父類使用 master 參數 self.master = master # 子類定義一個屬性接收傳遞過來的 master 參數 self.pack() # .pack 設置佈局管理器 self.createWidget() # 在初始化時,將按鈕也實現 # master傳遞給父類 Frame 使用後,子類中再定義一個 master 對象 def createWidget(self): '''建立組件''' self.lab1 = Label(self,text = "Label實現",width = 10,height = 2, bg = 'black',fg = 'white') self.lab1.pack() self.lab2 = Label(self,text = "Labe2實現",width = 10,height = 2, bg = 'black',fg = 'white',font = ("宋體",14)) self.lab2.pack() # 顯示圖像 global photo_gif # 將 photo_gif 設置爲全局變量,防止方法調用後銷燬 photo_gif = PhotoImage(file = "路徑/圖片名.gif") self.lab3 = Label(self,image = photo_gif) self.lab3.pack() # 顯示多行文本 self.lab4 = Label(self,text = " Hany加油\n 人艱不拆！" ,borderwidth = 1,relief = "solid",justify = "right") self.lab4.pack() if __name__ == '__main__': root = Tk() # 定義主窗口對象 root.geometry("300x300+400+300") # 建立大小 root.title("Label 測試") # 設置標題 app = Application(master = root) # 傳遞 master 參數爲 主窗口對象 root.mainloop()

注:圖片路徑和圖片要修改成本身的

類實例化的對象調用的方法或屬性來自於類的哪一個方法中 __init__ 構造方法 對象建立 p = Person() __del__ 析構方法 對象回收 __repr__ , __str__ 打印,轉換 print(a) __call__ 函數調用 a() __getattr__ 點號運算 a.xxx __setattr__ 屬性賦值 a.xxx = value __getitem__ 索引運算 a[key] __setitem__ 索引賦值 a[key]=value __len__ 長度 len(a)

每一個運算符實際上都對應了相應的方法 運算符+ __add__ 加法 運算符- __sub__ 減法 <,<=,== __lt__,__le__,__eq__ 比較運算符 >,>=,!= __gt__,_ ge__,__ne__ 比較運算符 |,^,& __or__,__xor__,__and__ 或,異或,與 <<,>>__lshift__,__ rshift__ 左移,右移 *,/,%,// __mul__,__truediv__,__mod__,__floordiv__ 乘,浮點除,模運算(取餘),整數除 ** __pow__ 指數運算

tkinter Button基本用語 ''' 1.self.btn2 = Button(root,image = photo,command = self.login) 使用 image 圖片做爲按鈕,command 做爲響應 2.self.btn2.config(state = "disabled") 對按鈕進行禁用 3.Button 中 anchor 控制按鈕上的圖片位置 N NE E SE SW W NW CENTER 默認居中 ''' from tkinter import * from tkinter import messagebox class Application(Frame): '''GUI程序經典寫法''' def __init__(self,master = None): super().__init__(master) # super() 表示父類的定義,父類使用 master 參數 self.master = master # 子類定義一個屬性接收傳遞過來的 master 參數 self.pack() # .pack 設置佈局管理器 self.createWidget() # 在初始化時,將按鈕也實現 # master傳遞給父類 Frame 使用後,子類中再定義一個 master 對象 def createWidget(self): '''建立組件''' self.btn1 = Button(root,text = '登陸',command = self.login, width = 5,height = 2,anchor = E) # command 進行操做的函數 self.btn1.pack() global photo photo = PhotoImage(file = "圖片路徑/圖片名.gif") self.btn2 = Button(root,image = photo,command = self.login) self.btn2.pack() # self.btn2.config(state = "disabled") # # 設置按鈕爲禁用按鈕 def login(self): messagebox.showinfo("博客園","歡迎使用~") if __name__ == '__main__': root = Tk() # 定義主窗口對象 root.geometry("300x300+400+300") # 建立大小 root.title("Button 測試") # 設置標題 app = Application(master = root) # 傳遞 master 參數爲 主窗口對象 root.mainloop()

tkinter Entry基本用法 ''' 1.BooleanVar() 布爾類型 2.IntVar() 整數類型 3.DoubleVar() 浮點數類型 4.StringVar() 字符串類型 5.self.entry1 = Entry(self,textviable = v1) textviable 實現雙向關聯 6.v1.set("admin") # 設置單行文本的值 7.v1.get() self.entry1.get() 獲取的是單行文本框中的值 8.self.entry_passwd = Entry(self,textvariable = v2,show = "*") textvariable 進行綁定 v2 v2 = StringVar() 用戶輸入後,show 顯示爲 * 9.Button(self,text = "登陸",command = self.login).pack() 登陸操做 10.點擊登錄後執行的函數能夠與數據庫進行交互,達到驗證的目的 self.組件實例化對象.get() 獲取值 '''

from tkinter import * from tkinter import messagebox class Application(Frame): '''GUI程序經典寫法''' def __init__(self,master = None): super().__init__(master) # super() 表示父類的定義,父類使用 master 參數 self.master = master # 子類定義一個屬性接收傳遞過來的 master 參數 self.pack() # .pack 設置佈局管理器 self.createWidget() # 在初始化時,將按鈕也實現 # master傳遞給父類 Frame 使用後,子類中再定義一個 master 對象 def createWidget(self): '''建立組件''' self.lab1 = Label(self,text = "用戶名") self.lab1.pack() # StringVar() 綁定到指定的組件,StringVar 和 v1 一塊兒變化 v1 = StringVar() self.entry_user = Entry(self,textvariable = v1) self.entry_user.pack() v1.set("admin") # 設置單行文本的值 # v1.get() self.entry_user.get() 獲取的是單行文本框中的值 # 建立密碼框 self.lab2 = Label(self,text = "密碼") self.lab2.pack() v2 = StringVar() self.entry_passwd = Entry(self,textvariable = v2,show = "*") self.entry_passwd.pack() Button(self,text = "登陸",command = self.login).pack() def login(self): username = self.entry_user.get() passwd = self.entry_passwd.get() # 數據庫進行操做,查看是否存在該用戶 print("用戶名:" + username) print("密碼:" + passwd) if username == "Hany" and passwd == "123456": messagebox.showinfo("博客園","歡迎使用~") else: messagebox.showinfo("Error","請從新輸入~") if __name__ == '__main__': root = Tk() # 定義主窗口對象 root.geometry("300x300+400+300") # 建立大小 root.title("Button 測試") # 設置標題 app = Application(master = root) # 傳遞 master 參數爲 主窗口對象 root.mainloop()

此處的用戶名是 Hany , 密碼是 123456

Text多行文本框基本用法 #coding=gbk ''' 1.Text(root,width,height,bg) 主窗口,寬度,高度,背景色 2.使用 .insert() 方法添加內容 Text 對象.insert(幾行.幾列,"內容") w1.insert(2.3,"···") END 爲最後位置 self.w1.insert(END,'[end]') 3.Button(窗口對象,text = "內容",command = "self.函數名").pack([side = "left"]) Button(self,text = "返回文本",command = self.returnText).pack(side = "left") text 顯示的內容 command 運行的函數 pack 位置,使用 side 後,按鈕按照 pack 來 4.在類中定義的屬性,不會由於運行函數方法後,就銷燬 self.photo 不用再使用 global 進行聲明 5.使用 PhotoImage 將圖片存起來後,將圖片顯示在多行文本 Text 中 self.photo = PhotoImage(file = '圖片路徑/圖片名.gif') self.photo = PhotoImage(file = 'images/logo.gif') 使用 .image_create(位置,image = self.photo) 進行添加 self.w1.image_create(END,image = self.photo) 6.添加按鈕組件到文本中 btn1 = Button(文本內容,text = "內容") 7.self.w1.tag_config (內容,background 背景顏色,foreground 文字顏色) 8.self.w1.tag_add("內容",起始位置,終止位置) tag_add 加入內容 9.self.w1.tag_bind("內容","事件",self.函數名) self.w1.tag_bind("baidu","<Button-1>",self.webshow) 10.webbrowser.open("網址") 打開一個網址 ''' from tkinter import * from tkinter import messagebox # 顯示消息 import webbrowser # 導入 webbrowser 到時候點擊字體跳轉使用 class Application(Frame): '''GUI程序經典寫法''' def __init__(self,master = None): super().__init__(master) # super() 表示父類的定義,父類使用 master 參數 self.master = master # 子類定義一個屬性接收傳遞過來的 master 參數 self.pack() # .pack 設置佈局管理器 self.createWidget() # 在初始化時,將按鈕也實現 # master傳遞給父類 Frame 使用後,子類中再定義一個 master 對象 def createWidget(self): '''建立組件''' # 建立文字 Text(root 主窗口對象,width 寬度,height 高度,bg 背景色) # 只對於文本有效 self.w1 = Text(root,width = 100,height = 40,bg = "gray") # 設置背景色 bg = "gray" self.w1.pack() self.w1.insert(1.0,"0123456789\nabcdefg") # 1.0 在 第一行 第零列 插入數據 self.w1.insert(2.3,"活在當下\n結髮爲夫妻，恩愛兩不疑\n言行在於美，不在於多") # 2.3 在 第二行 第三列 Button(self,text = "重複插入文本",command = self.insertText).pack(side = "left") # 水平排列 side = "left" Button(self,text = "返回文本",command = self.returnText).pack(side = "left") Button(self,text = "添加圖片",command = self.addImage).pack(side = "left") Button(self,text = "添加組件",command = self.addWidget).pack(side = "left") Button(self,text = "經過 tag 控制文本",command = self.testTag).pack(side = "left") def insertText(self): '''INSERT 索引表示在光標處插入''' self.w1.insert(INSERT,'Hany') # END 索引號表示在最後插入 self.w1.insert(END,'[end]') # 在文本區域最後 self.w1.insert(1.2,"(.-_-.)") def returnText(self): '''返回文本內容''' # Indexes(索引) 是用來指向 Text 組件中文本的位置 # Text 的組件索引 也是對應實際字符之間的位置 # 核心：行號從 1 開始,列號從 0 開始 print(self.w1.get(1.2,1.6)) print("文本內容:\n" + self.w1.get(1.0,END)) def addImage(self): '''增長圖片''' self.photo = PhotoImage(file = 'images/logo.gif') self.w1.image_create(END,image = self.photo) def addWidget(self): '''添加組件''' btn1 = Button(self.w1,text = "Submit") self.w1.window_create(INSERT,window = btn1) # 添加組件 def testTag(self): '''將某一塊做爲特殊標記,並使用函數''' self.w1.delete(1.0,END) self.w1.insert(INSERT,"Come on, you're the best.\n博客園\nHany 加油!!!") # self.w1.tag_add("good",1.0,1.9) # 選中標記區域 # self.w1.tag_config("good",background = "yellow",foreground = "red") # 單獨標記某一句,背景色 字體色 self.w1.tag_add("baidu",3.0,3.4) # self.w1.tag_config("baidu",underline = True,background = "yellow",foreground = "red") self.w1.tag_bind("baidu","<Button-1>",self.webshow) def webshow(self,event): webbrowser.open("http://www.baidu.com") if __name__ == '__main__': root = Tk() # 定義主窗口對象 root.geometry("500x300+400+300") # 建立大小 root.title("Button 測試") # 設置標題 app = Application(master = root) # 傳遞 master 參數爲 主窗口對象 root.mainloop()

Radiobutton基礎語法 ''' 1.Radiobutton(root 主窗口,text 文本內容,value 值(能夠經過set 和 get 獲取到的值),variable 變量修改原來的StringVar) self.radio_man = Radiobutton(root,text = "男性",value = "man",variable = self.v) 2.Button(root,text = "提交",command = self.confirm).pack(side = "left") 設置按鈕進行提交,而後響應的函數 ''' from tkinter import * from tkinter import messagebox class Application(Frame): '''GUI程序經典寫法''' def __init__(self,master = None): super().__init__(master) # super() 表示父類的定義,父類使用 master 參數 self.master = master # 子類定義一個屬性接收傳遞過來的 master 參數 self.pack() # .pack 設置佈局管理器 self.createWidget() # 在初始化時,將按鈕也實現 # master傳遞給父類 Frame 使用後,子類中再定義一個 master 對象 def createWidget(self): '''建立組件''' self.v = StringVar() #String類型 self.v.set("man") # 默認爲 man 選中 self.radio_man = Radiobutton(self,text = "男性",value = "man",variable = self.v) # Radiobutton(root/self 主窗口,text 文本內容,value 值(能夠經過set 和 get 獲取到的值),variable 變量修改原來的StringVar()變量也修改) self.radio_woman = Radiobutton(self,text = "女性",value = "woman",variable = self.v) self.radio_man.pack(side = "left") self.radio_woman.pack(side = "left") # 放到最佳位置 Button(self,text = "提交",command = self.confirm).pack(side = "left") # 設置按鈕進行提交,而後響應的函數 def confirm(self): messagebox.showinfo("選擇結果","選擇的性別是 : "+self.v.get()) # 兩個參數,一個是標題另外一個是內容 # 顯示內容 if __name__ == '__main__': root = Tk() # 定義主窗口對象 root.geometry("300x100+400+300") # 建立大小 root.title("Button 測試") # 設置標題 app = Application(master = root) # 傳遞 master 參數爲 主窗口對象 root.mainloop()

Checkbutton基本寫法 ''' 1.Checkbutton(self 窗口對象,text 按鈕顯示內容,variable 綁定變量->一塊兒變化, onvalue 用戶點擊時獲得的值,offvalue 沒有點擊獲得的值) self.choose1 = Checkbutton(self,text = "玩遊戲",variable = self.playHobby, onvalue = 1,offvalue = 0) 2.self.playHobby.get() == 1 : .get() 獲取到值 判斷是否時 onvalue 的值 ''' from tkinter import * from tkinter import messagebox class Application(Frame): '''GUI程序經典寫法''' def __init__(self,master = None): super().__init__(master) # super() 表示父類的定義,父類使用 master 參數 self.master = master # 子類定義一個屬性接收傳遞過來的 master 參數 self.pack() # .pack 設置佈局管理器 self.createWidget() # 在初始化時,將按鈕也實現 # master傳遞給父類 Frame 使用後,子類中再定義一個 master 對象 def createWidget(self): '''建立組件''' self.playHobby = IntVar() # 默認爲 0 # .get() 獲取值 .set() 設置值 self.travelHobby = IntVar() self.watchTvHobby = IntVar() # print(self.playHobby.get()) 0 self.choose1 = Checkbutton(self,text = "玩遊戲",variable = self.playHobby, onvalue = 1,offvalue = 0) # Checkbutton(self 窗口對象,text 按鈕顯示內容,variable 綁定變量->一塊兒變化, # onvalue 用戶點擊時獲得的值,offvalue 沒有點擊獲得的值) self.choose2 = Checkbutton(self,text = "去旅遊",variable = self.travelHobby, onvalue = 1,offvalue = 0) self.choose3 = Checkbutton(self,text = "看電影",variable = self.watchTvHobby, onvalue = 1,offvalue = 0) self.choose1.pack(side = "left") self.choose2.pack(side = "left") self.choose3.pack(side = "left") Button(self,text = "肯定",command = self.confirm).pack(side = "left") def confirm(self): if self.playHobby.get() == 1 : # 獲取到的數據是 1 的話,進行接下來的操做 messagebox.showinfo("假期項目","玩遊戲----") if self.travelHobby.get() == 1 : messagebox.showinfo("假期項目","去旅遊----") if self.watchTvHobby.get() == 1 : messagebox.showinfo("假期項目","看電影----") if __name__ == '__main__': root = Tk() # 定義主窗口對象 root.geometry("300x200+400+300") # 建立大小 root.title("Button 測試") # 設置標題 app = Application(master = root) # 傳遞 master 參數爲 主窗口對象 root.mainloop()

# 一行代碼合併字典 # {**{'鍵':'值','鍵':'值'},**{'鍵','值'}} dic = {**{'a':1,'b':2},**{'c':3},**{'d':4}} print(dic) # {'a': 1, 'b': 2, 'c': 3, 'd': 4} # 一行代碼查看多個列表最大值 print(max([[1,2,3],[4,5,7,8],[6]],key = lambda v:max(v))) # [4, 5, 7, 8] print(max(max([[1,2,3],[4,5,7,8],[6]],key = lambda v:max(v)))) # 8

整理上課內容 加載數據集 sklearn.datasets 集成了部分數據分析的經典數據集· load_boston 迴歸 load_breast_cancer 分類 聚類 fetch_california_housing 迴歸 load_iris 分類 聚類 load_digits 分類 load_wine 分類 from sklearn.datasets import load_breast_cancer cancer=load_ breast_cancer() print('breast_cancer數據集的長度爲：'，len(cancer)) print('breast_cancer數據集的類型爲：'，type(cancer)) 數據集能夠看做字典 能夠使用 data target feature_names DESCR 分別獲取數據集的數據 標籤 特徵名稱 描述信息 cancer['data'] cancer['target'] cancer['feature_names'] cancer['DESCR'] 將樣本分爲三部分 訓練集(train set)用於估計模型 驗證集(validation set) 用於肯定 網絡結構 或 控制模型複雜程度 的參數 測試集(test set) 檢驗最優的模型的性能 佔比 50% 25% %25 經過一些數據創建一些模型 經過模型能夠將新數據分組 K折交叉驗證法 經常使用的方法是留少部分作測試集 對其他 N 個樣本採用 K 折交叉驗證法 將樣本打亂 均勻分紅K份。 輪流選擇其中 K-1 份作訓練 剩餘的一份作驗證。 計算預測偏差平方和 把K次的預測偏差平方和的均值做爲選擇最優模型結構的依據 對數據集進行拆分 sklearn.model_selection 的 train_test_split 函數 參數 *arrays 接收一個或多個須要劃分的數據集 分類->數據和標籤 聚類->數據 test_size 接收 float int None 數據 表示測試集的大小 float 類型 0-1 之間 表示測試集在總數中的佔比 int 類型 表示測試集的絕對數目 test_size 默認爲 25% train_size 接收 float int None 類型的數據 表示訓練集的大小 和 test_size 只能有一個 random_state 接收 int 類型 表示隨機種子的編號 相同隨機種子編號產生相同的隨機結果 不一樣的隨機種子編號產生不一樣的隨機結果 shuffle 接收布爾類型 表明是否進行有放回抽樣’ stratify 接收 array標籤 或者 None 使用標籤進行分層抽樣 train_test_split 函數根據傳入的數據 分別將傳入的數據劃分爲訓練集和測試集 若是傳入的是1組數據，那麼生成的就是這一組數據隨機劃分後訓練集和測試集 若是傳入的是2組數據，則生成的訓練集和測試集分別2組 將breast_cancer數據劃分爲訓練集和測試集 from sklearn.model_selection import train_test_split cancer_data_train,cancer_data_test,cancer_target_train,cancer_target_test = train_test_split(cancer_data,cancer_target,test_size=0.2,random_state=42) .shape 查看形狀 numpy.max() 查看最大值 使用 sklearn 轉換器 fit 分析特徵和目標值，提取有價值的信息 如 統計量 或 權值係數等。 transform 對特徵進行轉換 無信息轉換 指數和對數函數轉換等 有信息轉換 無監督轉換 只利用特徵的統計信息 如 標準化 和 PCA 降維 有監督轉換 利用 特徵信息 和 目標值信息 如經過模型->選擇特徵 LDA降維等 fit_tranform 先調用 fit 方法 而後調用 transform 方法 使用 sklearn 轉換器 可以實現對傳入的 Numpy數組 進行標準化處理、歸一化處理、二值化處理、PCA降維等操做 注 各種特徵處理相關的操做都要將 訓練集和測試集 分開 將訓練集的操做規則、權重係數等應用到測試集中 .shape 查看形狀 sklearn 提供的方法 MinMaxScaler 對特徵進行離差標準化 StandardScaler 對特徵進行標準差標準化 Normalizer 對特徵進行歸一化 Binarizer 對定量特徵進行二值化處理 OneHotEncoder 對定性特徵進行獨熱編碼處理 Function Transformer 對特徵進行自定義函數變換 from sklearn.decomposition import PCA PCA 降維算法經常使用參數及做用 n_components 接收 None int float string 參數 未指定時,表明全部特徵都會保留下來 int -> 下降到 n 個維度 float 同時 svd_solver 爲full string 如 n_components='mle' 自動選取特徵個數爲 n 知足所要求的方差百分比 默認爲 None copy 接收 布爾類型數據 True 運行後 原始數據不會發生變化 False 運行 PCA 算法後,原始數據 會發生變化 默認爲 True whiten 接收 布爾類型數據 表示白化 對降維後的數據的每一個特徵進行歸一化 默認爲 False svd_solver 接收 'auto' 'full' 'arpack' 'randomized' 默認爲auto auto 表明PCA類會自動在上述三種算法中去權衡 選擇一個合適的SVD算法來降維 full 使用SciPy庫實現的傳統SVD算法 arpack 和randomized的適用場景相似 區別是 randomized 使用的是 sklearn 的SVD實現 而arpack直接使用了 SciPy 庫的 sparse SVD實現 randomized 通常適用於數據量大 數據維度多 同時主成分數目比例又較低的PCA降維 使用一些加快SVD的隨機算法 聚類分析 在沒有給定 劃分類別 的狀況下，根據 數據類似度 進行樣本分組的一種方法 聚類模型 能夠將 無類標記的數據 彙集爲多個簇 視爲一類 是一種 非監督的學習算法 聚類的輸入是 一組未被標記的樣本 根據 自身的距離 或 類似度 將他們劃分爲若干組 原則 組內樣本最小化 組間距離最大化 經常使用的聚類算法 劃分方法 K-Means算法（K-平均） K-MEDOIDS算法（K-中心點） CLARANS算法（基於選擇的算法） 層次分析方法 BIRCH算法（平衡送代規約和聚類） CURE算法（表明點聚類） CHAMELEON算法（動態模型） 基於密度的方法 DBSCAN算法（基於高密度鏈接區域） DENCLUE算法（密度分佈函數） OPTICS算法（對象排序識別） 基於網格的方法 STING算法（統計信息網絡） CLIOUE算法（聚類高維空間） WAVE-CLUSTER算法（小波變換） sklearn.cluster 提供的聚類算法 函數名稱 K-Means 參數 簇數 適用範圍 樣本數目很大 聚類數目中等 距離度量 點之間的距離 函數名稱 Spectral clustering 參數 簇數 適用範圍 樣本數目中等 聚類數目較小 距離度量 圖距離 函數名稱 Ward hierarchical clustering 參數 簇數 適用範圍 樣本數目較大 聚類數目較大 距離度量 點之間的距離 函數名稱 Agglomerative clustering 參數 簇數 連接類型 距離 適用範圍 樣本數目較大 聚類數目較大 距離度量 任意成對點線圖間的距離 函數名稱 DBSCAN 參數 半徑大小 最低成員數目 適用範圍 樣本數目很大 聚類數目中等 距離度量 最近的點之間的距離 函數名稱 Birch 參數 分支因子 閾值 可選全局集羣 適用範圍 樣本數目很大 聚類數目較大 距離度量 點之間的歐式距離 聚類算法實現須要sklearn估計器 fit 和 predict fit 訓練算法 接收訓練集和標籤 可用於有監督和無監督學習 predict 預測有監督學習的測試集標籤 可用於劃分傳入數據的類別 將規則經過 fit 訓練好後 將規則進行 預測 predict 若是存在數據 還能夠檢驗規則訓練的好壞 引入離差標準化 from sklearn.preprocessing import MinMaxScaler from sklearn.datasets import load_iris from sklearn.cluster import K-Means iris = load_iris() 數據集的特徵 iris_data = iris['data'] 數據集的標籤 iris_target = iris['target'] 數據集的特徵名 iris_names = iris['feature_names'] 訓練規則 scale = MinMaxScaler().fit(iris_data) 應用規則 iris_dataScale = scale.transform(iris_data) 構建並訓練模型 kmeans = KMeans(n_components = 3,random_state = 123).fit(iris_dataScale) n_components = 3 分爲三類 預測模型 result = kmeans.predict([[1.5,1.5,1.5,1.5]]) 查看預測類別 result[0] 使用 sklearn.manifold 模塊的 TSNE 函數 實現多維數據的可視化展示 原理 使用 TSNE 進行數據降維 import pandas as pd import matplotlib.pyplot as plt from sklearn.manifold import TSNE 使用 TSNE 進行數據降維 降爲兩維 tsne = TSNE(n_components = 2,init = 'random',random_state = 177).fit(iris_data) n_components = 2 降爲兩維 將原始數據轉換爲 DataFrame 對象 df = pd.DataFrame(tsne.embedding_) 轉換爲二維表格式 將聚類結果存到 df 數據表中 df['labels'] = kmeans.labels_ 提取不一樣標籤的數據 df1 = df[df['labels'] == 0] df2 = df[df['labels'] == 1] df3 = df[df['labels'] == 2] 繪製圖形 fig = plt.figure(figsize = (9,6)) 使用不一樣的顏色表示不一樣的數據 plt.plot(df1[0],df1[1],'bo',df2[0],df2[1],'r*') 儲存爲 .png 圖片 plt.savefig('../tmp/名稱.png') plt.show() 聚類模型評價指標 標準 組內的對象相互之間是類似的（相關的） 不一樣組中的對象是不一樣的（不相關的） sklearn.metrics 提供評價指標 ARI評價法（蘭德係數） adjusted _rand_score AMI評價法（互信息） adjusted_mutual_info_score V-measure評分 completeness_score FMI評價法 fowlkes_mallows_score 輪廓係數評價法 silhouette _score Calinski-Harabasz指數評價法 calinski_harabaz_score 前四種更有說服力 評分越高越好 聚類方法的評價能夠等同於分類算法的評價 FMI評價法 fowlkes_mallows_score from sklearn.metrics import fowlkes_mallows_score for i in range(2,7): kmeans =KMeans(n_clusters =i,random_state=123).fit(iris_data) score = fowlkes_mallows_score(iris_target,kmeans.labels_) print('iris數據聚 %d 類FMI評價分值爲：%f'%(i,score)) 輪廓係數評價法 silhouette_score from sklearn.metrics import silhouette_score import matplotlib.pyplot as plt silhouettteScore=[] for i in range(2,15): kmeans=KMeans(n_clusters =i,random state=123).fit(iris data) score = silhouette_score(iris_data,kmeans.labels_) silhouettteScore.append(score) plt.figure(figsize=(10,6)) plt.plot(range(2,15),silhouettteScore,linewidth=1.5,linestyle="-") plt.show() 使用 Calinski-Harabasz 指數評價 K-Means 聚類模型 分值越高聚類效果越好 from sklearn.metrics import calinski_harabaz_score for i in range(2,7): kmeans=KMeans(n_clusters =i,random_state=123).fit(iris_data) 進行評價 score=calinski_harabaz_score(iris_data,kmeans.labels_) print（'iris數據聚%d類calinski harabaz指數爲：%f'%(i,score) 構建並評價分類模型(有監督學習) 輸入樣本的特徵值 輸出對應的類別 將每一個樣本映射到預先定義好的類別 分類模型創建在已有模型的數據集上 用於 圖像檢測 物品分類 分類算法 模塊名 函數名稱 算法名稱 linear_model LogisticRegression 邏輯斯蒂迴歸 svm SVC 支持向量機 neighbors KNeighborsClassifier K最近鄰分類 naive_bayes GaussianNB 高斯樸素貝葉斯 tree DecisionTreeClassifier 分類決策樹 ensemble RandomForestClassifier 隨機森林分類 ensemble GradientBoostingClassifier 梯度提高分類樹 以 breast_cancer 數據爲例 使用sklearn估計器構建支持向量機（SVM）模型 import numpy as np from sklearn.datasets import load_breast.cancer from sklearn.svm import SVC from sklearn.model_selection import train_test_split from sklearn.preprocessing import StandardScaler cancer = load_breast_cancer() cancer_data = cancerf['data'] Cancer_target = cancert['target'] cancer_names = cancer['feature_names'] 創建 SVM 模型 svm = SVC().fit(cancer_trainStd,cancer_target_train) 預測訓練集結果 cancer_target_pred = svm.predict(cancer_testStd) 將預測結果和真實結果比對 求出預測對的結果和預測錯的結果 true = np.sum(cancer_target_pred == cancer_target_test) 預測對的結果的數目 true 預測錯的結果的數目 cancer_target_test.shape[0] - true 準確率 true/cancer_target_test.shape[0] 評價分類模型 分類模型對測試集進行預測獲得的準確率並不能很好的反映模型的性能 結合真實值->計算準確率、召回率 F1 值和 Cohen's Kappa 係數等指標 方法名稱 最佳值 sklearn 函數 Precision(精確率) 1.0 metrics.precision_score Recall(召回率） 1.0 metrics.recall_score F1值 1.0 metrics.f1_score Cohen's Kappa 係數 1.0 metrics.cohen_kappa_score ROC曲線 最靠近y軸 metrics.roc_curve from sklearn.metrics import accuracy_score,precision_score,recall_score,f1_score,cohen_kappa_score 使用SVM預測breast_cancer數據的準確率爲 accuracy_score(cancer_target_test,cancer_target_pred) 使用SVM預測breast_cancer數據的精確率爲 precision_score(cancer_target_test,cancer_target_pred) 繪製ROC曲線 from sklearn.metrics import roc_curve import matplotlib.pyplot as plt 求出ROC曲線的x軸和y軸 fpr,tpr,thresholds = roc_curve(cancer_target_test,cancer_target_pred) plIt.figure(figsize=(10,6)) plt.xlim(O,1)##設定x軸的範圍 plt.ylim(0.0,1.1)##設定y軸的範圍 plt.xlabel('False Postive Rate') plt.ylabel('True Postive Rate') plt.plot(fpr,tpr,linewidth=2,linestyle=*-".color='red") plt.show() ROC曲線 與 x 軸面積越大 模型性能越好 構建並評價迴歸模型 分類和迴歸的區別 分類算法的標籤是離散的 迴歸算法的標籤是連續的 做用 交通 物流 社交網絡和金融領域等 迴歸模型 自變量已知 因變量未知 須要預測 迴歸算法實現步驟 分爲 學習 和 預測 兩個步驟 學習 經過訓練樣本數據來擬合迴歸方程 預測 利用學習過程當中擬合出的迴歸方程 將測試數據放入方程中求出預測值 迴歸算法 模塊名稱 函數名稱 算法名稱 linear_model LinearRegression 線性迴歸 svm SVR 支持向量迴歸 neighbors KNeighborsRegressor 最近鄰迴歸 tree DecisionTreeRegressor 迴歸決策樹 ensemble RandomForestRegressor 隨機森林迴歸 ensemble GradientBoostingRegressor 梯度提高迴歸樹 以boston數據集爲例 使用skllearn估計器構建線性迴歸模型 from sklearn.linear_model import LinearRegression from sklearn.datasets import load_boston from sklearn.model_selection import train_test_split boston = load_boston() X = boston['data'] y = boston['target'] names = boston['feature_names'] 劃分訓練集和測試集 X_train,X_test,y_train,y_test = train_test_split(X,Y.test_size=0.2,random_state=125) 創建線性迴歸模型 clf = LinearRegression().fit(X_train.y_train) 預測訓練集結果 y_pred = clf.predict(X_test) 前二十個結果 y_pred[:20] 使用不一樣的顏色表示不一樣的數據 plt.plot(range(y_test.shape[0]),y_test,color='blue',linewidth=1.5,linestyle='-') 評價迴歸模型 方法名稱 最優值 sklearn函數 平均絕對偏差 0.0 metrics.mean_absolute_error 均方偏差 0.0 metrics.mean_squared_error 中值絕對偏差 0.0 metrics.median_absolute_error 可解釋方差值 1.0 metrics.explained_variance_score R方值 1.0 metrics.r2_score 平均絕對偏差 均方偏差和中值絕對偏差的值越靠近 0 模型性能越好 可解釋方差值 和 R方值 則越靠近1 模型性能越好 from sklearn.metrics import explained_variance_score,mean_absolute_error,mean_squared_error,median_absolute_error,r2_score Boston數據線性迴歸模型的平均絕對偏差爲 mean_absolute_error(y_test,y_pred) Boston數據線性迴歸模型的均方偏差爲 mean_squared_error(y_test,y _pred) Boston數據線性迴歸模型的中值絕對偏差爲 median_absolute_error(y_test,y_pred) Boston數據線性迴歸模型的可解釋方差值爲 explained_variance_score(y_test,y_pred) Boston數據線性迴歸模型的R方值爲 r2_score(y test,y_pred)

注:此篇隨筆進行讀取內容時,所讀取的文件能夠修改成本身的文件.

Seaborn基礎1

import seaborn as sns import numpy as np import matplotlib.pyplot as plt # # 折線圖 def sinplot(flip = 1): x = np.linspace(0,14,100) for i in range(1,7): plt.plot(x,np.sin(x+i*0.5) * (7-i) * flip) sns.set() # # 默認組合 sinplot() plt.show() # # 不帶灰色格子 sns.set_style("white") sinplot() plt.show() # 座標加上豎線 sns.set_style("ticks") sinplot() plt.show() # 將右上角的兩條線去掉 sinplot() sns.despine() plt.show() # # 盒圖 sns.set_style("whitegrid") data = np.random.normal(size=(20,6)) + np.arange(6)/2 sns.boxplot(data = data) plt.show()

Seaborn基礎2

import matplotlib.pyplot as plt import seaborn as sns import numpy as np def sinplot(flip = 1): x = np.linspace(0,14,100) for i in range(1,7): plt.plot(x,np.sin(x+i*0.5) * (7-i) * flip) data = np.random.normal(size=(20,6)) + np.arange(6)/2 # 使用 despine 進行操做 sns.violinplot(data) sns.despine(offset = 10) # offset 設置距離軸的距離 plt.show() # 底部變爲白色 sns.set_style("whitegrid") # 讓左面的豎線消失 sns.boxplot(data = data,palette = "deep") sns.despine(left = True) plt.show() # 五種主題風格 darkgrid whitegrid dark white ticks # 繪製子圖 with sns.axes_style("darkgrid"): # 第一種風格背景爲黑色 plt.subplot(211) # 分兩個一列上面 sinplot() plt.subplot(212) sinplot(-1) plt.show() # 設置佈局,畫圖的大小和風格 sns.set_context("paper") # sns.set_context("talk") # sns.set_context("poster") # sns.set_context("notebook") # 線條粗細依次變大 plt.figure(figsize=(8,6)) sinplot() plt.show() # 設置座標字體大小 參數 font_scale sns.set_context("paper",font_scale=3) plt.figure(figsize=(8,6)) sinplot() plt.show() # 設置線的粗度 rc = {"lines.linewidth":4.5} sns.set_context("paper",font_scale=1.5,rc={"lines.linewidth":3}) plt.figure(figsize=(8,6)) sinplot() plt.show()

Seaborn基礎3

import seaborn as sns import numpy as np import matplotlib.pyplot as plt sns.set(rc = {"figure.figsize":(6,6)}) # 調色板 # color_palette() 默認顏色 , 能夠傳入全部支持顏色 # set_palette() 設置全部圖的顏色 # 分類色板,顯示十種顏色 current_palette = sns.color_palette() sns.palplot(current_palette) plt.show() current_palette = sns.color_palette("hls",8) # 設置八種顏色 sns.palplot(current_palette) plt.show() # 將八種顏色應用在盒圖中 current_palette = sns.color_palette("hls",8) data = np.random.normal(size = (20,8)) + np.arange(8)/2 sns.boxplot(data = data,palette = current_palette) plt.show() # 指定亮度和飽和度 # hls_palette() # l 亮度 s 飽和度 # 使用飽和度方法 sns.palplot(sns.hls_palette(8,l = 1,s = 5)) # 將兩個相鄰的顏色相近 使用 Paired 參數 sns.palplot(sns.color_palette("Paired",10)) plt.show() # 連續型漸變色畫板 color_palette("顏色名") sns.palplot(sns.color_palette("Blues")) # 從淺到深 plt.show() # 從深到淺 加上 _r 後綴名 sns.palplot(sns.color_palette("BuGn_r")) plt.show() # cubehelix_palette() 調色板 # 八種顏色分別漸變 sns.palplot(sns.color_palette("cubehelix",8)) plt.show() # 指定 start 值,在區間中顏色的顯示也不一樣 sns.palplot(sns.cubehelix_palette(8,start=5,rot=-0.75)) plt.show() # 顏色從淺到深 light_palette sns.palplot(sns.light_palette("green")) plt.show() # 顏色從深到淺 dark_palette sns.palplot(sns.dark_palette("green")) plt.show() # 實現反轉顏色 在 light_palette 中添加參數 reverse sns.palplot(sns.light_palette("green",reverse = True)) plt.show()

Seaborn實現單變量分析

import numpy as np import pandas as pd from scipy import stats,integrate import matplotlib.pyplot as plt import seaborn as sns # 繪製直方圖 sns.set(color_codes=True) np.random.seed(sum(map(ord,"distributions"))) # 生成高斯數據 x = np.random.normal(size = 100) # # sns.distplot(x,kde = False) # x 數據 kde 是否作密度估計 # 將數據劃分爲 15 份 bins = 15 sns.distplot(x,kde = False,bins = 15) plt.show() # 查看數據分佈情況,根據某一個指標畫一條線 x = np.random.gamma(6,size = 200) sns.distplot(x,kde = False,fit = stats.gamma) plt.show() mean,cov = [0,1],[(1,5),(0.5,1)] data = np.random.multivariate_normal(mean,cov,200) df = pd.DataFrame(data,columns=["x","y"]) # 單變量使用直方圖,關係使用散點圖 # 關係 joinplot (x,y,data) sns.jointplot(x = "x",y = "y",data = df) # 繪製散點圖和直方圖 plt.show() # hex 圖,數據越多 色越深 mean,cov = [0,1],[(1,8),(0.5,1)] x,y = np.random.multivariate_normal(mean,cov,500).T # 注意 .T 進行倒置 with sns.axes_style("white"): sns.jointplot(x = x,y = y,kind = "hex",color = "k") plt.show()

Seaborn實現迴歸分析

import pandas as pd import matplotlib.pyplot as plt import seaborn as sns iris = pd.read_csv("iris.csv") # 對角線上是單個數據的狀況,旁邊的圖都是關係分佈的狀況 sns.pairplot(iris) plt.show() tips = pd.read_csv("tips.csv") print(tips.head()) # 畫圖方式 regplot() 和 lmplot sns.regplot(x = "total_bill",y = "tip",data = tips) # x y 都是原數據的列名 plt.show() # lmplot 畫圖方式,支持更高級的功能,可是規範多 sns.lmplot(x = "total_bill",y = "tip",data = tips) plt.show() sns.lmplot(x = "size",y = "tip",data = tips) plt.show() # 加上抖動,使迴歸更準確 sns.regplot(x = "size",y = "tip",data = tips,x_jitter=0.08) # x_jitter=0.05 在原始數據集中加上小範圍浮動 plt.show()

Seaborn實現多變量分析

import seaborn as sns import numpy as np import pandas as pd import matplotlib.pyplot as plt sns.set(style = "whitegrid",color_codes = True) np.random.seed(sum(map(ord,"categorical"))) titanic = pd.read_csv("titanic.csv") tips = pd.read_csv("tips.csv") iris = pd.read_csv("iris.csv") # 顯示多個點 sns.stripplot(x = "day",y = "total_bill",data = tips) plt.show() sns.swarmplot(x = "day",y = "total_bill",data = tips,hue = "sex") # hue="sex" 生成兩個顏色的小圓圈 混合進行查看,進行優化 plt.show() # 四分位距 IQR 四分之一到四分之三位 之間的距離 # N = 1.5 * IQR # 離羣點 > Q3 + N , < Q1 - N sns.boxplot(x = "day",y = "total_bill",data = tips) # hue = "time" 列名 plt.show() # 小提琴圖 sns.violinplot(x = "total_bill",y = "day",hue = "time",data = tips) plt.show() # 加入 split 豎着展現 sns.violinplot(x = "day",y = "total_bill",hue = "sex",data = tips,split = True) plt.show()

因爲圖片太多,請複製代碼後運行查看.文件名修改成本身的文件夾的名字.

將形如 5D, 30s 的字符串轉爲秒

import sys def convert_to_seconds(time_str): # write code here if 's' in time_str: return float(time_str[:-1]) elif 'm' in time_str: return float(time_str[:-1]) * 60 elif 'h' in time_str: return float(time_str[:-1]) * 3600 elif 'd' in time_str: return float(time_str[:-1]) * 3600 *24 elif 'D' in time_str: return float(time_str[:-1]) * 3600 *24 while True: line = sys.stdin.readline() line = line.strip() if line == '': break print(convert_to_seconds(line))

得到昨天和明天的日期

import datetime import sys def next_day(date_str): date = datetime.datetime.strptime(date_str, '%Y-%m-%d') return (date + datetime.timedelta(days=1)).date() def prev_day(date_str): date = datetime.datetime.strptime(date_str,'%Y-%m-%d') return (date - datetime.timedelta(days = 1)).date() while True: line = sys.stdin.readline() line = line.strip() if line == '': break print('前一天:', prev_day(line)) print('後一天:', next_day(line))

計算兩個日期相隔的秒數

import datetime def date_delta(start, end): # 轉換爲標準時間 start = datetime.datetime.strptime(start,"%Y-%m-%d %H:%M:%S") end = datetime.datetime.strptime(end,"%Y-%m-%d %H:%M:%S") # 獲取時間戳 timeStamp_start = start.timestamp() timeStamp_end = end.timestamp() return timeStamp_end - timeStamp_start start = input() # sys.stdin.readline() end = input() # sys.stdin.readline() print(date_delta(start, end))

遍歷多個 txt 文件進行獲取值

import random def load_config(path): with open(path,'r') as tou: return [line for line in tou.readlines()] headers = { 'User-Agent':load_config('useragents.txt')[random.randint(0,len(load_config('useragents.txt'))-1)].strip("\n"), 'Referer':load_config('referers.txt')[random.randint(0,len(load_config('referers.txt'))-1)].strip("\n"), 'Accept':load_config('acceptall.txt')[random.randint(0,len(load_config('acceptall.txt'))-1)].strip("\n"), } print(headers)

安裝第三方庫 pip install 包名 -i http://pypi.douban.com/simple/ --trusted-host pypi.douban.com

安裝第三方庫進階 # 安裝 pip 包 from tkinter import * def getBao(): pip = 'pip install %s -i http://pypi.douban.com/simple/ --trusted-host pypi.douban.com'%entry_bao.get() print(pip) root = Tk() root.title("pip包") root.geometry("250x150+400+300") url = StringVar() url_lab1 = Label(text = "請輸入包名:") url_lab1.pack() entry_bao = Entry(root,textvariable = url) entry_bao.pack() btn1 = Button(root,text = "提交",command = getBao,width = 8,height = 2) btn1.pack() root.mainloop()

Python第一次實驗

''' 計算 1.輸入半徑,輸出面積和周長 2.輸入面積,輸出半徑及周長 3.輸入周長,輸出半徑及面積 ''' # 1.輸入半徑,輸出面積和周長 from math import pi # 定義半徑 r = int(input("請輸入半徑的值(整數)")) if r < 0 : exit("請從新輸入半徑") ''' S 面積: pi * r * r ''' S = pi * pow(r,2) print(" 半徑爲 %d 的圓,面積爲 %.2f"%(r,S)) '''C 周長: C = 2 * pi * r ''' C = 2 * pi * r print(" 半徑爲 %d 的圓,周長爲 %.2f"%(r,C)) # 2.輸入面積,輸出半徑及周長 from math import pi,sqrt S = float(input("請輸入圓的面積(支持小數格式)")) if S < 0 : exit("請從新輸入面積") '''r 半徑: r = sqrt(S/pi)''' r = sqrt(S/pi) print("面積爲 %.2f 的圓,半徑爲 %.2f"%(S,r)) '''C 周長: C = 2 * pi * r ''' C = 2 * pi * r print("面積爲 %.2f 的圓,周長爲 %.2f"%(S,C)) # 3.輸入周長,輸出半徑及面積 from math import pi C = float(input("請輸入圓的周長(支持小數格式)")) if C < 0 : exit("請從新輸入周長") '''r 半徑: r = C/(2*pi)''' r = C/(2*pi) print("周長爲 %.2f 的圓,半徑爲 %.2f"%(C,r)) ''' S 面積: pi * r * r ''' S = pi * pow(r,2) print("周長爲 %.2f 的圓,面積爲 %.2f"%(C,S))

''' 數據結構 列表練習 1.建立列表對象 [110,'dog','cat',120,'apple'] 2.在字符串 'dog' 和 'cat' 之間插入空列表 3.刪除 'apple' 這個字符串 4.查找出 1十、120 兩個數值,並以 10 爲乘數作自乘運算 ''' # 1.建立列表對象 [110,'dog','cat',120,'apple'] '''建立一個名爲 lst 的列表對象''' lst = [110,'dog','cat',120,'apple'] print(lst) # 2.在字符串 'dog' 和 'cat' 之間插入空列表 lst = [110,'dog','cat',120,'apple'] '''添加元素到 'dog' 和 'cat' 之間''' lst.insert(2,[]) print(lst) # 3.刪除 'apple' 這個字符串 lst = [110,'dog','cat',120,'apple'] '''刪除最後一個元素''' lst.pop() print(lst) # 4.查找出 1十、120 兩個數值,並以 10 爲乘數作自乘運算 lst = [110,'dog','cat',120,'apple'] try: # 若是找不到數據,進行異常處理 lst[lst.index(110)] *= 10 lst[lst.index(120)] *= 10 except Exception as e: print(e) print(lst)

''' 字典練習 1.建立字典 {'Math':96,'English':86,'Chinese':95.5,'Biology':86,'Physics':None} 2.在字典中添加鍵值對 {'Histore':88} 3.刪除 {'Physisc':None} 鍵值對 4.將鍵 'Chinese' 所對應的值進行四捨五入後取整 5.查詢鍵 'Math' 的對應值 ''' # 1.建立字典 {'Math':96,'English':86,'Chinese':95.5,'Biology':86,'Physics':None} stu_score = {'Math':96,'English':86,'Chinese':95.5,'Biology':86,'Physics':None} # 2.在字典中添加鍵值對 {'Histore':88} stu_score['Histore'] = 88 # 3.刪除 {'Physisc':None} 鍵值對 if 'Physisc' in stu_score.keys(): '''若是存在 "Physisc" ''' del stu_score['Physisc'] # 4.將鍵 'Chinese' 所對應的值進行四捨五入後取整 if 'Chinese' in stu_score.keys(): # 四捨五入 使用 round stu_score['Chinese'] = round(stu_score['Chinese']) # 5.查詢鍵 'Math' 的對應值 print(stu_score.get('Math',"沒有找到 Math 的值"))

''' 元組練習 1.建立列表 ['pen','paper',10,False,2.5] 賦給變量並查看變量的類型 2.將變量轉換爲 tuple 類型,查看變量的類型 3.查詢元組中的元素 False 的位置 4.根據得到的位置提取元素 ''' # 1.建立列表 ['pen','paper',10,False,2.5] 賦給變量並查看變量的類型 lst = ['pen','paper',10,False,2.5] '''查看變量類型''' print("變量的類型",type(lst)) # 2.將變量轉換爲 tuple 類型,查看變量的類型 lst = tuple(lst) print("變量的類型",type(lst)) # 3.查詢元組中的元素 False 的位置 if False in lst: print("False 的位置爲(從0開始): ",lst.index(False)) # 4.根據得到的位置提取元素 print("根據得到的位置提取的元素爲: ",lst[lst.index(False)]) else: print("不在元組中")

''' 集合練習 1.建立列表 ['apple','pear','watermelon','peach'] 並賦給變量 2.用 list() 建立列表 ['pear','banana','orange','peach','grape'],並賦給變量 3.將建立的兩個列表對象轉換爲集合類型 4.求兩個集合的並集,交集和差集 ''' # 1.建立列表 ['apple','pear','watermelon','peach'] 並賦給變量 lst = ['apple','pear','watermelon','peach'] # 2.用 list() 建立列表 ['pear','banana','orange','peach','grape'],並賦給變量 lst_2 = list({'pear','banana','orange','peach','grape'}) print(lst_2) # 3.將建立的兩個列表對象轉換爲集合類型 lst_set = set(lst) lst2_set = set(lst_2) # 4.求兩個集合的並集,交集和差集 ''' 並集 | 交集 & 差集 - ''' print("兩個集合的 並集爲 :",lst_set | lst2_set) print("兩個集合的 交集爲 :",lst_set & lst2_set) print("lst_set 與 lst2_set 的差集爲 :",lst_set - lst2_set) print("lst2_set 與 lst_set 的差集爲 :",lst2_set - lst_set)

pip 國內源 經常使用國內源 清華：https://pypi.tuna.tsinghua.edu.cn/simple 阿里雲：https://mirrors.aliyun.com/pypi/simple/ 中國科技大學 https://pypi.mirrors.ustc.edu.cn/simple/ 華中理工大學：http://pypi.hustunique.com/ 山東理工大學：http://pypi.sdutlinux.org/ 豆瓣：http://pypi.douban.com/simple/

format 進階 '''format(數字,str(算術式)+"d或者f") d 表示 int f 表示 float ''' format(5,str(2*4)+"d") ' 5' format(5,str(2*4)+"f") '5.000000' '''使用 .2f 控制小數點個數''' format(5,str(2*4)+".2f") ' 5.00' format(5,str(2*15)+"f") ' 5.000000' '''format(字符串,str(算術式)+"s")''' format('s',str(2*3)+"s") 's '

進階刪除重複元素

def dedupe(items,key=None): seen = set() for item in items: val = item if key==None else key(item) #item是否爲字典，是則轉化爲字典key(item),匿名函數調用 if val not in seen: yield item seen.add(val) #集合增長元素val if __name__=="__main__": a = [{'x':2,'y':4},{'x':3,'y':5},{'x':5,'y':8},{'x':2,'y':4},{'x':3,'y':5}] b=[1,2,3,4,1,3,5] print(b) print(list(dedupe(b))) print(a) print(list(dedupe(a,key=lambda a:(a['x'],a['y'])))) #按照a['x'],a['y']方式

爬蟲流程複習2

1.打開網頁 urllib.request.urlopen('網址') 例：response = urllib.request.urlopen('http://www.baidu.com/') 返回值爲 <http.client.HTTPResponse object at 0x00000224EC2C9490> 2.獲取響應頭信息 urlopen 對象.getheaders() 例：response.getheaders() 返回值爲 [('Bdpagetype', '1'), ('Bdqid', '0x8fa65bba0000ba44'),···,('Transfer-Encoding', 'chunked')] [('頭','信息')] 3.獲取響應頭信息,帶參數表示指定響應頭 urlopen 對象.getheader('頭信息') 例：response.getheader('Content-Type') 返回值爲 'text/html;charset=utf-8' 4.查看狀態碼 urlopen 對象.status 例：response.status 返回值爲 200 則表示成功 5.獲得二進制數據,而後轉換爲 utf-8 格式 二進制數據 例：html = response.read() HTML 數據格式 例：html = response.read().decode('utf-8') 打印輸出時,使用 decode('字符集') 的數據 print(html.decode('utf-8')) 6.存儲 HTML 數據 fp = open('文件名.html','模式 wb') 例：fp = open('baidu.html', 'wb') fp.write(response.read() 對象) 例：fp.write(html) 7.關閉文件 open對象.close() 例：fp.close() 8.使用 ssl 進行抓取 https 的網頁 例： import ssl content = ssl._create_unverified_context() headers = {'User-Agent':'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.100 Safari/537.36'} request = urllib.request.Request('http://www.baidu.com/', headers = headers) response = urllib.request.urlopen(request, context = context) 這裏的 response 就和上面同樣了 9.獲取碼 response.getcode() 返回值爲 200 10.獲取爬取的網頁 url response.geturl() 返回值爲 https://www.baidu.com/ 11.獲取響應的報頭信息 response.info() 例： import ssl request = urllib.request.Request('http://www.baidu.com/', headers = headers) context = ssl._create_unverified_context() response = urllib.request.urlopen(request, context = context) response.info() 獲取的爲 頭信息 -- response = urllib.request.urlopen('http://www.baidu.com/') response.info() 返回值爲 <http.client.HTTPMessage object at 0x00000268D453DA60> 12.保存網頁 urllib.request.urlretrieve(url, '文件名.html') 例：urllib.request.urlretrieve(url, 'baidu.html') 13.保存圖片 urllib.request.urlretrieve(url, '圖片名.jpg') 例：urllib.request.urlretrieve(url, 'Dog.jpg') 其餘字符（如漢字）不符合標準時,要進行編碼 14.除了-._/09AZaz 都會編碼 urllib.parse.quote() 例： Param = "全文檢索:*" urllib.parse.quote(Param) 返回值爲 '%E5%85%A8%E6%96%87%E6%A3%80%E7%B4%A2%3A%2A' 參考連接：https://blog.csdn.net/ZTCooper/article/details/80165038 15.會編碼 / 斜線(將斜線也轉換爲 %.. 這種格式) urllib.parse.quote_plus(Param) 16.將字典拼接爲 query 字符串 若是有中文,進行url編碼 dic_object = { 'user_name':'張三', 'user_passwd':'123456' } urllib.parse.urlencode(dic_object) 返回值爲 'user_name=%E5%BC%A0%E4%B8%89&user_passwd=123456' 17.獲取 response 的行 url = 'http://www.baidu.com' response = urllib.request.urlopen(url) response.readline() 18.隨機獲取請求頭(隨機包含請求頭信息的列表) user_agent = [ "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv2.0.1) Gecko/20100101 Firefox/4.0.1", "Mozilla/5.0 (Windows NT 6.1; rv2.0.1) Gecko/20100101 Firefox/4.0.1", "Opera/9.80 (Macintosh; Intel Mac OS X 10.6.8; U; en) Presto/2.8.131 Version/11.11", "Opera/9.80 (Windows NT 6.1; U; en) Presto/2.8.131 Version/11.11", "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_0) AppleWebKit/535.11 (KHTML, like Gecko) Chrome/17.0.963.56 Safari/535.11" ] ua = random.choice(user_agent) headers = {'User-Agent':ua} 19.對輸入的漢字進行 urlencode 編碼 urllib.parse.urlencode(字典對象) 例： chinese = input('請輸入要查詢的中文詞語:') wd = {'wd':chinese} wd = urllib.parse.urlencode(wd) 返回值爲 'wd=%E4%BD%A0%E5%A5%BD' 20.常見分頁操做 for page in range(start_page, end_page + 1): pn = (page - 1) * 50 21.一般會進行拼接字符串造成網址 例：fullurl = url + '&pn=' + str(pn) 22.進行拼接造成要保存的文件名 例：filename = 'tieba/' + name + '貼吧_第' + str(page) + '頁.html' 23.保存文件 with open(filename,'wb') as f: f.write(reponse.read() 對象) 24.headers 頭信息能夠刪除的有 cookie、accept-encoding、accept-languag、content-length\connection\origin\host 25.headers 頭信息不能夠刪除的有 Accept、X-Requested-With、User-Agent、Content-Type、Referer 26.提交給網頁的數據 formdata formdata = { 'from':'en', 'to':'zh', 'query':word, 'transtype':'enter', 'simple_means_flag':'3' } 27.將formdata進行urlencode編碼,而且轉化爲bytes類型 formdata = urllib.parse.urlencode(formdata).encode('utf-8') 28.使用 formdata 在 urlopen() 中 response = urllib.request.urlopen(request, data=formdata) 29.轉換爲正確數據(導包 json) read -> decode -> loads -> json.dumps 經過read讀取過來爲字節碼 data = response.read() 將字節碼解碼爲utf8的字符串 data = data.decode('utf-8') 將json格式的字符串轉化爲json對象 obj = json.loads(data) 禁用ascii以後，將json對象轉化爲json格式字符串 html = json.dumps(obj, ensure_ascii=False) json 對象經過 str轉換後 使用 utf-8 字符集格式寫入 保存和以前的方法相同 with open('json.txt', 'w', encoding='utf-8') as f: f.write(html) 30.ajax請求自帶的頭部 'X-Requested-With':'XMLHttpRequest' 31.豆瓣默認都得使用https來進行抓取，因此須要使用ssl模塊忽略證書 例： url = 'http://movie.douban.com/j/chart/top_list?type=24&interval_id=100%3A90&action=' page = int(input('請輸入要獲取頁碼:')) start = (page - 1) * 20 limit = 20 key = { 'start':start, 'limit':limit } key = urllib.parse.urlencode(key) url = url + '&' + key headers = { 'X-Requested-With':'XMLHttpRequest', 'User-Agent':'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.100 Safari/537.36' } request = urllib.request.Request(url, headers=headers) # context = ssl._create_unverified_context() response = urllib.request.urlopen(request) jsonret = response.read() with open('douban.txt', 'w', encoding='utf-8') as f: f.write(jsonret.decode('utf-8')) print('over') 32.建立處理 http 請求的對象 http_handler = urllib.request.HTTPHandler() 33.處理 https 請求 https_handler = urllib.request.HTTPSHandler() 34.建立支持http請求的opener對象 opener = urllib.request.build_opener(http_handler) 35.建立 reponse 對象 例：opener.open(Request 對象) request = urllib.request.Request('http://www.baidu.com/') reponse = opener.open(request) 進行保存 with open('文件名.html', 'w', encoding='utf-8') as f: f.write(reponse.read().decode('utf-8')) 36.代理服務器 http_proxy_handler = urllib.request.ProxyHandler({'https':'ip地址:端口號'}) 例：http_proxy_handler = urllib.request.ProxyHandler({'https':'121.43.178.58:3128'}) 37.私密代理服務器(下面的只是一個例子,不必定正確) authproxy_handler = urllib.request.ProxyHandler({"http" : "user:password@ip:port"}) 38.不使用任何代理 http_proxy_handler = urllib.request.ProxyHandler({}) 39.使用了代理以後的 opener 寫法 opener = urllib.request.build_opener(http_proxy_handler) 40.response 寫法 response = opener.open(request) 41.若是訪問一個不存在的網址會報錯 urllib.error.URLError 42.HTTPError（是URLError的子類） 例： try: urllib.request.urlopen(url) except urllib.error.HTTPError as e: print(e.code) print(e.reason) except urllib.error.URLError as e: print(e) 43.使用 CookieJar 建立一個 cookie 對象,保存 cookie 值 import http.cookiejar cookie = http.cookiejar.CookieJar() 44.經過HTTPCookieProcessor構建一個處理器對象，用來處理cookie cookie_handler = urllib.request.HTTPCookieProcessor(cookie) opener 的寫法 opener = urllib.request.build_opener(cookie_handler) 45.使用 r'\x' \d 表示轉義字符 r'\d' 表示 \d 46.設置 正則模式 pattern = re.compile(r'規則', re.xxx ) pattern = re.compile(r'i\s(.*?),') 例：pattern = re.compile(r'LOVE', re.I) 47.match 只匹配開頭字符 pattern.match('字符串'[,起始位置,結束位置]) 例：m = pattern.match('i love you', 2, 6) 返回值爲 <re.Match object; span=(2, 6), match='love'> 48. search 從開始匹配到結尾,返回第一個匹配到的 pattern.search('字符串') 例：m = pattern.search('i love you, do you love me, yes, i love') 返回值爲 <re.Match object; span=(2, 6), match='love'> 49.findall 將匹配到的都放到列表中 pattern.findall('字符串') 例：m = pattern.findall('i love you, do you love me, yes, i love') 返回值爲 ['love', 'love', 'love'] 50.split 使用匹配到的字符串對原來的數據進行切割 pattern.split('字符串',次數) 例：m = pattern.split('i love you, do you love me, yes, i love me', 1) 返回值爲 ['i ', ' you, do you love me, yes, i love me'] 例：m = pattern.split('i love you, do you love me, yes, i love me', 2) 返回值爲 ['i ', ' you, do you ', ' me, yes, i love me'] 例：m = pattern.split('i love you, do you love me, yes, i love me', 3) 返回值爲 ['i ', ' you, do you ', ' me, yes, i ', ' me'] 51.sub 使用新字符串替換匹配到的字符串的值,默認所有替換 pattern.sub('新字符串','要匹配字符串'[,次數]) 注：返回的是字符串 例： string = 'i love you, do you love me, yes, i love me' m = pattern.sub('hate', string, 1) m 值爲 'i hate you, do you love me, yes, i love me' 52.group 匹配組 m.group() 返回的是匹配都的全部字符 m.group(1) 返回的是第二個規則匹配到的字符 例： string = 'i love you, do you love me, yes, i love me' pattern = re.compile(r'i\s(.*?),') m = pattern.match(string) m.group() 返回值爲 'i love you,' m.group(1) 返回值爲 'love you' 53.匹配標籤 pattern = re.compile(r'<div class="thumb">(.*?)<img src=(.*?) alt=(.*?)>(.*?)</div>', re.S) 54.分離出文件名和擴展名,返回二元組 os.path.splitext(參數) 例： 獲取路徑 image_path = './qiushi' 獲取後綴名 extension = os.path.splitext(image_url)[-1] 55.合併多個字符串 os.path.join() 圖片路徑 image_path = os.path.join(image_path, image_name + extension) 保存文件 urllib.request.urlretrieve(image_url, image_path) 56.獲取 a 標籤下的 href 的內容 pattern = re.compile(r'<a href="(.*?)" class="main_14" target="_blank">(.*?)</a>', re.M) 例： import urllib.parse import urllib.request import re class SmileSpider(object): """ 爬取笑話網站笑話的排行榜 """ def __init__(self, url, page=1): super(SmileSpider, self).__init__() self.url = url self.page = page def handle_url(self): ''' 處理url而且生成request請求對象 ''' self.url = self.url + '?mepage=' + str(self.page) headers = { 'User-Agent':'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.100 Safari/537.36', } request = urllib.request.Request(self.url, headers=headers) return request def xiazai(self, request): ''' 負責下載數據，而且將數據返回 ''' response = urllib.request.urlopen(request) html = response.read().decode('gbk') return html def handle_data(self, data): ''' 開始處理數據，將段子抓取出來而且寫入文件 ''' # 這個必須使用多行模式進行抓取，由於是抓取多個a連接 pattern = re.compile(r'<a href="(.*?)" class="main_14" target="_blank">(.*?)</a>', re.M) # 找到全部的笑話連接 alist = pattern.findall(data) # print(alist) # exit() print('開始下載') for smile in alist: # 獲取標題 # title = alist[14][1] title = smile[1] # 獲取url # smile_url = alist[14][0] smile_url = smile[0] # 獲取內容 content = self.handle_content(smile_url) # 將抓取的這一頁的笑話寫到文件中 with open('xiaohua.html', 'a', encoding='gbk') as f: f.write('<h1>' + title + '</h1>' + content) print('下載完畢') def handle_content(self, smile_url): # 由於有的href中有中文，因此必須先轉碼再拼接，若是先拼接再轉碼，就會將：也給轉碼了，不符合要求 smile_url = urllib.parse.quote(smile_url) smile_url = 'http://www.jokeji.cn' + smile_url # print(smile_url) # exit() content = self.xiazai(smile_url) # 因爲抓取的文本中，有的中間有空格，因此使用單行模式進行抓取 pattern = re.compile(r'<span id="text110">(.*?)</span>', re.S) ret = pattern.search(content) return ret.group(1) def start(self): request = self.handle_url() html = self.xiazai(request) self.handle_data(html) if __name__ == '__main__': url = 'http://www.jokeji.cn/hot.asp' spider = SmileSpider(url) spider.start() 57.href 中有中文的須要先進行轉碼,而後再拼接 smile_url = urllib.parse.quote(smile_url) smile_url = 'http://www.jokeji.cn' + smile_url 58.導入 etree from lxml import etree 59.實例化一個 html 對象,DOM模型 etree.HTML(經過requests庫的get方法或post方法獲取的信息 其實就是 HTML 代碼) 例：html_tree = etree.HTML(text) 返回值爲 <Element html at 0x26ee35b2400> 例：type(html_tree) <class 'lxml.etree._Element'> 60.查找全部的 li 標籤 html_tree.xpath('//li') 61.獲取全部li下面a中屬性href爲link1.html的a result = html_tree.xpath('//標籤/標籤[@屬性="值"]') 例：result = html_tree.xpath('//li/a[@href="link.html"]') 62.獲取最後一個 li 標籤下 a 標籤下面的 href 值 result = html_tree.xpath('//li[last()]/a/@href') 63.獲取 class 爲 temp 的結點 result = html_tree.xpath('//*[@class = "temp"]') 64.獲取全部 li 標籤下的 class 屬性 result = html_tree.xpath('//li/@class') 65.取出內容 [0].text 例：result = html_tree.xpath('//li[@class="popo"]/a')[0].text 例：result = html_tree.xpath('//li[@class="popo"]/a/text()') 66.將 tree 對象轉化爲字符串 etree.tostring(etree.HTML對象).decode('utf-8') 例： text = ''' <div> <ul> <li class="item-0"><a href="link1.html">first item</a></li> <li class="item-1"><a href="link2.html">second item</a></li> <li class="item-inactive"><a href="link3.html">third item</a></li> <li class="item-1"><a href="link4.html">fourth item</a></li> <li class="item-0"><a href="link5.html">fifth item</a> </ul> </div> ''' html = etree.HTML(text) tostring 轉換的是bytes 類型數據 result = etree.tostring(html) 將 bytes 類型數據轉換爲 str 類型數據 print(result.decode('utf-8')) 67.動態保存圖片,使用url後幾位做爲文件名 request = urllib.request.Request(url, headers=headers) response = urllib.request.urlopen(request) html_tree = etree.HTML(html) img_list = html_tree.xpath('//div[@class="box picblock col3"]/div/a/img/@src2') for img_url in img_list: # 定製圖片名字爲url後10位 file_name = 'image/' + img_url[-10:] load_image(img_url, file_name) load_image內容： def load_image(url, file_name): headers = { 'User-Agent':'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.100 Safari/537.36', } request = urllib.request.Request(url, headers=headers) response = urllib.request.urlopen(request) image_bytes = response.read() with open(file_name, 'wb') as f: f.write(image_bytes) print(file_name + '圖片已經成功下載完畢') 例： def load_page(url): headers = { #'User-Agent':'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.100 Safari/537.36', } print(url) # exit() request = urllib.request.Request(url, headers=headers) response = urllib.request.urlopen(request) html = response.read() # 這是專業的圖片網站，使用了懶加載，可是能夠經過源碼來進行查看，而且從新寫xpath路徑 with open('7image.html', 'w', encoding='utf-8') as f: f.write(html.decode('utf-8')) exit() # 將html文檔解析問DOM模型 html_tree = etree.HTML(html) # 經過xpath，找到須要的全部的圖片的src屬性，這裏獲取到的 img_list = html_tree.xpath('//div[@class="box picblock col3"]/div/a/img/@src2') for img_url in img_list: # 定製圖片名字爲url後10位 file_name = 'image/' + img_url[-10:] load_image(img_url, file_name) def load_image(url, file_name): headers = { 'User-Agent':'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.100 Safari/537.36', } request = urllib.request.Request(url, headers=headers) response = urllib.request.urlopen(request) image_bytes = response.read() with open(file_name, 'wb') as f: f.write(image_bytes) print(file_name + '圖片已經成功下載完畢') def main(): start = int(input('請輸入開始頁面:')) end = int(input('請輸入結束頁面:')) url = 'http://sc.chinaz.com/tag_tupian/' for page in range(start, end + 1): if page == 1: real_url = url + 'KaTong.html' else: real_url = url + 'KaTong_' + str(page) + '.html' load_page(real_url) print('第' + str(page) + '頁下載完畢') if __name__ == '__main__': main() 68.懶圖片加載案例 例： import urllib.request from lxml import etree import json def handle_tree(html_tree): node_list = html_tree.xpath('//div[@class="detail-wrapper"]') duan_list = [] for node in node_list: # 獲取全部的用戶名，由於該xpath獲取的是一個span列表，而後獲取第一個，而且經過text屬性獲得其內容 user_name = node.xpath('./div[contains(@class, "header")]/a/div/span[@class="name"]')[0].text # 只要涉及到圖片，頗有可能都是懶加載，因此要右鍵查看網頁源代碼，才能獲得真實的連接 # 因爲這個獲取的結果就是屬性字符串，因此只須要加上下標0便可 face = node.xpath('./div[contains(@class, "header")]//img/@data-src')[0] # .表明當前，一個/表示一級子目錄，兩個//表明當前節點裏面任意的位置查找 content = node.xpath('./div[@class="content-wrapper"]//p')[0].text zan = node.xpath('./div[@class="options"]//li[@class="digg-wrapper "]/span')[0].text item = { 'username':user_name, 'face':face, 'content':content, 'zan':zan, } # 將其存放到列表中 duan_list.append(item) # 將列表寫入到文件中 with open('8duanzi.txt', 'a', encoding='utf-8') as f: f.write(json.dumps(duan_list, ensure_ascii=False) + '\n') print('over') def main(): # 爬取百度貼吧，不能加上headers，加上headers爬取不下來 url = 'http://neihanshequ.com/' headers = { 'User-Agent':'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.100 Safari/537.36', } request = urllib.request.Request(url, headers=headers) response = urllib.request.urlopen(request) html_bytes = response.read() # fp = open('8tieba.html', 'w', encoding='utf-8') # fp.write(html_bytes.decode('utf-8')) # fp.close() # exit() # 將html字節串轉化爲html文檔樹 # 文檔樹有xpath方法，文檔節點也有xpath方法 # 【注】不能使用字節串轉化爲文檔樹，這樣會有亂碼 html_tree = etree.HTML(html_bytes.decode('utf-8')) handle_tree(html_tree) if __name__ == '__main__': main() 69. . / 和 // 在 xpath 中的使用 .表明當前目錄 / 表示一級子目錄 // 表明當前節點裏面任意的位置 70.獲取內容的示範 獲取內容時,若是爲字符串,則不須要使用 text 只須要寫[0] face = node.xpath('./div[contains(@class, "header")]//img/@data-src')[0] div 下 class 爲 "content-wrapper" 的全部 p 標籤內容 content = node.xpath('./div[@class="content-wrapper"]//p')[0].text div 下 class 爲 "options" 的全部 li 標籤下 class爲 "digg-wrapper" 的全部 span 標籤內容 zan = node.xpath('./div[@class="options"]//li[@class="digg-wrapper"]/span')[0].text 71.將json對象轉化爲json格式字符串 f.write(json.dumps(duan_list, ensure_ascii=False) + '\n') 72.正則獲取 div 下的內容 1.獲取 div 到 img 之間的數據 2.img 下 src 的數據 3.img 下 alt 的數據 4.一直到 div 結束的數據 pattern = re.compile(r'<div class="thumb">(.*?)<img src=(.*?) alt=(.*?)>(.*?)</div>', re.S) pattern.方法 ,參考上面的正則 73.帶有參數的 get 方式 import requests params = { 'wd':'中國' } r = requests.get('http://www.baidu.com/s?', headers=headers, params=params) requests.get 還能夠添加 cookie 參數 74.設置編碼 r.encoding='utf-8 75.查看全部頭信息 r.request.headers 76.在 requests.get 方法中 url,params,headers,proxies 爲參數 url 網址 params 須要的數據 headers 頭部 proxies 代理 77.經過 Session 對象,發送請求 s = requests.Session() 78.發送請求 s.post(url,data,headers) 79.接收請求 s.get(url[,proxies]) 80.當返回爲 json 樣式時 例： city = input('請輸入要查詢的城市:') params = { 'city':city } r = requests.get(url, params=params) r.json() 會打印出響應的內容 81.BeautifulSoup 建立對象 from bs4 import BeautifulSoup soup = BeautifulSoup(open(url,encoding='utf-8),'lxml') 82.查找第一個<title> 標籤 soup.title 返回值爲 <title>三國猛將</title> 83.查找第一個 a 標籤 soup.a 返回值爲 <a class="aa" href="http://www.baidu.com" title="baidu">百度</a> 84.查找第一個 ul 標籤 soup.ul 85.查看標籤名字 a_tag = soup.a a_tag.name 返回值爲 a 86.查看標籤內容 a_tag.attrs 返回值爲 {'href': 'http://www.baidu.com', 'title': 'baidu', 'class': ['aa']} 87.獲取找到的 a 標籤的 href 內容(第一個 a) soup.a.get('href') 返回值爲 http://www.baidu.com 88.獲取 a 標籤下的 title 屬性(第一個 a) soup.a.get('title') 返回值爲 baidu 89.查看 a 標籤下的內容 soup.標籤.string 標籤還能夠是 head、title等 soup.a.string 返回值爲 百度 90.獲取 p 標籤下的內容 soup.p.string 91.查看 div 的內容,包含 '\n' soup.div.contents 返回值爲 ['\n', <div class="div"> <a class="la" href="www.nihao.com">你好</a> </div>, '\n', <div> <a href="www.hello.com">世界</a> </div>, '\n'] 92.查看使用的字符集 soup.div.contents[1] 返回值爲 <meta charset="utf-8"/> 93.查看body的子節點 soup.標籤.children 例：soup.body.children 返回值是一個迭代對象,須要遍歷輸出 返回值爲 <list_iterator object at 0x0000021863886C10> for child in soup.body.children: print(child) 返回值爲 body 中的全部內容 94.查看全部的子孫節點 soup.標籤.descendants 例：soup.div.descendants 返回值爲 <div class="div"> <a class="la" href="www.nihao.com">你好</a> </div> <a class="la" href="www.nihao.com">你好</a> 你好 95.查看全部的 a 標籤 soup.find_all('a') 返回值爲 包含全部的 a 標籤的列表 96.查看 a 標籤中第二個連接的內容 soup.find_all('a')[1].string 97.查看 a 標籤中第二個連接的href值 soup.find_all('a')[1].href 98.將 re 正則嵌入進來,找尋全部以 b 開頭的標籤 soup.findall(re.compile('^b')) 返回值爲 <body>標籤 <b> 99.找到全部的 a 標籤和 b 標籤 soup.findall(re.compile(['a','b'])) 返回值爲 <a> 和 <b> 標籤 100.經過標籤名獲取全部的 a 標籤 soup.select('a') 返回值爲 全部的 <a> 標籤 101.經過 類名 獲取標籤(在 class 等於的值前面加 .) soup.select('.aa') 返回值爲 class='aa' 的標籤 102.經過 id 名獲取標籤(在 id 等於的值前面加 #) soup.select('#wangyi') 返回值爲 id='wangyi'的標籤 103.查看 div 下 class='aa' 的標籤 soup.select('標籤 .class 等於的值') soup.select('div .aa') 104.查看 div 下,第一層 class='aa' 的標籤 soup.select('.標籤名 > .class= 的值') soup.select('.div > .la') 105.根據屬性進行查找,input 標籤下class爲 haha 的標籤 soup.select('input[class="haha"]') 例： import requests from bs4 import BeautifulSoup import json import lxml def load_url(jl, kw): headers = { 'User-Agent':'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.100 Safari/537.36', } url = 'http://sou.zhaopin.com/jobs/searchresult.ashx?' params = { 'jl':jl, 'kw':kw, } # 自動完成轉碼，直接使用便可 r = requests.get(url, params=params, headers=headers) handle_data(r.text) def handle_data(html): # 建立soup對象 soup = BeautifulSoup(html, 'lxml') # 查找職位名稱 job_list = soup.select('#newlist_list_content_table table') # print(job_list) jobs = [] i = 1 for job in job_list: # 由於第一個table只是表格的標題，因此要過濾掉 if i == 1: i = 0 continue item = {} # 公司名稱 job_name = job.select('.zwmc div a')[0].get_text() # 職位月薪 company_name = job.select('.gsmc a')[0].get_text() # 工做地點 area = job.select('.gzdd')[0].get_text() # 發佈日期 time = job.select('.gxsj span')[0].get_text() # 將全部信息添加到字典中 item['job_name'] = job_name item['company_name'] = company_name item['area'] = area item['time'] = time jobs.append(item) # 將列表轉化爲json格式字符串，而後寫入到文件中 content = json.dumps(jobs, ensure_ascii=False) with open('python.json', 'w', encoding='utf-8') as f: f.write(content) print('over') def main(): # jl = input('請輸入工做地址:') # kw = input('請輸入工做職位:') load_url(jl='北京', kw='python') if __name__ == '__main__': main() 106.將字典進行 json 轉換爲 import json str_dict = {"name":"張三", "age":55, "height":180} print(json.dumps(str_dict, ensure_ascii=False)) 使用 ensure_ascii 輸出則爲 utf-8 編碼 107.讀取轉換的對象,(注意 loads 和 load 方法) json.loads(json.dumps 對象) string = json.dumps(str_dict, ensure_ascii=False) json.loads(string) {"name":"張三", "age":55, "height":180} 108.將對象序列化以後寫入文件 json.dump(字典對象,open(文件名.json,'w',encoding='utf-8,ensure_ascii=False)) json.dump(str_dict, open('jsontest.json', 'w', encoding='utf-8'), ensure_ascii=False) 109.轉換本地的 json 文件轉換爲 python 對象 json.load(open('文件名.json',encoding='utf-8)) 110.jsonpath 示例： book.json文件 { "store": { "book": [ { "category": "reference", "author": "Nigel Rees", "title": "Sayings of the Century", "price": 8.95 }, { "category": "fiction", "author": "Evelyn Waugh", "title": "Sword of Honour", "price": 12.99 }, { "category": "fiction", "author": "Herman Melville", "title": "Moby Dick", "isbn": "0-553-21311-3", "price": 8.99 }, { "category": "fiction", "author": "J. R. R. Tolkien", "title": "The Lord of the Rings", "isbn": "0-395-19395-8", "price": 22.99 } ], "bicycle": { "color": "red", "price": 19.95 } } } import json import jsonpath obj = json.load(open('book.json', encoding='utf-8')) 全部book book = jsonpath.jsonpath(obj, '$..book') print(book) 全部book中的全部做者 authors = jsonpath.jsonpath(obj, '$..book..author') print(authors) book中的前兩本書 '$..book[:2]' book中的最後兩本書 '$..book[-2:]' book = jsonpath.jsonpath(obj, '$..book[0,1]') print(book) 全部book中，有屬性isbn的書籍 book = jsonpath.jsonpath(obj, '$..book[?(@.isbn)]') print(book) 全部book中，價格小於10的書籍 book = jsonpath.jsonpath(obj, '$.store.book[?(@.price<10)]') print(book)

numpy第三方庫

# 導入numpy 並賦予別名 np import numpy as np # 建立數組的經常使用的幾種方式(列表，元組，range,arange,linspace(建立的是等差數組),zeros(全爲 0 的數組),ones(全爲 1 的數組),logspace(建立的是對數數組)) # 列表方式 np.array([1,2,3,4]) # array([1, 2, 3, 4]) # 元組方式 np.array((1,2,3,4)) # array([1, 2, 3, 4]) # range 方式 np.array(range(4)) # 不包含終止數字 # array([0, 1, 2, 3]) # 使用 arange(初始位置=0,末尾,步長=1) # 不包含末尾元素 np.arange(1,8,2) # array([1, 3, 5, 7]) np.arange(8) # array([0, 1, 2, 3, 4, 5, 6, 7]) # 使用 linspace(起始數字,終止數字，包含數字的個數[,endpoint = False]) 生成等差數組 # 生成等差數組,endpoint 爲 True 則包含末尾數字 np.linspace(1,3,4,endpoint=False) # array([1. , 1.5, 2. , 2.5]) np.linspace(1,3,4,endpoint=True) # array([1. , 1.66666667, 2.33333333, 3. ]) # 建立全爲零的一維數組 np.zeros(3) # 建立全爲一的一維數組 np.ones(4) # array([1., 1., 1., 1.]) np.linspace(1,3,4) # array([1. , 1.66666667, 2.33333333, 3. ]) # np.logspace(起始數字，終止數字，數字個數，base = 10) 對數數組 np.logspace(1,3,4) # 至關於 10 的 linspace(1,3,4) 次方 # array([ 10. , 46.41588834, 215.443469 , 1000. ]) np.logspace(1,3,4,base = 2) # array([2. , 3.1748021, 5.0396842, 8. ]) # 建立二維數組(列表嵌套列表) np.array([[1,2,3],[4,5,6]]) ''' array([[1, 2, 3], [4, 5, 6]]) ''' # 建立全爲零的二維數組 # 兩行兩列 np.zeros((2,2)) ''' array([[0., 0.], [0., 0.]]) ''' # 三行三列 np.zeros((3,2)) ''' array([[0., 0.], [0., 0.], [0., 0.]]) ''' # 建立一個單位數組 np.identity(3) ''' array([[1., 0., 0.], [0., 1., 0.], [0., 0., 1.]]) ''' # 建立一個對角矩陣，(參數爲對角線上的數字) np.diag((1,2,3)) ''' array([[1, 0, 0], [0, 2, 0], [0, 0, 3]]) ''' import numpy as np x = np.arange(8) # [0 1 2 3 4 5 6 7] # 在數組尾部追加一個元素 np.append(x,10) # array([ 0, 1, 2, 3, 4, 5, 6, 7, 10]) # 在數組尾部追加多個元素 np.append(x,[15,16,17]) # array([ 0, 1, 2, 3, 4, 5, 6, 7, 15, 16, 17]) # 使用 數組下標修改元素的值 x[0] = 99 # array([99, 1, 2, 3, 4, 5, 6, 7]) # 在指定位置插入數據 np.insert(x,0,54) # array([54, 99, 1, 2, 3, 4, 5, 6, 7]) # 建立一個多維數組 x = np.array([[1,2,3],[11,22,33],[111,222,333]]) ''' array([[ 1, 2, 3], [ 11, 22, 33], [111, 222, 333]]) ''' # 修改第 0 行第 2 列的元素值 x[0,2] = 9 ''' array([[ 1, 2, 9], [ 11, 22, 33], [111, 222, 333]]) ''' # 行數大於等於 1 的，列數大於等於 1 的置爲 1 x[1:,1:] = 1 ''' array([[ 1, 2, 9], [ 11, 1, 1], [111, 1, 1]]) ''' # 同時修改多個元素值 x[1:,1:] = [7,8] ''' array([[ 1, 2, 9], [ 11, 7, 8], [111, 7, 8]]) ''' x[1:,1:] = [[7,8],[9,10]] ''' array([[ 1, 2, 9], [ 11, 7, 8], [111, 9, 10]]) ''' import numpy as np n = np.arange(10) # array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) # 查看數組的大小 n.size # 10 # 將數組分爲兩行五列 n.shape = 2,5 ''' array([[0, 1, 2, 3, 4], [5, 6, 7, 8, 9]]) ''' # 顯示數組的維度 n.shape # (2, 5) # 設置數組的維度，-1 表示自動計算 n.shape = 5,-1 ''' array([[0, 1], [2, 3], [4, 5], [6, 7], [8, 9]]) ''' # 將新數組設置爲調用數組的兩行五列並返回 x = n.reshape(2,5) ''' array([[0, 1, 2, 3, 4], [5, 6, 7, 8, 9]]) ''' x = np.arange(5) # 將數組設置爲兩行，沒有數的設置爲 0 x.resize((2,10)) ''' array([[0, 1, 2, 3, 4, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0]]) ''' # 將 x 數組的兩行五列形式顯示，不改變 x 的值 np.resize(x,(2,5)) ''' array([[0, 1, 2, 3, 4], [0, 0, 0, 0, 0]]) ''' import numpy as np n = np.array(([1,2,3],[4,5,6],[7,8,9])) ''' array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]) ''' # 第一行元素 n[0] # array([1, 2, 3]) # 第一行第三列元素 n[0,2] # 3 # 第一行和第二行的元素 n[[0,1]] ''' array([[1, 2, 3], [4, 5, 6]]) ''' # 第一行第三列，第三行第二列，第二行第一列 n[[0,2,1],[2,1,0]] # array([3, 8, 4]) a = np.arange(8) # array([0, 1, 2, 3, 4, 5, 6, 7]) # 將數組倒序 a[::-1] # array([7, 6, 5, 4, 3, 2, 1, 0]) # 步長爲 2 a[::2] # array([0, 2, 4, 6]) # 從 0 到 4 的元素 a[:5] # array([0, 1, 2, 3, 4]) c = np.arange(16) # array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]) c.shape = 4,4 ''' array([[ 0, 1, 2, 3], [ 4, 5, 6, 7], [ 8, 9, 10, 11], [12, 13, 14, 15]]) ''' # 第一行，第三個元素到第五個元素(若是沒有則輸出到末尾截止) c[0,2:5] # array([2, 3]) # 第二行元素 c[1] # array([4, 5, 6, 7]) # 第三行到第六行，第三列到第六列 c[2:5,2:5] ''' array([[10, 11], [14, 15]]) ''' # 第二行第三列元素和第三行第四列元素 c[[1,2],[2,3]] # array([ 6, 11]) # 第一行和第三行的第二列到第三列的元素 c[[0,2],1:3] ''' array([[ 1, 2], [ 9, 10]]) ''' # 第一列和第三列的全部橫行元素 c[:,[0,2]] ''' array([[ 0, 2], [ 4, 6], [ 8, 10], [12, 14]]) ''' # 第三列全部元素 c[:,2] # array([ 2, 6, 10, 14]) # 第二行和第四行的全部元素 c[[1,3]] ''' array([[ 4, 5, 6, 7], [12, 13, 14, 15]]) ''' # 第一行的第二列，第四列元素，第四行的第二列，第四列元素 c[[0,3]][:,[1,3]] ''' array([[ 1, 3], [13, 15]]) ''' import numpy as np x = np.array((1,2,3,4,5)) # 使用 * 進行相乘 x*2 # array([ 2, 4, 6, 8, 10]) # 使用 / 進行相除 x / 2 # array([0.5, 1. , 1.5, 2. , 2.5]) 2 / x # array([2. , 1. , 0.66666667, 0.5 , 0.4 ]) # 使用 // 進行整除 x//2 # array([0, 1, 1, 2, 2], dtype=int32) 10//x # array([10, 5, 3, 2, 2], dtype=int32) # 使用 ** 進行冪運算 x**3 # array([ 1, 8, 27, 64, 125], dtype=int32) 2 ** x # array([ 2, 4, 8, 16, 32], dtype=int32) # 使用 + 進行相加 x + 2 # array([3, 4, 5, 6, 7]) # 使用 % 進行取模 x % 3 # array([1, 2, 0, 1, 2], dtype=int32) # 數組與數組之間的運算 # 使用 + 進行相加 np.array([1,2,3,4]) + np.array([11,22,33,44]) # array([12, 24, 36, 48]) np.array([1,2,3,4]) + np.array([3]) # array([4, 5, 6, 7]) n = np.array((1,2,3)) # + n + n # array([2, 4, 6]) n + np.array([4]) # array([5, 6, 7]) # * n * n # array([1, 4, 9]) n * np.array(([1,2,3],[4,5,6],[7,8,9])) ''' array([[ 1, 4, 9], [ 4, 10, 18], [ 7, 16, 27]]) ''' # - n - n # array([0, 0, 0]) # / n/n # array([1., 1., 1.]) # ** n**n # array([ 1, 4, 27], dtype=int32) x = np.array((1,2,3)) y = np.array((4,5,6)) # 數組的內積運算(對應位置上元素相乘) np.dot(x,y) # 32 sum(x*y) # 32 # 布爾運算 n = np.random.rand(4) # array([0.53583849, 0.09401473, 0.07829069, 0.09363152]) # 判斷數組中的元素是否大於 0.5 n > 0.5 # array([ True, False, False, False]) # 將數組中大於 0.5 的元素顯示 n[n>0.5] # array([0.53583849]) # 找到數組中 0.05 ~ 0.4 的元素總數 sum((n > 0.05)&(n < 0.4)) # 3 # 是否都大於 0.2 np.all(n > 0.2) # False # 是否有元素小於 0.1 np.any(n < 0.1) # True # 數組與數組之間的布爾運算 a = np.array([1,4,7]) # array([1, 4, 7]) b = np.array([4,3,7]) # array([4, 3, 7]) # 在 a 中是否有大於 b 的元素 a > b # array([False, True, False]) # 在 a 中是否有等於 b 的元素 a == b # array([False, False, True]) # 顯示 a 中 a 的元素等於 b 的元素 a[a == b] # array([7]) # 顯示 a 中的偶數且小於 5 的元素 a[(a%2 == 0) & (a < 5)] # array([4]) import numpy as np # 將 0~100 10等分 x = np.arange(0,100,10) # array([ 0, 10, 20, 30, 40, 50, 60, 70, 80, 90]) # 每一個數組元素對應的正弦值 np.sin(x) ''' array([ 0. , -0.54402111, 0.91294525, -0.98803162, 0.74511316, -0.26237485, -0.30481062, 0.77389068, -0.99388865, 0.89399666]) ''' # 每一個數組元素對應的餘弦值 np.cos(x) ''' array([ 1. , -0.83907153, 0.40808206, 0.15425145, -0.66693806, 0.96496603, -0.95241298, 0.6333192 , -0.11038724, -0.44807362]) ''' # 對參數進行四捨五入 np.round(np.cos(x)) # array([ 1., -1., 0., 0., -1., 1., -1., 1., -0., -0.]) # 對參數進行上入整數 3.3->4 np.ceil(x/3) # array([ 0., 4., 7., 10., 14., 17., 20., 24., 27., 30.]) # 分段函數 x = np.random.randint(0,10,size=(1,10)) # array([[0, 3, 6, 7, 9, 4, 9, 8, 1, 8]]) # 大於 4 的置爲 0 np.where(x > 4,0,1) # array([[1, 1, 0, 0, 0, 1, 0, 0, 1, 0]]) # 小於 4 的乘 2 ，大於 7 的乘3 np.piecewise(x,[x<4,x>7],[lambda x:x*2,lambda x:x*3]) # array([[ 0, 6, 0, 0, 27, 0, 27, 24, 2, 24]]) import numpy as np x = np.array([1,4,5,2]) # array([1, 4, 5, 2]) # 返回排序後元素的原下標 np.argsort(x) # array([0, 3, 1, 2], dtype=int64) # 輸出最大值的下標 x.argmax( ) # 2 # 輸出最小值的下標 x.argmin( ) # 0 # 對數組進行排序 x.sort( ) import numpy as np # 生成一個隨機數組 np.random.randint(0,6,3) # array([1, 1, 3]) # 生成一個隨機數組(二維數組) np.random.randint(0,6,(3,3)) ''' array([[4, 4, 1], [2, 1, 0], [5, 0, 0]]) ''' # 生成十個隨機數在[0,1)之間 np.random.rand(10) ''' array([0.9283789 , 0.43515554, 0.27117021, 0.94829333, 0.31733981, 0.42314939, 0.81838647, 0.39091899, 0.33571004, 0.90240897]) ''' # 從標準正態分佈中隨機抽選出3個數 np.random.standard_normal(3) # array([0.34660435, 0.63543859, 0.1307822 ]) # 返回三頁四行兩列的標準正態分佈數 np.random.standard_normal((3,4,2)) ''' array([[[-0.24880261, -1.17453957], [ 0.0295264 , 1.04038047], [-1.45201783, 0.57672288], [ 1.10282747, -2.08699482]], [[-0.3813943 , 0.47845782], [ 0.97708005, 1.1760147 ], [ 1.3414987 , -0.629902 ], [-0.29780567, 0.60288726]], [[ 1.43991349, -1.6757028 ], [-1.97956809, -1.18713495], [-1.39662811, 0.34174275], [ 0.56457553, -0.83224426]]]) ''' # 建立矩陣 import numpy as np x = np.matrix([[1,2,3],[4,5,6]]) ''' matrix([[1, 2, 3], [4, 5, 6]]) ''' y = np.matrix([1,2,3,4,5,6]) # matrix([[1, 2, 3, 4, 5, 6]]) # x 的第二行第二列元素 x[1,1] # 5 # 矩陣的函數 import numpy as np # 矩陣的轉置 x = np.matrix([[1,2,3],[4,5,6]]) ''' matrix([[1, 2, 3], [4, 5, 6]]) ''' y = np.matrix([1,2,3,4,5,6]) # matrix([[1, 2, 3, 4, 5, 6]]) # 實現矩陣的轉置 x.T ''' matrix([[1, 4], [2, 5], [3, 6]]) ''' y.T ''' matrix([[1], [2], [3], [4], [5], [6]]) ''' # 元素平均值 x.mean() # 3.5 # 縱向平均值 x.mean(axis = 0) # matrix([[2.5, 3.5, 4.5]]) # 橫向平均值 x.mean(axis = 1) ''' matrix([[2.], [5.]]) ''' # 全部元素之和 x.sum() # 21 # 橫向最大值 x.max(axis = 1) ''' matrix([[3], [6]]) ''' # 橫向最大值的索引下標 x.argmax(axis = 1) ''' matrix([[2], [2]], dtype=int64) ''' # 對角線元素 x.diagonal() # matrix([[1, 5]]) # 非零元素下標 x.nonzero() # (array([0, 0, 0, 1, 1, 1], dtype=int64), # array([0, 1, 2, 0, 1, 2], dtype=int64)) # 矩陣的運算 import numpy as np x = np.matrix([[1,2,3],[4,5,6]]) ''' matrix([[1, 2, 3], [4, 5, 6]]) ''' y = np.matrix([[1,2],[4,5],[7,8]]) ''' matrix([[1, 2], [4, 5], [7, 8]]) ''' # 矩陣的乘法 x*y ''' matrix([[30, 36], [66, 81]]) ''' # 相關係數矩陣,可以使用在列表元素數組矩陣 # 負相關 np.corrcoef([1,2,3],[8,5,4]) ''' array([[ 1. , -0.96076892], [-0.96076892, 1. ]]) ''' # 正相關 np.corrcoef([1,2,3],[4,5,7]) ''' array([[1. , 0.98198051], [0.98198051, 1. ]]) ''' # 矩陣的方差 np.cov([1,1,1,1,1]) # array(0.) # 矩陣的標準差 np.std([1,1,1,1,1]) # 0.0 x = [-2.1,-1,4.3] y = [3,1.1,0.12] # 垂直堆疊矩陣 z = np.vstack((x,y)) ''' array([[-2.1 , -1. , 4.3 ], [ 3. , 1.1 , 0.12]]) ''' # 矩陣的協方差 np.cov(z) ''' array([[11.71 , -4.286 ], [-4.286 , 2.14413333]]) ''' np.cov(x,y) ''' array([[11.71 , -4.286 ], [-4.286 , 2.14413333]]) ''' # 標準差 np.std(z) # 2.2071223094538484 # 列向標準差 np.std(z,axis = 1) # array([2.79404128, 1.19558447]) # 方差 np.cov(x) # array(11.71) # 特徵值和特徵向量 A = np.array([[1,-3,3],[3,-5,3],[6,-6,4]]) ''' array([[ 1, -3, 3], [ 3, -5, 3], [ 6, -6, 4]]) ''' e,v = np.linalg.eig(A) # e 爲特徵值, v 爲特徵向量 ''' e array([ 4.+0.00000000e+00j, -2.+1.10465796e-15j, -2.-1.10465796e-15j]) v array([[-0.40824829+0.j , 0.24400118-0.40702229j, 0.24400118+0.40702229j], [-0.40824829+0.j , -0.41621909-0.40702229j, -0.41621909+0.40702229j], [-0.81649658+0.j , -0.66022027+0.j , -0.66022027-0.j ]]) ''' # 矩陣與特徵向量的乘積 np.dot(A,v) ''' array([[-1.63299316+0.00000000e+00j, -0.48800237+8.14044580e-01j, -0.48800237-8.14044580e-01j], [-1.63299316+0.00000000e+00j, 0.83243817+8.14044580e-01j, 0.83243817-8.14044580e-01j], [-3.26598632+0.00000000e+00j, 1.32044054-5.55111512e-16j, 1.32044054+5.55111512e-16j]]) ''' # 特徵值與特徵向量的乘積 e * v ''' array([[-1.63299316+0.00000000e+00j, -0.48800237+8.14044580e-01j, -0.48800237-8.14044580e-01j], [-1.63299316+0.00000000e+00j, 0.83243817+8.14044580e-01j, 0.83243817-8.14044580e-01j], [-3.26598632+0.00000000e+00j, 1.32044054-7.29317578e-16j, 1.32044054+7.29317578e-16j]]) ''' # 驗證兩個乘積是否相等 np.isclose(np.dot(A,v),(e * v)) ''' array([[ True, True, True], [ True, True, True], [ True, True, True]]) ''' # 行列式 |A - λE| 的值應爲 0 np.linalg.det(A-np.eye(3,3)*e) # 5.965152994198125e-14j x = np.matrix([[1,2,3],[4,5,6],[7,8,0]]) ''' matrix([[1, 2, 3], [4, 5, 6], [7, 8, 0]]) ''' # 逆矩陣 y = np.linalg.inv(x) ''' matrix([[-1.77777778, 0.88888889, -0.11111111], [ 1.55555556, -0.77777778, 0.22222222], [-0.11111111, 0.22222222, -0.11111111]]) 注：numpy.linalg.LinAlgError: Singular matrix 矩陣不存在逆矩陣 ''' # 矩陣的乘法 x * y ''' matrix([[ 1.00000000e+00, 5.55111512e-17, 1.38777878e-17], [ 5.55111512e-17, 1.00000000e+00, 2.77555756e-17], [ 1.77635684e-15, -8.88178420e-16, 1.00000000e+00]]) ''' y * x ''' matrix([[ 1.00000000e+00, -1.11022302e-16, 0.00000000e+00], [ 8.32667268e-17, 1.00000000e+00, 2.22044605e-16], [ 6.93889390e-17, 0.00000000e+00, 1.00000000e+00]]) ''' # 求解線性方程組 a = np.array([[3,1],[1,2]]) ''' array([[3, 1], [1, 2]]) ''' b = np.array([9,8]) # array([9, 8]) # 求解 x = np.linalg.solve(a,b) # array([2., 3.]) # 驗證 np.dot(a,x) # array([9., 8.]) # 最小二乘解：返回解，餘項，a 的秩，a 的奇異值 np.linalg.lstsq(a,b) # (array([2., 3.]), array([], dtype=float64), 2, array([3.61803399, 1.38196601])) # 計算向量和矩陣的範數 x = np.matrix([[1,2],[3,-4]]) ''' matrix([[ 1, 2], [ 3, -4]]) ''' np.linalg.norm(x) # 5.477225575051661 np.linalg.norm(x,-2) # 1.9543950758485487 np.linalg.norm(x,-1) # 4.0 np.linalg.norm(x,1) # 6.0 np.linalg.norm([1,2,0,3,4,0],0) # 4.0 np.linalg.norm([1,2,0,3,4,0],2) # 5.477225575051661 # 奇異值分解 a = np.matrix([[1,2,3],[4,5,6],[7,8,9]]) ''' matrix([[1, 2, 3], [4, 5, 6], [7, 8, 9]]) ''' u,s,v = np.linalg.svd(a) u ''' matrix([[-0.21483724, 0.88723069, 0.40824829], [-0.52058739, 0.24964395, -0.81649658], [-0.82633754, -0.38794278, 0.40824829]]) ''' s ''' array([1.68481034e+01, 1.06836951e+00, 4.41842475e-16]) ''' v ''' matrix([[-0.47967118, -0.57236779, -0.66506441], [-0.77669099, -0.07568647, 0.62531805], [-0.40824829, 0.81649658, -0.40824829]]) ''' # 驗證 u * np.diag(s) * v ''' matrix([[1., 2., 3.], [4., 5., 6.], [7., 8., 9.]]) '''

pandas第三方庫

# 一維數組與經常使用操做 import pandas as pd # 設置輸出結果列對齊 pd.set_option('display.unicode.ambiguous_as_wide',True) pd.set_option('display.unicode.east_asian_width',True) # 建立 從 0 開始的非負整數索引 s1 = pd.Series(range(1,20,5)) ''' 0 1 1 6 2 11 3 16 dtype: int64 ''' # 使用字典建立 Series 字典的鍵做爲索引 s2 = pd.Series({'語文':95,'數學':98,'Python':100,'物理':97,'化學':99}) ''' 語文 95 數學 98 Python 100 物理 97 化學 99 dtype: int64 ''' # 修改 Series 對象的值 s1[3] = -17 ''' 0 1 1 6 2 11 3 -17 dtype: int64 ''' s2['語文'] = 94 ''' 語文 94 數學 98 Python 100 物理 97 化學 99 dtype: int64 ''' # 查看 s1 的絕對值 abs(s1) ''' 0 1 1 6 2 11 3 17 dtype: int64 ''' # 將 s1 全部的值都加 5 s1 + 5 ''' 0 6 1 11 2 16 3 -12 dtype: int64 ''' # 在 s1 的索引下標前加入參數值 s1.add_prefix(2) ''' 20 1 21 6 22 11 23 -17 dtype: int64 ''' # s2 數據的直方圖 s2.hist() # 每行索引後面加上 hany s2.add_suffix('hany') ''' 語文hany 94 數學hany 98 Pythonhany 100 物理hany 97 化學hany 99 dtype: int64 ''' # 查看 s2 中最大值的索引 s2.argmax() # 'Python' # 查看 s2 的值是否在指定區間內 s2.between(90,100,inclusive = True) ''' 語文 True 數學 True Python True 物理 True 化學 True dtype: bool ''' # 查看 s2 中 97 分以上的數據 s2[s2 > 97] ''' 數學 98 Python 100 化學 99 dtype: int64 ''' # 查看 s2 中大於中值的數據 s2[s2 > s2.median()] ''' Python 100 化學 99 dtype: int64 ''' # s2 與數字之間的運算,開平方 * 10 保留一位小數 round((s2**0.5)*10,1) ''' 語文 97.0 數學 99.0 Python 100.0 物理 98.5 化學 99.5 dtype: float64 ''' # s2 的中值 s2.median() # 98.0 # s2 中最小的兩個數 s2.nsmallest(2) ''' 語文 94 物理 97 dtype: int64 ''' # s2 中最大的兩個數 s2.nlargest(2) ''' Python 100 化學 99 dtype: int64 ''' # Series 對象之間的運算,對相同索引進行計算,不是相同索引的使用 NaN pd.Series(range(5)) + pd.Series(range(5,10)) ''' 0 5 1 7 2 9 3 11 4 13 dtype: int64 ''' # pipe 對 Series 對象使用匿名函數 pd.Series(range(5)).pipe(lambda x,y,z :(x**y)%z,2,5) ''' 0 0 1 1 2 4 3 4 4 1 dtype: int64 ''' pd.Series(range(5)).pipe(lambda x:x+3) ''' 0 3 1 4 2 5 3 6 4 7 dtype: int64 ''' pd.Series(range(5)).pipe(lambda x:x+3).pipe(lambda x:x*3) ''' 0 9 1 12 2 15 3 18 4 21 dtype: int64 ''' # 對 Series 對象使用匿名函數 pd.Series(range(5)).apply(lambda x:x+3) ''' 0 3 1 4 2 5 3 6 4 7 dtype: int64 ''' # 查看標準差 pd.Series(range(0,5)).std() # 1.5811388300841898 # 查看無偏方差 pd.Series(range(0,5)).var() # 2.5 # 查看無偏標準差 pd.Series(range(0,5)).sem() # 0.7071067811865476 # 查看是否存在等價於 True 的值 any(pd.Series([3,0,True])) # True # 查看是否全部的值都等價於 True all(pd.Series([3,0,True])) # False # 時間序列和經常使用操做 import pandas as pd # 每隔五天--5D pd.date_range(start = '20200101',end = '20200131',freq = '5D') ''' DatetimeIndex(['2020-01-01', '2020-01-06', '2020-01-11', '2020-01-16', '2020-01-21', '2020-01-26', '2020-01-31'], dtype='datetime64[ns]', freq='5D') ''' # 每隔一週--W pd.date_range(start = '20200301',end = '20200331',freq = 'W') ''' DatetimeIndex(['2020-03-01', '2020-03-08', '2020-03-15', '2020-03-22', '2020-03-29'], dtype='datetime64[ns]', freq='W-SUN') ''' # 間隔兩天,五個數據 pd.date_range(start = '20200301',periods = 5,freq = '2D') ''' DatetimeIndex(['2020-03-01', '2020-03-03', '2020-03-05', '2020-03-07', '2020-03-09'], dtype='datetime64[ns]', freq='2D') ''' # 間隔三小時，八個數據 pd.date_range(start = '20200301',periods = 8,freq = '3H') ''' DatetimeIndex(['2020-03-01 00:00:00', '2020-03-01 03:00:00', '2020-03-01 06:00:00', '2020-03-01 09:00:00', '2020-03-01 12:00:00', '2020-03-01 15:00:00', '2020-03-01 18:00:00', '2020-03-01 21:00:00'], dtype='datetime64[ns]', freq='3H') ''' # 三點開始，十二個數據，間隔一分鐘 pd.date_range(start = '202003010300',periods = 12,freq = 'T') ''' DatetimeIndex(['2020-03-01 03:00:00', '2020-03-01 03:01:00', '2020-03-01 03:02:00', '2020-03-01 03:03:00', '2020-03-01 03:04:00', '2020-03-01 03:05:00', '2020-03-01 03:06:00', '2020-03-01 03:07:00', '2020-03-01 03:08:00', '2020-03-01 03:09:00', '2020-03-01 03:10:00', '2020-03-01 03:11:00'], dtype='datetime64[ns]', freq='T') ''' # 每月的最後一天 pd.date_range(start = '20190101',end = '20191231',freq = 'M') ''' DatetimeIndex(['2019-01-31', '2019-02-28', '2019-03-31', '2019-04-30', '2019-05-31', '2019-06-30', '2019-07-31', '2019-08-31', '2019-09-30', '2019-10-31', '2019-11-30', '2019-12-31'], dtype='datetime64[ns]', freq='M') ''' # 間隔一年，六個數據，年底最後一天 pd.date_range(start = '20190101',periods = 6,freq = 'A') ''' DatetimeIndex(['2019-12-31', '2020-12-31', '2021-12-31', '2022-12-31', '2023-12-31', '2024-12-31'], dtype='datetime64[ns]', freq='A-DEC') ''' # 間隔一年，六個數據，年初最後一天 pd.date_range(start = '20200101',periods = 6,freq = 'AS') ''' DatetimeIndex(['2020-01-01', '2021-01-01', '2022-01-01', '2023-01-01', '2024-01-01', '2025-01-01'], dtype='datetime64[ns]', freq='AS-JAN') ''' # 使用 Series 對象包含時間序列對象,使用特定索引 data = pd.Series(index = pd.date_range(start = '20200321',periods = 24,freq = 'H'),data = range(24)) ''' 2020-03-21 00:00:00 0 2020-03-21 01:00:00 1 2020-03-21 02:00:00 2 2020-03-21 03:00:00 3 2020-03-21 04:00:00 4 2020-03-21 05:00:00 5 2020-03-21 06:00:00 6 2020-03-21 07:00:00 7 2020-03-21 08:00:00 8 2020-03-21 09:00:00 9 2020-03-21 10:00:00 10 2020-03-21 11:00:00 11 2020-03-21 12:00:00 12 2020-03-21 13:00:00 13 2020-03-21 14:00:00 14 2020-03-21 15:00:00 15 2020-03-21 16:00:00 16 2020-03-21 17:00:00 17 2020-03-21 18:00:00 18 2020-03-21 19:00:00 19 2020-03-21 20:00:00 20 2020-03-21 21:00:00 21 2020-03-21 22:00:00 22 2020-03-21 23:00:00 23 Freq: H, dtype: int64 ''' # 查看前五個數據 data[:5] ''' 2020-03-21 00:00:00 0 2020-03-21 01:00:00 1 2020-03-21 02:00:00 2 2020-03-21 03:00:00 3 2020-03-21 04:00:00 4 Freq: H, dtype: int64 ''' # 三分鐘重採樣，計算均值 data.resample('3H').mean() ''' 2020-03-21 00:00:00 1 2020-03-21 03:00:00 4 2020-03-21 06:00:00 7 2020-03-21 09:00:00 10 2020-03-21 12:00:00 13 2020-03-21 15:00:00 16 2020-03-21 18:00:00 19 2020-03-21 21:00:00 22 Freq: 3H, dtype: int64 ''' # 五分鐘重採樣，求和 data.resample('5H').sum() ''' 2020-03-21 00:00:00 10 2020-03-21 05:00:00 35 2020-03-21 10:00:00 60 2020-03-21 15:00:00 85 2020-03-21 20:00:00 86 Freq: 5H, dtype: int64 ''' # 計算OHLC open,high,low,close data.resample('5H').ohlc() ''' open high low close 2020-03-21 00:00:00 0 4 0 4 2020-03-21 05:00:00 5 9 5 9 2020-03-21 10:00:00 10 14 10 14 2020-03-21 15:00:00 15 19 15 19 2020-03-21 20:00:00 20 23 20 23 ''' # 將日期替換爲次日 data.index = data.index + pd.Timedelta('1D') # 查看前五條數據 data[:5] ''' 2020-03-22 00:00:00 0 2020-03-22 01:00:00 1 2020-03-22 02:00:00 2 2020-03-22 03:00:00 3 2020-03-22 04:00:00 4 Freq: H, dtype: int64 ''' # 查看指定日期是星期幾 # pd.Timestamp('20200321').weekday_name # 'Saturday' # 查看指定日期的年份是不是閏年 pd.Timestamp('20200301').is_leap_year # True # 查看指定日期所在的季度和月份 day = pd.Timestamp('20200321') # Timestamp('2020-03-21 00:00:00') # 查看日期的季度 day.quarter # 1 # 查看日期所在的月份 day.month # 3 # 轉換爲 python 的日期時間對象 day.to_pydatetime() # datetime.datetime(2020, 3, 21, 0, 0) # DateFrame 的建立,包含部分:index , column , values import numpy as np import pandas as pd # 建立一個 DataFrame 對象 dataframe = pd.DataFrame(np.random.randint(1,20,(5,3)), index = range(5), columns = ['A','B','C']) ''' A B C 0 17 9 19 1 14 5 8 2 7 18 13 3 13 16 2 4 18 6 5 ''' # 索引爲時間序列 dataframe2 = pd.DataFrame(np.random.randint(5,15,(9,3)), index = pd.date_range(start = '202003211126', end = '202003212000', freq = 'H'), columns = ['Pandas','爬蟲','比賽']) ''' Pandas 爬蟲 比賽 2020-03-21 11:26:00 8 10 8 2020-03-21 12:26:00 9 14 9 2020-03-21 13:26:00 9 5 13 2020-03-21 14:26:00 9 7 7 2020-03-21 15:26:00 11 10 14 2020-03-21 16:26:00 12 7 10 2020-03-21 17:26:00 11 11 13 2020-03-21 18:26:00 8 13 8 2020-03-21 19:26:00 7 7 13 ''' # 使用字典進行建立 dataframe3 = pd.DataFrame({'語文':[87,79,67,92], '數學':[93,89,80,77], '英語':[88,95,76,77]}, index = ['張三','李四','王五','趙六']) ''' 語文 數學 英語 張三 87 93 88 李四 79 89 95 王五 67 80 76 趙六 92 77 77 ''' # 建立時自動擴充 dataframe4 = pd.DataFrame({'A':range(5,10),'B':3}) ''' A B 0 5 3 1 6 3 2 7 3 3 8 3 4 9 3 ''' # C:\Users\lenovo\Desktop\總結\Python # 讀取 Excel 文件並進行篩選 import pandas as pd # 設置列對齊 pd.set_option("display.unicode.ambiguous_as_wide",True) pd.set_option("display.unicode.east_asian_width",True) # 讀取工號姓名時段交易額，使用默認索引 dataframe = pd.read_excel(r'C:\Users\lenovo\Desktop\總結\Python\超市營業額.xlsx', usecols = ['工號','姓名','時段','交易額']) # 打印前十行數據 dataframe[:10] ''' 工號 姓名 時段 交易額 0 1001 張三 9:00-14:00 2000 1 1002 李四 14:00-21:00 1800 2 1003 王五 9:00-14:00 800 3 1004 趙六 14:00-21:00 1100 4 1005 周七 9:00-14:00 600 5 1006 錢八 14:00-21:00 700 6 1006 錢八 9:00-14:00 850 7 1001 張三 14:00-21:00 600 8 1001 張三 9:00-14:00 1300 9 1002 李四 14:00-21:00 1500 ''' # 跳過 1 2 4 行，以第一列姓名爲索引 dataframe2 = pd.read_excel(r'C:\Users\lenovo\Desktop\總結\Python\超市營業額.xlsx', skiprows = [1,2,4], index_col = 1) '''注：張三李四趙六的第一條數據跳過 工號 日期 時段 交易額 櫃檯 姓名 王五 1003 20190301 9:00-14:00 800 食品 周七 1005 20190301 9:00-14:00 600 日用品 錢八 1006 20190301 14:00-21:00 700 日用品 錢八 1006 20190301 9:00-14:00 850 蔬菜水果 張三 1001 20190302 14:00-21:00 600 蔬菜水果 ''' # 篩選符合特定條件的數據 # 讀取超市營業額數據 dataframe = pd.read_excel(r'C:\Users\lenovo\Desktop\總結\Python\超市營業額.xlsx') # 查看 5 到 10 的數據 dataframe[5:11] ''' 工號 姓名 日期 時段 交易額 櫃檯 5 1006 錢八 20190301 14:00-21:00 700 日用品 6 1006 錢八 20190301 9:00-14:00 850 蔬菜水果 7 1001 張三 20190302 14:00-21:00 600 蔬菜水果 8 1001 張三 20190302 9:00-14:00 1300 化妝品 9 1002 李四 20190302 14:00-21:00 1500 化妝品 10 1003 王五 20190302 9:00-14:00 1000 食品 ''' # 查看第六行的數據,左閉右開 dataframe.iloc[5] ''' 工號 1006 姓名 錢八 時段 14:00-21:00 交易額 700 Name: 5, dtype: object ''' dataframe[:5] ''' 工號 姓名 時段 交易額 0 1001 張三 9:00-14:00 2000 1 1002 李四 14:00-21:00 1800 2 1003 王五 9:00-14:00 800 3 1004 趙六 14:00-21:00 1100 4 1005 周七 9:00-14:00 600 ''' # 查看第 1 3 4 行的數據 dataframe.iloc[[0,2,3],:] ''' 工號 姓名 時段 交易額 0 1001 張三 9:00-14:00 2000 2 1003 王五 9:00-14:00 800 3 1004 趙六 14:00-21:00 1100 ''' # 查看第 1 3 4 行的第 1 2 列 dataframe.iloc[[0,2,3],[0,1]] ''' 工號 姓名 0 1001 張三 2 1003 王五 3 1004 趙六 ''' # 查看前五行指定，姓名、時段和交易額的數據 dataframe[['姓名','時段','交易額']][:5] ''' 姓名 時段 交易額 0 張三 9:00-14:00 2000 1 李四 14:00-21:00 1800 2 王五 9:00-14:00 800 3 趙六 14:00-21:00 1100 4 周七 9:00-14:00 600 ''' dataframe[:5][['姓名','時段','交易額']] ''' 姓名 時段 交易額 0 張三 9:00-14:00 2000 1 李四 14:00-21:00 1800 2 王五 9:00-14:00 800 3 趙六 14:00-21:00 1100 4 周七 9:00-14:00 600 ''' # 查看第 2 4 5 行 姓名，交易額 數據 loc 函數，包含結尾 dataframe.loc[[1,3,4],['姓名','交易額']] ''' 姓名 交易額 1 李四 1800 3 趙六 1100 4 周七 600 ''' # 查看第四行的姓名數據 dataframe.at[3,'姓名'] # '趙六' # 查看交易額大於 1700 的數據 dataframe[dataframe['交易額'] > 1700] ''' 工號 姓名 時段 交易額 0 1001 張三 9:00-14:00 2000 1 1002 李四 14:00-21:00 1800 ''' # 查看交易額總和 dataframe.sum() ''' 工號 17055 姓名 張三李四王五趙六週七錢八錢八張三張三李四王五趙六週七錢八李四王五張三... 時段 9:00-14:0014:00-21:009:00-14:0014:00-21:009:00... 交易額 17410 dtype: object ''' # 某一時段的交易總和 dataframe[dataframe['時段'] == '14:00-21:00']['交易額'].sum() # 8300 # 查看張三在下午14:00以後的交易狀況 dataframe[(dataframe.姓名 == '張三') & (dataframe.時段 == '14:00-21:00')][:10] ''' 工號 姓名 時段 交易額 7 1001 張三 14:00-21:00 600 ''' # 查看日用品的銷售總額 # dataframe[dataframe['櫃檯'] == '日用品']['交易額'].sum() # 查看張三總共的交易額 dataframe[dataframe['姓名'].isin(['張三'])]['交易額'].sum() # 5200 # 查看交易額在 1500~3000 之間的記錄 dataframe[dataframe['交易額'].between(1500,3000)] ''' 工號 姓名 時段 交易額 0 1001 張三 9:00-14:00 2000 1 1002 李四 14:00-21:00 1800 9 1002 李四 14:00-21:00 1500 ''' # 查看數據特徵和統計信息 import pandas as pd # 讀取文件 dataframe = pd.read_excel(r'C:\Users\lenovo\Desktop\總結\Python\超市營業額.xlsx') # 查看全部的交易額信息 dataframe['交易額'].describe() ''' count 17.000000 mean 1024.117647 std 428.019550 min 580.000000 25% 700.000000 50% 850.000000 75% 1300.000000 max 2000.000000 Name: 交易額, dtype: float64 ''' # 查看四分位數 dataframe['交易額'].quantile([0,0.25,0.5,0.75,1.0]) ''' 0.00 580.0 0.25 700.0 0.50 850.0 0.75 1300.0 1.00 2000.0 Name: 交易額, dtype: float64 ''' # 交易額中值 dataframe['交易額'].median() # 850.0 # 交易額最小的三個數據 dataframe['交易額'].nsmallest(3) ''' 12 580 4 600 7 600 Name: 交易額, dtype: int64 ''' dataframe.nsmallest(3,'交易額') ''' 工號 姓名 日期 時段 交易額 櫃檯 12 1005 周七 20190302 9:00-14:00 580 日用品 4 1005 周七 20190301 9:00-14:00 600 日用品 7 1001 張三 20190302 14:00-21:00 600 蔬菜水果 ''' # 交易額最大的兩個數據 dataframe['交易額'].nlargest(2) ''' 0 2000 1 1800 Name: 交易額, dtype: int64 ''' dataframe.nlargest(2,'交易額') ''' 工號 姓名 日期 時段 交易額 櫃檯 0 1001 張三 20190301 9:00-14:00 2000 化妝品 1 1002 李四 20190301 14:00-21:00 1800 化妝品 ''' # 查看最後一個日期 dataframe['日期'].max() # 20190303 # 查看最小的工號 dataframe['工號'].min() # 1001 # 第一個最小交易額的行下標 index = dataframe['交易額'].idxmin() # 0 # 第一個最小交易額 dataframe.loc[index,'交易額'] # 580 # 最大交易額的行下標 index = dataframe['交易額'].idxmax() dataframe.loc[index,'交易額'] # 2000 import pandas as pd # 設置列對齊 pd.set_option("display.unicode.ambiguous_as_wide",True) pd.set_option("display.unicode.east_asian_width",True) # 讀取工號姓名時段交易額，使用默認索引 dataframe = pd.read_excel(r'C:\Users\lenovo\Desktop\總結\Python\超市營業額.xlsx', usecols = ['工號','姓名','時段','交易額','櫃檯']) dataframe[:5] ''' 工號 姓名 時段 交易額 櫃檯 0 1001 張三 9:00-14:00 2000 化妝品 1 1002 李四 14:00-21:00 1800 化妝品 2 1003 王五 9:00-14:00 800 食品 3 1004 趙六 14:00-21:00 1100 食品 4 1005 周七 9:00-14:00 600 日用品 ''' # 按照交易額和工號降序排序，查看五條數據 dataframe.sort_values(by = ['交易額','工號'],ascending = False)[:5] ''' 工號 姓名 時段 交易額 櫃檯 0 1001 張三 9:00-14:00 2000 化妝品 1 1002 李四 14:00-21:00 1800 化妝品 9 1002 李四 14:00-21:00 1500 化妝品 8 1001 張三 9:00-14:00 1300 化妝品 16 1001 張三 9:00-14:00 1300 化妝品 ''' # 按照交易額和工號升序排序，查看五條數據 dataframe.sort_values(by = ['交易額','工號'])[:5] ''' 工號 姓名 時段 交易額 櫃檯 12 1005 周七 9:00-14:00 580 日用品 7 1001 張三 14:00-21:00 600 蔬菜水果 4 1005 周七 9:00-14:00 600 日用品 14 1002 李四 9:00-14:00 680 蔬菜水果 5 1006 錢八 14:00-21:00 700 日用品 ''' # 按照交易額降序和工號升序排序，查看五條數據 dataframe.sort_values(by = ['交易額','工號'],ascending = [False,True])[:5] ''' 工號 姓名 時段 交易額 櫃檯 0 1001 張三 9:00-14:00 2000 化妝品 1 1002 李四 14:00-21:00 1800 化妝品 9 1002 李四 14:00-21:00 1500 化妝品 8 1001 張三 9:00-14:00 1300 化妝品 16 1001 張三 9:00-14:00 1300 化妝品 ''' # 按工號升序排序 dataframe.sort_values(by = ['工號'])[:5] ''' 工號 姓名 時段 交易額 櫃檯 0 1001 張三 9:00-14:00 2000 化妝品 7 1001 張三 14:00-21:00 600 蔬菜水果 8 1001 張三 9:00-14:00 1300 化妝品 16 1001 張三 9:00-14:00 1300 化妝品 1 1002 李四 14:00-21:00 1800 化妝品 ''' dataframe.sort_values(by = ['工號'],na_position = 'last')[:5] ''' 工號 姓名 時段 交易額 櫃檯 0 1001 張三 9:00-14:00 2000 化妝品 7 1001 張三 14:00-21:00 600 蔬菜水果 8 1001 張三 9:00-14:00 1300 化妝品 16 1001 張三 9:00-14:00 1300 化妝品 1 1002 李四 14:00-21:00 1800 化妝品 ''' # 按列名升序排序 dataframe.sort_index(axis = 1)[:5] ''' 交易額 姓名 工號 時段 櫃檯 0 2000 張三 1001 9:00-14:00 化妝品 1 1800 李四 1002 14:00-21:00 化妝品 2 800 王五 1003 9:00-14:00 食品 3 1100 趙六 1004 14:00-21:00 食品 4 600 周七 1005 9:00-14:00 日用品 ''' dataframe.sort_index(axis = 1,ascending = True)[:5] ''' 交易額 姓名 工號 時段 櫃檯 0 2000 張三 1001 9:00-14:00 化妝品 1 1800 李四 1002 14:00-21:00 化妝品 2 800 王五 1003 9:00-14:00 食品 3 1100 趙六 1004 14:00-21:00 食品 4 600 周七 1005 9:00-14:00 日用品 ''' # 分組與聚合 import pandas as pd import numpy as np # 設置列對齊 pd.set_option("display.unicode.ambiguous_as_wide",True) pd.set_option("display.unicode.east_asian_width",True) # 讀取工號姓名時段交易額，使用默認索引 dataframe = pd.read_excel(r'C:\Users\lenovo\Desktop\總結\Python\超市營業額.xlsx', usecols = ['工號','姓名','時段','交易額','櫃檯']) # 對 5 的餘數進行分組 dataframe.groupby(by = lambda num:num % 5)['交易額'].sum() ''' 0 4530 1 5000 2 1980 3 3120 4 2780 Name: 交易額, dtype: int64 ''' # 查看索引爲 7 15 的交易額 dataframe.groupby(by = {7:'索引爲7的行',15:'索引爲15的行'})['交易額'].sum() ''' 索引爲15的行 830 索引爲7的行 600 Name: 交易額, dtype: int64 ''' # 查看不一樣時段的交易總額 dataframe.groupby(by = '時段')['交易額'].sum() ''' 時段 14:00-21:00 8300 9:00-14:00 9110 Name: 交易額, dtype: int64 ''' # 各櫃檯的銷售總額 dataframe.groupby(by = '櫃檯')['交易額'].sum() ''' 櫃檯 化妝品 7900 日用品 2600 蔬菜水果 2960 食品 3950 Name: 交易額, dtype: int64 ''' # 查看每一個人在每一個時段購買的次數 count = dataframe.groupby(by = '姓名')['時段'].count() ''' 姓名 周七 2 張三 4 李四 3 王五 3 趙六 2 錢八 3 Name: 時段, dtype: int64 ''' # count.name = '交易人和次數' ''' ''' # 每一個人的交易額平均值並排序 dataframe.groupby(by = '姓名')['交易額'].mean().round(2).sort_values() ''' 姓名 周七 590.00 錢八 756.67 王五 876.67 趙六 1075.00 張三 1300.00 李四 1326.67 Name: 交易額, dtype: float64 ''' # 每一個人的交易額 dataframe.groupby(by = '姓名').sum()['交易額'].apply(int) ''' 姓名 周七 1180 張三 5200 李四 3980 王五 2630 趙六 2150 錢八 2270 Name: 交易額, dtype: int64 ''' # 每個員工交易額的中值 data = dataframe.groupby(by = '姓名').median() ''' 工號 交易額 姓名 周七 1005 590 張三 1001 1300 李四 1002 1500 王五 1003 830 趙六 1004 1075 錢八 1006 720 ''' data['交易額'] ''' 姓名 周七 590 張三 1300 李四 1500 王五 830 趙六 1075 錢八 720 Name: 交易額, dtype: int64 ''' # 查看交易額對應的排名 data['排名'] = data['交易額'].rank(ascending = False) data[['交易額','排名']] ''' 交易額 排名 姓名 周七 590 6.0 張三 1300 2.0 李四 1500 1.0 王五 830 4.0 趙六 1075 3.0 錢八 720 5.0 ''' # 每一個人不一樣時段的交易額 dataframe.groupby(by = ['姓名','時段'])['交易額'].sum() ''' 姓名 時段 周七 9:00-14:00 1180 張三 14:00-21:00 600 9:00-14:00 4600 李四 14:00-21:00 3300 9:00-14:00 680 王五 14:00-21:00 830 9:00-14:00 1800 趙六 14:00-21:00 2150 錢八 14:00-21:00 1420 9:00-14:00 850 Name: 交易額, dtype: int64 ''' # 設置各時段累計 dataframe.groupby(by = ['姓名'])['時段','交易額'].aggregate({'交易額':np.sum,'時段':lambda x:'各時段累計'}) ''' 交易額 時段 姓名 周七 1180 各時段累計 張三 5200 各時段累計 李四 3980 各時段累計 王五 2630 各時段累計 趙六 2150 各時段累計 錢八 2270 各時段累計 ''' # 對指定列進行聚合,查看最大,最小,和,平均值,中值 dataframe.groupby(by = '姓名').agg(['max','min','sum','mean','median']) ''' 工號 交易額 max min sum mean median max min sum mean median 姓名 周七 1005 1005 2010 1005 1005 600 580 1180 590.000000 590 張三 1001 1001 4004 1001 1001 2000 600 5200 1300.000000 1300 李四 1002 1002 3006 1002 1002 1800 680 3980 1326.666667 1500 王五 1003 1003 3009 1003 1003 1000 800 2630 876.666667 830 趙六 1004 1004 2008 1004 1004 1100 1050 2150 1075.000000 1075 錢八 1006 1006 3018 1006 1006 850 700 2270 756.666667 720 ''' # 查看部分聚合後的結果 dataframe.groupby(by = '姓名').agg(['max','min','sum','mean','median'])['交易額'] ''' max min sum mean median 姓名 周七 600 580 1180 590.000000 590 張三 2000 600 5200 1300.000000 1300 李四 1800 680 3980 1326.666667 1500 王五 1000 800 2630 876.666667 830 趙六 1100 1050 2150 1075.000000 1075 錢八 850 700 2270 756.666667 720 ''' # 處理異常值缺失值重複值數據差分 import pandas as pd import numpy as np import copy # 設置列對齊 pd.set_option("display.unicode.ambiguous_as_wide",True) pd.set_option("display.unicode.east_asian_width",True) # 異常值 # 讀取工號姓名時段交易額，使用默認索引 dataframe = pd.read_excel(r'C:\Users\lenovo\Desktop\總結\Python\超市營業額.xlsx') # 查看交易額低於 2000 的三條數據 # dataframe[dataframe.交易額 < 2000] dataframe[dataframe.交易額 < 2000][:3] ''' 工號 姓名 日期 時段 交易額 櫃檯 1 1002 李四 20190301 14:00-21:00 1800 化妝品 2 1003 王五 20190301 9:00-14:00 800 食品 3 1004 趙六 20190301 14:00-21:00 1100 食品 ''' # 查看上浮了 50% 以後依舊低於 1500 的交易額,查看 4 條數據 dataframe.loc[dataframe.交易額 < 1500,'交易額'] = dataframe[dataframe.交易額 < 1500]['交易額'].map(lambda num:num*1.5) dataframe[dataframe.交易額 < 1500][:4] ''' 工號 姓名 日期 時段 交易額 櫃檯 2 1003 王五 20190301 9:00-14:00 1200.0 食品 4 1005 周七 20190301 9:00-14:00 900.0 日用品 5 1006 錢八 20190301 14:00-21:00 1050.0 日用品 6 1006 錢八 20190301 9:00-14:00 1275.0 蔬菜水果 ''' # 查看交易額大於 2500 的數據 dataframe[dataframe.交易額 > 2500] ''' Empty DataFrame Columns: [工號, 姓名, 日期, 時段, 交易額, 櫃檯] Index: [] ''' # 查看交易額低於 900 或 高於 1800 的數據 dataframe[(dataframe.交易額 < 900)|(dataframe.交易額 > 1800)] ''' 工號 姓名 日期 時段 交易額 櫃檯 0 1001 張三 20190301 9:00-14:00 2000.0 化妝品 8 1001 張三 20190302 9:00-14:00 1950.0 化妝品 12 1005 周七 20190302 9:00-14:00 870.0 日用品 16 1001 張三 20190303 9:00-14:00 1950.0 化妝品 ''' # 將全部低於 200 的交易額都替換成 200 處理異常值 dataframe.loc[dataframe.交易額 < 200,'交易額'] = 200 # 查看低於 1500 的交易額個數 dataframe.loc[dataframe.交易額 < 1500,'交易額'].count() # 9 # 將大於 3000 元的都替換爲 3000 元 dataframe.loc[dataframe.交易額 > 3000,'交易額'] = 3000 # 缺失值 # 查看有多少行數據 len(dataframe) # 17 # 丟棄缺失值以後的行數 len(dataframe.dropna()) # 17 # 包含缺失值的行 dataframe[dataframe['交易額'].isnull()] ''' Empty DataFrame Columns: [工號, 姓名, 日期, 時段, 交易額, 櫃檯] Index: [] ''' # 使用固定值替換缺失值 # dff = copy.deepcopy(dataframe) # dff.loc[dff.交易額.isnull(),'交易額'] = 999 # 將缺失值設定爲 999，包含結尾 # dff.iloc[[1,4,17],:] # 使用交易額的均值替換缺失值 # dff = copy.deepcopy(dataframe) # for i in dff[dff.交易額.isnull()].index: # dff.loc[i,'交易額'] = round(dff.loc[dff.姓名 == dff.loc[i,'姓名'],'交易額'].mean()) # dff.iloc[[1,4,17],:] # 使用總體均值的 80% 填充缺失值 # dataframe.fillna({'交易額':round(dataframe['交易額'].mean() * 0.8)},inplace = True) # dataframe.iloc[[1,4,16],:] # 重複值 dataframe[dataframe.duplicated()] ''' Empty DataFrame Columns: [工號, 姓名, 日期, 時段, 交易額, 櫃檯] Index: [] ''' # dff = dataframe[['工號','姓名','日期','交易額']] # dff = dff[dff.duplicated()] # for row in dff.values: # df[(df.工號 == row[0]) & (df.日期 == row[2]) &(df.交易額 == row[3])] # 丟棄重複行 dataframe = dataframe.drop_duplicates() # 查看是否有錄入錯誤的工號和姓名 dff = dataframe[['工號','姓名']] dff.drop_duplicates() ''' 工號 姓名 0 1001 張三 1 1002 李四 2 1003 王五 3 1004 趙六 4 1005 周七 5 1006 錢八 ''' # 數據差分 # 查看員工業績波動狀況(每一天和昨天的數據做比較) dff = dataframe.groupby(by = '日期').sum()['交易額'].diff() ''' 日期 20190301 NaN 20190302 1765.0 20190303 -9690.0 Name: 交易額, dtype: float64 ''' # [:5] dataframe.head() dff.map(lambda num:'%.2f'%(num))[:5] ''' 日期 20190301 nan 20190302 1765.00 20190303 -9690.00 Name: 交易額, dtype: object ''' # 查看張三的波動狀況 dataframe[dataframe.姓名 == '張三'].groupby(by = '日期').sum()['交易額'].diff()[:5] ''' 日期 20190301 NaN 20190302 850.0 20190303 -900.0 Name: 交易額, dtype: float64 ''' # 使用透視表與交叉表查看業績彙總數據 import pandas as pd import numpy as np import copy # 設置列對齊 pd.set_option("display.unicode.ambiguous_as_wide",True) pd.set_option("display.unicode.east_asian_width",True) dataframe = pd.read_excel(r'C:\Users\lenovo\Desktop\總結\Python\超市營業額.xlsx') # 對姓名和日期進行分組,並進行求和 dff = dataframe.groupby(by = ['姓名','日期'],as_index = False).sum() ''' 姓名 日期 工號 交易額 0 周七 20190301 1005 600 1 周七 20190302 1005 580 2 張三 20190301 1001 2000 3 張三 20190302 2002 1900 4 張三 20190303 1001 1300 5 李四 20190301 1002 1800 6 李四 20190302 2004 2180 7 王五 20190301 1003 800 8 王五 20190302 2006 1830 9 趙六 20190301 1004 1100 10 趙六 20190302 1004 1050 11 錢八 20190301 2012 1550 12 錢八 20190302 1006 720 ''' # 將 dff 的索引，列 設置成透視表形式 dff = dff.pivot(index = '姓名',columns = '日期',values = '交易額') ''' 日期 20190301 20190302 20190303 姓名 周七 600.0 580.0 NaN 張三 2000.0 1900.0 1300.0 李四 1800.0 2180.0 NaN 王五 800.0 1830.0 NaN 趙六 1100.0 1050.0 NaN 錢八 1550.0 720.0 NaN ''' # 查看前一天的數據 dff.iloc[:,:1] ''' 日期 20190301 姓名 周七 600.0 張三 2000.0 李四 1800.0 王五 800.0 趙六 1100.0 錢八 1550.0 ''' # 交易總額小於 4000 的人的前三天業績 dff[dff.sum(axis = 1) < 4000].iloc[:,:3] ''' 日期 20190301 20190302 20190303 姓名 周七 600.0 580.0 NaN 李四 1800.0 2180.0 NaN 王五 800.0 1830.0 NaN 趙六 1100.0 1050.0 NaN 錢八 1550.0 720.0 NaN ''' # 工資總額大於 2900 元的員工的姓名 dff[dff.sum(axis = 1) > 2900].index.values # array(['張三', '李四'], dtype=object) # 顯示前兩天每一天的交易總額以及每一個人的交易金額 dataframe.pivot_table(values = '交易額',index = '姓名',columns = '日期',aggfunc = 'sum',margins = True).iloc[:,:2] ''' 日期 20190301 20190302 姓名 周七 600.0 580.0 張三 2000.0 1900.0 李四 1800.0 2180.0 王五 800.0 1830.0 趙六 1100.0 1050.0 錢八 1550.0 720.0 All 7850.0 8260.0 ''' # 顯示每一個人在每一個櫃檯的交易總額 dff = dataframe.groupby(by = ['姓名','櫃檯'],as_index = False).sum() dff.pivot(index = '姓名',columns = '櫃檯',values = '交易額') ''' 櫃檯 化妝品 日用品 蔬菜水果 食品 姓名 周七 NaN 1180.0 NaN NaN 張三 4600.0 NaN 600.0 NaN 李四 3300.0 NaN 680.0 NaN 王五 NaN NaN 830.0 1800.0 趙六 NaN NaN NaN 2150.0 錢八 NaN 1420.0 850.0 NaN ''' # 查看每人天天的上班次數 dataframe.pivot_table(values = '交易額',index = '姓名',columns = '日期',aggfunc = 'count',margins = True).iloc[:,:1] ''' 日期 20190301 姓名 周七 1.0 張三 1.0 李四 1.0 王五 1.0 趙六 1.0 錢八 2.0 All 7.0 ''' # 查看每一個人天天購買的次數 dataframe.pivot_table(values = '交易額',index = '姓名',columns = '日期',aggfunc = 'count',margins = True) ''' 日期 20190301 20190302 20190303 All 姓名 周七 1.0 1.0 NaN 2 張三 1.0 2.0 1.0 4 李四 1.0 2.0 NaN 3 王五 1.0 2.0 NaN 3 趙六 1.0 1.0 NaN 2 錢八 2.0 1.0 NaN 3 All 7.0 9.0 1.0 17 ''' # 交叉表 # 每一個人天天上過幾回班 pd.crosstab(dataframe.姓名,dataframe.日期,margins = True).iloc[:,:2] ''' 日期 20190301 20190302 姓名 周七 1 1 張三 1 2 李四 1 2 王五 1 2 趙六 1 1 錢八 2 1 All 7 9 ''' # 每一個人天天去過幾回櫃檯 pd.crosstab(dataframe.姓名,dataframe.櫃檯) ''' 櫃檯 化妝品 日用品 蔬菜水果 食品 姓名 周七 0 2 0 0 張三 3 0 1 0 李四 2 0 1 0 王五 0 0 1 2 趙六 0 0 0 2 錢八 0 2 1 0 ''' # 將每個人在每個櫃檯的交易總額顯示出來 pd.crosstab(dataframe.姓名,dataframe.櫃檯,dataframe.交易額,aggfunc='sum') ''' 櫃檯 化妝品 日用品 蔬菜水果 食品 姓名 周七 NaN 1180.0 NaN NaN 張三 4600.0 NaN 600.0 NaN 李四 3300.0 NaN 680.0 NaN 王五 NaN NaN 830.0 1800.0 趙六 NaN NaN NaN 2150.0 錢八 NaN 1420.0 850.0 NaN ''' # 每一個人在每一個櫃檯交易額的平均值,金額/天數 pd.crosstab(dataframe.姓名,dataframe.櫃檯,dataframe.交易額,aggfunc = 'mean').apply(lambda num:round(num,2) ) ''' 櫃檯 化妝品 日用品 蔬菜水果 食品 姓名 周七 NaN 590.0 NaN NaN 張三 1533.33 NaN 600.0 NaN 李四 1650.00 NaN 680.0 NaN 王五 NaN NaN 830.0 900.0 趙六 NaN NaN NaN 1075.0 錢八 NaN 710.0 850.0 NaN ''' # 重採樣 多索引 標準差 協方差 import pandas as pd import numpy as np import copy # 設置列對齊 pd.set_option("display.unicode.ambiguous_as_wide",True) pd.set_option("display.unicode.east_asian_width",True) data = pd.read_excel(r'C:\Users\lenovo\Desktop\總結\Python\超市營業額.xlsx') # 將日期設置爲 python 中的日期類型 data.日期 = pd.to_datetime(data.日期) ''' 工號 姓名 日期 時段 交易額 櫃檯 0 1001 張三 1970-01-01 00:00:00.020190301 9:00-14:00 2000 化妝品 1 1002 李四 1970-01-01 00:00:00.020190301 14:00-21:00 1800 化妝品 2 1003 王五 1970-01-01 00:00:00.020190301 9:00-14:00 800 食品 ''' # 每七天營業的總額 data.resample('7D',on = '日期').sum()['交易額'] ''' 日期 1970-01-01 17410 Freq: 7D, Name: 交易額, dtype: int64 ''' # 每七天營業總額 data.resample('7D',on = '日期',label = 'right').sum()['交易額'] ''' 日期 1970-01-08 17410 Freq: 7D, Name: 交易額, dtype: int64 ''' # 每七天營業額的平均值 func = lambda item:round(np.sum(item)/len(item),2) data.resample('7D',on = '日期',label = 'right').apply(func)['交易額'] ''' 日期 1970-01-08 1024.12 Freq: 7D, Name: 交易額, dtype: float64 ''' # 每七天營業額的平均值 func = lambda num:round(num,2) data.resample('7D',on = '日期',label = 'right').mean().apply(func)['交易額'] # 1024.12 # 刪除工號這一列 data.drop('工號',axis = 1,inplace = True) data[:2] ''' 姓名 日期 時段 交易額 櫃檯 0 張三 1970-01-01 00:00:00.020190301 9:00-14:00 2000 化妝品 1 李四 1970-01-01 00:00:00.020190301 14:00-21:00 1800 化妝品 ''' # 按照姓名和櫃檯進行分組彙總 data = data.groupby(by = ['姓名','櫃檯']).sum()[:3] ''' 交易額 姓名 櫃檯 周七 日用品 1180 張三 化妝品 4600 蔬菜水果 600 ''' # 查看張三的彙總數據 data.loc['張三',:] ''' 交易額 櫃檯 化妝品 4600 蔬菜水果 600 ''' # 查看張三在蔬菜水果的交易數據 data.loc['張三','蔬菜水果'] ''' 交易額 600 Name: (張三, 蔬菜水果), dtype: int64 ''' # 多索引 # 從新讀取，使用第二列和第六列做爲索引，排在前面 data = pd.read_excel(r'C:\Users\lenovo\Desktop\總結\Python\超市營業額.xlsx',index_col = [1,5]) data[:5] ''' 工號 日期 時段 交易額 姓名 櫃檯 張三 化妝品 1001 20190301 9:00-14:00 2000 李四 化妝品 1002 20190301 14:00-21:00 1800 王五 食品 1003 20190301 9:00-14:00 800 趙六 食品 1004 20190301 14:00-21:00 1100 周七 日用品 1005 20190301 9:00-14:00 600 ''' # 丟棄工號列 data.drop('工號',axis = 1,inplace = True) data[:5] ''' 日期 時段 交易額 姓名 櫃檯 張三 化妝品 20190301 9:00-14:00 2000 李四 化妝品 20190301 14:00-21:00 1800 王五 食品 20190301 9:00-14:00 800 趙六 食品 20190301 14:00-21:00 1100 周七 日用品 20190301 9:00-14:00 600 ''' # 按照櫃檯進行排序 dff = data.sort_index(level = '櫃檯',axis = 0) dff[:5] ''' 工號 日期 時段 交易額 姓名 櫃檯 張三 化妝品 1001 20190301 9:00-14:00 2000 化妝品 1001 20190302 9:00-14:00 1300 化妝品 1001 20190303 9:00-14:00 1300 李四 化妝品 1002 20190301 14:00-21:00 1800 化妝品 1002 20190302 14:00-21:00 1500 ''' # 按照姓名進行排序 dff = data.sort_index(level = '姓名',axis = 0) dff[:5] ''' 工號 日期 時段 交易額 姓名 櫃檯 周七 日用品 1005 20190301 9:00-14:00 600 日用品 1005 20190302 9:00-14:00 580 張三 化妝品 1001 20190301 9:00-14:00 2000 化妝品 1001 20190302 9:00-14:00 1300 化妝品 1001 20190303 9:00-14:00 1300 ''' # 按照櫃檯進行分組求和 dff = data.groupby(level = '櫃檯').sum()['交易額'] ''' 櫃檯 化妝品 7900 日用品 2600 蔬菜水果 2960 食品 3950 Name: 交易額, dtype: int64 ''' #標準差 data = pd.DataFrame({'A':[3,3,3,3,3],'B':[1,2,3,4,5], 'C':[-5,-4,1,4,5],'D':[-45,15,63,40,50] }) ''' A B C D 0 3 1 -5 -45 1 3 2 -4 15 2 3 3 1 63 3 3 4 4 40 4 3 5 5 50 ''' # 平均值 data.mean() ''' A 3.0 B 3.0 C 0.2 D 24.6 dtype: float64 ''' # 標準差 data.std() ''' A 0.000000 B 1.581139 C 4.549725 D 42.700117 dtype: float64 ''' # 標準差的平方 data.std()**2 ''' A 0.0 B 2.5 C 20.7 D 1823.3 dtype: float64 ''' # 協方差 data.cov() ''' A B C D A 0.0 0.00 0.00 0.00 B 0.0 2.50 7.00 53.75 C 0.0 7.00 20.70 153.35 D 0.0 53.75 153.35 1823.30 ''' # 指定索引爲 姓名，日期，時段，櫃檯，交易額 data = pd.read_excel(r'C:\Users\lenovo\Desktop\總結\Python\超市營業額.xlsx', usecols = ['姓名','日期','時段','櫃檯','交易額']) # 刪除缺失值和重複值,inplace = True 直接丟棄 data.dropna(inplace = True) data.drop_duplicates(inplace = True) # 處理異常值 data.loc[data.交易額 < 200,'交易額'] = 200 data.loc[data.交易額 > 3000,'交易額'] = 3000 # 使用交叉表獲得不一樣員工在不一樣櫃檯的交易額平均值 dff = pd.crosstab(data.姓名,data.櫃檯,data.交易額,aggfunc = 'mean') dff[:5] ''' 櫃檯 化妝品 日用品 蔬菜水果 食品 姓名 周七 NaN 590.0 NaN NaN 張三 1533.333333 NaN 600.0 NaN 李四 1650.000000 NaN 680.0 NaN 王五 NaN NaN 830.0 900.0 趙六 NaN NaN NaN 1075.0 ''' # 查看數據的標準差 dff.std() ''' 櫃檯 化妝品 82.495791 日用品 84.852814 蔬菜水果 120.277457 食品 123.743687 dtype: float64 ''' dff.cov() ''' 櫃檯 化妝品 日用品 蔬菜水果 食品 櫃檯 化妝品 6805.555556 NaN 4666.666667 NaN 日用品 NaN 7200.0 NaN NaN 蔬菜水果 4666.666667 NaN 14466.666667 NaN 食品 NaN NaN NaN 15312.5 ''' import pandas as pd import copy # 設置列對齊 pd.set_option("display.unicode.ambiguous_as_wide",True) pd.set_option("display.unicode.east_asian_width",True) data = pd.read_excel(r'C:\Users\lenovo\Desktop\總結\Python\超市營業額.xlsx',usecols = ['日期','交易額']) dff = copy.deepcopy(data) # 查看周幾 dff['日期'] = pd.to_datetime(data['日期']).dt.weekday_name ''' 日期 交易額 0 Thursday 2000 1 Thursday 1800 2 Thursday 800 ''' # 按照周幾進行分組，查看交易的平均值 dff = dff.groupby('日期').mean().apply(round) dff.index.name = '周幾' dff[:3] ''' 交易額 周幾 Thursday 1024.0 ''' # dff = copy.deepcopy(data) # 使用正則規則查看月份日期 # dff['日期'] = dff.日期.str.extract(r'(\d{4}-\d{2})') # dff[:5] # 按照日 進行分組查看交易的平均值 -1 表示倒數第一個 # data.groupby(data.日期.str.__getitem__(-1)).mean().apply(round) # 查看日期尾數爲 1 的數據 # data[data.日期.str.endswith('1')][:12] # 查看日期尾數爲 12 的交易數據,slice 爲切片 (-2) 表示倒數兩個 # data[data.日期.str.slice(-2) == '12'] # 查看日期中月份或天數包含 2 的交易數據 # data[data.日期.str.slice(-5).str.contains('2')][1:9] import pandas as pd import numpy as np # 讀取所有數據，使用默認索引 data = pd.read_excel(r'C:\Users\lenovo\Desktop\總結\Python\超市營業額.xlsx') # 修改異常值 data.loc[data.交易額 > 3000,'交易額'] = 3000 data.loc[data.交易額 < 200,'交易額'] = 200 # 刪除重複值 data.drop_duplicates(inplace = True) # 填充缺失值 data['交易額'].fillna(data['交易額'].mean(),inplace = True) # 使用交叉表獲得每人在各櫃檯交易額的平均值 data_group = pd.crosstab(data.姓名,data.櫃檯,data.交易額,aggfunc = 'mean').apply(round) # 繪製柱狀圖 data_group.plot(kind = 'bar') # <matplotlib.axes._subplots.AxesSubplot object at 0x000001D681607888> # 數據的合併 data1 = pd.read_excel(r'C:\Users\lenovo\Desktop\總結\Python\超市營業額.xlsx') data2 = pd.read_excel(r'C:\Users\lenovo\Desktop\總結\Python\超市營業額.xlsx',sheet_name = 'Sheet2') df1 = data1[:3] ''' 工號 姓名 日期 時段 交易額 櫃檯 0 1001 張三 20190301 9:00-14:00 2000 化妝品 1 1002 李四 20190301 14:00-21:00 1800 化妝品 2 1003 王五 20190301 9:00-14:00 800 食品 ''' df2 = data2[:4] ''' 工號 姓名 日期 時段 交易額 櫃檯 0 1006 錢八 20190301 9:00-14:00 850 蔬菜水果 1 1001 張三 20190302 14:00-21:00 600 蔬菜水果 2 1001 張三 20190302 9:00-14:00 1300 化妝品 3 1002 李四 20190302 14:00-21:00 1500 化妝品 ''' # 使用 concat 鏈接兩個相同結構的 DataFrame 對象 df3 = pd.concat([df1,df2]) ''' 工號 姓名 日期 時段 交易額 櫃檯 0 1001 張三 20190301 9:00-14:00 2000 化妝品 1 1002 李四 20190301 14:00-21:00 1800 化妝品 2 1003 王五 20190301 9:00-14:00 800 食品 0 1006 錢八 20190301 9:00-14:00 850 蔬菜水果 1 1001 張三 20190302 14:00-21:00 600 蔬菜水果 2 1001 張三 20190302 9:00-14:00 1300 化妝品 3 1002 李四 20190302 14:00-21:00 1500 化妝品 ''' # 合併，忽略原來的索引 ignore_index df4 = df3.append([df1,df2],ignore_index = True) ''' 工號 姓名 日期 時段 交易額 櫃檯 0 1001 張三 20190301 9:00-14:00 2000 化妝品 1 1002 李四 20190301 14:00-21:00 1800 化妝品 2 1003 王五 20190301 9:00-14:00 800 食品 3 1006 錢八 20190301 9:00-14:00 850 蔬菜水果 4 1001 張三 20190302 14:00-21:00 600 蔬菜水果 5 1001 張三 20190302 9:00-14:00 1300 化妝品 6 1002 李四 20190302 14:00-21:00 1500 化妝品 7 1001 張三 20190301 9:00-14:00 2000 化妝品 8 1002 李四 20190301 14:00-21:00 1800 化妝品 9 1003 王五 20190301 9:00-14:00 800 食品 10 1006 錢八 20190301 9:00-14:00 850 蔬菜水果 11 1001 張三 20190302 14:00-21:00 600 蔬菜水果 12 1001 張三 20190302 9:00-14:00 1300 化妝品 13 1002 李四 20190302 14:00-21:00 1500 化妝品 ''' # 按照列進行拆分 df5 = df4.loc[:,['姓名','櫃檯','交易額']] # 查看前五條數據 df5[:5] ''' 姓名 櫃檯 交易額 0 張三 化妝品 2000 1 李四 化妝品 1800 2 王五 食品 800 3 錢八 蔬菜水果 850 4 張三 蔬菜水果 600 ''' # 合併 merge 、 join # 按照工號進行合併，隨機查看 3 條數據 rows = np.random.randint(0,len(df5),3) pd.merge(df4,df5).iloc[rows,:] ''' 工號 姓名 日期 時段 交易額 櫃檯 7 1002 李四 20190301 14:00-21:00 1800 化妝品 4 1002 李四 20190301 14:00-21:00 1800 化妝品 10 1003 王五 20190301 9:00-14:00 800 食品 ''' # 按照工號進行合併，指定其餘同名列的後綴 pd.merge(df1,df2,on = '工號',suffixes = ['_x','_y']).iloc[:,:] ''' 工號 姓名_x 日期_x 時段_x ... 日期_y 時段_y 交易額_y 櫃檯_y 0 1001 張三 20190301 9:00-14:00 ... 20190302 14:00-21:00 600 蔬菜水果 1 1001 張三 20190301 9:00-14:00 ... 20190302 9:00-14:00 1300 化妝品 2 1002 李四 20190301 14:00-21:00 ... 20190302 14:00-21:00 1500 化妝品 ''' # 兩個表都設置工號爲索引 set_index df2.set_index('工號').join(df3.set_index('工號'),lsuffix = '_x',rsuffix = '_y').iloc[:] ''' 姓名_x 日期_x 時段_x 交易額_x ... 日期_y 時段_y 交易額_y 櫃檯_y 工號 ... 1001 張三 20190302 14:00-21:00 600 ... 20190301 9:00-14:00 2000 化妝品 1001 張三 20190302 14:00-21:00 600 ... 20190302 14:00-21:00 600 蔬菜水果 1001 張三 20190302 14:00-21:00 600 ... 20190302 9:00-14:00 1300 化妝品 1001 張三 20190302 9:00-14:00 1300 ... 20190301 9:00-14:00 2000 化妝品 1001 張三 20190302 9:00-14:00 1300 ... 20190302 14:00-21:00 600 蔬菜水果 1001 張三 20190302 9:00-14:00 1300 ... 20190302 9:00-14:00 1300 化妝品 1002 李四 20190302 14:00-21:00 1500 ... 20190301 14:00-21:00 1800 化妝品 1002 李四 20190302 14:00-21:00 1500 ... 20190302 14:00-21:00 1500 化妝品 1006 錢八 20190301 9:00-14:00 850 ... 20190301 9:00-14:00 850 蔬菜水果 '''

函數實現 多個數據求平均值

def average(*args): print(args) # (1, 2, 3) # (1, 2, 3) print(len(args)) # 3 # 3 print(sum(args, 0.0) / len(args)) average(*[1, 2, 3]) # 2.0 average(1, 2, 3) # 2.0

使用 * 對傳入的列表進行解包

對傳入的數據進行分類

def bifurcate(lst, filter): print(lst) # ['beep', 'boop', 'foo', 'bar'] print(filter) # [True, True, False, True] # 列表名,不是 filter 函數 print(enumerate(lst)) # <enumerate object at 0x0000017EB10B9D00> print(list(enumerate(lst))) # [(0, 'beep'), (1, 'boop'), (2, 'foo'), (3, 'bar')] print([ [x for i, x in enumerate(lst) if filter[i] == True], [x for i, x in enumerate(lst) if filter[i] == False] ]) ''' filter[i] 主要是對枚舉類型前面的索引和傳入的 filter 列表進行判斷是否重複 ''' bifurcate(['beep', 'boop', 'foo', 'bar'], [True, True, False, True])

進階 對傳入的數據進行分類

def bifurcate_by(lst, fn): print(lst) # ['beep', 'boop', 'foo', 'bar'] print(fn('baby')) # True print(fn('abc')) # False print([ [x for x in lst if fn(x)], [x for x in lst if not fn(x)] ]) bifurcate_by( ['beep', 'boop', 'foo', 'bar'], lambda x: x[0] == 'b' ) # [['beep', 'boop', 'bar'], ['foo']]

二進制字符長度

def byte_size(s): print(s) # 😀 # Hello World print(s.encode('utf-8')) # b'\xf0\x9f\x98\x80' # b'Hello World' print(len(s.encode('utf-8'))) # 4 11 byte_size('😀') # 4 byte_size('Hello World') # 11

將包含_或-的字符串最開始的字母小寫,其他的第一個字母大寫 from re import sub def camel(s): print(s) # some_database_field_name # Some label that needs to be camelized # some-javascript-property # some-mixed_string with spaces_underscores-and-hyphens print(sub(r"(_|-)+", " ", s)) # some database field name # Some label that needs to be camelized # some javascript property # some mixed string with spaces underscores and hyphens print((sub(r"(_|-)+", " ", s)).title()) # Some Database Field Name # Some Label That Needs To Be Camelized # Some Javascript Property # Some Mixed String With Spaces Underscores And Hyphens print((sub(r"(_|-)+", " ", s)).title().replace(" ", "")) # SomeDatabaseFieldName # SomeLabelThatNeedsToBeCamelized # SomeJavascriptProperty # SomeMixedStringWithSpacesUnderscoresAndHyphens s = sub(r"(_|-)+", " ", s).title().replace(" ", "") print(s) # SomeDatabaseFieldName # SomeLabelThatNeedsToBeCamelized # SomeJavascriptProperty # SomeMixedStringWithSpacesUnderscoresAndHyphens print(s[0].lower()) # s # s # s # s print(s[0].lower() + s[1:]) # someDatabaseFieldName # someLabelThatNeedsToBeCamelized # someJavascriptProperty # someMixedStringWithSpacesUnderscoresAndHyphens # s = sub(r"(_|-)+", " ", s).title().replace(" ", "") # print(s[0].lower() + s[1:]) camel('some_database_field_name') # someDatabaseFieldName camel('Some label that needs to be camelized') # someLabelThatNeedsToBeCamelized camel('some-javascript-property') # someJavascriptProperty camel('some-mixed_string with spaces_underscores-and-hyphens') # someMixedStringWithSpacesUnderscoresAndHyphens

不管傳入什麼數據都轉換爲列表

def cast_list(val): print(val) # foo # [1] # ('foo', 'bar') print(type(val)) # <class 'str'> # <class 'list'> # <class 'tuple'> print(isinstance(val,(tuple, list, set, dict))) # False # True # True print(list(val) if isinstance(val, (tuple, list, set, dict)) else [val]) ''' 若是type(val)在 元組,列表,集合,字典 中,則轉換爲列表 若是不在,也轉換爲列表 ''' cast_list('foo') # ['foo'] cast_list([1]) # [1] cast_list(('foo', 'bar')) # ['foo', 'bar']

斐波那契數列進一步討論性能 ''' 生成器求斐波那契數列 不須要擔憂會使用大量資源 ''' def fibon(n): a = b = 1 for i in range(n): yield a # a 爲每次生成的數值 a,b = b,a+b for x in fibon(1000000): print(x) ''' 使用列表進行斐波那契數列運算,會直接用盡全部的資源 ''' def fibon(n): a = b = 1 result = [] for i in range(n): result.append(a) # a 爲每次生成的數值 a,b = b,a+b return result for x in fibon(1000000): print(x) 生成器使用場景:不想在同一時間將全部計算結果都分配到內存中

迭代器和可迭代對象區別 迭代器: 只要定義了 __next__方法,就是一個迭代器 生成器也是一種迭代器,可是隻能迭代一次,由於只保存一次值 yield a next(yield 對象) 進行遍歷 可迭代對象: 只要定義了 __iter__ 方法就是一個可迭代對象 列表,字符串,元組,字典和集合都是可迭代對象 使用 iter(可迭代對象) 能夠轉換爲 迭代器

map 函數基本寫法 map(須要對對象使用的函數,要操做的對象) 函數能夠是自定義的,也能夠是內置函數的,或者 lambda 匿名函數 操做的對象多爲 可迭代對象 能夠是函數名的列表集合

filter 函數基本寫法 filter 返回一個符合要求的元素所構成的新列表 filter(函數,可迭代對象)

map 和 filter 混合使用 將 lst_num 中爲偶數的取出來進行加2 和 乘2 操做

 functools 中的 reduce 函數基本寫法 reduce 返回的每每是一整個可迭代對象的 操做結果 reduce(函數,可迭代對象) 注:lambda x,y 兩個參數

 三元運算符 條件爲真執行的語句 if 條件 else 條件爲假執行的語句 注:執行的語句,單獨包含一個關鍵字時可能會出錯

學裝飾器以前必需要了解的四點 裝飾器: 通俗理解:修改其餘函數的功能的函數 學習裝飾器以前,下面的點都要掌握 1.萬物皆對象,當將函數名賦值給另外一個對象以後 原來的對象刪除,不會影響賦值過的新對象 2.函數內定義函數 注:外部函數返回內部函數,內部函數調用外部函數的參數 才能夠稱爲閉包 3.從函數中返回函數,外部定義的函數返回內部定義的函數名 4.將函數做爲參數傳遞給另外一個函數

列表推導式,最基本寫法 普通寫法: [對象 for 對象 in 可迭代對象] [對象 for 對象 in 可迭代對象 if 條件] 注: 對象能夠進行表達式運算

字典推導式,最基本寫法 普通寫法 { 對象對 鍵的操做:對象對 值的操做 for 對象 in 字典 的keys() 或者 values() 或者 items() 方法 }

集合推導式,最基本寫法 普通寫法 { 對象的操做 for 對象 in 可迭代對象 }

狀態碼

100: ('continue',), 101: ('switching_protocols',), 102: ('processing',), 103: ('checkpoint',), 122: ('uri_too_long', 'request_uri_too_long'), 200: ('ok', 'okay', 'all_ok', 'all_okay', 'all_good', '\\o/', '✓'), 201: ('created',), 202: ('accepted',), 203: ('non_authoritative_info', 'non_authoritative_information'), 204: ('no_content',), 205: ('reset_content', 'reset'), 206: ('partial_content', 'partial'), 207: ('multi_status', 'multiple_status', 'multi_stati', 'multiple_stati'), 208: ('already_reported',), 226: ('im_used',), # 重定向錯誤. 300: ('multiple_choices',), 301: ('moved_permanently', 'moved', '\\o-'), 302: ('found',), 303: ('see_other', 'other'), 304: ('not_modified',), 305: ('use_proxy',), 306: ('switch_proxy',), 307: ('temporary_redirect', 'temporary_moved', 'temporary'), 308: ('permanent_redirect', 'resume_incomplete', 'resume',), # These 2 to be removed in 3.0 # 客戶端錯誤. 400: ('bad_request', 'bad'), 401: ('unauthorized',), 402: ('payment_required', 'payment'), 403: ('forbidden',), 404: ('not_found', '-o-'), 405: ('method_not_allowed', 'not_allowed'), 406: ('not_acceptable',), 407: ('proxy_authentication_required', 'proxy_auth', 'proxy_authentication'), 408: ('request_timeout', 'timeout'), 409: ('conflict',), 410: ('gone',), 411: ('length_required',), 412: ('precondition_failed', 'precondition'), 413: ('request_entity_too_large',), 414: ('request_uri_too_large',), 415: ('unsupported_media_type', 'unsupported_media', 'media_type'), 416: ('requested_range_not_satisfiable', 'requested_range', 'range_not_satisfiable'), 417: ('expectation_failed',), 418: ('im_a_teapot', 'teapot', 'i_am_a_teapot'), 421: ('misdirected_request',), 422: ('unprocessable_entity', 'unprocessable'), 423: ('locked',), 424: ('failed_dependency', 'dependency'), 425: ('unordered_collection', 'unordered'), 426: ('upgrade_required', 'upgrade'), 428: ('precondition_required', 'precondition'), 429: ('too_many_requests', 'too_many'), 431: ('header_fields_too_large', 'fields_too_large'), 444: ('no_response', 'none'), 449: ('retry_with', 'retry'), 450: ('blocked_by_windows_parental_controls', 'parental_controls'), 451: ('unavailable_for_legal_reasons', 'legal_reasons'), 499: ('client_closed_request',), # 服務器錯誤. 500: ('internal_server_error', 'server_error', '/o\\', '✗'), 501: ('not_implemented',), 502: ('bad_gateway',), 503: ('service_unavailable', 'unavailable'), 504: ('gateway_timeout',), 505: ('http_version_not_supported', 'http_version'), 506: ('variant_also_negotiates',), 507: ('insufficient_storage',), 509: ('bandwidth_limit_exceeded', 'bandwidth'), 510: ('not_extended',), 511: ('network_authentication_required', 'network_auth', 'network_authentication'),

爬蟲流程複習3 111.requests.get 方法的流程 r = requests.get('https://www.baidu.com/').content.decode('utf-8') 從狀態碼到 二進制碼到 utf-8 編碼 112.對 soup 對象進行美化 html = soup.prettify() <title> 百度一下，你就知道 </title> 113.將內容 string 化 html.xpath('string(//*[@id="cnblogs_post_body"])') 114.獲取屬性 soup.p['name'] 115.嵌套選擇 soup.head.title.string 116.獲取父節點和祖孫節點 soup.a.parent list(enumerate(soup.a.parents)) 117.獲取兄弟節點 soup.a.next_siblings list(enumerate(soup.a.next_siblings)) soup.a.previous_siblings list(enumerate(soup.a.previous_siblings)) 118.按照特定值查找標籤 查找 id 爲 list-1 的標籤 soup.find_all(attrs={'id': 'list-1'}) soup.find_all(id='list-1') 119.返回父節點 find_parents()返回全部祖先節點 find_parent()返回直接父節點 120.返回後面兄弟節點 find_next_siblings()返回後面全部兄弟節點 find_next_sibling()返回後面第一個兄弟節點。 121.返回前面兄弟節點 find_previous_siblings()返回前面全部兄弟節點 find_previous_sibling()返回前面第一個兄弟節點。 122.返回節點後符合條件的節點 find_all_next()返回節點後全部符合條件的節點 find_next()返回第一個符合條件的節點 123.返回節點前符合條件的節點 find_all_previous()返回節點前全部符合條件的節點 find_previous()返回第一個符合條件的節點 124.requests 的請求方式 requests.post(url) requests.put(url) requests.delete(url) requests.head(url) requests.options(url) 125.GET請求 response = requests.get(url) print(response.text) 126.解析 json response.json() json.loads(response.text) 127.發送 post 請求 response = requests.post(url, data=data, headers=headers) response.json() 128.文件上傳 在 post 方法內部添加參數 files 字典參數 import requests files = {'file': open('favicon.ico', 'rb')} response = requests.post("http://httpbin.org/post", files=files) print(response.text) 129.獲取 cookie response.cookie 返回值是 字典對象 for key, value in response.cookies.items(): print(key + '=' + value) 130.模擬登陸 requests.get('http://httpbin.org/cookies/set/number/123456789') response = requests.get('http://httpbin.org/cookies') 131.帶有 Session 的登陸 s = requests.Session() s.get('http://httpbin.org/cookies/set/number/123456789') response = s.get('http://httpbin.org/cookies') 132.證書驗證 urllib3.disable_warnings() response = requests.get('https://www.12306.cn', verify=False) response = requests.get('https://www.12306.cn', cert=('/path/server.crt', '/path/key')) 133.超時設置 from requests.exceptions import ReadTimeout response = requests.get("http://httpbin.org/get", timeout = 0.5) response = urllib.request.urlopen(url, timeout=1) 134.認證設置 from requests.auth import HTTPBasicAuth r = requests.get('http://120.27.34.24:9001', auth=HTTPBasicAuth('user', '123')) r = requests.get('http://120.27.34.24:9001', auth=('user', '123')) 135.異常處理 超時 ReadTimeout 鏈接出錯 ConnectionError 錯誤 RequestException 136.URL 解析 from urllib.parse import urlparse result = urlparse('http://www.baidu.com/index.html;user?id=5#comment') result = urlparse('www.baidu.com/index.html;user?id=5#comment', scheme='https') result = urlparse('http://www.baidu.com/index.html;user?id=5#comment',allow_fragments=False) 136.urllib.parse.urlunparse data = ['http', 'www.baidu.com', 'index.html', 'user', 'a=6', 'comment'] print(urlunparse(data)) http://www.baidu.com/index.html;user?a=6#comment 137.合併 url urllib.parse.urljoin urljoin('http://www.baidu.com', 'FAQ.html') http://www.baidu.com/FAQ.html urljoin('www.baidu.com#comment', '?category=2') www.baidu.com?category=2

因爲思惟導圖過大,不能所有截圖進來,因此複製了文字,進行導入 註釋導入xxx 是對上一行的解釋: 爬蟲基礎 導包 import requests from urllib.parse import urlencode # 導入解析模塊 from urllib.request import Request # Request 請求 from urllib.parse import quote # 使用 quote 解析中文 from urllib.request import urlopen # urlopen 打開 from fake_useragent import UserAgent # 導入 ua import ssl # 使用 ssl 忽略證書 from urllib.request import HTTPHandler from urllib.request import build_opener # 導入 build_opener from urllib.request import ProxyHandler # 導入 私人代理 from http.cookiejar import MozillaCookieJar # 導入 cookie , 從 http.cookiejar 中 from urllib.error import URLError # 捕捉 URL 異常 from lxml import etree # 導入 etree,使用 xpath 進行解析 import http.cookiejar # 導入 cookiejar import json # 導入 json import jsonpath # 導入 jsonpath from selenium import webdriver # 導入外部驅動 from selenium.webdriver.common.keys import Keys # 要想調用鍵盤按鍵操做須要引入keys包 headers headers = { 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.106 Safari/537.36' } headers = { 'User-Agent':UserAgent().random } from fake_useragent import UserAgent headers = { 'User-Agent':UserAgent().chrome } 使用 ua 列表 user_agent = [ "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv2.0.1) Gecko/20100101 Firefox/4.0.1", "Mozilla/5.0 (Windows NT 6.1; rv2.0.1) Gecko/20100101 Firefox/4.0.1", "Opera/9.80 (Macintosh; Intel Mac OS X 10.6.8; U; en) Presto/2.8.131 Version/11.11", "Opera/9.80 (Windows NT 6.1; U; en) Presto/2.8.131 Version/11.11", "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_0) AppleWebKit/535.11 (KHTML, like Gecko) Chrome/17.0.963.56 Safari/535.11" ] ua = random.choice(user_agent) headers = {'User-Agent':ua} url url = 'https://www.baidu.com/' # 要進行訪問的 URL url = 'https://www.baidu.com/s?wd={}'.format(quote('瀚陽的小驛站')) args = { 'wd':"Hany驛站", "ie":"utf-8" } url = 'https://www.baidu.com/s?wd={}'.format(urlencode(args)) 獲取 response get 請求 params = { 'wd':'Python' } response = requests.get(url,params = params,headers = headers) params = { 'wd':'ip' } proxies = { 'http':'代理' # "http":"http://用戶名:密碼@120.27.224.41:16818" } response = requests.get(url, params=params, headers=headers, proxies=proxies) response = requests.get(url,headers = headers) response = requests.get(url,verify = False,headers = headers) Request 請求 form_data = { 'user':'帳號', 'password':'密碼' } f_data = urlencode(form_data) request = Request(url = url,headers = headers,data = f_data) handler = HTTPCookieProcessor() opener = build_opener(handler) response = opener.open(request) request = Request(url = url,headers = headers) response = urlopen(request) request = Request(url,headers=headers) handler = HTTPHandler() # 構建 handler opener = build_opener(handler) # 將 handler 添加到 build_opener中 response = opener.open(request) request = urllib.request.Request(url) request.add_header('User-Agent', ua) context = ssl._create_unverified_context() reponse = urllib.request.urlopen(request, context = context) response = urllib.request.urlopen(request, data=formdata) # 構建請求體 formdata = { 'from':'en', 'to':'zh', 'query':word, 'transtype':'enter', 'simple_means_flag':'3' } # 將formdata進行urlencode編碼,而且轉化爲bytes類型 formdata = urllib.parse.urlencode(formdata).encode('utf-8') request = urllib.request.Request(url, headers=headers) # 建立一個HTTPHandler對象，用來處理http請求 http_handler = urllib.request.HTTPHandler() # 構建一個HTTPHandler 處理器對象，支持處理HTTPS請求 # 經過build_opener，建立支持http請求的opener對象 opener = urllib.request.build_opener(http_handler) # 建立請求對象 # 抓取https，若是開啓fiddler，則會報證書錯誤 # 不開啓fiddler，抓取https，得不到百度網頁， request = urllib.request.Request('http://www.baidu.com/') # 調用opener對象的open方法，發送http請求 reponse = opener.open(request) 使用 proxies 代理進行請求 proxies = { 'http':'代理' # "http":"http://用戶名:密碼@120.27.224.41:16818" } response = requests.get(url,headers = headers,proxies = proxies) request = Request(url,headers = headers) handler = ProxyHandler({"http":"110.243.3.207"}) # 代理網址 opener = build_opener(handler) response = opener.open(request) post 請求 data = { 'user':'用戶名', 'password':'密碼' } response = requests.post(url,headers = headers,data = data) # 使用 data 傳遞參數 使用 session session = requests.Session() get 請求 session.get(info_url,headers = headers) post 請求 params = { 'user':'用戶名', 'password':'密碼' } session.post(url,headers = headers,data = params) 使用 ssl 忽略證書 context = ssl._create_unverified_context() response = urlopen(request,context = context) 使用 cookie form_data = { 'user':'用戶名', 'password':'密碼' } f_data = urlencode(form_data).encode() request = Request(url = login_url,headers = headers,data = f_data) cookie_jar = MozillaCookieJar() handler = HTTPCookieProcessor(cookie_jar) opener = build_opener(handler) response = opener.open(request) cookie_jar.save('cookie.txt',ignore_discard=True,ignore_expires=True) # 失效或者過時依舊進行保存 request = Request(url = info_url,headers = headers) cookie_jar = MozillaCookieJar() cookie_jar.load('cookie.txt',ignore_expires=True,ignore_discard=True) handler = HTTPCookieProcessor(cookie_jar) opener = build_opener(handler) response = opener.open(request) 設置時間戳 response = requests.get(url,timeout = 0.001) # 設置時間戳 cookie = http.cookiejar.CookieJar() # 經過CookieJar建立一個cookie對象，用來保存cookie值 cookie_handler = urllib.request.HTTPCookieProcessor(cookie) # 經過HTTPCookieProcessor構建一個處理器對象，用來處理cookie opener = urllib.request.build_opener(cookie_handler) headers = { 'User-Agent':'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.100 Safari/537.36', 'Referer':'https://passport.weibo.cn/signin/login?entry=mweibo&r=http%3A%2F%2Fweibo.cn%2F&backTitle=%CE%A2%B2%A9&vt=', 'Content-Type':'application/x-www-form-urlencoded', # 'Host': 'passport.weibo.cn', # 'Connection': 'keep-alive', # 'Content-Length': '173', # 'Origin':'https://passport.weibo.cn', # 'Accept': '*/*', } url = 'https://passport.weibo.cn/sso/login' formdata = { 'username':'17701256561', 'password':'2630030lzb', 'savestate':'1', 'r':'http://weibo.cn/', 'ec':'0', 'pagerefer':'', 'entry':'mweibo', 'wentry':'', 'loginfrom':'', 'client_id':'', 'code':'', 'qq':'', 'mainpageflag':'1', 'hff':'', 'hfp':'' } formdata = urllib.parse.urlencode(formdata).encode() # post表單裏面的數據要轉化爲bytes類型，才能發送過去 request = urllib.request.Request(url, headers=headers) response = opener.open(request, data=formdata) headers = { 'User-Agent':'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.100 Safari/537.36', 'Referer':'https://coding.net/login', 'Content-Type':'application/x-www-form-urlencoded', } post_url = 'https://coding.net/api/v2/account/login' data = { 'account': 'wolfcode', 'password': '7c4a8d09ca3762af61e59520943dc26494f8941b', 'remember_me': 'false' } data = urllib.parse.urlencode(data).encode() # 向指定的post地址發送登陸請求 request = urllib.request.Request(post_url, headers=headers) response = opener.open(request, data=data) # 經過opener登陸 # 登陸成功以後，經過opener打開其餘地址便可 response 屬性和方法 response.getcode() # 獲取 HTTP 響應碼 200 response.geturl() # 獲取訪問的網址信息 response.info() # 獲取服務器響應的HTTP請求頭 info = response.read() # 讀取內容 info.decode() # 打印內容 response.read().decode() print(request.get_header("User-agent")) # 獲取請求頭信息 response.text # 獲取內容 response.encoding = 'utf-8' response.json() # 獲取響應信息(json 格式字符串) response.request.headers # 請求頭內容 response.cookie # 獲取 cookie response.readline() # 獲取一行信息 response.status # 查看狀態碼 正則表達式 $通配符,匹配字符串結尾 ret = re.match("[\w]{4,20}@163\.com$", email) # \w 匹配字母或數字 # {4,20}匹配前一個字符4到20次 re.match匹配字符（僅匹配開頭） ret = re.findall(r"\d+","Hany.age = 22, python.version = 3.7.5") # 輸出所有找到的結果 \d + 一次或屢次 ret = re.search(r"\d+",'閱讀次數爲:9999') # 只要找到規則便可,從頭至尾 re中匹配 [ ] 中列舉的字符 ret = re.match("[hH]","hello Python") # 大小寫h均可以的狀況 ret = re.match("[0-3,5-9]Hello Python","7Hello Python") # 匹配0到3 5到9的數字 re中匹配不是以4,7結尾的手機號碼 ret = re.match("1\d{9}[0-3,5-6,8-9]", tel) re中匹配中獎號碼 import re # 匹配中獎號碼 str2 = '17711602423' pattern = re.compile('^(1[3578]\d)(\d{4})(\d{4})$') print(pattern.sub(r'\1****\3',str2)) # r 字符串編碼轉化 '''177****2423''' re中匹配中文字符 pattern = re.compile('[\u4e00-\u9fa5]') strs = '你好 Hello hany' print(pattern.findall(strs)) # ['你', '好'] pattern = re.compile('[\u4e00-\u9fa5]+') print(pattern.findall(strs)) # ['你好'] re中將括號中字符做爲一個分組 ret = re.match("\w{4,20}@163\.com", "test@163.com") print(ret.group()) # test@163.com re中對分組起別名 ret = re.match(r"<(?P<name1>\w*)><(?P<name2>\w*)>.*</(?P=name2)></(?P=name1)>", "<html><h1>www.itcast.cn</h1></html>") print(ret.group()) <html><h1>www.itcast.cn</h1></html> re中匹配數字 # 使用\d進行匹配 ret = re.match("嫦娥\d號","嫦娥1號發射成功") print(ret.group()) re中匹配左右任意一個表達式 ret = re.match("[1-9]?\d$|100","78") print(ret.group()) # 78 re中匹配多個字符 問號 ret = re.match("[1-9]?\d[1-9]","33") print(ret.group()) # 33 ret = re.match("[1-9]?\d","33") print(ret.group()) # 33 re中匹配多個字符 星號 ret = re.match("[A-Z][a-z]*","MnnM") print(ret.group()) # Mnn ret = re.match("[A-Z][a-z]*","Aabcdef") print(ret.group()) # Aabcdef re中匹配多個字符 加號 import re #匹配前一個字符出現1次或無限次 names = ["name1", "_name", "2_name", "__name__"] for name in names: ret = re.match("[a-zA-Z_]+[\w]*",name) if ret: print("變量名 %s 符合要求" % ret.group()) else: print("變量名 %s 非法" % name) 變量名 name1 符合要求 變量名 _name 符合要求 變量名 2_name 非法 變量名 __name__ 符合要求 re中引用分組匹配字符串 # 經過引用分組中匹配到的數據便可，可是要注意是元字符串，即相似 r""這種格式 ret = re.match(r"<([a-zA-Z]*)>\w*</\1>", "<html>hh</html>") # </\1>匹配第一個規則 print(ret.group()) # <html>hh</html> ret = re.match(r"<(\w*)><(\w*)>.*</\2></\1>", label) re中的貪婪和非貪婪 ret = re.match(r"aa(\d+)","aa2343ddd") # 儘可能多的匹配字符 print(ret.group()) # aa2343 # 使用? 將re貪婪轉換爲非貪婪 ret = re.match(r"aa(\d+?)","aa2343ddd") # 只輸出一個數字 print(ret.group()) # aa2 re使用split切割字符串 str1 = 'one,two,three,four' pattern = re.compile(',') # 按照，將string分割後返回 print(pattern.split(str1)) # ['one', 'two', 'three', 'four'] str2 = 'one1two2three3four' print(re.split('\d+',str2)) # ['one', 'two', 'three', 'four'] re匹配中subn，進行替換並返回替換次數 pattern = re.compile('\d+') strs = 'one1two2three3four' print(pattern.subn('-',strs)) # ('one-two-three-four', 3) 3爲替換的次數 re匹配中sub將匹配到的數據進行替換 pattern = re.compile('\d') str1 = 'one1two2three3four' print(pattern.sub('-',str1)) # one-two-three-four print(re.sub('\d','-',str1)) # one-two-three-four 獲取圖片 src="https://rpic.douyucdn.cn/appCovers/2016/11/13/1213973_201611131917_small.jpg" ret = re.search(r"https://.*?\.jpg", src) print(ret.group()) https://rpic.douyucdn.cn/appCovers/2016/11/13/1213973_201611131917_small.jpg re匹配前一個字符出現m次 res = re.compile('[a-zA-Z]{1}') strs = '123abc456' print(re.search(res,strs).group( )) # a res = re.compile('[a-zA-Z]{1}') strs = '123abc456' print(re.findall(res,strs)) #findall返回列表元素對象不具備group函數 # ['a', 'b', 'c'] 分組 group strs = 'hello 123,world 456' pattern = re.compile('(\w+) (\d+)') for i in pattern.finditer(strs): print(i.group(0)) print(i.group(1)) print(i.group(2))#當存在第二個分組時 hello 123 hello 123 world 456 world 456 print(pattern.sub(r'\2 \1',strs)) # 先輸出第二組，後輸出第一組 # 123 hello,456 world print(pattern.sub(r'\1 \2',strs)) # 先輸出第一組，後輸出第二組 # hello 123,world 456 忽略警告 requests.packages.urllib3.disable_warnings() quote 編碼 urllib.parse.quote() 除了-._/09AZaz 都會編碼 urllib.parse.quote_plus() 還會編碼 / url = 'kw=中國' urllib.parse.quote(url) urllib.parse.quote_plus(url) 保存網址內容爲某個文件格式 urllib.request.urlretrieve(url, '名稱.後綴名') json # 將字節碼解碼爲utf8的字符串 data = data.decode('utf-8') # 將json格式的字符串轉化爲json對象 obj = json.loads(data) # 禁用ascii以後，寫入數據，就是正確的 html = json.dumps(obj, ensure_ascii=False) # 將json對象經過str函數強制轉化爲字符串而後按照utf-8格式寫入，這樣就能夠寫成中文漢字了 # 寫文件的時候要指定encoding，不然會按照系統的編碼集寫文件 loads 引號中爲列表 string = '[1, 2, 3, 4, "haha"]' json.loads(string) 引號中爲字典 str_dict = '{"name":"goudan", "age":100, "height":180}' json.loads(str_dict) obj = json.load(open('jsontest.json', encoding='utf-8')) # load 讀取文件中json形式的字符串 轉化成python對象 dumps json.dumps() 序列化時默認使用的ascii編碼 # 添加參數 ensure_ascii=False 禁用ascii編碼，按utf-8編碼 json.dump(str_dict, open('jsontest.json', 'w', encoding='utf-8'), ensure_ascii=False) # dump將對象序列化以後寫入文件 load obj = json.load(open('book.json', encoding='utf-8')) book = jsonpath.jsonpath(obj, '$..book') 保存文件 # 獲得html爲bytes類型 html = response.read() # 將bytes類型轉化爲字符串類型 html = html.decode('utf-8') # 輸出文件時，須要將bytes類型使用wb寫入文件，不然出錯 fp = open('baidu.html', 'w') fp.write(html) fp.close() html = reponse.read() with open(filename, 'wb') as f: f.write(html) # 經過read讀取過來爲字節碼 data = response.read() # 將字節碼解碼爲utf8的字符串 data = data.decode('utf-8') # 將json格式的字符串轉化爲json對象 obj = json.loads(data) # 禁用ascii以後，寫入數據，就是正確的 html = json.dumps(obj, ensure_ascii=False) # 將json對象經過str函數強制轉化爲字符串而後按照utf-8格式寫入，這樣就能夠寫成中文漢字了 # 寫文件的時候要指定encoding，不然會按照系統的編碼集寫文件 with open('json.txt', 'w', encoding='utf-8') as f: f.write(html) etree html_tree = etree.parse('文件名.html') # 經過讀取文件獲得tree對象 xpath 用法 result = html_tree.xpath('//li') # 獲取全部的li標籤 result = html_tree.xpath('//li/@class') # 獲取全部li標籤的class屬性 result = html_tree.xpath('//li/a[@href="link1.html"]') # 獲取全部li下面a中屬性href爲link1.html的a result = html_tree.xpath('//li[last()]/a/@href') # 獲取最後一個li的a裏面的href,結果爲一個字符串 result = html_tree.xpath('//*[@class="mimi"]') # 獲取class爲mimi的節點 result = html_tree.xpath('//li[@class="popo"]/a') # 符合條件的全部li裏面的全部a節點 result = html_tree.xpath('//li[@class="popo"]/a/text()') # 符合條件的全部li裏面的全部a節點的內容 result = html_tree.xpath('//li[@class="popo"]/a')[0].text # 符合條件的全部li裏面的 a節點的內容 xpath使用後,加上 .extract() 只有一個元素能夠使用 .extract_first() tostring etree.tostring(result[0]).decode('utf-8') # 將tree對象轉化爲字符串 html = etree.tostring(html_tree) print(html.decode('utf-8')) etree.HTML html_tree = etree.HTML('文件名.html') # 將html字符串解析爲文檔類型 html_bytes = response.read() html_tree = etree.HTML(html_bytes.decode('utf-8')) response = requests.get(url,headers = headers) e = etree.HTML(response.text) img_path = '//article//img/@src' img_urls = e.xpath(img_path) string(.) 方法 xpath獲取到的對象列表中的某一個元素 ret = score.xpath('string(.)').extract()[0] BeautifulSoup 獲取 soup soup = BeautifulSoup(open('文件名.html', encoding='utf-8'), 'lxml') soup = BeautifulSoup(driver.page_source, 'lxml') # 在全部內容中第一個符合要求的標籤 soup.title soup.a soup.ul a_tag = soup.a a_tag.name # 得到標籤名字 a_tag.attrs # 獲得標籤的全部屬性,字典類型 a_tag.get('href') # 獲取 href a_tag['title'] # 查看 a 標籤的 title 值 a_tag.string # 獲取 a 標籤的內容 獲取標籤下的子節點 contents soup.div.contents # 獲取 div 標籤下全部子節點 soup.head.contents[1] # 獲取 div 下第二個子節點 children # .children屬性獲得的是一個生成器，能夠遍歷生成器 # 遍歷生成器打印對象 for child in soup.body.children: print(child) # 只遍歷直接子節點 for child in soup.div.children: print(child) # descendants會遞歸遍歷子孫節點 for child in soup.div.descendants: print(child) find_all 方法,查找全部的內容 soup.find_all(re.compile('^b')) # 傳入正則表達式 找到全部以b開頭的標籤 soup.find_all(['a', 'b']) # 傳入列表 找到全部的a標籤和b標籤 select 方法 soup.select('a') # 經過類名 soup.select('.aa') # 經過id名 soup.select('#wangyi') # 組合查找 soup.select('div .la') # 直接層級 soup.select('.div > .la') # 根據屬性查找 soup.select('input[class="haha"]') # 查找 input 標籤下 class 爲 haha 的 標籤 soup.select('.la')[0].get_text() # 找到節點以後獲取內容 經過get_text()方法，而且記得添加下標 jsonpath jsonpath 方法 obj = json.load(open('book.json', encoding='utf-8')) book = jsonpath.jsonpath(obj, '$..book') # 全部book authors = jsonpath.jsonpath(obj, '$..book..author') # 全部book中的全部做者 # book中的前兩本書 '$..book[:2]' # book中的最後兩本書 '$..book[-2:]' book = jsonpath.jsonpath(obj, '$..book[0,1]') # 獲取前面的兩本書 book = jsonpath.jsonpath(obj, '$..book[?(@.isbn)]') # 全部book中，有屬性isbn的書籍 book = jsonpath.jsonpath(obj, '$.store.book[?(@.price<10)]') # 全部book中，價格小於10的書籍 xpath和jsonpath 補充資料 day01 http 狀態碼 協議簡介 fiddler 簡介 環境安裝 類型 問題 day02 day03 day04 經常使用函數 webdriver 方法 設置 driver driver = webdriver.PhantomJS() driver = webdriver.PhantomJS(executable_path="./phantomjs") # 若是沒有在環境變量指定PhantomJS位置 driver 方法 text # 獲取標籤內容 get_attribute('href') # 獲取標籤屬性 獲取id標籤值 element = driver.find_element_by_id("passwd-id") driver.find_element_by_id('kw').send_keys('中國') driver.find_element_by_id('su').click() # 點擊百度一下 yanzheng = input('請輸入驗證碼:') driver.find_element_by_id('captcha_field').send_keys(yanzheng) for x in range(1, 3): driver.find_element_by_id('loadMore').click() time.sleep(3) driver.save_screenshot(str(x) + '.png') 獲取name標籤值 element = driver.find_element_by_name("user-name") 獲取標籤名值 element = driver.find_element_by_tag_name("input") 能夠經過XPath來匹配 element = driver.find_element_by_xpath("//input[@id='passwd-id']") 經過css來匹配 element = driver.find_element_by_css_selector("#food span.dairy.aged") 獲取當前url driver.current_url 關閉瀏覽器 driver.quit() driver.save_screenshot('圖片名.png') # 保存當前網址爲一張圖片 driver.execute_script(js) # 調用js方法，同時執行javascript腳本 實例 小說 三寸人間

list 和 [ ] 的功能不相同 對於一個對象: list(對象) 能夠進行強制轉換 [對象] 不可以進行強制轉換,只是在外圍加上 [ ]

列表推導式中相同

數據庫設計基礎知識 流程: 1.用戶需求分析 2.概念結構設計 3.邏輯結構設計(規範化) 4.數據庫的物理結構設計 E-R 模型 -> 關係數據模型步驟 ①爲每一個實體創建-張表 ②爲每一個表選擇一一個主鍵(建議添加一-個沒有實際意義的字段做爲主鍵) ③使用外鍵表示實體間關係 ④定義約束條件 ⑤評價關係的質量，並進行必要的改進(關於範式等知識請參考其餘數據庫書籍) ⑥爲每一個字段選擇合適的數據類型、屬性和索引等 關係: 一對一 將 一 方的主鍵放入到 另外一方 中 一對多 將 一 方的主鍵 放到 多方 中 多對多 兩邊都將主鍵拿出,放入到一個新的表中 最少知足第三範式 定義屬性 類型 索引

注: 1.當超出範圍時,取類型的最大值 2.當無符號數時,給出負數,賦值爲 0

字符串類型 可存儲圖像或聲音之類的二進制數據 可存儲用 gzip 壓縮的數據

char 使用環境(推薦) 若是字段值長度固定或者相差很少(如性別) 數據庫要進行大量字符運算(如比較、排序等) varchar 若是字段長度變化較大的(如文章標題) BLOB保存二進制數據(相片、 電影、壓縮包) ENUM('','') 枚舉類型

字段屬性 UNSIGNED 不容許字段出現負數，能夠使最大容許長度增長 ZEROFILL 用零填充，數值以前自動用0補齊不足的位數，只用於設置數值類型 auto_ increment 自動增量屬性，默認從整數1開始遞增，步長爲1 能夠指定自增的初始值 auto_ increment=n 若是將 NULL 添加到一個auto increment列，MySQL將自動生成下一個序列編號 DEFAULT 指定一個默認值 索引 確保數據的惟一性 優化查詢 對索引字段中文本的搜索進行優化

 使用 you-get 下載免費電影或電視劇 安裝 you-get 和 ffmpeg ffmpeg 主要是下載以後,合併音頻和視頻 pip install you-get -i http://pypi.douban.com/simple/ --trusted-host pypi.douban.com pip install ffmpeg -i http://pypi.douban.com/simple/ --trusted-host pypi.douban.com you-get 下載指令: you-get 視頻網址 此處以 完美關係第二集爲例: https://v.qq.com/x/cover/mzc0020095tf0wm/s00338f1hq8.html?ptag=qqbrowser 注: 此下載方式會下載到 C:\Users\lenovo 目錄下

選擇下載路徑使用 -o 文件夾位置 you-get -o + 文件要保存到的位置 +視頻連接

python 鏈接 mysql 的三種驅動 鏈接 mysql 驅動 mysq1-client python2,3都能直接使用 對myaq1安裝有要求,必須指定位置存在 配置文件 python-mysql 　　python3 不支持 pymysql 　　python2, python3都支持 　　還能夠假裝成前面的庫

我的在使用 import mysql.connector

使用 pip 命令來安裝 mysql-connector： python -m pip install mysql-connector 建立數據庫鏈接 能夠使用如下代碼來鏈接數據庫： import mysql.connector mydb = mysql.connector.connect( host="localhost", # 數據庫主機地址 user="yourusername", # 數據庫用戶名 passwd="yourpassword" # 數據庫密碼 ) print(mydb)

建立數據庫 建立數據庫使用 "CREATE DATABASE" 語句，如下建立一個名爲 runoob_db 的數據庫： import mysql.connector mydb = mysql.connector.connect( host="localhost", user="root", passwd="123456" ) mycursor = mydb.cursor() mycursor.execute("CREATE DATABASE runoob_db")

建立數據表 建立數據表使用 "CREATE TABLE" 語句，建立數據表前，須要確保數據庫已存在，如下建立一個名爲 sites 的數據表： import mysql.connector mydb = mysql.connector.connect( host="localhost", user="root", passwd="123456", database="website" ) mycursor = mydb.cursor() mycursor.execute("CREATE TABLE sites (name VARCHAR(255), url VARCHAR(255))")

主鍵設置 mycursor.execute("ALTER TABLE sites ADD COLUMN id INT AUTO_INCREMENT PRIMARY KEY")

mycursor.execute("CREATE TABLE sites (id INT AUTO_INCREMENT PRIMARY KEY, name VARCHAR(255), url VARCHAR(255))")

插入數據 sql = "INSERT INTO sites (字段名, 字段名) VALUES (%s, %s)" val = ("字段值", "字段值") mycursor.execute(sql, val) mydb.commit() # 數據表內容有更新，必須使用到該語句 mycursor = mydb.cursor() sql = "INSERT INTO sites (name, url) VALUES (%s, %s)" val = ("hany", "值1") mycursor.execute(sql, val) mydb.commit() # 數據表內容有更新，必須使用到該語句

批量插入數據 sql = "INSERT INTO sites (name, url) VALUES (%s, %s)" val = [ ('Google', 'https://www.google.com'), ('Github', 'https://www.github.com'), ('Taobao', 'https://www.taobao.com'), ('stackoverflow', 'https://www.stackoverflow.com/') ] mycursor.executemany(sql, val) mydb.commit() # 數據表內容有更新，必須使用到該語句

查詢所有數據 mycursor = mydb.cursor() mycursor.execute("SELECT * FROM sites") myresult = mycursor.fetchall() # fetchall() 獲取全部記錄 for x in myresult: print(x)

查詢一條數據 myresult = mycursor.fetchone()

關於查詢的 sql 語句 sql = "SELECT * FROM sites WHERE name ='RUNOOB'" sql = "SELECT * FROM sites WHERE url LIKE '%oo%'" sql = "SELECT * FROM sites WHERE name = %s" sql = "SELECT * FROM sites ORDER BY name" sql = "SELECT * FROM sites ORDER BY name DESC" mycursor.execute("SELECT * FROM sites LIMIT 3") mycursor.execute("SELECT * FROM sites LIMIT 3 OFFSET 1") 注: # 0 爲 第一條，1 爲第二條，以此類推

刪除記錄 sql = "DELETE FROM sites WHERE name = 'stackoverflow'" mycursor.execute(sql) sql = "DELETE FROM sites WHERE name = %s" na = ("stackoverflow", ) mycursor.execute(sql, na)

刪除表
sql = "DROP TABLE IF EXISTS sites" # 刪除數據表 sites mycursor.execute(sql)

更新記錄 sql = "UPDATE sites SET name = 'ZH' WHERE name = 'Zhihu'" mycursor.execute(sql) sql = "UPDATE sites SET name = %s WHERE name = %s" val = ("Zhihu", "ZH") mycursor.execute(sql, val) mydb.commit()

pandas_DateFrame的建立 # DateFrame 的建立,包含部分:index , column , values import numpy as np import pandas as pd # 建立一個 DataFrame 對象 dataframe = pd.DataFrame(np.random.randint(1,20,(5,3)), index = range(5), columns = ['A','B','C']) ''' A B C 0 17 9 19 1 14 5 8 2 7 18 13 3 13 16 2 4 18 6 5 ''' # 索引爲時間序列 dataframe2 = pd.DataFrame(np.random.randint(5,15,(9,3)), index = pd.date_range(start = '202003211126', end = '202003212000', freq = 'H'), columns = ['Pandas','爬蟲','比賽']) ''' Pandas 爬蟲 比賽 2020-03-21 11:26:00 8 10 8 2020-03-21 12:26:00 9 14 9 2020-03-21 13:26:00 9 5 13 2020-03-21 14:26:00 9 7 7 2020-03-21 15:26:00 11 10 14 2020-03-21 16:26:00 12 7 10 2020-03-21 17:26:00 11 11 13 2020-03-21 18:26:00 8 13 8 2020-03-21 19:26:00 7 7 13 ''' # 使用字典進行建立 dataframe3 = pd.DataFrame({'語文':[87,79,67,92], '數學':[93,89,80,77], '英語':[88,95,76,77]}, index = ['張三','李四','王五','趙六']) ''' 語文 數學 英語 張三 87 93 88 李四 79 89 95 王五 67 80 76 趙六 92 77 77 ''' # 建立時自動擴充 dataframe4 = pd.DataFrame({'A':range(5,10),'B':3}) ''' A B 0 5 3 1 6 3 2 7 3 3 8 3 4 9 3 '''

pandas_一維數組與經常使用操做

# 一維數組與經常使用操做 import pandas as pd # 設置輸出結果列對齊 pd.set_option('display.unicode.ambiguous_as_wide',True) pd.set_option('display.unicode.east_asian_width',True) # 建立 從 0 開始的非負整數索引 s1 = pd.Series(range(1,20,5)) ''' 0 1 1 6 2 11 3 16 dtype: int64 ''' # 使用字典建立 Series 字典的鍵做爲索引 s2 = pd.Series({'語文':95,'數學':98,'Python':100,'物理':97,'化學':99}) ''' 語文 95 數學 98 Python 100 物理 97 化學 99 dtype: int64 ''' # 使用索引下標進行修改 # 修改 Series 對象的值 s1[3] = -17 ''' 0 1 1 6 2 11 3 -17 dtype: int64 ''' s2['語文'] = 94 ''' 語文 94 數學 98 Python 100 物理 97 化學 99 dtype: int64 ''' # 查看 s1 的絕對值 abs(s1) ''' 0 1 1 6 2 11 3 17 dtype: int64 ''' # 將 s1 全部的值都加 五、使用加法時，對全部元素都進行 s1 + 5 ''' 0 6 1 11 2 16 3 -12 dtype: int64 ''' # 在 s1 的索引下標前加入參數值 s1.add_prefix(2) ''' 20 1 21 6 22 11 23 -17 dtype: int64 ''' # s2 數據的直方圖 s2.hist() # 每行索引後面加上 hany s2.add_suffix('hany') ''' 語文hany 94 數學hany 98 Pythonhany 100 物理hany 97 化學hany 99 dtype: int64 ''' # 查看 s2 中最大值的索引 s2.argmax() # 'Python' # 查看 s2 的值是否在指定區間內 s2.between(90,100,inclusive = True) ''' 語文 True 數學 True Python True 物理 True 化學 True dtype: bool ''' # 查看 s2 中 97 分以上的數據 s2[s2 > 97] ''' 數學 98 Python 100 化學 99 dtype: int64 ''' # 查看 s2 中大於中值的數據 s2[s2 > s2.median()] ''' Python 100 化學 99 dtype: int64 ''' # s2 與數字之間的運算,開平方 * 10 保留一位小數 round((s2**0.5)*10,1) ''' 語文 97.0 數學 99.0 Python 100.0 物理 98.5 化學 99.5 dtype: float64 ''' # s2 的中值 s2.median() # 98.0 # s2 中最小的兩個數 s2.nsmallest(2) ''' 語文 94 物理 97 dtype: int64 ''' # s2 中最大的兩個數 s2.nlargest(2) ''' Python 100 化學 99 dtype: int64 ''' # Series 對象之間的運算,對相同索引進行計算,不是相同索引的使用 NaN pd.Series(range(5)) + pd.Series(range(5,10)) ''' 0 5 1 7 2 9 3 11 4 13 dtype: int64 ''' # 對 Series 對象使用匿名函數 pd.Series(range(5)).pipe(lambda x,y,z :(x**y)%z,2,5) ''' 0 0 1 1 2 4 3 4 4 1 dtype: int64 ''' pd.Series(range(5)).pipe(lambda x:x+3) ''' 0 3 1 4 2 5 3 6 4 7 dtype: int64 ''' pd.Series(range(5)).pipe(lambda x:x+3).pipe(lambda x:x*3) ''' 0 9 1 12 2 15 3 18 4 21 dtype: int64 ''' # 對 Series 對象使用匿名函數 pd.Series(range(5)).apply(lambda x:x+3) ''' 0 3 1 4 2 5 3 6 4 7 dtype: int64 ''' # 查看標準差 pd.Series(range(0,5)).std() # 1.5811388300841898 # 查看無偏方差 pd.Series(range(0,5)).var() # 2.5 # 查看無偏標準差 pd.Series(range(0,5)).sem() # 0.7071067811865476 # 查看是否存在等價於 True 的值 any(pd.Series([3,0,True])) # True # 查看是否全部的值都等價於 True all(pd.Series([3,0,True])) # False

pandas_使用屬性接口實現高級功能

C:\Users\lenovo\Desktop\總結\Python\超市營業額.xlsx 這個文檔本身建立就能夠,如下幾篇文章僅做爲參考 import pandas as pd import copy # 設置列對齊 pd.set_option("display.unicode.ambiguous_as_wide",True) pd.set_option("display.unicode.east_asian_width",True) data = pd.read_excel(r'C:\Users\lenovo\Desktop\總結\Python\超市營業額.xlsx',usecols = ['日期','交易額']) dff = copy.deepcopy(data) # 查看周幾 dff['日期'] = pd.to_datetime(data['日期']).dt.weekday_name ''' 日期 交易額 0 Thursday 2000 1 Thursday 1800 2 Thursday 800 ''' # 按照周幾進行分組，查看交易的平均值 dff = dff.groupby('日期').mean().apply(round) dff.index.name = '周幾' dff[:3] ''' 交易額 周幾 Thursday 1024.0 ''' # dff = copy.deepcopy(data) # 使用正則規則查看月份日期 # dff['日期'] = dff.日期.str.extract(r'(\d{4}-\d{2})') # dff[:5] # 按照日 進行分組查看交易的平均值 -1 表示倒數第一個 # data.groupby(data.日期.str.__getitem__(-1)).mean().apply(round) # 查看日期尾數爲 1 的數據 # data[data.日期.str.endswith('1')][:12] # 查看日期尾數爲 12 的交易數據,slice 爲切片 (-2) 表示倒數兩個 # data[data.日期.str.slice(-2) == '12'] # 查看日期中月份或天數包含 2 的交易數據 # data[data.日期.str.slice(-5).str.contains('2')][1:9]

pandas_使用透視表與交叉表查看業績彙總數據

# 使用透視表與交叉表查看業績彙總數據 import pandas as pd import numpy as np import copy # 設置列對齊 pd.set_option("display.unicode.ambiguous_as_wide",True) pd.set_option("display.unicode.east_asian_width",True) dataframe = pd.read_excel(r'C:\Users\lenovo\Desktop\總結\Python\超市營業額.xlsx') # 對姓名和日期進行分組,並進行求和 dff = dataframe.groupby(by = ['姓名','日期'],as_index = False).sum() ''' 姓名 日期 工號 交易額 0 周七 20190301 1005 600 1 周七 20190302 1005 580 2 張三 20190301 1001 2000 3 張三 20190302 2002 1900 4 張三 20190303 1001 1300 5 李四 20190301 1002 1800 6 李四 20190302 2004 2180 7 王五 20190301 1003 800 8 王五 20190302 2006 1830 9 趙六 20190301 1004 1100 10 趙六 20190302 1004 1050 11 錢八 20190301 2012 1550 12 錢八 20190302 1006 720 ''' # 將 dff 的索引，列 設置成透視表形式 dff = dff.pivot(index = '姓名',columns = '日期',values = '交易額') ''' 日期 20190301 20190302 20190303 姓名 周七 600.0 580.0 NaN 張三 2000.0 1900.0 1300.0 李四 1800.0 2180.0 NaN 王五 800.0 1830.0 NaN 趙六 1100.0 1050.0 NaN 錢八 1550.0 720.0 NaN ''' # 查看前一天的數據 dff.iloc[:,:1] ''' 日期 20190301 姓名 周七 600.0 張三 2000.0 李四 1800.0 王五 800.0 趙六 1100.0 錢八 1550.0 ''' # 交易總額小於 4000 的人的前三天業績 dff[dff.sum(axis = 1) < 4000].iloc[:,:3] ''' 日期 20190301 20190302 20190303 姓名 周七 600.0 580.0 NaN 李四 1800.0 2180.0 NaN 王五 800.0 1830.0 NaN 趙六 1100.0 1050.0 NaN 錢八 1550.0 720.0 NaN ''' # 工資總額大於 2900 元的員工的姓名 dff[dff.sum(axis = 1) > 2900].index.values # array(['張三', '李四'], dtype=object) # 顯示前兩天每一天的交易總額以及每一個人的交易金額 dataframe.pivot_table(values = '交易額',index = '姓名', columns = '日期',aggfunc = 'sum',margins = True).iloc[:,:2] ''' 日期 20190301 20190302 姓名 周七 600.0 580.0 張三 2000.0 1900.0 李四 1800.0 2180.0 王五 800.0 1830.0 趙六 1100.0 1050.0 錢八 1550.0 720.0 All 7850.0 8260.0 ''' # 顯示每一個人在每一個櫃檯的交易總額 dff = dataframe.groupby(by = ['姓名','櫃檯'],as_index = False).sum() dff.pivot(index = '姓名',columns = '櫃檯',values = '交易額') ''' 櫃檯 化妝品 日用品 蔬菜水果 食品 姓名 周七 NaN 1180.0 NaN NaN 張三 4600.0 NaN 600.0 NaN 李四 3300.0 NaN 680.0 NaN 王五 NaN NaN 830.0 1800.0 趙六 NaN NaN NaN 2150.0 錢八 NaN 1420.0 850.0 NaN ''' # 查看每人天天的上班次數 dataframe.pivot_table(values = '交易額',index = '姓名',columns = '日期',aggfunc = 'count',margins = True).iloc[:,:1] ''' 日期 20190301 姓名 周七 1.0 張三 1.0 李四 1.0 王五 1.0 趙六 1.0 錢八 2.0 All 7.0 ''' # 查看每一個人天天購買的次數 dataframe.pivot_table(values = '交易額',index = '姓名',columns = '日期',aggfunc = 'count',margins = True) ''' 日期 20190301 20190302 20190303 All 姓名 周七 1.0 1.0 NaN 2 張三 1.0 2.0 1.0 4 李四 1.0 2.0 NaN 3 王五 1.0 2.0 NaN 3 趙六 1.0 1.0 NaN 2 錢八 2.0 1.0 NaN 3 All 7.0 9.0 1.0 17 ''' # 交叉表 # 每一個人天天上過幾回班 pd.crosstab(dataframe.姓名,dataframe.日期,margins = True).iloc[:,:2] ''' 日期 20190301 20190302 姓名 周七 1 1 張三 1 2 李四 1 2 王五 1 2 趙六 1 1 錢八 2 1 All 7 9 ''' # 每一個人天天去過幾回櫃檯 pd.crosstab(dataframe.姓名,dataframe.櫃檯) ''' 櫃檯 化妝品 日用品 蔬菜水果 食品 姓名 周七 0 2 0 0 張三 3 0 1 0 李四 2 0 1 0 王五 0 0 1 2 趙六 0 0 0 2 錢八 0 2 1 0 ''' # 將每個人在每個櫃檯的交易總額顯示出來 pd.crosstab(dataframe.姓名,dataframe.櫃檯,dataframe.交易額,aggfunc='sum') ''' 櫃檯 化妝品 日用品 蔬菜水果 食品 姓名 周七 NaN 1180.0 NaN NaN 張三 4600.0 NaN 600.0 NaN 李四 3300.0 NaN 680.0 NaN 王五 NaN NaN 830.0 1800.0 趙六 NaN NaN NaN 2150.0 錢八 NaN 1420.0 850.0 NaN ''' # 每一個人在每一個櫃檯交易額的平均值,金額/天數 pd.crosstab(dataframe.姓名,dataframe.櫃檯,dataframe.交易額,aggfunc = 'mean').apply(lambda num:round(num,2) ) ''' 櫃檯 化妝品 日用品 蔬菜水果 食品 姓名 周七 NaN 590.0 NaN NaN 張三 1533.33 NaN 600.0 NaN 李四 1650.00 NaN 680.0 NaN 王五 NaN NaN 830.0 900.0 趙六 NaN NaN NaN 1075.0 錢八 NaN 710.0 850.0 NaN '''

pandas_分類與聚合

# 分組與聚合 import pandas as pd import numpy as np # 設置列對齊 pd.set_option("display.unicode.ambiguous_as_wide",True) pd.set_option("display.unicode.east_asian_width",True) # 讀取工號姓名時段交易額，使用默認索引 dataframe = pd.read_excel(r'C:\Users\lenovo\Desktop\總結\Python\超市營業額.xlsx', usecols = ['工號','姓名','時段','交易額','櫃檯']) # 對 5 的餘數進行分組 dataframe.groupby(by = lambda num:num % 5)['交易額'].sum() ''' 0 4530 1 5000 2 1980 3 3120 4 2780 Name: 交易額, dtype: int64 ''' # 查看索引爲 7 15 的交易額 dataframe.groupby(by = {7:'索引爲7的行',15:'索引爲15的行'})['交易額'].sum() ''' 索引爲15的行 830 索引爲7的行 600 Name: 交易額, dtype: int64 ''' # 查看不一樣時段的交易總額 dataframe.groupby(by = '時段')['交易額'].sum() ''' 時段 14:00-21:00 8300 9:00-14:00 9110 Name: 交易額, dtype: int64 ''' # 各櫃檯的銷售總額 dataframe.groupby(by = '櫃檯')['交易額'].sum() ''' 櫃檯 化妝品 7900 日用品 2600 蔬菜水果 2960 食品 3950 Name: 交易額, dtype: int64 ''' # 查看每一個人在每一個時段購買的次數 count = dataframe.groupby(by = '姓名')['時段'].count() ''' 姓名 周七 2 張三 4 李四 3 王五 3 趙六 2 錢八 3 Name: 時段, dtype: int64 ''' # count.name = '交易人和次數' ''' ''' # 每一個人的交易額平均值並排序 dataframe.groupby(by = '姓名')['交易額'].mean().round(2).sort_values() ''' 姓名 周七 590.00 錢八 756.67 王五 876.67 趙六 1075.00 張三 1300.00 李四 1326.67 Name: 交易額, dtype: float64 ''' # 每一個人的交易額，apply(int) 轉換爲整數 dataframe.groupby(by = '姓名').sum()['交易額'].apply(int) ''' 姓名 周七 1180 張三 5200 李四 3980 王五 2630 趙六 2150 錢八 2270 Name: 交易額, dtype: int64 ''' # 每個員工交易額的中值 data = dataframe.groupby(by = '姓名').median() ''' 工號 交易額 姓名 周七 1005 590 張三 1001 1300 李四 1002 1500 王五 1003 830 趙六 1004 1075 錢八 1006 720 ''' data['交易額'] ''' 姓名 周七 590 張三 1300 李四 1500 王五 830 趙六 1075 錢八 720 Name: 交易額, dtype: int64 ''' # 查看交易額對應的排名 data['排名'] = data['交易額'].rank(ascending = False) data[['交易額','排名']] ''' 交易額 排名 姓名 周七 590 6.0 張三 1300 2.0 李四 1500 1.0 王五 830 4.0 趙六 1075 3.0 錢八 720 5.0 ''' # 每一個人不一樣時段的交易額 dataframe.groupby(by = ['姓名','時段'])['交易額'].sum() ''' 姓名 時段 周七 9:00-14:00 1180 張三 14:00-21:00 600 9:00-14:00 4600 李四 14:00-21:00 3300 9:00-14:00 680 王五 14:00-21:00 830 9:00-14:00 1800 趙六 14:00-21:00 2150 錢八 14:00-21:00 1420 9:00-14:00 850 Name: 交易額, dtype: int64 ''' # 設置各時段累計 dataframe.groupby(by = ['姓名'])['時段','交易額'].aggregate({'交易額':np.sum,'時段':lambda x:'各時段累計'}) ''' 交易額 時段 姓名 周七 1180 各時段累計 張三 5200 各時段累計 李四 3980 各時段累計 王五 2630 各時段累計 趙六 2150 各時段累計 錢八 2270 各時段累計 ''' # 對指定列進行聚合,查看最大,最小,和,平均值,中值 dataframe.groupby(by = '姓名').agg(['max','min','sum','mean','median']) ''' 工號 交易額 max min sum mean median max min sum mean median 姓名 周七 1005 1005 2010 1005 1005 600 580 1180 590.000000 590 張三 1001 1001 4004 1001 1001 2000 600 5200 1300.000000 1300 李四 1002 1002 3006 1002 1002 1800 680 3980 1326.666667 1500 王五 1003 1003 3009 1003 1003 1000 800 2630 876.666667 830 趙六 1004 1004 2008 1004 1004 1100 1050 2150 1075.000000 1075 錢八 1006 1006 3018 1006 1006 850 700 2270 756.666667 720 ''' # 查看部分聚合後的結果 dataframe.groupby(by = '姓名').agg(['max','min','sum','mean','median'])['交易額'] ''' max min sum mean median 姓名 周七 600 580 1180 590.000000 590 張三 2000 600 5200 1300.000000 1300 李四 1800 680 3980 1326.666667 1500 王五 1000 800 2630 876.666667 830 趙六 1100 1050 2150 1075.000000 1075 錢八 850 700 2270 756.666667 720 '''

pandas_處理異常值缺失值重複值數據差分

# 處理異常值缺失值重複值數據差分 import pandas as pd import numpy as np import copy # 設置列對齊 pd.set_option("display.unicode.ambiguous_as_wide",True) pd.set_option("display.unicode.east_asian_width",True) # 異常值 # 讀取工號姓名時段交易額，使用默認索引 dataframe = pd.read_excel(r'C:\Users\lenovo\Desktop\總結\Python\超市營業額.xlsx') # 查看交易額低於 2000 的三條數據 # dataframe[dataframe.交易額 < 2000] dataframe[dataframe.交易額 < 2000][:3] ''' 工號 姓名 日期 時段 交易額 櫃檯 1 1002 李四 20190301 14:00-21:00 1800 化妝品 2 1003 王五 20190301 9:00-14:00 800 食品 3 1004 趙六 20190301 14:00-21:00 1100 食品 ''' # 查看上浮了 50% 以後依舊低於 1500 的交易額,查看 4 條數據 dataframe.loc[dataframe.交易額 < 1500,'交易額'] = dataframe[dataframe.交易額 < 1500]['交易額'].map(lambda num:num*1.5) dataframe[dataframe.交易額 < 1500][:4] ''' 工號 姓名 日期 時段 交易額 櫃檯 2 1003 王五 20190301 9:00-14:00 1200.0 食品 4 1005 周七 20190301 9:00-14:00 900.0 日用品 5 1006 錢八 20190301 14:00-21:00 1050.0 日用品 6 1006 錢八 20190301 9:00-14:00 1275.0 蔬菜水果 ''' # 查看交易額大於 2500 的數據 dataframe[dataframe.交易額 > 2500] ''' Empty DataFrame Columns: [工號, 姓名, 日期, 時段, 交易額, 櫃檯] Index: [] ''' # 查看交易額低於 900 或 高於 1800 的數據 dataframe[(dataframe.交易額 < 900)|(dataframe.交易額 > 1800)] ''' 工號 姓名 日期 時段 交易額 櫃檯 0 1001 張三 20190301 9:00-14:00 2000.0 化妝品 8 1001 張三 20190302 9:00-14:00 1950.0 化妝品 12 1005 周七 20190302 9:00-14:00 870.0 日用品 16 1001 張三 20190303 9:00-14:00 1950.0 化妝品 ''' # 將全部低於 200 的交易額都替換成 200 dataframe.loc[dataframe.交易額 < 200,'交易額'] = 200 # 查看低於 1500 的交易額個數 dataframe.loc[dataframe.交易額 < 1500,'交易額'].count() # 9 # 將大於 3000 元的都替換爲 3000 元 dataframe.loc[dataframe.交易額 > 3000,'交易額'] = 3000 # 缺失值 # 查看有多少行數據 len(dataframe) # 17 # 丟棄缺失值以後的行數 len(dataframe.dropna()) # 17 # 包含缺失值的行 dataframe[dataframe['交易額'].isnull()] ''' Empty DataFrame Columns: [工號, 姓名, 日期, 時段, 交易額, 櫃檯] Index: [] ''' # 使用固定值替換缺失值 # dff = copy.deepcopy(dataframe) # dff.loc[dff.交易額.isnull(),'交易額'] = 999 # 將缺失值設定爲 999 # dff.iloc[[1,4,17],:] # 使用交易額的均值替換缺失值 # dff = copy.deepcopy(dataframe) # for i in dff[dff.交易額.isnull()].index: # dff.loc[i,'交易額'] = round(dff.loc[dff.姓名 == dff.loc[i,'姓名'],'交易額'].mean()) # dff.iloc[[1,4,17],:] # 使用總體均值的 80% 填充缺失值 # dataframe.fillna({'交易額':round(dataframe['交易額'].mean() * 0.8)},inplace = True) # dataframe.iloc[[1,4,16],:] # 重複值 dataframe[dataframe.duplicated()] ''' Empty DataFrame Columns: [工號, 姓名, 日期, 時段, 交易額, 櫃檯] Index: [] ''' # dff = dataframe[['工號','姓名','日期','交易額']] # dff = dff[dff.duplicated()] # for row in dff.values: # df[(df.工號 == row[0]) & (df.日期 == row[2]) &(df.交易額 == row[3])] # 丟棄重複行 dataframe = dataframe.drop_duplicates() # 查看是否有錄入錯誤的工號和姓名 dff = dataframe[['工號','姓名']] dff.drop_duplicates() ''' 工號 姓名 0 1001 張三 1 1002 李四 2 1003 王五 3 1004 趙六 4 1005 周七 5 1006 錢八 ''' # 數據差分 # 查看員工業績波動狀況(每一天和昨天的數據做比較) dff = dataframe.groupby(by = '日期').sum()['交易額'].diff() ''' 日期 20190301 NaN 20190302 1765.0 20190303 -9690.0 Name: 交易額, dtype: float64 ''' dff.map(lambda num:'%.2f'%(num))[:5] ''' 日期 20190301 nan 20190302 1765.00 20190303 -9690.00 Name: 交易額, dtype: object ''' # 數據差分 # 查看張三的波動狀況 dataframe[dataframe.姓名 == '張三'].groupby(by = '日期').sum()['交易額'].diff()[:5] ''' 日期 20190301 NaN 20190302 850.0 20190303 -900.0 Name: 交易額, dtype: float64 '''

pandas_數據拆分與合併

import pandas as pd import numpy as np # 讀取所有數據，使用默認索引 data = pd.read_excel(r'C:\Users\lenovo\Desktop\總結\Python\超市營業額.xlsx') # 修改異常值 data.loc[data.交易額 > 3000,'交易額'] = 3000 data.loc[data.交易額 < 200,'交易額'] = 200 # 刪除重複值 data.drop_duplicates(inplace = True) # inplace 表示對源數據也進行修改 # 填充缺失值 data['交易額'].fillna(data['交易額'].mean(),inplace = True) # 使用交叉表獲得每人在各櫃檯交易額的平均值 data_group = pd.crosstab(data.姓名,data.櫃檯,data.交易額,aggfunc = 'mean').apply(round) # 繪製柱狀圖 data_group.plot(kind = 'bar') # <matplotlib.axes._subplots.AxesSubplot object at 0x000001D681607888> # 數據的合併 data1 = pd.read_excel(r'C:\Users\lenovo\Desktop\總結\Python\超市營業額.xlsx') data2 = pd.read_excel(r'C:\Users\lenovo\Desktop\總結\Python\超市營業額.xlsx',sheet_name = 'Sheet2') df1 = data1[:3] ''' 工號 姓名 日期 時段 交易額 櫃檯 0 1001 張三 20190301 9:00-14:00 2000 化妝品 1 1002 李四 20190301 14:00-21:00 1800 化妝品 2 1003 王五 20190301 9:00-14:00 800 食品 ''' df2 = data2[:4] ''' 工號 姓名 日期 時段 交易額 櫃檯 0 1006 錢八 20190301 9:00-14:00 850 蔬菜水果 1 1001 張三 20190302 14:00-21:00 600 蔬菜水果 2 1001 張三 20190302 9:00-14:00 1300 化妝品 3 1002 李四 20190302 14:00-21:00 1500 化妝品 ''' # 使用 concat 鏈接兩個相同結構的 DataFrame 對象 df3 = pd.concat([df1,df2]) ''' 工號 姓名 日期 時段 交易額 櫃檯 0 1001 張三 20190301 9:00-14:00 2000 化妝品 1 1002 李四 20190301 14:00-21:00 1800 化妝品 2 1003 王五 20190301 9:00-14:00 800 食品 0 1006 錢八 20190301 9:00-14:00 850 蔬菜水果 1 1001 張三 20190302 14:00-21:00 600 蔬菜水果 2 1001 張三 20190302 9:00-14:00 1300 化妝品 3 1002 李四 20190302 14:00-21:00 1500 化妝品 ''' # 合併，忽略原來的索引 ignore_index df4 = df3.append([df1,df2],ignore_index = True) ''' 工號 姓名 日期 時段 交易額 櫃檯 0 1001 張三 20190301 9:00-14:00 2000 化妝品 1 1002 李四 20190301 14:00-21:00 1800 化妝品 2 1003 王五 20190301 9:00-14:00 800 食品 3 1006 錢八 20190301 9:00-14:00 850 蔬菜水果 4 1001 張三 20190302 14:00-21:00 600 蔬菜水果 5 1001 張三 20190302 9:00-14:00 1300 化妝品 6 1002 李四 20190302 14:00-21:00 1500 化妝品 7 1001 張三 20190301 9:00-14:00 2000 化妝品 8 1002 李四 20190301 14:00-21:00 1800 化妝品 9 1003 王五 20190301 9:00-14:00 800 食品 10 1006 錢八 20190301 9:00-14:00 850 蔬菜水果 11 1001 張三 20190302 14:00-21:00 600 蔬菜水果 12 1001 張三 20190302 9:00-14:00 1300 化妝品 13 1002 李四 20190302 14:00-21:00 1500 化妝品 ''' # 按照列進行拆分 df5 = df4.loc[:,['姓名','櫃檯','交易額']] # 查看前五條數據 df5[:5] ''' 姓名 櫃檯 交易額 0 張三 化妝品 2000 1 李四 化妝品 1800 2 王五 食品 800 3 錢八 蔬菜水果 850 4 張三 蔬菜水果 600 ''' # 合併 merge 、 join # 按照工號進行合併，隨機查看 3 條數據 rows = np.random.randint(0,len(df5),3) pd.merge(df4,df5).iloc[rows,:] ''' 工號 姓名 日期 時段 交易額 櫃檯 7 1002 李四 20190301 14:00-21:00 1800 化妝品 4 1002 李四 20190301 14:00-21:00 1800 化妝品 10 1003 王五 20190301 9:00-14:00 800 食品 ''' # 按照工號進行合併，指定其餘同名列的後綴 pd.merge(df1,df2,on = '工號',suffixes = ['_x','_y']).iloc[:,:] ''' 工號 姓名_x 日期_x 時段_x ... 日期_y 時段_y 交易額_y 櫃檯_y 0 1001 張三 20190301 9:00-14:00 ... 20190302 14:00-21:00 600 蔬菜水果 1 1001 張三 20190301 9:00-14:00 ... 20190302 9:00-14:00 1300 化妝品 2 1002 李四 20190301 14:00-21:00 ... 20190302 14:00-21:00 1500 化妝品 ''' # 兩個表都設置工號爲索引 set_index df2.set_index('工號').join(df3.set_index('工號'),lsuffix = '_x',rsuffix = '_y').iloc[:] ''' 姓名_x 日期_x 時段_x 交易額_x ... 日期_y 時段_y 交易額_y 櫃檯_y 工號 ... 1001 張三 20190302 14:00-21:00 600 ... 20190301 9:00-14:00 2000 化妝品 1001 張三 20190302 14:00-21:00 600 ... 20190302 14:00-21:00 600 蔬菜水果 1001 張三 20190302 14:00-21:00 600 ... 20190302 9:00-14:00 1300 化妝品 1001 張三 20190302 9:00-14:00 1300 ... 20190301 9:00-14:00 2000 化妝品 1001 張三 20190302 9:00-14:00 1300 ... 20190302 14:00-21:00 600 蔬菜水果 1001 張三 20190302 9:00-14:00 1300 ... 20190302 9:00-14:00 1300 化妝品 1002 李四 20190302 14:00-21:00 1500 ... 20190301 14:00-21:00 1800 化妝品 1002 李四 20190302 14:00-21:00 1500 ... 20190302 14:00-21:00 1500 化妝品 1006 錢八 20190301 9:00-14:00 850 ... 20190301 9:00-14:00 850 蔬菜水果 '''

pandas_數據排序

import pandas as pd # 設置列對齊 pd.set_option("display.unicode.ambiguous_as_wide",True) pd.set_option("display.unicode.east_asian_width",True) # 讀取工號姓名時段交易額，使用默認索引 dataframe = pd.read_excel(r'C:\Users\lenovo\Desktop\總結\Python\超市營業額.xlsx', usecols = ['工號','姓名','時段','交易額','櫃檯']) dataframe[:5] ''' 工號 姓名 時段 交易額 櫃檯 0 1001 張三 9:00-14:00 2000 化妝品 1 1002 李四 14:00-21:00 1800 化妝品 2 1003 王五 9:00-14:00 800 食品 3 1004 趙六 14:00-21:00 1100 食品 4 1005 周七 9:00-14:00 600 日用品 ''' # 按照交易額和工號降序排序，查看五條數據 dataframe.sort_values(by = ['交易額','工號'],ascending = False)[:5] ''' 工號 姓名 時段 交易額 櫃檯 0 1001 張三 9:00-14:00 2000 化妝品 1 1002 李四 14:00-21:00 1800 化妝品 9 1002 李四 14:00-21:00 1500 化妝品 8 1001 張三 9:00-14:00 1300 化妝品 16 1001 張三 9:00-14:00 1300 化妝品 ''' # 按照交易額和工號升序排序，查看五條數據 dataframe.sort_values(by = ['交易額','工號'])[:5] ''' 工號 姓名 時段 交易額 櫃檯 12 1005 周七 9:00-14:00 580 日用品 7 1001 張三 14:00-21:00 600 蔬菜水果 4 1005 周七 9:00-14:00 600 日用品 14 1002 李四 9:00-14:00 680 蔬菜水果 5 1006 錢八 14:00-21:00 700 日用品 ''' # 按照交易額降序和工號升序排序，查看五條數據 dataframe.sort_values(by = ['交易額','工號'],ascending = [False,True])[:5] ''' 工號 姓名 時段 交易額 櫃檯 0 1001 張三 9:00-14:00 2000 化妝品 1 1002 李四 14:00-21:00 1800 化妝品 9 1002 李四 14:00-21:00 1500 化妝品 8 1001 張三 9:00-14:00 1300 化妝品 16 1001 張三 9:00-14:00 1300 化妝品 ''' # 按工號升序排序 dataframe.sort_values(by = ['工號'])[:5] ''' 工號 姓名 時段 交易額 櫃檯 0 1001 張三 9:00-14:00 2000 化妝品 7 1001 張三 14:00-21:00 600 蔬菜水果 8 1001 張三 9:00-14:00 1300 化妝品 16 1001 張三 9:00-14:00 1300 化妝品 1 1002 李四 14:00-21:00 1800 化妝品 ''' dataframe.sort_values(by = ['工號'],na_position = 'last')[:5] ''' 工號 姓名 時段 交易額 櫃檯 0 1001 張三 9:00-14:00 2000 化妝品 7 1001 張三 14:00-21:00 600 蔬菜水果 8 1001 張三 9:00-14:00 1300 化妝品 16 1001 張三 9:00-14:00 1300 化妝品 1 1002 李四 14:00-21:00 1800 化妝品 ''' # 按列名升序排序 dataframe.sort_index(axis = 1)[:5] ''' 交易額 姓名 工號 時段 櫃檯 0 2000 張三 1001 9:00-14:00 化妝品 1 1800 李四 1002 14:00-21:00 化妝品 2 800 王五 1003 9:00-14:00 食品 3 1100 趙六 1004 14:00-21:00 食品 4 600 周七 1005 9:00-14:00 日用品 ''' dataframe.sort_index(axis = 1,ascending = True)[:5] ''' 交易額 姓名 工號 時段 櫃檯 0 2000 張三 1001 9:00-14:00 化妝品 1 1800 李四 1002 14:00-21:00 化妝品 2 800 王五 1003 9:00-14:00 食品 3 1100 趙六 1004 14:00-21:00 食品 4 600 周七 1005 9:00-14:00 日用品 '''

pandas_時間序列和經常使用操做

# 時間序列和經常使用操做 import pandas as pd # 每隔五天--5D pd.date_range(start = '20200101',end = '20200131',freq = '5D') ''' DatetimeIndex(['2020-01-01', '2020-01-06', '2020-01-11', '2020-01-16', '2020-01-21', '2020-01-26', '2020-01-31'], dtype='datetime64[ns]', freq='5D') ''' # 每隔一週--W pd.date_range(start = '20200301',end = '20200331',freq = 'W') ''' DatetimeIndex(['2020-03-01', '2020-03-08', '2020-03-15', '2020-03-22', '2020-03-29'], dtype='datetime64[ns]', freq='W-SUN') ''' # 間隔兩天,五個數據 pd.date_range(start = '20200301',periods = 5,freq = '2D') # periods 幾個數據 ,freq 間隔時期，兩天 ''' DatetimeIndex(['2020-03-01', '2020-03-03', '2020-03-05', '2020-03-07', '2020-03-09'], dtype='datetime64[ns]', freq='2D') ''' # 間隔三小時，八個數據 pd.date_range(start = '20200301',periods = 8,freq = '3H') ''' DatetimeIndex(['2020-03-01 00:00:00', '2020-03-01 03:00:00', '2020-03-01 06:00:00', '2020-03-01 09:00:00', '2020-03-01 12:00:00', '2020-03-01 15:00:00', '2020-03-01 18:00:00', '2020-03-01 21:00:00'], dtype='datetime64[ns]', freq='3H') ''' # 三點開始，十二個數據，間隔一分鐘 pd.date_range(start = '202003010300',periods = 12,freq = 'T') ''' DatetimeIndex(['2020-03-01 03:00:00', '2020-03-01 03:01:00', '2020-03-01 03:02:00', '2020-03-01 03:03:00', '2020-03-01 03:04:00', '2020-03-01 03:05:00', '2020-03-01 03:06:00', '2020-03-01 03:07:00', '2020-03-01 03:08:00', '2020-03-01 03:09:00', '2020-03-01 03:10:00', '2020-03-01 03:11:00'], dtype='datetime64[ns]', freq='T') ''' # 每月的最後一天 pd.date_range(start = '20190101',end = '20191231',freq = 'M') ''' DatetimeIndex(['2019-01-31', '2019-02-28', '2019-03-31', '2019-04-30', '2019-05-31', '2019-06-30', '2019-07-31', '2019-08-31', '2019-09-30', '2019-10-31', '2019-11-30', '2019-12-31'], dtype='datetime64[ns]', freq='M') ''' # 間隔一年，六個數據，年底最後一天 pd.date_range(start = '20190101',periods = 6,freq = 'A') ''' DatetimeIndex(['2019-12-31', '2020-12-31', '2021-12-31', '2022-12-31', '2023-12-31', '2024-12-31'], dtype='datetime64[ns]', freq='A-DEC') ''' # 間隔一年，六個數據，年初最後一天 pd.date_range(start = '20200101',periods = 6,freq = 'AS') ''' DatetimeIndex(['2020-01-01', '2021-01-01', '2022-01-01', '2023-01-01', '2024-01-01', '2025-01-01'], dtype='datetime64[ns]', freq='AS-JAN') ''' # 使用 Series 對象包含時間序列對象,使用特定索引 data = pd.Series(index = pd.date_range(start = '20200321',periods = 24,freq = 'H'),data = range(24)) ''' 2020-03-21 00:00:00 0 2020-03-21 01:00:00 1 2020-03-21 02:00:00 2 2020-03-21 03:00:00 3 2020-03-21 04:00:00 4 2020-03-21 05:00:00 5 2020-03-21 06:00:00 6 2020-03-21 07:00:00 7 2020-03-21 08:00:00 8 2020-03-21 09:00:00 9 2020-03-21 10:00:00 10 2020-03-21 11:00:00 11 2020-03-21 12:00:00 12 2020-03-21 13:00:00 13 2020-03-21 14:00:00 14 2020-03-21 15:00:00 15 2020-03-21 16:00:00 16 2020-03-21 17:00:00 17 2020-03-21 18:00:00 18 2020-03-21 19:00:00 19 2020-03-21 20:00:00 20 2020-03-21 21:00:00 21 2020-03-21 22:00:00 22 2020-03-21 23:00:00 23 Freq: H, dtype: int64 ''' # 查看前五個數據 data[:5] ''' 2020-03-21 00:00:00 0 2020-03-21 01:00:00 1 2020-03-21 02:00:00 2 2020-03-21 03:00:00 3 2020-03-21 04:00:00 4 Freq: H, dtype: int64 ''' # 三分鐘重採樣，計算均值 data.resample('3H').mean() ''' 2020-03-21 00:00:00 1 2020-03-21 03:00:00 4 2020-03-21 06:00:00 7 2020-03-21 09:00:00 10 2020-03-21 12:00:00 13 2020-03-21 15:00:00 16 2020-03-21 18:00:00 19 2020-03-21 21:00:00 22 Freq: 3H, dtype: int64 ''' # 五分鐘重採樣，求和 data.resample('5H').sum() ''' 2020-03-21 00:00:00 10 2020-03-21 05:00:00 35 2020-03-21 10:00:00 60 2020-03-21 15:00:00 85 2020-03-21 20:00:00 86 Freq: 5H, dtype: int64 ''' # 計算OHLC open,high,low,close data.resample('5H').ohlc() ''' open high low close 2020-03-21 00:00:00 0 4 0 4 2020-03-21 05:00:00 5 9 5 9 2020-03-21 10:00:00 10 14 10 14 2020-03-21 15:00:00 15 19 15 19 2020-03-21 20:00:00 20 23 20 23 ''' # 將日期替換爲次日 data.index = data.index + pd.Timedelta('1D') # 查看前五條數據 data[:5] ''' 2020-03-22 00:00:00 0 2020-03-22 01:00:00 1 2020-03-22 02:00:00 2 2020-03-22 03:00:00 3 2020-03-22 04:00:00 4 Freq: H, dtype: int64 ''' # 查看指定日期是星期幾 # pd.Timestamp('20200321').weekday_name # 'Saturday' # 查看指定日期的年份是不是閏年 pd.Timestamp('20200301').is_leap_year # True # 查看指定日期所在的季度和月份 day = pd.Timestamp('20200321') # Timestamp('2020-03-21 00:00:00') # 查看日期的季度 day.quarter # 1 # 查看日期所在的月份 day.month # 3 # 轉換爲 python 的日期時間對象 day.to_pydatetime() # datetime.datetime(2020, 3, 21, 0, 0)

pandas_學習的時候總會忘了的知識點

對Series 對象使用匿名函數 使用 pipe 函數對 Series 對象使用 匿名函數 pd.Series(range(5)).pipe(lambda x,y,z :(x**y)%z,2,5) pd.Series(range(5)).pipe(lambda x:x+3).pipe(lambda x:x*3) 使用 apply 函數對 Series 對象使用 匿名函數 pd.Series(range(5)).apply(lambda x:x+3) # 查看無偏標準差，使用 sem 函數 pd.Series(range(0,5)).sem() # 按照日 進行分組查看交易的平均值 -1 表示倒數第一個 # data.groupby(data.日期.str.__getitem__(-1)).mean().apply(round) # 查看日期尾數爲 1 的數據 # data[data.日期.str.endswith('1')][:12] # 查看日期尾數爲 12 的交易數據,slice 爲切片 (-2) 表示倒數兩個 # data[data.日期.str.slice(-2) == '12'] # 查看日期中月份或天數包含 2 的交易數據 # data[data.日期.str.slice(-5).str.contains('2')][1:9] # 對姓名和日期進行分組,並進行求和 dff = dataframe.groupby(by = ['姓名','日期'],as_index = False).sum() # 使用 pivot 進行設置透視表 # 將 dff 的索引，列 設置成透視表形式 dff = dff.pivot(index = '姓名',columns = '日期',values = '交易額') index 設置行索引 columns 設置列索引 values 對應的值 # 查看第一天的數據 dff.iloc[:,:1] # 顯示前兩天每一天的交易總額以及每一個人的交易金額 dataframe.pivot_table(values = '交易額',index = '姓名',columns = '日期',aggfunc = 'sum',margins = True).iloc[:,:2] # 查看每一個人天天購買的次數 dataframe.pivot_table(values = '交易額',index = '姓名',columns = '日期',aggfunc = 'count',margins = True) # 每一個人天天去過幾回櫃檯，使用交叉表 crosstab pd.crosstab(dataframe.姓名,dataframe.櫃檯) # 每一個人在每一個櫃檯交易額的平均值,金額/天數 pd.crosstab(dataframe.姓名,dataframe.櫃檯,dataframe.交易額,aggfunc = 'mean').apply(lambda num:round(num,2) ) # 對 5 的餘數進行分組 by 能夠爲匿名函數，字典，字符串 dataframe.groupby(by = lambda num:num % 5)['交易額'].sum() dataframe.groupby(by = {7:'索引爲7的行',15:'索引爲15的行'})['交易額'].sum() dataframe.groupby(by = '時段')['交易額'].sum() # sort_values() 進行排序 # 查看交易額對應的排名 data['排名'] = data['交易額'].rank(ascending = False) # 每一個人不一樣時段的交易額 dataframe.groupby(by = ['姓名','時段'])['交易額'].sum() # 查看上浮了 50% 以後依舊低於 1500 的交易額,查看 4 條數據 # 對 DataFrame 對象使用 map 匹配函數 dataframe.loc[dataframe.交易額 < 1500,'交易額'] = dataframe[dataframe.交易額 < 1500]['交易額'].map(lambda num:num*1.5) # 丟棄缺失值以後的行數 len(dataframe.dropna()) # 包含缺失值的行 dataframe[dataframe['交易額'].isnull()] # 使用總體均值的 80% 填充缺失值 # dataframe.fillna({'交易額':round(dataframe['交易額'].mean() * 0.8)},inplace = True) # dataframe.iloc[[1,4,16],:] # 重複值 dataframe[dataframe.duplicated()] # 丟棄重複行 dataframe = dataframe.drop_duplicates() # 查看是否有錄入錯誤的工號和姓名 dff = dataframe[['工號','姓名']] dff.drop_duplicates() # 使用 diff 對數據進行差分 # 查看員工業績波動狀況(每一天和昨天的數據做比較) dff = dataframe.groupby(by = '日期').sum()['交易額'].diff() # 使用交叉表獲得每人在各櫃檯交易額的平均值 data_group = pd.crosstab(data.姓名,data.櫃檯,data.交易額,aggfunc = 'mean').apply(round) # 使用 concat 鏈接兩個相同結構的 DataFrame 對象 df3 = pd.concat([df1,df2]) # 合併 merge 、 join # 按照工號進行合併，隨機查看 3 條數據 # 合併 df4 和 df5 兩個DataFrame 對象 rows = np.random.randint(0,len(df5),3) pd.merge(df4,df5).iloc[rows,:] # 按照工號進行合併，指定其餘同名列的後綴 # on 對應索引列名 suffixes 區分兩個鏈接的對象 pd.merge(df1,df2,on = '工號',suffixes = ['_x','_y']).iloc[:,:] # 兩個表都設置工號爲索引 set_index，設置兩個鏈接對象的索引 df2.set_index('工號').join(df3.set_index('工號'),lsuffix = '_x',rsuffix = '_y').iloc[:] # 讀取 csv 對象時使用 usecols # 讀取工號姓名時段交易額，使用默認索引 dataframe = pd.read_excel(r'C:\Users\lenovo\Desktop\總結\Python\超市營業額.xlsx', usecols = ['工號','姓名','時段','交易額','櫃檯']) # 按照交易額降序和工號升序排序，查看五條數據 dataframe.sort_values(by = ['交易額','工號'],ascending = [False,True])[:5] # 按工號升序排序 dataframe.sort_values(by = ['工號'])[:5] # 三分鐘重採樣，計算均值 data.resample('3H').mean() # 計算OHLC open,high,low,close data.resample('5H').ohlc() # 將日期替換爲次日 data.index = data.index + pd.Timedelta('1D') # 查看指定日期的年份是不是閏年 pd.Timestamp('20200301').is_leap_year # 查看全部的交易額信息 dataframe['交易額'].describe() # 第一個最小交易額的行下標 index = dataframe['交易額'].idxmin() # 最大交易額的行下標 index = dataframe['交易額'].idxmax() dataframe.loc[index,'交易額'] # 2000 # 跳過 1 2 4 行，以第一列姓名爲索引 dataframe2 = pd.read_excel(r'C:\Users\lenovo\Desktop\總結\Python\超市營業額.xlsx', skiprows = [1,2,4], index_col = 1) skiprows 跳過的行 index_col 指定的列 dataframe.iloc[[0,2,3],:] # 查看第四行的姓名數據 dataframe.at[3,'姓名']

pandas_查看數據特徵和統計信息

# 查看數據特徵和統計信息 import pandas as pd # 讀取文件 dataframe = pd.read_excel(r'C:\Users\lenovo\Desktop\總結\Python\超市營業額.xlsx') # 查看全部的交易額信息 dataframe['交易額'].describe() ''' count 17.000000 mean 1024.117647 std 428.019550 min 580.000000 25% 700.000000 50% 850.000000 75% 1300.000000 max 2000.000000 Name: 交易額, dtype: float64 ''' # 查看四分位數 dataframe['交易額'].quantile([0,0.25,0.5,0.75,1.0]) ''' 0.00 580.0 0.25 700.0 0.50 850.0 0.75 1300.0 1.00 2000.0 Name: 交易額, dtype: float64 ''' # 交易額中值 dataframe['交易額'].median() # 850.0 # 交易額最小的三個數據 dataframe['交易額'].nsmallest(3) ''' 12 580 4 600 7 600 Name: 交易額, dtype: int64 ''' dataframe.nsmallest(3,'交易額') ''' 工號 姓名 日期 時段 交易額 櫃檯 12 1005 周七 20190302 9:00-14:00 580 日用品 4 1005 周七 20190301 9:00-14:00 600 日用品 7 1001 張三 20190302 14:00-21:00 600 蔬菜水果 ''' # 交易額最大的兩個數據 dataframe['交易額'].nlargest(2) ''' 0 2000 1 1800 Name: 交易額, dtype: int64 ''' # 查看最大的交易額數據 dataframe.nlargest(2,'交易額') ''' 工號 姓名 日期 時段 交易額 櫃檯 0 1001 張三 20190301 9:00-14:00 2000 化妝品 1 1002 李四 20190301 14:00-21:00 1800 化妝品 ''' # 查看最後一個日期 dataframe['日期'].max() # 20190303 # 查看最小的工號 dataframe['工號'].min() # 1001 # 第一個最小交易額的行下標 index = dataframe['交易額'].idxmin() # 0 # 第一個最小交易額 dataframe.loc[index,'交易額'] # 580 # 最大交易額的行下標 index = dataframe['交易額'].idxmax() dataframe.loc[index,'交易額'] # 2000

pandas_讀取Excel並篩選特定數據

# C:\Users\lenovo\Desktop\總結\Python # 讀取 Excel 文件並進行篩選 import pandas as pd # 設置列對齊 pd.set_option("display.unicode.ambiguous_as_wide",True) pd.set_option("display.unicode.east_asian_width",True) # 讀取工號姓名時段交易額，使用默認索引 dataframe = pd.read_excel(r'C:\Users\lenovo\Desktop\總結\Python\超市營業額.xlsx', usecols = ['工號','姓名','時段','交易額']) # 打印前十行數據 dataframe[:10] ''' 工號 姓名 時段 交易額 0 1001 張三 9:00-14:00 2000 1 1002 李四 14:00-21:00 1800 2 1003 王五 9:00-14:00 800 3 1004 趙六 14:00-21:00 1100 4 1005 周七 9:00-14:00 600 5 1006 錢八 14:00-21:00 700 6 1006 錢八 9:00-14:00 850 7 1001 張三 14:00-21:00 600 8 1001 張三 9:00-14:00 1300 9 1002 李四 14:00-21:00 1500 ''' # 跳過 1 2 4 行，以第一列姓名爲索引 dataframe2 = pd.read_excel(r'C:\Users\lenovo\Desktop\總結\Python\超市營業額.xlsx', skiprows = [1,2,4], index_col = 1) '''注：張三李四趙六的第一條數據跳過 工號 日期 時段 交易額 櫃檯 姓名 王五 1003 20190301 9:00-14:00 800 食品 周七 1005 20190301 9:00-14:00 600 日用品 錢八 1006 20190301 14:00-21:00 700 日用品 錢八 1006 20190301 9:00-14:00 850 蔬菜水果 張三 1001 20190302 14:00-21:00 600 蔬菜水果 ''' # 篩選符合特定條件的數據 # 讀取超市營業額數據 dataframe = pd.read_excel(r'C:\Users\lenovo\Desktop\總結\Python\超市營業額.xlsx') # 查看 5 到 10 的數據 dataframe[5:11] ''' 工號 姓名 日期 時段 交易額 櫃檯 5 1006 錢八 20190301 14:00-21:00 700 日用品 6 1006 錢八 20190301 9:00-14:00 850 蔬菜水果 7 1001 張三 20190302 14:00-21:00 600 蔬菜水果 8 1001 張三 20190302 9:00-14:00 1300 化妝品 9 1002 李四 20190302 14:00-21:00 1500 化妝品 10 1003 王五 20190302 9:00-14:00 1000 食品 ''' # 查看第六行的數據 dataframe.iloc[5] ''' 工號 1006 姓名 錢八 時段 14:00-21:00 交易額 700 Name: 5, dtype: object ''' dataframe[:5] ''' 工號 姓名 時段 交易額 0 1001 張三 9:00-14:00 2000 1 1002 李四 14:00-21:00 1800 2 1003 王五 9:00-14:00 800 3 1004 趙六 14:00-21:00 1100 4 1005 周七 9:00-14:00 600 ''' # 查看第 1 3 4 行的數據 dataframe.iloc[[0,2,3],:] ''' 工號 姓名 時段 交易額 0 1001 張三 9:00-14:00 2000 2 1003 王五 9:00-14:00 800 3 1004 趙六 14:00-21:00 1100 ''' # 查看第 1 3 4 行的第 1 2 列 dataframe.iloc[[0,2,3],[0,1]] ''' 工號 姓名 0 1001 張三 2 1003 王五 3 1004 趙六 ''' # 查看前五行指定，姓名、時段和交易額的數據 dataframe[['姓名','時段','交易額']][:5] ''' 姓名 時段 交易額 0 張三 9:00-14:00 2000 1 李四 14:00-21:00 1800 2 王五 9:00-14:00 800 3 趙六 14:00-21:00 1100 4 周七 9:00-14:00 600 ''' dataframe[:5][['姓名','時段','交易額']] ''' 姓名 時段 交易額 0 張三 9:00-14:00 2000 1 李四 14:00-21:00 1800 2 王五 9:00-14:00 800 3 趙六 14:00-21:00 1100 4 周七 9:00-14:00 600 ''' # 查看第 2 4 5 行 姓名，交易額 數據 loc 函數 dataframe.loc[[1,3,4],['姓名','交易額']] ''' 姓名 交易額 1 李四 1800 3 趙六 1100 4 周七 600 ''' # 查看第四行的姓名數據 dataframe.at[3,'姓名'] # '趙六' # 查看交易額大於 1700 的數據 dataframe[dataframe['交易額'] > 1700] ''' 工號 姓名 時段 交易額 0 1001 張三 9:00-14:00 2000 1 1002 李四 14:00-21:00 1800 ''' # 查看交易額總和 dataframe.sum() ''' 工號 17055 姓名 張三李四王五趙六週七錢八錢八張三張三李四王五趙六週七錢八李四王五張三... 時段 9:00-14:0014:00-21:009:00-14:0014:00-21:009:00... 交易額 17410 dtype: object ''' # 某一時段的交易總和 dataframe[dataframe['時段'] == '14:00-21:00']['交易額'].sum() # 8300 # 查看張三在下午14:00以後的交易狀況 dataframe[(dataframe.姓名 == '張三') & (dataframe.時段 == '14:00-21:00')][:10] ''' 工號 姓名 時段 交易額 7 1001 張三 14:00-21:00 600 ''' # 查看日用品的銷售總額 # dataframe[dataframe['櫃檯'] == '日用品']['交易額'].sum() # 查看張三總共的交易額 dataframe[dataframe['姓名'].isin(['張三'])]['交易額'].sum() # 5200 # 查看交易額在 1500~3000 之間的記錄 dataframe[dataframe['交易額'].between(1500,3000)] ''' 工號 姓名 時段 交易額 0 1001 張三 9:00-14:00 2000 1 1002 李四 14:00-21:00 1800 9 1002 李四 14:00-21:00 1500 '''

pandas_重採樣多索引標準差協方差

# 重採樣 多索引 標準差 協方差 import pandas as pd import numpy as np import copy # 設置列對齊 pd.set_option("display.unicode.ambiguous_as_wide",True) pd.set_option("display.unicode.east_asian_width",True) data = pd.read_excel(r'C:\Users\lenovo\Desktop\總結\Python\超市營業額.xlsx') # 將日期設置爲 python 中的日期類型 data.日期 = pd.to_datetime(data.日期) ''' 工號 姓名 日期 時段 交易額 櫃檯 0 1001 張三 1970-01-01 00:00:00.020190301 9:00-14:00 2000 化妝品 1 1002 李四 1970-01-01 00:00:00.020190301 14:00-21:00 1800 化妝品 2 1003 王五 1970-01-01 00:00:00.020190301 9:00-14:00 800 食品 ''' # 每七天營業的總額 data.resample('7D',on = '日期').sum()['交易額'] ''' 日期 1970-01-01 17410 Freq: 7D, Name: 交易額, dtype: int64 ''' # 每七天營業總額 data.resample('7D',on = '日期',label = 'right').sum()['交易額'] ''' 日期 1970-01-08 17410 Freq: 7D, Name: 交易額, dtype: int64 ''' # 每七天營業額的平均值 func = lambda item:round(np.sum(item)/len(item),2) data.resample('7D',on = '日期',label = 'right').apply(func)['交易額'] ''' 日期 1970-01-08 1024.12 Freq: 7D, Name: 交易額, dtype: float64 ''' # 每七天營業額的平均值 func = lambda num:round(num,2) data.resample('7D',on = '日期',label = 'right').mean().apply(func)['交易額'] # 1024.12 # 刪除工號這一列 data.drop('工號',axis = 1,inplace = True) data[:2] ''' 姓名 日期 時段 交易額 櫃檯 0 張三 1970-01-01 00:00:00.020190301 9:00-14:00 2000 化妝品 1 李四 1970-01-01 00:00:00.020190301 14:00-21:00 1800 化妝品 ''' # 按照姓名和櫃檯進行分組彙總 data = data.groupby(by = ['姓名','櫃檯']).sum()[:3] ''' 交易額 姓名 櫃檯 周七 日用品 1180 張三 化妝品 4600 蔬菜水果 600 ''' # 查看張三的彙總數據 data.loc['張三',:] ''' 交易額 櫃檯 化妝品 4600 蔬菜水果 600 ''' # 查看張三在蔬菜水果的交易數據 data.loc['張三','蔬菜水果'] ''' 交易額 600 Name: (張三, 蔬菜水果), dtype: int64 ''' # 多索引 # 從新讀取，使用第二列和第六列做爲索引，排在前面 data = pd.read_excel(r'C:\Users\lenovo\Desktop\總結\Python\超市營業額.xlsx',index_col = [1,5]) data[:5] ''' 工號 日期 時段 交易額 姓名 櫃檯 張三 化妝品 1001 20190301 9:00-14:00 2000 李四 化妝品 1002 20190301 14:00-21:00 1800 王五 食品 1003 20190301 9:00-14:00 800 趙六 食品 1004 20190301 14:00-21:00 1100 周七 日用品 1005 20190301 9:00-14:00 600 ''' # 丟棄工號列 data.drop('工號',axis = 1,inplace = True) data[:5] ''' 日期 時段 交易額 姓名 櫃檯 張三 化妝品 20190301 9:00-14:00 2000 李四 化妝品 20190301 14:00-21:00 1800 王五 食品 20190301 9:00-14:00 800 趙六 食品 20190301 14:00-21:00 1100 周七 日用品 20190301 9:00-14:00 600 ''' # 按照櫃檯進行排序 dff = data.sort_index(level = '櫃檯',axis = 0) dff[:5] ''' 工號 日期 時段 交易額 姓名 櫃檯 張三 化妝品 1001 20190301 9:00-14:00 2000 化妝品 1001 20190302 9:00-14:00 1300 化妝品 1001 20190303 9:00-14:00 1300 李四 化妝品 1002 20190301 14:00-21:00 1800 化妝品 1002 20190302 14:00-21:00 1500 ''' # 按照姓名進行排序 dff = data.sort_index(level = '姓名',axis = 0) dff[:5] ''' 工號 日期 時段 交易額 姓名 櫃檯 周七 日用品 1005 20190301 9:00-14:00 600 日用品 1005 20190302 9:00-14:00 580 張三 化妝品 1001 20190301 9:00-14:00 2000 化妝品 1001 20190302 9:00-14:00 1300 化妝品 1001 20190303 9:00-14:00 1300 ''' # 按照櫃檯進行分組求和 dff = data.groupby(level = '櫃檯').sum()['交易額'] ''' 櫃檯 化妝品 7900 日用品 2600 蔬菜水果 2960 食品 3950 Name: 交易額, dtype: int64 ''' #標準差 data = pd.DataFrame({'A':[3,3,3,3,3],'B':[1,2,3,4,5], 'C':[-5,-4,1,4,5],'D':[-45,15,63,40,50] }) ''' A B C D 0 3 1 -5 -45 1 3 2 -4 15 2 3 3 1 63 3 3 4 4 40 4 3 5 5 50 ''' # 平均值 data.mean() ''' A 3.0 B 3.0 C 0.2 D 24.6 dtype: float64 ''' # 標準差 data.std() ''' A 0.000000 B 1.581139 C 4.549725 D 42.700117 dtype: float64 ''' # 標準差的平方 data.std()**2 ''' A 0.0 B 2.5 C 20.7 D 1823.3 dtype: float64 ''' # 協方差 data.cov() ''' A B C D A 0.0 0.00 0.00 0.00 B 0.0 2.50 7.00 53.75 C 0.0 7.00 20.70 153.35 D 0.0 53.75 153.35 1823.30 ''' # 指定索引爲 姓名，日期，時段，櫃檯，交易額 data = pd.read_excel(r'C:\Users\lenovo\Desktop\總結\Python\超市營業額.xlsx', usecols = ['姓名','日期','時段','櫃檯','交易額']) # 刪除缺失值和重複值,inplace = True 直接丟棄 data.dropna(inplace = True) data.drop_duplicates(inplace = True) # 處理異常值 data.loc[data.交易額 < 200,'交易額'] = 200 data.loc[data.交易額 > 3000,'交易額'] = 3000 # 使用交叉表獲得不一樣員工在不一樣櫃檯的交易額平均值 dff = pd.crosstab(data.姓名,data.櫃檯,data.交易額,aggfunc = 'mean') dff[:5] ''' 櫃檯 化妝品 日用品 蔬菜水果 食品 姓名 周七 NaN 590.0 NaN NaN 張三 1533.333333 NaN 600.0 NaN 李四 1650.000000 NaN 680.0 NaN 王五 NaN NaN 830.0 900.0 趙六 NaN NaN NaN 1075.0 ''' # 查看數據的標準差 dff.std() ''' 櫃檯 化妝品 82.495791 日用品 84.852814 蔬菜水果 120.277457 食品 123.743687 dtype: float64 ''' # 協方差 dff.cov() ''' 櫃檯 化妝品 日用品 蔬菜水果 食品 櫃檯 化妝品 6805.555556 NaN 4666.666667 NaN 日用品 NaN 7200.0 NaN NaN 蔬菜水果 4666.666667 NaN 14466.666667 NaN 食品 NaN NaN NaN 15312.5 '''

Numpy random函數

import numpy as np # 生成一個隨機數組 np.random.randint(0,6,3) # array([1, 1, 3]) # 生成一個隨機數組(二維數組) np.random.randint(0,6,(3,3)) ''' array([[4, 4, 1], [2, 1, 0], [5, 0, 0]]) ''' # 生成十個隨機數在[0,1)之間 np.random.rand(10) ''' array([0.9283789 , 0.43515554, 0.27117021, 0.94829333, 0.31733981, 0.42314939, 0.81838647, 0.39091899, 0.33571004, 0.90240897]) ''' # 從標準正態分佈中隨機抽選出3個數 np.random.standard_normal(3) # array([0.34660435, 0.63543859, 0.1307822 ]) # 返回三頁四行兩列的標準正態分佈數 np.random.standard_normal((3,4,2)) ''' array([[[-0.24880261, -1.17453957], [ 0.0295264 , 1.04038047], [-1.45201783, 0.57672288], [ 1.10282747, -2.08699482]], [[-0.3813943 , 0.47845782], [ 0.97708005, 1.1760147 ], [ 1.3414987 , -0.629902 ], [-0.29780567, 0.60288726]], [[ 1.43991349, -1.6757028 ], [-1.97956809, -1.18713495], [-1.39662811, 0.34174275], [ 0.56457553, -0.83224426]]]) '''

Numpy修改數組中的元素值

import numpy as np x = np.arange(8) # [0 1 2 3 4 5 6 7] # 在數組尾部追加一個元素 np.append(x,10) # array([ 0, 1, 2, 3, 4, 5, 6, 7, 10]) # 在數組尾部追加多個元素 np.append(x,[15,16,17]) # array([ 0, 1, 2, 3, 4, 5, 6, 7, 15, 16, 17]) # 使用 數組下標修改元素的值 x[0] = 99 # array([99, 1, 2, 3, 4, 5, 6, 7]) # 在指定位置插入數據 np.insert(x,0,54) # array([54, 99, 1, 2, 3, 4, 5, 6, 7]) # 建立一個多維數組 x = np.array([[1,2,3],[11,22,33],[111,222,333]]) ''' array([[ 1, 2, 3], [ 11, 22, 33], [111, 222, 333]]) ''' # 修改第 0 行第 2 列的元素值 x[0,2] = 9 ''' array([[ 1, 2, 9], [ 11, 22, 33], [111, 222, 333]]) ''' # 行數大於等於 1 的，列數大於等於 1 的置爲 0 x[1:,1:] = 1 ''' array([[ 1, 2, 9], [ 11, 1, 1], [111, 1, 1]]) ''' # 同時修改多個元素值 x[1:,1:] = [7,8] ''' array([[ 1, 2, 9], [ 11, 7, 8], [111, 7, 8]]) ''' x[1:,1:] = [[7,8],[9,10]] ''' array([[ 1, 2, 9], [ 11, 7, 8], [111, 9, 10]]) '''

Numpy建立數組

# 導入numpy 並賦予別名 np import numpy as np # 建立數組的經常使用的幾種方式(列表，元組，range,arange,linspace(建立的是等差數組),zeros(全爲 0 的數組),ones(全爲 1 的數組),logspace(建立的是對數數組)) # 列表方式 np.array([1,2,3,4]) # array([1, 2, 3, 4]) # 元組方式 np.array((1,2,3,4)) # array([1, 2, 3, 4]) # range 方式 np.array(range(4)) # 不包含終止數字 # array([0, 1, 2, 3]) # 使用 arange(初始位置=0,末尾,步長=1) np.arange(1,8,2) # array([1, 3, 5, 7]) np.arange(8) # array([0, 1, 2, 3, 4, 5, 6, 7]) # 使用 linspace(起始數字,終止數字，包含數字的個數[,endpoint = False]) 生成等差數組 # 生成等差數組,endpoint 爲 True 則包含末尾數字 np.linspace(1,3,4,endpoint=False) # array([1. , 1.5, 2. , 2.5]) np.linspace(1,3,4,endpoint=True) # array([1. , 1.66666667, 2.33333333, 3. ]) # 建立全爲零的一維數組 np.zeros(3) # 建立全爲一的一維數組 np.ones(4) # array([1., 1., 1., 1.]) np.linspace(1,3,4) # array([1. , 1.66666667, 2.33333333, 3. ]) # np.logspace(起始數字，終止數字，數字個數，base = 10) 對數數組 np.logspace(1,3,4) # 至關於 10 的 linspace(1,3,4) 次方 # array([ 10. , 46.41588834, 215.443469 , 1000. ]) np.logspace(1,3,4,base = 2) # 2 的 linspace(1,3,4) 次方 # array([2. , 3.1748021, 5.0396842, 8. ]) # 建立二維數組(列表嵌套列表) np.array([[1,2,3],[4,5,6]]) ''' array([[1, 2, 3], [4, 5, 6]]) ''' # 建立全爲零的二維數組 # 兩行兩列 np.zeros((2,2)) ''' array([[0., 0.], [0., 0.]]) ''' # 三行三列 np.zeros((3,2)) ''' array([[0., 0.], [0., 0.], [0., 0.]]) ''' # 建立一個單位數組 np.identity(3) ''' array([[1., 0., 0.], [0., 1., 0.], [0., 0., 1.]]) ''' # 建立一個對角矩陣，(參數爲對角線上的數字) np.diag((1,2,3)) ''' array([[1, 0, 0], [0, 2, 0], [0, 0, 3]]) '''

Numpy改變數組的形狀

import numpy as np n = np.arange(10) # array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) # 查看數組的大小 n.size # 10 # 將數組分爲兩行五列 n.shape = 2,5 ''' array([[0, 1, 2, 3, 4], [5, 6, 7, 8, 9]]) ''' # 顯示數組的維度 n.shape # (2, 5) # 設置數組的維度，-1 表示自動計算 n.shape = 5,-1 ''' array([[0, 1], [2, 3], [4, 5], [6, 7], [8, 9]]) ''' # 將新數組設置爲調用數組的兩行五列並返回 x = n.reshape(2,5) ''' array([[0, 1, 2, 3, 4], [5, 6, 7, 8, 9]]) ''' x = np.arange(5) # 將數組設置爲兩行，沒有數的設置爲 0 x.resize((2,10)) ''' array([[0, 1, 2, 3, 4, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0]]) ''' # 將 x 數組的兩行五列形式顯示，不改變 x 的值 np.resize(x,(2,5)) ''' array([[0, 1, 2, 3, 4], [0, 0, 0, 0, 0]]) '''

Numpy數組排序

import numpy as np x = np.array([1,4,5,2]) # array([1, 4, 5, 2]) # 返回排序後元素的原下標 np.argsort(x) # array([0, 3, 1, 2], dtype=int64) # 輸出最大值的下標 x.argmax( ) # 2 # 輸出最小值的下標 x.argmin( ) # 0 # 對數組進行排序 x.sort( ) print(x) # [1 2 4 5]

Numpy數組的函數

import numpy as np # 將 0~100 10等分 x = np.arange(0,100,10) # array([ 0, 10, 20, 30, 40, 50, 60, 70, 80, 90]) # 每一個數組元素對應的正弦值 np.sin(x) ''' array([ 0. , -0.54402111, 0.91294525, -0.98803162, 0.74511316, -0.26237485, -0.30481062, 0.77389068, -0.99388865, 0.89399666]) ''' # 每一個數組元素對應的餘弦值 np.cos(x) ''' array([ 1. , -0.83907153, 0.40808206, 0.15425145, -0.66693806, 0.96496603, -0.95241298, 0.6333192 , -0.11038724, -0.44807362]) ''' # 對參數進行四捨五入 np.round(np.cos(x)) # array([ 1., -1., 0., 0., -1., 1., -1., 1., -0., -0.]) # 對參數進行上入整數 3.3->4 np.ceil(x/3) # array([ 0., 4., 7., 10., 14., 17., 20., 24., 27., 30.]) # 分段函數 x = np.random.randint(0,10,size=(1,10)) # array([[0, 3, 6, 7, 9, 4, 9, 8, 1, 8]]) # 大於 4 的置爲 0 np.where(x > 4,0,1) # array([[1, 1, 0, 0, 0, 1, 0, 0, 1, 0]]) # 小於 4 的乘 2 ，大於 7 的乘3 np.piecewise(x,[x<4,x>7],[lambda x:x*2,lambda x:x*3]) # array([[ 0, 6, 0, 0, 27, 0, 27, 24, 2, 24]])

Numpy數組的運算

import numpy as np x = np.array((1,2,3,4,5)) # 使用 * 進行相乘 x*2 # array([ 2, 4, 6, 8, 10]) # 使用 / 進行相除 x / 2 # array([0.5, 1. , 1.5, 2. , 2.5]) 2 / x # array([2. , 1. , 0.66666667, 0.5 , 0.4 ]) # 使用 // 進行整除 x//2 # array([0, 1, 1, 2, 2], dtype=int32) 10//x # array([10, 5, 3, 2, 2], dtype=int32) # 使用 ** 進行冪運算 x**3 # array([ 1, 8, 27, 64, 125], dtype=int32) 2 ** x # array([ 2, 4, 8, 16, 32], dtype=int32) # 使用 + 進行相加 x + 2 # array([3, 4, 5, 6, 7]) # 使用 % 進行取模 x % 3 # array([1, 2, 0, 1, 2], dtype=int32) # 數組與數組之間的運算 # 使用 + 進行相加 np.array([1,2,3,4]) + np.array([11,22,33,44]) # array([12, 24, 36, 48]) np.array([1,2,3,4]) + np.array([3]) # array([4, 5, 6, 7]) n = np.array((1,2,3)) # + n + n # array([2, 4, 6]) n + np.array([4]) # array([5, 6, 7]) # * n * n # array([1, 4, 9]) n * np.array(([1,2,3],[4,5,6],[7,8,9])) ''' array([[ 1, 4, 9], [ 4, 10, 18], [ 7, 16, 27]]) ''' # - n - n # array([0, 0, 0]) # / n/n # array([1., 1., 1.]) # ** n**n # array([ 1, 4, 27], dtype=int32) x = np.array((1,2,3)) y = np.array((4,5,6)) # 數組的內積運算(對應位置上元素相乘) np.dot(x,y) # 32 sum(x*y) # 32 # 布爾運算 n = np.random.rand(4) # array([0.53583849, 0.09401473, 0.07829069, 0.09363152]) # 判斷數組中的元素是否大於 0.5 n > 0.5 # array([ True, False, False, False]) # 將數組中大於 0.5 的元素顯示 n[n>0.5] # array([0.53583849]) # 找到數組中 0.05 ~ 0.4 的元素總數 sum((n > 0.05)&(n < 0.4)) # 3 # 是否都大於 0.2 np.all(n > 0.2) # False # 是否有元素小於 0.1 np.any(n < 0.1) # True # 數組與數組之間的布爾運算 a = np.array([1,4,7]) # array([1, 4, 7]) b = np.array([4,3,7]) # array([4, 3, 7]) # 在 a 中是否有大於 b 的元素 a > b # array([False, True, False]) # 在 a 中是否有等於 b 的元素 a == b # array([False, False, True]) # 顯示 a 中 a 的元素等於 b 的元素 a[a == b] # array([7]) # 顯示 a 中的偶數且小於 5 的元素 a[(a%2 == 0) & (a < 5)] # array([4])

Numpy訪問數組元素

import numpy as np n = np.array(([1,2,3],[4,5,6],[7,8,9])) ''' array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]) ''' # 第一行元素 n[0] # array([1, 2, 3]) # 第一行第三列元素 n[0,2] # 3 # 第一行和第二行的元素 n[[0,1]] ''' array([[1, 2, 3], [4, 5, 6]]) ''' # 第一行第三列，第三行第二列，第二行第一列 n[[0,2,1],[2,1,0]] # array([3, 8, 4]) a = np.arange(8) # array([0, 1, 2, 3, 4, 5, 6, 7]) # 將數組倒序 a[::-1] # array([7, 6, 5, 4, 3, 2, 1, 0]) # 步長爲 2 a[::2] # array([0, 2, 4, 6]) # 從 0 到 4 的元素 a[:5] # array([0, 1, 2, 3, 4]) c = np.arange(16) # array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]) c.shape = 4,4 ''' array([[ 0, 1, 2, 3], [ 4, 5, 6, 7], [ 8, 9, 10, 11], [12, 13, 14, 15]]) ''' # 第一行，第三個元素到第五個元素(若是沒有則輸出到末尾截止) c[0,2:5] # array([2, 3]) # 第二行元素 c[1] # array([4, 5, 6, 7]) # 第三行到第六行，第三列到第六列 c[2:5,2:5] ''' array([[10, 11], [14, 15]]) ''' # 第二行第三列元素和第三行第四列元素 c[[1,2],[2,3]] # array([ 6, 11]) # 第一行和第三行的第二列到第三列的元素 c[[0,2],1:3] ''' array([[ 1, 2], [ 9, 10]]) ''' # 第一列和第三列的全部橫行元素 c[:,[0,2]] ''' array([[ 0, 2], [ 4, 6], [ 8, 10], [12, 14]]) ''' # 第三列全部元素 c[:,2] # array([ 2, 6, 10, 14]) # 第二行和第四行的全部元素 c[[1,3]] ''' array([[ 4, 5, 6, 7], [12, 13, 14, 15]]) ''' # 第一行的第二列，第四列元素，第四行的第二列，第四列元素 c[[0,3]][:,[1,3]] ''' array([[ 1, 3], [13, 15]]) '''

TCP 客戶端

""" 建立客戶端 綁定服務器ip地址和端口號(端口號是整型) 與服務器創建鏈接 發送給服務器要發送的數據(轉碼) 接收服務器返回的數據 關閉客戶端 """ from socket import * # 建立tcp socket tcp_client_socket = socket(AF_INET,SOCK_STREAM) # tcp 使用STREAM # udp 使用DGRAM # 鏈接的服務器及端口號 server_ip = input("請輸入服務器ip地址:") server_port = eval(input("請輸入服務器端口號:")) # 創建鏈接 tcp_client_socket.connect((server_ip,server_port))#聯繫ip地址和端口號 # print(type((server_ip,server_port)))元組類型 # 提示用戶輸入數據 send_data = input("請輸入要發送的數據") tcp_client_socket.send(send_data.encode('gbk'))#對發送的數據進行轉碼 # 接收對方發送來的數據 recv_data = tcp_client_socket.recv(1024) print("接收到的數據是:%s"%(recv_data.decode('gbk'))) # 關閉套接字 tcp_client_socket.close() """ TCP使用AF_INET,SOCK_STREAM TCP須要先創建鏈接,使用connect函數鏈接服務器端ip地址和端口號(綁定在元組中) 使用send發送轉碼後的數據,str->bytes 使用encode 接收數據recv (1024)函數 最大接收1024字節 關閉客服端close() """

TCP 服務器端

""" 創建tcp服務器 綁定本地服務器信息(ip地址,端口號) 進行監聽 獲取監聽數據(監聽到的客戶端和地址) 使用監聽到的客戶端client_socket獲取數據 輸出獲取到的數據 並返回給客戶端一個數據 關閉服務器端 """ from socket import * # 建立tcp socket tcp_server_socket = socket(AF_INET,SOCK_STREAM) # 本地信息 ip地址+端口號 local_address = (('',7788)) # 綁定本地地址,主機號能夠不寫,固定端口號 tcp_server_socket.bind(local_address)#綁定ip地址和端口號 # 使用socket默認爲發送,服務端主要接收數據 tcp_server_socket.listen(128)#對客戶端進行監聽 # 當接收到數據後,client_socket用來爲客戶端服務 client_socket,client_address = tcp_server_socket.accept() # 接收對方發送的數據，客戶端socket對象和客戶端ip地址 recv_data = client_socket.recv(1024)#使用接收到的客戶端對象接收數據 print("接收到的數據爲:%s"%(recv_data.decode('gbk')))#對數據進行轉碼，並輸出 # 發送數據到客戶端 client_socket.send("Hany在tcp客戶端發送數據".encode('gbk')) # 關閉客戶端,若是還有客戶須要進行鏈接，等待下次 client_socket.close()##關閉服務器端 """ 服務端先要綁定信息,使用bind函數((ip地址(默認爲''便可),端口號)) 進行監聽listen(128) 接收監聽到的數據 accept() 客戶服務對象,端口號 使用客戶服務對象,接收數據recv(1024) 輸出接收到的bytes->str decode轉碼 數據 使用gbk 是由於windows使用gbk編碼 服務器端發送數據給剛剛監聽過的客戶端send函數，str->bytes類型 關閉服務器端 """

UDP 綁定信息

""" 創建->綁定本地ip地址和端口號->接收數據->轉碼輸出->關閉客戶端 """ from socket import * udp_socket = socket(AF_INET,SOCK_DGRAM) # 綁定本地的相關信息,若是網絡程序不綁定，則系統會隨機分配 # UDP使用SOCK_DGRAM local_addr = ('',7788)#ip地址能夠不寫 udp_socket.bind(local_addr)#綁定本地ip地址 # 接收對方發送的數據 recv_data = udp_socket.recvfrom(1024)#UDP使用recvfrom方法進行接收 # 輸出接收內容 print(recv_data[0].decode('gbk')) print(recv_data[1])#ip地址+端口號 udp_socket.close() """ UDP使用AF_INET,SOCK_DGRAM 綁定ip地址和端口號(固定端口號) 接收recvfrom(1024)傳來的元組 (數據，端口號) 數據是以bytes類型傳過來的，須要轉碼decode('gbk') """

UDP 網絡程序-發送_接收數據

""" 建立udp鏈接 發送數據給 """ from socket import * # 建立udp套接字，使用SOCK_DGRAM udp_socket = socket(AF_INET,SOCK_DGRAM) # 準備接收方的地址 dest_addr = ('',8080)#主機號，固定端口號 # 從鍵盤獲取數據 send_data = input("請輸入要發送的數據") # 發送數據到指定的電腦上 udp_socket.sendto(send_data.encode('UTF-8'),dest_addr)#使用sendto方法進行發送，發送的數據，ip地址和端口號 # 等待接收雙方發送的數據 recv_data = udp_socket.recvfrom(1024)# 1024表示本次接收的最大字節數 # 顯示對方發送的數據,recv_data是一個元組,第一個爲對方發送的數據,第二個是ip和端口 print(recv_data[0].decode('gbk')) # 發送的消息 print(recv_data[1]) # ip地址 # 關閉套接字 udp_socket.close()

WSGI應用程序示例

import time # WSGI容許開發者自由搭配web框架和web服務器 def app(environ,start_response): status = '200 OK' response_headers = [('Content-Type','text/html')] start_response(status,response_headers) return str(environ)+" Hello WSGI ---%s----"%(time.ctime()) print(time.ctime()) #Tue Jan 14 21:55:35 2020

定義 WSGI 接口

# WSGI服務器調用 def application(environ,start_response): start_response('200 OK',[('Content-Type','text/html')]) return 'Hello World' ''' environ: 包含HTTP請求信息的dict對象 start_response: 發送HTTP響應的函數 '''

encode 和 decode 的使用

txt = '我是字符串' txt_encode = txt.encode() print(txt) # 我是字符串 print(txt_encode) # b'\xe6\x88\x91\xe6\x98\xaf\xe5\xad\x97\xe7\xac\xa6\xe4\xb8\xb2' print(type(txt)) # <class 'str'> print(type(txt_encode)) # <class 'bytes'> txt_copy = txt_encode.decode( ) print(txt_copy) # 我是字符串 print(type(txt_copy)) # <class 'str'> # str->bytes encode # bytes->str decode

abs,all,any函數的使用 ''' abs函數:若是參數爲實數，則返回絕對值 若是參數爲複數，則返回複數的模 ''' a = 6 b = -6 c = 0 # print("a = {0} , b = {1} , c = {2}".format(abs(a),abs(b),abs(c))) # a = 6 , b = 6 , c = 0 # 負數變爲正數，正數不變，零不變 d = 3 + 4j # print("d的模爲 = {0}".format(abs(d))) # d的模爲 = 5.0 ''' 總結:返回值是自己的數: 正數，0 返回值是相反數的數: 負數 返回值是模: 複數 ''' ''' all函數,接收一個迭代器，若是迭代器對象都爲真，則返回True 有一個不爲真，就返回False ''' a = 6 b = -6 c = 0 d = 1 # print(all([a,b,c])) # False # 由於 c 爲 0 ，有一個假，則爲 False # print(all([a,b,d])) # True # 都爲真,實數不爲 0 則爲真 s = '' # print(all(s)) # True 字符串爲空，返回值爲True e = [0+0j] #all只接收可迭代對象，複數和實數都要使用列表 # print(all(e)) # False a = [''] # print(all(a)) # False # #''空字符串被列表化以後，結果爲False b = [] # print(all(b)) # True 空列表返回爲 True c = [0] # print(all(c)) # False 列表中存在 0,返回False d = {} # print(all(d)) # True 空字典返回值爲True e = set() # print(all(e)) # True 空集合返回值爲True f = [set()] # print(all(f)) # False 列表中爲空集合元素時，返回False g = [{}] # print(all(g)) # False 列表中爲空字典時，返回False ''' 總結： True: '' , [] , 除了 0 的實數, {} , set() False: [''] , [0+0j] , [0] ，[set()] ，[{}] ''' ''' any函數:接收一個迭代器，若是迭代器中只要有一個元素爲真，就返回True 若是迭代器中元素全爲假，則返回False ''' lst = [0,0,1] # print(any(lst)) # True 由於 1 爲真，當存在一個元素爲真時，返回True ''' 總結:只要有一個元素爲真，則返回True '''

# 文件的某些操做(之前發過相似的) # # 文件寫操做 ''' w 寫入操做 若是文件存在，則清空內容後進行寫入，不存在則建立 a 寫入操做 若是文件存在，則在文件內容後追加寫入，不存在則建立 with 使用 with 語句後，系統會自動關閉文件並處理異常 ''' import os print(os.path) from time import ctime # 使用 w 方式寫入文件 f = open(r"test.txt","w",encoding = "utf-8") print(f.write("該用戶於 {0} 時刻，進行了文件寫入".format(ctime()))) f.close() # 使用 a 方式寫入文件 f = open(r"text.txt","a",encoding = 'utf-8') print(f.write('使用 a 方式進行寫入 {0} '.format(ctime()))) f.close() # 使用 with 方式，系統自動關閉文件並處理異常狀況 with open(r"text.txt","w") as f : '''with方法，對文件進行關閉並處理異常結果''' f.write('{0}'.format(ctime())) # 文件讀操做 import os def mkdir(path): '''建立文件夾，先看是否存在該文件，而後再建立''' # 若是存在，則不進行建立 is_exist = os.path.exists(path) if not is_exist: # 若是不存在該路徑 os.mkdir(path) def open_file(file_name): '''打開文件，並返回讀取到的內容''' f = open(file_name) #使用 f 接收文件對象 f_lst = f.read( ) #進行讀取文件 f.close() #使用完文件後，關閉 return f_lst #返回讀取到的內容 # 獲取後綴名 import os '''os.path.splitext('文件路徑')，獲取後綴名''' # 將文件路徑後綴名和前面進行分割，返回類型爲元組類型 file_text = os.path.splitext('./data/py/test.py') # print(type(file_ext)) #<class 'tuple'>元組類型 front,text = file_text # front 爲元組的第一個元素 # ext 爲元組的第二個元素 print(front,file_text[0]) # ./data/py/test ./data/py/test print(text,file_text[1]) # .py .py ''' os.path.splitext('文件路徑')''' # 路徑中的後綴名 import os '''使用os.path.split('文件路徑') 獲取文件名''' file_text = os.path.split('./data/py/test.py') # print(type(file_text)) # <class 'tuple'> 元組類型 '''第一個元素爲文件路徑，第二個參數爲文件名''' path,file_name = file_text print(path) # ./data/py print(file_name) # test.py '''splitext獲取文件後綴名'''

pymysql 數據庫基礎應用 ''' 開始 建立 connection 獲取cursor 執行查詢，獲取數據，處理數據 關閉cursor 關閉connection 結束 ''' from pymysql import * conn = connect(host='localhost',port=3306,database=hany_vacation,user='root',password='root',charset='utf8') # conn = connect(host鏈接的mysql主機,port鏈接的mysql的主機, # database數據庫名稱,user鏈接的用戶名,password鏈接的密碼,charset採用的編碼方式) # 方法:commit()提交,close()關閉鏈接,cursor()返回Cursor對象,執行sql語句並得到結果 # 主要執行的sql語句:select , insert , update , delete cs1 = conn.cursor() # 對象擁有的方法: ''' close()關閉 execute(operation[,parameters])執行語句，返回受影響的行數，主要執行insert,update,delete,create,alter,drop語句 fetchone()執行查詢語句，獲取查詢結果集的第一個行數據，返回一個元組 fetchall()執行查詢語句，獲取結果集的全部行，一行構成一個元組，再將這些元組裝入一個元組返回 ''' # 對象的屬性 ''' rowcount 只讀屬性，表示最近一次execute()執行後受影響的行數 connection 獲取當前鏈接的對象 '''

數據庫進行參數化,查詢一行或多行語句 參數化 from pymysql import * def main(): find_name = input("請輸入物品名稱") conn = connect(host='localhost',port=3306,user='root',password='root',database='jing_dong',charset='utf8') # 主機名、端口號、用戶名、密碼、數據庫名、字符格式 cs1 = conn.cursor()#獲取遊標 # 構成參數列表 params = [find_name] # 對查詢的數據，使用變量進行賦值 count = cs1.execute('select * from goods where name=%s'%(params)) print(count) result = cs1.fetchall() # 輸出全部數據 print(result) # 先關閉遊標、後關閉鏈接 cs1.close() conn.close() if __name__ == '__main__': main()

查詢一行語句

from pymysql import * import time def main(): # 建立Connection鏈接 conn = connect(host='localhost',port=3306,user='root',password='root',database='jing_dong',charset='utf8') # 得到Cursor對象 cs1 = conn.cursor() # 執行select語句，並返回受影響的行數：查詢一條數據 count = cs1.execute('select id,name from goods where id>=4') # count = cs1.execute('select id,name from goods where id between 4 and 15') # 打印受影響的行數 print("查詢到%d條數據:" % count) for i in range(count): # 獲取查詢的結果 result = cs1.fetchone() #每次只輸出一條數據 fetchall所有輸出 # 打印查詢的結果 time.sleep(0.5) print(result) # 獲取查詢的結果 # 關閉Cursor對象 cs1.close() conn.close() if __name__ == '__main__': main()

from pymysql import * def main(): # 建立Connection鏈接 conn = connect(host='localhost',port=3306,user='root',password='root',database='jing_dong',charset='utf8') # 得到Cursor對象 cs1 = conn.cursor() # 執行select語句，並返回受影響的行數：查詢一條數據 count = cs1.execute('select id,name from goods where id>=4') # 打印受影響的行數 print("查詢到%d條數據:" % count) # for i in range(count): # # 獲取查詢的結果 # result = cs1.fetchone() # # 打印查詢的結果 # print(result) # # 獲取查詢的結果 result = cs1.fetchall()#直接一行輸出 print(result) # 關閉Cursor對象 cs1.close() conn.close() if __name__ == '__main__': main()

二分法查找 def binary_search(alist, item): first = 0 last = len(alist) - 1 while first <= last: midpoint = (first + last) // 2 if alist[midpoint] == item: return True elif item < alist[midpoint]: last = midpoint - 1 else: first = midpoint + 1 return False testlist = [0, 1, 2, 8, 13, 17, 19, 32, 42, ] print(binary_search(testlist, 3)) print(binary_search(testlist, 13))

# 遞歸使用二分法查找數據 def binary_search(alist, item): if len(alist) == 0: return False else: midpoint = len(alist)//2 if alist[midpoint]==item: return True else: if item < alist[midpoint]: return binary_search(alist[:midpoint],item) #從開始到中間 else:#item元素大於alist[midpoint] return binary_search(alist[midpoint+1:],item) #從中間到最後 testlist = [0, 1, 2, 8, 13, 17, 19, 32, 42,] print(binary_search(testlist, 3)) print(binary_search(testlist, 13))

正則表達式_合集上

$通配符,匹配字符串結尾 import re email_list = ["xiaoWang@163.com", "xiaoWang@163.comheihei", ".com.xiaowang@qq.com"] for email in email_list: ret = re.match("[\w]{4,20}@163\.com$", email) # \w 匹配字母或數字 # {4,20}匹配前一個字符4到20次 if ret: print("%s 是符合規定的郵件地址,匹配後的結果是:%s" % (email, ret.group())) elif ret == None : print("%s 不符合要求" % email) ''' xiaoWang@163.com 是符合規定的郵件地址,匹配後的結果是:xiaoWang@163.com xiaoWang@163.comheihei 不符合要求 .com.xiaowang@qq.com 不符合要求 ''' re.match匹配 import re # re.match匹配字符（僅匹配開頭） string = 'Hany' # result = re.match(string,'123Hany hanyang') result = re.match(string,'Hany hanyang') # 使用group方法提取數據 if result: print("匹配到的字符爲:",result.group( )) else: print("沒匹配到以%s開頭的字符串"%(string)) '''匹配到的字符爲: Hany''' re中findall用法 import re ret = re.findall(r"\d+","Hany.age = 22, python.version = 3.7.5") # 輸出所有找到的結果 \d + 一次或屢次 print(ret) # ['22', '3', '7', '5'] re中search用法 import re ret = re.search(r"\d+",'閱讀次數爲:9999') # 只要找到規則便可,從頭至尾 print(ret.group()) '''9999''' re中匹配 [ ] 中列舉的字符 import re # 匹配[]中列舉的字符 # 若是hello的首字符小寫，那麼正則表達式須要小寫的h ret = re.match("h","hello Python") print(ret.group()) # 若是hello的首字符大寫，那麼正則表達式須要大寫的H ret = re.match("H","Hello Python") print(ret.group()) # 大小寫h均可以的狀況 ret = re.match("[hH]","hello Python") print(ret.group()) ret = re.match("[hH]","Hello Python") print(ret.group()) ret = re.match("[hH]ello Python","Hello Python") print(ret.group()) # 匹配0到9第一種寫法 ret = re.match("[0123456789]Hello Python","7Hello Python") print(ret.group()) # 匹配0到9第二種寫法 ret = re.match("[0-9]Hello Python","7Hello Python") print(ret.group()) # 匹配0到3 5到9的數字 ret = re.match("[0-3,5-9]Hello Python","7Hello Python") print(ret.group()) # 下面這個正則不可以匹配到數字4，所以ret爲None ret = re.match("[0-3,5-9]Hello Python","4Hello Python") # print(ret.group()) ''' h H h H Hello Python 7Hello Python 7Hello Python 7Hello Python ''' re中匹配不是以4,7結尾的手機號碼 import re tels = ["13100001234", "18912344321", "10086", "18800007777"] for tel in tels: ret = re.match("1\d{9}[0-3,5-6,8-9]", tel) if ret: print("想要的手機號是:{}".format(ret.group())) else: print("%s 不是想要的手機號" % tel) ''' 13100001234 不是想要的手機號 想要的手機號是:18912344321 10086 不是想要的手機號 18800007777 不是想要的手機號 ''' re中匹配中獎號碼 import re # 匹配中獎號碼 str2 = '17711602423' pattern = re.compile('^(1[3578]\d)(\d{4})(\d{4})$') print(pattern.sub(r'\1****\3',str2)) # r 字符串編碼轉化 '''177****2423''' re中匹配中文字符 import re pattern = re.compile('[\u4e00-\u9fa5]') strs = '你好 Hello hany' print(pattern.findall(strs)) pattern = re.compile('[\u4e00-\u9fa5]+') print(pattern.findall(strs)) # ['你', '好'] # ['你好'] re中匹配任意字符 import re # 匹配任意一個字符 ret = re.match(".","M") # ret = re.match(".","M123") # M # 匹配單個字符 print(ret.group()) ret = re.match("t.o","too") print(ret.group()) ret = re.match("t.o","two") print(ret.group()) ''' M too two ''' re中匹配多個字符_加號 import re #匹配前一個字符出現1次或無限次 names = ["name1", "_name", "2_name", "__name__"] for name in names: ret = re.match("[a-zA-Z_]+[\w]*",name) if ret: print("變量名 %s 符合要求" % ret.group()) else: print("變量名 %s 非法" % name) ''' 變量名 name1 符合要求 變量名 _name 符合要求 變量名 2_name 非法 變量名 __name__ 符合要求 ''' re中匹配多個字符_星號 import re # 匹配前一個字符出現0次或無限次 # *號只針對前一個字符 ret = re.match("[A-Z][a-z]*","M") # ret = re.match("[a-z]*","M") 沒有結果 print(ret.group()) ret = re.match("[A-Z][a-z]*","MnnM") print(ret.group()) ret = re.match("[A-Z][a-z]*","Aabcdef") print(ret.group()) ''' M Mnn Aabcdef ''' re中匹配多個字符_問號 import re # 匹配前一個字符要麼1次，要麼0次 ret = re.match("[1-9]?[0-9]","7") print(ret.group()) ret = re.match("[1-9]?","7") print(ret.group()) ret = re.match("[1-9]?\d","33") print(ret.group()) ret = re.match("[1-9]?\d[1-9]","33") print(ret.group()) ret = re.match("[1-9]?\d","09") print(ret.group()) ''' 7 7 33 33 0 ''' re中匹配左右任意一個表達式 import re ret = re.match("[1-9]?\d","8") # ? 匹配1次或0次 print(ret.group()) # 8 ret = re.match("[1-9]?\d","78") print(ret.group()) # 78 # 不正確的狀況 ret = re.match("[1-9]?\d","08") print(ret.group()) # 0 # 修正以後的 ret = re.match("[1-9]?\d$","08") if ret: print(ret.group()) else: print("不在0-100之間") # 添加| ret = re.match("[1-9]?\d$|100","8") print(ret.group()) # 8 ret = re.match("[1-9]?\d$|100","78") print(ret.group()) # 78 ret = re.match("[1-9]?\d$|100","08") # print(ret.group()) # 不是0-100之間 ret = re.match("[1-9]?\d$|100","100") print(ret.group()) # 100 re中匹配數字 import re # 普通的匹配方式 ret = re.match("嫦娥1號","嫦娥1號發射成功") print(ret.group()) ret = re.match("嫦娥2號","嫦娥2號發射成功") print(ret.group()) ret = re.match("嫦娥3號","嫦娥3號發射成功") print(ret.group()) # 使用\d進行匹配 ret = re.match("嫦娥\d號","嫦娥1號發射成功") print(ret.group()) ret = re.match("嫦娥\d號","嫦娥2號發射成功") print(ret.group()) ret = re.match("嫦娥\d號","嫦娥3號發射成功") print(ret.group()) ''' 嫦娥1號 嫦娥2號 嫦娥3號 嫦娥1號 嫦娥2號 嫦娥3號 ''' re中對分組起別名 #(?P<name>) import re ret = re.match(r"<(?P<name1>\w*)><(?P<name2>\w*)>.*</(?P=name2)></(?P=name1)>", "<html><h1>www.itcast.cn</h1></html>") print(ret.group()) ret = re.match(r"<(?P<name1>\w*)><(?P<name2>\w*)>.*</(?P=name2)></(?P=name1)>", "<html><h1>www.itcast.cn</h2></html>") # (?P=name2) ----- (?P<name2>\w*) # ret.group() ''' <html><h1>www.itcast.cn</h1></html> ''' re中將括號中字符做爲一個分組 #(ab)將ab做爲一個分組 import re ret = re.match("\w{4,20}@163\.com", "test@163.com") print(ret.group()) # test@163.com ret = re.match("\w{4,20}@(163|126|qq)\.com", "test@126.com") print(ret.group()) # test@126.com ret = re.match("\w{4,20}@(163|126|qq)\.com", "test@qq.com") print(ret.group()) # test@qq.com ret = re.match("\w{4,20}@(163|126|qq)\.com", "test@gmail.com") if ret: print(ret.group()) else: print("不是16三、12六、qq郵箱") # 不是16三、12六、qq郵箱 ''' test@163.com test@126.com test@qq.com 不是16三、12六、qq郵箱 '''

正則表達式_合集下

re中引用分組匹配字符串 import re # 可以完成對正確的字符串的匹配 ret = re.match("<[a-zA-Z]*>\w*</[a-zA-Z]*>", "<html>hh</html>") print(ret.group()) # 若是遇到非正常的html格式字符串，匹配出錯</htmlbalabala>會一塊兒輸出 ret = re.match("<[a-zA-Z]*>\w*</[a-zA-Z]*>", "<html>hh</htmlbalabala>") # print(ret.group()) # 正確的理解思路：若是在第一對<>中是什麼，按理說在後面的那對<>中就應該是什麼 # 經過引用分組中匹配到的數據便可，可是要注意是元字符串，即相似 r""這種格式 ret = re.match(r"<([a-zA-Z]*)>\w*</\1>", "<html>hh</html>") # </\1>匹配第一個規則 print(ret.group()) # 由於2對<>中的數據不一致，因此沒有匹配出來 test_label = "<html>hh</htmlbalabala>" ret = re.match(r"<([a-zA-Z]*)>\w*</\1>", test_label) if ret: print(ret.group()) else: print("%s 這是一對不正確的標籤" % test_label) ''' <html>hh</html> <html>hh</htmlbalabala> <html>hh</html> <html>hh</htmlbalabala> 這是一對不正確的標籤 ''' re中引用分組匹配字符串_2 import re labels = ["<html><h1>www.itcast.cn</h1></html>", "<html><h1>www.itcast.cn</h2></html>"] for label in labels: ret = re.match(r"<(\w*)><(\w*)>.*</\2></\1>", label) # <\2>和第二個匹配同樣的內容 if ret: print("%s 是符合要求的標籤" % ret.group()) else: print("%s 不符合要求" % label) ''' <html><h1>www.itcast.cn</h1></html> 是符合要求的標籤 <html><h1>www.itcast.cn</h2></html> 不符合要求 ''' re中提取區號和電話號碼 import re ret = re.match("([^-]*)-(\d+)","010-1234-567") # 除了 - 的全部字符 # 對最後一個-前面的全部字符進行分組,直到最後一個數字爲止 print(ret.group( )) print(ret.group(1))#返回-以前的數據,不必定是最後一個-以前 print(ret.group(2)) re中的貪婪 import re s= "This is a number 234-235-22-423" r=re.match(".+(\d+-\d+-\d+-\d+)",s) # .+ 儘可能多的匹配任意字符,匹配到-前一個數字以前 # . 匹配任意字符 print(type(r)) print(r.group()) print(r.group(0)) print(r.group(1)) r=re.match(".+?(\d+-\d+-\d+-\d+)",s) print(r.group()) print(r.group(1))#到數字中止貪婪 ''' <class 're.Match'> This is a number 234-235-22-423 This is a number 234-235-22-423 4-235-22-423 This is a number 234-235-22-423 234-235-22-423 ''' import re ret = re.match(r"aa(\d+)","aa2343ddd") # 儘可能多的匹配字符 print(ret.group()) # 使用? 將re貪婪轉換爲非貪婪 ret = re.match(r"aa(\d+?)","aa2343ddd") # 只輸出一個數字 print(ret.group()) ''' aa2343 aa2 ''' re使用split切割字符串 import re ret = re.split(r":| ","info:XiaoLan 22 Hany.control") # | 或 知足一個便可 print(ret) str1 = 'one,two,three,four' pattern = re.compile(',') # 按照，將string分割後返回 print(pattern.split(str1)) # ['one', 'two', 'three', 'four'] str2 = 'one1two2three3four' print(re.split('\d+',str2)) # ['one', 'two', 'three', 'four'] re匹配中subn，進行替換並返回替換次數 import re pattern = re.compile('\d+') strs = 'one1two2three3four' print(pattern.subn('-',strs)) # ('one-two-three-four', 3) 3爲替換的次數 re匹配中sub將匹配到的數據進行替換 # import re # ret = re.sub(r"\d+", '替換的字符串998', "python = 997") # # python = 替換的字符串998 # print(ret) # # 將匹配到的數據替換掉，替換成想要替換的數據 # re.sub("規則","替換的字符串","想要替換的數據") import re def add(temp): strNum = temp.group() # 匹配到的數據.group()方式 print("原來匹配到的字符:",int(temp.group())) num = int(strNum) + 5 #字符串強制轉換 return str(num) ret = re.sub(r"\d+", add, "python = 997") # re.sub('正則規則','替換的字符串','字符串') print(ret) ret = re.sub(r"\d+", add, "python = 99") print(ret) pattern = re.compile('\d') str1 = 'one1two2three3four' print(pattern.sub('-',str1)) # one-two-three-four print(re.sub('\d','-',str1)) # one-two-three-four ''' 原來匹配到的字符: 997 python = 1002 原來匹配到的字符: 99 python = 104 one-two-three-four one-two-three-four ''' re匹配的小例子 import re src="https://rpic.douyucdn.cn/appCovers/2016/11/13/1213973_201611131917_small.jpg" ret = re.search(r"https://.*?\.jpg", src) print(ret.group()) res = re.compile('[a-zA-Z]{1}') strs = '123abc456' print(re.search(res,strs).group( )) print(re.findall(res,strs)) #findall返回列表元素對象不具備group函數 # print(re.finditer(res,strs)) #返回迭代器對象 ''' https://rpic.douyucdn.cn/appCovers/2016/11/13/1213973_201611131917_small.jpg a ['a', 'b', 'c'] ''' 匹配前一個字符出現m次 import re src="https://rpic.douyucdn.cn/appCovers/2016/11/13/1213973_201611131917_small.jpg" ret = re.search(r"https://.*?\.jpg", src) print(ret.group()) res = re.compile('[a-zA-Z]{1}') strs = '123abc456' print(re.search(res,strs).group( )) print(re.findall(res,strs)) #findall返回列表元素對象不具備group函數 # print(re.finditer(res,strs)) #返回迭代器對象 ''' https://rpic.douyucdn.cn/appCovers/2016/11/13/1213973_201611131917_small.jpg a ['a', 'b', 'c'] ''' 引用分組 import re strs = 'hello 123,world 456' pattern = re.compile('(\w+) (\d+)') # for i in pattern.finditer(strs): # print(i.group(0)) # print(i.group(1)) # print(i.group(2))#當存在第二個分組時 '''hello 123 hello 123 world 456 world 456 ''' print(pattern.sub(r'\2 \1',strs)) # 先輸出第二組，後輸出第一組 print(pattern.sub(r'\1 \2',strs)) 當findall遇到分組時，只匹配分組 import re pattern = re.compile('([a-z])[a-z]([a-z])') strs = '123abc456asd' # print(re.findall(pattern,strs)) # [('a', 'c'), ('a', 'd')]返回分組匹配到的結果 result = re.finditer(pattern,strs) for i in result: print(i.group( )) #match對象使用group函數輸出 print(i.group(0))#返回匹配到的全部結果 print(i.group(1))#返回第一個分組匹配的結果 print(i.group(2))#返回第二個分組匹配的結果 # <re.Match object; span=(3, 6), match='abc'> # <re.Match object; span=(9, 12), match='asd'> # 返回完整的匹配結果 ''' abc abc a c asd asd a d '''

線程_apply堵塞式 ''' 建立三個進程，讓三個進程分別執行功能,關閉進程 Pool 建立 ，apply執行 ， close，join 關閉進程 ''' from multiprocessing import Pool import os,time,random def worker(msg): # 建立一個函數，用來使進程進行執行 time_start = time.time() print("%s 號進程開始執行，進程號爲 %d"%(msg,os.getpid())) # 使用os.getpid()獲取子進程號 # os.getppid()返回父進程號 time.sleep(random.random()*2) time_end = time.time() print(msg,"號進程執行完畢，耗時%0.2f"%(time_end-time_start)) # 計算運行時間 if __name__ == '__main__': po = Pool(3)#建立三個進程 print("進程開始") for i in range(3): # 使用for循環，運行剛剛建立的進程 po.apply(worker,(i,))#進程池調用方式apply堵塞式 # 第一個參數爲函數名，第二個參數爲元組類型的參數(函數運行會用到的形參) #只有當進程執行完退出後，纔會新建立子進程來調用請求 po.close()# 關閉進程池，關閉後po再也不接收新的請求 # 先使用進程的close函數關閉，後使用join函數進行等待 po.join() # 等待po中全部子進程執行完成，必須放在close語句以後 print("進程結束") '''建立->apply應用->close關閉->join等待結束'''

線程_FIFO隊列實現生產者消費者 import threading # 導入線程庫 import time from queue import Queue # 隊列 class Producer(threading.Thread): # 線程的繼承類，修改 run 方法 def run(self): global queue count = 0 while True: if queue.qsize() <1000: for i in range(100): count = count + 1 msg = '生成產品'+str(count) queue.put(msg)#向隊列中添加元素 print(msg) time.sleep(1) class Consumer(threading.Thread): # 線程的繼承類，修改 run 方法 def run(self): global queue while True: if queue.qsize() >100 : for i in range(3): msg = self.name + '消費了' + queue.get() #獲取數據 # queue.get()獲取到數據 print(msg) time.sleep(1) if __name__ == '__main__': queue = Queue() # 建立一個隊列 for i in range(500): queue.put('初始產品'+str(i)) # 在 queue 中放入元素 使用 put 函數 for i in range(2): p = Producer() p.start() # 調用Producer類的run方法 for i in range(5): c = Consumer() c.start()

線程_GIL最簡單的例子 #解決多進程死循環 import multiprocessing def deadLoop(): while True: print("Hello") pass if __name__ == '__main__': # 子進程死循環 p1 = multiprocessing.Process(target=deadLoop) p1.start() # 主進程死循環 deadLoop()

線程_multiprocessing實現文件夾copy器 import multiprocessing import os import time import random def copy_file(queue,file_name,source_folder_name,dest_folder_name): f_read = open(source_folder_name+"/"+file_name,"rb") f_write = open(source_folder_name+"/"+file_name,"wb") while True: time.sleep(random.random()) content = f_read.read(1024) if content: f_write.write(content) else: break f_read.close() f_write.close() # 發送已經拷貝完畢的文件名字 queue.put(file_name) def main(): # 獲取要複製的文件夾 source_folder_name = input("請輸入要複製的文件夾名字:") # 整理目標文件夾 dest_folder_name = source_folder_name + "副本" # 建立目標文件夾 try: os.mkdir(dest_folder_name)#建立文件夾 except: pass # 獲取這個文件夾中全部的普通文件名 file_names = os.listdir(source_folder_name) # 建立Queue queue = multiprocessing.Manager().Queue() # 建立線程池 pool = multiprocessing.Pool(3) for file_name in file_names: # 向線程池中添加任務 pool.apply_async(copy_file,args=(queue,file_name,source_folder_name,dest_folder_name))#不堵塞執行 # 主進程顯示進度 pool.close() all_file_num = len(file_names) while True: file_name = queue.get() if file_name in file_names: file_names.remove(file_name) copy_rate = (all_file_num - len(file_names)) * 100 / all_file_num print("\r%.2f...(%s)" % (copy_rate, file_name) + " " * 50, end="") if copy_rate >= 100: break print() if __name__ == "__main__": main()

線程_multiprocessing異步 from multiprocessing import Pool import time import os def test(): print("---進程池中的進程---pid=%d,ppid=%d--"%(os.getpid(),os.getppid())) for i in range(3): print("----%d---"%i) time.sleep(1) return "hahah" def test2(args): print("---callback func--pid=%d"%os.getpid()) print("---callback func--args=%s"%args) if __name__ == '__main__': pool = Pool(3) pool.apply_async(func=test,callback=test2) # 異步執行 time.sleep(5) print("----主進程-pid=%d----"%os.getpid())

線程_Process實例 from multiprocessing import Process import os from time import sleep def run_proc(name,age,**kwargs): for i in range(10): print("子進程運行中,名字爲 = %s,年齡爲 = %d,子進程 = %d..."%(name,age,os.getpid())) print(kwargs) sleep(0.5) if __name__ == '__main__': print("父進程: %d"%(os.getpid())) pro = Process(target=run_proc,args=('test',18),kwargs={'kwargs':20}) print("子進程將要執行") pro.start( ) sleep(1) pro.terminate()#將進程進行終止 pro.join() print("子進程已結束")

from multiprocessing import Process import time import os #兩個子進程將會調用的兩個方法 def work_1(interval): # intercal爲掛起時間 print("work_1,父進程(%s),當前進程(%s)"%(os.getppid(),os.getpid())) start_time = time.time() time.sleep(interval) end_time = time.time() print("work_1,執行時間爲%f"%(end_time-start_time)) def work_2(interval): print("work_2,父進程(%s),當前進程(%s)"%(os.getppid(),os.getpid())) start_time = time.time() time.sleep(2) end_time = time.time() print("work_2執行時間爲:%.2f"%(end_time-start_time)) if __name__ == '__main__': print("進程Id:", os.getpid()) pro1 = Process(target=work_1, args=(2,)) pro2 = Process(target=work_2, name="pro2", args=(3,)) pro1.start() pro2.start() print("pro2.is_alive:%s" % (pro2.is_alive())) print("pro1.name:", pro1.name) print("pro1.pid=%s" % pro1.pid) print("pro2.name=%s" % pro2.name) print("pro2.pid=%s" % pro2.pid) pro1.join() print("pro1.is_alive:", pro1.is_alive())

線程_Process基礎語法

""" Process([group[,target[,name[,args[,kwargs]]]]]) group:大多數狀況下用不到 target:表示這個進程實例所調用的對象 target=函數名 name:爲當前進程實例的別名 args:表示調用對象的位置參數元組 args=(參數,) kwargs:表示調用對象的關鍵字參數字典 """ """ 經常使用方法: is_alive( ):判斷進程實例是否還在執行 join([timeout]):是否等待進程實例執行結束或等待多少秒 start():啓動進程實例(建立子進程) run():若是沒有給定target函數,對這個對象調用start()方法時, 就將執行對象中的run()方法 terminate():無論任務是否完成，當即中止 """ """ 經常使用屬性: name:當前進程實例的別名,默認爲Process-N,N從1開始 pid:當前進程實例的PID值 """

線程_ThreadLocal

import threading # 建立ThreadLocal對象 house = threading.local() def process_paper(): user = house.user print("%s是房子的主人,in %s"%(user,threading.current_thread().name)) def process_thread(user): house.user = user process_paper() t1 = threading.Thread(target=process_thread,args=('Xiaoming',),name='佳木斯') t2 = threading.Thread(target=process_thread,args=('Hany',),name='哈爾濱') t1.start() t1.join() t2.start() t2.join()

線程_互斥鎖_Lock及fork建立子進程 """ 建立鎖 mutex = threading.Lock() 鎖定 mutex.acquire([blocking]) 當blocking爲True時，當前線程會阻塞，直到獲取到這個鎖爲止 默認爲True 當blocking爲False時，當前線程不會阻塞 釋放 mutex.release() """ from threading import Thread,Lock g_num = 0 def test1(): global g_num for i in range(100000): mutexFlag = mutex.acquire(True)#經過全局變量進行調用函數 # True會發生阻塞,直到結束獲得鎖爲止 if mutexFlag: g_num += 1 mutex.release() print("test1--g_num = %d"%(g_num)) def test2(): global g_num for i in range(100000): mutexFlag = mutex.acquire(True) if mutexFlag: g_num += 1 mutex.release() print("----test2---g_num = %d "%(g_num)) mutex = Lock() p1 = Thread(target=test1,) # 開始進程 p1.start() p2 = Thread(target=test2,) p2.start() print("----g_num = %d---"%(g_num))

fork建立子進程 import os # fork()在windows下不可用 pid = os.fork()#返回兩個值 # 操做系統建立一個新的子進程，複製父進程的信息到子進程中 # 而後父進程和子進程都會獲得一個返回值，子進程爲0，父進程爲子進程的id號 if pid == 0: print("哈哈1") else: print("哈哈2")

線程_gevent實現多個視頻下載及併發下載

from gevent import monkey import gevent import urllib.request #有IO操做時,使用patch_all自動切換 monkey.patch_all() def my_downLoad(file_name, url): print('GET: %s' % url) resp = urllib.request.urlopen(url) # 使用庫打開網頁 data = resp.read() with open(file_name, "wb") as f: f.write(data) print('%d bytes received from %s.' % (len(data), url)) gevent.joinall([ gevent.spawn(my_downLoad, "1.mp4", 'http://oo52bgdsl.bkt.clouddn.com/05day-08-%E3%80%90%E7%90%86%E8%A7%A3%E3%80%91%E5%87%BD%E6%95%B0%E4%BD%BF%E7%94%A8%E6%80%BB%E7%BB%93%EF%BC%88%E4%B8%80%EF%BC%89.mp4'), gevent.spawn(my_downLoad, "2.mp4", 'http://oo52bgdsl.bkt.clouddn.com/05day-03-%E3%80%90%E6%8E%8C%E6%8F%A1%E3%80%91%E6%97%A0%E5%8F%82%E6%95%B0%E6%97%A0%E8%BF%94%E5%9B%9E%E5%80%BC%E5%87%BD%E6%95%B0%E7%9A%84%E5%AE%9A%E4%B9%89%E3%80%81%E8%B0%83%E7%94%A8%28%E4%B8%8B%29.mp4'), ])

from gevent import monkey import gevent import urllib.request # 有耗時操做時須要 monkey.patch_all() def my_downLoad(url): print('GET: %s' % url) resp = urllib.request.urlopen(url) data = resp.read() print('%d bytes received from %s.' % (len(data), url)) gevent.joinall([ gevent.spawn(my_downLoad, 'http://www.baidu.com/'), gevent.spawn(my_downLoad, 'http://www.itcast.cn/'), gevent.spawn(my_downLoad, 'http://www.itheima.com/'), ])

線程_gevent自動切換CPU協程 import gevent def f(n): for i in range(n): print (gevent.getcurrent(), i) # gevent.getcurrent() 獲取當前進程 g1 = gevent.spawn(f, 3)#函數名,數目 g2 = gevent.spawn(f, 4) g3 = gevent.spawn(f, 5) g1.join() g2.join() g3.join()

import gevent def f(n): for i in range(n): print (gevent.getcurrent(), i) #用來模擬一個耗時操做，注意不是time模塊中的sleep gevent.sleep(1) g1 = gevent.spawn(f, 2) g2 = gevent.spawn(f, 3) g3 = gevent.spawn(f, 4) g1.join() g2.join() g3.join()

import gevent import random import time def coroutine_work(coroutine_name): for i in range(10): print(coroutine_name, i) time.sleep(random.random()) gevent.joinall([ # 添加能夠切換的協程 gevent.spawn(coroutine_work, "work0"), gevent.spawn(coroutine_work, "work1"), gevent.spawn(coroutine_work, "work2") ])

from gevent import monkey import gevent import random import time # 有耗時操做時須要 monkey.patch_all()#自動切換協程 # 將程序中用到的耗時操做的代碼，換爲gevent中本身實現的模塊 def coroutine_work(coroutine_name): for i in range(10): print(coroutine_name, i) time.sleep(random.random()) gevent.joinall([ gevent.spawn(coroutine_work, "work"), gevent.spawn(coroutine_work, "work1"), gevent.spawn(coroutine_work, "work2") ])

線程_使用multiprocessing啓動一個子進程及建立Process 的子類 from multiprocessing import Process import os # 子進程執行的函數 def run_proc(name): print("子進程運行中,名稱:%s,pid:%d..."%(name,os.getpid())) if __name__ == "__main__": print("父進程爲:%d..."%(os.getpid())) # os.getpid()獲取到進程名 pro = Process(target=run_proc,args=('test',)) # target=函數名 args=(參數,) print("子進程將要執行") pro.start()#進程開始 pro.join()#添加進程 print("子進程執行結束...")

from multiprocessing import Process import time import os # 繼承Process類 class Process_Class(Process): def __init__(self,interval): Process.__init__(self) self.interval = interval # 重寫Process類的run方法 def run(self): print("我是類中的run方法") print("子進程(%s),開始執行,父進程爲(%s)"%(os.getpid(),os.getppid())) start_time = time.time() time.sleep(2) end_time = time.time() print("%s執行時間爲:%.2f秒" % (os.getpid(),end_time-start_time)) if __name__ == '__main__': start_time = time.time() print("當前進程爲:(%s)"%(os.getpid())) pro1 = Process_Class(2) # 對一個不包含target屬性的Process類執行start()方法， # 會運行這個類中的run()方法，因此這裏會執行p1.run() pro1.start() pro1.join() end_time = time.time() print("(%s)執行結束，耗時%0.2f" %(os.getpid(),end_time - start_time))

線程_共享全局變量(全局變量在主線程和子線程中不一樣)

from threading import Thread import time g_num = 100 def work1(): global g_num for i in range(3): g_num += 1 print("----在work1函數中,g_num 是 %d "%(g_num)) def work2(): global g_num print("在work2中，g_num爲 %d "%(g_num)) if __name__ == '__main__': print("---線程建立以前 g_num 是 %d"%(g_num)) t1 = Thread(target=work1) t1.start() t2 = Thread(target=work2) t2.start()

線程_多線程_列表當作實參傳遞到線程中

from threading import Thread def work1(nums): nums.append('a') print('---在work1中---',nums) def work2(nums): print("-----在work2中----,",nums) if __name__ == '__main__': g_nums = [1,2,3] t1 = Thread(target=work1,args=(g_nums,)) # target函數,args參數 t1.start() t2 = Thread(target=work2,args=(g_nums,)) t2.start()

線程_threading合集

# 主線程等待全部子線程結束才結束 import threading from time import sleep,ctime def sing(): for i in range(3): print("正在唱歌---%d"%(i)) sleep(2) def dance(): for i in range(3): print("正在跳舞---%d" % (i)) sleep(2) if __name__ == '__main__': print("----開始----%s"%(ctime())) t_sing = threading.Thread(target=sing) t_dance = threading.Thread(target=dance) t_sing.start() t_dance.start() print("----結束----%s"%(ctime())) #查看線程數量 import threading from time import sleep,ctime def sing(): for i in range(3): print("正在唱歌---%d"%i) sleep(1) def dance(): for i in range(3): print("正在跳舞---%d"%i) sleep(i) if __name__ == '__main__': t_sing = threading.Thread(target=sing) t_dance = threading.Thread(target=dance) t_sing.start() t_dance.start() while True: length = len(threading.enumerate()) print("當前運行的線程數爲:%d"%(length)) if length<= 1: break sleep(0.5) import threading import time class MyThread(threading.Thread): # 重寫 構造方法 def __init__(self, num, sleepTime): threading.Thread.__init__(self) self.num = num # 類實例不一樣，num值不一樣 self.sleepTime = sleepTime def run(self): self.num += 1 time.sleep(self.sleepTime) print('線程(%s),num=%d' % (self.name, self.num)) if __name__ == '__main__': mutex = threading.Lock() t1 = MyThread(100, 3) t1.start() t2 = MyThread(200, 1) t2.start() import threading from time import sleep g_num = 1 def test(sleepTime): num = 1 #num爲局部變量 sleep(sleepTime) num += 1 global g_num #g_num爲全局變量 g_num += 1 print('---(%s)--num=%d --g_num=%d' % (threading.current_thread(), num,g_num)) t1 = threading.Thread(target=test, args=(3,)) t2 = threading.Thread(target=test, args=(1,)) t1.start() t2.start() import threading import time class MyThread1(threading.Thread): def run(self): if mutexA.acquire(): print("A上鎖了") mutexA.release() time.sleep(2) if mutexB.acquire(): print("B上鎖了") mutexB.release() mutexA.release() class MyThread2(threading.Thread): def run(self): if mutexB.acquire(): print("B上鎖了") mutexB.release() time.sleep(2) if mutexA.acquire(): print("A上鎖了") mutexA.release() mutexB.release() # 先看B是否上鎖，而後看A是否上鎖 mutexA = threading.Lock() mutexB = threading.Lock() if __name__ == "__main__": t1 = MyThread1() t2 = MyThread2() t1.start() t2.start() 多線程threading的執行順序(不肯定) # 只能保證都執行run函數，不能保證執行順序和開始順序 import threading import time class MyThread(threading.Thread): def run(self): for i in range(3): time.sleep(1) msg = "I'm "+self.name+' @ '+str(i) print(msg) def test(): for i in range(5): t = MyThread() t.start() if __name__ == '__main__': test() 多線程threading的注意點 import threading import time class MyThread(threading.Thread): # 重寫threading.Thread類中的run方法 def run(self): for i in range(3):#開始線程以後循環三次 time.sleep(1) msg = "I'm "+self.name+'@'+str(i) # name屬性是當前線程的名字 print(msg) if __name__ == '__main__': t = MyThread()#使用threading.Thread的繼承類 t.start()#繼承線程以後要開始運行 start方法

線程_進程間通訊Queue合集

# Queue的工做原理 from multiprocessing import Queue q = Queue(3)#初始化一個Queue對象，最多可接收3條put消息 q.put("Info1") q.put("Info2") print("q是否滿了",q.full())#查看q是否滿了 q.put("Info3") print("q是否滿了",q.full()) try: q.put_nowait("info4") except: print("消息列隊已經滿了,現有消息數量爲:%s"%(q.qsize())) # 使用q.qsize()查看數量 # 先驗證是否滿了，再寫入 if not q.full(): q.put_nowait("info4") # 讀取信息時，先判斷消息列隊是否爲空，再讀取 if not q.empty(): print("開始讀取") for i in range(q.qsize()): print(q.get_nowait())

from multiprocessing import Queue from multiprocessing import Process import os,time,random def write(q): for value in ['a','b','c']: print("Put %s to q ..."%(value)) q.put(value) time.sleep(random.random()) def read(q): while True: if not q.empty(): value = q.get(True) print("Get %s from Queue..."%(value)) time.sleep(random.random()) else: break if __name__ == '__main__': #父進程建立Queue，傳給各個子進程 q = Queue() pw = Process(target=write,args=(q,)) pr = Process(target=read,args=(q,)) pw.start() # 等待pw結束 pw.join() pr.start() pr.join() print("數據寫入讀寫完成")

from multiprocessing import Manager,Pool import os,time,random # 名稱爲reader 輸出子進程和父進程 os 輸出q的信息 def reader(q): print("reader啓動,子進程:%s,父進程:%s"%(os.getpid(),os.getppid())) for i in range(q.qsize()):#在0 ~ qsize範圍內 print("獲取到queue的信息:%s"%(q.get(True))) def writer(q): print("writer啓動,子進程:%s,父進程:%s"%(os.getpid(),os.getppid())) for i in "HanYang":#須要寫入到 q 的數據 q.put(i) if __name__ == '__main__': print("%s 開始 "%(os.getpid())) q = Manager().Queue()#Queue使用multiprocessing.Manager()內部的 po = Pool()#建立一個線程池 po.apply(writer,(q,))#使用apply阻塞模式 po.apply(reader,(q,)) po.close()#關閉 po.join()#等待結束 print("(%s) 結束"%(os.getpid()))

線程_進程池

from multiprocessing import Pool import os,time,random def worker(msg): start_time = time.time() print("(%s)開始執行，進程號爲(%s)"%(msg,os.getpid())) time.sleep(random.random()*2) end_time = time.time() print(msg,"(%s)執行完畢，執行時間爲:%.2f"%(os.getpid(),end_time-start_time)) if __name__ == '__main__': po = Pool(3)#定義一個進程池，最大進程數爲3 for i in range(0,6): po.apply_async(worker,(i,)) # 參數:函數名,(傳遞給目標的參數元組) # 每次循環使用空閒的子進程調用函數,知足每一個時刻都有三個進程在執行 print("---開始---") po.close() po.join() print("---結束---") """ multiprocessing.Pool的經常使用函數: apply_async(func[,args[,kwds]]): 使用非阻塞方式調用func,並行執行 args爲傳遞給func的參數列表 kwds爲傳遞給func的關鍵字參數列表 apply(func[,args[,kwds]]) 使用堵塞方式調用func 堵塞方式：必須等待上一個進程退出才能執行下一個進程 close() 關閉Pool，使其不接受新的任務 terminate() 不管任務是否完成，當即中止 join() 主進程堵塞，等待子進程的退出 注:必須在terminate，close函數以後使用 """

線程_可能發生的問題

from threading import Thread g_num = 0 def test1(): global g_num for i in range(1000000): g_num += 1 print("---test1---g_num=%d"%g_num) def test2(): global g_num for i in range(1000000): g_num += 1 print("---test2---g_num=%d"%g_num) p1 = Thread(target=test1) p1.start() # time.sleep(3) p2 = Thread(target=test2) p2.start() print("---g_num=%d---"%g_num)

內存泄漏 import gc class ClassA(): def __init__(self): print('對象產生 id:%s'%str(hex(id(self)))) def f2(): while True: c1 = ClassA() c2 = ClassA() c1.t = c2#引用計數變爲2 c2.t = c1 del c1#引用計數變爲1 0才進行回收 del c2 #把python的gc關閉 gc.disable() f2()

== 和 is 的區別

import copy a = ['a','b','c'] b = a #b和a引用自同一塊地址空間 print("a==b :",a==b) print("a is b :",a is b) c = copy.deepcopy(a)# 對a進行深拷貝 print("a的id值爲:",id(a)) print("b的id值爲:",id(b)) print("c的id值爲:",id(c))#深拷貝，不一樣地址 print("a==c :",a==c) print("a is c :",a is c) """ is 是比較兩個引用是否指向了同一個對象（引用比較）。 == 是比較兩個對象是否相等。 """ ''' a==b : True a is b : True a的id值爲: 2242989720448 b的id值爲: 2242989720448 c的id值爲: 2242989720640 a==c : True a is c : False '''

如下爲類的小例子

__getattribute__小例子 class student(object): def __init__(self,name=None,age=None): self.name = name self.age = age def __getattribute__(self, item):#getattribute方法修改類的屬性 if item == 'name':#若是爲name屬性名 print("XiaoMing被我攔截住了") return "XiaoQiang " #返回值修改了name屬性 else: return object.__getattribute__(self,item) def show(self): print("姓名是: %s" %(self.name)) stu_one = student("XiaoMing",22) print("學生姓名爲:",stu_one.name) print("學生年齡爲:",stu_one.age) ''' XiaoMing被我攔截住了 學生姓名爲: XiaoQiang 學生年齡爲: 22 '''

__new__方法理解

class Foo(object): def __init__(self, *args, **kwargs): pass def __new__(cls, *args, **kwargs): return object.__new__(cls, *args, **kwargs) # 以上return等同於 # return object.__new__(Foo, *args, **kwargs) class Child(Foo): def __new__(cls, *args, **kwargs): return object.__new__(cls, *args, **kwargs)

class Round2Float(float): def __new__(cls,num): num = round(num,2) obj = float.__new__(Round2Float,num) return obj f=Round2Float(4.324599) print(f) '''派生不可變類型'''

ctime使用及datetime簡單使用

from time import ctime,sleep def Clock(func): def clock(): print("如今是:",ctime()) func() sleep(3) print("如今是:",ctime()) return clock @Clock def func(): print("函數計時") func()

import datetime now = datetime.datetime.now()#獲取當前時間 str = "%s"%(now.strftime("%Y-%m-%d-%H-%M-%S")) """ Y 年 y m 月 d 號 H 時 M 分 S 秒 """ # 設置時間格式 print(str)

functools函數中的partial函數及wraps函數

''' partial引用函數，並增長形參 ''' import functools def show_arg(*args,**kwargs): print("args",args) print("kwargs",kwargs) q = functools.partial(show_arg,1,2,3)#1,2,3爲默認值 # functools.partial(函數，形式參數) q()#至關於將show_arg改寫一下，而後換一個名字 q(4,5,6)#沒有鍵值對，kwargs爲空 q(a='python',b='Hany') # 增長默認參數 w = functools.partial(show_arg,a = 3,b = 'XiaoMing')#a = 3,b = 'XiaoMing'爲默認值 w()#當沒有值時，輸出默認值 w(1,2) w(a = 'python',b = 'Hany')

import functools def note(func): "note function" @functools.wraps(func) #使用wraps函數消除test函數使用@note裝飾器產生的反作用 .__doc__名稱 改變 def wrapper(): "wrapper function" print('note something') return func() return wrapper @note def test(): "test function" print('I am test') test() print(test.__doc__)

gc 模塊經常使用函數 一、gc.set_debug(flags) 設置gc的debug日誌，通常設置爲gc.DEBUG_LEAK 二、gc.collect([generation]) 顯式進行垃圾回收，能夠輸入參數，0表明只檢查第一代的對象， 1表明檢查一，二代的對象，2表明檢查一，二，三代的對象，若是不傳參數， 執行一個full collection，也就是等於傳2。 返回不可達（unreachable objects）對象的數目 三、gc.get_threshold() 獲取的gc模塊中自動執行垃圾回收的頻率。 四、gc.set_threshold(threshold0[, threshold1[, threshold2]) 設置自動執行垃圾回收的頻率。 五、gc.get_count() 獲取當前自動執行垃圾回收的計數器，返回一個長度爲3的列表

hashlib加密算法

# import hashlib # mima = hashlib.md5()#建立hash對象，md5是信息摘要算法，生成128位密文 # print(mima) # # mima.update('參數')使用參數更新哈希對象 # print(mima.hexdigest())#返回16進制的數字字符串 import hashlib import datetime KEY_VALUE = 'XiaoLiu' now = datetime.datetime.now() m = hashlib.md5() # 建立一個MD5密文 str = "%s%s%s"%(KEY_VALUE," ",now.strftime("%Y-%m-%d")) # strftime日期格式 m.update(str.encode('UTF-8')) value = m.hexdigest() # 以十六進制進行輸出 print(str,value)

__slots__屬性 使用__slots__時,子類不受影響 class Person(object): __slots__ = ("name","age") def __str__(self): return "姓名:%s,年齡:%d"%(self.name,self.age) p = Person() class man(Person): pass m = man() m.score = 78 print(m.score) 使用__slots__限制類添加的屬性 class Person(object): __slots__ = ("name","age") def __str__(self): return "姓名:%s,年齡:%d"%(self.name,self.age) p = Person() p.name = "Xiaoming" p.age = 15 print(p) try: p.score = 78 except AttributeError : print(AttributeError)

isinstance方法判斷可迭代和迭代器

from collections import Iterable print(isinstance([],Iterable)) print(isinstance( {}, Iterable)) print(isinstance( (), Iterable)) print(isinstance( 'abc', Iterable)) print(isinstance( '100', Iterable)) print(isinstance((x for x in range(10) ), Iterable)) ''' True D:/看法/Python/Python代碼/vacation/python高級/使用isinstance判斷是否能夠迭代.py:1: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3, and in 3.9 it will stop working True from collections import Iterable True True True True '''

from collections import Iterator print(isinstance( [ ], Iterator)) print(isinstance( 'abc', Iterator)) print(isinstance(() , Iterator)) print(isinstance( {} , Iterator)) print(isinstance( 123, Iterator)) print(isinstance( 5+2j, Iterator)) print(isinstance( (x for x in range(10)) , Iterator)) # 生成器能夠是迭代器 ''' False D:/看法/Python/Python代碼/vacation/python高級/使用isinstance判斷是不是迭代器.py:1: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3, and in 3.9 it will stop working False from collections import Iterator False False False False True '''

metaclass 攔截類的建立,並返回

def upper_attr(future_class_name, future_class_parents, future_class_attr): #遍歷屬性字典，把不是__開頭的屬性名字變爲大寫 newAttr = {} for name,value in future_class_attr.items():#遍歷字典 if not name.startswith("__"):#若是不是以__開頭 newAttr[name.upper()] = value # 將future_class_attr字典中的鍵大寫，而後賦值 return type(future_class_name, future_class_parents, newAttr)#第三個參數爲新修改好的值(類名，父類，字典) class Foo(object, metaclass=upper_attr): # 使用upper_attr對類中值進行修改 bar = 'bip'#一開始建立Foo類時 print(hasattr(Foo, 'bar'))# hasattr查看Foo類中是否存在bar屬性 print(hasattr(Foo, 'BAR')) f = Foo()#實例化對象 print(f.BAR)#輸出

timeit_list操做測試 ''' timeit庫Timer函數 ''' from timeit import Timer def test1(): l = list(range(1000)) def test2(): l = [] for i in range(1000): l.append(i) def test3(): l = [] for i in range(1000): l = l + [i] def test4(): l = [i for i in range(1000)] if __name__ == '__main__': # Timer函數，函數名，導入包 t1 = Timer("test1()","from __main__ import test1") # timeit運行次數 print(t1.timeit(number = 1000)) t2 = Timer("test2()","from __main__ import test2") print(t2.timeit(number =1000)) t3 = Timer("test3","from __main__ import test3") print(t3.timeit(number=1000)) t4 = Timer("test4","from __main__ import test4") print(t4.timeit(number=1000))

 nonlocal 訪問變量 def counter(start = 0): def incr(): nonlocal start #分別保存每個變量的臨時值、相似yield start += 1 return start return incr c1 = counter(5) print(c1()) c2 = counter(50) print(c2()) # c1 繼續上次,輸出接下來的值 print(c1()) print(c2())

pdb 進行調試

import pdb a = 'aaa' pdb.set_trace( ) b = 'bbb' c = 'ccc' final = a+b+c print(final)

import pdb a = 'aaa' pdb.set_trace( ) b = 'bbb' c = 'ccc' pdb.set_trace() final = a+b+c print(final)

import pdb def combine(s1,s2): pdb.set_trace( ) s3 = s1 + s2 return s3 a = 'aaa' pdb.set_trace( ) b = 'bbb' c = 'ccc' pdb.set_trace( ) final = combine(a,b) print(final)

使用property取代getter和setter方法 class Days(object): def __init__(self): self.__days = 0 @property def days(self):#獲取函數，名字是days days 是get方法 return self.__days @days.setter #get方法的set方法 def day(self,days): self.__days = days dd = Days() print(dd.days) dd.day = 15 #經過day函數設置__days的值 print(dd.days) ''' 0 15 '''

使用types庫修改函數

import types class ppp: pass p = ppp()#p爲ppp類實例對象 def run(self): print("run函數") r = types.MethodType(run,p) #函數名，類實例對象 r() ''' run函數 '''

type 建立類,賦予類\靜態方法等 類方法 class ObjectCreator(object): pass @classmethod def testClass(cls): cls.temp = 666 print(cls.temp) test = type("Test",(ObjectCreator,),{'testClass':testClass}) t = test() t.testClass()#字典中的鍵 靜態方法 class Test: pass @staticmethod def TestStatic(): print("我是靜態方法----------") t = type('Test_two',(Test,),{"TestStatic":TestStatic}) print(type(t)) print(t.TestStatic) print(t.TestStatic()) class Test: pass def Method(): return "定義了一個方法" test2 = type("Test2",(Test,),{'Method':Method}) # 第一個參數爲類名，第二個參數爲父類(必須是元組類型)， # 第三個參數爲類屬性，不是實例屬性 # print(type(test2)) # print(test2.Method()) print(hasattr(test2,'Method')) # hasattr查看test2是否包含有Method方法

迭代器實現斐波那契數列

class FibIterator(object): """斐波那契數列迭代器""" def __init__(self, n): """ :param n: int, 指明生成數列的前n個數 """ self.n = n # current用來保存當前生成到數列中的第幾個數了 self.current = 0 # num1用來保存前前一個數，初始值爲數列中的第一個數0 self.num1 = 0 # num2用來保存前一個數，初始值爲數列中的第二個數1 self.num2 = 1 def __next__(self): """被next()函數調用來獲取下一個數""" if self.current < self.n: num = self.num1 self.num1, self.num2 = self.num2, self.num1+self.num2 self.current += 1 return num else: raise StopIteration def __iter__(self): """迭代器的__iter__返回自身便可""" return self if __name__ == '__main__': fib = FibIterator(10) for num in fib: print(num, end=" ")

在( ) 中使用推導式 建立生成器 G = (x*2 for x in range(4)) print(G) print(G.__next__()) print(next(G))#兩種方法等價 # G每一次讀取，指針都會下移 for x in G: print(x,end = " ")

動態給類的實例對象 或 類 添加屬性

class Person(object): def __init__(self,name = None,age = None): self.name = name self.age = age def __str__(self): return "%s 的年齡爲 %d 歲 %s性"%(self.name,self.age,self.sex) pass Xiaoming = Person('小明',20) Xiaoming.sex = '男'#只有Xiaoming對象擁有sex屬性 print(Xiaoming) 小明 的年齡爲 20 歲 男性 class Person(object): def __init__(self,name = None,age = None): self.name = name self.age = age def __str__(self): return "%s 的年齡爲 %d 歲 %s性"%(self.name,self.age,self.sex) Xiaoming = Person('小明',20) Xiaolan = Person('小蘭',19) Person.sex = None #類建立sex默認屬性爲None Xiaolan.sex = '女' print(Xiaoming) print(Xiaolan) 小明 的年齡爲 20 歲 None性 小蘭 的年齡爲 19 歲 女性

線程_同步應用

''' 建立mutex = threading.Lock( ) 鎖定mutex.acquire([blocking]) 釋放mutex.release( ) 建立->鎖定->釋放 ''' from threading import Thread,Lock from time import sleep class Task1(Thread): def run(self): while True: if lock1.acquire(): #對lock1鎖定 print("------Task 1 -----") sleep(0.5) lock2.release() # 釋放lock2鎖的綁定 # 鎖1上鎖，鎖2解鎖 class Task2(Thread): def run(self): while True: if lock2.acquire(): print("------Task 2 -----") sleep(0.5) lock3.release() # 鎖2上鎖，鎖3解鎖 class Task3(Thread): def run(self): while True: if lock3.acquire(): print("------Task 3 -----") sleep(0.5) lock1.release() #使用Lock建立出的鎖默認沒有「鎖上」 lock1 = Lock() #建立另外的鎖，而且上鎖 lock2 = Lock() lock2.acquire() lock3 = Lock() lock3.acquire() t1 = Task1() t2 = Task2() t3 = Task3() t1.start() t2.start() t3.start()

垃圾回收機制_合集

#大整數對象池 b = 1500 a = 1254 print(id(a)) print(id(b)) b = a print(id(b))

a1 = "Hello 垃圾機制" a2 = "Hello 垃圾機制" print(id(a1),id(a2)) del a1 del a2 a3 = "Hello 垃圾機制" print(id(a3))

s = "Hello" print(id (s)) s = "World" print(id (s)) s = 123 print(id (s)) s = 12 print(id (s))

lst1 = [1,2,3] lst2 = [4,5,6] lst1.append(lst2) lst2.append(lst1)#循環進行引用 print(lst1) print(lst2)

class Node(object): def __init__(self,value): self.value = value print(Node(1)) """ 建立一個新對象，python向操做系統請求內存， python實現了內存分配系統， 在操做系統之上提供了一個抽象層 """ print(Node(2))#再次請求,分配內存

import gc class ClassA(): def __init__(self): print('object born,id:%s'%str(hex(id(self)))) def f3(): print("-----0------") # print(gc.collect()) c1 = ClassA() c2 = ClassA() c1.t = c2 c2.t = c1 del c1 del c2 print("gc.garbage:",gc.garbage) print("gc.collect",gc.collect()) #顯式執行垃圾回收 print("gc.garbage:",gc.garbage) if __name__ == '__main__': gc.set_debug(gc.DEBUG_LEAK) #設置gc模塊的日誌 f3()

協程的簡單實現

import time # yield配合next使用 def work1(): while True: print("----work1---") yield time.sleep(0.3) def work2(): while True: print("----work2---") yield time.sleep(0.3) def main(): w1 = work1() w2 = work2() while True: next(w1) next(w2) if __name__ == "__main__": main( )

實現了__iter__和__next__的對象是迭代器

class MyList(object): """自定義的一個可迭代對象""" def __init__(self): self.items = [] def add(self, val): self.items.append(val) def __iter__(self): myiterator = MyIterator(self) return myiterator class MyIterator(object): """自定義的供上面可迭代對象使用的一個迭代器""" def __init__(self, mylist): self.mylist = mylist # current用來記錄當前訪問到的位置 self.current = 0 def __next__(self): if self.current < len(self.mylist.items): item = self.mylist.items[self.current] self.current += 1 return item else: raise StopIteration def __iter__(self): return self if __name__ == '__main__': mylist = MyList() mylist.add(1) mylist.add(2) mylist.add(3) mylist.add(4) mylist.add(5) for num in mylist: print(num)

對類中私有化的理解

class Person(object): def __init__(self,name,age,taste): self.name = name self._age = age self.__taste = taste def showPerson(self): print(self.name) print(self._age) print(self.__taste) def do_work(self): self._work() self.__away() def _work(self): print("_work方法被調用") def __away(self): print("__away方法被調用") class Student(Person): def construction(self,name,age,taste): self.name = name self._age = age self.__taste = taste def showStudent(self): print(self.name) print(self._age) print(self.__taste) @staticmethod def testbug(): _Bug.showbug() class _Bug(Student): @staticmethod def showbug(): print("showbug函數開始運行") s1 = Student('Xiaoming',22,'basketball') s1.showPerson() # s1.showStudent() # s1.construction( ) s1.construction('rose',18,'football') s1.showPerson() s1.showStudent() Student.testbug() ''' Xiaoming 22 basketball rose 18 basketball rose 18 football showbug函數開始運行 '''

拷貝的一些生成式 a = "abc" b = a[:] print(a,b)#值相同 print(id(a),id(b))#地址相同(字符串是不可變類型) d = dict(name = "Xiaoming",age = 22) d_copy = d.copy() print( d ,id(d)) print(d_copy ,id(d_copy))#地址不一樣(字典是可變類型) q = list(range(10)) q_copy = list(q) print( q ,id(q))#值相同，地址不一樣 (<class 'range'>)爲可變類型 print(q_copy,id(q_copy)) ''' abc abc 2233026983024 2233026983024 {'name': 'Xiaoming', 'age': 22} 2233146632896 {'name': 'Xiaoming', 'age': 22} 2233027001984 [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] 2233146658048 [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] 2233164085888 '''

查看 __class__屬性 查看complex的__class__屬性 a = 5+2j print(a.__class__) print(a.__class__.__class__) ''' <class 'complex'> <class 'type'> ''' 查看int的__class__屬性 a = 123 print(a.__class__) print(a.__class__.__class__) ''' <class 'int'> <class 'type'> ''' 查看str的__class__屬性 a = 'str' print(a.__class__) print(a.__class__.__class__) ''' <class 'str'> <class 'type'> ''' class ObjectCreator(object): pass print(type(ObjectCreator))#輸出類的類型 print(type(ObjectCreator()))#<class '__main__.ObjectCreator'> print(ObjectCreator.__class__)#輸出類的類型 ''' <class 'type'> <class '__main__.ObjectCreator'> <class 'type'> '''

運行過程當中給類添加方法 types.MethodType

class Person(object): def __init__(self,name = None,age = None): self.name = name#類中擁有的屬性 self.age = age def eat (self): print("%s在吃東西"%(self.name)) p = Person("XiaoLiu",22) p.eat()#調用Person中的方法 def run(self,speed):#run方法爲須要添加到Person類中的方法 # run方法 self 給類添加方法，使用self指向該類 print("%s在以%d米每秒的速度在跑步"%(self.name,speed)) run(p,2)#p爲類對象 import types p1= types.MethodType(run,p)#p1只是用來接收的對象,MethodType內參數爲 函數+類實例對象 ，接收以後使用函數都是對類實例對象進行使用的 # 第二個參數不可以使用類名進行調用 p1(2) #p1(2)調用實際上時run(p,2) ''' XiaoLiu在吃東西 XiaoLiu在以2米每秒的速度在跑步 XiaoLiu在以2米每秒的速度在跑步 '''

import types class Person(object): num = 0 #num是一個類屬性 def __init__(self, name = None, age = None): self.name = name self.age = age def eat(self): print("eat food") #定義一個類方法 @classmethod #函數具備cls屬性 def testClass(cls): cls.num = 100 # 類方法對類屬性進行修改，使用cls進行修改 #定義一個靜態方法 @staticmethod def testStatic(): print("---static method----") P = Person("老王", 24) #調用在class中的構造方法 P.eat() #給Person類綁定類方法 Person.testClass = testClass #使用函數名進行引用 #調用類方法 print(Person.num) Person.testClass()#Person.testClass至關於testClass方法 print(Person.num)#驗證添加的類方法是否執行成功，執行成功後num變爲100，類方法中使用cls修改的值 #給Person類綁定靜態方法 Person.testStatic = testStatic#使用函數名進行引用 #調用靜態方法 Person.testStatic() ''' eat food 0 100 ---static method---- '''

查看一個對象的引用計數 a = "Hello World " import sys print("a的引用計數爲:",sys.getrefcount(a)) '''a的引用計數爲: 4'''

淺拷貝和深拷貝

a = [1,2,3,4] print(id(a)) b = a print(id(b)) # 地址相同 a.append('a') print(a) print(b)#b和a的值一致，a改變，b就跟着改變 ''' 2342960103104 2342960103104 [1, 2, 3, 4, 'a'] [1, 2, 3, 4, 'a'] ''' 淺拷貝對不可變類型和可變類型的copy不一樣 import copy a = [1,2,3] b = copy.copy(a) a.append('a') print(a," ",b) print(id(a),id(b)) a = (1,2,3) b = copy.copy(a) print(id(a),id(b)) # 淺拷貝copy.copy()對於可變類型賦予的地址不一樣，對於不可變類型賦予相同地址 ''' [1, 2, 3, 'a'] [1, 2, 3] 2053176165376 2053176165568 2053175778688 2053175778688 ''' 深拷貝 import copy a = [1,2,3,4] print(id(a)) b = copy.deepcopy(a) print(id(b))#地址不一樣 a.append('a') print(a," ",b) # 深拷貝:不跟着拷貝的對象發生變化 ''' 2944424869376 2944424869568 [1, 2, 3, 4, 'a'] [1, 2, 3, 4] '''

.format方式輸出星號字典的值是鍵 dic = {'a':123,'b':456} print("{0}:{1}".format(*dic)) # a:b

類能夠打印，賦值，做爲實參和實例化 class ObjectCreator(object): pass print(ObjectCreator) # 打印 ObjectCreator.name = 'XiaoLiu' # 對ObjectCreator類增長屬性,之後使用ObjectCreator類時，都具備name屬性 g = lambda x:x # 把函數賦值給對象g g(ObjectCreator) # 將ObjectCreator做爲實參傳遞給剛剛賦值過的g函數 Obj = ObjectCreator() # 賦值給變量 ''' <class '__main__.ObjectCreator'> '''

類能夠在函數中建立，做爲返回值(返回類)

def func_class(string): if string == 'class_one': class class_one: pass return class_one else: class class_two: pass return class_two MyClass = func_class('') print("MyClass爲 " , MyClass) m = MyClass() print("m爲 ",m) ''' MyClass爲 <class '__main__.func_class.<locals>.class_two'> m爲 <__main__.func_class.<locals>.class_two object at 0x000002BC0491B190> '''

查看某一個字符出現的次數

#方法一 import random range_lst = [random.randint(0,100) for i in range(100)] # 建立一個包含有 100 個數據的隨機數 range_set = set(range_lst) # 建立集合，不包含重複元素 for num in range_set: # 對集合進行遍歷，查找元素出現的次數 # list.count(元素) 查看元素在列表中出現的次數 print(num,":",range_lst.count(num))

# 方法二 import random range_lst = [random.randint(0,5) for i in range(10)] range_dict = dict() for i in range_lst: # 默認爲 0 次，若是出現一次就 + 1 range_dict[i] = range_dict.get(i,0) +1 print(range_dict)

閉包函數 def test(number): #在函數內部再定義一個函數，而且這個函數用到了外邊函數的變量，那麼將這個函數以及用到的一些變量稱之爲閉包 def test_in(number_in): print("in test_in 函數, number_in is %d"%number_in) return number+number_in#使用到了外部的變量number return test_in #將內部函數做爲返回值 #給test函數賦值，這個20就是給參數number ret = test(20)#ret接收返回值(內部函數test_in) #注意這裏的100其實給參數number_in print(ret(100)) #100+20 print(ret(200)) #200+20 def test1(): print("----test1----") test1() ret = test1#使用對象引用函數，使用函數名進行傳遞 print(id(ret)) # 引用的對象地址和原函數一致 print(id(test1)) ret() ''' ----test1---- 1511342483488 1511342483488 ----test1---- ''' def line_conf(a,b): def line(x): return "%d * %d + %d"%(a,x,b) # 內部函數必定要使用外部函數，才能稱爲閉包函數 return line line_one = line_conf(1,1) # 使用變量進行接收外部函數，而後使用變量進行調用閉包函數中的內部函數 line_two = line_conf(2,3) print(line_one(7)) print(line_two(7)) ''' 1 * 7 + 1 2 * 7 + 3 '''

自定義建立元類

#coding=utf-8 class UpperAttrMetaClass(type): # __new__ 是在__init__以前被調用的特殊方法 # __new__是用來建立對象並返回之的方法 # 而__init__只是用來將傳入的參數初始化給對象 # __new__可以控制對象的建立 # 這裏，建立的對象是類，自定義這個類，咱們這裏改寫__new__ # 若是你但願的話，你也能夠在__init__中作些事情 # 可改寫__call__特殊方法 def __new__(cls, future_class_name, future_class_parents, future_class_attr): # cls、類名、父類、須要修改的字典 #遍歷屬性字典，把不是__開頭的屬性名字變爲大寫 newAttr = {} for key,value in future_class_attr.items(): if not key.startswith("__"): newAttr[key.upper()] = value #使字典的鍵值大寫 # 方法1：經過'type'來作類對象的建立 # return type(future_class_name, future_class_parents, newAttr) # type 類名、父類名、字典(剛剛進行修改的字典) # 方法2：複用type.__new__方法 # 這就是基本的OOP編程，沒什麼魔法 # return type.__new__(cls, future_class_name, future_class_parents, newAttr) # 類名、父類名、字典(剛剛進行修改的字典) # 方法3：使用super方法 return super(UpperAttrMetaClass,cls).__new__(cls, future_class_name, future_class_parents, newAttr) # python3的用法 class Foo(object, metaclass = UpperAttrMetaClass): # metaclass運行類的時候，根據metaclass的屬性。修改類中的屬性 bar = 'bip' # hasattr 查看類中是否具備該屬性 print(hasattr(Foo, 'bar')) # 輸出: False print(hasattr(Foo, 'BAR')) # 輸出:True f = Foo() # 進行構造，產生 f 對象 print(f.BAR) # 輸出:'bip'，metaclass修改了Foo類

class UpperAttrMetaClass(type): # __new__ 是在__init__以前被調用的特殊方法 # __new__是用來建立對象並返回之的方法 # 而__init__只是用來將傳入的參數初始化給對象 # 你不多用到__new__，除非你但願可以控制對象的建立 # 這裏，建立的對象是類，咱們但願可以自定義它，因此咱們這裏改寫__new__ # 若是你但願的話，你也能夠在__init__中作些事情 # 還有一些高級的用法會涉及到改寫__call__特殊方法，可是咱們這裏不用 def __new__(cls, future_class_name, future_class_parents, future_class_attr): #遍歷屬性字典，把不是__開頭的屬性名字變爲大寫 newAttr = {} for name,value in future_class_attr.items(): if not name.startswith("__"): newAttr[name.upper()] = value # 方法1：經過'type'來作類對象的建立 return type(future_class_name, future_class_parents, newAttr) # 方法2：複用type.__new__方法 # return type.__new__(cls, future_class_name, future_class_parents, newAttr) #return type.__new__(cls,future_class_name,future_class_parents,newAttr) # 方法3：使用super方法 return super(UpperAttrMetaClass, cls).__new__(cls, future_class_name, future_class_parents, newAttr) # python3的用法 class Foo(object, metaclass = UpperAttrMetaClass): bar = 'bip' print(hasattr(Foo, 'bar')) print(hasattr(Foo, 'BAR')) f = Foo() print(f.BAR)

迪傑斯特拉算法(網上找的)

""" 輸入 graph 輸入的圖 src 原點 返回 dis 記錄源點到其餘點的最短距離 path 路徑 """ import json def dijkstra(graph, src): if graph == None: return None # 定點集合 nodes = [i for i in range(len(graph))] # 獲取頂點列表，用鄰接矩陣存儲圖 # 頂點是否被訪問 visited = [] visited.append(src) # 初始化dis dis = {src: 0} # 源點到自身的距離爲0 for i in nodes: dis[i] = graph[src][i] path = {src: {src: []}} # 記錄源節點到每一個節點的路徑 k = pre = src while nodes: temp_k = k mid_distance = float('inf') # 設置中間距離無窮大 for v in visited: for d in nodes: if graph[src][v] != float('inf') and graph[v][d] != float('inf'): # 有邊 new_distance = graph[src][v] + graph[v][d] if new_distance <= mid_distance: mid_distance = new_distance graph[src][d] = new_distance # 進行距離更新 k = d pre = v if k != src and temp_k == k: break dis[k] = mid_distance # 最短路徑 path[src][k] = [i for i in path[src][pre]] path[src][k].append(k) visited.append(k) nodes.remove(k) print(nodes) return dis, path if __name__ == '__main__': # 輸入的有向圖,有邊存儲的就是邊的權值，無邊就是float('inf')，頂點到自身就是0 graph = [ [0, float('inf'), 10, float('inf'), 30, 100], [float('inf'), 0, 5, float('inf'), float('inf'), float('inf')], [float('inf'), float('inf'), 0, 50, float('inf'), float('inf')], [float('inf'), float('inf'), float('inf'), 0, float('inf'), 10], [float('inf'), float('inf'), float('inf'), 20, 0, 60], [float('inf'), float('inf'), float('inf'), float('inf'), float('inf'), 0]] dis, path = dijkstra(graph, 0) # 查找從源點0開始帶其餘節點的最短路徑 print(dis) print(json.dumps(path, indent=4))

裝飾器_上 def foo(): print("foo") print(foo) # 輸出foo的地址 foo()#對foo函數的調用 def foo(): print("foo2") foo = lambda x : x+1 # 使用foo對象接收函數 print(foo(2)) def w(func): def inner(): # 驗證、使用內部函數的inner函數進行驗證 print("對函數進行驗證中~~~") func()#內部函數使用了外部函數的func函數 return inner#閉包函數、返回內部函數名 @w #對w的裝飾 def fun1(): print("fun1驗證完畢,開始接下來的工做") @w def fun2(): print("fun2驗證完畢,開始接下來的工做") fun1()#運行、先運行裝飾器，後運行函數 fun2() def makeBold(fn): # fn形參其實是使用了makeBold裝飾器的函數 def wrapped(): return "<b>"+fn()+"</b>" return wrapped def makeitalic(fn): def wrapped(): return "<i>"+fn()+"</i>" return wrapped @makeBold def test1(): return "hello world -1 " @makeitalic def test2(): return "hello world -2 " @makeBold#後使用makeBold @makeitalic#先使用裝飾器makeitalic def test3(): return "hello world -3 " #@makeitalic先執行而後是@makeBold 先執行最近的 print(test1()) print(test2()) print(test3())

裝飾器_下 # 示例1 from time import ctime,sleep #導包就不須要使用包名.函數了 def timefun(func): def wrappedfunc(): print("%s called at %s"%(func.__name__,ctime())) func()#內部函數須要使用外部函數的參數 return wrappedfunc#返回內部函數 @timefun #timefun是一個閉包函數 def foo():#將foo函數傳遞給timefun閉包函數 print("I'm foo ") foo() sleep(3) foo() ''' foo called at Fri May 8 01:00:18 2020 I'm foo foo called at Fri May 8 01:00:21 2020 I'm foo ''' # 示例2 from time import ctime,sleep import functools def timefun(func): #functools.wraps(func) def wrappedfunc(a,b):#使用了timefun函數的參數a，b print("%s called at %s"%(func.__name__,ctime())) print(a,b) func(a,b) return wrappedfunc @timefun def foo(a,b): print(a+b) foo(3,5) sleep(2) foo(2,4) ''' foo called at Fri May 8 01:01:16 2020 3 5 8 foo called at Fri May 8 01:01:18 2020 2 4 6 ''' # 示例3 from time import ctime,sleep def timefun(func): def wrappedfunc(*args,**kwargs): # *args主要是元組，列表，單個值的集合 # **kwargs 主要是鍵值對，字典 print("%s called at %s"%(func.__name__,ctime())) func(*args,**kwargs) return wrappedfunc @timefun def foo(a,b,c): print(a+b+c) foo(3,5,7) sleep(2) foo(2,4,9) ''' foo called at Fri May 8 01:01:38 2020 15 foo called at Fri May 8 01:01:40 2020 15 ''' # 示例4 def timefun(func): def wrappedfunc( ): return func( )#閉包函數返回調用的函數(原函數有return) return wrappedfunc @timefun def getInfo(): return '----haha----' info = getInfo()#接收函數的返回值 print(info)#輸出getInfo 若是閉包函數沒有return 返回，則爲None ''' ----haha---- ''' # 示例5 from time import ctime,sleep def timefun_arg(pre = "Hello"):#使用了pre默認參數 def timefun(func): def wrappedfunc():#閉包函數嵌套閉包函數 print("%s called at %s"%(func.__name__,ctime())) print(pre) # func.__name__函數名 ctime()時間 return func#返回使用了裝飾器的函數 return wrappedfunc return timefun @timefun_arg("foo的pre")#對閉包函數中最外部函數進行形參傳遞pre def foo( ): print("I'm foo") @timefun_arg("too的pre") def too(): print("I'm too") foo() sleep(2) foo() too() sleep(2) too() '''foo called at Fri May 8 01:02:34 2020 foo的pre foo called at Fri May 8 01:02:36 2020 foo的pre too called at Fri May 8 01:02:36 2020 too的pre too called at Fri May 8 01:02:38 2020 too的pre '''

設計模式_理解單例設計模式

設計模式分類: 結構型 行爲型 建立型 單例模式屬於建立型設計模式 單例模式主要使用在 日誌記錄 ->將多項服務的日誌信息按照順序保存到一個特定日誌文件 數據庫操做 -> 使用一個數據庫對象進行操做,保證數據的一致性 打印機後臺處理程序 以及其餘程序 該程序運行過程當中 只能生成一個實例 避免對同一資源產生相互衝突的請求 單例設計模式的意圖: 確保類有且只有一個對象被建立。 爲對象提供一個訪問點，以使程序能夠全局訪問該對象。 控制共享資源的並行訪問 實現單例模式的一個簡單方法是: 使構造函數私有化 並建立一個靜態方法來完成對象的初始化 這樣作的目的是: 對象將在第一次調用時建立 此後，這個類將返回同一個對象 實踐: 1.只容許Singleton類生成一個實例。 2.若是已經有一個實例了 則重複提供同-個對象 class Singletion(object): def __new__(cls): ''' 覆蓋 __new__方法,控制對象的建立 ''' if not hasattr(cls,'instance'): ''' hasattr 用來了解對象是否具備某個屬性 檢查 cls 是否具備屬性 instance instance 屬性的做用是檢查該類是否已經生成了一個對象 ''' cls.instance = super(Singletion,cls).__new__(cls) ''' 當對象s1被請求建立時,hasattr發現對象已經存在 對象s1將被分配到已有的對象實例 ''' return cls.instance s = Singletion() ''' s對象 經過 __new__ 方法進行建立 在建立以前,會檢查對象是否已存在 ''' print("對象已經建立好了:",s) s1 = Singletion() print("對象已經建立好了:",s1) ''' 運行結果: 對象已經建立好了: <__main__.Singletion object at 0x000001EE59F93340> 對象已經建立好了: <__main__.Singletion object at 0x000001EE59F93340> '''

設計模式_單例模式的懶漢式實例化 單例模式的懶漢式 在導入模塊的時候,可能會無心中建立一個對象,但當時根本用不到 懶漢式實例化可以確保在實際須要時才建立對象 懶漢式實例化是一種節約資源並僅在須要時才建立它們的方式 class Singleton: _instance = None def __init__(self): if not Singleton._instance: print("__init__的方法使用了,在靜態 getinstance 方法才建立了實例對象") else: # 在靜態 getinstance 方法,改變了 _instance 的值 print("實例已建立",self.getinstance()) @classmethod def getinstance(cls): ''' 在 getinstance 內寫建立實例的語句 若是在 __init__ 內寫明建立語句 若是那個對象建立以後沒有當即使用,會形成資源浪費 ''' if not cls._instance: cls._instance = Singleton() ''' 建立實例化對象時,還會再調用一次 __init__方法 cls._instance = Singleton() 修改了 _instance 屬性的狀態 ''' return cls._instance s = Singleton() # __init__的方法使用了,在靜態 getinstance 方法才建立了實例對象 print('已建立對象',Singleton.getinstance()) ''' 此時纔是真正的建立了對象 運行結果: __init__的方法使用了,在靜態 getinstance 方法才建立了實例對象 已建立對象 <__main__.Singleton object at 0x00000206AA2436D0> ''' print(id(s)) # 2227647559520 s1 = Singleton() # 實例已建立 <__main__.Singleton object at 0x00000206AA2436D0> print('已建立對象',Singleton.getinstance()) # 已建立對象 <__main__.Singleton object at 0x00000206AA2436D0> print(id(s1)) # 2227647561248 建立一個靜態變量 _instance = None 在 __init__ 方法內部,不進行建立對象的操做 在類方法 getinstance 方法中,進行類的建立 注: 此時會調用一次 __init__ 方法 建立對象時 s = Singleton() 還會調用一次 __init__ 方法

查看MySQL支持的存儲引擎

查看當前全部數據庫

建立一個數據庫

選擇當前操做的數據庫

刪除數據庫

查看數據庫表

建立一個數據庫表

顯示錶的結構

查看建立表的建立語句

向表中加入記錄

刪除記錄

更新記錄

刪除表

在類外建立函數,而後使用類的實例化對象進行調用 def f(self,x): y = x + 3 return y class Add: # 建立一個 Add 類 def add(self,a): return a + 4 f1 = f # 讓 f1 等於外面定義的函數 f n = Add() # 建立實例化對象 print(n.add(4)) # 調用類內定義方法 print(n.f1(4)) # 調用類外建立的方法 運行結果: 8 7 [Finished in 0.2s] 須要注意的點: 外部定義的函數第一個參數爲 self 建立類的實例化對象使用 Add() 括號不要丟

PageRank算法

def create(q,graph,N): #compute Probability Matrix L = [[(1-q)/N]*N for i in range(N)] for node,edges in enumerate(graph): num_edge = len(edges) for each in edges: L[each][node] += q/num_edge return L def transform(A): n,m = len(A),len(A[0]) new_A = [[A[j][i] for j in range(n) ] for i in range(m)] return new_A def mul(A,B): n = len(A) m = len(B[0]) B = transform(B) next = [[0]*m for i in range(n)] for i in range(n): row = A[i] for j in range(m): col = B[j] next[i][j] = sum([row[k]*col[k] for k in range(n)]) return next def power(A,N): n = len(A) assert(len(A[0])==n) final_ans,temp = A,A N-=1 while N>0: if N&1: final_ans = mul(final_ans,temp) temp = mul(temp,temp) N >>=1 return final_ans def PageRank(q,graph,N): X = [[1] for i in range(N)] A = create(q,graph,N) X = mul(power(A,20),X) return X print(PageRank(0.85,[[1,2],[2],[0]],3)) ———————————————— 原文連接：https://blog.csdn.net/pp634077956/java/article/details/52604137

窮人版PageRank算法的Python實現

#用於存儲圖 class Graph(): def __init__(self): self.linked_node_map = {}#鄰接表， self.PR_map ={}#存儲每一個節點的入度 #添加節點 def add_node(self, node_id): if node_id not in self.linked_node_map: self.linked_node_map[node_id] = set({}) self.PR_map[node_id] = 0 else: print("這個節點已經存在") #增長一個從Node1指向node2的邊。容許添加新節點 def add_link(self, node1, node2): if node1 not in self.linked_node_map: self.add_node(node1) if node2 not in self.linked_node_map: self.add_node(node2) self.linked_node_map[node1].add(node2)#爲node1添加一個鄰接節點，表示ndoe2引用了node1 #計算pr def get_PR(self, epoch_num=10, d=0.5):#配置迭代輪數，以及阻尼係數 for i in range(epoch_num): for node in self.PR_map:#遍歷每個節點 self.PR_map[node] = (1-d) + d*sum([self.PR_map[temp_node] for temp_node in self.linked_node_map[node]])#原始版公式 print(self.PR_map) edges = [[1,2], [3,2], [3,5], [1,3], [2,3], [3, 1], [5,1]]#模擬的一個網頁連接網絡 if __name__ == '__main__': graph = Graph() for edge in edges: graph.add_link(edge[0], edge[1]) graph.get_PR()

分解質因數

把一個合數分解成若干個質因數的乘積的形式，即求質因數的過程叫作分解質因數。 Python練習題問題以下： 要求：將一個正整數分解質因數；例如您輸入90，分解打印90=2*3*3*5。 Python解題思路分析： 這道題須要分三部分來分解，具體分解說明以下。 一、首先當這個質數恰等於n的狀況下，則說明分解質因數的過程已經結束，打印出便可。 二、若是遇到n<>k，但n能被k整除的狀況，則應打印出k的值。同時用n除以k的商，做爲新的正整數你n，以後再重複執行第一步的操做。 三、若是n不能被k整除時，那麼用k+1做爲k的值，再來重複執行第一步的操做系統。 def reduceNum(n): print ('{} = '.format(n)) if not isinstance(n, int) or n <= 0 : print ('請輸入一個正確的數字') exit(0) elif n in [1] : # 若是 n 爲 1 print('{}'.format(n)) while n not in [1] : # 循環保證遞歸 for index in range(2, n + 1) : if n % index == 0: n //= index # n 等於 n//index if n == 1: print (index) else : # index 必定是素數 print ('{} *'.format(index),end = " ") break reduceNum(90) reduceNum(100)

計算皮球下落速度 問題簡述：假設一支皮球從100米高度自由落下。條件，每次落地後反跳回原高度的一半後，再落下。 要求：算出這支皮球，在它在第10次落地時，共通過多少米？第10次反彈多高？

解題思路 總共初始高度 100 米 高度 每次彈起一半距離 每一次彈起 上升的高度和降低的高度 是一次的距離 每一次彈起,高度都會是以前的一半

Sn = 100.0 Hn = Sn / 2 for n in range(2,11): Sn += 2 * Hn Hn /= 2 print ('總共通過 %.2f 米' % Sn) print ('第十次反彈 %.2f 米' % Hn)

給定年月日,判斷是這一年的第幾天

# 輸入年月日 year = int(input('year:')) month = int(input('month:')) day = int(input('day:')) # 將正常狀況下,每個月的累計天數放入到元組中進行保存 months = (0,31,59,90,120,151,181,212,243,273,304,334) if 0 < month <= 12: # 若是輸入的數據正確,月份在 1~12 之間 sum_days = months[month - 1] # 總天數就爲 列表中的天數,索引值根據 輸入的月份進行選擇 else: print ('數據錯誤,請從新輸入') # 加上輸入的日期 sum_days += day # 默認不是閏年 leap = 0 # 判斷是不是閏年,被 400 整除,能夠整除4 可是不能被 100 除掉 if (year % 400 == 0) or ((year % 4 == 0) and (year % 100 != 0)): leap = 1 # 若是爲 1 則表示 這一年是閏年 if (leap == 1) and (month > 2): # 當這一年是閏年,而且月份大於 2 時,說明存在 29 日,因此總天數再增長 1 sum_days += 1 print ('這是 %d 年的第 %d 天.' % (year,sum_days))

實驗1-5

Django暫時中止更新,先把學校實驗報告弄完 ''' 計算 1.輸入半徑,輸出面積和周長 2.輸入面積,輸出半徑及周長 3.輸入周長,輸出半徑及面積 ''' '''1.輸入半徑,輸出面積和周長''' from math import pi '''定義半徑''' r = int(input("請輸入半徑的值(整數)")) if r <= 0 : exit("請從新輸入半徑") ''' S 面積: pi * r * r ''' S = pi * pow(r,2) print(" 半徑爲 %d 的圓,面積爲 %.2f"%(r,S)) '''C 周長: C = 2 * pi * r ''' C = 2 * pi * r print(" 半徑爲 %d 的圓,周長爲 %.2f"%(r,C)) '''2.輸入面積,輸出半徑及周長''' from math import pi,sqrt S = float(input("請輸入圓的面積(支持小數格式)")) if S < 0 : exit("請從新輸入面積") '''r 半徑: r = sqrt(S/pi)''' r = sqrt(S/pi) print("面積爲 %.2f 的圓,半徑爲 %.2f"%(S,r)) '''C 周長: C = 2 * pi * r ''' C = 2 * pi * r print("面積爲 %.2f 的圓,周長爲 %.2f"%(S,C)) '''3.輸入周長,輸出半徑及面積''' from math import pi C = float(input("請輸入圓的周長(支持小數格式)")) if C < 0 : exit("請從新輸入周長") '''r 半徑: r = C/(2*pi)''' r = C/(2*pi) print("周長爲 %.2f 的圓,半徑爲 %.2f"%(C,r)) ''' S 面積: pi * r * r ''' S = pi * pow(r,2) print("周長爲 %.2f 的圓,面積爲 %.2f"%(C,S)) ''' 數據結構 列表練習 1.建立列表對象 [110,'dog','cat',120,'apple'] 2.在字符串 'dog' 和 'cat' 之間插入空列表 3.刪除 'apple' 這個字符串 4.查找出 1十、120 兩個數值,並以 10 爲乘數作自乘運算 ''' '''1.建立列表對象 [110,'dog','cat',120,'apple']''' # '''建立一個名爲 lst 的列表對象''' lst = [110,'dog','cat',120,'apple'] print(lst) '''2.在字符串 'dog' 和 'cat' 之間插入空列表''' lst = [110,'dog','cat',120,'apple'] '''添加元素到 'dog' 和 'cat' 之間''' lst.insert(2,[]) print(lst) '''3.刪除 'apple' 這個字符串''' lst = [110,'dog','cat',120,'apple'] '''刪除最後一個元素''' lst.pop() print(lst) '''4.查找出 1十、120 兩個數值,並以 10 爲乘數作自乘運算''' lst = [110,'dog','cat',120,'apple'] try: '''若是找不到數據,進行異常處理''' lst[lst.index(110)] *= 10 lst[lst.index(120)] *= 10 except Exception as e: print(e) print(lst) ''' 元組練習 1.建立列表 ['pen','paper',10,False,2.5] 賦給變量並查看變量的類型 2.將變量轉換爲 tuple 類型,查看變量的類型 3.查詢元組中的元素 False 的位置 4.根據得到的位置提取元素 ''' '''1.建立列表 ['pen','paper',10,False,2.5] 賦給變量並查看變量的類型''' lst = ['pen','paper',10,False,2.5] '''查看變量類型''' print("變量的類型",type(lst)) '''2.將變量轉換爲 tuple 類型,查看變量的類型''' lst = tuple(lst) print("變量的類型",type(lst)) '''3.查詢元組中的元素 False 的位置''' if False in lst: print("False 的位置爲(從0開始): ",lst.index(False)) '''4.根據得到的位置提取元素''' print("根據得到的位置提取的元素爲: ",lst[lst.index(False)]) else: print("不在元組中") ''' 1.建立字典{‘Math’:96,’English’:86,’Chinese’:95.5, ’Biology’:86,’Physics’:None} 2.在字典中添加鍵對{‘Histore’:88} 3.刪除{’Physics’:None}鍵值對 4.將鍵‘Chinese’所對應的值進行四捨五入後取整 5.查詢鍵「Math」的對應值 ''' '''1.建立字典''' dic = {'Math':96,'English':86,'Chinese':95.5,'Biology':86,'Physics':None} print(dic) '''2.添加鍵值對''' dic['Histore'] = 88 print(dic) '''3.刪除{’Physics’:None}鍵值對''' del dic['Physics'] print(dic) '''4.將鍵‘Chinese’所對應的值進行四捨五入後取整''' print(round(dic['Chinese'])) '''5.查詢鍵「Math」的對應值''' print(dic['Math']) ''' 1.建立列表[‘apple’,’pear’,’watermelon’,’peach’]並賦給變量 2.用list()建立列表[‘pear’,’banana’,’orange’,’peach’,’grape’]，並賦給變量 3.將建立的兩個列表對象轉換爲集合類型 4.求兩個集合的並集、交集和差集 ''' '''1.建立列表''' lst = ['apple','pear','watermelon','peach'] print(lst) '''2.用list()建立,並賦給變量''' lstTwo = list(('pear','banana','orange','peach','grape')) print(lstTwo) '''3.將建立的兩個列表對象轉換爲集合類型''' lst = set(lst) lstTwo = set(lstTwo) '''4.求兩個集合的並集、交集和差集''' print("並集是:",lst | lstTwo) print("交集是:",lst & lstTwo) print("差集:") print(lst - lstTwo) print(lstTwo - lst) ''' 1 輸出數字金字塔 （1）設置輸入語句，輸入數字 （2）建立變量來存放金字塔層數 （3）編寫嵌套循環，控制變量存放每一層的長度 （4）設置條件來打印每一行輸出的數字 （5）輸出打印結果 ''' num = int(input("請輸入金字塔的層數")) '''cs 層數''' cs = 1 while cs <= num: kk = 1 t = cs length = 2*t - 1 while kk <= length: if kk == 1: if kk == length: print(format(t,str(2*num-1)+"d"),'\n') break else: print(format(t,str(2*num+1 - 2*cs) + "d"),"",end = "") t -= 1 else: if kk == length: '''最右側數字 length 位置上數字爲 t''' print(t,"\n") break elif kk <= length/2: '''到中間的數字不斷減少''' print(t,"",end = "") t -= 1 else: '''中間的數字到右側的數字不斷增大''' print(t,"",end = "") t += 1 kk += 1 '''層數加 1''' cs += 1 ''' （1）使用自定義函數，完成對程序的模塊化 （2）學生信息至少包括：姓名、性別及手機號 （3）該系統具備的功能：添加、刪除、修改、顯示、退出系統 ''' '''建立一個容納全部學生的基本信息的列表''' stusInfo = [] def menu(): '''定義頁面顯示''' print('-'*20) print('學生管理系統') print('1.添加學生信息') print('2.刪除學生信息') print('3.修改學生信息') print('4.顯示全部學生信息') print('0.退出系統') print('-' * 20) def addStuInfo(): '''添加學生信息''' '''設置變量容納學生基本信息''' newStuName = input('請輸入新學生的姓名') newStuGender = input('請輸入新學生的性別') newStuPhone = input('請輸入新學生的手機號') '''設置字典將變量保存''' newStudent = {} newStudent['name'] = newStuName newStudent['gender'] = newStuGender newStudent['phone'] = newStuPhone '''添加到信息中''' stusInfo.append(newStudent) def delStuInfo(): '''刪除學生信息''' showStusInfo() defStuId = int(input('請輸入要刪除的序號:')) '''從列表中刪除 該學生''' del stusInfo[defStuId - 1] def changeStuInfo(): '''修改學生信息''' showStusInfo() '''查看修改的學生序號''' stuId = int(input("請輸入須要修改的學生的序號: ")) changeStuName = input('請輸入修改後的學生的姓名') changeStuGender = input('請輸入修改後的學生的性別') changeStuPhone = input('請輸入修改後的學生的手機號') '''對列表修改學生信息''' stusInfo[stuId - 1]['name'] = changeStuName stusInfo[stuId - 1]['gender'] = changeStuGender stusInfo[stuId - 1]['phone'] = changeStuPhone def showStusInfo(): print('-'*30) print("學生的信息以下:") print('-'*30) print('序號 姓名 性別 手機號碼') '''展現學生序號(位置),姓名,性別,手機號碼''' stuAddr = 1 for stuTemp in stusInfo: print("%d %s %s %s"%(stuAddr,stuTemp['name'],stuTemp['gender'],stuTemp['phone'])) stuAddr += 1 def main(): '''主函數''' while True: '''顯示菜單''' menu() keyNum = int(input("請輸入功能對應的數字")) if keyNum == 1: addStuInfo() elif keyNum == 2: delStuInfo() elif keyNum == 3: changeStuInfo() elif keyNum == 4: showStusInfo() elif keyNum == 0: print("歡迎下次使用") break if __name__ == '__main__': main() import numpy as np '''一維數組''' '''np.array 方法''' print(np.array([1,2,3,4])) # [1 2 3 4] print(np.array((1,2,3,4))) # [1 2 3 4] print(np.array(range(4))) # [0 1 2 3] '''np.arange 方法''' print(np.arange(10)) # [0 1 2 3 4 5 6 7 8 9] '''np.linspace 方法''' print(np.linspace(0,10,11)) # [ 0. 1. 2. 3. 4. 5. 6. 7. 8. 9. 10.] print(np.linspace(0,10,11,endpoint = False)) # [0. 0.90909091 1.81818182 2.72727273 3.63636364 4.54545455 # 5.45454545 6.36363636 7.27272727 8.18181818 9.09090909] '''np.logspace 方法''' print(np.logspace(0,100,10)) # [1.00000000e+000 1.29154967e+011 1.66810054e+022 2.15443469e+033 # 2.78255940e+044 3.59381366e+055 4.64158883e+066 5.99484250e+077 # 7.74263683e+088 1.00000000e+100] print(np.logspace(1,4,4,base = 2)) # [ 2. 4. 8. 16.] '''zeros 方法''' print(np.zeros(3)) # [0. 0. 0.] '''ones 方法''' print(np.ones(3)) # [1. 1. 1.] '''二維數組''' '''np.array 方法''' print(np.array([[1,2,3],[4,5,6]])) # [[1 2 3] # [4 5 6]] '''np.identify 方法''' print(np.identity(3)) # [[1. 0. 0.] # [0. 1. 0.] # [0. 0. 1.]] '''np.empty 方法''' print(np.empty((3,3))) # [[1. 0. 0.] # [0. 1. 0.] # [0. 0. 1.]] '''np.zeros 方法''' print(np.zeros((3,3))) # [[0. 0. 0.] # [0. 0. 0.] # [0. 0. 0.]] '''np.ones 方法''' print(np.ones((3,3))) # [[1. 1. 1.] # [1. 1. 1.] # [1. 1. 1.]] import numpy as np '''一維數組''' '''np.random.randint 方法''' print(np.random.randint(0,6,3)) # [4 2 1] '''np.random.rand 方法''' print(np.random.rand(10)) # [0.12646424 0.59660184 0.52361248 0.1206141 0.28359949 0.46069696 # 0.18397493 0.73839455 0.99115088 0.98297331] '''np.random.standard_normal 方法''' print(np.random.standard_normal(3)) # [-0.12944733 -0.32607943 0.58582095] '''二維數組''' '''np.random.randint 方法''' print(np.random.randint(0,6,(3,3))) # [[0 0 0] # [4 4 0] # [5 0 3]] '''多維數組''' print(np.random.standard_normal((3,4,2))) # [[[-0.79751104 -1.40814148] # [-1.06189896 1.19993648] # [ 1.68883868 0.09190824] # [ 0.33723433 0.28246094]] # # [[ 0.3065646 1.1177714 ] # [-0.48928572 0.55461195] # [ 0.3880272 -2.27673705] # [-0.97869759 -0.07330554]] # # [[ 0.62155155 -0.17690222] # [ 1.61473949 -0.34930031] # [-1.41535777 1.32646137] # [-0.22865775 -2.00845225]]] import numpy as np n = np.array([10,20,30,40]) print(n + 5) # [15 25 35 45] print(n - 10) # [ 0 10 20 30] print(n * 2) # [20 40 60 80] print(n / 3) # [ 3.33333333 6.66666667 10. 13.33333333] print(n // 3) # [ 3 6 10 13] print(n % 3) # [1 2 0 1] print(n ** 2) # [ 100 400 900 1600] n = np.array([1,2,3,4]) print(2 ** n) # [ 2 4 8 16] print(16//n) # [16 8 5 4] print(np.array([1,2,3,4]) + np.array([1,1,2,2])) # [2 3 5 6] print(np.array([1,2,3,4]) + np.array(4)) # [5 6 7 8] print(n) # [1 2 3 4] print(n + n) # [2 4 6 8] print(n - n) # [0 0 0 0] print(n * n) # [ 1 4 9 16] print(n / n) # [1. 1. 1. 1.] print(n ** n) # [ 1 4 27 256] x = np.array([4,7,3]) print(np.argsort(x)) # [2 0 1] print(x.argmax()) # 1 print(x.argmin()) # 2 print(np.sort(x)) # [3 4 7] print(np.where(x < 5,0,1)) x = np.array([[1,2,3],[4,5,6],[7,8,9]]) print(x) # [[1 2 3] # [4 5 6] # [7 8 9]] x.resize((2,5)) print(x) # [[1 2 3 4 5] # [6 7 8 9 0]] print(np.piecewise(x,[x<3],[lambda x:x + 1])) # [[2 3 0 0 0] # [0 0 0 0 1]] import pandas as pd import numpy as np '''對 sepal_length 這一列進行分析''' irisSepalLength = np.loadtxt('iris.csv') print(irisSepalLength[:5]) # [5.1 4.9 4.7 4.6 5. ] '''對數據進行排序''' irisSepalLength.sort() print(irisSepalLength[:15]) # [4.3 4.4 4.4 4.4 4.5 4.6 4.6 4.6 4.6 4.7 4.7 4.8 4.8 4.8 4.8] '''查看去重後的數據集''' print(np.unique(irisSepalLength)[:15]) # [4.3 4.4 4.5 4.6 4.7 4.8 4.9 5. 5.1 5.2 5.3 5.4 5.5 5.6 5.7] '''查看長度的總和''' print(np.sum(irisSepalLength)) # 876.5 '''累計和''' print(np.cumsum(irisSepalLength)[:10]) # [ 4.3 8.7 13.1 17.5 22. 26.6 31.2 35.8 40.4 45.1] '''均值''' print(np.mean(irisSepalLength)) # 5.843333333333334 '''標準差''' print(np.std(irisSepalLength)) # 0.8253012917851409 '''方差''' print(np.var(irisSepalLength)) # 0.6811222222222223 '''最小值''' print(np.min(irisSepalLength)) # 4.3 '''最大值''' print(np.max(irisSepalLength)) # 7.9 import pandas as pd '''建立 Series 對象 s''' s = pd.Series(range(1,20,5)) print(s) # 0 1 # 1 6 # 2 11 # 3 16 # dtype: int64 s = pd.Series({'語文':90,'數學':92,'物理':88,'化學':99}) print(s) # 語文 90 # 數學 92 # 物理 88 # 化學 99 # dtype: int64 '''修改數據''' s['語文'] = 100 print(s) # 語文 100 # 數學 92 # 物理 88 # 化學 99 # dtype: int64 '''使用絕對值''' print(abs(s)) # 語文 100 # 數學 92 # 物理 88 # 化學 99 # dtype: int64 '''對數據列加後綴''' # s.add_suffix('後綴') '''查看某些數據是否知足條件''' print(s.between(90,99,inclusive = True)) # 語文 False # 數學 True # 物理 False # 化學 True # dtype: bool '''查看屬性最大的列名''' print(s.idxmax()) # 語文 '''查看屬性最小的列名''' print(s.idxmin()) # 物理 '''大於 95 的列''' print(s[s > 95]) # 語文 100 # 化學 99 # dtype: int64 '''查看中值''' print(s.median()) # 95.5 '''大於中值的列''' print(s[s > s.median()]) # 語文 100 # 化學 99 # dtype: int64 '''查看最小的 3 個值''' print(s.nsmallest(3)) # 物理 88 # 數學 92 # 化學 99 # dtype: int64 '''查看最大的 3 個值''' print(s.nlargest(3)) # 語文 100 # 化學 99 # 數學 92 # dtype: int64 '''兩個 Series 對象進行相加''' print(pd.Series(range(5)) + pd.Series(range(7,12))) # 0 7 # 1 9 # 2 11 # 3 13 # 4 15 # dtype: int64 '''對 Series 對象使用函數''' print(pd.Series(range(5)).pipe(lambda x,y:(x**y),4)) # 0 0 # 1 1 # 2 16 # 3 81 # 4 256 # dtype: int64 print(pd.Series(range(5)).pipe(lambda x:x + 3)) # 0 3 # 1 4 # 2 5 # 3 6 # 4 7 # dtype: int64 print(pd.Series(range(5)).apply(lambda x:x + 3)) # 0 3 # 1 4 # 2 5 # 3 6 # 4 7 # dtype: int64 '''查看標準差方差''' print(s) print(s.std()) # 5.737304826019502 print(s.var()) # 32.916666666666664 '''查看是否所有爲真''' print(any(pd.Series([1,0,1]))) # True print(all(pd.Series([1,0,1]))) # False import pandas as pd import numpy as np '''建立一個 DataFrame 對象''' df = pd.DataFrame(np.random.randint(1,5,(5,3)),index = range(5),columns = ['A','B','C']) print(df) # A B C # 0 4 4 2 # 1 1 4 1 # 2 4 3 4 # 3 3 1 3 # 4 2 3 1 '''讀取數據''' market = pd.read_excel('超市營業額.xlsx') print(market.head()) # 工號 姓名 日期 時段 交易額 櫃檯 # 0 1001 張三 2019-03-01 9：00-14：00 1664.0 化妝品 # 1 1002 李四 2019-03-01 14：00-21：00 954.0 化妝品 # 2 1003 王五 2019-03-01 9：00-14：00 1407.0 食品 # 3 1004 趙六 2019-03-01 14：00-21：00 1320.0 食品 # 4 1005 周七 2019-03-01 9：00-14：00 994.0 日用品 '''查看不連續行的數據''' print(market.iloc[[1,8,19],:]) # 工號 姓名 日期 時段 交易額 櫃檯 # 1 1002 李四 2019-03-01 14：00-21：00 954.0 化妝品 # 8 1001 張三 2019-03-02 9：00-14：00 1530.0 化妝品 # 19 1004 趙六 2019-03-03 14：00-21：00 1236.0 食品 print(market.iloc[[1,8,19],[1,4]]) # 姓名 交易額 # 1 李四 954.0 # 8 張三 1530.0 # 19 趙六 1236.0 '''查看前五條記錄的 姓名 時段 和 交易額''' print(market[['姓名','時段','交易額']].head()) # 姓名 時段 交易額 # 0 張三 9：00-14：00 1664.0 # 1 李四 14：00-21：00 954.0 # 2 王五 9：00-14：00 1407.0 # 3 趙六 14：00-21：00 1320.0 # 4 周七 9：00-14：00 994.0 print(market.loc[[3,4,7],['姓名','時段','櫃檯']]) # 姓名 時段 櫃檯 # 3 趙六 14：00-21：00 食品 # 4 周七 9：00-14：00 日用品 # 7 張三 14：00-21：00 蔬菜水果 '''查看交易額大於 2000 的數據''' print(market[market.交易額 > 1500].head()) # 工號 姓名 日期 時段 交易額 櫃檯 # 0 1001 張三 2019-03-01 9：00-14：00 1664.0 化妝品 # 8 1001 張三 2019-03-02 9：00-14：00 1530.0 化妝品 # 14 1002 李四 2019-03-02 9：00-14：00 1649.0 蔬菜水果 # 18 1003 王五 2019-03-03 9：00-14：00 1713.0 食品 # 20 1005 周七 2019-03-03 9：00-14：00 1592.0 日用品 '''查看交易額的總和''' print(market['交易額'].sum()) # 327257.0 print(market[market['時段'] == '9：00-14：00']['交易額'].sum()) # 176029.0 '''查看某員工在 14：00-21：00 的交易數據''' print(market[(market.姓名 == '張三') & (market.時段 == '14：00-21：00')].head()) # 工號 姓名 日期 時段 交易額 櫃檯 # 7 1001 張三 2019-03-01 14：00-21：00 1442.0 蔬菜水果 # 39 1001 張三 2019-03-05 14：00-21：00 856.0 蔬菜水果 # 73 1001 張三 2019-03-10 14：00-21：00 1040.0 化妝品 # 91 1001 張三 2019-03-12 14：00-21：00 1435.0 食品 # 99 1001 張三 2019-03-13 14：00-21：00 1333.0 食品 '''查看交易額在 1500 到 2500 的數據''' print(market[market['交易額'].between(1500,2000)].head()) # 工號 姓名 日期 時段 交易額 櫃檯 # 0 1001 張三 2019-03-01 9：00-14：00 1664.0 化妝品 # 8 1001 張三 2019-03-02 9：00-14：00 1530.0 化妝品 # 14 1002 李四 2019-03-02 9：00-14：00 1649.0 蔬菜水果 # 18 1003 王五 2019-03-03 9：00-14：00 1713.0 食品 # 20 1005 周七 2019-03-03 9：00-14：00 1592.0 日用品 '''查看描述''' print(market['交易額'].describe()) # count 246.000000 # mean 1330.313008 # std 904.300720 # min 53.000000 # 25% 1031.250000 # 50% 1259.000000 # 75% 1523.000000 # max 12100.000000 # Name: 交易額, dtype: float64 print(market['交易額'].quantile([0,0.25,0.5,0.75,1])) # 0.00 53.00 # 0.25 1031.25 # 0.50 1259.00 # 0.75 1523.00 # 1.00 12100.00 # Name: 交易額, dtype: float64 '''查看中值''' print(market['交易額'].median()) # 1259.0 '''查看最大值''' print(market['交易額'].max()) # 12100.0 print(market['交易額'].nlargest(5)) # 105 12100.0 # 223 9031.0 # 113 1798.0 # 188 1793.0 # 136 1791.0 '''查看最小值''' print(market['交易額'].min()) # 53.0 print(market['交易額'].nsmallest(5)) # 76 53.0 # 97 98.0 # 194 114.0 # 86 801.0 # 163 807.0 # Name: 交易額, dtype: float64 import pandas as pd market = pd.read_excel('超市營業額.xlsx') print(market.head()) # 工號 姓名 日期 時段 交易額 櫃檯 # 0 1001 張三 2019-03-01 9：00-14：00 1664.0 化妝品 # 1 1002 李四 2019-03-01 14：00-21：00 954.0 化妝品 # 2 1003 王五 2019-03-01 9：00-14：00 1407.0 食品 # 3 1004 趙六 2019-03-01 14：00-21：00 1320.0 食品 # 4 1005 周七 2019-03-01 9：00-14：00 994.0 日用品 '''對數據進行排序''' print(market.sort_values(by = ['交易額','工號'],ascending = False).head()) # 工號 姓名 日期 時段 交易額 櫃檯 # 105 1001 張三 2019-03-14 9：00-14：00 12100.0 日用品 # 223 1003 王五 2019-03-28 9：00-14：00 9031.0 食品 # 113 1002 李四 2019-03-15 9：00-14：00 1798.0 日用品 # 188 1002 李四 2019-03-24 14：00-21：00 1793.0 蔬菜水果 # 136 1001 張三 2019-03-17 14：00-21：00 1791.0 食品 print(market.sort_values(by = ['交易額','工號'],ascending = True).head()) # 工號 姓名 日期 時段 交易額 櫃檯 # 76 1005 周七 2019-03-10 9：00-14：00 53.0 日用品 # 97 1002 李四 2019-03-13 14：00-21：00 98.0 日用品 # 194 1001 張三 2019-03-25 14：00-21：00 114.0 化妝品 # 86 1003 王五 2019-03-11 9：00-14：00 801.0 蔬菜水果 # 163 1006 錢八 2019-03-21 9：00-14：00 807.0 蔬菜水果 '''groupby 對象 的使用''' print(market.groupby(by = lambda x:x%3)['交易額'].sum()) # 0 113851.0 # 1 108254.0 # 2 105152.0 # Name: 交易額, dtype: float64 '''查看 櫃檯的交易額 ''' print(market.groupby(by = '櫃檯')['交易額'].sum()) # 櫃檯 # 化妝品 75389.0 # 日用品 88162.0 # 蔬菜水果 78532.0 # 食品 85174.0 # Name: 交易額, dtype: float64 '''查看日期個數''' print(market.groupby(by = '姓名')['日期'].count()) # 姓名 # 周七 42 # 張三 38 # 李四 47 # 王五 40 # 趙六 45 # 錢八 37 # Name: 日期, dtype: int64 '''將員工的營業額彙總出來''' print(market.groupby(by = '姓名')['交易額'].sum().apply(int)) # 姓名 # 周七 47818 # 張三 58130 # 李四 58730 # 王五 58892 # 趙六 56069 # 錢八 47618 # Name: 交易額, dtype: int64 '''查看交易額的 最大最小平均值和中值''' print(market.groupby(by = '姓名').agg(['max','min','mean','median'])['交易額']) # max min mean median # 姓名 # 周七 1778.0 53.0 1195.450000 1134.5 # 張三 12100.0 114.0 1529.736842 1290.0 # 李四 1798.0 98.0 1249.574468 1276.0 # 王五 9031.0 801.0 1472.300000 1227.0 # 趙六 1775.0 825.0 1245.977778 1224.0 # 錢八 1737.0 807.0 1322.722222 1381.0 '''處理異常值''' # 考慮使用其餘數據替代 '''處理缺失值''' print(market[market['交易額'].isnull()]) # 工號 姓名 日期 時段 交易額 櫃檯 # 110 1005 周七 2019-03-14 14：00-21：00 NaN 化妝品 # 124 1006 錢八 2019-03-16 14：00-21：00 NaN 食品 # 168 1005 周七 2019-03-21 14：00-21：00 NaN 食品 # 考慮使用 平均值 替換 '''處理重複值''' # 考慮是否刪除數據 '''duplicated() 和 drop_duplicates()''' '''使用透視表,查看前五天的數據''' print(market.pivot_table(values = '交易額',index = '姓名',columns = '日期',aggfunc = 'sum').iloc[:,:5]) print(market.pivot_table(values = '交易額',index = '姓名',columns = '櫃檯',aggfunc = 'count').iloc[:,:5]) '''使用交叉表,查看員工和櫃檯的次數''' print(pd.crosstab(market['姓名'],market['櫃檯'])) # 姓名 # 周七 9 11 14 8 # 張三 19 6 6 7 # 李四 16 9 18 4 # 王五 8 9 9 14 # 趙六 10 18 2 15 # 錢八 0 9 14 14 print(pd.crosstab(market['姓名'],market['日期'])) # 日期 2019-03-01 2019-03-02 2019-03-03 ... 2019-03-29 2019-03-30 2019-03-31 # 姓名 ... # 周七 1 1 2 ... 1 1 2 # 張三 2 1 1 ... 1 2 0 # 李四 1 2 1 ... 2 2 2 # 王五 1 2 1 ... 1 1 1 # 趙六 1 1 2 ... 2 1 2 # 錢八 2 1 1 ... 1 1 1 print(pd.crosstab(market['姓名'],market['櫃檯'],market['交易額'],aggfunc = 'mean').apply(lambda x:round(x))) # 櫃檯 化妝品 日用品 蔬菜水果 食品 # 姓名 # 周七 1190.0 1169.0 1174.0 1285.0 # 張三 1209.0 3105.0 1211.0 1323.0 # 李四 1279.0 1123.0 1292.0 1224.0 # 王五 1264.0 1262.0 1164.0 1925.0 # 趙六 1232.0 1294.0 1264.0 1196.0 # 錢八 NaN 1325.0 1326.0 1318.0

學生成績表數據包括:學號,姓名,高數,英語和計算機三門課成績,計算每一個學生總分,每課程平均分,最高分和最低分

''' 每個學生的總分,每一個課程的平均分,最高分,最低分 ''' # 建立學生列表 stuLst = [] # 建立學生信息 stu1 = {'學號':'1001','姓名':'小明','高數':95,'英語':88,'計算機':80} stu2 = {'學號':'1002','姓名':'小李','高數':84,'英語':70,'計算機':60} stu3 = {'學號':'1003','姓名':'小王','高數':79,'英語':78,'計算機':75} # 將學生列表加入到學生信息中 stuLst.append(stu1) stuLst.append(stu2) stuLst.append(stu3) def sumScore(stuLst): '''計算每名學生的總分''' for stu in stuLst: print(stu['姓名'],"的三科總分是 ",stu['高數'] + stu['英語'] + stu['計算機']) def meanScore(stuLst): '''計算課程的平均分''' sumProjectScore_gs = 0 # 設置高數學科總分 sumProjectScore_yy = 0 # 設置英語學科總分 sumProjectScore_jsj = 0 # 設置計算機學科總分(_拼音縮寫) for stu in stuLst: sumProjectScore_gs += stu['高數'] sumProjectScore_yy += stu['英語'] sumProjectScore_jsj += stu['計算機'] print("高數的平均分是 %.2f"%(sumProjectScore_gs//len(stuLst))) print("英語的平均分是 %.2f" % (sumProjectScore_yy // len(stuLst))) print("計算機的平均分是 %.2f" % (sumProjectScore_jsj // len(stuLst))) def maxScore(stuLst): '''求最大值''' # 高數 英語 計算機 gs = [] yy = [] jsj = [] for stu in stuLst: gs.append(stu['高數']) yy.append(stu['英語']) jsj.append(stu['計算機']) print("高數的最高分是 %.2f"%(max(gs))) print("英語的最高分是 %.2f" % (max(yy))) print("計算機的最高分是 %.2f" % (max(jsj))) def minScore(stuLst): '''求最小值''' # 高數 英語 計算機 gs = [] yy = [] jsj = [] for stu in stuLst: gs.append(stu['高數']) yy.append(stu['英語']) jsj.append(stu['計算機']) print("高數的最低分是 %.2f" % (min(gs))) print("英語的最低分是 %.2f" % (min(yy))) print("計算機的最低分是 %.2f" % (min(jsj))) sumScore(stuLst) meanScore(stuLst) maxScore(stuLst) minScore(stuLst)

四位玫瑰數 for i in range(1000,10000): t=str(i) if pow(eval(t[0]),4)+pow(eval(t[1]),4)+pow(eval(t[2]),4)+pow(eval(t[3]),4) == i: print(i)

四平方和

import math def f(n): if isinstance(n,int): for i in range(round(math.sqrt(n))): for j in range(round(math.sqrt(n))): for k in range(round(math.sqrt(n))): h = math.sqrt(n - i*i - j*j - k*k) # 剪掉使用了的值 if h == int(h): print("(%d,%d,%d,%d)"%(i,j,k,h)) return else: print("(0,0,0,0)") f(5) f(12) f("aaa")

學生管理系統-明日學院的

import re # 導入正則表達式模塊 import os # 導入操做系統模塊 filename = "students.txt" # 定義保存學生信息的文件名 def menu(): # 輸出菜單 print(''' ╔———————學生信息管理系統————————╗ │ │ │ =============== 功能菜單 =============== │ │ │ │ 1 錄入學生信息 │ │ 2 查找學生信息 │ │ 3 刪除學生信息 │ │ 4 修改學生信息 │ │ 5 排序 │ │ 6 統計學生總人數 │ │ 7 顯示全部學生信息 │ │ 0 退出系統 │ │ ========================================== │ │ 說明：經過數字或↑↓方向鍵選擇菜單 │ ╚———————————————————————╝ ''') def main(): ctrl = True # 標記是否退出系統 while (ctrl): menu() # 顯示菜單 option = input("請選擇：") # 選擇菜單項 option_str = re.sub("\D", "", option) # 提取數字 if option_str in ['0', '1', '2', '3', '4', '5', '6', '7']: option_int = int(option_str) if option_int == 0: # 退出系統 print('您已退出學生成績管理系統！') ctrl = False elif option_int == 1: # 錄入學生成績信息 insert() elif option_int == 2: # 查找學生成績信息 search() elif option_int == 3: # 刪除學生成績信息 delete() elif option_int == 4: # 修改學生成績信息 modify() elif option_int == 5: # 排序 sort() elif option_int == 6: # 統計學生總數 total() elif option_int == 7: # 顯示全部學生信息 show() '''1 錄入學生信息''' def insert(): stdentList = [] # 保存學生信息的列表 mark = True # 是否繼續添加 while mark: id = input("請輸入ID（如 1001）：") if not id: # ID爲空，跳出循環 break name = input("請輸入名字：") if not name: # 名字爲空，跳出循環 break try: english = int(input("請輸入英語成績：")) python = int(input("請輸入Python成績：")) c = int(input("請輸入C語言成績：")) except: print("輸入無效，不是整型數值．．．．從新錄入信息") continue stdent = {"id": id, "name": name, "english": english, "python": python, "c": c} # 將輸入的學生信息保存到字典 stdentList.append(stdent) # 將學生字典添加到列表中 inputMark = input("是否繼續添加？（y/n）:") if inputMark == "y": # 繼續添加 mark = True else: # 不繼續添加 mark = False save(stdentList) # 將學生信息保存到文件 print("學生信息錄入完畢！！！") # 將學生信息保存到文件 def save(student): try: students_txt = open(filename, "a") # 以追加模式打開 except Exception as e: students_txt = open(filename, "w") # 文件不存在，建立文件並打開 for info in student: students_txt.write(str(info) + "\n") # 按行存儲，添加換行符 students_txt.close() # 關閉文件 '''2 查找學生成績信息''' def search(): mark = True student_query = [] # 保存查詢結果的學生列表 while mark: id = "" name = "" if os.path.exists(filename): # 判斷文件是否存在 mode = input("按ID查輸入1；按姓名查輸入2：") if mode == "1": id = input("請輸入學生ID：") elif mode == "2": name = input("請輸入學生姓名：") else: print("您的輸入有誤，請從新輸入！") search() # 從新查詢 with open(filename, 'r') as file: # 打開文件 student = file.readlines() # 讀取所有內容 for list in student: d = dict(eval(list)) # 字符串轉字典 if id != "": # 判斷是否按ID查 if d['id'] == id: student_query.append(d) # 將找到的學生信息保存到列表中 elif name != "": # 判斷是否按姓名查 if d['name'] == name: student_query.append(d) # 將找到的學生信息保存到列表中 show_student(student_query) # 顯示查詢結果 student_query.clear() # 清空列表 inputMark = input("是否繼續查詢？（y/n）:") if inputMark == "y": mark = True else: mark = False else: print("暫未保存數據信息...") return '''3 刪除學生成績信息''' def delete(): mark = True # 標記是否循環 while mark: studentId = input("請輸入要刪除的學生ID：") if studentId != "": # 判斷要刪除的學生是否存在 if os.path.exists(filename): # 判斷文件是否存在 with open(filename, 'r') as rfile: # 打開文件 student_old = rfile.readlines() # 讀取所有內容 else: student_old = [] ifdel = False # 標記是否刪除 if student_old: # 若是存在學生信息 with open(filename, 'w') as wfile: # 以寫方式打開文件 d = {} # 定義空字典 for list in student_old: d = dict(eval(list)) # 字符串轉字典 if d['id'] != studentId: wfile.write(str(d) + "\n") # 將一條學生信息寫入文件 else: ifdel = True # 標記已經刪除 if ifdel: print("ID爲 %s 的學生信息已經被刪除..." % studentId) else: print("沒有找到ID爲 %s 的學生信息..." % studentId) else: # 不存在學生信息 print("無學生信息...") break # 退出循環 show() # 顯示所有學生信息 inputMark = input("是否繼續刪除？（y/n）:") if inputMark == "y": mark = True # 繼續刪除 else: mark = False # 退出刪除學生信息功能 '''4 修改學生成績信息''' def modify(): show() # 顯示所有學生信息 if os.path.exists(filename): # 判斷文件是否存在 with open(filename, 'r') as rfile: # 打開文件 student_old = rfile.readlines() # 讀取所有內容 else: return studentid = input("請輸入要修改的學生ID：") with open(filename, "w") as wfile: # 以寫模式打開文件 for student in student_old: d = dict(eval(student)) # 字符串轉字典 if d["id"] == studentid: # 是否爲要修改的學生 print("找到了這名學生，能夠修改他的信息！") while True: # 輸入要修改的信息 try: d["name"] = input("請輸入姓名：") d["english"] = int(input("請輸入英語成績：")) d["python"] = int(input("請輸入Python成績：")) d["c"] = int(input("請輸入C語言成績：")) except: print("您的輸入有誤，請從新輸入。") else: break # 跳出循環 student = str(d) # 將字典轉換爲字符串 wfile.write(student + "\n") # 將修改的信息寫入到文件 print("修改爲功！") else: wfile.write(student) # 將未修改的信息寫入到文件 mark = input("是否繼續修改其餘學生信息？（y/n）：") if mark == "y": modify() # 從新執行修改操做 '''5 排序''' def sort(): show() # 顯示所有學生信息 if os.path.exists(filename): # 判斷文件是否存在 with open(filename, 'r') as file: # 打開文件 student_old = file.readlines() # 讀取所有內容 student_new = [] for list in student_old: d = dict(eval(list)) # 字符串轉字典 student_new.append(d) # 將轉換後的字典添加到列表中 else: return ascORdesc = input("請選擇（0升序；1降序）：") if ascORdesc == "0": # 按升序排序 ascORdescBool = False # 標記變量，爲False表示升序排序 elif ascORdesc == "1": # 按降序排序 ascORdescBool = True # 標記變量，爲True表示降序排序 else: print("您的輸入有誤，請從新輸入！") sort() mode = input("請選擇排序方式（1按英語成績排序；2按Python成績排序；3按C語言成績排序；0按總成績排序）：") if mode == "1": # 按英語成績排序 student_new.sort(key=lambda x: x["english"], reverse=ascORdescBool) elif mode == "2": # 按Python成績排序 student_new.sort(key=lambda x: x["python"], reverse=ascORdescBool) elif mode == "3": # 按C語言成績排序 student_new.sort(key=lambda x: x["c"], reverse=ascORdescBool) elif mode == "0": # 按總成績排序 student_new.sort(key=lambda x: x["english"] + x["python"] + x["c"], reverse=ascORdescBool) else: print("您的輸入有誤，請從新輸入！") sort() show_student(student_new) # 顯示排序結果 ''' 6 統計學生總數''' def total(): if os.path.exists(filename): # 判斷文件是否存在 with open(filename, 'r') as rfile: # 打開文件 student_old = rfile.readlines() # 讀取所有內容 if student_old: print("一共有 %d 名學生！" % len(student_old)) else: print("尚未錄入學生信息！") else: print("暫未保存數據信息...") ''' 7 顯示全部學生信息 ''' def show(): student_new = [] if os.path.exists(filename): # 判斷文件是否存在 with open(filename, 'r') as rfile: # 打開文件 student_old = rfile.readlines() # 讀取所有內容 for list in student_old: student_new.append(eval(list)) # 將找到的學生信息保存到列表中 if student_new: show_student(student_new) else: print("暫未保存數據信息...") # 將保存在列表中的學生信息顯示出來 def show_student(studentList): if not studentList: print("無數據信息 \n") return format_title = "{:^6}{:^12}\t{:^8}\t{:^10}\t{:^10}\t{:^10}" print(format_title.format("ID", "名字", "英語成績", "Python成績", "C語言成績", "總成績")) format_data = "{:^6}{:^12}\t{:^12}\t{:^12}\t{:^12}\t{:^12}" for info in studentList: print(format_data.format(info.get("id"), info.get("name"), str(info.get("english")), str(info.get("python")), str(info.get("c")), str(info.get("english") + info.get("python") + info.get("c")).center(12))) if __name__ == "__main__": main()

定義函數，給定一個列表做爲函數參數，將列表中的非數字字符去除

'''定義函數，給定一個列表做爲函數參數，將列表中的非數字字符去除。''' class list: def __init__(self,alist): self.alist=alist def remove_str(self): a="" for i in self.alist: b=str(i) a+=b "".join (filter(str.isdigit,a)) print("".join(filter(str.isdigit,a))) a=list([1,2,3,"q",6,"sd","[][]{"]) a.remove_str()

給定幾位數，查看數根(使用函數實現)

def numRoot(num): '''定義數根函數''' if len(num) == 1: return int(num) else: nums = [] for i in range(len(num)): # 對字符串進行遍歷 nums.append(int(num[i])) if sum(nums) >= 10: # 若是數值加起來大於 10 numRoot(str(sum(nums))) else: # 輸出樹根 print(sum(nums)) num = input("請輸入一個數，查看它的數根") numRoot(num)

水果系統(面向過程,面向對象)

fruit = [] def menu(): print( ''' ********************水果超市******************** （面向對象，面向過程） 1. 查詢所有水果 2. 查詢指定名稱的水果 3. 增長水果（增長到數據庫） 4. 修改水果數量或者價格 5. 刪除水果 6. 按照價格排序 7. 退出系統 *********************************************** ''' ) def showFruit(): '''功能1 查詢所有水果''' print('-' * 30) print("水果的信息以下:") print('-' * 30) print('序號 水果名 價格 數量') fru_id = 1 for fru_temp in fruit: print("%s %s %s %s "%(fru_id,fru_temp['name'],fru_temp['price'],fru_temp['num'])) fru_id += 1 def searchFruitName(): '''功能2 查詢指定名稱的水果''' showFruit() fru_name = input("請輸入想要查詢的水果名稱") fru_id = 1 for fru in fruit: if fru_name in fru['name']: print("該水果信息以下:") print("%d %s %s %s " % (fru_id, fru['name'], fru['price'], fru['num'])) return fru_id += 1 print("沒有查詢到該水果名稱") def addFruit(): '''功能3 增長水果（增長到數據庫）''' newFruitName = input('請輸入新水果的名稱') newFruitPrice = input('請輸入新水果的價格') newFruitNum = input('請輸入新水果的數量') newFruit = {} newFruit['name'] = newFruitName newFruit['price'] = newFruitPrice newFruit['num'] = newFruitNum fruit.append(newFruit) def changeFruit(): '''功能4 修改水果數量或者價格''' showFruit() fru_id = int(input("請輸入須要修改的水果的序號: ")) changeFruitName = input('請輸入修改後的水果的名稱') changeFruitPrice = input('請輸入修改後的水果的價格') changeFruitNum = input('請輸入修改後的水果的數量') fruit[fru_id - 1]['name'] = changeFruitName fruit[fru_id - 1]['price'] = changeFruitPrice fruit[fru_id - 1]['num'] = changeFruitNum def delFruit(): '''功能5 刪除水果''' showFruit() delFruitId = int(input('請輸入要刪除的序號:')) del fruit[delFruitId - 1] def sortFruit(): '''功能6 按照價格排序''' showFruit() sortStandard = input("請選擇(0升序；1降序):") if sortStandard == "0": sortStandard = True elif sortStandard == "1": sortStandard = False else: print("您的輸入有誤，請從新輸入！") fruit.sort(key = lambda x:x['price'],reverse = sortStandard) showFruit() def exitSystem(): '''功能7 退出系統''' print("您已經退出水果超市系統") exit() def main(): notExit = True while notExit: menu() try: option = int(input("請選擇功能:")) except Exception as e: print("請從新輸入") if option in [i for i in range(1,8)]: if option == 1: showFruit() elif option == 2: searchFruitName() elif option == 3: addFruit() elif option == 4: changeFruit() elif option == 5: delFruit() elif option == 6: sortFruit() elif option == 7: notExit = False exitSystem() if __name__ == '__main__': main()

class FruitMarket(): def __init__(self): self.fruit = [] def showFruit(self): '''功能1 查詢所有水果''' print('-' * 30) print("水果的信息以下:") print('-' * 30) print('序號 水果名 價格 數量') fru_id = 1 for fru_temp in self.fruit: print("%s %s %s %s " % (fru_id, fru_temp['name'], fru_temp['price'], fru_temp['num'])) fru_id += 1 def searchFruitName(self): '''功能2 查詢指定名稱的水果''' self.showFruit() fru_name = input("請輸入想要查詢的水果名稱") fru_id = 1 for fru in self.fruit: if fru_name in fru['name']: print("該水果信息以下:") print("%d %s %s %s " % (fru_id, fru['name'], fru['price'], fru['num'])) return fru_id += 1 print("沒有查詢到該水果名稱") def addFruit(self): '''功能3 增長水果（增長到數據庫）''' newFruitName = input('請輸入新水果的名稱') newFruitPrice = input('請輸入新水果的價格') newFruitNum = input('請輸入新水果的數量') newFruit = {} newFruit['name'] = newFruitName newFruit['price'] = newFruitPrice newFruit['num'] = newFruitNum self.fruit.append(newFruit) def changeFruit(self): '''功能4 修改水果數量或者價格''' self.showFruit() fru_id = int(input("請輸入須要修改的水果的序號: ")) changeFruitName = input('請輸入修改後的水果的名稱') changeFruitPrice = input('請輸入修改後的水果的價格') changeFruitNum = input('請輸入修改後的水果的數量') self.fruit[fru_id - 1]['name'] = changeFruitName self.fruit[fru_id - 1]['price'] = changeFruitPrice self.fruit[fru_id - 1]['num'] = changeFruitNum def delFruit(self): '''功能5 刪除水果''' self.showFruit() delFruitId = int(input('請輸入要刪除的序號:')) del self.fruit[delFruitId - 1] def sortFruit(self): '''功能6 按照價格排序''' self.showFruit() sortStandard = input("請選擇(0升序；1降序):") if sortStandard == "0": sortStandard = True elif sortStandard == "1": sortStandard = False else: print("您的輸入有誤，請從新輸入！") self.fruit.sort(key=lambda x: x['price'], reverse=sortStandard) self.showFruit() def exitSystem(self): '''功能7 退出系統''' print("您已經退出水果超市系統") exit() def menu( ): print( ''' ********************水果超市******************** （面向對象，面向過程） 1. 查詢所有水果 2. 查詢指定名稱的水果 3. 增長水果（增長到數據庫） 4. 修改水果數量或者價格 5. 刪除水果 6. 按照價格排序 7. 退出系統 *********************************************** ''' ) fruitmarket = FruitMarket() def main(): notExit = True while notExit: menu() try: option = int(input("請選擇功能:")) except Exception as e: print("請從新輸入") if option == 1: fruitmarket.showFruit() elif option == 2: fruitmarket.searchFruitName() elif option == 3: fruitmarket.addFruit() elif option == 4: fruitmarket.changeFruit() elif option == 5: fruitmarket.delFruit() elif option == 6: fruitmarket.sortFruit() elif option == 7: notExit = False fruitmarket.exitSystem() if __name__ == '__main__': main()

matplotlib基礎彙總_01

灰度化處理就是將一幅色彩圖像轉化爲灰度圖像的過程。彩色圖像分爲R，G，B三個份量，
分別顯示出紅綠藍等各類顏色，灰度化就是使彩色的R，G，B份量相等的過程。
灰度值大的像素點比較亮（像素值最大爲255，爲白色），反之比較暗（像素最下爲0，爲黑色）。 圖像灰度化的算法主要有如下3種：

data2 = data.mean(axis = 2)

data3 = np.dot(data,[0.299,0.587,0.114])

Matplotlib中的基本圖表包括的元素 x軸和y軸 水平和垂直的軸線 x軸和y軸刻度 刻度標示座標軸的分隔，包括最小刻度和最大刻度 x軸和y軸刻度標籤 表示特定座標軸的值 繪圖區域 實際繪圖的區域

繪製一條曲線 x = np.arange(0.0,6.0,0.01) plt.plot(x, x**2) plt.show()

繪製多條曲線 x = np.arange(1, 5,0.01) plt.plot(x, x**2) plt.plot(x, x**3.0) plt.plot(x, x*3.0) plt.show() x = np.arange(1, 5) plt.plot(x, x*1.5, x, x*3.0, x, x/3.0) plt.show()

繪製網格線 設置grid參數（參數與plot函數相同） lw表明linewidth，線的粗細 alpha表示線的明暗程度 # 使用子圖顯示不一樣網格線對比 fig = plt.figure(figsize=(20,3)) x = np.linspace(0, 5, 100) # 使用默認網格設置 ax1 = fig.add_subplot(131) ax1.plot(x, x**2, x, x**3,lw=2) ax1.grid(True) # 顯式網格線 # 對網格線進行設置 ax2 = fig.add_subplot(132) ax2.plot(x, x**2, x, x**4, lw=2) ax2.grid(color='r', alpha=0.5, linestyle='dashed', linewidth=0.5) # grid函數中用與plot函數一樣的參數設置網格線 # 對網格線進行設置 ax3 = fig.add_subplot(133) ax3.plot(x, x**2, x, x**4, lw=2) ax3.grid(color='r', alpha=0.5, linestyle='-.', linewidth=0.5) # grid函數中用與plot函數一樣的參數設置網格線

座標軸界限 axis 方法 x = np.arange(1, 5) plt.plot(x, x*1.5, x, x*3.0, x, x/3.0) # plt.axis() # shows the current axis limits values；若是axis方法沒有任何參數，則返回當前座標軸的上下限 # (1.0, 4.0, 0.0, 12.0) # plt.axis([0, 15, -5, 13]) # set new axes limits；axis方法中有參數，設置座標軸的上下限；參數順序爲[xmin, xmax, ymin, ymax] plt.axis(xmax=5,ymax=23) # 可以使用xmax,ymax參數 plt.show() 設置緊湊型座標軸 x = np.arange(1, 5) plt.plot(x, x*1.5, x, x*3.0, x, x/3.0) plt.axis('tight') # 緊湊型座標軸 plt.show() plt除了axis方法設置座標軸範圍，還能夠經過xlim，ylim設置座標軸範圍 x = np.arange(1, 5) plt.plot(x, x*1.5, x, x*3.0, x, x/3.0) plt.xlim([0, 5]) # ylim([ymin, ymax]) plt.ylim([-1, 13]) # xlim([xmin, xmax]) plt.show()

座標軸標籤 plt.plot([1, 3, 2, 4]) plt.xlabel('This is the X axis') plt.ylabel('This is the Y axis') plt.show()

座標軸標題 plt.plot([1, 3, 2, 4]) plt.title('Simple plot') plt.show()

label參數爲'_nolegend_'，則圖例中不顯示 x = np.arange(1, 5) plt.plot(x, x*1.5, label = '_nolegend_') # label參數爲'_nolegend_'，則圖例中不顯示 plt.plot(x, x*3.0, label='Fast') plt.plot(x, x/3.0, label='Slow') plt.legend() plt.show()

圖例 legend legend方法 兩種傳參方法： 【推薦使用】在plot函數中增長label參數 在legend方法中傳入字符串列表 方法一： x = np.arange(1, 5) plt.plot(x, x*1.5, label='Normal') # 在plot函數中增長label參數 plt.plot(x, x*3.0, label='Fast') plt.plot(x, x/3.0, label='Slow') plt.legend() plt.show() 方法二： x = np.arange(1, 5) plt.plot(x, x*1.5) plt.plot(x, x*3.0) plt.plot(x, x/3.0) plt.legend(['Normal', 'Fast', 'Slow']) # 在legend方法中傳入字符串列表 plt.show()

loc 參數 x = np.arange(1, 5) plt.plot(x, x*1.5, label='Normal') plt.plot(x, x*3.0, label='Fast') plt.plot(x, x/3.0, label='Slow') plt.legend(loc=10) plt.show()

loc參數能夠是2元素的元組，表示圖例左下角的座標 x = np.arange(1, 5) plt.plot(x, x*1.5, label='Normal') plt.plot(x, x*3.0, label='Fast') plt.plot(x, x/3.0, label='Slow') plt.legend(loc=(0,1)) # loc參數能夠是2元素的元組，表示圖例左下角的座標 plt.show()

ncol參數控制圖例中有幾列 x = np.arange(1, 5) plt.plot(x, x*1.5, label='Normal') plt.plot(x, x*3.0, label='Fast') plt.plot(x, x/3.0, label='Slow') plt.legend(loc=0, ncol=2) # ncol控制圖例中有幾列 plt.show()

linestyle 屬性 plt.plot(np.random.randn(1000).cumsum(), linestyle = ':',marker = '.', label='one') plt.plot(np.random.randn(1000).cumsum(), 'r--', label='two') plt.plot(np.random.randn(1000).cumsum(), 'b.', label='three') plt.legend(loc='best') # loc='best' plt.show()

保存圖片 filename 含有文件路徑的字符串或Python的文件型對象。圖像格式由文件擴展名推斷得出，例如，.pdf推斷出PDF，.png推斷出PNG （「png」、「pdf」、「svg」、「ps」、「eps」……） dpi 圖像分辨率（每英寸點數），默認爲100 facecolor 圖像的背景色，默認爲「w」（白色） x = np.random.randn(1000).cumsum() fig = plt.figure(figsize = (10,3)) splt = fig.add_subplot(111) splt.plot(x) fig.savefig(filename = "filena.eps",dpi = 100,facecolor = 'g')

matplotlib基礎彙總_02

設置plot的風格和樣式

點和線的樣式

顏色 參數color或c 五種定義顏色值的方式 別名 color='r' 合法的HTML顏色名 color = 'red' HTML十六進制字符串 color = '#eeefff' 歸一化到[0, 1]的RGB元組 color = (0.3, 0.3, 0.4) 灰度 color = (0.1)

透明度 y = np.arange(1, 3) plt.plot(y, c="red", alpha=0.1); # 設置透明度 plt.plot(y+1, c="red", alpha=0.5); plt.plot(y+2, c="red", alpha=0.9);

設置背景色 經過plt.subplot()方法傳入facecolor參數，來設置座標軸的背景色 plt.subplot(facecolor='orange'); plt.plot(np.random.randn(10),np.arange(1,11))

線型

不一樣寬度破折線 # 第一段線2個點的寬度，接下來的空白區5個點的寬度，第二段線5個點的寬度，空白區2個點的寬度，以此類推 plt.plot(np.linspace(-np.pi, np.pi, 256, endpoint=True), np.cos(np.linspace(-np.pi, np.pi, 256, endpoint=True)), dashes=[2, 5, 5, 2]);

點型 y = np.arange(1, 3, 0.2) plt.plot(y, '1', y+0.5, '2', y+1, '3', y+1.5,'4'); plt.plot(y+2, '3') #不聲明marker，默認ls = None plt.plot(y+2.5,marker = '3') #聲明瞭marker，ls 默認是實線 plt.show()

多參數連用 顏色、點型、線型 x = np.linspace(0, 5, 10) plt.plot(x,3*x,'r-.') plt.plot(x, x**2, 'b^:') # blue line with dots plt.plot(x, x**3, 'go-.') # green dashed line plt.show()

更多點和線的設置 y = np.arange(1, 3, 0.3) plt.plot(y, color='blue', linestyle='dashdot', linewidth=4, marker='o', markerfacecolor='red', markeredgecolor='black', markeredgewidth=3, markersize=12); plt.show()

在一條語句中爲多個曲線進行設置 多個曲線同一設置¶ plt.plot(x1, y1, x2, y2, fmt, ...) 多個曲線不一樣設置¶ plt.plot(x1, y1, fmt1, x2, y2, fmt2, ...)

在一條語句中爲多個曲線進行設置 多個曲線同一設置¶ plt.plot(x1, y1, x2, y2, fmt, ...) 多個曲線不一樣設置¶ plt.plot(x1, y1, fmt1, x2, y2, fmt2, ...)

三種設置方式 向方法傳入關鍵字參數 對實例使用一系列的setter方法 x = np.arange(0,10) y = np.random.randint(10,30,size = 10) line,= plt.plot(x, y) line2 = plt.plot(x,y*2,x,y*3) line.set_linewidth(5) line2[1].set_marker('o') print(line,line2) 使用setp()方法 line = plt.plot(x, y) plt.setp(line, 'linewidth', 1.5,'color','r','marker','o','linestyle','--')

X、Y軸座標刻度

xticks()和yticks()方法 x = [5, 3, 7, 2, 4, 1] plt.plot(x); plt.xticks(range(len(x)), ['a', 'b', 'c', 'd', 'e', 'f']); # 傳入位置和標籤參數，以修改座標軸刻度 plt.yticks(range(1, 8, 2)); plt.show()

面向對象方法 set_xticks、set_yticks、set_xticklabels、set_yticklabels方法 fig = plt.figure(figsize=(10, 4)) ax = fig.add_subplot(111) x = np.linspace(0, 5, 100) ax.plot(x, x**2, x, x**3, lw=2) ax.set_xticks([1, 2, 3, 4, 5]) ax.set_xticklabels(['a','b','c','d','e'], fontsize=18) yticks = [0, 50, 100, 150] ax.set_yticks(yticks) ax.set_yticklabels([y for y in yticks], fontsize=18); # use LaTeX formatted labels

正弦餘弦：LaTex語法，用$\pi$等表達式在圖表上寫上希臘字母 x = np.arange(-np.pi,np.pi,0.01) plt.figure(figsize=(12,9)) plt.plot(x,np.sin(x),x,np.cos(x)) plt.axis([x.min()-1,x.max()+1,-1.2,1.2]) #xticks:參數一刻度，參數二，對應刻度上的值 plt.xticks(np.arange(-np.pi,np.pi+1,np.pi/2), ['$-\delta$','$-\pi$/2','0','$\pi$/2','$\pi$'],size = 20) plt.yticks([-1,0,1],['min','0','max'],size = 20) plt.show()

matplotlib基礎彙總_03

四圖

直方圖 【直方圖的參數只有一個x！！！不像條形圖須要傳入x,y】 hist()的參數 bins 能夠是一個bin數量的整數值，也能夠是表示bin的一個序列。默認值爲10 normed 若是值爲True，直方圖的值將進行歸一化處理，造成機率密度，默認值爲False color 指定直方圖的顏色。能夠是單一顏色值或顏色的序列。若是指定了多個數據集合，顏色序列將會設置爲相同的順序。若是未指定，將會使用一個默認的線條顏色 orientation 經過設置orientation爲horizontal建立水平直方圖。默認值爲vertical x = np.random.randint(5,size = 5) display(x) plt.hist(x,histtype = 'bar'); # 默認繪製10個bin plt.show()

普通直方圖/累計直方圖 n = np.random.randn(10000) fig,axes = plt.subplots(1,2,figsize = (12,4)) axes[0].hist(n,bins = 50)#普通直方圖 axes[0].set_title('Default histogram') axes[0].set_xlim(min(n),max(n)) axes[1].hist(n,bins = 50,cumulative = True)# 累計直方圖 axes[1].set_title('Cumulative detailed histogram') axes[1].set_xlim(min(n),max(n))

正太分佈 u = 100 #數學指望 s = 15 #方差 x = np.random.normal(u,s,1000) # 生成正太分佈數據 ax = plt.gca() #獲取當前圖表 ax.set_xlabel('Value') ax.set_ylabel('Frequency') #設置x，y軸標題 ax.set_title("Histogram normal u = 100 s = 15") #設置圖表標題 ax.hist(x,bins = 100,color = 'r',orientation='horizontal') plt.show()

條形圖

bar # 第一個參數爲條形左下角的x軸座標，第二個參數爲條形的高度； # matplotlib會自動設置條形的寬度，本例中條形寬0.8 plt.bar([1, 2, 3], [3, 2, 5]); plt.show()

# width參數設置條形寬度；color參數設置條形顏色；bottom參數設置條形底部的垂直座標 plt.bar([1, 2, 3], [3, 2, 5], width=0.5, color='r', bottom=1); plt.ylim([0, 7]) plt.show()

# 例子：繪製並列條形圖 data1 = 10*np.random.rand(5) data2 = 10*np.random.rand(5) data3 = 10*np.random.rand(5) locs = np.arange(1, len(data1)+1) width = 0.27 plt.bar(locs, data1, width=width); plt.bar(locs+width, data2, width=width, color='red'); plt.bar(locs+2*width, data3, width=width, color='green') ; plt.xticks(locs + width*1, locs); plt.show()

barh plt.barh([1, 2, 3], [3, 2, 5],height = 0.27,color = 'yellow'); plt.show()

餅圖

【餅圖也只有一個參數x！】 pie() 餅圖適合展現各部分佔整體的比例，條形圖適合比較各部分的大小

plt.figure(figsize = (4,4)) # 餅圖繪製正方形 x = [45,35,20] #百分比 labels = ['Cats','Dogs','Fishes'] #每一個區域名稱 plt.pie(x,labels = labels) plt.show()

plt.figure(figsize=(4, 4)); x = [0.1, 0.2, 0.3] # 當各部分之和小於1時，則不計算各部分佔整體的比例，餅的大小是數值和1之比 labels = ['Cats', 'Dogs', 'Fishes'] plt.pie(x, labels=labels); # labels參數能夠設置各區域標籤 plt.show()

# labels參數設置每一塊的標籤；labeldistance參數設置標籤距離圓心的距離（比例值） # autopct參數設置比例值的顯示格式（%1.1f%%）；pctdistance參數設置比例值文字距離圓心的距離 # explode參數設置每一塊頂點距圓形的長度（比例值）；colors參數設置每一塊的顏色； # shadow參數爲布爾值，設置是否繪製陰影 plt.figure(figsize=(4, 4)); x = [4, 9, 21, 55, 30, 18] labels = ['Swiss', 'Austria', 'Spain', 'Italy', 'France', 'Benelux'] explode = [0.2, 0.1, 0, 0, 0.1, 0] colors = ['r', 'k', 'b', 'm', 'c', 'g'] plt.pie(x, labels=labels, labeldistance=1.2, explode=explode, colors=colors, autopct='%1.1f%%', pctdistance=0.5, shadow=True); plt.show()

散點圖

【散點圖須要兩個參數x,y，但此時x不是表示x軸的刻度，而是每一個點的橫座標！】 scatter()

# s參數設置散點的大小；c參數設置散點的顏色；marker參數設置散點的形狀 x = np.random.randn(1000) y = np.random.randn(1000) size = 50*abs(np.random.randn(1000)) colors = np.random.randint(16777215,size = 1000) li = [] for color in colors: a = hex(color) str1 = a[2:] l = len(str1) for i in range(1,7-l): str1 = '0'+str1 str1 = "#" + str1 li.append(str1) plt.scatter(x, y,s = size, c=li, marker='d'); plt.show()

import numpy as np import pandas as pd from pandas import Series,DataFrame import matplotlib.pyplot as plt x = np.random.randn(1000) y1 = np.random.randn(1000) y2 = 1.2 + np.exp(x) #exp(x) 返回的是e的x次方 ax1 = plt.subplot(121) plt.scatter(x,y1,color = 'purple',alpha = 0.3,edgecolors = 'white',label = 'no correl') plt.xlabel('no correlation') plt.grid(True) plt.legend() ax2 = plt.subplot(122) plt.scatter(x,y2,color = 'green',alpha = 0.3,edgecolors = 'gray',label = 'correl') plt.xlabel('correlation') plt.grid(True) plt.legend() plt.show()

圖形內的文字、註釋、箭頭

文字 x = np.arange(0, 7, .01) y = np.sin(x) plt.plot(x, y); plt.text(0.1, -0.04, 'sin(0)=0'); # 位置參數是座標 plt.show()

註釋 # xy參數設置箭頭指示的位置，xytext參數設置註釋文字的位置 # arrowprops參數以字典的形式設置箭頭的樣式 # width參數設置箭頭長方形部分的寬度，headlength參數設置箭頭尖端的長度， # headwidth參數設置箭頭尖端底部的寬度，shrink參數設置箭頭頂點、尾部與指示點、註釋文字的距離（比例值） y = [13, 11, 13, 12, 13, 10, 30, 12, 11, 13, 12, 12, 12, 11, 12] plt.plot(y); plt.ylim(ymax=35); # 爲了讓註釋不會超出圖的範圍，須要調整y座標軸的界限 plt.annotate('this spot must really\nmean something', xy=(6, 30), xytext=(8, 31.5), arrowprops=dict(width=15, headlength=20, headwidth=20, facecolor='black', shrink=0.1)); plt.show()

# 生成3個正態分佈數據數據集 x1 = np.random.normal(30, 3, 100) x2 = np.random.normal(20, 2, 100) x3 = np.random.normal(10, 3, 100) # 繪製3個數據集，併爲每一個plot指定一個字符串標籤 plt.plot(x1, label='plot') # 若是不想在圖例中顯示標籤，能夠將標籤設置爲_nolegend_ plt.plot(x2, label='2nd plot') plt.plot(x3, label='last plot') # 繪製圖例 plt.legend(bbox_to_anchor=(0, 1.02, 1, 0.102), # 指定邊界框起始位置爲(0, 1.02)，並設置寬度爲1，高度爲0.102 ncol=3, # 設置列數爲3，默認值爲1 mode="expand", # mode爲None或者expand，當爲expand時，圖例框會擴展至整個座標軸區域 borderaxespad=0.) # 指定座標軸和圖例邊界之間的間距 # 繪製註解 plt.annotate("Important value", # 註解文本的內容 xy=(55,20), # 箭頭終點所在位置 xytext=(5, 38), # 註解文本的起始位置，箭頭由xytext指向xy座標位置 arrowprops=dict(arrowstyle='->')); # arrowprops字典定義箭頭屬性，此處用arrowstyle定義箭頭風格

箭頭

matplotlib基礎彙總_04

3D圖形

導包 import numpy as np import matplotlib.pyplot as plt #3d圖形必須的 from mpl_toolkits.mplot3d.axes3d import Axes3D %matplotlib inline

生成數據 #係數，由X，Y生成Z a = 0.7 b = np.pi #計算Z軸的值 def mk_Z(X, Y): return 2 + a - 2 * np.cos(X) * np.cos(Y) - a * np.cos(b - 2*X) #生成X，Y，Z x = np.linspace(0, 2*np.pi, 100) y = np.linspace(0, 2*np.pi, 100) X,Y = np.meshgrid(x, y) Z = mk_Z(X, Y)

繪製圖形 fig = plt.figure(figsize=(14,6)) #建立3d的視圖，使用屬性projection ax = fig.add_subplot(1, 2, 1, projection='3d') ax.plot_surface(X,Y,Z,rstride = 5,cstride = 5) #建立3d視圖，使用colorbar，添加顏色柱 ax = fig.add_subplot(1, 2, 2, projection='3d') p = ax.plot_surface(X, Y, Z, rstride=5, cstride=5, cmap='rainbow', antialiased=True) cb = fig.colorbar(p, shrink=0.5)

玫瑰圖

#極座標條形圖 def showRose(values,title): max_value = values.max() # 分爲8個面元 N = 8 # 面元的分隔角度 angle = np.arange(0.,2 * np.pi, 2 * np.pi / N) # 每一個面元的大小（半徑） radius = np.array(values) # 設置極座標條形圖 plt.axes([0, 0, 2, 2], polar=True,facecolor = 'g') colors = [(1 - x/max_value, 1 - x/max_value, 0.75) for x in radius] # 畫圖 plt.bar(angle, radius, width=(2*np.pi/N), bottom=0.0, color=colors) plt.title(title,x=0.2, fontsize=20)

繪製圖形 #拉韋納(Ravenna)又譯「臘萬納」「拉文納」「拉溫拿」。意大利北部城市。位於距亞得里亞海10千米的沿海平原上 data = np.load('Ravenna_wind.npy') hist, angle = np.histogram(data,8,[0,360]) showRose(hist,'Ravenna')

城市氣候與海洋關係

導包 import numpy as np import pandas as pd import matplotlib.pyplot as plt from pandas import Series,DataFrame %matplotlib inline

加載數據 #意大利小鎮費拉拉 ferrara1 = pd.read_csv('./ferrara_150715.csv') ferrara2 = pd.read_csv('./ferrara_250715.csv') ferrara3 = pd.read_csv('./ferrara_270615.csv') ferrara = pd.concat([ferrara1,ferrara2,ferrara3],ignore_index=True)

去除沒用的列 asti.drop(['Unnamed: 0'],axis = 1,inplace=True) bologna.drop(['Unnamed: 0'],axis = 1,inplace=True) cesena.drop(['Unnamed: 0'],axis = 1,inplace=True) ferrara.drop(['Unnamed: 0'],axis = 1,inplace=True) mantova.drop(['Unnamed: 0'],axis = 1,inplace=True) milano.drop(['Unnamed: 0'],axis = 1,inplace=True) piacenza.drop(['Unnamed: 0'],axis = 1,inplace=True) ravenna.drop(['Unnamed: 0'],axis = 1,inplace=True) torino.drop(['Unnamed: 0'],axis = 1,inplace=True)

獲取個城市距離海邊距離，最高溫度，最低溫度，最高溼度，最低溼度 dist = [ravenna['dist'][0], cesena['dist'][0], faenza['dist'][0], ferrara['dist'][0], bologna['dist'][0], mantova['dist'][0], piacenza['dist'][0], milano['dist'][0], asti['dist'][0], torino['dist'][0] ] temp_max = [ravenna['temp'].max(), cesena['temp'].max(), faenza['temp'].max(), ferrara['temp'].max(), bologna['temp'].max(), mantova['temp'].max(), piacenza['temp'].max(), milano['temp'].max(), asti['temp'].max(), torino['temp'].max() ] temp_min = [ravenna['temp'].min(), cesena['temp'].min(), faenza['temp'].min(), ferrara['temp'].min(), bologna['temp'].min(), mantova['temp'].min(), piacenza['temp'].min(), milano['temp'].min(), asti['temp'].min(), torino['temp'].min() ] hum_min = [ravenna['humidity'].min(), cesena['humidity'].min(), faenza['humidity'].min(), ferrara['humidity'].min(), bologna['humidity'].min(), mantova['humidity'].min(), piacenza['humidity'].min(), milano['humidity'].min(), asti['humidity'].min(), torino['humidity'].min() ] hum_max = [ravenna['humidity'].max(), cesena['humidity'].max(), faenza['humidity'].max(), ferrara['humidity'].max(), bologna['humidity'].max(), mantova['humidity'].max(), piacenza['humidity'].max(), milano['humidity'].max(), asti['humidity'].max(), torino['humidity'].max() ]

顯示最高溫度與離海遠近的關係 plt.axis([0,400,32,35]) plt.plot(dist,temp_max,'ro')

根據距海遠近劃分數據 觀察發現，離海近的能夠造成一條直線，離海遠的也能造成一條直線。 首先使用numpy：把列表轉換爲numpy數組，用於後續計算。 分別以100千米和50千米爲分界點，劃分爲離海近和離海遠的兩組數據 # 把列表轉換爲numpy數組 x = np.array(dist) display('x:',x) y = np.array(temp_max) display('y:',y) # 離海近的一組數據 x1 = x[x<100] x1 = x1.reshape((x1.size,1)) display('x1:',x1) y1 = y[x<100] display('y1:',y1) # 離海遠的一組數據 x2 = x[x>50] x2 = x2.reshape((x2.size,1)) display('x2:',x2) y2 = y[x>50] display('y2:',y2)

機器學習計算迴歸模型 from sklearn.svm import SVR svr_lin1 = SVR(kernel='linear', C=1e3) svr_lin2 = SVR(kernel='linear', C=1e3) svr_lin1.fit(x1, y1) svr_lin2.fit(x2, y2) xp1 = np.arange(10,100,10).reshape((9,1)) xp2 = np.arange(50,400,50).reshape((7,1)) yp1 = svr_lin1.predict(xp1) yp2 = svr_lin2.predict(xp2)

繪製迴歸曲線 plt.plot(xp1, yp1, c='r', label='Strong sea effect') plt.plot(xp2, yp2, c='b', label='Light sea effect') #plt.axis('tight') plt.legend() plt.scatter(x, y, c='k', label='data')

最低溫度與海洋距離關係 plt.axis((0,400,16,21)) plt.plot(dist,temp_min,'bo')

最低溼度與海洋距離關係 plt.axis([0,400,70,120]) plt.plot(dist,hum_min,'bo')

最高溼度與海洋距離關係 plt.axis([0,400,70,120]) plt.plot(dist,hum_max,'bo')

平均溼度與海洋距離的關係 hum_mean = [ravenna['humidity'].mean(), cesena['humidity'].mean(), faenza['humidity'].mean(), ferrara['humidity'].mean(), bologna['humidity'].mean(), mantova['humidity'].mean(), piacenza['humidity'].mean(), milano['humidity'].mean(), asti['humidity'].mean(), torino['humidity'].mean() ] plt.plot(dist,hum_mean,'bo')

風速與風向的關係 plt.plot(ravenna['wind_deg'],ravenna['wind_speed'],'ro')

在子圖中，同時比較風向與溼度和風力的關係 plt.subplot(211) plt.plot(cesena['wind_deg'],cesena['humidity'],'bo') plt.subplot(212) plt.plot(cesena['wind_deg'],cesena['wind_speed'],'bo')

玫瑰圖 def showRoseWind(values,city_name): ''' 查看風向圖，半徑越大，表明這個方向上的風越多 ''' max_value = values.max() # 分爲8個面元 N = 8 # 面元的分隔角度 theta = np.arange(0.,2 * np.pi, 2 * np.pi / N) # 每一個面元的大小（半徑） radii = np.array(values) # 設置極座標條形圖 plt.axes([0.025, 0.025, 0.95, 0.95], polar=True) colors = [(1 - x/max_value, 1 - x/max_value, 0.75) for x in radii] # 畫圖 plt.bar(theta, radii, width=(2*np.pi/N), bottom=0.0, color=colors) plt.title(city_name,x=0.2, fontsize=20)

用numpy建立一個直方圖，將360度劃分爲8個面元，將數據分類到這8個面元中 hist, bin = np.histogram(ravenna['wind_deg'],8,[0,360]) print(hist) hist = hist/hist.sum() print(bin) showRoseWind(hist,'Ravenna')

計算米蘭各個方向的風速 print(milano[milano['wind_deg']<45]['wind_speed'].mean()) print(milano[(milano['wind_deg']>44) & (milano['wind_deg']<90)]['wind_speed'].mean()) print(milano[(milano['wind_deg']>89) & (milano['wind_deg']<135)]['wind_speed'].mean()) print(milano[(milano['wind_deg']>134) & (milano['wind_deg']<180)]['wind_speed'].mean()) print(milano[(milano['wind_deg']>179) & (milano['wind_deg']<225)]['wind_speed'].mean()) print(milano[(milano['wind_deg']>224) & (milano['wind_deg']<270)]['wind_speed'].mean()) print(milano[(milano['wind_deg']>269) & (milano['wind_deg']<315)]['wind_speed'].mean()) print(milano[milano['wind_deg']>314]['wind_speed'].mean())

將各個方向風速保存到列表中 degs = np.arange(45,361,45) tmp = [] for deg in degs: tmp.append(milano[(milano['wind_deg']>(deg-46)) & (milano['wind_deg']<deg)]['wind_speed'].mean()) speeds = np.array(tmp) print(speeds)

畫出各個方向的風速 N = 8 theta = np.arange(0.,2 * np.pi, 2 * np.pi / N) radii = np.array(speeds) plt.axes([0.025, 0.025, 0.95, 0.95], polar=True) colors = [(1-x/10.0, 1-x/10.0, 0.75) for x in radii] bars = plt.bar(theta, radii, width=(2*np.pi/N), bottom=0.0, color=colors) plt.title('Milano',x=0.2, fontsize=20)

抽取函數 def RoseWind_Speed(city): degs = np.arange(45,361,45) tmp = [] for deg in degs: tmp.append(city[(city['wind_deg']>(deg-46)) & (city['wind_deg']<deg)]['wind_speed'].mean()) return np.array(tmp) def showRoseWind_Speed(speeds,city_name): N = 8 theta = np.arange(0.,2 * np.pi, 2 * np.pi / N) radii = np.array(speeds) plt.axes([0.025, 0.025, 0.95, 0.95], polar=True) colors = [(1-x/10.0, 1-x/10.0, 0.75) for x in radii] bars = plt.bar(theta, radii, width=(2*np.pi/N), bottom=0.0, color=colors) plt.title(city_name,x=0.2, fontsize=20)

函數調用 showRoseWind_Speed(RoseWind_Speed(ravenna),'Ravenna')

根據列表的值來顯示每個元素出現的次數

lst = ['中雨','雷陣雨','中到大雨','陰','多雲','晴','中雨'] dic = {} for i in lst: if i not in dic: dic[i] = lst.count(i) print(dic)

鑽石和玻璃球遊戲(鑽石位置固定)

''' 開始，你能夠隨意選擇一個抽屜，在開啓它以前， 主持人會開啓另一個抽屜，露出抽屜裏的玻璃球。 這時，主持人會給你一次更換本身選擇的機會。 請本身認真分析一下「不換選擇能有更高的概率得到鑽石， 仍是換選擇能有更高的概率得到鑽石？或概率沒有發生變化？」寫出你分析的思路和結果。 設法編寫python程序驗證本身的想法， 驗證的結果支持了你的分析結果，仍是沒有支持你的分析結果， 請寫出結果，並附上你的程序代碼，並在程序代碼中經過註釋說明你解決問題的思路。 （提示：能夠藉助隨機數函數完成此程序） ''' import random print("鑽石和玻璃球的遊戲開始了") # 擺在你面前有3個關閉的抽屜 lst_dic = [{'抽屜':'鑽石'},{'抽屜':'玻璃球'},{'抽屜':'玻璃球'}] # 定義鑽石 zs = 0 # 定義玻璃球 blq = 0 def Game(your_choice,lst_dic): isLouchu = False # 查看主持人是否露出 for num in range(len(lst_dic)): if not isLouchu: if lst_dic[your_choice]['抽屜'] == '鑽石': # 第一種 抽到 鑽石 if num != your_choice: print("主持人露出了 %d 號抽屜的玻璃球"%(num + 1)) isLouchu = True else: # 第二種 抽到 玻璃球 if num != your_choice and lst_dic[num]['抽屜'] != '鑽石': print("主持人露出了 %d 號抽屜的玻璃球"%(num + 1)) isLouchu = True choice = 'yn' you_select = random.choice(choice) if you_select == 'y': lst_nums = [0,1,2] ischoose = False for new_choice in lst_nums: if not ischoose : if (new_choice != num) and (new_choice != your_choice): print("你新的選擇是:",new_choice+1,"號抽屜") your_choice = new_choice ischoose = True ChooseLater(your_choice) else: print("不變選擇,繼續堅持個人 %d 號抽屜"%(your_choice + 1)) your_choice = your_choice ChooseLater(your_choice) def ChooseLater(your_choice): # 選擇後進行計數 公佈答案 global zs, blq if lst_dic[your_choice]['抽屜'] == '鑽石': zs += 1 # 鑽石數 +1 else: blq += 1 # 玻璃球數 +1 answer_num = 0 isanswer = False for answer in lst_dic: if not isanswer: if answer['抽屜'] == '鑽石': print("鑽石在 %d 號抽屜 "%(answer_num + 1)) isanswer = True answer_num += 1 nums = int(input("請輸入想要實驗的次數")) for i in range(nums): # 你能夠隨意選擇一個抽屜 your_choice = random.randint(0, 2) print("你當前想要選擇的是 %d 號抽屜" % (your_choice + 1)) Game(your_choice,lst_dic) print("抽到的鑽石數爲: %d"%(zs)) print("抽到的玻璃球數爲: %d"%(blq)) print("鑽石的機率是 %.2f"%(zs/nums))

小人推心圖(網上代碼)

from turtle import * def go_to(x, y): up() goto(x, y) down() def head(x, y, r): go_to(x, y) speed(1) circle(r) leg(x, y) def leg(x, y): right(90) forward(180) right(30) forward(100) left(120) go_to(x, y - 180) forward(100) right(120) forward(100) left(120) hand(x, y) def hand(x, y): go_to(x, y - 60) forward(100) left(60) forward(100) go_to(x, y - 90) right(60) forward(100) right(60) forward(100) left(60) eye(x, y) def eye(x, y): go_to(x - 50, y + 130) right(90) forward(50) go_to(x + 40, y + 130) forward(50) left(90) def big_Circle(size): speed(20) for i in range(150): forward(size) right(0.3) def line(size): speed(1) forward(51 * size) def small_Circle(size): speed(10) for i in range(210): forward(size) right(0.786) def heart(x, y, size): go_to(x, y) left(150) begin_fill() line(size) big_Circle(size) small_Circle(size) left(120) small_Circle(size) big_Circle(size) line(size) end_fill() def main(): pensize(2) color('red', 'pink') head(-120, 100, 100) heart(250, -80, 1) go_to(200, -300) write("To: Hany", move=True, align="left", font=("楷體", 20, "normal")) done() main()

0525習題

for i in range(1000,2201): if i % 7 == 0 and i % 5 != 0: print(i,end = " ") def func(num): if num == 0: return 1 return num * func(num - 1) print(func(5))

for i in range(100,1000): a = i//100 b = i//10 % 10 c = i%100%10 if a**3 + b**3 + c**3 == i: print(i)

name = 'seven' passwd = '123' num = 3 while num: input_name = input("請輸入用戶名") input_passwd = input("請輸入密碼") if input_name == name and input_passwd == passwd: print("登錄成功") break else: print("登錄失敗") num = num - 1

x,y=eval(input("請輸入兩個數字，逗號分隔：")) lst_num=[] for i in range(x): L=[] for j in range(y): L.append(j*i) lst_num.append(L) print(lst_num)

def rank(score): if isinstance(score,int): if 90 <= score <= 100: print("優秀") elif 80<= score <= 89: print("良好") elif 60<= score <= 79: print("及格") elif 0<= score <= 59: print("不及格") else: print("輸入有誤!") try: score = eval(input("請輸入一個學生的成績")) rank(score) except Exception as e: print("請輸入數字")

def test(name): if name == 'exit': print('歡迎下次使用') if name[0].isalpha() or name[0] == '_': for i in name[1:]: if not (i.isalnum() or i == '_'): print('變量名不合法') break else: print('變量名合法!') else: print('變量名非法!') name = input() test(name)

def comb(n,m): if(n == m or (not m)): return 1 else: return comb(n-1,m) + comb(n-1,m-1) try: n,m = eval(input()) print(comb(n,m)) except : print("輸入有誤!")

def sort(dct): newDct={} items = list(dct.items()) items.sort(key=lambda x:x[1],reverse=True) for i in range(len(dct)): name,score = items[i] newDct[name] = score print("第%d名:%s,成績: %.2f分"%(i+1,name,newDct[name])) def avg(dct): scores = list(dct.values()) print("最高分:%.2f"%(max(scores)),end = "") print("最低分:%.2f" % (min(scores)),end="") print("平均分:%.2f" % (sum(scores) / len(scores)),end="") dct={} dct['張三'],dct['李四'],dct['王五'],dct['趙六'],dct['侯七']=eval(input()) sort(dct) avg(dct)

words = "Python" print("{:#^9}".format(words))

string = "Python" if "p" in string: print(string[:-1]) else: print(string[0:4])

# 建立文件data.txt，共100000行，每行存放一個1～100之間的整數 import random f = open('data.txt','w+') for i in range(100000): f.write(str(random.randint(1,100)) + '\n') f.seek(0,0) print(f.read()) f.close()

# 生成100個MAC地址並寫入文件中，MAC地址前6位（16進制）爲01-AF-3B # 01-AF-3B(-xx)(-xx)(-xx) # -xx # 01-AF-3B-xx # -xx # 01-AF-3B-xx-xx # -xx # 01-AF-3B-xx-xx-xx import random import string def create_mac(): mac = '01-AF-3B' # 生成16進制的數 hex_num = string.hexdigits for i in range(3): # 從16進制字符串中隨機選擇兩個數字 # 返回值是一個列表 n = random.sample(hex_num,2) # 拼接內容 將小寫字母轉換稱大寫字母 sn = '-' + ''.join(n).upper() mac += sn return mac # 主函數 隨機生成100個mac地址 def main(): with open('mac.txt','w') as f: for i in range(100): mac = create_mac() print(mac) f.write(mac + '\n') main()

# 生成一個大文件ips.txt,要求1200行,每行隨機爲172.25.254.0/24段的ip # 讀取ips.txt文件統計這個文件中ip出現頻率排前10的ip import random def create_ip(filename): ip=['172.25.254.'+str(i) for i in range(1,255)] with open(filename,'w') as f: for i in range(1200): f.write(random.sample(ip,1)[0]+'\n') create_ip('ips.txt') ips_dict={} with open('ips.txt') as f : for ip in f: ip=ip.strip() if ip in ips_dict: ips_dict[ip]+=1 else: ips_dict[ip]=1 sorted_ip=sorted(ips_dict.items(),key=lambda x:x[1],reverse=True)[:10] print(sorted_ip)

jieba嚐鮮

import jieba strings = '我工做在安徽的安徽師範大學，這個大學很美麗，在蕪湖' # print(dir(jieba)) dic_strings = {} lst_strings = jieba.lcut(strings) for ci in lst_strings: # 對獲得的分詞進行彙總 dic_strings[ci] = lst_strings.count(ci) # 更改字典中單詞出現的次數 print(dic_strings)

inf 無窮 inf = float('inf')

讀取文件進行繪圖

import numpy as np import pandas as pd import matplotlib.pyplot as plt cqlq=pd.read_csv("cqlq.txt",sep="\s+",encoding="gbk") dxnt=pd.read_csv("dxnt.txt",sep="\s+",encoding="gbk") ggdq=pd.read_csv("ggdq.txt",sep="\s+",encoding="gbk") giyy=pd.read_csv("gjyy.txt",sep="\s+",encoding="gbk") cqlq.columns = ["date","oppr","hipr","lopr","clpr","TR"] dxnt.columns = ["date","oppr","hipr","lopr","clpr"] ggdq.columns = ["date","oppr","hipr","lopr","clpr","TR"] giyy.columns = ["date","oppr","hipr","lopr","clpr","TR"] a=cqlq b=dxnt c=ggdq d=giyy ua=(a["clpr"]-a["clpr"].shift(1))/a["clpr"] ub=(b["clpr"]-b["clpr"].shift(1))/b["clpr"] uc=(c["clpr"]-c["clpr"].shift(1))/c["clpr"] ud=(d["clpr"]-d["clpr"].shift(1))/d["clpr"] u=pd.concat([ua,ub,uc,ud],axis=1) u.dropna() miu=u.mean()+0.005 jz=u.cov() yi = np.ones(4) miu= np.mat(miu) jz = np.mat(jz) yi = np.mat(yi) nijz = jz.I a = miu*nijz*miu.T b =yi*nijz*miu.T c = yi*nijz*yi.T deta=a*c-b**2 stock_y=[i*0.0001 for i in range(100)] stock_x=[(np.sqrt(( c/deta)*(rp-b/c)**2+1/c)).max() for rp in stock_y] plt.rcParams['font.sans-serif']=['SimHei'] plt.plot(stock_x,stock_y) plt.xlabel("方差") plt.ylabel("指望") print(miu) print(jz) plt.show()

Sqlite3 實現學生信息增刪改查 import sqlite3 conn = sqlite3.connect('studentsdb.db') # 鏈接數據庫 cursor = conn.cursor( ) # 建立數據表 def createDatabase(): '''建立一個數據表''' sql = 'create table student(stuId int primary key,stuName text,stuAge text,stuGender text,stuClass text)' cursor.execute(sql) conn.commit() def addInfo(sql = ''): '''添加數據''' if sql =='': # 若是是初始化,則默認會進行增長 6 條數據 stuInfo = [(1001, '小華', '20', '男', '二班'), (1002, '小明', '19', '女', '二班'), (1003, '小李', '20', '女', '一班'), (1004, '小王', '18', '男', '一班'), (1005, '小劉', '20', '女', '二班'), (1006, '小張', '19', '女', '一班')] cursor.executemany("insert into student values(?,?,?,?,?)",stuInfo) # 插入多條語句 conn.commit() def deleteInfo(): '''刪除數據''' cursor.execute("delete from student where stuId = 1005") # 將學號爲 1005 的小劉同窗刪除 conn.commit() def modifyInfo(): '''修改數據''' sql = "update student set stuAge = ? where stuId = ?" cursor.execute(sql,(20,1006)) # 將小張的年齡修改成 20 conn.commit() def selectInfo(): '''查詢學生信息''' sql = 'select * from student' # 查詢所有數據 cursor.execute(sql) print(cursor.fetchall()) def main(): # 建立一個數據表 createDatabase() # 添加數據 print("添加六條學生數據以後") addInfo() selectInfo() # 修改數據 print("將小張的年齡修改成 20") modifyInfo() selectInfo() # 刪除數據 print("將學號爲 1005 的小劉同窗刪除") deleteInfo() selectInfo() # cursor.execute('drop table student') # conn.commit() main()

繪圖 示例

import numpy as np import pandas as pd import matplotlib.pyplot as plt cqlq=pd.read_csv("cqlq.txt",sep="\s+",encoding="gbk") dxnt=pd.read_csv("dxnt.txt",sep="\s+",encoding="gbk") ggdq=pd.read_csv("ggdq.txt",sep="\s+",encoding="gbk") giyy=pd.read_csv("gjyy.txt",sep="\s+",encoding="gbk") cqlq.columns = ["date","oppr","hipr","lopr","clpr","TR"] dxnt.columns = ["date","oppr","hipr","lopr","clpr"] ggdq.columns = ["date","oppr","hipr","lopr","clpr","TR"] giyy.columns = ["date","oppr","hipr","lopr","clpr","TR"] a=cqlq b=dxnt c=ggdq d=giyy ua=(a["clpr"]-a["clpr"].shift(1))/a["clpr"] ub=(b["clpr"]-b["clpr"].shift(1))/b["clpr"] uc=(c["clpr"]-c["clpr"].shift(1))/c["clpr"] ud=(d["clpr"]-d["clpr"].shift(1))/d["clpr"] u=pd.concat([ua,ub,uc,ud],axis=1) u.dropna() miu=u.mean()+0.005 jz=u.cov() yi = np.ones(4) miu= np.mat(miu) jz = np.mat(jz) yi = np.mat(yi) nijz = jz.I a = miu*nijz*miu.T b =yi*nijz*miu.T c = yi*nijz*yi.T deta=a*c-b**2 stock_y=[i*0.0001 for i in range(100)] stock_x=[(np.sqrt(( c/deta)*(rp-b/c)**2+1/c)).max() for rp in stock_y] plt.rcParams['font.sans-serif']=['SimHei'] plt.plot(stock_x,stock_y) plt.xlabel("方差") plt.ylabel("指望") print(miu) print(jz) plt.show()

使用正則匹配數字

import re pattern = '\d+?\.\d+' s = "[Decimal('90.900000')]" s2 = "[Decimal('75.900000'),Decimal('57.280000')]" [print(i,end = " ") for i in re.findall(pattern,s)] print() [print(i,end = " ") for i in re.findall(pattern,s2)]

0528習題

''' 1. 編寫程序實現：計算並輸出標準輸入的三個數中絕對值最小的數。 ''' #計算並輸出標準輸入的三個數中絕對值最小的數。 import math num1 = int(input()) num2 = int(input()) num3 = int(input()) num_list = (num1, num2, num3) index_min = 0 #絕對值最小的元素的下標 if math.fabs(num_list[index_min]) > math.fabs(num_list[1]): index_min = 1 if math.fabs(num_list[index_min]) > math.fabs(num_list[2]): index_min = 2 for n in num_list: if math.fabs(num_list[index_min]) == math.fabs(n): print(n, end=' ')

''' 2. 編寫程序，功能是輸入五分製成績， 輸出對應的百分制分數檔。 不考慮非法輸入的情形。 對應關係爲：A: 90~100, B: 80~89, C: 70~79,D: 60~69, E: 0~59 ''' score = int(input()) if 90 <= score <= 100: print("A") elif 80 <= score <= 89: print("B") elif 70 <= score <= 79: print("C") elif 60 <= score <= 69: print("D") elif 0 <= score <= 59: print("E") else: print("請輸入正確分數")

''' 3. 編寫程序， 輸入年(year)、月(month)，輸出該年份該月的天數。 公曆閏年的計算方法爲：年份能被4整除且不能被100整除的爲閏年； 或者，年份能被400整除的是閏年。 ''' year = int(input("請輸入年份")) month = int(input("請輸入月份")) month_lst = [31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31] if (year %4 == 0 and year % 100 !=0) and year % 400 == 0: month_lst[1] = 29 print(month_lst[month - 1])

''' 4. 編寫程序，功能是輸入一個數，輸出該數的絕對值 ''' num = int(input()) print(abs(num))

''' 5. 編寫「猜數遊戲」程序， 功能是： 若是用戶輸入的數等於程序選定的數（該數設定爲10），則輸出「you win」， 不然若是大於選定的數，則輸出「too big」，反之輸出「too small」。 ''' num = 10 your_num = int(input()) if your_num == num: print("you win") elif your_num > num: print("too big") else: print("too small")

關於某一爬蟲實例的總結

os.chdir(r"C:\Users\47311\Desktop\code\") #修改成本身文件路徑

data = pd.read_excel(r"公司公告2020.xlsx")[:-1] #讀入數據，並刪除最後一行（最後一行爲空值）

讀取的數據在 chdir 之下

存在多個數據時,使用字符串類型進行 split 分割 " 可能會出錯,須要異常處理

DataFrame 對象.apply(函數名) 常常會使用,能夠用來賦值新的值 def address(str): #定義提取公告地址函數 try: return str.split('"')[1] except: pass data["公告地址"] = data["公告地址"].apply(address)

對代碼進行獲取某一個值時 能夠先獲取數據上面的內容 html = requests.get(url).text 使用 etree.HTML(html) 進行解析 使用 xpath 讀取路徑 tree.xpath("xxxx") 返回讀取到的內容,對原內容進行更新 return "http://xxxx.com/" + url[0]

data.iterrows() 讀取每一行的數據 for index, row in data.iterrows(): row['屬性'] 進行獲取值 添加文件後綴 name = row['公告標題'].split(':')[0] + row["證券代碼"][:6] + "_" + row["公告日期"] + ".pdf"

爬取時,進行必要的條件信息的說明 使用 urlretrieve(url,filename = r' xxx ') 進行保存

當獲取到的數據不存在時,能夠經過設置一個 len( data ) 設置一個長度 ,過濾掉 不到長度的數據 設置一個布爾類型的全局變量 當訪問到時 設置爲 True 若是沒有訪問到,則設置爲 False 根據全局變量的值,判斷是否繼續進行訪問

是否感染病毒

import random ganran = float(input("請輸入感染機率")) is_person_ganran = False # 人是否感染了 person_ganran = random.randint(0,100) if person_ganran /100 < ganran: is_person_ganran = True print(person_ganran) if is_person_ganran: print("被感染了") else: print("仍是正常人")

python文件操做 file = open('abc.txt','r',encoding='utf-8') file = open('abc.txt','w',encoding='utf-8') 'w' 寫入模式 會清空掉文件,而後再寫入 不想徹底覆蓋掉原文件的話,使用'a' 關鍵字with,with open(xxx) as f 避免打開文件後忘記關閉 readline() 讀取一行 讀取出來的數據 後面都有\n readlines() 將每一行造成一個元素，放到一個列表中 seek操做 seek(n)光標移動到n位置 注意: 移動單位是byte 若是是utf-8的中文部分要是3的倍數 seek(0,0)默認爲0，移動到文件頭 seek(0,1)移動到當前位置 seek(0,2)移動到文件尾 tell() 獲取當前光標在什麼位置 修改文件 將文件中的內容讀取到內存中 將信息修改完畢, 而後將源文件刪除, 將新文件的名字改爲原來文件的名字 能夠一行一行的讀取修改,避免溢出

pandas 幾個重要知識點

將 NaN 替換成某一數值 使用 fillna dataframe.fillna(value = 'xxx',inplace=True) 刪除某一個值 使用 drop dataframe.drop(10,inplace=True) 交換兩行的值 if m != n: temp = np.copy(dataframe[m]) dataframe[m] = dataframe[n] dataframe[n] = temp else: temp = np.copy(dataframe[dataframe.shape[1]-1]) dataframe[dataframe.shape[1]-1] = dataframe[n] dataframe[n] = temp 刪除 columns 這些列 dataframe.drop(columns = list, inplace=True)

一千美圓的故事(錢放入信封中)

def dollar(n): global story_money money = [] for i in range(10): if 2**(i+1) > story_money-sum(money): money.append(story_money-2**i+1) break money.append(2 ** i) # print(money) answer = [] if n >= money[-1]: answer.append(10) n -= money[-1] n = list(bin(n))[2:] n.reverse() rank = 1 for i in n: if i == '1': answer.append(rank) rank += 1 print(answer) story_money = 1000 dollar(500)

給定兩個列表,轉換爲DataFrame類型

import pandas as pd def get_data(): q1 = [] q2 = [] p1 = input("list 1:") p2 = input("list 2:") q1=p1.split(',') q2=p2.split(',') for i,j in zip(range(len(q1)),range(len(q2))): q1[i] = int(q1[i])**1 q2[j] = float(q2[j])**2 dic = { "L":q1, "I":q2 } A = pd.DataFrame(dic) print(A) get_data() 1.將輸入的使用 split(',') 進行分割 2.使用 for i,j in zip(range(len(q1)),range(len(q2))) 對 q1 和 q2 都進行遍歷 3.使用字典,將列表做爲值,傳遞過去 使用 pd.DataFrame 進行轉換

經過文檔算學生的平均分

tom 85 90 jerry 95 80 lucy 80 90 rose 88 90 jay 76 75 summer 87 85 horry 84 80

dic = {} with open('score.txt','r') as f: lines = f.readlines() f.close() for line in lines: line = line.strip('\n').split(' ') dic[line[0]] = (int(line[1]),int(line[2])) name = input() if name not in dic.keys(): print("not found") else: print(sum(dic[name])/len(dic[name]))

0528習題 11-15

''' 6. 一元二次方程：ax2+bx+c=0 （a ╪ 0） 【輸入形式】輸入a、b和c的值（有理數） 【輸出形式】輸出x的兩個值，或者No（即沒有有理數的解） ''' import math a = int(input()) b = int(input()) c = int(input()) disc = b*b - 4*a*c p = -b/(2*a) if disc > 0: q = math.sqrt(disc)/(2*a) x1 = p + q x2 = p - q print("x1 = %s,x2 = %s"%(str(x1,x2))) elif disc == 0: x1 = p print("x1 = x2 = ",x1) else: disc = -disc q = math.sqrt(disc)/(2*a) print("x1 = ",p,"+",q) print("x2 = ", p, "-", q)

''' 7. 計算1+1/2+1/3+...+1/n ''' n = int(input()) sum = 0 for i in range(1,n+1): sum += 1/i print(sum)

''' 8. 編寫猜數遊戲程序，功能是：容許用戶反覆輸入數， 直至猜中程序選定的數（假定爲100）。 輸入的數若是大於選定的數，則提示"larger than expected"； 若是小於選定的數，則提示"less than expected"； 若是等於選定的數，則輸出"you win"並結束程序。 ''' import random num = random.randint(1,5) while True: your_num = int(input()) if your_num == num: print("you win") break elif your_num > num: print("larger than expected") else: print("less than expected")

''' 9. 計算1-100之間的偶數和 ''' num_lst = [i for i in range(1,101) if i % 2 == 0] print(sum(num_lst))

''' 10. 猴子摘下若干個桃子，第一天吃了桃子的一半多一個， 之後天天吃了前一天剩下的一半多一個， 到第n天吃之前發現只剩下一個桃子， 編寫程序實現：據輸入的天數計算並輸出猴子共摘了幾個桃子 【輸入形式】輸入的一行爲一個非負整數，表示一共吃的天數。 【輸出形式】輸出的一行爲一個非負整數，表示共摘了幾個桃子， 若輸入的數據不合法（如：負數或小數），則輸出"illegal data"。 ''' def Peach(day,yesterday_sum,now_rest): if day != 0: day -= 1 yesterday_sum = (now_rest + 1) * 2 now_rest = yesterday_sum return Peach(day,yesterday_sum,now_rest) else: return yesterday_sum yesterday_sum = 0 now_rest = 1 day = int(input()) if day <= 0: print("illegal data") exit() print(Peach(day,yesterday_sum,now_rest))

0528習題 6-10

''' 1. 編寫程序，功能是把輸入的字符串的大寫字母變成小寫字母， 小寫字母變成大寫字母，非字母的字符不做變換。輸出變換後的結果 ''' string = input() s = '' for str in string: if 'a' <= str <= 'z': s += str.upper() elif 'A' <= str <= 'Z': s += str.lower() else: s += str print(s) ''' 2. 已知10個四位數輸出全部對稱數及個數 n， 例如122一、2332都是對稱數。 【輸入形式】10個四位數，以空格分隔開 【輸出形式】輸入的四位數中的全部對稱數，對稱數個數 ''' input_nums = input().split() nums = [] for num in input_nums: nums.append(int(num)) symmetric_num = [] for num in nums: num = str(num) if num[0] == num[3] and num[1] == num[2]: symmetric_num.append(num) print("對稱數:") [print(i,end = " ") for i in symmetric_num] print(len(symmetric_num)) # 1221 2243 2332 1435 1236 5623 4321 4356 6754 3234 ''' 學校舉辦新生歌手大賽，每一個選手的成績 由評委的評分去掉一個最高分和一個最低分剩下評分的平均值獲得。‪‬‪‬‪‬‪‬‪‬‮‬‫‬‫‬‪‬‪‬‪‬‪‬‪‬‮‬‪‬‮‬‪‬‪‬‪‬‪‬‪‬‮‬‫‬‮‬‪‬‪‬‪‬‪‬‪‬‮‬‪‬‪‬‬‬‬‬‬‬‬‬‬‬‬‬‬‬‬‬‬‬‬‬‬‬‬‬‬‬‬‬‬‬‬‬ 編寫程序實現：輸入第一行指定n，從第二行開始每行輸入一個評委的得分（共n行）， 計算選手的成績，並輸出。‪‬‪‬‪‬‪‬‪‬‮‬‫‬‫‬‪‬‪‬‪‬‪‬‪‬‮‬‪‬‮‬‪‬‪‬‪‬‪‬‪‬‮‬‫‬‮‬‪‬‪‬‪‬‪‬‪‬‮‬‪‬‪‬‬‬‬‬‬‬‬‬‬‬‬‬‬‬‬‬‬‬‬‬‬‬‬‬‬‬‬‬‬‬‬‬ ''' n = int(input()) player = [] for i in range(n): score = float(input()) player.append(score) player.remove(max(player)) player.remove(min(player)) print("%.1f"%(sum(player)/len(player))) ''' 1. 編寫程序實現：計算並輸出標準輸入的三個數中絕對值最小的數。 ''' #計算並輸出標準輸入的三個數中絕對值最小的數。 import math num1 = int(input()) num2 = int(input()) num3 = int(input()) num_list = (num1, num2, num3) index_min = 0 #絕對值最小的元素的下標 if math.fabs(num_list[index_min]) > math.fabs(num_list[1]): index_min = 1 if math.fabs(num_list[index_min]) > math.fabs(num_list[2]): index_min = 2 for n in num_list: if math.fabs(num_list[index_min]) == math.fabs(n): print(n, end=' ') ''' 5. 從鍵盤輸入非0整數，以輸入0爲輸入結束標誌，求平均值，統計正數負數個數 【輸入形式】 每一個整數一行。最後一行是0，表示輸入結束。 【輸出形式】輸出三行。 第一行是平均值。第二行是正數個數。第三行是負數個數。 ''' nums = [] n_z = 0 n_f = 0 while True: num = int(input()) if num == 0: print(sum(nums)/len(nums)) for n in nums: if n > 0: n_z += 1 elif n < 0: n_f += 1 print(n_z) print(n_f) exit() else: nums.append(num)

0528習題 16-20

''' 11. 編寫程序，判斷一個數是否是素數，是則輸出「Yes」，不是輸出「No」.(while循環) ''' num = int(input()) i = 2 flag = True while i < num: if num % i ==0: flag = False i += 1 if flag: print("Yes") else: print("No")

''' 12. 編程實現：從鍵盤輸入5個分數，計算平均分。 【輸入形式】5個分數，每一個分數佔一行。 【輸出形式】新起一行輸出平均分。 ''' nums = [] for i in range(5): num = float(input()) nums.append(num) print(sum(nums)/len(nums))

''' 13. 輸入3個整數，輸出其中最大的一個 。 ''' nums = [] for i in range(3): num = int(input()) nums.append(num) print(max(nums))

''' 14. 輸入n，計算n!（n!=1*2*3*...*n） ''' n = int(input()) sum = 1 for i in range(1,n+1): sum *= i print(sum)

''' 編寫程序，打印菱形圖案，行數n從鍵盤輸入。 下爲n=3時的圖案，其中的點號實際爲空格。圖案左對齊輸出。 ''' n = 3 for i in range(1, n + 1): print(" " * (n - i) + "* " * (2 * i - 1)) for i in range(n-1,0,-1): print(" " * (n - i) + "* " * (2 * i - 1))

0528習題 21-25

''' 16. 編寫程序計算學生的平均分。 【輸入形式】輸入的第一行表示學生人數n； 標準輸入的第2至n+1行表示學生成績。 【輸出形式】輸出的一行表示平均分（保留兩位小數）。 若輸入的數據不合法（學生人數不是大於0的整數， 或學生成績小於0或大於100），輸出「illegal input」。 ''' n = int(input()) nums = [] for i in range(n): score = float(input()) if not 0<= score <= 100: print("illegal input") nums.append(score) print("%.2f"%(sum(nums)/len(nums))) ''' 17. 請將一萬之內的徹底平方數輸出 . ''' for x in range(1,101): y = x*x if y <= 10000: print(y) else: break ''' 18. 從鍵盤輸入非0整數，以輸入0爲輸入結束標誌，求平均值，統計正數負數個數 【輸入形式】每一個整數一行。最後一行是0，表示輸入結束。 【輸出形式】輸出三行。 第一行是平均值。第二行是正數個數。第三行是負數個數。 ''' nums = [] n_z = 0 n_f = 0 while True: num = int(input()) if num == 0: print(sum(nums)/len(nums)) for n in nums: if n > 0: n_z += 1 elif n < 0: n_f += 1 print(n_z) print(n_f) exit() else: nums.append(num) ''' 【問題描述】從鍵盤輸入一個大寫字母，要求輸出其對應的小寫字母。 【輸入形式】輸入大寫字母，不考慮不合法輸入。 【輸出形式】輸出對應的小寫字母。 【樣例輸入】A 【樣例輸出】a ''' s = input() print(s.lower()) ''' 【問題描述】 從鍵盤輸入三個字符，按ASCII碼值從小到大排序輸出，字符之間間隔一個空格。 【輸入形式】 輸入三個字符，每一個字符用空格隔開。 【輸出形式】 相對應的輸出按照ASCII碼值從小到大排列的三個字符，每一個字符間用空格隔開。 【樣例輸入】a c b 【樣例輸出】a b c ''' strings = input().split(' ') strings = sorted(strings) for s in strings: print(s,end = " ")

0528習題 26-31

''' 【問題描述】定義一個函數判斷是否爲素數isPrime（）， 主程序經過調用函數輸出2-30之間全部的素數。 素數：一個大於1的天然數，除了1和它自己外，不能被其餘天然數整除。 【輸入形式】無【輸出形式】2~30之間全部的索數（逗號分隔） 【樣例輸入】【樣例輸出】2,3,5,7,11,13,17,19,23,29， 【樣例說明】【評分標準】 ''' def isPrime(n): i = 2 flag = True while i < n: if n % i == 0: flag = False i += 1 if flag: return True else: return False for i in range(2,31): if isPrime(i): print(i,end = ',')

''' 【問題描述】有182只兔子，分別裝在甲乙兩種籠子裏， 甲種籠子（x）每一個裝6只，乙種籠子（y）每一個裝4只， 兩種籠子正好用36個，問兩種籠子各用多少個？ 【輸入形式】無 【輸出形式】籠子的個數 【樣例輸入】 【樣例輸出】x=*；y=* 【輸出說明】 1)*表明輸出的值； 2)輸出的等號和分號都是英文字符 ''' for i in range(1,36): x = i y = 36 - i if 6*x + 4*y == 182: print("x=%d;y=%d"%(x,y))

''' 輸入圓柱體的底面半徑和高，求圓柱體的體積並輸出。 圓周率T取固定值3.14。 【輸入形式】圓柱體的底面半徑和高 【輸出形式】圓柱體的體積 【樣例輸入】2 【樣例輸出】50.24 ''' r = float(input()) h = float(input()) pi = 3.14 print("%.2f"%(pi*r*r*h))

''' 【問題描述】猴子吃桃問題： 猴子摘下若干個桃子，第一天吃了桃子的一半多一個， 之後天天吃了前一天剩下的一半多一個， 到第n天吃之前發現只剩下一個桃子， 編寫程序實現：據輸入的天數計算並輸出猴子共摘了幾個桃子。 【輸入形式】n。 【輸出形式】共摘了幾個桃子 【樣例輸入】3 【樣例輸出】10 【樣例輸入】1 【樣例輸出】1 ''' day = int(input()) now = 1 yesterday = 0 while day > 1: yesterday = (now + 1) * 2 now = yesterday day -= 1 print(now)

''' 輸入5名學生的成績，保存到列表， 統計最高分、最低分、平均分和及格率。平均分 和及格率保留兩位小數，及格率的輸出格式爲x%。 【輸入形式】5我的的成績 【輸出形式】最高分、最低分、平均分、及格率 【樣例輸入】 56 67 55 66 70 【樣例輸出】 70 55 62.80 60.00% ''' score = [] for i in range(5): num = float(input()) score.append(num) n = 0 for i in score: if i > 60: n += 1 print(max(score)) print(min(score)) print(sum(score)/len(score)) print("%.2f%%"%(n*100/len(score)))

''' 【問題描述】 文件「in.txt」中存儲了學生的姓名和成績， 每一個人的姓名成績放在一行，中間用空格隔開， 形式以下：Sunny 70 Susan 88從文件讀取數據後， 存入字典，姓名做爲字典的鍵，成績做爲字典的值。 而後輸入姓名，查詢相應的成績，查不到，顯示"not found"。 【輸入形式】姓名 【輸出形式】成績 【樣例輸入】鍵盤輸入：Susan ''' name = input() flag = False with open('in.txt','r',encoding='utf-8') as fp: for line in fp: line = line.replace('\n','') if line != "": lst = line.split() tup = tuple(lst) # print(tup) if tup[0] == name: flag = True print(tup[-1]) if not flag: print("not found") in.txt文件內容 Sunny 70 Susan 80

讀取 csv , xlsx 表格並添加總分列

import pandas as pd import numpy as np data = pd.read_excel('學生成績表.csv',columns = ['學號','姓名','高數','英語','計算機']) sum_score = [ ] for i in range(len(data)): sum_score.append(sum(data.loc[i,:][data.columns[2:]])) data['總分'] = sum_score print(data['總分']) data.to_csv('學生成績表1.csv',encoding='gbk')

matplotlib 顯示中文問題 import matplotlib.pyplot as plt plt.rcParams['font.sans-serif']=['SimHei'] #用來正常顯示中文標籤 plt.rcParams['axes.unicode_minus']=False #用來正常顯示負號

十進制轉換

# bin2dec # 二進制 to 十進e5a48de588b662616964757a686964616f31333335336437制: int(str,n=10) def bin2dec(string_num): return str(int(string_num, 2)) # hex2dec # 十六進制 to 十進制 def hex2dec(string_num): return str(int(string_num.upper(), 16)) # dec2bin # 十進制 to 二進制: bin() def dec2bin(string_num): num = int(string_num) mid = [] while True: if num == 0: break num,rem = divmod(num, 2) mid.append(base[rem]) return ''.join([str(x) for x in mid[::-1]]) # dec2hex # 十進制 to 八進制: oct() # 十進制 to 十六進制: hex() def dec2hex(string_num): num = int(string_num) mid = [] while True: if num == 0: break num,rem = divmod(num, 16) mid.append(base[rem]) return ''.join([str(x) for x in mid[::-1]])

base = [str(x) for x in range(10)] + [chr(x) for x in range(ord('A'), ord('A') + 6)] def dec2bin(string_num): '''十進制轉換爲 二進制''' num = int(string_num) mid = [] while True: if num == 0: break num, rem = divmod(num, 2) mid.append(base[rem]) return ''.join([str(x) for x in mid[::-1]]) def dec2oct(string_num): '''轉換爲 八進制''' num = int(string_num) mid = [] while True: if num == 0: break num, rem = divmod(num, 8) mid.append(base[rem]) return ''.join([str(x) for x in mid[::-1]]) def dec2hex(string_num): '''轉換爲 十六進制''' num = int(string_num) mid = [] while True: if num == 0: break num, rem = divmod(num, 16) mid.append(base[rem]) return ''.join([str(x) for x in mid[::-1]]) num = float(input()) print(dec2bin(num),dec2oct(num),dec2hex(num))

正則表達式鞏固

# 導入re模塊 import re # 使用match方法進行匹配操做 result = re.match(正則表達式,要匹配的字符串) # 若是上一步匹配到數據的話，能夠使用group方法來提取數據 result.group()

re.match用來進行正則匹配檢查 若字符串匹配正則表達式，則match方法返回匹配對象（Match Object） 不然返回None

import re result = re.match("itcast","itcast.cn") result.group()

re.match() 可以匹配出以xxx開頭的字符串

大寫字母表示 非 \w 匹配字母,數字,下劃線 \W 表示除了字母 數字 下劃線的

import re ret = re.match(".","a") ret.group() ret = re.match(".","b") ret.group() ret = re.match(".","M") ret.group()

import re # 若是hello的首字符小寫，那麼正則表達式須要小寫的h ret = re.match("h","hello Python") ret.group() # 若是hello的首字符大寫，那麼正則表達式須要大寫的H ret = re.match("H","Hello Python") ret.group() # 大小寫h均可以的狀況 ret = re.match("[hH]","hello Python") ret.group() ret = re.match("[hH]","Hello Python") ret.group() # 匹配0到9第一種寫法 ret = re.match("[0123456789]","7Hello Python") ret.group() # 匹配0到9第二種寫法 ret = re.match("[0-9]","7Hello Python") ret.group()

import re # 普通的匹配方式 ret = re.match("嫦娥1號","嫦娥1號發射成功") print(ret.group()) ret = re.match("嫦娥2號","嫦娥2號發射成功") print(ret.group()) ret = re.match("嫦娥3號","嫦娥3號發射成功") print(ret.group()) # 使用\d進行匹配 ret = re.match("嫦娥\d號","嫦娥1號發射成功") print(ret.group()) ret = re.match("嫦娥\d號","嫦娥2號發射成功") print(ret.group()) ret = re.match("嫦娥\d號","嫦娥3號發射成功") print(ret.group())

正則表達式裏使用"\"做爲轉義字符 須要匹配文本中的字符"\" 使用反斜槓"\\"

import re ret = re.match("[A-Z][a-z]*","Mm") print(ret.group()) ret = re.match("[A-Z][a-z]*","Aabcdef") print(ret.group())

import re ret = re.match("[a-zA-Z_]+[\w_]*","name1") print(ret.group()) ret = re.match("[a-zA-Z_]+[\w_]*","_name") print(ret.group()) ret = re.match("[a-zA-Z_]+[\w_]*","2_name") print(ret.group())

import re ret = re.match("[1-9]?[0-9]","7") print(ret.group()) ret = re.match("[1-9]?[0-9]","33") print(ret.group()) ret = re.match("[1-9]?[0-9]","09") print(ret.group())

import re ret = re.match("[a-zA-Z0-9_]{6}","12a3g45678") print(ret.group()) ret = re.match("[a-zA-Z0-9_]{8,20}","1ad12f23s34455ff66") print(ret.group())

import re # 正確的地址 ret = re.match("[\w]{4,20}@163\.com", "xiaoWang@163.com") print(ret.group()) # 不正確的地址 ret = re.match("[\w]{4,20}@163\.com", "xiaoWang@163.comheihei") print(ret.group()) # 經過$來肯定末尾 ret = re.match("[\w]{4,20}@163\.com$", "xiaoWang@163.comheihei") print(ret.group())

\b 匹配一個單詞的邊界 \B 匹配非單詞邊界

import re ret = re.match("[1-9]?\d","8") print(ret.group()) ret = re.match("[1-9]?\d","78") print(ret.group()) # 添加| ret = re.match("[1-9]?\d$|100","8") print(ret.group()) ret = re.match("[1-9]?\d$|100","78") print(ret.group()) ret = re.match("[1-9]?\d$|100","100") print(ret.group())

import re ret = re.match("\w{4,20}@163\.com", "test@163.com") print(ret.group()) ret = re.match("\w{4,20}@(163|126|qq)\.com", "test@126.com") print(ret.group()) ret = re.match("\w{4,20}@(163|126|qq)\.com", "test@qq.com") print(ret.group())

import re # 可以完成對正確的字符串的匹配 ret = re.match("<[a-zA-Z]*>\w*</[a-zA-Z]*>", "<html>hh</html>") print(ret.group()) # 若是遇到非正常的html格式字符串，匹配出錯 ret = re.match("<[a-zA-Z]*>\w*</[a-zA-Z]*>", "<html>hh</htmlbalabala>") print(ret.group()) # 正確的理解思路：若是在第一對<>中是什麼，按理說在後面的那對<>中就應該是什麼 # 經過引用分組中匹配到的數據便可，可是要注意是元字符串，即相似 r""這種格式 ret = re.match(r"<([a-zA-Z]*)>\w*</\1>", "<html>hh</html>") print(ret.group()) # 由於2對<>中的數據不一致，因此沒有匹配出來 ret = re.match(r"<([a-zA-Z]*)>\w*</\1>", "<html>hh</htmlbalabala>") print(ret.group())

import re ret = re.match(r"<(\w*)><(\w*)>.*</\2></\1>", "<html><h1>www.itcast.cn</h1></html>") print(ret.group()) # 由於子標籤不一樣,致使出錯 ret = re.match(r"<(\w*)><(\w*)>.*</\2></\1>", "<html><h1>www.itcast.cn</h2></html>") print(ret.group())

import re ret = re.match(r"<(?P<name1>\w*)><(?P<name2>\w*)>.*</(?P=name2)></(?P=name1)>", "<html><h1>www.itcast.cn</h1></html>") print(ret.group()) ret = re.match(r"<(?P<name1>\w*)><(?P<name2>\w*)>.*</(?P=name2)></(?P=name1)>", "<html><h1>www.itcast.cn</h2></html>") print(ret.group())

import re ret = re.search(r"\d+", "閱讀次數爲 9999") print(ret.group())

import re ret = re.findall(r"\d+", "python = 9999, c = 7890, c++ = 12345") print(ret)

import re ret = re.sub(r"\d+", '998', "python = 997") print(ret)

import re def add(temp): strNum = temp.group() num = int(strNum) + 1 return str(num) # 替換的是 原數據 + 1 ret = re.sub(r"\d+", add, "python = 997") print(ret) ret = re.sub(r"\d+", add, "python = 99") print(ret)

import re ret = re.split(r":| ","info:xiaoZhang 33 shandong") print(ret)

Python裏數量詞默認是貪婪的 匹配儘量多的字符 非貪婪 總匹配儘量少的字符。

pandas鞏固

導包 import pandas as pd

設置輸出結果列對齊 pd.set_option('display.unicode.ambiguous_as_wide',True) pd.set_option('display.unicode.east_asian_width',True)

建立 從 0 開始的非負整數索引 s1 = pd.Series(range(1,20,5))

使用字典建立 Series 字典的鍵做爲索引 s2 = pd.Series({'語文':95,'數學':98,'Python':100,'物理':97,'化學':99})

修改 Series 對象的值 s1[3] = -17

查看 s1 的絕對值 abs(s1)

將 s1 全部的值都加 五、使用加法時，對全部元素都進行 s1 + 5

在 s1 的索引下標前加入參數值 s1.add_prefix(2)

s2 數據的直方圖 s2.hist()

每行索引後面加上 hany s2.add_suffix('hany')

查看 s2 中最大值的索引 s2.argmax()

查看 s2 的值是否在指定區間內 s2.between(90,100,inclusive = True)

查看 s2 中 97 分以上的數據 s2[s2 > 97]

查看 s2 中大於中值的數據 s2[s2 > s2.median()]

s2 與數字之間的運算,開平方根 * 10 保留一位小數 round((s2**0.5)*10,1)

s2 的中值 s2.median()

s2 中最小的兩個數 s2.nsmallest(2)

s2 中最大的兩個數 s2.nlargest(2)

Series 對象之間的運算,對相同索引進行計算,不是相同索引的使用 NaN pd.Series(range(5)) + pd.Series(range(5,10))

對 Series 對象使用匿名函數 pd.Series(range(5)).pipe(lambda x,y,z :(x**y)%z,2,5)

pd.Series(range(5)).pipe(lambda x:x+3)

pd.Series(range(5)).pipe(lambda x:x+3).pipe(lambda x:x*3)

對 Series 對象使用匿名函數 pd.Series(range(5)).apply(lambda x:x+3)

查看標準差 pd.Series(range(0,5)).std()

查看無偏方差 pd.Series(range(0,5)).var()

查看無偏標準差 pd.Series(range(0,5)).sem()

查看是否存在等價於 True 的值 any(pd.Series([3,0,True]))

查看是否全部的值都等價於 True all(pd.Series([3,0,True]))

建立一個 DataFrame 對象 dataframe = pd.DataFrame(np.random.randint(1,20,(5,3)), index = range(5), columns = ['A','B','C'])

索引爲時間序列 dataframe2 = pd.DataFrame(np.random.randint(5,15,(9,3)), index = pd.date_range(start = '202003211126', end = '202003212000', freq = 'H'), columns = ['Pandas','爬蟲','比賽'])

使用字典進行建立 dataframe3 = pd.DataFrame({'語文':[87,79,67,92], '數學':[93,89,80,77], '英語':[88,95,76,77]}, index = ['張三','李四','王五','趙六'])

建立時自動擴充 dataframe4 = pd.DataFrame({'A':range(5,10),'B':3})

查看周幾 dff['日期'] = pd.to_datetime(data['日期']).dt.weekday_name

按照周幾進行分組，查看交易的平均值 dff = dff.groupby('日期').mean().apply(round) dff.index.name = '周幾'

對姓名和日期進行分組,並進行求和 dff = dataframe.groupby(by = ['姓名','日期'],as_index = False).sum()

將 dff 的索引，列 設置成透視表形式 dff = dff.pivot(index = '姓名',columns = '日期',values = '交易額')

查看前一天的數據 dff.iloc[:,:1]

交易總額小於 4000 的人的前三天業績 dff[dff.sum(axis = 1) < 4000].iloc[:,:3]

工資總額大於 2900 元的員工的姓名 dff[dff.sum(axis = 1) > 2900].index.values

顯示前兩天每一天的交易總額以及每一個人的交易金額 dataframe.pivot_table(values = '交易額',index = '姓名', columns = '日期',aggfunc = 'sum',margins = True).iloc[:,:2]

顯示每一個人在每一個櫃檯的交易總額 dff = dataframe.groupby(by = ['姓名','櫃檯'],as_index = False).sum()

dff.pivot(index = '姓名',columns = '櫃檯',values = '交易額')

查看每人天天的上班次數 dataframe.pivot_table(values = '交易額',index = '姓名',columns = '日期',aggfunc = 'count',margins = True).iloc[:,:1]

查看每一個人天天購買的次數 dataframe.pivot_table(values = '交易額',index = '姓名',columns = '日期',aggfunc = 'count',margins = True)

每一個人天天上過幾回班 pd.crosstab(dataframe.姓名,dataframe.日期,margins = True).iloc[:,:2]

每一個人天天去過幾回櫃檯 pd.crosstab(dataframe.姓名,dataframe.櫃檯)

將每個人在每個櫃檯的交易總額顯示出來 pd.crosstab(dataframe.姓名,dataframe.櫃檯,dataframe.交易額,aggfunc='sum')

每一個人在每一個櫃檯交易額的平均值,金額/天數 pd.crosstab(dataframe.姓名,dataframe.櫃檯,dataframe.交易額,aggfunc = 'mean').apply(lambda num:round(num,2) )

對 5 的餘數進行分組 dataframe.groupby(by = lambda num:num % 5)['交易額'].sum()

查看索引爲 7 15 的交易額 dataframe.groupby(by = {7:'索引爲7的行',15:'索引爲15的行'})['交易額'].sum()

查看不一樣時段的交易總額 dataframe.groupby(by = '時段')['交易額'].sum()

各櫃檯的銷售總額 dataframe.groupby(by = '櫃檯')['交易額'].sum()

查看每一個人在每一個時段購買的次數 count = dataframe.groupby(by = '姓名')['時段'].count()

每一個人的交易額平均值並排序 dataframe.groupby(by = '姓名')['交易額'].mean().round(2).sort_values()

每一個人的交易額，apply(int) 轉換爲整數 dataframe.groupby(by = '姓名').sum()['交易額'].apply(int)

每個員工交易額的中值 data = dataframe.groupby(by = '姓名').median()

查看交易額對應的排名 data['排名'] = data['交易額'].rank(ascending = False)

data[['交易額','排名']]

每一個人不一樣時段的交易額 dataframe.groupby(by = ['姓名','時段'])['交易額'].sum()

設置各時段累計 dataframe.groupby(by = ['姓名'])['時段','交易額'].aggregate({'交易額':np.sum,'時段':lambda x:'各時段累計'})

對指定列進行聚合,查看最大,最小,和,平均值,中值 dataframe.groupby(by = '姓名').agg(['max','min','sum','mean','median'])

查看部分聚合後的結果 dataframe.groupby(by = '姓名').agg(['max','min','sum','mean','median'])['交易額']

查看交易額低於 2000 的三條數據 dataframe[dataframe.交易額 < 2000][:3]

查看上浮了 50% 以後依舊低於 1500 的交易額,查看 4 條數據 dataframe.loc[dataframe.交易額 < 1500,'交易額'] = dataframe[dataframe.交易額 < 1500]['交易額'].map(lambda num:num*1.5)

查看交易額大於 2500 的數據 dataframe[dataframe.交易額 > 2500]

查看交易額低於 900 或 高於 1800 的數據 dataframe[(dataframe.交易額 < 900)|(dataframe.交易額 > 1800)]

將全部低於 200 的交易額都替換成 200 dataframe.loc[dataframe.交易額 < 200,'交易額'] = 200

查看低於 1500 的交易額個數 dataframe.loc[dataframe.交易額 < 1500,'交易額'].count()

將大於 3000 元的都替換爲 3000 元 dataframe.loc[dataframe.交易額 > 3000,'交易額'] = 3000

查看有多少行數據 len(dataframe)

丟棄缺失值以後的行數 len(dataframe.dropna())

包含缺失值的行 dataframe[dataframe['交易額'].isnull()]

使用固定值替換缺失值 dff = copy.deepcopy(dataframe) dff.loc[dff.交易額.isnull(),'交易額'] = 999

使用交易額的均值替換缺失值 dff = copy.deepcopy(dataframe) for i in dff[dff.交易額.isnull()].index: dff.loc[i,'交易額'] = round(dff.loc[dff.姓名 == dff.loc[i,'姓名'],'交易額'].mean())

使用總體均值的 80% 填充缺失值 dataframe.fillna({'交易額':round(dataframe['交易額'].mean() * 0.8)},inplace = True)

查看重複值 dataframe[dataframe.duplicated()]

丟棄重複行 dataframe = dataframe.drop_duplicates()

查看員工業績波動狀況(每一天和昨天的數據做比較) dff = dataframe.groupby(by = '日期').sum()['交易額'].diff()

對數據使用 map 函數 dff.map(lambda num:'%.2f'%(num))[:5]

查看張三的波動狀況 dataframe[dataframe.姓名 == '張三'].groupby(by = '日期').sum()['交易額'].diff()

修改異常值 data.loc[data.交易額 > 3000,'交易額'] = 3000 data.loc[data.交易額 < 200,'交易額'] = 200

刪除重複值 data.drop_duplicates(inplace = True)

填充缺失值 data['交易額'].fillna(data['交易額'].mean(),inplace = True)

使用交叉表獲得每人在各櫃檯交易額的平均值 data_group = pd.crosstab(data.姓名,data.櫃檯,data.交易額,aggfunc = 'mean').apply(round)

繪製柱狀圖 data_group.plot(kind = 'bar')

使用 concat 鏈接兩個相同結構的 DataFrame 對象 df3 = pd.concat([df1,df2])

合併，忽略原來的索引 ignore_index df4 = df3.append([df1,df2],ignore_index = True)

按照列進行拆分 df5 = df4.loc[:,['姓名','櫃檯','交易額']]

按照工號進行合併，隨機查看 3 條數據 rows = np.random.randint(0,len(df5),3) pd.merge(df4,df5).iloc[rows,:]

按照工號進行合併，指定其餘同名列的後綴 pd.merge(df1,df2,on = '工號',suffixes = ['_x','_y']).iloc[:,:]

兩個表都設置工號爲索引 set_index df2.set_index('工號').join(df3.set_index('工號'),lsuffix = '_x',rsuffix = '_y').iloc[:]

按照交易額和工號降序排序，查看五條數據 dataframe.sort_values(by = ['交易額','工號'],ascending = False)[:5]

按照交易額和工號升序排序，查看五條數據 dataframe.sort_values(by = ['交易額','工號'])[:5]

按照交易額降序和工號升序排序，查看五條數據 dataframe.sort_values(by = ['交易額','工號'],ascending = [False,True])[:5]

按工號升序排序 dataframe.sort_values(by = ['工號'])[:5]

按列名升序排序 dataframe.sort_index(axis = 1)[:5]

每隔五天--5D pd.date_range(start = '20200101',end = '20200131',freq = '5D')

每隔一週--W pd.date_range(start = '20200301',end = '20200331',freq = 'W')

間隔兩天,五個數據 pd.date_range(start = '20200301',periods = 5,freq = '2D')

間隔三小時，八個數據 pd.date_range(start = '20200301',periods = 8,freq = '3H')

三點開始，十二個數據，間隔一分鐘 pd.date_range(start = '202003010300',periods = 12,freq = 'T')

每月的最後一天 pd.date_range(start = '20190101',end = '20191231',freq = 'M')

間隔一年，六個數據，年底最後一天 pd.date_range(start = '20190101',periods = 6,freq = 'A')

間隔一年，六個數據，年初最後一天 pd.date_range(start = '20200101',periods = 6,freq = 'AS')

使用 Series 對象包含時間序列對象,使用特定索引 data = pd.Series(index = pd.date_range(start = '20200321',periods = 24,freq = 'H'),data = range(24))

三分鐘重採樣，計算均值 data.resample('3H').mean()

五分鐘重採樣，求和 data.resample('5H').sum()

計算OHLC open,high,low,close data.resample('5H').ohlc()

將日期替換爲次日 data.index = data.index + pd.Timedelta('1D')

查看指定日期的年份是不是閏年 pd.Timestamp('20200301').is_leap_year

查看指定日期所在的季度和月份 day = pd.Timestamp('20200321')

查看日期的季度 day.quarter

查看日期所在的月份 day.month

轉換爲 python 的日期時間對象 day.to_pydatetime()

查看全部的交易額信息 dataframe['交易額'].describe()

查看四分位數 dataframe['交易額'].quantile([0,0.25,0.5,0.75,1.0])

查看最大的交易額數據 dataframe.nlargest(2,'交易額')

查看最後一個日期 dataframe['日期'].max()

查看最小的工號 dataframe['工號'].min()

第一個最小交易額的行下標 index = dataframe['交易額'].idxmin()

第一個最小交易額 dataframe.loc[index,'交易額']

最大交易額的行下標 index = dataframe['交易額'].idxmax()

跳過 1 2 4 行，以第一列姓名爲索引 dataframe2 = pd.read_excel('超市營業額.xlsx', skiprows = [1,2,4], index_col = 1)

查看 5 到 10 的數據 dataframe[5:11]

查看第六行的數據 dataframe.iloc[5]

查看第 1 3 4 行的數據 dataframe.iloc[[0,2,3],:]

查看第 1 3 4 行的第 1 2 列 dataframe.iloc[[0,2,3],[0,1]]

查看前五行指定，姓名、時段和交易額的數據 dataframe[['姓名','時段','交易額']][:5]

查看第 2 4 5 行 姓名，交易額 數據 loc 函數 dataframe.loc[[1,3,4],['姓名','交易額']]

查看第四行的姓名數據 dataframe.at[3,'姓名']

某一時段的交易總和 dataframe[dataframe['時段'] == '14:00-21:00']['交易額'].sum()

查看張三總共的交易額 dataframe[dataframe['姓名'].isin(['張三'])]['交易額'].sum()

查看日用品的銷售總額 dataframe[dataframe['櫃檯'] == '日用品']['交易額'].sum()

查看交易額在 1500~3000 之間的記錄 dataframe[dataframe['交易額'].between(1500,3000)]

將日期設置爲 python 中的日期類型 data.日期 = pd.to_datetime(data.日期)

每七天營業的總額 data.resample('7D',on = '日期').sum()['交易額']

每七天營業總額 data.resample('7D',on = '日期',label = 'right').sum()['交易額']

每七天營業額的平均值 func = lambda item:round(np.sum(item)/len(item),2) data.resample('7D',on = '日期',label = 'right').apply(func)['交易額']

每七天營業額的平均值 func = lambda num:round(num,2) data.resample('7D',on = '日期',label = 'right').mean().apply(func)['交易額']

刪除工號這一列 data.drop('工號',axis = 1,inplace = True)

按照姓名和櫃檯進行分組彙總 data = data.groupby(by = ['姓名','櫃檯']).sum()

查看張三的彙總數據 data.loc['張三',:]

查看張三在蔬菜水果的交易數據 data.loc['張三','蔬菜水果']

丟棄工號列 data.drop('工號',axis = 1,inplace = True)

按照櫃檯進行排序 dff = data.sort_index(level = '櫃檯',axis = 0)

按照姓名進行排序 dff = data.sort_index(level = '姓名',axis = 0)

按照櫃檯進行分組求和 dff = data.groupby(level = '櫃檯').sum()['交易額']

平均值 data.mean()

標準差 data.std()

協方差 data.cov()

刪除缺失值和重複值,inplace = True 直接丟棄 data.dropna(inplace = True) data.drop_duplicates(inplace = True)

numpy鞏固

導包 import numpy as np

建立二維數組 x = np.matrix([[1,2,3],[4,5,6]])

建立一維數組 y = np.matrix([1,2,3,4,5,6])

x 的第二行第二列元素 x[1,1]

矩陣的乘法 x*y

# 相關係數矩陣,可以使用在列表元素數組矩陣 # 負相關 np.corrcoef([1,2,3],[8,5,4]) ''' array([[ 1. , -0.96076892], [-0.96076892, 1. ]]) ''' # 正相關 np.corrcoef([1,2,3],[4,5,7]) ''' array([[1. , 0.98198051], [0.98198051, 1. ]]) '''

矩陣的方差 np.cov([1,1,1,1,1])

矩陣的標準差 np.std([1,1,1,1,1])

垂直堆疊矩陣 z = np.vstack((x,y))

矩陣的協方差 np.cov(z)

np.cov(x,y)

標準差 np.std(z)

列向標準差 np.std(z,axis = 1)

方差 np.cov(x)

特徵值和特徵向量 A = np.array([[1,-3,3],[3,-5,3],[6,-6,4]]) e,v = np.linalg.eig(A) e 爲特徵值, v 爲特徵向量

矩陣與特徵向量的乘積 np.dot(A,v)

特徵值與特徵向量的乘積 e * v

驗證兩個乘積是否相等 np.isclose(np.dot(A,v),(e * v))

行列式 |A - λE| 的值應爲 0 np.linalg.det(A-np.eye(3,3)*e)

逆矩陣 y = np.linalg.inv(x)

矩陣的乘法(注意前後順序) x * y ''' matrix([[ 1.00000000e+00, 5.55111512e-17, 1.38777878e-17], [ 5.55111512e-17, 1.00000000e+00, 2.77555756e-17], [ 1.77635684e-15, -8.88178420e-16, 1.00000000e+00]]) ''' y * x ''' matrix([[ 1.00000000e+00, -1.11022302e-16, 0.00000000e+00], [ 8.32667268e-17, 1.00000000e+00, 2.22044605e-16], [ 6.93889390e-17, 0.00000000e+00, 1.00000000e+00]]) '''

求解線性方程組 a = np.array([[3,1],[1,2]]) b = np.array([9,8]) x = np.linalg.solve(a,b)

最小二乘解：返回解，餘項，a 的秩，a 的奇異值 np.linalg.lstsq(a,b) # (array([2., 3.]), array([], dtype=float64), 2, array([3.61803399, 1.38196601]))

計算向量和矩陣的範數 x = np.matrix([[1,2],[3,-4]]) np.linalg.norm(x) # 5.477225575051661 np.linalg.norm(x,-2) # 1.9543950758485487 np.linalg.norm(x,-1) # 4.0 np.linalg.norm(x,1) # 6.0 np.linalg.norm([1,2,0,3,4,0],0) # 4.0 np.linalg.norm([1,2,0,3,4,0],2) # 5.477225575051661

奇異值分解 a = np.matrix([[1,2,3],[4,5,6],[7,8,9]]) u,s,v = np.linalg.svd(a) u ''' matrix([[-0.21483724, 0.88723069, 0.40824829], [-0.52058739, 0.24964395, -0.81649658], [-0.82633754, -0.38794278, 0.40824829]]) ''' s ''' array([1.68481034e+01, 1.06836951e+00, 4.41842475e-16]) ''' v ''' matrix([[-0.47967118, -0.57236779, -0.66506441], [-0.77669099, -0.07568647, 0.62531805], [-0.40824829, 0.81649658, -0.40824829]]) ''' # 驗證 u * np.diag(s) * v ''' matrix([[1., 2., 3.], [4., 5., 6.], [7., 8., 9.]]) '''

實現矩陣的轉置 x.T

元素平均值 x.mean()

縱向平均值 x.mean(axis = 0)

橫向平均值 x.mean(axis = 1)

全部元素之和 x.sum()

橫向最大值 x.max(axis = 1)

橫向最大值的索引下標 x.argmax(axis = 1)

對角線元素 x.diagonal()

非零元素下標 x.nonzero()

建立數組 np.array([1,2,3,4])

np.array((1,2,3,4))

np.array(range(4)) # 不包含終止數字 # array([0, 1, 2, 3])

# 使用 arange(初始位置=0,末尾,步長=1) np.arange(1,8,2) # array([1, 3, 5, 7])

生成等差數組,endpoint 爲 True 則包含末尾數字 np.linspace(1,3,4,endpoint=False) # array([1. , 1.5, 2. , 2.5]) np.linspace(1,3,4,endpoint=True) # array([1. , 1.66666667, 2.33333333, 3. ])

建立全爲零的一維數組 np.zeros(3)

建立全爲一的一維數組 np.ones(4)

np.linspace(1,3,4) # array([1. , 1.66666667, 2.33333333, 3. ])

np.logspace(起始數字，終止數字，數字個數，base = 10) 對數數組 np.logspace(1,3,4) # 至關於 10 的 linspace(1,3,4) 次方 # array([ 10. , 46.41588834, 215.443469 , 1000. ]) np.logspace(1,3,4,base = 2) # 2 的 linspace(1,3,4) 次方 # array([2. , 3.1748021, 5.0396842, 8. ])

建立二維數組(列表嵌套列表) np.array([[1,2,3],[4,5,6]])

# 建立全爲零的二維數組 # 兩行兩列 np.zeros((2,2))

三行兩列 np.zeros((3,2))

# 建立一個單位數組 np.identity(3) ''' array([[1., 0., 0.], [0., 1., 0.], [0., 0., 1.]]) '''

建立一個對角矩陣，(參數爲對角線上的數字) np.diag((1,2,3)) ''' array([[1, 0, 0], [0, 2, 0], [0, 0, 3]]) '''

第一行元素 n[0]

第一行第三列元素 n[0,2]

第一行和第二行的元素 n[[0,1]]

第一行第三列，第三行第二列，第二行第一列 n[[0,2,1],[2,1,0]]

將數組倒序 a[::-1]

步長爲 2 a[::2]

從 0 到 4 的元素 a[:5]

變換 c 的矩陣行和列 c = np.arange(16) # array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]) c.shape = 4,4 ''' array([[ 0, 1, 2, 3], [ 4, 5, 6, 7], [ 8, 9, 10, 11], [12, 13, 14, 15]]) '''

第一行，第三個元素到第五個元素(若是沒有則輸出到末尾截止) c[0,2:5]

第二行元素 c[1]

第三行到第六行，第三列到第六列 c[2:5,2:5]

第二行第三列元素和第三行第四列元素 c[[1,2],[2,3]]

第一行和第三行的第二列到第三列的元素 c[[0,2],1:3]

第一列和第三列的全部橫行元素 c[:,[0,2]]

第三列全部元素 c[:,2]

第二行和第四行的全部元素 c[[1,3]]

第一行的第二列，第四列元素，第四行的第二列，第四列元素 c[[0,3]][:,[1,3]]

使用 * 進行相乘 x*2

使用 / 進行相除 x / 2

2 / x

使用 // 進行整除 x//2

10//x

使用 ** 進行冪運算 x**3

2 ** x

使用 + 進行相加 x + 2

使用 % 進行取模 x % 3

使用 + 進行相加 np.array([1,2,3,4]) + np.array([11,22,33,44]) np.array([1,2,3,4]) + np.array([3]) # array([4, 5, 6, 7])

數組的內積運算(對應位置上元素相乘) np.dot(x,y)

sum(x*y)

將數組中大於 0.5 的元素顯示 n[n>0.5]

找到數組中 0.05 ~ 0.4 的元素總數 sum((n > 0.05)&(n < 0.4))

是否都大於 0.2 np.all(n > 0.2)

是否有元素小於 0.1 np.any(n < 0.1)

在 a 中是否有大於 b 的元素 a > b # array([False, True, False]) # 在 a 中是否有等於 b 的元素 a == b # array([False, False, True]) # 顯示 a 中 a 的元素等於 b 的元素 a[a == b] # array([7])

顯示 a 中的偶數且小於 5 的元素 a[(a%2 == 0) & (a < 5)]

生成一個隨機數組 np.random.randint(0,6,3)

生成一個隨機數組(二維數組) np.random.randint(0,6,(3,3))

生成十個隨機數在[0,1)之間 np.random.rand(10) ''' array([0.9283789 , 0.43515554, 0.27117021, 0.94829333, 0.31733981, 0.42314939, 0.81838647, 0.39091899, 0.33571004, 0.90240897]) '''

從標準正態分佈中隨機抽選出3個數 np.random.standard_normal(3)

返回三頁四行兩列的標準正態分佈數 np.random.standard_normal((3,4,2))

x = np.arange(8) 在數組尾部追加一個元素 np.append(x,10) 在數組尾部追加多個元素 np.append(x,[15,16,17])

使用 數組下標修改元素的值 x[0] = 99

在指定位置插入數據 np.insert(x,0,54)

建立一個多維數組 x = np.array([[1,2,3],[11,22,33],[111,222,333]]) 修改第 0 行第 2 列的元素值 x[0,2] = 9

行數大於等於 1 的，列數大於等於 1 的置爲 1 x[1:,1:] = 1

# 同時修改多個元素值 x[1:,1:] = [7,8] ''' array([[ 1, 2, 9], [ 11, 7, 8], [111, 7, 8]]) ''' x[1:,1:] = [[7,8],[9,10]] ''' array([[ 1, 2, 9], [ 11, 7, 8], [111, 9, 10]]) '''

查看數組的大小 n.size

將數組分爲兩行五列 n.shape = 2,5

顯示數組的維度 n.shape

設置數組的維度，-1 表示自動計算 n.shape = 5,-1

將新數組設置爲調用數組的兩行五列並返回 x = n.reshape(2,5)

x = np.arange(5) # 將數組設置爲兩行，沒有數的設置爲 0 x.resize((2,10)) ''' array([[0, 1, 2, 3, 4, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0]]) ''' # 將 x 數組的兩行五列形式顯示，不改變 x 的值 np.resize(x,(2,5)) ''' array([[0, 1, 2, 3, 4], [0, 0, 0, 0, 0]]) '''

x = np.array([1,4,5,2]) # array([1, 4, 5, 2]) # 返回排序後元素的原下標 np.argsort(x) # array([0, 3, 1, 2], dtype=int64)

輸出最大值的下標 x.argmax( )

輸出最小值的下標 x.argmin( )

對數組進行排序 x.sort( )

每一個數組元素對應的正弦值 np.sin(x)

每一個數組元素對應的餘弦值 np.cos(x)

對參數進行四捨五入 np.round(np.cos(x))

對參數進行上入整數 3.3->4 np.ceil(x/3)

# 分段函數 x = np.random.randint(0,10,size=(1,10)) # array([[0, 3, 6, 7, 9, 4, 9, 8, 1, 8]]) # 大於 4 的置爲 0 np.where(x > 4,0,1) # array([[1, 1, 0, 0, 0, 1, 0, 0, 1, 0]]) # 小於 4 的乘 2 ，大於 7 的乘3 np.piecewise(x,[x<4,x>7],[lambda x:x*2,lambda x:x*3]) # array([[ 0, 6, 0, 0, 27, 0, 27, 24, 2, 24]])

數據庫 mysql-connector 基礎

安裝驅動 python -m pip install mysql-connector

導包 import mysql.connector

mydb = mysql.connector.connect( host="localhost", # 數據庫主機地址 user="root", # 數據庫用戶名 passwd="root" # 數據庫密碼 )

建立遊標 mycursor = mydb.cursor()

使用 mycursor.execute("sql 語句") 進行運行 mycursor.execute("CREATE DATABASE runoob_db")

指定數據庫名爲 runoob_db mydb = mysql.connector.connect( host="localhost", user="root", passwd="123456", database="runoob_db" )

建立數據表 mycursor.execute("CREATE TABLE sites (name VARCHAR(255), url VARCHAR(255))")

查看當前數據表有哪些 mycursor.execute("SHOW TABLES")

使用 "INT AUTO_INCREMENT PRIMARY KEY" 語句 建立一個主鍵，主鍵起始值爲 1，逐步遞增 mycursor.execute("ALTER TABLE sites ADD COLUMN id INT AUTO_INCREMENT PRIMARY KEY")

建立表時,添加主鍵 mycursor.execute("CREATE TABLE sites (id INT AUTO_INCREMENT PRIMARY KEY, name VARCHAR(255), url VARCHAR(255))")

插入數據 sql = "INSERT INTO sites (name, url) VALUES (%s, %s)" val = ("RUNOOB", "https://www.runoob.com") mycursor.execute(sql, val) mydb.commit() # 數據表內容有更新，必須使用到該語句

打印 行號 mycursor.rowcount

插入多條語句 sql = "INSERT INTO sites (name, url) VALUES (%s, %s)" val = [ ('Google', 'https://www.google.com'), ('Github', 'https://www.github.com'), ('Taobao', 'https://www.taobao.com'), ('stackoverflow', 'https://www.stackoverflow.com/') ] mycursor.executemany(sql, val) mydb.commit() # 數據表內容有更新，必須使用到該語句

在數據插入後,獲取該條記錄的 ID mycursor.lastrowid

使用 fetchall() 獲取全部記錄 mycursor.execute("SELECT * FROM sites") myresult = mycursor.fetchall() for x in myresult: print(x)

選取指定數據進行查找 mycursor.execute("SELECT name, url FROM sites") myresult = mycursor.fetchall() for x in myresult: print(x)

使用 .fetchone() 獲取一條數據 mycursor.execute("SELECT * FROM sites") myresult = mycursor.fetchone() print(myresult)

使用 where 語句 sql = "SELECT * FROM sites WHERE name ='RUNOOB'" mycursor.execute(sql) myresult = mycursor.fetchall()

使用 fetchall 以後,須要使用循環進行輸出 for x in myresult: print(x)

使用 通配符 % sql = "SELECT * FROM sites WHERE url LIKE '%oo%'"

使用 %s 防止發生 SQL 注入攻擊 sql = "SELECT * FROM sites WHERE name = %s" na = ("RUNOOB", ) mycursor.execute(sql, na)

排序 使用 ORDER BY 語句,默認升序,關鍵字爲 ASC 若是要設置降序排序，能夠設置關鍵字 DESC

sql = "SELECT * FROM sites ORDER BY name" mycursor.execute(sql)

降序 DESC sql = "SELECT * FROM sites ORDER BY name DESC" mycursor.execute(sql)

使用 limit 設置查詢的數據量 mycursor.execute("SELECT * FROM sites LIMIT 3")

limit 指定起始位置 使用 offset mycursor.execute("SELECT * FROM sites LIMIT 3 OFFSET 1") # 0 爲 第一條，1 爲第二條，以此類推 myresult = mycursor.fetchall()

刪除記錄 delete from sql = "DELETE FROM sites WHERE name = 'stackoverflow'" mycursor.execute(sql)

sql = "DELETE FROM sites WHERE name = %s" na = ("stackoverflow", ) mycursor.execute(sql, na)

更新表中數據 update sql = "UPDATE sites SET name = 'ZH' WHERE name = 'Zhihu'" mycursor.execute(sql)

sql = "UPDATE sites SET name = %s WHERE name = %s" val = ("Zhihu", "ZH") mycursor.execute(sql, val)

刪除表 drop table 能夠先使用 if exists 判斷是否存在 sql = "DROP TABLE IF EXISTS sites" # 刪除數據表 sites mycursor.execute(sql)

爬蟲流程(前面發過的文章的合集)鞏固

1.打開網頁 urllib.request.urlopen('網址')
 例：response = urllib.request.urlopen('http://www.baidu.com/') 返回值爲 <http.client.HTTPResponse object at 0x00000224EC2C9490>

2.獲取響應頭信息 urlopen 對象.getheaders() 例：response.getheaders() 返回值爲 [('Bdpagetype', '1'), ('Bdqid', '0x8fa65bba0000ba44'),···,('Transfer-Encoding', 'chunked')] [('頭','信息')]

3.獲取響應頭信息,帶參數表示指定響應頭 urlopen 對象.getheader('頭信息') 例：response.getheader('Content-Type') 返回值爲 'text/html;charset=utf-8'

4.查看狀態碼 urlopen 對象.status 例：response.status 返回值爲 200 則表示成功

5.獲得二進制數據,而後轉換爲 utf-8 格式 二進制數據 例：html = response.read() HTML 數據格式 例：html = response.read().decode('utf-8') 打印輸出時,使用 decode('字符集') 的數據 print(html.decode('utf-8'))

6.存儲 HTML 數據 fp = open('文件名.html','模式 wb') 例：fp = open('baidu.html', 'wb') fp.write(response.read() 對象) 例：fp.write(html)

7.關閉文件 open對象.close() 例：fp.close()

8.使用 ssl 進行抓取 https 的網頁 例： import ssl content = ssl._create_unverified_context() headers = {'User-Agent':'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.100 Safari/537.36'} request = urllib.request.Request('http://www.baidu.com/', headers = headers) response = urllib.request.urlopen(request, context = context) 這裏的 response 就和上面同樣了

9.獲取碼 response.getcode() 返回值爲 200

10.獲取爬取的網頁 url response.geturl() 返回值爲 https://www.baidu.com/

11.獲取響應的報頭信息 response.info()

12.保存網頁 urllib.request.urlretrieve(url, '文件名.html') 例：urllib.request.urlretrieve(url, 'baidu.html')

13.保存圖片 urllib.request.urlretrieve(url, '圖片名.jpg') 例：urllib.request.urlretrieve(url, 'Dog.jpg')

其餘字符（如漢字）不符合標準時,要進行編碼 14.除了-._/09AZaz 都會編碼 urllib.parse.quote() 例： Param = "全文檢索:*" urllib.parse.quote(Param) 返回值爲 '%E5%85%A8%E6%96%87%E6%A3%80%E7%B4%A2%3A%2A'

15.會編碼 / 斜線(將斜線也轉換爲 %.. 這種格式) urllib.parse.quote_plus(Param)

16.將字典拼接爲 query 字符串 若是有中文,進行url編碼 dic_object = { 'user_name':'張三', 'user_passwd':'123456' } urllib.parse.urlencode(dic_object) 返回值爲 'user_name=%E5%BC%A0%E4%B8%89&user_passwd=123456'

17.獲取 response 的行 url = 'http://www.baidu.com' response = urllib.request.urlopen(url) response.readline()

18.隨機獲取請求頭(隨機包含請求頭信息的列表) user_agent = [ "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv2.0.1) Gecko/20100101 Firefox/4.0.1", "Mozilla/5.0 (Windows NT 6.1; rv2.0.1) Gecko/20100101 Firefox/4.0.1", "Opera/9.80 (Macintosh; Intel Mac OS X 10.6.8; U; en) Presto/2.8.131 Version/11.11", "Opera/9.80 (Windows NT 6.1; U; en) Presto/2.8.131 Version/11.11", "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_0) AppleWebKit/535.11 (KHTML, like Gecko) Chrome/17.0.963.56 Safari/535.11" ] ua = random.choice(user_agent) headers = {'User-Agent':ua}

19.對輸入的漢字進行 urlencode 編碼 urllib.parse.urlencode(字典對象) 例： chinese = input('請輸入要查詢的中文詞語:') wd = {'wd':chinese} wd = urllib.parse.urlencode(wd) 返回值爲 'wd=%E4%BD%A0%E5%A5%BD'

20.常見分頁操做 for page in range(start_page, end_page + 1): pn = (page - 1) * 50

21.一般會進行拼接字符串造成網址 例：fullurl = url + '&pn=' + str(pn)

22.進行拼接造成要保存的文件名 例：filename = 'tieba/' + name + '貼吧_第' + str(page) + '頁.html'

23.保存文件 with open(filename,'wb') as f: f.write(reponse.read() 對象)

24.headers 頭信息能夠刪除的有 cookie、accept-encoding、accept-languag、content-length\connection\origin\host

25.headers 頭信息不能夠刪除的有 Accept、X-Requested-With、User-Agent、Content-Type、Referer

26.提交給網頁的數據 formdata formdata = { 'from':'en', 'to':'zh', 'query':word, 'transtype':'enter', 'simple_means_flag':'3' }

27.將formdata進行urlencode編碼,而且轉化爲bytes類型 formdata = urllib.parse.urlencode(formdata).encode('utf-8')

28.使用 formdata 在 urlopen() 中 response = urllib.request.urlopen(request, data=formdata)

29.轉換爲正確數據(導包 json) read -> decode -> loads -> json.dumps 經過read讀取過來爲字節碼 data = response.read() 將字節碼解碼爲utf8的字符串 data = data.decode('utf-8') 將json格式的字符串轉化爲json對象 obj = json.loads(data) 禁用ascii以後，將json對象轉化爲json格式字符串 html = json.dumps(obj, ensure_ascii=False) json 對象經過 str轉換後 使用 utf-8 字符集格式寫入 保存和以前的方法相同 with open('json.txt', 'w', encoding='utf-8') as f: f.write(html)

30.ajax請求自帶的頭部 'X-Requested-With':'XMLHttpRequest'

31.豆瓣默認都得使用https來進行抓取，因此須要使用ssl模塊忽略證書 例： url = 'http://movie.douban.com/j/chart/top_list?type=24&interval_id=100%3A90&action=' page = int(input('請輸入要獲取頁碼:')) start = (page - 1) * 20 limit = 20 key = { 'start':start, 'limit':limit } key = urllib.parse.urlencode(key) url = url + '&' + key headers = { 'X-Requested-With':'XMLHttpRequest', 'User-Agent':'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.100 Safari/537.36' } request = urllib.request.Request(url, headers=headers) # context = ssl._create_unverified_context() response = urllib.request.urlopen(request) jsonret = response.read() with open('douban.txt', 'w', encoding='utf-8') as f: f.write(jsonret.decode('utf-8')) print('over')

32.建立處理 http 請求的對象 http_handler = urllib.request.HTTPHandler()

33.處理 https 請求 https_handler = urllib.request.HTTPSHandler()

34.建立支持http請求的opener對象 opener = urllib.request.build_opener(http_handler)

35.建立 reponse 對象 例：opener.open(Request 對象) request = urllib.request.Request('http://www.baidu.com/') reponse = opener.open(request) 進行保存 with open('文件名.html', 'w', encoding='utf-8') as f: f.write(reponse.read().decode('utf-8'))

36.代理服務器 http_proxy_handler = urllib.request.ProxyHandler({'https':'ip地址:端口號'}) 例：http_proxy_handler = urllib.request.ProxyHandler({'https':'121.43.178.58:3128'})

37.私密代理服務器(下面的只是一個例子,不必定正確) authproxy_handler = urllib.request.ProxyHandler({"http" : "user:password@ip:port"})

38.不使用任何代理 http_proxy_handler = urllib.request.ProxyHandler({})

39.使用了代理以後的 opener 寫法 opener = urllib.request.build_opener(http_proxy_handler)

40.response 寫法 response = opener.open(request)

41.若是訪問一個不存在的網址會報錯 urllib.error.URLError

42.HTTPError（是URLError的子類） 例： try: urllib.request.urlopen(url) except urllib.error.HTTPError as e: print(e.code) print(e.reason) except urllib.error.URLError as e: print(e)

43.使用 CookieJar 建立一個 cookie 對象,保存 cookie 值 import http.cookiejar cookie = http.cookiejar.CookieJar( )

44.經過HTTPCookieProcessor構建一個處理器對象，用來處理cookie cookie_handler = urllib.request.HTTPCookieProcessor(cookie) opener 的寫法 opener = urllib.request.build_opener(cookie_handler)

45.使用 r'\x' 消除轉義 \d 表示轉義字符 r'\d' 表示 \d

46.設置 正則模式 pattern = re.compile(r'規則', re.xxx ) pattern = re.compile(r'i\s(.*?),') 例：pattern = re.compile(r'LOVE', re.I)

使用 pattern 進行調用匹配 47.match 只匹配開頭字符 pattern.match('字符串'[,起始位置,結束位置]) 例：m = pattern.match('i love you', 2, 6) 返回值爲 <re.Match object; span=(2, 6), match='love'>

48. search 從開始匹配到結尾,返回第一個匹配到的 pattern.search('字符串') 例：m = pattern.search('i love you, do you love me, yes, i love') 返回值爲 <re.Match object; span=(2, 6), match='love'>

49.findall 將匹配到的都放到列表中 pattern.findall('字符串') 例：m = pattern.findall('i love you, do you love me, yes, i love') 返回值爲 ['love', 'love', 'love']

50.split 使用匹配到的字符串對原來的數據進行切割 pattern.split('字符串',次數) 例：m = pattern.split('i love you, do you love me, yes, i love me', 1) 返回值爲 ['i ', ' you, do you love me, yes, i love me'] 例：m = pattern.split('i love you, do you love me, yes, i love me', 2) 返回值爲 ['i ', ' you, do you ', ' me, yes, i love me'] 例：m = pattern.split('i love you, do you love me, yes, i love me', 3) 返回值爲 ['i ', ' you, do you ', ' me, yes, i ', ' me']

51.sub 使用新字符串替換匹配到的字符串的值,默認所有替換 pattern.sub('新字符串','要匹配字符串'[,次數]) 注：返回的是字符串 例： string = 'i love you, do you love me, yes, i love me' m = pattern.sub('hate', string, 1) m 值爲 'i hate you, do you love me, yes, i love me'

52.group 匹配組 m.group() 返回的是匹配都的全部字符 m.group(1) 返回的是第二個規則匹配到的字符 例： string = 'i love you, do you love me, yes, i love me' pattern = re.compile(r'i\s(.*?),') m = pattern.match(string) m.group() 返回值爲 'i love you,' m.group(1) 返回值爲 'love you'

53.匹配標籤 pattern = re.compile(r'<div class="thumb">(.*?)<img src=(.*?) alt=(.*?)>(.*?)</div>', re.S)

54.分離出文件名和擴展名,返回二元組 os.path.splitext(參數) 例： 獲取路徑 image_path = './qiushi' 獲取後綴名 extension = os.path.splitext(image_url)[-1]

55.合併多個字符串 os.path.join() 圖片路徑 image_path = os.path.join(image_path, image_name + extension) 保存文件 urllib.request.urlretrieve(image_url, image_path)

56.獲取 a 標籤下的 href 的內容 pattern = re.compile(r'<a href="(.*?)" class="main_14" target="_blank">(.*?)</a>', re.M)

57.href 中有中文的須要先進行轉碼,而後再拼接 smile_url = urllib.parse.quote(smile_url) smile_url = 'http://www.jokeji.cn' + smile_url

58.導入 etree from lxml import etree

59.實例化一個 html 對象,DOM模型 etree.HTML (經過requests庫的get方法或post方法獲取的信息 其實就是 HTML 代碼) 例：html_tree = etree.HTML(text) 返回值爲 <Element html at 0x26ee35b2400> 例：type(html_tree) <class 'lxml.etree._Element'>

60.查找全部的 li 標籤 html_tree.xpath('//li')

61.獲取全部li下面a中屬性href爲link1.html的a result = html_tree.xpath('//標籤/標籤[@屬性="值"]') 例：result = html_tree.xpath('//li/a[@href="link.html"]')

62.獲取最後一個 li 標籤下 a 標籤下面的 href 值 result = html_tree.xpath('//li[last()]/a/@href')

63.獲取 class 爲 temp 的結點 result = html_tree.xpath('//*[@class = "temp"]')

64.獲取全部 li 標籤下的 class 屬性 result = html_tree.xpath('//li/@class')

65.取出內容 [0].text 例：result = html_tree.xpath('//li[@class="popo"]/a')[0].text 例：result = html_tree.xpath('//li[@class="popo"]/a/text()')

66.將 tree 對象轉化爲字符串 etree.tostring(etree.HTML對象).decode('utf-8')

67.動態保存圖片,使用url後幾位做爲文件名 request = urllib.request.Request(url, headers=headers) response = urllib.request.urlopen(request) html_tree = etree.HTML(html) img_list = html_tree.xpath('//div[@class="box picblock col3"]/div/a/img/@src2') for img_url in img_list: # 定製圖片名字爲url後10位 file_name = 'image/' + img_url[-10:] load_image(img_url, file_name) load_image內容： def load_image(url, file_name): headers = { 'User-Agent':'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.100 Safari/537.36', } request = urllib.request.Request(url, headers=headers) response = urllib.request.urlopen(request) image_bytes = response.read() with open(file_name, 'wb') as f: f.write(image_bytes) print(file_name + '圖片已經成功下載完畢') 例： def load_page(url): headers = { #'User-Agent':'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.100 Safari/537.36', } print(url) # exit() request = urllib.request.Request(url, headers=headers) response = urllib.request.urlopen(request) html = response.read() # 這是專業的圖片網站，使用了懶加載，可是能夠經過源碼來進行查看，而且從新寫xpath路徑 with open('7image.html', 'w', encoding='utf-8') as f: f.write(html.decode('utf-8')) exit() # 將html文檔解析問DOM模型 html_tree = etree.HTML(html) # 經過xpath，找到須要的全部的圖片的src屬性，這裏獲取到的 img_list = html_tree.xpath('//div[@class="box picblock col3"]/div/a/img/@src2') for img_url in img_list: # 定製圖片名字爲url後10位 file_name = 'image/' + img_url[-10:] load_image(img_url, file_name) def load_image(url, file_name): headers = { 'User-Agent':'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.100 Safari/537.36', } request = urllib.request.Request(url, headers=headers) response = urllib.request.urlopen(request) image_bytes = response.read() with open(file_name, 'wb') as f: f.write(image_bytes) print(file_name + '圖片已經成功下載完畢') def main(): start = int(input('請輸入開始頁面:')) end = int(input('請輸入結束頁面:')) url = 'http://sc.chinaz.com/tag_tupian/' for page in range(start, end + 1): if page == 1: real_url = url + 'KaTong.html' else: real_url = url + 'KaTong_' + str(page) + '.html' load_page(real_url) print('第' + str(page) + '頁下載完畢') if __name__ == '__main__': main()

68.懶圖片加載案例 例： import urllib.request from lxml import etree import json def handle_tree(html_tree): node_list = html_tree.xpath('//div[@class="detail-wrapper"]') duan_list = [] for node in node_list: # 獲取全部的用戶名，由於該xpath獲取的是一個span列表，而後獲取第一個，而且經過text屬性獲得其內容 user_name = node.xpath('./div[contains(@class, "header")]/a/div/span[@class="name"]')[0].text # 只要涉及到圖片，頗有可能都是懶加載，因此要右鍵查看網頁源代碼，才能獲得真實的連接 # 因爲這個獲取的結果就是屬性字符串，因此只須要加上下標0便可 face = node.xpath('./div[contains(@class, "header")]//img/@data-src')[0] # .表明當前，一個/表示一級子目錄，兩個//表明當前節點裏面任意的位置查找 content = node.xpath('./div[@class="content-wrapper"]//p')[0].text zan = node.xpath('./div[@class="options"]//li[@class="digg-wrapper "]/span')[0].text item = { 'username':user_name, 'face':face, 'content':content, 'zan':zan, } # 將其存放到列表中 duan_list.append(item) # 將列表寫入到文件中 with open('8duanzi.txt', 'a', encoding='utf-8') as f: f.write(json.dumps(duan_list, ensure_ascii=False) + '\n') print('over') def main(): # 爬取百度貼吧，不能加上headers，加上headers爬取不下來 url = 'http://neihanshequ.com/' headers = { 'User-Agent':'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.100 Safari/537.36', } request = urllib.request.Request(url, headers=headers) response = urllib.request.urlopen(request) html_bytes = response.read() # fp = open('8tieba.html', 'w', encoding='utf-8') # fp.write(html_bytes.decode('utf-8')) # fp.close() # exit() # 將html字節串轉化爲html文檔樹 # 文檔樹有xpath方法，文檔節點也有xpath方法 # 【注】不能使用字節串轉化爲文檔樹，這樣會有亂碼 html_tree = etree.HTML(html_bytes.decode('utf-8')) handle_tree(html_tree) if __name__ == '__main__': main()

69. . / 和 // 在 xpath 中的使用 .表明當前目錄 / 表示一級子目錄 // 表明當前節點裏面任意的位置

70.獲取內容的示範 獲取內容時,若是爲字符串,則不須要使用 text 只須要寫[0] face = node.xpath('./div[contains(@class, "header")]//img/@data-src')[0] div 下 class 爲 "content-wrapper" 的全部 p 標籤內容 content = node.xpath('./div[@class="content-wrapper"]//p')[0].text div 下 class 爲 "options" 的全部 li 標籤下 class爲 "digg-wrapper" 的全部 span 標籤內容 zan = node.xpath('./div[@class="options"]//li[@class="digg-wrapper"]/span')[0].text

71.將json對象轉化爲json格式字符串 f.write(json.dumps(duan_list, ensure_ascii=False) + '\n')

72.正則獲取 div 下的內容 1.獲取 div 到 img 之間的數據 2.img 下 src 的數據 3.img 下 alt 的數據 4.一直到 div 結束的數據 pattern = re.compile(r'<div class="thumb">(.*?)<img src=(.*?) alt=(.*?)>(.*?)</div>', re.S) pattern.方法 ,參考上面的正則

73.帶有參數的 get 方式 import requests params = { 'wd':'中國' } r = requests.get('http://www.baidu.com/s?', headers=headers, params=params) requests.get 還能夠添加 cookie 參數

74.設置編碼 r.encoding='utf-8

75.查看全部頭信息 r.request.headers

76.在 requests.get 方法中 url,params,headers,proxies 爲參數 url 網址 params 須要的數據 headers 頭部 proxies 代理

77.經過 Session 對象,發送請求 s = requests.Session() 78.發送請求 s.post(url,data,headers) 79.接收請求 s.get(url[,proxies]) 80.當返回爲 json 樣式時 例： city = input('請輸入要查詢的城市:') params = { 'city':city } r = requests.get(url, params=params) r.json() 會打印出響應的內容 81.BeautifulSoup 建立對象 from bs4 import BeautifulSoup soup = BeautifulSoup(open(url,encoding='utf-8),'lxml') 82.查找第一個<title> 標籤 soup.title 返回值爲 <title>三國猛將</title> 83.查找第一個 a 標籤 soup.a 返回值爲 <a class="aa" href="http://www.baidu.com" title="baidu">百度</a> 84.查找第一個 ul 標籤 soup.ul 85.查看標籤名字 a_tag = soup.a a_tag.name 返回值爲 a 86.查看標籤內容 a_tag.attrs 返回值爲 {'href': 'http://www.baidu.com', 'title': 'baidu', 'class': ['aa']} 87.獲取找到的 a 標籤的 href 內容(第一個 a) soup.a.get('href') 返回值爲 http://www.baidu.com 88.獲取 a 標籤下的 title 屬性(第一個 a) soup.a.get('title') 返回值爲 baidu 89.查看 a 標籤下的內容 soup.標籤.string 標籤還能夠是 head、title等 soup.a.string 返回值爲 百度 90.獲取 p 標籤下的內容 soup.p.string 91.查看 div 的內容,包含 '\n' soup.div.contents 返回值爲 ['\n', <div class="div"> <a class="la" href="www.nihao.com">你好</a> </div>, '\n', <div> <a href="www.hello.com">世界</a> </div>, '\n'] 92.查看使用的字符集 soup.div.contents[1] 返回值爲 <meta charset="utf-8"/> 93.查看body的子節點 soup.標籤.children 例：soup.body.children 返回值是一個迭代對象,須要遍歷輸出 返回值爲 <list_iterator object at 0x0000021863886C10> for child in soup.body.children: print(child) 返回值爲 body 中的全部內容 94.查看全部的子孫節點 soup.標籤.descendants 例：soup.div.descendants 返回值爲 <div class="div"> <a class="la" href="www.nihao.com">你好</a> </div> <a class="la" href="www.nihao.com">你好</a> 你好 95.查看全部的 a 標籤 soup.find_all('a') 返回值爲 包含全部的 a 標籤的列表 96.查看 a 標籤中第二個連接的內容 soup.find_all('a')[1].string 97.查看 a 標籤中第二個連接的href值 soup.find_all('a')[1].href 98.將 re 正則嵌入進來,找尋全部以 b 開頭的標籤 soup.findall(re.compile('^b')) 返回值爲 <body>標籤 <b> 99.找到全部的 a 標籤和 b 標籤 soup.findall(re.compile(['a','b'])) 返回值爲 <a> 和 <b> 標籤 100.經過標籤名獲取全部的 a 標籤 soup.select('a') 返回值爲 全部的 <a> 標籤 101.經過 類名 獲取標籤(在 class 等於的值前面加 .) soup.select('.aa') 返回值爲 class='aa' 的標籤 102.經過 id 名獲取標籤(在 id 等於的值前面加 #) soup.select('#wangyi') 返回值爲 id='wangyi'的標籤 103.查看 div 下 class='aa' 的標籤 soup.select('標籤 .class 等於的值') soup.select('div .aa') 104.查看 div 下,第一層 class='aa' 的標籤 soup.select('.標籤名 > .class= 的值') soup.select('.div > .la') 105.根據屬性進行查找,input 標籤下class爲 haha 的標籤 soup.select('input[class="haha"]') 例： import requests from bs4 import BeautifulSoup import json import lxml def load_url(jl, kw): headers = { 'User-Agent':'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.100 Safari/537.36', } url = 'http://sou.zhaopin.com/jobs/searchresult.ashx?' params = { 'jl':jl, 'kw':kw, } # 自動完成轉碼，直接使用便可 r = requests.get(url, params=params, headers=headers) handle_data(r.text) def handle_data(html): # 建立soup對象 soup = BeautifulSoup(html, 'lxml') # 查找職位名稱 job_list = soup.select('#newlist_list_content_table table') # print(job_list) jobs = [] i = 1 for job in job_list: # 由於第一個table只是表格的標題，因此要過濾掉 if i == 1: i = 0 continue item = {} # 公司名稱 job_name = job.select('.zwmc div a')[0].get_text() # 職位月薪 company_name = job.select('.gsmc a')[0].get_text() # 工做地點 area = job.select('.gzdd')[0].get_text() # 發佈日期 time = job.select('.gxsj span')[0].get_text() # 將全部信息添加到字典中 item['job_name'] = job_name item['company_name'] = company_name item['area'] = area item['time'] = time jobs.append(item) # 將列表轉化爲json格式字符串，而後寫入到文件中 content = json.dumps(jobs, ensure_ascii=False) with open('python.json', 'w', encoding='utf-8') as f: f.write(content) print('over') def main(): # jl = input('請輸入工做地址:') # kw = input('請輸入工做職位:') load_url(jl='北京', kw='python') if __name__ == '__main__': main() 106.將字典進行 json 轉換爲 import json str_dict = {"name":"張三", "age":55, "height":180} print(json.dumps(str_dict, ensure_ascii=False)) 使用 ensure_ascii 輸出則爲 utf-8 編碼 107.讀取轉換的對象,(注意 loads 和 load 方法) json.loads(json.dumps 對象) string = json.dumps(str_dict, ensure_ascii=False) json.loads(string) {"name":"張三", "age":55, "height":180} 108.將對象序列化以後寫入文件 json.dump(字典對象,open(文件名.json,'w',encoding='utf-8,ensure_ascii=False)) json.dump(str_dict, open('jsontest.json', 'w', encoding='utf-8'), ensure_ascii=False) 109.轉換本地的 json 文件轉換爲 python 對象 json.load(open('文件名.json',encoding='utf-8)) 110.jsonpath 示例： book.json文件 { "store": { "book": [ { "category": "reference", "author": "Nigel Rees", "title": "Sayings of the Century", "price": 8.95 }, { "category": "fiction", "author": "Evelyn Waugh", "title": "Sword of Honour", "price": 12.99 }, { "category": "fiction", "author": "Herman Melville", "title": "Moby Dick", "isbn": "0-553-21311-3", "price": 8.99 }, { "category": "fiction", "author": "J. R. R. Tolkien", "title": "The Lord of the Rings", "isbn": "0-395-19395-8", "price": 22.99 } ], "bicycle": { "color": "red", "price": 19.95 } } } import json import jsonpath obj = json.load(open('book.json', encoding='utf-8')) 全部book book = jsonpath.jsonpath(obj, '$..book') print(book) 全部book中的全部做者 authors = jsonpath.jsonpath(obj, '$..book..author') print(authors) book中的前兩本書 '$..book[:2]' book中的最後兩本書 '$..book[-2:]' book = jsonpath.jsonpath(obj, '$..book[0,1]') print(book) 全部book中，有屬性isbn的書籍 book = jsonpath.jsonpath(obj, '$..book[?(@.isbn)]') print(book) 全部book中，價格小於10的書籍 book = jsonpath.jsonpath(obj, '$.store.book[?(@.price<10)]') print(book) 111.requests.get 方法的流程 r = requests.get('https://www.baidu.com/').content.decode('utf-8') 從狀態碼到 二進制碼到 utf-8 編碼 112.對 soup 對象進行美化 html = soup.prettify() <title> 百度一下，你就知道 </title> 113.將內容 string 化 html.xpath('string(//*[@id="cnblogs_post_body"])') 114.獲取屬性 soup.p['name'] 115.嵌套選擇 soup.head.title.string 116.獲取父節點和祖孫節點 soup.a.parent list(enumerate(soup.a.parents)) 117.獲取兄弟節點 soup.a.next_siblings list(enumerate(soup.a.next_siblings)) soup.a.previous_siblings list(enumerate(soup.a.previous_siblings)) 118.按照特定值查找標籤 查找 id 爲 list-1 的標籤 soup.find_all(attrs={'id': 'list-1'}) soup.find_all(id='list-1') 119.返回父節點 find_parents()返回全部祖先節點 find_parent()返回直接父節點 120.返回後面兄弟節點 find_next_siblings()返回後面全部兄弟節點 find_next_sibling()返回後面第一個兄弟節點。 121.返回前面兄弟節點 find_previous_siblings()返回前面全部兄弟節點 find_previous_sibling()返回前面第一個兄弟節點。 122.返回節點後符合條件的節點 find_all_next()返回節點後全部符合條件的節點 find_next()返回第一個符合條件的節點 123.返回節點前符合條件的節點 find_all_previous()返回節點前全部符合條件的節點 find_previous()返回第一個符合條件的節點 124.requests 的請求方式 requests.post(url) requests.put(url) requests.delete(url) requests.head(url) requests.options(url) 125.GET請求 response = requests.get(url) print(response.text) 126.解析 json response.json() json.loads(response.text) 127.發送 post 請求 response = requests.post(url, data=data, headers=headers) response.json() 128.文件上傳 在 post 方法內部添加參數 files 字典參數 import requests files = {'file': open('favicon.ico', 'rb')} response = requests.post("http://httpbin.org/post", files=files) print(response.text) 129.獲取 cookie response.cookie 返回值是 字典對象 for key, value in response.cookies.items(): print(key + '=' + value) 130.模擬登陸 requests.get('http://httpbin.org/cookies/set/number/123456789') response = requests.get('http://httpbin.org/cookies') 131.帶有 Session 的登陸 s = requests.Session() s.get('http://httpbin.org/cookies/set/number/123456789') response = s.get('http://httpbin.org/cookies') 132.證書驗證 urllib3.disable_warnings() response = requests.get('https://www.12306.cn', verify=False) response = requests.get('https://www.12306.cn', cert=('/path/server.crt', '/path/key')) 133.超時設置 from requests.exceptions import ReadTimeout response = requests.get("http://httpbin.org/get", timeout = 0.5) response = urllib.request.urlopen(url, timeout=1) 134.認證設置 from requests.auth import HTTPBasicAuth r = requests.get('http://120.27.34.24:9001', auth=HTTPBasicAuth('user', '123')) r = requests.get('http://120.27.34.24:9001', auth=('user', '123')) 135.異常處理 超時 ReadTimeout 鏈接出錯 ConnectionError 錯誤 RequestException 136.URL 解析 from urllib.parse import urlparse result = urlparse('http://www.baidu.com/index.html;user?id=5#comment') result = urlparse('www.baidu.com/index.html;user?id=5#comment', scheme='https') result = urlparse('http://www.baidu.com/index.html;user?id=5#comment',allow_fragments=False) 136.urllib.parse.urlunparse data = ['http', 'www.baidu.com', 'index.html', 'user', 'a=6', 'comment'] print(urlunparse(data)) http://www.baidu.com/index.html;user?a=6#comment 137.合併 url urllib.parse.urljoin urljoin('http://www.baidu.com', 'FAQ.html') http://www.baidu.com/FAQ.html urljoin('www.baidu.com#comment', '?category=2') www.baidu.com?category=2

matplotlib示例

plt.plot 內只有一個列表示例 import matplotlib.pyplot as plt lst = [4.53,1.94,4.75,0.43,2.02,1.22,2.13,2.77] plt.plot(lst) plt.rcParams['font.sans-serif']=['SimHei'] #用來正常顯示中文標籤 plt.rcParams['axes.unicode_minus']=False #用來正常顯示負號 plt.title("使用一行列表進行繪製折線圖") plt.show()

import matplotlib.pyplot as plt plt.rcParams['font.sans-serif']=['SimHei'] #用來正常顯示中文標籤 plt.rcParams['axes.unicode_minus']=False #用來正常顯示負號 x = range(0,8) y1 = [4.53,1.74,4.55,0.03,2.12,1.22,2.43,2.77] y2 = [2.38, 4.23,1.49,2.75,3.73,4.90,0.13,1.29] plt.plot(x,y1,'b-1',x,y2,'m:o') plt.xlabel('x軸') plt.ylabel('y軸') plt.title("繪製兩個折線圖示例") plt.show()

設置顯示樣式 plt.plot(x,y1,'b-1',x,y2,'m:o')

設置中文標籤 plt.rcParams['font.sans-serif']=['SimHei'] #用來正常顯示中文標籤 plt.rcParams['axes.unicode_minus']=False #用來正常顯示負號

import numpy as np import matplotlib.pyplot as plt with open("haidian.csv","r",encoding = 'utf-8') as f: data = np.loadtxt(f,str,delimiter = ',') x = data[:,1][::10] y = data[:,4][::10] plt.plot(x,y,'g-o') plt.xlabel("時間",fontproperties = 'SimHei') plt.ylabel("溫度",fontproperties = 'SimHei') plt.title("海淀地區20日溫度趨勢圖",fontproperties = 'FangSong',fontsize = 20) plt.xticks(rotation=90) # x 軸旋轉角度 plt.show()

設置 x y 標籤時,指定使用的字體 fontproperties = 'SimHei' plt.xlabel("時間",fontproperties = 'SimHei') plt.ylabel("溫度",fontproperties = 'SimHei')

打開 csv 文件時,使用 np.loadtxt 進行讀取 先使用 with open 打開文件,而後使用 np.loadtxt 進行讀取 np.loadtxt(f,str,delimiter = ',') 提取過來的數據時 numpy.str_類型,使用時能夠使用str進行轉換 with open("haidian.csv","r",encoding = 'utf-8') as f: data = np.loadtxt(f,str,delimiter = ',')

直方圖 hist 參數 data:必選參數，繪圖數據 bins:直方圖的長條形數目，可選項，默認爲10 normed:是否將獲得的直方圖向量歸一化，可選項 默認爲0，表明不歸一化，顯示頻數 normed=1 表示歸一化，顯示頻率 facecolor:長條形的顏色 edgecolor:長條形邊框的顏色 alpha:透明度

一張圖顯示兩個直方圖示例 from matplotlib import pyplot as plt x = [5,8,10] y = [12,16,6] x2 = [6,9,11] y2 = [6,15,7] plt.bar(x, y, align = 'center',label = 'x') plt.bar(x2, y2, color = 'g', align = 'center',label = 'x2') plt.title('直方圖圖示') plt.ylabel('Y軸') plt.xlabel('X軸') plt.legend() plt.show()

使用 plt.subplot(2,1) 繪製子圖 經過子圖設置標籤 ax[0].hist(avg_wd,bins = 15,alpha=0.7) ax[0].set(title=u"時間和溫度的關係圖",ylabel=u"溫度") # 設置標題 ax[1].hist(avg_sd,bins = 15,alpha=0.7) ax[1].set_title('時間和溼度的關係圖') ax[1].set(title=u"14-28日煙臺時間和溼度的關係圖",ylabel=u"溼度") # 設置標題

matplotlib顏色線條及繪製直線

plt.axhline(y=0,ls=":",c="yellow")#添加水平直線 plt.axvline(x=4,ls="-",c="green")#添加垂直直線

matplotlib繪製子圖

fig,subs = plt.subplots(2,2) subs[0][0].plot(data_math_C1) subs[0][0].set_title('C_1 曲線') subs[0][1].plot(data_math_C2) subs[0][1].set_title('C_2 曲線') subs[1][0].plot(data_math_C3) subs[1][0].set_title('C_3 曲線') subs[1][1].plot(data_math_C4) subs[1][1].set_title('C_4 曲線') plt.show()

下載數據到csv中(亂碼),使用numpy , pandas讀取失敗 解決方案

讀取數據,下載數據到 csv 文件中 allUniv 列表類型[[...],[...]] 字符集編碼使用 utf-8-sig with open('文件名.csv','w',newline='',encoding='utf-8-sig') as fout: write = csv.writer(fout) columns = ['文字', '文字', '文字', '文字', '文字', '文字'] write.writerow(columns) for row in allUniv: write.writerow(row)

讀取 csv 文件 必定要使用 pd.read_csv 進行讀取 data = pd.read_csv('文件名.csv') print(data[:5])

查看一個數全部的因子及因子的和

相關標籤/搜索

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。