python 全棧開發，Day26(hashlib文件一致性,configparser,logging,collections模塊)

時間 2019-11-18

標籤 python 開發 day26 day hashlib 文件一致性 configparser logging collections 模塊欄目 Python 简体版

原文原文鏈接

一，hashlib 文件一致性校驗python

爲什麼要進行文件一致性校驗?程序員

爲了確保你獲得的文件是正確的版本，而沒有被注入病毒和木馬程序。例如咱們常常在網上下載軟件，而這些軟件已經被注入了一些廣告和病毒等，若是不進行文件與原始發佈商的一致性校驗的話，可能會給咱們帶來必定的損失。算法

文件一致性校驗原理
要進行文件的一致性校驗，咱們不可能像文本文件比較那樣，將兩個文件放到一塊兒對比，由於不少的時候文件很大。目前最理想的辦法就是，是經過加密算法，對文件生成對應的值，經過生成的值與發佈商提供的值比較來確認兩個文件是否一致。json

MD5和SHA1就是目前使用最爲普遍的良種加密算法。windows

舉例：安全

先手動建立2個文件，file1 和 file 2 ,內容123網絡

使用MD5計算file1的加密值函數

import hashlib
md5obj = hashlib.md5()
with open('file1','rb')as f:
    content = f.read()
    md5obj.update(content)
print(md5obj.hexdigest())

執行後輸出：編碼

再計算fiel2的加密值，再把上面的代碼複製一遍？太low了，若是有多個文件怎麼辦？　　加密

定義一個方法：

import hashlib
def check_md5(filename):
    md5obj = hashlib.md5()
    with open(filename,'rb')as f:
        content = f.read()
        md5obj.update(content)
    return md5obj.hexdigest()
ret1 = check_md5('file1')
ret2 = check_md5('file2')
print(ret1)
print(ret2)

　執行輸出：

這樣就能夠知道，兩個文件是否一致了。

可是上面的方法，有一個缺陷，當文件達到GB級別的時候那內存怎麼支撐？(這種比對至關於要先把全部的文件都讀入內存中)

那麼怎麼辦？先看下面的一個小栗子：

import hashlib
md5obj = hashlib.md5()
md5obj.update(b'john')  #b 'string'表示bytes類型，不能有中文符號
print(md5obj.hexdigest())

拆分字符串

import hashlib
md5obj = hashlib.md5()   #建立MD5對象
md5obj.update(b'john')   #拆分字符串
md5obj.update(b'alen')
print(md5obj.hexdigest())

執行輸出：

結論：

一段字符串直接進行摘要和分紅幾段摘要的結果是相同的

那麼就能夠把大文件，分段進行MD5加密，就能夠了

那麼就能夠把大文件，分段進行md5加密，就能夠了

下載一部電影《海上鋼琴師》，文件有1.58GB

本片講述了一個鋼琴天才傳奇的一輩子。豆瓣評分9.2

計算電影的md5值

 
         import  
         hashlib 
        
         def  
         check(filename): 
        
         md5obj  
         =  
         hashlib.md5() 
        
         with  
         open 
         (filename, 
         'rb' 
         ) as f: 
        
         while  
         True 
         : 
        
         content  
         =  
         f.read( 
         1048576 
         )   
         # 每次讀取1048576字節，也就是1MB 
        
         if  
         content: 
        
         md5obj.update(content) 
        
         else 
         : 
        
         break   
         # 當內容爲空時,終止循環 
        
         return  
         md5obj.hexdigest() 
        
         ret1  
         =  
         check( 
         'E:\迅雷下載\[迅雷下載www.2tu.cc]海上鋼琴師.BD1280高清中英雙字.rmvb' 
         ) 
        
         print 
         (ret1)

花費了9秒，執行輸出：

30c7f078203d761d3f13bec6f8fd3088

總結：

序列化 把數據類型變成字符串

　　爲何要有序列化，由於在網絡上和文件中能存在的只有字節

json

　　在全部的語言中通用，只對有限的數據類型進行序列化字典列表字符串數字元祖

　　在屢次寫入dump數據進入文件的時候，不能經過load來取。

pickle

　　只能在python種使用，對絕對大多數數據類型均可以進行序列化

　　在load的是哦湖，必須擁有load數據類型對應的類在內存裏

　　dumps 序列化

　　loads 反序列化

　　dump 直接向文件中序列化

　　load 直接對文件反序列化

shelve

　　f = open() 打開文件

json 和 pickle 必須熟練掌握

二，configarser模塊

　　該模塊適用於配置文件的格式與windows ini 文件相似，能夠包含一個或多個字節（section），每一個字節能夠有多個參數(鍵=值).

建立文件

來看一個好多軟件的常見文檔格式以下：

section 稱之爲節點，節點裏面賦值對，稱之爲項

若是想用python生成一個這樣的文檔怎麼作呢？

 
      import  
      configparser 
     

         
     
 
      config  
      =  
      configparser.ConfigParser()   
      #建立一個ConfigParser對象 
     

         
     
 
      config[ 
      "DEFAULT" 
      ]  
      =  
      { 
      'ServerAliveInterval' 
      :  
      '45' 
      ,   
      #默認參數 
     
 
                             
      'Compression' 
      :  
      'yes' 
      , 
     
 
                            
      'CompressionLevel' 
      :  
      '9' 
      , 
     
 
                            
      'ForwardX11' 
      : 
      'yes' 
     
 
                            
      } 
     
 
      config[ 
      'bitbucket.org' 
      ]  
      =  
      { 
      'User' 
      : 
      'hg' 
      }  
      #添加一個節點bitbucket.org 
     
 
      config[ 
      'topsecret.server.com' 
      ]  
      =  
      { 
      'Host Port' 
      : 
      '50022' 
      , 
      'ForwardX11' 
      : 
      'no' 
      } 
     

         
     
 
      with  
      open 
      ( 
      'example.ini' 
      ,  
      'w' 
      ) as configfile:  
      #寫入配置文件example.ini 
     
 
          
      config.write(configfile) 
     

執行程序，查看example.ini的內容

 
         [DEFAULT] 
        
         serveraliveinterval  
         =  
         45 
        
         forwardx11  
         =  
         yes 
        
         compression  
         =  
         yes 
        
         compressionlevel  
         =  
         9 
        
         [bitbucket.org] 
        
         user  
         =  
         hg 
        
         [topsecret.server.com] 
        
         forwardx11  
         =  
         no 
        
         host port  
         =  
         50022

能夠看出節點的項，都變成小寫了。

這是由於它在寫入的時候，將全部字符串使用了lower()方法，轉換爲小寫了。

查找文件

import configparser
config = configparser.ConfigParser()
config.read('example.ini')          ###上面內容爲固定部分###
print(config.sections())            # 查看全部的節點，但默認不顯示DEFAULT,返回列表

執行輸出：

下面的代碼，固定部分我就不貼了

 
        print 
        ( 
        'bitbucket.org'  
        in  
        config)   
        # 驗證某個節點是否在文件中

執行輸出： True

 
         print 
         (config[ 
         'bitbucket.org' 
         ][ 
         'user' 
         ])   
         # 查看某節點下面的某個項的值

執行輸出： hg

 
         print 
         (config[ 
         'bitbucket.org' 
         ])   
         # 輸出一個可迭代對象

執行輸出： <Section: bitbucket.org>

 
         #使用for循環一個可迭代對象 
        
         for  
         key  
         in  
         config[ 
         'bitbucket.org' 
         ]:   
         # 注意,有default時,會默認輸出它的鍵 
        
         print 
         (key)

執行輸出：

user
serveraliveinterval
forwardx11
compression
compressionlevel

 
         print 
         (config.items( 
         'bitbucket.org' 
         ))   
         # 找到'bitbucket.org'下全部的鍵值對

執行輸出：

[('serveraliveinterval', '45'), ('forwardx11', 'yes'), ('compression', 'yes'), ('compressionlevel', '9'), ('user', 'hg')]

 
         print 
         (config.get( 
         'bitbucket.org' 
         , 
         'compression' 
         ))   
         # get方法section下的key對應的value

執行輸出： yes

增刪改操做

增長一個節點

 
        print 
        (config.add_section( 
        'yuan' 
        ))   
        # 增長一個節點

注意，它不會當即寫入!必須執行下面的代碼

 
         config.write( 
         open 
         ( 
         'example.ini' 
         ,  
         "w" 
         ))  
         # 寫入文件

open('example.ini',w) 表示清空文件

config.write 表示寫入內容

再次查看文件內容：

[DEFAULT]
serveraliveinterval = 45
forwardx11 = yes
compression = yes
compressionlevel = 9

[bitbucket.org]
user = hg

[topsecret.server.com]
forwardx11 = no
host port = 50022

[yuan]

　刪除一個節點

config.remove_section('bitbucket.org')
config.write(open('example.ini','w'))

　修改節點

1 2	`config.` `set` `(` `'yuan'` `,` `'k2'` `,` `'222'` `)` `# yuan節點增長項k2 = 222` `config.write(` `open` `(` `'example.ini'` `,` `"w"` `))` `# 寫入文件`

總結：

section 能夠直接操做它的對象來獲取全部的節信息

option 能夠經過找到的節來查看多有項

三，loggin

爲了保護數據安全
全部的增長，修改，刪除操做，都要記錄日誌

好比log日誌，管理員操做日誌，消費記錄...

日誌給咱們在內部操做的時候提供不少遍歷
日誌給用戶提供更多的信息
在程序使用的過程當中本身調試須要看的信息
幫助程序員排查程序的問題

ogging模塊不會自動幫你添加日誌的內容
你本身想打印什麼你就寫什麼

import logging
logging.debug('debug message')
logging.info('info message')
logging.warning('warning message')
logging.error('error message')
logging.critical('critical message')

執行輸出：　

設置INFO，只顯示INFO以上的錯誤

能不能只顯示一種級別信息呢？不行！
只能打印某個級別以上的信息

增長時間顯示

import logging
logging.basicConfig(level=logging.DEBUG,
                    format='%(asctime)s %(filename)s[line:%(lineno)d] '
                           '%(levelname)s %(message)s')
logging.debug('debug message')  #debug調試模式 級別模式
logging.info('info message')    #info 顯示正常信息
logging.warning('warning messafe')  #warning 顯示警告信息
logging.error('error message')      #error 顯示錯誤信息
logging.critical('critical message') #critical 顯示驗證錯誤信息

　執行輸出：

設置時間格式

#設置時間格式
import logging
logging.basicConfig(level=logging.DEBUG,
                    format='%(asctime)s %(filename)s[line:%(lineno)d] %(levelname)s %(message)s'),
datefmt = '%a, %d %b %y %H:%M:%S',
logging.debug('debug message')       # debug 調試模式 級別最低
logging.info('info message')         # info  顯示正常信息
logging.warning('warning message')   # warning 顯示警告信息
logging.error('error message')       # error 顯示錯誤信息
logging.critical('critical message') # critical 顯示驗證錯誤信息

　　執行輸出：

logging.basicConfig()函數中可經過具體參數來更改logging模塊默認行爲，可用參數有：

filename：用指定的文件名建立FiledHandler，這樣日誌會被存儲在指定的文件中。
filemode：文件打開方式，在指定了filename時使用這個參數，默認值爲「a」還可指定爲「w」。
format：指定handler使用的日誌顯示格式。
datefmt：指定日期時間格式。
level：設置rootlogger（後邊會講解具體概念）的日誌級別
stream：用指定的stream建立StreamHandler。能夠指定輸出到sys.stderr,sys.stdout或者文件(f=open(‘test.log’,’w’))，默認爲sys.stderr。若同時列出了filename和stream兩個參數，則stream參數會被忽略。

format參數中可能用到的格式化串：
%(name)s Logger的名字
%(levelno)s 數字形式的日誌級別
%(levelname)s 文本形式的日誌級別
%(pathname)s 調用日誌輸出函數的模塊的完整路徑名，可能沒有
%(filename)s 調用日誌輸出函數的模塊的文件名
%(module)s 調用日誌輸出函數的模塊名
%(funcName)s 調用日誌輸出函數的函數名
%(lineno)d 調用日誌輸出函數的語句所在的代碼行
%(created)f 當前時間，用UNIX標準的表示時間的浮 點數表示
%(relativeCreated)d 輸出日誌信息時的，自Logger建立以 來的毫秒數
%(asctime)s 字符串形式的當前時間。默認格式是 「2003-07-08 16:49:45,896」。逗號後面的是毫秒
%(thread)d 線程ID。可能沒有
%(threadName)s 線程名。可能沒有
%(process)d 進程ID。可能沒有
%(message)s用戶輸出的消息

寫入文件

 
        import  
        logging 
       
        logging.basicConfig(level 
        = 
        logging.DEBUG, 
       
        format 
        = 
        '%(asctime)s %(filename)s[line:%(lineno)d] %(levelname)s %(message)s' 
        , 
       
        datefmt 
        = 
        '%a, %d %b %y %H:%M:%S' 
        , 
       
        filename  
        =  
        'userinfo.log' 
       
        ) 
       
        logging.debug( 
        'debug message' 
        )        
        # debug 調試模式 級別最低 
       
        logging.info( 
        'info message' 
        )          
        # info  顯示正常信息 
       
        logging.warning( 
        'warning message' 
        )    
        # warning 顯示警告信息 
       
        logging.error( 
        'error message' 
        )        
        # error 顯示錯誤信息 
       
        logging.critical( 
        'critical message' 
        )  
        # critical 顯示驗證錯誤信息

執行程序，查看文件內容

某些狀況下，查看文件是亂碼的。

它的侷限性有2個

　　編碼格式不能設置

　　不能同時輸出到文件和屏幕

loggin 對象方式

因爲簡單配置有侷限性，logging對象方式更爲靈活

import logging
logger = logging.getLogger()   #實例化了一個logger對象
#國外叫handler,中國翻譯過來，叫句柄

#設置文件名和編碼
fh = logging.FileHandler('test.log',encoding='utf-8') #實例化了一個文件句柄
sh = logging.StreamHandler()    #用於輸出到控制檯

fmt = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')
fh.setFormatter(fmt)
sh.setFormatter(fmt)

#吸星大法
logger.addHandler(fh)
logger.addHandler(sh)
logger.setLevel(logger.DEBUG)

logger.debug('debug message')
logger.info('info message')
logger.warning('warning message')

執行輸出：

warning message

查看文件內容，也是

warning message

這樣就具有了同時寫入文件以及輸出屏幕的技能

增長輸出格式功能：

 
        import  
        logging 
       
        logger  
        =  
        logging.getLogger()   
        # 實例化了一個logger對象 
       
        #在國外叫handler，在中國翻譯過來，叫句柄 
       
        #設置文件名和編碼 
       
        fh  
        =  
        logging.FileHandler( 
        'test.log' 
        ,encoding 
        = 
        'utf-8' 
        )   
        # 實例化了一個文件句柄 # 格式和文件句柄或者屏幕句柄關聯 
       
        sh  
        =  
        logging.StreamHandler()   
        # 用於輸出到控制檯 
       
        fmt  
        =  
        logging.Formatter( 
        '%(asctime)s - %(name)s - %(levelname)s - %(message)s' 
        )   
        # 格式化 
       
        fh.setFormatter(fmt)   
        # 格式和文件句柄或者屏幕句柄關聯 
       
        sh.setFormatter(fmt) 
       
        #吸星大法 
       
        logger.addHandler(fh)   
        # 吸取寫文件功能 和logger關聯的只有句柄 
       
        logger.addHandler(sh)   
        # 吸取輸出屏幕功能 
       
        logger.setLevel(logging.DEBUG)   
        # 設置警告級別爲debug,此處DEBUG源碼爲DEBUG = 10 
       
        logger.debug( 
        'debug message' 
        ) 
       
        logger.info( 
        'info message' 
        ) 
       
        logger.warning( 
        'warning message' 
        )

執行輸出：

2018-04-23 20:16:58,850 - root - DEBUG - debug message
2018-04-23 20:16:58,850 - root - INFO - info message
2018-04-23 20:16:58,850 - root - WARNING - warning message

查看文件內容，也是同樣的。