python 生涯之經常使用模塊 (二)

時間 2019-11-09

標籤 python 生涯經常使用模塊欄目 Python 简体版

原文原文鏈接

json & pickle 模塊

用於序列化的兩個模塊html

json，用於字符串和 python數據類型間進行轉換
pickle，用於python特有的類型和 python的數據類型間進行轉換

Json模塊提供了四個功能：dumps、dump、loads、loadnode

pickle模塊提供了四個功能：dumps、dump、loads、loadpython

shelve 模塊

shelve模塊是一個簡單的k,v將內存數據經過文件持久化的模塊，能夠持久化任何pickle可支持的python數據格式git

import shelve

d = shelve.open('shelve_test') #打開一個文件 

class Test(object):
    def __init__(self,n):
        self.n = n


t = Test(123)  
t2 = Test(123334)

name = ["alex","rain","test"] 
d["test"] = name #持久化列表
d["t1"] = t      #持久化類
d["t2"] = t2

d.close()

xml處理模塊

xml是實現不一樣語言或程序之間進行數據交換的協議，跟json差很少，但json使用起來更簡單，不過，古時候，在json還沒誕生的黑暗年代，你們只能選擇用xml呀，至今不少傳統公司如金融行業的不少系統的接口還主要是xml。正則表達式

xml的格式以下，就是經過<>節點來區別數據結構的:算法

 
         <? 
         xml  
         version="1.0"?> 
        
         < 
         data 
         > 
        
         < 
         country  
         name="Liechtenstein"> 
        
         < 
         rank  
         updated="yes">2</ 
         rank 
         > 
        
         < 
         year 
         >2008</ 
         year 
         > 
        
         < 
         gdppc 
         >141100</ 
         gdppc 
         > 
        
         < 
         neighbor  
         name="Austria" direction="E"/> 
        
         < 
         neighbor  
         name="Switzerland" direction="W"/> 
        
         </ 
         country 
         > 
        
         < 
         country  
         name="Singapore"> 
        
         < 
         rank  
         updated="yes">5</ 
         rank 
         > 
        
         < 
         year 
         >2011</ 
         year 
         > 
        
         < 
         gdppc 
         >59900</ 
         gdppc 
         > 
        
         < 
         neighbor  
         name="Malaysia" direction="N"/> 
        
         </ 
         country 
         > 
        
         < 
         country  
         name="Panama"> 
        
         < 
         rank  
         updated="yes">69</ 
         rank 
         > 
        
         < 
         year 
         >2011</ 
         year 
         > 
        
         < 
         gdppc 
         >13600</ 
         gdppc 
         > 
        
         < 
         neighbor  
         name="Costa Rica" direction="W"/> 
        
         < 
         neighbor  
         name="Colombia" direction="E"/> 
        
         </ 
         country 
         > 
        
         </ 
         data 
         >

xml協議在各個語言裏的都是支持的，在python中能夠用如下模塊操做xml 　　編程

import xml.etree.ElementTree as ET

tree = ET.parse("xmltest.xml")
root = tree.getroot()
print(root.tag)

#遍歷xml文檔
for child in root:
    print(child.tag, child.attrib)
    for i in child:
        print(i.tag,i.text)

#只遍歷year 節點
for node in root.iter('year'):
    print(node.tag,node.text)

修改和刪除xml文檔內容json

import xml.etree.ElementTree as ET

tree = ET.parse("xmltest.xml")
root = tree.getroot()

#修改
for node in root.iter('year'):
    new_year = int(node.text) + 1
    node.text = str(new_year)
    node.set("updated","yes")

tree.write("xmltest.xml")


#刪除node
for country in root.findall('country'):
   rank = int(country.find('rank').text)
   if rank > 50:
     root.remove(country)

tree.write('output.xml')

本身建立xml文檔網絡

import xml.etree.ElementTree as ET


new_xml = ET.Element("namelist")
name = ET.SubElement(new_xml,"name",attrib={"enrolled":"yes"})
age = ET.SubElement(name,"age",attrib={"checked":"no"})
sex = ET.SubElement(name,"sex")
sex.text = '33'
name2 = ET.SubElement(new_xml,"name",attrib={"enrolled":"no"})
age = ET.SubElement(name2,"age")
age.text = '19'

et = ET.ElementTree(new_xml) #生成文檔對象
et.write("test.xml", encoding="utf-8",xml_declaration=True)

ET.dump(new_xml) #打印生成的格式

ConfigParser模塊

用於生成和修改常見配置文檔，當前模塊的名稱在 python 3.x 版本中變動爲 configparser。數據結構

來看一個好多軟件的常見文檔格式以下

[DEFAULT]
ServerAliveInterval = 45
Compression = yes
CompressionLevel = 9
ForwardX11 = yes

[bitbucket.org]
User = hg

[topsecret.server.com]
Port = 50022
ForwardX11 = no

若是想用python生成一個這樣的文檔怎麼作呢？

import configparser

config = configparser.ConfigParser()
config["DEFAULT"] = {'ServerAliveInterval': '45',
                      'Compression': 'yes',
                     'CompressionLevel': '9'}

config['bitbucket.org'] = {}
config['bitbucket.org']['User'] = 'hg'
config['topsecret.server.com'] = {}
topsecret = config['topsecret.server.com']
topsecret['Host Port'] = '50022'     # mutates the parser
topsecret['ForwardX11'] = 'no'  # same here
config['DEFAULT']['ForwardX11'] = 'yes'
with open('example.ini', 'w') as configfile:
   config.write(configfile)

寫完了還能夠再讀出來哈。

>>> import configparser
>>> config = configparser.ConfigParser()
>>> config.sections()
[]
>>> config.read('example.ini')
['example.ini']
>>> config.sections()
['bitbucket.org', 'topsecret.server.com']
>>> 'bitbucket.org' in config
True
>>> 'bytebong.com' in config
False
>>> config['bitbucket.org']['User']
'hg'
>>> config['DEFAULT']['Compression']
'yes'
>>> topsecret = config['topsecret.server.com']
>>> topsecret['ForwardX11']
'no'
>>> topsecret['Port']
'50022'
>>> for key in config['bitbucket.org']: print(key)
...
user
compressionlevel
serveraliveinterval
compression
forwardx11
>>> config['bitbucket.org']['ForwardX11']
'yes'

configparser增刪改查語法

[section1]
k1 = v1
k2:v2
 
[section2]
k1 = v1

import ConfigParser
 
config = ConfigParser.ConfigParser()
config.read('i.cfg')
 
# ########## 讀 ##########
#secs = config.sections()
#print secs
#options = config.options('group2')
#print options
 
#item_list = config.items('group2')
#print item_list
 
#val = config.get('group1','key')
#val = config.getint('group1','key')
 
# ########## 改寫 ##########
#sec = config.remove_section('group1')
#config.write(open('i.cfg', "w"))
 
#sec = config.has_section('wupeiqi')
#sec = config.add_section('wupeiqi')
#config.write(open('i.cfg', "w"))
 
 
#config.set('group2','k1',11111)
#config.write(open('i.cfg', "w"))
 
#config.remove_option('group2','age')
#config.write(open('i.cfg', "w"))

hashlib模塊　　

用於加密相關的操做，3.x裏代替了md5模塊和sha模塊，主要提供 SHA1, SHA224, SHA256, SHA384, SHA512 ，MD5 算法

import hashlib

m = hashlib.md5()
m.update(b"Hello")
m.update(b"It's me")
print(m.digest())
m.update(b"It's been a long time since last time we ...")

print(m.digest()) #2進制格式hash
print(len(m.hexdigest())) #16進制格式hash
'''
def digest(self, *args, **kwargs): # real signature unknown
    """ Return the digest value as a string of binary data. """
    pass

def hexdigest(self, *args, **kwargs): # real signature unknown
    """ Return the digest value as a string of hexadecimal digits. """
    pass

'''
import hashlib

# ######## md5 ########

hash = hashlib.md5()
hash.update('admin')
print(hash.hexdigest())

# ######## sha1 ########

hash = hashlib.sha1()
hash.update('admin')
print(hash.hexdigest())

# ######## sha256 ########

hash = hashlib.sha256()
hash.update('admin')
print(hash.hexdigest())


# ######## sha384 ########

hash = hashlib.sha384()
hash.update('admin')
print(hash.hexdigest())

# ######## sha512 ########

hash = hashlib.sha512()
hash.update('admin')
print(hash.hexdigest())

還不夠吊？python 還有一個 hmac 模塊，它內部對咱們建立 key 和內容再進行處理而後再加密

散列消息鑑別碼，簡稱HMAC，是一種基於消息鑑別碼MAC（Message Authentication Code）的鑑別機制。使用HMAC時,消息通信的雙方，經過驗證消息中加入的鑑別密鑰K來鑑別消息的真僞；

通常用於網絡通訊中消息加密，前提是雙方先要約定好key,就像接頭暗號同樣，而後消息發送把用key把消息加密，接收方用key ＋消息明文再加密，拿加密後的值跟發送者的相對比是否相等，這樣就能驗證消息的真實性，及發送者的合法性了。

import hmac
h = hmac.new(b'天王蓋地虎', b'寶塔鎮河妖')
print h.hexdigest()

更多關於md5,sha1,sha256等介紹的文章看這裏https://www.tbs-certificates.co.uk/FAQ/en/sha256.html

re模塊

經常使用正則表達式符號

'.'		默認匹配除\n以外的任意一個字符，若指定flag DOTALL,則匹配任意字符，包括換行
'^'		匹配字符開頭，若指定flags MULTILINE,這種也能夠匹配上(r"^a","\nabc\neee",flags=re.MULTILINE)
'$'		匹配字符結尾，或e.search("foo$","bfoo\nsdfsf",flags=re.MULTILINE).group()也能夠
'*'		匹配*號前的字符0次或屢次，re.findall("ab*","cabb3abcbbac")  結果爲['abb', 'ab', 'a']
'+'		匹配前一個字符1次或屢次，re.findall("ab+","ab+cd+abb+bba") 結果['ab', 'abb']
'?'		匹配前一個字符1次或0次
'{m}'	匹配前一個字符m次
'{n,m}'	匹配前一個字符n到m次，re.findall("ab{1,3}","abb abc abbcbbb") 結果'abb', 'ab', 'abb']
'|'		匹配|左或|右的字符，re.search("abc|ABC","ABCBabcCD").group()	結果'ABC'
'(...)' 分組匹配，re.search("(abc){2}a(123|456)c", "abcabca456c").group() 結果 abcabca456c


'\A'	只從字符開頭匹配，re.search("\Aabc","alexabc") 是匹配不到的
'\Z'	匹配字符結尾，同$
'\d'	匹配數字0-9
'\D'	匹配非數字
'\w'	匹配[A-Za-z0-9]
'\W'	匹配非[A-Za-z0-9]
's'		匹配空白字符、\t、\n、\r , re.search("\s+","ab\tc1\n3").group() 結果 '\t'

'(?P<name>...)' 分組匹配 re.search("(?P<province>[0-9]{4})(?P<city>[0-9]{2})(?P<birthday>[0-9]{4})","371481199306143242").groupdict("city") 結果{'province': '3714', 'city': '81', 'birthday': '1993'}

最經常使用的匹配語法

re.match 從頭開始匹配
re.search 匹配包含
re.findall 把全部匹配到的字符放到以列表中的元素返回
re.splitall 以匹配到的字符當作列表分隔符
re.sub		匹配字符並替換

反斜槓的困擾
與大多數編程語言相同，正則表達式裏使用"\"做爲轉義字符，這就可能形成反斜槓困擾。假如你須要匹配文本中的字符"\"，那麼使用編程語言表示的正則表達式裏將須要4個反斜槓"\\\\"：前兩個和後兩個分別用於在編程語言裏轉義成反斜槓，轉換成兩個反斜槓後再在正則表達式裏轉義成一個反斜槓。Python裏的原生字符串很好地解決了這個問題，這個例子中的正則表達式可使用r"\\"表示。一樣，匹配一個數字的"\\d"能夠寫成r"\d"。有了原生字符串，你不再用擔憂是否是漏寫了反斜槓，寫出來的表達式也更直觀。

僅需輕輕知道的幾個匹配模式

re.I(re.IGNORECASE): 忽略大小寫（括號內是完整寫法，下同）
M(MULTILINE): 多行模式，改變'^'和'$'的行爲（參見上圖）
S(DOTALL): 點任意匹配模式，改變'.'的行爲

相關標籤/搜索

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。