05 python學習--第五週--模塊（time &datetime模塊、random、os、sys、shutil、json & pickle、shelve、xml處理、yaml處理、configp

時間 2019-11-11

標籤 python 學習第五模塊 time datetime random sys shutil json pickle shelve xml 處理 yaml configp 欄目 Python 简体版

原文原文鏈接

本節大綱：html

模塊介紹
time &datetime模塊
random
os
sys
shutil
json & pickle
shelve
xml處理
yaml處理
configparser
hashlib
subprocess
logging模塊
re正則表達式

模塊，用一砣代碼實現了某個功能的代碼集合。 node

相似於函數式編程和麪向過程編程，函數式編程則完成一個功能，其餘代碼用來調用便可，提供了代碼的重用性和代碼間的耦合。而對於一個複雜的功能來，可能須要多個函數才能完成（函數又能夠在不一樣的.py文件中），n個 .py 文件組成的代碼集合就稱爲模塊。python

如：os 是系統相關的模塊；file是文件操做相關的模塊git

模塊分爲三種：程序員

自定義模塊
內置標準模塊（又稱標準庫）
開源模塊

自定義模塊和開源模塊的使用參考 http://www.cnblogs.com/wupeiqi/articles/4963027.html 正則表達式

自定義模塊

一、定義模塊算法

情景一：shell

情景二：macos

情景三：編程

二、導入模塊

Python之因此應用愈來愈普遍，在必定程度上也依賴於其爲程序員提供了大量的模塊以供使用，若是想要使用模塊，則須要導入。導入模塊有一下幾種方法：

1 import module
2 from module.xx.xx import xx
3 from module.xx.xx import xx as rename  
4 from module.xx.xx import *

導入模塊其實就是告訴Python解釋器去解釋那個py文件

導入一個py文件，解釋器解釋該py文件
導入一個包，解釋器解釋該包下的 __init__.py 文件

那麼問題來了，導入模塊時是根據那個路徑做爲基準來進行的呢？即：sys.path

1 import sys
2 print sys.path
3   
4 結果：
5 ['/Users/wupeiqi/PycharmProjects/calculator/p1/pp1', 
'/usr/local/lib/python2.7/site-packages/setuptools-15.2-py2.7.egg',
 '/usr/local/lib/python2.7/site-packages/distribute-0.6.28-py2.7.egg', 
'/usr/local/lib/python2.7/site-packages/MySQL_python-1.2.4b4-py2.7-macosx-10.10-x86_64.egg',
 '/usr/local/lib/python2.7/site-packages/xlutils-1.7.1-py2.7.egg',
 '/usr/local/lib/python2.7/site-packages/xlwt-1.0.0-py2.7.egg', 

'/usr/local/lib/python2.7/site-packages/xlrd-0.9.3-py2.7.egg', 
'/usr/local/lib/python2.7/site-packages/tornado-4.1-py2.7-macosx-10.10-x86_64.egg',
 '/usr/local/lib/python2.7/site-packages/backports.ssl_match_hostname-3.4.0.2-py2.7.egg', 
'/usr/local/lib/python2.7/site-packages/certifi-2015.4.28-py2.7.egg',
 '/usr/local/lib/python2.7/site-packages/pyOpenSSL-0.15.1-py2.7.egg',
 '/usr/local/lib/python2.7/site-packages/six-1.9.0-py2.7.egg', 
'/usr/local/lib/python2.7/site-packages/cryptography-0.9.1-py2.7-macosx-10.10-x86_64.egg', 
'/usr/local/lib/python2.7/site-packages/cffi-1.1.1-py2.7-macosx-10.10-x86_64.egg', 
'/usr/local/lib/python2.7/site-packages/ipaddress-1.0.7-py2.7.egg', 
'/usr/local/lib/python2.7/site-packages/enum34-1.0.4-py2.7.egg', 
'/usr/local/lib/python2.7/site-packages/pyasn1-0.1.7-py2.7.egg', 
'/usr/local/lib/python2.7/site-packages/idna-2.0-py2.7.egg', 
'/usr/local/lib/python2.7/site-packages/pycparser-2.13-py2.7.egg', 
'/usr/local/lib/python2.7/site-packages/Django-1.7.8-py2.7.egg', 
'/usr/local/lib/python2.7/site-packages/paramiko-1.10.1-py2.7.egg', 
'/usr/local/lib/python2.7/site-packages/gevent-1.0.2-py2.7-macosx-10.10-x86_64.egg', 
'/usr/local/lib/python2.7/site-packages/greenlet-0.4.7-py2.7-macosx-10.10-x86_64.egg',
 '/Users/wupeiqi/PycharmProjects/calculator', '/usr/local/Cellar/python/2.7.9/Frameworks/Python.framework/Versions/2.7/lib/python27.zip',
 '/usr/local/Cellar/python/2.7.9/Frameworks/Python.framework/Versions/2.7/lib/python2.7',
 '/usr/local/Cellar/python/2.7.9/Frameworks/Python.framework/Versions/2.7/lib/python2.7/plat-darwin',
 '/usr/local/Cellar/python/2.7.9/Frameworks/Python.framework/Versions/2.7/lib/python2.7/plat-mac',
 '/usr/local/Cellar/python/2.7.9/Frameworks/Python.framework/Versions/2.7/lib/python2.7/plat-mac/lib-scriptpackages',
 '/usr/local/Cellar/python/2.7.9/Frameworks/Python.framework/Versions/2.7/lib/python2.7/lib-tk',
 '/usr/local/Cellar/python/2.7.9/Frameworks/Python.framework/Versions/2.7/lib/python2.7/lib-old',
 '/usr/local/Cellar/python/2.7.9/Frameworks/Python.framework/Versions/2.7/lib/python2.7/lib-dynload', '/usr/local/lib/python2.7/site-packages',
 '/Library/Python/2.7/site-packages']

若是sys.path路徑列表沒有你想要的路徑，能夠經過 sys.path.append('路徑') 添加。
經過os模塊能夠獲取各類目錄，例如：

1 import sys
2 import os
3 
4 pre_path = os.path.abspath('../')
5 sys.path.append(pre_path)

開源模塊

1、下載安裝

下載安裝有兩種方式：

yum 
pip
apt-get
...

1 下載源碼
2 解壓源碼
3 進入目錄
4 編譯源碼    python setup.py build
5 安裝源碼    python setup.py install

注：在使用源碼安裝時，須要使用到gcc編譯和python開發環境，因此，須要先執行：

1 yum install gcc
2 yum install python-devel
3 或
4 apt-get python-dev

安裝成功後，模塊會自動安裝到 sys.path 中的某個目錄中，如：

1 /usr/lib/python2.7/site-packages/

2、導入模塊

同自定義模塊中導入的方式

3、模塊 paramiko

paramiko是一個用於作遠程控制的模塊，使用該模塊能夠對遠程服務器進行命令或文件操做，值得一說的是，fabric和ansible內部的遠程管理就是使用的paramiko來現實。

一、下載安裝

pip3 install paramiko

或

 1 # pycrypto，因爲 paramiko 模塊內部依賴pycrypto，因此先下載安裝pycrypto
 2  
 3 # 下載安裝 pycrypto
 4 wget http://files.cnblogs.com/files/wupeiqi/pycrypto-2.6.1.tar.gz
 5 tar -xvf pycrypto-2.6.1.tar.gz
 6 cd pycrypto-2.6.1
 7 python setup.py build
 8 python setup.py install
 9  
10 # 進入python環境，導入Crypto檢查是否安裝成功
11  
12 # 下載安裝 paramiko
13 wget http://files.cnblogs.com/files/wupeiqi/paramiko-1.10.1.tar.gz
14 tar -xvf paramiko-1.10.1.tar.gz
15 cd paramiko-1.10.1
16 python setup.py build
17 python setup.py install
18  
19 # 進入python環境，導入paramiko檢查是否安裝成功

二、使用模塊

#!/usr/bin/env python
#coding:utf-8

import paramiko

ssh = paramiko.SSHClient()
ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy())
ssh.connect('192.168.1.108', 22, 'alex', '123')
stdin, stdout, stderr = ssh.exec_command('df')
print stdout.read()
ssh.close();

執行命令 - 經過用戶名和密碼鏈接服務器

import paramiko

private_key_path = '/home/auto/.ssh/id_rsa'
key = paramiko.RSAKey.from_private_key_file(private_key_path)

ssh = paramiko.SSHClient()
ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy())
ssh.connect('主機名 ', 端口, '用戶名', key)

stdin, stdout, stderr = ssh.exec_command('df')
print stdout.read()
ssh.close()

執行命令 - 過密鑰連接服務器

 1 import os,sys
 2 import paramiko
 3 
 4 t = paramiko.Transport(('182.92.219.86',22))
 5 t.connect(username='wupeiqi',password='123')
 6 sftp = paramiko.SFTPClient.from_transport(t)
 7 sftp.put('/tmp/test.py','/tmp/test.py') 
 8 t.close()
 9 
10 
11 import os,sys
12 import paramiko
13 
14 t = paramiko.Transport(('182.92.219.86',22))
15 t.connect(username='wupeiqi',password='123')
16 sftp = paramiko.SFTPClient.from_transport(t)
17 sftp.get('/tmp/test.py','/tmp/test2.py')
18 t.close()
19 
20 上傳或者下載文件 - 經過用戶名和密碼

 1 import paramiko
 2 
 3 pravie_key_path = '/home/auto/.ssh/id_rsa'
 4 key = paramiko.RSAKey.from_private_key_file(pravie_key_path)
 5 
 6 t = paramiko.Transport(('182.92.219.86',22))
 7 t.connect(username='wupeiqi',pkey=key)
 8 
 9 sftp = paramiko.SFTPClient.from_transport(t)
10 sftp.put('/tmp/test3.py','/tmp/test3.py') 
11 
12 t.close()
13 
14 import paramiko
15 
16 pravie_key_path = '/home/auto/.ssh/id_rsa'
17 key = paramiko.RSAKey.from_private_key_file(pravie_key_path)
18 
19 t = paramiko.Transport(('182.92.219.86',22))
20 t.connect(username='wupeiqi',pkey=key)
21 
22 sftp = paramiko.SFTPClient.from_transport(t)
23 sftp.get('/tmp/test3.py','/tmp/test4.py') 
24 
25 t.close()
26 
27 上傳或下載文件 - 經過密鑰

time & datetime模塊

時間相關的操做，時間有三種表示方式：

時間戳 1970年1月1日以後的秒，即：time.time()
格式化的字符串 2014-11-11 11:11，即：time.strftime('%Y-%m-%d')
結構化時間元組包含了：年、日、星期等... time.struct_time 即：time.localtime()

 1 print time.time()
 2 print time.mktime(time.localtime())
 3   
 4 print time.gmtime()    #可加時間戳參數
 5 print time.localtime() #可加時間戳參數
 6 print time.strptime('2014-11-11', '%Y-%m-%d')
 7   
 8 print time.strftime('%Y-%m-%d') #默認當前時間
 9 print time.strftime('%Y-%m-%d',time.localtime()) #默認當前時間
10 print time.asctime()
11 print time.asctime(time.localtime())
12 print time.ctime(time.time())
13   
14 import datetime
15 '''
16 datetime.date：表示日期的類。經常使用的屬性有year, month, day
17 datetime.time：表示時間的類。經常使用的屬性有hour, minute, second, microsecond
18 datetime.datetime：表示日期時間
19 datetime.timedelta：表示時間間隔，即兩個時間點之間的長度
20 timedelta([days[, seconds[, microseconds[, milliseconds[, minutes[, hours[, weeks]]]]]]])
21 strftime("%Y-%m-%d")
22 '''
23 import datetime
24 print datetime.datetime.now()
25 print datetime.datetime.now() - datetime.timedelta(days=5)

 1 #_*_coding:utf-8_*_
 2 __author__ = 'Alex Li'
 3 
 4 import time
 5 
 6 
 7 # print(time.clock()) #返回處理器時間,3.3開始已廢棄 , 改爲了time.process_time()測量處理器運算時間,不包括sleep時間,不穩定,mac上測不出來
 8 # print(time.altzone)  #返回與utc時間的時間差,以秒計算\
 9 # print(time.asctime()) #返回時間格式"Fri Aug 19 11:14:16 2016",
10 # print(time.localtime()) #返回本地時間 的struct time對象格式
11 # print(time.gmtime(time.time()-800000)) #返回utc時間的struc時間對象格式
12 
13 # print(time.asctime(time.localtime())) #返回時間格式"Fri Aug 19 11:14:16 2016",
14 #print(time.ctime()) #返回Fri Aug 19 12:38:29 2016 格式, 同上
15 
16 
17 
18 # 日期字符串 轉成  時間戳
19 # string_2_struct = time.strptime("2016/05/22","%Y/%m/%d") #將 日期字符串 轉成 struct時間對象格式
20 # print(string_2_struct)
21 # #
22 # struct_2_stamp = time.mktime(string_2_struct) #將struct時間對象轉成時間戳
23 # print(struct_2_stamp)
24 
25 
26 
27 #將時間戳轉爲字符串格式
28 # print(time.gmtime(time.time()-86640)) #將utc時間戳轉換成struct_time格式
29 # print(time.strftime("%Y-%m-%d %H:%M:%S",time.gmtime()) ) #將utc struct_time格式轉成指定的字符串格式
30 
31 
32 
33 
34 
35 #時間加減
36 import datetime
37 
38 # print(datetime.datetime.now()) #返回 2016-08-19 12:47:03.941925
39 #print(datetime.date.fromtimestamp(time.time()) )  # 時間戳直接轉成日期格式 2016-08-19
40 # print(datetime.datetime.now() )
41 # print(datetime.datetime.now() + datetime.timedelta(3)) #當前時間+3天
42 # print(datetime.datetime.now() + datetime.timedelta(-3)) #當前時間-3天
43 # print(datetime.datetime.now() + datetime.timedelta(hours=3)) #當前時間+3小時
44 # print(datetime.datetime.now() + datetime.timedelta(minutes=30)) #當前時間+30分
45 
46 
47 #
48 # c_time  = datetime.datetime.now()
49 # print(c_time.replace(minute=3,hour=2)) #時間替換

Directive	Meaning	Notes
`%a`	Locale’s abbreviated weekday name.
`%A`	Locale’s full weekday name.
`%b`	Locale’s abbreviated month name.
`%B`	Locale’s full month name.
`%c`	Locale’s appropriate date and time representation.
`%d`	Day of the month as a decimal number [01,31].
`%H`	Hour (24-hour clock) as a decimal number [00,23].
`%I`	Hour (12-hour clock) as a decimal number [01,12].
`%j`	Day of the year as a decimal number [001,366].
`%m`	Month as a decimal number [01,12].
`%M`	Minute as a decimal number [00,59].
`%p`	Locale’s equivalent of either AM or PM.	(1)
`%S`	Second as a decimal number [00,61].	(2)
`%U`	Week number of the year (Sunday as the first day of the week) as a decimal number [00,53]. All days in a new year preceding the first Sunday are considered to be in week 0.	(3)
`%w`	Weekday as a decimal number [0(Sunday),6].
`%W`	Week number of the year (Monday as the first day of the week) as a decimal number [00,53]. All days in a new year preceding the first Monday are considered to be in week 0.	(3)
`%x`	Locale’s appropriate date representation.
`%X`	Locale’s appropriate time representation.
`%y`	Year without century as a decimal number [00,99].
`%Y`	Year with century as a decimal number.
`%z`	Time zone offset indicating a positive or negative time difference from UTC/GMT of the form +HHMM or -HHMM, where H represents decimal hour digits and M represents decimal minute digits [-23:59, +23:59].
`%Z`	Time zone name (no characters if no time zone exists).
`%%`	A literal `'%'` character.

random模塊

隨機數

1 mport random
2 print random.random()
3 print random.randint(1,2)
4 print random.randrange(1,10)

生成隨機驗證碼

 1 import random
 2 checkcode = ''
 3 for i in range(4):
 4     current = random.randrange(0,4)
 5     if current != i:
 6         temp = chr(random.randint(65,90))
 7     else:
 8         temp = random.randint(0,9)
 9     checkcode += str(temp)
10 print checkcode

OS模塊

提供對操做系統進行調用的接口

 1 os.getcwd() 獲取當前工做目錄，即當前python腳本工做的目錄路徑
 2 os.chdir("dirname")  改變當前腳本工做目錄；至關於shell下cd
 3 os.curdir  返回當前目錄: ('.')
 4 os.pardir  獲取當前目錄的父目錄字符串名：('..')
 5 os.makedirs('dirname1/dirname2')    可生成多層遞歸目錄
 6 os.removedirs('dirname1')    若目錄爲空，則刪除，並遞歸到上一級目錄，如若也爲空，則刪除，依此類推
 7 os.mkdir('dirname')    生成單級目錄；至關於shell中mkdir dirname
 8 os.rmdir('dirname')    刪除單級空目錄，若目錄不爲空則沒法刪除，報錯；至關於shell中rmdir dirname
 9 os.listdir('dirname')    列出指定目錄下的全部文件和子目錄，包括隱藏文件，並以列表方式打印
10 os.remove()  刪除一個文件
11 os.rename("oldname","newname")  重命名文件/目錄
12 os.stat('path/filename')  獲取文件/目錄信息
13 os.sep    輸出操做系統特定的路徑分隔符，win下爲"\\",Linux下爲"/"
14 os.linesep    輸出當前平臺使用的行終止符，win下爲"\t\n",Linux下爲"\n"
15 os.pathsep    輸出用於分割文件路徑的字符串
16 os.name    輸出字符串指示當前使用平臺。win->'nt'; Linux->'posix'
17 os.system("bash command")  運行shell命令，直接顯示
18 os.environ  獲取系統環境變量
19 os.path.abspath(path)  返回path規範化的絕對路徑
20 os.path.split(path)  將path分割成目錄和文件名二元組返回
21 os.path.dirname(path)  返回path的目錄。其實就是os.path.split(path)的第一個元素
22 os.path.basename(path)  返回path最後的文件名。如何path以／或\結尾，那麼就會返回空值。即os.path.split(path)的第二個元素
23 os.path.exists(path)  若是path存在，返回True；若是path不存在，返回False
24 os.path.isabs(path)  若是path是絕對路徑，返回True
25 os.path.isfile(path)  若是path是一個存在的文件，返回True。不然返回False
26 os.path.isdir(path)  若是path是一個存在的目錄，則返回True。不然返回False
27 os.path.join(path1[, path2[, ...]])  將多個路徑組合後返回，第一個絕對路徑以前的參數將被忽略
28 os.path.getatime(path)  返回path所指向的文件或者目錄的最後存取時間
29 os.path.getmtime(path)  返回path所指向的文件或者目錄的最後修改時間

更多猛擊這裏

執行系統命令

能夠執行shell命令的相關模塊和函數有：

os.system
os.spawn*
os.popen* --廢棄
popen2.* --廢棄
commands.* --廢棄，3.x中被移除

1 import commands
2 
3 result = commands.getoutput('cmd')
4 result = commands.getstatus('cmd')
5 result = commands.getstatusoutput('cmd')

以上執行shell命令的相關的模塊和函數的功能均在 subprocess 模塊中實現，並提供了更豐富的功能。

call

執行命令，返回狀態碼

1 ret = subprocess.call(["ls", "-l"], shell=False)
2 ret = subprocess.call("ls -l", shell=True)

shell = True ，容許 shell 命令是字符串形式

check_call

執行命令，若是執行狀態碼是 0 ，則返回0，不然拋異常

1 subprocess.check_call(["ls", "-l"])
2 subprocess.check_call("exit 1", shell=True)

check_output

執行命令，若是狀態碼是 0 ，則返回執行結果，不然拋異常

1 subprocess.check_output(["echo", "Hello World!"])
2 subprocess.check_output("exit 1", shell=True)

subprocess.Popen(...)

用於執行復雜的系統命令

參數：

args：shell命令，能夠是字符串或者序列類型（如：list，元組）
bufsize：指定緩衝。0 無緩衝,1 行緩衝,其餘緩衝區大小,負值系統緩衝
stdin, stdout, stderr：分別表示程序的標準輸入、輸出、錯誤句柄
preexec_fn：只在Unix平臺下有效，用於指定一個可執行對象（callable object），它將在子進程運行以前被調用
close_sfs：在windows平臺下，若是close_fds被設置爲True，則新建立的子進程將不會繼承父進程的輸入、輸出、錯誤管道。
因此不能將close_fds設置爲True同時重定向子進程的標準輸入、輸出與錯誤(stdin, stdout, stderr)。
shell：同上
cwd：用於設置子進程的當前目錄
env：用於指定子進程的環境變量。若是env = None，子進程的環境變量將從父進程中繼承。
universal_newlines：不一樣系統的換行符不一樣，True -> 贊成使用 \n
startupinfo與createionflags只在windows下有效
將被傳遞給底層的CreateProcess()函數，用於設置子進程的一些屬性，如：主窗口的外觀，進程的優先級等等

1 import subprocess
2 ret1 = subprocess.Popen(["mkdir","t1"])
3 ret2 = subprocess.Popen("mkdir t2", shell=True)

終端輸入的命令分爲兩種：

輸入便可獲得輸出，如：ifconfig
輸入進行某環境，依賴再輸入，如：python

1 import subprocess
2 obj = subprocess.Popen("mkdir t3", shell=True, cwd='/home/dev',)

 1 import subprocess
 2 
 3 obj = subprocess.Popen(["python"], stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
 4 obj.stdin.write('print 1 \n ')
 5 obj.stdin.write('print 2 \n ')
 6 obj.stdin.write('print 3 \n ')
 7 obj.stdin.write('print 4 \n ')
 8 obj.stdin.close()
 9 
10 cmd_out = obj.stdout.read()
11 obj.stdout.close()
12 cmd_error = obj.stderr.read()
13 obj.stderr.close()
14 
15 print cmd_out
16 print cmd_error

 1 import subprocess
 2 
 3 obj = subprocess.Popen(["python"], stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
 4 obj.stdin.write('print 1 \n ')
 5 obj.stdin.write('print 2 \n ')
 6 obj.stdin.write('print 3 \n ')
 7 obj.stdin.write('print 4 \n ')
 8 
 9 out_error_list = obj.communicate()
10 print out_error_list

1 import subprocess
2 
3 obj = subprocess.Popen(["python"], stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
4 out_error_list = obj.communicate('print "hello"')
5 print out_error_list

sys模塊

1 sys.argv           命令行參數List，第一個元素是程序自己路徑
2 sys.exit(n)        退出程序，正常退出時exit(0)
3 sys.version        獲取Python解釋程序的版本信息
4 sys.maxint         最大的Int值
5 sys.path           返回模塊的搜索路徑，初始化時使用PYTHONPATH環境變量的值
6 sys.platform       返回操做系統平臺名稱
7 sys.stdout.write('please:')
8 val = sys.stdin.readline()[:-1]

shutil 模塊

直接參考 http://www.cnblogs.com/wupeiqi/articles/4963027.html

高級的文件、文件夾、壓縮包處理模塊

shutil.copyfileobj(fsrc, fdst[, length])
將文件內容拷貝到另外一個文件中，能夠部份內容

1 def copyfileobj(fsrc, fdst, length=16*1024):
2     """copy data from file-like object fsrc to file-like object fdst"""
3     while 1:
4         buf = fsrc.read(length)
5         if not buf:
6             break
7         fdst.write(buf)

shutil.copyfile(src, dst)
拷貝文件

 1 def copyfile(src, dst):
 2     """Copy data from src to dst"""
 3     if _samefile(src, dst):
 4         raise Error("`%s` and `%s` are the same file" % (src, dst))
 5 
 6     for fn in [src, dst]:
 7         try:
 8             st = os.stat(fn)
 9         except OSError:
10             # File most likely does not exist
11             pass
12         else:
13             # XXX What about other special files? (sockets, devices...)
14             if stat.S_ISFIFO(st.st_mode):
15                 raise SpecialFileError("`%s` is a named pipe" % fn)
16 
17     with open(src, 'rb') as fsrc:
18         with open(dst, 'wb') as fdst:
19             copyfileobj(fsrc, fdst)

shutil.copymode(src, dst)
僅拷貝權限。內容、組、用戶均不變

1 def copymode(src, dst):
2     """Copy mode bits from src to dst"""
3     if hasattr(os, 'chmod'):
4         st = os.stat(src)
5         mode = stat.S_IMODE(st.st_mode)
6         os.chmod(dst, mode)

shutil.copystat(src, dst)
拷貝狀態的信息，包括：mode bits, atime, mtime, flags

 1 def copystat(src, dst):
 2     """Copy all stat info (mode bits, atime, mtime, flags) from src to dst"""
 3     st = os.stat(src)
 4     mode = stat.S_IMODE(st.st_mode)
 5     if hasattr(os, 'utime'):
 6         os.utime(dst, (st.st_atime, st.st_mtime))
 7     if hasattr(os, 'chmod'):
 8         os.chmod(dst, mode)
 9     if hasattr(os, 'chflags') and hasattr(st, 'st_flags'):
10         try:
11             os.chflags(dst, st.st_flags)
12         except OSError, why:
13             for err in 'EOPNOTSUPP', 'ENOTSUP':
14                 if hasattr(errno, err) and why.errno == getattr(errno, err):
15                     break
16             else:
17                 raise

shutil.copy(src, dst)
拷貝文件和權限

 1 def copy(src, dst):
 2     """Copy data and mode bits ("cp src dst").
 3 
 4     The destination may be a directory.
 5 
 6     """
 7     if os.path.isdir(dst):
 8         dst = os.path.join(dst, os.path.basename(src))
 9     copyfile(src, dst)
10     copymode(src, dst)

shutil.copy2(src, dst)
拷貝文件和狀態信息

 1 def copy2(src, dst):
 2     """Copy data and all stat info ("cp -p src dst").
 3 
 4     The destination may be a directory.
 5 
 6     """
 7     if os.path.isdir(dst):
 8         dst = os.path.join(dst, os.path.basename(src))
 9     copyfile(src, dst)
10     copystat(src, dst)

shutil.ignore_patterns(*patterns)
shutil.copytree(src, dst, symlinks=False, ignore=None)
遞歸的去拷貝文件

例如：copytree(source, destination, ignore=ignore_patterns('*.pyc', 'tmp*'))

 1 def ignore_patterns(*patterns):
 2     """Function that can be used as copytree() ignore parameter.
 3 
 4     Patterns is a sequence of glob-style patterns
 5     that are used to exclude files"""
 6     def _ignore_patterns(path, names):
 7         ignored_names = []
 8         for pattern in patterns:
 9             ignored_names.extend(fnmatch.filter(names, pattern))
10         return set(ignored_names)
11     return _ignore_patterns
12 
13 def copytree(src, dst, symlinks=False, ignore=None):
14     """Recursively copy a directory tree using copy2().
15 
16     The destination directory must not already exist.
17     If exception(s) occur, an Error is raised with a list of reasons.
18 
19     If the optional symlinks flag is true, symbolic links in the
20     source tree result in symbolic links in the destination tree; if
21     it is false, the contents of the files pointed to by symbolic
22     links are copied.
23 
24     The optional ignore argument is a callable. If given, it
25     is called with the `src` parameter, which is the directory
26     being visited by copytree(), and `names` which is the list of
27     `src` contents, as returned by os.listdir():
28 
29         callable(src, names) -> ignored_names
30 
31     Since copytree() is called recursively, the callable will be
32     called once for each directory that is copied. It returns a
33     list of names relative to the `src` directory that should
34     not be copied.
35 
36     XXX Consider this example code rather than the ultimate tool.
37 
38     """
39     names = os.listdir(src)
40     if ignore is not None:
41         ignored_names = ignore(src, names)
42     else:
43         ignored_names = set()
44 
45     os.makedirs(dst)
46     errors = []
47     for name in names:
48         if name in ignored_names:
49             continue
50         srcname = os.path.join(src, name)
51         dstname = os.path.join(dst, name)
52         try:
53             if symlinks and os.path.islink(srcname):
54                 linkto = os.readlink(srcname)
55                 os.symlink(linkto, dstname)
56             elif os.path.isdir(srcname):
57                 copytree(srcname, dstname, symlinks, ignore)
58             else:
59                 # Will raise a SpecialFileError for unsupported file types
60                 copy2(srcname, dstname)
61         # catch the Error from the recursive copytree so that we can
62         # continue with other files
63         except Error, err:
64             errors.extend(err.args[0])
65         except EnvironmentError, why:
66             errors.append((srcname, dstname, str(why)))
67     try:
68         copystat(src, dst)
69     except OSError, why:
70         if WindowsError is not None and isinstance(why, WindowsError):
71             # Copying file access times may fail on Windows
72             pass
73         else:
74             errors.append((src, dst, str(why)))
75     if errors:
76         raise Error, errors

shutil.rmtree(path[, ignore_errors[, onerror]])
遞歸的去刪除文件

 1 def rmtree(path, ignore_errors=False, onerror=None):
 2     """Recursively delete a directory tree.
 3 
 4     If ignore_errors is set, errors are ignored; otherwise, if onerror
 5     is set, it is called to handle the error with arguments (func,
 6     path, exc_info) where func is os.listdir, os.remove, or os.rmdir;
 7     path is the argument to that function that caused it to fail; and
 8     exc_info is a tuple returned by sys.exc_info().  If ignore_errors
 9     is false and onerror is None, an exception is raised.
10 
11     """
12     if ignore_errors:
13         def onerror(*args):
14             pass
15     elif onerror is None:
16         def onerror(*args):
17             raise
18     try:
19         if os.path.islink(path):
20             # symlinks to directories are forbidden, see bug #1669
21             raise OSError("Cannot call rmtree on a symbolic link")
22     except OSError:
23         onerror(os.path.islink, path, sys.exc_info())
24         # can't continue even if onerror hook returns
25         return
26     names = []
27     try:
28         names = os.listdir(path)
29     except os.error, err:
30         onerror(os.listdir, path, sys.exc_info())
31     for name in names:
32         fullname = os.path.join(path, name)
33         try:
34             mode = os.lstat(fullname).st_mode
35         except os.error:
36             mode = 0
37         if stat.S_ISDIR(mode):
38             rmtree(fullname, ignore_errors, onerror)
39         else:
40             try:
41                 os.remove(fullname)
42             except os.error, err:
43                 onerror(os.remove, fullname, sys.exc_info())
44     try:
45         os.rmdir(path)
46     except os.error:
47         onerror(os.rmdir, path, sys.exc_info())

shutil.move(src, dst)
遞歸的去移動文件

 1 def move(src, dst):
 2     """Recursively move a file or directory to another location. This is
 3     similar to the Unix "mv" command.
 4 
 5     If the destination is a directory or a symlink to a directory, the source
 6     is moved inside the directory. The destination path must not already
 7     exist.
 8 
 9     If the destination already exists but is not a directory, it may be
10     overwritten depending on os.rename() semantics.
11 
12     If the destination is on our current filesystem, then rename() is used.
13     Otherwise, src is copied to the destination and then removed.
14     A lot more could be done here...  A look at a mv.c shows a lot of
15     the issues this implementation glosses over.
16 
17     """
18     real_dst = dst
19     if os.path.isdir(dst):
20         if _samefile(src, dst):
21             # We might be on a case insensitive filesystem,
22             # perform the rename anyway.
23             os.rename(src, dst)
24             return
25 
26         real_dst = os.path.join(dst, _basename(src))
27         if os.path.exists(real_dst):
28             raise Error, "Destination path '%s' already exists" % real_dst
29     try:
30         os.rename(src, real_dst)
31     except OSError:
32         if os.path.isdir(src):
33             if _destinsrc(src, dst):
34                 raise Error, "Cannot move a directory '%s' into itself '%s'." % (src, dst)
35             copytree(src, real_dst, symlinks=True)
36             rmtree(src)
37         else:
38             copy2(src, real_dst)
39             os.unlink(src)

shutil.make_archive(base_name, format,...)

建立壓縮包並返回文件路徑，例如：zip、tar

base_name：壓縮包的文件名，也能夠是壓縮包的路徑。只是文件名時，則保存至當前目錄，不然保存至指定路徑，
如：www =>保存至當前路徑
如：/Users/wupeiqi/www =>保存至/Users/wupeiqi/
format：壓縮包種類，「zip」, 「tar」, 「bztar」，「gztar」
root_dir：要壓縮的文件夾路徑（默認當前目錄）
owner：用戶，默認當前用戶
group：組，默認當前組
logger：用於記錄日誌，一般是logging.Logger對象

1 #將 /Users/wupeiqi/Downloads/test 下的文件打包放置當前程序目錄
2  
3 import shutil
4 ret = shutil.make_archive("wwwwwwwwww", 'gztar', root_dir='/Users/wupeiqi/Downloads/test')
5  
6  
7 #將 /Users/wupeiqi/Downloads/test 下的文件打包放置 /Users/wupeiqi/目錄
8 import shutil
9 ret = shutil.make_archive("/Users/wupeiqi/wwwwwwwwww", 'gztar', root_dir='/Users/wupeiqi/Downloads/test')

 1 def make_archive(base_name, format, root_dir=None, base_dir=None, verbose=0,
 2                  dry_run=0, owner=None, group=None, logger=None):
 3     """Create an archive file (eg. zip or tar).
 4 
 5     'base_name' is the name of the file to create, minus any format-specific
 6     extension; 'format' is the archive format: one of "zip", "tar", "bztar"
 7     or "gztar".
 8 
 9     'root_dir' is a directory that will be the root directory of the
10     archive; ie. we typically chdir into 'root_dir' before creating the
11     archive.  'base_dir' is the directory where we start archiving from;
12     ie. 'base_dir' will be the common prefix of all files and
13     directories in the archive.  'root_dir' and 'base_dir' both default
14     to the current directory.  Returns the name of the archive file.
15 
16     'owner' and 'group' are used when creating a tar archive. By default,
17     uses the current owner and group.
18     """
19     save_cwd = os.getcwd()
20     if root_dir is not None:
21         if logger is not None:
22             logger.debug("changing into '%s'", root_dir)
23         base_name = os.path.abspath(base_name)
24         if not dry_run:
25             os.chdir(root_dir)
26 
27     if base_dir is None:
28         base_dir = os.curdir
29 
30     kwargs = {'dry_run': dry_run, 'logger': logger}
31 
32     try:
33         format_info = _ARCHIVE_FORMATS[format]
34     except KeyError:
35         raise ValueError, "unknown archive format '%s'" % format
36 
37     func = format_info[0]
38     for arg, val in format_info[1]:
39         kwargs[arg] = val
40 
41     if format != 'zip':
42         kwargs['owner'] = owner
43         kwargs['group'] = group
44 
45     try:
46         filename = func(base_name, base_dir, **kwargs)
47     finally:
48         if root_dir is not None:
49             if logger is not None:
50                 logger.debug("changing back to '%s'", save_cwd)
51             os.chdir(save_cwd)
52 
53     return filename

shutil 對壓縮包的處理是調用 ZipFile 和 TarFile 兩個模塊來進行的，詳細：

 1 import zipfile
 2 
 3 # 壓縮
 4 z = zipfile.ZipFile('laxi.zip', 'w')
 5 z.write('a.log')
 6 z.write('data.data')
 7 z.close()
 8 
 9 # 解壓
10 z = zipfile.ZipFile('laxi.zip', 'r')
11 z.extractall()
12 z.close()

zipfile 壓縮解壓

 1 import tarfile
 2 
 3 # 壓縮
 4 tar = tarfile.open('your.tar','w')
 5 tar.add('/Users/wupeiqi/PycharmProjects/bbs2.zip', arcname='bbs2.zip')
 6 tar.add('/Users/wupeiqi/PycharmProjects/cmdb.zip', arcname='cmdb.zip')
 7 tar.close()
 8 
 9 # 解壓
10 tar = tarfile.open('your.tar','r')
11 tar.extractall()  # 可設置解壓地址
12 tar.close()

tarfile壓縮和解壓

  1 class ZipFile(object):
  2     """ Class with methods to open, read, write, close, list zip files.
  3 
  4     z = ZipFile(file, mode="r", compression=ZIP_STORED, allowZip64=False)
  5 
  6     file: Either the path to the file, or a file-like object.
  7           If it is a path, the file will be opened and closed by ZipFile.
  8     mode: The mode can be either read "r", write "w" or append "a".
  9     compression: ZIP_STORED (no compression) or ZIP_DEFLATED (requires zlib).
 10     allowZip64: if True ZipFile will create files with ZIP64 extensions when
 11                 needed, otherwise it will raise an exception when this would
 12                 be necessary.
 13 
 14     """
 15 
 16     fp = None                   # Set here since __del__ checks it
 17 
 18     def __init__(self, file, mode="r", compression=ZIP_STORED, allowZip64=False):
 19         """Open the ZIP file with mode read "r", write "w" or append "a"."""
 20         if mode not in ("r", "w", "a"):
 21             raise RuntimeError('ZipFile() requires mode "r", "w", or "a"')
 22 
 23         if compression == ZIP_STORED:
 24             pass
 25         elif compression == ZIP_DEFLATED:
 26             if not zlib:
 27                 raise RuntimeError,\
 28                       "Compression requires the (missing) zlib module"
 29         else:
 30             raise RuntimeError, "That compression method is not supported"
 31 
 32         self._allowZip64 = allowZip64
 33         self._didModify = False
 34         self.debug = 0  # Level of printing: 0 through 3
 35         self.NameToInfo = {}    # Find file info given name
 36         self.filelist = []      # List of ZipInfo instances for archive
 37         self.compression = compression  # Method of compression
 38         self.mode = key = mode.replace('b', '')[0]
 39         self.pwd = None
 40         self._comment = ''
 41 
 42         # Check if we were passed a file-like object
 43         if isinstance(file, basestring):
 44             self._filePassed = 0
 45             self.filename = file
 46             modeDict = {'r' : 'rb', 'w': 'wb', 'a' : 'r+b'}
 47             try:
 48                 self.fp = open(file, modeDict[mode])
 49             except IOError:
 50                 if mode == 'a':
 51                     mode = key = 'w'
 52                     self.fp = open(file, modeDict[mode])
 53                 else:
 54                     raise
 55         else:
 56             self._filePassed = 1
 57             self.fp = file
 58             self.filename = getattr(file, 'name', None)
 59 
 60         try:
 61             if key == 'r':
 62                 self._RealGetContents()
 63             elif key == 'w':
 64                 # set the modified flag so central directory gets written
 65                 # even if no files are added to the archive
 66                 self._didModify = True
 67             elif key == 'a':
 68                 try:
 69                     # See if file is a zip file
 70                     self._RealGetContents()
 71                     # seek to start of directory and overwrite
 72                     self.fp.seek(self.start_dir, 0)
 73                 except BadZipfile:
 74                     # file is not a zip file, just append
 75                     self.fp.seek(0, 2)
 76 
 77                     # set the modified flag so central directory gets written
 78                     # even if no files are added to the archive
 79                     self._didModify = True
 80             else:
 81                 raise RuntimeError('Mode must be "r", "w" or "a"')
 82         except:
 83             fp = self.fp
 84             self.fp = None
 85             if not self._filePassed:
 86                 fp.close()
 87             raise
 88 
 89     def __enter__(self):
 90         return self
 91 
 92     def __exit__(self, type, value, traceback):
 93         self.close()
 94 
 95     def _RealGetContents(self):
 96         """Read in the table of contents for the ZIP file."""
 97         fp = self.fp
 98         try:
 99             endrec = _EndRecData(fp)
100         except IOError:
101             raise BadZipfile("File is not a zip file")
102         if not endrec:
103             raise BadZipfile, "File is not a zip file"
104         if self.debug > 1:
105             print endrec
106         size_cd = endrec[_ECD_SIZE]             # bytes in central directory
107         offset_cd = endrec[_ECD_OFFSET]         # offset of central directory
108         self._comment = endrec[_ECD_COMMENT]    # archive comment
109 
110         # "concat" is zero, unless zip was concatenated to another file
111         concat = endrec[_ECD_LOCATION] - size_cd - offset_cd
112         if endrec[_ECD_SIGNATURE] == stringEndArchive64:
113             # If Zip64 extension structures are present, account for them
114             concat -= (sizeEndCentDir64 + sizeEndCentDir64Locator)
115 
116         if self.debug > 2:
117             inferred = concat + offset_cd
118             print "given, inferred, offset", offset_cd, inferred, concat
119         # self.start_dir:  Position of start of central directory
120         self.start_dir = offset_cd + concat
121         fp.seek(self.start_dir, 0)
122         data = fp.read(size_cd)
123         fp = cStringIO.StringIO(data)
124         total = 0
125         while total < size_cd:
126             centdir = fp.read(sizeCentralDir)
127             if len(centdir) != sizeCentralDir:
128                 raise BadZipfile("Truncated central directory")
129             centdir = struct.unpack(structCentralDir, centdir)
130             if centdir[_CD_SIGNATURE] != stringCentralDir:
131                 raise BadZipfile("Bad magic number for central directory")
132             if self.debug > 2:
133                 print centdir
134             filename = fp.read(centdir[_CD_FILENAME_LENGTH])
135             # Create ZipInfo instance to store file information
136             x = ZipInfo(filename)
137             x.extra = fp.read(centdir[_CD_EXTRA_FIELD_LENGTH])
138             x.comment = fp.read(centdir[_CD_COMMENT_LENGTH])
139             x.header_offset = centdir[_CD_LOCAL_HEADER_OFFSET]
140             (x.create_version, x.create_system, x.extract_version, x.reserved,
141                 x.flag_bits, x.compress_type, t, d,
142                 x.CRC, x.compress_size, x.file_size) = centdir[1:12]
143             x.volume, x.internal_attr, x.external_attr = centdir[15:18]
144             # Convert date/time code to (year, month, day, hour, min, sec)
145             x._raw_time = t
146             x.date_time = ( (d>>9)+1980, (d>>5)&0xF, d&0x1F,
147                                      t>>11, (t>>5)&0x3F, (t&0x1F) * 2 )
148 
149             x._decodeExtra()
150             x.header_offset = x.header_offset + concat
151             x.filename = x._decodeFilename()
152             self.filelist.append(x)
153             self.NameToInfo[x.filename] = x
154 
155             # update total bytes read from central directory
156             total = (total + sizeCentralDir + centdir[_CD_FILENAME_LENGTH]
157                      + centdir[_CD_EXTRA_FIELD_LENGTH]
158                      + centdir[_CD_COMMENT_LENGTH])
159 
160             if self.debug > 2:
161                 print "total", total
162 
163 
164     def namelist(self):
165         """Return a list of file names in the archive."""
166         l = []
167         for data in self.filelist:
168             l.append(data.filename)
169         return l
170 
171     def infolist(self):
172         """Return a list of class ZipInfo instances for files in the
173         archive."""
174         return self.filelist
175 
176     def printdir(self):
177         """Print a table of contents for the zip file."""
178         print "%-46s %19s %12s" % ("File Name", "Modified    ", "Size")
179         for zinfo in self.filelist:
180             date = "%d-%02d-%02d %02d:%02d:%02d" % zinfo.date_time[:6]
181             print "%-46s %s %12d" % (zinfo.filename, date, zinfo.file_size)
182 
183     def testzip(self):
184         """Read all the files and check the CRC."""
185         chunk_size = 2 ** 20
186         for zinfo in self.filelist:
187             try:
188                 # Read by chunks, to avoid an OverflowError or a
189                 # MemoryError with very large embedded files.
190                 with self.open(zinfo.filename, "r") as f:
191                     while f.read(chunk_size):     # Check CRC-32
192                         pass
193             except BadZipfile:
194                 return zinfo.filename
195 
196     def getinfo(self, name):
197         """Return the instance of ZipInfo given 'name'."""
198         info = self.NameToInfo.get(name)
199         if info is None:
200             raise KeyError(
201                 'There is no item named %r in the archive' % name)
202 
203         return info
204 
205     def setpassword(self, pwd):
206         """Set default password for encrypted files."""
207         self.pwd = pwd
208 
209     @property
210     def comment(self):
211         """The comment text associated with the ZIP file."""
212         return self._comment
213 
214     @comment.setter
215     def comment(self, comment):
216         # check for valid comment length
217         if len(comment) > ZIP_MAX_COMMENT:
218             import warnings
219             warnings.warn('Archive comment is too long; truncating to %d bytes'
220                           % ZIP_MAX_COMMENT, stacklevel=2)
221             comment = comment[:ZIP_MAX_COMMENT]
222         self._comment = comment
223         self._didModify = True
224 
225     def read(self, name, pwd=None):
226         """Return file bytes (as a string) for name."""
227         return self.open(name, "r", pwd).read()
228 
229     def open(self, name, mode="r", pwd=None):
230         """Return file-like object for 'name'."""
231         if mode not in ("r", "U", "rU"):
232             raise RuntimeError, 'open() requires mode "r", "U", or "rU"'
233         if not self.fp:
234             raise RuntimeError, \
235                   "Attempt to read ZIP archive that was already closed"
236 
237         # Only open a new file for instances where we were not
238         # given a file object in the constructor
239         if self._filePassed:
240             zef_file = self.fp
241             should_close = False
242         else:
243             zef_file = open(self.filename, 'rb')
244             should_close = True
245 
246         try:
247             # Make sure we have an info object
248             if isinstance(name, ZipInfo):
249                 # 'name' is already an info object
250                 zinfo = name
251             else:
252                 # Get info object for name
253                 zinfo = self.getinfo(name)
254 
255             zef_file.seek(zinfo.header_offset, 0)
256 
257             # Skip the file header:
258             fheader = zef_file.read(sizeFileHeader)
259             if len(fheader) != sizeFileHeader:
260                 raise BadZipfile("Truncated file header")
261             fheader = struct.unpack(structFileHeader, fheader)
262             if fheader[_FH_SIGNATURE] != stringFileHeader:
263                 raise BadZipfile("Bad magic number for file header")
264 
265             fname = zef_file.read(fheader[_FH_FILENAME_LENGTH])
266             if fheader[_FH_EXTRA_FIELD_LENGTH]:
267                 zef_file.read(fheader[_FH_EXTRA_FIELD_LENGTH])
268 
269             if fname != zinfo.orig_filename:
270                 raise BadZipfile, \
271                         'File name in directory "%s" and header "%s" differ.' % (
272                             zinfo.orig_filename, fname)
273 
274             # check for encrypted flag & handle password
275             is_encrypted = zinfo.flag_bits & 0x1
276             zd = None
277             if is_encrypted:
278                 if not pwd:
279                     pwd = self.pwd
280                 if not pwd:
281                     raise RuntimeError, "File %s is encrypted, " \
282                         "password required for extraction" % name
283 
284                 zd = _ZipDecrypter(pwd)
285                 # The first 12 bytes in the cypher stream is an encryption header
286                 #  used to strengthen the algorithm. The first 11 bytes are
287                 #  completely random, while the 12th contains the MSB of the CRC,
288                 #  or the MSB of the file time depending on the header type
289                 #  and is used to check the correctness of the password.
290                 bytes = zef_file.read(12)
291                 h = map(zd, bytes[0:12])
292                 if zinfo.flag_bits & 0x8:
293                     # compare against the file type from extended local headers
294                     check_byte = (zinfo._raw_time >> 8) & 0xff
295                 else:
296                     # compare against the CRC otherwise
297                     check_byte = (zinfo.CRC >> 24) & 0xff
298                 if ord(h[11]) != check_byte:
299                     raise RuntimeError("Bad password for file", name)
300 
301             return ZipExtFile(zef_file, mode, zinfo, zd,
302                     close_fileobj=should_close)
303         except:
304             if should_close:
305                 zef_file.close()
306             raise
307 
308     def extract(self, member, path=None, pwd=None):
309         """Extract a member from the archive to the current working directory,
310            using its full name. Its file information is extracted as accurately
311            as possible. `member' may be a filename or a ZipInfo object. You can
312            specify a different directory using `path'.
313         """
314         if not isinstance(member, ZipInfo):
315             member = self.getinfo(member)
316 
317         if path is None:
318             path = os.getcwd()
319 
320         return self._extract_member(member, path, pwd)
321 
322     def extractall(self, path=None, members=None, pwd=None):
323         """Extract all members from the archive to the current working
324            directory. `path' specifies a different directory to extract to.
325            `members' is optional and must be a subset of the list returned
326            by namelist().
327         """
328         if members is None:
329             members = self.namelist()
330 
331         for zipinfo in members:
332             self.extract(zipinfo, path, pwd)
333 
334     def _extract_member(self, member, targetpath, pwd):
335         """Extract the ZipInfo object 'member' to a physical
336            file on the path targetpath.
337         """
338         # build the destination pathname, replacing
339         # forward slashes to platform specific separators.
340         arcname = member.filename.replace('/', os.path.sep)
341 
342         if os.path.altsep:
343             arcname = arcname.replace(os.path.altsep, os.path.sep)
344         # interpret absolute pathname as relative, remove drive letter or
345         # UNC path, redundant separators, "." and ".." components.
346         arcname = os.path.splitdrive(arcname)[1]
347         arcname = os.path.sep.join(x for x in arcname.split(os.path.sep)
348                     if x not in ('', os.path.curdir, os.path.pardir))
349         if os.path.sep == '\\':
350             # filter illegal characters on Windows
351             illegal = ':<>|"?*'
352             if isinstance(arcname, unicode):
353                 table = {ord(c): ord('_') for c in illegal}
354             else:
355                 table = string.maketrans(illegal, '_' * len(illegal))
356             arcname = arcname.translate(table)
357             # remove trailing dots
358             arcname = (x.rstrip('.') for x in arcname.split(os.path.sep))
359             arcname = os.path.sep.join(x for x in arcname if x)
360 
361         targetpath = os.path.join(targetpath, arcname)
362         targetpath = os.path.normpath(targetpath)
363 
364         # Create all upper directories if necessary.
365         upperdirs = os.path.dirname(targetpath)
366         if upperdirs and not os.path.exists(upperdirs):
367             os.makedirs(upperdirs)
368 
369         if member.filename[-1] == '/':
370             if not os.path.isdir(targetpath):
371                 os.mkdir(targetpath)
372             return targetpath
373 
374         with self.open(member, pwd=pwd) as source, \
375              file(targetpath, "wb") as target:
376             shutil.copyfileobj(source, target)
377 
378         return targetpath
379 
380     def _writecheck(self, zinfo):
381         """Check for errors before writing a file to the archive."""
382         if zinfo.filename in self.NameToInfo:
383             import warnings
384             warnings.warn('Duplicate name: %r' % zinfo.filename, stacklevel=3)
385         if self.mode not in ("w", "a"):
386             raise RuntimeError, 'write() requires mode "w" or "a"'
387         if not self.fp:
388             raise RuntimeError, \
389                   "Attempt to write ZIP archive that was already closed"
390         if zinfo.compress_type == ZIP_DEFLATED and not zlib:
391             raise RuntimeError, \
392                   "Compression requires the (missing) zlib module"
393         if zinfo.compress_type not in (ZIP_STORED, ZIP_DEFLATED):
394             raise RuntimeError, \
395                   "That compression method is not supported"
396         if not self._allowZip64:
397             requires_zip64 = None
398             if len(self.filelist) >= ZIP_FILECOUNT_LIMIT:
399                 requires_zip64 = "Files count"
400             elif zinfo.file_size > ZIP64_LIMIT:
401                 requires_zip64 = "Filesize"
402             elif zinfo.header_offset > ZIP64_LIMIT:
403                 requires_zip64 = "Zipfile size"
404             if requires_zip64:
405                 raise LargeZipFile(requires_zip64 +
406                                    " would require ZIP64 extensions")
407 
408     def write(self, filename, arcname=None, compress_type=None):
409         """Put the bytes from filename into the archive under the name
410         arcname."""
411         if not self.fp:
412             raise RuntimeError(
413                   "Attempt to write to ZIP archive that was already closed")
414 
415         st = os.stat(filename)
416         isdir = stat.S_ISDIR(st.st_mode)
417         mtime = time.localtime(st.st_mtime)
418         date_time = mtime[0:6]
419         # Create ZipInfo instance to store file information
420         if arcname is None:
421             arcname = filename
422         arcname = os.path.normpath(os.path.splitdrive(arcname)[1])
423         while arcname[0] in (os.sep, os.altsep):
424             arcname = arcname[1:]
425         if isdir:
426             arcname += '/'
427         zinfo = ZipInfo(arcname, date_time)
428         zinfo.external_attr = (st[0] & 0xFFFF) << 16L      # Unix attributes
429         if compress_type is None:
430             zinfo.compress_type = self.compression
431         else:
432             zinfo.compress_type = compress_type
433 
434         zinfo.file_size = st.st_size
435         zinfo.flag_bits = 0x00
436         zinfo.header_offset = self.fp.tell()    # Start of header bytes
437 
438         self._writecheck(zinfo)
439         self._didModify = True
440 
441         if isdir:
442             zinfo.file_size = 0
443             zinfo.compress_size = 0
444             zinfo.CRC = 0
445             zinfo.external_attr |= 0x10  # MS-DOS directory flag
446             self.filelist.append(zinfo)
447             self.NameToInfo[zinfo.filename] = zinfo
448             self.fp.write(zinfo.FileHeader(False))
449             return
450 
451         with open(filename, "rb") as fp:
452             # Must overwrite CRC and sizes with correct data later
453             zinfo.CRC = CRC = 0
454             zinfo.compress_size = compress_size = 0
455             # Compressed size can be larger than uncompressed size
456             zip64 = self._allowZip64 and \
457                     zinfo.file_size * 1.05 > ZIP64_LIMIT
458             self.fp.write(zinfo.FileHeader(zip64))
459             if zinfo.compress_type == ZIP_DEFLATED:
460                 cmpr = zlib.compressobj(zlib.Z_DEFAULT_COMPRESSION,
461                      zlib.DEFLATED, -15)
462             else:
463                 cmpr = None
464             file_size = 0
465             while 1:
466                 buf = fp.read(1024 * 8)
467                 if not buf:
468                     break
469                 file_size = file_size + len(buf)
470                 CRC = crc32(buf, CRC) & 0xffffffff
471                 if cmpr:
472                     buf = cmpr.compress(buf)
473                     compress_size = compress_size + len(buf)
474                 self.fp.write(buf)
475         if cmpr:
476             buf = cmpr.flush()
477             compress_size = compress_size + len(buf)
478             self.fp.write(buf)
479             zinfo.compress_size = compress_size
480         else:
481             zinfo.compress_size = file_size
482         zinfo.CRC = CRC
483         zinfo.file_size = file_size
484         if not zip64 and self._allowZip64:
485             if file_size > ZIP64_LIMIT:
486                 raise RuntimeError('File size has increased during compressing')
487             if compress_size > ZIP64_LIMIT:
488                 raise RuntimeError('Compressed size larger than uncompressed size')
489         # Seek backwards and write file header (which will now include
490         # correct CRC and file sizes)
491         position = self.fp.tell()       # Preserve current position in file
492         self.fp.seek(zinfo.header_offset, 0)
493         self.fp.write(zinfo.FileHeader(zip64))
494         self.fp.seek(position, 0)
495         self.filelist.append(zinfo)
496         self.NameToInfo[zinfo.filename] = zinfo
497 
498     def writestr(self, zinfo_or_arcname, bytes, compress_type=None):
499         """Write a file into the archive.  The contents is the string
500         'bytes'.  'zinfo_or_arcname' is either a ZipInfo instance or
501         the name of the file in the archive."""
502         if not isinstance(zinfo_or_arcname, ZipInfo):
503             zinfo = ZipInfo(filename=zinfo_or_arcname,
504                             date_time=time.localtime(time.time())[:6])
505 
506             zinfo.compress_type = self.compression
507             if zinfo.filename[-1] == '/':
508                 zinfo.external_attr = 0o40775 << 16   # drwxrwxr-x
509                 zinfo.external_attr |= 0x10           # MS-DOS directory flag
510             else:
511                 zinfo.external_attr = 0o600 << 16     # ?rw-------
512         else:
513             zinfo = zinfo_or_arcname
514 
515         if not self.fp:
516             raise RuntimeError(
517                   "Attempt to write to ZIP archive that was already closed")
518 
519         if compress_type is not None:
520             zinfo.compress_type = compress_type
521 
522         zinfo.file_size = len(bytes)            # Uncompressed size
523         zinfo.header_offset = self.fp.tell()    # Start of header bytes
524         self._writecheck(zinfo)
525         self._didModify = True
526         zinfo.CRC = crc32(bytes) & 0xffffffff       # CRC-32 checksum
527         if zinfo.compress_type == ZIP_DEFLATED:
528             co = zlib.compressobj(zlib.Z_DEFAULT_COMPRESSION,
529                  zlib.DEFLATED, -15)
530             bytes = co.compress(bytes) + co.flush()
531             zinfo.compress_size = len(bytes)    # Compressed size
532         else:
533             zinfo.compress_size = zinfo.file_size
534         zip64 = zinfo.file_size > ZIP64_LIMIT or \
535                 zinfo.compress_size > ZIP64_LIMIT
536         if zip64 and not self._allowZip64:
537             raise LargeZipFile("Filesize would require ZIP64 extensions")
538         self.fp.write(zinfo.FileHeader(zip64))
539         self.fp.write(bytes)
540         if zinfo.flag_bits & 0x08:
541             # Write CRC and file sizes after the file data
542             fmt = '<LQQ' if zip64 else '<LLL'
543             self.fp.write(struct.pack(fmt, zinfo.CRC, zinfo.compress_size,
544                   zinfo.file_size))
545         self.fp.flush()
546         self.filelist.append(zinfo)
547         self.NameToInfo[zinfo.filename] = zinfo
548 
549     def __del__(self):
550         """Call the "close()" method in case the user forgot."""
551         self.close()
552 
553     def close(self):
554         """Close the file, and for mode "w" and "a" write the ending
555         records."""
556         if self.fp is None:
557             return
558 
559         try:
560             if self.mode in ("w", "a") and self._didModify: # write ending records
561                 pos1 = self.fp.tell()
562                 for zinfo in self.filelist:         # write central directory
563                     dt = zinfo.date_time
564                     dosdate = (dt[0] - 1980) << 9 | dt[1] << 5 | dt[2]
565                     dostime = dt[3] << 11 | dt[4] << 5 | (dt[5] // 2)
566                     extra = []
567                     if zinfo.file_size > ZIP64_LIMIT \
568                             or zinfo.compress_size > ZIP64_LIMIT:
569                         extra.append(zinfo.file_size)
570                         extra.append(zinfo.compress_size)
571                         file_size = 0xffffffff
572                         compress_size = 0xffffffff
573                     else:
574                         file_size = zinfo.file_size
575                         compress_size = zinfo.compress_size
576 
577                     if zinfo.header_offset > ZIP64_LIMIT:
578                         extra.append(zinfo.header_offset)
579                         header_offset = 0xffffffffL
580                     else:
581                         header_offset = zinfo.header_offset
582 
583                     extra_data = zinfo.extra
584                     if extra:
585                         # Append a ZIP64 field to the extra's
586                         extra_data = struct.pack(
587                                 '<HH' + 'Q'*len(extra),
588                                 1, 8*len(extra), *extra) + extra_data
589 
590                         extract_version = max(45, zinfo.extract_version)
591                         create_version = max(45, zinfo.create_version)
592                     else:
593                         extract_version = zinfo.extract_version
594                         create_version = zinfo.create_version
595 
596                     try:
597                         filename, flag_bits = zinfo._encodeFilenameFlags()
598                         centdir = struct.pack(structCentralDir,
599                         stringCentralDir, create_version,
600                         zinfo.create_system, extract_version, zinfo.reserved,
601                         flag_bits, zinfo.compress_type, dostime, dosdate,
602                         zinfo.CRC, compress_size, file_size,
603                         len(filename), len(extra_data), len(zinfo.comment),
604                         0, zinfo.internal_attr, zinfo.external_attr,
605                         header_offset)
606                     except DeprecationWarning:
607                         print >>sys.stderr, (structCentralDir,
608                         stringCentralDir, create_version,
609                         zinfo.create_system, extract_version, zinfo.reserved,
610                         zinfo.flag_bits, zinfo.compress_type, dostime, dosdate,
611                         zinfo.CRC, compress_size, file_size,
612                         len(zinfo.filename), len(extra_data), len(zinfo.comment),
613                         0, zinfo.internal_attr, zinfo.external_attr,
614                         header_offset)
615                         raise
616                     self.fp.write(centdir)
617                     self.fp.write(filename)
618                     self.fp.write(extra_data)
619                     self.fp.write(zinfo.comment)
620 
621                 pos2 = self.fp.tell()
622                 # Write end-of-zip-archive record
623                 centDirCount = len(self.filelist)
624                 centDirSize = pos2 - pos1
625                 centDirOffset = pos1
626                 requires_zip64 = None
627                 if centDirCount > ZIP_FILECOUNT_LIMIT:
628                     requires_zip64 = "Files count"
629                 elif centDirOffset > ZIP64_LIMIT:
630                     requires_zip64 = "Central directory offset"
631                 elif centDirSize > ZIP64_LIMIT:
632                     requires_zip64 = "Central directory size"
633                 if requires_zip64:
634                     # Need to write the ZIP64 end-of-archive records
635                     if not self._allowZip64:
636                         raise LargeZipFile(requires_zip64 +
637                                            " would require ZIP64 extensions")
638                     zip64endrec = struct.pack(
639                             structEndArchive64, stringEndArchive64,
640                             44, 45, 45, 0, 0, centDirCount, centDirCount,
641                             centDirSize, centDirOffset)
642                     self.fp.write(zip64endrec)
643 
644                     zip64locrec = struct.pack(
645                             structEndArchive64Locator,
646                             stringEndArchive64Locator, 0, pos2, 1)
647                     self.fp.write(zip64locrec)
648                     centDirCount = min(centDirCount, 0xFFFF)
649                     centDirSize = min(centDirSize, 0xFFFFFFFF)
650                     centDirOffset = min(centDirOffset, 0xFFFFFFFF)
651 
652                 endrec = struct.pack(structEndArchive, stringEndArchive,
653                                     0, 0, centDirCount, centDirCount,
654                                     centDirSize, centDirOffset, len(self._comment))
655                 self.fp.write(endrec)
656                 self.fp.write(self._comment)
657                 self.fp.flush()
658         finally:
659             fp = self.fp
660             self.fp = None
661             if not self._filePassed:
662                 fp.close()
663 
664 ZipFile

ZipFile 壓縮和解壓

  1 class TarFile(object):
  2     """The TarFile Class provides an interface to tar archives.
  3     """
  4 
  5     debug = 0                   # May be set from 0 (no msgs) to 3 (all msgs)
  6 
  7     dereference = False         # If true, add content of linked file to the
  8                                 # tar file, else the link.
  9 
 10     ignore_zeros = False        # If true, skips empty or invalid blocks and
 11                                 # continues processing.
 12 
 13     errorlevel = 1              # If 0, fatal errors only appear in debug
 14                                 # messages (if debug >= 0). If > 0, errors
 15                                 # are passed to the caller as exceptions.
 16 
 17     format = DEFAULT_FORMAT     # The format to use when creating an archive.
 18 
 19     encoding = ENCODING         # Encoding for 8-bit character strings.
 20 
 21     errors = None               # Error handler for unicode conversion.
 22 
 23     tarinfo = TarInfo           # The default TarInfo class to use.
 24 
 25     fileobject = ExFileObject   # The default ExFileObject class to use.
 26 
 27     def __init__(self, name=None, mode="r", fileobj=None, format=None,
 28             tarinfo=None, dereference=None, ignore_zeros=None, encoding=None,
 29             errors=None, pax_headers=None, debug=None, errorlevel=None):
 30         """Open an (uncompressed) tar archive `name'. `mode' is either 'r' to
 31            read from an existing archive, 'a' to append data to an existing
 32            file or 'w' to create a new file overwriting an existing one. `mode'
 33            defaults to 'r'.
 34            If `fileobj' is given, it is used for reading or writing data. If it
 35            can be determined, `mode' is overridden by `fileobj's mode.
 36            `fileobj' is not closed, when TarFile is closed.
 37         """
 38         modes = {"r": "rb", "a": "r+b", "w": "wb"}
 39         if mode not in modes:
 40             raise ValueError("mode must be 'r', 'a' or 'w'")
 41         self.mode = mode
 42         self._mode = modes[mode]
 43 
 44         if not fileobj:
 45             if self.mode == "a" and not os.path.exists(name):
 46                 # Create nonexistent files in append mode.
 47                 self.mode = "w"
 48                 self._mode = "wb"
 49             fileobj = bltn_open(name, self._mode)
 50             self._extfileobj = False
 51         else:
 52             if name is None and hasattr(fileobj, "name"):
 53                 name = fileobj.name
 54             if hasattr(fileobj, "mode"):
 55                 self._mode = fileobj.mode
 56             self._extfileobj = True
 57         self.name = os.path.abspath(name) if name else None
 58         self.fileobj = fileobj
 59 
 60         # Init attributes.
 61         if format is not None:
 62             self.format = format
 63         if tarinfo is not None:
 64             self.tarinfo = tarinfo
 65         if dereference is not None:
 66             self.dereference = dereference
 67         if ignore_zeros is not None:
 68             self.ignore_zeros = ignore_zeros
 69         if encoding is not None:
 70             self.encoding = encoding
 71 
 72         if errors is not None:
 73             self.errors = errors
 74         elif mode == "r":
 75             self.errors = "utf-8"
 76         else:
 77             self.errors = "strict"
 78 
 79         if pax_headers is not None and self.format == PAX_FORMAT:
 80             self.pax_headers = pax_headers
 81         else:
 82             self.pax_headers = {}
 83 
 84         if debug is not None:
 85             self.debug = debug
 86         if errorlevel is not None:
 87             self.errorlevel = errorlevel
 88 
 89         # Init datastructures.
 90         self.closed = False
 91         self.members = []       # list of members as TarInfo objects
 92         self._loaded = False    # flag if all members have been read
 93         self.offset = self.fileobj.tell()
 94                                 # current position in the archive file
 95         self.inodes = {}        # dictionary caching the inodes of
 96                                 # archive members already added
 97 
 98         try:
 99             if self.mode == "r":
100                 self.firstmember = None
101                 self.firstmember = self.next()
102 
103             if self.mode == "a":
104                 # Move to the end of the archive,
105                 # before the first empty block.
106                 while True:
107                     self.fileobj.seek(self.offset)
108                     try:
109                         tarinfo = self.tarinfo.fromtarfile(self)
110                         self.members.append(tarinfo)
111                     except EOFHeaderError:
112                         self.fileobj.seek(self.offset)
113                         break
114                     except HeaderError, e:
115                         raise ReadError(str(e))
116 
117             if self.mode in "aw":
118                 self._loaded = True
119 
120                 if self.pax_headers:
121                     buf = self.tarinfo.create_pax_global_header(self.pax_headers.copy())
122                     self.fileobj.write(buf)
123                     self.offset += len(buf)
124         except:
125             if not self._extfileobj:
126                 self.fileobj.close()
127             self.closed = True
128             raise
129 
130     def _getposix(self):
131         return self.format == USTAR_FORMAT
132     def _setposix(self, value):
133         import warnings
134         warnings.warn("use the format attribute instead", DeprecationWarning,
135                       2)
136         if value:
137             self.format = USTAR_FORMAT
138         else:
139             self.format = GNU_FORMAT
140     posix = property(_getposix, _setposix)
141 
142     #--------------------------------------------------------------------------
143     # Below are the classmethods which act as alternate constructors to the
144     # TarFile class. The open() method is the only one that is needed for
145     # public use; it is the "super"-constructor and is able to select an
146     # adequate "sub"-constructor for a particular compression using the mapping
147     # from OPEN_METH.
148     #
149     # This concept allows one to subclass TarFile without losing the comfort of
150     # the super-constructor. A sub-constructor is registered and made available
151     # by adding it to the mapping in OPEN_METH.
152 
153     @classmethod
154     def open(cls, name=None, mode="r", fileobj=None, bufsize=RECORDSIZE, **kwargs):
155         """Open a tar archive for reading, writing or appending. Return
156            an appropriate TarFile class.
157 
158            mode:
159            'r' or 'r:*' open for reading with transparent compression
160            'r:'         open for reading exclusively uncompressed
161            'r:gz'       open for reading with gzip compression
162            'r:bz2'      open for reading with bzip2 compression
163            'a' or 'a:'  open for appending, creating the file if necessary
164            'w' or 'w:'  open for writing without compression
165            'w:gz'       open for writing with gzip compression
166            'w:bz2'      open for writing with bzip2 compression
167 
168            'r|*'        open a stream of tar blocks with transparent compression
169            'r|'         open an uncompressed stream of tar blocks for reading
170            'r|gz'       open a gzip compressed stream of tar blocks
171            'r|bz2'      open a bzip2 compressed stream of tar blocks
172            'w|'         open an uncompressed stream for writing
173            'w|gz'       open a gzip compressed stream for writing
174            'w|bz2'      open a bzip2 compressed stream for writing
175         """
176 
177         if not name and not fileobj:
178             raise ValueError("nothing to open")
179 
180         if mode in ("r", "r:*"):
181             # Find out which *open() is appropriate for opening the file.
182             for comptype in cls.OPEN_METH:
183                 func = getattr(cls, cls.OPEN_METH[comptype])
184                 if fileobj is not None:
185                     saved_pos = fileobj.tell()
186                 try:
187                     return func(name, "r", fileobj, **kwargs)
188                 except (ReadError, CompressionError), e:
189                     if fileobj is not None:
190                         fileobj.seek(saved_pos)
191                     continue
192             raise ReadError("file could not be opened successfully")
193 
194         elif ":" in mode:
195             filemode, comptype = mode.split(":", 1)
196             filemode = filemode or "r"
197             comptype = comptype or "tar"
198 
199             # Select the *open() function according to
200             # given compression.
201             if comptype in cls.OPEN_METH:
202                 func = getattr(cls, cls.OPEN_METH[comptype])
203             else:
204                 raise CompressionError("unknown compression type %r" % comptype)
205             return func(name, filemode, fileobj, **kwargs)
206 
207         elif "|" in mode:
208             filemode, comptype = mode.split("|", 1)
209             filemode = filemode or "r"
210             comptype = comptype or "tar"
211 
212             if filemode not in ("r", "w"):
213                 raise ValueError("mode must be 'r' or 'w'")
214 
215             stream = _Stream(name, filemode, comptype, fileobj, bufsize)
216             try:
217                 t = cls(name, filemode, stream, **kwargs)
218             except:
219                 stream.close()
220                 raise
221             t._extfileobj = False
222             return t
223 
224         elif mode in ("a", "w"):
225             return cls.taropen(name, mode, fileobj, **kwargs)
226 
227         raise ValueError("undiscernible mode")
228 
229     @classmethod
230     def taropen(cls, name, mode="r", fileobj=None, **kwargs):
231         """Open uncompressed tar archive name for reading or writing.
232         """
233         if mode not in ("r", "a", "w"):
234             raise ValueError("mode must be 'r', 'a' or 'w'")
235         return cls(name, mode, fileobj, **kwargs)
236 
237     @classmethod
238     def gzopen(cls, name, mode="r", fileobj=None, compresslevel=9, **kwargs):
239         """Open gzip compressed tar archive name for reading or writing.
240            Appending is not allowed.
241         """
242         if mode not in ("r", "w"):
243             raise ValueError("mode must be 'r' or 'w'")
244 
245         try:
246             import gzip
247             gzip.GzipFile
248         except (ImportError, AttributeError):
249             raise CompressionError("gzip module is not available")
250 
251         try:
252             fileobj = gzip.GzipFile(name, mode, compresslevel, fileobj)
253         except OSError:
254             if fileobj is not None and mode == 'r':
255                 raise ReadError("not a gzip file")
256             raise
257 
258         try:
259             t = cls.taropen(name, mode, fileobj, **kwargs)
260         except IOError:
261             fileobj.close()
262             if mode == 'r':
263                 raise ReadError("not a gzip file")
264             raise
265         except:
266             fileobj.close()
267             raise
268         t._extfileobj = False
269         return t
270 
271     @classmethod
272     def bz2open(cls, name, mode="r", fileobj=None, compresslevel=9, **kwargs):
273         """Open bzip2 compressed tar archive name for reading or writing.
274            Appending is not allowed.
275         """
276         if mode not in ("r", "w"):
277             raise ValueError("mode must be 'r' or 'w'.")
278 
279         try:
280             import bz2
281         except ImportError:
282             raise CompressionError("bz2 module is not available")
283 
284         if fileobj is not None:
285             fileobj = _BZ2Proxy(fileobj, mode)
286         else:
287             fileobj = bz2.BZ2File(name, mode, compresslevel=compresslevel)
288 
289         try:
290             t = cls.taropen(name, mode, fileobj, **kwargs)
291         except (IOError, EOFError):
292             fileobj.close()
293             if mode == 'r':
294                 raise ReadError("not a bzip2 file")
295             raise
296         except:
297             fileobj.close()
298             raise
299         t._extfileobj = False
300         return t
301 
302     # All *open() methods are registered here.
303     OPEN_METH = {
304         "tar": "taropen",   # uncompressed tar
305         "gz":  "gzopen",    # gzip compressed tar
306         "bz2": "bz2open"    # bzip2 compressed tar
307     }
308 
309     #--------------------------------------------------------------------------
310     # The public methods which TarFile provides:
311 
312     def close(self):
313         """Close the TarFile. In write-mode, two finishing zero blocks are
314            appended to the archive.
315         """
316         if self.closed:
317             return
318 
319         if self.mode in "aw":
320             self.fileobj.write(NUL * (BLOCKSIZE * 2))
321             self.offset += (BLOCKSIZE * 2)
322             # fill up the end with zero-blocks
323             # (like option -b20 for tar does)
324             blocks, remainder = divmod(self.offset, RECORDSIZE)
325             if remainder > 0:
326                 self.fileobj.write(NUL * (RECORDSIZE - remainder))
327 
328         if not self._extfileobj:
329             self.fileobj.close()
330         self.closed = True
331 
332     def getmember(self, name):
333         """Return a TarInfo object for member `name'. If `name' can not be
334            found in the archive, KeyError is raised. If a member occurs more
335            than once in the archive, its last occurrence is assumed to be the
336            most up-to-date version.
337         """
338         tarinfo = self._getmember(name)
339         if tarinfo is None:
340             raise KeyError("filename %r not found" % name)
341         return tarinfo
342 
343     def getmembers(self):
344         """Return the members of the archive as a list of TarInfo objects. The
345            list has the same order as the members in the archive.
346         """
347         self._check()
348         if not self._loaded:    # if we want to obtain a list of
349             self._load()        # all members, we first have to
350                                 # scan the whole archive.
351         return self.members
352 
353     def getnames(self):
354         """Return the members of the archive as a list of their names. It has
355            the same order as the list returned by getmembers().
356         """
357         return [tarinfo.name for tarinfo in self.getmembers()]
358 
359     def gettarinfo(self, name=None, arcname=None, fileobj=None):
360         """Create a TarInfo object for either the file `name' or the file
361            object `fileobj' (using os.fstat on its file descriptor). You can
362            modify some of the TarInfo's attributes before you add it using
363            addfile(). If given, `arcname' specifies an alternative name for the
364            file in the archive.
365         """
366         self._check("aw")
367 
368         # When fileobj is given, replace name by
369         # fileobj's real name.
370         if fileobj is not None:
371             name = fileobj.name
372 
373         # Building the name of the member in the archive.
374         # Backward slashes are converted to forward slashes,
375         # Absolute paths are turned to relative paths.
376         if arcname is None:
377             arcname = name
378         drv, arcname = os.path.splitdrive(arcname)
379         arcname = arcname.replace(os.sep, "/")
380         arcname = arcname.lstrip("/")
381 
382         # Now, fill the TarInfo object with
383         # information specific for the file.
384         tarinfo = self.tarinfo()
385         tarinfo.tarfile = self
386 
387         # Use os.stat or os.lstat, depending on platform
388         # and if symlinks shall be resolved.
389         if fileobj is None:
390             if hasattr(os, "lstat") and not self.dereference:
391                 statres = os.lstat(name)
392             else:
393                 statres = os.stat(name)
394         else:
395             statres = os.fstat(fileobj.fileno())
396         linkname = ""
397 
398         stmd = statres.st_mode
399         if stat.S_ISREG(stmd):
400             inode = (statres.st_ino, statres.st_dev)
401             if not self.dereference and statres.st_nlink > 1 and \
402                     inode in self.inodes and arcname != self.inodes[inode]:
403                 # Is it a hardlink to an already
404                 # archived file?
405                 type = LNKTYPE
406                 linkname = self.inodes[inode]
407             else:
408                 # The inode is added only if its valid.
409                 # For win32 it is always 0.
410                 type = REGTYPE
411                 if inode[0]:
412                     self.inodes[inode] = arcname
413         elif stat.S_ISDIR(stmd):
414             type = DIRTYPE
415         elif stat.S_ISFIFO(stmd):
416             type = FIFOTYPE
417         elif stat.S_ISLNK(stmd):
418             type = SYMTYPE
419             linkname = os.readlink(name)
420         elif stat.S_ISCHR(stmd):
421             type = CHRTYPE
422         elif stat.S_ISBLK(stmd):
423             type = BLKTYPE
424         else:
425             return None
426 
427         # Fill the TarInfo object with all
428         # information we can get.
429         tarinfo.name = arcname
430         tarinfo.mode = stmd
431         tarinfo.uid = statres.st_uid
432         tarinfo.gid = statres.st_gid
433         if type == REGTYPE:
434             tarinfo.size = statres.st_size
435         else:
436             tarinfo.size = 0L
437         tarinfo.mtime = statres.st_mtime
438         tarinfo.type = type
439         tarinfo.linkname = linkname
440         if pwd:
441             try:
442                 tarinfo.uname = pwd.getpwuid(tarinfo.uid)[0]
443             except KeyError:
444                 pass
445         if grp:
446             try:
447                 tarinfo.gname = grp.getgrgid(tarinfo.gid)[0]
448             except KeyError:
449                 pass
450 
451         if type in (CHRTYPE, BLKTYPE):
452             if hasattr(os, "major") and hasattr(os, "minor"):
453                 tarinfo.devmajor = os.major(statres.st_rdev)
454                 tarinfo.devminor = os.minor(statres.st_rdev)
455         return tarinfo
456 
457     def list(self, verbose=True):
458         """Print a table of contents to sys.stdout. If `verbose' is False, only
459            the names of the members are printed. If it is True, an `ls -l'-like
460            output is produced.
461         """
462         self._check()
463 
464         for tarinfo in self:
465             if verbose:
466                 print filemode(tarinfo.mode),
467                 print "%s/%s" % (tarinfo.uname or tarinfo.uid,
468                                  tarinfo.gname or tarinfo.gid),
469                 if tarinfo.ischr() or tarinfo.isblk():
470                     print "%10s" % ("%d,%d" \
471                                     % (tarinfo.devmajor, tarinfo.devminor)),
472                 else:
473                     print "%10d" % tarinfo.size,
474                 print "%d-%02d-%02d %02d:%02d:%02d" \
475                       % time.localtime(tarinfo.mtime)[:6],
476 
477             print tarinfo.name + ("/" if tarinfo.isdir() else ""),
478 
479             if verbose:
480                 if tarinfo.issym():
481                     print "->", tarinfo.linkname,
482                 if tarinfo.islnk():
483                     print "link to", tarinfo.linkname,
484             print
485 
486     def add(self, name, arcname=None, recursive=True, exclude=None, filter=None):
487         """Add the file `name' to the archive. `name' may be any type of file
488            (directory, fifo, symbolic link, etc.). If given, `arcname'
489            specifies an alternative name for the file in the archive.
490            Directories are added recursively by default. This can be avoided by
491            setting `recursive' to False. `exclude' is a function that should
492            return True for each filename to be excluded. `filter' is a function
493            that expects a TarInfo object argument and returns the changed
494            TarInfo object, if it returns None the TarInfo object will be
495            excluded from the archive.
496         """
497         self._check("aw")
498 
499         if arcname is None:
500             arcname = name
501 
502         # Exclude pathnames.
503         if exclude is not None:
504             import warnings
505             warnings.warn("use the filter argument instead",
506                     DeprecationWarning, 2)
507             if exclude(name):
508                 self._dbg(2, "tarfile: Excluded %r" % name)
509                 return
510 
511         # Skip if somebody tries to archive the archive...
512         if self.name is not None and os.path.abspath(name) == self.name:
513             self._dbg(2, "tarfile: Skipped %r" % name)
514             return
515 
516         self._dbg(1, name)
517 
518         # Create a TarInfo object from the file.
519         tarinfo = self.gettarinfo(name, arcname)
520 
521         if tarinfo is None:
522             self._dbg(1, "tarfile: Unsupported type %r" % name)
523             return
524 
525         # Change or exclude the TarInfo object.
526         if filter is not None:
527             tarinfo = filter(tarinfo)
528             if tarinfo is None:
529                 self._dbg(2, "tarfile: Excluded %r" % name)
530                 return
531 
532         # Append the tar header and data to the archive.
533         if tarinfo.isreg():
534             with bltn_open(name, "rb") as f:
535                 self.addfile(tarinfo, f)
536 
537         elif tarinfo.isdir():
538             self.addfile(tarinfo)
539             if recursive:
540                 for f in os.listdir(name):
541                     self.add(os.path.join(name, f), os.path.join(arcname, f),
542                             recursive, exclude, filter)
543 
544         else:
545             self.addfile(tarinfo)
546 
547     def addfile(self, tarinfo, fileobj=None):
548         """Add the TarInfo object `tarinfo' to the archive. If `fileobj' is
549            given, tarinfo.size bytes are read from it and added to the archive.
550            You can create TarInfo objects using gettarinfo().
551            On Windows platforms, `fileobj' should always be opened with mode
552            'rb' to avoid irritation about the file size.
553         """
554         self._check("aw")
555 
556         tarinfo = copy.copy(tarinfo)
557 
558         buf = tarinfo.tobuf(self.format, self.encoding, self.errors)
559         self.fileobj.write(buf)
560         self.offset += len(buf)
561 
562         # If there's data to follow, append it.
563         if fileobj is not None:
564             copyfileobj(fileobj, self.fileobj, tarinfo.size)
565             blocks, remainder = divmod(tarinfo.size, BLOCKSIZE)
566             if remainder > 0:
567                 self.fileobj.write(NUL * (BLOCKSIZE - remainder))
568                 blocks += 1
569             self.offset += blocks * BLOCKSIZE
570 
571         self.members.append(tarinfo)
572 
573     def extractall(self, path=".", members=None):
574         """Extract all members from the archive to the current working
575            directory and set owner, modification time and permissions on
576            directories afterwards. `path' specifies a different directory
577            to extract to. `members' is optional and must be a subset of the
578            list returned by getmembers().
579         """
580         directories = []
581 
582         if members is None:
583             members = self
584 
585         for tarinfo in members:
586             if tarinfo.isdir():
587                 # Extract directories with a safe mode.
588                 directories.append(tarinfo)
589                 tarinfo = copy.copy(tarinfo)
590                 tarinfo.mode = 0700
591             self.extract(tarinfo, path)
592 
593         # Reverse sort directories.
594         directories.sort(key=operator.attrgetter('name'))
595         directories.reverse()
596 
597         # Set correct owner, mtime and filemode on directories.
598         for tarinfo in directories:
599             dirpath = os.path.join(path, tarinfo.name)
600             try:
601                 self.chown(tarinfo, dirpath)
602                 self.utime(tarinfo, dirpath)
603                 self.chmod(tarinfo, dirpath)
604             except ExtractError, e:
605                 if self.errorlevel > 1:
606                     raise
607                 else:
608                     self._dbg(1, "tarfile: %s" % e)
609 
610     def extract(self, member, path=""):
611         """Extract a member from the archive to the current working directory,
612            using its full name. Its file information is extracted as accurately
613            as possible. `member' may be a filename or a TarInfo object. You can
614            specify a different directory using `path'.
615         """
616         self._check("r")
617 
618         if isinstance(member, basestring):
619             tarinfo = self.getmember(member)
620         else:
621             tarinfo = member
622 
623         # Prepare the link target for makelink().
624         if tarinfo.islnk():
625             tarinfo._link_target = os.path.join(path, tarinfo.linkname)
626 
627         try:
628             self._extract_member(tarinfo, os.path.join(path, tarinfo.name))
629         except EnvironmentError, e:
630             if self.errorlevel > 0:
631                 raise
632             else:
633                 if e.filename is None:
634                     self._dbg(1, "tarfile: %s" % e.strerror)
635                 else:
636                     self._dbg(1, "tarfile: %s %r" % (e.strerror, e.filename))
637         except ExtractError, e:
638             if self.errorlevel > 1:
639                 raise
640             else:
641                 self._dbg(1, "tarfile: %s" % e)
642 
643     def extractfile(self, member):
644         """Extract a member from the archive as a file object. `member' may be
645            a filename or a TarInfo object. If `member' is a regular file, a
646            file-like object is returned. If `member' is a link, a file-like
647            object is constructed from the link's target. If `member' is none of
648            the above, None is returned.
649            The file-like object is read-only and provides the following
650            methods: read(), readline(), readlines(), seek() and tell()
651         """
652         self._check("r")
653 
654         if isinstance(member, basestring):
655             tarinfo = self.getmember(member)
656         else:
657             tarinfo = member
658 
659         if tarinfo.isreg():
660             return self.fileobject(self, tarinfo)
661 
662         elif tarinfo.type not in SUPPORTED_TYPES:
663             # If a member's type is unknown, it is treated as a
664             # regular file.
665             return self.fileobject(self, tarinfo)
666 
667         elif tarinfo.islnk() or tarinfo.issym():
668             if isinstance(self.fileobj, _Stream):
669                 # A small but ugly workaround for the case that someone tries
670                 # to extract a (sym)link as a file-object from a non-seekable
671                 # stream of tar blocks.
672                 raise StreamError("cannot extract (sym)link as file object")
673             else:
674                 # A (sym)link's file object is its target's file object.
675                 return self.extractfile(self._find_link_target(tarinfo))
676         else:
677             # If there's no data associated with the member (directory, chrdev,
678             # blkdev, etc.), return None instead of a file object.
679             return None
680 
681     def _extract_member(self, tarinfo, targetpath):
682         """Extract the TarInfo object tarinfo to a physical
683            file called targetpath.
684         """
685         # Fetch the TarInfo object for the given name
686         # and build the destination pathname, replacing
687         # forward slashes to platform specific separators.
688         targetpath = targetpath.rstrip("/")
689         targetpath = targetpath.replace("/", os.sep)
690 
691         # Create all upper directories.
692         upperdirs = os.path.dirname(targetpath)
693         if upperdirs and not os.path.exists(upperdirs):
694             # Create directories that are not part of the archive with
695             # default permissions.
696             os.makedirs(upperdirs)
697 
698         if tarinfo.islnk() or tarinfo.issym():
699             self._dbg(1, "%s -> %s" % (tarinfo.name, tarinfo.linkname))
700         else:
701             self._dbg(1, tarinfo.name)
702 
703         if tarinfo.isreg():
704             self.makefile(tarinfo, targetpath)
705         elif tarinfo.isdir():
706             self.makedir(tarinfo, targetpath)
707         elif tarinfo.isfifo():
708             self.makefifo(tarinfo, targetpath)
709         elif tarinfo.ischr() or tarinfo.isblk():
710             self.makedev(tarinfo, targetpath)
711         elif tarinfo.islnk() or tarinfo.issym():
712             self.makelink(tarinfo, targetpath)
713         elif tarinfo.type not in SUPPORTED_TYPES:
714             self.makeunknown(tarinfo, targetpath)
715         else:
716             self.makefile(tarinfo, targetpath)
717 
718         self.chown(tarinfo, targetpath)
719         if not tarinfo.issym():
720             self.chmod(tarinfo, targetpath)
721             self.utime(tarinfo, targetpath)
722 
723     #--------------------------------------------------------------------------
724     # Below are the different file methods. They are called via
725     # _extract_member() when extract() is called. They can be replaced in a
726     # subclass to implement other functionality.
727 
728     def makedir(self, tarinfo, targetpath):
729         """Make a directory called targetpath.
730         """
731         try:
732             # Use a safe mode for the directory, the real mode is set
733             # later in _extract_member().
734             os.mkdir(targetpath, 0700)
735         except EnvironmentError, e:
736             if e.errno != errno.EEXIST:
737                 raise
738 
739     def makefile(self, tarinfo, targetpath):
740         """Make a file called targetpath.
741         """
742         source = self.extractfile(tarinfo)
743         try:
744             with bltn_open(targetpath, "wb") as target:
745                 copyfileobj(source, target)
746         finally:
747             source.close()
748 
749     def makeunknown(self, tarinfo, targetpath):
750         """Make a file from a TarInfo object with an unknown type
751            at targetpath.
752         """
753         self.makefile(tarinfo, targetpath)
754         self._dbg(1, "tarfile: Unknown file type %r, " \
755                      "extracted as regular file." % tarinfo.type)
756 
757     def makefifo(self, tarinfo, targetpath):
758         """Make a fifo called targetpath.
759         """
760         if hasattr(os, "mkfifo"):
761             os.mkfifo(targetpath)
762         else:
763             raise ExtractError("fifo not supported by system")
764 
765     def makedev(self, tarinfo, targetpath):
766         """Make a character or block device called targetpath.
767         """
768         if not hasattr(os, "mknod") or not hasattr(os, "makedev"):
769             raise ExtractError("special devices not supported by system")
770 
771         mode = tarinfo.mode
772         if tarinfo.isblk():
773             mode |= stat.S_IFBLK
774         else:
775             mode |= stat.S_IFCHR
776 
777         os.mknod(targetpath, mode,
778                  os.makedev(tarinfo.devmajor, tarinfo.devminor))
779 
780     def makelink(self, tarinfo, targetpath):
781         """Make a (symbolic) link called targetpath. If it cannot be created
782           (platform limitation), we try to make a copy of the referenced file
783           instead of a link.
784         """
785         if hasattr(os, "symlink") and hasattr(os, "link"):
786             # For systems that support symbolic and hard links.
787             if tarinfo.issym():
788                 if os.path.lexists(targetpath):
789                     os.unlink(targetpath)
790                 os.symlink(tarinfo.linkname, targetpath)
791             else:
792                 # See extract().
793                 if os.path.exists(tarinfo._link_target):
794                     if os.path.lexists(targetpath):
795                         os.unlink(targetpath)
796                     os.link(tarinfo._link_target, targetpath)
797                 else:
798                     self._extract_member(self._find_link_target(tarinfo), targetpath)
799         else:
800             try:
801                 self._extract_member(self._find_link_target(tarinfo), targetpath)
802             except KeyError:
803                 raise ExtractError("unable to resolve link inside archive")
804 
805     def chown(self, tarinfo, targetpath):
806         """Set owner of targetpath according to tarinfo.
807         """
808         if pwd and hasattr(os, "geteuid") and os.geteuid() == 0:
809             # We have to be root to do so.
810             try:
811                 g = grp.getgrnam(tarinfo.gname)[2]
812             except KeyError:
813                 g = tarinfo.gid
814             try:
815                 u = pwd.getpwnam(tarinfo.uname)[2]
816             except KeyError:
817                 u = tarinfo.uid
818             try:
819                 if tarinfo.issym() and hasattr(os, "lchown"):
820                     os.lchown(targetpath, u, g)
821                 else:
822                     if sys.platform != "os2emx":
823                         os.chown(targetpath, u, g)
824             except EnvironmentError, e:
825                 raise ExtractError("could not change owner")
826 
827     def chmod(self, tarinfo, targetpath):
828         """Set file permissions of targetpath according to tarinfo.
829         """
830         if hasattr(os, 'chmod'):
831             try:
832                 os.chmod(targetpath, tarinfo.mode)
833             except EnvironmentError, e:
834                 raise ExtractError("could not change mode")
835 
836     def utime(self, tarinfo, targetpath):
837         """Set modification time of targetpath according to tarinfo.
838         """
839         if not hasattr(os, 'utime'):
840             return
841         try:
842             os.utime(targetpath, (tarinfo.mtime, tarinfo.mtime))
843         except EnvironmentError, e:
844             raise ExtractError("could not change modification time")
845 
846     #--------------------------------------------------------------------------
847     def next(self):
848         """Return the next member of the archive as a TarInfo object, when
849            TarFile is opened for reading. Return None if there is no more
850            available.
851         """
852         self._check("ra")
853         if self.firstmember is not None:
854             m = self.firstmember
855             self.firstmember = None
856             return m
857 
858         # Read the next block.
859         self.fileobj.seek(self.offset)
860         tarinfo = None
861         while True:
862             try:
863                 tarinfo = self.tarinfo.fromtarfile(self)
864             except EOFHeaderError, e:
865                 if self.ignore_zeros:
866                     self._dbg(2, "0x%X: %s" % (self.offset, e))
867                     self.offset += BLOCKSIZE
868                     continue
869             except InvalidHeaderError, e:
870                 if self.ignore_zeros:
871                     self._dbg(2, "0x%X: %s" % (self.offset, e))
872                     self.offset += BLOCKSIZE
873                     continue
874                 elif self.offset == 0:
875                     raise ReadError(str(e))
876             except EmptyHeaderError:
877                 if self.offset == 0:
878                     raise ReadError("empty file")
879             except TruncatedHeaderError, e:
880                 if self.offset == 0:
881                     raise ReadError(str(e))
882             except SubsequentHeaderError, e:
883                 raise ReadError(str(e))
884             break
885 
886         if tarinfo is not None:
887             self.members.append(tarinfo)
888         else:
889             self._loaded = True
890 
891         return tarinfo
892 
893     #--------------------------------------------------------------------------
894     # Little helper methods:
895 
896     def _getmember(self, name, tarinfo=None, normalize=False):
897         """Find an archive member by name from bottom to top.
898            If tarinfo is given, it is used as the starting point.
899         """
900         # Ensure that all members have been loaded.
901         members = self.getmembers()
902 
903         # Limit the member search list up to tarinfo.
904         if tarinfo is not None:
905             members = members[:members.index(tarinfo)]
906 
907         if normalize:
908             name = os.path.normpath(name)
909 
910         for member in reversed(members):
911             if normalize:
912                 member_name = os.path.normpath(member.name)
913             else:
914                 member_name = member.name
915 
916             if name == member_name:
917                 return member
918 
919     def _load(self):
920         """Read through the entire archive file and look for readable
921            members.
922         """
923         while True:
924             tarinfo = self.next()
925             if tarinfo is None:
926                 break
927         self._loaded = True
928 
929     def _check(self, mode=None):
930         """Check if TarFile is still open, and if the operation's mode
931            corresponds to TarFile's mode.
932         """
933         if self.closed:
934             raise IOError("%s is closed" % self.__class__.__name__)
935         if mode is not None and self.mode not in mode:
936             raise IOError("bad operation for mode %r" % self.mode)
937 
938     def _find_link_target(self, tarinfo):
939         """Find the target member of a symlink or hardlink member in the
940            archive.
941         """
942         if tarinfo.issym():
943             # Always search the entire archive.
944             linkname = "/".join(filter(None, (os.path.dirname(tarinfo.name), tarinfo.linkname)))
945             limit = None
946         else:
947             # Search the archive before the link, because a hard link is
948             # just a reference to an already archived file.
949             linkname = tarinfo.linkname
950             limit = tarinfo
951 
952         member = self._getmember(linkname, tarinfo=limit, normalize=True)
953         if member is None:
954             raise KeyError("linkname %r not found" % linkname)
955         return member
956 
957     def __iter__(self):
958         """Provide an iterator object.
959         """
960         if self._loaded:
961             return iter(self.members)
962         else:
963             return TarIter(self)
964 
965     def _dbg(self, level, msg):
966         """Write debugging output to sys.stderr.
967         """
968         if level <= self.debug:
969             print >> sys.stderr, msg
970 
971     def __enter__(self):
972         self._check()
973         return self
974 
975     def __exit__(self, type, value, traceback):
976         if type is None:
977             self.close()
978         else:
979             # An exception occurred. We must not call close() because
980             # it would try to write end-of-archive blocks and padding.
981             if not self._extfileobj:
982                 self.fileobj.close()
983             self.closed = True
984 # class TarFile
985 
986 TarFile

json & pickle 模塊

用於序列化的兩個模塊

json，用於字符串和 python數據類型間進行轉換
pickle，用於python特有的類型和 python的數據類型間進行轉換

Json模塊提供了四個功能：dumps、dump、loads、load

pickle模塊提供了四個功能：dumps、dump、loads、load

shelve 模塊

shelve模塊是一個簡單的k,v將內存數據經過文件持久化的模塊，能夠持久化任何pickle可支持的python數據格式

 1 import shelve
 2  
 3 d = shelve.open('shelve_test') #打開一個文件
 4  
 5 class Test(object):
 6     def __init__(self,n):
 7         self.n = n
 8  
 9  
10 t = Test(123) 
11 t2 = Test(123334)
12  
13 name = ["alex","rain","test"]
14 d["test"] = name #持久化列表
15 d["t1"] = t      #持久化類
16 d["t2"] = t2
17  
18 d.close()

xml處理模塊

xml是實現不一樣語言或程序之間進行數據交換的協議，跟json差很少，但json使用起來更簡單，不過，古時候，在json還沒誕生的黑暗年代，你們只能選擇用xml呀，至今不少傳統公司如金融行業的不少系統的接口還主要是xml。

xml的格式以下，就是經過<>節點來區別數據結構的:

 1 <?xml version="1.0"?>
 2 <data>
 3     <country name="Liechtenstein">
 4         <rank updated="yes">2</rank>
 5         <year>2008</year>
 6         <gdppc>141100</gdppc>
 7         <neighbor name="Austria" direction="E"/>
 8         <neighbor name="Switzerland" direction="W"/>
 9     </country>
10     <country name="Singapore">
11         <rank updated="yes">5</rank>
12         <year>2011</year>
13         <gdppc>59900</gdppc>
14         <neighbor name="Malaysia" direction="N"/>
15     </country>
16     <country name="Panama">
17         <rank updated="yes">69</rank>
18         <year>2011</year>
19         <gdppc>13600</gdppc>
20         <neighbor name="Costa Rica" direction="W"/>
21         <neighbor name="Colombia" direction="E"/>
22     </country>
23 </data>

xml協議在各個語言裏的都是支持的，在python中能夠用如下模塊操做xml

 1 import xml.etree.ElementTree as ET
 2  
 3 tree = ET.parse("xmltest.xml")
 4 root = tree.getroot()
 5 print(root.tag)
 6  
 7 #遍歷xml文檔
 8 for child in root:
 9     print(child.tag, child.attrib)
10     for i in child:
11         print(i.tag,i.text)
12  
13 #只遍歷year 節點
14 for node in root.iter('year'):
15     print(node.tag,node.text)

修改和刪除xml文檔內容

 1 import xml.etree.ElementTree as ET
 2  
 3 tree = ET.parse("xmltest.xml")
 4 root = tree.getroot()
 5  
 6 #修改
 7 for node in root.iter('year'):
 8     new_year = int(node.text) + 1
 9     node.text = str(new_year)
10     node.set("updated","yes")
11  
12 tree.write("xmltest.xml")
13  
14  
15 #刪除node
16 for country in root.findall('country'):
17    rank = int(country.find('rank').text)
18    if rank > 50:
19      root.remove(country)
20  
21 tree.write('output.xml')

本身建立xml文檔

 1 import xml.etree.ElementTree as ET
 2  
 3  
 4 new_xml = ET.Element("namelist")
 5 name = ET.SubElement(new_xml,"name",attrib={"enrolled":"yes"})
 6 age = ET.SubElement(name,"age",attrib={"checked":"no"})
 7 sex = ET.SubElement(name,"sex")
 8 sex.text = '33'
 9 name2 = ET.SubElement(new_xml,"name",attrib={"enrolled":"no"})
10 age = ET.SubElement(name2,"age")
11 age.text = '19'
12  
13 et = ET.ElementTree(new_xml) #生成文檔對象
14 et.write("test.xml", encoding="utf-8",xml_declaration=True)
15  
16 ET.dump(new_xml) #打印生成的格式

PyYAML模塊

Python也能夠很容易的處理ymal文檔格式，只不過須要安裝一個模塊，參考文檔：http://pyyaml.org/wiki/PyYAMLDocumentation

ConfigParser模塊

用於對特定的配置進行操做，當前模塊的名稱在 python 3.x 版本中變動爲 configparser。

1 # 註釋1
2 ; 註釋2
3  
4 [section1]
5 k1 = v1
6 k2:v2
7  
8 [section2]
9 k1 = v1

 1 import ConfigParser
 2  
 3 config = ConfigParser.ConfigParser()
 4 config.read('i.cfg')
 5  
 6 # ########## 讀 ##########
 7 #secs = config.sections()
 8 #print secs
 9 #options = config.options('group2')
10 #print options
11  
12 #item_list = config.items('group2')
13 #print item_list
14  
15 #val = config.get('group1','key')
16 #val = config.getint('group1','key')
17  
18 # ########## 改寫 ##########
19 #sec = config.remove_section('group1')
20 #config.write(open('i.cfg', "w"))
21  
22 #sec = config.has_section('wupeiqi')
23 #sec = config.add_section('wupeiqi')
24 #config.write(open('i.cfg', "w"))
25  
26  
27 #config.set('group2','k1',11111)
28 #config.write(open('i.cfg', "w"))
29  
30 #config.remove_option('group2','age')
31 #config.write(open('i.cfg', "w"))

用於生成和修改常見配置文檔，當前模塊的名稱在 python 3.x 版本中變動爲 configparser。

來看一個好多軟件的常見文檔格式以下

 1 [DEFAULT]
 2 ServerAliveInterval = 45
 3 Compression = yes
 4 CompressionLevel = 9
 5 ForwardX11 = yes
 6  
 7 [bitbucket.org]
 8 User = hg
 9  
10 [topsecret.server.com]
11 Port = 50022
12 ForwardX11 = no

若是想用python生成一個這樣的文檔怎麼作呢？

 1 import configparser
 2  
 3 config = configparser.ConfigParser()
 4 config["DEFAULT"] = {'ServerAliveInterval': '45',
 5                       'Compression': 'yes',
 6                      'CompressionLevel': '9'}
 7  
 8 config['bitbucket.org'] = {}
 9 config['bitbucket.org']['User'] = 'hg'
10 config['topsecret.server.com'] = {}
11 topsecret = config['topsecret.server.com']
12 topsecret['Host Port'] = '50022'     # mutates the parser
13 topsecret['ForwardX11'] = 'no'  # same here
14 config['DEFAULT']['ForwardX11'] = 'yes'
15 with open('example.ini', 'w') as configfile:
16    config.write(configfile)

寫完了還能夠再讀出來哈。

 1 >>> import configparser
 2 >>> config = configparser.ConfigParser()
 3 >>> config.sections()
 4 []
 5 >>> config.read('example.ini')
 6 ['example.ini']
 7 >>> config.sections()
 8 ['bitbucket.org', 'topsecret.server.com']
 9 >>> 'bitbucket.org' in config
10 True
11 >>> 'bytebong.com' in config
12 False
13 >>> config['bitbucket.org']['User']
14 'hg'
15 >>> config['DEFAULT']['Compression']
16 'yes'
17 >>> topsecret = config['topsecret.server.com']
18 >>> topsecret['ForwardX11']
19 'no'
20 >>> topsecret['Port']
21 '50022'
22 >>> for key in config['bitbucket.org']: print(key)
23 ...
24 user
25 compressionlevel
26 serveraliveinterval
27 compression
28 forwardx11
29 >>> config['bitbucket.org']['ForwardX11']
30 'yes'

configparser增刪改查語法

 1 [section1]
 2 k1 = v1
 3 k2:v2
 4   
 5 [section2]
 6 k1 = v1
 7  
 8 import ConfigParser
 9   
10 config = ConfigParser.ConfigParser()
11 config.read('i.cfg')
12   
13 # ########## 讀 ##########
14 #secs = config.sections()
15 #print secs
16 #options = config.options('group2')
17 #print options
18   
19 #item_list = config.items('group2')
20 #print item_list
21   
22 #val = config.get('group1','key')
23 #val = config.getint('group1','key')
24   
25 # ########## 改寫 ##########
26 #sec = config.remove_section('group1')
27 #config.write(open('i.cfg', "w"))
28   
29 #sec = config.has_section('wupeiqi')
30 #sec = config.add_section('wupeiqi')
31 #config.write(open('i.cfg', "w"))
32   
33   
34 #config.set('group2','k1',11111)
35 #config.write(open('i.cfg', "w"))
36   
37 #config.remove_option('group2','age')
38 #config.write(open('i.cfg', "w"))

hashlib模塊　　

用於加密相關的操做，3.x裏代替了md5模塊和sha模塊，主要提供 SHA1, SHA224, SHA256, SHA384, SHA512 ，MD5 算法

1 import md5
2 hash = md5.new()
3 hash.update('admin')
4 print hash.hexdigest()

廢棄

1 import sha
2 
3 hash = sha.new()
4 hash.update('admin')
5 print hash.hexdigest()

廢棄

 1 import hashlib
 2  
 3 m = hashlib.md5()
 4 m.update(b"Hello")
 5 m.update(b"It's me")
 6 print(m.digest())
 7 m.update(b"It's been a long time since last time we ...")
 8  
 9 print(m.digest()) #2進制格式hash
10 print(len(m.hexdigest())) #16進制格式hash
11 '''
12 def digest(self, *args, **kwargs): # real signature unknown
13     """ Return the digest value as a string of binary data. """
14     pass
15  
16 def hexdigest(self, *args, **kwargs): # real signature unknown
17     """ Return the digest value as a string of hexadecimal digits. """
18     pass
19  
20 '''
21 import hashlib
22  
23 # ######## md5 ########
24  
25 hash = hashlib.md5()
26 hash.update('admin')
27 print(hash.hexdigest())
28  
29 # ######## sha1 ########
30  
31 hash = hashlib.sha1()
32 hash.update('admin')
33 print(hash.hexdigest())
34  
35 # ######## sha256 ########
36  
37 hash = hashlib.sha256()
38 hash.update('admin')
39 print(hash.hexdigest())
40  
41  
42 # ######## sha384 ########
43  
44 hash = hashlib.sha384()
45 hash.update('admin')
46 print(hash.hexdigest())
47  
48 # ######## sha512 ########
49  
50 hash = hashlib.sha512()
51 hash.update('admin')
52 print(hash.hexdigest())

以上加密算法雖然依然很是厲害，但時候存在缺陷，即：經過撞庫能夠反解。因此，有必要對加密算法中添加自定義key再來作加密。

1 import hashlib
2  
3 # ######## md5 ########
4  
5 hash = hashlib.md5('898oaFs09f')
6 hash.update('admin')
7 print hash.hexdigest()

還不夠吊？python 還有一個 hmac 模塊，它內部對咱們建立 key 和內容再進行處理而後再加密

散列消息鑑別碼，簡稱HMAC，是一種基於消息鑑別碼MAC（Message Authentication Code）的鑑別機制。使用HMAC時,消息通信的雙方，經過驗證消息中加入的鑑別密鑰K來鑑別消息的真僞；

通常用於網絡通訊中消息加密，前提是雙方先要約定好key,就像接頭暗號同樣，而後消息發送把用key把消息加密，接收方用key ＋消息明文再加密，拿加密後的值跟發送者的相對比是否相等，這樣就能驗證消息的真實性，及發送者的合法性了。

1 import hmac
2 h = hmac.new(b'天王蓋地虎', b'寶塔鎮河妖')
3 print h.hexdigest()

更多關於md5,sha1,sha256等介紹的文章看這裏https://www.tbs-certificates.co.uk/FAQ/en/sha256.html

Subprocess模塊

The subprocess module allows you to spawn new processes, connect to their input/output/error pipes, and obtain their return codes. This module intends to replace several older modules and functions:

os.system os.spawn*

The recommended approach to invoking subprocesses is to use the run() function for all use cases it can handle. For more advanced use cases, the underlying Popen interface can be used directly.

The run() function was added in Python 3.5; if you need to retain compatibility with older versions, see the Older high-level API section.

(args, *, stdin=None, input=None, stdout=None, stderr=None, shell=False, timeout=None, check=False)subprocess.run

Run the command described by args. Wait for command to complete, then return a CompletedProcess instance.

The arguments shown above are merely the most common ones, described below in Frequently Used Arguments (hence the use of keyword-only notation in the abbreviated signature). The full function signature is largely the same as that of the Popen constructor - apart from timeout, input and check, all the arguments to this function are passed through to that interface.

This does not capture stdout or stderr by default. To do so, pass PIPE for the stdout and/or stderr arguments.

The timeout argument is passed to Popen.communicate(). If the timeout expires, the child process will be killed and waited for. The TimeoutExpired exception will be re-raised after the child process has terminated.

The input argument is passed to Popen.communicate() and thus to the subprocess’s stdin. If used it must be a byte sequence, or a string if universal_newlines=True. When used, the internal Popen object is automatically created withstdin=PIPE, and the stdin argument may not be used as well.

If check is True, and the process exits with a non-zero exit code, a CalledProcessError exception will be raised. Attributes of that exception hold the arguments, the exit code, and stdout and stderr if they were captured.

經常使用subprocess方法示例

 1 #執行命令，返回命令執行狀態 ， 0 or 非0
 2 >>> retcode = subprocess.call(["ls", "-l"])
 3 
 4 #執行命令，若是命令結果爲0，就正常返回，不然拋異常
 5 >>> subprocess.check_call(["ls", "-l"])
 6 0
 7 
 8 #接收字符串格式命令，返回元組形式，第1個元素是執行狀態，第2個是命令結果 
 9 >>> subprocess.getstatusoutput('ls /bin/ls')
10 (0, '/bin/ls')
11 
12 #接收字符串格式命令，並返回結果
13 >>> subprocess.getoutput('ls /bin/ls')
14 '/bin/ls'
15 
16 #執行命令，並返回結果，注意是返回結果，不是打印，下例結果返回給res
17 >>> res=subprocess.check_output(['ls','-l'])
18 >>> res
19 b'total 0\ndrwxr-xr-x 12 alex staff 408 Nov 2 11:05 OldBoyCRM\n'
20 
21 #上面那些方法，底層都是封裝的subprocess.Popen
22 poll()
23 Check if child process has terminated. Returns returncode
24 
25 wait()
26 Wait for child process to terminate. Returns returncode attribute.
27 
28 
29 terminate() 殺掉所啓動進程
30 communicate() 等待任務結束
31 
32 stdin 標準輸入
33 
34 stdout 標準輸出
35 
36 stderr 標準錯誤
37 
38 pid
39 The process ID of the child process.
40 
41 #例子
42 >>> p = subprocess.Popen("df -h|grep disk",stdin=subprocess.PIPE,stdout=subprocess.PIPE,shell=True)
43 >>> p.stdout.read()
44 b'/dev/disk1 465Gi 64Gi 400Gi 14% 16901472 104938142 14% /\n'

 1 >>> subprocess.run(["ls", "-l"])  # doesn't capture output
 2 CompletedProcess(args=['ls', '-l'], returncode=0)
 3  
 4 >>> subprocess.run("exit 1", shell=True, check=True)
 5 Traceback (most recent call last):
 6   ...
 7 subprocess.CalledProcessError: Command 'exit 1' returned non-zero exit status 1
 8  
 9 >>> subprocess.run(["ls", "-l", "/dev/null"], stdout=subprocess.PIPE)
10 CompletedProcess(args=['ls', '-l', '/dev/null'], returncode=0,
11 stdout=b'crw-rw-rw- 1 root root 1, 3 Jan 23 16:23 /dev/null\n')

調用subprocess.run(...)是推薦的經常使用方法，在大多數狀況下能知足需求，但若是你可能須要進行一些複雜的與系統的交互的話，你還能夠用subprocess.Popen(),語法以下：

1 p = subprocess.Popen("find / -size +1000000 -exec ls -shl {} \;",shell=True,stdout=subprocess.PIPE)
2 print(p.stdout.read())

可用參數：

args：shell命令，能夠是字符串或者序列類型（如：list，元組）
bufsize：指定緩衝。0 無緩衝,1 行緩衝,其餘緩衝區大小,負值系統緩衝
stdin, stdout, stderr：分別表示程序的標準輸入、輸出、錯誤句柄
preexec_fn：只在Unix平臺下有效，用於指定一個可執行對象（callable object），它將在子進程運行以前被調用
close_sfs：在windows平臺下，若是close_fds被設置爲True，則新建立的子進程將不會繼承父進程的輸入、輸出、錯誤管道。
因此不能將close_fds設置爲True同時重定向子進程的標準輸入、輸出與錯誤(stdin, stdout, stderr)。
shell：同上
cwd：用於設置子進程的當前目錄
env：用於指定子進程的環境變量。若是env = None，子進程的環境變量將從父進程中繼承。
universal_newlines：不一樣系統的換行符不一樣，True -> 贊成使用 \n
startupinfo與createionflags只在windows下有效
將被傳遞給底層的CreateProcess()函數，用於設置子進程的一些屬性，如：主窗口的外觀，進程的優先級等等

終端輸入的命令分爲兩種：

輸入便可獲得輸出，如：ifconfig
輸入進行某環境，依賴再輸入，如：python

須要交互的命令示例

 1 import subprocess
 2  
 3 obj = subprocess.Popen(["python"], stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
 4 obj.stdin.write('print 1 \n ')
 5 obj.stdin.write('print 2 \n ')
 6 obj.stdin.write('print 3 \n ')
 7 obj.stdin.write('print 4 \n ')
 8  
 9 out_error_list = obj.communicate(timeout=10)
10 print out_error_list

subprocess實現sudo 自動輸入密碼

 1 import subprocess
 2  
 3 def mypass():
 4     mypass = '123' #or get the password from anywhere
 5     return mypass
 6  
 7 echo = subprocess.Popen(['echo',mypass()],
 8                         stdout=subprocess.PIPE,
 9                         )
10  
11 sudo = subprocess.Popen(['sudo','-S','iptables','-L'],
12                         stdin=echo.stdout,
13                         stdout=subprocess.PIPE,
14                         )
15  
16 end_of_pipe = sudo.stdout
17  
18 print "Password ok \n Iptables Chains %s" % end_of_pipe.read()

logging模塊　　

用於便捷記錄日誌且線程安全的模塊

 1 import logging
 2  
 3  
 4 logging.basicConfig(filename='log.log',
 5                     format='%(asctime)s - %(name)s - %(levelname)s -%(module)s:  %(message)s',
 6                     datefmt='%Y-%m-%d %H:%M:%S %p',
 7                     level=10)
 8  
 9 logging.debug('debug')
10 logging.info('info')
11 logging.warning('warning')
12 logging.error('error')
13 logging.critical('critical')
14 logging.log(10,'log')

對於等級：

1 CRITICAL = 50
2 FATAL = CRITICAL
3 ERROR = 40
4 WARNING = 30
5 WARN = WARNING
6 INFO = 20
7 DEBUG = 10
8 NOTSET = 0

只有大於當前日誌等級的操做纔會被記錄。

對於格式，有以下屬性但是配置：

不少程序都有記錄日誌的需求，而且日誌中包含的信息即有正常的程序訪問日誌，還可能有錯誤、警告等信息輸出，python的logging模塊提供了標準的日誌接口，你能夠經過它存儲各類格式的日誌，logging的日誌能夠分爲 debug(), info(), warning(), error() and critical() 5個級別，下面咱們看一下怎麼用。

最簡單用法

1 import logging
2  
3 logging.warning("user [alex] attempted wrong password more than 3 times")
4 logging.critical("server is down")
5  
6 #輸出
7 WARNING:root:user [alex] attempted wrong password more than 3 times
8 CRITICAL:root:server is down

看一下這幾個日誌級別分別表明什麼意思

Level	When it’s used
`DEBUG`	Detailed information, typically of interest only when diagnosing problems.
`INFO`	Confirmation that things are working as expected.
`WARNING`	An indication that something unexpected happened, or indicative of some problem in the near future (e.g. ‘disk space low’). The software is still working as expected.
`ERROR`	Due to a more serious problem, the software has not been able to perform some function.
`CRITICAL`	A serious error, indicating that the program itself may be unable to continue running.

若是想把日誌寫到文件裏，也很簡單

1 import logging
2  
3 logging.basicConfig(filename='example.log',level=logging.INFO)
4 logging.debug('This message should go to the log file')
5 logging.info('So should this')
6 logging.warning('And this, too')

其中下面這句中的level=loggin.INFO意思是，把日誌紀錄級別設置爲INFO，也就是說，只有比日誌是INFO或比INFO級別更高的日誌纔會被紀錄到文件裏，在這個例子，第一條日誌是不會被紀錄的，若是但願紀錄debug的日誌，那把日誌級別改爲DEBUG就好了。

1 logging.basicConfig(filename='example.log',level=logging.INFO)

感受上面的日誌格式忘記加上時間啦，日誌不知道時間怎麼行呢，下面就來加上!

1 import logging
2 logging.basicConfig(format='%(asctime)s %(message)s', datefmt='%m/%d/%Y %I:%M:%S %p')
3 logging.warning('is when this event was logged.')
4  
5 #輸出
6 12/12/2010 11:46:36 AM is when this event was logged.

日誌格式

%(name)s	Logger的名字
%(levelno)s	數字形式的日誌級別
%(levelname)s	文本形式的日誌級別
%(pathname)s	調用日誌輸出函數的模塊的完整路徑名，可能沒有
%(filename)s	調用日誌輸出函數的模塊的文件名
%(module)s	調用日誌輸出函數的模塊名
%(funcName)s	調用日誌輸出函數的函數名
%(lineno)d	調用日誌輸出函數的語句所在的代碼行
%(created)f	當前時間，用UNIX標準的表示時間的浮點數表示
%(relativeCreated)d	輸出日誌信息時的，自Logger建立以來的毫秒數
%(asctime)s	字符串形式的當前時間。默認格式是「2003-07-08 16:49:45,896」。逗號後面的是毫秒
%(thread)d	線程ID。可能沒有
%(threadName)s	線程名。可能沒有
%(process)d	進程ID。可能沒有
%(message)s	用戶輸出的消息

若是想同時把log打印在屏幕和文件日誌裏，就須要瞭解一點複雜的知識了

Python 使用logging模塊記錄日誌涉及四個主要類，使用官方文檔中的歸納最爲合適：

logger提供了應用程序能夠直接使用的接口；

handler將(logger建立的)日誌記錄發送到合適的目的輸出；

filter提供了細度設備來決定輸出哪條日誌記錄；

formatter決定日誌記錄的最終輸出格式。

logger
每一個程序在輸出信息以前都要得到一個Logger。Logger一般對應了程序的模塊名，好比聊天工具的圖形界面模塊能夠這樣得到它的Logger：
LOG=logging.getLogger(」chat.gui」)
而核心模塊能夠這樣：
LOG=logging.getLogger(」chat.kernel」)

Logger.setLevel(lel):指定最低的日誌級別，低於lel的級別將被忽略。debug是最低的內置級別，critical爲最高
Logger.addFilter(filt)、Logger.removeFilter(filt):添加或刪除指定的filter
Logger.addHandler(hdlr)、Logger.removeHandler(hdlr)：增長或刪除指定的handler
Logger.debug()、Logger.info()、Logger.warning()、Logger.error()、Logger.critical()：能夠設置的日誌級別

handler

handler對象負責發送相關的信息到指定目的地。Python的日誌系統有多種Handler可使用。有些Handler能夠把信息輸出到控制檯，有些Logger能夠把信息輸出到文件，還有些 Handler能夠把信息發送到網絡上。若是以爲不夠用，還能夠編寫本身的Handler。能夠經過addHandler()方法添加多個多handler
Handler.setLevel(lel):指定被處理的信息級別，低於lel級別的信息將被忽略
Handler.setFormatter()：給這個handler選擇一個格式
Handler.addFilter(filt)、Handler.removeFilter(filt)：新增或刪除一個filter對象

每一個Logger能夠附加多個Handler。接下來咱們就來介紹一些經常使用的Handler：
1) logging.StreamHandler
使用這個Handler能夠向相似與sys.stdout或者sys.stderr的任何文件對象(file object)輸出信息。它的構造函數是：
StreamHandler([strm])
其中strm參數是一個文件對象。默認是sys.stderr

2) logging.FileHandler
和StreamHandler相似，用於向一個文件輸出日誌信息。不過FileHandler會幫你打開這個文件。它的構造函數是：
FileHandler(filename[,mode])
filename是文件名，必須指定一個文件名。
mode是文件的打開方式。參見Python內置函數open()的用法。默認是’a'，即添加到文件末尾。

3) logging.handlers.RotatingFileHandler
這個Handler相似於上面的FileHandler，可是它能夠管理文件大小。當文件達到必定大小以後，它會自動將當前日誌文件更名，而後建立一個新的同名日誌文件繼續輸出。好比日誌文件是chat.log。當chat.log達到指定的大小以後，RotatingFileHandler自動把文件更名爲chat.log.1。不過，若是chat.log.1已經存在，會先把chat.log.1重命名爲chat.log.2。。。最後從新建立 chat.log，繼續輸出日誌信息。它的構造函數是：
RotatingFileHandler( filename[, mode[, maxBytes[, backupCount]]])
其中filename和mode兩個參數和FileHandler同樣。
maxBytes用於指定日誌文件的最大文件大小。若是maxBytes爲0，意味着日誌文件能夠無限大，這時上面描述的重命名過程就不會發生。
backupCount用於指定保留的備份文件的個數。好比，若是指定爲2，當上面描述的重命名過程發生時，原有的chat.log.2並不會被改名，而是被刪除。

4) logging.handlers.TimedRotatingFileHandler
這個Handler和RotatingFileHandler相似，不過，它沒有經過判斷文件大小來決定什麼時候從新建立日誌文件，而是間隔必定時間就自動建立新的日誌文件。重命名的過程與RotatingFileHandler相似，不過新的文件不是附加數字，而是當前時間。它的構造函數是：
TimedRotatingFileHandler( filename [,when [,interval [,backupCount]]])
其中filename參數和backupCount參數和RotatingFileHandler具備相同的意義。
interval是時間間隔。
when參數是一個字符串。表示時間間隔的單位，不區分大小寫。它有如下取值：
S 秒
M 分
H 小時
D 天
W 每星期（interval==0時表明星期一）
midnight 天天凌晨

 1 import logging
 2  
 3 #create logger
 4 logger = logging.getLogger('TEST-LOG')
 5 logger.setLevel(logging.DEBUG)
 6  
 7  
 8 # create console handler and set level to debug
 9 ch = logging.StreamHandler()
10 ch.setLevel(logging.DEBUG)
11  
12 # create file handler and set level to warning
13 fh = logging.FileHandler("access.log")
14 fh.setLevel(logging.WARNING)
15 # create formatter
16 formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')
17  
18 # add formatter to ch and fh
19 ch.setFormatter(formatter)
20 fh.setFormatter(formatter)
21  
22 # add ch and fh to logger
23 logger.addHandler(ch)
24 logger.addHandler(fh)
25  
26 # 'application' code
27 logger.debug('debug message')
28 logger.info('info message')
29 logger.warn('warn message')
30 logger.error('error message')
31 logger.critical('critical message')

文件自動截斷例子

 1 import logging
 2 
 3 from logging import handlers
 4 
 5 logger = logging.getLogger(__name__)
 6 
 7 log_file = "timelog.log"
 8 #fh = handlers.RotatingFileHandler(filename=log_file,maxBytes=10,backupCount=3)
 9 fh = handlers.TimedRotatingFileHandler(filename=log_file,when="S",interval=5,backupCount=3)
10 
11 
12 formatter = logging.Formatter('%(asctime)s %(module)s:%(lineno)d %(message)s')
13 
14 fh.setFormatter(formatter)
15 
16 logger.addHandler(fh)
17 
18 
19 logger.warning("test1")
20 logger.warning("test12")
21 logger.warning("test13")
22 logger.warning("test14")

re模塊　　

經常使用正則表達式符號

 1 '.'     默認匹配除\n以外的任意一個字符，若指定flag DOTALL,則匹配任意字符，包括換行
 2 '^'     匹配字符開頭，若指定flags MULTILINE,這種也能夠匹配上(r"^a","\nabc\neee",flags=re.MULTILINE)
 3 '$'     匹配字符結尾，或e.search("foo$","bfoo\nsdfsf",flags=re.MULTILINE).group()也能夠
 4 '*'     匹配*號前的字符0次或屢次，re.findall("ab*","cabb3abcbbac")  結果爲['abb', 'ab', 'a']
 5 '+'     匹配前一個字符1次或屢次，re.findall("ab+","ab+cd+abb+bba") 結果['ab', 'abb']
 6 '?'     匹配前一個字符1次或0次
 7 '{m}'   匹配前一個字符m次
 8 '{n,m}' 匹配前一個字符n到m次，re.findall("ab{1,3}","abb abc abbcbbb") 結果'abb', 'ab', 'abb']
 9 '|'     匹配|左或|右的字符，re.search("abc|ABC","ABCBabcCD").group() 結果'ABC'
10 '(...)' 分組匹配，re.search("(abc){2}a(123|456)c", "abcabca456c").group() 結果 abcabca456c
11  
12  
13 '\A'    只從字符開頭匹配，re.search("\Aabc","alexabc") 是匹配不到的
14 '\Z'    匹配字符結尾，同$
15 '\d'    匹配數字0-9
16 '\D'    匹配非數字
17 '\w'    匹配[A-Za-z0-9]
18 '\W'    匹配非[A-Za-z0-9]
19 's'     匹配空白字符、\t、\n、\r , re.search("\s+","ab\tc1\n3").group() 結果 '\t'

1 '(?P<name>...)' 分組匹配 re.search("(?P<province>[0-9]{4})(?P<city>[0-9]{2})(?P<birthday>[0-9]{4})","371481199306143242").groupdict("city") 結果{'province': '3714', 'city': '81', 'birthday': '1993'}

最經常使用的匹配語法

1 re.match 從頭開始匹配
2 re.search 匹配包含
3 re.findall 把全部匹配到的字符放到以列表中的元素返回
4 re.splitall 以匹配到的字符當作列表分隔符
5 re.sub      匹配字符並替換

反斜槓的困擾
與大多數編程語言相同，正則表達式裏使用"\"做爲轉義字符，這就可能形成反斜槓困擾。假如你須要匹配文本中的字符"\"，那麼使用編程語言表示的正則表達式裏將須要4個反斜槓"\\\\"：前兩個和後兩個分別用於在編程語言裏轉義成反斜槓，轉換成兩個反斜槓後再在正則表達式裏轉義成一個反斜槓。Python裏的原生字符串很好地解決了這個問題，這個例子中的正則表達式可使用r"\\"表示。一樣，匹配一個數字的"\\d"能夠寫成r"\d"。有了原生字符串，你不再用擔憂是否是漏寫了反斜槓，寫出來的表達式也更直觀。

僅需輕輕知道的幾個匹配模式

1 re.I(re.IGNORECASE): 忽略大小寫（括號內是完整寫法，下同）
2 M(MULTILINE): 多行模式，改變'^'和'$'的行爲（參見上圖）
3 S(DOTALL): 點任意匹配模式，改變'.'的行爲

本節做業

開發一個簡單的python計算器

實現加減乘除及拓號優先級解析
用戶輸入 1 - 2 * ( (60-30 +(-40/5) * (9-2*5/3 + 7 /3*99/4*2998 +10 * 568/14 )) - (-4*3)/ (16-3*2) )等相似公式後，必須本身解析裏面的(),+,-,*,/符號和公式(不能調用eval等相似功能偷懶實現)，運算後得出結果，結果必須與真實的計算器所得出的結果一致

hint:

re.search(r'\([^()]+\)',s).group()

'(-40/5)'