本節大綱:html
模塊,用一砣代碼實現了某個功能的代碼集合。 node
相似於函數式編程和麪向過程編程,函數式編程則完成一個功能,其餘代碼用來調用便可,提供了代碼的重用性和代碼間的耦合。而對於一個複雜的功能來,可能須要多個函數才能完成(函數又能夠在不一樣的.py文件中),n個 .py 文件組成的代碼集合就稱爲模塊。python
如:os 是系統相關的模塊;file是文件操做相關的模塊git
模塊分爲三種:程序員
自定義模塊 和開源模塊的使用參考 http://www.cnblogs.com/wupeiqi/articles/4963027.html 正則表達式
一、定義模塊算法
情景一:shell
情景二:macos
情景三:編程
二、導入模塊
Python之因此應用愈來愈普遍,在必定程度上也依賴於其爲程序員提供了大量的模塊以供使用,若是想要使用模塊,則須要導入。導入模塊有一下幾種方法:
1 import module 2 from module.xx.xx import xx 3 from module.xx.xx import xx as rename 4 from module.xx.xx import *
導入模塊其實就是告訴Python解釋器去解釋那個py文件
那麼問題來了,導入模塊時是根據那個路徑做爲基準來進行的呢?即:sys.path
1 import sys 2 print sys.path 3 4 結果: 5 ['/Users/wupeiqi/PycharmProjects/calculator/p1/pp1',
'/usr/local/lib/python2.7/site-packages/setuptools-15.2-py2.7.egg',
'/usr/local/lib/python2.7/site-packages/distribute-0.6.28-py2.7.egg',
'/usr/local/lib/python2.7/site-packages/MySQL_python-1.2.4b4-py2.7-macosx-10.10-x86_64.egg',
'/usr/local/lib/python2.7/site-packages/xlutils-1.7.1-py2.7.egg',
'/usr/local/lib/python2.7/site-packages/xlwt-1.0.0-py2.7.egg',
'/usr/local/lib/python2.7/site-packages/xlrd-0.9.3-py2.7.egg',
'/usr/local/lib/python2.7/site-packages/tornado-4.1-py2.7-macosx-10.10-x86_64.egg',
'/usr/local/lib/python2.7/site-packages/backports.ssl_match_hostname-3.4.0.2-py2.7.egg',
'/usr/local/lib/python2.7/site-packages/certifi-2015.4.28-py2.7.egg',
'/usr/local/lib/python2.7/site-packages/pyOpenSSL-0.15.1-py2.7.egg',
'/usr/local/lib/python2.7/site-packages/six-1.9.0-py2.7.egg',
'/usr/local/lib/python2.7/site-packages/cryptography-0.9.1-py2.7-macosx-10.10-x86_64.egg',
'/usr/local/lib/python2.7/site-packages/cffi-1.1.1-py2.7-macosx-10.10-x86_64.egg',
'/usr/local/lib/python2.7/site-packages/ipaddress-1.0.7-py2.7.egg',
'/usr/local/lib/python2.7/site-packages/enum34-1.0.4-py2.7.egg',
'/usr/local/lib/python2.7/site-packages/pyasn1-0.1.7-py2.7.egg',
'/usr/local/lib/python2.7/site-packages/idna-2.0-py2.7.egg',
'/usr/local/lib/python2.7/site-packages/pycparser-2.13-py2.7.egg',
'/usr/local/lib/python2.7/site-packages/Django-1.7.8-py2.7.egg',
'/usr/local/lib/python2.7/site-packages/paramiko-1.10.1-py2.7.egg',
'/usr/local/lib/python2.7/site-packages/gevent-1.0.2-py2.7-macosx-10.10-x86_64.egg',
'/usr/local/lib/python2.7/site-packages/greenlet-0.4.7-py2.7-macosx-10.10-x86_64.egg',
'/Users/wupeiqi/PycharmProjects/calculator', '/usr/local/Cellar/python/2.7.9/Frameworks/Python.framework/Versions/2.7/lib/python27.zip',
'/usr/local/Cellar/python/2.7.9/Frameworks/Python.framework/Versions/2.7/lib/python2.7',
'/usr/local/Cellar/python/2.7.9/Frameworks/Python.framework/Versions/2.7/lib/python2.7/plat-darwin',
'/usr/local/Cellar/python/2.7.9/Frameworks/Python.framework/Versions/2.7/lib/python2.7/plat-mac',
'/usr/local/Cellar/python/2.7.9/Frameworks/Python.framework/Versions/2.7/lib/python2.7/plat-mac/lib-scriptpackages',
'/usr/local/Cellar/python/2.7.9/Frameworks/Python.framework/Versions/2.7/lib/python2.7/lib-tk',
'/usr/local/Cellar/python/2.7.9/Frameworks/Python.framework/Versions/2.7/lib/python2.7/lib-old',
'/usr/local/Cellar/python/2.7.9/Frameworks/Python.framework/Versions/2.7/lib/python2.7/lib-dynload', '/usr/local/lib/python2.7/site-packages',
'/Library/Python/2.7/site-packages']
若是sys.path路徑列表沒有你想要的路徑,能夠經過 sys.path.append('路徑') 添加。
經過os模塊能夠獲取各類目錄,例如:
1 import sys 2 import os 3 4 pre_path = os.path.abspath('../') 5 sys.path.append(pre_path)
1、下載安裝
下載安裝有兩種方式:
yum pip apt-get ...
1 下載源碼 2 解壓源碼 3 進入目錄 4 編譯源碼 python setup.py build 5 安裝源碼 python setup.py install
注:在使用源碼安裝時,須要使用到gcc編譯和python開發環境,因此,須要先執行:
1 yum install gcc 2 yum install python-devel 3 或 4 apt-get python-dev
安裝成功後,模塊會自動安裝到 sys.path 中的某個目錄中,如:
1 /usr/lib/python2.7/site-packages/
2、導入模塊
同自定義模塊中導入的方式
3、模塊 paramiko
paramiko是一個用於作遠程控制的模塊,使用該模塊能夠對遠程服務器進行命令或文件操做,值得一說的是,fabric和ansible內部的遠程管理就是使用的paramiko來現實。
一、下載安裝
pip3 install paramiko
或
1 # pycrypto,因爲 paramiko 模塊內部依賴pycrypto,因此先下載安裝pycrypto 2 3 # 下載安裝 pycrypto 4 wget http://files.cnblogs.com/files/wupeiqi/pycrypto-2.6.1.tar.gz 5 tar -xvf pycrypto-2.6.1.tar.gz 6 cd pycrypto-2.6.1 7 python setup.py build 8 python setup.py install 9 10 # 進入python環境,導入Crypto檢查是否安裝成功 11 12 # 下載安裝 paramiko 13 wget http://files.cnblogs.com/files/wupeiqi/paramiko-1.10.1.tar.gz 14 tar -xvf paramiko-1.10.1.tar.gz 15 cd paramiko-1.10.1 16 python setup.py build 17 python setup.py install 18 19 # 進入python環境,導入paramiko檢查是否安裝成功
二、使用模塊
#!/usr/bin/env python #coding:utf-8 import paramiko ssh = paramiko.SSHClient() ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy()) ssh.connect('192.168.1.108', 22, 'alex', '123') stdin, stdout, stderr = ssh.exec_command('df') print stdout.read() ssh.close(); 執行命令 - 經過用戶名和密碼鏈接服務器
import paramiko private_key_path = '/home/auto/.ssh/id_rsa' key = paramiko.RSAKey.from_private_key_file(private_key_path) ssh = paramiko.SSHClient() ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy()) ssh.connect('主機名 ', 端口, '用戶名', key) stdin, stdout, stderr = ssh.exec_command('df') print stdout.read() ssh.close() 執行命令 - 過密鑰連接服務器
1 import os,sys 2 import paramiko 3 4 t = paramiko.Transport(('182.92.219.86',22)) 5 t.connect(username='wupeiqi',password='123') 6 sftp = paramiko.SFTPClient.from_transport(t) 7 sftp.put('/tmp/test.py','/tmp/test.py') 8 t.close() 9 10 11 import os,sys 12 import paramiko 13 14 t = paramiko.Transport(('182.92.219.86',22)) 15 t.connect(username='wupeiqi',password='123') 16 sftp = paramiko.SFTPClient.from_transport(t) 17 sftp.get('/tmp/test.py','/tmp/test2.py') 18 t.close() 19 20 上傳或者下載文件 - 經過用戶名和密碼
1 import paramiko 2 3 pravie_key_path = '/home/auto/.ssh/id_rsa' 4 key = paramiko.RSAKey.from_private_key_file(pravie_key_path) 5 6 t = paramiko.Transport(('182.92.219.86',22)) 7 t.connect(username='wupeiqi',pkey=key) 8 9 sftp = paramiko.SFTPClient.from_transport(t) 10 sftp.put('/tmp/test3.py','/tmp/test3.py') 11 12 t.close() 13 14 import paramiko 15 16 pravie_key_path = '/home/auto/.ssh/id_rsa' 17 key = paramiko.RSAKey.from_private_key_file(pravie_key_path) 18 19 t = paramiko.Transport(('182.92.219.86',22)) 20 t.connect(username='wupeiqi',pkey=key) 21 22 sftp = paramiko.SFTPClient.from_transport(t) 23 sftp.get('/tmp/test3.py','/tmp/test4.py') 24 25 t.close() 26 27 上傳或下載文件 - 經過密鑰
時間相關的操做,時間有三種表示方式:
1 print time.time() 2 print time.mktime(time.localtime()) 3 4 print time.gmtime() #可加時間戳參數 5 print time.localtime() #可加時間戳參數 6 print time.strptime('2014-11-11', '%Y-%m-%d') 7 8 print time.strftime('%Y-%m-%d') #默認當前時間 9 print time.strftime('%Y-%m-%d',time.localtime()) #默認當前時間 10 print time.asctime() 11 print time.asctime(time.localtime()) 12 print time.ctime(time.time()) 13 14 import datetime 15 ''' 16 datetime.date:表示日期的類。經常使用的屬性有year, month, day 17 datetime.time:表示時間的類。經常使用的屬性有hour, minute, second, microsecond 18 datetime.datetime:表示日期時間 19 datetime.timedelta:表示時間間隔,即兩個時間點之間的長度 20 timedelta([days[, seconds[, microseconds[, milliseconds[, minutes[, hours[, weeks]]]]]]]) 21 strftime("%Y-%m-%d") 22 ''' 23 import datetime 24 print datetime.datetime.now() 25 print datetime.datetime.now() - datetime.timedelta(days=5)
1 #_*_coding:utf-8_*_ 2 __author__ = 'Alex Li' 3 4 import time 5 6 7 # print(time.clock()) #返回處理器時間,3.3開始已廢棄 , 改爲了time.process_time()測量處理器運算時間,不包括sleep時間,不穩定,mac上測不出來 8 # print(time.altzone) #返回與utc時間的時間差,以秒計算\ 9 # print(time.asctime()) #返回時間格式"Fri Aug 19 11:14:16 2016", 10 # print(time.localtime()) #返回本地時間 的struct time對象格式 11 # print(time.gmtime(time.time()-800000)) #返回utc時間的struc時間對象格式 12 13 # print(time.asctime(time.localtime())) #返回時間格式"Fri Aug 19 11:14:16 2016", 14 #print(time.ctime()) #返回Fri Aug 19 12:38:29 2016 格式, 同上 15 16 17 18 # 日期字符串 轉成 時間戳 19 # string_2_struct = time.strptime("2016/05/22","%Y/%m/%d") #將 日期字符串 轉成 struct時間對象格式 20 # print(string_2_struct) 21 # # 22 # struct_2_stamp = time.mktime(string_2_struct) #將struct時間對象轉成時間戳 23 # print(struct_2_stamp) 24 25 26 27 #將時間戳轉爲字符串格式 28 # print(time.gmtime(time.time()-86640)) #將utc時間戳轉換成struct_time格式 29 # print(time.strftime("%Y-%m-%d %H:%M:%S",time.gmtime()) ) #將utc struct_time格式轉成指定的字符串格式 30 31 32 33 34 35 #時間加減 36 import datetime 37 38 # print(datetime.datetime.now()) #返回 2016-08-19 12:47:03.941925 39 #print(datetime.date.fromtimestamp(time.time()) ) # 時間戳直接轉成日期格式 2016-08-19 40 # print(datetime.datetime.now() ) 41 # print(datetime.datetime.now() + datetime.timedelta(3)) #當前時間+3天 42 # print(datetime.datetime.now() + datetime.timedelta(-3)) #當前時間-3天 43 # print(datetime.datetime.now() + datetime.timedelta(hours=3)) #當前時間+3小時 44 # print(datetime.datetime.now() + datetime.timedelta(minutes=30)) #當前時間+30分 45 46 47 # 48 # c_time = datetime.datetime.now() 49 # print(c_time.replace(minute=3,hour=2)) #時間替換
Directive | Meaning | Notes |
---|---|---|
%a |
Locale’s abbreviated weekday name. | |
%A |
Locale’s full weekday name. | |
%b |
Locale’s abbreviated month name. | |
%B |
Locale’s full month name. | |
%c |
Locale’s appropriate date and time representation. | |
%d |
Day of the month as a decimal number [01,31]. | |
%H |
Hour (24-hour clock) as a decimal number [00,23]. | |
%I |
Hour (12-hour clock) as a decimal number [01,12]. | |
%j |
Day of the year as a decimal number [001,366]. | |
%m |
Month as a decimal number [01,12]. | |
%M |
Minute as a decimal number [00,59]. | |
%p |
Locale’s equivalent of either AM or PM. | (1) |
%S |
Second as a decimal number [00,61]. | (2) |
%U |
Week number of the year (Sunday as the first day of the week) as a decimal number [00,53]. All days in a new year preceding the first Sunday are considered to be in week 0. | (3) |
%w |
Weekday as a decimal number [0(Sunday),6]. | |
%W |
Week number of the year (Monday as the first day of the week) as a decimal number [00,53]. All days in a new year preceding the first Monday are considered to be in week 0. | (3) |
%x |
Locale’s appropriate date representation. | |
%X |
Locale’s appropriate time representation. | |
%y |
Year without century as a decimal number [00,99]. | |
%Y |
Year with century as a decimal number. | |
%z |
Time zone offset indicating a positive or negative time difference from UTC/GMT of the form +HHMM or -HHMM, where H represents decimal hour digits and M represents decimal minute digits [-23:59, +23:59]. | |
%Z |
Time zone name (no characters if no time zone exists). | |
%% |
A literal '%' character. |
隨機數
1 mport random 2 print random.random() 3 print random.randint(1,2) 4 print random.randrange(1,10)
生成隨機驗證碼
1 import random 2 checkcode = '' 3 for i in range(4): 4 current = random.randrange(0,4) 5 if current != i: 6 temp = chr(random.randint(65,90)) 7 else: 8 temp = random.randint(0,9) 9 checkcode += str(temp) 10 print checkcode
提供對操做系統進行調用的接口
1 os.getcwd() 獲取當前工做目錄,即當前python腳本工做的目錄路徑 2 os.chdir("dirname") 改變當前腳本工做目錄;至關於shell下cd 3 os.curdir 返回當前目錄: ('.') 4 os.pardir 獲取當前目錄的父目錄字符串名:('..') 5 os.makedirs('dirname1/dirname2') 可生成多層遞歸目錄 6 os.removedirs('dirname1') 若目錄爲空,則刪除,並遞歸到上一級目錄,如若也爲空,則刪除,依此類推 7 os.mkdir('dirname') 生成單級目錄;至關於shell中mkdir dirname 8 os.rmdir('dirname') 刪除單級空目錄,若目錄不爲空則沒法刪除,報錯;至關於shell中rmdir dirname 9 os.listdir('dirname') 列出指定目錄下的全部文件和子目錄,包括隱藏文件,並以列表方式打印 10 os.remove() 刪除一個文件 11 os.rename("oldname","newname") 重命名文件/目錄 12 os.stat('path/filename') 獲取文件/目錄信息 13 os.sep 輸出操做系統特定的路徑分隔符,win下爲"\\",Linux下爲"/" 14 os.linesep 輸出當前平臺使用的行終止符,win下爲"\t\n",Linux下爲"\n" 15 os.pathsep 輸出用於分割文件路徑的字符串 16 os.name 輸出字符串指示當前使用平臺。win->'nt'; Linux->'posix' 17 os.system("bash command") 運行shell命令,直接顯示 18 os.environ 獲取系統環境變量 19 os.path.abspath(path) 返回path規範化的絕對路徑 20 os.path.split(path) 將path分割成目錄和文件名二元組返回 21 os.path.dirname(path) 返回path的目錄。其實就是os.path.split(path)的第一個元素 22 os.path.basename(path) 返回path最後的文件名。如何path以/或\結尾,那麼就會返回空值。即os.path.split(path)的第二個元素 23 os.path.exists(path) 若是path存在,返回True;若是path不存在,返回False 24 os.path.isabs(path) 若是path是絕對路徑,返回True 25 os.path.isfile(path) 若是path是一個存在的文件,返回True。不然返回False 26 os.path.isdir(path) 若是path是一個存在的目錄,則返回True。不然返回False 27 os.path.join(path1[, path2[, ...]]) 將多個路徑組合後返回,第一個絕對路徑以前的參數將被忽略 28 os.path.getatime(path) 返回path所指向的文件或者目錄的最後存取時間 29 os.path.getmtime(path) 返回path所指向的文件或者目錄的最後修改時間
更多猛擊這裏
執行系統命令
能夠執行shell命令的相關模塊和函數有:
1 import commands 2 3 result = commands.getoutput('cmd') 4 result = commands.getstatus('cmd') 5 result = commands.getstatusoutput('cmd')
以上執行shell命令的相關的模塊和函數的功能均在 subprocess 模塊中實現,並提供了更豐富的功能。
call
執行命令,返回狀態碼
1 ret = subprocess.call(["ls", "-l"], shell=False) 2 ret = subprocess.call("ls -l", shell=True)
shell = True ,容許 shell 命令是字符串形式
check_call
執行命令,若是執行狀態碼是 0 ,則返回0,不然拋異常
1 subprocess.check_call(["ls", "-l"]) 2 subprocess.check_call("exit 1", shell=True)
check_output
執行命令,若是狀態碼是 0 ,則返回執行結果,不然拋異常
1 subprocess.check_output(["echo", "Hello World!"]) 2 subprocess.check_output("exit 1", shell=True)
subprocess.Popen(...)
用於執行復雜的系統命令
參數:
1 import subprocess 2 ret1 = subprocess.Popen(["mkdir","t1"]) 3 ret2 = subprocess.Popen("mkdir t2", shell=True)
終端輸入的命令分爲兩種:
1 import subprocess 2 obj = subprocess.Popen("mkdir t3", shell=True, cwd='/home/dev',)
1 import subprocess 2 3 obj = subprocess.Popen(["python"], stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE) 4 obj.stdin.write('print 1 \n ') 5 obj.stdin.write('print 2 \n ') 6 obj.stdin.write('print 3 \n ') 7 obj.stdin.write('print 4 \n ') 8 obj.stdin.close() 9 10 cmd_out = obj.stdout.read() 11 obj.stdout.close() 12 cmd_error = obj.stderr.read() 13 obj.stderr.close() 14 15 print cmd_out 16 print cmd_error
1 import subprocess 2 3 obj = subprocess.Popen(["python"], stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE) 4 obj.stdin.write('print 1 \n ') 5 obj.stdin.write('print 2 \n ') 6 obj.stdin.write('print 3 \n ') 7 obj.stdin.write('print 4 \n ') 8 9 out_error_list = obj.communicate() 10 print out_error_list
1 import subprocess 2 3 obj = subprocess.Popen(["python"], stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE) 4 out_error_list = obj.communicate('print "hello"') 5 print out_error_list
1 sys.argv 命令行參數List,第一個元素是程序自己路徑 2 sys.exit(n) 退出程序,正常退出時exit(0) 3 sys.version 獲取Python解釋程序的版本信息 4 sys.maxint 最大的Int值 5 sys.path 返回模塊的搜索路徑,初始化時使用PYTHONPATH環境變量的值 6 sys.platform 返回操做系統平臺名稱 7 sys.stdout.write('please:') 8 val = sys.stdin.readline()[:-1]
直接參考 http://www.cnblogs.com/wupeiqi/articles/4963027.html
高級的 文件、文件夾、壓縮包 處理模塊
shutil.copyfileobj(fsrc, fdst[, length])
將文件內容拷貝到另外一個文件中,能夠部份內容
1 def copyfileobj(fsrc, fdst, length=16*1024): 2 """copy data from file-like object fsrc to file-like object fdst""" 3 while 1: 4 buf = fsrc.read(length) 5 if not buf: 6 break 7 fdst.write(buf)
shutil.copyfile(src, dst)
拷貝文件
1 def copyfile(src, dst): 2 """Copy data from src to dst""" 3 if _samefile(src, dst): 4 raise Error("`%s` and `%s` are the same file" % (src, dst)) 5 6 for fn in [src, dst]: 7 try: 8 st = os.stat(fn) 9 except OSError: 10 # File most likely does not exist 11 pass 12 else: 13 # XXX What about other special files? (sockets, devices...) 14 if stat.S_ISFIFO(st.st_mode): 15 raise SpecialFileError("`%s` is a named pipe" % fn) 16 17 with open(src, 'rb') as fsrc: 18 with open(dst, 'wb') as fdst: 19 copyfileobj(fsrc, fdst)
shutil.copymode(src, dst)
僅拷貝權限。內容、組、用戶均不變
1 def copymode(src, dst): 2 """Copy mode bits from src to dst""" 3 if hasattr(os, 'chmod'): 4 st = os.stat(src) 5 mode = stat.S_IMODE(st.st_mode) 6 os.chmod(dst, mode)
shutil.copystat(src, dst)
拷貝狀態的信息,包括:mode bits, atime, mtime, flags
1 def copystat(src, dst): 2 """Copy all stat info (mode bits, atime, mtime, flags) from src to dst""" 3 st = os.stat(src) 4 mode = stat.S_IMODE(st.st_mode) 5 if hasattr(os, 'utime'): 6 os.utime(dst, (st.st_atime, st.st_mtime)) 7 if hasattr(os, 'chmod'): 8 os.chmod(dst, mode) 9 if hasattr(os, 'chflags') and hasattr(st, 'st_flags'): 10 try: 11 os.chflags(dst, st.st_flags) 12 except OSError, why: 13 for err in 'EOPNOTSUPP', 'ENOTSUP': 14 if hasattr(errno, err) and why.errno == getattr(errno, err): 15 break 16 else: 17 raise
shutil.copy(src, dst)
拷貝文件和權限
1 def copy(src, dst): 2 """Copy data and mode bits ("cp src dst"). 3 4 The destination may be a directory. 5 6 """ 7 if os.path.isdir(dst): 8 dst = os.path.join(dst, os.path.basename(src)) 9 copyfile(src, dst) 10 copymode(src, dst)
shutil.copy2(src, dst)
拷貝文件和狀態信息
1 def copy2(src, dst): 2 """Copy data and all stat info ("cp -p src dst"). 3 4 The destination may be a directory. 5 6 """ 7 if os.path.isdir(dst): 8 dst = os.path.join(dst, os.path.basename(src)) 9 copyfile(src, dst) 10 copystat(src, dst)
shutil.ignore_patterns(*patterns)
shutil.copytree(src, dst, symlinks=False, ignore=None)
遞歸的去拷貝文件
例如:copytree(source, destination, ignore=ignore_patterns('*.pyc', 'tmp*'))
1 def ignore_patterns(*patterns): 2 """Function that can be used as copytree() ignore parameter. 3 4 Patterns is a sequence of glob-style patterns 5 that are used to exclude files""" 6 def _ignore_patterns(path, names): 7 ignored_names = [] 8 for pattern in patterns: 9 ignored_names.extend(fnmatch.filter(names, pattern)) 10 return set(ignored_names) 11 return _ignore_patterns 12 13 def copytree(src, dst, symlinks=False, ignore=None): 14 """Recursively copy a directory tree using copy2(). 15 16 The destination directory must not already exist. 17 If exception(s) occur, an Error is raised with a list of reasons. 18 19 If the optional symlinks flag is true, symbolic links in the 20 source tree result in symbolic links in the destination tree; if 21 it is false, the contents of the files pointed to by symbolic 22 links are copied. 23 24 The optional ignore argument is a callable. If given, it 25 is called with the `src` parameter, which is the directory 26 being visited by copytree(), and `names` which is the list of 27 `src` contents, as returned by os.listdir(): 28 29 callable(src, names) -> ignored_names 30 31 Since copytree() is called recursively, the callable will be 32 called once for each directory that is copied. It returns a 33 list of names relative to the `src` directory that should 34 not be copied. 35 36 XXX Consider this example code rather than the ultimate tool. 37 38 """ 39 names = os.listdir(src) 40 if ignore is not None: 41 ignored_names = ignore(src, names) 42 else: 43 ignored_names = set() 44 45 os.makedirs(dst) 46 errors = [] 47 for name in names: 48 if name in ignored_names: 49 continue 50 srcname = os.path.join(src, name) 51 dstname = os.path.join(dst, name) 52 try: 53 if symlinks and os.path.islink(srcname): 54 linkto = os.readlink(srcname) 55 os.symlink(linkto, dstname) 56 elif os.path.isdir(srcname): 57 copytree(srcname, dstname, symlinks, ignore) 58 else: 59 # Will raise a SpecialFileError for unsupported file types 60 copy2(srcname, dstname) 61 # catch the Error from the recursive copytree so that we can 62 # continue with other files 63 except Error, err: 64 errors.extend(err.args[0]) 65 except EnvironmentError, why: 66 errors.append((srcname, dstname, str(why))) 67 try: 68 copystat(src, dst) 69 except OSError, why: 70 if WindowsError is not None and isinstance(why, WindowsError): 71 # Copying file access times may fail on Windows 72 pass 73 else: 74 errors.append((src, dst, str(why))) 75 if errors: 76 raise Error, errors
shutil.rmtree(path[, ignore_errors[, onerror]])
遞歸的去刪除文件
1 def rmtree(path, ignore_errors=False, onerror=None): 2 """Recursively delete a directory tree. 3 4 If ignore_errors is set, errors are ignored; otherwise, if onerror 5 is set, it is called to handle the error with arguments (func, 6 path, exc_info) where func is os.listdir, os.remove, or os.rmdir; 7 path is the argument to that function that caused it to fail; and 8 exc_info is a tuple returned by sys.exc_info(). If ignore_errors 9 is false and onerror is None, an exception is raised. 10 11 """ 12 if ignore_errors: 13 def onerror(*args): 14 pass 15 elif onerror is None: 16 def onerror(*args): 17 raise 18 try: 19 if os.path.islink(path): 20 # symlinks to directories are forbidden, see bug #1669 21 raise OSError("Cannot call rmtree on a symbolic link") 22 except OSError: 23 onerror(os.path.islink, path, sys.exc_info()) 24 # can't continue even if onerror hook returns 25 return 26 names = [] 27 try: 28 names = os.listdir(path) 29 except os.error, err: 30 onerror(os.listdir, path, sys.exc_info()) 31 for name in names: 32 fullname = os.path.join(path, name) 33 try: 34 mode = os.lstat(fullname).st_mode 35 except os.error: 36 mode = 0 37 if stat.S_ISDIR(mode): 38 rmtree(fullname, ignore_errors, onerror) 39 else: 40 try: 41 os.remove(fullname) 42 except os.error, err: 43 onerror(os.remove, fullname, sys.exc_info()) 44 try: 45 os.rmdir(path) 46 except os.error: 47 onerror(os.rmdir, path, sys.exc_info())
shutil.move(src, dst)
遞歸的去移動文件
1 def move(src, dst): 2 """Recursively move a file or directory to another location. This is 3 similar to the Unix "mv" command. 4 5 If the destination is a directory or a symlink to a directory, the source 6 is moved inside the directory. The destination path must not already 7 exist. 8 9 If the destination already exists but is not a directory, it may be 10 overwritten depending on os.rename() semantics. 11 12 If the destination is on our current filesystem, then rename() is used. 13 Otherwise, src is copied to the destination and then removed. 14 A lot more could be done here... A look at a mv.c shows a lot of 15 the issues this implementation glosses over. 16 17 """ 18 real_dst = dst 19 if os.path.isdir(dst): 20 if _samefile(src, dst): 21 # We might be on a case insensitive filesystem, 22 # perform the rename anyway. 23 os.rename(src, dst) 24 return 25 26 real_dst = os.path.join(dst, _basename(src)) 27 if os.path.exists(real_dst): 28 raise Error, "Destination path '%s' already exists" % real_dst 29 try: 30 os.rename(src, real_dst) 31 except OSError: 32 if os.path.isdir(src): 33 if _destinsrc(src, dst): 34 raise Error, "Cannot move a directory '%s' into itself '%s'." % (src, dst) 35 copytree(src, real_dst, symlinks=True) 36 rmtree(src) 37 else: 38 copy2(src, real_dst) 39 os.unlink(src)
shutil.make_archive(base_name, format,...)
建立壓縮包並返回文件路徑,例如:zip、tar
1 #將 /Users/wupeiqi/Downloads/test 下的文件打包放置當前程序目錄 2 3 import shutil 4 ret = shutil.make_archive("wwwwwwwwww", 'gztar', root_dir='/Users/wupeiqi/Downloads/test') 5 6 7 #將 /Users/wupeiqi/Downloads/test 下的文件打包放置 /Users/wupeiqi/目錄 8 import shutil 9 ret = shutil.make_archive("/Users/wupeiqi/wwwwwwwwww", 'gztar', root_dir='/Users/wupeiqi/Downloads/test')
1 def make_archive(base_name, format, root_dir=None, base_dir=None, verbose=0, 2 dry_run=0, owner=None, group=None, logger=None): 3 """Create an archive file (eg. zip or tar). 4 5 'base_name' is the name of the file to create, minus any format-specific 6 extension; 'format' is the archive format: one of "zip", "tar", "bztar" 7 or "gztar". 8 9 'root_dir' is a directory that will be the root directory of the 10 archive; ie. we typically chdir into 'root_dir' before creating the 11 archive. 'base_dir' is the directory where we start archiving from; 12 ie. 'base_dir' will be the common prefix of all files and 13 directories in the archive. 'root_dir' and 'base_dir' both default 14 to the current directory. Returns the name of the archive file. 15 16 'owner' and 'group' are used when creating a tar archive. By default, 17 uses the current owner and group. 18 """ 19 save_cwd = os.getcwd() 20 if root_dir is not None: 21 if logger is not None: 22 logger.debug("changing into '%s'", root_dir) 23 base_name = os.path.abspath(base_name) 24 if not dry_run: 25 os.chdir(root_dir) 26 27 if base_dir is None: 28 base_dir = os.curdir 29 30 kwargs = {'dry_run': dry_run, 'logger': logger} 31 32 try: 33 format_info = _ARCHIVE_FORMATS[format] 34 except KeyError: 35 raise ValueError, "unknown archive format '%s'" % format 36 37 func = format_info[0] 38 for arg, val in format_info[1]: 39 kwargs[arg] = val 40 41 if format != 'zip': 42 kwargs['owner'] = owner 43 kwargs['group'] = group 44 45 try: 46 filename = func(base_name, base_dir, **kwargs) 47 finally: 48 if root_dir is not None: 49 if logger is not None: 50 logger.debug("changing back to '%s'", save_cwd) 51 os.chdir(save_cwd) 52 53 return filename
shutil 對壓縮包的處理是調用 ZipFile 和 TarFile 兩個模塊來進行的,詳細:
1 import zipfile 2 3 # 壓縮 4 z = zipfile.ZipFile('laxi.zip', 'w') 5 z.write('a.log') 6 z.write('data.data') 7 z.close() 8 9 # 解壓 10 z = zipfile.ZipFile('laxi.zip', 'r') 11 z.extractall() 12 z.close()
1 import tarfile 2 3 # 壓縮 4 tar = tarfile.open('your.tar','w') 5 tar.add('/Users/wupeiqi/PycharmProjects/bbs2.zip', arcname='bbs2.zip') 6 tar.add('/Users/wupeiqi/PycharmProjects/cmdb.zip', arcname='cmdb.zip') 7 tar.close() 8 9 # 解壓 10 tar = tarfile.open('your.tar','r') 11 tar.extractall() # 可設置解壓地址 12 tar.close()
1 class ZipFile(object): 2 """ Class with methods to open, read, write, close, list zip files. 3 4 z = ZipFile(file, mode="r", compression=ZIP_STORED, allowZip64=False) 5 6 file: Either the path to the file, or a file-like object. 7 If it is a path, the file will be opened and closed by ZipFile. 8 mode: The mode can be either read "r", write "w" or append "a". 9 compression: ZIP_STORED (no compression) or ZIP_DEFLATED (requires zlib). 10 allowZip64: if True ZipFile will create files with ZIP64 extensions when 11 needed, otherwise it will raise an exception when this would 12 be necessary. 13 14 """ 15 16 fp = None # Set here since __del__ checks it 17 18 def __init__(self, file, mode="r", compression=ZIP_STORED, allowZip64=False): 19 """Open the ZIP file with mode read "r", write "w" or append "a".""" 20 if mode not in ("r", "w", "a"): 21 raise RuntimeError('ZipFile() requires mode "r", "w", or "a"') 22 23 if compression == ZIP_STORED: 24 pass 25 elif compression == ZIP_DEFLATED: 26 if not zlib: 27 raise RuntimeError,\ 28 "Compression requires the (missing) zlib module" 29 else: 30 raise RuntimeError, "That compression method is not supported" 31 32 self._allowZip64 = allowZip64 33 self._didModify = False 34 self.debug = 0 # Level of printing: 0 through 3 35 self.NameToInfo = {} # Find file info given name 36 self.filelist = [] # List of ZipInfo instances for archive 37 self.compression = compression # Method of compression 38 self.mode = key = mode.replace('b', '')[0] 39 self.pwd = None 40 self._comment = '' 41 42 # Check if we were passed a file-like object 43 if isinstance(file, basestring): 44 self._filePassed = 0 45 self.filename = file 46 modeDict = {'r' : 'rb', 'w': 'wb', 'a' : 'r+b'} 47 try: 48 self.fp = open(file, modeDict[mode]) 49 except IOError: 50 if mode == 'a': 51 mode = key = 'w' 52 self.fp = open(file, modeDict[mode]) 53 else: 54 raise 55 else: 56 self._filePassed = 1 57 self.fp = file 58 self.filename = getattr(file, 'name', None) 59 60 try: 61 if key == 'r': 62 self._RealGetContents() 63 elif key == 'w': 64 # set the modified flag so central directory gets written 65 # even if no files are added to the archive 66 self._didModify = True 67 elif key == 'a': 68 try: 69 # See if file is a zip file 70 self._RealGetContents() 71 # seek to start of directory and overwrite 72 self.fp.seek(self.start_dir, 0) 73 except BadZipfile: 74 # file is not a zip file, just append 75 self.fp.seek(0, 2) 76 77 # set the modified flag so central directory gets written 78 # even if no files are added to the archive 79 self._didModify = True 80 else: 81 raise RuntimeError('Mode must be "r", "w" or "a"') 82 except: 83 fp = self.fp 84 self.fp = None 85 if not self._filePassed: 86 fp.close() 87 raise 88 89 def __enter__(self): 90 return self 91 92 def __exit__(self, type, value, traceback): 93 self.close() 94 95 def _RealGetContents(self): 96 """Read in the table of contents for the ZIP file.""" 97 fp = self.fp 98 try: 99 endrec = _EndRecData(fp) 100 except IOError: 101 raise BadZipfile("File is not a zip file") 102 if not endrec: 103 raise BadZipfile, "File is not a zip file" 104 if self.debug > 1: 105 print endrec 106 size_cd = endrec[_ECD_SIZE] # bytes in central directory 107 offset_cd = endrec[_ECD_OFFSET] # offset of central directory 108 self._comment = endrec[_ECD_COMMENT] # archive comment 109 110 # "concat" is zero, unless zip was concatenated to another file 111 concat = endrec[_ECD_LOCATION] - size_cd - offset_cd 112 if endrec[_ECD_SIGNATURE] == stringEndArchive64: 113 # If Zip64 extension structures are present, account for them 114 concat -= (sizeEndCentDir64 + sizeEndCentDir64Locator) 115 116 if self.debug > 2: 117 inferred = concat + offset_cd 118 print "given, inferred, offset", offset_cd, inferred, concat 119 # self.start_dir: Position of start of central directory 120 self.start_dir = offset_cd + concat 121 fp.seek(self.start_dir, 0) 122 data = fp.read(size_cd) 123 fp = cStringIO.StringIO(data) 124 total = 0 125 while total < size_cd: 126 centdir = fp.read(sizeCentralDir) 127 if len(centdir) != sizeCentralDir: 128 raise BadZipfile("Truncated central directory") 129 centdir = struct.unpack(structCentralDir, centdir) 130 if centdir[_CD_SIGNATURE] != stringCentralDir: 131 raise BadZipfile("Bad magic number for central directory") 132 if self.debug > 2: 133 print centdir 134 filename = fp.read(centdir[_CD_FILENAME_LENGTH]) 135 # Create ZipInfo instance to store file information 136 x = ZipInfo(filename) 137 x.extra = fp.read(centdir[_CD_EXTRA_FIELD_LENGTH]) 138 x.comment = fp.read(centdir[_CD_COMMENT_LENGTH]) 139 x.header_offset = centdir[_CD_LOCAL_HEADER_OFFSET] 140 (x.create_version, x.create_system, x.extract_version, x.reserved, 141 x.flag_bits, x.compress_type, t, d, 142 x.CRC, x.compress_size, x.file_size) = centdir[1:12] 143 x.volume, x.internal_attr, x.external_attr = centdir[15:18] 144 # Convert date/time code to (year, month, day, hour, min, sec) 145 x._raw_time = t 146 x.date_time = ( (d>>9)+1980, (d>>5)&0xF, d&0x1F, 147 t>>11, (t>>5)&0x3F, (t&0x1F) * 2 ) 148 149 x._decodeExtra() 150 x.header_offset = x.header_offset + concat 151 x.filename = x._decodeFilename() 152 self.filelist.append(x) 153 self.NameToInfo[x.filename] = x 154 155 # update total bytes read from central directory 156 total = (total + sizeCentralDir + centdir[_CD_FILENAME_LENGTH] 157 + centdir[_CD_EXTRA_FIELD_LENGTH] 158 + centdir[_CD_COMMENT_LENGTH]) 159 160 if self.debug > 2: 161 print "total", total 162 163 164 def namelist(self): 165 """Return a list of file names in the archive.""" 166 l = [] 167 for data in self.filelist: 168 l.append(data.filename) 169 return l 170 171 def infolist(self): 172 """Return a list of class ZipInfo instances for files in the 173 archive.""" 174 return self.filelist 175 176 def printdir(self): 177 """Print a table of contents for the zip file.""" 178 print "%-46s %19s %12s" % ("File Name", "Modified ", "Size") 179 for zinfo in self.filelist: 180 date = "%d-%02d-%02d %02d:%02d:%02d" % zinfo.date_time[:6] 181 print "%-46s %s %12d" % (zinfo.filename, date, zinfo.file_size) 182 183 def testzip(self): 184 """Read all the files and check the CRC.""" 185 chunk_size = 2 ** 20 186 for zinfo in self.filelist: 187 try: 188 # Read by chunks, to avoid an OverflowError or a 189 # MemoryError with very large embedded files. 190 with self.open(zinfo.filename, "r") as f: 191 while f.read(chunk_size): # Check CRC-32 192 pass 193 except BadZipfile: 194 return zinfo.filename 195 196 def getinfo(self, name): 197 """Return the instance of ZipInfo given 'name'.""" 198 info = self.NameToInfo.get(name) 199 if info is None: 200 raise KeyError( 201 'There is no item named %r in the archive' % name) 202 203 return info 204 205 def setpassword(self, pwd): 206 """Set default password for encrypted files.""" 207 self.pwd = pwd 208 209 @property 210 def comment(self): 211 """The comment text associated with the ZIP file.""" 212 return self._comment 213 214 @comment.setter 215 def comment(self, comment): 216 # check for valid comment length 217 if len(comment) > ZIP_MAX_COMMENT: 218 import warnings 219 warnings.warn('Archive comment is too long; truncating to %d bytes' 220 % ZIP_MAX_COMMENT, stacklevel=2) 221 comment = comment[:ZIP_MAX_COMMENT] 222 self._comment = comment 223 self._didModify = True 224 225 def read(self, name, pwd=None): 226 """Return file bytes (as a string) for name.""" 227 return self.open(name, "r", pwd).read() 228 229 def open(self, name, mode="r", pwd=None): 230 """Return file-like object for 'name'.""" 231 if mode not in ("r", "U", "rU"): 232 raise RuntimeError, 'open() requires mode "r", "U", or "rU"' 233 if not self.fp: 234 raise RuntimeError, \ 235 "Attempt to read ZIP archive that was already closed" 236 237 # Only open a new file for instances where we were not 238 # given a file object in the constructor 239 if self._filePassed: 240 zef_file = self.fp 241 should_close = False 242 else: 243 zef_file = open(self.filename, 'rb') 244 should_close = True 245 246 try: 247 # Make sure we have an info object 248 if isinstance(name, ZipInfo): 249 # 'name' is already an info object 250 zinfo = name 251 else: 252 # Get info object for name 253 zinfo = self.getinfo(name) 254 255 zef_file.seek(zinfo.header_offset, 0) 256 257 # Skip the file header: 258 fheader = zef_file.read(sizeFileHeader) 259 if len(fheader) != sizeFileHeader: 260 raise BadZipfile("Truncated file header") 261 fheader = struct.unpack(structFileHeader, fheader) 262 if fheader[_FH_SIGNATURE] != stringFileHeader: 263 raise BadZipfile("Bad magic number for file header") 264 265 fname = zef_file.read(fheader[_FH_FILENAME_LENGTH]) 266 if fheader[_FH_EXTRA_FIELD_LENGTH]: 267 zef_file.read(fheader[_FH_EXTRA_FIELD_LENGTH]) 268 269 if fname != zinfo.orig_filename: 270 raise BadZipfile, \ 271 'File name in directory "%s" and header "%s" differ.' % ( 272 zinfo.orig_filename, fname) 273 274 # check for encrypted flag & handle password 275 is_encrypted = zinfo.flag_bits & 0x1 276 zd = None 277 if is_encrypted: 278 if not pwd: 279 pwd = self.pwd 280 if not pwd: 281 raise RuntimeError, "File %s is encrypted, " \ 282 "password required for extraction" % name 283 284 zd = _ZipDecrypter(pwd) 285 # The first 12 bytes in the cypher stream is an encryption header 286 # used to strengthen the algorithm. The first 11 bytes are 287 # completely random, while the 12th contains the MSB of the CRC, 288 # or the MSB of the file time depending on the header type 289 # and is used to check the correctness of the password. 290 bytes = zef_file.read(12) 291 h = map(zd, bytes[0:12]) 292 if zinfo.flag_bits & 0x8: 293 # compare against the file type from extended local headers 294 check_byte = (zinfo._raw_time >> 8) & 0xff 295 else: 296 # compare against the CRC otherwise 297 check_byte = (zinfo.CRC >> 24) & 0xff 298 if ord(h[11]) != check_byte: 299 raise RuntimeError("Bad password for file", name) 300 301 return ZipExtFile(zef_file, mode, zinfo, zd, 302 close_fileobj=should_close) 303 except: 304 if should_close: 305 zef_file.close() 306 raise 307 308 def extract(self, member, path=None, pwd=None): 309 """Extract a member from the archive to the current working directory, 310 using its full name. Its file information is extracted as accurately 311 as possible. `member' may be a filename or a ZipInfo object. You can 312 specify a different directory using `path'. 313 """ 314 if not isinstance(member, ZipInfo): 315 member = self.getinfo(member) 316 317 if path is None: 318 path = os.getcwd() 319 320 return self._extract_member(member, path, pwd) 321 322 def extractall(self, path=None, members=None, pwd=None): 323 """Extract all members from the archive to the current working 324 directory. `path' specifies a different directory to extract to. 325 `members' is optional and must be a subset of the list returned 326 by namelist(). 327 """ 328 if members is None: 329 members = self.namelist() 330 331 for zipinfo in members: 332 self.extract(zipinfo, path, pwd) 333 334 def _extract_member(self, member, targetpath, pwd): 335 """Extract the ZipInfo object 'member' to a physical 336 file on the path targetpath. 337 """ 338 # build the destination pathname, replacing 339 # forward slashes to platform specific separators. 340 arcname = member.filename.replace('/', os.path.sep) 341 342 if os.path.altsep: 343 arcname = arcname.replace(os.path.altsep, os.path.sep) 344 # interpret absolute pathname as relative, remove drive letter or 345 # UNC path, redundant separators, "." and ".." components. 346 arcname = os.path.splitdrive(arcname)[1] 347 arcname = os.path.sep.join(x for x in arcname.split(os.path.sep) 348 if x not in ('', os.path.curdir, os.path.pardir)) 349 if os.path.sep == '\\': 350 # filter illegal characters on Windows 351 illegal = ':<>|"?*' 352 if isinstance(arcname, unicode): 353 table = {ord(c): ord('_') for c in illegal} 354 else: 355 table = string.maketrans(illegal, '_' * len(illegal)) 356 arcname = arcname.translate(table) 357 # remove trailing dots 358 arcname = (x.rstrip('.') for x in arcname.split(os.path.sep)) 359 arcname = os.path.sep.join(x for x in arcname if x) 360 361 targetpath = os.path.join(targetpath, arcname) 362 targetpath = os.path.normpath(targetpath) 363 364 # Create all upper directories if necessary. 365 upperdirs = os.path.dirname(targetpath) 366 if upperdirs and not os.path.exists(upperdirs): 367 os.makedirs(upperdirs) 368 369 if member.filename[-1] == '/': 370 if not os.path.isdir(targetpath): 371 os.mkdir(targetpath) 372 return targetpath 373 374 with self.open(member, pwd=pwd) as source, \ 375 file(targetpath, "wb") as target: 376 shutil.copyfileobj(source, target) 377 378 return targetpath 379 380 def _writecheck(self, zinfo): 381 """Check for errors before writing a file to the archive.""" 382 if zinfo.filename in self.NameToInfo: 383 import warnings 384 warnings.warn('Duplicate name: %r' % zinfo.filename, stacklevel=3) 385 if self.mode not in ("w", "a"): 386 raise RuntimeError, 'write() requires mode "w" or "a"' 387 if not self.fp: 388 raise RuntimeError, \ 389 "Attempt to write ZIP archive that was already closed" 390 if zinfo.compress_type == ZIP_DEFLATED and not zlib: 391 raise RuntimeError, \ 392 "Compression requires the (missing) zlib module" 393 if zinfo.compress_type not in (ZIP_STORED, ZIP_DEFLATED): 394 raise RuntimeError, \ 395 "That compression method is not supported" 396 if not self._allowZip64: 397 requires_zip64 = None 398 if len(self.filelist) >= ZIP_FILECOUNT_LIMIT: 399 requires_zip64 = "Files count" 400 elif zinfo.file_size > ZIP64_LIMIT: 401 requires_zip64 = "Filesize" 402 elif zinfo.header_offset > ZIP64_LIMIT: 403 requires_zip64 = "Zipfile size" 404 if requires_zip64: 405 raise LargeZipFile(requires_zip64 + 406 " would require ZIP64 extensions") 407 408 def write(self, filename, arcname=None, compress_type=None): 409 """Put the bytes from filename into the archive under the name 410 arcname.""" 411 if not self.fp: 412 raise RuntimeError( 413 "Attempt to write to ZIP archive that was already closed") 414 415 st = os.stat(filename) 416 isdir = stat.S_ISDIR(st.st_mode) 417 mtime = time.localtime(st.st_mtime) 418 date_time = mtime[0:6] 419 # Create ZipInfo instance to store file information 420 if arcname is None: 421 arcname = filename 422 arcname = os.path.normpath(os.path.splitdrive(arcname)[1]) 423 while arcname[0] in (os.sep, os.altsep): 424 arcname = arcname[1:] 425 if isdir: 426 arcname += '/' 427 zinfo = ZipInfo(arcname, date_time) 428 zinfo.external_attr = (st[0] & 0xFFFF) << 16L # Unix attributes 429 if compress_type is None: 430 zinfo.compress_type = self.compression 431 else: 432 zinfo.compress_type = compress_type 433 434 zinfo.file_size = st.st_size 435 zinfo.flag_bits = 0x00 436 zinfo.header_offset = self.fp.tell() # Start of header bytes 437 438 self._writecheck(zinfo) 439 self._didModify = True 440 441 if isdir: 442 zinfo.file_size = 0 443 zinfo.compress_size = 0 444 zinfo.CRC = 0 445 zinfo.external_attr |= 0x10 # MS-DOS directory flag 446 self.filelist.append(zinfo) 447 self.NameToInfo[zinfo.filename] = zinfo 448 self.fp.write(zinfo.FileHeader(False)) 449 return 450 451 with open(filename, "rb") as fp: 452 # Must overwrite CRC and sizes with correct data later 453 zinfo.CRC = CRC = 0 454 zinfo.compress_size = compress_size = 0 455 # Compressed size can be larger than uncompressed size 456 zip64 = self._allowZip64 and \ 457 zinfo.file_size * 1.05 > ZIP64_LIMIT 458 self.fp.write(zinfo.FileHeader(zip64)) 459 if zinfo.compress_type == ZIP_DEFLATED: 460 cmpr = zlib.compressobj(zlib.Z_DEFAULT_COMPRESSION, 461 zlib.DEFLATED, -15) 462 else: 463 cmpr = None 464 file_size = 0 465 while 1: 466 buf = fp.read(1024 * 8) 467 if not buf: 468 break 469 file_size = file_size + len(buf) 470 CRC = crc32(buf, CRC) & 0xffffffff 471 if cmpr: 472 buf = cmpr.compress(buf) 473 compress_size = compress_size + len(buf) 474 self.fp.write(buf) 475 if cmpr: 476 buf = cmpr.flush() 477 compress_size = compress_size + len(buf) 478 self.fp.write(buf) 479 zinfo.compress_size = compress_size 480 else: 481 zinfo.compress_size = file_size 482 zinfo.CRC = CRC 483 zinfo.file_size = file_size 484 if not zip64 and self._allowZip64: 485 if file_size > ZIP64_LIMIT: 486 raise RuntimeError('File size has increased during compressing') 487 if compress_size > ZIP64_LIMIT: 488 raise RuntimeError('Compressed size larger than uncompressed size') 489 # Seek backwards and write file header (which will now include 490 # correct CRC and file sizes) 491 position = self.fp.tell() # Preserve current position in file 492 self.fp.seek(zinfo.header_offset, 0) 493 self.fp.write(zinfo.FileHeader(zip64)) 494 self.fp.seek(position, 0) 495 self.filelist.append(zinfo) 496 self.NameToInfo[zinfo.filename] = zinfo 497 498 def writestr(self, zinfo_or_arcname, bytes, compress_type=None): 499 """Write a file into the archive. The contents is the string 500 'bytes'. 'zinfo_or_arcname' is either a ZipInfo instance or 501 the name of the file in the archive.""" 502 if not isinstance(zinfo_or_arcname, ZipInfo): 503 zinfo = ZipInfo(filename=zinfo_or_arcname, 504 date_time=time.localtime(time.time())[:6]) 505 506 zinfo.compress_type = self.compression 507 if zinfo.filename[-1] == '/': 508 zinfo.external_attr = 0o40775 << 16 # drwxrwxr-x 509 zinfo.external_attr |= 0x10 # MS-DOS directory flag 510 else: 511 zinfo.external_attr = 0o600 << 16 # ?rw------- 512 else: 513 zinfo = zinfo_or_arcname 514 515 if not self.fp: 516 raise RuntimeError( 517 "Attempt to write to ZIP archive that was already closed") 518 519 if compress_type is not None: 520 zinfo.compress_type = compress_type 521 522 zinfo.file_size = len(bytes) # Uncompressed size 523 zinfo.header_offset = self.fp.tell() # Start of header bytes 524 self._writecheck(zinfo) 525 self._didModify = True 526 zinfo.CRC = crc32(bytes) & 0xffffffff # CRC-32 checksum 527 if zinfo.compress_type == ZIP_DEFLATED: 528 co = zlib.compressobj(zlib.Z_DEFAULT_COMPRESSION, 529 zlib.DEFLATED, -15) 530 bytes = co.compress(bytes) + co.flush() 531 zinfo.compress_size = len(bytes) # Compressed size 532 else: 533 zinfo.compress_size = zinfo.file_size 534 zip64 = zinfo.file_size > ZIP64_LIMIT or \ 535 zinfo.compress_size > ZIP64_LIMIT 536 if zip64 and not self._allowZip64: 537 raise LargeZipFile("Filesize would require ZIP64 extensions") 538 self.fp.write(zinfo.FileHeader(zip64)) 539 self.fp.write(bytes) 540 if zinfo.flag_bits & 0x08: 541 # Write CRC and file sizes after the file data 542 fmt = '<LQQ' if zip64 else '<LLL' 543 self.fp.write(struct.pack(fmt, zinfo.CRC, zinfo.compress_size, 544 zinfo.file_size)) 545 self.fp.flush() 546 self.filelist.append(zinfo) 547 self.NameToInfo[zinfo.filename] = zinfo 548 549 def __del__(self): 550 """Call the "close()" method in case the user forgot.""" 551 self.close() 552 553 def close(self): 554 """Close the file, and for mode "w" and "a" write the ending 555 records.""" 556 if self.fp is None: 557 return 558 559 try: 560 if self.mode in ("w", "a") and self._didModify: # write ending records 561 pos1 = self.fp.tell() 562 for zinfo in self.filelist: # write central directory 563 dt = zinfo.date_time 564 dosdate = (dt[0] - 1980) << 9 | dt[1] << 5 | dt[2] 565 dostime = dt[3] << 11 | dt[4] << 5 | (dt[5] // 2) 566 extra = [] 567 if zinfo.file_size > ZIP64_LIMIT \ 568 or zinfo.compress_size > ZIP64_LIMIT: 569 extra.append(zinfo.file_size) 570 extra.append(zinfo.compress_size) 571 file_size = 0xffffffff 572 compress_size = 0xffffffff 573 else: 574 file_size = zinfo.file_size 575 compress_size = zinfo.compress_size 576 577 if zinfo.header_offset > ZIP64_LIMIT: 578 extra.append(zinfo.header_offset) 579 header_offset = 0xffffffffL 580 else: 581 header_offset = zinfo.header_offset 582 583 extra_data = zinfo.extra 584 if extra: 585 # Append a ZIP64 field to the extra's 586 extra_data = struct.pack( 587 '<HH' + 'Q'*len(extra), 588 1, 8*len(extra), *extra) + extra_data 589 590 extract_version = max(45, zinfo.extract_version) 591 create_version = max(45, zinfo.create_version) 592 else: 593 extract_version = zinfo.extract_version 594 create_version = zinfo.create_version 595 596 try: 597 filename, flag_bits = zinfo._encodeFilenameFlags() 598 centdir = struct.pack(structCentralDir, 599 stringCentralDir, create_version, 600 zinfo.create_system, extract_version, zinfo.reserved, 601 flag_bits, zinfo.compress_type, dostime, dosdate, 602 zinfo.CRC, compress_size, file_size, 603 len(filename), len(extra_data), len(zinfo.comment), 604 0, zinfo.internal_attr, zinfo.external_attr, 605 header_offset) 606 except DeprecationWarning: 607 print >>sys.stderr, (structCentralDir, 608 stringCentralDir, create_version, 609 zinfo.create_system, extract_version, zinfo.reserved, 610 zinfo.flag_bits, zinfo.compress_type, dostime, dosdate, 611 zinfo.CRC, compress_size, file_size, 612 len(zinfo.filename), len(extra_data), len(zinfo.comment), 613 0, zinfo.internal_attr, zinfo.external_attr, 614 header_offset) 615 raise 616 self.fp.write(centdir) 617 self.fp.write(filename) 618 self.fp.write(extra_data) 619 self.fp.write(zinfo.comment) 620 621 pos2 = self.fp.tell() 622 # Write end-of-zip-archive record 623 centDirCount = len(self.filelist) 624 centDirSize = pos2 - pos1 625 centDirOffset = pos1 626 requires_zip64 = None 627 if centDirCount > ZIP_FILECOUNT_LIMIT: 628 requires_zip64 = "Files count" 629 elif centDirOffset > ZIP64_LIMIT: 630 requires_zip64 = "Central directory offset" 631 elif centDirSize > ZIP64_LIMIT: 632 requires_zip64 = "Central directory size" 633 if requires_zip64: 634 # Need to write the ZIP64 end-of-archive records 635 if not self._allowZip64: 636 raise LargeZipFile(requires_zip64 + 637 " would require ZIP64 extensions") 638 zip64endrec = struct.pack( 639 structEndArchive64, stringEndArchive64, 640 44, 45, 45, 0, 0, centDirCount, centDirCount, 641 centDirSize, centDirOffset) 642 self.fp.write(zip64endrec) 643 644 zip64locrec = struct.pack( 645 structEndArchive64Locator, 646 stringEndArchive64Locator, 0, pos2, 1) 647 self.fp.write(zip64locrec) 648 centDirCount = min(centDirCount, 0xFFFF) 649 centDirSize = min(centDirSize, 0xFFFFFFFF) 650 centDirOffset = min(centDirOffset, 0xFFFFFFFF) 651 652 endrec = struct.pack(structEndArchive, stringEndArchive, 653 0, 0, centDirCount, centDirCount, 654 centDirSize, centDirOffset, len(self._comment)) 655 self.fp.write(endrec) 656 self.fp.write(self._comment) 657 self.fp.flush() 658 finally: 659 fp = self.fp 660 self.fp = None 661 if not self._filePassed: 662 fp.close() 663 664 ZipFile
1 class TarFile(object): 2 """The TarFile Class provides an interface to tar archives. 3 """ 4 5 debug = 0 # May be set from 0 (no msgs) to 3 (all msgs) 6 7 dereference = False # If true, add content of linked file to the 8 # tar file, else the link. 9 10 ignore_zeros = False # If true, skips empty or invalid blocks and 11 # continues processing. 12 13 errorlevel = 1 # If 0, fatal errors only appear in debug 14 # messages (if debug >= 0). If > 0, errors 15 # are passed to the caller as exceptions. 16 17 format = DEFAULT_FORMAT # The format to use when creating an archive. 18 19 encoding = ENCODING # Encoding for 8-bit character strings. 20 21 errors = None # Error handler for unicode conversion. 22 23 tarinfo = TarInfo # The default TarInfo class to use. 24 25 fileobject = ExFileObject # The default ExFileObject class to use. 26 27 def __init__(self, name=None, mode="r", fileobj=None, format=None, 28 tarinfo=None, dereference=None, ignore_zeros=None, encoding=None, 29 errors=None, pax_headers=None, debug=None, errorlevel=None): 30 """Open an (uncompressed) tar archive `name'. `mode' is either 'r' to 31 read from an existing archive, 'a' to append data to an existing 32 file or 'w' to create a new file overwriting an existing one. `mode' 33 defaults to 'r'. 34 If `fileobj' is given, it is used for reading or writing data. If it 35 can be determined, `mode' is overridden by `fileobj's mode. 36 `fileobj' is not closed, when TarFile is closed. 37 """ 38 modes = {"r": "rb", "a": "r+b", "w": "wb"} 39 if mode not in modes: 40 raise ValueError("mode must be 'r', 'a' or 'w'") 41 self.mode = mode 42 self._mode = modes[mode] 43 44 if not fileobj: 45 if self.mode == "a" and not os.path.exists(name): 46 # Create nonexistent files in append mode. 47 self.mode = "w" 48 self._mode = "wb" 49 fileobj = bltn_open(name, self._mode) 50 self._extfileobj = False 51 else: 52 if name is None and hasattr(fileobj, "name"): 53 name = fileobj.name 54 if hasattr(fileobj, "mode"): 55 self._mode = fileobj.mode 56 self._extfileobj = True 57 self.name = os.path.abspath(name) if name else None 58 self.fileobj = fileobj 59 60 # Init attributes. 61 if format is not None: 62 self.format = format 63 if tarinfo is not None: 64 self.tarinfo = tarinfo 65 if dereference is not None: 66 self.dereference = dereference 67 if ignore_zeros is not None: 68 self.ignore_zeros = ignore_zeros 69 if encoding is not None: 70 self.encoding = encoding 71 72 if errors is not None: 73 self.errors = errors 74 elif mode == "r": 75 self.errors = "utf-8" 76 else: 77 self.errors = "strict" 78 79 if pax_headers is not None and self.format == PAX_FORMAT: 80 self.pax_headers = pax_headers 81 else: 82 self.pax_headers = {} 83 84 if debug is not None: 85 self.debug = debug 86 if errorlevel is not None: 87 self.errorlevel = errorlevel 88 89 # Init datastructures. 90 self.closed = False 91 self.members = [] # list of members as TarInfo objects 92 self._loaded = False # flag if all members have been read 93 self.offset = self.fileobj.tell() 94 # current position in the archive file 95 self.inodes = {} # dictionary caching the inodes of 96 # archive members already added 97 98 try: 99 if self.mode == "r": 100 self.firstmember = None 101 self.firstmember = self.next() 102 103 if self.mode == "a": 104 # Move to the end of the archive, 105 # before the first empty block. 106 while True: 107 self.fileobj.seek(self.offset) 108 try: 109 tarinfo = self.tarinfo.fromtarfile(self) 110 self.members.append(tarinfo) 111 except EOFHeaderError: 112 self.fileobj.seek(self.offset) 113 break 114 except HeaderError, e: 115 raise ReadError(str(e)) 116 117 if self.mode in "aw": 118 self._loaded = True 119 120 if self.pax_headers: 121 buf = self.tarinfo.create_pax_global_header(self.pax_headers.copy()) 122 self.fileobj.write(buf) 123 self.offset += len(buf) 124 except: 125 if not self._extfileobj: 126 self.fileobj.close() 127 self.closed = True 128 raise 129 130 def _getposix(self): 131 return self.format == USTAR_FORMAT 132 def _setposix(self, value): 133 import warnings 134 warnings.warn("use the format attribute instead", DeprecationWarning, 135 2) 136 if value: 137 self.format = USTAR_FORMAT 138 else: 139 self.format = GNU_FORMAT 140 posix = property(_getposix, _setposix) 141 142 #-------------------------------------------------------------------------- 143 # Below are the classmethods which act as alternate constructors to the 144 # TarFile class. The open() method is the only one that is needed for 145 # public use; it is the "super"-constructor and is able to select an 146 # adequate "sub"-constructor for a particular compression using the mapping 147 # from OPEN_METH. 148 # 149 # This concept allows one to subclass TarFile without losing the comfort of 150 # the super-constructor. A sub-constructor is registered and made available 151 # by adding it to the mapping in OPEN_METH. 152 153 @classmethod 154 def open(cls, name=None, mode="r", fileobj=None, bufsize=RECORDSIZE, **kwargs): 155 """Open a tar archive for reading, writing or appending. Return 156 an appropriate TarFile class. 157 158 mode: 159 'r' or 'r:*' open for reading with transparent compression 160 'r:' open for reading exclusively uncompressed 161 'r:gz' open for reading with gzip compression 162 'r:bz2' open for reading with bzip2 compression 163 'a' or 'a:' open for appending, creating the file if necessary 164 'w' or 'w:' open for writing without compression 165 'w:gz' open for writing with gzip compression 166 'w:bz2' open for writing with bzip2 compression 167 168 'r|*' open a stream of tar blocks with transparent compression 169 'r|' open an uncompressed stream of tar blocks for reading 170 'r|gz' open a gzip compressed stream of tar blocks 171 'r|bz2' open a bzip2 compressed stream of tar blocks 172 'w|' open an uncompressed stream for writing 173 'w|gz' open a gzip compressed stream for writing 174 'w|bz2' open a bzip2 compressed stream for writing 175 """ 176 177 if not name and not fileobj: 178 raise ValueError("nothing to open") 179 180 if mode in ("r", "r:*"): 181 # Find out which *open() is appropriate for opening the file. 182 for comptype in cls.OPEN_METH: 183 func = getattr(cls, cls.OPEN_METH[comptype]) 184 if fileobj is not None: 185 saved_pos = fileobj.tell() 186 try: 187 return func(name, "r", fileobj, **kwargs) 188 except (ReadError, CompressionError), e: 189 if fileobj is not None: 190 fileobj.seek(saved_pos) 191 continue 192 raise ReadError("file could not be opened successfully") 193 194 elif ":" in mode: 195 filemode, comptype = mode.split(":", 1) 196 filemode = filemode or "r" 197 comptype = comptype or "tar" 198 199 # Select the *open() function according to 200 # given compression. 201 if comptype in cls.OPEN_METH: 202 func = getattr(cls, cls.OPEN_METH[comptype]) 203 else: 204 raise CompressionError("unknown compression type %r" % comptype) 205 return func(name, filemode, fileobj, **kwargs) 206 207 elif "|" in mode: 208 filemode, comptype = mode.split("|", 1) 209 filemode = filemode or "r" 210 comptype = comptype or "tar" 211 212 if filemode not in ("r", "w"): 213 raise ValueError("mode must be 'r' or 'w'") 214 215 stream = _Stream(name, filemode, comptype, fileobj, bufsize) 216 try: 217 t = cls(name, filemode, stream, **kwargs) 218 except: 219 stream.close() 220 raise 221 t._extfileobj = False 222 return t 223 224 elif mode in ("a", "w"): 225 return cls.taropen(name, mode, fileobj, **kwargs) 226 227 raise ValueError("undiscernible mode") 228 229 @classmethod 230 def taropen(cls, name, mode="r", fileobj=None, **kwargs): 231 """Open uncompressed tar archive name for reading or writing. 232 """ 233 if mode not in ("r", "a", "w"): 234 raise ValueError("mode must be 'r', 'a' or 'w'") 235 return cls(name, mode, fileobj, **kwargs) 236 237 @classmethod 238 def gzopen(cls, name, mode="r", fileobj=None, compresslevel=9, **kwargs): 239 """Open gzip compressed tar archive name for reading or writing. 240 Appending is not allowed. 241 """ 242 if mode not in ("r", "w"): 243 raise ValueError("mode must be 'r' or 'w'") 244 245 try: 246 import gzip 247 gzip.GzipFile 248 except (ImportError, AttributeError): 249 raise CompressionError("gzip module is not available") 250 251 try: 252 fileobj = gzip.GzipFile(name, mode, compresslevel, fileobj) 253 except OSError: 254 if fileobj is not None and mode == 'r': 255 raise ReadError("not a gzip file") 256 raise 257 258 try: 259 t = cls.taropen(name, mode, fileobj, **kwargs) 260 except IOError: 261 fileobj.close() 262 if mode == 'r': 263 raise ReadError("not a gzip file") 264 raise 265 except: 266 fileobj.close() 267 raise 268 t._extfileobj = False 269 return t 270 271 @classmethod 272 def bz2open(cls, name, mode="r", fileobj=None, compresslevel=9, **kwargs): 273 """Open bzip2 compressed tar archive name for reading or writing. 274 Appending is not allowed. 275 """ 276 if mode not in ("r", "w"): 277 raise ValueError("mode must be 'r' or 'w'.") 278 279 try: 280 import bz2 281 except ImportError: 282 raise CompressionError("bz2 module is not available") 283 284 if fileobj is not None: 285 fileobj = _BZ2Proxy(fileobj, mode) 286 else: 287 fileobj = bz2.BZ2File(name, mode, compresslevel=compresslevel) 288 289 try: 290 t = cls.taropen(name, mode, fileobj, **kwargs) 291 except (IOError, EOFError): 292 fileobj.close() 293 if mode == 'r': 294 raise ReadError("not a bzip2 file") 295 raise 296 except: 297 fileobj.close() 298 raise 299 t._extfileobj = False 300 return t 301 302 # All *open() methods are registered here. 303 OPEN_METH = { 304 "tar": "taropen", # uncompressed tar 305 "gz": "gzopen", # gzip compressed tar 306 "bz2": "bz2open" # bzip2 compressed tar 307 } 308 309 #-------------------------------------------------------------------------- 310 # The public methods which TarFile provides: 311 312 def close(self): 313 """Close the TarFile. In write-mode, two finishing zero blocks are 314 appended to the archive. 315 """ 316 if self.closed: 317 return 318 319 if self.mode in "aw": 320 self.fileobj.write(NUL * (BLOCKSIZE * 2)) 321 self.offset += (BLOCKSIZE * 2) 322 # fill up the end with zero-blocks 323 # (like option -b20 for tar does) 324 blocks, remainder = divmod(self.offset, RECORDSIZE) 325 if remainder > 0: 326 self.fileobj.write(NUL * (RECORDSIZE - remainder)) 327 328 if not self._extfileobj: 329 self.fileobj.close() 330 self.closed = True 331 332 def getmember(self, name): 333 """Return a TarInfo object for member `name'. If `name' can not be 334 found in the archive, KeyError is raised. If a member occurs more 335 than once in the archive, its last occurrence is assumed to be the 336 most up-to-date version. 337 """ 338 tarinfo = self._getmember(name) 339 if tarinfo is None: 340 raise KeyError("filename %r not found" % name) 341 return tarinfo 342 343 def getmembers(self): 344 """Return the members of the archive as a list of TarInfo objects. The 345 list has the same order as the members in the archive. 346 """ 347 self._check() 348 if not self._loaded: # if we want to obtain a list of 349 self._load() # all members, we first have to 350 # scan the whole archive. 351 return self.members 352 353 def getnames(self): 354 """Return the members of the archive as a list of their names. It has 355 the same order as the list returned by getmembers(). 356 """ 357 return [tarinfo.name for tarinfo in self.getmembers()] 358 359 def gettarinfo(self, name=None, arcname=None, fileobj=None): 360 """Create a TarInfo object for either the file `name' or the file 361 object `fileobj' (using os.fstat on its file descriptor). You can 362 modify some of the TarInfo's attributes before you add it using 363 addfile(). If given, `arcname' specifies an alternative name for the 364 file in the archive. 365 """ 366 self._check("aw") 367 368 # When fileobj is given, replace name by 369 # fileobj's real name. 370 if fileobj is not None: 371 name = fileobj.name 372 373 # Building the name of the member in the archive. 374 # Backward slashes are converted to forward slashes, 375 # Absolute paths are turned to relative paths. 376 if arcname is None: 377 arcname = name 378 drv, arcname = os.path.splitdrive(arcname) 379 arcname = arcname.replace(os.sep, "/") 380 arcname = arcname.lstrip("/") 381 382 # Now, fill the TarInfo object with 383 # information specific for the file. 384 tarinfo = self.tarinfo() 385 tarinfo.tarfile = self 386 387 # Use os.stat or os.lstat, depending on platform 388 # and if symlinks shall be resolved. 389 if fileobj is None: 390 if hasattr(os, "lstat") and not self.dereference: 391 statres = os.lstat(name) 392 else: 393 statres = os.stat(name) 394 else: 395 statres = os.fstat(fileobj.fileno()) 396 linkname = "" 397 398 stmd = statres.st_mode 399 if stat.S_ISREG(stmd): 400 inode = (statres.st_ino, statres.st_dev) 401 if not self.dereference and statres.st_nlink > 1 and \ 402 inode in self.inodes and arcname != self.inodes[inode]: 403 # Is it a hardlink to an already 404 # archived file? 405 type = LNKTYPE 406 linkname = self.inodes[inode] 407 else: 408 # The inode is added only if its valid. 409 # For win32 it is always 0. 410 type = REGTYPE 411 if inode[0]: 412 self.inodes[inode] = arcname 413 elif stat.S_ISDIR(stmd): 414 type = DIRTYPE 415 elif stat.S_ISFIFO(stmd): 416 type = FIFOTYPE 417 elif stat.S_ISLNK(stmd): 418 type = SYMTYPE 419 linkname = os.readlink(name) 420 elif stat.S_ISCHR(stmd): 421 type = CHRTYPE 422 elif stat.S_ISBLK(stmd): 423 type = BLKTYPE 424 else: 425 return None 426 427 # Fill the TarInfo object with all 428 # information we can get. 429 tarinfo.name = arcname 430 tarinfo.mode = stmd 431 tarinfo.uid = statres.st_uid 432 tarinfo.gid = statres.st_gid 433 if type == REGTYPE: 434 tarinfo.size = statres.st_size 435 else: 436 tarinfo.size = 0L 437 tarinfo.mtime = statres.st_mtime 438 tarinfo.type = type 439 tarinfo.linkname = linkname 440 if pwd: 441 try: 442 tarinfo.uname = pwd.getpwuid(tarinfo.uid)[0] 443 except KeyError: 444 pass 445 if grp: 446 try: 447 tarinfo.gname = grp.getgrgid(tarinfo.gid)[0] 448 except KeyError: 449 pass 450 451 if type in (CHRTYPE, BLKTYPE): 452 if hasattr(os, "major") and hasattr(os, "minor"): 453 tarinfo.devmajor = os.major(statres.st_rdev) 454 tarinfo.devminor = os.minor(statres.st_rdev) 455 return tarinfo 456 457 def list(self, verbose=True): 458 """Print a table of contents to sys.stdout. If `verbose' is False, only 459 the names of the members are printed. If it is True, an `ls -l'-like 460 output is produced. 461 """ 462 self._check() 463 464 for tarinfo in self: 465 if verbose: 466 print filemode(tarinfo.mode), 467 print "%s/%s" % (tarinfo.uname or tarinfo.uid, 468 tarinfo.gname or tarinfo.gid), 469 if tarinfo.ischr() or tarinfo.isblk(): 470 print "%10s" % ("%d,%d" \ 471 % (tarinfo.devmajor, tarinfo.devminor)), 472 else: 473 print "%10d" % tarinfo.size, 474 print "%d-%02d-%02d %02d:%02d:%02d" \ 475 % time.localtime(tarinfo.mtime)[:6], 476 477 print tarinfo.name + ("/" if tarinfo.isdir() else ""), 478 479 if verbose: 480 if tarinfo.issym(): 481 print "->", tarinfo.linkname, 482 if tarinfo.islnk(): 483 print "link to", tarinfo.linkname, 484 print 485 486 def add(self, name, arcname=None, recursive=True, exclude=None, filter=None): 487 """Add the file `name' to the archive. `name' may be any type of file 488 (directory, fifo, symbolic link, etc.). If given, `arcname' 489 specifies an alternative name for the file in the archive. 490 Directories are added recursively by default. This can be avoided by 491 setting `recursive' to False. `exclude' is a function that should 492 return True for each filename to be excluded. `filter' is a function 493 that expects a TarInfo object argument and returns the changed 494 TarInfo object, if it returns None the TarInfo object will be 495 excluded from the archive. 496 """ 497 self._check("aw") 498 499 if arcname is None: 500 arcname = name 501 502 # Exclude pathnames. 503 if exclude is not None: 504 import warnings 505 warnings.warn("use the filter argument instead", 506 DeprecationWarning, 2) 507 if exclude(name): 508 self._dbg(2, "tarfile: Excluded %r" % name) 509 return 510 511 # Skip if somebody tries to archive the archive... 512 if self.name is not None and os.path.abspath(name) == self.name: 513 self._dbg(2, "tarfile: Skipped %r" % name) 514 return 515 516 self._dbg(1, name) 517 518 # Create a TarInfo object from the file. 519 tarinfo = self.gettarinfo(name, arcname) 520 521 if tarinfo is None: 522 self._dbg(1, "tarfile: Unsupported type %r" % name) 523 return 524 525 # Change or exclude the TarInfo object. 526 if filter is not None: 527 tarinfo = filter(tarinfo) 528 if tarinfo is None: 529 self._dbg(2, "tarfile: Excluded %r" % name) 530 return 531 532 # Append the tar header and data to the archive. 533 if tarinfo.isreg(): 534 with bltn_open(name, "rb") as f: 535 self.addfile(tarinfo, f) 536 537 elif tarinfo.isdir(): 538 self.addfile(tarinfo) 539 if recursive: 540 for f in os.listdir(name): 541 self.add(os.path.join(name, f), os.path.join(arcname, f), 542 recursive, exclude, filter) 543 544 else: 545 self.addfile(tarinfo) 546 547 def addfile(self, tarinfo, fileobj=None): 548 """Add the TarInfo object `tarinfo' to the archive. If `fileobj' is 549 given, tarinfo.size bytes are read from it and added to the archive. 550 You can create TarInfo objects using gettarinfo(). 551 On Windows platforms, `fileobj' should always be opened with mode 552 'rb' to avoid irritation about the file size. 553 """ 554 self._check("aw") 555 556 tarinfo = copy.copy(tarinfo) 557 558 buf = tarinfo.tobuf(self.format, self.encoding, self.errors) 559 self.fileobj.write(buf) 560 self.offset += len(buf) 561 562 # If there's data to follow, append it. 563 if fileobj is not None: 564 copyfileobj(fileobj, self.fileobj, tarinfo.size) 565 blocks, remainder = divmod(tarinfo.size, BLOCKSIZE) 566 if remainder > 0: 567 self.fileobj.write(NUL * (BLOCKSIZE - remainder)) 568 blocks += 1 569 self.offset += blocks * BLOCKSIZE 570 571 self.members.append(tarinfo) 572 573 def extractall(self, path=".", members=None): 574 """Extract all members from the archive to the current working 575 directory and set owner, modification time and permissions on 576 directories afterwards. `path' specifies a different directory 577 to extract to. `members' is optional and must be a subset of the 578 list returned by getmembers(). 579 """ 580 directories = [] 581 582 if members is None: 583 members = self 584 585 for tarinfo in members: 586 if tarinfo.isdir(): 587 # Extract directories with a safe mode. 588 directories.append(tarinfo) 589 tarinfo = copy.copy(tarinfo) 590 tarinfo.mode = 0700 591 self.extract(tarinfo, path) 592 593 # Reverse sort directories. 594 directories.sort(key=operator.attrgetter('name')) 595 directories.reverse() 596 597 # Set correct owner, mtime and filemode on directories. 598 for tarinfo in directories: 599 dirpath = os.path.join(path, tarinfo.name) 600 try: 601 self.chown(tarinfo, dirpath) 602 self.utime(tarinfo, dirpath) 603 self.chmod(tarinfo, dirpath) 604 except ExtractError, e: 605 if self.errorlevel > 1: 606 raise 607 else: 608 self._dbg(1, "tarfile: %s" % e) 609 610 def extract(self, member, path=""): 611 """Extract a member from the archive to the current working directory, 612 using its full name. Its file information is extracted as accurately 613 as possible. `member' may be a filename or a TarInfo object. You can 614 specify a different directory using `path'. 615 """ 616 self._check("r") 617 618 if isinstance(member, basestring): 619 tarinfo = self.getmember(member) 620 else: 621 tarinfo = member 622 623 # Prepare the link target for makelink(). 624 if tarinfo.islnk(): 625 tarinfo._link_target = os.path.join(path, tarinfo.linkname) 626 627 try: 628 self._extract_member(tarinfo, os.path.join(path, tarinfo.name)) 629 except EnvironmentError, e: 630 if self.errorlevel > 0: 631 raise 632 else: 633 if e.filename is None: 634 self._dbg(1, "tarfile: %s" % e.strerror) 635 else: 636 self._dbg(1, "tarfile: %s %r" % (e.strerror, e.filename)) 637 except ExtractError, e: 638 if self.errorlevel > 1: 639 raise 640 else: 641 self._dbg(1, "tarfile: %s" % e) 642 643 def extractfile(self, member): 644 """Extract a member from the archive as a file object. `member' may be 645 a filename or a TarInfo object. If `member' is a regular file, a 646 file-like object is returned. If `member' is a link, a file-like 647 object is constructed from the link's target. If `member' is none of 648 the above, None is returned. 649 The file-like object is read-only and provides the following 650 methods: read(), readline(), readlines(), seek() and tell() 651 """ 652 self._check("r") 653 654 if isinstance(member, basestring): 655 tarinfo = self.getmember(member) 656 else: 657 tarinfo = member 658 659 if tarinfo.isreg(): 660 return self.fileobject(self, tarinfo) 661 662 elif tarinfo.type not in SUPPORTED_TYPES: 663 # If a member's type is unknown, it is treated as a 664 # regular file. 665 return self.fileobject(self, tarinfo) 666 667 elif tarinfo.islnk() or tarinfo.issym(): 668 if isinstance(self.fileobj, _Stream): 669 # A small but ugly workaround for the case that someone tries 670 # to extract a (sym)link as a file-object from a non-seekable 671 # stream of tar blocks. 672 raise StreamError("cannot extract (sym)link as file object") 673 else: 674 # A (sym)link's file object is its target's file object. 675 return self.extractfile(self._find_link_target(tarinfo)) 676 else: 677 # If there's no data associated with the member (directory, chrdev, 678 # blkdev, etc.), return None instead of a file object. 679 return None 680 681 def _extract_member(self, tarinfo, targetpath): 682 """Extract the TarInfo object tarinfo to a physical 683 file called targetpath. 684 """ 685 # Fetch the TarInfo object for the given name 686 # and build the destination pathname, replacing 687 # forward slashes to platform specific separators. 688 targetpath = targetpath.rstrip("/") 689 targetpath = targetpath.replace("/", os.sep) 690 691 # Create all upper directories. 692 upperdirs = os.path.dirname(targetpath) 693 if upperdirs and not os.path.exists(upperdirs): 694 # Create directories that are not part of the archive with 695 # default permissions. 696 os.makedirs(upperdirs) 697 698 if tarinfo.islnk() or tarinfo.issym(): 699 self._dbg(1, "%s -> %s" % (tarinfo.name, tarinfo.linkname)) 700 else: 701 self._dbg(1, tarinfo.name) 702 703 if tarinfo.isreg(): 704 self.makefile(tarinfo, targetpath) 705 elif tarinfo.isdir(): 706 self.makedir(tarinfo, targetpath) 707 elif tarinfo.isfifo(): 708 self.makefifo(tarinfo, targetpath) 709 elif tarinfo.ischr() or tarinfo.isblk(): 710 self.makedev(tarinfo, targetpath) 711 elif tarinfo.islnk() or tarinfo.issym(): 712 self.makelink(tarinfo, targetpath) 713 elif tarinfo.type not in SUPPORTED_TYPES: 714 self.makeunknown(tarinfo, targetpath) 715 else: 716 self.makefile(tarinfo, targetpath) 717 718 self.chown(tarinfo, targetpath) 719 if not tarinfo.issym(): 720 self.chmod(tarinfo, targetpath) 721 self.utime(tarinfo, targetpath) 722 723 #-------------------------------------------------------------------------- 724 # Below are the different file methods. They are called via 725 # _extract_member() when extract() is called. They can be replaced in a 726 # subclass to implement other functionality. 727 728 def makedir(self, tarinfo, targetpath): 729 """Make a directory called targetpath. 730 """ 731 try: 732 # Use a safe mode for the directory, the real mode is set 733 # later in _extract_member(). 734 os.mkdir(targetpath, 0700) 735 except EnvironmentError, e: 736 if e.errno != errno.EEXIST: 737 raise 738 739 def makefile(self, tarinfo, targetpath): 740 """Make a file called targetpath. 741 """ 742 source = self.extractfile(tarinfo) 743 try: 744 with bltn_open(targetpath, "wb") as target: 745 copyfileobj(source, target) 746 finally: 747 source.close() 748 749 def makeunknown(self, tarinfo, targetpath): 750 """Make a file from a TarInfo object with an unknown type 751 at targetpath. 752 """ 753 self.makefile(tarinfo, targetpath) 754 self._dbg(1, "tarfile: Unknown file type %r, " \ 755 "extracted as regular file." % tarinfo.type) 756 757 def makefifo(self, tarinfo, targetpath): 758 """Make a fifo called targetpath. 759 """ 760 if hasattr(os, "mkfifo"): 761 os.mkfifo(targetpath) 762 else: 763 raise ExtractError("fifo not supported by system") 764 765 def makedev(self, tarinfo, targetpath): 766 """Make a character or block device called targetpath. 767 """ 768 if not hasattr(os, "mknod") or not hasattr(os, "makedev"): 769 raise ExtractError("special devices not supported by system") 770 771 mode = tarinfo.mode 772 if tarinfo.isblk(): 773 mode |= stat.S_IFBLK 774 else: 775 mode |= stat.S_IFCHR 776 777 os.mknod(targetpath, mode, 778 os.makedev(tarinfo.devmajor, tarinfo.devminor)) 779 780 def makelink(self, tarinfo, targetpath): 781 """Make a (symbolic) link called targetpath. If it cannot be created 782 (platform limitation), we try to make a copy of the referenced file 783 instead of a link. 784 """ 785 if hasattr(os, "symlink") and hasattr(os, "link"): 786 # For systems that support symbolic and hard links. 787 if tarinfo.issym(): 788 if os.path.lexists(targetpath): 789 os.unlink(targetpath) 790 os.symlink(tarinfo.linkname, targetpath) 791 else: 792 # See extract(). 793 if os.path.exists(tarinfo._link_target): 794 if os.path.lexists(targetpath): 795 os.unlink(targetpath) 796 os.link(tarinfo._link_target, targetpath) 797 else: 798 self._extract_member(self._find_link_target(tarinfo), targetpath) 799 else: 800 try: 801 self._extract_member(self._find_link_target(tarinfo), targetpath) 802 except KeyError: 803 raise ExtractError("unable to resolve link inside archive") 804 805 def chown(self, tarinfo, targetpath): 806 """Set owner of targetpath according to tarinfo. 807 """ 808 if pwd and hasattr(os, "geteuid") and os.geteuid() == 0: 809 # We have to be root to do so. 810 try: 811 g = grp.getgrnam(tarinfo.gname)[2] 812 except KeyError: 813 g = tarinfo.gid 814 try: 815 u = pwd.getpwnam(tarinfo.uname)[2] 816 except KeyError: 817 u = tarinfo.uid 818 try: 819 if tarinfo.issym() and hasattr(os, "lchown"): 820 os.lchown(targetpath, u, g) 821 else: 822 if sys.platform != "os2emx": 823 os.chown(targetpath, u, g) 824 except EnvironmentError, e: 825 raise ExtractError("could not change owner") 826 827 def chmod(self, tarinfo, targetpath): 828 """Set file permissions of targetpath according to tarinfo. 829 """ 830 if hasattr(os, 'chmod'): 831 try: 832 os.chmod(targetpath, tarinfo.mode) 833 except EnvironmentError, e: 834 raise ExtractError("could not change mode") 835 836 def utime(self, tarinfo, targetpath): 837 """Set modification time of targetpath according to tarinfo. 838 """ 839 if not hasattr(os, 'utime'): 840 return 841 try: 842 os.utime(targetpath, (tarinfo.mtime, tarinfo.mtime)) 843 except EnvironmentError, e: 844 raise ExtractError("could not change modification time") 845 846 #-------------------------------------------------------------------------- 847 def next(self): 848 """Return the next member of the archive as a TarInfo object, when 849 TarFile is opened for reading. Return None if there is no more 850 available. 851 """ 852 self._check("ra") 853 if self.firstmember is not None: 854 m = self.firstmember 855 self.firstmember = None 856 return m 857 858 # Read the next block. 859 self.fileobj.seek(self.offset) 860 tarinfo = None 861 while True: 862 try: 863 tarinfo = self.tarinfo.fromtarfile(self) 864 except EOFHeaderError, e: 865 if self.ignore_zeros: 866 self._dbg(2, "0x%X: %s" % (self.offset, e)) 867 self.offset += BLOCKSIZE 868 continue 869 except InvalidHeaderError, e: 870 if self.ignore_zeros: 871 self._dbg(2, "0x%X: %s" % (self.offset, e)) 872 self.offset += BLOCKSIZE 873 continue 874 elif self.offset == 0: 875 raise ReadError(str(e)) 876 except EmptyHeaderError: 877 if self.offset == 0: 878 raise ReadError("empty file") 879 except TruncatedHeaderError, e: 880 if self.offset == 0: 881 raise ReadError(str(e)) 882 except SubsequentHeaderError, e: 883 raise ReadError(str(e)) 884 break 885 886 if tarinfo is not None: 887 self.members.append(tarinfo) 888 else: 889 self._loaded = True 890 891 return tarinfo 892 893 #-------------------------------------------------------------------------- 894 # Little helper methods: 895 896 def _getmember(self, name, tarinfo=None, normalize=False): 897 """Find an archive member by name from bottom to top. 898 If tarinfo is given, it is used as the starting point. 899 """ 900 # Ensure that all members have been loaded. 901 members = self.getmembers() 902 903 # Limit the member search list up to tarinfo. 904 if tarinfo is not None: 905 members = members[:members.index(tarinfo)] 906 907 if normalize: 908 name = os.path.normpath(name) 909 910 for member in reversed(members): 911 if normalize: 912 member_name = os.path.normpath(member.name) 913 else: 914 member_name = member.name 915 916 if name == member_name: 917 return member 918 919 def _load(self): 920 """Read through the entire archive file and look for readable 921 members. 922 """ 923 while True: 924 tarinfo = self.next() 925 if tarinfo is None: 926 break 927 self._loaded = True 928 929 def _check(self, mode=None): 930 """Check if TarFile is still open, and if the operation's mode 931 corresponds to TarFile's mode. 932 """ 933 if self.closed: 934 raise IOError("%s is closed" % self.__class__.__name__) 935 if mode is not None and self.mode not in mode: 936 raise IOError("bad operation for mode %r" % self.mode) 937 938 def _find_link_target(self, tarinfo): 939 """Find the target member of a symlink or hardlink member in the 940 archive. 941 """ 942 if tarinfo.issym(): 943 # Always search the entire archive. 944 linkname = "/".join(filter(None, (os.path.dirname(tarinfo.name), tarinfo.linkname))) 945 limit = None 946 else: 947 # Search the archive before the link, because a hard link is 948 # just a reference to an already archived file. 949 linkname = tarinfo.linkname 950 limit = tarinfo 951 952 member = self._getmember(linkname, tarinfo=limit, normalize=True) 953 if member is None: 954 raise KeyError("linkname %r not found" % linkname) 955 return member 956 957 def __iter__(self): 958 """Provide an iterator object. 959 """ 960 if self._loaded: 961 return iter(self.members) 962 else: 963 return TarIter(self) 964 965 def _dbg(self, level, msg): 966 """Write debugging output to sys.stderr. 967 """ 968 if level <= self.debug: 969 print >> sys.stderr, msg 970 971 def __enter__(self): 972 self._check() 973 return self 974 975 def __exit__(self, type, value, traceback): 976 if type is None: 977 self.close() 978 else: 979 # An exception occurred. We must not call close() because 980 # it would try to write end-of-archive blocks and padding. 981 if not self._extfileobj: 982 self.fileobj.close() 983 self.closed = True 984 # class TarFile 985 986 TarFile
用於序列化的兩個模塊
Json模塊提供了四個功能:dumps、dump、loads、load
pickle模塊提供了四個功能:dumps、dump、loads、load
shelve模塊是一個簡單的k,v將內存數據經過文件持久化的模塊,能夠持久化任何pickle可支持的python數據格式
1 import shelve 2 3 d = shelve.open('shelve_test') #打開一個文件 4 5 class Test(object): 6 def __init__(self,n): 7 self.n = n 8 9 10 t = Test(123) 11 t2 = Test(123334) 12 13 name = ["alex","rain","test"] 14 d["test"] = name #持久化列表 15 d["t1"] = t #持久化類 16 d["t2"] = t2 17 18 d.close()
xml是實現不一樣語言或程序之間進行數據交換的協議,跟json差很少,但json使用起來更簡單,不過,古時候,在json還沒誕生的黑暗年代,你們只能選擇用xml呀,至今不少傳統公司如金融行業的不少系統的接口還主要是xml。
xml的格式以下,就是經過<>節點來區別數據結構的:
1 <?xml version="1.0"?> 2 <data> 3 <country name="Liechtenstein"> 4 <rank updated="yes">2</rank> 5 <year>2008</year> 6 <gdppc>141100</gdppc> 7 <neighbor name="Austria" direction="E"/> 8 <neighbor name="Switzerland" direction="W"/> 9 </country> 10 <country name="Singapore"> 11 <rank updated="yes">5</rank> 12 <year>2011</year> 13 <gdppc>59900</gdppc> 14 <neighbor name="Malaysia" direction="N"/> 15 </country> 16 <country name="Panama"> 17 <rank updated="yes">69</rank> 18 <year>2011</year> 19 <gdppc>13600</gdppc> 20 <neighbor name="Costa Rica" direction="W"/> 21 <neighbor name="Colombia" direction="E"/> 22 </country> 23 </data>
xml協議在各個語言裏的都 是支持的,在python中能夠用如下模塊操做xml
1 import xml.etree.ElementTree as ET 2 3 tree = ET.parse("xmltest.xml") 4 root = tree.getroot() 5 print(root.tag) 6 7 #遍歷xml文檔 8 for child in root: 9 print(child.tag, child.attrib) 10 for i in child: 11 print(i.tag,i.text) 12 13 #只遍歷year 節點 14 for node in root.iter('year'): 15 print(node.tag,node.text)
修改和刪除xml文檔內容
1 import xml.etree.ElementTree as ET 2 3 tree = ET.parse("xmltest.xml") 4 root = tree.getroot() 5 6 #修改 7 for node in root.iter('year'): 8 new_year = int(node.text) + 1 9 node.text = str(new_year) 10 node.set("updated","yes") 11 12 tree.write("xmltest.xml") 13 14 15 #刪除node 16 for country in root.findall('country'): 17 rank = int(country.find('rank').text) 18 if rank > 50: 19 root.remove(country) 20 21 tree.write('output.xml')
本身建立xml文檔
1 import xml.etree.ElementTree as ET 2 3 4 new_xml = ET.Element("namelist") 5 name = ET.SubElement(new_xml,"name",attrib={"enrolled":"yes"}) 6 age = ET.SubElement(name,"age",attrib={"checked":"no"}) 7 sex = ET.SubElement(name,"sex") 8 sex.text = '33' 9 name2 = ET.SubElement(new_xml,"name",attrib={"enrolled":"no"}) 10 age = ET.SubElement(name2,"age") 11 age.text = '19' 12 13 et = ET.ElementTree(new_xml) #生成文檔對象 14 et.write("test.xml", encoding="utf-8",xml_declaration=True) 15 16 ET.dump(new_xml) #打印生成的格式
Python也能夠很容易的處理ymal文檔格式,只不過須要安裝一個模塊,參考文檔:http://pyyaml.org/wiki/PyYAMLDocumentation
用於對特定的配置進行操做,當前模塊的名稱在 python 3.x 版本中變動爲 configparser。
1 # 註釋1 2 ; 註釋2 3 4 [section1] 5 k1 = v1 6 k2:v2 7 8 [section2] 9 k1 = v1
1 import ConfigParser 2 3 config = ConfigParser.ConfigParser() 4 config.read('i.cfg') 5 6 # ########## 讀 ########## 7 #secs = config.sections() 8 #print secs 9 #options = config.options('group2') 10 #print options 11 12 #item_list = config.items('group2') 13 #print item_list 14 15 #val = config.get('group1','key') 16 #val = config.getint('group1','key') 17 18 # ########## 改寫 ########## 19 #sec = config.remove_section('group1') 20 #config.write(open('i.cfg', "w")) 21 22 #sec = config.has_section('wupeiqi') 23 #sec = config.add_section('wupeiqi') 24 #config.write(open('i.cfg', "w")) 25 26 27 #config.set('group2','k1',11111) 28 #config.write(open('i.cfg', "w")) 29 30 #config.remove_option('group2','age') 31 #config.write(open('i.cfg', "w"))
用於生成和修改常見配置文檔,當前模塊的名稱在 python 3.x 版本中變動爲 configparser。
來看一個好多軟件的常見文檔格式以下
1 [DEFAULT] 2 ServerAliveInterval = 45 3 Compression = yes 4 CompressionLevel = 9 5 ForwardX11 = yes 6 7 [bitbucket.org] 8 User = hg 9 10 [topsecret.server.com] 11 Port = 50022 12 ForwardX11 = no
若是想用python生成一個這樣的文檔怎麼作呢?
1 import configparser 2 3 config = configparser.ConfigParser() 4 config["DEFAULT"] = {'ServerAliveInterval': '45', 5 'Compression': 'yes', 6 'CompressionLevel': '9'} 7 8 config['bitbucket.org'] = {} 9 config['bitbucket.org']['User'] = 'hg' 10 config['topsecret.server.com'] = {} 11 topsecret = config['topsecret.server.com'] 12 topsecret['Host Port'] = '50022' # mutates the parser 13 topsecret['ForwardX11'] = 'no' # same here 14 config['DEFAULT']['ForwardX11'] = 'yes' 15 with open('example.ini', 'w') as configfile: 16 config.write(configfile)
寫完了還能夠再讀出來哈。
1 >>> import configparser 2 >>> config = configparser.ConfigParser() 3 >>> config.sections() 4 [] 5 >>> config.read('example.ini') 6 ['example.ini'] 7 >>> config.sections() 8 ['bitbucket.org', 'topsecret.server.com'] 9 >>> 'bitbucket.org' in config 10 True 11 >>> 'bytebong.com' in config 12 False 13 >>> config['bitbucket.org']['User'] 14 'hg' 15 >>> config['DEFAULT']['Compression'] 16 'yes' 17 >>> topsecret = config['topsecret.server.com'] 18 >>> topsecret['ForwardX11'] 19 'no' 20 >>> topsecret['Port'] 21 '50022' 22 >>> for key in config['bitbucket.org']: print(key) 23 ... 24 user 25 compressionlevel 26 serveraliveinterval 27 compression 28 forwardx11 29 >>> config['bitbucket.org']['ForwardX11'] 30 'yes'
configparser增刪改查語法
1 [section1] 2 k1 = v1 3 k2:v2 4 5 [section2] 6 k1 = v1 7 8 import ConfigParser 9 10 config = ConfigParser.ConfigParser() 11 config.read('i.cfg') 12 13 # ########## 讀 ########## 14 #secs = config.sections() 15 #print secs 16 #options = config.options('group2') 17 #print options 18 19 #item_list = config.items('group2') 20 #print item_list 21 22 #val = config.get('group1','key') 23 #val = config.getint('group1','key') 24 25 # ########## 改寫 ########## 26 #sec = config.remove_section('group1') 27 #config.write(open('i.cfg', "w")) 28 29 #sec = config.has_section('wupeiqi') 30 #sec = config.add_section('wupeiqi') 31 #config.write(open('i.cfg', "w")) 32 33 34 #config.set('group2','k1',11111) 35 #config.write(open('i.cfg', "w")) 36 37 #config.remove_option('group2','age') 38 #config.write(open('i.cfg', "w"))
hashlib模塊
用於加密相關的操做,3.x裏代替了md5模塊和sha模塊,主要提供 SHA1, SHA224, SHA256, SHA384, SHA512 ,MD5 算法
1 import md5 2 hash = md5.new() 3 hash.update('admin') 4 print hash.hexdigest()
1 import sha 2 3 hash = sha.new() 4 hash.update('admin') 5 print hash.hexdigest()
1 import hashlib 2 3 m = hashlib.md5() 4 m.update(b"Hello") 5 m.update(b"It's me") 6 print(m.digest()) 7 m.update(b"It's been a long time since last time we ...") 8 9 print(m.digest()) #2進制格式hash 10 print(len(m.hexdigest())) #16進制格式hash 11 ''' 12 def digest(self, *args, **kwargs): # real signature unknown 13 """ Return the digest value as a string of binary data. """ 14 pass 15 16 def hexdigest(self, *args, **kwargs): # real signature unknown 17 """ Return the digest value as a string of hexadecimal digits. """ 18 pass 19 20 ''' 21 import hashlib 22 23 # ######## md5 ######## 24 25 hash = hashlib.md5() 26 hash.update('admin') 27 print(hash.hexdigest()) 28 29 # ######## sha1 ######## 30 31 hash = hashlib.sha1() 32 hash.update('admin') 33 print(hash.hexdigest()) 34 35 # ######## sha256 ######## 36 37 hash = hashlib.sha256() 38 hash.update('admin') 39 print(hash.hexdigest()) 40 41 42 # ######## sha384 ######## 43 44 hash = hashlib.sha384() 45 hash.update('admin') 46 print(hash.hexdigest()) 47 48 # ######## sha512 ######## 49 50 hash = hashlib.sha512() 51 hash.update('admin') 52 print(hash.hexdigest())
以上加密算法雖然依然很是厲害,但時候存在缺陷,即:經過撞庫能夠反解。因此,有必要對加密算法中添加自定義key再來作加密。
1 import hashlib 2 3 # ######## md5 ######## 4 5 hash = hashlib.md5('898oaFs09f') 6 hash.update('admin') 7 print hash.hexdigest()
還不夠吊?python 還有一個 hmac 模塊,它內部對咱們建立 key 和 內容 再進行處理而後再加密
散列消息鑑別碼,簡稱HMAC,是一種基於消息鑑別碼MAC(Message Authentication Code)的鑑別機制。使用HMAC時,消息通信的雙方,經過驗證消息中加入的鑑別密鑰K來鑑別消息的真僞;
通常用於網絡通訊中消息加密,前提是雙方先要約定好key,就像接頭暗號同樣,而後消息發送把用key把消息加密,接收方用key + 消息明文再加密,拿加密後的值 跟 發送者的相對比是否相等,這樣就能驗證消息的真實性,及發送者的合法性了。
1 import hmac 2 h = hmac.new(b'天王蓋地虎', b'寶塔鎮河妖') 3 print h.hexdigest()
更多關於md5,sha1,sha256等介紹的文章看這裏https://www.tbs-certificates.co.uk/FAQ/en/sha256.html
The subprocess
module allows you to spawn new processes, connect to their input/output/error pipes, and obtain their return codes. This module intends to replace several older modules and functions:
os.system os.spawn*
The recommended approach to invoking subprocesses is to use the run()
function for all use cases it can handle. For more advanced use cases, the underlying Popen
interface can be used directly.
The run()
function was added in Python 3.5; if you need to retain compatibility with older versions, see the Older high-level API section.
(args, *, stdin=None, input=None, stdout=None, stderr=None, shell=False, timeout=None, check=False)subprocess.run
Run the command described by args. Wait for command to complete, then return a CompletedProcess
instance.
The arguments shown above are merely the most common ones, described below in Frequently Used Arguments (hence the use of keyword-only notation in the abbreviated signature). The full function signature is largely the same as that of the Popen
constructor - apart from timeout, input and check, all the arguments to this function are passed through to that interface.
This does not capture stdout or stderr by default. To do so, pass PIPE
for the stdout and/or stderr arguments.
The timeout argument is passed to Popen.communicate()
. If the timeout expires, the child process will be killed and waited for. The TimeoutExpired
exception will be re-raised after the child process has terminated.
The input argument is passed to Popen.communicate()
and thus to the subprocess’s stdin. If used it must be a byte sequence, or a string if universal_newlines=True
. When used, the internal Popen
object is automatically created withstdin=PIPE
, and the stdin argument may not be used as well.
If check is True, and the process exits with a non-zero exit code, a CalledProcessError
exception will be raised. Attributes of that exception hold the arguments, the exit code, and stdout and stderr if they were captured.
1 #執行命令,返回命令執行狀態 , 0 or 非0 2 >>> retcode = subprocess.call(["ls", "-l"]) 3 4 #執行命令,若是命令結果爲0,就正常返回,不然拋異常 5 >>> subprocess.check_call(["ls", "-l"]) 6 0 7 8 #接收字符串格式命令,返回元組形式,第1個元素是執行狀態,第2個是命令結果 9 >>> subprocess.getstatusoutput('ls /bin/ls') 10 (0, '/bin/ls') 11 12 #接收字符串格式命令,並返回結果 13 >>> subprocess.getoutput('ls /bin/ls') 14 '/bin/ls' 15 16 #執行命令,並返回結果,注意是返回結果,不是打印,下例結果返回給res 17 >>> res=subprocess.check_output(['ls','-l']) 18 >>> res 19 b'total 0\ndrwxr-xr-x 12 alex staff 408 Nov 2 11:05 OldBoyCRM\n' 20 21 #上面那些方法,底層都是封裝的subprocess.Popen 22 poll() 23 Check if child process has terminated. Returns returncode 24 25 wait() 26 Wait for child process to terminate. Returns returncode attribute. 27 28 29 terminate() 殺掉所啓動進程 30 communicate() 等待任務結束 31 32 stdin 標準輸入 33 34 stdout 標準輸出 35 36 stderr 標準錯誤 37 38 pid 39 The process ID of the child process. 40 41 #例子 42 >>> p = subprocess.Popen("df -h|grep disk",stdin=subprocess.PIPE,stdout=subprocess.PIPE,shell=True) 43 >>> p.stdout.read() 44 b'/dev/disk1 465Gi 64Gi 400Gi 14% 16901472 104938142 14% /\n'
1 >>> subprocess.run(["ls", "-l"]) # doesn't capture output 2 CompletedProcess(args=['ls', '-l'], returncode=0) 3 4 >>> subprocess.run("exit 1", shell=True, check=True) 5 Traceback (most recent call last): 6 ... 7 subprocess.CalledProcessError: Command 'exit 1' returned non-zero exit status 1 8 9 >>> subprocess.run(["ls", "-l", "/dev/null"], stdout=subprocess.PIPE) 10 CompletedProcess(args=['ls', '-l', '/dev/null'], returncode=0, 11 stdout=b'crw-rw-rw- 1 root root 1, 3 Jan 23 16:23 /dev/null\n')
調用subprocess.run(...)是推薦的經常使用方法,在大多數狀況下能知足需求,但若是你可能須要進行一些複雜的與系統的交互的話,你還能夠用subprocess.Popen(),語法以下:
1 p = subprocess.Popen("find / -size +1000000 -exec ls -shl {} \;",shell=True,stdout=subprocess.PIPE) 2 print(p.stdout.read())
可用參數:
終端輸入的命令分爲兩種:
須要交互的命令示例
1 import subprocess 2 3 obj = subprocess.Popen(["python"], stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE) 4 obj.stdin.write('print 1 \n ') 5 obj.stdin.write('print 2 \n ') 6 obj.stdin.write('print 3 \n ') 7 obj.stdin.write('print 4 \n ') 8 9 out_error_list = obj.communicate(timeout=10) 10 print out_error_list
1 import subprocess 2 3 def mypass(): 4 mypass = '123' #or get the password from anywhere 5 return mypass 6 7 echo = subprocess.Popen(['echo',mypass()], 8 stdout=subprocess.PIPE, 9 ) 10 11 sudo = subprocess.Popen(['sudo','-S','iptables','-L'], 12 stdin=echo.stdout, 13 stdout=subprocess.PIPE, 14 ) 15 16 end_of_pipe = sudo.stdout 17 18 print "Password ok \n Iptables Chains %s" % end_of_pipe.read()
用於便捷記錄日誌且線程安全的模塊
1 import logging 2 3 4 logging.basicConfig(filename='log.log', 5 format='%(asctime)s - %(name)s - %(levelname)s -%(module)s: %(message)s', 6 datefmt='%Y-%m-%d %H:%M:%S %p', 7 level=10) 8 9 logging.debug('debug') 10 logging.info('info') 11 logging.warning('warning') 12 logging.error('error') 13 logging.critical('critical') 14 logging.log(10,'log')
對於等級:
1 CRITICAL = 50 2 FATAL = CRITICAL 3 ERROR = 40 4 WARNING = 30 5 WARN = WARNING 6 INFO = 20 7 DEBUG = 10 8 NOTSET = 0
只有大於當前日誌等級的操做纔會被記錄。
對於格式,有以下屬性但是配置:
不少程序都有記錄日誌的需求,而且日誌中包含的信息即有正常的程序訪問日誌,還可能有錯誤、警告等信息輸出,python的logging模塊提供了標準的日誌接口,你能夠經過它存儲各類格式的日誌,logging的日誌能夠分爲 debug()
, info()
, warning()
, error()
and critical() 5個級別,
下面咱們看一下怎麼用。
最簡單用法
1 import logging 2 3 logging.warning("user [alex] attempted wrong password more than 3 times") 4 logging.critical("server is down") 5 6 #輸出 7 WARNING:root:user [alex] attempted wrong password more than 3 times 8 CRITICAL:root:server is down
看一下這幾個日誌級別分別表明什麼意思
Level | When it’s used |
---|---|
DEBUG |
Detailed information, typically of interest only when diagnosing problems. |
INFO |
Confirmation that things are working as expected. |
WARNING |
An indication that something unexpected happened, or indicative of some problem in the near future (e.g. ‘disk space low’). The software is still working as expected. |
ERROR |
Due to a more serious problem, the software has not been able to perform some function. |
CRITICAL |
A serious error, indicating that the program itself may be unable to continue running. |
若是想把日誌寫到文件裏,也很簡單
1 import logging 2 3 logging.basicConfig(filename='example.log',level=logging.INFO) 4 logging.debug('This message should go to the log file') 5 logging.info('So should this') 6 logging.warning('And this, too')
其中下面這句中的level=loggin.INFO意思是,把日誌紀錄級別設置爲INFO,也就是說,只有比日誌是INFO或比INFO級別更高的日誌纔會被紀錄到文件裏,在這個例子, 第一條日誌是不會被紀錄的,若是但願紀錄debug的日誌,那把日誌級別改爲DEBUG就好了。
1 logging.basicConfig(filename='example.log',level=logging.INFO)
感受上面的日誌格式忘記加上時間啦,日誌不知道時間怎麼行呢,下面就來加上!
1 import logging 2 logging.basicConfig(format='%(asctime)s %(message)s', datefmt='%m/%d/%Y %I:%M:%S %p') 3 logging.warning('is when this event was logged.') 4 5 #輸出 6 12/12/2010 11:46:36 AM is when this event was logged.
日誌格式
%(name)s |
Logger的名字 |
%(levelno)s |
數字形式的日誌級別 |
%(levelname)s |
文本形式的日誌級別 |
%(pathname)s |
調用日誌輸出函數的模塊的完整路徑名,可能沒有 |
%(filename)s |
調用日誌輸出函數的模塊的文件名 |
%(module)s |
調用日誌輸出函數的模塊名 |
%(funcName)s |
調用日誌輸出函數的函數名 |
%(lineno)d |
調用日誌輸出函數的語句所在的代碼行 |
%(created)f |
當前時間,用UNIX標準的表示時間的浮 點數表示 |
%(relativeCreated)d |
輸出日誌信息時的,自Logger建立以 來的毫秒數 |
%(asctime)s |
字符串形式的當前時間。默認格式是 「2003-07-08 16:49:45,896」。逗號後面的是毫秒 |
%(thread)d |
線程ID。可能沒有 |
%(threadName)s |
線程名。可能沒有 |
%(process)d |
進程ID。可能沒有 |
%(message)s |
用戶輸出的消息 |
若是想同時把log打印在屏幕和文件日誌裏,就須要瞭解一點複雜的知識 了
Python 使用logging模塊記錄日誌涉及四個主要類,使用官方文檔中的歸納最爲合適:
logger提供了應用程序能夠直接使用的接口;
handler將(logger建立的)日誌記錄發送到合適的目的輸出;
filter提供了細度設備來決定輸出哪條日誌記錄;
formatter決定日誌記錄的最終輸出格式。
logger
每一個程序在輸出信息以前都要得到一個Logger。Logger一般對應了程序的模塊名,好比聊天工具的圖形界面模塊能夠這樣得到它的Logger:
LOG=logging.getLogger(」chat.gui」)
而核心模塊能夠這樣:
LOG=logging.getLogger(」chat.kernel」)
Logger.setLevel(lel):指定最低的日誌級別,低於lel的級別將被忽略。debug是最低的內置級別,critical爲最高
Logger.addFilter(filt)、Logger.removeFilter(filt):添加或刪除指定的filter
Logger.addHandler(hdlr)、Logger.removeHandler(hdlr):增長或刪除指定的handler
Logger.debug()、Logger.info()、Logger.warning()、Logger.error()、Logger.critical():能夠設置的日誌級別
handler
handler對象負責發送相關的信息到指定目的地。Python的日誌系統有多種Handler可使用。有些Handler能夠把信息輸出到控制檯,有些Logger能夠把信息輸出到文件,還有些 Handler能夠把信息發送到網絡上。若是以爲不夠用,還能夠編寫本身的Handler。能夠經過addHandler()方法添加多個多handler
Handler.setLevel(lel):指定被處理的信息級別,低於lel級別的信息將被忽略
Handler.setFormatter():給這個handler選擇一個格式
Handler.addFilter(filt)、Handler.removeFilter(filt):新增或刪除一個filter對象
每一個Logger能夠附加多個Handler。接下來咱們就來介紹一些經常使用的Handler:
1) logging.StreamHandler
使用這個Handler能夠向相似與sys.stdout或者sys.stderr的任何文件對象(file object)輸出信息。它的構造函數是:
StreamHandler([strm])
其中strm參數是一個文件對象。默認是sys.stderr
2) logging.FileHandler
和StreamHandler相似,用於向一個文件輸出日誌信息。不過FileHandler會幫你打開這個文件。它的構造函數是:
FileHandler(filename[,mode])
filename是文件名,必須指定一個文件名。
mode是文件的打開方式。參見Python內置函數open()的用法。默認是’a',即添加到文件末尾。
3) logging.handlers.RotatingFileHandler
這個Handler相似於上面的FileHandler,可是它能夠管理文件大小。當文件達到必定大小以後,它會自動將當前日誌文件更名,而後建立 一個新的同名日誌文件繼續輸出。好比日誌文件是chat.log。當chat.log達到指定的大小以後,RotatingFileHandler自動把 文件更名爲chat.log.1。不過,若是chat.log.1已經存在,會先把chat.log.1重命名爲chat.log.2。。。最後從新建立 chat.log,繼續輸出日誌信息。它的構造函數是:
RotatingFileHandler( filename[, mode[, maxBytes[, backupCount]]])
其中filename和mode兩個參數和FileHandler同樣。
maxBytes用於指定日誌文件的最大文件大小。若是maxBytes爲0,意味着日誌文件能夠無限大,這時上面描述的重命名過程就不會發生。
backupCount用於指定保留的備份文件的個數。好比,若是指定爲2,當上面描述的重命名過程發生時,原有的chat.log.2並不會被改名,而是被刪除。
4) logging.handlers.TimedRotatingFileHandler
這個Handler和RotatingFileHandler相似,不過,它沒有經過判斷文件大小來決定什麼時候從新建立日誌文件,而是間隔必定時間就 自動建立新的日誌文件。重命名的過程與RotatingFileHandler相似,不過新的文件不是附加數字,而是當前時間。它的構造函數是:
TimedRotatingFileHandler( filename [,when [,interval [,backupCount]]])
其中filename參數和backupCount參數和RotatingFileHandler具備相同的意義。
interval是時間間隔。
when參數是一個字符串。表示時間間隔的單位,不區分大小寫。它有如下取值:
S 秒
M 分
H 小時
D 天
W 每星期(interval==0時表明星期一)
midnight 天天凌晨
1 import logging 2 3 #create logger 4 logger = logging.getLogger('TEST-LOG') 5 logger.setLevel(logging.DEBUG) 6 7 8 # create console handler and set level to debug 9 ch = logging.StreamHandler() 10 ch.setLevel(logging.DEBUG) 11 12 # create file handler and set level to warning 13 fh = logging.FileHandler("access.log") 14 fh.setLevel(logging.WARNING) 15 # create formatter 16 formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s') 17 18 # add formatter to ch and fh 19 ch.setFormatter(formatter) 20 fh.setFormatter(formatter) 21 22 # add ch and fh to logger 23 logger.addHandler(ch) 24 logger.addHandler(fh) 25 26 # 'application' code 27 logger.debug('debug message') 28 logger.info('info message') 29 logger.warn('warn message') 30 logger.error('error message') 31 logger.critical('critical message')
文件自動截斷例子
1 import logging 2 3 from logging import handlers 4 5 logger = logging.getLogger(__name__) 6 7 log_file = "timelog.log" 8 #fh = handlers.RotatingFileHandler(filename=log_file,maxBytes=10,backupCount=3) 9 fh = handlers.TimedRotatingFileHandler(filename=log_file,when="S",interval=5,backupCount=3) 10 11 12 formatter = logging.Formatter('%(asctime)s %(module)s:%(lineno)d %(message)s') 13 14 fh.setFormatter(formatter) 15 16 logger.addHandler(fh) 17 18 19 logger.warning("test1") 20 logger.warning("test12") 21 logger.warning("test13") 22 logger.warning("test14")
經常使用正則表達式符號
1 '.' 默認匹配除\n以外的任意一個字符,若指定flag DOTALL,則匹配任意字符,包括換行 2 '^' 匹配字符開頭,若指定flags MULTILINE,這種也能夠匹配上(r"^a","\nabc\neee",flags=re.MULTILINE) 3 '$' 匹配字符結尾,或e.search("foo$","bfoo\nsdfsf",flags=re.MULTILINE).group()也能夠 4 '*' 匹配*號前的字符0次或屢次,re.findall("ab*","cabb3abcbbac") 結果爲['abb', 'ab', 'a'] 5 '+' 匹配前一個字符1次或屢次,re.findall("ab+","ab+cd+abb+bba") 結果['ab', 'abb'] 6 '?' 匹配前一個字符1次或0次 7 '{m}' 匹配前一個字符m次 8 '{n,m}' 匹配前一個字符n到m次,re.findall("ab{1,3}","abb abc abbcbbb") 結果'abb', 'ab', 'abb'] 9 '|' 匹配|左或|右的字符,re.search("abc|ABC","ABCBabcCD").group() 結果'ABC' 10 '(...)' 分組匹配,re.search("(abc){2}a(123|456)c", "abcabca456c").group() 結果 abcabca456c 11 12 13 '\A' 只從字符開頭匹配,re.search("\Aabc","alexabc") 是匹配不到的 14 '\Z' 匹配字符結尾,同$ 15 '\d' 匹配數字0-9 16 '\D' 匹配非數字 17 '\w' 匹配[A-Za-z0-9] 18 '\W' 匹配非[A-Za-z0-9] 19 's' 匹配空白字符、\t、\n、\r , re.search("\s+","ab\tc1\n3").group() 結果 '\t'
1 '(?P<name>...)' 分組匹配 re.search("(?P<province>[0-9]{4})(?P<city>[0-9]{2})(?P<birthday>[0-9]{4})","371481199306143242").groupdict("city") 結果{'province': '3714', 'city': '81', 'birthday': '1993'}
最經常使用的匹配語法
1 re.match 從頭開始匹配 2 re.search 匹配包含 3 re.findall 把全部匹配到的字符放到以列表中的元素返回 4 re.splitall 以匹配到的字符當作列表分隔符 5 re.sub 匹配字符並替換
反斜槓的困擾
與大多數編程語言相同,正則表達式裏使用"\"做爲轉義字符,這就可能形成反斜槓困擾。假如你須要匹配文本中的字符"\",那麼使用編程語言表示的正則表達式裏將須要4個反斜槓"\\\\":前兩個和後兩個分別用於在編程語言裏轉義成反斜槓,轉換成兩個反斜槓後再在正則表達式裏轉義成一個反斜槓。Python裏的原生字符串很好地解決了這個問題,這個例子中的正則表達式可使用r"\\"表示。一樣,匹配一個數字的"\\d"能夠寫成r"\d"。有了原生字符串,你不再用擔憂是否是漏寫了反斜槓,寫出來的表達式也更直觀。
僅需輕輕知道的幾個匹配模式
1 re.I(re.IGNORECASE): 忽略大小寫(括號內是完整寫法,下同) 2 M(MULTILINE): 多行模式,改變'^'和'$'的行爲(參見上圖) 3 S(DOTALL): 點任意匹配模式,改變'.'的行爲
開發一個簡單的python計算器
hint:
re.search(r'\([^()]+\)',s).group()
'(-40/5)'