Day6 模塊及Python經常使用模塊

模塊概述

定義:模塊,用一砣代碼實現了某類功能的代碼集合。 html

爲了編寫可維護的代碼,咱們把不少函數分組,分別放到不一樣的文件裏,提供了代碼的重用性。在Python中,一個.py文件就稱之爲一個模塊(Module)。node

注意:python

  • 模塊讓你可以有邏輯地組織你的Python代碼段。
  • 把相關的代碼分配到一個 模塊裏能讓你的代碼更好用,更易懂。
  • 簡單地說,模塊就是一個保存了Python代碼的文件。模塊能定義函數,類和變量。模塊裏也能包含可執行的代碼!

模塊分爲三種:mysql

  • 自定義模塊
  • 開源模塊
  • 內置模塊(又稱標準庫)

使用模塊有什麼好處?nginx

  第一:最大的好處是大大提升了代碼的可維護性。git

  其次:編寫代碼沒必要從零開始。當一個模塊編寫完畢,就能夠被其餘地方引用。咱們在編寫程序的時候,也常常引用其餘模塊,包括Python程序員

       內置的模塊和來自第三方的模塊。github

  第三:使用模塊還能夠避免函數名和變量名衝突。相同名字的函數和變量徹底能夠分別存在不一樣的模塊中,所以,咱們本身在編寫模塊時,正則表達式

            沒必要考慮名字會與其餘模塊衝突。redis

可是也要注意:儘可能不要與內置函數名字衝突。

自定義的模塊

注意:自定義模塊不要與系統內置的模塊同名!

2.模塊須知:

 Python之因此應用愈來愈普遍,在必定程度上也依賴於其爲程序員提供了大量的模塊以供使用,若是想要使用模塊,則須要導入。導入模塊用import:

import 語法:

import module1[, module2[,... moduleN]

Python 自己帶着一些標準的模塊庫

#!/usr/bin/python3
# 文件名: using_sys.py

import sys

print('命令行參數以下:')
for i in sys.argv:
   print(i)

print('\n\nPython 路徑爲:', sys.path, '\n')

注意:

  • 一、import sys 引入 python 標準庫中的 sys.py 模塊;這是引入某一模塊的方法。
  • 二、sys.argv 是一個包含命令行參數的列表。
  • 三、sys.path sys.path是python的搜索模塊的路徑集,是一個list,能夠在python 環境下使用sys.path.append(path)添加相關的路徑,這樣就能夠添加指定地址中的py文件了!
  • 4.千萬要區分開:python ab.py python解釋器會從當前目錄中查找ab文件,而程序中代碼執行用到的變量、方法、模塊是從sys.path路徑集中按順序查找!
  • 5.一個模塊只會被導入一次,無論你執行了多少次import。這樣能夠防止導入模塊被一遍又一遍地執行[只初始化執行依次]。
  • 6.import語句能夠出如今任意位置!
import os
#獲取文件的當前路徑
current_path = os.path.dirname(__file__)
#獲取父級目錄
pre_path = os.path.dirname(current_path)
print(pre_path)

import拓展:

import module #導入整個模塊,可是指定模塊中的變量和函數並無導入,導入以後就能夠直接用module.[點]的方式訪問裏面的變量和方法了!
from module.xx.xx import xx[,yy] #導入模塊中某些變量或函數
from module.xx.xx import xx as rename[,yy as rename] #別名 
from module.xx.xx import * #導入模塊中全部的不是如下劃線(_)開頭的名字都導入到當前位置.[_]單下劃線開頭的是私有變量或者方法,只能在本模塊使用,不能夠在別的模塊使用!

通常來講,應該避免使用from … import 而使用import語句,由於這樣可使你的程序更加易讀,也能夠避免名稱衝突

導入模塊其實就是告訴Python解釋器去解釋那個py文件

  • 導入一個py文件,解釋器解釋該py文件
  • 導入一個包,解釋器解釋該包下的 __init__.py 文件

那麼問題來了,導入模塊時是根據那個路徑做爲基準來進行的呢?即:sys.path;若是sys.path路徑列表沒有你想要的路徑,能夠經過 sys.path.append('路徑') 添加。

sys.path.insert(0,'/x/y/z') #排在前的目錄,優先被搜索
sys.path.remove() #刪除某個搜索目錄

綜上所述:當咱們導入某個模塊的時候,首先會從python內置的模塊【內置模塊如:sys】中找,找不到再去sys.path路徑中查找【此路徑包含當前目錄(空目錄)】

做用域

在一個模塊中,咱們可能會定義不少函數和變量,但有的函數和變量咱們但願給別人使用,有的函數和變量咱們但願僅僅在模塊內部使用。在Python中,是經過_前綴來實現的。

正常的函數和變量名是公開的(public),能夠被直接引用,好比:abcx123PI等;

相似__xxx__這樣的變量是特殊變量,能夠被直接引用,可是有特殊用途

好比:做者:__author__,文檔註釋__doc__就是特殊變量,hello模塊定義的也能夠用特殊變量訪問,咱們本身的變量通常不要用這種變量名;

相似_xxx__xxx這樣的函數或變量就是非公開的(private),不該該被直接引用,好比_abc__abc等;

之因此咱們說,private函數和變量「不該該」被直接引用,而不是「不能」被直接引用,是由於Python並無一種方法能夠徹底限制訪問private函數或變量,可是,從編程習慣上不該該引用private函數或變量。

private函數或變量不該該被別人引用,那它們有什麼用呢?請看例子:

def _private_1(name):
    return 'Hello, %s' % name

def _private_2(name):
    return 'Hi, %s' % name

def greeting(name):
    if len(name) > 3:
        return _private_1(name)
    else:
        return _private_2(name)
View Code

咱們在模塊裏公開greeting()函數,而把內部邏輯用private函數隱藏起來了,這樣,調用greeting()函數不用關心內部的private函數細節,這也是一種很是有用的代碼封裝和抽象的方法,即:

外部不須要引用的函數所有定義成private,只有外部須要引用的函數才定義爲public。

其它小知識點:

__name__屬性:

一個模塊被另外一個程序第一次引入時,其主程序將運行。若是咱們想在模塊被引入時,模塊中的某一程序塊不執行,咱們能夠用__name__屬性來使該程序塊僅在該模塊自身運行時執行。

if __name__ == '__main__':
   print('程序自身在運行')
else:
   print('我來自另外一模塊')

每一個模塊都有一個__name__屬性,當其值是'__main__'時,代表該模塊自身在運行,不然是被引入。

dir() 函數:

內置的函數 dir() 能夠找到模塊【可調用對象】內定義的全部屬性和方法,這些方法能夠以模塊名.[點]的方式訪問。以一個字符串列表的形式返回:

import sys
print(sys.__name__) #注意:用import導入時是能夠訪問它的全部的屬性和方法的!
print(dir(sys))
View Code

若是沒有參數,dir()列舉出當前定義的名字

dir()不會列舉出內建函數或者變量的名字,它們都被定義到了標準模塊builtin中,能夠列舉出它們,

import builtins
dir(builtins)

編譯python文件

爲了提升模塊的加載速度,Python緩存編譯的版本,每一個模塊在__pycache__目錄的以module.version.pyc的形式命名,一般包含了python的版本號,如在CPython版本3.3,關於spam.py的編譯版本將被緩存成__pycache__/spam.cpython-33.pyc,這種命名約定容許不一樣的版本,不一樣版本的Python編寫模塊共存。

Python檢查源文件的修改時間與編譯的版本進行對比,若是過時就須要從新編譯。這是徹底自動的過程。而且編譯的模塊是平臺獨立的,因此相同的庫能夠在不一樣的架構的系統之間共享,即pyc使一種跨平臺的字節碼,相似於JAVA、NET,是由python虛擬機來執行的,可是pyc的內容跟python的版本相關,不一樣的版本編譯後的pyc文件不一樣,2.5編譯的pyc文件不能到3.5上執行,而且pyc文件是能夠反編譯的,於是它的出現僅僅是用來提高模塊的加載速度的。

提示:

1.模塊名區分大小寫,foo.py與FOO.py表明的是兩個模塊

2.你可使用-O或者-OO轉換python命令來減小編譯模塊的大小

1 -O轉換會幫你去掉assert語句
2 -OO轉換會幫你去掉assert語句和__doc__文檔字符串
3 因爲一些程序可能依賴於assert語句或文檔字符串,你應該在在確認須要的狀況下使用這些選項。

3.在速度上從.pyc文件中讀指令來執行不會比從.py文件中讀指令執行更快,只有在模塊被加載時,.pyc文件纔是更快的

4.只有使用import語句是纔將文件自動編譯爲.pyc文件,在命令行或標準輸入中指定運行腳本則不會生成這類文件,於是咱們可使用compieall模塊爲一個目錄中的全部模塊建立.pyc文件

模塊能夠做爲一個腳本(使用python -m compileall)編譯Python源
 
python -m compileall /module_directory 遞歸着編譯
若是使用python -O -m compileall /module_directory -l則只一層
 
命令行裏使用compile()函數時,自動使用python -O -m compileall
 
詳見:https://docs.python.org/3/library/compileall.html#module-compileall

總結:python並不是徹底是解釋性語言,它是有編譯的,先把源碼py文件編譯成pyc,而後由python的虛擬機執行!

咱們平時所說的python解釋器實際上是Cpython,在執行的時候,python會先將.py文件編譯成中間形式的字節碼(bytecode)並存放在內存當中,而後在正真執行的時候將字節碼解釋爲機器可識別的二進制碼。
默認狀況下,被import的文件編譯出字節碼會被保存下來,即咱們看到的.pyc文件了。固然咱們能夠顯示的編譯一個.py文件並保存。

包:

你也許還想到,若是不一樣的人編寫的模塊名相同怎麼辦?爲了不模塊名衝突,Python又引入了按目錄來組織模塊的方法,稱爲包(Package)。

函數就至關於工具,模塊就是工具包,包就至關於工具包的集合!

舉個例子,一個init.py的文件就是一個名字叫init的模塊,一個Demo.py的文件就是一個名字叫Demo的模塊。

如今,假設咱們的init和Demo這兩個模塊名字與其餘模塊衝突了,因而咱們能夠經過包來組織模塊,避免衝突。方法是選擇一個頂層包名,好比oop,按照以下目錄存放:

引入了包之後,只要頂層的包名不與別人衝突,那全部模塊都不會與別人衝突。如今,init.py模塊的名字就變成了oop.init,相似的,Demo.py的模塊名變成了oop.Demo

請注意,每個包目錄下面都會有一個__init__.py的文件,這個文件是必須存在的,不然,Python就把這個目錄當成普通目錄,而不是一個包。

__init__.py能夠是空文件,也能夠有Python代碼,由於__init__.py自己就是一個模塊,而它的模塊名就是oop。

相似的,能夠有多級目錄,組成多級層次的包結構。好比以下的目錄結構:

兩個文件Demo.py的模塊名分別是oop.Demooop.conf.Demo

 

不管是import形式仍是from...import形式,凡是在導入語句中(而不是在使用時)遇到帶點的,都要第一時間提升警覺:這是關於包纔有的導入語法

包的本質就是一個包含__init__.py文件的目錄。

包A和包B下有同名模塊也不會衝突,如A.a與B.a來自倆個命名空間:

glance/                   #Top-level package

├── __init__.py      #Initialize the glance package

├── api                  #Subpackage for api

│   ├── __init__.py

│   ├── policy.py

│   └── versions.py

├── cmd                #Subpackage for cmd

│   ├── __init__.py

│   └── manage.py

└── db                  #Subpackage for db

    ├── __init__.py

    └── models.py
#文件內容

#policy.py
def get():
    print('from policy.py')

#versions.py
def create_resource(conf):
    print('from version.py: ',conf)

#manage.py
def main():
    print('from manage.py')

#models.py
def register_models(engine):
    print('from models.py: ',engine)

2.1 注意事項
1.關於包相關的導入語句也分爲import和from ... import ...兩種,可是不管哪一種,不管在什麼位置,在導入時都必須遵循一個原則:凡是在導入時帶點的,點的左邊都必須是一個包,不然非法。能夠帶有一連串的點, 如item.subitem.subsubitem,但都必須遵循這個原則。

2.對於導入後,在使用時就沒有這種限制了,點的左邊能夠是包,模塊,函數,類(它們均可以用點的方式調用本身的屬性)。

3.對比import item 和from item import name的應用場景:
若是咱們想直接使用name那必須使用後者。

 import 

咱們在與包glance同級別的文件中測試

import glance.db.models
glance.db.models.register_models('mysql') 

from ... import ...

須要注意的是from後import導入的模塊,必須是明確的一個不能帶點,不然會有語法錯誤,如:from a import b.c是錯誤語法

 

咱們在與包glance同級別的文件中測試 

from glance.db import models
models.register_models('mysql')
 
from glance.db.models import register_models
register_models('mysql')

__init__.py文件

無論是哪一種方式,只要是第一次導入包或者是包的任何其餘部分,都會依次執行包下的__init__.py文件(咱們能夠在每一個包的文件內都打印一行內容來驗證一下),這個文件能夠爲空,可是也能夠存放一些初始化包的代碼。

from glance.api import *

在講模塊時,咱們已經討論過了從一個模塊內導入全部*,此處咱們研究從一個包導入全部模塊[*]。

此處是想從包api中導入全部,實際上該語句只會執行導入包api下__init__.py文件中的代碼【,此時在導入的地方就能夠引用定義在__init__文件的變量和方法了】,而沒有導入這個包下的其它模塊,要是想導入裏面的某些或全部模塊,咱們能夠在這個文件中定義__all__,而後將模塊都放到__all__表示的列表中:

1 #在__init__.py中定義
2 x=10
3 
4 def func():
5     print('from api.__init.py')
6 
7 __all__=['x','func','policy']
View Code

此時咱們在於glance同級的文件中執行from glance.api import *就導入__all__中的內容(versions仍然不能導入),同時若是__init__文件中的變量或者函數若是不加入到__all__列表中,實際上在導入的地方也是不可以直接使用的,簡言之:__all__限定了 glance.api import *導入包的時候只是導入了init文件的__all__中的全部內容[注意:__all__只是給from module import *用的哦]

 

注意:上面的from glance.api import * 僅僅是導入一個包中的全部模塊的時候是按着上面那麼用,實際上以下:

從某個包中導入某個文件from glance.db  import models即從glance.db包中導入models模塊仍是能夠這麼用的,而後使用models.register_models()函數也是能夠的!

固然導入某個包某個文件中的全部不是以_開頭的屬性和方法也是能夠的,以下所示:

from test.lib.aa import *  #導入test包下lib包下的aa模塊,而後就能夠在下面使用aa模塊中的af()函數了!
af()

雖然有這種方式,可是不建議這麼使用!

絕對導入和相對導入

咱們的最頂級包glance是寫給別人用的,而後在glance包內部也會有彼此之間互相導入的需求,這時候就有絕對導入和相對導入兩種方式:

絕對導入:以glance做爲起始

相對導入:用.或者..的方式最爲起始(只能在一個包中使用,不能用於不一樣目錄內)

例如:咱們在glance/api/versions.py中想要導入glance/api/policy.py文件,這時候咱們又在與glance包在同一級目錄的py文件中導入了glance/api/versions.py就會出問題!以下所示:

import policy
policy.get()  #注意:此時導入policy模塊,並調用policy模塊中的get()方法是沒問題的,此時執行的時候是以當前文件的相對路徑導入的,因此沒問題!
def create_resource(conf):
    print('from version.py: ',conf)
import glance.api.versions #Demo.py與glance包在同一級目錄,此時就會有問題,由於此時是以Demo的路徑爲當前路徑執行的,當執行導入glance.api.versions的時候,在這個文件中
                           #又導入了policy文件,可是此時的policy文件查找路徑是以Demo的路徑爲基準查找的,因此找不到,報錯誤!

出現以下錯誤[緣由在上面已寫]:

ImportError: No module named 'policy'
特別須要注意的是:能夠用import導入內置或者第三方開源模塊,可是要絕對避免使用import來導入自定義包的子模塊,應該使用from... import ...的絕對或者相對導入,且包的相對導入只能用from的形式。

 這時咱們能夠導入例如:咱們在glance/api/versions.py中想要導入glance/api/policy.py能夠用相對路徑和絕對路徑的方式,以下所示:

from glance.api import policy #絕對路徑
# from . import policy 相對路徑  #from ..cmd import manage
policy.get()
def create_resource(conf):
    print('from version.py: ',conf)

尤爲是咱們在用到了包的概念的時候,咱們對項目的組織結構就是一個項目下有多個包,包中再有子包或者py文件,此時咱們包中的文件必定要用from ... import ... 的方式導入,方便本身在別的包使用,也方便給別人的時候使用!

單獨導入包

單獨導入包名稱時不會導入包中全部包含的全部子模塊,如:

#在與glance同級的test.py中
import glance
glance.cmd.manage.main()

'''
執行結果:
AttributeError: module 'glance' has no attribute 'cmd'

'''
View Code

解決方法:

#glance/__init__.py
from . import cmd

#glance/cmd/__init__.py
from . import manage

千萬別問:__all__不能解決嗎,__all__是用於控制from...import * ,fuck!

綜上所述:在包內的py文件若是想要導入本身寫的py文件就用from module import 導入【若是導入的是內置的或者第三方包能夠用import】,在包外面導入包【導入包中的文件就不用這麼作了】的時候,能夠用咱們上面這種方式,在包下面的__init__.py文件中加入 from . import module/py

開源模塊

下載安裝有兩種方式:

yum 
pip
apt-get
...
View Code
下載源碼
解壓源碼
進入目錄
編譯源碼    python setup.py build
安裝源碼    python setup.py install
View Code

注:在使用源碼安裝時,須要使用到gcc編譯和python開發環境,因此,須要先執行:

yum install gcc
yum install python-devel
或
apt-get python-dev
View Code

安裝成功後,模塊會自動安裝到 sys.path 中的某個目錄中,如:

/usr/lib/python2.7/site-packages/

2、導入模塊

同自定義模塊中導入的方式

3、模塊 paramiko

paramiko是一個用於作遠程控制的模塊,使用該模塊能夠對遠程服務器進行命令或文件操做,值得一說的是,fabric和ansible內部的遠程管理就是使用的paramiko來現實。

一、下載安裝

pip3 install paramiko

# pycrypto,因爲 paramiko 模塊內部依賴pycrypto,因此先下載安裝pycrypto
 
# 下載安裝 pycrypto
wget http://files.cnblogs.com/files/wupeiqi/pycrypto-2.6.1.tar.gz
tar -xvf pycrypto-2.6.1.tar.gz
cd pycrypto-2.6.1
python setup.py build
python setup.py install
 
# 進入python環境,導入Crypto檢查是否安裝成功
 
# 下載安裝 paramiko
wget http://files.cnblogs.com/files/wupeiqi/paramiko-1.10.1.tar.gz
tar -xvf paramiko-1.10.1.tar.gz
cd paramiko-1.10.1
python setup.py build
python setup.py install
 
# 進入python環境,導入paramiko檢查是否安裝成功

二、使用模塊

#!/usr/bin/env python
#coding:utf-8

import paramiko

ssh = paramiko.SSHClient()
ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy())
ssh.connect('192.168.1.108', 22, 'alex', '123')
stdin, stdout, stderr = ssh.exec_command('df')
print stdout.read()
ssh.close();
執行命令 - 經過用戶名和密碼鏈接服務器
執行命令 - 過密鑰連接服務器
import os,sys
import paramiko

t = paramiko.Transport(('182.92.219.86',22))
t.connect(username='wupeiqi',password='123')
sftp = paramiko.SFTPClient.from_transport(t)
sftp.put('/tmp/test.py','/tmp/test.py') 
t.close()


import os,sys
import paramiko

t = paramiko.Transport(('182.92.219.86',22))
t.connect(username='wupeiqi',password='123')
sftp = paramiko.SFTPClient.from_transport(t)
sftp.get('/tmp/test.py','/tmp/test2.py')
t.close()
上傳或者下載文件 - 經過用戶名和密碼
上傳或下載文件 - 經過密鑰

內置模塊

python的內置模塊在python下載的解釋器中是找不到py文件的!好比sys模塊對吧,那麼在下載python解釋器的時候,還自帶了Lib庫等文件夾,這裏面也是有py文件的,咱們把這些自帶的以及找不到py文件的這些統稱爲內置模塊!

1、os

用於提供系統級別的操做

os.getcwd() 獲取當前工做目錄,即當前python腳本工做的目錄路徑
os.chdir("dirname")  改變當前腳本工做目錄;至關於shell下cd
os.curdir  返回當前目錄: ('.')
os.pardir  獲取當前目錄的父目錄字符串名:('..')
os.makedirs('dirname1/dirname2')    可生成多層遞歸目錄
os.removedirs('dirname1')    若目錄爲空,則刪除,並遞歸到上一級目錄,如若也爲空,則刪除,依此類推
os.mkdir('dirname')    生成單級目錄;至關於shell中mkdir dirname
os.rmdir('dirname')    刪除單級空目錄,若目錄不爲空則沒法刪除,報錯;至關於shell中rmdir dirname
os.listdir('dirname')    列出指定目錄下的全部文件和子目錄,包括隱藏文件,並以列表方式打印
os.remove()  刪除一個文件
os.rename("oldname","newname")  重命名文件/目錄
os.stat('path/filename')  獲取文件/目錄信息
os.sep    輸出操做系統特定的路徑分隔符,win下爲"\\",Linux下爲"/"
os.linesep    輸出當前平臺使用的行終止符,win下爲"\t\n",Linux下爲"\n"
os.pathsep    輸出用於分割文件路徑的字符串
os.name    輸出字符串指示當前使用平臺。win->'nt'; Linux->'posix'
os.system("bash command")  運行shell命令,直接顯示
os.environ  獲取系統環境變量
os.path.abspath(path)  返回path規範化的絕對路徑
os.path.split(path)  將path分割成目錄和文件名二元組返回
os.path.dirname(path)  返回path的目錄。其實就是os.path.split(path)的第一個元素
os.path.basename(path)  返回path最後的文件名。如何path以/或\結尾,那麼就會返回空值。即os.path.split(path)的第二個元素
os.path.exists(path)  若是path存在,返回True;若是path不存在,返回False
os.path.isabs(path)  若是path是絕對路徑,返回True
os.path.isfile(path)  若是path是一個存在的文件,返回True。不然返回False
os.path.isdir(path)  若是path是一個存在的目錄,則返回True。不然返回False
os.path.join(path1[, path2[, ...]])  將多個路徑組合後返回,第一個絕對路徑以前的參數將被忽略
os.path.getatime(path)  返回path所指向的文件或者目錄的最後存取時間
os.path.getmtime(path)  返回path所指向的文件或者目錄的最後修改時間

更多猛擊這裏

2、sys

用於提供對解釋器相關的操做

sys.argv           命令行參數List,第一個元素是程序自己路徑
sys.exit(n)        退出程序,正常退出時exit(0)
sys.version        獲取Python解釋程序的版本信息
sys.maxint         最大的Int值
sys.path           返回模塊的搜索路徑,初始化時使用PYTHONPATH環境變量的值
sys.platform       返回操做系統平臺名稱
sys.stdout.write('please:')
val = sys.stdin.readline()[:-1]
View Code

更多猛擊這裏

3、hashlib 

用於加密相關的操做,代替了md5模塊和sha模塊,主要提供 SHA1, SHA224, SHA256, SHA384, SHA512 ,MD5 算法

import md5
hash = md5.new()
hash.update('admin')
print hash.hexdigest()
md5-廢棄
import sha

hash = sha.new()
hash.update('admin')
print hash.hexdigest()
sha-廢棄
import hashlib
  
# ######## md5 ########
  
hash = hashlib.md5()
hash.update('admin')
print hash.hexdigest()
  
# ######## sha1 ########
  
hash = hashlib.sha1()
hash.update('admin')
print hash.hexdigest()
  
# ######## sha256 ########
  
hash = hashlib.sha256()
hash.update('admin')
print hash.hexdigest()
  
  
# ######## sha384 ########
  
hash = hashlib.sha384()
hash.update('admin')
print hash.hexdigest()
  
# ######## sha512 ########
  
hash = hashlib.sha512()
hash.update('admin')
print hash.hexdigest()

以上加密算法雖然依然很是厲害,但時候存在缺陷,即:經過撞庫能夠反解。因此,有必要對加密算法中添加自定義key再來作加密。

import hashlib
# ######## md5 ########
 
hash = hashlib.md5('898oaFs09f')
hash.update('admin')
print hash.hexdigest()

還不夠吊?python 還有一個 hmac 模塊,它內部對咱們建立 key 和 內容 再進行處理而後再加密

散列消息鑑別碼,簡稱HMAC,是一種基於消息鑑別碼MAC(Message Authentication Code)的鑑別機制。使用HMAC時,消息通信的雙方,經過驗證消息中加入的鑑別密鑰K來鑑別消息的真僞;

通常用於網絡通訊中消息加密,前提是雙方先要約定好key,就像接頭暗號同樣,而後消息發送把用key把消息加密,接收方用key + 消息明文再加密,拿加密後的值 跟 發送者的相對比是否相等,這樣就能驗證消息的真實性,及發送者的合法性了。

import hmac
h = hmac.new(b'天王蓋地虎', b'寶塔鎮河妖')
print h.hexdigest()

更多關於md5,sha1,sha256等介紹的文章看這裏https://www.tbs-certificates.co.uk/FAQ/en/sha256.html  

4、random模塊

隨機數

mport random
print random.random()
print random.randint(1,2)
print random.randrange(1,10)

生成隨機驗證碼

import random
checkcode = ''
for i in range(4):
    current = random.randrange(0,4)
    if current != i:
        temp = chr(random.randint(65,90))
    else:
        temp = random.randint(0,9)
    checkcode += str(temp)
print checkcode

json & pickle 模塊

用於序列化的兩個模塊

  • json,用於字符串 和 python數據類型間進行轉換
  • pickle,用於python特有的類型 和 python的數據類型間進行轉換

Json模塊提供了四個功能:dumps、dump、loads、load

pickle模塊提供了四個功能:dumps、dump、loads、load

⑥time & datetime模塊

#_*_coding:utf-8_*_
__author__ = 'Alex Li'

import time


# print(time.clock()) #返回處理器時間,3.3開始已廢棄 , 改爲了time.process_time()測量處理器運算時間,不包括sleep時間,不穩定,mac上測不出來
# print(time.altzone)  #返回與utc時間的時間差,以秒計算\
# print(time.asctime()) #返回時間格式"Fri Aug 19 11:14:16 2016",
# print(time.localtime()) #返回本地時間 的struct time對象格式
# print(time.gmtime(time.time()-800000)) #返回utc時間的struc時間對象格式

# print(time.asctime(time.localtime())) #返回時間格式"Fri Aug 19 11:14:16 2016",
#print(time.ctime()) #返回Fri Aug 19 12:38:29 2016 格式, 同上



# 日期字符串 轉成  時間戳
# string_2_struct = time.strptime("2016/05/22","%Y/%m/%d") #將 日期字符串 轉成 struct時間對象格式
# print(string_2_struct)
# #
# struct_2_stamp = time.mktime(string_2_struct) #將struct時間對象轉成時間戳
# print(struct_2_stamp)



#將時間戳轉爲字符串格式
# print(time.gmtime(time.time()-86640)) #將utc時間戳轉換成struct_time格式
# print(time.strftime("%Y-%m-%d %H:%M:%S",time.gmtime()) ) #將utc struct_time格式轉成指定的字符串格式





#時間加減
import datetime

# print(datetime.datetime.now()) #返回 2016-08-19 12:47:03.941925
#print(datetime.date.fromtimestamp(time.time()) )  # 時間戳直接轉成日期格式 2016-08-19
# print(datetime.datetime.now() )
# print(datetime.datetime.now() + datetime.timedelta(3)) #當前時間+3天
# print(datetime.datetime.now() + datetime.timedelta(-3)) #當前時間-3天
# print(datetime.datetime.now() + datetime.timedelta(hours=3)) #當前時間+3小時
# print(datetime.datetime.now() + datetime.timedelta(minutes=30)) #當前時間+30分


#
# c_time  = datetime.datetime.now()
# print(c_time.replace(minute=3,hour=2)) #時間替換
View Code
Directive Meaning Notes
%a Locale’s abbreviated weekday name.  
%A Locale’s full weekday name.  
%b Locale’s abbreviated month name.  
%B Locale’s full month name.  
%c Locale’s appropriate date and time representation.  
%d Day of the month as a decimal number [01,31].  
%H Hour (24-hour clock) as a decimal number [00,23].  
%I Hour (12-hour clock) as a decimal number [01,12].  
%j Day of the year as a decimal number [001,366].  
%m Month as a decimal number [01,12].  
%M Minute as a decimal number [00,59].  
%p Locale’s equivalent of either AM or PM. (1)
%S Second as a decimal number [00,61]. (2)
%U Week number of the year (Sunday as the first day of the week) as a decimal number [00,53]. All days in a new year preceding the first Sunday are considered to be in week 0. (3)
%w Weekday as a decimal number [0(Sunday),6].  
%W Week number of the year (Monday as the first day of the week) as a decimal number [00,53]. All days in a new year preceding the first Monday are considered to be in week 0. (3)
%x Locale’s appropriate date representation.  
%X Locale’s appropriate time representation.  
%y Year without century as a decimal number [00,99].  
%Y Year with century as a decimal number.  
%z Time zone offset indicating a positive or negative time difference from UTC/GMT of the form +HHMM or -HHMM, where H represents decimal hour digits and M represents decimal minute digits [-23:59, +23:59].  
%Z Time zone name (no characters if no time zone exists).  
%% A literal '%' character.

 

7、shutil

高級的 文件、文件夾、壓縮包 處理模塊

shutil.copyfileobj(fsrc, fdst[, length])
將文件內容拷貝到另外一個文件中,能夠部份內容

def copyfileobj(fsrc, fdst, length=16*1024):
    """copy data from file-like object fsrc to file-like object fdst"""
    while 1:
        buf = fsrc.read(length)
        if not buf:
            break
        fdst.write(buf)
View Code

shutil.copyfile(src, dst)
拷貝文件

def copyfile(src, dst):
    """Copy data from src to dst"""
    if _samefile(src, dst):
        raise Error("`%s` and `%s` are the same file" % (src, dst))

    for fn in [src, dst]:
        try:
            st = os.stat(fn)
        except OSError:
            # File most likely does not exist
            pass
        else:
            # XXX What about other special files? (sockets, devices...)
            if stat.S_ISFIFO(st.st_mode):
                raise SpecialFileError("`%s` is a named pipe" % fn)

    with open(src, 'rb') as fsrc:
        with open(dst, 'wb') as fdst:
            copyfileobj(fsrc, fdst)
View Code

shutil.copymode(src, dst)
僅拷貝權限。內容、組、用戶均不變

def copymode(src, dst):
    """Copy mode bits from src to dst"""
    if hasattr(os, 'chmod'):
        st = os.stat(src)
        mode = stat.S_IMODE(st.st_mode)
        os.chmod(dst, mode)
View Code

shutil.copystat(src, dst)
拷貝狀態的信息,包括:mode bits, atime, mtime, flags

def copystat(src, dst):
    """Copy all stat info (mode bits, atime, mtime, flags) from src to dst"""
    st = os.stat(src)
    mode = stat.S_IMODE(st.st_mode)
    if hasattr(os, 'utime'):
        os.utime(dst, (st.st_atime, st.st_mtime))
    if hasattr(os, 'chmod'):
        os.chmod(dst, mode)
    if hasattr(os, 'chflags') and hasattr(st, 'st_flags'):
        try:
            os.chflags(dst, st.st_flags)
        except OSError, why:
            for err in 'EOPNOTSUPP', 'ENOTSUP':
                if hasattr(errno, err) and why.errno == getattr(errno, err):
                    break
            else:
                raise
View Code

shutil.copy(src, dst)
拷貝文件和權限

def copy(src, dst):
    """Copy data and mode bits ("cp src dst").

    The destination may be a directory.

    """
    if os.path.isdir(dst):
        dst = os.path.join(dst, os.path.basename(src))
    copyfile(src, dst)
    copymode(src, dst)
View Code

shutil.copy2(src, dst)
拷貝文件和狀態信息

def copy2(src, dst):
    """Copy data and all stat info ("cp -p src dst").

    The destination may be a directory.

    """
    if os.path.isdir(dst):
        dst = os.path.join(dst, os.path.basename(src))
    copyfile(src, dst)
    copystat(src, dst)
View Code

shutil.ignore_patterns(*patterns)
shutil.copytree(src, dst, symlinks=False, ignore=None)
遞歸的去拷貝文件

例如:copytree(source, destination, ignore=ignore_patterns('*.pyc', 'tmp*'))

def ignore_patterns(*patterns):
    """Function that can be used as copytree() ignore parameter.

    Patterns is a sequence of glob-style patterns
    that are used to exclude files"""
    def _ignore_patterns(path, names):
        ignored_names = []
        for pattern in patterns:
            ignored_names.extend(fnmatch.filter(names, pattern))
        return set(ignored_names)
    return _ignore_patterns

def copytree(src, dst, symlinks=False, ignore=None):
    """Recursively copy a directory tree using copy2().

    The destination directory must not already exist.
    If exception(s) occur, an Error is raised with a list of reasons.

    If the optional symlinks flag is true, symbolic links in the
    source tree result in symbolic links in the destination tree; if
    it is false, the contents of the files pointed to by symbolic
    links are copied.

    The optional ignore argument is a callable. If given, it
    is called with the `src` parameter, which is the directory
    being visited by copytree(), and `names` which is the list of
    `src` contents, as returned by os.listdir():

        callable(src, names) -> ignored_names

    Since copytree() is called recursively, the callable will be
    called once for each directory that is copied. It returns a
    list of names relative to the `src` directory that should
    not be copied.

    XXX Consider this example code rather than the ultimate tool.

    """
    names = os.listdir(src)
    if ignore is not None:
        ignored_names = ignore(src, names)
    else:
        ignored_names = set()

    os.makedirs(dst)
    errors = []
    for name in names:
        if name in ignored_names:
            continue
        srcname = os.path.join(src, name)
        dstname = os.path.join(dst, name)
        try:
            if symlinks and os.path.islink(srcname):
                linkto = os.readlink(srcname)
                os.symlink(linkto, dstname)
            elif os.path.isdir(srcname):
                copytree(srcname, dstname, symlinks, ignore)
            else:
                # Will raise a SpecialFileError for unsupported file types
                copy2(srcname, dstname)
        # catch the Error from the recursive copytree so that we can
        # continue with other files
        except Error, err:
            errors.extend(err.args[0])
        except EnvironmentError, why:
            errors.append((srcname, dstname, str(why)))
    try:
        copystat(src, dst)
    except OSError, why:
        if WindowsError is not None and isinstance(why, WindowsError):
            # Copying file access times may fail on Windows
            pass
        else:
            errors.append((src, dst, str(why)))
    if errors:
        raise Error, errors
View Code

shutil.rmtree(path[, ignore_errors[, onerror]])
遞歸的去刪除文件

def rmtree(path, ignore_errors=False, onerror=None):
    """Recursively delete a directory tree.

    If ignore_errors is set, errors are ignored; otherwise, if onerror
    is set, it is called to handle the error with arguments (func,
    path, exc_info) where func is os.listdir, os.remove, or os.rmdir;
    path is the argument to that function that caused it to fail; and
    exc_info is a tuple returned by sys.exc_info().  If ignore_errors
    is false and onerror is None, an exception is raised.

    """
    if ignore_errors:
        def onerror(*args):
            pass
    elif onerror is None:
        def onerror(*args):
            raise
    try:
        if os.path.islink(path):
            # symlinks to directories are forbidden, see bug #1669
            raise OSError("Cannot call rmtree on a symbolic link")
    except OSError:
        onerror(os.path.islink, path, sys.exc_info())
        # can't continue even if onerror hook returns
        return
    names = []
    try:
        names = os.listdir(path)
    except os.error, err:
        onerror(os.listdir, path, sys.exc_info())
    for name in names:
        fullname = os.path.join(path, name)
        try:
            mode = os.lstat(fullname).st_mode
        except os.error:
            mode = 0
        if stat.S_ISDIR(mode):
            rmtree(fullname, ignore_errors, onerror)
        else:
            try:
                os.remove(fullname)
            except os.error, err:
                onerror(os.remove, fullname, sys.exc_info())
    try:
        os.rmdir(path)
    except os.error:
        onerror(os.rmdir, path, sys.exc_info())
View Code

shutil.move(src, dst)
遞歸的去移動文件

def move(src, dst):
    """Recursively move a file or directory to another location. This is
    similar to the Unix "mv" command.

    If the destination is a directory or a symlink to a directory, the source
    is moved inside the directory. The destination path must not already
    exist.

    If the destination already exists but is not a directory, it may be
    overwritten depending on os.rename() semantics.

    If the destination is on our current filesystem, then rename() is used.
    Otherwise, src is copied to the destination and then removed.
    A lot more could be done here...  A look at a mv.c shows a lot of
    the issues this implementation glosses over.

    """
    real_dst = dst
    if os.path.isdir(dst):
        if _samefile(src, dst):
            # We might be on a case insensitive filesystem,
            # perform the rename anyway.
            os.rename(src, dst)
            return

        real_dst = os.path.join(dst, _basename(src))
        if os.path.exists(real_dst):
            raise Error, "Destination path '%s' already exists" % real_dst
    try:
        os.rename(src, real_dst)
    except OSError:
        if os.path.isdir(src):
            if _destinsrc(src, dst):
                raise Error, "Cannot move a directory '%s' into itself '%s'." % (src, dst)
            copytree(src, real_dst, symlinks=True)
            rmtree(src)
        else:
            copy2(src, real_dst)
            os.unlink(src)
View Code

shutil.make_archive(base_name, format,...)

建立壓縮包並返回文件路徑,例如:zip、tar

    • base_name: 壓縮包的文件名,也能夠是壓縮包的路徑。只是文件名時,則保存至當前目錄,不然保存至指定路徑,
      如:www                        =>保存至當前路徑
      如:/Users/wupeiqi/www =>保存至/Users/wupeiqi/
    • format: 壓縮包種類,「zip」, 「tar」, 「bztar」,「gztar」
    • root_dir: 要壓縮的文件夾路徑(默認當前目錄)
    • owner: 用戶,默認當前用戶
    • group: 組,默認當前組
    • logger: 用於記錄日誌,一般是logging.Logger對象

 

#將 /Users/wupeiqi/Downloads/test 下的文件打包放置當前程序目錄
 
import shutil
ret = shutil.make_archive("wwwwwwwwww", 'gztar', root_dir='/Users/wupeiqi/Downloads/test')
 
 
#將 /Users/wupeiqi/Downloads/test 下的文件打包放置 /Users/wupeiqi/目錄
import shutil
ret = shutil.make_archive("/Users/wupeiqi/wwwwwwwwww", 'gztar', root_dir='/Users/wupeiqi/Downloads/test')
def make_archive(base_name, format, root_dir=None, base_dir=None, verbose=0,
                 dry_run=0, owner=None, group=None, logger=None):
    """Create an archive file (eg. zip or tar).

    'base_name' is the name of the file to create, minus any format-specific
    extension; 'format' is the archive format: one of "zip", "tar", "bztar"
    or "gztar".

    'root_dir' is a directory that will be the root directory of the
    archive; ie. we typically chdir into 'root_dir' before creating the
    archive.  'base_dir' is the directory where we start archiving from;
    ie. 'base_dir' will be the common prefix of all files and
    directories in the archive.  'root_dir' and 'base_dir' both default
    to the current directory.  Returns the name of the archive file.

    'owner' and 'group' are used when creating a tar archive. By default,
    uses the current owner and group.
    """
    save_cwd = os.getcwd()
    if root_dir is not None:
        if logger is not None:
            logger.debug("changing into '%s'", root_dir)
        base_name = os.path.abspath(base_name)
        if not dry_run:
            os.chdir(root_dir)

    if base_dir is None:
        base_dir = os.curdir

    kwargs = {'dry_run': dry_run, 'logger': logger}

    try:
        format_info = _ARCHIVE_FORMATS[format]
    except KeyError:
        raise ValueError, "unknown archive format '%s'" % format

    func = format_info[0]
    for arg, val in format_info[1]:
        kwargs[arg] = val

    if format != 'zip':
        kwargs['owner'] = owner
        kwargs['group'] = group

    try:
        filename = func(base_name, base_dir, **kwargs)
    finally:
        if root_dir is not None:
            if logger is not None:
                logger.debug("changing back to '%s'", save_cwd)
            os.chdir(save_cwd)

    return filename
View Code

shutil 對壓縮包的處理是調用 ZipFile 和 TarFile 兩個模塊來進行的,詳細:

import zipfile

# 壓縮
z = zipfile.ZipFile('laxi.zip', 'w')
z.write('a.log')
z.write('data.data')
z.close()

# 解壓
z = zipfile.ZipFile('laxi.zip', 'r')
z.extractall()
z.close()
zipfile 壓縮解壓
import tarfile

# 壓縮
tar = tarfile.open('your.tar','w')
tar.add('/Users/wupeiqi/PycharmProjects/bbs2.zip', arcname='bbs2.zip')
tar.add('/Users/wupeiqi/PycharmProjects/cmdb.zip', arcname='cmdb.zip')
tar.close()

# 解壓
tar = tarfile.open('your.tar','r')
tar.extractall()  # 可設置解壓地址
tar.close()
tarfile 壓縮解壓
class ZipFile(object):
    """ Class with methods to open, read, write, close, list zip files.

    z = ZipFile(file, mode="r", compression=ZIP_STORED, allowZip64=False)

    file: Either the path to the file, or a file-like object.
          If it is a path, the file will be opened and closed by ZipFile.
    mode: The mode can be either read "r", write "w" or append "a".
    compression: ZIP_STORED (no compression) or ZIP_DEFLATED (requires zlib).
    allowZip64: if True ZipFile will create files with ZIP64 extensions when
                needed, otherwise it will raise an exception when this would
                be necessary.

    """

    fp = None                   # Set here since __del__ checks it

    def __init__(self, file, mode="r", compression=ZIP_STORED, allowZip64=False):
        """Open the ZIP file with mode read "r", write "w" or append "a"."""
        if mode not in ("r", "w", "a"):
            raise RuntimeError('ZipFile() requires mode "r", "w", or "a"')

        if compression == ZIP_STORED:
            pass
        elif compression == ZIP_DEFLATED:
            if not zlib:
                raise RuntimeError,\
                      "Compression requires the (missing) zlib module"
        else:
            raise RuntimeError, "That compression method is not supported"

        self._allowZip64 = allowZip64
        self._didModify = False
        self.debug = 0  # Level of printing: 0 through 3
        self.NameToInfo = {}    # Find file info given name
        self.filelist = []      # List of ZipInfo instances for archive
        self.compression = compression  # Method of compression
        self.mode = key = mode.replace('b', '')[0]
        self.pwd = None
        self._comment = ''

        # Check if we were passed a file-like object
        if isinstance(file, basestring):
            self._filePassed = 0
            self.filename = file
            modeDict = {'r' : 'rb', 'w': 'wb', 'a' : 'r+b'}
            try:
                self.fp = open(file, modeDict[mode])
            except IOError:
                if mode == 'a':
                    mode = key = 'w'
                    self.fp = open(file, modeDict[mode])
                else:
                    raise
        else:
            self._filePassed = 1
            self.fp = file
            self.filename = getattr(file, 'name', None)

        try:
            if key == 'r':
                self._RealGetContents()
            elif key == 'w':
                # set the modified flag so central directory gets written
                # even if no files are added to the archive
                self._didModify = True
            elif key == 'a':
                try:
                    # See if file is a zip file
                    self._RealGetContents()
                    # seek to start of directory and overwrite
                    self.fp.seek(self.start_dir, 0)
                except BadZipfile:
                    # file is not a zip file, just append
                    self.fp.seek(0, 2)

                    # set the modified flag so central directory gets written
                    # even if no files are added to the archive
                    self._didModify = True
            else:
                raise RuntimeError('Mode must be "r", "w" or "a"')
        except:
            fp = self.fp
            self.fp = None
            if not self._filePassed:
                fp.close()
            raise

    def __enter__(self):
        return self

    def __exit__(self, type, value, traceback):
        self.close()

    def _RealGetContents(self):
        """Read in the table of contents for the ZIP file."""
        fp = self.fp
        try:
            endrec = _EndRecData(fp)
        except IOError:
            raise BadZipfile("File is not a zip file")
        if not endrec:
            raise BadZipfile, "File is not a zip file"
        if self.debug > 1:
            print endrec
        size_cd = endrec[_ECD_SIZE]             # bytes in central directory
        offset_cd = endrec[_ECD_OFFSET]         # offset of central directory
        self._comment = endrec[_ECD_COMMENT]    # archive comment

        # "concat" is zero, unless zip was concatenated to another file
        concat = endrec[_ECD_LOCATION] - size_cd - offset_cd
        if endrec[_ECD_SIGNATURE] == stringEndArchive64:
            # If Zip64 extension structures are present, account for them
            concat -= (sizeEndCentDir64 + sizeEndCentDir64Locator)

        if self.debug > 2:
            inferred = concat + offset_cd
            print "given, inferred, offset", offset_cd, inferred, concat
        # self.start_dir:  Position of start of central directory
        self.start_dir = offset_cd + concat
        fp.seek(self.start_dir, 0)
        data = fp.read(size_cd)
        fp = cStringIO.StringIO(data)
        total = 0
        while total < size_cd:
            centdir = fp.read(sizeCentralDir)
            if len(centdir) != sizeCentralDir:
                raise BadZipfile("Truncated central directory")
            centdir = struct.unpack(structCentralDir, centdir)
            if centdir[_CD_SIGNATURE] != stringCentralDir:
                raise BadZipfile("Bad magic number for central directory")
            if self.debug > 2:
                print centdir
            filename = fp.read(centdir[_CD_FILENAME_LENGTH])
            # Create ZipInfo instance to store file information
            x = ZipInfo(filename)
            x.extra = fp.read(centdir[_CD_EXTRA_FIELD_LENGTH])
            x.comment = fp.read(centdir[_CD_COMMENT_LENGTH])
            x.header_offset = centdir[_CD_LOCAL_HEADER_OFFSET]
            (x.create_version, x.create_system, x.extract_version, x.reserved,
                x.flag_bits, x.compress_type, t, d,
                x.CRC, x.compress_size, x.file_size) = centdir[1:12]
            x.volume, x.internal_attr, x.external_attr = centdir[15:18]
            # Convert date/time code to (year, month, day, hour, min, sec)
            x._raw_time = t
            x.date_time = ( (d>>9)+1980, (d>>5)&0xF, d&0x1F,
                                     t>>11, (t>>5)&0x3F, (t&0x1F) * 2 )

            x._decodeExtra()
            x.header_offset = x.header_offset + concat
            x.filename = x._decodeFilename()
            self.filelist.append(x)
            self.NameToInfo[x.filename] = x

            # update total bytes read from central directory
            total = (total + sizeCentralDir + centdir[_CD_FILENAME_LENGTH]
                     + centdir[_CD_EXTRA_FIELD_LENGTH]
                     + centdir[_CD_COMMENT_LENGTH])

            if self.debug > 2:
                print "total", total


    def namelist(self):
        """Return a list of file names in the archive."""
        l = []
        for data in self.filelist:
            l.append(data.filename)
        return l

    def infolist(self):
        """Return a list of class ZipInfo instances for files in the
        archive."""
        return self.filelist

    def printdir(self):
        """Print a table of contents for the zip file."""
        print "%-46s %19s %12s" % ("File Name", "Modified    ", "Size")
        for zinfo in self.filelist:
            date = "%d-%02d-%02d %02d:%02d:%02d" % zinfo.date_time[:6]
            print "%-46s %s %12d" % (zinfo.filename, date, zinfo.file_size)

    def testzip(self):
        """Read all the files and check the CRC."""
        chunk_size = 2 ** 20
        for zinfo in self.filelist:
            try:
                # Read by chunks, to avoid an OverflowError or a
                # MemoryError with very large embedded files.
                with self.open(zinfo.filename, "r") as f:
                    while f.read(chunk_size):     # Check CRC-32
                        pass
            except BadZipfile:
                return zinfo.filename

    def getinfo(self, name):
        """Return the instance of ZipInfo given 'name'."""
        info = self.NameToInfo.get(name)
        if info is None:
            raise KeyError(
                'There is no item named %r in the archive' % name)

        return info

    def setpassword(self, pwd):
        """Set default password for encrypted files."""
        self.pwd = pwd

    @property
    def comment(self):
        """The comment text associated with the ZIP file."""
        return self._comment

    @comment.setter
    def comment(self, comment):
        # check for valid comment length
        if len(comment) > ZIP_MAX_COMMENT:
            import warnings
            warnings.warn('Archive comment is too long; truncating to %d bytes'
                          % ZIP_MAX_COMMENT, stacklevel=2)
            comment = comment[:ZIP_MAX_COMMENT]
        self._comment = comment
        self._didModify = True

    def read(self, name, pwd=None):
        """Return file bytes (as a string) for name."""
        return self.open(name, "r", pwd).read()

    def open(self, name, mode="r", pwd=None):
        """Return file-like object for 'name'."""
        if mode not in ("r", "U", "rU"):
            raise RuntimeError, 'open() requires mode "r", "U", or "rU"'
        if not self.fp:
            raise RuntimeError, \
                  "Attempt to read ZIP archive that was already closed"

        # Only open a new file for instances where we were not
        # given a file object in the constructor
        if self._filePassed:
            zef_file = self.fp
            should_close = False
        else:
            zef_file = open(self.filename, 'rb')
            should_close = True

        try:
            # Make sure we have an info object
            if isinstance(name, ZipInfo):
                # 'name' is already an info object
                zinfo = name
            else:
                # Get info object for name
                zinfo = self.getinfo(name)

            zef_file.seek(zinfo.header_offset, 0)

            # Skip the file header:
            fheader = zef_file.read(sizeFileHeader)
            if len(fheader) != sizeFileHeader:
                raise BadZipfile("Truncated file header")
            fheader = struct.unpack(structFileHeader, fheader)
            if fheader[_FH_SIGNATURE] != stringFileHeader:
                raise BadZipfile("Bad magic number for file header")

            fname = zef_file.read(fheader[_FH_FILENAME_LENGTH])
            if fheader[_FH_EXTRA_FIELD_LENGTH]:
                zef_file.read(fheader[_FH_EXTRA_FIELD_LENGTH])

            if fname != zinfo.orig_filename:
                raise BadZipfile, \
                        'File name in directory "%s" and header "%s" differ.' % (
                            zinfo.orig_filename, fname)

            # check for encrypted flag & handle password
            is_encrypted = zinfo.flag_bits & 0x1
            zd = None
            if is_encrypted:
                if not pwd:
                    pwd = self.pwd
                if not pwd:
                    raise RuntimeError, "File %s is encrypted, " \
                        "password required for extraction" % name

                zd = _ZipDecrypter(pwd)
                # The first 12 bytes in the cypher stream is an encryption header
                #  used to strengthen the algorithm. The first 11 bytes are
                #  completely random, while the 12th contains the MSB of the CRC,
                #  or the MSB of the file time depending on the header type
                #  and is used to check the correctness of the password.
                bytes = zef_file.read(12)
                h = map(zd, bytes[0:12])
                if zinfo.flag_bits & 0x8:
                    # compare against the file type from extended local headers
                    check_byte = (zinfo._raw_time >> 8) & 0xff
                else:
                    # compare against the CRC otherwise
                    check_byte = (zinfo.CRC >> 24) & 0xff
                if ord(h[11]) != check_byte:
                    raise RuntimeError("Bad password for file", name)

            return ZipExtFile(zef_file, mode, zinfo, zd,
                    close_fileobj=should_close)
        except:
            if should_close:
                zef_file.close()
            raise

    def extract(self, member, path=None, pwd=None):
        """Extract a member from the archive to the current working directory,
           using its full name. Its file information is extracted as accurately
           as possible. `member' may be a filename or a ZipInfo object. You can
           specify a different directory using `path'.
        """
        if not isinstance(member, ZipInfo):
            member = self.getinfo(member)

        if path is None:
            path = os.getcwd()

        return self._extract_member(member, path, pwd)

    def extractall(self, path=None, members=None, pwd=None):
        """Extract all members from the archive to the current working
           directory. `path' specifies a different directory to extract to.
           `members' is optional and must be a subset of the list returned
           by namelist().
        """
        if members is None:
            members = self.namelist()

        for zipinfo in members:
            self.extract(zipinfo, path, pwd)

    def _extract_member(self, member, targetpath, pwd):
        """Extract the ZipInfo object 'member' to a physical
           file on the path targetpath.
        """
        # build the destination pathname, replacing
        # forward slashes to platform specific separators.
        arcname = member.filename.replace('/', os.path.sep)

        if os.path.altsep:
            arcname = arcname.replace(os.path.altsep, os.path.sep)
        # interpret absolute pathname as relative, remove drive letter or
        # UNC path, redundant separators, "." and ".." components.
        arcname = os.path.splitdrive(arcname)[1]
        arcname = os.path.sep.join(x for x in arcname.split(os.path.sep)
                    if x not in ('', os.path.curdir, os.path.pardir))
        if os.path.sep == '\\':
            # filter illegal characters on Windows
            illegal = ':<>|"?*'
            if isinstance(arcname, unicode):
                table = {ord(c): ord('_') for c in illegal}
            else:
                table = string.maketrans(illegal, '_' * len(illegal))
            arcname = arcname.translate(table)
            # remove trailing dots
            arcname = (x.rstrip('.') for x in arcname.split(os.path.sep))
            arcname = os.path.sep.join(x for x in arcname if x)

        targetpath = os.path.join(targetpath, arcname)
        targetpath = os.path.normpath(targetpath)

        # Create all upper directories if necessary.
        upperdirs = os.path.dirname(targetpath)
        if upperdirs and not os.path.exists(upperdirs):
            os.makedirs(upperdirs)

        if member.filename[-1] == '/':
            if not os.path.isdir(targetpath):
                os.mkdir(targetpath)
            return targetpath

        with self.open(member, pwd=pwd) as source, \
             file(targetpath, "wb") as target:
            shutil.copyfileobj(source, target)

        return targetpath

    def _writecheck(self, zinfo):
        """Check for errors before writing a file to the archive."""
        if zinfo.filename in self.NameToInfo:
            import warnings
            warnings.warn('Duplicate name: %r' % zinfo.filename, stacklevel=3)
        if self.mode not in ("w", "a"):
            raise RuntimeError, 'write() requires mode "w" or "a"'
        if not self.fp:
            raise RuntimeError, \
                  "Attempt to write ZIP archive that was already closed"
        if zinfo.compress_type == ZIP_DEFLATED and not zlib:
            raise RuntimeError, \
                  "Compression requires the (missing) zlib module"
        if zinfo.compress_type not in (ZIP_STORED, ZIP_DEFLATED):
            raise RuntimeError, \
                  "That compression method is not supported"
        if not self._allowZip64:
            requires_zip64 = None
            if len(self.filelist) >= ZIP_FILECOUNT_LIMIT:
                requires_zip64 = "Files count"
            elif zinfo.file_size > ZIP64_LIMIT:
                requires_zip64 = "Filesize"
            elif zinfo.header_offset > ZIP64_LIMIT:
                requires_zip64 = "Zipfile size"
            if requires_zip64:
                raise LargeZipFile(requires_zip64 +
                                   " would require ZIP64 extensions")

    def write(self, filename, arcname=None, compress_type=None):
        """Put the bytes from filename into the archive under the name
        arcname."""
        if not self.fp:
            raise RuntimeError(
                  "Attempt to write to ZIP archive that was already closed")

        st = os.stat(filename)
        isdir = stat.S_ISDIR(st.st_mode)
        mtime = time.localtime(st.st_mtime)
        date_time = mtime[0:6]
        # Create ZipInfo instance to store file information
        if arcname is None:
            arcname = filename
        arcname = os.path.normpath(os.path.splitdrive(arcname)[1])
        while arcname[0] in (os.sep, os.altsep):
            arcname = arcname[1:]
        if isdir:
            arcname += '/'
        zinfo = ZipInfo(arcname, date_time)
        zinfo.external_attr = (st[0] & 0xFFFF) << 16L      # Unix attributes
        if compress_type is None:
            zinfo.compress_type = self.compression
        else:
            zinfo.compress_type = compress_type

        zinfo.file_size = st.st_size
        zinfo.flag_bits = 0x00
        zinfo.header_offset = self.fp.tell()    # Start of header bytes

        self._writecheck(zinfo)
        self._didModify = True

        if isdir:
            zinfo.file_size = 0
            zinfo.compress_size = 0
            zinfo.CRC = 0
            zinfo.external_attr |= 0x10  # MS-DOS directory flag
            self.filelist.append(zinfo)
            self.NameToInfo[zinfo.filename] = zinfo
            self.fp.write(zinfo.FileHeader(False))
            return

        with open(filename, "rb") as fp:
            # Must overwrite CRC and sizes with correct data later
            zinfo.CRC = CRC = 0
            zinfo.compress_size = compress_size = 0
            # Compressed size can be larger than uncompressed size
            zip64 = self._allowZip64 and \
                    zinfo.file_size * 1.05 > ZIP64_LIMIT
            self.fp.write(zinfo.FileHeader(zip64))
            if zinfo.compress_type == ZIP_DEFLATED:
                cmpr = zlib.compressobj(zlib.Z_DEFAULT_COMPRESSION,
                     zlib.DEFLATED, -15)
            else:
                cmpr = None
            file_size = 0
            while 1:
                buf = fp.read(1024 * 8)
                if not buf:
                    break
                file_size = file_size + len(buf)
                CRC = crc32(buf, CRC) & 0xffffffff
                if cmpr:
                    buf = cmpr.compress(buf)
                    compress_size = compress_size + len(buf)
                self.fp.write(buf)
        if cmpr:
            buf = cmpr.flush()
            compress_size = compress_size + len(buf)
            self.fp.write(buf)
            zinfo.compress_size = compress_size
        else:
            zinfo.compress_size = file_size
        zinfo.CRC = CRC
        zinfo.file_size = file_size
        if not zip64 and self._allowZip64:
            if file_size > ZIP64_LIMIT:
                raise RuntimeError('File size has increased during compressing')
            if compress_size > ZIP64_LIMIT:
                raise RuntimeError('Compressed size larger than uncompressed size')
        # Seek backwards and write file header (which will now include
        # correct CRC and file sizes)
        position = self.fp.tell()       # Preserve current position in file
        self.fp.seek(zinfo.header_offset, 0)
        self.fp.write(zinfo.FileHeader(zip64))
        self.fp.seek(position, 0)
        self.filelist.append(zinfo)
        self.NameToInfo[zinfo.filename] = zinfo

    def writestr(self, zinfo_or_arcname, bytes, compress_type=None):
        """Write a file into the archive.  The contents is the string
        'bytes'.  'zinfo_or_arcname' is either a ZipInfo instance or
        the name of the file in the archive."""
        if not isinstance(zinfo_or_arcname, ZipInfo):
            zinfo = ZipInfo(filename=zinfo_or_arcname,
                            date_time=time.localtime(time.time())[:6])

            zinfo.compress_type = self.compression
            if zinfo.filename[-1] == '/':
                zinfo.external_attr = 0o40775 << 16   # drwxrwxr-x
                zinfo.external_attr |= 0x10           # MS-DOS directory flag
            else:
                zinfo.external_attr = 0o600 << 16     # ?rw-------
        else:
            zinfo = zinfo_or_arcname

        if not self.fp:
            raise RuntimeError(
                  "Attempt to write to ZIP archive that was already closed")

        if compress_type is not None:
            zinfo.compress_type = compress_type

        zinfo.file_size = len(bytes)            # Uncompressed size
        zinfo.header_offset = self.fp.tell()    # Start of header bytes
        self._writecheck(zinfo)
        self._didModify = True
        zinfo.CRC = crc32(bytes) & 0xffffffff       # CRC-32 checksum
        if zinfo.compress_type == ZIP_DEFLATED:
            co = zlib.compressobj(zlib.Z_DEFAULT_COMPRESSION,
                 zlib.DEFLATED, -15)
            bytes = co.compress(bytes) + co.flush()
            zinfo.compress_size = len(bytes)    # Compressed size
        else:
            zinfo.compress_size = zinfo.file_size
        zip64 = zinfo.file_size > ZIP64_LIMIT or \
                zinfo.compress_size > ZIP64_LIMIT
        if zip64 and not self._allowZip64:
            raise LargeZipFile("Filesize would require ZIP64 extensions")
        self.fp.write(zinfo.FileHeader(zip64))
        self.fp.write(bytes)
        if zinfo.flag_bits & 0x08:
            # Write CRC and file sizes after the file data
            fmt = '<LQQ' if zip64 else '<LLL'
            self.fp.write(struct.pack(fmt, zinfo.CRC, zinfo.compress_size,
                  zinfo.file_size))
        self.fp.flush()
        self.filelist.append(zinfo)
        self.NameToInfo[zinfo.filename] = zinfo

    def __del__(self):
        """Call the "close()" method in case the user forgot."""
        self.close()

    def close(self):
        """Close the file, and for mode "w" and "a" write the ending
        records."""
        if self.fp is None:
            return

        try:
            if self.mode in ("w", "a") and self._didModify: # write ending records
                pos1 = self.fp.tell()
                for zinfo in self.filelist:         # write central directory
                    dt = zinfo.date_time
                    dosdate = (dt[0] - 1980) << 9 | dt[1] << 5 | dt[2]
                    dostime = dt[3] << 11 | dt[4] << 5 | (dt[5] // 2)
                    extra = []
                    if zinfo.file_size > ZIP64_LIMIT \
                            or zinfo.compress_size > ZIP64_LIMIT:
                        extra.append(zinfo.file_size)
                        extra.append(zinfo.compress_size)
                        file_size = 0xffffffff
                        compress_size = 0xffffffff
                    else:
                        file_size = zinfo.file_size
                        compress_size = zinfo.compress_size

                    if zinfo.header_offset > ZIP64_LIMIT:
                        extra.append(zinfo.header_offset)
                        header_offset = 0xffffffffL
                    else:
                        header_offset = zinfo.header_offset

                    extra_data = zinfo.extra
                    if extra:
                        # Append a ZIP64 field to the extra's
                        extra_data = struct.pack(
                                '<HH' + 'Q'*len(extra),
                                1, 8*len(extra), *extra) + extra_data

                        extract_version = max(45, zinfo.extract_version)
                        create_version = max(45, zinfo.create_version)
                    else:
                        extract_version = zinfo.extract_version
                        create_version = zinfo.create_version

                    try:
                        filename, flag_bits = zinfo._encodeFilenameFlags()
                        centdir = struct.pack(structCentralDir,
                        stringCentralDir, create_version,
                        zinfo.create_system, extract_version, zinfo.reserved,
                        flag_bits, zinfo.compress_type, dostime, dosdate,
                        zinfo.CRC, compress_size, file_size,
                        len(filename), len(extra_data), len(zinfo.comment),
                        0, zinfo.internal_attr, zinfo.external_attr,
                        header_offset)
                    except DeprecationWarning:
                        print >>sys.stderr, (structCentralDir,
                        stringCentralDir, create_version,
                        zinfo.create_system, extract_version, zinfo.reserved,
                        zinfo.flag_bits, zinfo.compress_type, dostime, dosdate,
                        zinfo.CRC, compress_size, file_size,
                        len(zinfo.filename), len(extra_data), len(zinfo.comment),
                        0, zinfo.internal_attr, zinfo.external_attr,
                        header_offset)
                        raise
                    self.fp.write(centdir)
                    self.fp.write(filename)
                    self.fp.write(extra_data)
                    self.fp.write(zinfo.comment)

                pos2 = self.fp.tell()
                # Write end-of-zip-archive record
                centDirCount = len(self.filelist)
                centDirSize = pos2 - pos1
                centDirOffset = pos1
                requires_zip64 = None
                if centDirCount > ZIP_FILECOUNT_LIMIT:
                    requires_zip64 = "Files count"
                elif centDirOffset > ZIP64_LIMIT:
                    requires_zip64 = "Central directory offset"
                elif centDirSize > ZIP64_LIMIT:
                    requires_zip64 = "Central directory size"
                if requires_zip64:
                    # Need to write the ZIP64 end-of-archive records
                    if not self._allowZip64:
                        raise LargeZipFile(requires_zip64 +
                                           " would require ZIP64 extensions")
                    zip64endrec = struct.pack(
                            structEndArchive64, stringEndArchive64,
                            44, 45, 45, 0, 0, centDirCount, centDirCount,
                            centDirSize, centDirOffset)
                    self.fp.write(zip64endrec)

                    zip64locrec = struct.pack(
                            structEndArchive64Locator,
                            stringEndArchive64Locator, 0, pos2, 1)
                    self.fp.write(zip64locrec)
                    centDirCount = min(centDirCount, 0xFFFF)
                    centDirSize = min(centDirSize, 0xFFFFFFFF)
                    centDirOffset = min(centDirOffset, 0xFFFFFFFF)

                endrec = struct.pack(structEndArchive, stringEndArchive,
                                    0, 0, centDirCount, centDirCount,
                                    centDirSize, centDirOffset, len(self._comment))
                self.fp.write(endrec)
                self.fp.write(self._comment)
                self.fp.flush()
        finally:
            fp = self.fp
            self.fp = None
            if not self._filePassed:
                fp.close()
ZipFile
class TarFile(object):
    """The TarFile Class provides an interface to tar archives.
    """

    debug = 0                   # May be set from 0 (no msgs) to 3 (all msgs)

    dereference = False         # If true, add content of linked file to the
                                # tar file, else the link.

    ignore_zeros = False        # If true, skips empty or invalid blocks and
                                # continues processing.

    errorlevel = 1              # If 0, fatal errors only appear in debug
                                # messages (if debug >= 0). If > 0, errors
                                # are passed to the caller as exceptions.

    format = DEFAULT_FORMAT     # The format to use when creating an archive.

    encoding = ENCODING         # Encoding for 8-bit character strings.

    errors = None               # Error handler for unicode conversion.

    tarinfo = TarInfo           # The default TarInfo class to use.

    fileobject = ExFileObject   # The default ExFileObject class to use.

    def __init__(self, name=None, mode="r", fileobj=None, format=None,
            tarinfo=None, dereference=None, ignore_zeros=None, encoding=None,
            errors=None, pax_headers=None, debug=None, errorlevel=None):
        """Open an (uncompressed) tar archive `name'. `mode' is either 'r' to
           read from an existing archive, 'a' to append data to an existing
           file or 'w' to create a new file overwriting an existing one. `mode'
           defaults to 'r'.
           If `fileobj' is given, it is used for reading or writing data. If it
           can be determined, `mode' is overridden by `fileobj's mode.
           `fileobj' is not closed, when TarFile is closed.
        """
        modes = {"r": "rb", "a": "r+b", "w": "wb"}
        if mode not in modes:
            raise ValueError("mode must be 'r', 'a' or 'w'")
        self.mode = mode
        self._mode = modes[mode]

        if not fileobj:
            if self.mode == "a" and not os.path.exists(name):
                # Create nonexistent files in append mode.
                self.mode = "w"
                self._mode = "wb"
            fileobj = bltn_open(name, self._mode)
            self._extfileobj = False
        else:
            if name is None and hasattr(fileobj, "name"):
                name = fileobj.name
            if hasattr(fileobj, "mode"):
                self._mode = fileobj.mode
            self._extfileobj = True
        self.name = os.path.abspath(name) if name else None
        self.fileobj = fileobj

        # Init attributes.
        if format is not None:
            self.format = format
        if tarinfo is not None:
            self.tarinfo = tarinfo
        if dereference is not None:
            self.dereference = dereference
        if ignore_zeros is not None:
            self.ignore_zeros = ignore_zeros
        if encoding is not None:
            self.encoding = encoding

        if errors is not None:
            self.errors = errors
        elif mode == "r":
            self.errors = "utf-8"
        else:
            self.errors = "strict"

        if pax_headers is not None and self.format == PAX_FORMAT:
            self.pax_headers = pax_headers
        else:
            self.pax_headers = {}

        if debug is not None:
            self.debug = debug
        if errorlevel is not None:
            self.errorlevel = errorlevel

        # Init datastructures.
        self.closed = False
        self.members = []       # list of members as TarInfo objects
        self._loaded = False    # flag if all members have been read
        self.offset = self.fileobj.tell()
                                # current position in the archive file
        self.inodes = {}        # dictionary caching the inodes of
                                # archive members already added

        try:
            if self.mode == "r":
                self.firstmember = None
                self.firstmember = self.next()

            if self.mode == "a":
                # Move to the end of the archive,
                # before the first empty block.
                while True:
                    self.fileobj.seek(self.offset)
                    try:
                        tarinfo = self.tarinfo.fromtarfile(self)
                        self.members.append(tarinfo)
                    except EOFHeaderError:
                        self.fileobj.seek(self.offset)
                        break
                    except HeaderError, e:
                        raise ReadError(str(e))

            if self.mode in "aw":
                self._loaded = True

                if self.pax_headers:
                    buf = self.tarinfo.create_pax_global_header(self.pax_headers.copy())
                    self.fileobj.write(buf)
                    self.offset += len(buf)
        except:
            if not self._extfileobj:
                self.fileobj.close()
            self.closed = True
            raise

    def _getposix(self):
        return self.format == USTAR_FORMAT
    def _setposix(self, value):
        import warnings
        warnings.warn("use the format attribute instead", DeprecationWarning,
                      2)
        if value:
            self.format = USTAR_FORMAT
        else:
            self.format = GNU_FORMAT
    posix = property(_getposix, _setposix)

    #--------------------------------------------------------------------------
    # Below are the classmethods which act as alternate constructors to the
    # TarFile class. The open() method is the only one that is needed for
    # public use; it is the "super"-constructor and is able to select an
    # adequate "sub"-constructor for a particular compression using the mapping
    # from OPEN_METH.
    #
    # This concept allows one to subclass TarFile without losing the comfort of
    # the super-constructor. A sub-constructor is registered and made available
    # by adding it to the mapping in OPEN_METH.

    @classmethod
    def open(cls, name=None, mode="r", fileobj=None, bufsize=RECORDSIZE, **kwargs):
        """Open a tar archive for reading, writing or appending. Return
           an appropriate TarFile class.

           mode:
           'r' or 'r:*' open for reading with transparent compression
           'r:'         open for reading exclusively uncompressed
           'r:gz'       open for reading with gzip compression
           'r:bz2'      open for reading with bzip2 compression
           'a' or 'a:'  open for appending, creating the file if necessary
           'w' or 'w:'  open for writing without compression
           'w:gz'       open for writing with gzip compression
           'w:bz2'      open for writing with bzip2 compression

           'r|*'        open a stream of tar blocks with transparent compression
           'r|'         open an uncompressed stream of tar blocks for reading
           'r|gz'       open a gzip compressed stream of tar blocks
           'r|bz2'      open a bzip2 compressed stream of tar blocks
           'w|'         open an uncompressed stream for writing
           'w|gz'       open a gzip compressed stream for writing
           'w|bz2'      open a bzip2 compressed stream for writing
        """

        if not name and not fileobj:
            raise ValueError("nothing to open")

        if mode in ("r", "r:*"):
            # Find out which *open() is appropriate for opening the file.
            for comptype in cls.OPEN_METH:
                func = getattr(cls, cls.OPEN_METH[comptype])
                if fileobj is not None:
                    saved_pos = fileobj.tell()
                try:
                    return func(name, "r", fileobj, **kwargs)
                except (ReadError, CompressionError), e:
                    if fileobj is not None:
                        fileobj.seek(saved_pos)
                    continue
            raise ReadError("file could not be opened successfully")

        elif ":" in mode:
            filemode, comptype = mode.split(":", 1)
            filemode = filemode or "r"
            comptype = comptype or "tar"

            # Select the *open() function according to
            # given compression.
            if comptype in cls.OPEN_METH:
                func = getattr(cls, cls.OPEN_METH[comptype])
            else:
                raise CompressionError("unknown compression type %r" % comptype)
            return func(name, filemode, fileobj, **kwargs)

        elif "|" in mode:
            filemode, comptype = mode.split("|", 1)
            filemode = filemode or "r"
            comptype = comptype or "tar"

            if filemode not in ("r", "w"):
                raise ValueError("mode must be 'r' or 'w'")

            stream = _Stream(name, filemode, comptype, fileobj, bufsize)
            try:
                t = cls(name, filemode, stream, **kwargs)
            except:
                stream.close()
                raise
            t._extfileobj = False
            return t

        elif mode in ("a", "w"):
            return cls.taropen(name, mode, fileobj, **kwargs)

        raise ValueError("undiscernible mode")

    @classmethod
    def taropen(cls, name, mode="r", fileobj=None, **kwargs):
        """Open uncompressed tar archive name for reading or writing.
        """
        if mode not in ("r", "a", "w"):
            raise ValueError("mode must be 'r', 'a' or 'w'")
        return cls(name, mode, fileobj, **kwargs)

    @classmethod
    def gzopen(cls, name, mode="r", fileobj=None, compresslevel=9, **kwargs):
        """Open gzip compressed tar archive name for reading or writing.
           Appending is not allowed.
        """
        if mode not in ("r", "w"):
            raise ValueError("mode must be 'r' or 'w'")

        try:
            import gzip
            gzip.GzipFile
        except (ImportError, AttributeError):
            raise CompressionError("gzip module is not available")

        try:
            fileobj = gzip.GzipFile(name, mode, compresslevel, fileobj)
        except OSError:
            if fileobj is not None and mode == 'r':
                raise ReadError("not a gzip file")
            raise

        try:
            t = cls.taropen(name, mode, fileobj, **kwargs)
        except IOError:
            fileobj.close()
            if mode == 'r':
                raise ReadError("not a gzip file")
            raise
        except:
            fileobj.close()
            raise
        t._extfileobj = False
        return t

    @classmethod
    def bz2open(cls, name, mode="r", fileobj=None, compresslevel=9, **kwargs):
        """Open bzip2 compressed tar archive name for reading or writing.
           Appending is not allowed.
        """
        if mode not in ("r", "w"):
            raise ValueError("mode must be 'r' or 'w'.")

        try:
            import bz2
        except ImportError:
            raise CompressionError("bz2 module is not available")

        if fileobj is not None:
            fileobj = _BZ2Proxy(fileobj, mode)
        else:
            fileobj = bz2.BZ2File(name, mode, compresslevel=compresslevel)

        try:
            t = cls.taropen(name, mode, fileobj, **kwargs)
        except (IOError, EOFError):
            fileobj.close()
            if mode == 'r':
                raise ReadError("not a bzip2 file")
            raise
        except:
            fileobj.close()
            raise
        t._extfileobj = False
        return t

    # All *open() methods are registered here.
    OPEN_METH = {
        "tar": "taropen",   # uncompressed tar
        "gz":  "gzopen",    # gzip compressed tar
        "bz2": "bz2open"    # bzip2 compressed tar
    }

    #--------------------------------------------------------------------------
    # The public methods which TarFile provides:

    def close(self):
        """Close the TarFile. In write-mode, two finishing zero blocks are
           appended to the archive.
        """
        if self.closed:
            return

        if self.mode in "aw":
            self.fileobj.write(NUL * (BLOCKSIZE * 2))
            self.offset += (BLOCKSIZE * 2)
            # fill up the end with zero-blocks
            # (like option -b20 for tar does)
            blocks, remainder = divmod(self.offset, RECORDSIZE)
            if remainder > 0:
                self.fileobj.write(NUL * (RECORDSIZE - remainder))

        if not self._extfileobj:
            self.fileobj.close()
        self.closed = True

    def getmember(self, name):
        """Return a TarInfo object for member `name'. If `name' can not be
           found in the archive, KeyError is raised. If a member occurs more
           than once in the archive, its last occurrence is assumed to be the
           most up-to-date version.
        """
        tarinfo = self._getmember(name)
        if tarinfo is None:
            raise KeyError("filename %r not found" % name)
        return tarinfo

    def getmembers(self):
        """Return the members of the archive as a list of TarInfo objects. The
           list has the same order as the members in the archive.
        """
        self._check()
        if not self._loaded:    # if we want to obtain a list of
            self._load()        # all members, we first have to
                                # scan the whole archive.
        return self.members

    def getnames(self):
        """Return the members of the archive as a list of their names. It has
           the same order as the list returned by getmembers().
        """
        return [tarinfo.name for tarinfo in self.getmembers()]

    def gettarinfo(self, name=None, arcname=None, fileobj=None):
        """Create a TarInfo object for either the file `name' or the file
           object `fileobj' (using os.fstat on its file descriptor). You can
           modify some of the TarInfo's attributes before you add it using
           addfile(). If given, `arcname' specifies an alternative name for the
           file in the archive.
        """
        self._check("aw")

        # When fileobj is given, replace name by
        # fileobj's real name.
        if fileobj is not None:
            name = fileobj.name

        # Building the name of the member in the archive.
        # Backward slashes are converted to forward slashes,
        # Absolute paths are turned to relative paths.
        if arcname is None:
            arcname = name
        drv, arcname = os.path.splitdrive(arcname)
        arcname = arcname.replace(os.sep, "/")
        arcname = arcname.lstrip("/")

        # Now, fill the TarInfo object with
        # information specific for the file.
        tarinfo = self.tarinfo()
        tarinfo.tarfile = self

        # Use os.stat or os.lstat, depending on platform
        # and if symlinks shall be resolved.
        if fileobj is None:
            if hasattr(os, "lstat") and not self.dereference:
                statres = os.lstat(name)
            else:
                statres = os.stat(name)
        else:
            statres = os.fstat(fileobj.fileno())
        linkname = ""

        stmd = statres.st_mode
        if stat.S_ISREG(stmd):
            inode = (statres.st_ino, statres.st_dev)
            if not self.dereference and statres.st_nlink > 1 and \
                    inode in self.inodes and arcname != self.inodes[inode]:
                # Is it a hardlink to an already
                # archived file?
                type = LNKTYPE
                linkname = self.inodes[inode]
            else:
                # The inode is added only if its valid.
                # For win32 it is always 0.
                type = REGTYPE
                if inode[0]:
                    self.inodes[inode] = arcname
        elif stat.S_ISDIR(stmd):
            type = DIRTYPE
        elif stat.S_ISFIFO(stmd):
            type = FIFOTYPE
        elif stat.S_ISLNK(stmd):
            type = SYMTYPE
            linkname = os.readlink(name)
        elif stat.S_ISCHR(stmd):
            type = CHRTYPE
        elif stat.S_ISBLK(stmd):
            type = BLKTYPE
        else:
            return None

        # Fill the TarInfo object with all
        # information we can get.
        tarinfo.name = arcname
        tarinfo.mode = stmd
        tarinfo.uid = statres.st_uid
        tarinfo.gid = statres.st_gid
        if type == REGTYPE:
            tarinfo.size = statres.st_size
        else:
            tarinfo.size = 0L
        tarinfo.mtime = statres.st_mtime
        tarinfo.type = type
        tarinfo.linkname = linkname
        if pwd:
            try:
                tarinfo.uname = pwd.getpwuid(tarinfo.uid)[0]
            except KeyError:
                pass
        if grp:
            try:
                tarinfo.gname = grp.getgrgid(tarinfo.gid)[0]
            except KeyError:
                pass

        if type in (CHRTYPE, BLKTYPE):
            if hasattr(os, "major") and hasattr(os, "minor"):
                tarinfo.devmajor = os.major(statres.st_rdev)
                tarinfo.devminor = os.minor(statres.st_rdev)
        return tarinfo

    def list(self, verbose=True):
        """Print a table of contents to sys.stdout. If `verbose' is False, only
           the names of the members are printed. If it is True, an `ls -l'-like
           output is produced.
        """
        self._check()

        for tarinfo in self:
            if verbose:
                print filemode(tarinfo.mode),
                print "%s/%s" % (tarinfo.uname or tarinfo.uid,
                                 tarinfo.gname or tarinfo.gid),
                if tarinfo.ischr() or tarinfo.isblk():
                    print "%10s" % ("%d,%d" \
                                    % (tarinfo.devmajor, tarinfo.devminor)),
                else:
                    print "%10d" % tarinfo.size,
                print "%d-%02d-%02d %02d:%02d:%02d" \
                      % time.localtime(tarinfo.mtime)[:6],

            print tarinfo.name + ("/" if tarinfo.isdir() else ""),

            if verbose:
                if tarinfo.issym():
                    print "->", tarinfo.linkname,
                if tarinfo.islnk():
                    print "link to", tarinfo.linkname,
            print

    def add(self, name, arcname=None, recursive=True, exclude=None, filter=None):
        """Add the file `name' to the archive. `name' may be any type of file
           (directory, fifo, symbolic link, etc.). If given, `arcname'
           specifies an alternative name for the file in the archive.
           Directories are added recursively by default. This can be avoided by
           setting `recursive' to False. `exclude' is a function that should
           return True for each filename to be excluded. `filter' is a function
           that expects a TarInfo object argument and returns the changed
           TarInfo object, if it returns None the TarInfo object will be
           excluded from the archive.
        """
        self._check("aw")

        if arcname is None:
            arcname = name

        # Exclude pathnames.
        if exclude is not None:
            import warnings
            warnings.warn("use the filter argument instead",
                    DeprecationWarning, 2)
            if exclude(name):
                self._dbg(2, "tarfile: Excluded %r" % name)
                return

        # Skip if somebody tries to archive the archive...
        if self.name is not None and os.path.abspath(name) == self.name:
            self._dbg(2, "tarfile: Skipped %r" % name)
            return

        self._dbg(1, name)

        # Create a TarInfo object from the file.
        tarinfo = self.gettarinfo(name, arcname)

        if tarinfo is None:
            self._dbg(1, "tarfile: Unsupported type %r" % name)
            return

        # Change or exclude the TarInfo object.
        if filter is not None:
            tarinfo = filter(tarinfo)
            if tarinfo is None:
                self._dbg(2, "tarfile: Excluded %r" % name)
                return

        # Append the tar header and data to the archive.
        if tarinfo.isreg():
            with bltn_open(name, "rb") as f:
                self.addfile(tarinfo, f)

        elif tarinfo.isdir():
            self.addfile(tarinfo)
            if recursive:
                for f in os.listdir(name):
                    self.add(os.path.join(name, f), os.path.join(arcname, f),
                            recursive, exclude, filter)

        else:
            self.addfile(tarinfo)

    def addfile(self, tarinfo, fileobj=None):
        """Add the TarInfo object `tarinfo' to the archive. If `fileobj' is
           given, tarinfo.size bytes are read from it and added to the archive.
           You can create TarInfo objects using gettarinfo().
           On Windows platforms, `fileobj' should always be opened with mode
           'rb' to avoid irritation about the file size.
        """
        self._check("aw")

        tarinfo = copy.copy(tarinfo)

        buf = tarinfo.tobuf(self.format, self.encoding, self.errors)
        self.fileobj.write(buf)
        self.offset += len(buf)

        # If there's data to follow, append it.
        if fileobj is not None:
            copyfileobj(fileobj, self.fileobj, tarinfo.size)
            blocks, remainder = divmod(tarinfo.size, BLOCKSIZE)
            if remainder > 0:
                self.fileobj.write(NUL * (BLOCKSIZE - remainder))
                blocks += 1
            self.offset += blocks * BLOCKSIZE

        self.members.append(tarinfo)

    def extractall(self, path=".", members=None):
        """Extract all members from the archive to the current working
           directory and set owner, modification time and permissions on
           directories afterwards. `path' specifies a different directory
           to extract to. `members' is optional and must be a subset of the
           list returned by getmembers().
        """
        directories = []

        if members is None:
            members = self

        for tarinfo in members:
            if tarinfo.isdir():
                # Extract directories with a safe mode.
                directories.append(tarinfo)
                tarinfo = copy.copy(tarinfo)
                tarinfo.mode = 0700
            self.extract(tarinfo, path)

        # Reverse sort directories.
        directories.sort(key=operator.attrgetter('name'))
        directories.reverse()

        # Set correct owner, mtime and filemode on directories.
        for tarinfo in directories:
            dirpath = os.path.join(path, tarinfo.name)
            try:
                self.chown(tarinfo, dirpath)
                self.utime(tarinfo, dirpath)
                self.chmod(tarinfo, dirpath)
            except ExtractError, e:
                if self.errorlevel > 1:
                    raise
                else:
                    self._dbg(1, "tarfile: %s" % e)

    def extract(self, member, path=""):
        """Extract a member from the archive to the current working directory,
           using its full name. Its file information is extracted as accurately
           as possible. `member' may be a filename or a TarInfo object. You can
           specify a different directory using `path'.
        """
        self._check("r")

        if isinstance(member, basestring):
            tarinfo = self.getmember(member)
        else:
            tarinfo = member

        # Prepare the link target for makelink().
        if tarinfo.islnk():
            tarinfo._link_target = os.path.join(path, tarinfo.linkname)

        try:
            self._extract_member(tarinfo, os.path.join(path, tarinfo.name))
        except EnvironmentError, e:
            if self.errorlevel > 0:
                raise
            else:
                if e.filename is None:
                    self._dbg(1, "tarfile: %s" % e.strerror)
                else:
                    self._dbg(1, "tarfile: %s %r" % (e.strerror, e.filename))
        except ExtractError, e:
            if self.errorlevel > 1:
                raise
            else:
                self._dbg(1, "tarfile: %s" % e)

    def extractfile(self, member):
        """Extract a member from the archive as a file object. `member' may be
           a filename or a TarInfo object. If `member' is a regular file, a
           file-like object is returned. If `member' is a link, a file-like
           object is constructed from the link's target. If `member' is none of
           the above, None is returned.
           The file-like object is read-only and provides the following
           methods: read(), readline(), readlines(), seek() and tell()
        """
        self._check("r")

        if isinstance(member, basestring):
            tarinfo = self.getmember(member)
        else:
            tarinfo = member

        if tarinfo.isreg():
            return self.fileobject(self, tarinfo)

        elif tarinfo.type not in SUPPORTED_TYPES:
            # If a member's type is unknown, it is treated as a
            # regular file.
            return self.fileobject(self, tarinfo)

        elif tarinfo.islnk() or tarinfo.issym():
            if isinstance(self.fileobj, _Stream):
                # A small but ugly workaround for the case that someone tries
                # to extract a (sym)link as a file-object from a non-seekable
                # stream of tar blocks.
                raise StreamError("cannot extract (sym)link as file object")
            else:
                # A (sym)link's file object is its target's file object.
                return self.extractfile(self._find_link_target(tarinfo))
        else:
            # If there's no data associated with the member (directory, chrdev,
            # blkdev, etc.), return None instead of a file object.
            return None

    def _extract_member(self, tarinfo, targetpath):
        """Extract the TarInfo object tarinfo to a physical
           file called targetpath.
        """
        # Fetch the TarInfo object for the given name
        # and build the destination pathname, replacing
        # forward slashes to platform specific separators.
        targetpath = targetpath.rstrip("/")
        targetpath = targetpath.replace("/", os.sep)

        # Create all upper directories.
        upperdirs = os.path.dirname(targetpath)
        if upperdirs and not os.path.exists(upperdirs):
            # Create directories that are not part of the archive with
            # default permissions.
            os.makedirs(upperdirs)

        if tarinfo.islnk() or tarinfo.issym():
            self._dbg(1, "%s -> %s" % (tarinfo.name, tarinfo.linkname))
        else:
            self._dbg(1, tarinfo.name)

        if tarinfo.isreg():
            self.makefile(tarinfo, targetpath)
        elif tarinfo.isdir():
            self.makedir(tarinfo, targetpath)
        elif tarinfo.isfifo():
            self.makefifo(tarinfo, targetpath)
        elif tarinfo.ischr() or tarinfo.isblk():
            self.makedev(tarinfo, targetpath)
        elif tarinfo.islnk() or tarinfo.issym():
            self.makelink(tarinfo, targetpath)
        elif tarinfo.type not in SUPPORTED_TYPES:
            self.makeunknown(tarinfo, targetpath)
        else:
            self.makefile(tarinfo, targetpath)

        self.chown(tarinfo, targetpath)
        if not tarinfo.issym():
            self.chmod(tarinfo, targetpath)
            self.utime(tarinfo, targetpath)

    #--------------------------------------------------------------------------
    # Below are the different file methods. They are called via
    # _extract_member() when extract() is called. They can be replaced in a
    # subclass to implement other functionality.

    def makedir(self, tarinfo, targetpath):
        """Make a directory called targetpath.
        """
        try:
            # Use a safe mode for the directory, the real mode is set
            # later in _extract_member().
            os.mkdir(targetpath, 0700)
        except EnvironmentError, e:
            if e.errno != errno.EEXIST:
                raise

    def makefile(self, tarinfo, targetpath):
        """Make a file called targetpath.
        """
        source = self.extractfile(tarinfo)
        try:
            with bltn_open(targetpath, "wb") as target:
                copyfileobj(source, target)
        finally:
            source.close()

    def makeunknown(self, tarinfo, targetpath):
        """Make a file from a TarInfo object with an unknown type
           at targetpath.
        """
        self.makefile(tarinfo, targetpath)
        self._dbg(1, "tarfile: Unknown file type %r, " \
                     "extracted as regular file." % tarinfo.type)

    def makefifo(self, tarinfo, targetpath):
        """Make a fifo called targetpath.
        """
        if hasattr(os, "mkfifo"):
            os.mkfifo(targetpath)
        else:
            raise ExtractError("fifo not supported by system")

    def makedev(self, tarinfo, targetpath):
        """Make a character or block device called targetpath.
        """
        if not hasattr(os, "mknod") or not hasattr(os, "makedev"):
            raise ExtractError("special devices not supported by system")

        mode = tarinfo.mode
        if tarinfo.isblk():
            mode |= stat.S_IFBLK
        else:
            mode |= stat.S_IFCHR

        os.mknod(targetpath, mode,
                 os.makedev(tarinfo.devmajor, tarinfo.devminor))

    def makelink(self, tarinfo, targetpath):
        """Make a (symbolic) link called targetpath. If it cannot be created
          (platform limitation), we try to make a copy of the referenced file
          instead of a link.
        """
        if hasattr(os, "symlink") and hasattr(os, "link"):
            # For systems that support symbolic and hard links.
            if tarinfo.issym():
                if os.path.lexists(targetpath):
                    os.unlink(targetpath)
                os.symlink(tarinfo.linkname, targetpath)
            else:
                # See extract().
                if os.path.exists(tarinfo._link_target):
                    if os.path.lexists(targetpath):
                        os.unlink(targetpath)
                    os.link(tarinfo._link_target, targetpath)
                else:
                    self._extract_member(self._find_link_target(tarinfo), targetpath)
        else:
            try:
                self._extract_member(self._find_link_target(tarinfo), targetpath)
            except KeyError:
                raise ExtractError("unable to resolve link inside archive")

    def chown(self, tarinfo, targetpath):
        """Set owner of targetpath according to tarinfo.
        """
        if pwd and hasattr(os, "geteuid") and os.geteuid() == 0:
            # We have to be root to do so.
            try:
                g = grp.getgrnam(tarinfo.gname)[2]
            except KeyError:
                g = tarinfo.gid
            try:
                u = pwd.getpwnam(tarinfo.uname)[2]
            except KeyError:
                u = tarinfo.uid
            try:
                if tarinfo.issym() and hasattr(os, "lchown"):
                    os.lchown(targetpath, u, g)
                else:
                    if sys.platform != "os2emx":
                        os.chown(targetpath, u, g)
            except EnvironmentError, e:
                raise ExtractError("could not change owner")

    def chmod(self, tarinfo, targetpath):
        """Set file permissions of targetpath according to tarinfo.
        """
        if hasattr(os, 'chmod'):
            try:
                os.chmod(targetpath, tarinfo.mode)
            except EnvironmentError, e:
                raise ExtractError("could not change mode")

    def utime(self, tarinfo, targetpath):
        """Set modification time of targetpath according to tarinfo.
        """
        if not hasattr(os, 'utime'):
            return
        try:
            os.utime(targetpath, (tarinfo.mtime, tarinfo.mtime))
        except EnvironmentError, e:
            raise ExtractError("could not change modification time")

    #--------------------------------------------------------------------------
    def next(self):
        """Return the next member of the archive as a TarInfo object, when
           TarFile is opened for reading. Return None if there is no more
           available.
        """
        self._check("ra")
        if self.firstmember is not None:
            m = self.firstmember
            self.firstmember = None
            return m

        # Read the next block.
        self.fileobj.seek(self.offset)
        tarinfo = None
        while True:
            try:
                tarinfo = self.tarinfo.fromtarfile(self)
            except EOFHeaderError, e:
                if self.ignore_zeros:
                    self._dbg(2, "0x%X: %s" % (self.offset, e))
                    self.offset += BLOCKSIZE
                    continue
            except InvalidHeaderError, e:
                if self.ignore_zeros:
                    self._dbg(2, "0x%X: %s" % (self.offset, e))
                    self.offset += BLOCKSIZE
                    continue
                elif self.offset == 0:
                    raise ReadError(str(e))
            except EmptyHeaderError:
                if self.offset == 0:
                    raise ReadError("empty file")
            except TruncatedHeaderError, e:
                if self.offset == 0:
                    raise ReadError(str(e))
            except SubsequentHeaderError, e:
                raise ReadError(str(e))
            break

        if tarinfo is not None:
            self.members.append(tarinfo)
        else:
            self._loaded = True

        return tarinfo

    #--------------------------------------------------------------------------
    # Little helper methods:

    def _getmember(self, name, tarinfo=None, normalize=False):
        """Find an archive member by name from bottom to top.
           If tarinfo is given, it is used as the starting point.
        """
        # Ensure that all members have been loaded.
        members = self.getmembers()

        # Limit the member search list up to tarinfo.
        if tarinfo is not None:
            members = members[:members.index(tarinfo)]

        if normalize:
            name = os.path.normpath(name)

        for member in reversed(members):
            if normalize:
                member_name = os.path.normpath(member.name)
            else:
                member_name = member.name

            if name == member_name:
                return member

    def _load(self):
        """Read through the entire archive file and look for readable
           members.
        """
        while True:
            tarinfo = self.next()
            if tarinfo is None:
                break
        self._loaded = True

    def _check(self, mode=None):
        """Check if TarFile is still open, and if the operation's mode
           corresponds to TarFile's mode.
        """
        if self.closed:
            raise IOError("%s is closed" % self.__class__.__name__)
        if mode is not None and self.mode not in mode:
            raise IOError("bad operation for mode %r" % self.mode)

    def _find_link_target(self, tarinfo):
        """Find the target member of a symlink or hardlink member in the
           archive.
        """
        if tarinfo.issym():
            # Always search the entire archive.
            linkname = "/".join(filter(None, (os.path.dirname(tarinfo.name), tarinfo.linkname)))
            limit = None
        else:
            # Search the archive before the link, because a hard link is
            # just a reference to an already archived file.
            linkname = tarinfo.linkname
            limit = tarinfo

        member = self._getmember(linkname, tarinfo=limit, normalize=True)
        if member is None:
            raise KeyError("linkname %r not found" % linkname)
        return member

    def __iter__(self):
        """Provide an iterator object.
        """
        if self._loaded:
            return iter(self.members)
        else:
            return TarIter(self)

    def _dbg(self, level, msg):
        """Write debugging output to sys.stderr.
        """
        if level <= self.debug:
            print >> sys.stderr, msg

    def __enter__(self):
        self._check()
        return self

    def __exit__(self, type, value, traceback):
        if type is None:
            self.close()
        else:
            # An exception occurred. We must not call close() because
            # it would try to write end-of-archive blocks and padding.
            if not self._extfileobj:
                self.fileobj.close()
            self.closed = True
# class TarFile
TarFile

xml處理模塊

xml是實現不一樣語言或程序之間進行數據交換的協議,跟json差很少,但json使用起來更簡單,不過,古時候,在json還沒誕生的黑暗年代,你們只能選擇用xml呀,至今不少傳統公司如金融行業的不少系統的接口還主要是xml。

xml的格式以下,就是經過<>節點來區別數據結構的:

<?xml version="1.0"?>
<data>
    <country name="Liechtenstein">
        <rank updated="yes">2</rank>
        <year>2008</year>
        <gdppc>141100</gdppc>
        <neighbor name="Austria" direction="E"/>
        <neighbor name="Switzerland" direction="W"/>
    </country>
    <country name="Singapore">
        <rank updated="yes">5</rank>
        <year>2011</year>
        <gdppc>59900</gdppc>
        <neighbor name="Malaysia" direction="N"/>
    </country>
    <country name="Panama">
        <rank updated="yes">69</rank>
        <year>2011</year>
        <gdppc>13600</gdppc>
        <neighbor name="Costa Rica" direction="W"/>
        <neighbor name="Colombia" direction="E"/>
    </country>
</data>

xml協議在各個語言裏的都 是支持的,在python中能夠用如下模塊操做xml 

import xml.etree.ElementTree as ET
 
tree = ET.parse("xmltest.xml")
root = tree.getroot()
print(root.tag)
 
#遍歷xml文檔
for child in root:
    print(child.tag, child.attrib)
    for i in child:
        print(i.tag,i.text)
 
#只遍歷year 節點
for node in root.iter('year'):
    print(node.tag,node.text)

修改和刪除xml文檔內容

import xml.etree.ElementTree as ET
 
tree = ET.parse("xmltest.xml")
root = tree.getroot()
 
#修改
for node in root.iter('year'):
    new_year = int(node.text) + 1
    node.text = str(new_year)
    node.set("updated","yes")
 
tree.write("xmltest.xml")
 
 
#刪除node
for country in root.findall('country'):
   rank = int(country.find('rank').text)
   if rank > 50:
     root.remove(country)
 
tree.write('output.xml')

本身建立xml文檔

import xml.etree.ElementTree as ET
 
 
new_xml = ET.Element("namelist")
name = ET.SubElement(new_xml,"name",attrib={"enrolled":"yes"})
age = ET.SubElement(name,"age",attrib={"checked":"no"})
sex = ET.SubElement(name,"sex")
sex.text = '33'
name2 = ET.SubElement(new_xml,"name",attrib={"enrolled":"no"})
age = ET.SubElement(name2,"age")
age.text = '19'
 
et = ET.ElementTree(new_xml) #生成文檔對象
et.write("test.xml", encoding="utf-8",xml_declaration=True)
 
ET.dump(new_xml) #打印生成的格式

ConfigParser模塊

用於生成和修改常見配置文檔,當前模塊的名稱在 python 3.x 版本中變動爲 configparser。

來看一個好多軟件的常見文檔格式以下

[DEFAULT]
ServerAliveInterval = 45
Compression = yes
CompressionLevel = 9
ForwardX11 = yes
 
[bitbucket.org]
User = hg
 
[topsecret.server.com]
Port = 50022
ForwardX11 = no

若是想用python生成一個這樣的文檔怎麼作呢?

import configparser
 
config = configparser.ConfigParser()
config["DEFAULT"] = {'ServerAliveInterval': '45',
                      'Compression': 'yes',
                     'CompressionLevel': '9'}
 
config['bitbucket.org'] = {}
config['bitbucket.org']['User'] = 'hg'
config['topsecret.server.com'] = {}
topsecret = config['topsecret.server.com']
topsecret['Host Port'] = '50022'     # mutates the parser
topsecret['ForwardX11'] = 'no'  # same here
config['DEFAULT']['ForwardX11'] = 'yes'
with open('example.ini', 'w') as configfile:
   config.write(configfile)

寫完了還能夠再讀出來哈。

>>> import configparser
>>> config = configparser.ConfigParser()
>>> config.sections()
[]
>>> config.read('example.ini')
['example.ini']
>>> config.sections()
['bitbucket.org', 'topsecret.server.com']
>>> 'bitbucket.org' in config
True
>>> 'bytebong.com' in config
False
>>> config['bitbucket.org']['User']
'hg'
>>> config['DEFAULT']['Compression']
'yes'
>>> topsecret = config['topsecret.server.com']
>>> topsecret['ForwardX11']
'no'
>>> topsecret['Port']
'50022'
>>> for key in config['bitbucket.org']: print(key)
...
user
compressionlevel
serveraliveinterval
compression
forwardx11
>>> config['bitbucket.org']['ForwardX11']
'yes'

configparser增刪改查語法

[section1]
k1 = v1
k2:v2
  
[section2]
k1 = v1
 
import ConfigParser
  
config = ConfigParser.ConfigParser()
config.read('i.cfg')
  
# ########## 讀 ##########
#secs = config.sections()
#print secs
#options = config.options('group2')
#print options
  
#item_list = config.items('group2')
#print item_list
  
#val = config.get('group1','key')
#val = config.getint('group1','key')
  
# ########## 改寫 ##########
#sec = config.remove_section('group1')
#config.write(open('i.cfg', "w"))
  
#sec = config.has_section('wupeiqi')
#sec = config.add_section('wupeiqi')
#config.write(open('i.cfg', "w"))
  
  
#config.set('group2','k1',11111)
#config.write(open('i.cfg', "w"))
  
#config.remove_option('group2','age')
#config.write(open('i.cfg', "w"))

執行系統命令 

能夠執行shell命令的相關模塊和函數有:

  • os.system
  • os.spawn*
  • os.popen*          --廢棄
  • popen2.*           --廢棄
  • commands.*      --廢棄,3.x中被移除
import commands

result = commands.getoutput('cmd')
result = commands.getstatus('cmd')
result = commands.getstatusoutput('cmd')
View Code

以上執行shell命令的相關的模塊和函數的功能均在 subprocess 模塊中實現,並提供了更豐富的功能。

call 

執行命令,返回狀態碼

ret = subprocess.call(["ls", "-l"], shell=False)
ret = subprocess.call("ls -l", shell=True)

shell = True ,容許 shell 命令是字符串形式

check_call

執行命令,若是執行狀態碼是 0 ,則返回0,不然拋異常

subprocess.check_call(["ls", "-l"])
subprocess.check_call("exit 1", shell=True)

check_output

執行命令,若是狀態碼是 0 ,則返回執行結果,不然拋異常

subprocess.check_output(["echo", "Hello World!"])
subprocess.check_output("exit 1", shell=True)

subprocess.Popen(...)

用於執行復雜的系統命令

參數:

    • args:shell命令,能夠是字符串或者序列類型(如:list,元組)
    • bufsize:指定緩衝。0 無緩衝,1 行緩衝,其餘 緩衝區大小,負值 系統緩衝
    • stdin, stdout, stderr:分別表示程序的標準輸入、輸出、錯誤句柄
    • preexec_fn:只在Unix平臺下有效,用於指定一個可執行對象(callable object),它將在子進程運行以前被調用
    • close_sfs:在windows平臺下,若是close_fds被設置爲True,則新建立的子進程將不會繼承父進程的輸入、輸出、錯誤管道。
      因此不能將close_fds設置爲True同時重定向子進程的標準輸入、輸出與錯誤(stdin, stdout, stderr)。
    • shell:同上
    • cwd:用於設置子進程的當前目錄
    • env:用於指定子進程的環境變量。若是env = None,子進程的環境變量將從父進程中繼承。
    • universal_newlines:不一樣系統的換行符不一樣,True -> 贊成使用 \n
    • startupinfo與createionflags只在windows下有效
      將被傳遞給底層的CreateProcess()函數,用於設置子進程的一些屬性,如:主窗口的外觀,進程的優先級等等
import subprocess
ret1 = subprocess.Popen(["mkdir","t1"])
ret2 = subprocess.Popen("mkdir t2", shell=True)
執行普通命令

終端輸入的命令分爲兩種:

  • 輸入便可獲得輸出,如:ifconfig
  • 輸入進行某環境,依賴再輸入,如:python
import subprocess

obj = subprocess.Popen("mkdir t3", shell=True, cwd='/home/dev',)
View Code
import subprocess

obj = subprocess.Popen(["python"], stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
obj.stdin.write('print 1 \n ')
obj.stdin.write('print 2 \n ')
obj.stdin.write('print 3 \n ')
obj.stdin.write('print 4 \n ')
obj.stdin.close()

cmd_out = obj.stdout.read()
obj.stdout.close()
cmd_error = obj.stderr.read()
obj.stderr.close()

print cmd_out
print cmd_error
View Code
import subprocess

obj = subprocess.Popen(["python"], stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
obj.stdin.write('print 1 \n ')
obj.stdin.write('print 2 \n ')
obj.stdin.write('print 3 \n ')
obj.stdin.write('print 4 \n ')

out_error_list = obj.communicate()
print out_error_list
View Code
import subprocess

obj = subprocess.Popen(["python"], stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
out_error_list = obj.communicate('print "hello"')
print out_error_list
View Code

更多猛擊這裏

logging模塊  

不少程序都有記錄日誌的需求,而且日誌中包含的信息即有正常的程序訪問日誌,還可能有錯誤、警告等信息輸出,python的logging模塊提供了標準的日誌接口,你能夠經過它存儲各類格式的日誌,logging的日誌能夠分爲 debug()info()warning()error() and critical() 5個級別,下面咱們看一下怎麼用。

最簡單用法

import logging
 
logging.warning("user [alex] attempted wrong password more than 3 times")
logging.critical("server is down")
 
#輸出
WARNING:root:user [alex] attempted wrong password more than 3 times
CRITICAL:root:server is down

看一下這幾個日誌級別分別表明什麼意思

 

若是想把日誌寫到文件裏,也很簡單

import logging
 
logging.basicConfig(filename='example.log',level=logging.INFO)
logging.debug('This message should go to the log file')
logging.info('So should this')
logging.warning('And this, too')

其中下面這句中的level=loggin.INFO意思是,把日誌紀錄級別設置爲INFO,也就是說,只有比日誌是INFO或比INFO級別更高的日誌纔會被紀錄到文件裏,在這個例子, 第一條日誌是不會被紀錄的,若是但願紀錄debug的日誌,那把日誌級別改爲DEBUG就好了。

logging.basicConfig(filename='example.log',level=logging.INFO)

感受上面的日誌格式忘記加上時間啦,日誌不知道時間怎麼行呢,下面就來加上!

import logging
logging.basicConfig(format='%(asctime)s %(message)s', datefmt='%m/%d/%Y %I:%M:%S %p')
logging.warning('is when this event was logged.')
 
#輸出
12/12/2010 11:46:36 AM is when this event was logged.

 

若是想同時把log打印在屏幕和文件日誌裏,就須要瞭解一點複雜的知識 了


Python 使用logging模塊記錄日誌涉及四個主要類,使用官方文檔中的歸納最爲合適:

logger提供了應用程序能夠直接使用的接口;

handler將(logger建立的)日誌記錄發送到合適的目的輸出;

filter提供了細度設備來決定輸出哪條日誌記錄;

formatter決定日誌記錄的最終輸出格式。

logger
每一個程序在輸出信息以前都要得到一個Logger。Logger一般對應了程序的模塊名,好比聊天工具的圖形界面模塊能夠這樣得到它的Logger:
LOG=logging.getLogger(」chat.gui」)
而核心模塊能夠這樣:
LOG=logging.getLogger(」chat.kernel」)

Logger.setLevel(lel):指定最低的日誌級別,低於lel的級別將被忽略。debug是最低的內置級別,critical爲最高
Logger.addFilter(filt)、Logger.removeFilter(filt):添加或刪除指定的filter
Logger.addHandler(hdlr)、Logger.removeHandler(hdlr):增長或刪除指定的handler
Logger.debug()、Logger.info()、Logger.warning()、Logger.error()、Logger.critical():能夠設置的日誌級別

 

handler

handler對象負責發送相關的信息到指定目的地。Python的日誌系統有多種Handler可使用。有些Handler能夠把信息輸出到控制檯,有些Logger能夠把信息輸出到文件,還有些 Handler能夠把信息發送到網絡上。若是以爲不夠用,還能夠編寫本身的Handler。能夠經過addHandler()方法添加多個多handler
Handler.setLevel(lel):指定被處理的信息級別,低於lel級別的信息將被忽略
Handler.setFormatter():給這個handler選擇一個格式
Handler.addFilter(filt)、Handler.removeFilter(filt):新增或刪除一個filter對象


每一個Logger能夠附加多個Handler。接下來咱們就來介紹一些經常使用的Handler:
1) logging.StreamHandler
使用這個Handler能夠向相似與sys.stdout或者sys.stderr的任何文件對象(file object)輸出信息。它的構造函數是:
StreamHandler([strm])
其中strm參數是一個文件對象。默認是sys.stderr


2) logging.FileHandler
和StreamHandler相似,用於向一個文件輸出日誌信息。不過FileHandler會幫你打開這個文件。它的構造函數是:
FileHandler(filename[,mode])
filename是文件名,必須指定一個文件名。
mode是文件的打開方式。參見Python內置函數open()的用法。默認是’a',即添加到文件末尾。

3) logging.handlers.RotatingFileHandler
這個Handler相似於上面的FileHandler,可是它能夠管理文件大小。當文件達到必定大小以後,它會自動將當前日誌文件更名,而後建立 一個新的同名日誌文件繼續輸出。好比日誌文件是chat.log。當chat.log達到指定的大小以後,RotatingFileHandler自動把 文件更名爲chat.log.1。不過,若是chat.log.1已經存在,會先把chat.log.1重命名爲chat.log.2。。。最後從新建立 chat.log,繼續輸出日誌信息。它的構造函數是:
RotatingFileHandler( filename[, mode[, maxBytes[, backupCount]]])
其中filename和mode兩個參數和FileHandler同樣。
maxBytes用於指定日誌文件的最大文件大小。若是maxBytes爲0,意味着日誌文件能夠無限大,這時上面描述的重命名過程就不會發生。
backupCount用於指定保留的備份文件的個數。好比,若是指定爲2,當上面描述的重命名過程發生時,原有的chat.log.2並不會被改名,而是被刪除。


4) logging.handlers.TimedRotatingFileHandler
這個Handler和RotatingFileHandler相似,不過,它沒有經過判斷文件大小來決定什麼時候從新建立日誌文件,而是間隔必定時間就 自動建立新的日誌文件。重命名的過程與RotatingFileHandler相似,不過新的文件不是附加數字,而是當前時間。它的構造函數是:
TimedRotatingFileHandler( filename [,when [,interval [,backupCount]]])
其中filename參數和backupCount參數和RotatingFileHandler具備相同的意義。
interval是時間間隔。
when參數是一個字符串。表示時間間隔的單位,不區分大小寫。它有如下取值:
S 秒
M 分
H 小時
D 天
W 每星期(interval==0時表明星期一)
midnight 天天凌晨

 

import logging
 
#create logger
logger = logging.getLogger('TEST-LOG')
logger.setLevel(logging.DEBUG)
 
 
# create console handler and set level to debug
ch = logging.StreamHandler()
ch.setLevel(logging.DEBUG)
 
# create file handler and set level to warning
fh = logging.FileHandler("access.log")
fh.setLevel(logging.WARNING)
# create formatter
formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')
 
# add formatter to ch and fh
ch.setFormatter(formatter)
fh.setFormatter(formatter)
 
# add ch and fh to logger
logger.addHandler(ch)
logger.addHandler(fh)
 
# 'application' code
logger.debug('debug message')
logger.info('info message')
logger.warn('warn message')
logger.error('error message')
logger.critical('critical message')

文件自動截斷例子

import logging

from logging import handlers

logger = logging.getLogger(__name__)

log_file = "timelog.log"
#fh = handlers.RotatingFileHandler(filename=log_file,maxBytes=10,backupCount=3)
fh = handlers.TimedRotatingFileHandler(filename=log_file,when="S",interval=5,backupCount=3)


formatter = logging.Formatter('%(asctime)s %(module)s:%(lineno)d %(message)s')

fh.setFormatter(formatter)

logger.addHandler(fh)


logger.warning("test1")
logger.warning("test12")
logger.warning("test13")
logger.warning("test14")

re模塊   

經常使用正則表達式符號

'.'     默認匹配除\n以外的任意一個字符,若指定flag DOTALL,則匹配任意字符,包括換行
'^'     匹配字符開頭,若指定flags MULTILINE,這種也能夠匹配上(r"^a","\nabc\neee",flags=re.MULTILINE)
'$'     匹配字符結尾,或e.search("foo$","bfoo\nsdfsf",flags=re.MULTILINE).group()也能夠
'*'     匹配*號前的字符0次或屢次,re.findall("ab*","cabb3abcbbac")  結果爲['abb', 'ab', 'a']
'+'     匹配前一個字符1次或屢次,re.findall("ab+","ab+cd+abb+bba") 結果['ab', 'abb']
'?'     匹配前一個字符1次或0次
'{m}'   匹配前一個字符m次
'{n,m}' 匹配前一個字符n到m次,re.findall("ab{1,3}","abb abc abbcbbb") 結果'abb', 'ab', 'abb']
'|'     匹配|左或|右的字符,re.search("abc|ABC","ABCBabcCD").group() 結果'ABC'
'(...)' 分組匹配,re.search("(abc){2}a(123|456)c", "abcabca456c").group() 結果 abcabca456c
 
 
'\A'    只從字符開頭匹配,re.search("\Aabc","alexabc") 是匹配不到的
'\Z'    匹配字符結尾,同$
'\d'    匹配數字0-9
'\D'    匹配非數字
'\w'    匹配[A-Za-z0-9]
'\W'    匹配非[A-Za-z0-9]
's'     匹配空白字符、\t、\n、\r , re.search("\s+","ab\tc1\n3").group() 結果 '\t'
 
'(?P<name>...)' 分組匹配 re.search("(?P<province>[0-9]{4})(?P<city>[0-9]{2})(?P<birthday>[0-9]{4})","371481199306143242").groupdict("city") 結果{'province': '3714', 'city': '81', 'birthday': '1993'}

最經常使用的匹配語法

re.match 從頭開始匹配
re.search 匹配包含
re.findall 把全部匹配到的字符放到以列表中的元素返回
re.splitall 以匹配到的字符當作列表分隔符
re.sub      匹配字符並替換

反斜槓的困擾
與大多數編程語言相同,正則表達式裏使用"\"做爲轉義字符,這就可能形成反斜槓困擾。假如你須要匹配文本中的字符"\",那麼使用編程語言表示的正則表達式裏將須要4個反斜槓"\\\\":前兩個和後兩個分別用於在編程語言裏轉義成反斜槓,轉換成兩個反斜槓後再在正則表達式裏轉義成一個反斜槓。Python裏的原生字符串很好地解決了這個問題,這個例子中的正則表達式可使用r"\\"表示。一樣,匹配一個數字的"\\d"能夠寫成r"\d"。有了原生字符串,你不再用擔憂是否是漏寫了反斜槓,寫出來的表達式也更直觀。

僅需輕輕知道的幾個匹配模式

re.I(re.IGNORECASE): 忽略大小寫(括號內是完整寫法,下同)
M(MULTILINE): 多行模式,改變'^'和'$'的行爲(參見上圖)
S(DOTALL): 點任意匹配模式,改變'.'的行爲

本節做業

開發一個簡單的python計算器

  1. 實現加減乘除及拓號優先級解析
  2. 用戶輸入 1 - 2 * ( (60-30 +(-40/5) * (9-2*5/3 + 7 /3*99/4*2998 +10 * 568/14 )) - (-4*3)/ (16-3*2) )等相似公式後,必須本身解析裏面的(),+,-,*,/符號和公式(不能調用eval等相似功能偷懶實現),運算後得出結果,結果必須與真實的計算器所得出的結果一致

hint:

re.search(r'\([^()]+\)',s).group()

'(-40/5)'

  

軟件目錄結構規範

 爲何要設計好目錄結構?

"設計項目目錄結構",就和"代碼編碼風格"同樣,屬於我的風格問題。對於這種風格上的規範,一直都存在兩種態度:

  1. 一類同窗認爲,這種我的風格問題"可有可無"。理由是能讓程序work就好,風格問題根本不是問題。
  2. 另外一類同窗認爲,規範化能更好的控制程序結構,讓程序具備更高的可讀性。

我是比較偏向於後者的,由於我是前一類同窗思想行爲下的直接受害者。我曾經維護過一個很是很差讀的項目,其實現的邏輯並不複雜,可是卻耗費了我很是長的時間去理解它想表達的意思。今後我我的對於提升項目可讀性、可維護性的要求就很高了。"項目目錄結構"其實也是屬於"可讀性和可維護性"的範疇,咱們設計一個層次清晰的目錄結構,就是爲了達到如下兩點:

  1. 可讀性高: 不熟悉這個項目的代碼的人,一眼就能看懂目錄結構,知道程序啓動腳本是哪一個,測試目錄在哪兒,配置文件在哪兒等等。從而很是快速的瞭解這個項目。
  2. 可維護性高: 定義好組織規則後,維護者就能很明確地知道,新增的哪一個文件和代碼應該放在什麼目錄之下。這個好處是,隨着時間的推移,代碼/配置的規模增長,項目結構不會混亂,仍然可以組織良好。

因此,我認爲,保持一個層次清晰的目錄結構是有必要的。更況且組織一個良好的工程目錄,實際上是一件很簡單的事兒。

目錄組織方式

關於如何組織一個較好的Python工程目錄結構,已經有一些獲得了共識的目錄結構。在Stackoverflow的這個問題上,能看到你們對Python目錄結構的討論。

這裏面說的已經很好了,我也不打算從新造輪子列舉各類不一樣的方式,這裏面我說一下個人理解和體會。

假設你的項目名爲foo, 我比較建議的最方便快捷目錄結構這樣就足夠了:

Foo/
|-- bin/
|   |-- foo
|
|-- foo/
|   |-- tests/
|   |   |-- __init__.py
|   |   |-- test_main.py
|   |
|   |-- __init__.py
|   |-- main.py
|
|-- docs/
|   |-- conf.py
|   |-- abc.rst
|
|-- setup.py
|-- requirements.txt
|-- README

簡要解釋一下:

  1. bin/: 存放項目的一些可執行文件,固然你能夠起名script/之類的也行。
  2. foo/: 存放項目的全部源代碼。(1) 源代碼中的全部模塊、包都應該放在此目錄。不要置於頂層目錄。(2) 其子目錄tests/存放單元測試代碼; (3) 程序的入口最好命名爲main.py
  3. docs/: 存放一些文檔。
  4. setup.py: 安裝、部署、打包的腳本。
  5. requirements.txt: 存放軟件依賴的外部Python包列表。
  6. README: 項目說明文件。

除此以外,有一些方案給出了更加多的內容。好比LICENSE.txt,ChangeLog.txt文件等,我沒有列在這裏,由於這些東西主要是項目開源的時候須要用到。若是你想寫一個開源軟件,目錄該如何組織,能夠參考這篇文章

下面,再簡單講一下我對這些目錄的理解和我的要求吧。

關於README的內容

這個我以爲是每一個項目都應該有的一個文件,目的是能簡要描述該項目的信息,讓讀者快速瞭解這個項目。

它須要說明如下幾個事項:

  1. 軟件定位,軟件的基本功能。
  2. 運行代碼的方法: 安裝環境、啓動命令等。
  3. 簡要的使用說明。
  4. 代碼目錄結構說明,更詳細點能夠說明軟件的基本原理。
  5. 常見問題說明。

我以爲有以上幾點是比較好的一個README。在軟件開發初期,因爲開發過程當中以上內容可能不明確或者發生變化,並非必定要在一開始就將全部信息都補全。可是在項目完結的時候,是須要撰寫這樣的一個文檔的。

能夠參考Redis源碼中Readme的寫法,這裏面簡潔可是清晰的描述了Redis功能和源碼結構。

關於requirements.txt和setup.py

setup.py

通常來講,用setup.py來管理代碼的打包、安裝、部署問題。業界標準的寫法是用Python流行的打包工具setuptools來管理這些事情。這種方式廣泛應用於開源項目中。不過這裏的核心思想不是用標準化的工具來解決這些問題,而是說,一個項目必定要有一個安裝部署工具,能快速便捷的在一臺新機器上將環境裝好、代碼部署好和將程序運行起來。

這個我是踩過坑的。

我剛開始接觸Python寫項目的時候,安裝環境、部署代碼、運行程序這個過程全是手動完成,遇到過如下問題:

  1. 安裝環境時常常忘了最近又添加了一個新的Python包,結果一到線上運行,程序就出錯了。
  2. Python包的版本依賴問題,有時候咱們程序中使用的是一個版本的Python包,可是官方的已是最新的包了,經過手動安裝就可能裝錯了。
  3. 若是依賴的包不少的話,一個一個安裝這些依賴是很費時的事情。
  4. 新同窗開始寫項目的時候,將程序跑起來很是麻煩,由於可能常常忘了要怎麼安裝各類依賴。

setup.py能夠將這些事情自動化起來,提升效率、減小出錯的機率。"複雜的東西自動化,能自動化的東西必定要自動化。"是一個很是好的習慣。

setuptools的文檔比較龐大,剛接觸的話,可能不太好找到切入點。學習技術的方式就是看他人是怎麼用的,能夠參考一下Python的一個Web框架,flask是如何寫的: setup.py

固然,簡單點本身寫個安裝腳本(deploy.sh)替代setup.py也何嘗不可。

requirements.txt

這個文件存在的目的是:

  1. 方便開發者維護軟件的包依賴。將開發過程當中新增的包添加進這個列表中,避免在setup.py安裝依賴時漏掉軟件包。
  2. 方便讀者明確項目使用了哪些Python包。

這個文件的格式是每一行包含一個包依賴的說明,一般是flask>=0.10這種格式,要求是這個格式能被pip識別,這樣就能夠簡單的經過 pip install -r requirements.txt來把全部Python包依賴都裝好了。具體格式說明: 點這裏

 

關於配置文件的使用方法

注意,在上面的目錄結構中,沒有將conf.py放在源碼目錄下,而是放在docs/目錄下。

不少項目對配置文件的使用作法是:

  1. 配置文件寫在一個或多個python文件中,好比此處的conf.py。
  2. 項目中哪一個模塊用到這個配置文件就直接經過import conf這種形式來在代碼中使用配置。

這種作法我不太贊同:

  1. 這讓單元測試變得困難(由於模塊內部依賴了外部配置)
  2. 另外一方面配置文件做爲用戶控制程序的接口,應當能夠由用戶自由指定該文件的路徑。
  3. 程序組件可複用性太差,由於這種貫穿全部模塊的代碼硬編碼方式,使得大部分模塊都依賴conf.py這個文件。

因此,我認爲配置的使用,更好的方式是,

  1. 模塊的配置都是能夠靈活配置的,不受外部配置文件的影響。
  2. 程序的配置也是能夠靈活控制的。

可以佐證這個思想的是,用過nginx和mysql的同窗都知道,nginx、mysql這些程序均可以自由的指定用戶配置。

因此,不該當在代碼中直接import conf來使用配置文件。上面目錄結構中的conf.py,是給出的一個配置樣例,不是在寫死在程序中直接引用的配置文件。能夠經過給main.py啓動參數指定配置路徑的方式來讓程序讀取配置內容。固然,這裏的conf.py你能夠換個相似的名字,好比settings.py。或者你也可使用其餘格式的內容來編寫配置文件,好比settings.yaml之類的。

相關文章
相關標籤/搜索