a='我是中國人'.encode('utf-8') #3.0默認是unicode,轉換成utf-8html
一、導入模塊:node
默認是在以下目錄的能夠直接導入:python
import sys print sys.path 結果: ['/Users/wupeiqi/PycharmProjects/calculator/p1/pp1', '/usr/local/lib/python2.7/site-packages/setuptools-15.2-py2.7.egg', '/usr/local/lib/python2.7/site-packages/distribute-0.6.28-py2.7.egg', '/usr/local/lib/python2.7/site-packages/MySQL_python-1.2.4b4-py2.7-macosx-10.10-x86_64.egg', '/usr/local/lib/python2.7/site-packages/xlutils-1.7.1-py2.7.egg', '/usr/local/lib/python2.7/site-packages/xlwt-1.0.0-py2.7.egg', '/usr/local/lib/python2.7/site-packages/xlrd-0.9.3-py2.7.egg', '/usr/local/lib/python2.7/site-packages/tornado-4.1-py2.7-macosx-10.10-x86_64.egg', '/usr/local/lib/python2.7/site-packages/backports.ssl_match_hostname-3.4.0.2-py2.7.egg', '/usr/local/lib/python2.7/site-packages/certifi-2015.4.28-py2.7.egg', '/usr/local/lib/python2.7/site-packages/pyOpenSSL-0.15.1-py2.7.egg', '/usr/local/lib/python2.7/site-packages/six-1.9.0-py2.7.egg', '/usr/local/lib/python2.7/site-packages/cryptography-0.9.1-py2.7-macosx-10.10-x86_64.egg', '/usr/local/lib/python2.7/site-packages/cffi-1.1.1-py2.7-macosx-10.10-x86_64.egg', '/usr/local/lib/python2.7/site-packages/ipaddress-1.0.7-py2.7.egg', '/usr/local/lib/python2.7/site-packages/enum34-1.0.4-py2.7.egg', '/usr/local/lib/python2.7/site-packages/pyasn1-0.1.7-py2.7.egg', '/usr/local/lib/python2.7/site-packages/idna-2.0-py2.7.egg', '/usr/local/lib/python2.7/site-packages/pycparser-2.13-py2.7.egg', '/usr/local/lib/python2.7/site-packages/Django-1.7.8-py2.7.egg', '/usr/local/lib/python2.7/site-packages/paramiko-1.10.1-py2.7.egg', '/usr/local/lib/python2.7/site-packages/gevent-1.0.2-py2.7-macosx-10.10-x86_64.egg', '/usr/local/lib/python2.7/site-packages/greenlet-0.4.7-py2.7-macosx-10.10-x86_64.egg', '/Users/wupeiqi/PycharmProjects/calculator', '/usr/local/Cellar/python/2.7.9/Frameworks/Python.framework/Versions/2.7/lib/python27.zip', '/usr/local/Cellar/python/2.7.9/Frameworks/Python.framework/Versions/2.7/lib/python2.7', '/usr/local/Cellar/python/2.7.9/Frameworks/Python.framework/Versions/2.7/lib/python2.7/plat-darwin', '/usr/local/Cellar/python/2.7.9/Frameworks/Python.framework/Versions/2.7/lib/python2.7/plat-mac', '/usr/local/Cellar/python/2.7.9/Frameworks/Python.framework/Versions/2.7/lib/python2.7/plat-mac/lib-scriptpackages', '/usr/local/Cellar/python/2.7.9/Frameworks/Python.framework/Versions/2.7/lib/python2.7/lib-tk', '/usr/local/Cellar/python/2.7.9/Frameworks/Python.framework/Versions/2.7/lib/python2.7/lib-old', '/usr/local/Cellar/python/2.7.9/Frameworks/Python.framework/Versions/2.7/lib/python2.7/lib-dynload', '/usr/local/lib/python2.7/site-packages', '/Library/Python/2.7/site-packages'] 若是sys.path路徑列表沒有你想要的路徑,能夠經過 sys.path.append('路徑') 添加。
導入模塊其實就是告訴Python解釋器去解釋那個py文件linux
那麼問題來了,導入模塊時是根據那個路徑做爲基準來進行的呢?即:sys.pathgit
import module from module.xx.xx import xx from module.xx.xx import xx as rename from module.xx.xx import *
二、導入自定義模塊:github
import sys import os project_path = os.path.dirname(os.path.dirname(os.path.abspath(__file__))) sys.path.append(project_path)
本身寫個模塊web
python tab補全模塊正則表達式
for mac算法
#!/usr/bin/env python # python startup file import sys import readline import rlcompleter import atexit import os # tab completion readline.parse_and_bind('tab: complete') # history file histfile = os.path.join(os.environ['HOME'], '.pythonhistory') try: readline.read_history_file(histfile) except IOError: pass atexit.register(readline.write_history_file, histfile) del os, histfile, readline, rlcompleter
寫完保存後就可使用了shell
localhost:~ jieli$ python Python 2.7.10 (default, Oct 23 2015, 18:05:06) [GCC 4.2.1 Compatible Apple LLVM 7.0.0 (clang-700.0.59.5)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import tab
你會發現,上面本身寫的tab.py模塊只能在當前目錄下導入,若是想在系統的何何一個地方都使用怎麼辦呢? 此時你就要把這個tab.py放到python全局環境變量目錄裏啦,基本通常都放在一個叫 Python/2.7/site-packages 目錄下,這個目錄在不一樣的OS裏放的位置不同,用 print(sys.path) 能夠查看python環境變量列表。
自定義模塊 和開源模塊的使用參考 http://www.cnblogs.com/wupeiqi/articles/4963027.html
三、經常使用內置模塊整理:
Getpass模塊:
輸入密碼時,若是想要不可見,須要利用getpass 模塊中的 getpass方法,即:
#!/usr/bin/env python # -*- coding: utf-8 -*- import getpass # 將用戶輸入的內容賦值給 name 變量 pwd = getpass.getpass("請輸入密碼:") # 打印輸入的內容 print(pwd)
3.1 SYS模塊:用於提供對Python解釋器相關的操做:
sys.argv 命令行參數List,第一個元素是程序自己路徑
sys.exit(n) 退出程序,正常退出時exit(0)
sys.version 獲取Python解釋程序的版本信息
sys.maxint 最大的Int值
sys.path 返回模塊的搜索路徑,初始化時使用PYTHONPATH環境變量的值
sys.platform 返回操做系統平臺名稱
sys.stdin 輸入相關
sys.stdout 輸出相關
sys.stderror 錯誤相關
例子:進度條:
import sys import time def view_bar(num, total): rate = float(num) / float(total) rate_num = int(rate * 100) r = '\r%d%%' % (rate_num, ) sys.stdout.write(r) sys.stdout.flush() if __name__ == '__main__': for i in range(0, 100): time.sleep(0.1) view_bar(i, 100)
3.2 OS模塊:用於提供系統級別的操做:
os.getcwd() 獲取當前工做目錄,即當前python腳本工做的目錄路徑 os.chdir("dirname") 改變當前腳本工做目錄;至關於shell下cd os.curdir 返回當前目錄: ('.') os.pardir 獲取當前目錄的父目錄字符串名:('..') os.makedirs('dir1/dir2') 可生成多層遞歸目錄 os.removedirs('dirname1') 若目錄爲空,則刪除,並遞歸到上一級目錄,如若也爲空,則刪除,依此類推 os.mkdir('dirname') 生成單級目錄;至關於shell中mkdir dirname os.rmdir('dirname') 刪除單級空目錄,若目錄不爲空則沒法刪除,報錯;至關於shell中rmdir dirname os.listdir('dirname') 列出指定目錄下的全部文件和子目錄,包括隱藏文件,並以列表方式打印 os.remove() 刪除一個文件 os.rename("oldname","new") 重命名文件/目錄 os.stat('path/filename') 獲取文件/目錄信息 os.sep 操做系統特定的路徑分隔符,win下爲"\\",Linux下爲"/" os.linesep 當前平臺使用的行終止符,win下爲"\t\n",Linux下爲"\n" os.pathsep 用於分割文件路徑的字符串 os.name 字符串指示當前使用平臺。win->'nt'; Linux->'posix' os.system("bash command") 運行shell命令,直接顯示 os.environ 獲取系統環境變量 os.path.abspath(path) 返回path規範化的絕對路徑 os.path.split(path) 將path分割成目錄和文件名二元組返回 os.path.dirname(path) 返回path的目錄。其實就是os.path.split(path)的第一個元素 os.path.basename(path) 返回path最後的文件名。如何path以/或\結尾,那麼就會返回空值。即os.path.split(path)的第二個元素 os.path.exists(path) 若是path存在,返回True;若是path不存在,返回False os.path.isabs(path) 若是path是絕對路徑,返回True os.path.isfile(path) 若是path是一個存在的文件,返回True。不然返回False os.path.isdir(path) 若是path是一個存在的目錄,則返回True。不然返回False os.path.join(path1[, path2[, ...]]) 將多個路徑組合後返回,第一個絕對路徑以前的參數將被忽略 os.path.getatime(path) 返回path所指向的文件或者目錄的最後存取時間 os.path.getmtime(path) 返回path所指向的文件或者目錄的最後修改時間
3.3 time&datetime 模塊
#_*_coding:utf-8_*_ __author__ = 'Alex Li' import time # print(time.clock()) #返回處理器時間,3.3開始已廢棄 , 改爲了time.process_time()測量處理器運算時間,不包括sleep時間,不穩定,mac上測不出來 # print(time.altzone) #返回與utc時間的時間差,以秒計算\ # print(time.asctime()) #返回時間格式"Fri Aug 19 11:14:16 2016", # print(time.localtime()) #返回本地時間 的struct time對象格式 # print(time.gmtime(time.time()-800000)) #返回utc時間的struc時間對象格式 # print(time.asctime(time.localtime())) #返回時間格式"Fri Aug 19 11:14:16 2016", #print(time.ctime()) #返回Fri Aug 19 12:38:29 2016 格式, 同上 # 日期字符串 轉成 時間戳 # string_2_struct = time.strptime("2016/05/22","%Y/%m/%d") #將 日期字符串 轉成 struct時間對象格式 # print(string_2_struct) # # # struct_2_stamp = time.mktime(string_2_struct) #將struct時間對象轉成時間戳 # print(struct_2_stamp) #將時間戳轉爲字符串格式 # print(time.gmtime(time.time()-86640)) #將utc時間戳轉換成struct_time格式 # print(time.strftime("%Y-%m-%d %H:%M:%S",time.gmtime()) ) #將utc struct_time格式轉成指定的字符串格式 #時間加減 import datetime # print(datetime.datetime.now()) #返回 2016-08-19 12:47:03.941925 #print(datetime.date.fromtimestamp(time.time()) ) # 時間戳直接轉成日期格式 2016-08-19 # print(datetime.datetime.now() ) # print(datetime.datetime.now() + datetime.timedelta(3)) #當前時間+3天 # print(datetime.datetime.now() + datetime.timedelta(-3)) #當前時間-3天 # print(datetime.datetime.now() + datetime.timedelta(hours=3)) #當前時間+3小時 # print(datetime.datetime.now() + datetime.timedelta(minutes=30)) #當前時間+30分 # # c_time = datetime.datetime.now() # print(c_time.replace(minute=3,hour=2)) #時間替換
Directive | Meaning | Notes |
---|---|---|
%a |
Locale’s abbreviated weekday name. | |
%A |
Locale’s full weekday name. | |
%b |
Locale’s abbreviated month name. | |
%B |
Locale’s full month name. | |
%c |
Locale’s appropriate date and time representation. | |
%d |
Day of the month as a decimal number [01,31]. | |
%H |
Hour (24-hour clock) as a decimal number [00,23]. | |
%I |
Hour (12-hour clock) as a decimal number [01,12]. | |
%j |
Day of the year as a decimal number [001,366]. | |
%m |
Month as a decimal number [01,12]. | |
%M |
Minute as a decimal number [00,59]. | |
%p |
Locale’s equivalent of either AM or PM. | (1) |
%S |
Second as a decimal number [00,61]. | (2) |
%U |
Week number of the year (Sunday as the first day of the week) as a decimal number [00,53]. All days in a new year preceding the first Sunday are considered to be in week 0. | (3) |
%w |
Weekday as a decimal number [0(Sunday),6]. | |
%W |
Week number of the year (Monday as the first day of the week) as a decimal number [00,53]. All days in a new year preceding the first Monday are considered to be in week 0. | (3) |
%x |
Locale’s appropriate date representation. | |
%X |
Locale’s appropriate time representation. | |
%y |
Year without century as a decimal number [00,99]. | |
%Y |
Year with century as a decimal number. | |
%z |
Time zone offset indicating a positive or negative time difference from UTC/GMT of the form +HHMM or -HHMM, where H represents decimal hour digits and M represents decimal minute digits [-23:59, +23:59]. | |
%Z |
Time zone name (no characters if no time zone exists). | |
%% |
A literal '%' character. |
3.4 datetime模塊 能夠算天和秒級計算,多用於時間運算
兩個模塊經常使用實例:
import time import datetime print(time.altzone/3600) print(time.asctime()) t=time.localtime() #本地時間 t=time.localtime(time.time() + 3600*3) #加3小時,不能進行天數運算 print(t.tm_year,t.tm_yday) #year to day =ytd,mtd=month to day print(time.time()) #打印時間戳 從1970到如今 print(time.gmtime()) #utc time print(time.ctime()) #返回當前時間 print(time.strptime('2016-11-11 23:20','%Y-%m-%d %H:%M')) #字符串轉成時間對象 t2 = time.strptime('2016-11-11 23:20','%Y-%m-%d %H:%M') t2_stamp= time.mktime(t2) print(time.mktime(t2)) #轉換成時間戳再計算 t3 = time.localtime(t2_stamp) #stamp to time struct t3_str = time.strftime('%Y_%m_%d_%H_%M.log',t3) print(t3_str) print('datetime'.center(60,'_')) print(datetime.datetime.now()) #打印當前時間 print(datetime.datetime.fromtimestamp(time.time())) print(datetime.datetime.now() + datetime.timedelta(days=3)) #當前日期加3天 print(datetime.datetime.now() + datetime.timedelta(hours=3)) #加3小時 now=datetime.datetime.now() print(now.replace(month=1,day=1))
3.5 json & pickle 模塊
用於序列化的兩個模塊
Json模塊提供了四個功能:dumps、dump、loads、load
pickle模塊提供了四個功能:dumps、dump、loads、load
3.6 shelve 模塊
shelve模塊是一個簡單的k,v將內存數據經過文件持久化的模塊,能夠持久化任何pickle可支持的python數據格式
import shelve d = shelve.open('shelve_test') #打開一個文件 class Test(object): def __init__(self,n): self.n = n t = Test(123) t2 = Test(123334) name = ["alex","rain","test"] d["test"] = name #持久化列表 d["t1"] = t #持久化類 d["t2"] = t2 d.close()
3.4 hashlib:
用於加密相關的操做,代替了md5模塊和sha模塊,主要提供 SHA1, SHA224, SHA256, SHA384, SHA512 ,MD5 算法
例子:
import hashlib # ######## md5 ######## hash = hashlib.md5() # help(hash.update) hash.update(bytes('admin', encoding='utf-8')) print(hash.hexdigest()) print(hash.digest()) ######## sha1 ######## hash = hashlib.sha1() hash.update(bytes('admin', encoding='utf-8')) print(hash.hexdigest()) # ######## sha256 ######## hash = hashlib.sha256() hash.update(bytes('admin', encoding='utf-8')) print(hash.hexdigest()) # ######## sha384 ######## hash = hashlib.sha384() hash.update(bytes('admin', encoding='utf-8')) print(hash.hexdigest()) # ######## sha512 ######## hash = hashlib.sha512() hash.update(bytes('admin', encoding='utf-8')) print(hash.hexdigest())
以上加密算法雖然依然很是厲害,但時候存在缺陷,即:經過撞庫能夠反解。因此,有必要對加密算法中添加自定義key再來作加密。
import hashlib # ######## md5 ######## hash = hashlib.md5(bytes('898oaFs09f',encoding="utf-8")) hash.update(bytes('admin',encoding="utf-8")) print(hash.hexdigest())
更多關於md5,sha1,sha256等介紹的文章看這裏https://www.tbs-certificates.co.uk/FAQ/en/sha256.html
一樣的對象,md5後結果同樣
python內置還有一個 hmac 模塊,它內部對咱們建立 key 和 內容 進行進一步的處理而後再加密
import hmac h = hmac.new(bytes('898oaFs09f',encoding="utf-8")) h.update(bytes('admin',encoding="utf-8")) print(h.hexdigest())
import hmac h_obj=hmac.new(b'salt',b'hello') #加密消息 print(h_obj.hexdigest()) /usr/bin/python3.5 /home/ld/mytest/day3/test_hashlib.py 3a2484b4f0df4f4157d069598a334b31
3.4 RANDOM模塊:隨機數
import random print(random.random()) print(random.randint(1, 2)) print(random.randrange(1, 10))
print(random.sample(range(100),5)) #從範圍100中隨機選取5個數字
import random import string #string模塊 str_source=string.ascii_letters + string.digits print(random.sample(str_source,5)) #從上面的字符中隨機選5個
import random checkcode = '' for i in range(4): current = random.randrange(0,4) if current != i: temp = chr(random.randint(65,90)) else: temp = random.randint(0,9) checkcode += str(temp) print checkcode
例子:
import random checkcode = '' for i in range(4): current = random.randrange(0,4) if current != i: temp = chr(random.randint(65,90)) else: temp = random.randint(0,9) checkcode += str(temp) print checkcode
3.5 re
python中re模塊提供了正則表達式相關操做
字符:
. 匹配除換行符之外的任意字符
\w 匹配字母或數字或下劃線或漢字
\s 匹配任意的空白符
\d 匹配數字
\b 匹配單詞的開始或結束
^ 匹配字符串的開始
$ 匹配字符串的結束
次數:
* 重複零次或更屢次
+ 重複一次或更屢次
? 重複零次或一次
{n} 重複n次
{n,} 重複n次或更屢次
{n,m} 重複n到m次
經常使用正則表達式符號
'.' 默認匹配除\n以外的任意一個字符,若指定flag DOTALL,則匹配任意字符,包括換行 '^' 匹配字符開頭,若指定flags MULTILINE,這種也能夠匹配上(r"^a","\nabc\neee",flags=re.MULTILINE) '$' 匹配字符結尾,或e.search("foo$","bfoo\nsdfsf",flags=re.MULTILINE).group()也能夠 '*' 匹配*號前的字符0次或屢次,re.findall("ab*","cabb3abcbbac") 結果爲['abb', 'ab', 'a'] '+' 匹配前一個字符1次或屢次,re.findall("ab+","ab+cd+abb+bba") 結果['ab', 'abb'] '?' 匹配前一個字符1次或0次 '{m}' 匹配前一個字符m次 '{n,m}' 匹配前一個字符n到m次,re.findall("ab{1,3}","abb abc abbcbbb") 結果'abb', 'ab', 'abb'] '|' 匹配|左或|右的字符,re.search("abc|ABC","ABCBabcCD").group() 結果'ABC' '(...)' 分組匹配,re.search("(abc){2}a(123|456)c", "abcabca456c").group() 結果 abcabca456c '\A' 只從字符開頭匹配,re.search("\Aabc","alexabc") 是匹配不到的 '\Z' 匹配字符結尾,同$ '\d' 匹配數字0-9 '\D' 匹配非數字 '\w' 匹配[A-Za-z0-9] '\W' 匹配非[A-Za-z0-9] 's' 匹配空白字符、\t、\n、\r , re.search("\s+","ab\tc1\n3").group() 結果 '\t' '(?P<name>...)' 分組匹配 re.search("(?P<province>[0-9]{4})(?P<city>[0-9]{2})(?P<birthday>[0-9]{4})","371481199306143242").groupdict("city") 結果{'province': '3714', 'city': '81', 'birthday': '1993'}
身份證分組實例:
>>> re.search("(\d{2})(\d{2})(\d{4})","371481199206143421 name alex").groups() ('37', '14', '8119')
實例:
>>> import re >>> re.findall("\D+","ab3c4sdfd45634sfsd26ds6") ['ab', 'c', 'sdfd', 'sfsd', 'ds'] >>> re.findall("\d+","ab3c4sdfd45634sfsd26ds6") ['3', '4', '45634', '26', '6'] >>> re.split("\d+","ab3c4sdfd45634sfsd26ds6") ['ab', 'c', 'sdfd', 'sfsd', 'ds', ''] >>> re.sub("\d+","|","ab3c4sdfd45634sfsd26ds6") 'ab|c|sdfd|sfsd|ds|'
反斜槓的困擾
與大多數編程語言相同,正則表達式裏使用"\"做爲轉義字符,這就可能形成反斜槓困擾。假如你須要匹配文本中的字符"\",那麼使用編程語言表示的正則表達式裏將須要4個反斜槓"\\\\":前兩個和後兩個分別用於在編程語言裏轉義成反斜槓,轉換成兩個反斜槓後再在正則表達式裏轉義成一個反斜槓。Python裏的原生字符串很好地解決了這個問題,這個例子中的正則表達式可使用r"\\"表示。一樣,匹配一個數字的"\\d"能夠寫成r"\d"。有了原生字符串,你不再用擔憂是否是漏寫了反斜槓,寫出來的表達式也更直觀。
僅需輕輕知道的幾個匹配模式
re.I(re.IGNORECASE): 忽略大小寫(括號內是完整寫法,下同) M(MULTILINE): 多行模式,改變'^'和'$'的行爲(參見上圖) S(DOTALL): 點任意匹配模式,改變'.'的行爲
match
# match,從起始位置開始匹配,匹配成功返回一個對象,未匹配成功返回None match(pattern, string, flags=0) # pattern: 正則模型 # string : 要匹配的字符串 # falgs : 匹配模式 X VERBOSE Ignore whitespace and comments for nicer looking RE's. I IGNORECASE Perform case-insensitive matching. M MULTILINE "^" matches the beginning of lines (after a newline) as well as the string. "$" matches the end of lines (before a newline) as well as the end of the string. S DOTALL "." matches any character at all, including the newline. A ASCII For string patterns, make \w, \W, \b, \B, \d, \D match the corresponding ASCII character categories (rather than the whole Unicode categories, which is the default). For bytes patterns, this flag is the only available behaviour and needn't be specified. L LOCALE Make \w, \W, \b, \B, dependent on the current locale. U UNICODE For compatibility only. Ignored for string patterns (it is the default), and forbidden for bytes patterns.
demo:
# 無分組 r = re.match("h\w+", origin) print(r.group()) # 獲取匹配到的全部結果 print(r.groups()) # 獲取模型中匹配到的分組結果 print(r.groupdict()) # 獲取模型中匹配到的分組結果 # 有分組 # 爲什麼要有分組?提取匹配成功的指定內容(先匹配成功所有正則,再匹配成功的局部內容提取出來) r = re.match("h(\w+).*(?P<name>\d)$", origin) print(r.group()) # 獲取匹配到的全部結果 print(r.groups()) # 獲取模型中匹配到的分組結果 print(r.groupdict()) # 獲取模型中匹配到的分組中全部執行了key的組
search:
# search,瀏覽整個字符串去匹配第一個,未匹配成功返回None
# search(pattern, string, flags=0)
# 無分組 r = re.search("a\w+", origin) print(r.group()) # 獲取匹配到的全部結果 print(r.groups()) # 獲取模型中匹配到的分組結果 print(r.groupdict()) # 獲取模型中匹配到的分組結果 # 有分組 r = re.search("a(\w+).*(?P<name>\d)$", origin) print(r.group()) # 獲取匹配到的全部結果 print(r.groups()) # 獲取模型中匹配到的分組結果 print(r.groupdict()) # 獲取模型中匹配到的分組中全部執行了key的組
>>> re.search("\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}","inet 地址:192.168.12.55 廣播:192.168.12.255").group() '192.168.12.55' >>> print(re.search("(\d{1,3}\.){3}\d{1,3}", "inet 地址:192.168.12.55 廣播:192.168.12.255")) <_sre.SRE_Match object; span=(8, 21), match='192.168.12.55'> >>> print(re.search("(\d{1,3}\.){3}\d{1,3}", "inet 地址:192.168.12.55 廣播:192.168.12.255").group(0)) 192.168.12.55
findall:
# findall,獲取非重複的匹配列表;若是有一個組則以列表形式返回,且每個匹配均是字符串;若是模型中有多個組,則以列表形式返回,且每個匹配均是元祖;
# 空的匹配也會包含在結果中
#findall(pattern, string, flags=0)
# 無分組 r = re.findall("a\w+",origin) print(r) # 有分組 origin = "hello alex bcd abcd lge acd 19" r = re.findall("a((\w*)c)(d)", origin) print(r)
sub:
# sub,替換匹配成功的指定位置字符串
sub(pattern, repl, string, count
=
0
, flags
=
0
)
# pattern: 正則模型
# repl : 要替換的字符串或可執行對象
# string : 要匹配的字符串
# count : 指定匹配個數
# flags : 匹配模式
# 與分組無關 origin = "hello alex bcd alex lge alex acd 19" r = re.sub("a\w+", "999", origin, 2) print(r)
split:
# split,根據正則匹配分割字符串
split(pattern, string, maxsplit
=
0
, flags
=
0
)
# pattern: 正則模型
# string : 要匹配的字符串
# maxsplit:指定分割個數
# flags : 匹配模式
# 無分組 origin = "hello alex bcd alex lge alex acd 19" r = re.split("alex", origin, 1) print(r) # 有分組 origin = "hello alex bcd alex lge alex acd 19" r1 = re.split("(alex)", origin, 1) print(r1) r2 = re.split("(al(ex))", origin, 1) print(r2)
經常使用正則表達式:
IP: ^(25[0-5]|2[0-4]\d|[0-1]?\d?\d)(\.(25[0-5]|2[0-4]\d|[0-1]?\d?\d)){3}$ 手機號: ^1[3|4|5|8][0-9]\d{8}$ 郵箱: [a-zA-Z0-9_-]+@[a-zA-Z0-9_-]+(\.[a-zA-Z0-9_-]+)+
四、序列化:
Python中用於序列化的兩個模塊
Json模塊提供了四個功能:dumps、dump、loads、load
pickle模塊提供了四個功能:dumps、dump、loads、load.
demo:
import pickle data = {'k1':123,'k2':'hello'} #pickle.dumps 將數據經過特殊的形式轉換爲只有python語言認識的字符串 p_str = pickle.dumps(data) #pickle.dump將經過特殊的形式轉換爲只有Python語言認識的字符串,並寫入文件 with open('d:/result.pk','w') as fp: picke.dump(data,fp) import json #json.dumps將經過特殊的形式轉換爲全部程序 語言都認識的字符串 j_str=json.dumps(data) print j_str #json.dumps將經過特殊的形式轉換爲全部程序 語言都認識的字符串並寫入文件 with open('d:/result.json','w') as fp: json.dump(data,fp)
5.configparser:configparser用於處理特定格式的文件,其本質上是利用open來操做文件。
# 註釋1 ; 註釋2 [section1] # 節點 k1 = v1 # 值 k2:v2 # 值 [section2] # 節點 k1 = v1 # 值
本身生成特定格式文件當如何作?
import configparser config = configparser.ConfigParser() config["DEFAULT"] = {'ServerAliveInterval': '45', 'Compression': 'yes', 'CompressionLevel': '9'} config['bitbucket.org'] = {} config['bitbucket.org']['User'] = 'hg' config['topsecret.server.com'] = {} topsecret = config['topsecret.server.com'] topsecret['Host Port'] = '50022' # mutates the parser topsecret['ForwardX11'] = 'no' # same here config['DEFAULT']['ForwardX11'] = 'yes' with open('example.ini', 'w') as configfile: config.write(configfile)
寫完了還能夠再讀出來哈。
>>> import configparser >>> config = configparser.ConfigParser() >>> config.sections() [] >>> config.read('example.ini') ['example.ini'] >>> config.sections() ['bitbucket.org', 'topsecret.server.com'] >>> 'bitbucket.org' in config True >>> 'bytebong.com' in config False >>> config['bitbucket.org']['User'] 'hg' >>> config['DEFAULT']['Compression'] 'yes' >>> topsecret = config['topsecret.server.com'] >>> topsecret['ForwardX11'] 'no' >>> topsecret['Port'] '50022' >>> for key in config['bitbucket.org']: print(key) ... user compressionlevel serveraliveinterval compression forwardx11 >>> config['bitbucket.org']['ForwardX11'] 'yes'
5.1 獲取全部節點:
只能獲取[]格式,做爲節點項目。
import configparser config = configparser.ConfigParser() config.read('xxxooo', encoding='utf-8') ret = config.sections() print(ret)
5.2 獲取指定節點下全部的鍵值對:
import configparser config = configparser.ConfigParser() config.read('xxxooo', encoding='utf-8') ret = config.items('section1') print(ret)
5.3 獲取指定節點下全部的鍵:
import configparser config = configparser.ConfigParser() config.read('xxxooo', encoding='utf-8') ret = config.options('section1') print(ret)
5.4 獲取指定節點下指定key的值
import configparser config = configparser.ConfigParser() config.read('xxxooo', encoding='utf-8') v = config.get('section1', 'k1') # v = config.getint('section1', 'k1') # v = config.getfloat('section1', 'k1') # v = config.getboolean('section1', 'k1') print(v)
5.5 檢查、刪除、添加節點
import configparser config = configparser.ConfigParser() config.read('xxxooo', encoding='utf-8') # 檢查 has_sec = config.has_section('section1') print(has_sec) # 添加節點 config.add_section("SEC_1") config.write(open('xxxooo', 'w')) # 刪除節點 config.remove_section("SEC_1") config.write(open('xxxooo', 'w'))
5.6 檢查、刪除、設置指定組內的鍵值對
import configparser config = configparser.ConfigParser() config.read('xxxooo', encoding='utf-8') # 檢查 has_opt = config.has_option('section1', 'k1') print(has_opt) # 刪除 config.remove_option('section1', 'k1') config.write(open('xxxooo', 'w')) # 設置 config.set('section1', 'k10', "123") config.write(open('xxxooo', 'w'))
六、XML
XML是實現不一樣語言或程序之間進行數據交換的協議,XML文件格式以下:
<data> <country name="Liechtenstein"> <rank updated="yes">2</rank> <year>2023</year> <gdppc>141100</gdppc> <neighbor direction="E" name="Austria" /> <neighbor direction="W" name="Switzerland" /> </country> <country name="Singapore"> <rank updated="yes">5</rank> <year>2026</year> <gdppc>59900</gdppc> <neighbor direction="N" name="Malaysia" /> </country> <country name="Panama"> <rank updated="yes">69</rank> <year>2026</year> <gdppc>13600</gdppc> <neighbor direction="W" name="Costa Rica" /> <neighbor direction="E" name="Colombia" /> </country> </data>
6.1 解析XML
from xml.etree import ElementTree as ET # 打開文件,讀取XML內容 str_xml = open('xo.xml', 'r').read() # 將字符串解析成xml特殊對象,root代指xml文件的根節點 root = ET.XML(str_xml)
from xml.etree import ElementTree as ET # 直接解析xml文件 tree = ET.parse("xo.xml") # 獲取xml文件的根節點 root = tree.getroot()
6.2 操做XML
XML格式類型是節點嵌套節點,對於每個節點均有如下功能,以便對當前節點進行操做:
class Element: """An XML element. This class is the reference implementation of the Element interface. An element's length is its number of subelements. That means if you want to check if an element is truly empty, you should check BOTH its length AND its text attribute. The element tag, attribute names, and attribute values can be either bytes or strings. *tag* is the element name. *attrib* is an optional dictionary containing element attributes. *extra* are additional element attributes given as keyword arguments. Example form: <tag attrib>text<child/>...</tag>tail """ 當前節點的標籤名 tag = None """The element's name.""" 當前節點的屬性 attrib = None """Dictionary of the element's attributes.""" 當前節點的內容 text = None """ Text before first subelement. This is either a string or the value None. Note that if there is no text, this attribute may be either None or the empty string, depending on the parser. """ tail = None """ Text after this element's end tag, but before the next sibling element's start tag. This is either a string or the value None. Note that if there was no text, this attribute may be either None or an empty string, depending on the parser. """ def __init__(self, tag, attrib={}, **extra): if not isinstance(attrib, dict): raise TypeError("attrib must be dict, not %s" % ( attrib.__class__.__name__,)) attrib = attrib.copy() attrib.update(extra) self.tag = tag self.attrib = attrib self._children = [] def __repr__(self): return "<%s %r at %#x>" % (self.__class__.__name__, self.tag, id(self)) def makeelement(self, tag, attrib): 建立一個新節點 """Create a new element with the same type. *tag* is a string containing the element name. *attrib* is a dictionary containing the element attributes. Do not call this method, use the SubElement factory function instead. """ return self.__class__(tag, attrib) def copy(self): """Return copy of current element. This creates a shallow copy. Subelements will be shared with the original tree. """ elem = self.makeelement(self.tag, self.attrib) elem.text = self.text elem.tail = self.tail elem[:] = self return elem def __len__(self): return len(self._children) def __bool__(self): warnings.warn( "The behavior of this method will change in future versions. " "Use specific 'len(elem)' or 'elem is not None' test instead.", FutureWarning, stacklevel=2 ) return len(self._children) != 0 # emulate old behaviour, for now def __getitem__(self, index): return self._children[index] def __setitem__(self, index, element): # if isinstance(index, slice): # for elt in element: # assert iselement(elt) # else: # assert iselement(element) self._children[index] = element def __delitem__(self, index): del self._children[index] def append(self, subelement): 爲當前節點追加一個子節點 """Add *subelement* to the end of this element. The new element will appear in document order after the last existing subelement (or directly after the text, if it's the first subelement), but before the end tag for this element. """ self._assert_is_element(subelement) self._children.append(subelement) def extend(self, elements): 爲當前節點擴展 n 個子節點 """Append subelements from a sequence. *elements* is a sequence with zero or more elements. """ for element in elements: self._assert_is_element(element) self._children.extend(elements) def insert(self, index, subelement): 在當前節點的子節點中插入某個節點,即:爲當前節點建立子節點,而後插入指定位置 """Insert *subelement* at position *index*.""" self._assert_is_element(subelement) self._children.insert(index, subelement) def _assert_is_element(self, e): # Need to refer to the actual Python implementation, not the # shadowing C implementation. if not isinstance(e, _Element_Py): raise TypeError('expected an Element, not %s' % type(e).__name__) def remove(self, subelement): 在當前節點在子節點中刪除某個節點 """Remove matching subelement. Unlike the find methods, this method compares elements based on identity, NOT ON tag value or contents. To remove subelements by other means, the easiest way is to use a list comprehension to select what elements to keep, and then use slice assignment to update the parent element. ValueError is raised if a matching element could not be found. """ # assert iselement(element) self._children.remove(subelement) def getchildren(self): 獲取全部的子節點(廢棄) """(Deprecated) Return all subelements. Elements are returned in document order. """ warnings.warn( "This method will be removed in future versions. " "Use 'list(elem)' or iteration over elem instead.", DeprecationWarning, stacklevel=2 ) return self._children def find(self, path, namespaces=None): 獲取第一個尋找到的子節點 """Find first matching element by tag name or path. *path* is a string having either an element tag or an XPath, *namespaces* is an optional mapping from namespace prefix to full name. Return the first matching element, or None if no element was found. """ return ElementPath.find(self, path, namespaces) def findtext(self, path, default=None, namespaces=None): 獲取第一個尋找到的子節點的內容 """Find text for first matching element by tag name or path. *path* is a string having either an element tag or an XPath, *default* is the value to return if the element was not found, *namespaces* is an optional mapping from namespace prefix to full name. Return text content of first matching element, or default value if none was found. Note that if an element is found having no text content, the empty string is returned. """ return ElementPath.findtext(self, path, default, namespaces) def findall(self, path, namespaces=None): 獲取全部的子節點 """Find all matching subelements by tag name or path. *path* is a string having either an element tag or an XPath, *namespaces* is an optional mapping from namespace prefix to full name. Returns list containing all matching elements in document order. """ return ElementPath.findall(self, path, namespaces) def iterfind(self, path, namespaces=None): 獲取全部指定的節點,並建立一個迭代器(能夠被for循環) """Find all matching subelements by tag name or path. *path* is a string having either an element tag or an XPath, *namespaces* is an optional mapping from namespace prefix to full name. Return an iterable yielding all matching elements in document order. """ return ElementPath.iterfind(self, path, namespaces) def clear(self): 清空節點 """Reset element. This function removes all subelements, clears all attributes, and sets the text and tail attributes to None. """ self.attrib.clear() self._children = [] self.text = self.tail = None def get(self, key, default=None): 獲取當前節點的屬性值 """Get element attribute. Equivalent to attrib.get, but some implementations may handle this a bit more efficiently. *key* is what attribute to look for, and *default* is what to return if the attribute was not found. Returns a string containing the attribute value, or the default if attribute was not found. """ return self.attrib.get(key, default) def set(self, key, value): 爲當前節點設置屬性值 """Set element attribute. Equivalent to attrib[key] = value, but some implementations may handle this a bit more efficiently. *key* is what attribute to set, and *value* is the attribute value to set it to. """ self.attrib[key] = value def keys(self): 獲取當前節點的全部屬性的 key """Get list of attribute names. Names are returned in an arbitrary order, just like an ordinary Python dict. Equivalent to attrib.keys() """ return self.attrib.keys() def items(self): 獲取當前節點的全部屬性值,每一個屬性都是一個鍵值對 """Get element attributes as a sequence. The attributes are returned in arbitrary order. Equivalent to attrib.items(). Return a list of (name, value) tuples. """ return self.attrib.items() def iter(self, tag=None): 在當前節點的子孫中根據節點名稱尋找全部指定的節點,並返回一個迭代器(能夠被for循環)。 """Create tree iterator. The iterator loops over the element and all subelements in document order, returning all elements with a matching tag. If the tree structure is modified during iteration, new or removed elements may or may not be included. To get a stable set, use the list() function on the iterator, and loop over the resulting list. *tag* is what tags to look for (default is to return all elements) Return an iterator containing all the matching elements. """ if tag == "*": tag = None if tag is None or self.tag == tag: yield self for e in self._children: yield from e.iter(tag) # compatibility def getiterator(self, tag=None): # Change for a DeprecationWarning in 1.4 warnings.warn( "This method will be removed in future versions. " "Use 'elem.iter()' or 'list(elem.iter())' instead.", PendingDeprecationWarning, stacklevel=2 ) return list(self.iter(tag)) def itertext(self): 在當前節點的子孫中根據節點名稱尋找全部指定的節點的內容,並返回一個迭代器(能夠被for循環)。 """Create text iterator. The iterator loops over the element and all subelements in document order, returning all inner text. """ tag = self.tag if not isinstance(tag, str) and tag is not None: return if self.text: yield self.text for e in self: yield from e.itertext() if e.tail: yield e.tail
因爲 每一個節點 都具備以上的方法,而且在上一步驟中解析時均獲得了root(xml文件的根節點),so 能夠利用以上方法進行操做xml文件。
a. 遍歷XML文檔的全部內容
from xml.etree import ElementTree as ET ############ 解析方式一 ############ """ # 打開文件,讀取XML內容 str_xml = open('xo.xml', 'r').read() # 將字符串解析成xml特殊對象,root代指xml文件的根節點 root = ET.XML(str_xml) """ ############ 解析方式二 ############ # 直接解析xml文件 tree = ET.parse("xo.xml") # 獲取xml文件的根節點 root = tree.getroot() ### 操做 # 頂層標籤 print(root.tag) # 遍歷XML文檔的第二層 for child in root: # 第二層節點的標籤名稱和標籤屬性 print(child.tag, child.attrib) # 遍歷XML文檔的第三層 for i in child: # 第二層節點的標籤名稱和內容 print(i.tag,i.text)
b、遍歷XML中指定的節點
from xml.etree import ElementTree as ET ############ 解析方式一 ############ """ # 打開文件,讀取XML內容 str_xml = open('xo.xml', 'r').read() # 將字符串解析成xml特殊對象,root代指xml文件的根節點 root = ET.XML(str_xml) """ ############ 解析方式二 ############ # 直接解析xml文件 tree = ET.parse("xo.xml") # 獲取xml文件的根節點 root = tree.getroot() ### 操做 # 頂層標籤 print(root.tag) # 遍歷XML中全部的year節點 for node in root.iter('year'): # 節點的標籤名稱和內容 print(node.tag, node.text)
c、修改節點內容
因爲修改的節點時,均是在內存中進行,其不會影響文件中的內容。因此,若是想要修改,則須要從新將內存中的內容寫到文件。
from xml.etree import ElementTree as ET ############ 解析方式一 ############ # 打開文件,讀取XML內容 str_xml = open('xo.xml', 'r').read() # 將字符串解析成xml特殊對象,root代指xml文件的根節點 root = ET.XML(str_xml) ############ 操做 ############ # 頂層標籤 print(root.tag) # 循環全部的year節點 for node in root.iter('year'): # 將year節點中的內容自增一 new_year = int(node.text) + 1 node.text = str(new_year) # 設置屬性 node.set('name', 'alex') node.set('age', '18') # 刪除屬性 del node.attrib['name'] ############ 保存文件 ############ tree = ET.ElementTree(root) tree.write("newnew.xml", encoding='utf-8')
from xml.etree import ElementTree as ET ############ 解析方式二 ############ # 直接解析xml文件 tree = ET.parse("xo.xml") # 獲取xml文件的根節點 root = tree.getroot() ############ 操做 ############ # 頂層標籤 print(root.tag) # 循環全部的year節點 for node in root.iter('year'): # 將year節點中的內容自增一 new_year = int(node.text) + 1 node.text = str(new_year) # 設置屬性 node.set('name', 'alex') node.set('age', '18') # 刪除屬性 del node.attrib['name'] ############ 保存文件 ############ tree.write("newnew.xml", encoding='utf-8')
d、刪除節點
from xml.etree import ElementTree as ET ############ 解析字符串方式打開 ############ # 打開文件,讀取XML內容 str_xml = open('xo.xml', 'r').read() # 將字符串解析成xml特殊對象,root代指xml文件的根節點 root = ET.XML(str_xml) ############ 操做 ############ # 頂層標籤 print(root.tag) # 遍歷data下的全部country節點 for country in root.findall('country'): # 獲取每個country節點下rank節點的內容 rank = int(country.find('rank').text) if rank > 50: # 刪除指定country節點 root.remove(country) ############ 保存文件 ############ tree = ET.ElementTree(root) tree.write("newnew.xml", encoding='utf-8')
from xml.etree import ElementTree as ET ############ 解析文件方式 ############ # 直接解析xml文件 tree = ET.parse("xo.xml") # 獲取xml文件的根節點 root = tree.getroot() ############ 操做 ############ # 頂層標籤 print(root.tag) # 遍歷data下的全部country節點 for country in root.findall('country'): # 獲取每個country節點下rank節點的內容 rank = int(country.find('rank').text) if rank > 50: # 刪除指定country節點 root.remove(country) ############ 保存文件 ############ tree.write("newnew.xml", encoding='utf-8')
6.3 建立XML文檔
from xml.etree import ElementTree as ET # 建立根節點 root = ET.Element("famliy") # 建立節點大兒子 son1 = ET.Element('son', {'name': '兒1'}) # 建立小兒子 son2 = ET.Element('son', {"name": '兒2'}) # 在大兒子中建立兩個孫子 grandson1 = ET.Element('grandson', {'name': '兒11'}) grandson2 = ET.Element('grandson', {'name': '兒12'}) son1.append(grandson1) son1.append(grandson2) # 把兒子添加到根節點中 root.append(son1) root.append(son1) tree = ET.ElementTree(root) tree.write('oooo.xml',encoding='utf-8', short_empty_elements=False)
from xml.etree import ElementTree as ET # 建立根節點 root = ET.Element("famliy") # 建立大兒子 # son1 = ET.Element('son', {'name': '兒1'}) son1 = root.makeelement('son', {'name': '兒1'}) # 建立小兒子 # son2 = ET.Element('son', {"name": '兒2'}) son2 = root.makeelement('son', {"name": '兒2'}) # 在大兒子中建立兩個孫子 # grandson1 = ET.Element('grandson', {'name': '兒11'}) grandson1 = son1.makeelement('grandson', {'name': '兒11'}) # grandson2 = ET.Element('grandson', {'name': '兒12'}) grandson2 = son1.makeelement('grandson', {'name': '兒12'}) son1.append(grandson1) son1.append(grandson2) # 把兒子添加到根節點中 root.append(son1) root.append(son1) tree = ET.ElementTree(root) tree.write('oooo.xml',encoding='utf-8', short_empty_elements=False)
from xml.etree import ElementTree as ET # 建立根節點 root = ET.Element("famliy") # 建立節點大兒子 son1 = ET.SubElement(root, "son", attrib={'name': '兒1'}) # 建立小兒子 son2 = ET.SubElement(root, "son", attrib={"name": "兒2"}) # 在大兒子中建立一個孫子 grandson1 = ET.SubElement(son1, "age", attrib={'name': '兒11'}) grandson1.text = '孫子' et = ET.ElementTree(root) #生成文檔對象 et.write("test.xml", encoding="utf-8", xml_declaration=True, short_empty_elements=False)
因爲原生保存的XML時默認無縮進,若是想要設置縮進的話, 須要修改保存方式:
from xml.etree import ElementTree as ET from xml.dom import minidom def prettify(elem): """將節點轉換成字符串,並添加縮進。 """ rough_string = ET.tostring(elem, 'utf-8') reparsed = minidom.parseString(rough_string) return reparsed.toprettyxml(indent="\t") # 建立根節點 root = ET.Element("famliy") # 建立大兒子 # son1 = ET.Element('son', {'name': '兒1'}) son1 = root.makeelement('son', {'name': '兒1'}) # 建立小兒子 # son2 = ET.Element('son', {"name": '兒2'}) son2 = root.makeelement('son', {"name": '兒2'}) # 在大兒子中建立兩個孫子 # grandson1 = ET.Element('grandson', {'name': '兒11'}) grandson1 = son1.makeelement('grandson', {'name': '兒11'}) # grandson2 = ET.Element('grandson', {'name': '兒12'}) grandson2 = son1.makeelement('grandson', {'name': '兒12'}) son1.append(grandson1) son1.append(grandson2) # 把兒子添加到根節點中 root.append(son1) root.append(son1) raw_str = prettify(root) f = open("xxxoo.xml",'w',encoding='utf-8') f.write(raw_str) f.close()
6.4 命名空間
詳細介紹,猛擊這裏
7.requests
Python標準庫中提供了:urllib等模塊以供Http請求,可是,它的 API 太渣了。它是爲另外一個時代、另外一個互聯網所建立的。它須要巨量的工做,甚至包括各類方法覆蓋,來完成最簡單的任務。
7.1發送GET請求
import urllib.request f = urllib.request.urlopen('http://www.webxml.com.cn//webservices/qqOnlineWebService.asmx/qqCheckOnline?qqCode=424662508') result = f.read().decode('utf-8')
7.2 發送攜帶請求頭的GET請求:
import urllib.request req = urllib.request.Request('http://www.example.com/') req.add_header('Referer', 'http://www.python.org/') r = urllib.request.urlopen(req) result = f.read().decode('utf-8')
注:更多見Python官方文檔:https://docs.python.org/3.5/library/urllib.request.html#module-urllib.request
Requests 是使用 Apache2 Licensed 許可證的 基於Python開發的HTTP 庫,其在Python內置模塊的基礎上進行了高度的封裝,從而使得Pythoner進行網絡請求時,變得美好了許多,使用Requests能夠垂手可得的完成瀏覽器可有的任何操做。
一、安裝模塊
pip3 install requests
二、使用模塊 GET請求
# 一、無參數實例 import requests ret = requests.get('https://github.com/timeline.json') print(ret.url) print(ret.text) # 二、有參數實例 import requests payload = {'key1': 'value1', 'key2': 'value2'} ret = requests.get("http://httpbin.org/get", params=payload) print(ret.url) print(ret.text)
# 一、基本POST實例 import requests payload = {'key1': 'value1', 'key2': 'value2'} ret = requests.post("http://httpbin.org/post", data=payload) print(ret.text) # 二、發送請求頭和數據實例 import requests import json url = 'https://api.github.com/some/endpoint' payload = {'some': 'data'} headers = {'content-type': 'application/json'} ret = requests.post(url, data=json.dumps(payload), headers=headers) print(ret.text) print(ret.cookies)
requests.get(url, params=None, **kwargs) requests.post(url, data=None, json=None, **kwargs) requests.put(url, data=None, **kwargs) requests.head(url, **kwargs) requests.delete(url, **kwargs) requests.patch(url, data=None, **kwargs) requests.options(url, **kwargs) # 以上方法均是在此方法的基礎上構建 requests.request(method, url, **kwargs)
更多requests模塊相關的文檔見:http://cn.python-requests.org/zh_CN/latest/
三、Http請求和XML實例
實例:檢測QQ帳號是否在線
import urllib import requests from xml.etree import ElementTree as ET # 使用內置模塊urllib發送HTTP請求,或者XML格式內容 """ f = urllib.request.urlopen('http://www.webxml.com.cn//webservices/qqOnlineWebService.asmx/qqCheckOnline?qqCode=424662508') result = f.read().decode('utf-8') """ # 使用第三方模塊requests發送HTTP請求,或者XML格式內容 r = requests.get('http://www.webxml.com.cn//webservices/qqOnlineWebService.asmx/qqCheckOnline?qqCode=424662508') result = r.text # 解析XML格式內容 node = ET.XML(result) # 獲取內容 if node.text == "Y": print("在線") else: print("離線")
實例:查看火車停靠信息
import urllib import requests from xml.etree import ElementTree as ET # 使用內置模塊urllib發送HTTP請求,或者XML格式內容 """ f = urllib.request.urlopen('http://www.webxml.com.cn/WebServices/TrainTimeWebService.asmx/getDetailInfoByTrainCode?TrainCode=G666&UserID=') result = f.read().decode('utf-8') """ # 使用第三方模塊requests發送HTTP請求,或者XML格式內容 r = requests.get('http://www.webxml.com.cn/WebServices/TrainTimeWebService.asmx/getDetailInfoByTrainCode?TrainCode=G666&UserID=') result = r.text # 解析XML格式內容 root = ET.XML(result) for node in root.iter('TrainDetailInfo'): print(node.find('TrainStation').text,node.find('StartTime').text,node.tag,node.attrib)
注:更多接口猛擊這裏
八、loggin模塊
用於便捷記錄日誌且線程安全的模塊
8.1單文件日誌
import logging logging.basicConfig(filename='log.log', format='%(asctime)s - %(name)s - %(levelname)s -%(module)s: %(message)s', datefmt='%Y-%m-%d %H:%M:%S %p', level=10) logging.debug('debug') logging.info('info') logging.warning('warning') logging.error('error') logging.critical('critical') logging.log(10,'log')
最簡單用法:
import logging logging.warning("user [alex] attempted wrong password more than 3 times") logging.critical("server is down") #輸出 WARNING:root:user [alex] attempted wrong password more than 3 times CRITICAL:root:server is down
日誌等級:
CRITICAL = 50 FATAL = CRITICAL ERROR = 40 WARNING = 30 WARN = WARNING INFO = 20 DEBUG = 10 NOTSET = 0
注:只有【當前寫等級】大於【日誌等級】時,日誌文件才被記錄。
日誌記錄格式:
日誌格式
%(name)s |
Logger的名字 |
%(levelno)s |
數字形式的日誌級別 |
%(levelname)s |
文本形式的日誌級別 |
%(pathname)s |
調用日誌輸出函數的模塊的完整路徑名,可能沒有 |
%(filename)s |
調用日誌輸出函數的模塊的文件名 |
%(module)s |
調用日誌輸出函數的模塊名 |
%(funcName)s |
調用日誌輸出函數的函數名 |
%(lineno)d |
調用日誌輸出函數的語句所在的代碼行 |
%(created)f |
當前時間,用UNIX標準的表示時間的浮 點數表示 |
%(relativeCreated)d |
輸出日誌信息時的,自Logger建立以 來的毫秒數 |
%(asctime)s |
字符串形式的當前時間。默認格式是 「2003-07-08 16:49:45,896」。逗號後面的是毫秒 |
%(thread)d |
線程ID。可能沒有 |
%(threadName)s |
線程名。可能沒有 |
%(process)d |
進程ID。可能沒有 |
%(message)s |
用戶輸出的消息 |
8.2 多文件日誌
對於上述記錄日誌的功能,只能將日誌記錄在單文件中,若是想要設置多個日誌文件,logging.basicConfig將沒法完成,須要自定義文件和日誌操做對象。
# 定義文件 file_1_1 = logging.FileHandler('l1_1.log', 'a', encoding='utf-8') fmt = logging.Formatter(fmt="%(asctime)s - %(name)s - %(levelname)s -%(module)s: %(message)s") file_1_1.setFormatter(fmt) file_1_2 = logging.FileHandler('l1_2.log', 'a', encoding='utf-8') fmt = logging.Formatter() file_1_2.setFormatter(fmt) # 定義日誌 logger1 = logging.Logger('s1', level=logging.ERROR) logger1.addHandler(file_1_1) logger1.addHandler(file_1_2) # 寫日誌 logger1.critical('1111')
# 定義文件 file_2_1 = logging.FileHandler('l2_1.log', 'a') fmt = logging.Formatter() file_2_1.setFormatter(fmt) # 定義日誌 logger2 = logging.Logger('s2', level=logging.INFO) logger2.addHandler(file_2_1)
如上述建立的兩個日誌對象
若是想同時把log打印在屏幕和文件日誌裏,就須要瞭解一點複雜的知識 了
Python 使用logging模塊記錄日誌涉及四個主要類,使用官方文檔中的歸納最爲合適:
logger提供了應用程序能夠直接使用的接口;
handler將(logger建立的)日誌記錄發送到合適的目的輸出;
filter提供了細度設備來決定輸出哪條日誌記錄;
formatter決定日誌記錄的最終輸出格式。
logger
每一個程序在輸出信息以前都要得到一個Logger。Logger一般對應了程序的模塊名,好比聊天工具的圖形界面模塊能夠這樣得到它的Logger:
LOG=logging.getLogger(」chat.gui」)
而核心模塊能夠這樣:
LOG=logging.getLogger(」chat.kernel」)
Logger.setLevel(lel):指定最低的日誌級別,低於lel的級別將被忽略。debug是最低的內置級別,critical爲最高
Logger.addFilter(filt)、Logger.removeFilter(filt):添加或刪除指定的filter
Logger.addHandler(hdlr)、Logger.removeHandler(hdlr):增長或刪除指定的handler
Logger.debug()、Logger.info()、Logger.warning()、Logger.error()、Logger.critical():能夠設置的日誌級別
handler
handler對象負責發送相關的信息到指定目的地。Python的日誌系統有多種Handler可使用。有些Handler能夠把信息輸出到控制檯,有些Logger能夠把信息輸出到文件,還有些 Handler能夠把信息發送到網絡上。若是以爲不夠用,還能夠編寫本身的Handler。能夠經過addHandler()方法添加多個多handler
Handler.setLevel(lel):指定被處理的信息級別,低於lel級別的信息將被忽略
Handler.setFormatter():給這個handler選擇一個格式
Handler.addFilter(filt)、Handler.removeFilter(filt):新增或刪除一個filter對象
每一個Logger能夠附加多個Handler。接下來咱們就來介紹一些經常使用的Handler:
1) logging.StreamHandler
使用這個Handler能夠向相似與sys.stdout或者sys.stderr的任何文件對象(file object)輸出信息。它的構造函數是:
StreamHandler([strm])
其中strm參數是一個文件對象。默認是sys.stderr
2) logging.FileHandler
和StreamHandler相似,用於向一個文件輸出日誌信息。不過FileHandler會幫你打開這個文件。它的構造函數是:
FileHandler(filename[,mode])
filename是文件名,必須指定一個文件名。
mode是文件的打開方式。參見Python內置函數open()的用法。默認是’a',即添加到文件末尾。
3) logging.handlers.RotatingFileHandler
這個Handler相似於上面的FileHandler,可是它能夠管理文件大小。當文件達到必定大小以後,它會自動將當前日誌文件更名,而後建立 一個新的同名日誌文件繼續輸出。好比日誌文件是chat.log。當chat.log達到指定的大小以後,RotatingFileHandler自動把 文件更名爲chat.log.1。不過,若是chat.log.1已經存在,會先把chat.log.1重命名爲chat.log.2。。。最後從新建立 chat.log,繼續輸出日誌信息。它的構造函數是:
RotatingFileHandler( filename[, mode[, maxBytes[, backupCount]]])
其中filename和mode兩個參數和FileHandler同樣。
maxBytes用於指定日誌文件的最大文件大小。若是maxBytes爲0,意味着日誌文件能夠無限大,這時上面描述的重命名過程就不會發生。
backupCount用於指定保留的備份文件的個數。好比,若是指定爲2,當上面描述的重命名過程發生時,原有的chat.log.2並不會被改名,而是被刪除。
4) logging.handlers.TimedRotatingFileHandler
這個Handler和RotatingFileHandler相似,不過,它沒有經過判斷文件大小來決定什麼時候從新建立日誌文件,而是間隔必定時間就 自動建立新的日誌文件。重命名的過程與RotatingFileHandler相似,不過新的文件不是附加數字,而是當前時間。它的構造函數是:
TimedRotatingFileHandler( filename [,when [,interval [,backupCount]]])
其中filename參數和backupCount參數和RotatingFileHandler具備相同的意義。
interval是時間間隔。
when參數是一個字符串。表示時間間隔的單位,不區分大小寫。它有如下取值:
S 秒
M 分
H 小時
D 天
W 每星期(interval==0時表明星期一)
midnight 天天凌晨
import logging #create logger logger = logging.getLogger('TEST-LOG') logger.setLevel(logging.DEBUG) # create console handler and set level to debug ch = logging.StreamHandler() ch.setLevel(logging.DEBUG) # create file handler and set level to warning fh = logging.FileHandler("access.log") fh.setLevel(logging.WARNING) # create formatter formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s') # add formatter to ch and fh ch.setFormatter(formatter) fh.setFormatter(formatter) # add ch and fh to logger logger.addHandler(ch) logger.addHandler(fh) # 'application' code logger.debug('debug message') logger.info('info message') logger.warn('warn message') logger.error('error message') logger.critical('critical message')
文件自動截斷例子
import logging from logging import handlers logger = logging.getLogger(__name__) log_file = "timelog.log" #fh = handlers.RotatingFileHandler(filename=log_file,maxBytes=10,backupCount=3) fh = handlers.TimedRotatingFileHandler(filename=log_file,when="S",interval=5,backupCount=3) formatter = logging.Formatter('%(asctime)s %(module)s:%(lineno)d %(message)s') fh.setFormatter(formatter) logger.addHandler(fh) logger.warning("test1") logger.warning("test12") logger.warning("test13") logger.warning("test14")
九、系統命令,subprocess模塊:
能夠執行shell命令的相關模塊和函數有:
import commands result = commands.getoutput('cmd') result = commands.getstatus('cmd') result = commands.getstatusoutput('cmd')
以上執行shell命令的相關的模塊和函數的功能均在 subprocess 模塊中實現,並提供了更豐富的功能。
call
執行命令,返回狀態碼
ret = subprocess.call(["ls", "-l"], shell=False) ret = subprocess.call("ls -l", shell=True)
check_call
執行命令,若是執行狀態碼是 0 ,則返回0,不然拋異常
subprocess.check_call(["ls", "-l"]) subprocess.check_call("exit 1", shell=True)
check_output
執行命令,若是狀態碼是 0 ,則返回執行結果,不然拋異常
subprocess.check_output(["echo", "Hello World!"]) subprocess.check_output("exit 1", shell=True)
subprocess.Popen(...)
用於執行復雜的系統命令
參數:
import subprocess ret1 = subprocess.Popen(["mkdir","t1"]) ret2 = subprocess.Popen("mkdir t2", shell=True)
終端輸入的命令分爲兩種:
1.import subprocess obj = subprocess.Popen("mkdir t3", shell=True, cwd='/home/dev',) 2.import subprocess obj = subprocess.Popen(["python"], stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE, universal_newlines=True) obj.stdin.write("print(1)\n") obj.stdin.write("print(2)") obj.stdin.close() cmd_out = obj.stdout.read() obj.stdout.close() cmd_error = obj.stderr.read() obj.stderr.close() print(cmd_out) print(cmd_error) 3.import subprocess obj = subprocess.Popen(["python"], stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE, universal_newlines=True) obj.stdin.write("print(1)\n") obj.stdin.write("print(2)") out_error_list = obj.communicate() print(out_error_list) 4.import subprocess obj = subprocess.Popen(["python"], stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE, universal_newlines=True) out_error_list = obj.communicate('print("hello")') print(out_error_list)
十、shutil
高級的 文件、文件夾、壓縮包 處理模塊
shutil.copyfileobj(fsrc, fdst[, length])
將文件內容拷貝到另外一個文件中
import shutil shutil.copyfileobj(open('old.xml','r'), open('new.xml', 'w'))
shutil.copyfile(src, dst)
拷貝文件
shutil.copyfile('f1.log', 'f2.log')
shutil.copymode(src, dst)
僅拷貝權限。內容、組、用戶均不變
shutil.copymode('f1.log', 'f2.log')
shutil.copystat(src, dst)
僅拷貝狀態的信息,包括:mode bits, atime, mtime, flags
shutil.copystat('f1.log', 'f2.log')
shutil.copy(src, dst)
拷貝文件和權限
import shutil shutil.copy('f1.log', 'f2.log')
shutil.copy2(src, dst)拷貝文件和狀態信息
import shutil shutil.copy2('f1.log', 'f2.log')
shutil.ignore_patterns(*patterns)
shutil.copytree(src, dst, symlinks=False, ignore=None)
遞歸的去拷貝文件夾
import shutil shutil.copytree('folder1', 'folder2', ignore=shutil.ignore_patterns('*.pyc', 'tmp*'))
import shutil shutil.copytree('f1', 'f2', symlinks=True, ignore=shutil.ignore_patterns('*.pyc', 'tmp*'))
shutil.rmtree(path[, ignore_errors[, onerror]])
遞歸的去刪除文件
import shutil shutil.rmtree('folder1')
shutil.move(src, dst)
遞歸的去移動文件,它相似mv命令,其實就是重命名。
import shutil shutil.move('folder1', 'folder3')
shutil.make_archive(base_name, format,...)
建立壓縮包並返回文件路徑,例如:zip、tar
建立壓縮包並返回文件路徑,例如:zip、tar
#將 /Users/wupeiqi/Downloads/test 下的文件打包放置當前程序目錄 import shutil ret = shutil.make_archive("wwwwwwwwww", 'gztar', root_dir='/Users/wupeiqi/Downloads/test') #將 /Users/wupeiqi/Downloads/test 下的文件打包放置 /Users/wupeiqi/目錄 import shutil ret = shutil.make_archive("/Users/wupeiqi/wwwwwwwwww", 'gztar', root_dir='/Users/wupeiqi/Downloads/test')
shutil 對壓縮包的處理是調用 ZipFile 和 TarFile 兩個模塊來進行的,詳細:
import zipfile # 壓縮 z = zipfile.ZipFile('laxi.zip', 'w') z.write('a.log') z.write('data.data') z.close() # 解壓 z = zipfile.ZipFile('laxi.zip', 'r') z.extractall() z.close()
import tarfile # 壓縮 tar = tarfile.open('your.tar','w') tar.add('/Users/wupeiqi/PycharmProjects/bbs2.log', arcname='bbs2.log') tar.add('/Users/wupeiqi/PycharmProjects/cmdb.log', arcname='cmdb.log') tar.close() # 解壓 tar = tarfile.open('your.tar','r') tar.extractall() # 可設置解壓地址 tar.close()
Python也能夠很容易的處理ymal文檔格式,只不過須要安裝一個模塊,參考文檔:http://pyyaml.org/wiki/PyYAMLDocumentation
十一、paramiko
paramiko是一個用於作遠程控制的模塊,使用該模塊能夠對遠程服務器進行命令或文件操做,值得一說的是,fabric和ansible內部的遠程管理就是使用的paramiko來現實。
一、下載安裝
pycrypto,因爲 paramiko 模塊內部依賴pycrypto,因此先下載安裝pycrypto
pip3 install pycrypto
pip3 install paramiko
#!/usr/bin/env python #coding:utf-8 import paramiko ssh = paramiko.SSHClient() ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy()) ssh.connect('192.168.1.108', 22, 'alex', '123') stdin, stdout, stderr = ssh.exec_command('df') print stdout.read() ssh.close();
import paramiko private_key_path = '/home/auto/.ssh/id_rsa' key = paramiko.RSAKey.from_private_key_file(private_key_path) ssh = paramiko.SSHClient() ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy()) ssh.connect('主機名 ', 端口, '用戶名', key) stdin, stdout, stderr = ssh.exec_command('df') print stdout.read() ssh.close()
import os,sys import paramiko t = paramiko.Transport(('182.92.219.86',22)) t.connect(username='wupeiqi',password='123') sftp = paramiko.SFTPClient.from_transport(t) sftp.put('/tmp/test.py','/tmp/test.py') t.close() import os,sys import paramiko t = paramiko.Transport(('182.92.219.86',22)) t.connect(username='wupeiqi',password='123') sftp = paramiko.SFTPClient.from_transport(t) sftp.get('/tmp/test.py','/tmp/test2.py') t.close()
import paramiko pravie_key_path = '/home/auto/.ssh/id_rsa' key = paramiko.RSAKey.from_private_key_file(pravie_key_path) t = paramiko.Transport(('182.92.219.86',22)) t.connect(username='wupeiqi',pkey=key) sftp = paramiko.SFTPClient.from_transport(t) sftp.put('/tmp/test3.py','/tmp/test3.py') t.close() import paramiko pravie_key_path = '/home/auto/.ssh/id_rsa' key = paramiko.RSAKey.from_private_key_file(pravie_key_path) t = paramiko.Transport(('182.92.219.86',22)) t.connect(username='wupeiqi',pkey=key) sftp = paramiko.SFTPClient.from_transport(t) sftp.get('/tmp/test3.py','/tmp/test4.py') t.close()
十二、time模塊
時間相關的操做,時間有三種表示方式:
print time.time() print time.mktime(time.localtime()) print time.gmtime() #可加時間戳參數 print time.localtime() #可加時間戳參數 print time.strptime('2014-11-11', '%Y-%m-%d') print time.strftime('%Y-%m-%d') #默認當前時間 print time.strftime('%Y-%m-%d',time.localtime()) #默認當前時間 print time.asctime() print time.asctime(time.localtime()) print time.ctime(time.time()) import datetime ''' datetime.date:表示日期的類。經常使用的屬性有year, month, day datetime.time:表示時間的類。經常使用的屬性有hour, minute, second, microsecond datetime.datetime:表示日期時間 datetime.timedelta:表示時間間隔,即兩個時間點之間的長度 timedelta([days[, seconds[, microseconds[, milliseconds[, minutes[, hours[, weeks]]]]]]]) strftime("%Y-%m-%d") ''' import datetime print datetime.datetime.now() print datetime.datetime.now() - datetime.timedelta(days=5)
%Y Year with century as a decimal number. %m Month as a decimal number [01,12]. %d Day of the month as a decimal number [01,31]. %H Hour (24-hour clock) as a decimal number [00,23]. %M Minute as a decimal number [00,59]. %S Second as a decimal number [00,61]. %z Time zone offset from UTC. %a Locale's abbreviated weekday name. %A Locale's full weekday name. %b Locale's abbreviated month name. %B Locale's full month name. %c Locale's appropriate date and time representation. %I Hour (12-hour clock) as a decimal number [01,12]. %p Locale's equivalent of either AM or PM.
一、經過HTTP請求和XML實現獲取電視節目
API:http://www.webxml.com.cn/webservices/ChinaTVprogramWebService.asmx
二、經過HTTP請求和JSON實現獲取天氣情況
API:http://wthrcdn.etouch.cn/weather_mini?city=北京