Python——re模塊

時間 2019-11-08

標籤 python 模塊欄目 Python 简体版

原文原文鏈接

re模塊

1. 匹配經常使用方法

(1) findall

　　返回值：列表：列表中是全部匹配到的項正則表達式

import re
ret = re.findall('a','eva egon yuan')       #['a', 'a']
print(ret)
ret = re.findall('[a-z]+','eva egon yuan')      #['eva', 'egon', 'yuan']
print(ret)

View Code

(2) search

　　ret = search('\d(\w)+','bdsjdbc14564fvfv')ide

　　ret = search('\d(?P<name>\w)+','bdsjdbc14564fvfv') #給分組命名爲namespa

　　找整個字符串，遇到匹配上的就返回，遇不到就返回None3d

　　若是有返回值，ret.group()就能夠取到值code

　　取分組中的內容：ret.group(1)　　#按照組的順序（從第一個開始）對象

　　　　　　　　　　ret.group('name')　　#按照組的名字blog

import re

ret = re.search('a','eva egon yuan')
print(ret)  #<_sre.SRE_Match object; span=(2, 3), match='a'>
print(ret.group())  #a

View Code

import re

# 報錯：若是沒有找到結果，那麼返回None，調用group會報錯
# ret = re.search('m','eva egon yuan')
# print(ret)  #None
# print(ret.group())  #AttributeError: 'NoneType' object has no attribute 'group'

# 使用下面方法不報錯
# (1) 找到結果
ret = re.search('a','eva egon yuan')
if ret:
    print(ret.group())  #a
#  (2) 沒有找到結果
ret = re.search('m','eva egon yuan')
if ret:
    print(ret.group())  #無任何結果，也不報錯

View Code

(3) match

　　從頭開始匹配，匹配上就返回，匹配不上就返回None內存

　　匹配上：ret.group()取值字符串

import re

# 若是正則規則從頭開始能夠匹配上，則返回一個變量，調用group顯示
# 若是沒匹配上，就返回None，調用group會報錯
ret = re.match('[a-z]+','eva egon yuan')
if ret:
    print(ret.group())  #eva

View Code

2. 其餘經常使用方法

(1) 分割 split

import re

# 先按照'a'分割獲得''和'bcd',再對''和'bcd'分別按照'b'分割
ret = re.split('[ab]','abcd')
print(ret)  #['', '', 'cd']

View Code

(2) 替換 sub subn

import re

# 將數字替換成'H',參數1表示替換一次
ret = re.sub('\d','H','eva3egon4yuan4',1)
print(ret)  #evaHegon4yuan4

View Code

import re

# subn
# 將數字替換成'H',返回替換結果和替換次數
ret = re.subn('\d','H','eva3egon4yuan4')
print(ret)  #('evaHegonHyuanH', 3)

View Code

(3) 返回迭代器 finditer

　　返回不少值，不想讓它們一次性所有出如今內存裏it

import re

# 返回一個存放匹配結果的迭代器
ret = re.finditer('\d','ds3sy4764384a')
print(ret)  #<callable_iterator object at 0x0000013D05E83198>
# 查看第一個結果
print(next(ret).group())    #3
# 查看第二個結果
print(next(ret).group())    #4
# 查看剩餘的全部結果
print([i.group() for i in ret]) #['7', '6', '4', '3', '8', '4']

# 查看全部匹配結果：循環打印其中的數字
ret = re.finditer('\d','ds3sy4764384a')
for i in ret:
    print(i.group())    # 3     4       6       7       3       8       4

View Code

(4) 編譯 compile

　　正則表達式很長且要屢次使用

import re

# 將正則表達式編譯成一個正則表達式對象
obj = re.compile('\d{3}')   #此規則要匹配3個數字
ret = obj.search('abc123eeee')
print(ret.group())  #123
ret = obj.search('412e')
print(ret.group())  #412
ret = obj.search('abgfnjgn78967ee')
print(ret.group())  #789

View Code

3. 分組優先

（1）findall的優先級查詢——?:

import re

ret = re.findall('www.(baidu|oldboy).com','www.oldboy.com')
print(ret)  #['oldboy']
# ?:——取消分組優先
ret = re.findall('www.(?:baidu|oldboy).com','www.oldboy.com')
print(ret)  #['www.oldboy.com']

View Code

（2）split的優先級查詢——()

import re

ret = re.split("\d+","eva3egon4yuan")
print(ret)  #['eva', 'egon', 'yuan']
# 加()——取消分組優先
ret = re.split("(\d+)","eva3egon4yuan")
print(ret)  #['eva', '3', 'egon', '4', 'yuan']