Python全棧正則表達式（re模塊正則接口全方位詳解）

時間 2019-12-13

標籤 python 正則表達式模塊接口全方位詳解欄目 Python 简体版

原文原文鏈接

re模塊是Python的標準庫模塊正則表達式

模塊正則接口的總體模式spa

re.compile 返回regetx對象code

finditer fullmatch match search 返回 match對象對象

match.屬性|方法blog

re模塊的使用:token

regex = re.compile(pattern,flags = 0)接口

功能：字符串

生成正則表達式對象get

參數： string

pattern 正則表達式

flags 功能標誌位，豐富正則表達式的匹配

返回值：

返回一個正則表達式對象

re.findall(pattern,string,flags = 0)

功能：

根據正則表達式匹配目標字串內容

參數：

pattern 正則表達式

string 目標字符串

返回值：

列表裏面是匹配到的內容

若是正則表達式有子組，則只返回子組中的內容

regex.findall(string,pos,endpos)

功能：

根據正則表達式匹配目標字串內容

參數：

string 目標字符串

pos,endpos : 截取目標字符串的起止位置進行匹配，默認是整個字符串

返回值：

列表裏面是匹配到的內容

若是正則表達式有子組，則只返回子組中的內容

re.split(pattern,string,flags = 0)

功能：

經過正則表達式切割目標字符串

參數：

pattern 正則

string 目標字串

返回值：

以列表形式返回切割後的內容

re.sub(pattern,replace,string,max,flags)

功能:

替換正則表達式匹配內容

參數：

pattern 正則

replace 要替換的內容

string 目標字符串

max 設定最多替換幾處

返回值：

替換後的字符串

re.subn(pattern,replace,string,max,flags)

功能和參數同sub

返回值多一個實際替換了幾處

re.finditer(pattern,string,flags)

功能：

使用正則匹配目標字串

參數：

pattern 正則

string 目標字串

返回值：

迭代對象 ----》迭代內容爲match對象

re.fullmatch(pattern,string,flags)

功能：

徹底匹配一個字符串

參數：

pattern 正則

string 目標字串

返回值：

match對象，匹配到的內容

re.match(pattern,string,flags)

功能：

匹配一個字符串起始內容

參數：

pattern 正則

string 目標字串

返回值：

match對象，匹配到的內容

re.search(pattern,string,flags)

功能：

匹配第一個符合條件的字符串

參數：

pattern 正則

string 目標字串

返回值：

match對象，匹配到的內容

regex 對象的屬性

flags 標誌位數值

pattern 正則表達式

groups 子組個數

groupindex 獲取捕獲組字典，鍵爲組名值是第幾組

match對象屬性：

match.string 表示目標字符串的開始位置

match.pos 表示目標字符串的結束位置

match.re 表示對象生成正則表達式

match.endpos 目標字符串

match.lastindex 最後一個分組是第幾組

match.lastgroup 最後一組的名稱（捕獲）

match對象方法：

match.span() 返回匹配到內容的開始結束位置元組

match.start() 返回匹配到內容的開始位置

match.end() 返回匹配到內容的結束位置

match.groups() 返回全部子組匹配到的內容

match.groupdict() 返回捕獲組字典鍵：捕獲名　值：內容

group(n=0)

功能：

獲取match對象對應的匹配內容

參數：

默認爲0 表示獲取總體的匹配內容

若是賦值1,2,3。。。表示獲取第n個子組匹配到的內容

返回值：

返回獲取到的內容字串

# regex1.py
import re

pattern = r"(?P<dog>ab)cd(?P<pig>ef)"
# 生成正則表達式對象
regex = re.compile(pattern)

s = "abcdefghfkfdafsabcdefjsaavjhcabca"
# 獲取mtach對象
obj = regex.search(s, 0, 8)  # 設置開始位置結束位置


# print(len(s))


# match對象屬性
print(obj.pos)  # 目標字符串的開始位置
print(obj.endpos)  # 目標字符串的結束位置
print(obj.re)  # 正則表達式對象　　re.compile('(?P<dog>ab)cd(ef)')
print(obj.string)  # 目標字符串
print(obj.lastindex)  # 最後一個分組是第幾組
print(obj.lastgroup)  # 最後一組的名稱


# match對象方法
print(obj.span())  # 匹配到內容的起止位置
print(obj.start())  # 匹配到的內容開始位置
print(obj.end())  # 匹配到的內容結束位置
print(obj.groups())  # 全部子組匹配到的內容
print(obj.groupdict())  # 捕獲組字典　鍵：捕獲名　值：內容


print(obj.group())
print(obj.group(2))
# group(n=0)
#     功能：
#         獲取match對象對應的匹配內容
#     參數：
#         默認爲0 表示獲取總體的匹配內容
#         若是賦值1,2,3。。。表示獲取第n個子組匹配到的內容
#     返回值：
#         返回獲取到的內容字串

flags參數：

re.compile

re.findall

re.search

re.match

re.finditer

re.fullmatch

re.split

re.sub

做用：

輔助正則表達式，擴展豐富的匹配內容、

regex = re.compile(r"Hello", re.I) # 忽略字母大小寫

I == IGNORECASE 忽略字母大小寫

S == DOTALL 讓元字符 . 可以匹配到\n

M == MULTILINE 讓元字符 ^ $ 可以匹配每一行的開頭和結尾

X == VERBOOS 可以爲正則添加註釋

flags傳遞多個參數時能夠用按位或： | 連接

import re 

# 忽略字母大小寫
regex = re.compile(r'hello',re.I)

# l = regex.findall('hello Hello')
# print(l)

s = '''hello world
nihao Beijing'''
# 讓.可以匹配換行符
l = re.findall(r'.+',s,re.S)
print(l)
# 匹配每一行
obj = re.search(r"world$",s,re.M)
print(obj.group())

# re自帶註釋方法
pattern = r"""(?P<dog>\w+)  #dog組
\s+   #匹配任意多個空格
(\W+)  #匹配一些特殊字符
"""


#添加註釋同時忽略大小寫
s = re.match(pattern,'hello  %#@',re.X | re.I).group()
print(s)