re模塊的方法

時間 2019-11-09

標籤模塊方法简体版

原文原文鏈接

match方法

match(string[, pos[, endpos]])

string:匹配使用的文本，

pos: 文本中正則表達式開始搜索的索引。及開始搜索string的下標

endpos: 文本中正則表達式結束搜索的索引。

若是不指定pos，默認是從開頭開始匹配，若是匹配不到，直接返回None

import re
pattern = re.compile(r'\w*(hello w.*)(hello l.*)')
result = pattern.match(r'aahello world hello ling')
print(result)
result2 = pattern.match(r'hello world hello ling')
print(result2.groups())

結果：

None

('hello world ', 'hello ling')

解釋：若是不指定pos的話，默認是從字符串開始位置匹配，匹配不到就返回None，以上全部的pattern都是一個match對象，他在查找結果的時候有本身的方法，咱們在後一節會詳細接受他的方法。

search方法

search(string[, pos[, endpos]])

這個方法用於查找字符串中能夠匹配成功的子串。從string的pos下標處起嘗試匹配pattern，若是pattern結束時仍可匹配，則返回一個Match對象；若沒法匹配，則將pos加1後從新嘗試匹配；直到pos=endpos時仍沒法匹配則返回None。下面看個列子：

import re
pattern = re.compile(r'(hello w.*)(hello l.*)')
result1 = pattern.search(r'aahello world hello ling')
print(result1.groups())

結果：

('hello world ', 'hello ling')

split方法

split(string[, maxsplit])

按照可以匹配的子串將string分割後返回列表。maxsplit用於指定最大分割次數，不指定將所有分割。

import re
p = re.compile(r'\d+')
print(p.split('one1two2three3four4'))

結果：

['one', 'two', 'three', 'four', '']

解釋：直接把p的正則當成是分隔符，而後把最後的字符串用p進行分割，而後返回回去

findall方法

findall(string[, pos[, endpos]])

搜索string，以列表形式返回所有能匹配的子串.

import re
p = re.compile(r'\d+')
print(findall('one1two2three3four4'))

結果：

['1', '2', '3', '4']

結果：findall是把匹配到的字符串最後一列表的形式返回回去

finditer方法

finditer(string[, pos[, endpos]])

搜索string，返回一個順序訪問每個匹配結果（Match對象）的迭代器。

import re
p = re.compile(r'\d+')
print(type(p.finditer('one1two2three3four4')))
for m in p.finditer('one1two2three3four4'):
    print(type(m))
print(m.group())

結果：<type 'callable-iterator'>

解釋：

p.finditer('one1two2three3four4')是一個迭代器，而返回的每一個m都是match對象，group方法也會在下一節進行詳細介紹。

sub方法

sub(repl, string[, count])

使用repl替換string中每個匹配的子串後返回替換後的字符串。

當repl是一個字符串時，能夠使用\id或\g<id>、\g<name>引用分組，但不能使用編號0。

當repl是一個方法時，這個方法應當只接受一個參數（Match對象），並返回一個字符串用於替換（返回的字符串中不能再引用分組）。

count用於指定最多替換次數，不指定時所有替換。

import re
p = re.compile(r'(\w+) (\w+)')
s = 'i say, hello world!'
print(p.sub(r'\2 \1', s))
def func(m):
    return m.group(1).title() + ' ' + m.group(2).title()
print(p.sub(func, s))