python re模塊提供了一個正則表達式引擎接口,能夠將REString編譯成對象,並用編譯好的對象來匹配。若是一個正則表達式常常用來作匹配,那麼能夠編譯,這樣速度更快。python
>>> import re >>> p = re.compile("c[a-g]t") >>> print(p) <_sre.SRE_Pattern object at 0x11e6420> >>> p.findall("cat cbtt") ['cat', 'cbt']
re.I 不區分大小寫 ......正則表達式
>>> p = re.compile("c[a-g]t", re.I) >>> p.findall("cat cBTt") ['cat', 'cBT']
若RE在字符串開始的位置匹配,則返回一個'matchObject'實例(對象);不然返回None。 一般用法是將match的返回值賦給一個變量,而後判斷這個變量是否在None。 固然,返回的matchObject也有一些類方法,這裏暫時省略,之後補充。app
>>> p = re.compile("abc") >>> mo = p.match("aaaaabcdrfg") >>> p = re.compile("abc") >>> mo1 = p.match("aaaaabcdrfg") >>> mo2 = p.match("abcdrfg") >>> print(mo1) #RE沒有出如今字符串的開頭,所以爲None None >>> print(mo2) <_sre.SRE_Match object at 0x12425e0> >>> mo3 ="aaaaabcdrfg") >>> mo4 ="abcdrfg") >>> print(mo3) <_sre.SRE_Match object at 0x1309b28> >>> print(mo4) <_sre.SRE_Match object at 0x1309b90>
sub(pattern, repl, string, count=0, flags=0) Return the string obtained by replacing the leftmost non-overlapping occurrences of the pattern in string by the replacement repl. repl can be either a string or a callable; if a string, backslash escapes in it are processed. If it is a callable, it's passed the match object and must return a replacement string to be used.接口
>>> re.sub(r"a", "b", "haha") # count=0或者省略表示所有替換 'hbhb' >>> re.sub(r"a", "b", "haha", 0) 'hbhb' >>> re.sub(r"a", "b", "haha", 1)# count=1表示所有替換1次 'hbha'
subn(pattern, repl, string, count=0, flags=0) Return a 2-tuple containing (new_string, number). new_string is the string obtained by replacing the leftmost non-overlapping occurrences of the pattern in the source string by the replacement repl. number is the number of substitutions that were made. repl can be either a string or a callable; if a string, backslash escapes in it are processed. If it is a callable, it's passed the match object and must return a replacement string to be used.字符串
>>> re.subn(r"a", "b", "haha") #參數與sub同樣 ('hbhb', 2) >>> re.subn(r"a", "b", "haha", 1) ('hbha', 1) >>> re.subn(r"a", "b", "haha", 2) ('hbhb', 2)
split(pattern, string, maxsplit=0, flags=0) Split the source string by the occurrences of the pattern, returning a list containing the resulting substrings. 這個函數與字符串的split區別就是這裏的pattern支持正則表達式,使用更加靈活。
>>> re.split(r"[a-f]", "afternoon") ['', '', 't', 'rnoon']