Python Regular Expression

1.  Referenceshtml

http://docs.python.org/library/re.html#module-repython

http://docs.python.org/library/re.html#raw-string-notationgit

http://docs.python.org/howto/regex.html#regex-howtoshell

"mastering regular expression"express

 

2. an example in bitbakeapp

'(?P<base>.*?)(?P<keyword>_append|_prepend)(_(?P<add>.*))?'ide

This regular expression matches strings which have form 'xxx_append' or 'xxx_prepend' or 'xxx_append_yyy' or 'xxx_prepend_yyy'. ui

Let's take a simple analysis on it.this

REGEXP = '(?P<base>.*?)(?P<keyword>_append|_prepend)(_(?P<add>.*))?' = ABC? (match AB or ABC)spa

A = '(?P<base>.*?)' matches the same string set with '.*?' but the matched substring could be accessed by the identifier base.

B = '(?P<keyword>_append|_prepend) matches '_append' or '_prepend'.

C = (_(?P<add>.*)) matches string like '_xxxxxxxxx'

Following the test code for this regular expression the the corresponding output in shell.

#!/usr/bin/env python                                                                                                                                                                                           

# test_regexp.py regexp string                                                                                                                                                                                  
import sys
import re

#pattern = sys.argv[1]                                                                                                                                                                                          
#string = sys.argv[2]                                                                                                                                                                                           

def test(pattern, string):
    result = re.match(pattern, string)
    if result == None:
        print (pattern, string, None)
    else:
        print (pattern, string, result.group('keyword'), result.group('add'), result.group(0))

pattern = '(?P<base>.*?)(?P<keyword>_append|_prepend)(_(?P<add>.*))?'
test(pattern, 'hello_append')
test(pattern, 'hello')
test(pattern, 'hello_prepend')
test(pattern, 'hello_append_add_package')
test(pattern, 'hello_append_world_package')

chenqi@chenqi-OptiPlex-760:~/mypro/python$ ./test_regexp.py
('(?P<base>.*?)(?P<keyword>_append|_prepend)(_(?P<add>.*))?', 'hello_append', '_append', None, 'hello_append')
('(?P<base>.*?)(?P<keyword>_append|_prepend)(_(?P<add>.*))?', 'hello', None)
('(?P<base>.*?)(?P<keyword>_append|_prepend)(_(?P<add>.*))?', 'hello_prepend', '_prepend', None, 'hello_prepend')
('(?P<base>.*?)(?P<keyword>_append|_prepend)(_(?P<add>.*))?', 'hello_append_add_package', '_append', 'add_package', 'hello_append_add_package')
('(?P<base>.*?)(?P<keyword>_append|_prepend)(_(?P<add>.*))?', 'hello_append_world_package', '_append', 'world_package', 'hello_append_world_package')

 

3. a complete list of metacharacters

. ^ $ * + ? { } [ ] \ | ( )

 

 

4. predefined special sequences

\d
Matches any decimal digit; this is equivalent to the class [0-9].
\D
Matches any non-digit character; this is equivalent to the class [^0-9].
\s
Matches any whitespace character; this is equivalent to the class [ \t\n\r\f\v].
\S
Matches any non-whitespace character; this is equivalent to the class [^ \t\n\r\f\v].
\w
Matches any alphanumeric character; this is equivalent to the class [a-zA-Z0-9_].
\W

Matches any non-alphanumeric character; this is equivalent to the class[^a-zA-Z0-9_].


 

To Be Continue ...

相關文章
相關標籤/搜索