linux正則表達式

時間 2019-11-07

原文原文鏈接

1.什麼是正則表達式

簡單的說正則表達試就是爲處理大量字符串而定義的一套規則和方法，例如：假設「@」表明nishishei，「！」表明linzhongniao。echo 「@!」=」nishisheilinzhongniao」
經過定義的這些特殊符號的輔助，系統管理員就能夠快速過濾，替換或輸出須要的字符串，linux正則表達式通常以行爲單位處理的。能夠用man grep深刻研究node

2.爲何要學習正則表達式？

在企業工做中，咱們天天作的linux運維工做中，時刻都會面對大量的有字符串的文本配置、程序、命令輸出及日誌文件等，而咱們常常會迫切的須要，從大量的字符串中查找符合工做須要的特定的字符串，這就要靠正則表達式了。例如：ifconfig命令只輸出IP，access.log日誌文件只取出ip等。linux正則表達式以行爲單位處理。mysql

3.基礎正則第一波命令說明

3.1 模擬數據

[root@linzhongniao ~]# cat linzhongniao.log 
I am linzhongniao!
I like linux.
空行
I like badminton ball,billiard ball and chinese chess !
my blog id https://blog.51cto.com/10642812
my qq num is 1200098
空行
my god,i am not linzhongniao,But to the birds of the forest!!!

3.2 「^」尖括號說明

「^」匹配以什麼字符開頭的內容，vi/vim編輯器裏面「^」表明一行的開頭linux

實例：過濾以字母m開頭的內容正則表達式

[root@linzhongniao ~]# grep "^m" linzhongniao.log 
my blog id https://blog.51cto.com/10642812
my qq num is 1200098
my god,i am not linzhongniao,But to the birds of the forest!!!

3.3 「$」符號說明

「$」匹配以什麼字符結尾的內容，vi/vim編輯器裏面「$」表明一行的結尾。sql

實例；過濾出以8結尾的內容vim

[root@linzhongniao ~]# grep "8$" linzhongniao.log 
my qq num is 1200098

3.4 「^$」組合符號說明

「^$」表示空行bash

4.基礎正則第二波命令說明

4.1 「.」點號說明

「.」點號表明且只能表明任意一個字符運維

實例：tcp

[root@linzhongniao ~]# grep "." linzhongniao.log
I am linzhongnieo!
I like linux.

I like badminton ball,billiard ball and chinese chess !
my blog id https://blog.51cto.com/10642812
my qq num is 1200098

my god,i am not linzhongniao,But to the birds of the forest!!!

匹配以linzhongni開頭，以o結尾的內容，中間的字符能夠任意多個。編輯器

[root@linzhongniao ~]# grep "linzhongni.*o" linzhongniao.log 
I am linzhongnieo!
my god,i am not linzhongniao,But to the birds of the FOREST!!!!

4.2 「\」反斜線符號說明

轉義符號，例「\.」就只表明點自己了，讓有着特殊身份意義的字符，脫掉馬甲，還原原型。\$就表明着$符號。

只匹配以點號結尾的字符，須要對點號進行轉義

[root@linzhongniao ~]# grep "\.$" linzhongniao.log 
I like linux.

4.3 「*」星號符號說明

重複0個或多個前面的一個字符，例如o*表明匹配有零個或多個字母o的內容。

[root@linzhongniao ~]# grep "linzhongniao*" linzhongniao.log
my god,i am not linzhongniao,But to the birds of the forest!!!
[root@linzhongniao ~]# grep "n*" linzhongniao.log
I am linzhongnieo!
I like linux.

I like badminton ball,billiard ball and chinese chess !
my blog id https://blog.51cto.com/10642812
my qq num is 1200098

my god,i am not linzhongniao,But to the birds of the forest!!!

4.4 「.*」組合符號說明

「.*」匹配全部（任意）多個字符，延伸「^.*」以任意多個字符開頭，「.*$」以任意多個字符結尾。

實例：

匹配以goo開頭的任意多個字符

[root@linzhongniao ~]# grep "goo.*" linzhongniao.log 
goodi
very good 
goood
good

匹配任意多個以字母d結尾的內容

[root@linzhongniao ~]# grep ".*d$" linzhongniao.log  
gd
goood
glad
good

匹配任意多個以數字2結尾的內容

[root@linzhongniao ~]# grep ".*2$" linzhongniao.log  
my blog id https://blog.51cto.com/10642812

匹配任意多個以歎號結尾的內容，注意反斜線的運用

[root@linzhongniao ~]#  grep ".*\!$" linzhongniao.log
I am linzhongnieo!
I like badminton ball,billiard ball and chinese chess !
my god,i am not linzhongniao,But to the birds of the FOREST!!!!

5.基礎正則表達式第三波命令說明

5.1 [ abc ] 符號說明

匹配字符集內的任意一個字符[a-zA-Z],[0-9],[A-Z]。

[root@linzhongniao ~]# grep "[A-Z]" linzhongniao.log
I am linzhongnieo!
I like linux.
I like badminton ball,billiard ball and chinese chess !
my god,i am not linzhongniao,But to the birds of the FOREST!!!!
LINZHONGNIAO
[root@linzhongniao ~]# grep "[a-z]" linzhongniao.log
I am linzhongnieo!
I like linux.
I like badminton ball,billiard ball and chinese chess !
my blog id https://blog.51cto.com/10642812
my qq num is 1200098
not 1200000098
my god,i am not linzhongniao,But to the birds of the FOREST!!!!
goodi
good
gd
goood
glad
[root@linzhongniao ~]# grep -i "[A-Z]" linzhongniao.log 
I am linzhongnieo!
I like linux.
I like badminton ball,billiard ball and chinese chess !
my blog id https://blog.51cto.com/10642812
my qq num is 1200098
not 1200000098
my god,i am not linzhongniao,But to the birds of the FOREST!!!!
goodi
good
gd
goood
glad
LINZHONGNIAO

5.2 [^abc]符號說明

中括號裏的「^」尖括號爲取反的意思，匹配不包含^後的任意一個字符的內容。注意和^在中括號外面是有區別的，^在中括號外面是表示以什麼開頭的意思。

5.3`a\{n,m\}`符號說明

重複n到m次前一個出現的字符（即重複字母a，n到m次），若是用egrep(grep -E)和sed –r 能夠去掉斜線,它們能夠識別擴展正則表達式。

5.4 `a\{n,\}`符號說明

重複至少n次(即重複a至少n次)，若是用egrep(grep -E)/sed –r 能夠去掉斜線。

5.5 `a\{n\}`符號說明

重複n次，前一個出現的字符。若是用egrep(grep -E)和sed –r 能夠去掉斜線。

[root@linzhongniao ~]# egrep "0{3}" linzhongniao.log
my qq num is 1200098
not 1200000098

5.6 `a\{,m\}`符號說明

重複最多m次, 前一個重複的字符。若是用egrep(grep -E)/sed –r 能夠去掉斜線。

6.擴展的正則表達式

grep –E 以及egrep

【瞭解便可】

（1）「+」，加號表示重複「一個或一個以上」前面的字符（*是0或多個）。

[root@linzhongniao ~]# egrep "g+d" linzhongniao.log 
gd
[root@linzhongniao ~]# egrep "go+d" linzhongniao.log  
my god,i am not linzhongniao,But to the birds of the FOREST!!!!
good

（2）* 星號表示0個或多個

[root@linzhongniao ~]# egrep "go*d" linzhongniao.log  
my god,i am not linzhongniao,But to the birds of the FOREST!!!!
good
gd

（3）「？」問號表示重複「0個或一個」（「.」點號是有且只有一個）

[root@linzhongniao ~]# egrep "go?d" linzhongniao.log  
my god,i am not linzhongniao,But to the birds of the FOREST!!!!
gd
[root@linzhongniao ~]# egrep "go.d" linzhongniao.log
good

（4）「|」管道

表示同時過濾多個字符串。

[root@linzhongniao ~]# egrep "3306|1521" /etc/services 
mysql   3306/tcp# MySQL
mysql   3306/udp# MySQL
ncube-lm1521/tcp# nCube License Manager
ncube-lm1521/udp# nCube License Manager
[root@linzhongniao ~]# egrep "god|good" linzhongniao.log 
my god,i am not linzhongniao,But to the birds of the FOREST!!!!
good

（5）(|)小括號分組過濾，後向引用。

[root@linzhongniao ~]# egrep "g(la|oo)d" linzhongniao.log   
good
glad

7.元字符

元字符（meta character）是一種perl風格的正則表達式，只有一部分文本處理工具支持它，並非全部的文本處理工具都支持。

\b 單詞邊界

示例：

[root@linzhongniao ~]# grep "good" linzhongniao.log 
goodi
good

若是隻想過濾good，不想過濾goodi；能夠用\b定義邊界，也能夠用grep –w按單詞搜索

[root@linzhongniao ~]# grep "good\b" linzhongniao.log 
good
[root@linzhongniao ~]# grep -w "good" linzhongniao.log 
good

8.正則表達式知識總結

9.企業級實戰linux正則表達式結合三劍客實戰

9.1 取下面的ip

解答：

sed -n 's#支持正則位置##gp' file

方法一：先把行給取出來，對目標前的內容進行匹配

[root@linzhongniao ~]# ifconfig eth0|sed -n '2'p|sed 's#^.*dr:##g'
 192.168.0.117  Bcast:192.168.0.255  Mask:255.255.255.0

再對目標後的內容進行匹配

[root@linzhongniao ~]# ifconfig eth0|sed -n '2p'|sed 's#^.*dr:##g'|sed 's#  B.*$##g'  《==這裏#  B.*$中間有兩個空格,最好複製粘貼
 192.168.0.117

處理技巧：

匹配須要的目標（獲取的字符串如上文的ip）前的字符串通常用以^開頭（^.*）來匹配到以實際字符結尾，如：「^.addr:」表示匹配以任意字符開頭到addr:結尾的內容。而處理須要的目標後的內容通常在匹配的開頭寫上實際的字符，結尾是以$結尾（.$）來匹配。如B.*$部分表示匹配以空格大寫B開頭一直到結尾的內容。將匹配到的內容替換爲空剩下的就是想要的內容。

方法二：

[root@linzhongniao ~]# ifconfig eth0|sed -n '2s#^.*dr:##gp'|sed  's#  B.*$##g'
 192.168.0.117

方法三：

sed的後向引用：

sed –nr ‘s#()()#\1\2#gp’file

參數：

-n 取消默認輸出

-r 不用轉義

sed反向引用演示：取出linzhongniao

[root@linzhongniao ~]# echo "I am linzhongniao linux" >f.txt
[root@linzhongniao ~]# cat f.txt
I am linzhongniao linux
[root@linzhongniao ~]# cat f.txt|sed -nr 's#^.*m (.*) l.*$#\1#gp'
linzhongniao

當在前面匹配的部分用小括號的時候，第一個括號內容，能夠在後面的部分用\1輸出，第二個括號的內容能夠在後面部分用\2輸出，以此類推。

[root@linzhongniao ~]# ifconfig eth0|sed -nr '2s#^.*dr:(.*)  B.*$#\1#gp'
 192.168.1.106

方法四：

[root@linzhongniao ~]# ifconfig eth0|awk -F "[ :]+" 'NR==2{print $4}'
 192.168.0.106

方法五：

[root@linzhongniao ~]# ifconfig eth0|sed -nr '/inet addr/s#^.*dr:(.*) B.*$#\1#gp' 
 192.168.0.117

方法六：

[root@linzhongniao ~]# ifconfig bond0|awk -F "(addr:| Bcast:)" 'NR==2{print $2}'  
 192.168.1.225

取出ip addr列出的ip

[root@linzhongniao ~]# ip addr|awk -F "[ /]+" 'NR==8 {print $3}'
 192.168.0.106
[root@linzhongniao ~]# ip addr|sed -nr '8s#^.*inet ##gp'|sed 's#/24 b.*$##g'
 192.168.1.106
[root@linzhongniao ~]# ip addr|sed -nr '8s#^.*inet (.*)/24.*$#\1#gp'
 192.168.1.106
[root@linzhongniao ~]# ip addr|awk -F "(inet |/24 brd)" NR==8'{print $2}'
 192.168.1.106

9.2 將/etc/passwd文件下的第一列和最後一列替換

[root@linzhongniao ~]# tail /etc/passwd|awk -F "[:]+" '{print $6":"$2":"$3":"$4"::"$5":"$1}'
/bin/bash:x:855:855::/home/stu1:stu1
/bin/bash:x:856:856::/home/stu2:stu2
/bin/bash:x:857:857::/home/stu3:stu3
/bin/bash:x:858:858::/home/stu4:stu4
/bin/bash:x:859:859::/home/stu5:stu5
/bin/bash:x:860:860::/home/stu6:stu6
/bin/bash:x:861:861::/home/stu7:stu7
/bin/bash:x:862:862::/home/stu8:stu8
/bin/bash:x:863:863::/home/stu9:stu9
/bin/bash:x:864:864::/home/stu10:stu10

9.3 取出文件權限

取出644

[root@linzhongniao ~]# stat /etc/hosts
  File: `/etc/hosts'
  Size: 218 Blocks: 8  IO Block: 4096   regular file
Device: 804h/2052d  Inode: 260125  Links: 2
Access: (0644/-rw-r--r--)  Uid: (0/root)   Gid: (0/root)
Access: 2018-07-18 10:09:51.759042316 +0800
Modify: 2018-07-11 16:18:38.646992646 +0800
Change: 2018-07-11 16:18:38.646992646 +0800

解答

方法一：

[root@linzhongniao ~]# stat /etc/hosts|sed -nr 's#^.*0(.*)/-rw.*$#\1#gp' 
644

方法二：

[root@linzhongniao ~]# stat /etc/hosts|awk -F "[0/]+" 'NR==4 {print $2}'  
644
[root@linzhongniao ~]# stat ett.txt|awk -F "[: (0/]+" 'NR==4{print $2}'  
644

方法三：

[root@linzhongniao ~]# stat /etc/hosts|awk -F "(0|/)" 'NR==4{print $2}'
644

方法四：

[root@linzhongniao ~]# stat -c %a /etc/hosts
644

9.4 批量重命名文件

當前目錄下有文件以下所示：要求用sed命令重命名，刪除文件名中的_finished 。

[root@linzhongniao test]# ls
stu_102999_1_finished.jpg stu_102999_2_finished.jpg stu_102999_3_finished.jpg stu_102999_4_finished.jpg stu_102999_5_finished.jpg

解答：

下面mv &中的&符號表明前面ls查找的內容。

[root@linzhongniao test]# ls|sort|sed -nr 's#(^.*_)(.*)(_.*ed)(.j.*$)#mv & \1\2\4#gp'
mv stu_102999_1_finished.jpg stu_102999_1.jpg
mv stu_102999_2_finished.jpg stu_102999_2.jpg
mv stu_102999_3_finished.jpg stu_102999_3.jpg
mv stu_102999_4_finished.jpg stu_102999_4.jpg
mv stu_102999_5_finished.jpg stu_102999_5.jpg

將上面輸出的內容交給bash處理。

[root@linzhongniao test]# ls|sort|sed -nr 's#(^.*_)(.*)(_.*ed)(.j.*$)#mv & \1\2\4#gp'|bash
[root@linzhongniao test]# ls
stu_102999_1.jpg  stu_102999_2.jpg  stu_102999_3.jpg  stu_102999_4.jpg  stu_102999_5.jpg

9.5 批量建立用戶

[root@linzhongniao ~]# echo stu{1..10}|xargs -n 1|awk '{print "useradd" ,$0}' 
useradd stu1
useradd stu2
useradd stu3
useradd stu4
useradd stu5
useradd stu6
useradd stu7
useradd stu8
useradd stu9
useradd stu10
[root@linzhongniao ~]# echo stu{1..10}|xargs -n 1|awk '{print "useradd " ,$0}' 
useradd stu1
useradd stu2
useradd stu3
useradd stu4
useradd stu5
useradd stu6
useradd stu7
useradd stu8
useradd stu9
useradd stu10
最後交給bash處理
[root@linzhongniao ~]# echo stu{1..10}|xargs -n 1|awk '{print "useradd " ,$0}'|bash