sed命令

時間 2019-11-18

標籤 sed 命令欄目 Linux 简体版

原文原文鏈接

英文全稱 Stream EDitorcss

原理html

SED遵循簡單的工做流：讀取，執行和顯示，下圖描述了該工做流面試

讀取： SED從輸入流（文件，管道或者標準輸入）中讀取一行而且存儲到叫作模式空間（pattern buffer）的內部緩衝區
執行：默認狀況下，全部的SED命令都在模式空間中順序的執行，除非指定了行的地址，不然SED命令將會在全部的行上依次執行
顯示：發送修改後的內容到輸出流。在發送數據以後，模式空間將會被清空。
在文件全部的內容都被處理完成以前，上述過程將會重複執行

須要注意的幾點

模式空間（pattern buffer）是一塊活躍的緩衝區，在sed編輯器執行命令時它會保存待檢查的文本
默認狀況下，全部的SED命令都是在模式空間中執行，所以輸入文件並不會發生改變
還有另一個緩衝區叫作保持空間（hold buffer），在處理模式空間中的某些行時，能夠用保持空間來臨時保存一些行。在每個循環結束的時候，SED將會移除模式空間中的內容，可是該緩衝區中的內容在全部的循環過程當中是持久存儲的。SED命令沒法直接在該緩衝區中執行，所以SED容許數據在 保持空間 和 模式空間之間切換
初始狀況下，保持空間 和 模式空間 這兩個緩衝區都是空的
若是沒有提供輸入文件的話，SED將會從標準輸入接收請求
若是沒有提供地址範圍的話，默認狀況下SED將會對全部的行進行操做

語法：shell

命令格式： sed [option] 'sed command' filenameexpress

腳本格式：sed [option] -f 'sed script' filename編輯器

選項：post

        -n∶只顯示匹配行
        -e∶直接在命令行模式上進行sed動做編輯，此爲默認選項
        -f∶直接將 sed 的動做寫在一個檔案內， -f filename 則能夠執行 filename 內的sed 動做；
        -r∶支持擴展正則
        -i∶直接修改文件內容ui

sed編輯命令：
        a   ∶新增， a 的後面能夠接字串，而這些字串會在新的一行出現(目前的下一行)～
        c   ∶取代， c 的後面能夠接字串，這些字串能夠取代 n1,n2 之間的行！
        d   ∶刪除
         i   ∶插入， i 的後面能夠接字串，而這些字串會在新的一行出現(目前的上一行)；
         p  ∶打印
         s  ∶取代，能夠直接進行取代的工做哩！一般這個 s 的動做能夠搭配正規表示法！例如 1,20s/old/new/g 就是啦！this

舉例：spa

一、行尋址

打印第2行，sed默認打印文件中全部行
[root@localhost] ~$ sed '2p' data
this is the header line
this is the first  data line
this is the first  data line
this is the second data line
this is the last line 

只打印第2行 [root@localhost] ~$ sed -n '2p' data

打印1-3行 [root@localhost] ~$ sed -n '1,3p' data

打印第2行開始加下面4行，即2，3，4，5，6，行
sed -n '2,+4 p' books.txt

打印第1行開始的每2行。例如，1~2匹配行號1，3，5，7等，讓咱們只輸出文件中的奇數行
sed -n '1~2 p' books.txt

匹配second行 [root@localhost] ~$ sed -n '/second/p' data
this is the second data line

打印first行到第4行 [root@localhost] ~$ sed -n '/first/,4p' data
this is the first  data line
this is the second data line
this is the last line 

打印data行到last行 [root@localhost] ~$ sed -n '/data/,/last/p' data
this is the first  data line
this is the second data line
this is the last line

過濾#開頭的行 [root@localhost] ~$ sed -n '/^#/!p'  /etc/rc.d/rc.local 

touch /var/lock/subsys/local
過濾#開頭的行，將結果傳入{}，再過濾空行 [root@localhost] ~$ sed -n '/^#/!{/^$/!p}'  /etc/rc.d/rc.local 
touch /var/lock/subsys/local
對單個文件實現不一樣的操做，每一個操做用-e參數 [root@localhost] ~$ sed -e '/^#/d' -e '/^$/d'  /etc/rc.d/rc.local 
touch /var/lock/subsys/local

二、a 添加（i 插入同理）

在第四行以後添加hello world
sed '4 a hello world' books.txt

三、替換

sed 's/原字符串/替換字符串/' 　　只替換每行匹配到的第一個字符串

sed 's/原字符串/替換字符串/2'　　　只替換每行匹配到的第二個字符串

sed 's/原字符串/替換字符串/g'　　　替換全部字符串

在執行替換操做的時候，若是要替換的內容中包含/，這個時候怎麼辦？很簡單，添加轉義操做符。

$ echo "/bin/sed" | sed 's/\/bin\/sed/\/home\/mylxsw\/src\/sed\/sed-4.2.2\/sed/'
/home/mylxsw/src/sed/sed-4.2.2/sed

上面的命令中，咱們使用\對/進行了轉義，不過表達式已經看起來很是難看了，在SED中還可使用|，@，^，!做爲命令的分隔符，因此，下面的幾個命令和上面的是等價的

echo "/bin/sed" | sed 's|/bin/sed|/mylxsw/mylxsw/src/sed/sed-4.2.2/sed|'
echo "/bin/sed" | sed 's@/bin/sed@/home/mylxsw/src/sed/sed-4.2.2/sed@'
echo "/bin/sed" | sed 's^/bin/sed^/home/mylxsw/src/sed/sed-4.2.2/sed^'
echo "/bin/sed" | sed 's!/bin/sed!/home/mylxsw/src/sed/sed-4.2.2/sed!'

c 行替換

把第4行替換爲hello world
sed '4 c hello world' books.txt 

把第4-6行替換爲hello world
sed '4, 6 c hello world' books.txt

四、y 轉換（Translate）是惟一能夠處理單個字符的命令

轉換命令會對inchars和outchars值進行一對一的映射。這個映射過程會一直持續處處理完指定字符。若是inchars和outchars的長度不一樣，則產生錯誤。


$ echo "1 5 15 20" | sed 'y/151520/IVXVXX/'
I V IV XX

[root@localhost] ~$ echo "1 5 15 201" | sed 'y/151520/IVXVXX/'
I V IV XXI

[root@localhost] ~$ echo "1 5 15 201" | sed 'y/1515201/IVXVXX/'
sed: -e expression #1, char 17: strings for `y' command are different lengths

[root@localhost] ~$ echo "11 5 15 201" | sed 'y/151520/IVXVXX/'
II V IV XXI

[root@localhost] ~$ echo "11 5 15 203" | sed 'y/151520/IVXVXX/'
II V IV XX3

五、w 將匹配到的內容寫入文件，注意w和文件名之間只能有一個空格

將books.txt的內容寫入books.bak，至關於cp
sed -n 'w books.bak' books.txt

把偶數行寫入到junk.txt
sed -n '2~2 w junk.txt' books.txt

分別匹配寫入
sed -n -e '/Martin/ w Martin.txt' -e '/Paulo/ w Paulo.txt'  books.txt

在test文件中匹配IPADDR所在的行，而後用ip替換IPADDR，而後保存到ip.txt
[root@jie1 ~]# sed -i 's/IPADDR/ip/ w ip.txt' test

五、r 文件讀取命令，r命令和文件名之間必須只有一個空格。

打開junk.txt文件，將其內容插入到books.txt文件的第三行以後
$ echo "This is junk text." > junk.txt 
$ sed '3 r junk.txt' books.txt 
1) A Storm of Swords, George R. R. Martin, 1216 
2) The Two Towers, J. R. R. Tolkien, 352 
3) The Alchemist, Paulo Coelho, 197 
This is junk text. 
4) The Fellowship of the Ring, J. R. R. Tolkien, 432 
5) The Pilgrimage, Paulo Coelho, 288 
6) A Game of Thrones, George R. R. Martin, 864

r命令也支持地址範圍，例如3, 5 r junk.txt會在第三行，第四行，第五行後面分別插入junk.txt的內容

六、 l：輸出隱藏字符命令，你能經過直接觀察區分出單詞是經過空格仍是tab進行分隔的嗎？顯然是不能的，可是SED能夠爲你作到這點。使用l命令（英文字母L的小寫）能夠顯示文本中的隱藏字符（例如\t或者$字符）。

先將books.txt中的空格替換爲tab $ sed 's/ /\t/g' books.txt > junk.txt

顯示隱藏字符 $ sed -n 'l' junk.txt
1)\tStorm\tof\tSwords,\tGeorge\tR.\tR.\tMartin,\t1216\t$
2)\tThe\tTwo\tTowers,\tJ.\tR.\tR.\tTolkien,\t352\t$
3)\tThe\tAlchemist,\tPaulo\tCoelho,\t197\t$
4)\tThe\tFellowship\tof\tthe\tRing,\tJ.\tR.\tR.\tTolkien,\t432\t$
5)\tThe\tPilgrimage,\tPaulo\tCoelho,\t288\t$
6)\tA\tGame\tof\tThrones,\tGeorge\tR.\tR.\tMartin,\t864$

另外加上數字，能夠按照每行25個字符進行換行 $ sed -n 'l 25' books.txt

八、q 退出命令

打印到第3行退出 
$ sed '3 q' books.txt
1) A Storm of Swords, George R. R. Martin, 1216 
2) The Two Towers, J. R. R. Tolkien, 352 
3) The Alchemist, Paulo Coelho, 197

九、e 執行外部命令

下面的命令會在第三行以前執行date命令 
$ sed '3 e date' books.txt
1) Storm of Swords, George R. R. Martin, 1216
2) The Two Towers, J. R. R. Tolkien, 352
2016年11月29日 星期二 22時46分14秒 CST
3) The Alchemist, Paulo Coelho, 197
4) The Fellowship of the Ring, J. R. R. Tolkien, 432
5) The Pilgrimage, Paulo Coelho, 288
6) A Game of Thrones, George R. R. Martin, 864

若是你仔細觀察e命令的語法，你會發現其實它的command參數是可選的。在沒有提供外部命令的時候，SED會將模式空間中的內容做爲要執行的命令。

$ echo -e "date\ncal\nuname" > commands.txt
$ cat commands.txt
date
cal
uname

$ sed 'e' commands.txt
2016年11月29日 星期二 22時50分30秒 CST
十一月 2016
日 一 二 三 四 五 六
1 2 3 4 5
6 7 8 9 10 11 12
13 14 15 16 17 18 19
20 21 22 23 24 25 26
27 28 29 30
Darwin

十、提取子字符串（重點）

如今有以下一串字符串：
"asdfkjasldjkf"shiner"df

需求：
須要提取出shiner子字符串。

命令以下：
[root@localhost /]$ echo "asdfkjasldjkf\"shiner\"df" | sed 's/$.*$"$.*$"$.*$/\2/g'
shiner

命令解釋
s：表示替換命令
$.*$" : 表示第一個引號前的內容
"$.*$"：表示兩引號之間的內容
)"$.*$：表示引號後的內容
\2: 表示第二對括號裏面的內容

括號裏的表達式匹配的內容，能夠用\1，\2等進行引用，第n個括號對內的內容，就用\n引用。

這個命令的意思是：
用\2表明的第二個括號的內容（shiner）去替換整個字符串，這樣就獲得了咱們所須要的子字符串了。

特殊字符 = 和 &

= 輸出行號

爲每一行輸出行號
sed '=' books2.txt

爲1-4行輸出行號
sed '1, 4 =' books2.txt

最後一行輸出行號，能夠用於輸出文件總共有多少行 sed -n '$ =' books2.txt

sed常見面試題

##1）、處理如下文件內容,將域名取出並進行計數排序,如處理:  
http://www.baidu.com/index.<a target="_blank" href="http://www.2cto.com/kf/qianduan/css/" class="keylink" style="border:none; padding:0px; margin:0px; color:rgb(51,51,51); text-decoration:none; font-size:14px">html</a>  
http://www.baidu.com/1.html  
http://post.baidu.com/index.html  
http://mp3.baidu.com/index.html  
http://www.baidu.com/3.html  
http://post.baidu.com/2.html  
獲得以下結果:  
域名的出現的次數 域名  
3 www.baidu.com  
2 post.baidu.com  
1 mp3.baidu.com  ，
[root@localhost shell]# cat file | sed -e ' s/http:\/\///' -e ' s/\/.*//' | sort | uniq -c | sort -rn    #把http://用空格代替，把/後面跟任意字符用空格代替，排序，去重，再反向排序 3 www.baidu.com  
2 post.baidu.com  
1 mp3.baidu.com  
[root@codfei4 shell]# awk -F/ '{print $3}' file |sort -r|uniq -c|awk '{print $1"\t",$2}'  
3 www.baidu.com  
2 post.baidu.com  
1 mp3.baidu.com