SED 簡明教程

轉自: https://coolshell.cn/articles/9104.htmlhtml

 

awk於1977年出生,今年36歲本命年,sed比awk大2-3歲,awk就像林妹妹,sed就是寶玉哥哥了。因此 林妹妹跳了個Topless,他的哥哥sed坐不住了,也必定要出來抖一抖。正則表達式

sed全名叫stream editor,流編輯器,用程序的方式來編輯文本,至關的hacker啊。sed基本上就是玩正則模式匹配,因此,玩sed的人,正則表達式通常都比較強。shell

一樣,本篇文章不會說sed的所有東西,你能夠參看sed的手冊,我這裏主要仍是想和你們競爭一下那些從手機指縫間或馬桶裏流走的時間,用這些時間來學習一些東西。固然,接下來的仍是要靠你們本身雙手。bash

用s命令替換

我使用下面的這段文本作演示:微信

1
2
3
4
5
6
7
8
9
$ cat pets.txt
This is my cat
   my cat 's name is betty
This is my dog
   my dog's name is frank
This is my fish
   my fish's name is george
This is my goat
   my goat's name is adam

把其中的my字符串替換成Hao Chen’s,下面的語句應該很好理解(s表示替換命令,/my/表示匹配my,/Hao Chen’s/表示把匹配替換成Hao Chen’s,/g 表示一行上的替換全部的匹配):app

1
2
3
4
5
6
7
8
9
$ sed "s/my/Hao Chen's/g" pets.txt
This is Hao Chen's cat
   Hao Chen 's cat' s name is betty
This is Hao Chen's dog
   Hao Chen 's dog' s name is frank
This is Hao Chen's fish
   Hao Chen 's fish' s name is george
This is Hao Chen's goat
   Hao Chen 's goat' s name is adam

注意:若是你要使用單引號,那麼你沒辦法經過\’這樣來轉義,就有雙引號就能夠了,在雙引號內能夠用\」來轉義。less

 

再注意:上面的sed並無對文件的內容改變,只是把處理事後的內容輸出,若是你要寫回文件,你可使用重定向,如:編輯器

1
$ sed "s/my/Hao Chen's/g" pets.txt > hao_pets.txt

或使用 -i 參數直接修改文件內容:學習

1
$ sed -i "s/my/Hao Chen's/g" pets.txt

在每一行最前面加點東西:spa

1
2
3
4
5
6
7
8
9
$ sed 's/^/#/g' pets.txt
#This is my cat
#  my cat's name is betty
#This is my dog
#  my dog's name is frank
#This is my fish
#  my fish's name is george
#This is my goat
#  my goat's name is adam

在每一行最後面加點東西:

1
2
3
4
5
6
7
8
9
$ sed 's/$/ --- /g' pets.txt
This is my cat ---
   my cat 's name is betty ---
This is my dog ---
   my dog's name is frank ---
This is my fish ---
   my fish's name is george ---
This is my goat ---
   my goat's name is adam ---

順手介紹一下正則表達式的一些最基本的東西:

  • ^ 表示一行的開頭。如:/^#/ 以#開頭的匹配。
  • $ 表示一行的結尾。如:/}$/ 以}結尾的匹配。
  • \< 表示詞首。 如:\<abc 表示以 abc 爲首的詞。
  • \> 表示詞尾。 如:abc\> 表示以 abc 結尾的詞。
  • . 表示任何單個字符。
  • * 表示某個字符出現了0次或屢次。
  • [ ] 字符集合。 如:[abc] 表示匹配a或b或c,還有 [a-zA-Z]表示匹配全部的26個字符。若是其中有^表示反,如 [^a] 表示非a的字符

正規則表達式是一些很牛的事,好比咱們要去掉某html中的tags:

HTML.TXT
1
< b >This</ b > is what < span style = "text-decoration: underline;" >I</ span > meant. Understand?

看看咱們的sed命令

1
2
3
4
5
6
7
8
# 若是你這樣搞的話,就會有問題
$ sed 's/<.*>//g' html.txt
  Understand?
 
# 要解決上面的那個問題,就得像下面這樣。
# 其中的'[^>]' 指定了除了>的字符重複0次或屢次。
$ sed 's/<[^>]*>//g' html.txt
This is what I meant. Understand?

咱們再來看看指定須要替換的內容:

1
2
3
4
5
6
7
8
9
$ sed "3s/my/your/g" pets.txt
This is my cat
   my cat 's name is betty
This is your dog
   my dog's name is frank
This is my fish
   my fish's name is george
This is my goat
   my goat's name is adam

下面的命令只替換第3到第6行的文本。

1
2
3
4
5
6
7
8
9
$ sed "3,6s/my/your/g" pets.txt
This is my cat
   my cat 's name is betty
This is your dog
   your dog's name is frank
This is your fish
   your fish's name is george
This is my goat
   my goat's name is adam

 

1
2
3
4
5
$ cat my.txt
This is my cat , my cat 's name is betty
This is my dog, my dog's name is frank
This is my fish, my fish's name is george
This is my goat, my goat's name is adam

只替換每一行的第一個s:

1
2
3
4
5
$ sed 's/s/S/1' my.txt
ThiS is my cat , my cat 's name is betty
ThiS is my dog, my dog's name is frank
ThiS is my fish, my fish's name is george
ThiS is my goat, my goat's name is adam

只替換每一行的第二個s:

1
2
3
4
5
$ sed 's/s/S/2' my.txt
This iS my cat , my cat 's name is betty
This iS my dog, my dog's name is frank
This iS my fish, my fish's name is george
This iS my goat, my goat's name is adam

只替換第一行的第3個之後的s:

1
2
3
4
5
$ sed 's/s/S/3g' my.txt
This is my cat , my cat 'S name iS betty
This is my dog, my dog'S name iS frank
This is my fiSh, my fiSh'S name iS george
This is my goat, my goat'S name iS adam

多個匹配

若是咱們須要一次替換多個模式,可參看下面的示例:(第一個模式把第一行到第三行的my替換成your,第二個則把第3行之後的This替換成了That)

1
2
3
4
5
$ sed '1,3s/my/your/g; 3,$s/This/That/g' my.txt
This is your cat , your cat 's name is betty
This is your dog, your dog's name is frank
That is your fish, your fish's name is george
That is my goat, my goat's name is adam

上面的命令等價於:(注:下面使用的是sed的-e命令行參數)

1
sed -e '1,3s/my/your/g' -e '3,$s/This/That/g' my.txt

咱們可使用&來當作被匹配的變量,而後能夠在基本左右加點東西。以下所示:

1
2
3
4
5
$ sed 's/my/[&]/g' my.txt
This is [my] cat , [my] cat 's name is betty
This is [my] dog, [my] dog's name is frank
This is [my] fish, [my] fish's name is george
This is [my] goat, [my] goat's name is adam

圓括號匹配

使用圓括號匹配的示例:(圓括號括起來的正則表達式所匹配的字符串會能夠當成變量來使用,sed中使用的是\1,\2…)

1
2
3
4
5
$ sed 's/This is my \([^,&]*\),.*is \(.*\)/\1:\2/g' my.txt
cat :betty
dog:frank
fish:george
goat:adam

上面這個例子中的正則表達式有點複雜,解開以下(去掉轉義字符):

正則爲:This is my ([^,]*),.*is (.*)
匹配爲:This is my (cat),……….is (betty)

而後:\1就是cat,\2就是betty

sed的命令

讓咱們回到最一開始的例子pets.txt,讓咱們來看幾個命令:

N命令

先來看N命令 —— 把下一行的內容歸入當成緩衝區作匹配。

下面的的示例會把原文本中的偶數行歸入奇數行匹配,而s只匹配並替換一次,因此,就成了下面的結果:

1
2
3
4
5
6
7
8
9
$ sed 'N;s/my/your/' pets.txt
This is your cat
   my cat 's name is betty
This is your dog
   my dog's name is frank
This is your fish
   my fish's name is george
This is your goat
   my goat's name is adam

也就是說,原來的文件成了:

1
2
3
4
This is my cat \n  my cat 's name is betty
This is my dog\n  my dog's name is frank
This is my fish\n  my fish's name is george
This is my goat\n  my goat's name is adam

這樣一來,下面的例子你就明白了,

1
2
3
4
5
$ sed 'N;s/\n/,/' pets.txt
This is my cat ,  my cat 's name is betty
This is my dog,  my dog's name is frank
This is my fish,  my fish's name is george
This is my goat,  my goat's name is adam
a命令和i命令

a命令就是append, i命令就是insert,它們是用來添加行的。如:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
# 其中的1i代表,其要在第1行前插入一行(insert)
$ sed "1 i This is my monkey, my monkey's name is wukong" my.txt
This is my monkey, my monkey's name is wukong
This is my cat , my cat 's name is betty
This is my dog, my dog's name is frank
This is my fish, my fish's name is george
This is my goat, my goat's name is adam
 
# 其中的1a代表,其要在最後一行後追加一行(append)
$ sed "$ a This is my monkey, my monkey's name is wukong" my.txt
This is my cat , my cat 's name is betty
This is my monkey, my monkey's name is wukong
This is my dog, my dog's name is frank
This is my fish, my fish's name is george
This is my goat, my goat's name is adam

咱們能夠運用匹配來添加文本:

1
2
3
4
5
6
7
# 注意其中的/fish/a,這意思是匹配到/fish/後就追加一行
$ sed "/fish/a This is my monkey, my monkey's name is wukong" my.txt
This is my cat , my cat 's name is betty
This is my dog, my dog's name is frank
This is my fish, my fish's name is george
This is my monkey, my monkey's name is wukong
This is my goat, my goat's name is adam

下面這個例子是對每一行都挺插入:

1
2
3
4
5
6
7
8
9
$ sed "/my/a ----" my.txt
This is my cat , my cat 's name is betty
----
This is my dog, my dog's name is frank
----
This is my fish, my fish's name is george
----
This is my goat, my goat's name is adam
----
c命令

c 命令是替換匹配行

1
2
3
4
5
6
7
8
9
10
11
$ sed "2 c This is my monkey, my monkey's name is wukong" my.txt
This is my cat , my cat 's name is betty
This is my monkey, my monkey's name is wukong
This is my fish, my fish's name is george
This is my goat, my goat's name is adam
 
$ sed "/fish/c This is my monkey, my monkey's name is wukong" my.txt
This is my cat , my cat 's name is betty
This is my dog, my dog's name is frank
This is my monkey, my monkey's name is wukong
This is my goat, my goat's name is adam
d命令

刪除匹配行

1
2
3
4
5
6
7
8
9
10
11
12
$ sed '/fish/d' my.txt
This is my cat , my cat 's name is betty
This is my dog, my dog's name is frank
This is my goat, my goat's name is adam
 
$ sed '2d' my.txt
This is my cat , my cat 's name is betty
This is my fish, my fish's name is george
This is my goat, my goat's name is adam
 
$ sed '2,$d' my.txt
This is my cat , my cat 's name is betty
p命令

打印命令

你能夠把這個命令當成grep式的命令

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
# 匹配fish並輸出,能夠看到fish的那一行被打了兩遍,
# 這是由於sed處理時會把處理的信息輸出
$ sed '/fish/p' my.txt
This is my cat , my cat 's name is betty
This is my dog, my dog's name is frank
This is my fish, my fish's name is george
This is my fish, my fish's name is george
This is my goat, my goat's name is adam
 
# 使用n參數就行了
$ sed -n '/fish/p' my.txt
This is my fish, my fish's name is george
 
# 從一個模式到另外一個模式
$ sed -n '/dog/,/fish/p' my.txt
This is my dog, my dog's name is frank
This is my fish, my fish's name is george
 
#從第一行打印到匹配fish成功的那一行
$ sed -n '1,/fish/p' my.txt
This is my cat , my cat 's name is betty
This is my dog, my dog's name is frank
This is my fish, my fish's name is george

幾個知識點

好了,下面咱們要介紹四個sed的基本知識點:

Pattern Space

第零個是關於-n參數的,你們也許沒看懂,不要緊,咱們來看一下sed處理文本的僞代碼,並瞭解一下Pattern Space的概念:

1
2
3
4
5
6
7
8
9
10
11
12
foreach line in file {
     //放入把行Pattern_Space
     Pattern_Space <= line;
 
     // 對每一個pattern space執行sed命令
     Pattern_Space <= EXEC(sed_cmd, Pattern_Space);
 
     // 若是沒有指定 -n 則輸出處理後的Pattern_Space
     if (sed option hasn't "-n" )  {
        print Pattern_Space
     }
}
Address

第一個是關於address,幾乎上述全部的命令都是這樣的(注:其中的!表示匹配成功後是否執行命令)

[address[,address]][!]{cmd}

address能夠是一個數字,也能夠是一個模式,你能夠經過逗號要分隔兩個address 表示兩個address的區間,參執行命令cmd,僞代碼以下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
bool bexec = false
foreach line in file {
     if ( match(address1) ){
         bexec = true ;
     }
 
     if ( bexec == true ) {
         EXEC(sed_cmd);
     }
 
     if ( match (address2) ) {
         bexec = false ;
     }
}

關於address可使用相對位置,如:

1
2
3
4
5
6
7
8
9
10
# 其中的+3表示後面連續3行
$ sed '/dog/,+3s/^/# /g' pets.txt
This is my cat
   my cat 's name is betty
# This is my dog
#   my dog's name is frank
# This is my fish
#   my fish's name is george
This is my goat
   my goat's name is adam
命令打包

第二個是cmd能夠是多個,它們能夠用分號分開,能夠用大括號括起來做爲嵌套命令。下面是幾個例子:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
$ cat pets.txt
This is my cat
   my cat 's name is betty
This is my dog
   my dog's name is frank
This is my fish
   my fish's name is george
This is my goat
   my goat's name is adam
 
# 對3行到第6行,執行命令/This/d
$ sed '3,6 {/This/d}' pets.txt
This is my cat
   my cat 's name is betty
   my dog's name is frank
   my fish's name is george
This is my goat
   my goat's name is adam
 
# 對3行到第6行,匹配/This/成功後,再匹配/fish/,成功後執行d命令
$ sed '3,6 {/This/{/fish/d}}' pets.txt
This is my cat
   my cat 's name is betty
This is my dog
   my dog's name is frank
   my fish's name is george
This is my goat
   my goat's name is adam
 
# 從第一行到最後一行,若是匹配到This,則刪除之;若是前面有空格,則去除空格
$ sed '1,${/This/d;s/^ *//g}' pets.txt
my cat 's name is betty
my dog's name is frank
my fish's name is george
my goat's name is adam
Hold Space

第三個咱們再來看一下 Hold Space

接下來,咱們須要瞭解一下Hold Space的概念,咱們先來看四個命令:

g: 將hold space中的內容拷貝到pattern space中,原來pattern space裏的內容清除
G: 將hold space中的內容append到pattern space\n後
h: 將pattern space中的內容拷貝到hold space中,原來的hold space裏的內容被清除
H: 將pattern space中的內容append到hold space\n後
x: 交換pattern space和hold space的內容

這些命令有什麼用?咱們來看兩個示例吧,用到的示例文件是:

1
2
3
4
$ cat t.txt
one
two
three

第一個示例:

1
2
3
4
5
6
7
8
9
$ sed 'H;g' t.txt
one
 
one
two
 
one
two
three

是否是有點沒看懂,我做個圖你就看懂了。

第二個示例,反序了一個文件的行:

1
2
3
4
$ sed '1!G;h;$!d' t.txt
three
two
one

其中的 ‘1!G;h;$!d’ 可拆解爲三個命令

  • 1!G —— 只有第一行不執行G命令,將hold space中的內容append回到pattern space
  • h —— 第一行都執行h命令,將pattern space中的內容拷貝到hold space中
  • $!d —— 除了最後一行不執行d命令,其它行都執行d命令,刪除當前行

這個執行序列很難理解,作個圖以下你們就明白了:

就先說這麼多吧,但願對你們有用。

(全文完)


關注CoolShell微信公衆帳號能夠在手機端搜索文章

(轉載本站文章請註明做者和出處 酷 殼 – CoolShell ,請勿用於任何商業用途)

——===  訪問 酷殼404頁面 尋找遺失兒童。 ===——
相關文章
相關標籤/搜索