Linux shell編程學習實例與參數分析（三）

時間 2020-01-24

原文原文鏈接

第五章文本過濾

1.正則表達式
一種用來描述文本模式的特殊語法，由普通字符以及特殊字符（元字符）組成
^    ----只匹配行首
$    ----只匹配行尾
*    ----匹配0個或多個此單字符
[]   ----只匹配[]內字符，能夠使用-表示序列範圍[1-5]
\    ----屏蔽一個元字符的特殊含義
.    ----匹配任意單字符
pattern\{n\} 只用來匹配前面pattern出現的次數，n爲次數
pattern\{n，\}只用來匹配前面pattern出現的次數，至少爲n
pattern\{n，m\}只用來匹配前面pattern出現的次數，次數在n-m之間c++

eg：
A\{3\}B AAAB
A\{3,\}B AAAB AAAAB ...
A\{3,5\}B AAAB AAAAB AAAAAB正則表達式

2.find命令     ----查找文件和目錄
find pathname -options [-print -exec -ok]
pathname --查找的目錄路徑. .--表示當前目錄，/表示根目錄
-print 輸出
-exec 對匹配的文件執行該參數所給出的shell命令，相應命令形式爲'command'{} \;'    注意{}和\;之間的空格
-ok    與-exec相同，不過執行命令前會有提示
options   ：
-name
-perm
-user
-group
-mtime -n +n (atime,-ctime) 修改時間（訪問時間，建立時間）
-size n[c]
-type 查找某一類型的文件
eg.
[test@szbirdora 1]$ find ./ -mtime +5
./helloworld.sh
./nohup.out
查看./目錄（當前）下修改時間超過5天的文件

3.grep介紹
grep -c 輸出匹配行計數
grep -i 不區分大小寫
grep -h 查詢多文件時不顯示文件名
grep -H 顯示文件名
grep -l 查詢多文件時只輸出包含匹配字符的文件名
grep -n 顯示匹配行及行號
grep -s 不顯示不存在或不匹配文本的錯誤信息
grep -v 顯示不包含匹配文本的全部行（過濾文本）

eg.
[test@szbirdora 1]$ grep -n 's.a' myfile
2:/dev/sda1              20G 3.3G   16G 18% /
4:/dev/sda2              79G   18G   58G 23% /u01
5:/dev/sda4              28G 3.9G   22G 15% /u02
[test@szbirdora 1]$ grep -n '2$' myfile
5:/dev/sda4              28G 3.9G   22G 15% /u02
grep -options '正則表達式' filename

4.sed介紹
sed不與初始化文件打交道，它操做的只是一個拷貝，而後全部的改動若是沒有重定向到一個文件將輸出到屏幕
sed是一種重要的文本過濾工具，使用一行命令或使用管道與grep與awk相結合。
sed調用：
1.命令 sed [options] '正則表達式sedcommand' input-files
2.script :sed [options] -f sedscript input-filesshell

sed在文本中查詢文本的方式
-行號，能夠是簡單數字，或一個行號範圍
-使用正則表達式
x ----行號
x,y ----行號範圍從x到y
x,y! ---不包含行號x到yexpress

sed命令選項：
-n 不打印
-c 下一個命令是編輯命令
-f 若是正在調用sed腳本文件編程

基本sed命令
p 打印匹配行
= 顯示文本行號
a\ 在定位行號後附加新文本信息
i\在定位行號前插入新文本信息
d 刪除定位行
c\用新文本替換定位文本
s 使用替換模式替換相應模式
r 從另外一個文件中讀文本
w 寫文本到一個文件
q 第一個模式匹配完成後退去
l 顯示與八進制ascii代碼等價的控制字符
{}在定位行執行命令組
n 從一個文件中讀文本下一行，並附加在下一行
g 將模式2粘貼到/pattern n/
y 傳送字符數組

eg.
[test@szbirdora 1]$ sed -n '2p' myfile
c
打印myfile第2行
[test@szbirdora 1]$ sed -n '2,4p' myfile
c
f
b
打印第二行到第四行
[test@szbirdora 1]$ sed -n '/a/p' myfile
a
打印匹配a的行sass

[test@szbirdora 1]$ sed -n '2,/2/p' myfile
c
f
b
1
2
打印第二行到匹配'2'的行ide

s命令替換
[test@szbirdora 1]$ sed 's/b/a/p' myfile
a
a
a
c
d
e
替換b爲a

多點編輯 -e
eg. （myfile包含a-e）
[test@szbirdora 1]$ sed -e '2d' -e 's/c/d/' myfile 11
a
d
d
e

sed命令r ---從文件中讀取選定的行，讀入輸入文件中，顯示在匹配的行後面
eg.
[test@szbirdora 1]$ cat 11
*******************Alaska***************
[test@szbirdora 1]$ sed '/a/r 11' myfile
a
*******************Alaska***************
b
c
d
e

寫入命令：w   將輸入文件中的匹配行寫入到指定文件中
eg.
[test@szbirdora 1]$ cat 11
b
[test@szbirdora 1]$ sed -n '/a/w 11' myfile
[test@szbirdora 1]$ cat 11
a

追加：a   將文本追加到匹配行的後面。sed要求在a後加\,不止一行的以\鏈接
eg.
[test@szbirdora 1]$ sed '/b/a\****************hello*************\-------------china---------' myfile
a
b
****************hello*************-------------china---------
c
d
e

插入命令：i   將文本插入到匹配行的前面。sed要求在a後加\,不止一行的以\鏈接
eg.
[test@szbirdora 1]$ sed '/b/i\
> THE CHARACTER B IS BEST\
> *******************************' myfile
a
THE CHARACTER B IS BEST
*******************************
b
c
d
e

下一個：n 從一個文件中讀文本下一行，並附加在下一行

退出命令 q 打印多少行後退出
eg.
[test@szbirdora 1]$ sed '3q' myfile
a alert
b best
c cook

sed script:
sed -f scriptfile myfile

5.awk介紹
awk可從文件或字符串值基於指定規則瀏覽和抽取信息
awk三種調用方式：
1.命令行方式
awk [-F field-sperator]'pattern{active}' input-files
awk [-F field-sperator]'command' input-files
awk腳本
全部awk命令插入一個文件，並使awk程序可執行，而後用awk命令解析器做爲腳本的首行，以便經過鍵入腳本名稱來調用。
awk命令插入一個單獨文件
awk -f awk-script-file input-files
函數

awk腳本由模式和動做組成
分隔符、域、記錄工具

注意這裏的$1,$2是域與位置變量$1,$2不同。$0文件中的全部記錄

eg：
awk '{print $0}' myfile
awk 'BEGIN {print "IP DATE ----"}{print $1"\t"$4}END{print "end-of -report"}
[test@szbirdora 1]$ df |awk '$1!~"dev"'|grep -v Filesystem
none                   1992400         0   1992400   0% /dev/shm
[test@szbirdora 1]$ df |awk '{if ($1=="/dev/sda1") print $0}'
/dev/sda1             20641788   3367972 16225176 18% /

[test@szbirdora shelltest]$ cat employee
Tom Jones       4424    5/12/66 543354
Mary Adams      5346    11/4/63 28765
Sally Chang     1654    7/22/54 650000
Billy Black     1683    9/23/44 336500
[test@szbirdora shelltest]$ awk '/[Aa]dams/' employee
Mary Adams      5346    11/4/63 28765
[test@szbirdora shelltest]$ sed -n '/[Aa]dams/p' employee
Mary Adams      5346    11/4/63 28765
[test@szbirdora shelltest]$ grep '[Aa]dams' employee
Mary Adams      5346    11/4/63 28765
三種命令方式下，使用模式匹配查詢

[test@szbirdora shelltest]$ awk '{print $1}' employee
Tom
Mary
Sally
Billy
打印文件第一列

[test@szbirdora shelltest]$ awk '/Sally/{print $1"\t"$2}' employee
Sally   Chang
打印匹配Sally的行的第一列和第二列

[test@szbirdora shelltest]$ df |awk '$4>20884623'
Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/sda2             82567220 17488436 60884616 23% /u01
/dev/sda4             28494620   4589172 22457992 17% /u02
打印df輸出第四列大於××的行

格式輸出：
打印函數—
[test@szbirdora shelltest]$ date
Mon Mar 10 15:15:47 CST 2008
[test@szbirdora shelltest]$ date |awk '{print "Month:" $2"\nYear:" $6}'
Month:Mar
Year:2008
[test@szbirdora shelltest]$ awk '/Sally/{print "\t\tHave a nice day,"$1"\t"$2}' employee
                Have a nice day,Sally   Chang

printf函數
[test@szbirdora shelltest]$ echo "LINUX"|awk '{printf "|%-10s|\n",$1}'
|LINUX     |
[test@szbirdora shelltest]$ echo "LINUX"|awk '{printf "|%10s|\n",$1}'
|     LINUX|

～匹配符
[test@szbirdora shelltest]$ awk '$1~/Tom/{print $1,$2}' employee
Tom Jones

awk 給表達式賦值
關係運算符：
<             小於
>             大於
==           等於
!=            不等於
>=           大於等於
<=           小於等於
~              匹配
!~            不匹配
eg.
[test@szbirdora shelltest]$ cat employee
Tom Jones       4424    5/12/66 543354
Mary Adams      5346    11/4/63 28765
Sally Chang     1654    7/22/54 650000
Billy Black     1683    9/23/44 336500
[test@szbirdora shelltest]$ awk '$2~/Adams/' employee
Mary Adams      5346    11/4/63 28765

條件表達式：
condition   expression1?expression2:expression3
eg.
awk '{max=($1>$2) ? $1:$2;print max}' filename

運算符
+，-，*，/,%,^,&&,||,!

[test@szbirdora shelltest]$ cat /etc/passwd |awk -F: '\
NF!=7{\
printf("line %d does not have 7 fields:%s\n",NR,$0)}\
$1!~/[A-Za-z0-9]/{printf("line %d,nonalphanumberic user id:%s\n",NR,$0)}\
$2=="*"{printf("line %d,no password:%s\n",NR,$0)}'

awk編程
遞增操做符 x++，++x
遞減操做符 x--，--x

BEGIN模塊
BEGIN模塊後面緊跟着動做塊，在讀入文件前執行。一般被用來改變內建變量的值，如：FS\RS\OFS,初始化變量的值和打印輸出標題。
[test@szbirdora shelltest]$ awk 'BEGIN{print "HELLO WORLD"}'
HELLO WORLD
[test@szbirdora shelltest]$ awk 'BEGIN{print "---------LIST---------"}{print}END{print "------END--------"}' donors
---------LIST---------
Mike Harrington:(510) 548-1278:250:100:175
Christian Dobbins:(408) 538-2358:155:90:201
Susan Dalsass:(206) 654-6279:250:60:50
Archie McNichol:(206) 548-1348:250:100:175
Jody Savage:(206) 548-1278:15:188:150
Guy Quigley:(916) 343-6410:250:100:175
Dan Savage:(406) 298-7744:450:300:275
Nancy McNeil:(206) 548-1278:250:80:75
John Goldenrod:(916) 348-4278:250:100:175
Chet Main:(510) 548-5258:50:95:135
Tom Savage:(408) 926-3456:250:168:200
Elizabeth Stachelin:(916) 440-1763:175:75:300
------END--------

重定向和管道
輸出重定向
awk輸出重定向到一個文件須要使用輸出重定向符，輸出文件名須要用雙引號括起來。
[test@szbirdora shelltest]$ awk -F: '{print $1,$2>"note"}' donors
[test@szbirdora shelltest]$ cat note
Mike Harrington (510) 548-1278
Christian Dobbins (408) 538-2358
Susan Dalsass (206) 654-6279
Archie McNichol (206) 548-1348
Jody Savage (206) 548-1278
Guy Quigley (916) 343-6410
Dan Savage (406) 298-7744
Nancy McNeil (206) 548-1278
John Goldenrod (916) 348-4278
Chet Main (510) 548-5258
Tom Savage (408) 926-3456
Elizabeth Stachelin (916) 440-1763

輸入重定向
getline函數
[test@szbirdora shelltest]$ awk 'BEGIN{"date +%Y"|getline d;print d}'
2008

[test@szbirdora shelltest]$ awk -F"[ :]" 'BEGIN{printf "What is your name?";\
getline name<"/dev/tty"}\
$1~ name{print "Found\t" name "\ton line",NR"."}\
END{print "see ya," name "."}' donors
What is your name?Jody
Found   Jody    on line 5.
see ya,Jody.

[test@szbirdora shelltest]$ awk 'BEGIN{while(getline<"/etc/passwd">0)lc++;print lc}'
36
從文件中輸入，若是獲得一個記錄，getline函數就返回1，若是文件已經到了末尾，則返回0，若是文件名錯誤則返回-1.

管道：
awk命令打開一個管道後要打開下一個管道須要關閉前一個管道，管道符右邊能夠使用「」關閉管道。在同一時間只有一個管道存在
[test@szbirdora shelltest]$ awk '{print $1,$2|"sort -r +1 -2 +0 -1"}' names
tony tram
john smith
dan savage
john oldenrod
barbara nguyen
elizabeth lone
susan goldberg
george goldberg
eliza goldberg
alice cheba
|後用""關閉管道

system函數
system（"LINUX command"）
system("cat" $1)
system("clear")

條件語句
1.if（）{}
2.if(){}
else{}
3.if(){}
else if(){}
else if(){}
else{}
[test@szbirdora shelltest]$ awk -F: '{if ($3>250){printf "%-2s%13s\n",$1,"-----------good partman"}else{print $1}}' donors

循環語句
[test@szbirdora shelltest]$ awk -F: '{i=1;while(i<=NF){print NF,$i;i++}}' donors
循環控制語句break、continue

程序控制語句
next從輸入文件中讀取下一行，而後從頭開始執行awk腳本
{if($1~/Peter/){next}
else{print}}

exit 結束awk語句，但不會結束END模塊的處理。

數組：
awk '{name[x++]=$1;for(i=0;i<NR;i++){print i,name[i]}}' donors

(P177)---2008.3.11

awk內建函數
sub（正則表達式，替換字符[，$n]） ---域n匹配正則表達式的字符串將被替換。
[test@szbirdora shelltest]$ awk '{sub(/Tom/,"Jack",$1);print}' employee
Jack Jones 4424 5/12/66 543354
Mary Adams      5346    11/4/63 28765
Sally Chang     1654    7/22/54 650000
Billy Black     1683    9/23/44 336500
Jack He 3000 8/22/44 320000

index函數 index（字符串，子字符串）子字符串在字符串中的位置
[test@szbirdora shelltest]$ awk 'BEGIN{a=index("hello","llo");print a}'
3

length函數 length（string）字符串的長度
[test@szbirdora shelltest]$ awk 'BEGIN{a=length("hello world");print a}'
11

substr函數 substr（字符串，開始位置[，子字符串長度]）
[test@szbirdora shelltest]$ awk 'BEGIN{a=substr("hello world",7);print a}'
world
[test@szbirdora shelltest]$ awk 'BEGIN{a=substr("hello world",7,3);print a}'
wor

match(string,正則表達式) 找出字符串中第一個匹配正則表達式的位置,其內建變量RSTART爲匹配開始位置，RLENGTH爲匹配開始後字符數
[test@szbirdora shelltest]$ awk '{a=match($0,/Jon/);if (a!=0){print NR,a}}' employee
1 5
[test@szbirdora shelltest]$ awk '{a=match($0,/Jon/);if (a!=0){print NR,a,RSTART,RLENGTH}}' employee
1 5 5 3

toupper和tolower函數
[test@szbirdora shelltest]$ awk 'BEGIN{a=toupper("hello");print a}'
HELLO

split函數 split（string,array,fieldseperator）
[test@szbirdora shelltest]$ awk 'BEGIN{"date"|getline d;split(d,date);print date[2]}'
Mar

時間函數
systime（） ----1970年1月1日到當前忽略閏年得出的秒數。
strftime(格式描述，時間戳)
[test@szbirdora shelltest]$ awk 'BEGIN{d=strftime("%T",systime());print d}'
13:08:09
[test@szbirdora shelltest]$ awk 'BEGIN{d=strftime("%D",systime());print d}'
03/12/08
[test@szbirdora shelltest]$ awk 'BEGIN{d=strftime("%Y",systime());print d}'
2008

6.sort介紹

sort：
     -c 測試文件是否已經排序
     -m 合併兩個排序文件
     -u 刪除全部複製行
     -o 存儲sort結果的輸出文件名
     -t 域分隔符；用非空格或tab鍵分割域
     +n n爲域號，使用此域號開始排序   （注意0是第一列）
     -r 逆序排序
     n 指定排序是域上的數字排序項
[test@szbirdora 1]$ df -lh|grep -v 'Filesystem'|sort +1
none                  2.0G     0 2.0G   0% /dev/shm
/dev/sda1              20G 3.3G   16G 18% /
/dev/sda4              28G 3.9G   22G 15% /u02
/dev/sda2              79G   17G   59G 23% /u01

uniq [option]files 從一個文本文件中去除或禁止重複行
     -u 只顯示不重複行
     -d 只顯示有重複數據行，每重複行只顯示其中一行
     -c 打印每一重複行出現次數
     -f n爲數字，前n個域被忽略

注意要先排序

7.split cut join 分割和合並文件命令
[test@szbirdora 1]$ split -l 2 myfile split （每兩行分割爲一個以split名稱開頭的文件） [test@szbirdora 1]$ ls case.sh df.out helloworld.sh iftest.sh myfile nohup.out nullfile.txt parm.sh splitaa splitab splitac splitad splitae [test@szbirdora 1]$ cat splitaa Filesystem Size Used Avail Use% Mounted on /dev/sda1 20G 3.3G 16G 18% /