Linux學習筆記：awk詳細用法

時間 2020-07-23

標籤 linux 學習筆記 awk 詳細用法欄目 Linux 简体版

原文原文鏈接

1、基礎用法
python

awk：報告生成工具；把文件中讀取到的每一行的每一個字段分別進行格式化，而後進行顯示。正則表達式

[Linux85]#awk -h
Usage: awk [POSIX or GNU style options] -f progfile [--] file ...
Usage: awk [POSIX or GNU style options] [--] 'program' file ...
POSIX options:      GNU long options:
    -f progfile     --file=progfile
    -F fs           --field-separator=fs    #字段分隔符
    -v var=val      --assign=var=val
    -m[fr] val
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          
awk [options] 'script' FILE ...
awk [options] '/pattern/{action}' FILE ...

四種分隔符：shell

輸入/輸出express

行分隔符：$數組

字段分隔符：空白ruby

模式bash

地址定界	/pattern1/,/pattern2/
/pattern/	能夠 ! 取反
expression	表達式；>, >=, <, <=, ==, !=, ~
BEGIN{}	在遍歷操做開始以前執行一次
END{}	在遍歷操做結束以後、命令退出以前執行一次

[Linux85]#awk '/^soul/{print $0}' /etc/passwd /etc/shadow /etc/group
soul:x:501:501::/home/soul:/bin/bash
soul:!!:16166:0:99999:7:::
soul:x:501:
[Linux85]#

#ID號大於等於500的用戶
[Linux85]#awk -F : '$3>=500{print $1}' /etc/passwd
nfsnobody
gentoo
soul
[Linux85]#

BEGIN執行前操做
[Linux85]#awk -F : 'BEGIN{print "UserName\n***********"}$3>=500{print $1}' /etc/passwd
UserName
***********
nfsnobody
gentoo
soul
[Linux85]#

awk的內置變量：cookie

NF	字段數( The number of fields in the current input record.)
FS	field separator，讀取文本時，所使用字段分隔符
RS	Record separator，輸入文本信息所使用的換行符；
OFS	輸出時使用字段分隔符，默認爲空白(output field separator)
ORS	output record separator

[Linux85]#awk -F : '/^soul/{print $1,$7}' /etc/passwd
soul /bin/bash
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             
[Linux85]#awk 'BEGIN{FS=":"}/^soul/{print $1,$7}' /etc/passwd
soul /bin/bash
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             
[Linux85]#awk 'BEGIN{FS=":";OFS=":"}/^soul/{print $1,$7}' /etc/passwd
soul:/bin/bash
[Linux85]#

[Linux85]#awk '!/^$|^#/{print $1}' /etc/sysctl.conf
net.ipv4.ip_forward
net.ipv4.conf.default.rp_filter
net.ipv4.conf.default.accept_source_route
kernel.sysrq
kernel.core_uses_pid
net.ipv4.tcp_syncookies
net.bridge.bridge-nf-call-ip6tables
net.bridge.bridge-nf-call-iptables
net.bridge.bridge-nf-call-arptables
kernel.msgmnb
kernel.msgmax
kernel.shmmax
kernel.shmall
[Linux85]#

[Linux85]#ifconfig | awk '/inet addr/{print $2}' | awk -F : '!/127/{print $2}'
172.16.251.85
[Linux85]#

2、awk的進階使用app

一、print輸出：print item1, item2, ...less

各項目之間使用逗號隔開，而輸出時則以空白字符分隔；
輸出的item能夠爲字符串或數值、當前記錄的字段(如$1)、變量或awk的表達式；數值會先轉換爲字符串，然後再輸出；
print命令後面的item能夠省略，此時其功能至關於print $0, 所以，若是想輸出空白行，則須要使用print ""；

二、printf輸出：printf format, item1, item2, ...

其與print命令的最大不一樣是，printf須要指定format；
format用於指定後面的每一個item的輸出格式；
printf語句不會自動打印換行符；\n

format格式的指示符都以%開頭；後面跟一個字符；

%c	顯示字符的ASCII碼；
%d \| %i	十進制整數；
%e \| %E	科學計數法顯示數值；
%f	顯示浮點數；
%g \| %G	以科學計數法的格式或浮點數的格式顯示數值；
%s	顯示字符串；
%u	無符號整數；
%%	顯示%自身；

[Linux85]#awk 'BEGIN{num1=20;num2=30; printf "%d %d\n",num1,num2}'
20 30
[Linux85]#
#不顯示item；只顯示的是格式；格式對應的後面的變量；因此須要一一對應

修飾符

N	顯示寬度
-	左對齊
+	顯示數值符號；正負數

[Linux85]#awk -F: '{printf "%-14s %s\n",$1,$NF}' /etc/passwd
root           /bin/bash
bin            /sbin/nologin
daemon         /sbin/nologin
adm            /sbin/nologin
lp             /sbin/nologin
sync           /bin/sync

三、awk內置變量之數據變量

NR	The number of input records，awk命令所處理的記錄數；若是有多個文件，這個數目會把處理的多個文件中行統一計數；
NF	Number of Field，當前記錄的field個數；
FNR	與NR不一樣的是，FNR用於記錄正處理的行是當前這一文件中被總共處理的行數；
ARGV	數組，保存命令行自己這個字符串，如awk '{print $0}' a.txt b.txt這個命令中，ARGV[0]保存awk，ARGV[1]保存a.txt；
ARGC	awk命令的參數的個數；
FILENAME	awk命令所處理的文件的名稱；
ENVIROM	當前shell環境變量及其值的關聯數組；

[Linux85]#awk '{print NR,$0}' 1.txt
1 one line
2 two line
3 three line
4 four line
5 five line
[Linux85]#awk '{print NR,$0}' 2.txt
1 six line
2 seven line
3 eight line
4 nine line
5 ten line
[Linux85]#awk '{print NR,$0}' 1.txt 2.txt
1 one line
2 two line
3 three line
4 four line
5 five line
6 six line
7 seven line
8 eight line
9 nine line
10 ten line
[Linux85]#
#
[Linux85]#awk '{print FNR,$0}' 1.txt 2.txt
1 one line
2 two line
3 three line
4 four line
5 five line
1 six line
2 seven line
3 eight line
4 nine line
5 ten line
[Linux85]#

[Linux85]#awk -F: '/root/{print $1,"is a user in",ARGV[1]}' /etc/passwd
root is a user in /etc/passwd
operator is a user in /etc/passwd
[Linux85]#

[Linux85]#awk 'BEGIN{print ARGC}' /etc/passwd /etc/group /etc/shadow
4
[Linux85]#
# 'BEGIN{print ARGC}'自己也當成一個參數

[Linux85]#awk '{print $0,"in",  FILENAME}' 1.txt 2.txt
one line in 1.txt
two line in 1.txt
three line in 1.txt
four line in 1.txt
five line  in 1.txt
six line in 2.txt
seven line in 2.txt
eight line in 2.txt
nine line in 2.txt
ten line in 2.txt
[Linux85]#

四、輸出重定向

print items > output-file

print items >> output-file

print items | command

特殊文件描述符：

/dev/stdin：標準輸入
/dev/sdtout: 標準輸出
/dev/stderr: 錯誤輸出
/dev/fd/N: 某特定文件描述符，如/dev/stdin就至關於/dev/fd/0；

五、awk的操做符

算術操做符	賦值操做符	比較操做符
-x:負值	=:應[=]	x < y True if x is less than y.
+x:轉換爲數值	+=	x <= y True if x is less than or equal to y.
x^y:次方	-=	x > y True if x is greater than y.
x**y:次方	*=	x >= y True if x is greater than or equal to y.
x*y	/=	x == y True if x is equal to y.
x/y	%=	x != y True if x is not equal to y.
x+y	^=	x ~ y True if the string x matches the regexp denoted by y.
x-y	**=	x !~ y True if the string x does not match the regexp denoted by y.
x%y	++	subscript in array True if the array array has an element with the subscript subscript.
	--

awk中；任何非0值或非空字符串都爲真；反之爲假。

條件表達式：

select?if-true-exp:if-false-exp

六、模式和常見的模式類型

模式：

awk 'program' input-file1 input-file2 ...

program：

pattern { action }
pattern { action }
....

常見的模式：

Regexp	正則表達式，格式爲/regular expression/
expresssion	表達式，其值非0或爲非空字符時知足條件，如：$1 ~ /foo/ 或 $1 == "soul"，用運算符~(匹配)和!~(不匹配)。
Ranges	指定的匹配範圍，格式爲pat1,pat2
BEGIN/END	特殊模式，僅在awk命令執行前運行一次或結束前運行一次
Empty(空模式)	匹配任意輸入行；

常見的Action

Expressions
Control statements
Compound statements
Input statements
Output statements

七、控制語句

if-else

語法：if (condition) {then-body} else {[ else-body ]}

[Linux85]#awk -F : 'BEGIN{OFS=":"}{if ($3==0) {print $1,"Administrator";} else {print $1,"Common User"}}' /etc/passwd
root:Administrator
bin:Common User
daemon:Common User
adm:Common User
lp:Common User
sync:Common User
shutdown:Common User

[Linux85]#awk -F: '{if ($1=="root") printf "%-15s: %s\n",$1,"Admin";else printf "%-15s: %s\n",$1,"Common User"}' /etc/passwd
root           : Admin
bin            : Common User
daemon         : Common User
adm            : Common User
lp             : Common User
sync           : Common User
shutdown       : Common User
halt           : Common User
mail           : Common User
uucp           : Common User
operator       : Common User
games          : Common User
gopher         : Common User
ftp            : Common User
nobody         : Common User
dbus           : Common User
usbmuxd        : Common User

[Linux85]#awk -F: -v sum=0 '{if ($3>=500) sum++}END{print sum}' /etc/passwd
3
[Linux85]#統計uid>=500的用戶個數

while

語法：while (condition){statement1; statment2; ...}

[Linux85]#awk -F : '{i=1;while (i<=3) {print $i;i++}}' /etc/passwd
root
x
0
bin
x
1
#打印出/etc/passwd前三個字段

[Linux85]#awk -F: '{i=1;while (i<=NF) { if (length($i)>=4) {print $i}; i++ }}' /etc/passwd
root
root
/root
/bin/bash
/bin
/sbin/nologin

do-while 至少執行一次循環體，無論條件知足與否

語法：do {statement1, statement2, ...} while (condition)

[Linux85]#awk -F: '{i=1;do {print $i;i++}while(i<=3)}' /etc/passwd
root
x
0
bin
x
1
daemon
x
2

[Linux85]#awk -F: '{i=4;do {print $i;i--}while(i>4)}' /etc/passwd
0
1
2
4
7
0
0
0
12

for

語法：for (variable assignment; condition; iteration process) {statement1, statement2, ...}

[Linux85]#awk -F: '{for(i=1;i<=3;i++) if (i<3){printf "%s:",$i} print $i}' /etc/passwd
root:x:0
bin:x:1
daemon:x:2
adm:x:4
lp:x:7
sync:x:0
shutdown:x:0

for循環遍歷數組元素

語法： for (i in array) {statement1, statement2, ...}

[Linux85]#awk -F: '$NF!~/^$/{BASH[$NF]++}END{for(A in BASH){printf "%15s:%i\n",A,BASH[A]}}' /etc/passwd
 /sbin/shutdown:1
       /bin/csh:1
      /bin/bash:2
  /sbin/nologin:29
     /sbin/halt:1
      /bin/sync:1
[Linux85]#
#統計最後一個字段出現的次數

case
語法：switch (expression) { case VALUE or /REGEXP/: statement1, statement2,... default: statement1, ...}
break 和 continue
next
提早結束對本行文本的處理，並接着處理下一行；

[Linux85]#awk -F: '{if($3%2==0) next;print $1,$3}' /etc/passwd
bin 1
adm 3
sync 5
halt 7
operator 11
gopher 13
nobody 99
dbus 81
usbmuxd 113
vcsa 69
rtkit 499
abrt 173
postfix 89
rpcuser 29
pulse 497
soul 501
[Linux85]#

八、數組

array[index-expression]

index-expression可使用任意字符串；須要注意的是，若是某數據組元素事先不存在，那麼在引用其時，awk會自動建立此元素並初始化爲空串；所以，要判斷某數據組中是否存在某元素，須要使用index in array的方式。
要遍歷數組中的每個元素，須要使用以下的特殊結構：

for (var in array) { statement1, ... }

其中，var用於引用數組下標，而不是元素值；

刪除數組中的變量：delete array[index]

[Linux85]#netstat -ant | awk '/^tcp/ {++S[$NF]} END {for(a in S) print a, S[a]}'
ESTABLISHED 2
LISTEN 10
[Linux85]#

九、awk的內置函數

split(string, array [, fieldsep [, seps ] ])
將string表示的字符串以fieldsep爲分隔符進行分隔，並將分隔後的結果保存至array爲名的數組中；數組下標爲從1開始的序列；

[Linux85]#df -lh | awk '!/^File/{split($5,percent,"%");if(percent[1]>=10){print $1}}'
/dev/sda1
/dev/mapper/vg0-usr
[Linux85]#
#磁盤使用率大於等於%10的顯示出來

length([string])：返回string字符串中字符的個數；

[Linux85]#awk -F: '{for(i=1;i<=NF;i++) { if (length($i)>=4) {print $i}}}' /etc/passwd
root
root
/root
/bin/bash
/bin
/sbin/nologin
daemon
daemon
/sbin
/sbin/nologin

substr(string, start [, length ])
取string字符串中的子串，從start開始，取length個；start從1開始計數；
system(command)：執行系統command並將結果返回至awk命令
systime()：取系統當前時間
tolower(s)：將s中的全部字母轉爲小寫
toupper(s)：將s中的全部字母轉爲大寫

十、用戶自定義函數

自定義函數使用function關鍵字。格式以下：

function F_NAME([variable])

{

statements

}

example：

#統計當前系統上每一個客戶端IP的鏈接中狀處於ESTABLISHED的鏈接態的個數；
[Linux85]#netstat -tn | awk '/ESTABLISHED\>/{split($5,ip,":");num[ip[1]]++}END{for (i in num) printf "%s %d\n", i, num[i]}'
172.16.254.28 2
[Linux85]#

#統計ps aux命令執行時，當前系統上各狀態的進程的個數；
[Linux85]#ps aux | awk '!/^USER/{state[$8]++}END{for (i in state) printf "%s %d\n",i,state[i]}'
S< 2
S<sl 1
Ss 18
SN 1
S 69
Ss+ 6
Ssl 2
R+ 1
S+ 2
Sl 2
S<s 1
[Linux85]#

#統計ps aux命令執行時，當前系統上各用戶的進程的個數；
[Linux85]#ps aux | awk '!/^USER/{state[$1]++}END{for (i in state) printf "%s %d\n",i,state[i]}'
rpc 1
dbus 1
68 2
postfix 2
rpcuser 1
root 96
gentoo 2
[Linux85]#

#顯示ps aux命令執行時，當前系統上其VSZ（虛擬內存集）大於10000的進程及其PID；
[Linux85]#ps aux | awk '!/USER/{if($5>10000) print $2,$11}'
1 /sbin/init
397 /sbin/udevd
1184 auditd
1209 /sbin/rsyslogd
1251 rpcbind
1282 dbus-daemon
1292 NetworkManager
1297 /usr/sbin/modem-manager
1311 rpc.statd
1344 cupsd
1354 /usr/sbin/wpa_supplicant
1392 hald