1、基礎用法
python
awk:報告生成工具;把文件中讀取到的每一行的每一個字段分別進行格式化,而後進行顯示。正則表達式
[Linux85]#awk -h Usage: awk [POSIX or GNU style options] -f progfile [--] file ... Usage: awk [POSIX or GNU style options] [--] 'program' file ... POSIX options: GNU long options: -f progfile --file=progfile -F fs --field-separator=fs #字段分隔符 -v var=val --assign=var=val -m[fr] val awk [options] 'script' FILE ... awk [options] '/pattern/{action}' FILE ...
四種分隔符:shell
輸入/輸出express
行分隔符:$數組
字段分隔符:空白ruby
模式bash
地址定界 | /pattern1/,/pattern2/ |
/pattern/ | 能夠 ! 取反 |
expression |
表達式;>, >=, <, <=, ==, !=, ~ |
BEGIN{} | 在遍歷操做開始以前執行一次 |
END{} | 在遍歷操做結束以後、命令退出以前執行一次 |
[Linux85]#awk '/^soul/{print $0}' /etc/passwd /etc/shadow /etc/group soul:x:501:501::/home/soul:/bin/bash soul:!!:16166:0:99999:7::: soul:x:501: [Linux85]#
#ID號大於等於500的用戶 [Linux85]#awk -F : '$3>=500{print $1}' /etc/passwd nfsnobody gentoo soul [Linux85]#
BEGIN執行前操做 [Linux85]#awk -F : 'BEGIN{print "UserName\n***********"}$3>=500{print $1}' /etc/passwd UserName *********** nfsnobody gentoo soul [Linux85]#
awk的內置變量:cookie
NF | 字段數( The number of fields in the current input record.) |
FS | field separator,讀取文本時,所使用字段分隔符 |
RS | Record separator,輸入文本信息所使用的換行符; |
OFS | 輸出時使用字段分隔符,默認爲空白(output field separator) |
ORS | output record separator |
[Linux85]#awk -F : '/^soul/{print $1,$7}' /etc/passwd soul /bin/bash [Linux85]#awk 'BEGIN{FS=":"}/^soul/{print $1,$7}' /etc/passwd soul /bin/bash [Linux85]#awk 'BEGIN{FS=":";OFS=":"}/^soul/{print $1,$7}' /etc/passwd soul:/bin/bash [Linux85]#
[Linux85]#awk '!/^$|^#/{print $1}' /etc/sysctl.conf net.ipv4.ip_forward net.ipv4.conf.default.rp_filter net.ipv4.conf.default.accept_source_route kernel.sysrq kernel.core_uses_pid net.ipv4.tcp_syncookies net.bridge.bridge-nf-call-ip6tables net.bridge.bridge-nf-call-iptables net.bridge.bridge-nf-call-arptables kernel.msgmnb kernel.msgmax kernel.shmmax kernel.shmall [Linux85]#
[Linux85]#ifconfig | awk '/inet addr/{print $2}' | awk -F : '!/127/{print $2}' 172.16.251.85 [Linux85]#
2、awk的進階使用app
一、print輸出:print item1, item2, ...less
各項目之間使用逗號隔開,而輸出時則以空白字符分隔;
輸出的item能夠爲字符串或數值、當前記錄的字段(如$1)、變量或awk的表達式;數值會先轉換爲字符串,然後再輸出;
print命令後面的item能夠省略,此時其功能至關於print $0, 所以,若是想輸出空白行,則須要使用print "";
二、printf輸出:printf format, item1, item2, ...
其與print命令的最大不一樣是,printf須要指定format;
format用於指定後面的每一個item的輸出格式;
printf語句不會自動打印換行符;\n
format格式的指示符都以%開頭;後面跟一個字符;
%c | 顯示字符的ASCII碼; |
%d | %i | 十進制整數; |
%e | %E | 科學計數法顯示數值; |
%f | 顯示浮點數; |
%g | %G | 以科學計數法的格式或浮點數的格式顯示數值; |
%s | 顯示字符串; |
%u | 無符號整數; |
%% | 顯示%自身; |
[Linux85]#awk 'BEGIN{num1=20;num2=30; printf "%d %d\n",num1,num2}' 20 30 [Linux85]# #不顯示item;只顯示的是格式;格式對應的後面的變量;因此須要一一對應
修飾符
N | 顯示寬度 |
- | 左對齊 |
+ | 顯示數值符號;正負數 |
[Linux85]#awk -F: '{printf "%-14s %s\n",$1,$NF}' /etc/passwd root /bin/bash bin /sbin/nologin daemon /sbin/nologin adm /sbin/nologin lp /sbin/nologin sync /bin/sync
三、awk內置變量之數據變量
NR | The number of input records,awk命令所處理的記錄數;若是有多個文件,這個數目會把處理的多個文件中行統一計數; |
NF | Number of Field,當前記錄的field個數; |
FNR | 與NR不一樣的是,FNR用於記錄正處理的行是當前這一文件中被總共處理的行數; |
ARGV | 數組,保存命令行自己這個字符串,如awk '{print $0}' a.txt b.txt這個命令中,ARGV[0]保存awk,ARGV[1]保存a.txt; |
ARGC | awk命令的參數的個數; |
FILENAME | awk命令所處理的文件的名稱; |
ENVIROM | 當前shell環境變量及其值的關聯數組; |
[Linux85]#awk '{print NR,$0}' 1.txt 1 one line 2 two line 3 three line 4 four line 5 five line [Linux85]#awk '{print NR,$0}' 2.txt 1 six line 2 seven line 3 eight line 4 nine line 5 ten line [Linux85]#awk '{print NR,$0}' 1.txt 2.txt 1 one line 2 two line 3 three line 4 four line 5 five line 6 six line 7 seven line 8 eight line 9 nine line 10 ten line [Linux85]# # [Linux85]#awk '{print FNR,$0}' 1.txt 2.txt 1 one line 2 two line 3 three line 4 four line 5 five line 1 six line 2 seven line 3 eight line 4 nine line 5 ten line [Linux85]#
[Linux85]#awk -F: '/root/{print $1,"is a user in",ARGV[1]}' /etc/passwd root is a user in /etc/passwd operator is a user in /etc/passwd [Linux85]#
[Linux85]#awk 'BEGIN{print ARGC}' /etc/passwd /etc/group /etc/shadow 4 [Linux85]# # 'BEGIN{print ARGC}'自己也當成一個參數
[Linux85]#awk '{print $0,"in", FILENAME}' 1.txt 2.txt one line in 1.txt two line in 1.txt three line in 1.txt four line in 1.txt five line in 1.txt six line in 2.txt seven line in 2.txt eight line in 2.txt nine line in 2.txt ten line in 2.txt [Linux85]#
四、輸出重定向
print items > output-file
print items >> output-file
print items | command
特殊文件描述符:
/dev/stdin:標準輸入
/dev/sdtout: 標準輸出
/dev/stderr: 錯誤輸出
/dev/fd/N: 某特定文件描述符,如/dev/stdin就至關於/dev/fd/0;
五、awk的操做符
算術操做符 |
賦值操做符 | 比較操做符 |
-x:負值 | =:應[=] | x < y True if x is less than y. |
+x:轉換爲數值 | += | x <= y True if x is less than or equal to y. |
x^y:次方 | -= | x > y True if x is greater than y. |
x**y:次方 | *= |
x >= y True if x is greater than or equal to y. |
x*y | /= | x == y True if x is equal to y. |
x/y | %= | x != y True if x is not equal to y. |
x+y | ^= | x ~ y True if the string x matches the regexp denoted by y. |
x-y | **= | x !~ y True if the string x does not match the regexp denoted by y. |
x%y | ++ | subscript in array True if the array array has an element with the subscript subscript. |
-- |
awk中;任何非0值或非空字符串都爲真;反之爲假。
條件表達式:
select?if-true-exp:if-false-exp
六、模式和常見的模式類型
模式:
awk 'program' input-file1 input-file2 ...
program:
pattern { action }
pattern { action }
....
常見的模式:
Regexp | 正則表達式,格式爲/regular expression/ |
expresssion | 表達式,其值非0或爲非空字符時知足條件,如:$1 ~ /foo/ 或 $1 == "soul",用運算符~(匹配)和!~(不匹配)。 |
Ranges | 指定的匹配範圍,格式爲pat1,pat2 |
BEGIN/END | 特殊模式,僅在awk命令執行前運行一次或結束前運行一次 |
Empty(空模式) | 匹配任意輸入行; |
常見的Action
Expressions
Control statements
Compound statements
Input statements
Output statements
七、控制語句
if-else
語法:if (condition) {then-body} else {[ else-body ]}
[Linux85]#awk -F : 'BEGIN{OFS=":"}{if ($3==0) {print $1,"Administrator";} else {print $1,"Common User"}}' /etc/passwd root:Administrator bin:Common User daemon:Common User adm:Common User lp:Common User sync:Common User shutdown:Common User
[Linux85]#awk -F: '{if ($1=="root") printf "%-15s: %s\n",$1,"Admin";else printf "%-15s: %s\n",$1,"Common User"}' /etc/passwd root : Admin bin : Common User daemon : Common User adm : Common User lp : Common User sync : Common User shutdown : Common User halt : Common User mail : Common User uucp : Common User operator : Common User games : Common User gopher : Common User ftp : Common User nobody : Common User dbus : Common User usbmuxd : Common User
[Linux85]#awk -F: -v sum=0 '{if ($3>=500) sum++}END{print sum}' /etc/passwd 3 [Linux85]#統計uid>=500的用戶個數
while
語法:while (condition){statement1; statment2; ...}
[Linux85]#awk -F : '{i=1;while (i<=3) {print $i;i++}}' /etc/passwd root x 0 bin x 1 #打印出/etc/passwd前三個字段
[Linux85]#awk -F: '{i=1;while (i<=NF) { if (length($i)>=4) {print $i}; i++ }}' /etc/passwd root root /root /bin/bash /bin /sbin/nologin
do-while 至少執行一次循環體,無論條件知足與否
語法:do {statement1, statement2, ...} while (condition)
[Linux85]#awk -F: '{i=1;do {print $i;i++}while(i<=3)}' /etc/passwd root x 0 bin x 1 daemon x 2
[Linux85]#awk -F: '{i=4;do {print $i;i--}while(i>4)}' /etc/passwd 0 1 2 4 7 0 0 0 12
for
語法:for (variable assignment; condition; iteration process) {statement1, statement2, ...}
[Linux85]#awk -F: '{for(i=1;i<=3;i++) if (i<3){printf "%s:",$i} print $i}' /etc/passwd root:x:0 bin:x:1 daemon:x:2 adm:x:4 lp:x:7 sync:x:0 shutdown:x:0
for循環遍歷數組元素
語法: for (i in array) {statement1, statement2, ...}
[Linux85]#awk -F: '$NF!~/^$/{BASH[$NF]++}END{for(A in BASH){printf "%15s:%i\n",A,BASH[A]}}' /etc/passwd /sbin/shutdown:1 /bin/csh:1 /bin/bash:2 /sbin/nologin:29 /sbin/halt:1 /bin/sync:1 [Linux85]# #統計最後一個字段出現的次數
case
語法:switch (expression) { case VALUE or /REGEXP/: statement1, statement2,... default: statement1, ...}
break 和 continue
next
提早結束對本行文本的處理,並接着處理下一行;
[Linux85]#awk -F: '{if($3%2==0) next;print $1,$3}' /etc/passwd bin 1 adm 3 sync 5 halt 7 operator 11 gopher 13 nobody 99 dbus 81 usbmuxd 113 vcsa 69 rtkit 499 abrt 173 postfix 89 rpcuser 29 pulse 497 soul 501 [Linux85]#
八、數組
array[index-expression]
index-expression可使用任意字符串;須要注意的是,若是某數據組元素事先不存在,那麼在引用其時,awk會自動建立此元素並初始化爲空串;所以,要判斷某數據組中是否存在某元素,須要使用index in array的方式。
要遍歷數組中的每個元素,須要使用以下的特殊結構:
for (var in array) { statement1, ... }
其中,var用於引用數組下標,而不是元素值;
刪除數組中的變量:delete array[index]
[Linux85]#netstat -ant | awk '/^tcp/ {++S[$NF]} END {for(a in S) print a, S[a]}' ESTABLISHED 2 LISTEN 10 [Linux85]#
九、awk的內置函數
split(string, array [, fieldsep [, seps ] ])
將string表示的字符串以fieldsep爲分隔符進行分隔,並將分隔後的結果保存至array爲名的數組中;數組下標爲從1開始的序列;
[Linux85]#df -lh | awk '!/^File/{split($5,percent,"%");if(percent[1]>=10){print $1}}' /dev/sda1 /dev/mapper/vg0-usr [Linux85]# #磁盤使用率大於等於%10的顯示出來
length([string]):返回string字符串中字符的個數;
[Linux85]#awk -F: '{for(i=1;i<=NF;i++) { if (length($i)>=4) {print $i}}}' /etc/passwd root root /root /bin/bash /bin /sbin/nologin daemon daemon /sbin /sbin/nologin
substr(string, start [, length ])
取string字符串中的子串,從start開始,取length個;start從1開始計數;
system(command):執行系統command並將結果返回至awk命令
systime():取系統當前時間
tolower(s):將s中的全部字母轉爲小寫
toupper(s):將s中的全部字母轉爲大寫
十、用戶自定義函數
自定義函數使用function關鍵字。格式以下:
function F_NAME([variable])
{
statements
}
example:
#統計當前系統上每一個客戶端IP的鏈接中狀處於ESTABLISHED的鏈接態的個數; [Linux85]#netstat -tn | awk '/ESTABLISHED\>/{split($5,ip,":");num[ip[1]]++}END{for (i in num) printf "%s %d\n", i, num[i]}' 172.16.254.28 2 [Linux85]#
#統計ps aux命令執行時,當前系統上各狀態的進程的個數; [Linux85]#ps aux | awk '!/^USER/{state[$8]++}END{for (i in state) printf "%s %d\n",i,state[i]}' S< 2 S<sl 1 Ss 18 SN 1 S 69 Ss+ 6 Ssl 2 R+ 1 S+ 2 Sl 2 S<s 1 [Linux85]#
#統計ps aux命令執行時,當前系統上各用戶的進程的個數; [Linux85]#ps aux | awk '!/^USER/{state[$1]++}END{for (i in state) printf "%s %d\n",i,state[i]}' rpc 1 dbus 1 68 2 postfix 2 rpcuser 1 root 96 gentoo 2 [Linux85]#
#顯示ps aux命令執行時,當前系統上其VSZ(虛擬內存集)大於10000的進程及其PID; [Linux85]#ps aux | awk '!/USER/{if($5>10000) print $2,$11}' 1 /sbin/init 397 /sbin/udevd 1184 auditd 1209 /sbin/rsyslogd 1251 rpcbind 1282 dbus-daemon 1292 NetworkManager 1297 /usr/sbin/modem-manager 1311 rpc.statd 1344 cupsd 1354 /usr/sbin/wpa_supplicant 1392 hald