【轉載】三十分鐘學會AWK

本文大部份內容翻譯自我開始學習AWK時看到的一篇英文文章 AWK Tutorial ，以爲對AWK入門很是有幫助，因此對其進行了粗略的翻譯，並對其中部份內容進行了刪減或者補充，但願能爲對AWK感興趣的小夥伴提供一份快速入門的教程，幫助小夥伴們快速掌握AWK的基本使用方式，固然，我也是剛開始學習AWK，本文在翻譯或者補充的過程當中確定會有不少疏漏或者錯誤，但願你們可以幫忙指正。html

本文將會持續修正和更新，最新內容請參考個人 GITHUB 上的程序猿成長計劃項目，歡迎 Star。nginx

概述

AWK是一門解釋型的編程語言。在文本處理領域它是很是強大的，它的名字來源於它的三位做者的姓氏：Alfred Aho， Peter Weinberger 和 Brian Kernighan。git

GNU/Linux發佈的AWK目前由自由軟件基金會（FSF）進行開發和維護，一般也稱它爲 GNU AWK。github

AWK的類型

下面是幾個AWK的變體：正則表達式

AWK – 原先來源於 AT & T 實驗室的的AWK
NAWK – AT & T 實驗室的AWK的升級版
GAWK – 這就是GNU AWK。全部的GNU/Linux發佈版都自帶GAWK，它與AWK和NAWK徹底兼容

AWK的典型用途

使用AWK能夠作不少任務，下面是其中一些shell

文本處理
輸出格式化的文本報表
執行算數運算
執行字符串操做等等

工做流

要成爲AWK編程專家，你須要先知道它的內部實現機制，AWK遵循了很是簡單的工做流 – 讀取，執行和重複，下圖描述了AWK的工做流。編程

Read數組

AWK從輸入流（文件，管道或者標準輸入）中讀取一行，而後存儲到內存中。app

Execute編程語言

全部的AWK命令都依次在輸入上執行。默認狀況下，AWK會對每一行執行命令，咱們能夠經過提供模式限制這種行爲。

Repeat

處理過程不斷重複，直到到達文件結尾。

程序結構

如今，讓咱們先學習一下AWK的程序結構。

BEGIN 語句塊

BEGIN語句塊的語法

  
  
  
  
   
   
   
    
    BEGIN {awk-commands} 
   
   
   
   
   BEGIN {awk-commands}
   
   
   
    
     
     
     
     
   
  
  
  
  
BEGIN {awk-commands}

BEGIN語句塊在程序開始的使用執行，它只執行一次，在這裏能夠初始化變量。BEGIN是AWK的關鍵字，所以它必須爲大寫，注意，這個語句塊是可選的。

BODY 語句塊

BODY語句塊的語法

  
  
  
  
   
   
   
    
    /pattern/ {awk-commands} 
   
   
   
   
   /pattern/ {awk-commands}
   
   
   
    
     
     
     
     
   
  
  
  
  
/pattern/ {awk-commands}

BODY語句塊中的命令會對輸入的每一行執行，咱們也能夠經過提供模式來控制這種行爲。注意，BODY語句塊沒有關鍵字。

END 語句塊

END語句塊的語法

  
  
  
  
   
   
   
    
    END {awk-commands} 
   
   
   
   
   END {awk-commands}
   
   
   
    
     
     
     
     
   
  
  
  
  
END {awk-commands}

END語句塊在程序的最後執行，END是AWK的關鍵字，所以必須爲大寫，它也是可選的。

讓咱們建立一個包含序號，學生姓名，科目名稱和得分的文件 marks.txt。

  
  
  
  
   
   
   
    
    1) Amit Physics 80 
    2) Rahul Maths 90 
    3) Shyam Biology 87 
    4) Kedar English 85 
    5) Hari History 89 
   
   
   
   
   1)  Amit    Physics  80
2)  Rahul   Maths    90
3)  Shyam   Biology  87
4)  Kedar   English  85
5)  Hari    History  89
   
   
   
    
     
     
     
     
   
  
  
  
  1)  Amit    Physics  80
2)  Rahul   Maths    90
3)  Shyam   Biology  87
4)  Kedar   English  85
5)  Hari    History  89

下面的例子中咱們將會顯示文件內容，而且添加每一列的標題

  
  
  
  
   
   
   
    
    $ awk 'BEGIN{printf "Sr No\tName\tSub\tMarks\n"} {print}' marks.txt 
   
   
   
   
   $ awk 'BEGIN{printf "Sr No\tName\tSub\tMarks\n"} {print}' marks.txt
   
   
   
    
     
     
     
     
   
  
  
  
  
$ awk 'BEGIN{printf "Sr No\tName\tSub\tMarks\n"} {print}' marks.txt

上述代碼執行後，輸出如下內容

  
  
  
  
   
   
   
    
    Sr No Name Sub Marks 
     1) Amit Physics 80 
     2) Rahul Maths 90 
     3) Shyam Biology 87 
     4) Kedar English 85 
     5) Hari History 89 
   
   
   
   
   Sr No     Name     Sub          Marks
 1)       Amit     Physics      80
 2)       Rahul    Maths        90
 3)       Shyam    Biology      87
 4)       Kedar    English      85
 5)       Hari     History      89
   
   
   
    
     
     
     
     
   
  
  
  
  Sr No     Name     Sub          Marks
 1)       Amit     Physics      80
 2)       Rahul    Maths        90
 3)       Shyam    Biology      87
 4)       Kedar    English      85
 5)       Hari     History      89

在程序的開始，AWK在BEGIN語句中打印出標題。而後再BODY語句中，它會讀取文件的每一行而後執行AWK的print命令將每一行的內容打印到標準輸出。這個過程會一直重複直到文件的結尾。

基礎語法

AWK的使用很是簡單，咱們能夠直接在命令行中執行AWK的命令，也能夠從包含AWK命令的文本文件中執行。

AWK命令行

咱們可使用單引號在命令行中指定AWK命令

  
  
  
  
   
   
   
    
    awk [options] file ... 
   
   
   
   
   awk [options] file ...
   
   
   
    
     
     
     
     
   
  
  
  
  
awk [options] file ...

好比咱們有一個包含下面內容的文本文件 marks.txt:

  
  
  
  
   
   
   
    
    1) Amit Physics 80 
    2) Rahul Maths 90 
    3) Shyam Biology 87 
    4) Kedar English 85 
    5) Hari History 89 
   
   
   
   
   1) Amit     Physics    80
2) Rahul    Maths      90
3) Shyam    Biology    87
4) Kedar    English    85
5) Hari     History    89
   
   
   
    
     
     
     
     
   
  
  
  
  1) Amit     Physics    80
2) Rahul    Maths      90
3) Shyam    Biology    87
4) Kedar    English    85
5) Hari     History    89

咱們可使用下面的命令顯示該文件的完整內容

  
  
  
  
   
   
   
    
    $ awk '{print}' marks.txt  
   
   
   
   
   $ awk '{print}' marks.txt 
   
   
   
    
     
     
     
     
   
  
  
  
  
$ awk '{print}' marks.txt

AWK程序文件

咱們可使用腳本文件提供AWK命令

  
  
  
  
   
   
   
    
    awk [options] -f file .... 
   
   
   
   
   awk [options] -f file ....
   
   
   
    
     
     
     
     
   
  
  
  
  
awk [options] -f file ....

首先，建立一個包含下面內容的文本文件 command.awk

如今，咱們可讓AWK執行該文件中的命令，這裏咱們實現了和上例一樣的結果

  
  
  
  
   
   
   
    
    $ awk -f command.awk marks.txt 
   
   
   
   
   $ awk -f command.awk marks.txt
   
   
   
    
     
     
     
     
   
  
  
  
  
$ awk -f command.awk marks.txt

AWK標準選項

AWK支持下列命令行標準選項

`-v` 變量賦值選項

該選項將一個值賦予一個變量，它會在程序開始以前進行賦值，下面的例子描述了該選項的使用

  
  
  
  
   
   
   
    
    $ awk -v name=Jerry 'BEGIN{printf "Name = %s\n", name}' 
    Name = Jerry 
   
   
   
   
   $ awk -v name=Jerry 'BEGIN{printf "Name = %s\n", name}'
Name = Jerry
   
   
   
    
     
     
     
     
   
  
  
  
  $ awk -v name=Jerry 'BEGIN{printf "Name = %s\n", name}'
Name = Jerry

`--dump-variables[=file]` 選項

該選項會輸出排好序的全局變量列表和它們最終的值到文件中，默認的文件是 awkvars.out。

  
  
  
  
   
   
   
    
    $ awk --dump-variables '' 
    $ cat awkvars.out  
    ARGC: 1 
    ARGIND: 0 
    ARGV: array, 1 elements 
    BINMODE: 0 
    CONVFMT: "%.6g" 
    ERRNO: "" 
    FIELDWIDTHS: "" 
    FILENAME: "" 
    FNR: 0 
    FPAT: "[^[:space:]]+" 
    FS: " " 
    IGNORECASE: 0 
    LINT: 0 
    NF: 0 
    NR: 0 
    OFMT: "%.6g" 
    OFS: " " 
    ORS: "\n" 
    RLENGTH: 0 
    RS: "\n" 
    RSTART: 0 
    RT: "" 
    SUBSEP: "\034" 
    TEXTDOMAIN: "messages" 
   
   
   
   
   $ awk --dump-variables ''
$ cat awkvars.out 
ARGC: 1
ARGIND: 0
ARGV: array, 1 elements
BINMODE: 0
CONVFMT: "%.6g"
ERRNO: ""
FIELDWIDTHS: ""
FILENAME: ""
FNR: 0
FPAT: "[^[:space:]]+"
FS: " "
IGNORECASE: 0
LINT: 0
NF: 0
NR: 0
OFMT: "%.6g"
OFS: " "
ORS: "\n"
RLENGTH: 0
RS: "\n"
RSTART: 0
RT: ""
SUBSEP: "\034"
TEXTDOMAIN: "messages"
   
   
   
    
     
     
     
     
   
  
  
  
  $ awk --dump-variables ''
$ cat awkvars.out 
ARGC: 1
ARGIND: 0
ARGV: array, 1 elements
BINMODE: 0
CONVFMT: "%.6g"
ERRNO: ""
FIELDWIDTHS: ""
FILENAME: ""
FNR: 0
FPAT: "[^[:space:]]+"
FS: " "
IGNORECASE: 0
LINT: 0
NF: 0
NR: 0
OFMT: "%.6g"
OFS: " "
ORS: "\n"
RLENGTH: 0
RS: "\n"
RSTART: 0
RT: ""
SUBSEP: "\034"
TEXTDOMAIN: "messages"

`--help` 選項

打印幫助信息。

  
  
  
  
   
   
   
    
    $ awk --help 
    Usage: awk [POSIX or GNU style options] -f progfile [--] file ... 
    Usage: awk [POSIX or GNU style options] [--] 'program' file ... 
    POSIX options : GNU long options: (standard) 
     -f progfile --file=progfile 
     -F fs --field-separator=fs 
     -v var=val --assign=var=val 
    Short options : GNU long options: (extensions) 
     -b --characters-as-bytes 
     -c --traditional 
     -C --copyright 
     -d[file] --dump-variables[=file] 
     -e 'program-text' --source='program-text' 
     -E file --exec=file 
     -g --gen-pot 
     -h --help 
     -L [fatal] --lint[=fatal] 
     -n --non-decimal-data 
     -N --use-lc-numeric 
     -O --optimize 
     -p[file] --profile[=file] 
     -P --posix 
     -r --re-interval 
     -S --sandbox 
     -t --lint-old 
     -V --version 
   
   
   
   
   $ awk --help
Usage: awk [POSIX or GNU style options] -f progfile [--] file ...
Usage: awk [POSIX or GNU style options] [--] 'program' file ...
POSIX options : GNU long options: (standard)
   -f progfile                --file=progfile
   -F fs                      --field-separator=fs
   -v var=val                 --assign=var=val
Short options : GNU long options: (extensions)
   -b                         --characters-as-bytes
   -c                         --traditional
   -C                         --copyright
   -d[file]                   --dump-variables[=file]
   -e 'program-text'          --source='program-text'
   -E file                    --exec=file
   -g                         --gen-pot
   -h                         --help
   -L [fatal]                 --lint[=fatal]
   -n                         --non-decimal-data
   -N                         --use-lc-numeric
   -O                         --optimize
   -p[file]                   --profile[=file]
   -P                         --posix
   -r                         --re-interval
   -S                         --sandbox
   -t                         --lint-old
   -V                         --version
   
   
   
    
     
     
     
     
   
  
  
  
  $ awk --help
Usage: awk [POSIX or GNU style options] -f progfile [--] file ...
Usage: awk [POSIX or GNU style options] [--] 'program' file ...
POSIX options : GNU long options: (standard)
   -f progfile                --file=progfile
   -F fs                      --field-separator=fs
   -v var=val                 --assign=var=val
Short options : GNU long options: (extensions)
   -b                         --characters-as-bytes
   -c                         --traditional
   -C                         --copyright
   -d[file]                   --dump-variables[=file]
   -e 'program-text'          --source='program-text'
   -E file                    --exec=file
   -g                         --gen-pot
   -h                         --help
   -L [fatal]                 --lint[=fatal]
   -n                         --non-decimal-data
   -N                         --use-lc-numeric
   -O                         --optimize
   -p[file]                   --profile[=file]
   -P                         --posix
   -r                         --re-interval
   -S                         --sandbox
   -t                         --lint-old
   -V                         --version

`--lint[=fatal]` 選項

該選項容許檢查程序的不兼容性或者模棱兩可的代碼，當提供參數 fatal的時候，它會對待Warning消息做爲Error。

  
  
  
  
   
   
   
    
    $ awk --lint '' /bin/ls 
    awk: cmd. line:1: warning: empty program text on command line 
    awk: cmd. line:1: warning: source file does not end in newline 
    awk: warning: no program text at all! 
   
   
   
   
   $ awk --lint '' /bin/ls
awk: cmd. line:1: warning: empty program text on command line
awk: cmd. line:1: warning: source file does not end in newline
awk: warning: no program text at all!
   
   
   
    
     
     
     
     
   
  
  
  
  $ awk --lint '' /bin/ls
awk: cmd. line:1: warning: empty program text on command line
awk: cmd. line:1: warning: source file does not end in newline
awk: warning: no program text at all!

`--posix` 選項

該選項開啓嚴格的POSIX兼容。

`--profile[=file]`選項

該選項會輸出一份格式化以後的程序到文件中，默認文件是 awkprof.out。

  
  
  
  
   
   
   
    
    $ awk --profile 'BEGIN{printf"---|Header|--\n"} {print}  
    END{printf"---|Footer|---\n"}' marks.txt > /dev/null  
    $ cat awkprof.out 
     # gawk 配置, 建立 Wed Oct 26 15:05:49 2016 
     
     # BEGIN 塊 
     
     BEGIN { 
     printf "---|Header|--\n" 
     } 
     
     # 規則 
     
     { 
     print $0 
     } 
     
     # END 塊 
     
     END { 
     printf "---|Footer|---\n" 
     } 
   
   
   
   
   $ awk --profile 'BEGIN{printf"---|Header|--\n"} {print} 
END{printf"---|Footer|---\n"}' marks.txt > /dev/null 
$ cat awkprof.out
    # gawk 配置, 建立 Wed Oct 26 15:05:49 2016

    # BEGIN 塊

    BEGIN {
        printf "---|Header|--\n"
    }

    # 規則

    {
        print $0
    }

    # END 塊

    END {
        printf "---|Footer|---\n"
    }
   
   
   
    
     
     
     
     
   
  
  
  
  $ awk --profile 'BEGIN{printf"---|Header|--\n"} {print} 
END{printf"---|Footer|---\n"}' marks.txt > /dev/null 
$ cat awkprof.out
    # gawk 配置, 建立 Wed Oct 26 15:05:49 2016

    # BEGIN 塊

    BEGIN {
        printf "---|Header|--\n"
    }

    # 規則

    {
        print $0
    }

    # END 塊

    END {
        printf "---|Footer|---\n"
    }

`--traditional` 選項

該選項會禁止全部的gawk規範的擴展。

`--version` 選項

輸出版本號

  
  
  
  
   
   
   
    
    $ awk --version 
    GNU Awk 3.1.7 
    版權全部 © 1989, 1991-2009 自由軟件基金會(FSF)。 
     
    該程序爲自由軟件，你能夠在自由軟件基金會發布的 GNU 通用公共許可證(GPL)第 
    3版或之後版本下修改或從新發布。 
     
    該程序之因此被髮布是由於但願他能對你有所用處，但咱們不做任何擔保。這包含 
    但不限於任何商業適售性以及針對特定目的的適用性的擔保。詳情參見 GNU 通用公 
    共許可證(GPL)。 
     
    你應該收到程序附帶的一份 GNU 通用公共許可證(GPL)。若是沒有收到，請參看 http://www.gnu.org/licenses/ 。 
    You have new mail in /var/spool/mail/root 
   
   
   
   
   $ awk --version
GNU Awk 3.1.7
版權全部 © 1989, 1991-2009 自由軟件基金會(FSF)。

該程序爲自由軟件，你能夠在自由軟件基金會發布的 GNU 通用公共許可證(GPL)第
3版或之後版本下修改或從新發布。

該程序之因此被髮布是由於但願他能對你有所用處，但咱們不做任何擔保。這包含
但不限於任何商業適售性以及針對特定目的的適用性的擔保。詳情參見 GNU 通用公
共許可證(GPL)。

你應該收到程序附帶的一份 GNU 通用公共許可證(GPL)。若是沒有收到，請參看 http://www.gnu.org/licenses/ 。
You have new mail in /var/spool/mail/root
   
   
   
    
     
     
     
     
   
  
  
  
  $ awk --version
GNU Awk 3.1.7
版權全部 © 1989, 1991-2009 自由軟件基金會(FSF)。

該程序爲自由軟件，你能夠在自由軟件基金會發布的 GNU 通用公共許可證(GPL)第
3版或之後版本下修改或從新發布。

該程序之因此被髮布是由於但願他能對你有所用處，但咱們不做任何擔保。這包含
但不限於任何商業適售性以及針對特定目的的適用性的擔保。詳情參見 GNU 通用公
共許可證(GPL)。

你應該收到程序附帶的一份 GNU 通用公共許可證(GPL)。若是沒有收到，請參看 http://www.gnu.org/licenses/ 。
You have new mail in /var/spool/mail/root

基本使用示例

本部分會講述一些有用的AWK命令和它們的使用示例，全部的例子都是如下面的文本文件 marks.txt 爲基礎的

  
  
  
  
   
   
   
    
    1) Amit Physics 80 
    2) Rahul Maths 90 
    3) Shyam Biology 87 
    4) Kedar English 85 
    5) Hari History 89 
   
   
   
   
   1) Amit Physics     80
2) Rahul    Maths       90
3) Shyam    Biology     87
4) Kedar    English     85
5) Hari History     89
   
   
   
    
     
     
     
     
   
  
  
  
  1) Amit Physics     80
2) Rahul    Maths       90
3) Shyam    Biology     87
4) Kedar    English     85
5) Hari History     89

打印某列或者字段

AWK能夠只打印輸入字段中的某些列。

  
  
  
  
   
   
   
    
    $ awk '{print $3 "\t" $4}' marks.txt 
    Physics 80 
    Maths 90 
    Biology 87 
    English 85 
    History 89 
   
   
   
   
   $ awk '{print $3 "\t" $4}' marks.txt
Physics 80
Maths   90
Biology 87
English 85
History 89
   
   
   
    
     
     
     
     
   
  
  
  
  $ awk '{print $3 "\t" $4}' marks.txt
Physics 80
Maths   90
Biology 87
English 85
History 89

在文件marks.txt中，第三列包含了科目名，第四列則是得分，上面的例子中，咱們只打印出了這兩列，$3 和 $4 表明了輸入記錄中的第三和第四個字段。

打印全部的行

默認狀況下，AWK會打印出全部匹配模式的行

  
  
  
  
   
   
   
    
    $ awk '/a/ {print $0}' marks.txt 
    2) Rahul Maths 90 
    3) Shyam Biology 87 
    4) Kedar English 85 
    5) Hari History 89 
   
   
   
   
   $ awk '/a/ {print $0}' marks.txt
2)  Rahul   Maths    90
3)  Shyam   Biology  87
4)  Kedar   English  85
5)  Hari    History  89
   
   
   
    
     
     
     
     
   
  
  
  
  $ awk '/a/ {print $0}' marks.txt
2)  Rahul   Maths    90
3)  Shyam   Biology  87
4)  Kedar   English  85
5)  Hari    History  89

上述命令會判斷每一行中是否包含a，若是包含則打印該行，若是BODY部分缺失則默認會執行打印，所以，上述命令和下面這個是等價的

  
  
  
  
   
   
   
    
    $ awk '/a/' marks.txt 
   
   
   
   
   $ awk '/a/' marks.txt
   
   
   
    
     
     
     
     
   
  
  
  
  
$ awk '/a/' marks.txt

打印匹配模式的列

當模式匹配成功時，默認狀況下AWK會打印該行，可是也可讓它只打印指定的字段。例如，下面的例子中，只會打印出匹配模式的第三和第四個字段。

  
  
  
  
   
   
   
    
    $ awk '/a/ {print $3 "\t" $4}' marks.txt 
    Maths 90 
    Biology 87 
    English 85 
    History 89 
   
   
   
   
   $ awk '/a/ {print $3 "\t" $4}' marks.txt
Maths   90
Biology 87
English 85
History 89
   
   
   
    
     
     
     
     
   
  
  
  
  $ awk '/a/ {print $3 "\t" $4}' marks.txt
Maths   90
Biology 87
English 85
History 89

任意順序打印列

  
  
  
  
   
   
   
    
    $ awk '/a/ {print $4 "\t" $3}' marks.txt 
    90 Maths 
    87 Biology 
    85 English 
    89 History 
   
   
   
   
   $ awk '/a/ {print $4 "\t" $3}' marks.txt
90  Maths
87  Biology
85  English
89  History
   
   
   
    
     
     
     
     
   
  
  
  
  $ awk '/a/ {print $4 "\t" $3}' marks.txt
90  Maths
87  Biology
85  English
89  History

統計匹配模式的行數

  
  
  
  
   
   
   
    
    $ awk '/a/{++cnt} END {print "Count = ", cnt}' marks.txt 
    Count = 4 
   
   
   
   
   $ awk '/a/{++cnt} END {print "Count = ", cnt}' marks.txt
Count =  4
   
   
   
    
     
     
     
     
   
  
  
  
  $ awk '/a/{++cnt} END {print "Count = ", cnt}' marks.txt
Count =  4

打印超過18個字符的行

  
  
  
  
   
   
   
    
    $ awk 'length($0) > 18' marks.txt 
    3) Shyam Biology 87 
    4) Kedar English 85 
   
   
   
   
   $ awk 'length($0) > 18' marks.txt
3) Shyam   Biology   87
4) Kedar   English   85
   
   
   
    
     
     
     
     
   
  
  
  
  $ awk 'length($0) > 18' marks.txt
3) Shyam   Biology   87
4) Kedar   English   85

內建變量

AWK提供了不少內置的變量，它們在開發AWK腳本的過程當中起着很是重要的角色。

標準AWK變量

ARGC 命令行參數個數

命令行中提供的參數個數

  
  
  
  
   
   
   
    
    $ awk 'BEGIN {print "Arguments =", ARGC}' One Two Three Four 
    Arguments = 5 
   
   
   
   
   $ awk 'BEGIN {print "Arguments =", ARGC}' One Two Three Four
Arguments = 5
   
   
   
    
     
     
     
     
   
  
  
  
  $ awk 'BEGIN {print "Arguments =", ARGC}' One Two Three Four
Arguments = 5

ARGV 命令行參數數組

存儲命令行參數的數組，索引範圍從0 – ARGC – 1。

  
  
  
  
   
   
   
    
    $ awk 'BEGIN {  
     for (i = 0; i < ARGC - 1; ++i) {  
     printf "ARGV[%d] = %s\n", i, ARGV[i]  
     }  
    }' one two three four 
    ARGV[0] = awk 
    ARGV[1] = one 
    ARGV[2] = two 
    ARGV[3] = three 
   
   
   
   
   $ awk 'BEGIN { 
   for (i = 0; i < ARGC - 1; ++i) { 
      printf "ARGV[%d] = %s\n", i, ARGV[i] 
   } 
}' one two three four
ARGV[0] = awk
ARGV[1] = one
ARGV[2] = two
ARGV[3] = three
   
   
   
    
     
     
     
     
   
  
  
  
  $ awk 'BEGIN { 
   for (i = 0; i < ARGC - 1; ++i) { 
      printf "ARGV[%d] = %s\n", i, ARGV[i] 
   } 
}' one two three four
ARGV[0] = awk
ARGV[1] = one
ARGV[2] = two
ARGV[3] = three

CONVFMT 數字的約定格式

表明了數字的約定格式，默認值是%.6g

  
  
  
  
   
   
   
    
    $ awk 'BEGIN { print "Conversion Format =", CONVFMT }' 
    Conversion Format = %.6g 
   
   
   
   
   $ awk 'BEGIN { print "Conversion Format =", CONVFMT }'
Conversion Format = %.6g
   
   
   
    
     
     
     
     
   
  
  
  
  $ awk 'BEGIN { print "Conversion Format =", CONVFMT }'
Conversion Format = %.6g

ENVIRON 環境變量

環境變量的關聯數組

  
  
  
  
   
   
   
    
    $ awk 'BEGIN { print ENVIRON["USER"] }' 
    mylxsw 
   
   
   
   
   $ awk 'BEGIN { print ENVIRON["USER"] }'
mylxsw
   
   
   
    
     
     
     
     
   
  
  
  
  $ awk 'BEGIN { print ENVIRON["USER"] }'
mylxsw

FILENAME 當前文件名

  
  
  
  
   
   
   
    
    $ awk 'END {print FILENAME}' marks.txt 
    marks.txt 
   
   
   
   
   $ awk 'END {print FILENAME}' marks.txt
marks.txt
   
   
   
    
     
     
     
     
   
  
  
  
  $ awk 'END {print FILENAME}' marks.txt
marks.txt

FS 輸入字段的分隔符

表明了輸入字段的分隔符，默認值爲空格，能夠經過-F選項在命令行選項中修改它。

  
  
  
  
   
   
   
    
    $ awk 'BEGIN {print "FS = " FS}' | cat -vte 
    FS = $ 
    $ awk -F , 'BEGIN {print "FS = " FS}' | cat -vte 
    FS = ,$ 
   
   
   
   
   $ awk 'BEGIN {print "FS = " FS}' | cat -vte
FS =  $
$ awk -F , 'BEGIN {print "FS = " FS}' | cat -vte
FS = ,$
   
   
   
    
     
     
     
     
   
  
  
  
  $ awk 'BEGIN {print "FS = " FS}' | cat -vte
FS =  $
$ awk -F , 'BEGIN {print "FS = " FS}' | cat -vte
FS = ,$

NF 字段數目

表明了當前行中的字段數目，例以下面例子打印出了包含大於兩個字段的行

  
  
  
  
   
   
   
    
    $ echo -e "One Two\nOne Two Three\nOne Two Three Four" | awk 'NF > 2' 
    One Two Three 
    One Two Three Four 
   
   
   
   
   $ echo -e "One Two\nOne Two Three\nOne Two Three Four" | awk 'NF > 2'
One Two Three
One Two Three Four
   
   
   
    
     
     
     
     
   
  
  
  
  $ echo -e "One Two\nOne Two Three\nOne Two Three Four" | awk 'NF > 2'
One Two Three
One Two Three Four

NR 行號

  
  
  
  
   
   
   
    
    $ echo -e "One Two\nOne Two Three\nOne Two Three Four" | awk 'NR < 3' 
    One Two 
    One Two Three 
   
   
   
   
   $ echo -e "One Two\nOne Two Three\nOne Two Three Four" | awk 'NR < 3'
One Two
One Two Three
   
   
   
    
     
     
     
     
   
  
  
  
  $ echo -e "One Two\nOne Two Three\nOne Two Three Four" | awk 'NR < 3'
One Two
One Two Three

FNR 行號（相對當前文件）

與NR類似，不過在處理多文件時更有用，獲取的行號相對於當前文件。

OFMT 輸出格式數字

默認值爲%.6g

  
  
  
  
   
   
   
    
    $ awk 'BEGIN {print "OFMT = " OFMT}' 
    OFMT = %.6g 
   
   
   
   
   $ awk 'BEGIN {print "OFMT = " OFMT}'
OFMT = %.6g
   
   
   
    
     
     
     
     
   
  
  
  
  $ awk 'BEGIN {print "OFMT = " OFMT}'
OFMT = %.6g

OFS 輸出字段分隔符

輸出字段分隔符，默認爲空格

  
  
  
  
   
   
   
    
    $ awk 'BEGIN {print "OFS = " OFS}' | cat -vte 
    OFS = $ 
   
   
   
   
   $ awk 'BEGIN {print "OFS = " OFS}' | cat -vte
OFS =  $
   
   
   
    
     
     
     
     
   
  
  
  
  $ awk 'BEGIN {print "OFS = " OFS}' | cat -vte
OFS =  $

ORS 輸出行分隔符

默認值爲換行符

  
  
  
  
   
   
   
    
    $ awk 'BEGIN {print "ORS = " ORS}' | cat -vte 
    ORS = $ 
    $ 
   
   
   
   
   $ awk 'BEGIN {print "ORS = " ORS}' | cat -vte
ORS = $
$
   
   
   
    
     
     
     
     
   
  
  
  
  $ awk 'BEGIN {print "ORS = " ORS}' | cat -vte
ORS = $
$

RLENGTH

表明了 match 函數匹配的字符串長度。

  
  
  
  
   
   
   
    
    $ awk 'BEGIN { if (match("One Two Three", "re")) { print RLENGTH } }' 
    2 
   
   
   
   
   $ awk 'BEGIN { if (match("One Two Three", "re")) { print RLENGTH } }'
2
   
   
   
    
     
     
     
     
   
  
  
  
  $ awk 'BEGIN { if (match("One Two Three", "re")) { print RLENGTH } }'
2

RS 輸入記錄分隔符

  
  
  
  
   
   
   
    
    $ awk 'BEGIN {print "RS = " RS}' | cat -vte 
    RS = $ 
    $ 
   
   
   
   
   $ awk 'BEGIN {print "RS = " RS}' | cat -vte
RS = $
$
   
   
   
    
     
     
     
     
   
  
  
  
  $ awk 'BEGIN {print "RS = " RS}' | cat -vte
RS = $
$

RSTART

match函數匹配的第一次出現位置

  
  
  
  
   
   
   
    
    $ awk 'BEGIN { if (match("One Two Three", "Thre")) { print RSTART } } 
    9 
   
   
   
   
   $ awk 'BEGIN { if (match("One Two Three", "Thre")) { print RSTART } }
9
   
   
   
    
     
     
     
     
   
  
  
  
  $ awk 'BEGIN { if (match("One Two Three", "Thre")) { print RSTART } }
9

SUBSEP 數組子腳本的分隔符

數組子腳本的分隔符，默認爲\034

  
  
  
  
   
   
   
    
    $ awk 'BEGIN { print "SUBSEP = " SUBSEP }' | cat -vte 
    SUBSEP = ^\$ 
   
   
   
   
   $ awk 'BEGIN { print "SUBSEP = " SUBSEP }' | cat -vte
SUBSEP = ^\$
   
   
   
    
     
     
     
     
   
  
  
  
  $ awk 'BEGIN { print "SUBSEP = " SUBSEP }' | cat -vte
SUBSEP = ^\$

$ 0 表明了當前行

表明了當前行

  
  
  
  
   
   
   
    
    $ awk '{print $0}' marks.txt 
    1) Amit Physics 80 
    2) Rahul Maths 90 
    3) Shyam Biology 87 
    4) Kedar English 85 
    5) Hari History 89 
   
   
   
   
   $ awk '{print $0}' marks.txt
1) Amit     Physics   80
2) Rahul    Maths     90
3) Shyam    Biology   87
4) Kedar    English   85
5) Hari     History   89
   
   
   
    
     
     
     
     
   
  
  
  
  $ awk '{print $0}' marks.txt
1) Amit     Physics   80
2) Rahul    Maths     90
3) Shyam    Biology   87
4) Kedar    English   85
5) Hari     History   89

$n

當前行中的第n個字段

  
  
  
  
   
   
   
    
    $ awk '{print $3 "\t" $4}' marks.txt 
    Physics 80 
    Maths 90 
    Biology 87 
    English 85 
    History 89 
   
   
   
   
   $ awk '{print $3 "\t" $4}' marks.txt
Physics   80
Maths     90
Biology   87
English   85
History   89
   
   
   
    
     
     
     
     
   
  
  
  
  $ awk '{print $3 "\t" $4}' marks.txt
Physics   80
Maths     90
Biology   87
English   85
History   89

GNU AWK的變量

ARGIND

當前被處理的ARGV的索引

  
  
  
  
   
   
   
    
    $ awk '{  
     print "ARGIND = ", ARGIND; print "Filename = ", ARGV[ARGIND]  
    }' junk1 junk2 junk3 
    ARGIND = 1 
    Filename = junk1 
    ARGIND = 2 
    Filename = junk2 
    ARGIND = 3 
    Filename = junk3 
   
   
   
   
   $ awk '{ 
   print "ARGIND   = ", ARGIND; print "Filename = ", ARGV[ARGIND] 
}' junk1 junk2 junk3
ARGIND   =  1
Filename =  junk1
ARGIND   =  2
Filename =  junk2
ARGIND   =  3
Filename =  junk3
   
   
   
    
     
     
     
     
   
  
  
  
  $ awk '{ 
   print "ARGIND   = ", ARGIND; print "Filename = ", ARGV[ARGIND] 
}' junk1 junk2 junk3
ARGIND   =  1
Filename =  junk1
ARGIND   =  2
Filename =  junk2
ARGIND   =  3
Filename =  junk3

BINMODE

在非POSIX系統上指定對全部的文件I/O採用二進制模式。

ERRORNO

一個表明了getline跳轉失敗或者是close調用失敗的錯誤的字符串。

  
  
  
  
   
   
   
    
    $ awk 'BEGIN { ret = getline < "junk.txt"; if (ret == -1) print "Error:", ERRNO }' 
    Error: No such file or directory 
   
   
   
   
   $ awk 'BEGIN { ret = getline < "junk.txt"; if (ret == -1) print "Error:", ERRNO }'
Error: No such file or directory
   
   
   
    
     
     
     
     
   
  
  
  
  $ awk 'BEGIN { ret = getline < "junk.txt"; if (ret == -1) print "Error:", ERRNO }'
Error: No such file or directory

FIELDWIDTHS

設置了空格分隔的字段寬度變量列表的話，GAWK會將輸入解析爲固定寬度的字段，而不是使用FS進行分隔。

IGNORECASE

設置了這個變量的話，AWK會忽略大小寫。

  
  
  
  
   
   
   
    
    $ awk 'BEGIN{IGNORECASE = 1} /amit/' marks.txt 
    1) Amit Physics 80 
   
   
   
   
   $ awk 'BEGIN{IGNORECASE = 1} /amit/' marks.txt
1) Amit  Physics   80
   
   
   
    
     
     
     
     
   
  
  
  
  $ awk 'BEGIN{IGNORECASE = 1} /amit/' marks.txt
1) Amit  Physics   80

LINT

提供了對–lint選項的動態控制。

  
  
  
  
   
   
   
    
    $ awk 'BEGIN {LINT = 1; a}' 
    awk: cmd. line:1: warning: reference to uninitialized variable `a' 
    awk: cmd. line:1: warning: statement has no effect 
   
   
   
   
   $ awk 'BEGIN {LINT = 1; a}'
awk: cmd. line:1: warning: reference to uninitialized variable `a'
awk: cmd. line:1: warning: statement has no effect
   
   
   
    
     
     
     
     
   
  
  
  
  $ awk 'BEGIN {LINT = 1; a}'
awk: cmd. line:1: warning: reference to uninitialized variable `a'
awk: cmd. line:1: warning: statement has no effect

PROCINFO

包含進程信息的關聯數組，例如UID，進程ID等

  
  
  
  
   
   
   
    
    $ awk 'BEGIN { print PROCINFO["pid"] }' 
    4316 
   
   
   
   
   $ awk 'BEGIN { print PROCINFO["pid"] }'
4316
   
   
   
    
     
     
     
     
   
  
  
  
  $ awk 'BEGIN { print PROCINFO["pid"] }'
4316

TEXTDOMAIN

表明了AWK的文本域，用於查找字符串的本地化翻譯。

  
  
  
  
   
   
   
    
    $ awk 'BEGIN { print TEXTDOMAIN }' 
    messages 
   
   
   
   
   $ awk 'BEGIN { print TEXTDOMAIN }'
messages
   
   
   
    
     
     
     
     
   
  
  
  
  $ awk 'BEGIN { print TEXTDOMAIN }'
messages

操做符

與其它編程語言同樣，AWK也提供了大量的操做符。

算數操做符

算數操做符很少說，直接看例子，無非就是+-*/%

  
  
  
  
   
   
   
    
    $ awk 'BEGIN { a = 50; b = 20; print "(a + b) = ", (a + b) }' 
    (a + b) = 70 
     
    $ awk 'BEGIN { a = 50; b = 20; print "(a - b) = ", (a - b) }' 
    (a - b) = 30 
     
    $ awk 'BEGIN { a = 50; b = 20; print "(a * b) = ", (a * b) }' 
    (a * b) = 1000 
     
    $ awk 'BEGIN { a = 50; b = 20; print "(a / b) = ", (a / b) }' 
    (a / b) = 2.5 
     
    $ awk 'BEGIN { a = 50; b = 20; print "(a % b) = ", (a % b) }' 
    (a % b) = 10 
   
   
   
   
   $ awk 'BEGIN { a = 50; b = 20; print "(a + b) = ", (a + b) }'
(a + b) =  70

$ awk 'BEGIN { a = 50; b = 20; print "(a - b) = ", (a - b) }'
(a - b) =  30

$ awk 'BEGIN { a = 50; b = 20; print "(a * b) = ", (a * b) }'
(a * b) =  1000

$ awk 'BEGIN { a = 50; b = 20; print "(a / b) = ", (a / b) }'
(a / b) =  2.5

$ awk 'BEGIN { a = 50; b = 20; print "(a % b) = ", (a % b) }'
(a % b) =  10
   
   
   
    
     
     
     
     
   
  
  
  
  $ awk 'BEGIN { a = 50; b = 20; print "(a + b) = ", (a + b) }'
(a + b) =  70

$ awk 'BEGIN { a = 50; b = 20; print "(a - b) = ", (a - b) }'
(a - b) =  30

$ awk 'BEGIN { a = 50; b = 20; print "(a * b) = ", (a * b) }'
(a * b) =  1000

$ awk 'BEGIN { a = 50; b = 20; print "(a / b) = ", (a / b) }'
(a / b) =  2.5

$ awk 'BEGIN { a = 50; b = 20; print "(a % b) = ", (a % b) }'
(a % b) =  10

增減運算符

自增自減與C語言一致。

  
  
  
  
   
   
   
    
    $ awk 'BEGIN { a = 10; b = ++a; printf "a = %d, b = %d\n", a, b }' 
    a = 11, b = 11 
     
    $ awk 'BEGIN { a = 10; b = --a; printf "a = %d, b = %d\n", a, b }' 
    a = 9, b = 9 
     
    $ awk 'BEGIN { a = 10; b = a++; printf "a = %d, b = %d\n", a, b }' 
    a = 11, b = 10 
     
    $ awk 'BEGIN { a = 10; b = a--; printf "a = %d, b = %d\n", a, b }' 
    a = 9, b = 10 
   
   
   
   
   $ awk 'BEGIN { a = 10; b = ++a; printf "a = %d, b = %d\n", a, b }'
a = 11, b = 11

$ awk 'BEGIN { a = 10; b = --a; printf "a = %d, b = %d\n", a, b }'
a = 9, b = 9

$ awk 'BEGIN { a = 10; b = a++; printf "a = %d, b = %d\n", a, b }'
a = 11, b = 10

$ awk 'BEGIN { a = 10; b = a--; printf "a = %d, b = %d\n", a, b }'
a = 9, b = 10
   
   
   
    
     
     
     
     
   
  
  
  
  $ awk 'BEGIN { a = 10; b = ++a; printf "a = %d, b = %d\n", a, b }'
a = 11, b = 11

$ awk 'BEGIN { a = 10; b = --a; printf "a = %d, b = %d\n", a, b }'
a = 9, b = 9

$ awk 'BEGIN { a = 10; b = a++; printf "a = %d, b = %d\n", a, b }'
a = 11, b = 10

$ awk 'BEGIN { a = 10; b = a--; printf "a = %d, b = %d\n", a, b }'
a = 9, b = 10

賦值操做符

  
  
  
  
   
   
   
    
    $ awk 'BEGIN { name = "Jerry"; print "My name is", name }' 
    My name is Jerry 
    $ awk 'BEGIN { cnt = 10; cnt += 10; print "Counter =", cnt }' 
    Counter = 20 
    $ awk 'BEGIN { cnt = 100; cnt -= 10; print "Counter =", cnt }' 
    Counter = 90 
    $ awk 'BEGIN { cnt = 10; cnt *= 10; print "Counter =", cnt }' 
    Counter = 100 
    $ awk 'BEGIN { cnt = 100; cnt /= 5; print "Counter =", cnt }' 
    Counter = 20 
    $ awk 'BEGIN { cnt = 100; cnt %= 8; print "Counter =", cnt }' 
    Counter = 4 
    $ awk 'BEGIN { cnt = 2; cnt ^= 4; print "Counter =", cnt }' 
    Counter = 16 
    $ awk 'BEGIN { cnt = 2; cnt **= 4; print "Counter =", cnt }' 
    Counter = 16 
   
   
   
   
   $ awk 'BEGIN { name = "Jerry"; print "My name is", name }'
My name is Jerry
$ awk 'BEGIN { cnt = 10; cnt += 10; print "Counter =", cnt }'
Counter = 20
$ awk 'BEGIN { cnt = 100; cnt -= 10; print "Counter =", cnt }'
Counter = 90
$ awk 'BEGIN { cnt = 10; cnt *= 10; print "Counter =", cnt }'
Counter = 100
$ awk 'BEGIN { cnt = 100; cnt /= 5; print "Counter =", cnt }'
Counter = 20
$ awk 'BEGIN { cnt = 100; cnt %= 8; print "Counter =", cnt }'
Counter = 4
$ awk 'BEGIN { cnt = 2; cnt ^= 4; print "Counter =", cnt }'
Counter = 16
$ awk 'BEGIN { cnt = 2; cnt **= 4; print "Counter =", cnt }'
Counter = 16
   
   
   
    
     
     
     
     
   
  
  
  
  $ awk 'BEGIN { name = "Jerry"; print "My name is", name }'
My name is Jerry
$ awk 'BEGIN { cnt = 10; cnt += 10; print "Counter =", cnt }'
Counter = 20
$ awk 'BEGIN { cnt = 100; cnt -= 10; print "Counter =", cnt }'
Counter = 90
$ awk 'BEGIN { cnt = 10; cnt *= 10; print "Counter =", cnt }'
Counter = 100
$ awk 'BEGIN { cnt = 100; cnt /= 5; print "Counter =", cnt }'
Counter = 20
$ awk 'BEGIN { cnt = 100; cnt %= 8; print "Counter =", cnt }'
Counter = 4
$ awk 'BEGIN { cnt = 2; cnt ^= 4; print "Counter =", cnt }'
Counter = 16
$ awk 'BEGIN { cnt = 2; cnt **= 4; print "Counter =", cnt }'
Counter = 16

關係操做符

  
  
  
  
   
   
   
    
    $ awk 'BEGIN { a = 10; b = 10; if (a == b) print "a == b" }' 
    a == b 
    $ awk 'BEGIN { a = 10; b = 20; if (a != b) print "a != b" }' 
    a != b 
    $ awk 'BEGIN { a = 10; b = 20; if (a < b) print "a < b" }' 
    a < b 
    $ awk 'BEGIN { a = 10; b = 10; if (a <= b) print "a <= b" }' 
    a <= b 
    $ awk 'BEGIN { a = 10; b = 20; if (b > a ) print "b > a" }' 
    b > a 
   
   
   
   
   $ awk 'BEGIN { a = 10; b = 10; if (a == b) print "a == b" }'
a == b
$ awk 'BEGIN { a = 10; b = 20; if (a != b) print "a != b" }'
a != b
$ awk 'BEGIN { a = 10; b = 20; if (a < b) print "a  < b" }'
a  < b
$ awk 'BEGIN { a = 10; b = 10; if (a <= b) print "a <= b" }'
a <= b
$ awk 'BEGIN { a = 10; b = 20; if (b > a ) print "b > a" }'
b > a
   
   
   
    
     
     
     
     
   
  
  
  
  $ awk 'BEGIN { a = 10; b = 10; if (a == b) print "a == b" }'
a == b
$ awk 'BEGIN { a = 10; b = 20; if (a != b) print "a != b" }'
a != b
$ awk 'BEGIN { a = 10; b = 20; if (a < b) print "a  < b" }'
a  < b
$ awk 'BEGIN { a = 10; b = 10; if (a <= b) print "a <= b" }'
a <= b
$ awk 'BEGIN { a = 10; b = 20; if (b > a ) print "b > a" }'
b > a

邏輯操做符

  
  
  
  
   
   
   
    
    $ awk 'BEGIN { 
     num = 5; if (num >= 0 && num <= 7) printf "%d is in octal format\n", num 
    }' 
    5 is in octal format 
    $ awk 'BEGIN { 
     ch = "\n"; if (ch == " " || ch == "\t" || ch == "\n") 
     print "Current character is whitespace." 
    }' 
    Current character is whitespace. 
    $ awk 'BEGIN { name = ""; if (! length(name)) print "name is empty string." }' 
    name is empty string. 
   
   
   
   
   $ awk 'BEGIN {
   num = 5; if (num >= 0 && num <= 7) printf "%d is in octal format\n", num
}'
5 is in octal format
$ awk 'BEGIN {
   ch = "\n"; if (ch == " " || ch == "\t" || ch == "\n")
   print "Current character is whitespace."
}'
Current character is whitespace.
$ awk 'BEGIN { name = ""; if (! length(name)) print "name is empty string." }'
name is empty string.
   
   
   
    
     
     
     
     
   
  
  
  
  $ awk 'BEGIN {
   num = 5; if (num >= 0 && num <= 7) printf "%d is in octal format\n", num
}'
5 is in octal format
$ awk 'BEGIN {
   ch = "\n"; if (ch == " " || ch == "\t" || ch == "\n")
   print "Current character is whitespace."
}'
Current character is whitespace.
$ awk 'BEGIN { name = ""; if (! length(name)) print "name is empty string." }'
name is empty string.

三元操做符

  
  
  
  
   
   
   
    
    $ awk 'BEGIN { a = 10; b = 20; (a > b) ? max = a : max = b; print "Max =", max}' 
    Max = 20 
   
   
   
   
   $ awk 'BEGIN { a = 10; b = 20; (a > b) ? max = a : max = b; print "Max =", max}'
Max = 20
   
   
   
    
     
     
     
     
   
  
  
  
  $ awk 'BEGIN { a = 10; b = 20; (a > b) ? max = a : max = b; print "Max =", max}'
Max = 20

一元操做符

  
  
  
  
   
   
   
    
    $ awk 'BEGIN { a = -10; a = +a; print "a =", a }' 
    a = -10 
    $ awk 'BEGIN { a = -10; a = -a; print "a =", a }' 
    a = 10 
   
   
   
   
   $ awk 'BEGIN { a = -10; a = +a; print "a =", a }'
a = -10
$ awk 'BEGIN { a = -10; a = -a; print "a =", a }'
a = 10
   
   
   
    
     
     
     
     
   
  
  
  
  $ awk 'BEGIN { a = -10; a = +a; print "a =", a }'
a = -10
$ awk 'BEGIN { a = -10; a = -a; print "a =", a }'
a = 10

指數操做符

  
  
  
  
   
   
   
    
    $ awk 'BEGIN { a = 10; a = a ^ 2; print "a =", a }' 
    a = 100 
     
    $ awk 'BEGIN { a = 10; a ^= 2; print "a =", a }' 
    a = 100 
   
   
   
   
   $ awk 'BEGIN { a = 10; a = a ^ 2; print "a =", a }'
a = 100

$ awk 'BEGIN { a = 10; a ^= 2; print "a =", a }'
a = 100
   
   
   
    
     
     
     
     
   
  
  
  
  $ awk 'BEGIN { a = 10; a = a ^ 2; print "a =", a }'
a = 100

$ awk 'BEGIN { a = 10; a ^= 2; print "a =", a }'
a = 100

字符串鏈接操做符

  
  
  
  
   
   
   
    
    $ awk 'BEGIN { str1 = "Hello, "; str2 = "World"; str3 = str1 str2; print str3 }' 
    Hello, World 
   
   
   
   
   $ awk 'BEGIN { str1 = "Hello, "; str2 = "World"; str3 = str1 str2; print str3 }'
Hello, World
   
   
   
    
     
     
     
     
   
  
  
  
  $ awk 'BEGIN { str1 = "Hello, "; str2 = "World"; str3 = str1 str2; print str3 }'
Hello, World

數組成員操做符

  
  
  
  
   
   
   
    
    $ awk 'BEGIN {  
     arr[0] = 1; arr[1] = 2; arr[2] = 3; for (i in arr) printf "arr[%d] = %d\n", i, arr[i] 
    }' 
    arr[2] = 3 
    arr[0] = 1 
    arr[1] = 2 
   
   
   
   
   $ awk 'BEGIN { 
   arr[0] = 1; arr[1] = 2; arr[2] = 3; for (i in arr) printf "arr[%d] = %d\n", i, arr[i]
}'
arr[2] = 3
arr[0] = 1
arr[1] = 2
   
   
   
    
     
     
     
     
   
  
  
  
  $ awk 'BEGIN { 
   arr[0] = 1; arr[1] = 2; arr[2] = 3; for (i in arr) printf "arr[%d] = %d\n", i, arr[i]
}'
arr[2] = 3
arr[0] = 1
arr[1] = 2

正則表達式操做符

正則表達式操做符使用 ~ 和 !~ 分別表明匹配和不匹配。

  
  
  
  
   
   
   
    
    $ awk '$0 ~ 9' marks.txt 
    2) Rahul Maths 90 
    5) Hari History 89 
     
    $ awk '$0 !~ 9' marks.txt 
    1) Amit Physics 80 
    3) Shyam Biology 87 
    4) Kedar English 85 
     
    # 匹配正則表達式須要在表達式先後添加反斜線，與js相似吧 
    $ tail -n 40 /var/log/nginx/access.log | awk '$0 ~ /ip\[127\.0\.0\.1\]/' 
   
   
   
   
   $ awk '$0 ~ 9' marks.txt
2) Rahul   Maths    90
5) Hari    History  89

$ awk '$0 !~ 9' marks.txt
1) Amit     Physics   80
3) Shyam    Biology   87
4) Kedar    English   85

# 匹配正則表達式須要在表達式先後添加反斜線，與js相似吧
$ tail -n 40 /var/log/nginx/access.log | awk '$0 ~ /ip\[127\.0\.0\.1\]/'
   
   
   
    
     
     
     
     
   
  
  
  
  $ awk '$0 ~ 9' marks.txt
2) Rahul   Maths    90
5) Hari    History  89

$ awk '$0 !~ 9' marks.txt
1) Amit     Physics   80
3) Shyam    Biology   87
4) Kedar    English   85

# 匹配正則表達式須要在表達式先後添加反斜線，與js相似吧
$ tail -n 40 /var/log/nginx/access.log | awk '$0 ~ /ip\[127\.0\.0\.1\]/'

更多關於正則表達式請看後面的正則表達式部分

正則表達式

AWK在處理正則表達式方面是很是強大的，使用簡單的正則表達式能夠處理很是複雜的問題。

  
  
  
  
   
   
   
    
    $ echo -e "cat\nbat\nfun\nfin\nfan" | awk '/f.n/' 
    fun 
    fin 
    fan 
     
    $ echo -e "This\nThat\nThere\nTheir\nthese" | awk '/^The/' 
    There 
    Their 
     
    $ echo -e "knife\nknow\nfun\nfin\nfan\nnine" | awk '/n$/' 
    fun 
    fin 
    fan 
     
    $ echo -e "Call\nTall\nBall" | awk '/[CT]all/' 
    Call 
    Tall 
     
    $ echo -e "Call\nTall\nBall" | awk '/[^CT]all/' 
    Ball 
     
    $ echo -e "Call\nTall\nBall\nSmall\nShall" | awk '/Call|Ball/' 
    Call 
    Ball 
     
    $ echo -e "Colour\nColor" | awk '/Colou?r/' 
    Colour 
    Color 
     
    $ echo -e "ca\ncat\ncatt" | awk '/cat*/' 
    ca 
    cat 
    catt 
     
    $ echo -e "111\n22\n123\n234\n456\n222" | awk '/2+/' 
    22 
    123 
    234 
    222 
     
    $ echo -e "Apple Juice\nApple Pie\nApple Tart\nApple Cake" | awk '/Apple (Juice|Cake)/' 
    Apple Juice 
    Apple Cake 
   
   
   
   
   $ echo -e "cat\nbat\nfun\nfin\nfan" | awk '/f.n/'
fun
fin
fan

$ echo -e "This\nThat\nThere\nTheir\nthese" | awk '/^The/'
There
Their

$ echo -e "knife\nknow\nfun\nfin\nfan\nnine" | awk '/n$/'
fun
fin
fan

$ echo -e "Call\nTall\nBall" | awk '/[CT]all/'
Call
Tall

$ echo -e "Call\nTall\nBall" | awk '/[^CT]all/'
Ball

$ echo -e "Call\nTall\nBall\nSmall\nShall" | awk '/Call|Ball/'
Call
Ball

$ echo -e "Colour\nColor" | awk '/Colou?r/'
Colour
Color

$ echo -e "ca\ncat\ncatt" | awk '/cat*/'
ca
cat
catt

$ echo -e "111\n22\n123\n234\n456\n222"  | awk '/2+/'
22
123
234
222

$ echo -e "Apple Juice\nApple Pie\nApple Tart\nApple Cake" | awk '/Apple (Juice|Cake)/'
Apple Juice
Apple Cake
   
   
   
    
     
     
     
     
   
  
  
  
  $ echo -e "cat\nbat\nfun\nfin\nfan" | awk '/f.n/'
fun
fin
fan

$ echo -e "This\nThat\nThere\nTheir\nthese" | awk '/^The/'
There
Their

$ echo -e "knife\nknow\nfun\nfin\nfan\nnine" | awk '/n$/'
fun
fin
fan

$ echo -e "Call\nTall\nBall" | awk '/[CT]all/'
Call
Tall

$ echo -e "Call\nTall\nBall" | awk '/[^CT]all/'
Ball

$ echo -e "Call\nTall\nBall\nSmall\nShall" | awk '/Call|Ball/'
Call
Ball

$ echo -e "Colour\nColor" | awk '/Colou?r/'
Colour
Color

$ echo -e "ca\ncat\ncatt" | awk '/cat*/'
ca
cat
catt

$ echo -e "111\n22\n123\n234\n456\n222"  | awk '/2+/'
22
123
234
222

$ echo -e "Apple Juice\nApple Pie\nApple Tart\nApple Cake" | awk '/Apple (Juice|Cake)/'
Apple Juice
Apple Cake

數組

AWK支持關聯數組，也就是說，不只可使用數字索引的數組，還可使用字符串做爲索引，並且數字索引也不要求是連續的。數組不須要聲明能夠直接使用，語法以下：

  
  
  
  
   
   
   
    
    array_name[index] = value 
   
   
   
   
   array_name[index] = value
   
   
   
    
     
     
     
     
   
  
  
  
  
array_name[index] = value

建立數組的方式很是簡單，直接爲變量賦值便可

  
  
  
  
   
   
   
    
    $ awk 'BEGIN { 
     fruits["mango"] = "yellow"; 
     fruits["orange"] = "orange" 
     print fruits["orange"] "\n" fruits["mango"] 
    }' 
    orange 
    yellow 
   
   
   
   
   $ awk 'BEGIN {
   fruits["mango"] = "yellow";
   fruits["orange"] = "orange"
   print fruits["orange"] "\n" fruits["mango"]
}'
orange
yellow
   
   
   
    
     
     
     
     
   
  
  
  
  $ awk 'BEGIN {
   fruits["mango"] = "yellow";
   fruits["orange"] = "orange"
   print fruits["orange"] "\n" fruits["mango"]
}'
orange
yellow

刪除數組元素使用delete語句

  
  
  
  
   
   
   
    
    $ awk 'BEGIN { 
     fruits["mango"] = "yellow"; 
     fruits["orange"] = "orange"; 
     delete fruits["orange"]; 
     print fruits["orange"] 
    }' 
   
   
   
   
   $ awk 'BEGIN {
   fruits["mango"] = "yellow";
   fruits["orange"] = "orange";
   delete fruits["orange"];
   print fruits["orange"]
}'
   
   
   
    
     
     
     
     
   
  
  
  
  $ awk 'BEGIN {
   fruits["mango"] = "yellow";
   fruits["orange"] = "orange";
   delete fruits["orange"];
   print fruits["orange"]
}'

在AWK中，只支持一維數組，可是能夠經過一維數組模擬多維，例如咱們有一個3×3的三維數組

  
  
  
  
   
   
   
    
    100 200 300 
    400 500 600 
    700 800 900 
   
   
   
   
   100   200   300
400   500   600
700   800   900
   
   
   
    
     
     
     
     
   
  
  
  
  100   200   300
400   500   600
700   800   900

能夠這樣操做

  
  
  
  
   
   
   
    
    $ awk 'BEGIN { 
     array["0,0"] = 100; 
     array["0,1"] = 200; 
     array["0,2"] = 300; 
     array["1,0"] = 400; 
     array["1,1"] = 500; 
     array["1,2"] = 600; 
     
     # print array elements 
     print "array[0,0] = " array["0,0"]; 
     print "array[0,1] = " array["0,1"]; 
     print "array[0,2] = " array["0,2"]; 
     print "array[1,0] = " array["1,0"]; 
     print "array[1,1] = " array["1,1"]; 
     print "array[1,2] = " array["1,2"]; 
    }' 
    array[0,0] = 100 
    array[0,1] = 200 
    array[0,2] = 300 
    array[1,0] = 400 
    array[1,1] = 500 
    array[1,2] = 600 
   
   
   
   
   $ awk 'BEGIN {
   array["0,0"] = 100;
   array["0,1"] = 200;
   array["0,2"] = 300;
   array["1,0"] = 400;
   array["1,1"] = 500;
   array["1,2"] = 600;

   # print array elements
   print "array[0,0] = " array["0,0"];
   print "array[0,1] = " array["0,1"];
   print "array[0,2] = " array["0,2"];
   print "array[1,0] = " array["1,0"];
   print "array[1,1] = " array["1,1"];
   print "array[1,2] = " array["1,2"];
}'
array[0,0] = 100
array[0,1] = 200
array[0,2] = 300
array[1,0] = 400
array[1,1] = 500
array[1,2] = 600
   
   
   
    
     
     
     
     
   
  
  
  
  $ awk 'BEGIN {
   array["0,0"] = 100;
   array["0,1"] = 200;
   array["0,2"] = 300;
   array["1,0"] = 400;
   array["1,1"] = 500;
   array["1,2"] = 600;

   # print array elements
   print "array[0,0] = " array["0,0"];
   print "array[0,1] = " array["0,1"];
   print "array[0,2] = " array["0,2"];
   print "array[1,0] = " array["1,0"];
   print "array[1,1] = " array["1,1"];
   print "array[1,2] = " array["1,2"];
}'
array[0,0] = 100
array[0,1] = 200
array[0,2] = 300
array[1,0] = 400
array[1,1] = 500
array[1,2] = 600

流程控制

流程控制語句與大多數語言同樣，基本格式以下

  
  
  
  
   
   
   
    
    if (condition) 
     action 
     
    if (condition) { 
     action-1 
     action-1 
     . 
     . 
     action-n 
    } 
     
    if (condition) 
     action-1 
    else if (condition2) 
     action-2 
    else 
     action-3 
   
   
   
   
   if (condition)
   action

if (condition) {
   action-1
   action-1
   .
   .
   action-n
}

if (condition)
   action-1
else if (condition2)
   action-2
else
   action-3
   
   
   
    
     
     
     
     
   
  
  
  
  if (condition)
   action

if (condition) {
   action-1
   action-1
   .
   .
   action-n
}

if (condition)
   action-1
else if (condition2)
   action-2
else
   action-3

例如：

  
  
  
  
   
   
   
    
    $ awk 'BEGIN { 
     num = 11; if (num % 2 == 0) printf "%d is even number.\n", num;  
     else printf "%d is odd number.\n", num  
    }' 
     
    $ awk 'BEGIN { 
     a = 30; 
     
     if (a==10) 
     print "a = 10"; 
     else if (a == 20) 
     print "a = 20"; 
     else if (a == 30) 
     print "a = 30"; 
    }' 
   
   
   
   
   $ awk 'BEGIN {
   num = 11; if (num % 2 == 0) printf "%d is even number.\n", num; 
      else printf "%d is odd number.\n", num 
}'

$ awk 'BEGIN {
   a = 30;

   if (a==10)
   print "a = 10";
   else if (a == 20)
   print "a = 20";
   else if (a == 30)
   print "a = 30";
}'
   
   
   
    
     
     
     
     
   
  
  
  
  $ awk 'BEGIN {
   num = 11; if (num % 2 == 0) printf "%d is even number.\n", num; 
      else printf "%d is odd number.\n", num 
}'

$ awk 'BEGIN {
   a = 30;

   if (a==10)
   print "a = 10";
   else if (a == 20)
   print "a = 20";
   else if (a == 30)
   print "a = 30";
}'

循環

循環操做與其餘C系語言同樣，主要包括 for，while，do...while，break，continue 語句，固然，還有一個 exit語句用於退出腳本執行。

  
  
  
  
   
   
   
    
    for (initialisation; condition; increment/decrement) 
     action 
     
    while (condition) 
     action 
     
    do 
     action 
    while (condition) 
   
   
   
   
   for (initialisation; condition; increment/decrement)
   action

while (condition)
   action

do
   action
while (condition)
   
   
   
    
     
     
     
     
   
  
  
  
  for (initialisation; condition; increment/decrement)
   action

while (condition)
   action

do
   action
while (condition)

例子：

  
  
  
  
   
   
   
    
    $ awk 'BEGIN { for (i = 1; i <= 5; ++i) print i }' 
     
    $ awk 'BEGIN {i = 1; while (i < 6) { print i; ++i } }' 
     
    $ awk 'BEGIN {i = 1; do { print i; ++i } while (i < 6) }' 
     
    $ awk 'BEGIN { 
     sum = 0; for (i = 0; i < 20; ++i) {  
     sum += i; if (sum > 50) break; else print "Sum =", sum  
     }  
    }' 
     
    $ awk 'BEGIN { 
     for (i = 1; i <= 20; ++i) { 
     if (i % 2 == 0) print i ; else continue 
     }  
    }' 
     
    $ awk 'BEGIN { 
     sum = 0; for (i = 0; i < 20; ++i) { 
     sum += i; if (sum > 50) exit(10); else print "Sum =", sum  
     }  
    }' 
   
   
   
   
   $ awk 'BEGIN { for (i = 1; i <= 5; ++i) print i }'

$ awk 'BEGIN {i = 1; while (i < 6) { print i; ++i } }'

$ awk 'BEGIN {i = 1; do { print i; ++i } while (i < 6) }'

$ awk 'BEGIN {
   sum = 0; for (i = 0; i < 20; ++i) { 
      sum += i; if (sum > 50) break; else print "Sum =", sum 
   } 
}'

$ awk 'BEGIN {
   for (i = 1; i <= 20; ++i) {
      if (i % 2 == 0) print i ; else continue
   } 
}'

$ awk 'BEGIN {
   sum = 0; for (i = 0; i < 20; ++i) {
      sum += i; if (sum > 50) exit(10); else print "Sum =", sum 
   } 
}'
   
   
   
    
     
     
     
     
   
  
  
  
  $ awk 'BEGIN { for (i = 1; i <= 5; ++i) print i }'

$ awk 'BEGIN {i = 1; while (i < 6) { print i; ++i } }'

$ awk 'BEGIN {i = 1; do { print i; ++i } while (i < 6) }'

$ awk 'BEGIN {
   sum = 0; for (i = 0; i < 20; ++i) { 
      sum += i; if (sum > 50) break; else print "Sum =", sum 
   } 
}'

$ awk 'BEGIN {
   for (i = 1; i <= 20; ++i) {
      if (i % 2 == 0) print i ; else continue
   } 
}'

$ awk 'BEGIN {
   sum = 0; for (i = 0; i < 20; ++i) {
      sum += i; if (sum > 50) exit(10); else print "Sum =", sum 
   } 
}'

exit用於退出腳本，參數爲退出的狀態碼，能夠經過shell中的$?獲取

函數

內建函數

AWK提供了不少方便的內建函數供編程人員使用。因爲函數比較多，我的以爲單純看每一個函數的使用也沒有什麼實際意義，比較容易遺忘，所以，這裏只簡單的列出經常使用的一些函數，只須要對其有個印象便可，使用的時候再去查手冊效果會更好一些吧。

數學函數

atan2(y, x)
cos(expr)
exp(expr)
int(expr)
log(expr)
rand
sin(expr)
sqrt(expr)
srand([expr])

字符串函數

asort(arr [, d [, how] ])
asorti(arr [, d [, how] ])
gsub(regex, sub, string)
index(str, sub)
length(str)
match(str, regex)
split(str, arr, regex)
sprintf(format, expr-list)
strtonum(str)
sub(regex, sub, string)
substr(str, start, l)
tolower(str)
toupper(str)

時間函數

systime
mktime(datespec)
strftime([format [, timestamp[, utc-flag]]])

字節操做函數

and
compl
lshift
rshift
or
xor

其它

close(expr) 關閉管道文件

請看下面這段代碼

    
    
    
    
     
     
     
      
      $ awk 'BEGIN { 
       cmd = "tr [a-z] [A-Z]" 
       print "hello, world !!!" |& cmd 
       
       close(cmd, "to") 
       cmd |& getline out 
       print out; 
       
       close(cmd); 
      }' 
      HELLO, WORLD !!! 
     
     
     
     
     $ awk 'BEGIN {
   cmd = "tr [a-z] [A-Z]"
   print "hello, world !!!" |& cmd

   close(cmd, "to")
   cmd |& getline out
   print out;

   close(cmd);
}'
HELLO, WORLD !!!
     
     
     
      
       
       
       
       
     
    
    
    
    $ awk 'BEGIN {
   cmd = "tr [a-z] [A-Z]"
   print "hello, world !!!" |& cmd

   close(cmd, "to")
   cmd |& getline out
   print out;

   close(cmd);
}'
HELLO, WORLD !!!

是否是感受很難懂？讓我來解釋一下

第一個語句cmd = "tr [a-z] [A-Z]"是咱們在AWK中要用來創建雙向鏈接的命令。
第二個語句print提供了tr命令的輸入，使用 &| 表名創建雙向鏈接。
第三個語句close(cmd, "to")用於執行完成後關閉to進程
第四個語句cmd |& getline out使用getline函數存儲輸出到out變量
接下來打印變量out的內容，而後關閉cmd

delete 用於刪除數組元素
exit 退出腳本執行，並返回狀態碼參數
fflush

getline 該命令讓awk讀取下一行內容

該命令讓awk讀取下一行內容，好比

    
    
    
    
     
     
     
      
      $ awk '{getline; print $0}' marks.txt 
      2) Rahul Maths 90 
      4) Kedar English 85 
      5) Hari History 89 
     
     
     
     
     $ awk '{getline; print $0}' marks.txt
2) Rahul   Maths     90
4) Kedar   English   85
5) Hari    History   89
     
     
     
      
       
       
       
       
     
    
    
    
    $ awk '{getline; print $0}' marks.txt
2) Rahul   Maths     90
4) Kedar   English   85
5) Hari    History   89

使用getline var < file能夠從file中讀取輸入，存儲到變量var中

    
    
    
    
     
     
     
      
      { 
       if (NF == 2 && $1 == "@include") { 
       while ((getline line < $2) > 0) 
       print line 
       # 這裏的close確保若是文件中兩個@include，可讓其讀取兩次 
       close($2) 
       } else 
       print 
      } 
     
     
     
     
     {
     if (NF == 2 && $1 == "@include") {
          while ((getline line < $2) > 0)
               print line
          # 這裏的close確保若是文件中兩個@include，可讓其讀取兩次
          close($2)
     } else
          print
}
     
     
     
      
       
       
       
       
     
    
    
    
    {
     if (NF == 2 && $1 == "@include") {
          while ((getline line < $2) > 0)
               print line
          # 這裏的close確保若是文件中兩個@include，可讓其讀取兩次
          close($2)
     } else
          print
}

命令的輸出也能夠經過管道輸入到getline，使用command | getline這種方式。在這種狀況下，字符串命令會做爲shell命令執行，其標準輸出會經過管道傳遞個awk做爲其輸入，這種形式的getline會從管道中一次讀取一條記錄。例以下面的命令會從輸入中逐行讀取，若是遇到@execute，則將該行做爲命令執行，將命令的輸出做爲最終的輸出內容

    
    
    
    
     
     
     
      
      { 
       if ($1 == "@execute") { 
       tmp = substr($0, 10) # Remove "@execute" 
       while ((tmp | getline) > 0) 
       # 這裏實際上設置了$0爲這一行的內容 
       print 
       close(tmp) 
       } else 
       print 
      } 
     
     
     
     
     {
     if ($1 == "@execute") {
          tmp = substr($0, 10)        # Remove "@execute"
          while ((tmp | getline) > 0)
               # 這裏實際上設置了$0爲這一行的內容
               print
          close(tmp)
     } else
          print
}
     
     
     
      
       
       
       
       
     
    
    
    
    {
     if ($1 == "@execute") {
          tmp = substr($0, 10)        # Remove "@execute"
          while ((tmp | getline) > 0)
               # 這裏實際上設置了$0爲這一行的內容
               print
          close(tmp)
     } else
          print
}

若是文件包含如下內容

    
    
    
    
     
     
     
      
      foo 
      bar 
      baz 
      @execute who 
      bletch 
     
     
     
     
     foo
bar
baz
@execute who
bletch
     
     
     
      
       
       
       
       
     
    
    
    
    foo
bar
baz
@execute who
bletch

則會輸出

    
    
    
    
     
     
     
      
      foo 
      bar 
      baz 
      arnold ttyv0 Jul 13 14:22 
      miriam ttyp0 Jul 13 14:23 (murphy:0) 
      bill ttyp1 Jul 13 14:23 (murphy:0) 
      bletch 
     
     
     
     
     foo
bar
baz
arnold     ttyv0   Jul 13 14:22
miriam     ttyp0   Jul 13 14:23     (murphy:0)
bill       ttyp1   Jul 13 14:23     (murphy:0)
bletch
     
     
     
      
       
       
       
       
     
    
    
    
    foo
bar
baz
arnold     ttyv0   Jul 13 14:22
miriam     ttyp0   Jul 13 14:23     (murphy:0)
bill       ttyp1   Jul 13 14:23     (murphy:0)
bletch

使用command | getline var能夠實現將命令的輸出寫入到變量var。

    
    
    
    
     
     
     
      
      BEGIN { 
       "date" | getline current_time 
       close("date") 
       print "Report printed on " current_time 
      } 
     
     
     
     
     BEGIN {
     "date" | getline current_time
     close("date")
     print "Report printed on " current_time
}
     
     
     
      
       
       
       
       
     
    
    
    
    BEGIN {
     "date" | getline current_time
     close("date")
     print "Report printed on " current_time
}

getline使用管道讀取輸入是一種單向的操做，在某些場景下，你可能但願發送數據到另外一個進程，而後從這個進程中讀取處理後的結果，這就用到了協同進程，咱們可使用|&打開一個雙向管道。

    
    
    
    
     
     
     
      
      print "some query" |& "db_server" 
      "db_server" |& getline 
     
     
     
     
     print "some query" |& "db_server"
"db_server" |& getline
     
     
     
      
       
       
       
       
     
    
    
    
    print "some query" |& "db_server"
"db_server" |& getline

一樣，咱們也可使用command |& getline var將協同進程的輸出寫入到變量var。

next
nextfile

return

用於用戶自定義函數的返回值。
首先，建立一個functions.awk文件，包含下面的awk命令

    
    
    
    
     
     
     
      
      function addition(num1, num2) { 
       result = num1 + num2 
       return result 
      } 
      BEGIN { 
       res = addition(10, 20) 
       print "10 + 20 = " res 
      } 
     
     
     
     
     function addition(num1, num2) {
   result = num1 + num2
   return result
}
BEGIN {
   res = addition(10, 20)
   print "10 + 20 = " res
}
     
     
     
      
       
       
       
       
     
    
    
    
    function addition(num1, num2) {
   result = num1 + num2
   return result
}
BEGIN {
   res = addition(10, 20)
   print "10 + 20 = " res
}

執行上述代碼，輸出

    
    
    
    
     
     
     
      
      10 + 20 = 30 
     
     
     
     
     10 + 20 = 30
     
     
     
      
       
       
       
       
     
    
    
    
    
10 + 20 = 30

system

該函數用於執行指定的命令而且返回它的退出狀態，返回狀態碼0表示命令成功執行。

    
    
    
    
     
     
     
      
      $ awk 'BEGIN { ret = system("date"); print "Return value = " ret }' 
      2016年 10月 27日 星期四 22:08:36 CST 
      Return value = 0 
     
     
     
     
     $ awk 'BEGIN { ret = system("date"); print "Return value = " ret }'
2016年 10月 27日 星期四 22:08:36 CST
Return value = 0
     
     
     
      
       
       
       
       
     
    
    
    
    $ awk 'BEGIN { ret = system("date"); print "Return value = " ret }'
2016年 10月 27日 星期四 22:08:36 CST
Return value = 0

用戶自定義函數

函數是程序基本的組成部分，AWK容許咱們本身建立自定義的函數。一個大型的程序能夠被劃分爲多個函數，每一個函數之間能夠獨立的開發和測試，提供可重用的代碼。

下面是用戶自定義函數的基本語法

  
  
  
  
   
   
   
    
    function function_name(argument1, argument2, ...) {  
     function body 
    } 
   
   
   
   
   function function_name(argument1, argument2, ...) { 
   function body
}
   
   
   
    
     
     
     
     
   
  
  
  
  function function_name(argument1, argument2, ...) { 
   function body
}

例如，咱們建立一個名爲functions.awk的文件，包含下面的代碼

  
  
  
  
   
   
   
    
    # Returns minimum number 
    function find_min(num1, num2){ 
     if (num1 < num2) 
     return num1 
     return num2 
    } 
    # Returns maximum number 
    function find_max(num1, num2){ 
     if (num1 > num2) 
     return num1 
     return num2 
    } 
    # Main function 
    function main(num1, num2){ 
     # Find minimum number 
     result = find_min(10, 20) 
     print "Minimum =", result 
     
     # Find maximum number 
     result = find_max(10, 20) 
     print "Maximum =", result 
    } 
    # Script execution starts here 
    BEGIN { 
     main(10, 20) 
    } 
   
   
   
   
   # Returns minimum number
function find_min(num1, num2){
   if (num1 < num2)
   return num1
   return num2
}
# Returns maximum number
function find_max(num1, num2){
   if (num1 > num2)
   return num1
   return num2
}
# Main function
function main(num1, num2){
   # Find minimum number
   result = find_min(10, 20)
   print "Minimum =", result

   # Find maximum number
   result = find_max(10, 20)
   print "Maximum =", result
}
# Script execution starts here
BEGIN {
   main(10, 20)
}
   
   
   
    
     
     
     
     
   
  
  
  
  # Returns minimum number
function find_min(num1, num2){
   if (num1 < num2)
   return num1
   return num2
}
# Returns maximum number
function find_max(num1, num2){
   if (num1 > num2)
   return num1
   return num2
}
# Main function
function main(num1, num2){
   # Find minimum number
   result = find_min(10, 20)
   print "Minimum =", result

   # Find maximum number
   result = find_max(10, 20)
   print "Maximum =", result
}
# Script execution starts here
BEGIN {
   main(10, 20)
}

執行上述代碼，會獲得下面的輸出

  
  
  
  
   
   
   
    
    Minimum = 10 
    Maximum = 20 
   
   
   
   
   Minimum = 10
Maximum = 20
   
   
   
    
     
     
     
     
   
  
  
  
  Minimum = 10
Maximum = 20

輸出重定向

重定向操做符

到目前爲止，咱們全部的程序都是直接顯示數據到了標準輸出流，其實，咱們也能夠將輸出重定向到文件。重定向操做符跟在print和printf函數的後面，與shell中的用法基本一致。

  
  
  
  
   
   
   
    
    print DATA > output-file 
    print DATA >> output-file 
   
   
   
   
   print DATA > output-file
print DATA >> output-file
   
   
   
    
     
     
     
     
   
  
  
  
  print DATA > output-file
print DATA >> output-file

例如，下面兩條命令輸出是一致的

  
  
  
  
   
   
   
    
    $ echo "Hello, World !!!" > /tmp/message.txt 
    $ awk 'BEGIN { print "Hello, World !!!" > "/tmp/message.txt" }' 
   
   
   
   
   $ echo "Hello, World !!!" > /tmp/message.txt
$ awk 'BEGIN { print "Hello, World !!!" > "/tmp/message.txt" }'
   
   
   
    
     
     
     
     
   
  
  
  
  $ echo "Hello, World !!!" > /tmp/message.txt
$ awk 'BEGIN { print "Hello, World !!!" > "/tmp/message.txt" }'

與shell中同樣，>用於將輸出寫入到指定的文件中，若是文件中有內容則覆蓋，而>>則爲追加模式寫入。

  
  
  
  
   
   
   
    
    $ awk 'BEGIN { print "Hello, World !!!" >> "/tmp/message.txt" }' 
    $ cat /tmp/message.txt 
   
   
   
   
   $ awk 'BEGIN { print "Hello, World !!!" >> "/tmp/message.txt" }'
$ cat /tmp/message.txt
   
   
   
    
     
     
     
     
   
  
  
  
  $ awk 'BEGIN { print "Hello, World !!!" >> "/tmp/message.txt" }'
$ cat /tmp/message.txt

管道

除了將輸出重定向到文件以外，咱們還能夠將輸出重定向到其它程序，與shell中同樣，咱們可使用管道操做符|。

  
  
  
  
   
   
   
    
    $ awk 'BEGIN { print "hello, world !!!" | "tr [a-z] [A-Z]" }' 
    HELLO, WORLD !!! 
   
   
   
   
   $ awk 'BEGIN { print "hello, world !!!" | "tr [a-z] [A-Z]" }'
HELLO, WORLD !!!
   
   
   
    
     
     
     
     
   
  
  
  
  $ awk 'BEGIN { print "hello, world !!!" | "tr [a-z] [A-Z]" }'
HELLO, WORLD !!!

AWK中可使用|&進行雙向鏈接，那麼什麼是雙向鏈接呢？一種常見的場景是咱們發送數據到另外一個程序處理，而後讀取處理結果，這種場景下就須要打開一個到另一個進程的雙向管道了。第二個進程會與gawk程序並行執行，這裏稱其爲 協做進程。與單向鏈接使用|操做符不一樣的是，雙向鏈接使用|&操做符。

  
  
  
  
   
   
   
    
    do { 
     print data |& "subprogram" 
     "subprogram" |& getline results 
    } while (data left to process) 
    close("subprogram") 
   
   
   
   
   do {
    print data |& "subprogram"
    "subprogram" |& getline results
} while (data left to process)
close("subprogram")
   
   
   
    
     
     
     
     
   
  
  
  
  do {
    print data |& "subprogram"
    "subprogram" |& getline results
} while (data left to process)
close("subprogram")

第一次I/O操做使用了|&操做符，gawk會建立一個到運行其它程序的子進程的雙向管道，print的輸出被寫入到了subprogram的標準輸入，而這個subprogram的標準輸出在gawk中使用getline函數進行讀取。

注意：目前協同進程的標準錯誤輸出將會和gawk的標準錯誤輸出混雜在一塊兒，沒法單獨獲取標準錯誤輸出。另外，I/O緩衝可能存在問題，gawk程序會自動的刷新全部輸出到下游的協同進程的管道。可是，若是協同進程沒有刷新其標準輸出的話，gawk將可能會在使用getline函數從協同進程讀取輸出的時候掛起，這就可能引發死鎖。

咱們可使用close函數關閉雙向管道的to或者from一端，這兩個字符串值告訴gawk發送數據到協同進程完成時或者從協同進程讀取完畢時關閉管道。在使用系統命令sort的時候是這樣作是很是必要的，由於它必須等全部輸出都讀取完畢時才能進行排序。

  
  
  
  
   
   
   
    
    BEGIN { 
     command = "LC_ALL=C sort" 
     n = split("abcdefghijklmnopqrstuvwxyz", a, "") 
     
     for (i = n; i > 0; i--) 
     print a[i] |& command 
     close(command, "to") 
     
     while ((command |& getline line) > 0) 
     print "got", line 
     close(command) 
    } 
   
   
   
   
   BEGIN {
    command = "LC_ALL=C sort"
    n = split("abcdefghijklmnopqrstuvwxyz", a, "")

    for (i = n; i > 0; i--)
        print a[i] |& command
    close(command, "to")

    while ((command |& getline line) > 0)
        print "got", line
    close(command)
}
   
   
   
    
     
     
     
     
   
  
  
  
  BEGIN {
    command = "LC_ALL=C sort"
    n = split("abcdefghijklmnopqrstuvwxyz", a, "")

    for (i = n; i > 0; i--)
        print a[i] |& command
    close(command, "to")

    while ((command |& getline line) > 0)
        print "got", line
    close(command)
}

例如，下面的例子中使用tr命令轉換小寫爲大寫。咱們的command.awk文件包含如下內容

  
  
  
  
   
   
   
    
    BEGIN { 
     cmd = "tr [a-z] [A-Z]" 
     print "hello, world !!!" |& cmd 
     close(cmd, "to") 
     
     cmd |& getline out 
     print out; 
     close(cmd); 
    } 
   
   
   
   
   BEGIN {
   cmd = "tr [a-z] [A-Z]"
   print "hello, world !!!" |& cmd
   close(cmd, "to")

   cmd |& getline out
   print out;
   close(cmd);
}
   
   
   
    
     
     
     
     
   
  
  
  
  BEGIN {
   cmd = "tr [a-z] [A-Z]"
   print "hello, world !!!" |& cmd
   close(cmd, "to")

   cmd |& getline out
   print out;
   close(cmd);
}

輸出

  
  
  
  
   
   
   
    
    HELLO, WORLD !!! 
   
   
   
   
   HELLO, WORLD !!!
   
   
   
    
     
     
     
     
   
  
  
  
  
HELLO, WORLD !!!

上例看起來有些複雜，咱們逐行分析一下

首先，第一行 cmd = "tr [a-z] [A-Z]" 是在AWK中要創建雙向鏈接的命令
第二行的print命令用於爲tr命令提供輸入，而 |& 用於指出要創建雙向鏈接
第三行用於在上面的語句close(cmd, "to"),在執行完成後關閉其to進程
第四行 cmd |& getline out使用getline函數存儲輸出到變量out中
最後一行使用close函數關閉命令

美化輸出

到目前爲止，咱們已經使用過print和printf函數顯示數據到標準輸出，可是printf函數實際上要比咱們以前使用的狀況更增強大得多。該函數是從C語言中借鑑來的，在處理格式化的輸出時很是有用。

  
  
  
  
   
   
   
    
    $ awk 'BEGIN { printf "Hello\nWorld\n" }' 
    Hello 
    World 
     
    $ awk 'BEGIN { printf "ASCII value 65 = character %c\n", 65 }' 
    ASCII value 65 = character A 
   
   
   
   
   $ awk 'BEGIN { printf "Hello\nWorld\n" }'
Hello
World

$ awk 'BEGIN { printf "ASCII value 65 = character %c\n", 65 }'
ASCII value 65 = character A
   
   
   
    
     
     
     
     
   
  
  
  
  $ awk 'BEGIN { printf "Hello\nWorld\n" }'
Hello
World

$ awk 'BEGIN { printf "ASCII value 65 = character %c\n", 65 }'
ASCII value 65 = character A

格式化輸出標識有 %c， %d，%s 等，基本與C語言一致，這裏就很少贅述了。

執行shell命令

在AWK中執行shell命令有兩種方式

使用system函數
使用管道

使用system函數

system函數用於執行操做系統命令而且返回命令的退出碼到awk。

  
  
  
  
   
   
   
    
    END { 
     system("date | mail -s 'awk run done' root") 
    } 
   
   
   
   
   END {
     system("date | mail -s 'awk run done' root")
}
   
   
   
    
     
     
     
     
   
  
  
  
  END {
     system("date | mail -s 'awk run done' root")
}

使用管道

若是要執行的命令不少，能夠將輸出的命令直接用管道傳遞給"/bin/sh"執行

  
  
  
  
   
   
   
    
    while (more stuff to do) 
     print command | "/bin/sh" 
    close("/bin/sh") 
   
   
   
   
   while (more stuff to do)
    print command | "/bin/sh"
close("/bin/sh")
   
   
   
    
     
     
     
     
   
  
  
  
  while (more stuff to do)
    print command | "/bin/sh"
close("/bin/sh")

參考

本文將會持續修正和更新，最新內容請參考個人 GITHUB 上的程序猿成長計劃項目，歡迎 Star。