perl中的grep函數介紹

時間 2019-11-30

標籤 perl grep 函數介紹欄目 Perl 简体版

原文原文鏈接

grep函數ios

(若是你是個的新手,你能夠先跳過下面的兩段,直接到 Grep vs.loops 樣例這一部分,放心,在後面你還會遇到它)shell

<pre>
grep BLOCK LIST
grep EXPR, LIST
</pre>
grep 函數會用 LIST 中的元素對 BLOCK 或 EXPR 求值，並且會把局部變量 $_ 設置爲當前所用的 LIST 中的元素。BLOCK 塊是一個或多個由花括號分隔開的Perl 語句，而 List 則是一有序列表。EXPR 是一個或多個變量，操做符，字符，函數，子程序調用的組成的表達式。Grep 會對 BLOCK 塊或 EXPR 進行求值，將求值爲%{color:red}真%的元素加入到 Grep 返回列表中。若是 BLOCK 塊由多個語句組成,那麼 Grep 以 BLOCK 中的最後一條語句的求值爲準。LIST 能夠是一個列表也能夠是一個數組。在標量上下文中，grep 返回的是 BLOCK 或 EXPR 求值爲真的元素個數。數據庫

請避免在 BLOCK 或 EXPR 塊中修改 $_ ,由於這會相應的修改 LIST 中元素的值。同時還要避免把 grep 返回的列表作爲左值使用，由於這也會修改 LIST 中的元素。(所謂左值變量就是一個在賦值表達式左邊的變量)。一些 Perl hackers 可能會利用這個所謂的」特性」，可是我建議你不要使用這種混亂的風格.數組

grep 與循環app

這個例子打印出 myfile 這個文件中含有 terriosm 和 nuclear 的行(大小寫不敏感).函數

open FILE "<myfile" or die "Can't open myfile: $!"; print grep /terrorism|nuclear/i, <FILE>;
對於文件很大的狀況，這段代碼耗費不少內存。由於 grep 把它的第二個參數做爲一個列表上下文看待，因此 < > 操做符返回的是整個的文件。更有效的代碼應該這樣寫:
while ($line = <FILE>) { if ($line =~ /terrorism|nuclear/i) { print $line } }
通過上面能夠看到，使用循環能夠完成全部 grep 能夠完成的工做。那爲何咱們還要使用 grep 呢？一個直觀的答案是 grep 的風格更像 Perl，而 loops（循環）則是 C 的風格。一個更好的答案是，首先, grep 很直觀的告訴讀者正在進行的操做是從一串值中選出想要的。其次，grep 比循環簡潔。(用工程的說法就是 grep 比循環更具備內聚力)。基本上，若是你對 Perl 不是很熟悉,隨便你使用循環。不然，你應該多使用像 grep 這樣的強大工具.工具

計算數組中匹配給定模式的元素個數oop

在一個標量上下文中,grep 返回的是匹配的元素個數.測試

$num_apple = grep /^apple$/i, @fruits;
^ 和 $ 匹配符的聯合使用指定了只匹配那些以 apple 開頭且同時以 apple 結尾的元素。這裏 grep 匹配 apple 可是 pineapple 就不匹配。ui

輸出列表中的不一樣元素

@unique = grep { ++$count{$_} < 2 } qw(a b a c d d e f g f h h); print "@unique\n"; 輸出結果: a b c d e f g h$count{$_} 是 Perl 散列中的一個元素,是一個鍵值對 ( Perl中的散列和計算機科學中的哈希表有關係,但不徹底相同) 這裏 count 散列的鍵就是輸入列表中的各個值,而各鍵對應的值就是該鍵是否使 BLOCK 估值爲真的次數。當一個值第一次出現的時候 BLOCK 的值被估爲真（由於小於2），當該值再次出現的時候就會被估計爲假（由於等於或大於2）。

取出列表中出現兩次的值

@crops = qw(wheat corn barley rice corn soybean hay alfalfa rice hay beets corn hay); @duplicates = grep { $count{$_} == 2 } grep { ++$count{$_} > 1 } @crops; print "@duplicates\n";
在 grep 的第一個列表元素被傳給 BLOCK 或 EXPR 塊前，第二個參數被看成列表上下文看待。這意味着，第二個 grep 將在左邊的 grep 開始對 BLOCK 進行估值以前徹底讀入 count 散列。

列出當前目錄中的文本文件

@files = grep { -f and -T } glob '* .*'; print "@files\n";
glob 函數是獨立於操做系統的,它像 Unix 的 shell 同樣對文件的擴展名進行匹配。單個的 * 表示匹配因此當前目錄下不以 . 開頭的文件， .* 表示匹配當前目錄下以 . 開頭的全部文件。 -f 和 -T 文件測試符分別用來測試純文件和文本文件，是的話則返回真。使用 -f and -T 進行測試比單用 -T 進行測試有效，由於若是一個文件沒有經過 -f 測試，那麼相比 -f 更耗時的 -T 測試就不會進行。

從數組中選出非重複元素

@array = qw(To be or not to be that is the question); print "@array\n"; @found_words = grep { $_ =~ /b|o/i and ++$counts{$_} < 2; } @array; print "@found_words\n"; 輸出結果: To be or not to be that is the question To be or not to question

邏輯表達式 $_ =~ /b|o/i 匹配包含有 b 或 o 的元素(不區別大小寫)。在這個例子裏把匹配操做放在累加前比反過來作有效些。好比，若是左邊的表達式是假的，那麼右邊的表達式子就不會被計算。

選出二維座標數組中橫座標大於縱座標的元素

# An array of references to anonymous arrays @data_points = ( [ 5, 12 ], [ 20, -3 ], [ 2, 2 ], [ 13, 20 ] ); @y_gt_x = grep { $_->[0] < $_->[1] } @data_points; foreach $xy (@y_gt_x) { print "$xy->[0], $xy->[1]\n" } 輸出結果: 5, 12 13, 20

在一個簡單數據庫中查找餐館

這個例子裏的數據庫實現方法不是實際應用中該使用的，可是它說明了使用 grep 函數的時候，只要你的內存夠用， BLOCK 塊的複雜度基本沒有限制。

# @database is array of references to anonymous hashes @database = ( { name => "Wild Ginger", city => "Seattle",cuisine => "Asian Thai Chinese Korean Japanese",expense => 4, music => "\0", meals => "lunch dinner",view => "\0", smoking => "\0", parking => "validated",rating => 4, payment => "MC VISA AMEX", },# { ... }, etc.);sub findRestaurants {my ($database, $query) = @_;return grep {$query->{city} ? lc($query->{city}) eq lc($_->{city}) : 1 and $query->{cuisine} ? $_->{cuisine} =~ /$query->{cuisine}/i : 1 and $query->{min_expense} ? $_->{expense} >= $query->{min_expense} : 1 and $query->{max_expense} ? $_->{expense} <= $query->{max_expense} : 1 and $query->{music} ? $_->{music} : 1 and $query->{music_type} ? $_->{music} =~ /$query->{music_type}/i : 1 and $query->{meals} ? $_->{meals} =~ /$query->{meals}/i : 1 and $query->{view} ? $_->{view} : 1 and $query->{smoking} ? $_->{smoking} : 1 and $query->{parking} ? $_->{parking} : 1 and $query->{min_rating} ? $_->{rating} >= $query->{min_rating} : 1 and $query->{max_rating} ? $_->{rating} <= $query->{max_rating} : 1 and $query->{payment} ? $_->{payment} =~ /$query->{payment}/i : 1} @$database;}%query = ( city => 'Seattle', cuisine => 'Asian|Thai' );@restaurants = findRestaurants(\@database, \%query);print "$restaurants[0]->{name}\n";輸出結果: Wild Ginger