[Bash]LeetCode192. 統計詞頻 | Word Frequency

時間 2019-11-11

標籤 bash leetcode192 leetcode 統計詞頻 word frequency 欄目 Unix 简体版

原文原文鏈接

★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★
➤微信公衆號：山青詠芝（shanqingyongzhi）
➤博客園地址：山青詠芝（https://www.cnblogs.com/strengthen/）
➤GitHub地址：https://github.com/strengthen/LeetCode
➤原文地址：http://www.javashuo.com/article/p-bhjmwdak-md.html
➤若是連接不是山青詠芝的博客園地址，則多是爬取做者的文章。
➤原文已修改更新！強烈建議點擊原文地址閱讀！支持做者！支持原創！
★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★★html

Write a bash script to calculate the frequency of each word in a text file words.txt.git

For simplicity sake, you may assume:github

words.txt contains only lowercase characters and space ' ' characters.
Each word must consist of lowercase characters only.
Words are separated by one or more whitespace characters.

Example:bash

Assume that words.txt has the following content:微信

the day is sunny the the
the sunny is is

Your script should output the following, sorted by descending frequency:spa

the 4
is 3
sunny 2
day 1

Note:code

Don't worry about handling ties, it is guaranteed that each word's frequency count is unique.
Could you write it in one-line using Unix pipes?

寫一個 bash 腳本以統計一個文本文件 words.txt 中每一個單詞出現的頻率。htm

爲了簡單起見，你能夠假設：blog

words.txt只包括小寫字母和 ' ' 。
每一個單詞只由小寫字母組成。
單詞間由一個或多個空格字符分隔。

示例:排序

假設 words.txt 內容以下：

the day is sunny the the
the sunny is is

你的腳本應當輸出（以詞頻降序排列）：

the 4
is 3
sunny 2
day 1

說明:

不要擔憂詞頻相同的單詞的排序問題，每一個單詞出現的頻率都是惟一的。
你能夠使用一行 Unix pipes 實現嗎？

4ms

1 # Read from the file words.txt and output the word frequency list to stdout.
2 cat words.txt | tr -s ' ' '\n' | sort | uniq -c | sort -r | awk '{ print $2, $1 }'

8ms

1 # Read from the file words.txt and output the word frequency list to stdout.
2 awk '{
3     for (i = 1; i <= NF; ++i) ++s[$i];
4 } END {
5     for (i in s) print i, s[i];
6 }' words.txt | sort -nr -k 2

16ms

1 # Read from the file words.txt and output the word frequency list to stdout.
2 
3 # try 1
4 sed 's/ \{1,\}/\n/g' words.txt | sed '/^$/d' | sort | uniq -c | sort -nr | awk '{print $2,$1}'

相關標籤/搜索

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。