基礎篇-腳本題（2）

時間 2020-01-16

標籤基礎腳本简体版

原文原文鏈接

從老男孩老師那裏抓的題：html

處理如下文件內容,將域名取出並進行計數排序,如處理:(百度和sohu面試題)

python

oldboy.log http://www.etiantian.org/index.html http://www.etiantian.org/1.html http://post.etiantian.org/index.html http://mp3.etiantian.org/index.html http://www.etiantian.org/3.html http://post.etiantian.org/2.html

shell實現方式面試

awk -F "/"  '{print $3}'  oldboy.log | sort -r | uniq -c
cut -d "/" -f3 oldboy.log  | sort -r | uniq -c
cat oldboy.log | sed 's/^ http:\/\///g' | sed 's/\/.*$//g' | sort -r | uniq -c
以上三種實現方式比較簡單
awk -F "/" '{++S[$3]} END {for(key in S) print key,S[key]}' oldboy.log|sort -k2

第四種詳解：shell

python實現：bash

# coding: utf-8
import sys
from itertools import groupby
#ListFile = sys.argv[1]

def demo(ListFile):
reList = []
files = file(ListFile,'r')
lines = files.readlines()
for item in lines:
#print item,
rLIst = item.split("/")
r = rLIst[2]
reList.append(r)

result = [(a,len(list(b))) for a,b in groupby(sorted(reList))]
return result
if __name__ == "__main__":
#demo(ListFile)
print demo("/tmp/oldboy.log")app

相關標籤/搜索

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。