python3處理csv文件

1. 基礎語句

1.1 文件的讀取

若是須要讀取一行數據以下表1所示,那麼須要讀取域名下面的數據,便使用以下代碼:python

with open('A.csv','rb') as csvfile:
    reader = csv.reader(csvfile)
    rows = [row for row in reader]

其中每個row就是一行['121.241.244.92', 'known attacker', '"blocklist.de (+dataplane.org,greensnow.co,rulez.sk,rutgers.edu)"'],對應的row[0]就是121.241.244.92。
那麼讀取某一列就很容易了,取出每一row裏面的第0個元素組成colum,即是下面的語句能夠取得第0列 column0 = [row[0] for row in reader]
而取得某一行便相對複雜一些,代碼以下:正則表達式

import csv
with open('A.csv','rb') as csvfile:
    reader = csv.reader(csvfile)
    for i,rows in enumerate(reader):
        if i == 2:
            row = rows

上面的代碼中,運用到了enumerate函數,enumerate() 函數用於將一個可遍歷的數據對象(如列表、元組或字符串)組合爲一個索引序列,同時列出數據和數據下標,通常用在 for 循環當中。即對於一個列表中每個元素添加一個索引。經過索引選取對於的行。
seasons = ['Spring', 'Summer', 'Fall', 'Winter']
list(enumerate(seasons))
[(0, 'Spring'), (1, 'Summer'), (2, 'Fall'), (3, 'Winter')]函數

reader列表
[['121.241.244.92', 'known attacker', '"blocklist.de (+dataplane.org,greensnow.co,rulez.sk,rutgers.edu)"'],
['121.32.150.82', 'known attacker', '"blocklist.de (+dataplane.org,greensnow.co)"'],
['121.33.237.102', 'known attacker', '"blocklist.de (+dataplane.org,greensnow.co,rutgers.edu)"']]code

表1對象

域名
121.241.244.92 known attacker "blocklist.de (+dataplane.org,greensnow.co,rulez.sk,rutgers.edu)"
121.32.150.82 known attacker "blocklist.de (+dataplane.org,greensnow.co)"
121.33.237.102 known attacker "blocklist.de (+dataplane.org,greensnow.co,rutgers.edu)"

1.2 文件的寫入

文件的寫入就相對簡單不少,只需記得"a+"是追加寫入文件、"W"是覆蓋寫入便可,直接貼出下面的代碼:blog

with open("test.csv", "a+") as csvfile:
    writer = csv.writer(csvfile)
    writer.writerow(rows)

2. 在工程中的運用

情景:面對一份上百萬的trails.csv文件,以下圖所示,如今我須要把第一列的中只含有ip或者是IP:port的那一行讀取出來,而且放入新的文件中。

在這就須要使用正則表達式對ip進行匹配,匹配成功輸入文件中。匹配IPV4及端口號的正則 \A\d+\.[0-9.]+\Z|\A\d+\.[0-9.]+\:[0-8]+\Z。這裏使用到search語句進行匹配,代碼以下:索引

with open("trails.csv", "r", encoding = "utf-8") as f:
    reader = csv.reader(f)
    for rows in reader:
        m = re.search(r"\A\d+\.[0-9.]+\Z|\A\d+\.[0-9.]+\:[0-8]+\Z", rows[0])
        if m:
            with open("test.csv", "a+") as csvfile:
                writer = csv.writer(csvfile)
                writer.writerow(rows)