The so-called CSV (Comma Separated Values) format is the most common import and export format for spreadsheets and databases. CSV format was used for many years prior to attempts to describe the format in a standardized way in RFC 4180. html
The lack of a well-defined standard means that subtle differences often exist in the data produced and consumed by different applications. These differences can make it annoying to process CSV files from multiple sources. Still, while the delimiters and quoting characters vary, the overall format is similar enough that it is possible to write a single module which can efficiently manipulate such data, hiding the details of reading and writing the data from the programmer.python
The csv
module implements classes to read and write tabular data in CSV format. It allows programmers to say, 「write this data in the format preferred by Excel,」 or 「read data from this file which was generated by Excel,」 without knowing the precise details of the CSV format used by Excel. Programmers can also describe the CSV formats understood by other applications or define their own special-purpose CSV formats.app
the csv
module’s reader
and writer
objects read and write sequences. Programmers can also read and write data in dictionary form using the DictReader
and DictWriter
classesthis
csvfile
須要是支持迭代(Iterator)的對象,而且每次調用next方法的返回值是字符串(string),一般的文件(file)對象,或者列表(list)對象都是適用的,若是是文件對象,打開是須要加"b"標誌參數。
dialect
編碼風格,默認爲excel方式,也就是逗號(,)分隔,另外csv模塊也支持excel-tab風格,也就是製表符(tab)分隔。其它的方式須要本身定義,而後能夠調用register_dialect方法來註冊,以及list_dialects方法來查詢已註冊的全部編碼風格列表。
fmtparam
格式化參數,用來覆蓋以前dialect對象指定的編碼風格。編碼
參數解釋:spa
delimiter:設置分隔符rest
quotechar:設置引用符excel
quoting:引號選項,有4種不一樣的引號選項code
在csv模塊中定義爲四個變量:orm
QUOTE_ALL不論類型是什麼,對全部字段都加引號。
QUOTE_MINIMAL對包含特殊字符的字段加引號(所謂特殊字符是指,對於一個用相同方言和選項配置的解析器,可能會形成混淆的字符)。這是默認選項。
QUOTE_NONNUMERIC對全部非整數或浮點數的字段加引號。在閱讀器中使用時,不加引號的輸入字段會轉換爲浮點數。
QUOTE_NONE輸出中全部內容都不加引號。在閱讀器中使用時,引號字符包含在字段值中(正常狀況下,它們會處理爲定界符並去除)。
import csv
def testReader(file):
with open(file, 'r') as csvfile:
spamreader = csv.reader(csvfile, delimiter=',')
for row in spamreader:
print(', '.join(row))
if __name__ == '__main__':
csvFile = 'test.csv'
testReader(csvFile)
參數表(略: 同reader, 見上)
def testWriter(file):
with open(file, 'w') as csvfile:
spamwriter = csv.writer(csvfile, delimiter=',', quotechar='|', quoting=csv.QUOTE_MINIMAL)
spamwriter.writerow(['Spam'] * 5 + ['Baked Beans'])
spamwriter.writerow(['Spam', 'Lovely Spam', 'Wonderful Spam'])
建立一個像常規閱讀器同樣操做的對象,但將每一行中的信息映射到一個OrderedDict
由可選的fieldnames參數給出的鍵。
字段名的參數是一個序列。若是省略字段名稱,文件f的第一行中的值將用做字段名稱。不管字段名稱如何肯定,有序字典保留其原始排序。
若是一行的字段數超過了字段名,剩下的數據將被放在一個列表中,並與restkey(默認爲None
)指定的字段名一塊兒存儲。若是非空行的字段數少於字段名,則缺乏的值將被填入None
。
def testDictReader(file):
# 院系,專業,年級,學生類別,班級,學號,姓名,學分紅績,更新時間,班級排名,參與班級排名總人數
with open(file, 'rb') as csvfile:
dictreader = csv.DictReader(csvfile)
for row in dictreader:
print(' '.join([row['院系'], row['專業'], row['學號'], row['姓名']]))
建立一個像普通writer同樣運行的對象,但將字典映射到輸出行上。的字段名的參數是一個sequence
標識,其中在傳遞給字典值的順序按鍵的writerow()
方法被寫入到文件 ˚F。可選的restval參數指定字典缺乏字段名中的鍵時要寫入的值。若是傳遞給該writerow()
方法的字典包含在字段名稱中未找到的鍵 ,則可選的extrasaction參數指示要執行的操做。若是設置爲'raise'
默認值,ValueError
則爲a 。若是設置爲'ignore'
,字典中的額外值將被忽略。任何其餘可選或關鍵字參數都傳遞給底層 writer
實例。
請注意,與DictReader
類不一樣,fieldnames參數DictWriter
不是可選的。因爲Python的dict
對象未被排序,所以沒有足夠的可用信息推導出行應該寫入文件f的順序。
def testDictWriter(file):
with open(file, 'w') as csvfile:
fieldnames = ['院系', '專業', '年級', '學生類別', '班級', '學號']
writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
writer.writeheader()
writer.writerow(
{'院系': '信息學院', '專業': '計算機科學與技術', '年級': '2011級', '學生類別': '本科(本科)4年', '班級': '計算機11', '學號': '201101245'})
writer.writerow(
{'院系': '信息學院', '專業': '計算機科學與技術', '年級': '2011級', '學生類別': '本科(本科)4年', '班級': '計算機11', '學號': '201101275'})
def copycsv(source, target):
csvtarget = open(target, 'w+')
with open(source, 'r') as csvscource:
reader = csv.reader(csvscource, delimiter=',')
for line in reader:
writer = csv.writer(csvtarget, delimiter=',', quotechar='|', quoting=csv.QUOTE_MINIMAL)
writer.writerow(line)
csvtarget.close()
import numpy
my_matrix = numpy.loadtxt(open("num.csv", "rb"), delimiter=",", skiprows=0)
print(my_matrix)
import pandas as pd
obj=pd.read_csv('test.csv') print obj print type(obj) print obj.dtypes
院系,專業,年級,學生類別,班級,學號,姓名,學分紅績,更新時間,班級排名,參與班級排名總人數
信息學院,計算機科學與技術,2011級,本科(本科)4年,計算機11,201101244,欒,86.72,2017/9/5 9:59,1,27
信息學院,計算機科學與技術,2011級,本科(本科)4年,計算機11,201101237,劉,86.05,2017/9/5 9:59,2,27
信息學院,計算機科學與技術,2011級,本科(本科)4年,計算機11,201101233,劉,86.03,2017/9/5 9:59,3,27
信息學院,計算機科學與技術,2011級,本科(本科)4年,計算機11,201101250,李,85.43,2017/9/5 9:59,4,27
信息學院,計算機科學與技術,2011級,本科(本科)4年,計算機11,201101229,張,82.35,2017/9/5 9:59,5,27
信息學院,計算機科學與技術,2011級,本科(本科)4年,計算機11,201101241,韓,80.92,2017/9/5 9:59,6,27
信息學院,計算機科學與技術,2011級,本科(本科)4年,計算機11,201101232,丁,80.66,2017/9/5 9:59,7,27
信息學院,計算機科學與技術,2011級,本科(本科)4年,計算機11,201101228,張,79.61,2017/9/5 9:59,8,27
信息學院,計算機科學與技術,2011級,本科(本科)4年,計算機11,201101255,孟,79.55,2017/9/5 9:59,9,27
1,2,3
4,5,6
7,8,9
# coding:utf-8
import csv
def testReader(file):
with open(file, 'r') as csvfile:
spamreader = csv.reader(csvfile, delimiter=',')
for row in spamreader:
print(', '.join(row))
def testWriter(file):
with open(file, 'w') as csvfile:
spamwriter = csv.writer(csvfile, delimiter=',', quotechar='|', quoting=csv.QUOTE_MINIMAL)
spamwriter.writerow(['Spam'] * 5 + ['Baked Beans'])
spamwriter.writerow(['Spam', 'Lovely Spam', 'Wonderful Spam'])
def copycsv(source, target):
csvtarget = open(target, 'w+')
with open(source, 'r') as csvscource:
reader = csv.reader(csvscource, delimiter=',')
for line in reader:
writer = csv.writer(csvtarget, delimiter=',', quotechar='|', quoting=csv.QUOTE_MINIMAL)
writer.writerow(line)
csvtarget.close()
def testDictReader(file):
# 院系,專業,年級,學生類別,班級,學號,姓名,學分紅績,更新時間,班級排名,參與班級排名總人數
with open(file, 'rb') as csvfile:
dictreader = csv.DictReader(csvfile)
for row in dictreader:
print(' '.join([row['院系'], row['專業'], row['學號'], row['姓名']]))
def testDictWriter(file):
with open(file, 'w') as csvfile:
fieldnames = ['院系', '專業', '年級', '學生類別', '班級', '學號']
writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
writer.writeheader()
writer.writerow(
{'院系': '信息學院', '專業': '計算機科學與技術', '年級': '2011級', '學生類別': '本科(本科)4年', '班級': '計算機11', '學號': '201101245'})
writer.writerow(
{'院系': '信息學院', '專業': '計算機科學與技術', '年級': '2011級', '學生類別': '本科(本科)4年', '班級': '計算機11', '學號': '201101275'})
def testpandas_csv():
import pandas as pd
obj = pd.read_csv('test.csv')
print obj
print type(obj)
print obj.dtypes
def testnumpy_csv():
import numpy
my_matrix = numpy.loadtxt(open("num.csv", "rb"), delimiter=",", skiprows=0)
print(my_matrix)
if __name__ == '__main__':
# csvFile = 'test.csv'
# testReader(csvFile)
# csvFile = 'test2.csv'
# testWriter(csvFile)
# copycsv('test.csv', 'testcopy.csv')
# testDictReader('test.csv')
# testDictWriter('test2.csv')
testnumpy_csv()
# testpandas_csv()