python學習第十八天 --文件操做

時間 2019-12-09

原文原文鏈接

這一章節主要講解文件操做及其文件讀取,緩存,文件指針。python

文件操做

(1)文件打開：open(filepath,filemode)shell

filepath：要打開文件的路徑緩存

filemode：文件打開的方式函數

mode	說明	注意
'r'	只讀方式打開	文件必須存在
'w'	只寫方式打開	文件不存在建立文件編碼文件存在清空內容spa
'a'	追加方式打開	文件不存在建立文件
'r+'/'w+'	讀寫方式打開
'a+'	追加和讀寫方式打開

PS：'rb','wb','ab','rb+','wb+','ab+':二進制方式打開指針

>>> f = open('1.txt','w')
>>> type(f)
<type 'file'>
>>> f.close()

(2)文件寫操做：code

write（str):將字符串寫入文件對象

writelines(sequence_of_strings):寫多行到文件blog

>>> f = open('1.txt','w+')
>>> f.write("123456")
>>> f.close()

try:
    f = open('1.txt','w')
    f.writelines(('12','23','45'))
finally:
    if f:
        f.close()

(3)文件讀操做：

read([size])：讀取文件(讀取size個字節，默認讀取所有)

readline([size]):讀取一行

readlines([size]):讀取完文件，返回每一行所組成的列表

try:
    f = open('1.txt','r')
    value = f.read(3)
    print value
finally:
    if f:
        f.close()

        
122

try:
    f = open('1.txt','r')
    value = f.readline()
    print value
finally:
    if f:
        f.close()

        
122345

迭代讀取

在實際文件讀取中，由於OS緩存是有限的。IO只能緩存8192Byte。

>>> import io
>>> io.DEFAULT_BUFFER_SIZE
8192

若是文件超過8192Byte.直接經過readlines是不能徹底讀取出來了。

若是要徹底讀出來，使用迭代讀取。

try:
    f = open('1.txt','r')
    iter_f = iter(f)
    lines = 0
    for line in iter_f:
        lines+=1
        print line
    print lines
finally:
    if f:
        f.close()

#1.txt有7行數據，使用迭代方式iter來將f對象迭代。        
122345

1223456

12234567

122345678

1223456789

12234567890

122345678901
7

緩存機制

上一節講到迭代讀取，你們知道IO緩存最大爲8192.對於IO操做來說，緩存指的是內存緩存數據，執行以後，再從緩存寫入到硬盤。

針對於IO文件write操做來說，每次咱們寫入數據的時候，都是先寫入緩存，執行close或者flush以後，纔會真正的將緩存寫入到硬盤。

因此，務必在操做完文件以後，必須最後執行f.close或者中途執行f.flush的動做。

文件指針

看下下面的例子：

try:
    f = open('1.txt','r+')
    dat = f.read(3)
    print dat
    dat = f.read(3)
    print dat
finally:
    if f:
        f.close()
    
122
345

你們經過這個例子能夠發現，read操做讀取數據以後，再次調用讀取操做，是不會從開始字節讀取。這個就牽扯到文件指針的概念。

當f.read(3)，文件指針就移動到第三個字節位置，當再次調用f.read(3),一樣文件指針就移動到第六個字節位置。若是想要回滾會文件首部字節讀取怎麼辦？

使用f.seek操做，先來看看seek函數的介紹：(偏移量超出文件長度就會報錯)

f.seek(0, os.SEEK_SET) #移動文件指針到文件起始位置
f.seek(0, os.SEEK_END) #移動文件指針到文件末尾位置
f.seek(5, os.SEEK_CUR) #移動文件指針到當前位置的前5個字節上
f.seek(-5, os.SEEK_CUR)#移動文件指針到從文件末尾起前5個字節

try:
    f = open('1.txt','r+')
    dat = f.read(3)
    location = f.tell()
    print "location :%s" % location#標識當前文件指針的位置
    print dat
    dat = f.read(3)
    location = f.tell()
    print "location :%s" % location#標識當前文件指針的位置
    print dat
finally:
    if f:
        f.close()

location :3 #標識當前文件指針的位置
122
location :6
345

try:
    f = open('1.txt','r+')
    dat = f.read(3)
    location = f.tell()
    print "location :%s" % location
    print dat
    f.seek(os.SEEK_SET)#調用seek的動做
    location = f.tell()
    print "location :%s" % location    
    dat = f.read(3)
    location = f.tell()
    print "location :%s" % location
    print dat
finally:
    if f:
        f.close()

        
location :3
122
location :0
location :3
122

經過上述例子能夠看出，seek動做能夠將文件指針從新回到文件首部。文件指針的位置能夠經過f.tell()方法知道。seek其餘參數講解你們一一去嘗試操做下。

文件編碼格式

try:
    f = open('1.txt','r+')
    f.write('qwer')
    f.write(u'博客園')
finally:
    if f:
        f.close()

Traceback (most recent call last):
  File "<pyshell#69>", line 4, in <module>
    f.write(u'博客園')
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-2: ordinal not in range(128)

你們都知道，有時候須要在文檔裏面寫中文。上面的寫法會直接報錯。那麼如何寫入unicode編碼的字符呢？

提供兩種方法：

一、直接將unicode編碼字符轉換爲utf8編碼字符保存 :unicode.encode(u'博客園','utf-8')

try:
    f = open('1.txt','r+')
    f.write('qwer')
    val = unicode.encode(u'博客園','utf-8') 
    f.write(val)
finally:
    if f:
        f.close()

>>> try:
    f = open('1.txt','r+')
    w = f.read()
    print w
finally:
    if f:
        f.close()

qwer博客園56

2.使用codecs模塊直接建立指定編碼格式的文件

>>> import codecs
>>> help(codecs.open)
Help on function open in module codecs:

open(filename, mode='rb', encoding=None, errors='strict', buffering=1).....

那咱們用了實例試試：

try:
    f = codecs.open('5.txt','w+','utf-8')
    f.write(u'博客園')
    f.flush()
finally:
    if f:
        f.close()

小結：這一章節主要講解python的文件操做及其文件編碼。

相關標籤/搜索

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。