目錄
1、文件操做
1.1 文件操做基本流程
1.2 文件編碼
1.3 文件的打開模式
1.4 上下文管理
1.5 文件的修改
1.6 文件操做方法python
2、總結linux
1、文件操做
1.1 文件操做基本流程
一、打開文件,獲得文件句柄並賦值給一個變量
二、經過句柄對文件進行操做
三、關閉文件
例1:相對路徑讀取文件
1 # 1、打開文件,獲得文件句柄並賦值給一個變量(file、f_handle、file_handle、f_obj、f1) 2 f1 = open('a.txt',encoding='utf-8',mode='r') 3 # 2、經過句柄對文件進行操做 4 content = f1.read() 5 # 3、關閉文件 6 f1.close()
# 注意:open指令爲windows的指令。windows默認編碼方式爲gbk,linux默認編碼方式爲utf-8。
例2:絕對路徑讀取文件
1 f1 = open('D:\a.txt', encoding='utf-8', mode='r') 2 content = f1.read() 3 print(content) 4 輸出結果: 5 ''' 6 Traceback (most recent call last): 7 File "C:/Users/benjamin/python自動化21期/day3/筆記文本.py", line 1, in <module>
8 f1 = open('D:\a.txt', encoding='utf-8') 9 OSError: [Errno 22] Invalid argument: 'D:\x07.txt'
10 '''
解決方法1(推薦):vim
1 f1 = open(r'D:\a.txt', encoding='utf-8') 2 content = f1.read() 3 print(content)
解決方法2(不推薦):windows
1 f1 = open('D:\\a.txt', encoding='utf-8') 2 content = f1.read() 3 print(content)
# 注意:windows默認編碼爲gbk,Linux默認編碼爲utf-8,讀取文件時,讀取編碼不一樣,也會報錯。網絡
1.2 文件編碼
unicode:簡單粗暴,全部的字符都是2Bytes,優勢是字符--數字的轉換速度快;缺點是佔用空間大。
utf-8:精準,可變長,優勢是節省空間;缺點是轉換速度慢,由於每次轉換都須要計算出須要多長Bytes纔可以準確表示。app
1.內存中使用的編碼是unicode,用空間換時間(程序都須要加載到內存才能運行,於是內存應該是越快越好)
2.硬盤中或網絡傳輸用utf-8,保證數據傳輸的穩定性。less
全部程序,最終都要加載到內存,程序保存到硬盤不一樣的國家用不一樣的編碼格式,可是到內存中咱們爲了兼容萬國(計算機能夠運行任何國家的程序緣由在於此),統一且固定使用unicode,這就是爲什麼內存固定用unicode的緣由,你可能會說兼容萬國我能夠用utf-8啊,能夠,徹底能夠正常工做,之因此不用確定是unicode比utf-8更高效啊(uicode固定用2個字節編碼,utf-8則須要計算),可是unicode更浪費空間,沒錯,這就是用空間換時間的一種作法,而存放到硬盤,或者網絡傳輸,都須要把unicode轉成utf-8,由於數據的傳輸,追求的是穩定,高效,數據量越小數據傳輸就越靠譜,因而都轉成utf-8格式的,而不是unicode。
unicode------>encode(編碼)-------->utf-8
utf-8---------->decode--------->unicode編輯器
文件從內存刷到硬盤的操做簡稱存文件
文件從硬盤讀到內存的操做簡稱讀文件
亂碼:存文件時就已經亂碼 或者 存文件時不亂碼而讀文件時亂碼
總結:
不管是何種編輯器,要防止文件出現亂碼(請必定注意,存放一段代碼的文件也僅僅只是一個普通文件而已,此處指的是文件沒有執行前,咱們打開文件時出現的亂碼)
核心法則就是,文件以什麼編碼保存的,就以什麼編碼方式打開ide
1.3 文件的打開模式
文件句柄 = open('文件路徑','模式')
一、打開文件時,須要指定文件路徑和以什麼方式打開文件。
r模式:
1 ## r模式: 2 # read() 所有讀出 3 f1 = open('log1', encoding='utf-8') 4 content = f1.read() 5 print(content) 6 f1.close() 7
8 # read(n) r模式:按照字符讀取 9 f1 = open('log1',encoding='utf-8') 10 content = f1.read(5) 11 print(content) 12 f1.close() 13
14 # read(n) rb模式:按照字節讀取。1個字符3個字節,寫4個字節會報錯。 15 f1 = open('log1',mode='rb') 16 content = f1.read(3) 17 print(content.decode('utf-8')) 18 f1.close() 19
20 # readline() 按行讀取,讀取完,打印空行 21 f1 = open('log1',encoding='utf-8') 22 print(f1.readline()) 23 print(f1.readline()) 24 f1.close() 25
26 # readlines() 將文件每一行做爲列表的一個元素並返回這個列表 27 f1 = open('log1',encoding='utf-8') 28 print(f1.readlines()) 29 f1.close() 30
31 # for循環 for循環一個文件句柄,在內存中只佔用一條的空間 32 f1 = open('log1',encoding='utf-8') 33 for i in f1: 34 print(i) 35 f1.close() 36
37 # 編碼的補充 38 s1 = '中國'
39 s2 = s1.encode('gbk') 40 print(s2) 41 # 輸出結果:b'\xd6\xd0\xb9\xfa'
42
43 s1 = b'\xd6\xd0\xb9\xfa'
44 s2 = s1.decode('gbk') 45 s3 = s2.encode('utf-8') 46 print(s3) 47 # 輸出結果:b'\xe4\xb8\xad\xe5\x9b\xbd'
48
49 # 簡化 50 s1 = b'\xd6\xd0\xb9\xfa'.decode('gbk').encode('utf-8') 51 print(s1) 52 # 輸出結果:b'\xe4\xb8\xad\xe5\x9b\xbd'
w模式:學習
1 ## w模式 2 # 不可讀,文件不存在則建立,存在則清空內容,而後再寫入。 3 f1 = open('log2',encoding='utf-8',mode='w') 4 f1.write('python是一門高級語言') 5 f1.close()
a模式:
1 ## a模式 2 # 可讀,不存在則建立,存在則只追加內容 3 f1 = open('log2',encoding='utf-8',mode='a') 4 f1.write('\npython學習') 5 f1.close()
二、「+」表示能夠同時讀寫某個文件(就是增長了一個功能)
r+模式:
1 # r+模式 先讀出原文件,而後追加寫入 2 f1 = open('log1',encoding='utf-8',mode='r+') 3 print(f1.read()) 4 f1.write('666') 5 f1.close() 6
7 #r+模式 先寫後讀,正常狀況會出錯 8 f1 = open('log1',encoding='utf-8',mode='r+') 9 f1.write('666') 10 print(f1.read()) 11 f1.close() 12 # 原來內容:快快樂樂 13 # 輸出內容:快樂樂 14 # 文件內容:666快樂樂 15 # 光標按照字節去運轉 16
17 # r+模式 先寫後讀,調整光標位置 18 f1 = open('log1',encoding='utf-8',mode='r+') 19 f1.seek(0,2) 20 f1.write('666') 21 f1.seek(0) 22 print(f1.read()) 23 f1.close() 24 # 輸出內容:快快樂樂666
w+模式:
1 # w+模式 先寫後讀,原文件裏內容會先刪除,而後再寫入 2 f1 = open('log2',encoding='utf-8',mode='w+') 3 f1.write('老男孩') 4 f1.seek(0) 5 print(f1.read()) 6 f1.close()
a+模式:
1 # a+模式 2 f1 = open('log2',encoding='utf-8',mode='a+') 3 f1.write('ababababab') 4 f1.seek(0) 5 print(f1.read()) 6 f1.close()
三、「b」表示以字節的方式操做
對於非文本文件,咱們只能使用b模式,"b"表示以字節的方式操做(而全部文件也都是以字節的形式存儲的,使用這種模式無需考慮文本文件的字符編碼、圖片文件的jgp格式、視頻文件的avi格式)
注:以b方式打開時,讀取到的內容是字節類型,寫入時也須要提供字節類型,不能指定編碼
rb模式:
1 # rb模式 按照字節讀取 2 f1 = open('log1', mode='rb') 3 content = f1.read(3) 4 print(content.decode('utf-8')) 5 f1.close()
wb模式:
1 # wb模式 2 f1 = open('log2',mode='wb') 3 f1.write('python語言'.encode('utf-8')) 4 f1.close()
ab模式:
# ab模式 f1 = open('log2',mode='ab') f1.write('\npython語言'.encode('utf-8')) f1.close()
四、以bytes類型操做的讀寫、寫讀、寫讀模式
1 # with open() as: 在循環的時候不能用 2 with open('log1',encoding='utf-8') as f1: 3 print(f1.read()) 4
5 # with open() as: 操做多個文件句柄 6 with open('log1',encoding='utf-8') as f1,\ 7 open('log2',encoding='utf-8',mode='w') as f2: 8 print(f1.read()) 9 f2.write('777')
1.5 文件的修改
一、打開原文件,產生文件句柄
二、建立新文件產生文件句柄
三、讀取原文件,進行修改,寫入新文件
四、將原文件刪除
五、新文件重命名爲原文件
文件的數據是存放於硬盤上的,於是只存在覆蓋、不存在修改這麼一說,咱們平時看到的修改文件,都是模擬出來的效果,具體的說有兩種實現方式:
方式一:將硬盤存放的該文件的內容所有加載到內存,在內存中是能夠修改的,修改完畢後,再由內存覆蓋到硬盤(word,vim,nodpad++等編輯器)
方式二:將硬盤存放的該文件的內容一行一行地讀入內存,修改完畢就寫入新文件,最後用新文件覆蓋源文件
1 # 方式一 2 import os 3 with open('file_test',encoding='utf-8') as f1,\ 4 open('file_test.bak',encoding='utf-8',mode='w') as f2: 5 old_content = f1.read() 6 new_content = old_content.replace('alex','SB') 7 f2.write(new_content) 8 os.remove('file_test') 9 os.rename('file_test.bak','file_test') 10
11 # 方式二 12 import os 13 with open('file_test',encoding='utf-8') as f1,\ 14 open('file_test.bak',encoding='utf-8',mode='w') as f2: 15 for line in f1: 16 new_line = line.replace('SB','alex') 17 f2.write(new_line) 18 os.remove('file_test') 19 os.rename('file_test.bak','file_test')
一、經常使用操做方法
read(3):
1. 文件打開方式爲文本模式時,表明讀取3個字符
2. 文件打開方式爲b模式時,表明讀取3個字節
其他的文件內光標移動都是以字節爲單位的如:seek,tell,truncate
注意:
1. seek有三種移動方式0,1,2,其中1和2必須在b模式下進行,但不管哪一種模式,都是以bytes爲單位移動的
seek控制光標的移動,是以文件開頭做爲參照的
tell當前光標的位置
2. truncate是截斷文件,因此文件的打開方式必須可寫,可是不能用w或w+等方式打開,由於那樣直接清空文件了,因此truncate要在r+或a或a+等模式下測試效果。
1 # readable() 判斷是否可讀 2 f1 = open('log2',encoding='utf-8',mode='w') 3 print(f1.readable()) 4 f1.write('ababababab') 5 f1.close() 6 # 輸出結果:False 7
8 # writable() 判斷是否可寫 9 f1 = open('log2',encoding='utf-8',mode='w') 10 print(f1.writable()) 11 f1.write('ababababab') 12 f1.close() 13 # 輸出結果:True 14
15 # tell 告知指針的位置 16 f1 = open('log2',encoding='utf-8',mode='w') 17 f1.write('ababababab') 18 print(f1.tell()) 19 f1.close() 20 # 輸出結果:10
21
22 # seek(參數) 按照字節去調整 23 # seek(0,2) 調至最後位置
二、全部操做方法
1 class file(object) 2 def close(self): # real signature unknown; restored from __doc__ 3 關閉文件 4 """ 5 close() -> None or (perhaps) an integer. Close the file. 6
7 Sets data attribute .closed to True. A closed file cannot be used for
8 further I/O operations. close() may be called more than once without 9 error. Some kinds of file objects (for example, opened by popen()) 10 may return an exit status upon closing. 11 """ 12
13 def fileno(self): # real signature unknown; restored from __doc__ 14 文件描述符 15 """ 16 fileno() -> integer "file descriptor". 17
18 This is needed for lower-level file interfaces, such os.read(). 19 """ 20 return 0
21
22 def flush(self): # real signature unknown; restored from __doc__ 23 刷新文件內部緩衝區 24 """ flush() -> None. Flush the internal I/O buffer. """
25 pass 26
27
28 def isatty(self): # real signature unknown; restored from __doc__ 29 判斷文件是不是贊成tty設備 30 """ isatty() -> true or false. True if the file is connected to a tty device. """
31 return False 32
33
34 def next(self): # real signature unknown; restored from __doc__ 35 獲取下一行數據,不存在,則報錯 36 """ x.next() -> the next value, or raise StopIteration """
37 pass 38
39 def read(self, size=None): # real signature unknown; restored from __doc__ 40 讀取指定字節數據 41 """ 42 read([size]) -> read at most size bytes, returned as a string. 43
44 If the size argument is negative or omitted, read until EOF is reached. 45 Notice that when in non-blocking mode, less data than what was requested 46 may be returned, even if no size parameter was given. 47 """ 48 pass 49
50 def readinto(self): # real signature unknown; restored from __doc__ 51 讀取到緩衝區,不要用,將被遺棄 52 """ readinto() -> Undocumented. Don't use this; it may go away. """
53 pass 54
55 def readline(self, size=None): # real signature unknown; restored from __doc__ 56 僅讀取一行數據 57 """ 58 readline([size]) -> next line from the file, as a string. 59
60 Retain newline. A non-negative size argument limits the maximum 61 number of bytes to return (an incomplete line may be returned then). 62 Return an empty string at EOF. 63 """ 64 pass 65
66 def readlines(self, size=None): # real signature unknown; restored from __doc__ 67 讀取全部數據,並根據換行保存值列表 68 """ 69 readlines([size]) -> list of strings, each a line from the file. 70
71 Call readline() repeatedly and return a list of the lines so read. 72 The optional size argument, if given, is an approximate bound on the 73 total number of bytes in the lines returned. 74 """ 75 return [] 76
77 def seek(self, offset, whence=None): # real signature unknown; restored from __doc__ 78 指定文件中指針位置 79 """ 80 seek(offset[, whence]) -> None. Move to new file position. 81
82 Argument offset is a byte count. Optional argument whence defaults to 83 (offset from start of file, offset should be >= 0); other values are 1
84 (move relative to current position, positive or negative), and 2 (move 85 relative to end of file, usually negative, although many platforms allow 86 seeking beyond the end of a file). If the file is opened in text mode, 87 only offsets returned by tell() are legal. Use of other offsets causes 88 undefined behavior. 89 Note that not all file objects are seekable. 90 """ 91 pass 92
93 def tell(self): # real signature unknown; restored from __doc__ 94 獲取當前指針位置 95 """ tell() -> current file position, an integer (may be a long integer). """
96 pass 97
98 def truncate(self, size=None): # real signature unknown; restored from __doc__ 99 截斷數據,僅保留指定以前數據 100 """ 101 truncate([size]) -> None. Truncate the file to at most size bytes. 102
103 Size defaults to the current file position, as returned by tell(). 104 """ 105 pass 106
107 def write(self, p_str): # real signature unknown; restored from __doc__ 108 寫內容 109 """ 110 write(str) -> None. Write string str to file. 111
112 Note that due to buffering, flush() or close() may be needed before 113 the file on disk reflects the data written. 114 """ 115 pass 116
117 def writelines(self, sequence_of_strings): # real signature unknown; restored from __doc__ 118 將一個字符串列表寫入文件 119 """ 120 writelines(sequence_of_strings) -> None. Write the strings to the file. 121
122 Note that newlines are not added. The sequence can be any iterable object
123 producing strings. This is equivalent to calling write() for each string. 124 """ 125 pass 126
127 def xreadlines(self): # real signature unknown; restored from __doc__ 128 可用於逐行讀取文件,非所有 129 """ 130 xreadlines() -> returns self. 131
132 For backward compatibility. File objects now include the performance 133 optimizations previously implemented in the xreadlines module. 134 """ 135 pass 136
137 2.x 138
139 2.x
1 class TextIOWrapper(_TextIOBase): 2 """ 3 Character and line based layer over a BufferedIOBase object, buffer. 4
5 encoding gives the name of the encoding that the stream will be 6 decoded or encoded with. It defaults to locale.getpreferredencoding(False). 7
8 errors determines the strictness of encoding and decoding (see 9 help(codecs.Codec) or the documentation for codecs.register) and 10 defaults to "strict". 11
12 newline controls how line endings are handled. It can be None, '', 13 '\n', '\r', and '\r\n'. It works as follows: 14
15 * On input, if newline is None, universal newlines mode is 16 enabled. Lines in the input can end in '\n', '\r', or '\r\n', and 17 these are translated into '\n' before being returned to the 18 caller. If it is '', universal newline mode is enabled, but line 19 endings are returned to the caller untranslated. If it has any of 20 the other legal values, input lines are only terminated by the given 21 string, and the line ending is returned to the caller untranslated. 22
23 * On output, if newline is None, any '\n' characters written are 24 translated to the system default line separator, os.linesep. If 25 newline is '' or '\n', no translation takes place. If newline is any 26 of the other legal values, any '\n' characters written are translated 27 to the given string. 28
29 If line_buffering is True, a call to flush is implied when a call to 30 write contains a newline character. 31 """ 32 def close(self, *args, **kwargs): # real signature unknown 33 關閉文件 34 pass 35
36 def fileno(self, *args, **kwargs): # real signature unknown 37 文件描述符 38 pass 39
40 def flush(self, *args, **kwargs): # real signature unknown 41 刷新文件內部緩衝區 42 pass 43
44 def isatty(self, *args, **kwargs): # real signature unknown 45 判斷文件是不是贊成tty設備 46 pass 47
48 def read(self, *args, **kwargs): # real signature unknown 49 讀取指定字節數據 50 pass 51
52 def readable(self, *args, **kwargs): # real signature unknown 53 是否可讀 54 pass 55
56 def readline(self, *args, **kwargs): # real signature unknown 57 僅讀取一行數據 58 pass 59
60 def seek(self, *args, **kwargs): # real signature unknown 61 指定文件中指針位置 62 pass 63
64 def seekable(self, *args, **kwargs): # real signature unknown 65 指針是否可操做 66 pass 67
68 def tell(self, *args, **kwargs): # real signature unknown 69 獲取指針位置 70 pass 71
72 def truncate(self, *args, **kwargs): # real signature unknown 73 截斷數據,僅保留指定以前數據 74 pass 75
76 def writable(self, *args, **kwargs): # real signature unknown 77 是否可寫 78 pass 79
80 def write(self, *args, **kwargs): # real signature unknown 81 寫內容 82 pass 83
84 def __getstate__(self, *args, **kwargs): # real signature unknown 85 pass 86
87 def __init__(self, *args, **kwargs): # real signature unknown 88 pass 89
90 @staticmethod # known case of __new__ 91 def __new__(*args, **kwargs): # real signature unknown 92 """ Create and return a new object. See help(type) for accurate signature. """
93 pass 94
95 def __next__(self, *args, **kwargs): # real signature unknown 96 """ Implement next(self). """
97 pass 98
99 def __repr__(self, *args, **kwargs): # real signature unknown 100 """ Return repr(self). """
101 pass 102
103 buffer = property(lambda self: object(), lambda self, v: None, lambda self: None) # default 104
105 closed = property(lambda self: object(), lambda self, v: None, lambda self: None) # default 106
107 encoding = property(lambda self: object(), lambda self, v: None, lambda self: None) # default 108
109 errors = property(lambda self: object(), lambda self, v: None, lambda self: None) # default 110
111 line_buffering = property(lambda self: object(), lambda self, v: None, lambda self: None) # default 112
113 name = property(lambda self: object(), lambda self, v: None, lambda self: None) # default 114
115 newlines = property(lambda self: object(), lambda self, v: None, lambda self: None) # default 116
117 _CHUNK_SIZE = property(lambda self: object(), lambda self, v: None, lambda self: None) # default 118
119 _finalizing = property(lambda self: object(), lambda self, v: None, lambda self: None) # default 120
121 3.x 122
123 3.x
2、總結
# 打開文件 # f = open('文件路徑') 默認的打開方式r ,默認的打開編碼是操做系統的默認編碼 # r w a (r+ w+ a+) 以上6種加b ,若是打開模式+b,就不須要指定編碼了。r+ w+ a+ 工做中避免用這三個。主要用r w a 模式。 # 經常使用編碼:UTF-8 、 gbk# 操做文件 # 讀 # read 不傳參數 意味着讀全部 # 傳參,若是是r方式打開的,參數指的是讀多少個字符 # 傳參,若是是rb方式打開的,參數指的是讀多少個字節 # readline # 一行一行讀 每次只讀一行,不會自動中止 # for循環的方式 # 一行一行讀 從第一行開始 每次讀一行 讀到沒有以後就中止 # readlines 不經常使用 # 寫 # write 寫內容(不會本身換行,須要收到換行\n)# 關閉文件 # f.close() # with open() as f:# 修改文件 : # import os # os.remove # os.rename