1、打開和關閉文件python
一、文件打開和關閉shell
In [1]: help(open) Help on built-in function open in module io: open(file, mode='r', buffering=-1, encoding=None, errors=None, newline=None, closefd=True, opener=None) Open file and return a stream. Raise IOError upon failure. ========= =============================================================== Character Meaning --------- --------------------------------------------------------------- 'r' open for reading (default) 'w' open for writing, truncating the file first 'x' create a new file and open it for writing 'a' open for writing, appending to the end of the file if it exists 'b' binary mode 't' text mode (default) '+' open a disk file for updating (reading and writing) 'U' universal newline mode (deprecated) ========= =============================================================== In [6]: f = open("/tmp/shell/test.txt") # 打開一個文件,得到一個文件對象 In [7]: type(f) Out[7]: _io.TextIOWrapper In [8]: f Out[8]: <_io.TextIOWrapper name='/tmp/shell/test.txt' mode='r' encoding='UTF-8'> In [9]: f.mode # 文件對象的打開模式 Out[9]: 'r' In [11]: f.name # 文件名 Out[11]: '/tmp/shell/test.txt' In [13]: f.read() # 讀取文件的內容 Out[13]: 'Hello World!\nI love python\n' In [15]: f.readable() # 是否可讀 Out[15]: True In [16]: f.writable() # 是否可寫 Out[16]: False In [17]: f.closed # 文件對象是否關閉 Out[17]: False In [20]: f.close() # 關閉文件對象 In [21]: f.name Out[21]: '/tmp/shell/test.txt' In [22]: f.read() # 關閉後不能再查看了 --------------------------------------------------------------------------- ValueError Traceback (most recent call last) <ipython-input-22-bacd0e0f09a3> in <module>() ----> 1 f.read() ValueError: I/O operation on closed file. In [25]: f.closed Out[25]: True
文件對象的操做和打開方式是相關
bash
二、open函數mode參數詳解cookie
1)控制讀寫的模式
app
'r' :即mode=r,默認,只讀打開,不可寫;當文件不存在時,會拋出FileNotFoundError
'w':只寫打開,不可讀;會清空原文件(既使打開後沒有作任何操做也會清空),當文件不存在時,會新建
'x' :僅新建文件,只寫打開,不可讀;當文件存在時,會拋出FileExistError
'a' :追加內容到文件末尾(最後一行的下面一行),只寫,不可讀;當文件不存在時,會新建socket
從讀寫的方面來看,只有r可讀不可寫,其它都是可寫不可讀ide
當文件不存在時,只有r拋出異常,其它的都建立新文件函數
當文件存在時,只有x拋出異常
學習
從是否影響文件原始內容來看,只有w會清空文件
ui
2)控制打開方式的模式
't':以文本模式打開,默認,讀出和寫入的是字符串;按字符操做
'b' :以二進制的模式打開,讀出和寫入的都是bytes;按字節操做
In [85]: f = open('/root/1.txt', mode='w') In [86]: f.write('馬哥Python') Out[86]: 8 # 按字符寫入,寫入了8個字符 In [87]: f.close() In [88]: cat /root/1.txt 馬哥Python In [92]: f = open('/root/1.txt', mode='wb') In [93]: f.write('馬哥 Python') --------------------------------------------------------------------------- TypeError Traceback (most recent call last) <ipython-input-93-582947144dc2> in <module>() ----> 1 f.write('馬哥 Python') TypeError: a bytes-like object is required, not 'str' In [94]: f.write('馬哥 Python'.encode()) Out[94]: 13 # 按字節寫入,寫入了13個字節 In [95]: f.close() In [96]: f = open('/root/1.txt') In [97]: f.read() Out[97]: '馬哥 Python' In [98]: f.close() In [99]: f = open('/root/1.txt', mode='rb') In [100]: f.read() Out[100]: b'\xe9\xa9\xac\xe5\x93\xa5 Python'
注意:
mode的參數rw不能一塊兒寫:mode='rw',mode裏必須有且僅有rwax中的一種,不指定則默認rt
'+':可讀可寫;+不能單獨使用,
會增長額外的讀寫操做,也就是說原來是隻讀的,會增長可寫的操做,原來只寫的,增長可讀的操做,但不改變其它行爲
r+:可讀可寫,從當前指針覆蓋寫,
w+:可讀可寫,先清空文件,
a+:可讀可寫,從文件末尾追加(第一次寫老是在文件末尾追加,儘管寫以前移動了指針)
'U':已被廢棄
三、文件位置指針
當open()打開一個文件的時候,解釋器會持有一個指針,指向文件的某個位置
當咱們讀寫文件時,老是從指針處開始向後操做,而且移動指針
當mode=r(或w,w會先清空文件)時,指針是指向0(文件的開始)
當mode=a時,指針指向EOF(End Of File 文件末尾)
查看當前文件的指針的位置:
In [7]: help(f.tell) Help on built-in function tell: tell() method of _io.TextIOWrapper instance Return current stream position. (END) In [1]: f = open('/root/passwd') In [2]: f.tell() Out[2]: 0 In [3]: f.readline() Out[3]: 'root:x:0:0:root:/root:/bin/bash\n' In [4]: f.tell() Out[4]: 32 In [5]: f.readline() Out[5]: 'bin:x:1:1:bin:/bin:/sbin/nologin\n' In [6]: f.tell() Out[6]: 65
移動文件指針的位置:
seek()有2個參數:
cookie 表示移動到哪裏
whence 表示從哪一個位置開始移動,有3個值
0:表示文件的開頭,默認
1:表示當前指針的位置
2:表示文件末尾
In [23]: help(f.seek) Help on built-in function seek: seek(cookie, whence=0, /) method of _io.TextIOWrapper instance Change stream position. Change the stream position to the given byte offset. The offset is interpreted relative to the position indicated by whence. Values for whence are: * 0 -- start of stream (the default); offset should be zero or positive * 1 -- current stream position; offset may be negative * 2 -- end of stream; offset is usually negative Return the new absolute position. In [35]: f.tell() Out[35]: 65 In [36]: f.seek(0) Out[36]: 0 In [37]: f.tell() Out[37]: 0 In [38]: f.readline() Out[38]: 'root:x:0:0:root:/root:/bin/bash\n' In [39]: f.tell() Out[39]: 32 In [40]: f.seek(5) Out[40]: 5 In [41]: f.tell() Out[41]: 5 In [42]: f.readline() Out[42]: 'x:0:0:root:/root:/bin/bash\n' # whence不爲0時,cookie必須爲0 In [95]: f.seek(10,1) --------------------------------------------------------------------------- UnsupportedOperation Traceback (most recent call last) <ipython-input-95-aa8451ab67ab> in <module>() ----> 1 f.seek(10,1) UnsupportedOperation: can't do nonzero cur-relative seeks In [96]: f.seek(0,2) Out[96]: 1184
mode=t 時:
當whence爲0(默認值)時,cookie(offset)能夠是任意整數
當whence爲1或2時,cookie只能爲0
mode=b 時:
當whence爲0(默認值)時,cookie(offset)能夠是任意整數
當whence爲1或2時,cookie也能夠是任意整數
小結:
seek()能夠向後超出範圍(但老是從文件末尾開始寫),但不能向前超出範圍;當seek()超出文件末尾時,不會有有異常,tell()返回的值也會超出文件末尾,可是寫數據的時候,仍是會從文件末尾開始寫,而不是tell()返回的超過的值;即 write操做從Min(EOF,tell())中小的處開始
seek(),tell()老是以字節來計算
四、buffering
緩衝區設置
f.flush()、f.close()和f.seek()會刷新緩衝區
buffering=-1,默認值,8192個字節,超過這個值會自動刷新緩衝區的內容,再寫
二進制模式:緩衝區大小爲 io.DEFAULT_BUFFER_SIZE
文本模式:緩衝區大小爲 io.DEFAULT_BUFFER_SIZE
In [48]: import io In [49]: io.DEFAULT_BUFFER_SIZE Out[49]: 8192
buffering=0,關閉緩衝區(不啓用緩衝區)
二進制模式:unbuffered 關閉緩衝區
文本模式:不容許
buffering=1
二進制模式:緩衝區大小爲1
文本模式:line buffering 遇到換行符就flush
buffering > 1
二進制模式:buffering
會先判斷緩衝區剩餘位置是否足夠存放當前字節,若是不能,先flush,再把當前字節寫入緩衝區,若是當前字節大於緩衝區大小,直接flush
文本模式:io.DEFAULT_BUFFER_SIZE
若是當前字節加緩衝區中的字節,超出緩衝區大小,直接flush緩衝區和當前字節
注意:
通常在讀寫問件時,不考慮buffering,但在socket時須要考慮
特殊文件對象有特殊的刷新方式
2、文件對象
一、文件對象是可迭代對象
In [85]: cat /tmp/1.log 1234 5 5555 In [86]: f.tell() Out[86]: 12 In [87]: for i in f: # 文件對象是可迭代對象,每次迭代一行 ...: print(i) ...: In [88]: f.seek(0) Out[88]: 0 In [89]: for i in f: ...: print(i) ...: 1234 5 5555 In [90]:
二、文件對象上下文管理
上下文管理:會在離開時自動關閉文件,可是不會開啓新的做用域
In [141]: with open('/tmp/1.log') as f: ...: print(f.read()) ...: f.wirte('sb') ...: 1234 5 5555aaa bbb aaa bbb [aaa bbb ] --------------------------------------------------------------------------- AttributeError Traceback (most recent call last) <ipython-input-141-d87fc5f222e1> in <module>() 1 with open('/tmp/1.log') as f: 2 print(f.read()) ----> 3 f.wirte('sb') 4 AttributeError: '_io.TextIOWrapper' object has no attribute 'wirte' In [142]: f.closed Out[142]: True
三、File-like 類文件對象
StringI0
In [189]: from io import StringIO In [190]: sio = StringIO() In [190]: sio = StringIO() In [191]: sio.readable() Out[191]: True In [193]: sio.writable() Out[193]: True In [194]: sio.write('python') Out[194]: 6 In [195]: sio.seek(0) Out[195]: 0 In [196]: sio.read() Out[196]: 'python' In [197]: sio.tell() Out[197]: 6 In [198]: sio.getvalue() # 將文件的內容所有讀出來,無論指針在哪,並能夠重複讀取屢次 Out[198]: 'python'
類文件對象還有:BytesIO, socket
StringIO和BytesIO對應於打開模式爲t和b的文件對象
做用:
在內存中模擬文件對象,速度更快
類文件對象能夠不close,但會佔用內存
3、pathlib
在python 3.4以前只有使用os.path這一種方法來操做路徑
os.path是以字符串的方式操做路徑的
在python 3.4引入了pathlib庫以面向對象的方式來操做路徑
In [196]: import pathlib In [199]: pwd = pathlib.Path('.') # .表明當前目錄 In [200]: pwd Out[200]: PosixPath('.') In [206]: pwd.absolute() # 絕對路徑 Out[206]: PosixPath('/root/magedu/python3')
一、對目錄的操做
In [209]: pwd Out[209]: PosixPath('.') In [210]: pwd.is_absolute() Out[210]: False In [211]: pwd.is_dir() Out[211]: True
遍歷目錄:
In [212]: pwd.iterdir() # 返回一個目錄生成器 Out[212]: <generator object Path.iterdir at 0x7fee8a140e08> In [213]: for i in pwd.iterdir(): ...: i ...: In [214]: for i in pwd.iterdir(): # 遍歷目錄,只遍歷子目錄,不會遞歸遍歷 ...: print(i) ...: .ipynb_checkpoints 第一天.ipynb nohup.out .python-version 元組及其操做.ipynb test test.py In [216]: for i in pwd.iterdir(): ...: print(type(i)) ...: print(i) ...: <class 'pathlib.PosixPath'> # 能夠繼續循環,實現遞歸遍歷 .ipynb_checkpoints <class 'pathlib.PosixPath'> 第一天.ipynb <class 'pathlib.PosixPath'> nohup.out <class 'pathlib.PosixPath'> .python-version <class 'pathlib.PosixPath'> 元組及其操做.ipynb <class 'pathlib.PosixPath'> test <class 'pathlib.PosixPath'> test.py In [217]: type(pwd) Out[217]: pathlib.PosixPath In [218]: print(type(pwd)) <class 'pathlib.PosixPath'>
建立文件:
In [219]: d = pathlib.Path('/tmp/1.txt') In [220]: d.exists() Out[220]: True In [221]: d.mkdir(755) --------------------------------------------------------------------------- FileExistsError Traceback (most recent call last) <ipython-input-221-ea2331bf5a68> in <module>() ----> 1 d.mkdir(755) /root/.pyenv/versions/3.6.1/lib/python3.6/pathlib.py in mkdir(self, mode, parents, exist_ok) 1225 if not parents: 1226 try: -> 1227 self._accessor.mkdir(self, mode) 1228 except FileExistsError: 1229 if not exist_ok or not self.is_dir(): /root/.pyenv/versions/3.6.1/lib/python3.6/pathlib.py in wrapped(pathobj, *args) 388 @functools.wraps(strfunc) 389 def wrapped(pathobj, *args): --> 390 return strfunc(str(pathobj), *args) 391 return staticmethod(wrapped) 392 FileExistsError: [Errno 17] File exists: '/tmp/1.txt' In [223]: d1 = pathlib.Path('/tmp/11.txt') # 先建立一個對象 In [224]: d1.exists() # 判斷此對象不存在 Out[224]: False In [225]: help(d1.mkdir) Help on method mkdir in module pathlib: mkdir(mode=511, parents=False, exist_ok=False) method of pathlib.PosixPath instance # mode 指定權限 # parents 建立父目錄,當父目錄存在時會報錯,當exist_ok=True時,不報錯,就至關於shell命令的mkdir -p In [226]: d1.mkdir(0o755) # 新建目錄,並指定權限,默認511;0o表示八進制 In [227]: d1.exists() Out[227]: True In [237]: ls -ld /tmp/11.txt drwxr-xr-x 2 root root 4096 Nov 2 14:24 /tmp/11.txt/
刪除目錄:
In [239]: d1.rmdir() # 只能刪除空目錄 In [240]: d1.exists() Out[240]: False In [246]: d2 = pathlib.Path('/tmp') In [247]: d2.rmdir() --------------------------------------------------------------------------- OSError Traceback (most recent call last) <ipython-input-247-ffe14b7c0399> in <module>() ----> 1 d2.rmdir() /root/.pyenv/versions/3.6.1/lib/python3.6/pathlib.py in rmdir(self) 1273 if self._closed: 1274 self._raise_closed() -> 1275 self._accessor.rmdir(self) 1276 1277 def lstat(self): /root/.pyenv/versions/3.6.1/lib/python3.6/pathlib.py in wrapped(pathobj, *args) 388 @functools.wraps(strfunc) 389 def wrapped(pathobj, *args): --> 390 return strfunc(str(pathobj), *args) 391 return staticmethod(wrapped) 392 OSError: [Errno 39] Directory not empty: '/tmp' # 非空目錄如何刪除,在後面學習
二、對文件和目錄的通用操做
In [250]: f = pathlib.Path('/tmp/xj/a.txt') In [250]: f = pathlib.Path('/tmp/xj/a.txt') In [251]: f.exists() Out[251]: False In [252]: f.is_file() # 當文件不存在時,is類方法返回的都是false Out[252]: False In [253]: f.is_dir() Out[253]: False In [257]: f = pathlib.Path('/tmp/src') In [258]: f.exists() Out[258]: True In [259]: f.cwd() # 獲取當前工做目錄 Out[259]: PosixPath('/root/magedu/python3') In [260]: f.absolute() # 獲取絕對路徑 Out[260]: PosixPath('/tmp/src') In [261]: f.as_uri() # 絕對路徑轉化爲uri Out[261]: 'file:///tmp/src' In [265]: pathlib.Path('~') Out[265]: PosixPath('~') In [266]: pathlib.Path('~').expanduser() # 將~轉化爲具體的用戶家目錄 Out[266]: PosixPath('/root') In [275]: f.home() # 獲取家目錄 Out[275]: PosixPath('/root') 如何一個路徑是一個符號連接,修改符號連接的權限就須要使用lchmod In [267]: f.name # 基名,至關於basename Out[267]: 'src' In [278]: f.parent # 至關於dirname Out[278]: PosixPath('/tmp') In [279]: f.parents # 父類 Out[279]: <PosixPath.parents> In [273]: f.owner() # 文件屬主 Out[273]: 'root' In [274]: f.group() Out[274]: 'root' In [285]: f = pathlib.Path('/tmp/shell/not_exist.txt') In [286]: f.exists() Out[286]: True In [287]: f.suffix # 文件名的後綴(.後面的) Out[287]: '.txt' In [288]: f.suffixs --------------------------------------------------------------------------- AttributeError Traceback (most recent call last) <ipython-input-288-c25eaddee638> in <module>() ----> 1 f.suffixs AttributeError: 'PosixPath' object has no attribute 'suffixs' In [289]: f.suffixes Out[289]: ['.txt'] In [290]: f = pathlib.Path('/tmp/shell/not_exist.txt.log') In [291]: f.exists() Out[291]: False In [292]: f.suffixes # 文件不存在也能夠獲取到 Out[292]: ['.txt', '.log'] In [293]: f.suffix Out[293]: '.log'
f.stat:文件狀態
In [297]: f = pathlib.Path('/tmp/shell/not_exist.txt') In [298]: f.stat() Out[298]: os.stat_result(st_mode=33188, st_ino=530232, st_dev=2050, st_nlink=1, st_uid=0, st_gid=0, st_size=7, st_atime=1499241423, st_mtime=1499241417, st_ctime=1499241417) In [299]: f.lstat() # 針對符號連接自己 Out[299]: os.stat_result(st_mode=33188, st_ino=530232, st_dev=2050, st_nlink=1, st_uid=0, st_gid=0, st_size=7, st_atime=1499241423, st_mtime=1499241417, st_ctime=1499241417)
f.glob:通配符
f.rglob:遞歸遍歷
In [307]: d = pathlib.Path('/tmp') In [314]: for i in d.glob('*/*.txt'): # *表示當前目錄 ...: print(i) ...: /tmp/src/helloworld.txt /tmp/shell/python.txt /tmp/shell/test.txt /tmp/shell/not_exist.txt In [315]: for i in d.glob('**/*.txt'): # **表示遞歸子目錄 ...: print(i) ...: /tmp/1.txt /tmp/src/helloworld.txt /tmp/src/cdn/t2/t2_201706150000_charge.txt /tmp/src/cdn/t2/t2_201706150000_role.txt /tmp/src/cdn/t2/t2_201706150000_activity_record.txt /tmp/src/cdn/t2/t2_201706150000_role_face.txt /tmp/src/cdn/t2/t2_201706150000_enter_role_successful.txt /tmp/src/cdn/t2/t2_201706150000_task_record.txt /tmp/src/cdn/t2/t2_201706150000_user_login_success.txt /tmp/src/cdn/t2/t2_201706150000_consumption_record.txt /tmp/src/cdn/t2/t2_201706150000_user_login.txt /tmp/src/cdn/t2/t2_201706150000_enter_scene.txt /tmp/src/cdn/t2/t2_201706150000_mall_transactions.txt /tmp/src/cdn/t2/t2_201706150000_enter_role.txt /tmp/src/cdn/t2/t2_201706150000_props_record.txt /tmp/shell/python.txt /tmp/shell/test.txt /tmp/shell/not_exist.txt In [320]: for i in d.rglob('*/*.txt'): # gglob 遞歸遍歷 ...: print(i) ...: /tmp/src/helloworld.txt /tmp/shell/python.txt /tmp/shell/test.txt /tmp/shell/not_exist.txt /tmp/src/cdn/t2/t2_201706150000_charge.txt /tmp/src/cdn/t2/t2_201706150000_role.txt /tmp/src/cdn/t2/t2_201706150000_activity_record.txt /tmp/src/cdn/t2/t2_201706150000_role_face.txt /tmp/src/cdn/t2/t2_201706150000_enter_role_successful.txt /tmp/src/cdn/t2/t2_201706150000_task_record.txt /tmp/src/cdn/t2/t2_201706150000_user_login_success.txt /tmp/src/cdn/t2/t2_201706150000_consumption_record.txt /tmp/src/cdn/t2/t2_201706150000_user_login.txt /tmp/src/cdn/t2/t2_201706150000_enter_scene.txt /tmp/src/cdn/t2/t2_201706150000_mall_transactions.txt /tmp/src/cdn/t2/t2_201706150000_enter_role.txt /tmp/src/cdn/t2/t2_201706150000_props_record.txt
路徑的拼接:
In [321]: '/' + 'xj' + '/sb' Out[321]: '/xj/sb' In [322]: pathlib.Path('/', 'home', 'xj', 'workspace') Out[322]: PosixPath('/home/xj/workspace') In [323]: pathlib.Path('home', 'xj', 'workspace') Out[323]: PosixPath('home/xj/workspace') In [324]: print(pathlib.Path('home', 'xj', 'workspace')) home/xj/workspace In [325]: print(pathlib.Path('/', 'home', 'xj', 'workspace')) /home/xj/workspace In [331]: print(pathlib.Path('/', '/home', 'xj', 'workspace')) /home/xj/workspace In [331]: print(pathlib.Path('/', '/home', 'xj', 'workspace')) /home/xj/workspace # 能自動添加和刪除/,拼接路徑更加方便
4、文件對象實現copy,move,rm
一、copy
shutil標準庫
shutil.
copyfileobj
(fsrc, fdst[, length]) # 2個文件對象間內容的copy
shutil.
copyfile
(src, dst, *, follow_symlinks=True) # 僅複製內容,不復制元數據
shutil.
copymode
(src, dst, *, follow_symlinks=True) # 僅複製權限
shutil.
copystat
(src, dst, *, follow_symlinks=True) # 僅複製元數據
shutil.
copy
(src, dst, *, follow_symlinks=True) # 複製文件內容和權限,至關於copyfile + copymode
shutil.
copy2
(src, dst, *, follow_symlinks=True) # 複製文件內容和元數據,至關於copyfile + copystat
shutil.
ignore_patterns
(*patterns)
shutil.
copytree
(src, dst, symlinks=False, ignore=None, copy_function=copy2, ignore_dangling_symlinks=False)
shutil.
rmtree
(path, ignore_errors=False, onerror=None)
shutil.
move
(src, dst, copy_function=copy2)
以上函數都只針對文件或當前這一層的目錄
copyfileobj 操做的是文件對象,後面的函數都是操做路徑
shutil.
copytree
(src, dst, symlinks=False, ignore=None, copy_function=copy2, ignore_dangling_symlinks=False)
遞歸複製目錄,其中copy_function參數指定用何種方法複製文件,能夠是以上函數中除了copyfileobj意外任意一個
shutil.
rmtree
(path, ignore_errors=False, onerror=None)
遞歸刪除目錄,
ignore_errors 表示是否忽略錯誤,onerror 表示如何處理錯誤,僅當ignore_errors=False時,onerror才生效,ignore_errors 爲True是遇到錯誤直接拋出異常
二、move
shutil.
move
(src, dst, copy_function=copy2)
具體實現依賴操做系統,若是操做系統實現了renmae系統調用,直接走rename系統調用,若是沒有實現,先使用copytree複製,而後使用rmtree
shutil.
disk_usage
(path)
Return disk usage statistics about the given path as a named tuple with the attributes total, used and free, which are the amount of total, used and free space, in bytes.
New in version 3.3.
Availability: Unix, Windows.
shutil.
chown
(path, user=None, group=None)
Change owner user and/or group of the given path.
user can be a system user name or a uid; the same applies to group. At least one argument is required.
See also os.chown()
, the underlying function.
Availability: Unix.
New in version 3.3.
shutil.
which
(cmd, mode=os.F_OK | os.X_OK, path=None)
In [68]: shutil.which('python') Out[68]: '/root/.pyenv/versions/commy/bin/python' In [69]: shutil.which('ls') Out[69]: '/bin/ls'