Python os.walk() 簡介

os.walk目錄遍歷

每一個月都有那麼幾天想划水,又到划水的日子了,今天分享的是剛在處理遍歷目錄相關用到的相關方法。html

os.walk

os.walk的參數以下:python

os.walk(top, topdown=True, onerror=None, followlinks=False)

其中:git

  • top是要遍歷的目錄。函數

  • topdown是表明要從上而下遍歷仍是從下往上遍歷。測試

  • onerror能夠用來設置當便利出現錯誤的處理函數(該函數接受一個OSError的實例做爲參數),設置爲空則不做處理。code

  • followlinks表示是否要跟隨目錄下的連接去繼續遍歷,要注意的是,os.walk不會記錄已經遍歷的目錄,因此跟隨連接遍歷的話有可能一直循環調用下去。htm

os.walk返回的是一個3個元素的元組 (root, dirs, files) ,分別表示遍歷的路徑名,該路徑下的目錄列表和該路徑下文件列表。注意目錄列表和文件列表不是具體路徑,須要具體路徑(從root開始的路徑)的話能夠用 os.path.join(root,dir)os.path.join(root,dir)pdo

例子

假設如今存在以下的文件和目錄結構:rem

➜  test_os_walk git:(master) ✗ tree
.
├── a.py
├── b.py
├── c.py
├── dir1
│   ├── dir4
│   │   ├── g.py
│   │   └── h.py
│   ├── dirx
│   │   ├── diry
│   │   │   └── k.py
│   │   └── z.py
│   ├── e.py
│   ├── f.py
│   └── g.py
├── dir2
│   ├── dira
│   │   └── dirb
│   │       └── dirc
│   │           └── aha.py
│   ├── k.py
│   ├── l.py
│   └── m.py
└── dir3
    ├── dir5
    │   └── z.py
    ├── x.py
    └── y.py

10 directories, 17 files

測試topdown

當我用 os.walk 遍歷這個目錄時,程序和輸出以下:get

import os

path = '/Users/nisen/Projects/python_advanced_class/test/test_os_walk'

for root, dirs, files in os.walk(path, True):
    print 'root: %s' % root
    print 'dirs: %s' % dirs
    print 'files: %s' % files
    print ''

結果以下,從root的路徑能夠看出遍歷是自上而下的:

➜  test git:(master) ✗ python test11.py
root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk
dirs: ['dir1', 'dir2', 'dir3']
files: ['a.py', 'b.py', 'c.py']

root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir1
dirs: ['dir4', 'dirx']
files: ['e.py', 'f.py', 'g.py']

root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir1/dir4
dirs: []
files: ['g.py', 'h.py']

root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir1/dirx
dirs: ['diry']
files: ['z.py']

root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir1/dirx/diry
dirs: []
files: ['k.py']

root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir2
dirs: ['dira']
files: ['k.py', 'l.py', 'm.py']

root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir2/dira
dirs: ['dirb']
files: []

root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir2/dira/dirb
dirs: ['dirc']
files: []

root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir2/dira/dirb/dirc
dirs: []
files: ['aha.py']

root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir3
dirs: ['dir5']
files: ['x.py', 'y.py']

root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir3/dir5
dirs: []
files: ['z.py']

而當設置os.walk的topdown爲False時,結果以下, 能夠看出他是自上而下遍歷的:

➜  test git:(master) ✗ python test11.py
root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir1/dir4
dirs: []
files: ['g.py', 'h.py']

root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir1/dirx/diry
dirs: []
files: ['k.py']

root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir1/dirx
dirs: ['diry']
files: ['z.py']

root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir1
dirs: ['dir4', 'dirx']
files: ['e.py', 'f.py', 'g.py']

root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir2/dira/dirb/dirc
dirs: []
files: ['aha.py']

root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir2/dira/dirb
dirs: ['dirc']
files: []

root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir2/dira
dirs: ['dirb']
files: []

root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir2
dirs: ['dira']
files: ['k.py', 'l.py', 'm.py']

root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir3/dir5
dirs: []
files: ['z.py']

root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk/dir3
dirs: ['dir5']
files: ['x.py', 'y.py']

root: /Users/nisen/Projects/python_advanced_class/test/test_os_walk
dirs: ['dir1', 'dir2', 'dir3']
files: ['a.py', 'b.py', 'c.py']

運行時修改遍歷目錄

當topdown設置爲True時,能夠在處理時修改返回的 dirs 列表,這樣能夠遍歷下面的目錄時會根據修改後的 dirs 來遍歷。好比下面的例子,在遍歷的時候不把"CSV"目錄包括在內:

import os
from os.path import join, getsize
for root, dirs, files in os.walk('python/Lib/email'):
    print root, "consumes",
    print sum(getsize(join(root, name)) for name in files),
    print "bytes in", len(files), "non-directory files"
    if 'CVS' in dirs:
        dirs.remove('CVS')  # don't visit CVS directories

參考資料

相關文章
相關標籤/搜索