



2.1 字符串操做


一、startswith 以某個字符串起始服務器

二、endswith 以某個字符串結尾app

三、contain python沒有提供contain函數,可使用 'test' in somestring 的方式來進行判斷,固然也可使用index來判斷函數

四、strip 去除空格及特殊符號spa

五、len 判斷字符串長度len(str).net

六、upper lower 大小寫轉換code

七、split 分隔字符串blog

2.2 文件操做


一、walk 用於遞歸遍歷文件夾,獲取全部文件。

二、os.path 文件、文件夾路徑等操做。


def isFile(name): return os.path.isfile(name) def isDir(name): return os.path.isdir(name) def getDirPath(filename): return os.path.dirname(filename) def getFilename(path): return os.path.basename(path) def getExt(filename): return os.path.splitext(filename)[1] def changeExt(filename, ext): if not ext.startswith('.'): ext = '.' + ext return getFilenameWithoutExt(filename) + ext def getDirAndFileNameWithoutExt(filename): return os.path.splitext(filename)[0] def getFilenameWithoutExt(filename): return getFilename(getDirAndFileNameWithoutExt(filename)) def deleteFileOrFolder(path): try: if isFile(path): os.remove(path) elif isDir(path): shutil.rmtree(path) # or os.rmdir(path) except: pass

2.3 壓縮解壓縮操做



       壓縮、解壓.tar.gz文件能夠直接使用tarfile包,首先引入:import tarfile。解壓縮操做以下:

tar =, 'r:gz') file_names = tar.getnames() for file_name in file_names: tar.extract(file_name, path) tar.close()


tar =, 'w:gz') if isFile(srcpath): tar.add(srcpath, arcname=srcpath) elif isDir(srcpath): for root, dir, files in os.walk(srcpath): for file in files: fullpath = os.path.join(root, file) tar.add(fullpath, arcname=file) tar.close()


mode action
'r' or 'r:*' Open for reading with transparent compression (recommended). 'r:' Open for reading exclusively without compression. 'r:gz' Open for reading with gzip compression. 'r:bz2' Open for reading with bzip2 compression. 'a' or 'a:' Open for appending with no compression. The file is created if it does not exist. 'w' or 'w:' Open for uncompressed writing. 'w:gz' Open for gzip compressed writing. 'w:bz2' Open for bzip2 compressed writing.


       壓縮、解壓.gz文件能夠直接使用gzip包,首先引入:import gzip。解壓縮操做以下:

fname = path.replace('.gz', '').replace('.GZ', '') gfile = gzip.GzipFile(path) open(fname, 'wb').write( gfile.close()


gfile = gzip.GzipFile(srcpath + '.gz', mode='w') gfile.write(open(srcpath, 'rb').read()) gfile.close()



       壓縮、解壓.zip文件能夠直接使用zipfile包,首先引入:import zipfile。解壓縮操做以下:

zip_file  = zipfile.ZipFile(path, mode='r') for name in zipfile.namelist(): zip_file.extract(name, getFilenameWithoutExt(path)) zip_file.close()


zip_file  = zipfile.ZipFile(zippath, mode='w') if isFile(srcpath): zip_file.write(srcpath, arcname=srcpath) elif isDir(srcpath): for root, dir, files in os.walk(srcpath): for file in files: fullpath = os.path.join(root, file) zip_file.write(fullpath, arcname=file) zip_file.close()



3.1 引入hdfs3


from hdfs3 import HDFileSystem hdfs = HDFileSystem(host='namenode', port=8020)

3.2 創建文件夾


3.3 上傳文件

       上傳文件的時候只須要指定本地文件地址以及hdfs中存儲地址便可,hdfs地址也須要包含文件名,命令爲hdfs.put(localfile, remotefile)。

3.4 hdfs操做封裝


def mkdir(remotepath): if not exists(remotepath): hdfs.mkdir(dir) def get(remotepath, localpath): if exists(remotepath): hdfs.get(remotepath, localpath) def put(localfile, remotefile): dir = getDirPath(remotefile) mkdir(dir) hdfs.put(localfile, remotefile) def exists(remotepath): return hdfs.exists(remotepath) def delete(remotepath): if exists(remotepath): hdfs.rm(remotepath, recursive=True)


