Python的文件操做

時間 2019-11-24

標籤 python 文件欄目 Python 简体版

原文原文鏈接

一、讀寫文件

#!/usr/bin/env python
# -*- coding:utf-8 -*-
# @Time    : 2018/1/25 20:49
# @Author  : zhouyuyao
# @File    : demonWrite.py
# PyCharm 2017.3.2 (Community Edition)
# Build #PC-173.4127.16, built on December 19, 2017
# JRE: 1.8.0_152-release-1024-b8 amd64
# JVM: OpenJDK 64-Bit Server VM by JetBrains s.r.o
# Windows 10 10.0
# Python 3.6.1 (v3.6.1:69c0db5, Mar 21 2017, 18:41:36) 
# [MSC v.1900 64 bit (AMD64)] on win32

if __name__== "__main__":
    filename = input("Please input the name of file:")
    f = open(filename,"w")     # 以寫的形式打開一個文件
    while 1:         # 1 的效率是最高的
        context = input("Please input context('EOF' will close file): ")
        if context == "EOF":
            f.close()
            break
        else:
            f.write(context)
            f.write("\n")

    fRead = open(filename)
    readContext = fRead.read()
    print("------------start-------------")
    print(readContext)
    print("-------------end--------------")
    fRead.close()

運行結果：前端

Please input the name of file:z.log
Please input context('EOF' will close file): hello
Please input context('EOF' will close file): the weather is cool
Please input context('EOF' will close file): you have wear more clothes
Please input context('EOF' will close file): EOF
------------start-------------
hello
the weather is cool
you have wear more clothes

-------------end--------------

二、讀取文件方法

import codecs

ENCODING = "utf-8"       # 字符集
f = open("z.log",encoding=ENCODING)
print(f.name)            # 文件名
print(f.readline())      # 讀取成列表的形式
print(f.readlines())     # 讀取成列表的形式

with codecs.open("z.log","r",encoding=ENCODING) as f:
    print(f.read())

三、編碼問題

編碼：
支持中文的編碼：utf-8，gbk，gb2312python

decode 解碼
encode 編碼app

在Python2中不定義代碼的編碼排頭，在內容中出現中文時會報錯。
Python默認將代碼文件內容當作ASCII編碼處理，可是ASCII編碼不存在中文，由於則會拋出異常。
解決問題之道就是要讓Python之道文件中使用的是什麼編碼形式，對於中文，能夠用的常見編碼有utf-8，gbk和gb2312等，只需在代碼文件的最前端添加以下內容便可：ide

# -*- coding:utf-8 -*-ui

Python轉碼的過程：
原有編碼 ——> Unicode編碼 ——> 目的編碼編碼

python會自動將帶中文的字符串解碼成Unicode，而後再編碼成gbk，由於解碼是字典進行的，若是沒有指明解碼方式，就會使用sys,defaultencoding指明的方式來解碼。
方法一：
s.decode("utf-8").encoding("gbk")code

四、對文件進行排序

#!/usr/bin/env python
# -*- coding:utf-8 -*-
# @Time    : 2018/1/25 23:06
# @Author  : zhouyuyao
# @File    : sortUIDPasswd.py
# PyCharm 2017.3.2 (Community Edition)
# Build #PC-173.4127.16, built on December 19, 2017
# JRE: 1.8.0_152-release-1024-b8 amd64
# JVM: OpenJDK 64-Bit Server VM by JetBrains s.r.o
# Windows 10 10.0
# Python 3.6.1 (v3.6.1:69c0db5, Mar 21 2017, 18:41:36) 
# [MSC v.1900 64 bit (AMD64)] on win32
import codecs

file = "passwd"
sortfile = "sortpasswd.txt"
filecontext = []
sortuid = []

with codecs.open(sortfile,"wb") as fsort:
    with codecs.open(file,encoding="utf-8") as f:
        filecontext += f.readlines()
        for line in filecontext:
            sortuid.append(int(line.split(":")[2]))
        sortuid.sort()
        for uid in sortuid:
            for line in filecontext:
                if str(uid) == line.split(":")[2]:
                    print(line)
                    fsort.write(line.encode("utf-8"))

python3的新特性對文本和二進制數據做了更爲清晰的區分，
文本老是Unicode，由str類型表示，
二進制則是由bytes類型表示
字符串能夠encode編碼成字節包，而字節包能夠decode解碼成字符串排序