Python 語法速覽與實戰清單

本文是對於 現代 Python 開發:語法基礎與工程實踐的總結,更多 Python 相關資料參考 Python 學習與實踐資料索引;本文參考了 Python Crash Course - Cheat Sheetspysheeet 等。本文僅包含筆者在平常工做中常用的,而且認爲較爲關鍵的知識點與語法,若是想要進一步學習 Python 相關內容或者對於機器學習與數據挖掘方向感興趣,能夠參考程序猿的數據科學與機器學習實戰手冊html

基礎語法

Python 是一門高階、動態類型的多範式編程語言;定義 Python 文件的時候咱們每每會先聲明文件編碼方式:node

# 指定腳本調用方式
#!/usr/bin/env python
# 配置 utf-8 編碼
# -*- coding: utf-8 -*-

# 配置其餘編碼
# -*- coding: <encoding-name> -*-

# Vim 中還可使用以下方式
# vim:fileencoding=<encoding-name>

人生苦短,請用 Python,大量功能強大的語法糖的同時讓不少時候 Python 代碼看上去有點像僞代碼。譬如咱們用 Python 實現的簡易的快排相較於 Java 會顯得很短小精悍:python

def quicksort(arr):
    if len(arr) <= 1:
        return arr
    pivot = arr[len(arr) / 2]
    left = [x for x in arr if x < pivot]
    middle = [x for x in arr if x == pivot]
    right = [x for x in arr if x > pivot]
    return quicksort(left) + middle + quicksort(right)
    
print quicksort([3,6,8,10,1,2,1])
# Prints "[1, 1, 2, 3, 6, 8, 10]"

控制檯交互

能夠根據 __name__ 關鍵字來判斷是不是直接使用 python 命令執行某個腳本,仍是外部引用;Google 開源的 fire 也是不錯的快速將某個類封裝爲命令行工具的框架:mysql

import fire

class Calculator(object):
  """A simple calculator class."""

  def double(self, number):
    return 2 * number

if __name__ == '__main__':
  fire.Fire(Calculator)

# python calculator.py double 10  # 20
# python calculator.py double --number=15  # 30

Python 2 中 print 是表達式,而 Python 3 中 print 是函數;若是但願在 Python 2 中將 print 以函數方式使用,則須要自定義引入:git

from __future__ import print_function

咱們也可使用 pprint 來美化控制檯輸出內容:github

import pprint

stuff = ['spam', 'eggs', 'lumberjack', 'knights', 'ni']
pprint.pprint(stuff)

# 自定義參數
pp = pprint.PrettyPrinter(depth=6)
tup = ('spam', ('eggs', ('lumberjack', ('knights', ('ni', ('dead',('parrot', ('fresh fruit',))))))))
pp.pprint(tup)

模塊

Python 中的模塊(Module)便是 Python 源碼文件,其能夠導出類、函數與全局變量;當咱們從某個模塊導入變量時,函數名每每就是命名空間(Namespace)。而 Python 中的包(Package)則是模塊的文件夾,每每由 __init__.py 指明某個文件夾爲包:web

# 文件目錄
someDir/
    main.py
    siblingModule.py

# siblingModule.py

def siblingModuleFun():
    print('Hello from siblingModuleFun')
    
def siblingModuleFunTwo():
    print('Hello from siblingModuleFunTwo')

import siblingModule
import siblingModule as sibMod

sibMod.siblingModuleFun()

from siblingModule import siblingModuleFun
siblingModuleFun()

try:
    # Import 'someModuleA' that is only available in Windows
    import someModuleA
except ImportError:
    try:
        # Import 'someModuleB' that is only available in Linux
        import someModuleB
    except ImportError:

Package 能夠爲某個目錄下全部的文件設置統一入口:正則表達式

someDir/
    main.py
    subModules/
        __init__.py
        subA.py
        subSubModules/
            __init__.py
            subSubA.py

# subA.py

def subAFun():
    print('Hello from subAFun')
    
def subAFunTwo():
    print('Hello from subAFunTwo')

# subSubA.py

def subSubAFun():
    print('Hello from subSubAFun')
    
def subSubAFunTwo():
    print('Hello from subSubAFunTwo')

# __init__.py from subDir

# Adds 'subAFun()' and 'subAFunTwo()' to the 'subDir' namespace 
from .subA import *

# The following two import statement do the same thing, they add 'subSubAFun()' and 'subSubAFunTwo()' to the 'subDir' namespace. The first one assumes '__init__.py' is empty in 'subSubDir', and the second one, assumes '__init__.py' in 'subSubDir' contains 'from .subSubA import *'.

# Assumes '__init__.py' is empty in 'subSubDir'
# Adds 'subSubAFun()' and 'subSubAFunTwo()' to the 'subDir' namespace
from .subSubDir.subSubA import *

# Assumes '__init__.py' in 'subSubDir' has 'from .subSubA import *'
# Adds 'subSubAFun()' and 'subSubAFunTwo()' to the 'subDir' namespace
from .subSubDir import *
# __init__.py from subSubDir

# Adds 'subSubAFun()' and 'subSubAFunTwo()' to the 'subSubDir' namespace
from .subSubA import *

# main.py

import subDir

subDir.subAFun() # Hello from subAFun
subDir.subAFunTwo() # Hello from subAFunTwo
subDir.subSubAFun() # Hello from subSubAFun
subDir.subSubAFunTwo() # Hello from subSubAFunTwo

表達式與控制流

條件選擇

Python 中使用 if、elif、else 來進行基礎的條件選擇操做:sql

if x < 0:
     x = 0
     print('Negative changed to zero')
 elif x == 0:
     print('Zero')
 else:
     print('More')

Python 一樣支持 ternary conditional operator:express

a if condition else b

也可使用 Tuple 來實現相似的效果:

# test 須要返回 True 或者 False
(falseValue, trueValue)[test]

# 更安全的作法是進行強制判斷
(falseValue, trueValue)[test == True]

# 或者使用 bool 類型轉換函數
(falseValue, trueValue)[bool(<expression>)]

循環遍歷

for-in 能夠用來遍歷數組與字典:

words = ['cat', 'window', 'defenestrate']

for w in words:
    print(w, len(w))

# 使用數組訪問操做符,可以迅速地生成數組的副本
for w in words[:]:
    if len(w) > 6:
        words.insert(0, w)

# words -> ['defenestrate', 'cat', 'window', 'defenestrate']

若是咱們但願使用數字序列進行遍歷,可使用 Python 內置的 range 函數:

a = ['Mary', 'had', 'a', 'little', 'lamb']

for i in range(len(a)):
    print(i, a[i])

基本數據類型

可使用內建函數進行強制類型轉換(Casting):

int(str)
float(str)
str(int)
str(float)

Number: 數值類型

x = 3
print type(x) # Prints "<type 'int'>"
print x       # Prints "3"
print x + 1   # Addition; prints "4"
print x - 1   # Subtraction; prints "2"
print x * 2   # Multiplication; prints "6"
print x ** 2  # Exponentiation; prints "9"
x += 1
print x  # Prints "4"
x *= 2
print x  # Prints "8"
y = 2.5
print type(y) # Prints "<type 'float'>"
print y, y + 1, y * 2, y ** 2 # Prints "2.5 3.5 5.0 6.25"

布爾類型

Python 提供了常見的邏輯操做符,不過須要注意的是 Python 中並無使用 &&、|| 等,而是直接使用了英文單詞。

t = True
f = False
print type(t) # Prints "<type 'bool'>"
print t and f # Logical AND; prints "False"
print t or f  # Logical OR; prints "True"
print not t   # Logical NOT; prints "False"
print t != f  # Logical XOR; prints "True"

String: 字符串

Python 2 中支持 Ascii 碼的 str() 類型,獨立的 unicode() 類型,沒有 byte 類型;而 Python 3 中默認的字符串爲 utf-8 類型,而且包含了 byte 與 bytearray 兩個字節類型:

type("Guido") # string type is str in python2
# <type 'str'>

# 使用 __future__ 中提供的模塊來降級使用 Unicode
from __future__ import unicode_literals
type("Guido") # string type become unicode
# <type 'unicode'>

Python 字符串支持分片、模板字符串等常見操做:

var1 = 'Hello World!'
var2 = "Python Programming"

print "var1[0]: ", var1[0]
print "var2[1:5]: ", var2[1:5]
# var1[0]:  H
# var2[1:5]:  ytho

print "My name is %s and weight is %d kg!" % ('Zara', 21)
# My name is Zara and weight is 21 kg!
str[0:4]
len(str)

string.replace("-", " ")
",".join(list)
"hi {0}".format('j')
str.find(",")
str.index(",")   # same, but raises IndexError
str.count(",")
str.split(",")

str.lower()
str.upper()
str.title()

str.lstrip()
str.rstrip()
str.strip()

str.islower()
# 移除全部的特殊字符
re.sub('[^A-Za-z0-9]+', '', mystring)

若是須要判斷是否包含某個子字符串,或者搜索某個字符串的下標:

# in 操做符能夠判斷字符串
if "blah" not in somestring: 
    continue

# find 能夠搜索下標
s = "This be a string"
if s.find("is") == -1:
    print "No 'is' here!"
else:
    print "Found 'is' in the string."

Regex: 正則表達式

import re

# 判斷是否匹配
re.match(r'^[aeiou]', str)

# 以第二個參數指定的字符替換原字符串中內容
re.sub(r'^[aeiou]', '?', str)
re.sub(r'(xyz)', r'\1', str)

# 編譯生成獨立的正則表達式對象
expr = re.compile(r'^...$')
expr.match(...)
expr.sub(...)

下面列舉了常見的表達式使用場景:

# 檢測是否爲 HTML 標籤
re.search('<[^/>][^>]*>', '<a href="#label">')

# 常見的用戶名密碼
re.match('^[a-zA-Z0-9-_]{3,16}$', 'Foo') is not None
re.match('^\w|[-_]{3,16}$', 'Foo') is not None

# Email
re.match('^([a-z0-9_\.-]+)@([\da-z\.-]+)\.([a-z\.]{2,6})$', 'hello.world@example.com')

# Url
exp = re.compile(r'''^(https?:\/\/)? # match http or https
                ([\da-z\.-]+)            # match domain
                \.([a-z\.]{2,6})         # match domain
                ([\/\w \.-]*)\/?$        # match api or file
                ''', re.X)
exp.match('www.google.com')

# IP 地址
exp = re.compile(r'''^(?:(?:25[0-5]
                     |2[0-4][0-9]
                     |[1]?[0-9][0-9]?)\.){3}
                     (?:25[0-5]
                     |2[0-4][0-9]
                     |[1]?[0-9][0-9]?)$''', re.X)
exp.match('192.168.1.1')

集合類型

List: 列表

Operation: 建立增刪

list 是基礎的序列類型:

l = []
l = list()

# 使用字符串的 split 方法,能夠將字符串轉化爲列表
str.split(".")

# 若是須要將數組拼裝爲字符串,則可使用 join 
list1 = ['1', '2', '3']
str1 = ''.join(list1)

# 若是是數值數組,則須要先進行轉換
list1 = [1, 2, 3]
str1 = ''.join(str(e) for e in list1)

可使用 append 與 extend 向數組中插入元素或者進行數組鏈接

x = [1, 2, 3]

x.append([4, 5]) # [1, 2, 3, [4, 5]]

x.extend([4, 5]) # [1, 2, 3, 4, 5],注意 extend 返回值爲 None

可使用 pop、slices、del、remove 等移除列表中元素:

myList = [10,20,30,40,50]

# 彈出第二個元素
myList.pop(1) # 20
# myList: myList.pop(1)

# 若是不加任何參數,則默認彈出最後一個元素
myList.pop()

# 使用 slices 來刪除某個元素
a = [  1, 2, 3, 4, 5, 6 ]
index = 3 # Only Positive index
a = a[:index] + a[index+1 :]

# 根據下標刪除元素
myList = [10,20,30,40,50]
rmovIndxNo = 3
del myList[rmovIndxNo] # myList: [10, 20, 30, 50]

# 使用 remove 方法,直接根據元素刪除
letters = ["a", "b", "c", "d", "e"]
numbers.remove(numbers[1])
print(*letters) # used a * to make it unpack you don't have to

Iteration: 索引遍歷

你可使用基本的 for 循環來遍歷數組中的元素,就像下面介個樣紙:

animals = ['cat', 'dog', 'monkey']
for animal in animals:
    print animal
# Prints "cat", "dog", "monkey", each on its own line.

若是你在循環的同時也但願可以獲取到當前元素下標,可使用 enumerate 函數:

animals = ['cat', 'dog', 'monkey']
for idx, animal in enumerate(animals):
    print '#%d: %s' % (idx + 1, animal)
# Prints "#1: cat", "#2: dog", "#3: monkey", each on its own line

Python 也支持切片(Slices):

nums = range(5)    # range is a built-in function that creates a list of integers
print nums         # Prints "[0, 1, 2, 3, 4]"
print nums[2:4]    # Get a slice from index 2 to 4 (exclusive); prints "[2, 3]"
print nums[2:]     # Get a slice from index 2 to the end; prints "[2, 3, 4]"
print nums[:2]     # Get a slice from the start to index 2 (exclusive); prints "[0, 1]"
print nums[:]      # Get a slice of the whole list; prints ["0, 1, 2, 3, 4]"
print nums[:-1]    # Slice indices can be negative; prints ["0, 1, 2, 3]"
nums[2:4] = [8, 9] # Assign a new sublist to a slice
print nums         # Prints "[0, 1, 8, 9, 4]"

Comprehensions: 變換

Python 中一樣可使用 map、reduce、filter,map 用於變換數組:

# 使用 map 對數組中的每一個元素計算平方
items = [1, 2, 3, 4, 5]
squared = list(map(lambda x: x**2, items))

# map 支持函數以數組方式鏈接使用
def multiply(x):
    return (x*x)
def add(x):
    return (x+x)

funcs = [multiply, add]
for i in range(5):
    value = list(map(lambda x: x(i), funcs))
    print(value)

reduce 用於進行概括計算:

# reduce 將數組中的值進行概括

from functools import reduce
product = reduce((lambda x, y: x * y), [1, 2, 3, 4])

# Output: 24

filter 則能夠對數組進行過濾:

number_list = range(-5, 5)
less_than_zero = list(filter(lambda x: x < 0, number_list))
print(less_than_zero)

# Output: [-5, -4, -3, -2, -1]

字典類型

建立增刪

d = {'cat': 'cute', 'dog': 'furry'}  # 建立新的字典
print d['cat']       # 字典不支持點(Dot)運算符取值

若是須要合併兩個或者多個字典類型:

# python 3.5
z = {**x, **y}

# python 2.7
def merge_dicts(*dict_args):
    """
    Given any number of dicts, shallow copy and merge into a new dict,
    precedence goes to key value pairs in latter dicts.
    """
    result = {}
    for dictionary in dict_args:
        result.update(dictionary)
    return result

索引遍歷

能夠根據鍵來直接進行元素訪問:

# Python 中對於訪問不存在的鍵會拋出 KeyError 異常,須要先行判斷或者使用 get
print 'cat' in d     # Check if a dictionary has a given key; prints "True"

# 若是直接使用 [] 來取值,須要先肯定鍵的存在,不然會拋出異常
print d['monkey']  # KeyError: 'monkey' not a key of d

# 使用 get 函數則能夠設置默認值
print d.get('monkey', 'N/A')  # Get an element with a default; prints "N/A"
print d.get('fish', 'N/A')    # Get an element with a default; prints "wet"


d.keys() # 使用 keys 方法能夠獲取全部的鍵

可使用 for-in 來遍歷數組:

# 遍歷鍵
for key in d:

# 比前一種方式慢
for k in dict.keys(): ...

# 直接遍歷值
for value in dict.itervalues(): ...

# Python 2.x 中遍歷鍵值
for key, value in d.iteritems():

# Python 3.x 中遍歷鍵值
for key, value in d.items():

其餘序列類型

集合

# Same as {"a", "b","c"}
normal_set = set(["a", "b","c"])
 
# Adding an element to normal set is fine
normal_set.add("d")
 
print("Normal Set")
print(normal_set)
 
# A frozen set
frozen_set = frozenset(["e", "f", "g"])
 
print("Frozen Set")
print(frozen_set)
 
# Uncommenting below line would cause error as
# we are trying to add element to a frozen set
# frozen_set.add("h")

函數

函數定義

Python 中的函數使用 def 關鍵字進行定義,譬如:

def sign(x):
    if x > 0:
        return 'positive'
    elif x < 0:
        return 'negative'
    else:
        return 'zero'


for x in [-1, 0, 1]:
    print sign(x)
# Prints "negative", "zero", "positive"

Python 支持運行時建立動態函數,也便是所謂的 lambda 函數:

def f(x): return x**2

# 等價於
g = lambda x: x**2

參數

Option Arguments: 不定參數

def example(a, b=None, *args, **kwargs):
  print a, b
  print args
  print kwargs

example(1, "var", 2, 3, word="hello")
# 1 var
# (2, 3)
# {'word': 'hello'}

a_tuple = (1, 2, 3, 4, 5)
a_dict = {"1":1, "2":2, "3":3}
example(1, "var", *a_tuple, **a_dict)
# 1 var
# (1, 2, 3, 4, 5)
# {'1': 1, '2': 2, '3': 3}

生成器

def simple_generator_function():
    yield 1
    yield 2
    yield 3

for value in simple_generator_function():
    print(value)

# 輸出結果
# 1
# 2
# 3
our_generator = simple_generator_function()
next(our_generator)
# 1
next(our_generator)
# 2
next(our_generator)
#3

# 生成器典型的使用場景譬如無限數組的迭代
def get_primes(number):
    while True:
        if is_prime(number):
            yield number
        number += 1

裝飾器

裝飾器是很是有用的設計模式:

# 簡單裝飾器

from functools import wraps
def decorator(func):
    @wraps(func)
    def wrapper(*args, **kwargs):
        print('wrap function')
        return func(*args, **kwargs)
    return wrapper

@decorator
def example(*a, **kw):
    pass

example.__name__  # attr of function preserve
# 'example'
# Decorator 

# 帶輸入值的裝飾器

from functools import wraps
def decorator_with_argument(val):
  def decorator(func):
    @wraps(func)
    def wrapper(*args, **kwargs):
      print "Val is {0}".format(val)
      return func(*args, **kwargs)
    return wrapper
  return decorator

@decorator_with_argument(10)
def example():
  print "This is example function."

example()
# Val is 10
# This is example function.

# 等價於

def example():
  print "This is example function."

example = decorator_with_argument(10)(example)
example()
# Val is 10
# This is example function.

類與對象

類定義

Python 中對於類的定義也很直接:

class Greeter(object):
    
    # Constructor
    def __init__(self, name):
        self.name = name  # Create an instance variable
        
    # Instance method
    def greet(self, loud=False):
        if loud:
            print 'HELLO, %s!' % self.name.upper()
        else:
            print 'Hello, %s' % self.name
        
g = Greeter('Fred')  # Construct an instance of the Greeter class
g.greet()            # Call an instance method; prints "Hello, Fred"
g.greet(loud=True)   # Call an instance method; prints "HELLO, FRED!"
# isinstance 方法用於判斷某個對象是否源自某個類
ex = 10
isinstance(ex,int)

Managed Attributes: 受控屬性

# property、setter、deleter 能夠用於複寫點方法

class Example(object):
    def __init__(self, value):
       self._val = value
    @property
    def val(self):
        return self._val
    @val.setter
    def val(self, value):
        if not isintance(value, int):
            raise TypeError("Expected int")
        self._val = value
    @val.deleter
    def val(self):
        del self._val
    @property
    def square3(self):
        return 2**3

ex = Example(123)
ex.val = "str"
# Traceback (most recent call last):
#   File "", line 1, in
#   File "test.py", line 12, in val
#     raise TypeError("Expected int")
# TypeError: Expected int

類方法與靜態方法

class example(object):
  @classmethod
  def clsmethod(cls):
    print "I am classmethod"
  @staticmethod
  def stmethod():
    print "I am staticmethod"
  def instmethod(self):
    print "I am instancemethod"

ex = example()
ex.clsmethod()
# I am classmethod
ex.stmethod()
# I am staticmethod
ex.instmethod()
# I am instancemethod
example.clsmethod()
# I am classmethod
example.stmethod()
# I am staticmethod
example.instmethod()
# Traceback (most recent call last):
#   File "", line 1, in
# TypeError: unbound method instmethod() ...

對象

實例化

屬性操做

Python 中對象的屬性不一樣於字典鍵,可使用點運算符取值,直接使用 in 判斷會存在問題:

class A(object):
    @property
    def prop(self):
        return 3

a = A()
print "'prop' in a.__dict__ =", 'prop' in a.__dict__
print "hasattr(a, 'prop') =", hasattr(a, 'prop')
print "a.prop =", a.prop

# 'prop' in a.__dict__ = False
# hasattr(a, 'prop') = True
# a.prop = 3

建議使用 hasattr、getattr、setattr 這種方式對於對象屬性進行操做:

class Example(object):
  def __init__(self):
    self.name = "ex"
  def printex(self):
    print "This is an example"


# Check object has attributes
# hasattr(obj, 'attr')
ex = Example()
hasattr(ex,"name")
# True
hasattr(ex,"printex")
# True
hasattr(ex,"print")
# False

# Get object attribute
# getattr(obj, 'attr')
getattr(ex,'name')
# 'ex'

# Set object attribute
# setattr(obj, 'attr', value)
setattr(ex,'name','example')
ex.name
# 'example'

異常與測試

異常處理

Context Manager - with

with 經常使用於打開或者關閉某些資源:

host = 'localhost'
port = 5566
with Socket(host, port) as s:
    while True:
        conn, addr = s.accept()
        msg = conn.recv(1024)
        print msg
        conn.send(msg)
        conn.close()

單元測試

from __future__ import print_function

import unittest

def fib(n):
    return 1 if n<=2 else fib(n-1)+fib(n-2)

def setUpModule():
        print("setup module")
def tearDownModule():
        print("teardown module")

class TestFib(unittest.TestCase):

    def setUp(self):
        print("setUp")
        self.n = 10
    def tearDown(self):
        print("tearDown")
        del self.n
    @classmethod
    def setUpClass(cls):
        print("setUpClass")
    @classmethod
    def tearDownClass(cls):
        print("tearDownClass")
    def test_fib_assert_equal(self):
        self.assertEqual(fib(self.n), 55)
    def test_fib_assert_true(self):
        self.assertTrue(fib(self.n) == 55)

if __name__ == "__main__":
    unittest.main()

存儲

文件讀寫

路徑處理

Python 內置的 __file__ 關鍵字會指向當前文件的相對路徑,能夠根據它來構造絕對路徑,或者索引其餘文件:

# 獲取當前文件的相對目錄
dir = os.path.dirname(__file__) # src\app

## once you're at the directory level you want, with the desired directory as the final path node:
dirname1 = os.path.basename(dir) 
dirname2 = os.path.split(dir)[1] ## if you look at the documentation, this is exactly what os.path.basename does.

# 獲取當前代碼文件的絕對路徑,abspath 會自動根據相對路徑與當前工做空間進行路徑補全
os.path.abspath(os.path.dirname(__file__)) # D:\WorkSpace\OWS\tool\ui-tool-svn\python\src\app

# 獲取當前文件的真實路徑
os.path.dirname(os.path.realpath(__file__)) # D:\WorkSpace\OWS\tool\ui-tool-svn\python\src\app

# 獲取當前執行路徑
os.getcwd()

可使用 listdir、walk、glob 模塊來進行文件枚舉與檢索:

# 僅列舉全部的文件
from os import listdir
from os.path import isfile, join
onlyfiles = [f for f in listdir(mypath) if isfile(join(mypath, f))]

# 使用 walk 遞歸搜索
from os import walk

f = []
for (dirpath, dirnames, filenames) in walk(mypath):
    f.extend(filenames)
    break

# 使用 glob 進行復雜模式匹配
import glob
print(glob.glob("/home/adam/*.txt"))
# ['/home/adam/file1.txt', '/home/adam/file2.txt', .... ]

簡單文件讀寫

# 能夠根據文件是否存在選擇寫入模式
mode = 'a' if os.path.exists(writepath) else 'w'

# 使用 with 方法可以自動處理異常
with open("file.dat",mode) as f:
    f.write(...)
    ...
    # 操做完畢以後記得關閉文件
    f.close()

# 讀取文件內容
message = f.read()

複雜格式文件

JSON

import json

# Writing JSON data
with open('data.json', 'w') as f:
     json.dump(data, f)

# Reading data back
with open('data.json', 'r') as f:
     data = json.load(f)

XML

咱們可使用 lxml 來解析與處理 XML 文件,本部分即對其經常使用操做進行介紹。lxml 支持從字符串或者文件中建立 Element 對象:

from lxml import etree

# 能夠從字符串開始構造
xml = '<a xmlns="test"><b xmlns="test"/></a>'
root = etree.fromstring(xml)
etree.tostring(root)
# b'<a xmlns="test"><b xmlns="test"/></a>'

# 也能夠從某個文件開始構造
tree = etree.parse("doc/test.xml")

# 或者指定某個 baseURL
root = etree.fromstring(xml, base_url="http://where.it/is/from.xml")

其提供了迭代器以對全部元素進行遍歷:

# 遍歷全部的節點
for tag in tree.iter():
    if not len(tag):
        print tag.keys() # 獲取全部自定義屬性
        print (tag.tag, tag.text) # text 即文本子元素值

# 獲取 XPath
for e in root.iter():
    print tree.getpath(e)

lxml 支持以 XPath 查找元素,不過須要注意的是,XPath 查找的結果是數組,而且在包含命名空間的狀況下,須要指定命名空間:

root.xpath('//page/text/text()',ns={prefix:url})

# 可使用 getparent 遞歸查找父元素
el.getparent()

lxml 提供了 insert、append 等方法進行元素操做:

# append 方法默認追加到尾部
st = etree.Element("state", name="New Mexico")
co = etree.Element("county", name="Socorro")
st.append(co)

# insert 方法能夠指定位置
node.insert(0, newKid)

Excel

可使用 [xlrd]() 來讀取 Excel 文件,使用 xlsxwriter 來寫入與操做 Excel 文件。

# 讀取某個 Cell 的原始值
sh.cell(rx, col).value
# 建立新的文件
workbook = xlsxwriter.Workbook(outputFile)
worksheet = workbook.add_worksheet()

# 設置從第 0 行開始寫入
row = 0

# 遍歷二維數組,而且將其寫入到 Excel 中
for rowData in array:
    for col, data in enumerate(rowData):
        worksheet.write(row, col, data)
    row = row + 1

workbook.close()

文件系統

對於高級的文件操做,咱們可使用 Python 內置的 shutil

# 遞歸刪除 appName 下面的全部的文件夾
shutil.rmtree(appName)

網絡交互

Requests

Requests 是優雅而易用的 Python 網絡請求庫:

import requests

r = requests.get('https://api.github.com/events')
r = requests.get('https://api.github.com/user', auth=('user', 'pass'))

r.status_code
# 200
r.headers['content-type']
# 'application/json; charset=utf8'
r.encoding
# 'utf-8'
r.text
# u'{"type":"User"...'
r.json()
# {u'private_gists': 419, u'total_private_repos': 77, ...}

r = requests.put('http://httpbin.org/put', data = {'key':'value'})
r = requests.delete('http://httpbin.org/delete')
r = requests.head('http://httpbin.org/get')
r = requests.options('http://httpbin.org/get')

數據存儲

MySQL

import pymysql.cursors

# Connect to the database
connection = pymysql.connect(host='localhost',
                             user='user',
                             password='passwd',
                             db='db',
                             charset='utf8mb4',
                             cursorclass=pymysql.cursors.DictCursor)

try:
    with connection.cursor() as cursor:
        # Create a new record
        sql = "INSERT INTO `users` (`email`, `password`) VALUES (%s, %s)"
        cursor.execute(sql, ('webmaster@python.org', 'very-secret'))

    # connection is not autocommit by default. So you must commit to save
    # your changes.
    connection.commit()

    with connection.cursor() as cursor:
        # Read a single record
        sql = "SELECT `id`, `password` FROM `users` WHERE `email`=%s"
        cursor.execute(sql, ('webmaster@python.org',))
        result = cursor.fetchone()
        print(result)
finally:
    connection.close()
相關文章
相關標籤/搜索