流暢的python讀書筆記-第十章-序列的修改、散列和切片

時間 2019-11-16

原文原文鏈接

序列的修改、散列和切片

接着造Vector2d類

要達到的要求

爲了編寫Vector(3, 4) 和 Vector(3, 4, 5) 這樣的代碼，咱們可讓 init 法接受任意個參數（經過 *args）程序員

若是 Vector 實例的份量超過 6 個，repr() 生成的字符串就會使用 ... 省略一部
分，使用 reprlib 模塊能夠生成長度有限的表示形式編程

from array import array
import reprlib
import math


class Vector:
    typecode = 'd'

    def __init__(self, components):
        self._components = array(self.typecode, components)

    def __iter__(self):
        return iter(self._components)
        
    # 這裏是重點
    def __repr__(self):
        components = reprlib.repr(self._components)
        components = components[components.find('['):-1]
        return 'Vector({})'.format(components)
        
print(Vector([3.1, 4.2]))
print(Vector((3, 4, 5)))
print(Vector(range(10)))

❸ 使用 reprlib.repr() 函數獲取 self._components 的有限長度表示形式（如
array('d', [0.0, 1.0, 2.0, 3.0, 4.0, ...])）。
❹ 把字符串插入 Vector 的構造方法調用以前，去掉前面的 array('d' 和後面的 )。數組

協議和鴨子類型

在面向對象編程中，ssh

協議是非正式的接口，只在文檔中定義，在代碼中不定義。
例如，Python 的序列協議只須要 len 和 getitem 兩個方法。
任何類（如 Spam），只要使用標準的簽名和語義實現了這兩個方法，就能用在任何期待序列的地方。

第一章的代碼再次給出

import collections

Card = collections.namedtuple('Card', ['rank', 'suit'])


class FrenchDeck:
    ranks = [str(n) for n in range(2, 11)] + list('JQKA')
    suits = 'spades diamonds clubs hearts'.split()

    def __init__(self):
        self._cards = [Card(rank, suit) for suit in self.suits for rank in self.ranks]

    def __len__(self):
        return len(self._cards)

    def __getitem__(self, position):
        return self._cards[position]

Vector類第2版：可切片的序列

from array import array
import reprlib
import math


class Vector(object):
    typecode = 'd'

    def __init__(self, components):
        self._components = array(self.typecode, components)

    def __iter__(self):
        return iter(self._components)

    def __repr__(self):
        components = reprlib.repr(self._components)
        components = components[components.find('['):-1]
        return 'Vector({})'.format(components)

    def __str__(self):
        return str(tuple(self))

    def __bytes__(self):
        return (bytes([ord(self.typecode)]) +
                bytes(self._components))

    def __eq__(self, other):
        return tuple(self) == tuple(other)

    def __abs__(self):
        return math.sqrt(sum(x * x for x in self))

    def __bool__(self):
        return bool(abs(self))

    @classmethod
    def frombytes(cls, octets):
        typecode = chr(octets[0])
        memv = memoryview(octets[1:]).cast(typecode)
        return cls(memv)

    def __len__(self):
        return len(self._components)

    def __getitem__(self, index):
        return self._components[index]


v1 = Vector([3, 4, 5])
print(len(v1))

print(v1[0], v1[-1])

v7 = Vector(range(7))
print(v7[1:4])

如今連切片都支持了，不過尚不完美。若是 Vector 實例的切片也是 Vector
實例，而不是數組，那就更好了。ide

把 Vector 實例的切片也變成 Vector 實例，咱們不能簡單地委託給數組切片。咱們
要分析傳給 getitem 方法的參數，作適當的處理。函數

切片原理

class MySeq:
    def __getitem__(self, index):
        return index


s = MySeq()
print(s[1])

print(s[1:4])

print(s[1:4:2])

print(s[1:4:2, 9])

print(s[1:4:2, 7:9])

❸ 1:4 表示法變成了 slice(1, 4, None)。
❹ slice(1, 4, 2) 的意思是從 1 開始，到 4 結束，步幅爲 2。
❺ 神奇的事發生了：若是 [] 中有逗號，那麼 getitem 收到的是元組。
❻ 元組中甚至能夠有多個切片對象。測試

查看 slice 類的屬性

print(slice)
print(dir(slice))

['__class__', '__delattr__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', 'indices', 'start', 'step', 'stop']

經過審查 slice，發現它有 start、stop 和 step 數據屬性，以及 indices 方法。ui

indices 方法開放了內置序列實現的棘手邏輯，用於優雅地處理缺失索引和
負數索引，以及長度超過目標序列的切片。
這個方法會「整頓」元組，把 start、stop 和
stride 都變成非負數，並且都落在指定長度序列的邊界內。
一句話把負數索引和超出長度的索引調整成正常的索引spa

aa = 'ABCDE'

print(slice(None, 10, 2).indices(5))
print(slice(-3, None, None).indices(5))

print('='*40)
print(slice(None, 10, 2).indices(len(aa)))
print(slice(-3, None, None).indices(len(aa)))


print(aa[-3:])

能處理切片的getitem方法

from array import array
import reprlib
import math
import numbers

class Vector(object):
    typecode = 'd'

    def __init__(self, components):
        self._components = array(self.typecode, components)

    def __iter__(self):
        return iter(self._components)

    def __repr__(self):
        components = reprlib.repr(self._components)
        components = components[components.find('['):-1]
        return 'Vector({})'.format(components)



    def __len__(self):
        return len(self._components)

    ##[1:4] 返回一個向量對象
    def __getitem__(self, index):
        cls = type(self)
        if isinstance(index, slice):
            return cls(self._components[index])
        elif isinstance(index, numbers.Integral):
            return self._components[index]
        else:
            msg = '{cls.__name__} indices must be integers'
            raise TypeError(msg.format(cls=cls))


v7 = Vector(range(7))
print(v7[-1])

print(v7[1:4])

print(v7[-1:])

Vector類第3版：動態存取屬性

咱們能夠在 Vector 中編寫四個特性，但這樣太麻煩。
特殊方法 getattr 提供了更好的方式。.net

屬性查找失敗後，解釋器會調用 getattr 方法。
簡單來講，對 my_obj.x 表達式，

Python 會檢查 my_obj 實例有沒有名爲 x 的屬性；
若是沒有，到類（my_obj.__class__）中查找；
若是尚未，順着繼承樹繼續查找。
若是依舊找不到，調用 my_obj 所屬類中定義的 getattr 方法，傳入 self 和屬性名稱的字符串形式（如 'x'）。

from array import array
import reprlib
import math
import numbers


class Vector(object):
    typecode = 'd'

    def __init__(self, components):
        self._components = array(self.typecode, components)

    def __iter__(self):
        return iter(self._components)

    def __repr__(self):
        components = reprlib.repr(self._components)
        components = components[components.find('['):-1]
        return 'Vector({})'.format(components)

    shortcut_names = 'xyzt'

    def __getattr__(self, name):
        cls = type(self)
        if len(name) == 1:
            pos = cls.shortcut_names.find(name)
        if 0 <= pos < len(self._components):
            return self._components[pos]
        msg = '{.__name__!r} object has no attribute {!r}'
        raise AttributeError(msg.format(cls, name))

    def __setattr__(self, name, value):
        cls = type(self)
        if len(name) == 1:

            # 若是 name 是 xyzt 中的一個，設置特殊的錯誤消息。
            if name in cls.shortcut_names:
                error = 'readonly attribute {attr_name!r}'

            # 若是 name 是小寫字母，爲全部小寫字母設置一個錯誤消息。
            elif name.islower():
                error = "can't set attributes 'a' to 'z' in {cls_name!r}"

            #不然，把錯誤消息設爲空字符串。
            else:
                error = ''

            #若是有錯誤消息，拋出AttributeError。
            if error:
                msg = error.format(cls_name=cls.__name__, attr_name=name)
            raise AttributeError(msg)

        # 默認狀況：在超類上調用 __setattr__ 方法，提供標準行爲。
        super().__setattr__(name, value)


v = Vector(range(5))
print(v)

# 這個設置法 沒用
v.p = 10
print(v.x)

print(v)

super() 函數用於動態訪問超類的方法，對 Python 這樣支持多重繼承的動態
語言來講，必須能這麼作。程序員常常使用這個函數把子類方法的某些任務委託給超
類中適當的方法

注意，咱們沒有禁止爲所有屬性賦值，只是禁止爲單個小寫字母屬性賦值，以防與只讀屬
性 x、y、z 和 t 混淆。

Vector類第4版：散列和快速等值測試

functools.reduce() 能夠替換成 sum()

這裏的原理

它的關鍵思想是，把一系列值歸約成單個值。
reduce() 函數的第一個參數是接受兩個參數的函數，第二個參數是一個可迭代的對象。假若有個接受兩個參數的 fn 函數和一個 lst
列表。
調用 reduce(fn, lst) 時，fn 會應用到第一對元素上，即 fn(lst[0],lst[1])，生成第一個結果r1。而後，fn 會應用到 r1 和下一個元素上，即 fn(r1,lst[2])，生成第二個結果 r2。
接着，調用 fn(r2, lst[3])，生成 r3……直到最後一個元素，返回最後獲得的結果 rN。

如:

>>> import functools
>>> functools.reduce(lambda a,b: a*b, range(1, 6))
120

reduce接着用

import functools

aa = functools.reduce(lambda a, b: a ^ b, range(1,6))
print(aa)

# operator--操做符函數
# https://blog.csdn.net/shengmingqijiquan/article/details/53005129
import operator
bb = functools.reduce(operator.xor, range(6))
print(bb)

使用我喜歡的方式編寫 Vector.hash 方法，咱們要導入 functools 和

operator 模塊。(任性的做者)

import functools  # ➊
import operator  # ➋


class Vector:
    typecode = 'd'

    # 排版須要，省略了不少行...
    def __eq__(self, other):  # ➌
        return tuple(self) == tuple(other)

    def __hash__(self):
        hashes = (hash(x) for x in self._components)  # ➍
        return functools.reduce(operator.xor, hashes, 0)  # ➎

    # 排版須要，省略了不少行...

❹ 建立一個生成器表達式，惰性計算各個份量的散列值。
❺ 把 hashes 提供給 reduce 函數，使用 xor 函數計算聚合的散列值；第三個參數，0 是
初始值（參見下面的警告框）。

eq 方法更有效率

def __eq__(self, other):
    if len(self) != len(other):  # ➊
        return False
    for a, b in zip(self, other):  # ➋
        if a != b:  # ➌
            return False
    return True  # ➍

❷ zip 函數生成一個由元組構成的生成器，元組中的元素來自參數傳入的各個可迭代對
象。若是不熟悉 zip 函數，請閱讀「出色的 zip 函數」附註欄。前面比較長度的測試是有
必要的，由於一旦有一個輸入耗盡，zip 函數會當即中止生成值，並且不發出警告。

使用 zip 和 all 函數實現 Vector.__eq__ 方法

def __eq__(self, other):
 return len(self) == len(other) and all(a == b for a, b in zip(self, other))

zip 內置函數的使用示例

>>> zip(range(3), 'ABC') # ➊
<zip object at 0x10063ae48>
>>> list(zip(range(3), 'ABC')) # ➋
[(0, 'A'), (1, 'B'), (2, 'C')]
>>> list(zip(range(3), 'ABC', [0.0, 1.1, 2.2, 3.3])) # ➌
[(0, 'A', 0.0), (1, 'B', 1.1), (2, 'C', 2.2)]
>>> from itertools import zip_longest # ➍
>>> list(zip_longest(range(3), 'ABC', [0.0, 1.1, 2.2, 3.3], fillvalue=-1))
[(0, 'A', 0.0), (1, 'B', 1.1), (2, 'C', 2.2), (-1, -1, 3.3)]

❸ zip 有個奇怪的特性：當一個可迭代對象耗盡後，它不發出警告就中止。
❹ itertools.zip_longest 函數的行爲有所不一樣：使用可選的 fillvalue（默認
值爲 None）填充缺失的值，所以能夠繼續產出，直到最長的可迭代對象耗盡。