Python3.7 dataclass 介紹

時間 2019-12-14

標籤 python3.7 python dataclass 介紹欄目 Python 简体版

原文原文鏈接

Posted on 2018年6月28日 by laixintao 1 Comment

Python3.7 加入了一個新的 module：dataclasses。能夠簡單的理解成「支持默認值、能夠修改的tuple」（「mutable namedtuples with defaults」）。其實沒什麼特別的，就是你定義一個很普通的類，@dataclass 裝飾器能夠幫你生成 __repr__ __init__ 等等方法，就不用本身寫一遍了。可是此裝飾器返回的依然是一個 class，這意味着並無帶來任何不便，你依然可使用繼承、metaclass、docstring、定義方法等。html

先展現一個 PEP 中舉的例子，下面的這段代碼（Python3.7）：python

1

2

3

4

5

6

7

8

9

@dataclass

class InventoryItem:

'''Class for keeping track of an item in inventory.'''

name: str

unit_price: float

quantity_on_hand: int = 0

def total_cost(self) -> float:

return self.unit_price * self.quantity_on_hand

@dataclass 會自動生成git

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

def __init__(self, name: str, unit_price: float, quantity_on_hand: int = 0) -> None:

self.name = name

self.unit_price = unit_price

self.quantity_on_hand = quantity_on_hand

def __repr__(self):

return f'InventoryItem(name={self.name!r}, unit_price={self.unit_price!r}, quantity_on_hand={self.quantity_on_hand!r})'

def __eq__(self, other):

if other.__class__ is self.__class__:

return (self.name, self.unit_price, self.quantity_on_hand) == (other.name, other.unit_price, other.quantity_on_hand)

return NotImplemented

def __ne__(self, other):

if other.__class__ is self.__class__:

return (self.name, self.unit_price, self.quantity_on_hand) != (other.name, other.unit_price, other.quantity_on_hand)

return NotImplemented

def __lt__(self, other):

if other.__class__ is self.__class__:

return (self.name, self.unit_price, self.quantity_on_hand) < (other.name, other.unit_price, other.quantity_on_hand)

return NotImplemented

def __le__(self, other):

if other.__class__ is self.__class__:

return (self.name, self.unit_price, self.quantity_on_hand) <= (other.name, other.unit_price, other.quantity_on_hand)

return NotImplemented

def __gt__(self, other):

if other.__class__ is self.__class__:

return (self.name, self.unit_price, self.quantity_on_hand) > (other.name, other.unit_price, other.quantity_on_hand)

return NotImplemented

def __ge__(self, other):

if other.__class__ is self.__class__:

return (self.name, self.unit_price, self.quantity_on_hand) >= (other.name, other.unit_price, other.quantity_on_hand)

return NotImplemented

引入dataclass的理念

Python 想簡單的定義一種容器，支持經過的對象屬性進行訪問。在這方面已經有不少嘗試了：github

標準庫的 collections.namedtuple
標準庫的 typing.NamedTuple
著名的 attr 庫
各類 Snippet，問題和回答等

那麼爲何還須要 dataclass 呢？主要的好處有：緩存

沒有使用 BaseClass 或者 metaclass，不會影響代碼的繼承關係。被裝飾的類依然是一個普通的類
使用類的 Fields 類型註解，用原生的方法支持類型檢查，不侵入代碼，不像 attr 這種庫對代碼有侵入性（要用 attr 的函數將一些東西處理）

dataclass 並非要取代這些庫，做爲標準庫的 dataclass 只是提供了一種更加方便使用的途徑來定義 Data Class。以上這些庫有不一樣的 feature，依然有存在的意義。app

基本用法

dataclasses 的 dataclass 裝飾器的原型以下：ide

1	def dataclass(*, init=True, repr=True, eq=True, order=False, unsafe_hash=False, frozen=False)

很明顯，這些默認參數能夠控制是否生成魔術方法。經過本文開頭的例子能夠看出，不用加括號也能夠調用。函數

經過 field 能夠對參數作更多的定製化，好比默認值、是否參與repr、是否參與hash等。好比文檔中的這個例子，因爲 mylist 的缺失，就調用了 default_factory 。更多 field 能作的事情參考文檔吧。post

1

2

3

4

5

6

@dataclass

class C:

mylist: List[int] = field(default_factory=list)

c = C()

c.mylist += [1, 2, 3]

此外，dataclasses 模塊還提供了不少有用的函數，能夠將 dataclass 轉換成 tuple、dict 等形式。話說我本身重複過不少這樣的方法了……性能

1

2

3

4

5

6

7

8

9

10

11

12

13

14

@dataclass

class Point:

x: int

y: int

@dataclass

class C:

mylist: List[Point]

p = Point(10, 20)

assert asdict(p) == {'x': 10, 'y': 20}

c = C([Point(0, 0), Point(10, 4)])

assert asdict(c) == {'mylist': [{'x': 0, 'y': 0}, {'x': 10, 'y': 4}]}

Hook init

自動生成的 __init__ 能夠被 hook。很簡單，自動生成的 __init__ 方法會調用 __post_init__

1

2

3

4

5

6

7

8

@dataclass

class C:

a: float

b: float

c: float = field(init=False)

def __post_init__(self):

self.c = self.a + self.b

若是想傳給 __post_init__ 方法可是不傳給 __init__ ，可使用一個特殊的類型 InitVar

1

2

3

4

5

6

7

8

9

10

11

@dataclass

class C:

i: int

j: int = None

database: InitVar[DatabaseType] = None

def __post_init__(self, database):

if self.j is None and database is not None:

self.j = database.lookup('j')

c = C(10, database=my_database)

不可修改的功能

Python 沒有 const 相似的東西，理論上任何東西都是能夠修改的。若是非要說不能修改的實現呢，這裏有個比較著名的實現。只有不到10行代碼。

可是有了 dataclass ，能夠直接使用 @dataclass(frozen=True) 了。而後裝飾器會對 Class 添加上 __setattr__ 和 __delattr__ 。Raise 一個 FrozenInstanceError。缺點是會有一些性能損失，由於 __init__ 必須經過 object.__setattr__ 。

繼承

對於有繼承關係的 dataclass，會按照 MRO 的反順序（從object開始），對於每個基類，將在基類找到的 fields 添加到順序的一個 mapping 中。全部的基類都找完了，按照這個 mapping 生成全部的魔術方法。因此方法中這些參數的順序，是按照找到的順序排的，先找到的排在前面。由於是先找的基類，因此相同 name 的話，後面子類的 fields 定義會覆蓋基類的。好比文檔中的這個例子：

1

2

3

4

5

6

7

8

9

@dataclass

class Base:

x: Any = 15.0

y: int = 0

@dataclass

class C(Base):

z: int = 10

x: int = 15

那麼最後生成的將會是：

1	def __init__(self, x: int = 15, y: int = 0, z: int = 10):

注意 x y 的順序是 Base 中的順序，可是 C 的 x 是 int 類型，覆蓋了 Base 中的 Any。

可變對象的陷阱

在前面的「基本用法」一節中，使用了 default_factory 。爲何不直接使用 [] 做爲默認呢？

老鳥都會知道 Python 這麼一個坑：將可變對象好比 list 做爲函數的默認參數，那麼這個參數會被緩存，致使意外的錯誤。詳細的能夠參考這裏：Python Common Gotchas。

考慮到下面的代碼：

1

2

3

4

5

@dataclass

class D:

x: List = []

def add(self, element):

self.x += element

將會生成：

1

2

3

4

5

6

7

8

class D:

x = []

def __init__(self, x=x):

self.x = x

def add(self, element):

self.x += element

assert D().x is D().x

這樣不管實例化多少對象，x 變量將在多個實例之間共享。dataclass 很難有一個比較好的辦法預防這種狀況。因此這個地方作的設計是：若是默認參數的類型是 list dict 或 set ，就拋出一個 TypeError。雖然不算完美，可是能夠預防很大一部分狀況了。

若是默認參數須要是 list，那麼就用上面提到的 default_factory 。

相關文章

相關標籤/搜索

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。

最新文章

本站公眾號

歡迎關注本站公眾號,獲取更多信息

相關文章

>>更多相關文章<<