來源html
我在 Python Data Science Handbook 的書中找到了關於這個話題的說明。正如書中所描述的,transform 是一個和groupby同時使用的操做。我推測大多數的pandas用戶可能已經用過了aggregate, filter 或者 apply在使用 groupby的同時。然而,transform 有點難以理解。python
aggregation會返回數據的縮減版本,而transformation能返回完整數據的某一變換版本供咱們重組。這樣的transformation,輸出的形狀和輸入一致。一個常見的例子是經過減去分組平均值來居中數據。app
data_str='''account,name,order,sku,quantity,unit price,ext price 383080,Will LLC,10001,B1-20000,7,33.69,235.83 383080,Will LLC,10001,S1-27722,11,21.12,232.32 383080,Will LLC,10001,B1-86481,3,35.99,107.97 412290,Jerde-Hilpert,10005,S1-06532,48,55.82,2679.36 412290,Jerde-Hilpert,10005,S1-82801,21,13.62,286.02 412290,Jerde-Hilpert,10005,S1-06532,9,92.55,832.95 412290,Jerde-Hilpert,10005,S1-47412,44,78.91,3472.04 412290,Jerde-Hilpert,10005,S1-27722,36,25.42,915.12 218895,Kulas Inc,10006,S1-27722,32,95.66,3061.12 218895,Kulas Inc,10006,B1-33087,23,22.55,518.65 218895,Kulas Inc,10006,B1-33364,3,72.3,216.9 218895,Kulas Inc,10006,B1-20000,-1,72.18,-72.18''' import io import pandas as pd data=pd.read_csv(io.StringIO(data_str)) order_total = data.groupby('order')['ext price'].sum().rename('order total').reset_index() data_merge=data.merge(order_total) data_merge['Percnet Order']=data_merge['ext price']/data_merge['order total']
order_total=data.groupby('order')['ext price'].transform('sum') data['percent order'] = data['ext price']/order_total