3-3 groupby操做

 Pandas章節應用的數據能夠在如下連接下載:  https://files.cnblogs.com/files/AI-robort/Titanic_Data-master.zipjavascript

 

 

 

In [1]:
import pandas as pd
df=pd.DataFrame({'key':['A','B','C','A','B','C','A','B','C'],
                 'data':[0,5,10,5,10,15,10,15,20]})
df
Out[1]:
 
  key data
0 A 0
1 B 5
2 C 10
3 A 5
4 B 10
5 C 15
6 A 10
7 B 15
8 C 20
In [3]:
for key in['A','B','C']:
     print(key,df[df['key']==key].sum())#求每一個key值的求和
 
A key     AAA
data     15
dtype: object
B key     BBB
data     30
dtype: object
C key     CCC
data     45
dtype: object
In [4]:
df.groupby('key').sum()#和上面的分組是同樣的
Out[4]:
 
  data
key  
A 15
B 30
C 45
In [7]:
import numpy as np
df.groupby('key').aggregate(np.mean)#aggregate是執行操做,如np的sum 、mean等
Out[7]:
 
  data
key  
A 5
B 10
C 15
In [8]:
df1=pd.read_csv('./Titanic_Data-master/Titanic_Data-master/train.csv')
In [13]:
df1.groupby('Sex')['Age'].mean()#統計性別對應的年齡的均值
Out[13]:
Sex
female    27.915709
male      30.726645
Name: Age, dtype: float64
In [14]:
df1.groupby('Sex')['Survived'].mean()#統計性別對應的獲救的平均機率
Out[14]:
Sex
female    0.742038
male      0.188908
Name: Survived, dtype: float64
相關文章
相關標籤/搜索