python --panda(二)---DataFrame結構及平常操做

繼上一篇文章,這篇文章介紹一下Pandas模塊裏面的DataFrame結構html

1. 介紹

DataFrame unifies two or more Series into a single data structure.Each Series then represents a named column of the DataFrame, and instead of each column having its own index, the DataFrame provides a single index and the data in all columns is aligned to the master index of the DataFrame. 
這段話的意思是,DataFrame提供的是一個相似表的結構,由多個Series組成,而Series在DataFrame中叫columns(理解有錯請指出,(逃~ 
dataFrame1數組

2. 相關操做

a.create

pd.DataFrame() 
參數: 
一、二維array; 
二、Series 列表; 
三、value爲Series的字典;app

a.一、二維array

import pandas as pd
import numpy as np

s1=np.array([1,2,3,4])
s2=np.array([5,6,7,8])
df=pd.DataFrame([s1,s2])
print df
  •  

dataFrame二維數組create

a.二、Series列表(效果與二維array相同)

import pandas as pd
import numpy as np

s1=pd.Series(np.array([1,2,3,4]))
s2=pd.Series(np.array([5,6,7,8]))
df=pd.DataFrame([s1,s2])
print df
  •  

Series列表

a.三、value爲Series的字典結構;

import pandas as pd
import numpy as np

s1=pd.Series(np.array([1,2,3,4]))
s2=pd.Series(np.array([5,6,7,8]))
df=pd.DataFrame({"a":s1,"b":s2});
print df
  •  

value爲Series的字典結構 
注:若建立使用的參數中,array、Series長度不同時,對應index的value值若不存在則爲NaNide

b.屬性

b.1 .columns :每一個columns對應的keys

b.2 .shape:形狀,(a,b),index長度爲a,columns數爲b

b.3 .index;.values:返回index列表;返回value二維array

b.4 .head();.tail();

c.if-then 操做

c.1使用.ix[]

df=pd.DataFrame({"A":[1,2,3,4],"B":[5,6,7,8],"C":[1,1,1,1]})
df.ix[df.A>1,'B']= -1
print df
  •  

pandas11

df.ix[條件,then操做區域]spa

c.2使用numpy.where

df=pd.DataFrame({"A":[1,2,3,4],"B":[5,6,7,8],"C":[1,1,1,1]})
df["then"]=np.where(df.A<3,1,0)
print df
  •  

pandas12 
np.where(條件,then,else)code

d.根據條件選擇取DataFrame

d.1 直接取值df.[]

df=pd.DataFrame({"A":[1,2,3,4],"B":[5,6,7,8],"C":[1,1,1,1]})
df=df[df.A>=2]
print df
  •  

pandas13

d.2 使用.loc[]

df=pd.DataFrame({"A":[1,2,3,4],"B":[5,6,7,8],"C":[1,1,1,1]})
df=df.loc[df.A>2]
print df
  •  

(還有不少種方法就不一一列舉了)htm

e.Grouping

e.1groupby 造成group

df = pd.DataFrame({'animal': 'cat dog cat fish dog cat cat'.split(),
                  'size': list('SSMMMLL'),
                  'weight': [8, 10, 11, 1, 20, 12, 12],
                  'adult' : [False] * 5 + [True] * 2});
#列出動物中weight最大的對應size
group=df.groupby("animal").apply(lambda subf: subf['size'][subf['weight'].idxmax()])
print group
  •  

grouping 
e.2 使用get_group 取出其中一分組get

df = pd.DataFrame({'animal': 'cat dog cat fish dog cat cat'.split(),
                  'size': list('SSMMMLL'),
                  'weight': [8, 10, 11, 1, 20, 12, 12],
                  'adult' : [False] * 5 + [True] * 2});

group=df.groupby("animal")
cat=group.get_group("cat")
print cat
  •  

get_group

其餘具體操做請參考CookBook

http://pandas.pydata.org/pandas-docs/stable/cookbook.htmlpandas

相關文章
相關標籤/搜索