pandas.DataFrame對象解析

時間 2019-11-18

標籤 pandas.dataframe pandas dataframe 對象解析欄目 Spark 简体版

原文原文鏈接

pandas.DataFrame對象類型解析

df = pd.DataFrame([[1,"2",3,4],[5,"6",7,8]],columns=["a","b","c","d"])app

method解析

一、add()方法：相似加法運算（相加的元素必須是同一對象的數據）

 |  add(self, other, axis='columns', level=None, fill_value=None)
 |      Addition of dataframe and other, element-wise (binary operator `add`).
 |      
 |      Equivalent to ``dataframe + other``, but with support to substitute a fill_value for
 |      missing data in one of the inputs.
 |      
 |      Parameters
 |      ----------
 |      other : Series, DataFrame, or constant
 |      axis : {0, 1, 'index', 'columns'}
 |          For Series input, axis to match Series index on
 |      level : int or name
 |          Broadcast across a level, matching Index values on the
 |          passed MultiIndex level
 |      fill_value : None or float value, default None
 |          Fill existing missing (NaN) values, and any new element needed for
 |          successful DataFrame alignment, with this value before computation.
 |          If data in both corresponding DataFrame locations is missing
 |          the result will be missing

pandas.DataFrame.add方法

example：ide

output： flex

二、aggregate()方法：可簡寫agg()方法

aggregate(self, func, axis=0, *args, **kwargs)
 |      Aggregate using one or more operations over the specified axis.
 |      
 |      .. versionadded:: 0.20.0
 |      
 |      Parameters
 |      ----------
 |      func : function, string, dictionary, or list of string/functions
 |          Function to use for aggregating the data. If a function, must either
 |          work when passed a DataFrame or when passed to DataFrame.apply. For
 |          a DataFrame, can pass a dict, if the keys are DataFrame column names.
 |      
 |          Accepted combinations are:
 |      
 |          - string function name.
 |          - function.
 |          - list of functions.
 |          - dict of column names -> functions (or list of functions).

pandas.DataFrame.aggregate方法

example：ui

#coding=utf-8
import pandas as pd
import numpy as np

ds = pd.Series([11,"2",13,14])
print ds,"\n"

df = pd.DataFrame([[1,"2",3,4],[5,"6",7,8]],columns=["a","b","c","d"])
print df,"\n"

print(df.agg(['sum', 'min']))
print(df.agg({"a":['sum', 'min']}))

View Code

output：this

0    11
1     2
2    13
3    14
dtype: object 

   a  b  c  d
0  1  2  3  4
1  5  6  7  8 

     a   b   c   d
sum  6  26  10  12
min  1   2   3   4
     a
sum  6
min  1

View Code

經常使用的aggregation functions (`mean`, `median`, `prod`, `sum`, `std`,`var`)lua

mad(self, axis=None, skipna=None, level=None)
    Return the mean absolute deviation of the values for the requested axis
max(self, axis=None, skipna=None, level=None, numeric_only=None, **kwargs)
    This method returns the maximum of the values in the object.If you want the *index* of the maximum, use ``idxmax``. This is the equivalent of the ``numpy.ndarray`` method ``argmax``.
mean(self, axis=None, skipna=None, level=None, numeric_only=None, **kwargs)
    Return the mean of the values for the requested axis
median(self, axis=None, skipna=None, level=None, numeric_only=None, **kwargs)
    Return the median of the values for the requested axis
min(self, axis=None, skipna=None, level=None, numeric_only=None, **kwargs)
    This method returns the minimum of the values in the object.
    
memory_usage(self, index=True, deep=False)
    Return the memory usage of each column in bytes.
merge(self, right, how='inner', on=None, left_on=None, right_on=None, left_index=False, right_index=False, sort=False, suffixes=('_x', '_y'), copy=True, indicator=False, validate=None)
    Merge DataFrame objects by performing a database-style join operation by columns or indexes.

align(self, other, join='outer', axis=None, level=None, copy=True, fill_value=None, method=None, limit=None, fill_axis=0, broadcast_axis=None):
    Align two objects on their axes with the specified join method for each axis Index
all(self, axis=None, bool_only=None, skipna=None, level=None, **kwargs):
    Return whether all elements are True over series or dataframe axis.
any(self, axis=None, bool_only=None, skipna=None, level=None, **kwargs):
    Return whether any element is True over requested axis.
apply(self, func, axis=0, broadcast=None, raw=False, reduce=None, result_type=None, args=(), **kwds):
    Apply a function along an axis of the DataFrame.
applymap(self, func):
    Apply a function to a Dataframe elementwise.This method applies a function that accepts and returns a scalarto every element of a DataFrame.
append(self, other, ignore_index=False, verify_integrity=False, sort=None):
    Append rows of `other` to the end of this frame, returning a new object. Columns not in this frame are added as new columns.
assign(self, **kwargs):
    Assign new columns to a DataFrame, returning a new object(a copy) with the new columns added to the original ones.Existing columns that are re-assigned will be overwritten.
insert(self, loc, column, value, allow_duplicates=False)
    Insert column into DataFrame at specified location.    

combine(self, other, func, fill_value=None, overwrite=True):
    Add two DataFrame objects and do not propagate NaN values, so if for a(column, time) one frame is missing a value, it will default to theother frame's value (which might be NaN as well)
count(self, axis=0, level=None, numeric_only=False):
    Count non-NA cells for each column or row.
cov(self, min_periods=None):
   Compute pairwise covariance of columns, excluding NA/null values.
drop(self, labels=None, axis=0, index=None, columns=None, level=None, inplace=False, errors='raise'):
    Drop specified labels from rows or columns.
drop_duplicates(self, subset=None, keep='first', inplace=False):
    Return DataFrame with duplicate rows removed, optionally onlyconsidering certain columns
dropna(self, axis=0, how='any', thresh=None, subset=None, inplace=False)
    Remove missing values.
duplicated(self, subset=None, keep='first')
    Return boolean Series denoting duplicate rows, optionally onlyconsidering certain columns
eq(self, other, axis='columns', level=None)
    Wrapper for flexible comparison methods eq
eval(self, expr, inplace=False, **kwargs)
    Evaluate a string describing operations on DataFrame columns.
fillna(self, value=None, method=None, axis=None, inplace=False, limit=None, downcast=None, **kwargs)
    Fill NA/NaN values using the specified method
ge(self, other, axis='columns', level=None)
    Wrapper for flexible comparison methods ge
gt(self, other, axis='columns', level=None)
    Wrapper for flexible comparison methods gt
le(self, other, axis='columns', level=None)
    Wrapper for flexible comparison methods le
lt(self, other, axis='columns', level=None)
    Wrapper for flexible comparison methods lt

get_value(self, index, col, takeable=False)
    Quickly retrieve single value at passed column and index
info(self, verbose=None, buf=None, max_cols=None, memory_usage=None, null_counts=None)
    Print a concise summary of a DataFrame.
isin(self, values)
    Return boolean DataFrame showing whether each element in theDataFrame is contained in values.
isna(self)
    Detect missing values.Return a boolean same-sized object indicating if the values are NA.
isnull(self)
    Detect missing values.Return a boolean same-sized object indicating if the values are NA.
iteritems(self)
    Iterator over (column name, Series) pairs.
iterrows(self)
    Iterate over DataFrame rows as (index, Series) pairs.
itertuples(self, index=True, name='Pandas')
    Iterate over DataFrame rows as namedtuples, with index value as firstelement of the tuple.
join(self, other, on=None, how='left', lsuffix='', rsuffix='', sort=False)
    Join columns with other DataFrame either on index or on a keycolumn. Efficiently Join multiple DataFrame objects by index at once bypassing a list.

1. python數據分析 | 多種方式獲取pandas.DataFrame數據對象
2. 對象解析
3. pandas.DataFrame
4. json對象解析
5. 預解析、對象
6. window.history對象解析
7. XMLHttpRequest對象解析
8. 解析對象--什麼是對象
9. Python-pandas.DataFrame
10. dom4j解析spring.xml 對象解析(二)
更多相關文章...
• XML DOM 解析器 - XML DOM 教程
• TCP報文格式解析 - TCP/IP教程
• 互聯網組織的未來：剖析GitHub員工的任性之源
• Scala 中文亂碼解決

相關標籤/搜索

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。