python和java,.net,php web平臺交互最好使用web通訊方式,不要使用Jypython,IronPython,這樣的好處是可以保持程序模塊化,解耦性好javascript
>>> print(r'''Hello, ... Lisa!''') Hello, Lisa! >>>
>>> print('''line1 ... line2 ... line3''') line1 line2 line3
print(r'\\\t\\') # 輸出 \\\t\\
還有list(相似數組),dict(相似js object literal)html
常量: PIjava
/ : 自動使用浮點數,好比10/3=3.33333 9/3=3.0mysql
// : 取整 10//3= 3web
%: 10%3=1算法
>>> ord('A') 65 >>> ord('中') 20013 >>> chr(66) 'B' >>> chr(25991) '文'
>>> 'ABC'.encode('ascii') b'ABC' >>> '中文'.encode('utf-8') b'\xe4\xb8\xad\xe6\x96\x87'
反過來,若是從網絡或者磁盤上讀取了utf-8 byte字節流,那麼必須作decode操做成爲unicode後才能在代碼中使用,須要使用decode方法:
>>> b'ABC'.decode('ascii') 'ABC' >>> b'\xe4\xb8\xad\xe6\x96\x87'.decode('utf-8') '中文' >>> len('abc') 3 >>> len('中') 1 >>> len('中文'.encode('utf-8')) 6
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
全部python中須要顯示的字符串,應該以 u"this is unicode字符串"的方式來定義使用字符串
>>> 'Hello, %s' % 'world' 'Hello, world' >>> 'Hi, %s, you have $%d.' % ('Michael', 1000000) 'Hi, Michael, you have $1000000.'
>>> classmates = ['Michael', 'Bob', 'Tracy'] >>> classmates ['Michael', 'Bob', 'Tracy'] >>> len(classmates) 3 >>> classmates[0] 'Michael' >>> classmates[1] 'Bob' >>> classmates[2] 'Tracy' >>> classmates[3] Traceback (most recent call last): File "<stdin>", line 1, in <module> IndexError: list index out of range >>> classmates[-1] 'Tracy' >>> classmates[-2] 'Bob' >>> classmates[-3] 'Michael' >>> classmates[-4] Traceback (most recent call last): File "<stdin>", line 1, in <module> IndexError: list index out of range
list還有如下經常使用的操做函數: append,insert,pop
L = ['Hello', 'World', 18, 'Apple', None] print([s.lower() if isinstance(s,str) else s for s in L]) ['hello', 'world', 18, 'apple', None]
>>> L = [x * x for x in range(10)] >>> L [0, 1, 4, 9, 16, 25, 36, 49, 64, 81] >>> g = (x * x for x in range(10)) >>> g <generator object <genexpr> at 0x1022ef630> >>> next(g) 0 >>> next(g) 1 >>> next(g) 4 >>> next(g) 9 >>> next(g) 16 >>> g = (x * x for x in range(10)) >>> for n in g: ... print(n) ... 0 1 4 9
gougu = {z: (x,y) for z in [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26] for y in range(1, z) for x in range(1, y) if x*x + y*y == z*z} gougu Out[17]: {5: (3, 4), 10: (6, 8), 13: (5, 12), 15: (9, 12), 17: (8, 15), 20: (12, 16), 25: (7, 24), 26: (10, 24)} gougu = [[x, y, z] for z in [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26] for y in range(1, z) for x in range(1, y) if x*x + y*y == z*z] gougu Out[19]: [[3, 4, 5], [6, 8, 10], [5, 12, 13], [9, 12, 15], pyt = ((x, y, z) for z in [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26] for y in range(1, z) for x in range(1, y) if x*x + y*y == z*z) #這裏pyt就是一個generator,注意最外面的括號!隨後可使用for來調用生成式 print([m for m in pyt]) [(3, 4, 5), (6, 8, 10), (5, 12, 13), (9, 12, 15), (8, 15, 17), (12, 16, 20), (15, 20, 25), (7, 24, 25), (10, 24, 26)]
import jieba documents = [u'我來到北京清華大學', u'假如當前的單詞表有10個不一樣的單詞', u'我是中華人民共和國的公民,來自上海,老家是湖北襄陽'] documents_after = [] documents_after = [[w for w in jieba.cut(s)] for s in documents] documents_after2 = [' '.join(s) for s in documents_after] print(documents_after) print(documents_after2) [['我', '來到', '北京', '清華大學'], ['假如', '當前', '的', '單詞表', '有', '10', '個', '不一樣', '的', '單詞'], ['我', '是', '中華人民共和國', '的', '公民', ',', '來自', '上海', ',', '老家', '是', '湖北', '襄陽']] ['我 來到 北京 清華大學', '假如 當前 的 單詞表 有 10 個 不一樣 的 單詞', '我 是 中華人民共和國 的 公民 , 來自 上海 , 老家 是 湖北 襄陽']
def fib(max): n,a,b = 0,0,1 while n < max: yield b a,b = b,a+b n = n+1 return 'done' f = fib(6) for n in fib(6): print(n) 1 1 2 3 5 8
def countdown(n): print("counting down from ",n) while n > 0: yield n n -=1 x = countdown(10) print(x)
# 注意並未打印出 counting down from 10的信息哦 <generator object countdown at 0x0000026385694468>
# counting down from 10
# 10
generator和普通函數的行爲是徹底不一樣的。調用一個generator functionjiang chuangjian yige generator object.可是注意這時並不會調用函數自己!!
當generator return時,iteration就將stop.
generator雖然行爲和iterator很是相似,可是也有一點差異:generator是一個one-time operation
a = [1,2,3,4] b = (2*x for x in a) b Out[19]: <generator object <genexpr> at 0x0000023EDA2C6CA8> for i in b: print(i) 2 4 6 8
(expression for i in s if condition) # 等價於 for i in s: if condition: yield expression
注意:若是generator expression僅僅用於做爲惟一的函數形參時,能夠省略()
a = [1,2,3,4] sum(x*x for x in a) Out[21]: 30
咱們知道能夠用於for循環中不斷迭代的數據有:list,tuple,dict,set,str等集合類數據類型,或者是generator(包括帶yield的generator function)。全部這些類型的數據咱們都稱之爲可迭代的數據類型(iterable),可使用isinstance()來具體判斷:
>>> from collections import Iterable >>> isinstance([], Iterable) True >>> isinstance({}, Iterable) True >>> isinstance('abc', Iterable) True >>> isinstance((x for x in range(10)), Iterable) True >>> isinstance(100, Iterable) False
>>> from collections import Iterator >>> isinstance((x for x in range(10)), Iterator) True >>> isinstance([], Iterator) False >>> isinstance({}, Iterator) False >>> isinstance('abc', Iterator) False
>>> isinstance(iter([]), Iterator) True >>> isinstance(iter('abc'), Iterator) True
for x in [1, 2, 3, 4, 5]: pass #徹底等價於: # 首先得到Iterator對象: it = iter([1, 2, 3, 4, 5]) # 循環: while True: try: # 得到下一個值: x = next(it) except StopIteration: # 遇到StopIteration就退出循環 break
>>> classmates = ('Michael', 'Bob', 'Tracy')
>>> t = (1,) >>> t (1,)
a[start:end] # items start through end-1 a[start:] # items start through the rest of the array a[:end] # items from the beginning through end-1 a[:] # a copy of the whole array a[start:end:step] # start through not past end, by step a[-1] # last item in the array a[-2:] # last two items in the array a[:-2] # everything except the last two items a[::-1] # all items in the array, reversed a[1::-1] # the first two items, reversed a[:-3:-1] # the last two items, reversed a[-3::-1] # everything except the last two items, reversed
ndarray可使用標準的python $x[obj]$方式來訪問和切片,這裏$x$是數組自己,而$obj$是相應的選擇表達式。ndarray支持3中不一樣的index方式:field access, basic slicing, advanced indexing,具體使用哪種取決於$obj$自己。
$x[(exp1, exp2, ..., expN)] 等價於 x[exp1, exp2, ..., expN]$
ndarray的basic slicing將python僅能針對一維數組的基礎index和slicing概念拓展到N維。當前面的$x[obj]$ slice形式中的obj爲一個slice對象($[start:stop:step]$格式),或者一個整數,或者$(slice obj,int)$時,這就是basic slicing。basic slicing的標準規則在每一個緯度上分別應用。
全部basic slicing產生的數組其實是原始數組的view,數據自己並不會複製。
$i:j:k$,$i = start:end:step$,其中,若是$i,j$爲負數,則能夠理解爲$n+i,n+j$,n是相應維度上元素的個數。若是$k<0$,則表示走向到更小的indices.
>>> x = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) >>> x[1:7:2] array([1, 3, 5]) >>> x[-2:10] array([8, 9]) >>> x[-3:3:-1] array([7, 6, 5, 4]) >>> x[5:] array([5, 6, 7, 8, 9]) >>> x = np.array([[[1],[2],[3]], [[4],[5],[6]]]) >>> x.shape (2, 3, 1) >>> x[1:2] array([[[4], [5], [6]]]) >>> x[...,0] array([[1, 2, 3], [4, 5, 6]]) >>> x[:,np.newaxis,:,:].shape (2, 1, 3, 1)
若是selction obj不是一個sequence obj的tuple,而是一個值爲int或者bool的ndarray,或者是至少包含一個start:end:step或int/bool性ndarray的tuple,則就會應用advanced indexing.有兩種模式:integer和boolean
>>> x = array([[ 0, 1, 2], ... [ 3, 4, 5], ... [ 6, 7, 8], ... [ 9, 10, 11]]) >>> rows = np.array([[0, 0], ... [3, 3]], dtype=np.intp) >>> columns = np.array([[0, 2], ... [0, 2]], dtype=np.intp) >>> x[rows, columns] array([[ 0, 2], [ 9, 11]])
>>> x = np.array([[1, 2], [3, 4], [5, 6]]) >>> x[[0, 1, 2], [0, 1, 0]] array([1, 4, 5])
>>> x = np.array([[1., 2.], [np.nan, 3.], [np.nan, np.nan]]) >>> x[~np.isnan(x)] array([ 1., 2., 3.]) >>> x = np.array([1., -1., -2., 3]) >>> x[x < 0] += 20 >>> x array([ 1., 19., 18., 3.]) >>> x = np.array([[0, 1], [1, 1], [2, 2]]) >>> rowsum = x.sum(-1) >>> x[rowsum <= 2, :] array([[0, 1], [1, 1]]) >>> rowsum = x.sum(-1, keepdims=True) >>> rowsum.shape (3, 1) >>> x[rowsum <= 2, :] # fails IndexError: too many indices >>> x[rowsum <= 2] array([0, 1]) >>> x = array([[ 0, 1, 2], ... [ 3, 4, 5], ... [ 6, 7, 8], ... [ 9, 10, 11]]) >>> rows = (x.sum(-1) % 2) == 0 >>> rows array([False, True, False, True]) >>> columns = [0, 2] >>> x[np.ix_(rows, columns)] array([[ 3, 5], [ 9, 11]]) >>> rows = rows.nonzero()[0] >>> x[rows[:, np.newaxis], columns] array([[ 3, 5], [ 9, 11]])
# Import cars data import pandas as pd cars = pd.read_csv('cars.csv', index_col = 0) # Print out country column as Pandas Series print(cars['country']) In [4]: cars['country'] Out[4]: US United States AUS Australia JAP Japan IN India RU Russia MOR Morocco EG Egypt Name: country, dtype: object: Pandas Series # Print out country column as Pandas DataFrame print(cars[['country']]) In [5]: cars[['country']] Out[5]: country US United States AUS Australia JAP Japan IN India RU Russia MOR Morocco EG Egypt # Print out DataFrame with country and drives_right columns print(cars[['country','drives_right']]) In [6]: cars[['country','drives_right']] Out[6]: country drives_right US United States True AUS Australia False JAP Japan False IN India False RU Russia True MOR Morocco True EG Egypt True # Print out first 3 observations print(cars[0:4]) # Print out fourth, fifth and sixth observation print(cars[4:7]) # Print out first 3 observations print(cars[0:4]) # Print out fourth, fifth and sixth observation print(cars[4:7]) In [14]: cars Out[14]: cars_per_cap country drives_right US 809 United States True AUS 731 Australia False JAP 588 Japan False IN 18 India False RU 200 Russia True MOR 70 Morocco True EG 45 Egypt True In [15]: cars.loc['RU'] Out[15]: cars_per_cap 200 country Russia drives_right True Name: RU, dtype: object In [16]: cars.iloc[4] Out[16]: cars_per_cap 200 country Russia drives_right True Name: RU, dtype: object In [17]: cars.loc[['RU']] Out[17]: cars_per_cap country drives_right RU 200 Russia True In [18]: cars.iloc[[4]] Out[18]: cars_per_cap country drives_right RU 200 Russia True In [19]: cars.loc[['RU','AUS']] Out[19]: cars_per_cap country drives_right RU 200 Russia True AUS 731 Australia False In [20]: cars.iloc[[4,1]] Out[20]: cars_per_cap country drives_right RU 200 Russia True AUS 731 Australia False In [3]: cars.loc['IN','cars_per_cap'] Out[3]: 18 In [4]: cars.iloc[3,0] Out[4]: 18 In [5]: cars.loc[['IN','RU'],'cars_per_cap'] Out[5]: IN 18 RU 200 Name: cars_per_cap, dtype: int64 In [6]: cars.iloc[[3,4],0] Out[6]: IN 18 RU 200 Name: cars_per_cap, dtype: int64 In [7]: cars.loc[['IN','RU'],['cars_per_cap','country']] Out[7]: cars_per_cap country IN 18 India RU 200 Russia In [8]: cars.iloc[[3,4],[0,1]] Out[8]: cars_per_cap country IN 18 India RU 200 Russia print(cars.loc['MOR','drives_right']) True In [1]: cars.loc[:,'country'] Out[1]: US United States AUS Australia JAP Japan IN India RU Russia MOR Morocco EG Egypt Name: country, dtype: object In [2]: cars.iloc[:,1] Out[2]: US United States AUS Australia JAP Japan IN India RU Russia MOR Morocco EG Egypt Name: country, dtype: object In [3]: cars.loc[:,['country','drives_right']] Out[3]: country drives_right US United States True AUS Australia False JAP Japan False IN India False RU Russia True MOR Morocco True EG Egypt True In [4]: cars.iloc[:,[1,2]] Out[4]: country drives_right US United States True AUS Australia False JAP Japan False IN India False RU Russia True MOR Morocco True EG Egypt True
age = 3 if age >= 18: print('adult') elif age >= 6: print('teenager') else: print('kid')
names = ['Michael', 'Bob', 'Tracy'] for name in names: print(name)
>>> list(range(5)) [0, 1, 2, 3, 4] sum =0 for x in range(101): sum = sum+x print(sum)
>>> d = {'Michael': 95, 'Bob': 75, 'Tracy': 85} >>> d['Michael'] 95
>>> s = set([1, 2, 3]) >>> s {1, 2, 3}
>>> s1 = set([1, 2, 3]) >>> s2 = set([2, 3, 4]) >>> s1 & s2 {2, 3} >>> s1 | s2 {1, 2, 3, 4}
import math def move(x, y, step, angle=0): nx = x + step * math.cos(angle) ny = y - step * math.sin(angle) return nx, ny >>> x, y = move(100, 100, 60, math.pi / 6) >>> print(x, y) 151.96152422706632 70.0 >>> r = move(100, 100, 60, math.pi / 6) >>> print(r) #本質上函數返回的是一個tuple,而這個tuple的對應元素的值分別賦值給了左變量 (151.96152422706632, 70.0)
def enroll(name, gender, age=6, city='Beijing'): print('name:', name) print('gender:', gender) print('age:', age) print('city:', city) enroll('Bob', 'M', 7) enroll('Adam', 'M', city='Tianjin')
def calc(*numbers): sum = 0
# 注意這裏的numbers是tuple數據<class 'tuple'>
for n in numbers:
sum = sum + n * n return sum >>> nums = [1, 2, 3] >>> calc(*nums) #加一個*把list或者tuple變成可變參數傳進去*nums表示把nums這個list的全部元素做爲可變參數傳進去 14
def person(name, age, **kw): print('name:', name, 'age:', age, 'other:', kw)
print(type(kw)) # 注意kw是dict數據類型: <class 'dict'> >>> person('Michael', 30) name: Michael age: 30 other: {} >>> person('Bob', 35, city='Beijing') name: Bob age: 35 other: {'city': 'Beijing'} >>> person('Adam', 45, gender='M', job='Engineer') name: Adam age: 45 other: {'gender': 'M', 'job': 'Engineer'}
>>> extra = {'city': 'Beijing', 'job': 'Engineer'}
>>> person('Jack', 24, **extra)
name: Jack age: 24 other: {'city': 'Beijing', 'job': 'Engineer'}
def person(name, age, *, city='Beijing', job): #含默認值的命名關鍵字參數,city默認就爲'beijing' print(name, age, city, job) >>> person('Jack', 24, city='Beijing', job='Engineer') Jack 24 Beijing Engineer
def f(x): return x*x r = map(f,[1,2,3,4,5]) print(r)
print(isinstance(r, Iterator)) # True
print(list(r)) #結果以下 #<map object at 0x000000000072B9B0>, 返回結果是一個Iterator,所以必須經過list()調用才能生成list #[1, 4, 9, 16, 25]
image module code example:
from PIL import Image im ='C:\Users\Administrator\Desktop\jj.png') print(im.format,im.size,im.mode) im.thumbnail((100,50))'thumb.jpg','png')
import socket import threading import time def tcplink(sock,addr): print(('Accept new connection from %s:%s...' % addr)) sock.send(b'Welcome, client!') while True: data = sock.recv(1024) time.sleep(1) if not data or data.decode('utf-8') == 'exit': break sock.send(('Hello, %s!' % data).encode('utf-8')) sock.close() print('Connection from %s:%s closed.' %addr) s = socket.socket(socket.AF_INET,socket.SOCK_STREAM) s.bind(('',9999)) s.listen(5) print('waiting for connection coming on server...') while True: sock, addr = s.accept() t = threading.Thread(target=tcplink,args=(sock,addr)) t.start()
waiting for connection coming on server...
Accept new connection from
Connection from closed.
Accept new connection from
Connection from closed.
Accept new connection from
Connection from closed.
Accept new connection from
Connection from closed.
Accept new connection from
Connection from closed.
import socket import threading import time s = socket.socket(socket.AF_INET,socket.SOCK_STREAM) s.connect(('',9999)) print((s.recv(1024).decode('utf-8'))) for data in [b'Michael',b'Tracy',b'Sarah']: s.send(data) print(s.recv(1024).decode('utf-8')) s.send(b'exit') s.close()
Welcome, client!
Hello, b'Michael'!
Hello, b'Tracy'!
Hello, b'Sarah'!
ipython notebook->jupyter notebooks演進
總的來講分爲interface level和kernel level兩個領域,接口這一層能夠有notebooks,ipython console, qt console,直接經過一個MQ over socket和kernel level通訊,該通訊接口負責傳輸要執行的python code以及code執行完成後返回的data。
而jupyter將notebooks的這種模式擴展到多種語言,好比R, bash,在kernel層分別增長對應語言的kernel組件,負責對應語言的執行和返回結果。
IPython是一個加強交互能力的python console環境,它提供了不少有用的feature:
和標準的python console相比,它提供: Tab completion的功能,exlporing your objects,好比經過object_name?就將列出全部關於對象的細節。Magic functions, 好比%timeit這個magic經常能夠用來檢查代碼執行的效率, %run這個magic能夠容許你執行任何python scirpt而且將其全部的data直接加載到交互環境中。執行系統shell commands,好比!ping, 也能夠獲取到系統腳本命令輸出的內容:
files = !ls
!grep -rF $pattern ipython/*
Jupyter notebook軟件在至少如下兩種場景中很是好用:
1. 但願針對已經存在的notebook作進一步實驗或者純粹的學習;
2. 但願本身開發一個notebook用於輔助教學或者生成學術文章
在這兩種場景下,你可能都但願在一個特定的目錄下運行Jupyter notebook:
jupyter notebook
便可打開notebook,而且列出該目錄下的全部文件: http://localhost:8888/tree
some python debug study tips:
y=[x*x for x in range(1,11)] print(dir(y)) # 輸出: ['__add__', '__class__', '__contains__', '__delattr__', '__delitem__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__gt__', '__hash__', '__iadd__', '__imul__', '__init__', '__iter__', '__le__', '__len__', '__lt__', '__mul__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__reversed__', '__rmul__', '__setattr__', '__setitem__', '__sizeof__', '__str__', '__subclasshook__', 'append', 'clear', 'copy', 'count', 'extend', 'index', 'insert', 'pop', 'remove', 'reverse', 'sort']
%who Series #列出全部Series類型的變量 s temp_diffs temps1 temps2 %who #列出全部global DataFrame Series dates np pd plt s temp_diffs temps1 temps2 %whos #列出全部global及其詳細的type: Variable Type Data/Info --------------------------------------- DataFrame type <class 'pandas.core.frame.DataFrame'> Series type <class 'pandas.core.series.Series'> dates DatetimeIndex DatetimeIndex(['2014-07-0<...>atetime64[ns]', freq='D') my_func function <function my_func at 0x00000211211B7C80> np module <module 'numpy' from 'C:\<...>ges\\numpy\\'> pd module <module 'pandas' from 'C:<...>es\\pandas\\'> plt module <module 'matplotlib.pyplo<...>\\matplotlib\\'> s Series a 1\nb 2\nc 3\nd 4\ndtype: int64 temp_diffs Series 2014-07-01 10\n2014-07<...>10\nFreq: D, dtype: int64 temps1 Series 2014-07-01 80\n2014-07<...>87\nFreq: D, dtype: int64 temps2 Series 2014-07-01 70\n2014-07<...>77\nFreq: D, dtype: int64
import pandas as pd print(dir(pd)) print(help(pd.Series)) ['Categorical', 'CategoricalIndex', 'DataFrame', 'DateOffset', 'DatetimeIndex', 'ExcelFile', 'ExcelWriter', 'Expr', 'Float64Index', 'Grouper', 'HDFStore', 'Index', 'IndexSlice', 'Int64Index', 'MultiIndex', 'NaT', 'Panel', 'Panel4D', 'Period', 'PeriodIndex', 'RangeIndex', 'Series', 'SparseArray', 'SparseDataFrame', 'SparseList', 'SparsePanel', 'SparseSeries', 'SparseTimeSeries', 'Term', 'TimeGrouper', 'TimeSeries', 'Timedelta', 'TimedeltaIndex', 'Timestamp', 'WidePanel', '__builtins__', '__cached__', '__doc__', '__docformat__', '__file__', '__loader__', '__name__', '__package__', '__path__', '__spec__', '__version__', '_np_version_under1p10', '_np_version_under1p11', '_np_version_under1p12', '_np_version_under1p8', '_np_version_under1p9', '_period', '_sparse', '_testing', '_version', 'algos', 'bdate_range', 'compat', 'computation', 'concat', 'core', 'crosstab', 'cut', 'date_range', 'datetime', 'datetools', 'dependency', 'describe_option', 'eval', 'ewma', 'ewmcorr', 'ewmcov', 'ewmstd', 'ewmvar', 'ewmvol', 'expanding_apply', 'expanding_corr', 'expanding_count', 'expanding_cov', 'expanding_kurt', 'expanding_max', 'expanding_mean', 'expanding_median', 'expanding_min', 'expanding_quantile', 'expanding_skew', 'expanding_std', 'expanding_sum', 'expanding_var', 'factorize', 'fama_macbeth', 'formats', 'get_dummies', 'get_option', 'get_store', 'groupby', 'hard_dependencies', 'hashtable', 'index', 'indexes', 'infer_freq', 'info', 'io', 'isnull', 'json', 'lib', 'lreshape', 'match', 'melt', 'merge', 'missing_dependencies', 'msgpack', 'notnull', 'np', 'offsets', 'ols', 'option_context', 'options', 'ordered_merge', 'pandas', 'parser', 'period_range', 'pivot', 'pivot_table', 'plot_params', 'pnow', 'qcut', 'read_clipboard', 'read_csv', 'read_excel', 'read_fwf', 'read_gbq', 'read_hdf', 'read_html', 'read_json', 'read_msgpack', 'read_pickle', 'read_sas', 'read_sql', 'read_sql_query', 'read_sql_table', 'read_stata', 'read_table', 'reset_option', 'rolling_apply', 'rolling_corr', 'rolling_count', 'rolling_cov', 'rolling_kurt', 'rolling_max', 'rolling_mean', 'rolling_median', 'rolling_min', 'rolling_quantile', 'rolling_skew', 'rolling_std', 'rolling_sum', 'rolling_var', 'rolling_window', 'scatter_matrix', 'set_eng_float_format', 'set_option', 'show_versions', 'sparse', 'stats', 'test', 'timedelta_range', 'to_datetime', 'to_msgpack', 'to_numeric', 'to_pickle', 'to_timedelta', 'tools', 'tseries', 'tslib', 'types', 'unique', 'util', 'value_counts', 'wide_to_long'] Help on class Series in module pandas.core.series: class Series(pandas.core.base.IndexOpsMixin, pandas.core.strings.StringAccessorMixin, pandas.core.generic.NDFrame) | One-dimensional ndarray with axis labels (including time series). | | Labels need not be unique but must be any hashable type. The object
因爲標準的python list中保存的是對象的指針,所以必須二次尋址才能訪問到list中的元素。顯然這是低效而且浪費空間的。。
而且標準python list或者array不支持二緯數組,也不支持對數組數據作一些複雜適合數字運算的函數。
numpy爲了提升性能,而且支持二緯數組的複雜運算使用C語言編寫底層的實現而且以python obj方式給python調用。
import numpy as np from matplotlib import pyplot as plt x = np.linspace(0,2 * np.pi,100) y = np.sin(x) // y是對x中的全部元素執行sin計算 plt.plot(x,y,'r-',linewidth=3,label='sin function') plt.xlabel('x') plt.ylabel('sin(x)')
pandas在numpy之上又提供了相似於sql數據處理機制,提供Series和Dataframe兩種數據類型。 每一個Series實際上包含index和values兩個ndarray.其中index保存建立series時傳入的index信息,values則是保存對應值的ndarray數組。numpy的ufunc函數都對該values數組來執行.
dataframe.loc/iloc vs []index operator
.oc/iloc都是指的row,而[]則默認給column selection, column總歸會有一個name,所以column selection老是label based
df.loc[:,['Name','cost']] #返回全部store的name和cost value
shoplist = ['apple','mango','carrot','banana'] mylist = shoplist del shoplist[0] print('shoplist is:',shoplist) print('mylist is:',mylist) # 上面是相同的輸出 print('copy via slice and asignment') mycopiedlist = shoplist[:] # make a copy by doing a full slice del(mycopiedlist[0]) print('shoplist is: ',shoplist) print('mycopiedlist is:',mycopiedlist)
list('ABCD') # 輸出 ['A', 'B', 'C', 'D']
有的時候,咱們經過ipython shell作探索式編程,有一些函數已經作了定義和運行,隨後想再查看一下這個函數的代碼,而且準備調用它,這時你就須要想辦法「重現」該函數的代碼。
import inspect source_DF = inspect.getsource(pandas.DataFrame) print(type(source_DF)) print(source_DF[:200]) #打印源程序代碼 source_file_DF = inspect.getsourcefile(pandas.DataFrame) print(source_file_DF) # D:\Users\dengdong\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\
a = [0,1,2,3,4,5,6,7,8,9] b = a[:] print(id(a)) # 54749320 print(id(b)) # 54749340