Tushare是一個免費、開源的python財經數據接口包。主要實現對股票等金融數據從數據採集、清洗加工 到 數據存儲的過程,可以爲金融分析人員提供快速、整潔、和多樣的便於分析的數據,爲他們在數據獲取方面極大地減輕工做量,使他們更加專一於策略和模型的研究與實現上。考慮到Python pandas包在金融量化分析中體現出的優點,Tushare返回的絕大部分的數據格式都是pandas DataFrame類型。python
舉例使用app
import numpy as np import pandas as pd import matplotlib.pyplot as plt import tushare as ts # 使用tushare 獲取每隻股票的行情數據 df = ts.get_k_data('600519',start='2008-01-01') print(type(df)) df.to_csv('600519.csv') df = pd.read_csv('600519.csv',index_col='date',parse_dates=['date'])[['open','close','high','low']] print(df) # 輸出該股票全部收盤比開盤上漲3%以上的日期 print(df[(df['close']-df['open'])/df['open']>0.03].index) # df.shift() 移動,正數向下移動,負數向上移動 # 輸出該股票全部開盤比前日收盤跌幅超過2%的日期 df[(df['open']-df['close'].shift(1))/df['close'].shift(1)<=-0.02].index # 假如我從2008年1月1日開始,每個月第一個交易日買入1手股票,每一年最後一個交易日賣出全部股票,到今天爲止,個人收益如何? price_last = df['open'][-1] df = df['2008-01':'2018-11'] #剔除首尾無用的數據 df_monthly = df.resample("MS" ).first() # 每個月第一天 print("df_monthly 2008:") print(df_monthly) print("df_yearly:") df_yearly = df.resample("A").last()[:-1] # 每一年最後一天 print(df_yearly) cost_money=0 hold = 0 for year in range(2008,2018): cost_money += df_monthly[str(year)]['open'].sum() * 100 hold += len(df_monthly[str(year)]['open'])*100 cost_money -= df_yearly[str(year)]['open'][0] * hold hold = 0 print('cost_money: %s'%(0-cost_money)) # 求5日均線和30日均線 df = pd.read_csv('601318.csv',index_col='date',parse_dates=['date'])[['open','close','low','high']] print(df.head()) df['ma5'] = np.NAN df['ma30'] = np.NAN # # for i in range(4,len(df)): # df.loc[df.index[i],'ma5'] = df['close'][i-4:i+1].mean() # # for i in range(29,len(df)): # df.loc[df.index[i],'ma30'] = df['close'][i-29:i+1].mean() # # print(df.head(50)) df['ma5'] = df['close'].rolling(5).mean() # 窗口向下滾動5個 df['ma30'] = df['close'].rolling(30).mean() # 窗口向下滾動30個 print(df.head(50)) # 畫均線圖 df = df[:800] df[['close','ma5','ma30']].plot() plt.show() # 金叉和死叉日期 golden_cross =[] death_cross = [] for i in range(1,len(df)): if df['ma5'][i]>=df['ma30'][i] and df['ma5'][i-1]< df['ma30'][i-1]: golden_cross.append(df.index[i].to_pydatetime()) if df['ma5'][i] <= df['ma30'][i] and df['ma5'][i - 1] > df['ma30'][i - 1]: death_cross.append(df.index[i]) print(golden_cross[:5]) sr1 = df['ma5'] < df['ma30'] sr2 = df['ma5'] >= df['ma30'] death_cross = df[sr1 & sr2.shift(1)].index golden_cross = df[~(sr1 | sr2.shift(1))].index print(death_cross)