道瓊斯指數與成分股關係分析

1、獲取數據dom

> library(quantmod)
> codedata<-list()
> codepool<-c("MMM","AXP","AAPL","BA","CAT","CVX","CSCO","KO","DD","XOM","GE","GS","HD","IBM","INTC","JNJ","JPM","MCD","MRK","MSFT","NKE","PFE","PG","TRV","UTX","UNH","VZ","WMT","V","DIS")
> number<-length(codepool)
> for(i in 1:number){
+ n<-codepool[i]
+ TESTdata<-get(getSymbols(n,src="yahoo",from=Sys.Date()-50,to=Sys.Date()))
+ codedata[[i]]<-as.data.frame(TESTdata[,4])
+ next}
> finaldata<-do.call(cbind,codedata)
> names(finaldata)<-codepool
> head(finaldata,3)
              MMM   AXP   AAPL     BA   CAT    CVX  CSCO    KO    DD   XOM    GE     GS     HD    IBM  INTC    JNJ   JPM    MCD   MRK  MSFT   NKE   PFE    PG    TRV
2017-01-23 178.51 75.97 120.08 157.84 94.46 115.39 30.27 41.43 72.78 84.97 29.75 232.67 138.07 171.03 36.77 113.91 83.71 121.38 61.81 62.96 53.24 31.46 86.96 118.04
2017-01-24 175.97 77.43 119.97 160.55 96.24 116.37 30.60 41.90 76.05 85.09 30.00 233.68 138.06 175.90 37.62 111.76 84.72 121.05 61.21 63.52 53.45 31.15 87.86 116.84
2017-01-25 176.73 76.89 121.88 167.36 98.15 117.24 30.70 42.12 76.67 85.34 30.37 237.25 137.48 178.29 37.80 112.80 86.03 121.79 61.08 63.68 53.86 31.29 87.16 117.67
              UTX    UNH    VZ   WMT     V    DIS
2017-01-23 110.34 159.07 52.41 66.65 82.15 107.12
2017-01-24 111.61 160.43 50.12 67.40 83.23 107.90
2017-01-25 110.96 161.24 49.77 66.89 83.90 108.04
> tail(finaldata,3)
              MMM   AXP   AAPL     BA   CAT    CVX  CSCO    KO    DD   XOM    GE     GS     HD    IBM  INTC    JNJ   JPM    MCD   MRK  MSFT   NKE   PFE    PG    TRV
2017-03-08 189.51 79.04 139.00 181.74 93.23 109.61 34.02 41.99 79.77 81.03 29.80 250.24 146.92 179.45 35.62 124.10 91.21 128.09 65.80 64.99 56.51 33.91 90.14 121.24
2017-03-09 189.90 79.30 138.68 180.57 91.39 110.04 34.07 42.03 80.48 81.67 29.66 250.18 146.62 177.18 35.82 125.95 91.57 128.14 65.89 64.73 56.36 34.05 90.34 121.90
2017-03-10 191.21 79.38 139.14 178.70 92.31 110.61 34.26 42.29 80.86 81.61 30.28 248.38 146.85 177.83 35.91 126.21 91.28 127.98 65.60 64.93 56.43 34.11 91.07 122.83
              UTX    UNH    VZ   WMT     V    DIS
2017-03-08 111.75 167.91 49.16 69.80 88.96 110.84
2017-03-09 111.93 168.01 49.28 69.86 89.11 111.03
2017-03-10 112.14 169.98 49.35 70.10 89.73 110.92

道瓊斯指數共有30只成分股,取各個成分股最近50天的收盤價。code

2、主成分分析component

> p<-princomp(finaldata[,1:30])
> summary(p)
Importance of components:
                           Comp.1     Comp.2     Comp.3     Comp.4     Comp.5      Comp.6      Comp.7      Comp.8      Comp.9     Comp.10     Comp.11     Comp.12
Standard deviation     16.1041328 4.22688825 3.50222384 2.20689405 2.01370950 1.459120153 1.298793653 0.991546988 0.858445383 0.752024098 0.679270196 0.579182056
Proportion of Variance  0.8458697 0.05827339 0.04000521 0.01588517 0.01322581 0.006944019 0.005501855 0.003206678 0.002403555 0.001844558 0.001504922 0.001094105
Cumulative Proportion   0.8458697 0.90414310 0.94414831 0.96003348 0.97325929 0.980203314 0.985705168 0.988911847 0.991315402 0.993159960 0.994664882 0.995758987

主成分分析法生成了30個新變量,前面3個變量能夠解釋到原數據集94.41%(0.8458697+0.05827339+0.04000521)的方差,所以咱們能夠用這3個新的變量代替原來30個初始變量。orm

> stockdata<-as.data.frame(as.matrix(finaldata[,1:30])%*%as.matrix(p$loadings[,1:3]))
> head(stockdata)
              Comp.1    Comp.2   Comp.3
2017-01-23 -497.0058 -3.717498 174.8456
2017-01-24 -498.6973 -5.271716 178.5616
2017-01-25 -505.3037 -4.955875 181.6700
2017-01-26 -507.5417 -4.710687 182.2582
2017-01-27 -506.5475 -3.575603 180.2852
2017-01-30 -502.8693 -3.291562 176.4937

3、迴歸分析ip

一、獲取道瓊斯指數最近50天數據,合併數據框ci

> res<-get(getSymbols("DJIA",src="yahoo",from=Sys.Date()-50,to=Sys.Date()))
> US30<-as.data.frame(res[,4])
> mydata<-cbind(stockdata,US30)
> names(mydata)<-c("X1","X2","X3","US30")
> head(mydata)
                  X1        X2       X3     US30
2017-01-23 -497.0058 -3.717498 174.8456 19799.85
2017-01-24 -498.6973 -5.271716 178.5616 19912.71
2017-01-25 -505.3037 -4.955875 181.6700 20068.51
2017-01-26 -507.5417 -4.710687 182.2582 20100.91
2017-01-27 -506.5475 -3.575603 180.2852 20093.78
2017-01-30 -502.8693 -3.291562 176.4937 19971.13

二、構造多元線性迴歸模型get

> model<-lm(US30~X1+X2+X3,mydata)
> summary(model)

Call:
lm(formula = US30 ~ X1 + X2 + X3, data = mydata)

Residuals:
    Min      1Q  Median      3Q     Max 
-50.267  -7.866  -1.937  10.018  39.555 

Coefficients:
             Estimate Std. Error  t value Pr(>|t|)    
(Intercept) 5395.8202   218.4604   24.699  < 2e-16 ***
X1           -25.7840     0.2282 -113.006  < 2e-16 ***
X2             0.4522     0.8693    0.520    0.607    
X3             9.1254     1.0492    8.698 1.06e-09 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 21.43 on 30 degrees of freedom
Multiple R-squared:  0.9977,    Adjusted R-squared:  0.9974 
F-statistic:  4282 on 3 and 30 DF,  p-value: < 2.2e-16
> model

Call:
lm(formula = US30 ~ X1 + X2 + X3, data = mydata)

Coefficients:
(Intercept)           X1           X2           X3  
  5395.8202     -25.7840       0.4522       9.1254

檢驗顯示X2與US30不顯著,剔除X2。所以線性迴歸方程爲:io

> US30 = 5395.8202 + -25.7840*X1 + 9.1254*X3
相關文章
相關標籤/搜索