分位數迴歸(Quantile Regression)是計量經濟學的研究前沿方向之一,它利用解釋變量的多個分位數(例如四分位、十分位、百分位等)來獲得被解釋變量的條件分佈的相應的分位數方程。html
與傳統的OLS只獲得均值方程相比,分位數迴歸能夠更詳細地描述變量的統計分佈。它是給定迴歸變量X,估計響應變量Y條件分位數的一個基本方法;它不只能夠度量回歸變量在分佈中心的影響,並且還能夠度量在分佈上尾和下尾的影響,所以較之經典的最小二乘迴歸具備獨特的優點。衆所周知,經典的最小二乘迴歸是針對因變量的均值(指望)的:模型反映了因變量的均值怎樣受自變量的影響,
web
quantreg即:Quantile Regression,擁有條件分位數模型的估計和推斷方法,包括線性、非線性和非參模型;處理單變量響應的條件分位數方法;處理刪失數據的幾種方法。此外,還包括基於預期風險損失的投資組合選擇方法。dom
library(quantreg) # 載入quantreg包 data(engel) # 加載quantreg包自帶的數據集 分位數迴歸(tau = 0.5) fit1 = rq(foodexp ~ income, tau = 0.5, data = engel) r1 = resid(fit1) # 獲得殘差序列,並賦值爲變量r1 c1 = coef(fit1) # 獲得模型的係數,並賦值給變量c1 summary(fit1) # 顯示分位數迴歸的模型和係數 ` Call: rq(formula = foodexp ~ income, tau = 0.5, data = engel) tau: [1] 0.5 Coefficients: coefficients lower bd upper bd (Intercept) 81.48225 53.25915 114.01156 income 0.56018 0.48702 0.60199 ` summary(fit1, se = "boot") # 經過設置參數se,能夠獲得係數的假設檢驗 ` Call: rq(formula = foodexp ~ income, tau = 0.5, data = engel) tau: [1] 0.5 Coefficients: Value Std. Error t value Pr(>|t|) (Intercept) 81.48225 27.57092 2.95537 0.00344 income 0.56018 0.03507 15.97392 0.00000 ` 分位數迴歸(tau = 0.五、0.75) fit1 = rq(foodexp ~ income, tau = 0.5, data = engel) fit2 = rq(foodexp ~ income, tau = 0.75, data = engel) 模型比較 anova(fit1,fit2) #方差分析 ` Quantile Regression Analysis of Deviance Table Model: foodexp ~ income Joint Test of Equality of Slopes: tau in { 0.5 0.75 } Df Resid Df F value Pr(>F) 1 1 469 12.093 0.0005532 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 ` 畫圖比較分析 plot(engel$foodexp , engel$income,pch=20, col = "#2E8B57", main = "家庭收入與食品支出的分位數迴歸",xlab="食品支出",ylab="家庭收入") lines(fitted(fit1), engel$income,lwd=2, col = "#EEEE00") lines(fitted(fit2), engel$income,lwd=2, col = "#EE6363") legend("topright", c("tau=.5","tau=.75"), lty=c(1,1), col=c("#EEEE00","#EE6363"))
不一樣分位點的迴歸比較 fit = rq(foodexp ~ income, tau = c(0.05,0.25,0.5,0.75,0.95), data = engel) plot( summary(fit))
data(barro) fit1 <- rq(y.net ~ lgdp2 + fse2 + gedy2 + Iy2 + gcony2, data = barro,tau=.25) fit2 <- rq(y.net ~ lgdp2 + fse2 + gedy2 + Iy2 + gcony2, data = barro,tau=.50) fit3 <- rq(y.net ~ lgdp2 + fse2 + gedy2 + Iy2 + gcony2, data = barro,tau=.75) # 替代方式 fit <- rq(y.net ~ lgdp2 + fse2 + gedy2 + Iy2 + gcony2, method = "fn", tau = 1:4/5, data = barro) anova(fit1,fit2,fit3) # 不一樣分位點模型比較-方差分析 anova(fit1,fit2,fit3,joint=FALSE) ` Quantile Regression Analysis of Deviance Table Model: y.net ~ lgdp2 + fse2 + gedy2 + Iy2 + gcony2 Tests of Equality of Distinct Slopes: tau in { 0.25 0.5 0.75 } Df Resid Df F value Pr(>F) lgdp2 2 481 1.0656 0.34535 fse2 2 481 2.6398 0.07241 . gedy2 2 481 0.7862 0.45614 Iy2 2 481 0.0447 0.95632 gcony2 2 481 0.0653 0.93675 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Warning message: In summary.rq(x, se = se, covariance = TRUE) : 1 non-positive fis ` 不一樣分位點擬合曲線的比較 plot(barro$y.net,pch=20, col = "#2E8B57", main = "不一樣分位點擬合曲線的比較") lines(fitted(fit1),lwd=2, col = "#FF00FF") lines(fitted(fit2),lwd=2, col = "#EEEE00") lines(fitted(fit3),lwd=2, col = "#EE6363") legend("topright", c("tau=.25","tau=.50","tau=.75"), lty=c(1,1), col=c( "#FF00FF","#EEEE00","#EE6363"))
library(zoo) data("AirPassengers", package = "datasets") ap <- log(AirPassengers) fm <- dynrq(ap ~ trend(ap) + season(ap), tau = 1:4/5) ` Dynamic quantile regression "matrix" data: Start = 1949(1), End = 1960(12) Call: dynrq(formula = ap ~ trend(ap) + season(ap), tau = 1:4/5) Coefficients: tau= 0.2 tau= 0.4 tau= 0.6 tau= 0.8 (Intercept) 4.680165533 4.72442529 4.756389747 4.763636251 trend(ap) 0.122068032 0.11807467 0.120418846 0.122603451 season(ap)Feb -0.074408403 -0.02589716 -0.006661952 -0.013385535 season(ap)Mar 0.082349382 0.11526821 0.114939193 0.106390507 season(ap)Apr 0.062351869 0.07079315 0.063283042 0.066870808 season(ap)May 0.064763333 0.08453454 0.069344618 0.087566554 season(ap)Jun 0.195099116 0.19998275 0.194786890 0.192013960 season(ap)Jul 0.297796876 0.31034824 0.281698714 0.326054871 season(ap)Aug 0.287624540 0.30491687 0.290142727 0.275755490 season(ap)Sep 0.140938329 0.14399906 0.134373833 0.151793646 season(ap)Oct 0.002821207 0.01175582 0.013443965 0.002691383 season(ap)Nov -0.154101220 -0.12176290 -0.124004759 -0.136538575 season(ap)Dec -0.031548941 -0.01893221 -0.023048200 -0.019458814 Degrees of freedom: 144 total; 131 residual ` sfm <- summary(fm) plot(sfm)
不一樣分位點擬合曲線的比較 fm1 <- dynrq(ap ~ trend(ap) + season(ap), tau = .25) fm2 <- dynrq(ap ~ trend(ap) + season(ap), tau = .50) fm3 <- dynrq(ap ~ trend(ap) + season(ap), tau = .75) plot(ap,cex = .5,lwd=2, col = "#EE2C2C",main = "時間序列分位數迴歸") lines(fitted(fm1),lwd=2, col = "#1874CD") lines(fitted(fm2),lwd=2, col = "#00CD00") lines(fitted(fm3),lwd=2, col = "#CD00CD") legend("topright", c("原始擬合","tau=.25","tau=.50","tau=.75"), lty=c(1,1), col=c( "#EE2C2C","#1874CD","#00CD00","#CD00CD"),cex = 0.65)
除了本文介紹的以上內容,quantreg包還包含殘差形態的檢驗、非線性分位數迴歸和半參數和非參數分位數迴歸等等,詳細參見:用R語言進行分位數迴歸-詹鵬(北京師範大學經濟管理學院)和Package ‘quantreg’。函數