以MASS包中的UScereal數據集爲例,將研究美國谷物衆的卡路里、脂肪、糖含量是否由於存儲位置的不一樣而發生變化,1表明底層貨架、2表明中層貨架、3表明頂層貨架。卡路里、脂肪和糖含量是因變量、貨架是三水平(1\2\3)的自變量。app
> library(MASS) > attach(UScereal) The following object is masked _by_ .GlobalEnv: shelf The following objects are masked from UScereal (pos = 3): calories, carbo, fat, fibre, mfr, potassium, protein, shelf, sodium, sugars, vitamins > shelf <- factor(shelf) #貨架轉換成因子,後續能做爲分組變量 > y <- cbind(calories, fat, sugars) #將三個因變量(卡路里、脂肪、糖)合併成一個矩陣 > aggregate(y, by=list(shelf), FUN=mean) Group.1 calories fat sugars 1 1 119.4774 0.6621338 6.295493 2 2 129.8162 1.3413488 12.507670 3 3 180.1466 1.9449071 10.856821 > cov(y) #輸出協方差(本身跟本身的相關性)和協方差(本身和別人的相關性) calories fat sugars calories 3895.24210 60.674383 180.380317 fat 60.67438 2.713399 3.995474 sugars 180.38032 3.995474 34.050018 > fit <- manova(y ~ shelf) #擬合manova()函數,對組間差別進行元檢驗 > summary(fit) Df Pillai approx F num Df den Df Pr(>F) shelf 2 0.4021 5.1167 6 122 0.0001015 *** Residuals 62 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 > summary.aov(fit) Response calories : Df Sum Sq Mean Sq F value Pr(>F) shelf 2 50435 25217.6 7.8623 0.0009054 *** Residuals 62 198860 3207.4 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Response fat : Df Sum Sq Mean Sq F value Pr(>F) shelf 2 18.44 9.2199 3.6828 0.03081 * Residuals 62 155.22 2.5035 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Response sugars : Df Sum Sq Mean Sq F value Pr(>F) shelf 2 381.33 190.667 6.5752 0.002572 ** Residuals 62 1797.87 28.998 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
單因素多元方差分析有兩個前提假設,一個是多元正態性,一個是方差-協方差矩陣同質性。ide
a、 第一個假設即指因變量組合成的向量服從一個多元正態性。能夠用 Q-Q圖來檢驗該假設函數
center <- colMeans(y) n <- nrow(y) p <- ncol(y) cov <- cov(y) d <- mahalanobis(y,center,cov) coord <- qqplot(qchisq(ppoints(n),df=p), abline(a=0,b=1) identify(coord$x, coord$y, labels=row.names(UScereal)) #交互性的對地圖中的點進行判別
從圖形上看,兩個觀測點彷佛違反了多元正態性,能夠刪除這兩點再從新分析spa
b、方差-協方差矩陣同質性即指各組的協方差矩陣相同code