D01-R語言基礎學習

時間 2019-12-06

標籤 d01 語言基礎學習简体版

原文原文鏈接

R語言基礎學習——D01python

20190410內容綱要：linux

　　一、R的下載與安裝sql

　　二、R包的安裝與使用方法數據庫

　　　　（1）查看已安裝的包數組

　　　　（2）查看是否安裝過包網絡

　　　　（3）安裝包數據結構

　　　　（4）更新包app

　　三、結果的重用less

　　四、R處理大數據集curl

　　五、R的數據結構

　　　　（1）向量

　　　　（2）矩陣

　　　　（3）數組

　　　　（4）數據框

　　　　（5）列表

　　六、實例演練

　　七、小結

　1 R的下載與安裝

R是用於統計分析、繪圖的語言和操做環境。R是屬於GNU系統的一個自由、免費、源代碼開放的軟件，它是一個用於統計計算和統計製圖的優秀工具。

學習它那就先下載它！話很少說看連接：

Windows鏡像：　　http://mirror.fcaglp.unlp.edu.ar/CRAN/

固然也有Linux和Mac版本。

安裝，就很少少，直接下一步，下一步，下一步。別忘了更改安裝路徑就行！！！

先隨便玩點什麼？

>demo()
>demo(graphics)
>help.start()
>help("mean")
>?mean
>getwd()
>setwd("path")
>history()

看完這些，以爲R跟linux和Matlab有點像。聽說R的前身是S語言。S語言是什麼？https://baike.baidu.com/item/S%E8%AF%AD%E8%A8%80

2 R包的安裝與使用方法

（1）查看已安裝的包。

首先，若是照1方法安裝完成以後打開軟件。在R console中輸入library（）就能查看當前已經安裝的包。

>library()

 1 圖書館‘F:/R/R-3.5.3/library’裏有個程輯包：
 2 
 3 abind                                                                 Combine Multidimensional Arrays
 4 assertthat                                                            Easy Pre and Post Assertions
 5 base                                                                  The R Base Package
 6 BH                                                                    Boost C++ Header Files
 7 boot                                                                  Bootstrap Functions (Originally by Angelo Canty for S)
 8 car                                                                   Companion to Applied Regression
 9 carData                                                               Companion to Applied Regression Data Sets
10 cellranger                                                            Translate Spreadsheet Cell Ranges to Rows and Columns
11 class                                                                 Functions for Classification
12 cli                                                                   Helpers for Developing Command Line Interfaces
13 clipr                                                                 Read and Write from the System Clipboard
14 cluster                                                               "Finding Groups in Data": Cluster Analysis Extended Rousseeuw et al.
15 codetools                                                             Code Analysis Tools for R
16 compiler                                                              The R Compiler Package
17 crayon                                                                Colored Terminal Output
18 curl                                                                  A Modern and Flexible Web Client for R
19 data.table                                                            Extension of `data.frame`
20 datasets                                                              The R Datasets Package
21 ellipsis                                                              Tools for Working with ...
22 fansi                                                                 ANSI Control Sequence Aware String Functions
23 forcats                                                               Tools for Working with Categorical Variables (Factors)
24 foreign                                                               Read Data Stored by 'Minitab', 'S', 'SAS', 'SPSS', 'Stata', 'Systat', 'Weka', 'dBase', ...
25 graphics                                                              The R Graphics Package
26 grDevices                                                             The R Graphics Devices and Support for Colours and Fonts
27 grid                                                                  The Grid Graphics Package
28 haven                                                                 Import and Export 'SPSS', 'Stata' and 'SAS' Files
29 hms                                                                   Pretty Time of Day
30 KernSmooth                                                            Functions for Kernel Smoothing Supporting Wand & Jones (1995)
31 lattice                                                               Trellis Graphics for R
32 lme4                                                                  Linear Mixed-Effects Models using 'Eigen' and S4
33 magrittr                                                              A Forward-Pipe Operator for R
34 maptools                                                              Tools for Handling Spatial Objects
35 MASS                                                                  Support Functions and Datasets for Venables and Ripley's MASS
36 Matrix                                                                Sparse and Dense Matrix Classes and Methods
37 MatrixModels                                                          Modelling with Sparse And Dense Matrices
38 methods                                                               Formal Methods and Classes
39 mgcv                                                                  Mixed GAM Computation Vehicle with Automatic Smoothness Estimation
40 minqa                                                                 Derivative-free optimization algorithms by quadratic approximation
41 nlme                                                                  Linear and Nonlinear Mixed Effects Models
42 nloptr                                                                R Interface to NLopt
43 nnet                                                                  Feed-Forward Neural Networks and Multinomial Log-Linear Models
44 openxlsx                                                              Read, Write and Edit XLSX Files
45 parallel                                                              Support for Parallel computation in R
46 pbkrtest                                                              Parametric Bootstrap and Kenward Roger Based Methods for Mixed Model Comparison
47 pillar                                                                Coloured Formatting for Columns
48 pkgconfig                                                             Private Configuration for 'R' Packages
49 prettyunits                                                           Pretty, Human Readable Formatting of Quantities
50 progress                                                              Terminal Progress Bars
51 quantreg                                                              Quantile Regression
52 R6                                                                    Encapsulated Classes with Reference Semantics
53 Rcpp                                                                  Seamless R and C++ Integration
54 RcppEigen                                                             'Rcpp' Integration for the 'Eigen' Templated Linear Algebra Library
55 readr                                                                 Read Rectangular Text Data
56 readxl                                                                Read Excel Files
57 rematch                                                               Match Regular Expressions with a Nicer 'API'
58 rio                                                                   A Swiss-Army Knife for Data I/O
59 rlang                                                                 Functions for Base Types and Core R and 'Tidyverse' Features
60 rpart                                                                 Recursive Partitioning and Regression Trees
61 sp                                                                    Classes and Methods for Spatial Data
62 SparseM                                                               Sparse Linear Algebra
63 spatial                                                               Functions for Kriging and Point Pattern Analysis
64 splines                                                               Regression Spline Functions and Classes
65 stats                                                                 The R Stats Package
66 stats4                                                                Statistical Functions using S4 Classes
67 survival                                                              Survival Analysis
68 tcltk                                                                 Tcl/Tk Interface
69 tibble                                                                Simple Data Frames
70 tools                                                                 Tools for Package Development
71 translations                                                          The R Translations Package
72 utf8                                                                  Unicode Text Processing
73 utils                                                                 The R Utils Package
74 zip                                                                   Cross-Platform 'zip' Compression

View Code

（2）查看當前是否安裝過包

>help(package="car")        #car就是具體的某個包的名稱

若是已經安裝過，會自動跳轉本機的12569端口查看網頁版的詳細介紹。若是沒有那就裝吧~

（3）安裝包

安裝包的時候會提示選擇鏡像源，選中國的就行，剩下的就看網絡給不給力了~

install.packages("car")

（4）更新包

update.packages()    #不生命的話就默認更新所有

3 結果的重用

>head(mtcars)                                      #mtcars是一個數據集  
>lm(mpg~wt, data=mtcars                     #lm是線性擬合的命令
>Result = lm(mpg~wt, data=mtcars)
>summary(Result)
>plot(Result)
>predict(Result, mynewdata)                   #mynewdata是本身要預測的值

有不少東西看不懂沒事，後面還會有詳細說明。~~

　4 R處理大數據集

（1）R有專門用於大數據分析的包。如biglm（）能之內存高效的方式實現大型數據的線性模型擬合。

（2）R與大數據平臺的結合。如Rhadoop、RHive、RHipe。

R的數據集一般是由數據構成的一個矩形數組，行表示記錄，列表示屬性（字段）。形式能夠使Excel、txt、SAS、Mysql

對數據庫有興趣的話能夠看看：2019最受歡迎的數據庫是？ https://mp.weixin.qq.com/s/9fhPicVCjMpfMmjbhZUoFA

5 R的數據結構

話很少說，仍是經過代碼比較容易理解。。

（1）向量

向量中的元素能夠是數字型、字符型、也能夠是布爾型。可是當數組型和字符型混一塊兒時，有沒有什麼說法本身動手試試吧！！

>a <- c(1,3,5,7,2,-4)
>b <- c("one","two","three")
>c <- c(TRUE,TRUE,FALSE)
>d <- c(1,3,5,"ONE")

此外，關於切片其實跟python有點相似

>d[c(1,3,4)]
>d[3]
>d[1:3]

（2）矩陣　　matrix

>?matrix
>y <- matrix(5:24, nrow=4, ncol=5)
>x <- c(2,45,68,94)
>rnames <- c("R1","R2")
>cnames <- c("C1","C2")
>newMatrix <- matrix(x, nrow=2, ncol=2, byrow=TRUE, dimnames=list(rnames,cnames))
>>newMatrix <- matrix(x, nrow=2, ncol=2,dimnames=list(rnames,cnames))        #默認按列填充

>x[3,]
>x[2,3]
>x[,4]

（3）數組　　array

>?array
>dim1 <- c("A1","A2", "A3")
>dim2 <- c("B1", "B2")
>dim3 <- c("C1","C2", "C3")
>d <- array(1:24, c(3,2,4), dimnames=list(dim1,dim2,dim3))
>d[1,2,3]

 1 #輸出結果
 2 > d
 3 , , C1
 4 
 5    B1 B2
 6 A1  1  4
 7 A2  2  5
 8 A3  3  6
 9 
10 , , C2
11 
12    B1 B2
13 A1  7 10
14 A2  8 11
15 A3  9 12
16 
17 , , C3
18 
19    B1 B2
20 A1 13 16
21 A2 14 17
22 A3 15 18
23 
24 , , C4
25 
26    B1 B2
27 A1 19 22
28 A2 20 23
29 A3 21 24
30 
31 > d[1,2,3]
32 [1] 16

View Code

（4）數據框　　data.frame()

>patientID <- c(1,2,3,4)
>age <- c(25,34,28,52)
>diabetes <- c("Type1", "Type2", "Type3", "Type2")
>status <- c("poor", "Improved, "Excllent", "poor")
>patientData <- data.frame(patientID, age, diabetes, status)

> patientData
  patientID age diabetes   status
1         1  25    Type1     poor
2         2  34    Type2 Improved
3         3  28    Type3 Excllent
4         4  52    Type2     poor

>patientData[1:2]
>patientData[c("diabetes","status")]
>patientData$age　　
#雖然age直接輸入age也能調出，可是這是由於前面建立數據幀的時候包含age。若是沒有呢？

#下面舉個例子
>head(mtcars)
>mtcars$mpg
>mpg
#爲何會報錯呢，這個時候是由於mpg並無關聯到R中。這個時候能夠用attach這個命令進行關聯，解除用detach

>attach(mtcars)
>mpg
>detach(mtcars)
>mpg

#因子
> diabetes <- factor(diabetes)
> diabetes
[1] Type1 Type2 Type3 Type2
Levels: Type1 Type2 Type3

（5）列表　　list

> g <- "My first list"
> h <- c(12,23,34)
> j <- c("one","two","there")
> k <- matrix(1:10, nrow=2)
> mylist <- list(g,h,j,k

> mylist
[[1]]
[1] "My first list"

[[2]]
[1] 12 23 34

[[3]]
[1] "one"   "two"   "there"

[[4]]
     [,1] [,2] [,3] [,4] [,5]
[1,]    1    3    5    7    9
[2,]    2    4    6    8   10

可是，列表的切片方式略有不一樣。雙中括號！！！

>mylist[[2]]

6 實例演練

>age <- c(1,3,5,2,11,9,3,9,12,3)
>weight <- c(4.4, 5.3, 7.2, 5.2, 8.5, 7.3, 6.0, 10.4, 10.2, 6.1)
>mean(weight)        #求均值
>sd(weight)            #求方差
>cor(age, weight)    #求相關性
>plot(age,weight)