R語言基礎學習——D01python
20190410內容綱要:linux
一、R的下載與安裝sql
二、R包的安裝與使用方法數據庫
(1)查看已安裝的包數組
(2)查看是否安裝過包網絡
(3)安裝包數據結構
(4)更新包app
三、結果的重用less
四、R處理大數據集curl
五、R的數據結構
(1)向量
(2)矩陣
(3)數組
(4)數據框
(5)列表
六、實例演練
七、小結
1 R的下載與安裝
R是用於統計分析、繪圖的語言和操做環境。R是屬於GNU系統的一個自由、免費、源代碼開放的軟件,它是一個用於統計計算和統計製圖的優秀工具。
學習它那就先下載它!話很少說看連接:
Windows鏡像: http://mirror.fcaglp.unlp.edu.ar/CRAN/
固然也有Linux和Mac版本。
安裝,就很少少,直接下一步,下一步,下一步。別忘了更改安裝路徑就行!!!
先隨便玩點什麼?
>demo() >demo(graphics) >help.start() >help("mean") >?mean >getwd() >setwd("path") >history()
看完這些,以爲R跟linux和Matlab有點像。聽說R的前身是S語言。S語言是什麼?https://baike.baidu.com/item/S%E8%AF%AD%E8%A8%80
2 R包的安裝與使用方法
(1)查看已安裝的包。
首先,若是照1方法安裝完成以後打開軟件。在R console中輸入library()就能查看當前已經安裝的包。
>library()
1 圖書館‘F:/R/R-3.5.3/library’裏有個程輯包: 2 3 abind Combine Multidimensional Arrays 4 assertthat Easy Pre and Post Assertions 5 base The R Base Package 6 BH Boost C++ Header Files 7 boot Bootstrap Functions (Originally by Angelo Canty for S) 8 car Companion to Applied Regression 9 carData Companion to Applied Regression Data Sets 10 cellranger Translate Spreadsheet Cell Ranges to Rows and Columns 11 class Functions for Classification 12 cli Helpers for Developing Command Line Interfaces 13 clipr Read and Write from the System Clipboard 14 cluster "Finding Groups in Data": Cluster Analysis Extended Rousseeuw et al. 15 codetools Code Analysis Tools for R 16 compiler The R Compiler Package 17 crayon Colored Terminal Output 18 curl A Modern and Flexible Web Client for R 19 data.table Extension of `data.frame` 20 datasets The R Datasets Package 21 ellipsis Tools for Working with ... 22 fansi ANSI Control Sequence Aware String Functions 23 forcats Tools for Working with Categorical Variables (Factors) 24 foreign Read Data Stored by 'Minitab', 'S', 'SAS', 'SPSS', 'Stata', 'Systat', 'Weka', 'dBase', ... 25 graphics The R Graphics Package 26 grDevices The R Graphics Devices and Support for Colours and Fonts 27 grid The Grid Graphics Package 28 haven Import and Export 'SPSS', 'Stata' and 'SAS' Files 29 hms Pretty Time of Day 30 KernSmooth Functions for Kernel Smoothing Supporting Wand & Jones (1995) 31 lattice Trellis Graphics for R 32 lme4 Linear Mixed-Effects Models using 'Eigen' and S4 33 magrittr A Forward-Pipe Operator for R 34 maptools Tools for Handling Spatial Objects 35 MASS Support Functions and Datasets for Venables and Ripley's MASS 36 Matrix Sparse and Dense Matrix Classes and Methods 37 MatrixModels Modelling with Sparse And Dense Matrices 38 methods Formal Methods and Classes 39 mgcv Mixed GAM Computation Vehicle with Automatic Smoothness Estimation 40 minqa Derivative-free optimization algorithms by quadratic approximation 41 nlme Linear and Nonlinear Mixed Effects Models 42 nloptr R Interface to NLopt 43 nnet Feed-Forward Neural Networks and Multinomial Log-Linear Models 44 openxlsx Read, Write and Edit XLSX Files 45 parallel Support for Parallel computation in R 46 pbkrtest Parametric Bootstrap and Kenward Roger Based Methods for Mixed Model Comparison 47 pillar Coloured Formatting for Columns 48 pkgconfig Private Configuration for 'R' Packages 49 prettyunits Pretty, Human Readable Formatting of Quantities 50 progress Terminal Progress Bars 51 quantreg Quantile Regression 52 R6 Encapsulated Classes with Reference Semantics 53 Rcpp Seamless R and C++ Integration 54 RcppEigen 'Rcpp' Integration for the 'Eigen' Templated Linear Algebra Library 55 readr Read Rectangular Text Data 56 readxl Read Excel Files 57 rematch Match Regular Expressions with a Nicer 'API' 58 rio A Swiss-Army Knife for Data I/O 59 rlang Functions for Base Types and Core R and 'Tidyverse' Features 60 rpart Recursive Partitioning and Regression Trees 61 sp Classes and Methods for Spatial Data 62 SparseM Sparse Linear Algebra 63 spatial Functions for Kriging and Point Pattern Analysis 64 splines Regression Spline Functions and Classes 65 stats The R Stats Package 66 stats4 Statistical Functions using S4 Classes 67 survival Survival Analysis 68 tcltk Tcl/Tk Interface 69 tibble Simple Data Frames 70 tools Tools for Package Development 71 translations The R Translations Package 72 utf8 Unicode Text Processing 73 utils The R Utils Package 74 zip Cross-Platform 'zip' Compression
(2)查看當前是否安裝過包
>help(package="car") #car就是具體的某個包的名稱
若是已經安裝過,會自動跳轉本機的12569端口查看網頁版的詳細介紹。若是沒有那就裝吧~
(3)安裝包
安裝包的時候會提示選擇鏡像源,選中國的就行,剩下的就看網絡給不給力了~
install.packages("car")
(4)更新包
update.packages() #不生命的話就默認更新所有
3 結果的重用
>head(mtcars) #mtcars是一個數據集 >lm(mpg~wt, data=mtcars #lm是線性擬合的命令 >Result = lm(mpg~wt, data=mtcars) >summary(Result) >plot(Result) >predict(Result, mynewdata) #mynewdata是本身要預測的值
有不少東西看不懂沒事,後面還會有詳細說明。~~
4 R處理大數據集
(1)R有專門用於大數據分析的包。如biglm()能之內存高效的方式實現大型數據的線性模型擬合。
(2)R與大數據平臺的結合。如Rhadoop、RHive、RHipe。
R的數據集一般是由數據構成的一個矩形數組,行表示記錄,列表示屬性(字段)。形式能夠使Excel、txt、SAS、Mysql
對數據庫有興趣的話能夠看看:2019最受歡迎的數據庫是? https://mp.weixin.qq.com/s/9fhPicVCjMpfMmjbhZUoFA
5 R的數據結構
話很少說,仍是經過代碼比較容易理解。。
(1)向量
向量中的元素能夠是數字型、字符型、也能夠是布爾型。可是當數組型和字符型混一塊兒時,有沒有什麼說法本身動手試試吧!!
>a <- c(1,3,5,7,2,-4) >b <- c("one","two","three") >c <- c(TRUE,TRUE,FALSE) >d <- c(1,3,5,"ONE")
此外,關於切片其實跟python有點相似
>d[c(1,3,4)] >d[3] >d[1:3]
(2)矩陣 matrix
>?matrix >y <- matrix(5:24, nrow=4, ncol=5) >x <- c(2,45,68,94) >rnames <- c("R1","R2") >cnames <- c("C1","C2") >newMatrix <- matrix(x, nrow=2, ncol=2, byrow=TRUE, dimnames=list(rnames,cnames)) >>newMatrix <- matrix(x, nrow=2, ncol=2,dimnames=list(rnames,cnames)) #默認按列填充
>x[3,] >x[2,3] >x[,4]
(3)數組 array
>?array >dim1 <- c("A1","A2", "A3") >dim2 <- c("B1", "B2") >dim3 <- c("C1","C2", "C3") >d <- array(1:24, c(3,2,4), dimnames=list(dim1,dim2,dim3)) >d[1,2,3]
1 #輸出結果 2 > d 3 , , C1 4 5 B1 B2 6 A1 1 4 7 A2 2 5 8 A3 3 6 9 10 , , C2 11 12 B1 B2 13 A1 7 10 14 A2 8 11 15 A3 9 12 16 17 , , C3 18 19 B1 B2 20 A1 13 16 21 A2 14 17 22 A3 15 18 23 24 , , C4 25 26 B1 B2 27 A1 19 22 28 A2 20 23 29 A3 21 24 30 31 > d[1,2,3] 32 [1] 16
(4)數據框 data.frame()
>patientID <- c(1,2,3,4) >age <- c(25,34,28,52) >diabetes <- c("Type1", "Type2", "Type3", "Type2") >status <- c("poor", "Improved, "Excllent", "poor") >patientData <- data.frame(patientID, age, diabetes, status)
> patientData patientID age diabetes status 1 1 25 Type1 poor 2 2 34 Type2 Improved 3 3 28 Type3 Excllent 4 4 52 Type2 poor
>patientData[1:2] >patientData[c("diabetes","status")] >patientData$age
#雖然age直接輸入age也能調出,可是這是由於前面建立數據幀的時候包含age。若是沒有呢?
#下面舉個例子 >head(mtcars) >mtcars$mpg >mpg #爲何會報錯呢,這個時候是由於mpg並無關聯到R中。這個時候能夠用attach這個命令進行關聯,解除用detach
>attach(mtcars) >mpg >detach(mtcars) >mpg
#因子 > diabetes <- factor(diabetes) > diabetes [1] Type1 Type2 Type3 Type2 Levels: Type1 Type2 Type3
(5)列表 list
> g <- "My first list" > h <- c(12,23,34) > j <- c("one","two","there") > k <- matrix(1:10, nrow=2)
> mylist <- list(g,h,j,k
> mylist [[1]] [1] "My first list" [[2]] [1] 12 23 34 [[3]] [1] "one" "two" "there" [[4]] [,1] [,2] [,3] [,4] [,5] [1,] 1 3 5 7 9 [2,] 2 4 6 8 10
可是,列表的切片方式略有不一樣。雙中括號!!!
>mylist[[2]]
6 實例演練
>age <- c(1,3,5,2,11,9,3,9,12,3) >weight <- c(4.4, 5.3, 7.2, 5.2, 8.5, 7.3, 6.0, 10.4, 10.2, 6.1) >mean(weight) #求均值 >sd(weight) #求方差 >cor(age, weight) #求相關性 >plot(age,weight)
7 推薦
推薦1: 數據分析從零開始實戰 | 基礎篇 https://mp.weixin.qq.com/s/4ESKjlF4B63IveiIlfCdDA
推薦2:給入行數據分析的8個建議 https://mp.weixin.qq.com/s/FYQ192iwstn2J2QejDvNhA
我是尾巴~
數據分析必將大有所爲!!!