1、數據分析python
一、通常的數據分析過程算法
數據採集、數據存儲、數據分析、數據挖掘、數據可視化、進行決策數據庫
(1)數據存儲:存儲到數據倉庫(輸入到計算機中存儲爲文件格式)。編程
(2)數據挖掘:從大量的數據經過算法搜索隱藏於其中信息的過程,並不清楚可以挖掘出什麼,即不能明確結果。數組
2、R語言介紹數據結構
一、R語言特色app
(1)有效的數據處理和保存機制編程語言
(2)擁有一整套數組和矩陣的操做運算符函數
(3)一系列連貫而又完整的數據分析中間工具工具
(4)圖形統計能夠對數據直接進行分析和顯示,可用於多種圖形設備
(5)一種至關完善、簡潔和高效的程序設計語言
(6)R語言是完全面對對象的統計編程語言
(7)R語言和其它編程語言、數據庫之間有很好的接口
(8)R語言是自由軟件,能夠放心大膽地使用,但其功能卻不比任何其它同類軟件差
(9)R語言具備豐富的網上資源
二、R的缺點
(1)R軟件不夠規範,不容易上手,須要付出較多的努力,付出大量的學習成本
(2)R擴展包太多,且學習難度大
3、Rstudio基本操做
一、工做目錄
(1)getwd() 用於查詢當前工做目錄
(2)setwd()用於設置當前目錄
注意:路徑名要加上「 」,要將本來文件加的\改爲/
(3)dir() 查詢當前工做目錄的文件列表
注:若是想要改變默認工做目錄,須要點擊上方的Tools工具中的Global option中修改
二、賦值運算
(1)經常使用 <- 進行賦值,注:符號之間必定不要有空格
> x <- 3 > x [1] 3
(2)<<- 用於強制賦值給一個全局變量而不是局部變量,編寫函數會用到
三、經常使用函數
(1)sum(….),取得全部元素的最大值
> sum(1, 3,5 ,6,9) [1] 24
mean(….) 取得全部元素的平均值
> x<-c(1,2,3,4,5,6,7,8) > mean(x) [1] 4.5 > mean(1:8) [1] 4.5
ls() 列出目前在工做空間中存在的變量名
> ls() [1] "x"
ls.str() 列出目前在工做空間中存在的變量的全部信息
> ls.str() x : num [1:8] 1 2 3 4 5 6 7 8
str(變量名)列出變量的詳細信息
> str(x) num [1:8] 1 2 3 4 5 6 7 8
rm(變量名)刪除變量
> rm(x)
rm(list = ls())刪除全部變量
save.image()保存工做空間,默認保存在當前工做目錄下的RData文件中(繪製的圖片不會單獨保存)
> save.image()
4、R包的安裝
(1).libPaths()可以顯示當前庫所在的位置
> .libPaths() [1] "D:/R-3.6.2/library"
(2)library()不加任何參數能顯示當前庫中的軟件包
(3)library(包名)載入包/也可使用require(包名)載入包
> library("grid")
(4)R的基礎包在R啓動的時候就會被加載進來
(5)help(package="包名")查看包的幫助文檔
> help(package = "grid")
(6)library(help="包名")列出包的基本內容
> library(help="grid")
(7)ls("package:包名")列出包中包含的全部函數
> ls("package:grid") [1] "absolute.size" "addGrob" "applyEdit" [4] "applyEdits" "arcCurvature" "arrow" [7] "arrowsGrob" "ascentDetails" "bezierGrob" [10] "bezierPoints" "calcStringMetric" "childNames" [13] "circleGrob" "clipGrob" "convertHeight" [16] "convertNative" "convertUnit" "convertWidth" [19] "convertX" "convertY" "current.parent" [22] "current.rotation" "current.transform" "current.viewport" [25] "current.vpPath" "current.vpTree" "curveGrob" [28] "dataViewport" "delayGrob" "depth" [31] "descentDetails" "deviceDim" "deviceLoc" [34] "downViewport" "draw.details" "drawDetails" [37] "editDetails" "editGrob" "emptyCoords" [40] "engine.display.list" "explode" "forceGrob" [43] "frameGrob" "functionGrob" "gEdit" [46] "gEditList" "get.gpar" "getGrob" [49] "getNames" "gList" "gpar" [52] "gPath" "grid.abline" "grid.add" [55] "grid.arrows" "grid.bezier" "grid.cap" [58] "grid.circle" "grid.clip" "grid.collection" [61] "grid.convert" "grid.convertHeight" "grid.convertWidth" [64] "grid.convertX" "grid.convertY" "grid.copy" [67] "grid.curve" "grid.delay" "grid.display.list" [70] "grid.DLapply" "grid.draw" "grid.edit" [73] "grid.force" "grid.frame" "grid.function" [76] "grid.gedit" "grid.get" "grid.gget" [79] "grid.grab" "grid.grabExpr" "grid.gremove" [82] "grid.grep" "grid.grill" "grid.grob" [85] "grid.layout" "grid.legend" "grid.line.to" [88] "grid.lines" "grid.locator" "grid.ls" [91] "grid.move.to" "grid.multipanel" "grid.newpage" [94] "grid.null" "grid.pack" "grid.panel" [97] "grid.path" "grid.place" "grid.plot.and.legend" [100] "grid.points" "grid.polygon" "grid.polyline" [103] "grid.pretty" "grid.raster" "grid.record" [106] "grid.rect" "grid.refresh" "grid.remove" [109] "grid.reorder" "grid.revert" "grid.roundrect" [112] "grid.segments" "grid.set" "grid.show.layout" [115] "grid.show.viewport" "grid.strip" "grid.text" [118] "grid.xaxis" "grid.xspline" "grid.yaxis" [121] "grob" "grobAscent" "grobCoords" [124] "grobDescent" "grobHeight" "grobName" [127] "grobPathListing" "grobPoints" "grobTree" [130] "grobWidth" "grobX" "grobY" [133] "gTree" "heightDetails" "is.grob" [136] "is.unit" "isEmptyCoords" "layout.heights" [139] "layout.torture" "layout.widths" "layoutRegion" [142] "legendGrob" "linesGrob" "lineToGrob" [145] "makeContent" "makeContext" "moveToGrob" [148] "nestedListing" "nullGrob" "packGrob" [151] "pathGrob" "pathListing" "placeGrob" [154] "plotViewport" "pointsGrob" "polygonGrob" [157] "polylineGrob" "pop.viewport" "popViewport" [160] "postDrawDetails" "preDrawDetails" "push.viewport" [163] "pushViewport" "rasterGrob" "recordGrob" [166] "rectGrob" "removeGrob" "reorderGrob" [169] "resolveHJust" "resolveRasterSize" "resolveVJust" [172] "roundrectGrob" "seekViewport" "segmentsGrob" [175] "setChildren" "setGrob" "showGrob" [178] "showViewport" "stringAscent" "stringDescent" [181] "stringHeight" "stringWidth" "textGrob" [184] "unit" "unit.c" "unit.length" [187] "unit.pmax" "unit.pmin" "unit.rep" [190] "upViewport" "valid.just" "validDetails" [193] "viewport" "viewport.layout" "viewport.transform" [196] "vpList" "vpPath" "vpStack" [199] "vpTree" "widthDetails" "xaxisGrob" [202] "xDetails" "xsplineGrob" "xsplinePoints" [205] "yaxisGrob" "yDetails"
(8)data(package = "包名")列出包中包含全部的數據集
> data(package = "base") Warning message: In data(package = "base") : 數據機從程序包'base'移到了程序包'datasets'
(9)detach("package:包名")從內存中移除包
(10)remove.package("包名")刪除已經安裝的包
(11)install.packages("包名")安裝對應包
5、數據結構
一、定義:數據結構是計算機存儲、組織數據的方式
二、R中的數據類型
數值型,數值能夠用於直接結算,加減乘除
字符串型,能夠進行鏈接,轉換,提取等
邏輯性,真或者假
日期型等
三、普通的數據結構:向量、標量、列表、數組、多維數組
特殊的數據結構:perl中的哈希,python中的字典、C語言中的指針等
四、R對象:object,它是指能夠賦值給變量的任何事物,包括常量、數據結構、函數,甚至圖形。對象都擁有某種模式,描述了此對象是如何存儲的,以及某個類
五、R中的數據結構:向量、標量、矩陣、數組、列表、數據框因子、時間序列等。