學習的資料是官網的Programming Guidehtml
https://spark.apache.org/docs/latest/graphx-programming-guide.html
首先是GraphX的簡介java
GraphX是Spark中專門負責圖和圖並行計算的組件。node
GraphX經過引入了圖形概念來繼承了Spark RDD:一個鏈接節點和邊的有向圖算法
爲了支持圖計算,GraphX引入了一些算子: subgraph, joinVertices, and aggregateMessages等apache
和 Pregel API,此外還有一些algorithms 和 builders 來簡化圖分析任務。bash
關於構建 節點Vertex 和 邊Edgeide
1.若是須要將節點定義成一個類post
package graphx import org.apache.spark.{SparkConf, SparkContext} import org.apache.spark.graphx._ import org.apache.spark.rdd.RDD import org.graphstream.graph.implementations.{AbstractEdge, SingleGraph, SingleNode} /** * Created by common on 18-1-22. */ // 抽象節點 class VertexProperty() // User節點 case class UserProperty(val name: String) extends VertexProperty // Product節點 case class ProductProperty(val name: String, val price: Double) extends VertexProperty object GraphxLearning { def main(args: Array[String]): Unit = { val conf = new SparkConf().setAppName("GraphX").setMaster("local") val sc = new SparkContext(conf) // The graph might then have the type: var graph: Graph[VertexProperty, String] = null } }
和節點同樣,邊也能夠定義成一個class,同時Graph類須要和定義的節點和邊的類型相對應學習
class Graph[VD, ED] { // VD表示節點類型,ED表示邊類型 val vertices: VertexRDD[VD] val edges: EdgeRDD[ED] }
2.若是節點的類型比較簡單,例如只是一個String或者(String,String),就不須要定義成一個類ui
package graphx import org.apache.spark.{SparkConf, SparkContext} import org.apache.spark.graphx._ import org.apache.spark.rdd.RDD import org.graphstream.graph.implementations.{AbstractEdge, SingleGraph, SingleNode} /** * Created by common on 18-1-22. */ object GraphxLearning { def main(args: Array[String]): Unit = { val conf = new SparkConf().setAppName("GraphX").setMaster("local") val sc = new SparkContext(conf) // Create an RDD for the vertices val users: RDD[(VertexId, (String, String))] = sc.parallelize(Array((3L, ("rxin", "student")), (7L, ("jgonzal", "postdoc")), (5L, ("franklin", "prof")), (2L, ("istoica", "prof")))) // Create an RDD for edges val relationships: RDD[Edge[String]] = sc.parallelize(Array(Edge(3L, 7L, "collab"), Edge(5L, 3L, "advisor"), Edge(2L, 5L, "colleague"), Edge(5L, 7L, "pi"))) //Define a default user in case there are relationship with missing user val defaultUser = ("John Doe", "Missing") // 使用多個RDDs創建一個Graph,Graph的類型分別是節點加上邊的類型,有兩種節點,一種有ID,一種沒有 val srcGraph: Graph[(String, String), String] = Graph(users, relationships, defaultUser) } }
圖的一些算子
|