janusgraph-圖數據庫的學習(1)

圖數據庫的簡介-來源百度百科html

1.簡介node

  圖形數據庫是NoSQL數據庫的一種類型,它應用圖形理論存儲實體之間的關係信息。圖形數據庫是一種非關係型數據庫,它應用圖形理論存儲實體之間的關係信息。最多見例子就是社會網絡中人與人之間的關係。關係型數據庫用於存儲「關係型」數據的效果並很差,其查詢複雜、緩慢、超出預期,而圖形數據庫的獨特設計偏偏彌補了這個缺陷數據庫

2.圖數據庫的數據結構apache

  圖數據庫包含兩種基本數據類型:後端

  Nodes(節點) 和 Relationships(關係)。緩存

  Nodes 和 Relationships 包含key/value形式的屬性。Nodes經過Relationships所定義的關係相連起來,造成關係型網絡結構。服務器

3.janusgraph網絡

注:本人學習參考的是官方文檔和其餘學習資料,若有錯誤請指出數據結構

  1.janusgraph的優勢架構

  JanusGraph is designed to support the processing of graphs so large that they require storage and computational capacities beyond what a single machine can provide.Scaling graph data processing for real time traversals and analytical queries is JanusGraph’s foundational benefit.This section will discuss the various specific benefits of JanusGraph and its underlying, supported persistence solutions.

  上述能夠理解爲:設計 JanusGraph 是爲了支持處理如此大的圖,以致於它們須要超出單臺機器所能提供的存儲和計算能力。 爲實時遍歷和分析查詢縮放圖形數據處理是 JanusGraph 的基本優點

  1.1基本優點

  • Support for very large graphs. JanusGraph graphs scale with the number of machines in the cluster.
  • Support for very many concurrent transactions and operational graph processing. JanusGraph’s transactional capacity scales with the number of machines in the cluster and answers complex traversal queries on huge graphs in milliseconds.
  • Support for global graph analytics and batch graph processing through the Hadoop framework.
  • Support for geo, numeric range, and full text search for vertices and edges on very large graphs.
  • Native support for the popular property graph data model exposed by Apache TinkerPop.
  • Native support for the graph traversal language Gremlin.
  • Easy integration with the Gremlin Server for programming language agnostic connectivity.
  • Numerous graph-level configurations provide knobs for tuning performance.
  • Vertex-centric indices provide vertex-level querying to alleviate issues with the infamous super node problem.
  • Provides an optimized disk representation to allow for efficient use of storage and speed of access.
  • Open source under the liberal Apache 2 license.

 1.2和hbase的集成

  • Tight integration with the Apache Hadoop ecosystem.
  • Native support for strong consistency.
  • Linear scalability with the addition of more machines.
  • Strictly consistent reads and writes.
  • Convenient base classes for backing Hadoop MapReduce jobs with HBase tables.
  • Support for exporting metrics via JMX.
  • Open source under the liberal Apache 2 license

 1.3. JanusGraph and the CAP Theorem

 

Despite your best efforts, your system will experience enough faults that it will have to make a choice between reducing yield (i.e., stop answering requests) and reducing harvest (i.e., giving answers based on incomplete data). This decision should be based on business requirements.

 
  -- Coda Hale

When using a database, the CAP theorem should be thoroughly considered (C=Consistency, A=Availability, P=Partitionability). JanusGraph is distributed with 3 supporting backends: Apache CassandraApache HBase, and Oracle Berkeley DB Java Edition. Note that BerkeleyDB JE is a non-distributed database and is typically only used with JanusGraph for testing and exploration purposes.

HBase gives preference to consistency at the expense of yield, i.e. the probability of completing a request. Cassandra gives preference to availability at the expense of harvest, i.e. the completeness of the answer to the query (data available/complete data).

  CAP定理的簡介:C =一致性,A =可用性,P =可分區性   -----https://en.wikipedia.org/wiki/CAP_theorem

  2.janusGraph的總體架構

  Data storage:

  Indices, which speed up and enable more complex queries:

應用程序和Janusgraph進行交互

  • 將JanusGraph嵌入到執行Gremlin查詢的應用程序中,直接針對同一JVM中的圖形。查詢執行,JanusGraph的緩存和事務處理都發生在與應用程序相同的JVM中,而從存儲後端進行的數據檢索多是本地的或遠程的。
  • 經過向服務器提交Gremlin查詢來與本地或遠程JanusGraph實例交互。JanusGraph自己支持Apache TinkerPop堆棧的Gremlin Server組件

 

Janusgraph的架構

 

 

 janusgraph的架構

 架構分爲三層:

客戶端使用層,業務分析層,存儲層

業務分析層:聯機事務處理和聯機分析處理 

相關文章
相關標籤/搜索