http://www.cnblogs.com/bonelee/p/6211290.htmlhtml
現實中不少數據都是用圖來表達的,好比社交網絡中人與人的關係、地圖數據、或是基因信息等等。RDBMS並不適合表達這類數據,並且因爲海量數據的存在,讓其顯得捉襟見肘。NoSQL數據庫的興起,很好地解決了海量數據的存放問題,圖數據庫也是NoSQL的一個分支,相比於NoSQL中的其餘分支,它很適合用來原生表達圖結構的數據。java
下面一張圖說明,相比於其餘NoSQL,圖數據庫存放的數據規模有所降低,可是更可以表達複雜的數據。node
id="iframe_0.5484260382358499" src="data:text/html;charset=utf8,%3Cstyle%3Ebody%7Bmargin:0;padding:0%7D%3C/style%3E%3Cimg%20id=%22img%22%20src=%22http://qinxuye.me/static/uploads/nosql.jpg?_=6211290%22%20style=%22border:none;max-width:933px%22%3E%3Cscript%3Ewindow.onload%20=%20function%20()%20%7Bvar%20img%20=%20document.getElementById('img');%20window.parent.postMessage(%7BiframeId:'iframe_0.5484260382358499',width:img.width,height:img.height%7D,%20'http://www.cnblogs.com');%7D%3C/script%3E" frameborder="0" scrolling="no" style="border-width: initial; border-style: none; width: 429px; height: 346px;">程序員
一般來講,一個圖數據庫存儲的結構就如同數據結構中的圖,由頂點和邊組成。sql
Neo4j是圖數據庫中一個主要表明,其開源,且用Java實現。通過幾年的發展,已經能夠用於生產環境。其有兩種運行方式,一種是服務的方式,對外提供REST接口;另一種是嵌入式模式,數據以文件的形式存放在本地,能夠直接對本地文件進行操做。數據庫
Neo4j是一個高性能的,NOSQL圖形數據庫,它將結構化數據存儲在網絡上而不是表中。Neo4j也能夠被看做是一個高性能的圖引擎,該引擎具備成熟數據庫的全部特性。程序員工做在一個面向對象的、靈活的網絡結構下而不是嚴格、靜態的表中——可是他們能夠享受到具有徹底的事務特性、企業級的數據庫的全部好處。json
Neo4j因其嵌入式、高性能、輕量級等優點,愈來愈受到關注。數組
在一個圖中包含兩種基本的數據類型:Nodes(節點) 和 Relationships(關係)。Nodes 和 Relationships 包含key/value形式的屬性。Nodes經過Relationships所定義的關係相連起來,造成關係型網絡結構。瀏覽器
從這幾個方面來講,Neo4j是一個合適的選擇。Neo4j……緩存
做爲一個圖形NoSQL數據庫,Neo4j提供了大量的功能,但沒有什麼解決方案是完美的。在如下這些用例中,Neo4j就不是很是適合的選擇:
id="iframe_0.02215085003283357" src="data:text/html;charset=utf8,%3Cstyle%3Ebody%7Bmargin:0;padding:0%7D%3C/style%3E%3Cimg%20id=%22img%22%20src=%22http://sunxiang0918.cn/img/2015/06/27/1.jpg?_=6211290%22%20style=%22border:none;max-width:933px%22%3E%3Cscript%3Ewindow.onload%20=%20function%20()%20%7Bvar%20img%20=%20document.getElementById('img');%20window.parent.postMessage(%7BiframeId:'iframe_0.02215085003283357',width:img.width,height:img.height%7D,%20'http://www.cnblogs.com');%7D%3C/script%3E" frameborder="0" scrolling="no" style="border-width: initial; border-style: none; width: 300px; height: 237px;">
The node records contain only a pointer to their first property and their first relationship (in what is oftentermed the _relationship chain). From here, we can follow the (doubly) linked-list of relationships until we find the one we’re interested in, the LIKES relationship from Node 1 to Node 2 in this case. Once we’ve found the relationship record of interest, we can simply read its properties if there are any via the same singly-linked list structure as node properties, or we can examine the node records that it relates via its start node and end node IDs. These IDs, multiplied by the node record size, of course give the immediate offset of both nodes in the node store file.
上面的英文摘自<Graph Databases>
(做者:IanRobinson) 一書,描述了 neo4j 的存儲模型。Node和Relationship 的 Property 是用一個 Key-Value 的雙向列表來保存的; Node 的 Relatsionship 是用一個雙向列表來保存的,經過關係,能夠方便的找到關係的 from-to Node. Node 節點保存第1個屬性和第1個關係ID。
經過上述存儲模型,從一個Node-A開始,能夠方便的遍歷以該Node-A爲起點的圖。下面給個示例,來幫助理解上面的存儲模型,存儲文件的具體格式在第2章詳細描述。
id="iframe_0.26670357935809874" src="data:text/html;charset=utf8,%3Cstyle%3Ebody%7Bmargin:0;padding:0%7D%3C/style%3E%3Cimg%20id=%22img%22%20src=%22http://sunxiang0918.cn/img/2015/06/27/2.png?_=6211290%22%20style=%22border:none;max-width:933px%22%3E%3Cscript%3Ewindow.onload%20=%20function%20()%20%7Bvar%20img%20=%20document.getElementById('img');%20window.parent.postMessage(%7BiframeId:'iframe_0.26670357935809874',width:img.width,height:img.height%7D,%20'http://www.cnblogs.com');%7D%3C/script%3E" frameborder="0" scrolling="no" style="border-width: initial; border-style: none; width: 300px; height: 235px;">
在這個例子中,A~E表示Node 的編號,R1~R7 表示 Relationship
編號,P1~P10 表示Property
的編號。
Node
保存了第1個Property
和 第1個Relationship
:當咱們下載neo4j-community-2.1.0-M01 並安裝,而後拿 neo4j embedded-example 的EmbeddedNeo4j 例子跑一下,能夠看到在target/neo4j-hello-db下會生成以下neo4j graph db 的存儲文件。
-rw-r–r– 11 04-11 13:28 active_tx_log drwxr-xr-x 4096 04-11 13:28 index -rw-r–r– 23740 04-11 13:28 messages.log -rw-r–r– 78 04-11 13:28 neostore -rw-r–r– 9 04-11 13:28 neostore.id -rw-r–r– 22 04-11 13:28 neostore.labeltokenstore.db -rw-r–r– 9 04-11 13:28 neostore.labeltokenstore.db.id -rw-r–r– 64 04-11 13:28 neostore.labeltokenstore.db.names -rw-r–r– 9 04-11 13:28 neostore.labeltokenstore.db.names.id -rw-r–r– 61 04-11 13:28 neostore.nodestore.db -rw-r–r– 9 04-11 13:28 neostore.nodestore.db.id -rw-r–r– 93 04-11 13:28 neostore.nodestore.db.labels -rw-r–r– 9 04-11 13:28 neostore.nodestore.db.labels.id -rw-r–r– 307 04-11 13:28 neostore.propertystore.db -rw-r–r– 153 04-11 13:28 neostore.propertystore.db.arrays -rw-r–r– 9 04-11 13:28 neostore.propertystore.db.arrays.id -rw-r–r– 9 04-11 13:28 neostore.propertystore.db.id -rw-r–r– 61 04-11 13:28 neostore.propertystore.db.index -rw-r–r– 9 04-11 13:28 neostore.propertystore.db.index.id -rw-r–r– 216 04-11 13:28 neostore.propertystore.db.index.keys -rw-r–r– 9 04-11 13:28 neostore.propertystore.db.index.keys.id -rw-r–r– 410 04-11 13:28 neostore.propertystore.db.strings -rw-r–r– 9 04-11 13:28 neostore.propertystore.db.strings.id -rw-r–r– 69 04-11 13:28 neostore.relationshipgroupstore.db -rw-r–r– 9 04-11 13:28 neostore.relationshipgroupstore.db.id -rw-r–r– 92 04-11 13:28 neostore.relationshipstore.db -rw-r–r– 9 04-11 13:28 neostore.relationshipstore.db.id -rw-r–r– 38 04-11 13:28 neostore.relationshiptypestore.db -rw-r–r– 9 04-11 13:28 neostore.relationshiptypestore.db.id -rw-r–r– 140 04-11 13:28 neostore.relationshiptypestore.db.names -rw-r–r– 9 04-11 13:28 neostore.relationshiptypestore.db.names.id -rw-r–r– 82 04-11 13:28 neostore.schemastore.db -rw-r–r– 9 04-11 13:28 neostore.schemastore.db.id -rw-r–r– 4 04-11 13:28 nioneo_logical.log.active -rw-r–r– 2249 04-11 13:28 nioneo_logical.log.v0 drwxr-xr-x 4096 04-11 13:28 schema -rw-r–r– 0 04-11 13:28 store_lock -rw-r–r– 800 04-11 13:28 tm_tx_log.1
neostore.nodestore.db
: 存儲節點數組,數組的下標便是該節點的IDneostore.nodestore.db.id
:存儲最大的ID 及已經free的IDneostore.nodestore.db.labels
:存儲節點label數組數據,數組的下標便是該節點label的IDneostore.nodestore.db.labels.id
neostore.relationshipstore.db
存儲關係 record 數組數據neostore.relationshipstore.db.id
neostore.relationshipgroupstore.db
存儲關係 group數組數據neostore.relationshipgroupstore.db.id
neostore.relationshiptypestore.db
存儲關係類型數組數據neostore.relationshiptypestore.db.id
neostore.relationshiptypestore.db.names
存儲關係類型 token 數組數據neostore.relationshiptypestore.db.names.id
neostore.labeltokenstore.db
存儲lable token 數組數據neostore.labeltokenstore.db.id
neostore.labeltokenstore.db.names
存儲 label token 的 names 數據neostore.labeltokenstore.db.names.id
neostore.propertystore.db
存儲 property 數據neostore.propertystore.db.id
neostore.propertystore.db.arrays
存儲 property (key-value 結構)的Value值是數組的數據。neostore.propertystore.db.arrays.id
neostore.propertystore.db.strings
存儲 property (key-value 結構)的Value值是字符串的數據。neostore.propertystore.db.strings.id
neostore.propertystore.db.index
存儲 property (key-value 結構)的key 的索引數據。neostore.propertystore.db.index.id
neostore.propertystore.db.index.keys
存儲 property (key-value 結構)的key 的字符串值。neostore.propertystore.db.index.keys.id
neostore
neostore.id
neostore.schemastore.db
neostore.schemastore.db.id
nioneo_logical.log.active
active_tx_log
neo4j 中,主要有4類節點,屬性,關係等文件是以數組做爲核心存儲結構;同時對節點,屬性,關係等類型的每一個數據項都會分配一個惟一的ID,在存儲時以該ID 爲數組的下標。這樣,在訪問時經過其ID做爲下標,實現快速定位。因此在圖遍歷等操做時,能夠實現 free-index。
3.1.1 CommonAbstractStore.javaCommonAbstractStore
是全部 Store
類的基類,下面的代碼片斷是 CommonAbstractStore 的成員變量,比較重要的是飄紅的幾個,特別是IdGenerator
,每種Store 的實例都有本身的 id 分配管理器; StoreChannel
是負責Store文件的讀寫和定位;WindowsPool
是與Store Record相關的緩存,用來提高性能的。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 |
public abstract class CommonAbstractStore implements IdSequence { public static abstract class Configuration { public static final Setting store_dir = InternalAbstractGraphDatabase.Configuration.store_dir; public static final Setting neo_store = InternalAbstractGraphDatabase.Configuration.neo_store; public static final Setting read_only = GraphDatabaseSettings.read_only; public static final Setting backup_slave = GraphDatabaseSettings.backup_slave; public static final Setting use_memory_mapped_buffers = GraphDatabaseSettings.use_memory_mapped_buffers; } public static final String ALL_STORES_VERSION = "v0.A.2"; public static final String UNKNOWN_VERSION = "Uknown"; protected Config configuration; private final IdGeneratorFactory idGeneratorFactory; private final WindowPoolFactory windowPoolFactory; protected FileSystemAbstraction fileSystemAbstraction; protected final File storageFileName; protected final IdType idType; protected StringLogger stringLogger; private IdGenerator idGenerator = null; private StoreChannel fileChannel = null; private WindowPool windowPool; private boolean storeOk = true; private Throwable causeOfStoreNotOk; private FileLock fileLock; private boolean readOnly = false; private boolean backupSlave = false; private long highestUpdateRecordId = -1; |
文件名 | 文件存儲格式 |
---|---|
neostore.labeltokenstore.db | LabelTokenStore(TokenStore) |
neostore.labeltokenstore.db.id | ID 類型 |
neostore.labeltokenstore.db.names | StringPropertyStore (AbstractDynamicStore, NAME_STORE_BLOCK_SIZE = 30) |
neostore.labeltokenstore.db.names.id | ID 類型 |
neostore.nodestore.db | NodeStore |
neostore.nodestore.db.id | ID 類型 |
neostore.nodestore.db.labels | ArrayPropertyStore (AbstractDynamicStorelabel_block_size=60) |
neostore.nodestore.db.labels.id | ID 類型 |
neostore.propertystore.db | PropertyStore |
neostore.propertystore.db.arrays | ArrayPropertyStore (AbstractDynamicStorearray_block_size=120) |
neostore.propertystore.db.arrays.id | ID 類型 |
neostore.propertystore.db.id | ID 類型 |
neostore.propertystore.db.index | PropertyIndexStore |
neostore.propertystore.db.index.id | ID 類型 |
neostore.propertystore.db.index.keys | StringPropertyStore (AbstractDynamicStore, NAME_STORE_BLOCK_SIZE = 30) |
neostore.propertystore.db.index.keys.id | ID 類型 |
neostore.propertystore.db.strings | StringPropertyStore (AbstractDynamicStorestring_block_size=120) |
neostore.propertystore.db.strings.id | ID 類型 |
neostore.relationshipgroupstore.db | RelationshipGroupStore |
neostore.relationshipgroupstore.db.id | ID 類型 |
neostore.relationshipstore.db | RelationshipStore |
neostore.relationshipstore.db.id | ID 類型 |
neostore.relationshiptypestore.db | RelationshipTypeTokenStore(TokenStore) |
neostore.relationshiptypestore.db.id | ID 類型 |
neostore.relationshiptypestore.db.names | StringPropertyStore (AbstractDynamicStore, NAME_STORE_BLOCK_SIZE = 30) |
neostore.relationshiptypestore.db.names.id | ID 類型 |
neostore.schemastore.db | SchemaStore(AbstractDynamicStore, BLOCK_SIZE = 56) |
neostore.schemastore.db.id | ID 類型 |
下面是 neo4j db 中,每種Store
都有本身的ID文件(即後綴.id 文件),它們的格式都是同樣的。
[test00]$ls -lh target/neo4j-test00.db/ |grep .id -rw-r–r–9 04-11 13:28 neostore.id -rw-r–r–9 04-11 13:28 neostore.labeltokenstore.db.id -rw-r–r–9 04-11 13:28 neostore.labeltokenstore.db.names.id -rw-r–r–9 04-11 13:28 neostore.nodestore.db.id -rw-r–r–9 04-11 13:28 neostore.nodestore.db.labels.id -rw-r–r–9 04-11 13:28 neostore.propertystore.db.arrays.id -rw-r–r–9 04-11 13:28 neostore.propertystore.db.id -rw-r–r–9 04-11 13:28 neostore.propertystore.db.index.id -rw-r–r–9 04-11 13:28 neostore.propertystore.db.index.keys.id -rw-r–r–9 04-11 13:28 neostore.propertystore.db.strings.id -rw-r–r–9 04-11 13:28 neostore.relationshipgroupstore.db.id -rw-r–r–9 04-11 13:28 neostore.relationshipstore.db.id -rw-r–r–9 04-11 13:28 neostore.relationshiptypestore.db.id -rw-r–r–9 04-11 13:28 neostore.relationshiptypestore.db.names.id -rw-r–r–9 04-11 13:28 neostore.schemastore.db.id
3.3.1.1. ID類型文件的存儲格式
neo4j 中後綴爲 「.id」的文件格式如上圖所示,由文件頭
(9 Bytes)和 long類型 數組
2部分構成:
sticky(1 byte)
: if sticky the id generator wasn’t closed properly so it has to berebuilt (go through the node, relationship, property, rel type etc files).nextFreeId(long)
: 保存最大的ID,該值與對應類型的存儲數組的數組大小相對應。reuseId(long)
:用來保存已經釋放且可複用的ID值。經過複用ID ,能夠減小資源數組的空洞,提升磁盤利用率。3.3.1.2. IdGeneratorImpl.java
每一種資源類型的ID 分配 neo4j 中是經過 IdGeneratorImpl
來實現的,其功能是負責ID管理分配和回收複用。對於節點
,關係
,屬性
等每一種資源類型,均可以生成一個IdGenerator
實例來負責其ID管理分配和回收複用。
3.3.1.2.1. 讀取id 文件進行初始化
下面是 IdGeneratorImpl.java
中, 讀取id 文件進行初始化的過程,IdGeneratorImpl
會從 id 文件中讀取grabSize 個可複用的ID
(reuseId) 到idsReadFromFile(LinkedList<Long>)
中,在須要申請id 時優先分配 idsReadFromFile
中的可複用ID
。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 |
private synchronized void initGenerator() { try { fileChannel = fs.open(fileName, "rw"); ByteBuffer buffer = ByteBuffer.allocate(HEADER_SIZE); readHeader(buffer); markAsSticky(buffer); fileChannel.position(HEADER_SIZE); maxReadPosition = fileChannel.size(); defraggedIdCount = (int) (maxReadPosition - HEADER_SIZE) / 8; readIdBatch(); } catch (IOException e) { throw new UnderlyingStorageException( "Unable to init id generator " + fileName, e); } } private void readHeader(ByteBuffer buffer) throws IOException { readPosition = fileChannel.read(buffer); if (readPosition != HEADER_SIZE) { fileChannel.close(); throw new InvalidIdGeneratorException( "Unable to read header, bytes read: " + readPosition); } buffer.flip(); byte storageStatus = buffer.get(); if (storageStatus != CLEAN_GENERATOR) { fileChannel.close(); throw new InvalidIdGeneratorException("Sticky generator[ " + fileName + "] delete this id file and build a new one"); } this.highId.set(buffer.getLong()); } private void readIdBatch() { if (!canReadMoreIdBatches()) return; try { int howMuchToRead = (int) Math.min(grabSize * 8, maxReadPosition - readPosition); ByteBuffer readBuffer = ByteBuffer.allocate(howMuchToRead); fileChannel.position(readPosition); int bytesRead = fileChannel.read(readBuffer); assert fileChannel.position() <= maxReadPosition; readPosition += bytesRead; readBuffer.flip(); assert (bytesRead % 8) == 0; int idsRead = bytesRead / 8; defraggedIdCount -= idsRead; for (int i = 0; i < idsRead; i++) { long id = readBuffer.getLong(); if (id != INTEGER_MINUS_ONE) { idsReadFromFile.add(id); } } } catch (IOException e) { throw new UnderlyingStorageException( "Failed reading defragged id batch", e); } } |
3.3.1.2.2. 釋放id(freeId)
用戶釋放一個 id 後,會先放入 releasedIdList (LinkedList<Long>)
,當releasedIdList
中回收的 id 個數超過 grabSize
個時, 寫入到 id 文件的末尾。因此可見,對於一個 IdGeneratorImpl
, 最多有 2 * grabSize
個 id 緩存(releasedIdList 和 idsReadFromFile)。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 |
// initialize the id generator and performs a simple validation /** * Frees the <CODE>id</CODE> making it a defragged id that will be * <p/> * returned by next id before any new id (that hasn't been used yet) is * <p/> * returned. * <p/> * <p/> * <p/> * This method will throw an <CODE>IOException</CODE> if id is negative or * <p/> * if id is greater than the highest returned id. However as stated in the * <p/> * class documentation above the id isn't validated to see if it really is * <p/> * free. */ @Override public synchronized void freeId(long id) { if (id == INTEGER_MINUS_ONE) { return; } if (fileChannel == null) { throw new IllegalStateException("Generator closed " + fileName); } if (id < 0 || id >= highId.get()) { throw new IllegalArgumentException("Illegal id[" + id + "]"); } releasedIdList.add(id); defraggedIdCount++; if (releasedIdList.size() >= grabSize) { writeIdBatch(ByteBuffer.allocate(grabSize * 8)); } } |
3.3.1.2.3. 申請id ( nextId)
當用戶申請一個 id 時,IdGeneratorImpl
在分配時,有2種分配策略:「正常的分配策略」
和「激進分配策略」
(aggressiveReuse),能夠根據配置進行選擇。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 |
/** * Returns the next "free" id. If a defragged id exist it will be returned * <p/> * else the next free id that hasn't been used yet is returned. If no id * <p/> * exist the capacity is exceeded (all values <= max are taken) and a * <p/> * {@link UnderlyingStorageException} will be thrown. */ @Override public synchronized long nextId() { assertStillOpen(); long nextDefragId = nextIdFromDefragList(); if (nextDefragId != -1) return nextDefragId; long id = highId.get(); if (id == INTEGER_MINUS_ONE) { // Skip the integer -1 (0xFFFFFFFF) because it represents // special values, f.ex. the end of a relationships/property chain. id = highId.incrementAndGet(); } assertIdWithinCapacity(id); highId.incrementAndGet(); return id; } |
3.3.2.1. AbstractDynamicStore 的存儲格式
neo4j 中對於字符串等變長值的保存策略是用一組定長的 block 來保存,block之間用單向鏈表連接。類 AbstractDynamicStore 實現了該功能,下面是其註釋說明。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 |
/**
* An abstract representation of a dynamic store. The difference between a
* normal AbstractStore and a AbstractDynamicStore is
* that the size of a record/entry can be dynamic.
* Instead of a fixed record this class uses blocks to store a record. If a
* record size is greater than the block size the record will use one or more
* blocks to store its data.
* A dynamic store don’t have a IdGenerator because the position of a
* record can’t be calculated just by knowing the id. Instead one should use a
* AbstractStore and store the start block of the record located in the
* dynamic store. Note: This class makes use of an id generator internally for
* managing free and non free blocks.
* Note, the first block of a dynamic store is reserved and contains information
* about the store.
*/
|
AbstractDynamicStore
類對應的存儲文件格式如上圖所示, 整個文件是有一個block_size=BLOCK_HEADER_SIZE(8Bytes)+block_content_size
的定長數組和一個字符串「StringPropertyStore v0.A.2」
或「ArrayPropertyStore v0.A.2」
或「SchemaStore v0.A.2」
(文件類型描述TYPE_DESCRIPTOR和 neo4j 的 ALL_STORES_VERSION構成)。訪問時,能夠經過 id 做爲數組的下標進行訪問。其中,文件的第1個 record 中前4 字節用來保存 block_size。文件的第2個 record開始保存實際的block數據,它由8個字節的block_header和定長的 block_content(可配置)構成. block_header 結構以下:
next_block
的高4位block
是否在 use;block
是不是單向鏈表的第1個 block;0
表示第1個block, 1
表示後續 block.inUse
的第1~4 位,next_block
的實際長度共 36 bit。以數組方式存儲的單向鏈表的指針,指向保存同一條數據的下一個 block 的id.3.3.2.2. AbstractDynamicStore.java
下面看一下 AbstractDynamicStore.java 中 getRecord()
和readAndVerifyBlockSize()
成員函數,能夠幫助理解 DynamicStore 的存儲格式。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 |
private DynamicRecord getRecord( long blockId, PersistenceWindow window, RecordLoad load )</pre> <div>{ DynamicRecord record = new DynamicRecord( blockId ); Buffer buffer = window.getOffsettedBuffer( blockId ); /* * First 4b * [x , ][ , ][ , ][ , ] 0: start record, 1: linked record * [ x, ][ , ][ , ][ , ] inUse * [ ,xxxx][ , ][ , ][ , ] high next block bits * [ , ][xxxx,xxxx][xxxx,xxxx][xxxx,xxxx] nr of bytes in the data field in this record * */ long firstInteger = buffer.getUnsignedInt(); boolean isStartRecord = (firstInteger & 0x80000000) == 0; long maskedInteger = firstInteger & ~0x80000000; int highNibbleInMaskedInteger = (int) ( ( maskedInteger ) >> 28 ); boolean inUse = highNibbleInMaskedInteger == Record.IN_USE.intValue(); if ( !inUse && load != RecordLoad.FORCE ) { throw new InvalidRecordException( "DynamicRecord Not in use, blockId[" + blockId + "]" ); } int dataSize = getBlockSize() - BLOCK_HEADER_SIZE; int nrOfBytes = (int) ( firstInteger & 0xFFFFFF ); /* * Pointer to next block 4b (low bits of the pointer) */ long nextBlock = buffer.getUnsignedInt(); long nextModifier = ( firstInteger & 0xF000000L ) << 8; long longNextBlock = longFromIntAndMod( nextBlock, nextModifier ); boolean readData = load != RecordLoad.CHECK; if ( longNextBlock != Record.NO_NEXT_BLOCK.intValue() && nrOfBytes < dataSize || nrOfBytes > dataSize ) { readData = false; if ( load != RecordLoad.FORCE ) { throw new InvalidRecordException( "Next block set[" + nextBlock + "] current block illegal size[" + nrOfBytes + "/" + dataSize + "]" ); } } record.setInUse( inUse ); record.setStartRecord( isStartRecord ); record.setLength( nrOfBytes ); record.setNextBlock( longNextBlock ); /* * Data 'nrOfBytes' bytes */ if ( readData ) { byte byteArrayElement[] = new byte[nrOfBytes]; buffer.get( byteArrayElement ); record.setData( byteArrayElement ); } return record; } |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
protected void readAndVerifyBlockSize() throws IOException { ByteBuffer buffer = ByteBuffer.allocate( 4 ); getFileChannel().position( 0 ); getFileChannel().read( buffer ); buffer.flip(); blockSize = buffer.getInt(); if ( blockSize <= 0 ) { throw new InvalidRecordException( "Illegal block size: " + blockSize + " in " + getStorageFileName() ); } } |
3.3.2.3 類DynamicArrayStore, DynamicStringStore
類SchemaStore
,DynamicArrayStore(ArrayPropertyStore)
, DynamicStringStore(StringPropertyStore)
都是繼承成自類AbstractDynamicStore
,因此與類DynamicArrayStore, DynamicStringStore和 SchemaStore對應文件的存儲格式,都是遵循AbstractDynamicStore的存儲格式,除了block塊的大小(block_size)不一樣外。
db 文件 | 存儲類型 | block_size |
---|---|---|
neostore.labeltokenstore.db.names | StringPropertyStore | NAME_STORE_BLOCK_SIZE=30 |
neostore.propertystore.db.index.keys | StringPropertyStore | NAME_STORE_BLOCK_SIZE=30 |
neostore.relationshiptypestore.db.names | StringPropertyStore | NAME_STORE_BLOCK_SIZE=30 |
neostore.propertystore.db.strings | StringPropertyStore | string_block_size=120 |
neostore.nodestore.db.labels | ArrayPropertyStore | label_block_size=60 |
neostore.propertystore.db.arrays | ArrayPropertyStore | array_block_size=120 |
neostore.schemastore.db | SchemaStore | BLOCK_SIZE=56 |
block_size
經過配置文件或缺省值來設置的,下面的代碼片斷展現了neostore.propertystore.db.strings 文件的建立過程及block_size 的大小如何傳入。
GraphDatabaseSettings.java
1 2 3 |
public static final Setting string_block_size = setting("string_block_size", INTEGER, "120",min(1)); public static final Setting array_block_size = setting("array_block_size", INTEGER, "120",min(1)); public static final Setting label_block_size = setting("label_block_size", INTEGER, "60",min(1)); |
StoreFactory.java的Configuration 類
1 2 3 4 5 6 |
public static abstract class Configuration{ public static final Setting string_block_size = GraphDatabaseSettings.string_block_size; public static final Setting array_block_size = GraphDatabaseSettings.array_block_size; public static final Setting label_block_size = GraphDatabaseSettings.label_block_size; public static final Setting dense_node_threshold = GraphDatabaseSettings.dense_node_threshold; } |
StoreFactory.java的createPropertyStore 函數
1 2 3 4 5 6 7 8 |
public void createPropertyStore( File fileName ){ createEmptyStore( fileName, buildTypeDescriptorAndVersion( PropertyStore.TYPE_DESCRIPTOR )); int stringStoreBlockSize = config.get( Configuration.string_block_size ); int arrayStoreBlockSize = config.get( Configuration.array_block_size ) createDynamicStringStore(new File( fileName.getPath() + STRINGS_PART), stringStoreBlockSize, IdType.STRING_BLOCK); createPropertyKeyTokenStore( new File( fileName.getPath() + INDEX_PART ) ); createDynamicArrayStore( new File( fileName.getPath() + ARRAYS_PART ), arrayStoreBlockSize ); } |
StoreFactory.java的createDynamicStringStore函數
1 2 3 |
private void createDynamicStringStore( File fileName, int blockSize, IdType idType ){ createEmptyDynamicStore(fileName, blockSize, DynamicStringStore.VERSION, idType); } |
StoreFactory.java的createEmptyDynamicStore 函數
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 |
/** * Creates a new empty store. A factory method returning an implementation * should make use of this method to initialize an empty store. Block size * must be greater than zero. Not that the first block will be marked as * reserved (contains info about the block size). There will be an overhead * for each block of <CODE>AbstractDynamicStore.BLOCK_HEADER_SIZE</CODE>bytes. */ public void createEmptyDynamicStore( File fileName, int baseBlockSize, String typeAndVersionDescriptor, IdType idType) { int blockSize = baseBlockSize; // sanity checks … blockSize += AbstractDynamicStore.BLOCK_HEADER_SIZE; // write the header try { FileChannel channel = fileSystemAbstraction.create(fileName); int endHeaderSize = blockSize + UTF8.encode( typeAndVersionDescriptor ).length; ByteBuffer buffer = ByteBuffer.allocate( endHeaderSize ); buffer.putInt( blockSize ); buffer.position( endHeaderSize - typeAndVersionDescriptor.length() ); buffer.put( UTF8.encode( typeAndVersionDescriptor ) ).flip(); channel.write( buffer ); channel.force( false ); channel.close(); } catch ( IOException e ) { throw new UnderlyingStorageException( "Unable to create store " + fileName, e ); } idGeneratorFactory.create( fileSystemAbstraction, new File( fileName.getPath() + ".id"), 0 ); // TODO highestIdInUse = 0 works now, but not when slave can create store files. IdGenerator idGenerator = idGeneratorFactory.open(fileSystemAbstraction, new File( fileName.getPath() + ".id"),idType.getGrabSize(), idType, 0 ); idGenerator.nextId(); // reserve first for blockSize idGenerator.close(); } |
下面是neo4j graph db 中,Property數據存儲對應的文件:
neostore.propertystore.db neostore.propertystore.db.arrays neostore.propertystore.db.arrays.id neostore.propertystore.db.id neostore.propertystore.db.index neostore.propertystore.db.index.id neostore.propertystore.db.index.keys neostore.propertystore.db.index.keys.id neostore.propertystore.db.strings neostore.propertystore.db.strings.id
neo4j 中, Property
的存儲是由 PropertyStore
, ArrayPropertyStore
, StringPropertyStore
和PropertyKeyTokenStore
4種類型的Store配合來完成的.
類PropertyStore
對應的存儲文件是neostore.propertystore.db, 相應的用來存儲 string/array 類型屬性值的文件分別是neostore.propertystore.db.strings (StringPropertyStore) 和 neostore.propertystore.db.arrays(ArrayPropertyStore). 其存儲模型示意圖以下:
其中PropertyStore
是Property最主要的存儲結構,當Property的Key-Value對的Value 是字符串或數組類型而且要求的存儲空間比較大,在PropertyStore中保存不了,則會存在StringPropertyStore/ ArrayPropertyStore這樣的DynamicStore 中。若是長度超過一個block ,則分block存儲,並將其在StringPropertyStore/ ArrayPropertyStore中的第1個block 的 block_id 保存到 PropertyStore類型文件相應record 的PropertyBlock字段中。
PropertyKeyTokenStore
和StringPropertyStore
配合用來存儲Propery的Key部分。Propery的Key是編碼的,key 的 id 保存在 PropertyKeyTokenStore (即 neostore.propertystore.db.index),key 的字符串名保存在對應的StringPropertyStore類型文件neostore.propertystore.db.index.keys 中。
ArrayPropertyStore
的存儲格式見< 3.3.2 DynamicStore 類型>,下面分別介紹一下PropertyStore和PropertyKeyTokenStore(PropertyKeyTokenStore)的文件存儲格式。
neostore.propertystore.db文件存儲格式示意圖以下,整個文件是有一個 RECORD_SIZE=41 Bytes
的定長數組和一個字符串描述符「PropertyStore v0.A.2」
(文件類型描述TYPE_DESCRIPTOR和 neo4j 的 ALL_STORES_VERSION構成)。訪問時,能夠經過 prop_id 做爲數組的下標進行訪問。
下面介紹一下 property record 中每一個字段的含義:
highByte(1 Byte)
:第1字節,共分紅2部分next
的高4位;prev
的高4位prev(4 Bytes)
: Node或Relationship 的屬性是經過雙向鏈表方式組織的,prev 表示本屬性在雙向鏈表中的上一個屬性的id。第2~5字節是prev property_id的 低32位. 加上highByte字節的第 5~8 bit做爲高4位,構成一個完整的36位property_id。next(4 Bytes)
: next 表示本屬性在雙向鏈表中的下一個屬性的id。第6~9字節是next property_id的 低32位. 加上highByte字節的第 1~4 bit做爲高4位,構成一個完整的36位property_id。payload
: payload 由block_header(8 Bytes)加3個property_block(8 Bytes)組成,共計 32 Bytes. block_header 分紅3部分:
key_id(24 bits)
: 第1 ~24 bit , property 的key 的 idtype( 4 bits )
: 第25 ~28 bit , property 的 value 的類型,支持 string, Interger,Boolean, Float, Long,Double, Byte, Character,Short, array.payload(36 bits)
: 第29 ~64 bit, 共計36bit;對於Interger, Boolean, Float, Byte, Character , Short 類型的值,直接保存在payload;對於long,若是36位能夠表示,則直接保存在payload,若是不夠,則保存到第1個PropertyBlock中;double 類型,保存到第1個PropertyBlock中;對於 array/string ,若是編碼後在 block_header及3個PropertyBlock 能保存,則直接保存;不然,保存到ArrayDynamicStore/StringDynamicStore 中, payload 保存其在ArrayDynamicStore中的數組下表。下面的代碼片斷展現了neo4j 中,比較長的 String 類型屬性值的保存處理過程,其是如何分紅多個
DynamicBlock 來存儲的。
3.5.2.1 encodeValue 函數
encodeValue 函數是 PropertySTore.java 的成員函數, 它實現了不一樣類型的屬性值的編碼.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 |
public void encodeValue( PropertyBlock block, int keyId, Object value ) { if ( value instanceof String ) { // Try short string first, i.e. inlined in the property block String string = (String) value; if ( LongerShortString.encode( keyId, string, block, PropertyType.getPayloadSize() ) ) { return; } // Fall back to dynamic string store byte[] encodedString = encodeString( string ); Collection valueRecords = allocateStringRecords( encodedString ); setSingleBlockValue( block, keyId, PropertyType.STRING, first( valueRecords ).getId() ); for ( DynamicRecord valueRecord : valueRecords ) { valueRecord.setType( PropertyType.STRING.intValue() ); block.addValueRecord( valueRecord ); } } else if ( value instanceof Integer ) { setSingleBlockValue( block, keyId, PropertyType.INT, ((Integer) value).longValue() ); } else if ( value instanceof Boolean ) { setSingleBlockValue( block, keyId, PropertyType.BOOL, ((Boolean) value ? 1L : 0L) ); } else if ( value instanceof Float ) { setSingleBlockValue( block, keyId, PropertyType.FLOAT, Float.floatToRawIntBits( (Float) value ) ); } else if ( value instanceof Long ) { long keyAndType = keyId | (((long) PropertyType.LONG.intValue()) << 24); if ( ShortArray.LONG.getRequiredBits( (Long) value ) <= 35 ) { // We only need one block for this value, special layout compared to, say, an integer block.setSingleBlock( keyAndType | (1L << 28) | ((Long) value << 29) ); } else { // We need two blocks for this value block.setValueBlocks( new long[]{keyAndType, (Long) value} ); } } else if ( value instanceof Double ) { block.setValueBlocks( new long[]{ keyId | (((long) PropertyType.DOUBLE.intValue()) << 24), Double.doubleToRawLongBits( (Double) value )} ); } else if ( value instanceof Byte ) { setSingleBlockValue( block, keyId, PropertyType.BYTE, ((Byte) value).longValue() ); } else if ( value instanceof Character ) { setSingleBlockValue( block, keyId, PropertyType.CHAR, (Character) value ); } else if ( value instanceof Short ) { setSingleBlockValue( block, keyId, PropertyType.SHORT, ((Short) value).longValue() ); } else if ( value.getClass().isArray() ) { // Try short array first, i.e. inlined in the property block if ( ShortArray.encode( keyId, value, block, PropertyType.getPayloadSize() ) ) { return; } // Fall back to dynamic array store Collection arrayRecords = allocateArrayRecords( value ); setSingleBlockValue( block, keyId, PropertyType.ARRAY, first( arrayRecords ).getId() ); for ( DynamicRecord valueRecord : arrayRecords ) { valueRecord.setType( PropertyType.ARRAY.intValue() ); block.addValueRecord( valueRecord ); } } else { throw new IllegalArgumentException( "Unknown property type on: " + value + ", " + value.getClass() ); } } |
3.5.2.2 allocateStringRecords 函數allocateStringRecords
函數是 PropertySTore.java 的成員函數.
1 2 3 4 5 6 7 |
private Collection allocateStringRecords( byte[] chars ) { return stringPropertyStore.allocateRecordsFromBytes( chars ); } |
3.5.2.3 allocateRecordsFromBytes 函數allocateRecordsFromBytes
函數是 AbstractDynamicStore .java 的成員函數.
1 2 3 4 5 6 7 8 9 |
protected Collection allocateRecordsFromBytes( byte src[] ) { return allocateRecordsFromBytes( src, Collections.emptyList().iterator(), recordAllocator ); } |
3.5.2.4 allocateRecordsFromBytes 函數allocateRecordsFromBytes
函數是 AbstractDynamicStore .java 的成員函數.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 |
public static Collection allocateRecordsFromBytes( byte src[], Iterator recordsToUseFirst, DynamicRecordAllocator dynamicRecordAllocator ) { assert src != null : "Null src argument"; List recordList = new LinkedList<>(); DynamicRecord nextRecord = dynamicRecordAllocator.nextUsedRecordOrNew( recordsToUseFirst ); int srcOffset = 0; int dataSize = dynamicRecordAllocator.dataSize(); do { DynamicRecord record = nextRecord; record.setStartRecord( srcOffset == 0 ); if ( src.length - srcOffset > dataSize ) { byte data[] = new byte[dataSize]; System.arraycopy( src, srcOffset, data, 0, dataSize ); record.setData( data ); nextRecord = dynamicRecordAllocator.nextUsedRecordOrNew( recordsToUseFirst ); record.setNextBlock( nextRecord.getId() ); srcOffset += dataSize; } else { byte data[] = new byte[src.length - srcOffset]; System.arraycopy( src, srcOffset, data, 0, data.length ); record.setData( data ); nextRecord = null; record.setNextBlock( Record.NO_NEXT_BLOCK.intValue() ); } recordList.add( record ); assert !record.isLight(); assert record.getData() != null; } while ( nextRecord != null ); return recordList; } |
ShortArray.encode( keyId, value, block, PropertyType.getPayloadSize() )
, 它是在 kernel/impl/nioneo/store/ShortArray.java 中實現的,下面是其代碼片斷。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 |
public static boolean encode( int keyId, Object array, PropertyBlock target, int payloadSizeInBytes ) { /* * If the array is huge, we don't have to check anything else. * So do the length check first. */ int arrayLength = Array.getLength( array ); if ( arrayLength > 63 )/*because we only use 6 bits for length*/ { return false; } ShortArray type = typeOf( array ); if ( type == null ) { return false; } int requiredBits = type.calculateRequiredBitsForArray( array, arrayLength ); if ( !willFit( requiredBits, arrayLength, payloadSizeInBytes ) ) { // Too big array return false; } final int numberOfBytes = calculateNumberOfBlocksUsed( arrayLength, requiredBits ) * 8; if ( Bits.requiredLongs( numberOfBytes ) > PropertyType.getPayloadSizeLongs() ) { return false; } Bits result = Bits.bits( numberOfBytes ); // [][][ ,bbbb][bbll,llll][yyyy,tttt][kkkk,kkkk][kkkk,kkkk][kkkk,kkkk] writeHeader( keyId, type, arrayLength, requiredBits, result ); type.writeAll( array, arrayLength, requiredBits, result ); target.setValueBlocks( result.getLongs() ); return true; } private static void writeHeader( int keyId, ShortArray type, int arrayLength, int requiredBits, Bits result ) { result.put( keyId, 24 ); result.put( PropertyType.SHORT_ARRAY.intValue(), 4 ); result.put( type.type.intValue(), 4 ); result.put( arrayLength, 6 ); result.put( requiredBits, 6 ); } |
類PropertyTypeTokenStore對應的存儲文件名是neostore.propertystore.db.index,其對應的存儲格式如上圖所示: 是一個長度爲 RECORD_SIZE=9Bytes 的 record 數組和和一個字符串「PropertyIndexStore v0.A.2」(文件類型描述TYPE_DESCRIPTOR和 neo4j 的 ALL_STORES_VERSION構成)。訪問時,能夠經過 token_id 做爲數組的下標進行訪問。
record 是由 in_use(1 Byte) ,prop_count(4 Bytes), name_id(4 Bytes)構成。
neo4j 中, Node 的存儲是由 NodeStore
和 ArrayPropertyStore
2中類型配合來完成的. node 的label 內容是存在ArrayPropertyStore這樣的DynamicStore 中,若是長度超過一個block ,則分block存儲,並將其在ArrayPropertyStore中的第1個block 的 block_id 保存到 NodeStore類型文件相應record 的labels字段中。
下面是neo4j graph db 中,Node數據存儲對應的文件:
neostore.nodestore.db neostore.nodestore.db.id neostore.nodestore.db.labels neostore.nodestore.db.labels.id
ArrayPropertyStore的存儲格式見< 3.3.2 DynamicStore 類型>,下面介紹一下 NodeStore 的文件存儲格式。
NodeStore的主文件是neostore.nodestore.db, 其文件存儲格式示意圖以下,整個文件是有一個 RECORD_SIZE=15Bytes 的定長數組和一個字符串描述符「NodeStore v0.A.2」(文件類型描述TYPE_DESCRIPTOR和 neo4j 的 ALL_STORES_VERSION) 構成。訪問時,能夠經過 node_id 做爲數組的下標進行訪問。
1 2 3 |
// in_use(byte)+next_rel_id(int)+next_prop_id(int)+labels(5)+extra(byte) public static final int RECORD_SIZE = 15; |
下面介紹一下 node record 中每一個字段的含義:
inUse(1 Byte)
:第1字節,共分紅3部分
next_rel_id(4 Bytes)
: 第2~5字節是node 的第1個 relationship_id 的 低32位. 加上inUse 字節的第 2~4 bit做爲高3位,構成一個完整的35位relationship_id。next_prop_id(4 Bytes)
: 第6~9字節是node 的第1個 property_id 的 低32位. 加上inUse 字節的第 5~8 bit做爲高4位,構成一個完整的36 位 property_id。labels(5 Bytes)
: 第10~14字節是node 的label field。extra(1 Byte)
: 第15字節是 extra , 目前只用到第 1 bit ,表示該node 是否 dense, 缺省的配置是 該 node 的 relationshiop 的數量超過 50 個,這表示是 dense.neo4j 中與neostore.nodestore.db文件相對應的類是NodeStore,負責NodeRecord在neostore.nodestore.db文件中的讀寫。
下面看一下 NodeStore.java 中 getRecord 成員函數,能夠幫助理解 Node Record 的存儲格式。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 |
private NodeRecord getRecord(long id, PersistenceWindow window, RecordLoad load) { Buffer buffer = window.getOffsettedBuffer(id); // [ , x] in use bit // [ ,xxx ] higher bits for rel id // [xxxx, ] higher bits for prop idlong inUseByte = buffer.get(); boolean inUse = (inUseByte & amp; 0x1)==Record.IN_USE.intValue(); if (!inUse) { switch (load) { case NORMAL: throw new InvalidRecordException("NodeRecord[" + id + "] not in use"); case CHECK: return null; case FORCE: break; } } long nextRel = buffer.getUnsignedInt(); long nextProp = buffer.getUnsignedInt(); long relModifier = (inUseByte & amp; 0xEL)<<31; long propModifier = (inUseByte & amp; 0xF0L)<<28; long lsbLabels = buffer.getUnsignedInt(); long hsbLabels = buffer.get() & 0xFF; // so that a negative byte won't fill the "extended" bits with ones.long labels = lsbLabels | (hsbLabels << 32);byte extra = buffer.get();boolean dense = (extra & 0x1) > 0;NodeRecord nodeRecord = new NodeRecord( id, dense, longFromIntAndMod( nextRel, relModifier ),longFromIntAndMod( nextProp, propModifier ) );nodeRecord.setInUse( inUse );nodeRecord.setLabelField( labels, Collections.<DynamicRecord>emptyList() );return nodeRecord;} } |
下面是neo4j graph db 中,Relationship數據存儲對應的文件:
neostore.relationshipgroupstore.db neostore.relationshipgroupstore.db.id neostore.relationshipstore.db neostore.relationshipstore.db.id neostore.relationshiptypestore.db neostore.relationshiptypestore.db.id neostore.relationshiptypestore.db.names neostore.relationshiptypestore.db.names.id
neo4j 中, Relationship 的存儲是由 RelationshipStore , RelationshipGroupStore, RelationshipTypeTokenStore和StringPropertyStore 4種類型的Store配合來完成的. 其中RelationshipStore 是Relationship最主要的存儲結構;當一個Node 的關係數達到必定的閥值時,纔會對關係分組(group), RelationshipGroupStore 用來保存關係分組數據;RelationshipTypeTokenStore和StringPropertyStore 配合用來存儲關係的類型。
關係的類型的字符串描述值是存在StringPropertyStore這樣的DynamicStore 中,若是長度超過一個block ,則分block存儲,並將其在StringPropertyStore中的第1個block 的 block_id 保存到 RelationshipTypeTokenStore類型文件相應record 的name_id字段中。
ArrayPropertyStore的存儲格式見< 3.3.2 DynamicStore 類型>,下面分別介紹一下RelationshipTypeTokenStore, RelationshipStore和RelationshipStore的文件存儲格式。
類RelationshipTypeTokenStore對應的存儲文件是neostore.relationshiptypestore.db,其對應的存儲格式如上圖所示:是一個長度爲 RECORD_SIZE=5 Bytes 的 record 數組和和一個字符串描述符「RelationshipTypeStore v0.A.2」(文件類型描述TYPE_DESCRIPTOR和 neo4j 的 ALL_STORES_VERSION) 構成。訪問時,能夠經過 token_id 做爲數組的下標進行訪問。
record 是有 1Byte的 in_use 和 4Bytes 的 name_id 構成。
類RelationshipTypeTokenStore對應的存儲文件是neostore.relationshipstore.db,其文件存儲格式示意圖以下,整個文件是有一個 RECORD_SIZE=34Bytes 的定長數組和一個字符串描述符「RelationshipStore v0.A.2」(文件類型描述TYPE_DESCRIPTOR和 neo4j 的 ALL_STORES_VERSION構成)。訪問時,能夠經過 node_id 做爲數組的下標進行訪問。
1 2 3 4 5 6 7 8 9 |
// record header size // directed|in_use(byte)+first_node(int)+second_node(int)+rel_type(int)+ // first_prev_rel_id(int)+first_next_rel_id+second_prev_rel_id(int)+ // second_next_rel_id+next_prop_id(int)+first-in-chain-markers(1) public static final int RECORD_SIZE = 34; |
下面介紹一下 relationship record 中每一個字段的含義:
in_use(1 Byte)
: 第 1 字節, 分紅3部分.
first_node(4 Bytes)
: 第2~5字節是RelationShip的from_node 的node_id 的低32位. 加上inUse 字節的第 2~4 bit 做爲高3位,構成一個完整的35位node_id。second_node(4 Bytes)
: 第6~9字節是RelationShip的to_node 的node_id 的低32位. 加上rel_type的第29~31 bit做爲高3位,構成一個完整的35位node_id。rel_type(4 Bytes)
: 第 10~13 字節, 分紅6部分;
first_prev_rel_id(4 Bytes)
: 第14~17字節是from_node 的排在本RelationShip 前面一個RelationShip的 relationship_id 的低32位. 加上rel_type的第 26~28 bit 做爲高3位,構成一個完整的35位relationship_id。first_next_rel_id(4 Bytes)
: 第18~21字節是from_node 的排在本RelationShip 前面一個RelationShip的 relationship_id 的低32位. 加上rel_type的第 23~25 bit 做爲高3位,構成一個完整的35位relationship_id。second_prev_rel_id(4 Bytes)
: 第22~25字節是from_node 的排在本RelationShip 前面一個RelationShip的 relationship_id 的低32位. 加上rel_type的第 20~22 bit 做爲高3位,構成一個完整的35位relationship_id。second_next_rel_id(4 Bytes)
: 第26~29字節是from_node 的排在本RelationShip 前面一個RelationShip的 relationship_id 的低32位. 加上rel_type的第 17~19 bit 做爲高3位,構成一個完整的35位relationship_id。next_prop_id(4 Bytes)
: 第30~33字節是本RelationShip第1個Property的property_id 的低32位. 加上in_use的第 5~8 bit 做爲高3位,構成一個完整的36 位property_id。first-in-chain-markers(1 Byte)
: 目前只用了第1位和第2位,其做用筆者還沒搞清楚。3.7.2.1 RelationshipStore.java
與neostore.relationshipstore.db文件相對應的類是RelationshipStore,負責RelationshipRecord從neostore.relationshipstore.db文件的讀寫。下面看一下 neostore.relationshipstore.db 中 getRecord 成員函數,能夠幫助理解 Relationship Record 的存儲格式。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 |
private RelationshipRecord getRecord( long id, PersistenceWindow window,RecordLoad load ) { Buffer buffer = window.getOffsettedBuffer( id ); // [ , x] in use flag // [ ,xxx ] first node high order bits // [xxxx, ] next prop high order bits long inUseByte = buffer.get(); boolean inUse = (inUseByte & 0x1) == Record.IN_USE.intValue(); if ( !inUse ) { switch ( load ) { case NORMAL: throw new InvalidRecordException( "RelationshipRecord[" + id + "] not in use" ); case CHECK: return null; } } long firstNode = buffer.getUnsignedInt(); long firstNodeMod = (inUseByte & 0xEL) << 31; long secondNode = buffer.getUnsignedInt(); // [ xxx, ][ , ][ , ][ , ] second node high order bits, 0x70000000 // [ ,xxx ][ , ][ , ][ , ] first prev rel high order bits, 0xE000000 // [ , x][xx , ][ , ][ , ] first next rel high order bits, 0x1C00000 // [ , ][ xx,x ][ , ][ , ] second prev rel high order bits, 0x380000 // [ , ][ , xxx][ , ][ , ] second next rel high order bits, 0x70000 // [ , ][ , ][xxxx,xxxx][xxxx,xxxx] type long typeInt = buffer.getInt(); long secondNodeMod = (typeInt & 0x70000000L) << 4; int type = (int)(typeInt & 0xFFFF); RelationshipRecord record = new RelationshipRecord( id, longFromIntAndMod( firstNode, firstNodeMod ), longFromIntAndMod( secondNode, secondNodeMod ), type ); record.setInUse( inUse ); long firstPrevRel = buffer.getUnsignedInt(); long firstPrevRelMod = (typeInt & 0xE000000L) << 7; record.setFirstPrevRel( longFromIntAndMod( firstPrevRel, firstPrevRelMod ) ); long firstNextRel = buffer.getUnsignedInt(); long firstNextRelMod = (typeInt & 0x1C00000L) << 10; record.setFirstNextRel( longFromIntAndMod( firstNextRel, firstNextRelMod ) ); long secondPrevRel = buffer.getUnsignedInt(); long secondPrevRelMod = (typeInt & 0x380000L) << 13; record.setSecondPrevRel( longFromIntAndMod( secondPrevRel, secondPrevRelMod ) ); long secondNextRel = buffer.getUnsignedInt(); long secondNextRelMod = (typeInt & 0x70000L) << 16; record.setSecondNextRel( longFromIntAndMod( secondNextRel, secondNextRelMod ) ); long nextProp = buffer.getUnsignedInt(); long nextPropMod = (inUseByte & 0xF0L) << 28; byte extraByte = buffer.get(); record.setFirstInFirstChain( (extraByte & 0x1) != 0 ); record.setFirstInSecondChain( (extraByte & 0x2) != 0 ); record.setNextProp( longFromIntAndMod( nextProp, nextPropMod ) ); return record; } |
當Node的Relationship數量超過一個閥值時,neo4j 會對 Relationship 進行分組,以便提供性能。neo4j 中用來實現這一功能的類是 RelationshipGroupStore.
其對應的文件存儲格式以下:
整個文件是有一個 RECORD_SIZE=20Bytes 的定長數組和一個字符串「RelationshipGroupStore v0.A.2」(文件類型描述TYPE_DESCRIPTOR和 neo4j 的 ALL_STORES_VERSION構成)。訪問時,能夠經過 id 做爲數組的下標進行訪問。數組下標爲0的 record 前4 Bytes 保存Relationship分組的閥值。
RelationshipGroupStore 的record 的格式以下:
inUse(1 Byte)
:第1字節,共分紅4部分
第1 bit
: 表示 record 是否在 use;第2~4 bit
: 表示 next 的高3位;第 5~7 bit
:表示 firstOut高3位第8 bit
:沒有用。highByte(1 Byte)
:第1字節,共分紅4部分
第1 bit
:沒有用;第2~4 bit
: 表示 firstIn 的高3位;第 5~7 bit
:表示 firstLoop高3位第8 bit
:沒有用。next
:firstOut
firstIn
firstLoop
下面看一個簡單的例子,而後看一下幾個主要的存儲文件,有助於理解<3–neo4j存儲結構>描述的neo4j 的存儲格式。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 |
packagecom.wuzhu.neo4j_exam; importjava.util.List; importjava.util.ArrayList; importjava.util.Iterator; importorg.neo4j.graphdb.Direction; importorg.neo4j.graphdb.GraphDatabaseService; importorg.neo4j.graphdb.factory.GraphDatabaseFactory; importorg.neo4j.graphdb.Node; importorg.neo4j.graphdb.Relationship; importorg.neo4j.graphdb.Path; importorg.neo4j.graphdb.RelationshipType; importorg.neo4j.graphdb.Transaction; importorg.neo4j.graphdb.index.Index; importorg.neo4j.graphdb.traversal.Evaluation; importorg.neo4j.graphdb.traversal.Evaluator; importorg.neo4j.graphdb.traversal.Evaluators; importorg.neo4j.graphdb.traversal.Traverser; importorg.neo4j.kernel.EmbeddedReadOnlyGraphDatabase; importorg.neo4j.kernel.Traversal; importorg.neo4j.kernel.Uniqueness; importorg.neo4j.tooling.GlobalGraphOperations; importcom.alibaba.fastjson.JSON; publicclassNeo4jTest00 { GraphDatabaseService gds; Node fromNode; Node toNode; Node companyNode; Relationship relationship; Relationship belongRelationship; privatestaticenum UserRelationship implementsRelationshipType { FELLOW,BELONG } publicvoidcreateDb() { String DB_PATH="target/neo4j-test00.db"; GraphDatabaseFactory factory=newGraphDatabaseFactory(); gds=factory.newEmbeddedDatabase(DB_PATH); GlobalGraphOperations ggo=GlobalGraphOperations.at(gds); try</b>(Transaction tx=gds.beginTx()) { fromNode=gds.createNode(); fromNode.setProperty("prop_key_table","prop_value_table_person"); fromNode.setProperty("prop_key_name","prop_value_name_mayu"); toNode=gds.createNode(); toNode.setProperty("prop_key_table","prop_value_table_person"); toNode.setProperty("prop_key_name","prop_value_name_liyanhong"); relationship=fromNode.createRelationshipTo(toNode,UserRelationship.FELLOW); List<String>eventList=newArrayList<String>(); //eventList.add("2013福布斯中國富豪榜:李彥宏第3、馬化騰第5、馬雲第八 "); //eventList.add("李彥宏推輕應用馬雲入股瀏覽器 移動入口爭奪暗戰升級 "); eventList.add("2013fubushi zhongguo fuhaobang:liyanhong no.3 mahuateng no.5 mayu no.8 "); eventList.add("liyanhong tui qinyingyong,mayu rugu liulanqi; yidong rukou zhengduo anzhan shengji"); relationship.setProperty("prop_key_event",JSON.toJSONString(eventList)); companyNode=gds.createNode(); companyNode.setProperty("prop_key_table","company"); companyNode.setProperty("prop_key_name","alibaba corp"); belongRelationship=fromNode.createRelationshipTo(companyNode,UserRelationship.BELONG); belongRelationship.setProperty("event","mayu ruhe zhuangkong alibaba? "); tx.success(); Iterator<Node>iterator=ggo.getAllNodes().iterator(); while(iterator.hasNext()) { Node node=iterator.next(); Iterator<String>keysIterator=node.getPropertyKeys().iterator(); System.out.println("nodeId="+node.getId()); while(keysIterator.hasNext()) { String key=keysIterator.next(); System.out.println("node property : "+key+"->"+node.getProperty(key)); } Iterator<Relationship>relationshipsIterator=node.getRelationships().iterator(); while(relationshipsIterator.hasNext()) { Relationship relationships=relationshipsIterator.next(); System.out.println("關係:"+relationships.getType()); Iterator<String>keysIterator2=relationships.getPropertyKeys().iterator(); while(keysIterator2.hasNext()) { String key=keysIterator2.next(); System.out.println("relationship property : "+key+"->" +relationships.getProperty(key)); } } } } } publicvoidremoveData() { try(Transaction tx=gds.beginTx()) { belongRelationship.delete(); companyNode.delete(); tx.success(); } } publicvoidstopDb() { gds.shutdown(); } publicstaticvoidmain(String[]args) { Neo4jTest00 test00=newNeo4jTest00(); test00.createDb(); test00.removeData(); test00.stopDb(); } } |
上述程序執行後,會在target/neo4j-test00.db 下生成 neo4j 的 db 存儲文件,
下面咱們看幾個主要的存儲文件,來幫助咱們對 neo4j 的存儲格式有個直觀的認識。
爲了看文件的內容,筆者用二進制方式打開neo4j_exam的db存儲文件,並用虛擬打印機輸出到pdf 文件,並根據每一個文件的格式,進行了着色。
打開neo4j_exam的neostore.nodestore.db.id文件看到以下內容:
id 文件的header 部分: sticky 值是0, nextFreeId是3,目前已回收可複用的 ID 是 02。
從neo4j_exam的neostore.nodestore.db文件內容能夠看到,文件中保存了有 3 條node record 概率的數組和一個字符串「NodeStore v0.A.2」(文件類型描述TYPE_DESCRIPTOR和 neo4j 的 ALL_STORES_VERSION構成)。
其中3 條 node record 的內容以下:
結合 2.6.1 的源代碼,能夠的看到,fromNode 的 node_id=0, toNode的node_id=1, companyNode 的 node_id=2.
從neo4j_exam的neostore.relationshipstore.db文件內容能夠看到,文件中保存了有 2 條 relationship record記錄的數組和一個字符串「RelationshipStore v0.A.2」(文件類型描述TYPE_DESCRIPTOR和 neo4j 的 ALL_STORES_VERSION構成)。
其中2 個 relationship record 的內容以下:
字段 | 第1條記錄 | 第2條記錄 |
---|---|---|
in_use | 1 | 0 |
first_node | 0 | 0 |
second_node | 1 | 2 |
rel_type | 0 | 1 |
first_prev_rel_id | 1 | 2 |
first_next_rel_id | -1 | 0 |
second_prev_rel_id | 1 | 1 |
second_next_rel_id | -1 | -1 |
next_prop_id | 5 | 6 |
first-in-chain-markers | 3 | 3 |
type=0xB 表示 SHORT_STRING, type=0×9 表示 STRING.
由於 companyNode 節點和 belongRelationship 關係已經刪除,因此其屬性property[4], property[5] , property[7] 的 block_header (key,type,value)部分填充爲0。
打開neo4j_exam的neostore.nodestore.db.id文件看到如上內容: