Cassandra 利用結構建表,查詢| 8月更文挑戰

前言

列示存儲,你的優點就是你的劣勢,你的劣勢仍是你的列示。html

環境準備

默認已經搭好了cassandraspring

建立鍵空間

示例:sql

CREATE KEYSPACE one WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : 1 };
​
CREATE KEYSPACE two WITH replication = {'class': 'NetworkTopologyStrategy', 'DC1' : 1, 'DC2' : 3} AND durable_writes = false;
複製代碼

CREATE KEYSPACE語句有 兩個屬性:replication和durable_writes 和 NetworkTopologyStrategyapache

兩個策略:簡單策略 SimpleStrategy ,網絡拓撲策略 NetworkTopologyStrategymarkdown

兩個策略簡單來講區別是 單數據中心用簡單策略,多數據中心用網絡拓撲策略 這裏用簡單策略 即便用鍵空間爲 one網絡


建表 與分區

create table device_sensor_smoke
(
    unique_id                     text,
    time_id                       timeuuid,
        #略
    temperature                   double,
    primary key (unique_id, time_id, event_id)
)  #表後能夠跟一些選項,目前先忽略
複製代碼

主鍵選擇有3種方式oop

1.  unique_id               text   PRIMARY KEY, 
2.  primary key (unique_id, time_id, event_id)
3.  primary key (unique_id, time_id), event_id)
複製代碼
  1. a primary key 或者寫做 primary key(a) 表示 a 是分區鍵,沒有聚簇列(clustering columns)
  2. primary key (a, b , c ) a是分區鍵(partition key) b,c 是聚簇類
  3. primary key (a, b), c) a,b 組成分區鍵或者叫複合主鍵(composite partition key) event_id是聚簇列

在表中,CQL 定義了分區的概念。分區只是共享相同分區鍵值的一組行。注意,若是分區鍵由多列組成,則行屬於同一分區,只有它們對全部這些分區鍵列具備相同的值。post

例如,給定下表定義和內容:ui

CREATE TABLE t (
    a int,
    b int,
    c int,
    d int,
    PRIMARY KEY ((a, b), c, d)
);
​
SELECT * FROM t;
   a | b | c | d
  ---+---+---+---
   0 | 0 | 0 | 0    // row 1
   0 | 0 | 1 | 1    // row 2
   0 | 1 | 2 | 2    // row 3
   0 | 1 | 3 | 3    // row 4
   1 | 1 | 4 | 4    // row 5
複製代碼

row1 與 row2 處於同一分區中 這解釋了爲何咱們的項目表爲何用 unique_id 做爲第一位的主鍵 ,由於 unique_id擁有大量重複,cassandra 會根據unique_id順序查詢。spa

CREATE TABLE t (
    a int,
    b int,
    c int,
    PRIMARY KEY (a, b, c)
);
​
SELECT * FROM t;
   a | b | c
  ---+---+---
   0 | 0 | 4     // row 1
   0 | 1 | 9     // row 2
   0 | 2 | 2     // row 3
   0 | 3 | 3     // row 4
複製代碼

也正是因爲此特性,因此咱們將擁有大量重複的unique_id 做爲主鍵第一位,也就是對應上方表中的 a 。

這樣咱們表中的數據纔是有序的,若是咱們將惟一的值 例如 time_id。做爲第一主鍵。那麼每一個單獨的值都是一個區,這樣的話最終的結果將是亂序的。

查詢語句

經過上面的建表語句咱們已經見識到了 cql 是有嚴格的規則的,這一特性也體現到了查詢語句中 創建以下表

CREATE TABLE posts (
    userid text,
    blog_title text,
    posted_at timestamp,
    entry_title text,
    content text,
    category int,
    PRIMARY KEY (userid, blog_title, posted_at)
)
複製代碼

表中未插入任何數據 , 隨後作以下查詢

SELECT entry_title, content FROM posts
 WHERE userid = 'john doe'
   AND blog_title='John''s Blog'
   AND posted_at >= '2012-01-01' AND posted_at < '2012-01-31'
複製代碼

獲得的結果爲(我未插入任何數據)

entry_title | content
-------------+---------
​
(0 rows)
複製代碼

在以後作以下查詢

SELECT entry_title, content FROM posts
 WHERE userid = 'john doe'
   AND posted_at >= '2012-01-01' AND posted_at < '2012-01-31'
  
  
  #獲得的結果爲
  InvalidRequest: Error from server: code=2200 [Invalid query] message="PRIMARY KEY column "posted_at" cannot be restricted as preceding column "blog_title" is not restricted"
複製代碼

若是熟悉sql的話,乍一看並無感受出第二條有什麼不妥,而且是在第一次查詢成功的狀況下。 可是咱們發現 主鍵結構爲(userid, blog_title, posted_at) ,第一次查詢順序 與主鍵相同,第二次只查詢了 userid, posted_at 。

userid 與 posted_at在主鍵上並非連續的,因此會報錯。 Cassandra是容許「過濾」的,同事也要考慮空間連續性。

select * from posts ;
select * from entry_title ="cql怎麼用";
複製代碼

上述兩條是能夠經過的。

可是

select * from entry_title ="cql怎麼用" and  category  = 1;
複製代碼

這個查詢會被拒絕,由於 entry_title 與 category 並不是連續,因此Cassandra 並不能保證即便 知足的條件不多,也要掃描大量的數據。

可是能夠加上 ALLOW FILTERING 來強制執行,同理上面失效的 查詢也能夠。 因此要想要剛纔兩條查詢生效須要寫成以下的樣子 。

select * from entry_title ="cql怎麼用" and  category  = 1  ALLOW FILTERING  ;

SELECT entry_title, content FROM posts
 WHERE userid = 'john doe'
   AND posted_at >= '2012-01-01' AND posted_at < '2012-01-31'  ALLOW FILTERING ;
複製代碼

排序

其實cassandra的任何查詢,最後的結果都是有序的,默認與建表時指定的排序規則一致(如無指定默認爲升序,參考上方 t表)

可是若是有特殊排序需求也能知足。

也就是說 Cassandra支持自定義排序,但也是限制重重

建立以下表

create table teacher(
    id int,
    address text,
    name text,
    age int,
    height int,
    primary key(id,address,name)
)WITH CLUSTERING ORDER BY(address DESC, name ASC);
複製代碼

插入以下數據

insert into teacher(id,address,name,age,height) values(1,'guangdong','lixiao',32,172);
insert into teacher(id,address,name,age,height) values(1,'guangxi','linzexu',68,178);
insert into teacher(id,address,name,age,height) values(1,'guangxi','lihao',25,178);
insert into teacher(id,address,name,age,height) values(2,'guangxi','lixiaolong',32,172);
insert into teacher(id,address,name,age,height) values(2,'guangdong','lixiao',32,172);
insert into teacher(id,address,name,age,height) values(2,'guangxi','linzexu',68,178);
insert into teacher(id,address,name,age,height) values(2,'guangxi','lihao',25,178);
insert into teacher(id,address,name,age,height) values(2,'guangxi','nnd',32,172);
複製代碼

經過以上建表與查詢的套路,咱們也知道,全部的限制必然是來自辣個男人(表結構)。

那麼咱們直接上語句,來感覺此次列式存儲在又做什麼妖。

正確示例 1:

SELECT * FROM teacher WHERE id=1 ORDER BY address ASC;
SELECT * FROM teacher WHERE id=1 ORDER BY address ASC, name ASC;
SELECT * FROM teacher WHERE id=1 AND address='guangxi' ORDER BY address ASC;
SELECT * FROM teacher WHERE id=1 AND address='guangxi' ORDER BY address ASC, name ASC;
SELECT * FROM teacher WHERE id=1 ORDER BY address DESC;
SELECT * FROM teacher WHERE id=1 ORDER BY address DESC, name DESC;
SELECT * FROM teacher WHERE id=1 AND address='guangxi' ORDER BY address DESC;
SELECT * FROM teacher WHERE id=1 AND address='guangxi' ORDER BY address DESC, name DESC;
複製代碼

正確示例 2:

SELECT * FROM tt WHERE id=1 ORDER BY address DESC;
SELECT * FROM tt WHERE id=1 ORDER BY address DESC, name ASC;
SELECT * FROM tt WHERE id=1 AND address='guangxi' ORDER BY address DESC;
SELECT * FROM tt WHERE id=1 AND address='guangxi' ORDER BY address DESC, name ASC;
SELECT * FROM tt WHERE id=1 ORDER BY address ASC;
SELECT * FROM tt WHERE id=1 ORDER BY address ASC, name DESC;
SELECT * FROM tt WHERE id=1 AND address='guangxi' ORDER BY address ASC;
SELECT * FROM tt WHERE id=1 AND address='guangxi' ORDER BY address ASC, name DESC;
複製代碼

錯誤示例展現

SELECT * FROM teacher ORDER BY address DESC;                        //沒有第一主鍵 不行
SELECT * FROM teacher WHERE id=1 ORDER BY name DESC;                //必須以第二主鍵開始排序
SELECT * FROM teacher WHERE id=1 ORDER BY address DESC, name ASC;   //不是與建表時指定的排序一致或者徹底相反 (默認是address ASC, name ASC)
SELECT * FROM teacher WHERE age=1 ORDER BY address DESC;            //不能有索引
SELECT * FROM tt WHERE id=1 ORDER BY address DESC, name DESC;       //不是與建表時指定的排序一致或者徹底相反 (建表時指定了address DESC, name ASC)
複製代碼

推薦閱讀/本文參考文獻

SpringDate for Cassandra

spring.io/projects/sp…

Cassandra 官網

cassandra.apache.org/doc/latest/…

互聯網文章

www.bbsmax.com/A/kmzLXBYGz…

相關文章
相關標籤/搜索