Greenplum6 數據庫
Greenplum6 數據庫分佈
1. hash分佈
2. 隨機分佈
3. 複製分佈
基本語法介紹
1. 獲取語法
[gpadmin@mdw logs]$ psql
psql (9.4.24)
Type "help" for help.
postgres=# \h
Available help:
ABORT ALTER TEXT SEARCH TEMPLATE CREATE RESOURCE GROUP DROP FUNCTION LOAD
ALTER AGGREGATE ALTER TRIGGER CREATE RESOURCE QUEUE DROP GROUP LOCK
ALTER COLLATION ALTER TYPE CREATE ROLE DROP INDEX MOVE
ALTER CONVERSION ALTER USER CREATE RULE DROP LANGUAGE NOTIFY
ALTER DATABASE ALTER USER MAPPING CREATE SCHEMA DROP MATERIALIZED VIEW PREPARE
ALTER DEFAULT PRIVILEGES ALTER VIEW CREATE SEQUENCE DROP OPERATOR PREPARE TRANSACTION
ALTER DOMAIN ANALYZE CREATE SERVER DROP OPERATOR CLASS REASSIGN OWNED
ALTER EVENT TRIGGER BEGIN CREATE TABLE DROP OPERATOR FAMILY REFRESH MATERIALIZED VIEW
ALTER EXTENSION CHECKPOINT CREATE TABLE AS DROP OWNED REINDEX
ALTER EXTERNAL TABLE CLOSE CREATE TABLESPACE DROP PROTOCOL RELEASE SAVEPOINT
ALTER FOREIGN DATA WRAPPER CLUSTER CREATE TEXT SEARCH CONFIGURATION DROP RESOURCE GROUP RESET
ALTER FOREIGN TABLE COMMENT CREATE TEXT SEARCH DICTIONARY DROP RESOURCE QUEUE REVOKE
ALTER FUNCTION COMMIT CREATE TEXT SEARCH PARSER DROP ROLE ROLLBACK
ALTER GROUP COMMIT PREPARED CREATE TEXT SEARCH TEMPLATE DROP RULE ROLLBACK PREPARED
ALTER INDEX COPY CREATE TRIGGER DROP SCHEMA ROLLBACK TO SAVEPOINT
ALTER LANGUAGE CREATE AGGREGATE CREATE TYPE DROP SEQUENCE SAVEPOINT
ALTER LARGE OBJECT CREATE CAST CREATE USER DROP SERVER SECURITY LABEL
ALTER MATERIALIZED VIEW CREATE COLLATION CREATE USER MAPPING DROP TABLE SELECT
ALTER OPERATOR CREATE CONVERSION CREATE VIEW DROP TABLESPACE SELECT INTO
ALTER OPERATOR CLASS CREATE DATABASE DEALLOCATE DROP TEXT SEARCH CONFIGURATION SET
ALTER OPERATOR FAMILY CREATE DOMAIN DECLARE DROP TEXT SEARCH DICTIONARY SET CONSTRAINTS
ALTER PROTOCOL CREATE EVENT TRIGGER DELETE DROP TEXT SEARCH PARSER SET ROLE
ALTER RESOURCE GROUP CREATE EXTENSION DISCARD DROP TEXT SEARCH TEMPLATE SET SESSION AUTHORIZATION
ALTER RESOURCE QUEUE CREATE EXTERNAL TABLE DO DROP TRIGGER SET TRANSACTION
ALTER ROLE CREATE FOREIGN DATA WRAPPER DROP AGGREGATE DROP TYPE SHOW
ALTER RULE CREATE FOREIGN TABLE DROP CAST DROP USER START TRANSACTION
ALTER SCHEMA CREATE FUNCTION DROP COLLATION DROP USER MAPPING TABLE
ALTER SEQUENCE CREATE GROUP DROP CONVERSION DROP VIEW TRUNCATE
ALTER SERVER CREATE INDEX DROP DATABASE END UNLISTEN
ALTER SYSTEM CREATE LANGUAGE DROP DOMAIN EXECUTE UPDATE
ALTER TABLE CREATE MATERIALIZED VIEW DROP EVENT TRIGGER EXPLAIN VACUUM
ALTER TABLESPACE CREATE OPERATOR DROP EXTENSION FETCH VALUES
ALTER TEXT SEARCH CONFIGURATION CREATE OPERATOR CLASS DROP EXTERNAL TABLE GRANT WITH
ALTER TEXT SEARCH DICTIONARY CREATE OPERATOR FAMILY DROP FOREIGN DATA WRAPPER INSERT
ALTER TEXT SEARCH PARSER CREATE PROTOCOL DROP FOREIGN TABLE LISTEN
postgres=#
2. 建立數據庫
[gpadmin@mdw logs]$ createdb testDB -E utf-8
[gpadmin@mdw logs]$ psql -h 10.10.10.101 -p 5432 -d testDB -U gpadmin
psql (9.4.24)
Type "help" for help.
testDB=# \q
[gpadmin@mdw logs]$ export PGDATABASE=testDB
[gpadmin@mdw logs]$ psql
psql (9.4.24)
Type "help" for help.
testDB=#
3. 建表語句
- GreenPlum中建立表須要指定表的分佈鍵。
- 若是表須要用某個字段分區,能夠經過partition by 將表建成分區表。
- 能夠使用like操做建立與like的表同樣結構的表,功能相似create table t1 as select * from t2 limit 0。
- 能夠使用inherits實現表的繼承,具體實現參考postgresql文檔。
--語法查詢
\h create table
--建立表
create table test001(id int,name varchar(128)); --id 爲分佈鍵
create table test002(id int,name varchar(128)) distributed by (id); --同上
testDB=# create table test001(id int,name varchar(128));
NOTICE: Table doesn't have 'DISTRIBUTED BY' clause -- Using column named 'id' as the Greenplum Database data distribution key for this table.
HINT: The 'DISTRIBUTED BY' clause determines the distribution of data. Make sure column(s) chosen are the optimal data distribution key to minimize skew.
CREATE TABLE
testDB=# create table test002(id int,name varchar(128)) distributed by (id);
CREATE TABLE
testDB=#
create table test003(id int,name varchar(128)) distributed by (id,name) --多個分佈鍵
testDB=# create table test003(id int,name varchar(128)) distributed by (id,name);
CREATE TABLE
create table test004(id int,name varchar(128)) distributed randomly; --隨機分佈鍵
testDB=# create table test004(id int,name varchar(128)) distributed randomly;
CREATE TABLE
create table test005(id int primary key ,name varchar(128));
create table test006(id int unique ,name varchar(128));
testDB=# create table test005(id int primary key ,name varchar(128));
CREATE TABLE
testDB=# create table test006(id int unique ,name varchar(128));
CREATE TABLE
testDB=#
採用隨機分佈策略的表默認將主鍵,或者惟一鍵做爲分佈鍵,由於每一個Segment都是一個單一的數據庫,單個數據庫能夠確保惟一性,多個數據庫節點就沒法保證全局的跨庫惟一性,故只能按照惟一鍵分佈,同一個值的數據都在一個節點上,以此來保證惟一性。
--若是指定的分佈鍵與主鍵盤不同,那麼分佈鍵會被更改成主鍵。在greenplum6 中這句話貌似顯示不兼容,以下:
create table test007(id int unique,name varchar(128)) distributed by (id,name);
testDB=# create table test007(id int unique,name varchar(128)) distributed by (id,name);
ERROR: UNIQUE constraint and DISTRIBUTED BY definitions are incompatible
HINT: When there is both a UNIQUE constraint and a DISTRIBUTED BY clause, the DISTRIBUTED BY clause must be a subset of the UNIQUE constraint.
testDB=#
按照它的要求進行修改,咱們distribute by 修改爲id
testDB=# create table test007(id int unique,name varchar(128)) distributed by (id);
CREATE TABLE
testDB=#
---建立如出一轍的 表
create table test_like (like test001);
--使用like 建立表的時候,只是表結構會跟原表一摸同樣,表的特殊屬性並不會同樣,例如壓縮,只增(appendonly)等屬性,若是不指定分佈鍵,默認分佈鍵與原表一致。
testDB=# create table test_like (like test001);
NOTICE: table doesn't have 'DISTRIBUTED BY' clause, defaulting to distribution columns from LIKE table
CREATE TABLE
testDB=#
4. 插入語句
插入語句
執行insert語句注意分佈鍵不要爲空,不然分佈鍵默認變成null',數據都被保存到一個節點上會致使分佈不均
insert into test001 values (100,'tom'),(101,'lily'),(102,'jack'),(103,'linda');
insert into test002 values (200,'tom'),(101,'lily'),(202,'jack'),(103,'linda');
5. 更新語句
不能批量對分佈鍵執行update,由於分佈鍵執行update須要將數據重分佈.
testDB=# update test002 set id=203 where id=202;
UPDATE 1
testDB=#
6. 刪除語句delete--truncate
在Greenplum 3.x 的版本中,若是delete 操做涉及子查詢,子查詢的結果還涉及到數據重分佈,這樣的刪除語句會報錯,Greenplum 4.x以上,支持該操做。
testDB=# delete from test001 where name in (select name from test002);
DELETE 4
testDB=#
若是對整張表執行delete會比較慢,建議使用truncate.
truncate執行truncate直接刪除表的物理文件,而後建立新的數據文件。若是有sql正在操做這張表,那麼truncate會被鎖住,直到表上面的全部鎖會被釋放。
7. 查詢語句
testDB=# select * from test001 x,test002 y where x.id=y.id;
id | name | id | name
-----+-------+-----+-------
103 | linda | 103 | linda
101 | lily | 101 | lily
(2 rows)
8. 執行計劃
testDB=# explain select * from test001 x,test002 y where x.id=y.id;
QUERY PLAN
-------------------------------------------------------------------------------
Gather Motion 6:1 (slice1; segments: 6) (cost=0.00..862.00 rows=5 width=18)
-> Hash Join (cost=0.00..862.00 rows=1 width=18)
Hash Cond: (test001.id = test002.id)
-> Seq Scan on test001 (cost=0.00..431.00 rows=1 width=9)
-> Hash (cost=431.00..431.00 rows=1 width=9)
-> Seq Scan on test002 (cost=0.00..431.00 rows=1 width=9)
Optimizer: Pivotal Optimizer (GPORCA)
(7 rows)