GreenPlum 大數據平臺--基礎使用(一)

一,操做語法

  01,建立數據庫

[gpadmin@greenplum01 ~]$ createdb testDB -E utf-8
--建立用戶--
[gpadmin@greenplum01 ~]$ export PGDATABASE=testDB
--指定數據庫名字
[gpadmin@greenplum01 ~]$  psql
--鏈接本地數據庫
psql (8.3.23)
Type "help" for help.

testDB=# SELECT version();
                                                                                               version

-------------------------------------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------
 PostgreSQL 8.3.23 (Greenplum Database 5.16.0 build commit:23cec7df0406d69d6552a4bbb77035dba4d7dd44) on x86_64-pc-linux-gnu, co
mpiled by GCC gcc (GCC) 6.2.0, 64-bit compiled on Jan 16 2019 02:32:15
(1 row)

  02,使用說明

postgres=# \h create view;
Command:     CREATE VIEW
Description: define a new view
Syntax:
CREATE [ OR REPLACE ] [ TEMP | TEMPORARY ] VIEW name [ ( column_name [, ...] ) ]
    AS query

postgres=# \h create
Command:     CREATE AGGREGATE
Description: define a new aggregate function
Syntax:
CREATE AGGREGATE name ( input_data_type [ , ... ] ) (
.....
---\h 爲語句的使用說明書

  03,建表語句

--語法查詢
\h create table
--建立表
create table test001(id int,name varchar(128)); --id 爲分佈鍵
create table test002(id int,name varchar(128)) distributed by (id); --同上



create table test003(id int,name varchar(128)) distributed by (id,name) --多個分佈鍵

create table test004(id int,name varchar(128)) distributed randomly; --隨機分佈鍵



create table test005(id int primary,name varchar(128));
create table test006(id int unique,name varchar(128));

create table test007(id int unique,name varchar(128)) distributed by (id,name);



---建立如出一轍的 表
create table test_like (like test001);

  04,插入語句 

  執行insert語句注意分佈鍵不要爲空,不然分佈鍵默認變成null',數據都被保存到一個節點上會致使分佈不均linux

insert into test001 values (100,'tom'),(101,'lily'),(102,'jack'),(103,'linda');

insert into test002 values (200,'tom'),(101,'lily'),(202,'jack'),(103,'linda');

  05,更新語句

  不能批量對分佈鍵執行update,由於分佈鍵執行update須要將數據重分佈.算法

update test002 set id=203 where id=202;

  06,刪除語句delete--truncate

delete 刪除整張表比較慢,因此建議使用truncatesql

truncate test001;

  07,查詢語句

postgres=# select * from test2;
 id  | name
-----+------
 102 | zxc
 203 | rty
 105 | bnm
 101 | qwe
 201 | asd
 204 | dfg
(6 rows)

  08,執行計劃

postgres=# select * from test1 x,test2 y where x.id=y.id;
 id  | name | id  | name
-----+------+-----+------
 101 | lily | 101 | qwe
 102 | jack | 102 | zxc
(2 rows)
postgres=# explain select * from test1 x,test2 y where x.id=y.id;
                                              QUERY PLAN
------------------------------------------------------------------------------------------------------
 Gather Motion 8:1  (slice2; segments: 8)  (cost=0.00..862.00 rows=4 width=17)
   ->  Hash Join  (cost=0.00..862.00 rows=1 width=17)
         Hash Cond: test1.id = test2.id
         ->  Table Scan on test1  (cost=0.00..431.00 rows=1 width=9)
         ->  Hash  (cost=431.00..431.00 rows=1 width=8)
               ->  Redistribute Motion 8:8  (slice1; segments: 8)  (cost=0.00..431.00 rows=1 width=8)
                     Hash Key: test2.id
                     ->  Table Scan on test2  (cost=0.00..431.00 rows=1 width=8)
 Optimizer status: PQO version 3.21.0
(9 rows)

二,經常使用數據類型

  1.數值類型

  02,字符類型

 

  03,時間類型

三,經常使用函數

  1,字符串函數

--
postgres=# VALUES ('hello|world!'),('greenplum|database');
      column1
--------------------
 hello|world!
 greenplum|database
(2 rows)

--
postgres=# SELECT substr('hello world!',2,3);
 substr
--------
 ell
(1 row)

--
postgres=# SELECT position('world' in 'hello world!');
 position
----------
        7
(1 row)

  2,時間函數

postgres=# SELECT now(),current_date,current_time,current_timestamp;
              now              |    date    |       timetz       |              now
-------------------------------+------------+--------------------+-------------------------------
 2019-03-17 22:26:58.330843-04 | 2019-03-17 | 22:26:58.330843-04 | 2019-03-17 22:26:58.330843-04
(1 row)

  3,數值計算

四,其餘函數

  1,序列號生成函數——generate_series

postgres=# SELECT * from generate_series(6,10);
 generate_series
-----------------
               6
               7
               8
               9
              10
(5 rows)
語法: generate_series(x,y,t) 

生成多行數據從x到另外y,步長爲t,默認是1

   2,字符串列轉行——string_agg

string_agg(str,symbol [order by str])
(按照某字段排序)將str列轉行,以symbol分隔

  3,字符串行轉列——regexp_split_to_table

 

把轉成行的數據變成列數據

 

  4,hash函數——md5,hashbpchar

  md5的hash算法精度爲128位,返回一個字符串
  Hashbpchar的精度是32位,返回一個integer類型數據庫

postgres=# SELECT md5('admin')
postgres-# ;
               md5
----------------------------------
 21232f297a57a5a743894a0e4a801fc3
(1 row)

postgres=# SELECT hashbpchar('admin');
 hashbpchar
-------------
 -2087781708
(1 row)
相關文章
相關標籤/搜索