JIT just-in-time 即時編譯功能html
JIT在大數據集的查詢條件下,可能迅速提高查詢速度的做用。可是它也不是任何狀況下都能提效的,能夠參考這篇 https://www.postgresql.org/docs/11/jit-decision.htmlpython
下面,我以編譯PG11開啓JIT爲例演示下JIT的性能提高效果:git
注意:JIT的功能須要在編譯的時候就開啓 jit的支持,PostgreSQL documentation 說明LLVM最低版本須要3.9github
wget http://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpmsql
yum localinstall epel-release-latest-7.noarch.rpmexpress
yum install llvm5.0 llvm5.0-devel clangdom
cd /root/pg_sources/postgresql-11 # 切換到pg11的源碼的路徑下,執行編譯操做 ide
./configure --prefix=/usr/local/pgsql-11 \post
--with-python --with-perl --with-tcl --with-pam \性能
--with-openssl --with-libxml --with-libxslt \
--with-llvm LLVM_CONFIG='/usr/lib64/llvm5.0/bin/llvm-config'
# 若是有缺乏依賴包等報錯,能夠參考網上的資料補充後,再次執行 configure 命令。
修改配置文件,開啓JIT的參數。修改後,重啓PG,查看到的參數設置值以下:
postgres=# select name,setting from pg_settings where name like 'jit%';
name | setting
-------------------------+---------
jit | on
jit_above_cost | 100000
jit_debugging_support | off
jit_dump_bitcode | off
jit_expressions | on
jit_inline_above_cost | 500000
jit_optimize_above_cost | 500000
jit_profiling_support | off
jit_provider | llvmjit
jit_tuple_deforming | on
(10 rows)
德哥給出的測試樣例 https://github.com/digoal/blog/blob/master/201910/20191017_01.md
下面是我本身實際測試的(CenOS7+PG11+普通SATA硬盤,PG就設置了shared_buffer=8GB 沒有作其它的參數優化,直接開搞)
造些測試數據:
create table a(id int, info text, crt_Time timestamp, c1 int);
insert into a select generate_series(1,100000000),'test',now(),random()*100; -- 也不加索引了,純靠PG本身來硬抗
analyze a;
\dt+ a
List of relations
Schema | Name | Type | Owner | Size | Description
--------+------+-------+----------+---------+-------------
public | a | table | postgres | 5746 MB |
(1 row)
在開啓jit的PG11上的效果:
set jit=on;
set max_parallel_workers_per_gather =32;
alter table a set (parallel_workers =32);
set min_parallel_table_scan_size =0;
set min_parallel_index_scan_size =0;
set parallel_setup_cost =0;
set parallel_tuple_cost =0;
postgres=# select t1.c1,count(*) from a t1 join a t2 using (id) group by t1.c1;
Time: 31402.562 ms (00:31.403)
postgres=# explain select t1.c1,count(*) from a t1 join a t2 using (id) group by t1.c1;
QUERY PLAN
------------------------------------------------------------------------------------------------------------
Finalize GroupAggregate (cost=1657122.68..1657229.70 rows=101 width=12)
Group Key: t1.c1
-> Gather Merge (cost=1657122.68..1657212.53 rows=3232 width=12)
Workers Planned: 32
-> Sort (cost=1657121.85..1657122.10 rows=101 width=12)
Sort Key: t1.c1
-> Partial HashAggregate (cost=1657117.48..1657118.49 rows=101 width=12)
Group Key: t1.c1
-> Parallel Hash Join (cost=817815.59..1641492.46 rows=3125004 width=4)
Hash Cond: (t1.id = t2.id)
-> Parallel Seq Scan on a t1 (cost=0.00..766545.04 rows=3125004 width=8)
-> Parallel Hash (cost=766545.04..766545.04 rows=3125004 width=4)
-> Parallel Seq Scan on a t2 (cost=0.00..766545.04 rows=3125004 width=4)
JIT:
Functions: 23
Options: Inlining true, Optimization true, Expressions true, Deforming true
(16 rows)
postgres=# select t1.c1,count(*) from a t1 join a t2 on (t1.id=t2.id and t1.c1=2 and t2.c1=2) group by t1.c1;
c1 | count
----+---------
2 | 1000506
(1 row)
Time: 4780.824 ms (00:04.781)
postgres=# select * from a order by c1,id desc limit 10;
id | info | crt_time | c1
----------+------+----------------------------+----
99999958 | test | 2019-10-18 09:22:32.391061 | 0
99999926 | test | 2019-10-18 09:22:32.391061 | 0
99999901 | test | 2019-10-18 09:22:32.391061 | 0
99999802 | test | 2019-10-18 09:22:32.391061 | 0
99999165 | test | 2019-10-18 09:22:32.391061 | 0
99999100 | test | 2019-10-18 09:22:32.391061 | 0
99998968 | test | 2019-10-18 09:22:32.391061 | 0
99998779 | test | 2019-10-18 09:22:32.391061 | 0
99998652 | test | 2019-10-18 09:22:32.391061 | 0
99998441 | test | 2019-10-18 09:22:32.391061 | 0
(10 rows)
Time: 3317.480 ms (00:03.317)
postgres=# select c1,count(*) from a group by c1;
Time: 5031.796 ms (00:05.032)
在未編譯jit的PG11上的效果:
postgres=# select t1.c1,count(*) from a t1 join a t2 using (id) group by t1.c1;
Time: 71410.034 ms (01:11.410)
postgres=# explain select t1.c1,count(*) from a t1 join a t2 using (id) group by t1.c1;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------
Finalize GroupAggregate (cost=6150282.43..6150308.02 rows=101 width=12)
Group Key: t1.c1
-> Gather Merge (cost=6150282.43..6150306.00 rows=202 width=12)
Workers Planned: 2
-> Sort (cost=6149282.41..6149282.66 rows=101 width=12)
Sort Key: t1.c1
-> Partial HashAggregate (cost=6149278.03..6149279.04 rows=101 width=12)
Group Key: t1.c1
-> Parallel Hash Join (cost=1835524.52..5940950.58 rows=41665490 width=4)
Hash Cond: (t1.id = t2.id)
-> Parallel Seq Scan on a t1 (cost=0.00..1151949.90 rows=41665490 width=8)
-> Parallel Hash (cost=1151949.90..1151949.90 rows=41665490 width=4)
-> Parallel Seq Scan on a t2 (cost=0.00..1151949.90 rows=41665490 width=4)
(13 rows)
Time: 0.636 ms
postgres=# select t1.c1,count(*) from a t1 join a t2 on (t1.id=t2.id and t1.c1=2 and t2.c1=2) group by t1.c1;
c1 | count
----+---------
2 | 1001209
(1 row)
Time: 9329.623 ms (00:09.330)
postgres=# select * from a order by c1,id desc limit 10;
id | info | crt_time | c1
----------+------+----------------------------+----
99999518 | test | 2019-10-18 09:18:36.532469 | 0
99999088 | test | 2019-10-18 09:18:36.532469 | 0
99999016 | test | 2019-10-18 09:18:36.532469 | 0
99998987 | test | 2019-10-18 09:18:36.532469 | 0
99998899 | test | 2019-10-18 09:18:36.532469 | 0
99998507 | test | 2019-10-18 09:18:36.532469 | 0
99998142 | test | 2019-10-18 09:18:36.532469 | 0
99998107 | test | 2019-10-18 09:18:36.532469 | 0
99998050 | test | 2019-10-18 09:18:36.532469 | 0
99997437 | test | 2019-10-18 09:18:36.532469 | 0
(10 rows)
Time: 6113.971 ms (00:06.114)
postgres=# select c1,count(*) from a group by c1;
Time: 9868.117 ms (00:09.868)
從上面的測試結果看,基本上, 對於大數據集的JOIN之類的複雜 查詢, 用了JIT後, 查詢速度在原有的基礎上再縮短至少一半。
平常的OLTP+OLAP需求,一套PG11全搞定。