理解 PostgreSQL 的 count 函數的行爲

時間 2019-12-10

標籤理解 postgresql count 函數行爲欄目 Postgre SQL 简体版

原文原文鏈接

關於 count 函數的使用一直存在爭議，尤爲是在 MySQL 中，做爲流行度愈來愈高的 PostgreSQL 是否也有相似的問題呢，咱們經過實踐來理解一下 PostgreSQL 中 count 函數的行爲。html

構建測試數據庫

建立測試數據庫，並建立測試表。測試表中有自增 ID、建立時間、內容三個字段，自增 ID 字段是主鍵。sql

create database performance_test;

create table test_tbl (id serial primary key, created_at timestamp, content varchar(512));
複製代碼

生成測試數據

使用 generate_series 函數生成自增 ID，使用 now() 函數生成 created_at 列，對於 content 列，使用了 repeat(md5(random()::text), 10) 生成 10 個 32 位長度的 md5 字符串。使用下列語句，插入 1000w 條記錄用於測試。數據庫

performance_test=# insert into test_tbl select generate_series(1,10000000),now(),repeat(md5(random()::text),10);
INSERT 0 10000000
Time: 212184.223 ms (03:32.184)
複製代碼

由 count 語句引起的思考

默認狀況下 PostgreSQL 不開啓 SQL 執行時間的顯示，因此須要手動開啓一下，方便後面的測試對比。緩存

\timing on
複製代碼

count(*) 和 count(1) 的性能區別是常常被討論的問題，分別使用 count(*) 和 count(1) 執行一次查詢。bash

performance_test=# select count(*) from test_tbl;
  count
----------
 10000000
(1 row)

Time: 115090.380 ms (01:55.090)

performance_test=# select count(1) from test_tbl;
  count
----------
 10000000
(1 row)

Time: 738.502 ms
複製代碼

能夠看到兩次查詢的速度差異很是大，count(1) 真的有這麼大的性能提高？接下來再次運行查詢語句。app

performance_test=# select count(*) from test_tbl;
  count
----------
 10000000
(1 row)

Time: 657.831 ms

performance_test=# select count(1) from test_tbl;
  count
----------
 10000000
(1 row)

Time: 682.157 ms
複製代碼

能夠看到第一次查詢時候會很是的慢，後面三次速度很是快而且時間相近，這裏就有兩個問題出現了：dom

爲何第一次查詢速度這麼慢？
count(*) 和 count(1) 到底存不存在性能差異？

查詢緩存

使用 explain 語句從新執行查詢語句ide

explain (analyze,buffers,verbose) select count(*) from test_tbl;
複製代碼

能夠看到以下輸出：函數

Finalize Aggregate  (cost=529273.69..529273.70 rows=1 width=8) (actual time=882.569..882.570 rows=1 loops=1)
   Output: count(*)
   Buffers: shared hit=96 read=476095
   ->  Gather  (cost=529273.48..529273.69 rows=2 width=8) (actual time=882.492..884.170 rows=3 loops=1)
         Output: (PARTIAL count(*))
         Workers Planned: 2
         Workers Launched: 2
         Buffers: shared hit=96 read=476095
         ->  Partial Aggregate  (cost=528273.48..528273.49 rows=1 width=8) (actual time=881.014..881.014 rows=1 loops=3)
               Output: PARTIAL count(*)
               Buffers: shared hit=96 read=476095
               Worker 0: actual time=880.319..880.319 rows=1 loops=1
                 Buffers: shared hit=34 read=158206
               Worker 1: actual time=880.369..880.369 rows=1 loops=1
                 Buffers: shared hit=29 read=156424
               ->  Parallel Seq Scan on public.test_tbl  (cost=0.00..517856.98 rows=4166598 width=0) (actual time=0.029..662.165 rows=3333333 loops=3)
                     Buffers: shared hit=96 read=476095
                     Worker 0: actual time=0.026..661.807 rows=3323029 loops=1
                       Buffers: shared hit=34 read=158206
                     Worker 1: actual time=0.030..660.197 rows=3285513 loops=1
                       Buffers: shared hit=29 read=156424
 Planning time: 0.043 ms
 Execution time: 884.207 ms
複製代碼

注意裏面的 shared hit，表示命中了內存中緩存的數據，這就能夠解釋爲何後面的查詢會比第一次快不少。接下來去掉緩存，並重啓 PostgreSQL。oop

service postgresql stop
echo 1 > /proc/sys/vm/drop_caches
service postgresql start
複製代碼

從新執行 SQL 語句，速度慢了不少。

Finalize Aggregate  (cost=529273.69..529273.70 rows=1 width=8) (actual time=50604.564..50604.564 rows=1 loops=1)
   Output: count(*)
   Buffers: shared read=476191
   ->  Gather  (cost=529273.48..529273.69 rows=2 width=8) (actual time=50604.508..50606.141 rows=3 loops=1)
         Output: (PARTIAL count(*))
         Workers Planned: 2
         Workers Launched: 2
         Buffers: shared read=476191
         ->  Partial Aggregate  (cost=528273.48..528273.49 rows=1 width=8) (actual time=50591.550..50591.551 rows=1 loops=3)
               Output: PARTIAL count(*)
               Buffers: shared read=476191
               Worker 0: actual time=50585.182..50585.182 rows=1 loops=1
                 Buffers: shared read=158122
               Worker 1: actual time=50585.181..50585.181 rows=1 loops=1
                 Buffers: shared read=161123
               ->  Parallel Seq Scan on public.test_tbl  (cost=0.00..517856.98 rows=4166598 width=0) (actual time=92.491..50369.691 rows=3333333 loops=3)
                     Buffers: shared read=476191
                     Worker 0: actual time=122.170..50362.271 rows=3320562 loops=1
                       Buffers: shared read=158122
                     Worker 1: actual time=14.020..50359.733 rows=3383583 loops=1
                       Buffers: shared read=161123
 Planning time: 11.537 ms
 Execution time: 50606.215 ms
複製代碼

shared read 表示沒有命中緩存，經過這個現象能夠推斷出，上一小節的四次查詢中，第一次查詢沒有命中緩存，剩下三次查詢都命中了緩存。

count(1) 和 count(*) 的區別

接下來探究 count(1) 和 count(*) 的區別是什麼，繼續思考最開始的四次查詢，第一次查詢使用了 count(*)，第二次查詢使用了 count(1) ，卻依然命中了緩存，不正是說明 count(1) 和 count(*) 是同樣的嗎？

事實上，PostgreSQL 官方對於 is there a difference performance-wise between select count(1) and select count(*)? 問題的回覆也證明了這一點：

Nope. In fact, the latter is converted to the former during parsing.[2]

既然 count(1) 在性能上沒有比 count(*) 更好，那麼使用 count(*) 就是更好的選擇。

sequence scan 和 index scan

接下來測試一下，在不一樣數據量大小的狀況下 count(*) 的速度，將查詢語句寫在 count.sql 文件中，使用 pgbench 進行測試。

pgbench -c 5 -t 20 performance_test -r -f count.sql
複製代碼

分別測試 200w - 1000w 數據量下的 count 語句耗時

數據大小	count耗時(ms)
200w	738.758
300w	1035.846
400w	1426.183
500w	1799.866
600w	2117.247
700w	2514.691
800w	2526.441
900w	2568.240
1000w	2650.434

繪製成耗時曲線

曲線的趨勢在 600w - 700w 數據量之間出現了轉折，200w - 600w 是線性增加，600w 以後 count 的耗時就基本相同了。使用 explain 語句分別查看 600w 和 700w 數據時的 count 語句執行。

700w：

Finalize Aggregate  (cost=502185.93..502185.94 rows=1 width=8) (actual time=894.361..894.361 rows=1 loops=1)
   Output: count(*)
   Buffers: shared hit=16344 read=352463
   ->  Gather  (cost=502185.72..502185.93 rows=2 width=8) (actual time=894.232..899.763 rows=3 loops=1)
         Output: (PARTIAL count(*))
         Workers Planned: 2
         Workers Launched: 2
         Buffers: shared hit=16344 read=352463
         ->  Partial Aggregate  (cost=501185.72..501185.73 rows=1 width=8) (actual time=889.371..889.371 rows=1 loops=3)
               Output: PARTIAL count(*)
               Buffers: shared hit=16344 read=352463
               Worker 0: actual time=887.112..887.112 rows=1 loops=1
                 Buffers: shared hit=5459 read=118070
               Worker 1: actual time=887.120..887.120 rows=1 loops=1
                 Buffers: shared hit=5601 read=117051
               ->  Parallel Index Only Scan using test_tbl_pkey on public.test_tbl  (cost=0.43..493863.32 rows=2928960 width=0) (actual time=0.112..736.376 rows=2333333 loops=3)
                     Index Cond: (test_tbl.id < 7000000)
                     Heap Fetches: 2328492
                     Buffers: shared hit=16344 read=352463
                     Worker 0: actual time=0.107..737.180 rows=2344479 loops=1
                       Buffers: shared hit=5459 read=118070
                     Worker 1: actual time=0.133..737.960 rows=2327028 loops=1
                       Buffers: shared hit=5601 read=117051
 Planning time: 0.165 ms
 Execution time: 899.857 ms
複製代碼

600w：

Finalize Aggregate  (cost=429990.94..429990.95 rows=1 width=8) (actual time=765.575..765.575 rows=1 loops=1)
   Output: count(*)
   Buffers: shared hit=13999 read=302112
   ->  Gather  (cost=429990.72..429990.93 rows=2 width=8) (actual time=765.557..770.889 rows=3 loops=1)
         Output: (PARTIAL count(*))
         Workers Planned: 2
         Workers Launched: 2
         Buffers: shared hit=13999 read=302112
         ->  Partial Aggregate  (cost=428990.72..428990.73 rows=1 width=8) (actual time=763.821..763.821 rows=1 loops=3)
               Output: PARTIAL count(*)
               Buffers: shared hit=13999 read=302112
               Worker 0: actual time=762.742..762.742 rows=1 loops=1
                 Buffers: shared hit=4638 read=98875
               Worker 1: actual time=763.308..763.308 rows=1 loops=1
                 Buffers: shared hit=4696 read=101570
               ->  Parallel Index Only Scan using test_tbl_pkey on public.test_tbl  (cost=0.43..422723.16 rows=2507026 width=0) (actual time=0.053..632.199 rows=2000000 loops=3)
                     Index Cond: (test_tbl.id < 6000000)
                     Heap Fetches: 2018490
                     Buffers: shared hit=13999 read=302112
                     Worker 0: actual time=0.059..633.156 rows=1964483 loops=1
                       Buffers: shared hit=4638 read=98875
                     Worker 1: actual time=0.038..634.271 rows=2017026 loops=1
                       Buffers: shared hit=4696 read=101570
 Planning time: 0.055 ms
 Execution time: 770.921 ms
複製代碼

根據以上現象推斷，PostgreSQL 彷佛在 count 的數據量小於數據表長度的某一比例時，才使用 index scan，經過查看官方 wiki 也能夠看到相關描述：

It is important to realise that the planner is concerned with minimising the total cost of the query. With databases, the cost of I/O typically dominates. For that reason, "count(*) without any predicate" queries will only use an index-only scan if the index is significantly smaller than its table. This typically only happens when the table's row width is much wider than some indexes'.[3]

根據 Stackoverflow 上的回答，count 語句查詢的數量大於表大小的 3/4 時候就會用使用全表掃描代替索引掃描[4]。