PostgreSQL大對象的清理

系統使用了一款開源的cas單點登陸系統,存儲大對象的方式是lo,一般lo的性能會比bytea要好一點,開發告知會按期清理用戶數據,可是實際上發現系統並無刪除用戶數據所關聯的大對象數據。故須要寫個腳本按期清理一下。

1、開發背景
DB: PostgreSQL 9.3.0
cas=# select oid,rolname from pg_authid where oid in (10,327299);
 oid | rolname  
-----+----------
  10 | postgres
327299| usr_cas
(1 row)

cas=# select lomowner,count(1) from pg_largeobject_metadata group by 1;
 lomowner | count 
----------+--------
       10 |  292408
   327299 |  382123
(2 row)
2、清理
須要清理兩部分,postgres用戶的大對象與usr_cas用戶的大對象,前者是用postgres鏈接時建立的,須要所有刪除,後者存在部分用戶數據已刪但大對象沒刪的數據,也須要刪除。
 1.lo_unlink刪除
 刪除一般使用自帶的lo_unlink()函數,因而使用瞭如下命令,但爆出問題 out of shared memory
cas=# select lo_unlink(oid) from pg_largeobject_metadata where lomowner = 10;
WARNING:  out of shared memory
ERROR:  out of shared memory
HINT:  You might need to increase max_locks_per_transaction.

cas=# show max_locks_per_transaction ;
 max_locks_per_transaction 
---------------------------
 64
(1 row)
這個提示比較明顯,一個SQL把全部的大對象在一個事務裏完成,但分配的內存不夠,因此失敗了,要增長max_locks_per_transaction參數值,這個值默認是64。其實也能夠換個角度刪除,不把全部的大對象在一個事務裏刪除,而是分批次執行,由於要刪除的數據量其實也不算多,就考慮了後者。
--多執行如下命令幾回就能夠了,每次刪2W,執行10幾回就夠了,也能夠放腳本里寫,一次執行
cas=# select lo_unlink(oid) from pg_largeobject_metadata where lomowner = 10 limit 20000;
2.vacuumlo刪除
清理完postgres的用戶數據之後,接着要清理usr_cas用戶的大對象數據,要寫腳本逐個比對比較麻煩,並且效率也不必定好。這可使用自帶的vacuumlo的小工具。這個工具是經過大對象的OID與用戶表中的oid進行關聯比對,而後逐一刪除,因此在設計大對象用戶表時,雖然也可使用int類型存儲oid值,可是對後期的維護不方便,推薦使用oid類型。 若是這個工具沒有安裝,能夠在contrib/vacuumlo下面make && make install安裝一下便可
簡介以下:
[postgres@kenyon-primary ~]$ vacuumlo --help
vacuumlo removes unreferenced large objects from databases.

Usage:
  vacuumlo [OPTION]... DBNAME...

Options:
  -l LIMIT       commit after removing each LIMIT large objects
  -n             don't remove large objects, just show what would be done
  -v             write a lot of progress messages
  -V, --version  output version information, then exit
  -?, --help     show this help, then exit

Connection options:
  -h HOSTNAME    database server host or socket directory
  -p PORT        database server port
  -U USERNAME    user name to connect as
  -w             never prompt for password
  -W             force password prompt

Report bugs to .
使用:
--顯示要清理的數據,不清理,只顯示
[postgres@kenyon-primary ~]$ vacuumlo -n cas -v
Connected to database "cas"
Test run: no large objects will be removed!
Checking expiration_policy in public.serviceticket
Checking service in public.serviceticket
Checking expiration_policy in public.ticketgrantingticket
Checking authentication in public.ticketgrantingticket
Checking services_granted_access_to in public.ticketgrantingticket
Would remove 382143 large objects from database "cas".

--清理,能夠加個「l」參數,每隔這個參數提交一次
[postgres@kenyon-primary ~]$ vacuumlo cas -v -l 1000
Connected to database "cas"
Test run: no large objects will be removed!
Checking expiration_policy in public.serviceticket
Checking service in public.serviceticket
Checking expiration_policy in public.ticketgrantingticket
Checking authentication in public.ticketgrantingticket
Checking services_granted_access_to in public.ticketgrantingticket
Would remove 382143 large objects from database "cas".
清理完畢再看一下
cas=# select pg_size_pretty(pg_database_size('cas'));
 pg_size_pretty 
----------------
 1.3 GB
(1 row)

--空間尚未收縮,使用vacuum full analyze
cas=# vacuum full analyze verbose pg_largeobject;
INFO:  vacuuming "pg_catalog.pg_largeobject"
INFO:  scanned index "pg_largeobject_loid_pn_index" to remove 88928 row versions
DETAIL:  CPU 0.01s/0.24u sec elapsed 0.26 sec.
INFO:  "pg_largeobject": removed 88928 row versions in 6833 pages
DETAIL:  CPU 0.00s/0.02u sec elapsed 0.02 sec.
INFO:  index "pg_largeobject_loid_pn_index" now contains 948117 row versions in 4120 pages
DETAIL:  88928 index row versions were removed.
1516 index pages have been deleted, 1269 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  "pg_largeobject": found 88928 removable, 52 nonremovable row versions in 6891 out of 109226 pages
DETAIL:  0 dead row versions cannot be removed yet.
There were 2329 unused item pointers.
0 pages are entirely empty.
CPU 0.03s/0.32u sec elapsed 0.35 sec.
INFO:  analyzing "pg_catalog.pg_largeobject"
INFO:  "pg_largeobject": scanned 30000 of 109226 pages, containing 260529 live rows and 0 dead rows; 30000 rows in sample, 947568 estimated total rows
VACUUM

cas=# select pg_size_pretty(pg_relation_size('pg_largeobject'));
pg_size_pretty
----------------
8192 KB
(1 row)
整個世界清靜了。 寫成腳本的方式,按期執行
[postgres@kenyon-primary ~]$ more cas_rm_lo.sh
#!/bin/bash

######################################################
##
##  purpose:Rm the cas's large object and free space
##  
##  author :Kenyon
##   
##  created:2014-01-22
##  
#####################################################


source /home/postgres/.bash_profile

vacuumlo cas -l 1000 -v

psql -d cas -c "vacuum full analyze verbose pg_largeobject;"
psql -d cas -c "vacuum full analyze verbose pg_largeobject_metadata;"
3、總結 在使用開源的一些工具時,若是有使用一些大對象,須要注意一下程序清理用戶數據時是否會同步刪除大對象數據。
相關文章
相關標籤/搜索