調大work_mem減少I/O
更新時間 2025-02-05 09:36:58
最近更新時間: 2025-02-05 09:36:58
分享文章
本頁介紹天翼云TeleDB數據庫通過調大work_mem來減少I/O的優化案例。
work_mem參數為SQL中排序、hash散列操作時可用的內存,當排序、hash散列操作需要的內存超過work_mem時,會將超出部分寫入臨時文件,當觸發臨時文件寫時,會產生I/O,大量的臨時文件寫會嚴重影響SQL性能。
work_mem默認為4MB,可以針對需要更多work_mem的SQL進行會話級/語句級設置,不建議全局設置過大,避免因內存消耗過多導致的OOM。
下面是用到work_mem參數調整的SQL優化案例:
-
測試數據準備:
teledb=# CREATE TABLE t1(f1 serial not null unique,f2 text,f3 text,f4 text,f5 text,f6 text,f7 text,f8 text,f9 text,f10 text,f11 text,f12 text) distribute by shard(f1); NOTICE: Replica identity is needed for shard table, please add to this table through "alter table" command. CREATE TABLE Time: 70.545 ms teledb=# CREATE TABLE t2(f1 serial not null unique,f2 text,f3 text,f4 text,f5 text,f6 text,f7 text,f8 text,f9 text,f10 text,f11 text,f12 text) distribute by shard(f1); NOTICE: Replica identity is needed for shard table, please add to this table through "alter table" command. CREATE TABLE Time: 61.913 ms teledb=# insert into t1 select t,md5(t::text),md5(t::text),md5(t::text),md5(t::text),md5(t::text),md5(t::text),md5(t::text),md5(t::text),md5(t::text),md5(t::text),md5(t::text) from generate_series(1,1000) as t; INSERT 0 1000 Time: 48.866 ms teledb=# insert into t2 select t,md5(t::text),md5(t::text),md5(t::text),md5(t::text),md5(t::text),md5(t::text),md5(t::text),md5(t::text),md5(t::text),md5(t::text),md5(t::text) from generate_series(1,50000) as t; INSERT 0 50000 Time: 792.858 ms teledb=# analyze t1; ANALYZE Time: 175.946 ms teledb=# analyze t2; ANALYZE Time: 318.802 ms teledb=# -
優化前的SQL執行計劃和耗時:
默認work_mem為4MB:
teledb=# explain select * from t1 where f2 not in (select f2 from t2); QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------------------- Remote Subquery Scan on all (dn01,dn02,dn03,dn04,dn05,dn06,dn07,dn08,dn09,dn10) (cost=0.00..2076712.50 rows=500 width=367) -> Seq Scan on t1 (cost=0.00..2076712.50 rows=500 width=367) Filter: (NOT (SubPlan 1)) SubPlan 1 -> Materialize (cost=0.00..4028.00 rows=50000 width=33) -> Remote Subquery Scan on all (dn01,dn02,dn03,dn04,dn05,dn06,dn07,dn08,dn09,dn10) (cost=0.00..3240.00 rows=50000 width=33) -> Seq Scan on t2 (cost=0.00..3240.00 rows=50000 width=33) (7 rows) Time: 0.916 ms teledb=# select * from t1 where f2 not in (select f2 from t2); f1 | f2 | f3 | f4 | f5 | f6 | f7 | f8 | f9 | f10 | f11 | f12 ----+----+----+----+----+----+----+----+----+-----+-----+----- (0 rows) Time: 4226.825 ms (00:04.227) -
優化后的SQL執行計劃和耗時:
設置work_mem為8MB:
teledb=# set work_mem to '8MB'; SET Time: 0.289 ms teledb=# explain select * from t1 where f2 not in (select f2 from t2); QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------------- Remote Subquery Scan on all (dn01,dn02,dn03,dn04,dn05,dn06,dn07,dn08,dn09,dn10) (cost=3365.00..3577.50 rows=500 width=367) -> Seq Scan on t1 (cost=3365.00..3577.50 rows=500 width=367) Filter: (NOT (hashed SubPlan 1)) SubPlan 1 -> Remote Subquery Scan on all (dn01,dn02,dn03,dn04,dn05,dn06,dn07,dn08,dn09,dn10) (cost=0.00..3240.00 rows=50000 width=33) -> Seq Scan on t2 (cost=0.00..3240.00 rows=50000 width=33) (6 rows) Time: 0.890 ms teledb=# select * from t1 where f2 not in (select f2 from t2); f1 | f2 | f3 | f4 | f5 | f6 | f7 | f8 | f9 | f10 | f11 | f12 ----+----+----+----+----+----+----+----+----+-----+-----+----- (0 rows) Time: 105.249 ms可以看到,調整大work_mem前發生了Materialize物化動作,即將結果集存儲在了內存或磁盤上;而調大work_mem后,work_mem足夠放下filter的數據,不需要再做 Materialize物化,filter 由原來的subplan變成了hash subplan,直接在內存 hash表中filter;
優化后SQL性能提高了40倍,效果明顯。