Improving VACUUM Suction

Improving VACUUM
Suction
Greg Smith - © 2ndQuadrant US 2011
AKA the talk everyone
suggests renaming
'VACUUM Sucks!'
Greg Smith - © 2ndQuadrant US 2011
VACUUM
•
•
•
Cleans up after UPDATE and DELETE
The hidden cost of MVCC
Must happen eventually
–
Frozen ID cleanup
Greg Smith - © 2ndQuadrant US 2011
Autovacuum
•
•
•
•
Cleans up after dead rows
Also updates database stats
Large tables: 20% change required
autovacuum_vacuum_scale_factor=20
Greg Smith - © 2ndQuadrant US 2011
VACUUM Overhead
•
•
Intensive when it happens
Focus on smoothing and scheduling
•
Dead rows add invisible overhead
–
Putting it off makes it worse
–
–
Table “bloat” can be very large
Thresholds can be per-table
Greg Smith - © 2ndQuadrant US 2011
Index Bloating

Indexes can become less efficient after deletes

VACUUM FULL before 9.0 makes this worse

REINDEX helps, but it locks the table

CREATE INDEX can run CONCURRENTLY
–
–

Rename: simulate REINDEX CONCURRENTLY
All transactions must end to finish
CLUSTER does a full table rebuild
–
–
Same “fresh” performance as after dump/reload
Full table lock to do it
Greg Smith - © 2ndQuadrant US 2011
VACUUM Gone Wrong
•
Aim at a target peak performance
•
VACUUM isn't accounted for
•
Just survive peak load?
–
You won't survive VACUUM
Greg Smith - © 2ndQuadrant US 2011
VACUUM monitoring
•
Watch pg_stat_user_tables timestamps
•
Beware long-running transactions
•
log_autovacuum_min_duration
•
Sizes of tables/indexes critical too
Greg Smith - © 2ndQuadrant US 2011
Improving efficiency
•
•
•
•
•
maintenance_work_mem: up to 2GB
shared_buffers: 512MB to 8GB
checkpoint_segments: 16 to 256
Hardware write caches
Tune read-ahead
Greg Smith - © 2ndQuadrant US 2011
VACUUM Cost Limits
vacuum_cost_page_hit = 1
vacuum_cost_page_miss = 10
vacuum_cost_page_dirty = 20
vacuum_cost_limit = 200
autovacuum_vacuum_cost_delay = 20ms
Greg Smith - © 2ndQuadrant US 2011
autovacuum Cost Basics

Every 20 ms = 50 runs/second

Each run accumulates 200 cost units

200 * 50 = 10000 cost / second
Greg Smith - © 2ndQuadrant US 2011
Costs and Disk I/O
20ms = 10000 cost/second

All misses @ 10 cost?




–
–
10000 / 10 = 1000 reads/second
1000*8192/(1024*1024)=7.8MB/s read
–
–
10000 / 20 = 500 reads/second
500*8192/(1024*1024)=3.9 MB/s write
–
Doubles the rate: 17.2 MB/s / 7.8 MB/s
–
Halves the rate: 3.9 MB/s / 1.95 MB/s
All dirty @ 20 cost?
Halve the delay to 10ms?
Double the delay to 40ms?
Greg Smith - © 2ndQuadrant US 2011
Submission for 9.2

“Displaying accumulated autovacuum cost”

Patch submitted to pgsql-hackers in August

Easily applies to older versions
– Not very risky to production
– Just adds some logging

Useful for learning how to tune costs

Pushing to commit next month
Greg Smith - © 2ndQuadrant US 2011
Sample logging output
LOG: automatic vacuum of table "pgbench.public.pgbench_accounts": index scans: 1
pages: 0 removed, 163935 remain
tuples: 2000000 removed, 2928356 remain
buffer usage: 117393 hits, 123351 misses, 102684 dirtied, 2.168 MiB/s write rate
system usage: CPU 2.54s/6.27u sec elapsed 369.99 sec Greg Smith - © 2ndQuadrant US 2011
Next steps for patch

Progress report

Prototype via update_process_title

Show advance of cost accumulation

pg_stat_progress?
– Track other maintenance work too
– Index rebuilds, CLUSTER, etc.
– Queries are much harder
Greg Smith - © 2ndQuadrant US 2011
Common tricks

Manual VACUUM during slower periods
– Make sure to set vacuum_cost_delay
– Start with daily, break down by table size

Alternate fast/slow configurations
– Two postgresql.conf files, or edit script
– Swap/change using cron or pgAgent

Aggressive freezing
Greg Smith - © 2ndQuadrant US 2011
Write to disk, slow way

Data page change to pg_xlog WAL

Checkpoint pushes page to disk

Hint bits update page for faster visibility

Autovacuum marks free space

Freeze old transaction IDs
Greg Smith - © 2ndQuadrant US 2011
Manually maintained path

Data page change to pg_xlog WAL

Checkpoint pushes page to disk

Manually freeze old transaction Ids
– Tweak vacuum_freeze_min_age and/or
vacuum_freeze_table_age
Greg Smith - © 2ndQuadrant US 2011
Future ideas

Set cost parameters in MB/s
– Tricky due to OS caching

Measure operating system rate directly
– Prototype for Linux
– Hard to make cross-platform

Advise about workloads
– INSERT-only tables
Greg Smith - © 2ndQuadrant US 2011
Questions
Greg Smith - © 2ndQuadrant US 2011
PostgreSQL Books
http://www.2ndQuadrant.com/books/
Greg Smith - © 2ndQuadrant US 2011
2ndQuadrant Training
Available now in UK, Italy, Germany

On-site only in the US (so far)

Includes hands-on workshops with instructor
interaction

Database administration

Performance

Replication and Recovery

Real-world experience from production DBAs

Greg Smith - © 2ndQuadrant US 2011