EXPLAIN to me, why are you so slow?

EXPLAIN to me,
why are you so
slow?

Brought to you by Alexey Vasiliev, Railsware

Oleksii Vasyliev

Web and Mobile Developer (Ruby, Java, JavaScript, Objective-C, C/C++, Golang, Elixir/Erlang), DevOps
Open-Source libs: PGTune, SQL Joins Visualizer, RWbox, Go-Kinesis, ElixirV8, WebP-ffi, Zopfli-ffi, MongodbLogger, SMTRails, SHTRails, ...
Open-Source books: Setting up and scaling of PostgreSQL (Russian), Cooking Infrastructure by Chef
Leading RWpod podcast about Ruby and JavaScript

Your SQL query is slow.
What to check?

Health state of a host system
PostgreSQL configuration
Database structure is not well defined/normalized
Stored procedures for complex computations on database
Query is not well formed
No index
Number of rows in the table too large
Planner running queries inefficiently
Table bloating
many other...

EXPLAIN

PostgreSQL devises a query plan for each query it receives. Choosing the right plan to match the query structure and the properties of the data is absolutely critical for good performance, so the system includes a complex planner that tries to choose good plans. You can use the EXPLAIN command to see what query plan the planner creates for any query

EXPLAIN ANALYZE

The ANALYZE option causes the sql statement to be executed first and then actual run-time statistics in the returned information including total elapsed time expended within each plan node and the number of rows it actually returned

EXPLAIN ANALYZE

The ANALYZE statement actually executes the SQL statement and discards the output information, therefore, if you want to analyze any update statement such as INSERT, UPDATE, or DELETE without affecting the data, you should wrap the EXPLAIN ANALYZE in a transaction as follow:

BEGIN;
EXPLAIN ANALYZE sql_statement;
ROLLBACK;

EXPLAIN VERBOSE

The VERBOSE parameter allows you to show additional information regarding the plan

EXPLAIN COSTS

The COSTS option includes the estimated startup and total costs of each plan node, as well as the estimated number of rows and the estimated width of each row in the query plan

EXPLAIN BUFFERS

This parameter adds information to the buffer usage. BUFFERS only can be used when ANALYZE is enabled

EXPLAIN TIMING

This parameter includes the actual startup time and time spent in each node in the output. The TIMING defaults to TRUE and it may only be used when ANALYZE is enabled

EXPLAIN SUMMARY

The SUMMARY parameter adds summary information such as total timing after the query plan. Note that when ANALYZE option is used, the summary information is included by default

EXPLAIN FORMAT

Specify the output format of the query plan such as TEXT, XML, JSON, and YAML. This parameter is set to TEXT by default

EXPLAIN results

EXPLAIN ANALYZE SELECT * FROM tenk1 t1, tenk2 t2
WHERE t1.unique1 < 100 AND t1.unique2=t2.unique2 ORDER BY t1.fivethous;
                                                              QUERY PLAN
  --------------------------------------------------------------------------------------------------------------------------------------------
  Sort (cost=717.34..717.59 rows=101 width=488) (actual time=7.761..7.774 rows=100 loops=1)
    Sort Key: t1.fivethous
    Sort Method: quicksort Memory: 77kB
    -> Hash Join (cost=230.47..713.98 rows=101 width=488) (actual time=0.711..7.427 rows=100 loops=1)
        Hash Cond: (t2.unique2 = t1.unique2)
        -> Seq Scan on tenk2 t2 (cost=0.00..445.00 rows=10000 width=244) (actual time=0.007..2.583 rows=10000 loops=1)
          -> Hash (cost=229.20..229.20 rows=101 width=244) (actual time=0.659..0.659 rows=100 loops=1)
              Buckets: 1024 Batches: 1 Memory Usage: 28kB
              -> Bitmap Heap Scan on tenk1 t1 (cost=5.07..229.20 rows=101 width=244) (actual time=0.080..0.526 rows=100 loops=1)
                Recheck Cond: (unique1 < 100)
                  -> Bitmap Index Scan on tenk1_unique1 (cost=0.00..5.04 rows=101 width=0) (actual time=0.049..0.049 rows=100 loops=1)
                    Index Cond: (unique1 < 100)
Planning time: 0.194 ms
Execution time: 8.008 ms

First reaction

EXPLAIN tree structure

Sort
└── Hash Join
    ├── Seq Scan
    └── Hash
        └── Bitmap Heap Scan
            └── Bitmap Index Scan

Index and Table Access

Seq Scan - operation scans the entire relation (table) as stored on disk (like TABLE ACCESS FULL). Computational complexity: O(N)
Index Scan - performs a B-tree traversal, walks through the leaf nodes to find all matching entries, and fetches the corresponding table data. Computational complexity: O(log(N))

Index and Table Access

Index Only Scan (9.2+) - performs a B-tree traversal and walks through the leaf nodes to find all matching entries. There is no table access needed because the index has all columns to satisfy the query (exception: MVCC visibility information). Computational complexity: O(log(N))

Index and Table Access

Bitmap Index Scan / Bitmap Heap Scan / Recheck Cond
"A plain Index Scan fetches one tuple-pointer at a time from the index, and immediately visits that tuple in the table. A bitmap scan fetches all the tuple-pointers from the index in one go, sorts them using an in-memory "bitmap" data structure, and then visits the table tuples in physical tuple-location order." - Tom Lane. Computational complexity: Index Scan + Seq Scan (M), where M - number of rows found after index scan

Join Operations

Nested Loops - joins two tables by fetching the result from one table and querying the other table for each row from the first. Computational complexity: O(N^2)
Hash Join - loads the candidate records from one side of the join into a hash table (marked with Hash in the plan) which is then probed for each record from the other side of the join. Computational complexity: O(N) but in the case of a very inefficient hash function or a large number of identical fields for a connection, there may be O(N^2)

Join Operations

Merge Join - (sort) merge join combines two sorted lists like a zipper. Both sides of the join must be presorted. Computational complexity: O(N*log(N))

Sorting and Grouping

Sort / Sort Key - sorts the set on the columns mentioned in Sort Key. The Sort operation needs large amounts of memory to materialize the intermediate result (not pipelined). Computational complexity: O(N*log(N))
GroupAggregate - aggregates a presorted set according to the group by clause. This operation does not buffer large amounts of data (pipelined). Computational complexity: O(N*log(N))

Sorting and Grouping

HashAggregate - uses a temporary hash table to group records. This operation does not require a presorted data set, instead it uses large amounts of memory to materialize the intermediate result (not pipelined). Computational complexity: O(N)

Top-N Queries

Limit - aborts the underlying operations when the desired number of rows has been fetched. The efficiency of the top-N query depends on the execution mode of the underlying operations. It is very inefficient when aborting non-pipelined operations such as Sort
WindowAgg - indicates the use of window functions

But how are the decisions made?

How does PostgreSQL optimize queries?

Dynamic programming
Genetic algorithm

Dynamic programming

System R
Time complexity: 3^n
Memory consumption: 2^n
Always finds the cheapest plan

Genetic algorithm

Common and flexible method
Can be stopped on every iteration
No guarantees

Planner Cost Constants

seq_page_cost = 1.0 # measured ON an arbitrary scale
random_page_cost = 4.0 # same scale AS above
cpu_tuple_cost = 0.01 # same scale AS above
cpu_index_tuple_cost = 0.005 # same scale AS above
cpu_operator_cost = 0.0025 # same scale AS above

Planner Cost Constants

EXPLAIN ANALYZE SELECT * FROM users WHERE id = 15;
                          QUERY PLAN
--------------------------------------------------------------------------------
 Index Scan using users_pkey on users  (cost=0.42..8.44 rows=1 width=1111)
 (actual time=1.024..1.025 rows=1 loops=1)
  Index Cond: (id = 15)
 Total runtime: 1.071 ms
(3 rows)

Planner Cost Constants

SET enable_indexscan = OFF;
EXPLAIN ANALYZE SELECT * FROM users WHERE id = 15;
                                                    QUERY PLAN
-------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on users  (cost=4.43..8.44 rows=1 width=1111)
 (actual time=0.674..0.675 rows=1 loops=1)
   Recheck Cond: (id = 15)
   -> Bitmap Index Scan on users_pkey  (cost=0.00..4.43 rows=1 width=0)
   (actual time=0.664..0.664 rows=1 loops=1)
         Index Cond: (id = 15)
 Total runtime: 0.714 ms
(5 rows)

Planner Cost Constants

SET enable_bitmapscan = OFF;
EXPLAIN ANALYZE SELECT * FROM users WHERE id = 15;
                    QUERY PLAN
-----------------------------------------------------------------
Seq Scan on users  (cost=0.00..16474.22 rows=1 width=1111)
 (actual time=0.015..867.794 rows=1 loops=1)
   Filter: (id = 15)
   Rows Removed by Filter: 310789
 Total runtime: 867.817 ms
(4 rows)

Planner Cost Constants

Index Scan using users_pkey on users (cost=0.42..8.44 rows=1 width=1111)
Bitmap Heap Scan on users (cost=4.43..8.44 rows=1 width=1111)
Seq Scan on users (cost=0.00..16474.22 rows=1 width=1111)

The range (number..number) shows cost for starting the operation row and cost for getting all rows returned by this operation

Starting cost and rows

EXPLAIN ANALYZE SELECT * FROM users ORDER BY created_at DESC LIMIT 100;
                                 QUERY PLAN
-------------------------------------------------------------------------
Limit  (cost=27570.71..27570.96 rows=100 width=1111)
 (actual time=439.334..439.455 rows=100 loops=1)
   -> Sort (cost=27570.71..28347.35 rows=310658 width=1111)
   (actual time=439.332..439.380 rows=100 loops=1)
         Sort Key: created_at
         Sort Method: top-N heapsort  Memory: 74kB
         -> Seq Scan on users  (cost=0.00..15697.58 rows=310658 width=1111)
         (actual time=0.006..205.236 rows=310789 loops=1)
 Total runtime: 439.521 ms

Width

# EXPLAIN ANALYZE SELECT * FROM users ORDER BY id DESC LIMIT 100;
                              QUERY PLAN
----------------------------------------------------------------------
 Limit  (cost=0.42..14.79 rows=100 width=1111)
 (actual time=0.971..20.660 rows=100 loops=1)
   ->  Index Scan Backward using users_pkey on users
   (cost=0.42..44641.72 rows=310658 width=1111)
   (actual time=0.968..20.578 rows=100 loops=1)
 Total runtime: 20.755 ms
(3 rows)

Width

# EXPLAIN ANALYZE SELECT id, email FROM users ORDER BY id DESC LIMIT 100;
                           QUERY PLAN
--------------------------------------------------------------------------
 Limit  (cost=0.42..14.79 rows=100 width=27)
 (actual time=0.034..0.354 rows=100 loops=1)
   ->  Index Scan Backward using users_pkey on users
   (cost=0.42..44641.72 rows=310658 width=27)
   (actual time=0.032..0.263 rows=100 loops=1)
 Total runtime: 0.426 ms
(3 rows)

# EXPLAIN analyze SELECT * FROM t LIMIT 100;
                          QUERY PLAN
---------------------------------------------------------------------------
 LIMIT  (cost=0.00..9.33 ROWS=100 width=608)
 (actual TIME=0.008..0.152 ROWS=100 loops=1)
   ->  Seq Scan ON t  (cost=0.00..93333.86 ROWS=999986 width=608)
  (actual TIME=0.007..0.133 ROWS=100 loops=1)
 Total runtime: 0.181 ms
(3 ROWS)

 Nested Loop  (cost=0.00..3.17 rows=1 width=8)
(actual time=0.316..0.797 rows=2 loops=1)
   ->  Seq Scan on s  (cost=0.00..1.07 rows=1 width=4)
   (actual time=0.072..0.141 rows=2 loops=1)
         Filter: (id = 7)
         Rows Removed by Filter: 4
   ->  Seq Scan on s2  (cost=0.00..2.09 rows=1 width=4)
  (actual time=0.128..0.160 rows=1 loops=2)
         Filter: (id = 7)
         Rows Removed by Filter: 86
 Total runtime: 1.115 ms

 Nested Loop  (cost=0.00..10715.90 rows=26284 width=4449)
(actual time=0.054..291.131 rows=26284 loops=1)
  ->  Index Scan using books_index_title on books
  (cost=0.00..3306.28 rows=26284 width=3357)
  (actual time=0.033..50.773 rows=26284 loops=1)
  ->  Index Scan using categories_pkey on categories
  (cost=0.00..0.27 rows=1 width=1092)
  (actual time=0.002..0.003 rows=1 loops=26284)
        Index Cond: (categories.id = books.category_id)
Total runtime: 312.212 ms

Examples

Improve work_mem

Bitmap Heap Scan on posts  (cost=2.07..16.39 rows=9 width=704)
  Recheck Cond: (id < 10)
  -> Bitmap Index Scan on posts_pkey
  (cost=0.00..2.07 rows=9 width=0)
        Index Cond: (id < 10)

Improve work_mem

Hash Join  (cost=2579.00..53605.00 rows=50000 width=72) (actual time=66.820..959.794 rows=21 loops=1)
  Hash Cond: ((authors.name)::text = (publications.author)::text)
  ->  Seq Scan on authors  (cost=0.00..20310.00 rows=1000000 width=50) (actual time=0.059..267.217 rows=1000000 loops=1)
  ->  Hash  (cost=1270.00..1270.00 rows=50000 width=88) (actual time=38.054..38.054 rows=50000 loops=1)
        Buckets: 4096  Batches: 2  Memory Usage: 2948kB
        ->  Seq Scan on publications  (cost=0.00..1270.00 rows=50000 width=88) (actual time=0.010..14.211 rows=50000 loops=1)
Planning time: 0.489 ms
Execution time: 960.482 ms

Improve work_mem

Hash Join  (cost=1895.00..32705.00 rows=50000 width=72) (actual time=59.224..624.716 rows=21 loops=1)
  Hash Cond: ((authors.name)::text = (publications.author)::text)
  ->  Seq Scan on authors  (cost=0.00..20310.00 rows=1000000 width=50) (actual time=0.031..146.327 rows=1000000 loops=1)
  ->  Hash  (cost=1270.00..1270.00 rows=50000 width=88) (actual time=34.436..34.436 rows=50000 loops=1)
        Buckets: 8192  Batches: 1  Memory Usage: 5860kB
        ->  Seq Scan on publications  (cost=0.00..1270.00 rows=50000 width=88) (actual time=0.008..13.382 rows=50000 loops=1)
Planning time: 0.481 ms
Execution time: 625.796 ms

CREATE STATISTICS

# EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM t
	WHERE a = 1 AND b = 1;
                                 QUERY PLAN
-----------------------------------------------------------------------------
 Seq Scan on t  (cost=0.00..195.00 rows=1 width=8)
 (actual rows=100 loops=1)
   Filter: ((a = 1) AND (b = 1))
   Rows Removed by Filter: 9900

CREATE STATISTICS

# CREATE STATISTICS stts (dependencies) ON a, b FROM t;
# EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM t
	WHERE a = 1 AND b = 1;
                                  QUERY PLAN
-------------------------------------------------------------------------------
 Seq Scan on t  (cost=0.00..195.00 rows=100 width=8)
 (actual rows=100 loops=1)
   Filter: ((a = 1) AND (b = 1))
   Rows Removed by Filter: 9900

Ratio between live and dead rows

# SELECT
  relname AS ObjectName,
  pg_stat_get_live_tuples(c.oid) AS LiveTuples,
  pg_stat_get_dead_tuples(c.oid) AS DeadTuples
FROM pg_class c;

Tools

PostgreSQL EXPLAIN Visualizers

PostgreSQL tools

<Thank You!> Questions?

Contact information

web: leopard.in.ua
github: le0pard
twitter: @leopard_me