#StackBounty: #postgresql #optimization #postgresql-10 Postgres fails to propagate constraint into subquery

Bounty: 50

I have a complex query "csq", which I’m using as a subquery. (It’s actually a view, but there appears to be no difference between using the view or inlining the subquery, so forget about the view).

I need to pick a subset of rows from this subquery efficiently. I want to write the following, but this is slow (200ms). The DB apparently materializes the complete subquery (>20k rows) before picking 4 rows from it using a hash join.

Note how the inner and outer subject tables are forced to be equal (have equal ids) by the join clause. So any constraint on the outer subject could be propagated to the inner subject.

select *
from subject as outer_subject
    join (
      SELECT inner_subject.id                   AS subject_id,
             <more columns>
      FROM   subject as inner_subject
      JOIN <two pages query>            
    ) csq 
    on csq.subject_id = outer_subject.id
where outer_subject.no in ('A31793357992','A2125336999S');

Above query yields this plan:
analyze explain of slow version

If I put the constraint from the outer where clause directly into the subquery, like below, then it is fast (4ms). I get an extremely good plan with nested loops, where each step has at most ~300 rows:

select *
from subject as outer_subject
    join (
      SELECT inner_subject.id                   AS subject_id,
             <more columns>
      FROM   subject as inner_subject
      JOIN <two pages query>    
      -- only the next line is added        
      WHERE  inner_subject.no in ('A31793357992','A2125336999S')
    ) csq 
    on csq.subject_id = outer_subject.id
where outer_subject.no in ('A31793357992','A2125336999S');

With above version, I get this explain:
analyze explain of efficient query

I thought that the query optimizer should be able to propagate the constraint from the outer query into the subquery. But apparently this does not happen.

My question is, whether there is some fundamental thing preventing the optimizer from executing the first query as fast as the second one?

The other option is that the optimizer is just confused and accidentally does not find the good plan for the first query.

Any insights?


Apparently the DB performs a hash join in the first case, with a hash table of 20k rows and a loop of size 2.

Here is the relevant part of the execution plan (the table and column names might differ slightly form the examples above):

Hash Join  (cost=5251.03..5560.54 rows=4 width=54) (actual time=168.038..199.218 rows=8 loops=1)
  Hash Cond: (f_t.sav_subject_id = s.s_id)
  ->  Hash Left Join  (cost=4930.24..5183.02 rows=21600 width=57) (actual time=162.068..194.481 rows=21646 loops=1)
  <snip>
  ->  Hash  (cost=320.76..320.76 rows=2 width=18) (actual time=2.096..2.096 rows=2 loops=1)
        Buckets: 1024  Batches: 1  Memory Usage: 9kB
        ->  Seq Scan on j_subject s  (cost=0.00..320.76 rows=2 width=18) (actual time=0.038..2.092 rows=2 loops=1)
              Filter: ((s_no)::text = ANY ('{A31793357992,A2125336999S}'::text[]))
              Rows Removed by Filter: 10292
Planning time: 21.553 ms
Execution time: 199.635 ms

Postgres version:

PostgreSQL 10.5 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-28), 64-bit


Get this bounty!!!

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.