#StackBounty: #postgresql #performance #join #database-design #database-performance What is the best way to join the same table twice i…

Bounty: 50

The performance with the second join on the same table is degraded nearly half

SELECT * FROM party_party_relationship AS ppr 
    LEFT JOIN party_role AS r1 ON r1.party_role_uid = ppr.party_role_uid
    LEFT JOIN party_role AS r2 ON r2.party_role_uid = ppr.party_role_uid_related

Performance with first Join

"Hash Left Join  (cost=288.18..547.72 rows=10972 width=144) (actual time=5.281..17.781 rows=11192 loops=1)"
"  Hash Cond: (ppr.party_role_uid = r1.party_role_uid)"
"  ->  Seq Scan on party_party_relationship ppr  (cost=0.00..230.72 rows=10972 width=98) (actual time=0.020..2.438 rows=11192 loops=1)"
"  ->  Hash  (cost=181.97..181.97 rows=8497 width=46) (actual time=5.186..5.187 rows=9946 loops=1)"
"        Buckets: 16384  Batches: 1  Memory Usage: 823kB"
"        ->  Seq Scan on party_role r1  (cost=0.00..181.97 rows=8497 width=46) (actual time=0.010..2.073 rows=9946 loops=1)"
"Planning Time: 0.472 ms"
"Execution Time: 18.765 ms"

With two join

Performance with the second join on the same table almost doubled

"Hash Left Join  (cost=576.37..864.71 rows=10972 width=190) (actual time=9.871..31.986 rows=11192 loops=1)"
"  Hash Cond: (ppr.party_role_uid_related = r2.party_role_uid)"
"  ->  Hash Left Join  (cost=288.18..547.72 rows=10972 width=144) (actual time=5.163..18.437 rows=11192 loops=1)"
"        Hash Cond: (ppr.party_role_uid = r1.party_role_uid)"
"        ->  Seq Scan on party_party_relationship ppr  (cost=0.00..230.72 rows=10972 width=98) (actual time=0.015..2.735 rows=11192 loops=1)"
"        ->  Hash  (cost=181.97..181.97 rows=8497 width=46) (actual time=5.091..5.092 rows=9946 loops=1)"
"              Buckets: 16384  Batches: 1  Memory Usage: 823kB"
"              ->  Seq Scan on party_role r1  (cost=0.00..181.97 rows=8497 width=46) (actual time=0.008..2.030 rows=9946 loops=1)"
"  ->  Hash  (cost=181.97..181.97 rows=8497 width=46) (actual time=4.644..4.644 rows=9946 loops=1)"
"        Buckets: 16384  Batches: 1  Memory Usage: 823kB"
"        ->  Seq Scan on party_role r2  (cost=0.00..181.97 rows=8497 width=46) (actual time=0.014..1.810 rows=9946 loops=1)"
"Planning Time: 0.925 ms"
"Execution Time: 32.920 ms"

With one join

The above query is just a part of the whole query.

SELECT * FROM party_party_relationship AS ppr 
    INNER JOIN party_role AS r1 ON r1.party_role_uid = ppr.party_role_uid
        INNER JOIN party AS p1 ON p1.party_uid = r1.party_uid
                LEFT JOIN party_name AS n1 ON n1.party_uid = p1.party_uid AND n1.end_date IS NULL
                LEFT JOIN business_number AS b1 ON b1.party_uid = p1.party_uid AND b1.business_number_cd = p1.business_number_cd AND b1.end_date IS NULL

    INNER JOIN party_role AS r2 ON r2.party_role_uid = ppr.party_role_uid_related
        INNER JOIN party AS p2 ON p2.party_uid = r2.party_uid
                LEFT JOIN party_name AS n2 ON n2.party_uid = p2.party_uid AND n2.end_date IS NULL
                LEFT JOIN business_number AS b2 ON b2.party_uid = p2.party_uid AND b2.business_number_cd = p2.business_number_cd AND b2.end_date IS NULL
                
                WHERE ppr.case_uid = 9

Execution Plan

"Nested Loop Left Join  (cost=1113.46..3576.37 rows=915 width=772) (actual time=19.687..76.911 rows=919 loops=1)"
"  ->  Nested Loop Left Join  (cost=1113.31..3270.33 rows=915 width=694) (actual time=19.616..56.253 rows=919 loops=1)"
"        Join Filter: (n1.end_date IS NULL)"
"        ->  Hash Left Join  (cost=1113.03..2415.51 rows=915 width=547) (actual time=19.588..51.236 rows=915 loops=1)"
"              Hash Cond: (r1.party_uid = p2.party_uid)"
"              ->  Hash Left Join  (cost=856.60..2156.68 rows=915 width=481) (actual time=15.192..45.391 rows=915 loops=1)"
"                    Hash Cond: (ppr.party_role_uid_related = r2.party_role_uid)"
"                    ->  Nested Loop Left Join  (cost=568.42..1866.09 rows=915 width=435) (actual time=9.743..38.415 rows=915 loops=1)"
"                          ->  Nested Loop Left Join  (cost=568.27..1560.05 rows=915 width=357) (actual time=9.665..17.956 rows=915 loops=1)"
"                                ->  Hash Left Join  (cost=567.99..705.23 rows=915 width=210) (actual time=9.639..12.460 rows=915 loops=1)"
"                                      Hash Cond: (r1.party_uid = p1.party_uid)"
"                                      ->  Hash Left Join  (cost=311.56..446.40 rows=915 width=144) (actual time=5.314..7.056 rows=915 loops=1)"
"                                            Hash Cond: (ppr.party_role_uid = r1.party_role_uid)"
"                                            ->  Bitmap Heap Scan on party_party_relationship ppr  (cost=23.38..155.81 rows=915 width=98) (actual time=0.111..0.536 rows=915 loops=1)"
"                                                  Recheck Cond: (insolvency_case_uid = 9)"
"                                                  Heap Blocks: exact=18"
"                                                  ->  Bitmap Index Scan on ixfk_party_party_relationship_insolvency_case  (cost=0.00..23.15 rows=915 width=0) (actual time=0.097..0.097 rows=926 loops=1)"
"                                                        Index Cond: (insolvency_case_uid = 9)"
"                                            ->  Hash  (cost=181.97..181.97 rows=8497 width=46) (actual time=5.149..5.149 rows=9960 loops=1)"
"                                                  Buckets: 16384  Batches: 1  Memory Usage: 824kB"
"                                                  ->  Seq Scan on party_role r1  (cost=0.00..181.97 rows=8497 width=46) (actual time=0.009..1.979 rows=9960 loops=1)"
"                                      ->  Hash  (cost=161.19..161.19 rows=7619 width=66) (actual time=4.290..4.290 rows=7449 loops=1)"
"                                            Buckets: 8192  Batches: 1  Memory Usage: 701kB"
"                                            ->  Seq Scan on party p1  (cost=0.00..161.19 rows=7619 width=66) (actual time=0.013..1.680 rows=7449 loops=1)"
"                                ->  Index Scan using ixfk_party_name_party on party_name n1  (cost=0.28..0.92 rows=1 width=147) (actual time=0.004..0.005 rows=1 loops=915)"
"                                      Index Cond: (party_uid = p1.party_uid)"
"                                      Filter: (end_date IS NULL)"
"                                      Rows Removed by Filter: 0"
"                          ->  Index Scan using ex_business_number_end_date on business_number b1  (cost=0.15..0.32 rows=1 width=78) (actual time=0.020..0.021 rows=1 loops=915)"
"                                Index Cond: ((party_uid = p1.party_uid) AND (business_number_cd = p1.business_number_cd))"
"                    ->  Hash  (cost=181.97..181.97 rows=8497 width=46) (actual time=5.293..5.293 rows=9960 loops=1)"
"                          Buckets: 16384  Batches: 1  Memory Usage: 824kB"
"                          ->  Seq Scan on party_role r2  (cost=0.00..181.97 rows=8497 width=46) (actual time=0.010..1.799 rows=9960 loops=1)"
"              ->  Hash  (cost=161.19..161.19 rows=7619 width=66) (actual time=4.313..4.314 rows=7449 loops=1)"
"                    Buckets: 8192  Batches: 1  Memory Usage: 701kB"
"                    ->  Seq Scan on party p2  (cost=0.00..161.19 rows=7619 width=66) (actual time=0.011..1.587 rows=7449 loops=1)"
"        ->  Index Scan using ixfk_party_name_party on party_name n2  (cost=0.28..0.92 rows=1 width=147) (actual time=0.003..0.003 rows=1 loops=915)"
"              Index Cond: (party_uid = p2.party_uid)"
"  ->  Index Scan using ex_business_number_end_date on business_number b2  (cost=0.15..0.32 rows=1 width=78) (actual time=0.020..0.020 rows=1 loops=919)"
"        Index Cond: ((party_uid = p2.party_uid) AND (business_number_cd = p2.business_number_cd))"
"Planning Time: 4.499 ms"
"Execution Time: 77.433 ms"

Plan in Graph

Part of Execution plan - graph

Is there any better way to do it? The table is expected to grow very fast.


Get this bounty!!!

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.