#StackBounty: postgres 9.6 index only scan on functional index logically possible but not executed

Bounty: 50

I have read about functional indices and index-only scans in the docs / wiki published by Postgres.

I now have a query like:

SELECT(xpath('/document/uuid/text()', xmldata))[1]::text,  
      (xpath('/document/title/text()', xmldata))[1]::text
FROM xmltable
WHERE(xpath('/document/uuid/text()', xmldata))[1]::text = 'some-uuid-xxxx-xxxx'

and an index:

CREATE INDEX idx_covering_index on xmltable using btree (
    ((xpath('/document/uuid/text()', xmldata))[1]::text),     
    ((xpath('/document/title/text()', xmldata))[1]::text)
)

This index is, looking at it logically, a covering index and should enable an index-only-scan, as all queried values are contained in the index (uuid and title)

I now happen to know, that Postgres only recognizes covering indices on functional indices if the columns used in the function-calls are also contained

eg.:

SELECT to_upper(column1) from table where id >10

1) cannot be covered by this index:

CREATE INDEX idx_covering_index on xmltable using btree (id, to_upper(column1));

2) but can be covered by this one:

CREATE INDEX idx_covering_index on xmltable using btree (column1, id, to_upper(column1));

thus leading to an index-only-scan.

If I now try this with my xml setup:

CREATE INDEX idx_covering_index on xmltable using btree (xmldata,
    ((xpath('/document/uuid/text()', xmldata))[1]::text),     
    ((xpath('/document/title/text()', xmldata))[1]::text)
)

I get an error:

data type xml has no default operator class for access method “btree”

fair enough, unfortunately the normally used "text_ops" or "text_pattern_ops" do not accept “xml” as input – thus rendering my index – although it would cover all values – unable to support index-only-scans.

Can this be handled in a way providing the possibility for index-only-scans?

@EDIT1:

I know that postgres cannot use the index seen at 1) as covering index, but can use an index like 2)

I also tried with very simple tables to verify this behavior, and i also kind of remember to have read this – but i cannot for the life of me remember where.

create table test (
    id serial primary key,
    quote text
)



insert into test (number, quote) values ('I do not know any clever quotes');
insert into test (number, quote) values ('I am sorry');



CREATE INDEX idx_test_functional on test using btree ((regexp_replace(quote, '^I ', 'BillDoor ')));
set enable_seqscan = off;

analyze test;

explain select quote from test where regexp_replace(quote, '^I ', 'BillDoor ') = 'BillDoor do not know any clever quotes'

--> "Index Scan using idx_test_functional on test  (cost=0.13..8.15 rows=1 width=27)"

drop index idx_test_functional;
CREATE INDEX idx_test_functional on test using btree (quote, (regexp_replace(quote, '^I ', 'BillDoor ')));

analyze test;

explain select quote from test where regexp_replace(quote, '^I ', 'BillDoor ') = 'BillDoor do not know any clever quotes'

--> "Index Only Scan using idx_test_functional on test  (cost=0.13..12.17 rows=1 width=27)"

@EDIT2:

Full table definition of xmltable:

id serial primary key (clustered),
xmldata xml (only data used to filter queries)
history xml (never queried or read, just kept in case of legal inquiry)
fileinfo text (seldom quieried, sometimes retrieved)
"timestamp" timestamp (mainly for legal inquiries too)

The Table contains approx.: 500.000 Records, the xmldata is between 350 and 800 bytes in size, history is much larger but seldom retrieved and never used in filters

For the record, to be sure got real results, i always ran analyze xmltable after i created or dropped an index

a full execution plan for the query:

explain analyze select (xpath('/document/uuid/text()', d.xmldata))[1]::text as uuid
from xmltable as d
where
(xpath('/document/uuid/text()', d.xmldata))[1]::text = 'some-uuid-xxxx-xxxx' and (xpath('/document/genre/text()', d.xmldata))[1]::text = 'bio'

covered by these indizies:

create index idx_genre on xmltable using btree (((xpath('/document/genre/text()', xmldata))[1]::text));

create index idx_uuid on xmltable using btree (((xpath('/document/uuid/text()', xmldata))[1]::text)); 

create index idx_uuid_genre on xmltable using btree (((xpath('/document/uuid/text()', xmldata))[1]::text), ((xpath('/document/genre/text()', xmldata))[1]::text));

first leads to:

"Index Scan using idx_genre on xmldata d  (cost=0.42..6303.05 rows=18154 width=32)"
"  Index Cond: (((xpath('/document/genre/text()'::text, xmldata, '{}'::text[]))[1])::text = 'bio'::text)"
"  Filter: (((xpath('/document/uuid/text()'::text, xmldata, '{}'::text[]))[1])::text = 'some-uuid-xxxx-xxxx'::text)"

fair enough i thought, just for testing i’ll force it to use the – in my mind – covering index:

drop index idx_uuid;
drop index idx_genre;

and now i get:

"Bitmap Heap Scan on xmltable d  (cost=551.13..16025.51 rows=18216 width=32)"
"  Recheck Cond: ((((xpath('/document/genre/text()'::text, xmldata, '{}'::text[]))[1])::text = 'bio'::text) AND (((xpath('/document/uuid/text()'::text, xmldata, '{}'::text[]))[1])::text = 'some-uuid-xxxx-xxxx'::text))"
"  ->  Bitmap Index Scan on idx_uuid_genre  (cost=0.00..546.58 rows=18216 width=0)"
"        Index Cond: ((((xpath('/document/genre/text()'::text, xmldata, '{}'::text[]))[1])::text = 'bio'::text) AND (((xpath('/document/uuid/text()'::text, xmldata, '{}'::text[]))[1])::text = 'some-uuid-xxxx-xxxx'::text))"

i also tried switching positions of uuid and genre in the index, same execution plan.


Get this bounty!!!

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.