#StackBounty: #sql-server #optimization Demystification of SQL Server optimization process

Bounty: 100

We would like to see all variants of query plan considered during a query optimization by a SQL Server optimizer. SQL Server offers quite detailed insight using querytraceon options. For example QUERYTRACEON 3604, QUERYTRACEON 8615 allows us to print out MEMO structure and QUERYTRACEON 3604, QUERYTRACEON 8619 print out a list of transformation rules applied during the optimization process. That is great, however, we have several problems with trace outputs:

  1. It seems that the MEMO structure contains only final variants of the query plan or variants that were later rewritten into the final one. Is there a way to find "unsuccessful/unpromising" query plans?
  2. The operators in MEMO do not contain a reference to SQL parts. For example, LogOp_Get operator does not contain a reference to a specific Table.
  3. The transformation rules do not contain a precise reference to MEMO operators, therefore, we can not be sure which operators were transformed by the transformation rule.

Let me show it on a more elaborated example. Let me have two artificial tables A and B:

WITH x AS (
        SELECT n FROM
        (
            VALUES (0), (1), (2), (3), (4), (5), (6), (7), (8), (9)
        ) v(n)
    ),
    t1 AS
    (
        SELECT ones.n + 10 * tens.n + 100 * hundreds.n + 1000 * thousands.n + 10000 * tenthousands.n + 100000 * hundredthousands.n as id  
        FROM x ones, x tens, x hundreds, x thousands, x tenthousands, x hundredthousands
    )
SELECT
    CAST(id AS INT) id, 
    CAST(id % 9173 AS int) fkb, 
    CAST(id % 911 AS int) search, 
    LEFT('Value ' + CAST(id AS VARCHAR) + ' ' + REPLICATE('*', 1000), 1000) AS padding
INTO A
FROM t1;


WITH x AS (
        SELECT n FROM
        (
            VALUES (0), (1), (2), (3), (4), (5), (6), (7), (8), (9)
        ) v(n)
    ),
    t1 AS
    (
        SELECT ones.n + 10 * tens.n + 100 * hundreds.n + 1000 * thousands.n AS id  
        FROM x ones, x tens, x hundreds, x thousands       
    )
SELECT
    CAST(id AS INT) id,
    CAST(id % 901 AS INT) search, 
    LEFT('Value ' + CAST(id AS VARCHAR) + ' ' + REPLICATE('*', 1000), 1000) AS padding
INTO B
FROM t1;

Right now, I run one simple query

SELECT a1.id, a1.fkb, a1.search, a1.padding
FROM A a1 JOIN A a2 ON a1.fkb = a2.id
WHERE a1.search = 497 AND a2.search = 1
OPTION(RECOMPILE, 
    MAXDOP 1,
    QUERYTRACEON 3604,
    QUERYTRACEON 8615)
    

I get quite complex output that describes MEMO structure (you may try by yourself) having 15 groups. Here is the picture, that visualizes MEMO structure using a tree.
enter image description here
From the tree one may observe that there were certain rules applied before the optimizer found the final query plan. For example join commute (JoinCommute), join to hash join (JNtoHS), or Enforce sort (EnforceSort). As mentioned it is possible to print out the whole set of rewriting rules applied by the optimizer using QUERYTRACEON 3604, QUERYTRACEON 8619 options.
The problems:

  1. We may find JNtoSM (Join to sort merge) rewriting rule in the 8619 list, however, the sort-merge operator is not in MEMO structure. I understand that the sort-merge was probably more costly, but why it is not in MEMO?
  2. How to know whether LogOp_Get operator in MEMO references to table A or table B?
  3. If I see rule GetToIdxScan - Get -> IdxScan in the 8619 list, how to map it to the MEMO operators?

There is a limited number of resources about this. I have read many of the Paul White blog posts about transformation rules and MEMO, however, the above questions remain unanswered. Thanks for any help.


Get this bounty!!!

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.