org.axiondb.engine.commands
Class AxionQueryPlanner

java.lang.Object
  extended by org.axiondb.engine.commands.AxionQueryPlanner

public class AxionQueryPlanner
extends java.lang.Object

Query Planner used in select and sub-select Query. I use Rule-based analysis and decision-making ? rules expressing proven, efficient statement execution methods determine how operations and their attributes are built and combined.

One of the important components is the Query Optimizer. Its goal, for every SQL statement, is to answer this question: What is the quickest way to execute the statement, to produce exactly the data requested by the statement in the shortest possible time?

The query processor makes extensive use of a relational algebra tree representation to model and manipulate SQL queries. These trees can be thought of as a series of pipes, valves, and other components through which data flows, entering through the bottom from one or more tables, and leaving through the top as a result set. At various points within the tree, the data are operated on as needed to produce the desired result. Each operation is represented as a node in the tree. Nodes may have one or more expressions associated with them to specify columns, conditions, and calculations associated with the operation. In Axion in most cases these trees are represented as RowIterators.

Some of the operators that may be present in the tree are:

  • Restrict ? reduces the number of output rows by eliminating those that fail to satisfy some condition applied to the input. Restrict operators appear in the tree from WHERE clauses and JOINs. example: FilteringRowIterator
  • Join ? combines two input tables into a single output table that contains some combination of rows from the inputs. Joins appear in the tree from the use of FROM clauses and from JOIN clauses. example: NestedLoopJoinRowIterator.
  • Sort ? changes the ordering of rows in an input table to produce an output table in the desired order. example: SortedRowIterator
  • Table ? represents a Table scan or an Index scan, reading data from a given table by either its default index (table scan) or a specific index (index scan). Leaf nodes of the tree are always references to database tables.

    Current Axion SQL parser does not build a formal query tree, instead it builds a AxionQueryContext object, which hold the query tree, that includes selectables, where node, from node, oder by and group by node. FromNode is nested and represet a from tree, the subqueries are nested in the context object. Then I apply rule bases analysis to optimize the query.

    The Axion query optimizer divides the optimization process into several phases. Some of the phases deal with very significant rule based cost factors and are straightforward to understand. Each phase, listed below, addresses a specific type of optimization:

  • Pushing restrictions as far down the tree as possible
  • Derive restrictions
  • Using indexes for restrictions
  • Join optimization
  • Sort optimization
  • Using index for NULL Check
  • Count Rows Optimization

    Pushing Restrict Operations Close to the Data Origin: This optimization stage consists of moving restrict operators as far down the query tree as possible. This reduces the number of tuples moving up the tree for further processing and minimizes the amount of data handled. When restrict operations on a join node cannot be moved below the join node, they are set as join conditions. When multiple predicates are moved down the tree to the same relative position, they are reassembled into a single Restrict operation. Consider the SQL clause:

    SELECT EMP.NAME FROM EMP, DEPT WHERE EMP.SAL > 4000 AND EMP.SAL <= 6000 AND EMP.DEPTNO = DEPT.DEPTNO

    The optimizer apply restrictions EMP.SAL > 4000 and EMP.SAL <= 6000 are moved down the tree, below the join node, since they apply to a single table. The restriction EMP.DEPTNO = DEPT.DEPTNO, applying to two tables, stays in the join node.

    Derive restrictions: This optimization stage consists of deriving restrict operators based on the current current restriction and join condition. Consider the following SQL clause:

    SELECT EMP.NAME FROM EMP, DEPT WHERE EMP.DEPTNO = DEPT.DEPTNO AND EMP.DEPTNO = 10

    In this stage, the Optimizer can derive DEPT.DEPTNO = 10 and push that down the tree, below the join node, since they apply to a single table. The restriction EMP.DEPTNO = DEPT.DEPTNO, applying to two tables, stays in the join node.

    Using Indexes for Restrictions: This optimization phase consists of recognizing those cases where an existing index can be used to evaluate a restriction and converting a table scan into an index bracket or set of contiguous index entries. An index bracket is extremely effective in limiting the number of rows that must be processed. Consider the following SQL clause:

    SELECT EMP.NAME FROM EMP WHERE EMP.SAL > 6000 AND EMP.DEPTNO = 10

    Instead of a separate restrict operation, the output tree uses the index EmpIdx on the DEPTNO column table EMP to evaluate the restriction DEPTNO = 10.

    Choosing the Join Algorithm: Currently Axion support Augmented nested loop (ANL) or Index Nested Loop join and Nested loop join.

    The Index Nested Loop Join or Augmented Nested Loop Join (ANL) is by far the most common join method and is the classic Axion join method. An augmented nested loop join is performed by doing a scan over the left subtree and for each row in it, performing an index bracket scan on a portion of the right subtree. The right subtree is read as many times as there are rows in the left subtree. To be a candidate for an ANL join, the subtree pair for a join node must meet the following criteria:

  • There must be an index(es) defined on the join column(s) for the table in the right subtree.
  • No other scan on that index has already been set.

    When there is an index defined on the left subtree?s table instead of on the right, the optimizer swaps the subtrees to make an ANL join possible. When neither subtree?s table has an index defined on the join column, the optimizer creats a dynamic index on one of the subtree.

    A Nested Loop Join is performed by doing a scan over the left subtree and for each row in it performing a full scan of the right subtree. This is the default join algorithm, which can be used for any join. However, it is usually less efficient than the other methods. Usually, either an existing index, or a dynamic index, used in an ANL join, will cost much less. Occasionally, when subtree cardinalities are very low (possibly because of index bracketing), nested loop will be the method with the least cost.

    Count Rows Optimization: If we are coutinig the leaf nodes, axion will use table row count instead of table scan. If index was used to scan the table, count will use the index count. e.g select count(*) from emp will get the row count from table emp, instead of making a full table scan, but select count(*) from emp where emp.id > 100 require a full table scan unless optimizer can take advantage of index, if index is scanned then, index count will be used.

    Sort Optimization: The optimizer/planner performs two separate optimizations designed to avoid sort operations whenever possible: eliminating redundant sorts and converting table scans followed by sorts into index bracket scans.

  • Eliminating Redundant Sorts: The optimizer/planner checks whether the query tree contains unnecessary sort nodes. For example, when an SQL statement contains both a GROUP BY and an ORDER BY clause that refers to the same column, at most one sort is needed. A sort node is also redundant when the immediate descendant node of the sort node is an index bracket scan on the sort column. That is, the sort is redundant when the data input to the sort was read using an index with the needed sort order.

  • Converting Table Scans to Index Bracket Scans: When a leaf node of a subtree is a table scan, the optimizer/planner checks whether any indexes that exist on the table match the sort column(s). If so, the optimizer converts the table scan to the index bracket scan and removes the sort node.

    Version:
    $Revision: 1.61 $ $Date: 2006/01/10 21:02:37 $

    Constructor Summary
    AxionQueryPlanner(AxionQueryContext context)
               
     
    Method Summary
     java.util.Map getColumnIdToFieldMap()
               
     RowIterator getPlanNodeRowIterator()
               
     RowIterator makeRowIterator(Database db, boolean readOnly)
              Makes appropriate RowIteratorfor the current query/subquery.
     
    Methods inherited from class java.lang.Object
    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
     

    Constructor Detail

    AxionQueryPlanner

    public AxionQueryPlanner(AxionQueryContext context)
    Method Detail

    getColumnIdToFieldMap

    public java.util.Map getColumnIdToFieldMap()

    getPlanNodeRowIterator

    public RowIterator getPlanNodeRowIterator()

    makeRowIterator

    public RowIterator makeRowIterator(Database db,
                                       boolean readOnly)
                                throws AxionException
    Makes appropriate RowIteratorfor the current query/subquery.

    Parameters:
    db - the current database.
    readOnly - when true, the caller does not expect to be able to modify (i.e., call RowIterator.set(org.axiondb.Row)or RowIterator.remove()on) the returned RowIterator, the returned iterator may be unmodifiable.
    Returns:
    retunrs the top most RowIterator that may wrap other underlaying RowIterator based on the specific query.
    Throws:
    AxionException