Chapter 6 - Explain

“Fall seven times, stand up eight.”

- Japanese Proverb

Three Ways to Run an EXPLAIN

image

When you run an EXPLAIN, you are seeing the plan passed to the slices by the optimizer on the Leader Node. There are three ways to see an EXPLAIN. You can merely type the word EXPLAIN in front of any SQL. You can also hit F6 (Function Key 6), or you can click on the magnifying glass in Nexus. The EXPLAIN shows the plan, but does NOT run the actual query. Once you see the costs of the EXPLAIN, you can decide whether or not to run the query.

EXPLAIN – Steps, Segments and Streams

Step - Each individual step is an individual operation in the explain plan. Steps can even be combined to allow compute nodes to perform a query, join, subquery, or other type of database operation.

Segment – Segments are a number of steps that can be done by a single process. A segment is a single compilation unit executable by compute nodes. Each segment begins with a scan or reading of table data and ends either with a materialization step or some other network activity.

Stream - A collection of segments that always begins with a scan or reading of some data set and ends with a materialization or blocking step. Materialization or blocking steps can include HASH, AGG, SORT, and SAVE.

Last segment – The term last segment means the query returns the data. If the return set is aggregated or sorted, the compute nodes each send their piece of the intermediate result to the leader node which then merges the data so the final result can be sent back to the requesting client.

EXPLAIN Terms For Scans and Joins

Sequential Scan – Also termed as a scan. This means that Amazon Redshift will scan the entire table sequentially from beginning to end. Deem this a Full Table Scan. This means that Redshift also evaluates query constraints for every row (Filter) if specified with WHERE clause. This can also be utilized to run INSERT, UPDATE, and DELETE statements.

Merge Join – Also termed as an mjoin. This is commonly used for inner joins and outer joins (for join tables that are both distributed and sorted on the joining columns), this is generally the the fastest Amazon Redshift join algorithm.

Hash Join – Also termed as an hjoin. This is also used for inner joins and left and right outer joins, and it is typ ically faster than a nested loop join. Hash Join reads the outer table then hashes the joining column and finds matches in the inner hash table.

Nested Loop Join – Also termed as an nloop. This is the least optimal join, and it is mainly used for cross-joins, product joins, and Cartesian product joins. It is also used for joins without ajoin condition and inequality joins.

EXPLAIN Terms For Aggregation and Sorts

Aggregate– Also termed as a aggr. This is an operator/step for scalar aggregate functions.

HashAggregate– Also termed as a aggr. This is an operator/step for any grouped aggregate functions. This has the ability to operate from disk by virtue of hash table spilling to disk.

GroupAggregate– Also termed as a aggr. This is an operator that is sometimes chosen for grouped aggregate queries. This is only done if the Amazon Redshift configuration setting for force_hash_grouping setting is off.

Sort – Also termed as a sort. The ORDER BY clause controls this sort. It can also perform other operations such as UNIONs and joins. It can also operate from disk.

Merge – Also termed as a merge. This produces the final sorted results of a query based on intermediate sorted results derived from operations performed in parallel.

EXPLAIN Terms For Set Operators and Miscellaneous Terms

SetOp Except– Also termed as an hjoin. This is only used for EXCEPT queries.

Hash Intersect – Also termed as an hjoin. This is used for INTERSECT queries.

Append – Also termed as save. This is the append used with a Subquery Scan to implement UNION and UNION ALL queries.

Limit – Also termed as limit. This term evaluates the LIMIT clause.

Materialize – Also termed as save. This term means to materialize rows for input to nested loop joins and some merge joins. This can operate from disk.

Unique – Also termed as unique. This term means to materialize rows for input to nested loop joins and some merge joins. This can operate from disk.

Window – Also termed as window. This term means to compute aggregate and ranking window functions. This can operate from disk.

EXPLAIN Terms For Set Operators and Miscellaneous Terms

Network (Broadcast) – Also termed as a bcast. This is a Broadcast that is considered an attribute of the Join Explain operators and steps.

Network (Distribute) – Also termed as a dist. This is used to distribute rows to compute nodes for parallel processing by the data warehouse cluster.

Network (Send to Leader) – Also termed as return. This sends results back to the leader for further processing.

INSERT (Using Results) – Also termed as insert. This inserts data.

Delete (Scan and Filter) – Also termed as delete. This term means to delete data. This operation can operate from disk.

Update (Scan and Filter) – Also termed as delete, insert. This term means to implement as delete and Insert.

EXPLAIN Example and the Cost

image

Costs are cumulative as you read up the plan, so the HashAggregate cost number in this example (0.09) (red arrow) consists mostly of the Seq Scan cost below it (0.06) (blue arrow). So, the first scan (blue arrow) was 0.06 and the final cost is 0.09, which means the final step was a 0.03 cost. Add up 0.06 (blue arrow) and 0.03 (red arrow) and you get 0.09!

EXPLAIN Example and the Rows

image

The keyword rows in the EXPLAIN means the expected number of rows to return. In this example, the scan is expected to return 6 rows. The HashAggregate operator is expected to return 6 rows (red arrow).

EXPLAIN Example and the Width

image

The keyword width in the EXPLAIN means the estimated width of the average row, in bytes. It is important to analyze the table so statistics are collected. The width and number of rows returned can be improved dramatically.

Simple EXPLAIN Example and the Costs

image

Above, you can see a simple query with a simple explain plan. This is designed to show how the cost works.

EXPLAIN Join Example Using DS_BCAST_INNER

image

A Broadcast (BCAST) means to duplicate the table in its entirety across all nodes. We are duplicating the Department_Table on all nodes.

EXPLAIN Join Example Using DS_DIST_NONE

image

Take a look at the above query and the explain plan. We are joining three tables. The Student_Table and the Student_Course_Table are joined first. They both have a Distribution Key of Student_ID. Since they join on this column, there is no need to redistribute the data thus the DS_DIST_NONE keyword. Matching rows are already on the same node.

EXPLAIN Showing DS_DIST_NONE Visually

image

The Student_Table and the Student_Course_Table are joined first on Student_ID. Both tables have a Distribution Key of Student_ID. Since they join on Student_ID, the matching rows are already on the same slice. So, there is no need to redistribute the data thus the DS_DIST_NONE keyword. Then, the Course_Table is broadcast to all slices (DS_BCAST_INNER) where it can be joined to the results of the first two table join. Data movement happens in joins because the joining of two rows has to happen in the same slice and memory.

EXPLAIN With a Warning

image

The example above might become a problem. The costs look high and the EXPLAIN plan is shouting out to review the join predicates to avoid Cartesian product joins.

EXPLAIN For Ordered Analytics Such as CSUM

image

Let's examine this EXPLAIN plan from the bottom up. The first thing done (bottom) is a sequential scan of the sales_table. The cost for reading the first row is 0. The cost for reading all rows is 0.21. Then, the data is passed to slice 0. The keyword Network means to send intermediate results to the leader node for further processing. On the leader node, the data is sorted by Product_Id and Sale_Date. Then, the actual Cumulative Sum, Moving Sum, and Moving Avg are calculated. Notice the cost of the first row and the last row are both large. They expect 21 rows to be returned with an average width per row of 21 bytes. This is a good approach for analyzing an EXPLAIN plan.

EXPLAIN For Scalar Aggregate Functions

image

XN Aggregate (cost=0.11..0.11 rows=1 width=12)

  ->  XN Seq Scan on employee_table (cost=0.00..0.09 rows=9 width=12)

The keyword Aggregate in the EXPLAIN is used for aggregation scalar functions. A scalar function means that only one row and one column are returned in the answer set for an aggregation function. Notice that the above query produces a scalar result. The AVG(Salary) in the Employee_Table is $46782.15. That result is only one column and one row! It is scalar!

EXPLAIN For HashAggregate Functions

image

XN HashAggregate (cost=0.18..0.22 rows=5 width=14)

  ->  XN Seq Scan on employee_table (cost=0.00..0.09 rows=9 width=14)

The keyword HashAggregate in the EXPLAIN is used for unsorted grouped aggregate functions. Notice there is no sort!

EXPLAIN Using Limit, Merge and Sort

image

The keyword Limit is used to evaluate the LIMIT clause. The keyword Sort is used to evaluate the ORDER BY clause. The Keyword Merge is used when producing the final sorted results, which is derived from intermediate sorted results that each slice parallel processed. Remember, each slice must perform their work. Then, the data is sorted on each slice and passed to the leader node where a Merge operation is performed.

EXPLAIN Using a WHERE Clause Filter

image

XN Seq Scan on student_table (cost=0.00..0.12 rows=4 width=53)

  Filter: (class_code = 'FR'::bpchar)

The keyword Filter is used to evaluate the WHERE clause. In the above example, we are filtering the returning rows by only looking for Freshman who have a class_code of 'FR'. Our EXPLAIN (in yellow) shows the keyword filter looking for 'FR'.

EXPLAIN Using the Keyword Distinct

image

The keyword Unique in the EXPLAIN is used to evaluate the Distinct clause. In the above example, we do a sequential scan of the entire student_table. Then, you see the keyword UNIQUE in the EXPLAIN plan. This ensures that no duplicate values will be returned. The data is then sorted on each slice and sent to the leader node for a final merge among all slices.

EXPLAIN for Subqueries

image

A subquery involves at least two queries. A top and bottom query. In the above example, the bottom query is run first on the Department_Table. The result set consists of the column Dept_No. The Employee_Table (top query) is scanned next. Then, the results of both the Department_Table and the Employee_Table scans are hashed by Dept_No. This places all matches on the same slice. The rows can then be joined using a Hash Join in memory.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset