Chapter 5 - Secondary Indexes – An Alternate Path to the Data

“Cowards die many times before their deaths; the valiant never taste of death but once.”

– William Shakespeare

Creating a Unique Secondary Index (USI)

images

A Subtable is created on each AMP anytime a secondary index is created

We created a Unique Secondary Index (USI) on the column Emp_No named idxname.

What is in a Unique Secondary Index (USI) Subtable?

images

The USI Subtable contains two columns:

1.     Emp_No (The USI column)

2.     Row-ID of the real Primary Index of the base table

Inside the secondary index subtable is the column Emp_No (USI Column). Then, there is the corresponding Row-ID of the real Primary Index of the table. The USI subtable's Primary Index is Emp_No, and it is hashed and has a Row-ID like a normal table.

A Unique Secondary Index (USI) Subtable is Hashed

images

The USI Subtable spreads the rows evenly among the AMPs. The Primary Index of the Subtable is Emp_No (The USI column).

Subtable rows are hashed by their Primary Index (Emp_No) and distributed evenly. They contain the Row-ID of the base row thus providing a pointer to the AMP and row.

How the Parsing Engine uses the USI Subtable

images

SELECT * FROM Emp_Intl

WHERE Emp_No = 1004 ;

Parsing Engines Plan – A Two-AMP Operation

Emp_No is a Unique Secondary Index!

The PE will Hash 1004 and see which AMP holds the row in its subtable. (AMP 3)

The PE will have the BYNET contact AMP 3 with a Single AMP retrieve of row 1004.

The AMP will pass the real Row-ID of the base table row (4,1) back up to the PE.

The PE will use the Row-ID to find the base table row with another single-AMP retrieve.

A USI is a Two-AMP Operation

images

The first AMP is assigned to read the subtable and the second the base table. Two binary searches are performed in total, and one row is returned.

Creating a Non-Unique Secondary Index (NUSI)

images

A Subtable is created on each AMP

We just created a Non-Unique Secondary Index on the column First_Name.

What is in a Unique Secondary Index (USI) Subtable?

images

The NUSI Subtable only contains two columns:

1.     First_Name (The NUSI column)

2.     Row-ID of the real Primary Index of the base table

Inside the secondary index subtable is First_Name column (NUSI Column). Then, there is the corresponding Row-ID of the real Primary Index of the table. The NUSI rows get their own Row-ID, but they are not hashed to different AMPs and stay AMP local.

Non-Unique Secondary Index (NUSI) Subtable is AMP Local

images

The NUSI Subtable is AMP local because all values in the subtable are of those in the base table (on the same AMP).

Subtable rows match those of the base rows on the same AMP, which is why they call it AMP Local. A NUSI query always searches all AMPs, but the intent is not to do a Full Table Scan. If there are 50 AMPs, then a minimum of 50 binary searches are done.

How the Parsing Engine uses the NUSI Subtable

images

SELECT * FROM Emp_Intl WHERE First_Name = ‘Kyle’ ;

Parsing Engines Plan – An ALL-AMP Operation

First_Name is a Non-Unique Secondary Index!

The Parsing Engine will order each AMP to check if they have a ‘Kyle’ in their NUSI Subtable.

Each AMP will simultaneously perform a binary search on their NUSI subtable.

If an AMP has a ‘Kyle’, the PE will order them to retrieve the base row (AMP Local).

If there are 50 AMPs, then all 50 AMPs perform a binary search simultaneously. If they find a “Kyle”, they perform another binary search on their local base table.

Creating a Value-Ordered NUSI

images

The Secondary Index Subtable will be sorted by the Values and not in Hash Order.

The Hash Map Determines which AMP will own the Row

A Unique Secondary Index Subtable is hashed exactly like a Base Table.

images

Teradata uses the “Hash Formula” on the USI value for each base table row. The hashing of the row results in an answer called the “Row Hash”, and this alone, in conjunction with the system's hash map, determines which AMP holds the USI subtable row. The Parsing Engine can rerun the Hash Formula again to quickly find the row.

A Unique Primary Index Spreads the Data Evenly

images

Study the above because you are about to be asked several questions on the next page.

Quiz – Answer the Tough USI Questions

images

1) How are both tables sorted? ________

2) What was the Row-ID when Minal was hashed? _______

3) Looking in the subtable what is the Row-ID of the base for employee 1006? ________

4) When 1006 was placed in the subtable which bucket in the hash map was chosen? ______________

5) How many times is the Hash Map consulted on a query using a USI in the WHERE Clause? _______

Answer to Quiz – Answer the Tough USI Questions

images

1) How are both tables sorted? Row-ID

2) What was the Row-ID when Minal was hashed? 1,1

3) Looking in the subtable what is the Row-ID of the base for employee 1006? 6,1

4) When 1006 was placed in the subtable which bucket in the hash map was chosen? Bucket 17

5) How many times is the Hash Map consulted on a query using a USI in the WHERE Clause? 2

A Picture with a Base Table, USI, and NUSI Subtable

images

The Base Table has a Primary Index of Last_Name. The USI was created on EmpNo and the NUSI on First_Name. The USI rows are hashed to different AMPs, but the NUSI rows are AMP local. Both subtables contain the same Base Table Row-IDs.

Quiz – Tough Questions on the USI and NUSI Subtables

images

1) Why did the USI subtable row for Stover end up on AMP 3? ____________________

2) Why did NUSI subtable row for Stover end up on AMP 1? ______________________

3) What is the Row Hash of Kyle? __________

4) If a NUSI Subtable is AMP Local, then what is the purpose of hashing the first name?____________

5) How many AMPs above would be in a USI query, and how many AMPS above in a NUSI query? ___ ___

Answer – Tough Questions on the USI and NUSI Subtables

images

1) Why did the USI subtable row for Stover end up on AMP 3? EmpNo 1004 was hashed and sent there

2) Why did NUSI subtable row for Stover end up on AMP 1? NUSI rows are always AMP local

3) What is the Row Hash of Kyle? 40

4) If a NUSI Subtable is AMP Local, then what is the purpose of hashing the first name? Binary Search Sort

5) How many AMPs above would be in a USI query, and how many AMPS above in a NUSI query? 2 (3 all)

A Query Using an USI Only Moves Two Blocks

images

images

Parsing Engine

EmpNo is a Unique Secondary Index. Hash EmpNo 1001. (Row Hash Equals 22)Use the Hash Map to find the AMP that owns the USI Subtable Row for 22. (AMP2)Bring the USI block into FSG Cache AMP 2.Retrieve Row Hash 22 with a binary search. Send me the Row-ID of the Base. (1, 1)

Use the Hash Map find the AMP who owns Row-ID (1, 1).

Bring in the Base Table Block and do a Binary search. Send me the row.

Only one USI block and only one base table block is used on a USI search.

A Query Using A NUSI Always Uses All AMPs

images

images

Parsing Engine

First Name is a Non-Unique Secondary Index. Hash Sushma (Row Hash Equals 50)Every AMP must bring in the NUSI block that contains Sushma (row hash 50).Each AMP must do a binary search for 50.Report back to me if you find a 50.

If you found Sushma (50) in your NUSI subtable then get the real Row-ID of the base and bring the base table block into FSG Cache. Find the row and send it to me.

All AMPs move their NUSI block into FSG cache and do a binary search. For any AMP who finds they have a Sushma, they then move their base block into FSG Cache and they retrieve the row. A NUSI query is much slower than an USI query.

Two Non-Unique Secondary Indexes (NUSI) on a Table

images

Notice that the Base Table Emp_Intl and both NUSI Subtables for First_Name and Dept_No are sorted by Row-ID

Notice that both NUSI subtables have the same Base Table Row-IDs (all 9 Row-IDs).

A NUSI BITMAP Query (1 of 3)

images

SELECT * FROM Emp_Intl

WHERE First_Name = ‘Mo’

AND Dept_No = 100 ;

The SQL above has ANDed two NUSI columns together, so a bitmap could happen.

A NUSI BITMAP Theory (2 of 3)

images

SELECT * FROM Emp_Intl

WHERE First_Name = ‘Mo’

AND Dept_No = 100 ;

All AMPs will place both of their NUSI Subtable blocks in their FSG Cache. Then each AMP will perform a BMSMS, which is a Bit Map Set Manipulation Step.

A BMSMS means each AMP will simultaneously perform a binary search of both NUSI subtables in parallel steps.

Each AMP will binary search their First_Name subtable for ‘Mo’ (Row Hash 41). If a 41 is found the AMP will place the Row-ID of the base in spool.

At the same time each AMP is binary searching their First_Name subtable they will also binary search their Dept_No subtable. That is why it is called Set Manipulation. If Dept_No 100 (Row Hash 13) is found the AMP will place the Row-ID of the base in spool.

Each AMP then checks to see if the Base Row-IDs match.

A BMSMS is the plan when multiple NUSIs are ANDed together.

A NUSI Bitmap in Action (3 of 3)

images

images

Parsing Engine

Calling all AMPs! Transfer your First_Name and Dept_No NUSI subtables into FSG Cache.

Perform parallel steps on both NUSI subtables and do a binary search. (look for row hash 41 for Mo and row hash 13 for Dept_No 100).

If you find either, then place the Row-ID of the base in your spool When you are done, see if you have any Row-ID twice. If you do, then retrieve the row from the base and send it to me.

The PE directs the AMPs to do binary searches on their NUSI subtables and record the Row-ID of the base when a condition is met. If all conditions are met, then the AMP is directed to retrieve the base row. Think of this technique as binary searches in parallel.

A Brilliant Technique for a Unique Secondary Index

images

 CREATE TABLE Order_Table

 (Order_Number               INTEGER

 ,Customer_Number         INTEGER

 ,Order_Date                     DATE

 ,Order_Total                    Decimal (10,2)

 )PRIMARY INDEX(Order_Number)

 PARTITION BY RANGE_N

 (Order_Date BETWEEN date ‘2013-01-01’ AND date ‘2013-12-31’ EACH INTERVAL ‘1' Month) ;

A Partitioned Table (PPI) is sorted on each AMP by the Row-Key.

If the Primary Index is not part of the Partition, it can't be Unique

Create UNIQUE INDEX (Order_Number)on Order_Table

You can actually create a Unique Secondary Index on a Non-Unique Primary Index for a Partitioned table. This enforces uniqueness and points to the real Row-Key.

The USI for Partitioned Tables Points to the Row Key

images

The USI subtable for a partitioned table contains the USI column value (Order_No) and the Row Key of the base table row. Now, if a user queries the Primary Index of Order_No, the USI subtable will point to the exact Partition.

A Brilliant Technique for a Non-Unique Secondary Index

images

 CREATE TABLE Order_Table

 (Order_Number               INTEGER

 ,Customer_Number         INTEGER

 ,Order_Date                     DATE

 ,Order_Total                     Decimal (10,2)

 )PRIMARY INDEX(Order_Number)

 PARTITION BY RANGE_N

 (Order_Date BETWEEN date ‘2013-01-01’ AND date ‘2013-12-31’ EACH INTERVAL ‘1' Month) ;

A Partitioned Table (PPI) is sorted on each AMP by the Row-Key.

If the Primary Index is not part of the Partition it can't be Unique

Create INDEX (Order_Number)on Order_Table

You can actually create a Non-Unique Secondary Index on a Non-Unique Primary Index for a Partitioned table. This is a pointer to the Row Key so the Partition is known.

The NUSI for Partitioned Tables Points to the Row Key

images

The NUSI subtable for a partitioned table contains the NUSI column value (Order_No) and the Row Key of the base table row. Now, if a user queries the Primary Index of Order_No, the NUSI subtable will point to the exact Partition.

The Bigger Quiz

images

The Bigger Quiz Answers

images

Multiple Choice DBA

A large table called Big_Table exists. Two completely different departments query Big_Table regularly. The Sales Department always queries on a Unique column. The Marketing Department always queries on a Non-Unique column.

You are the DBA, and it is up to you to choose the best indexing for Big_Table. Choose the multiple choice answer that will work best so both departments are happy.

A) UPI and NUPI

B) USI and NUSI

C) UPI and NUSI

D) NUPI and USI

You have to pick either A, B, C or D. What do you think will work as the best indexing for Big_Table? Remember that one department always queries on a Unique column, and the other department always queries on a Non-Unique column. Good Luck!

Multiple Choice DBA

A large table called Big_Table exists. Two completely different departments query Big_Table regularly. The Sales Department always queries on a Unique column. The Marketing Department always queries on a Non-Unique column.

You are the DBA, and it is up to you to choose the best indexing for Big_Table. Choose the multiple choice answer that will work best so both departments are happy.

A) UPI and NUPI (Can't have two primary indexes)

B) USI and NUSI ( I need at least one Primary Index)

C) UPI and NUSI (1-AMP and All-AMP)

D) NUPI and USI (1-AMP and 2 AMP)

D is the best answer because the NUPI will provide the department who always queries on a Non-Unique column with a NUPI Single-AMP retrieve. The department who always queries on a Unique column can use the USI, which is a 2-AMP retrieve.

What are the Big Four Tactical Queries?

There are four Tactical Queries. A Tactical Query can be depended on to run in sub second time. Can you name the four Tactical Queries from fastest to slowest?

 

   1)

   2)

   3)

   4)

Tactical queries will run very fast, and they can be depended upon to run in sub second time. Can you list the four fastest queries in order from fastest to slowest, but all must run in under sub second time.

What are the Big Four Tactical Queries?

There are four Tactical Queries. A Tactical Query can be depended on to run in sub second time. Can you name the four Tactical Queries from fastest to slowest?

1) Unique Primary Index (UPI) in the WHERE Clause

2) Non-Unique Primary Index (NUPI) in the WHERE Clause

3) Unique Secondary Index (USI) in the WHERE Clause

4) All Amp Retrieve by way of a Single Partition

Tactical queries will run very fast and they can be depended upon to run in sub second time. The fastest is, of course, example 1 with an UPI in the Where clause. Then, it is the NUPI in the Where clause. Then, it is the USI in the WHERE clause. Finally, it is an All AMP retrieve by way of a Single Partition.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset