Home Page Icon
Home Page
Table of Contents for
Teradata Aster Data
Close
Teradata Aster Data
by John Nolan, Tom Coffing
Teradata Aster Data
Cover
The Tera-Tom Genius Series
Tera-Tom- Author of over 50 Books
The Best Query Tool Works on all Systems
Trademarks and Copyrights
About Tom Coffing
About John Nolan
Contents
Chapter 1 – The Aster Data Architecture
What is Parallel Processing?
Aster Data is a Parallel Processing System
Each vworker holds a Portion of Every Table
The Rows of a Table are Spread Across All vworkers
The Aster Data Architecture
The Queen Node
The Worker Node
The Loader Node
The Backup Node
The Aster Architecture Interconnect
Backup and Loader Nodes Do Not use the Interconnect
The Aster Architecture has Spare Nodes
The Aster Architecture Allows Flexibility based on Need
Aster Data Provides Four Fundamental Hardware Strengths
Replication Failover
Data is Compressed on Data Transfers
Aster Utilizes Dual Optimizers
Aster Allows a Hybrid of SQL and MapReduce
MapReduce History
What is MapReduce?
What is SQL-MR?
Sessionize – An Example of SQL-MR
Support for Mixed Workload Management and Prioritization
Chapter 2 – Fact and Dimension Tables
Aster Tables are defined as Fact or Dimension when Created
Fact Table
A More Detailed Look at the Fact Table Distribution
Dimension Table are Replicated
A Dimension Table is often Replicated across vworkers
Aster Data has Fact and Dimension Tables
Aster Tables are defined as Fact or Dimension when Created
Fact and Dimension Tables can be Hashed by the same Key
Distribution Key Rules
Aster Data Uses a Hash Formula
The Hash Map Determines which vworker will own the Row
The Hash Formula, Hash Map and vworker
Placing rows on the vworker
Placing rows on the vworker Continued
A Review of the Hashing Process
Like Data Hashes to the Same vworker
Distribution Key Data Types
Run ANALYZE to COLLECT STATISTICS on a Table
Some Examples of ANALYZE
What Columns to Analyze
Chapter 3 – How Aster Processes Data
When a Table is Created, a Table Header is Created
Every vworker has the Exact Same Tables
All Aster Tables are spread across All vworkers
The Table Header and the Data Rows are Stored Separately
A vworker Stores the Rows of a Table inside a Data Block
To Read Rows, a vworker Moves the Data Block into Memory
A Full Table Scan Means All vworkers must Read All Rows
The “Achilles Heel”, or Slowest Process, is Block Transfer
Each Table has a Distribution Key
A Query Using the Distribution Key uses a Single vworker
As Rows are Added, a Data Block will Eventually Split
A Full Table Scan Means All vworkers Read All Blocks
Distribution Key Query uses One vworker
Each vworker Can Have Many Blocks for a Single Table
A Full Table Scan Means All vworkers Read All Blocks
Quiz – How Many Blocks Move into vworker Memory?
Answer – How Many Blocks Move into vworker Memory?
Quiz – How Many Blocks Move Using the Distribution Key?
Answer-How Many Blocks Move Using the Distribution Key?
Chapter 4 - Four Options for Aster Data Table Design
There are Four Options to Aster Table Design
Straight up Distribute by Hash
Straight up Distribute by Hash - Problems
Straight up Distribute by Replication
Partition the Table with Logical Partitioning
This Partitioned Table Sorts Rows by Month of Order_Date
An All vworkers Retrieve By Way of a Single Partition
You can Partition a Table by Range or by List
A Partitioned By List Example with Three Tactical Queries
Aster Data Multi-Level Partitioning
Aster Allows for Multi-Level Partitioning
SQL Commands for Logical Partitioning as One Table
What Partitions are on my Table?
What does a Columnar Table look like?
A Comparison of Data for Normal Vs. Columnar
A Columnar Table is best for Queries with Few Columns
Quiz – How Many Blocks Move to vworker Memory?
Answer – How Many Containers Move to vworker Memory?
When to use a Columnar Table
Chapter 5 - How Joins Work Inside the Aster Engine
Aster Join Quiz
Aster Join Quiz Answer
The Joining of Two Tables
Aster Moves Joining Rows to the Same vworker
Because of the Join Rule – Dimension Table are Replicated
The Two Different Philosophies for Table Join Design
What Could You Do If Two Tables Joined 1000 Times a Day?
Fact and Dimension Tables can be Hashed by the same Key
Joining Two Tables with the same PK/FK Distribution Key
A Join With Co-Location
A Performance Tuning Technique for Large Joins
The Joining of Two Tables with an Additional WHERE Clause
Aster Performs Joins Using Three Different Methods
The Hash Join
The Merge Join
Nested Loop Joins
Chapter 6 - Temporary and Analytic Tables
Aster has Three Types of Data
Create a Permanent Table Using Create Table AS (CTAS)
Create a Logically Partitioned Table and Populate It
Create a Temporary Table with using Create Table AS (CTAS)
A Temporary Table in Action
A Temporary Table That Uses an Insert/Select
Create an Analytic Table Using an Insert/Select
Create an Analytic Table Using CREATE TABLE AS (CTAS)
Operations that Invalidate an Analytic Table
If an Analytic Table is Invalid
Tera-Tom History
Chapter 7 – Aster Modeling Rules
Modeling Rules for Aster Data
Three Principles that Govern the Modeling Rules
Modeling Rule 1 – Dimensionalize your Model
A Dimensional Model is called a "Star Schema"
To Read a Data Block, a vworker Moves the Block to Memory
A Dimensional Model Moves Less Mass into Memory
Which Move From Disk to Memory Would You Choose?
Vworkers transfer their Fact Table into Memory in Parallel
Modeling Rule 2 – Use Columnar
Which Move From Disk to Memory Would You Choose?
Let's Discuss Modeling and Joins at the Simplest Level
Let's Discuss Modeling and Joins at the Simplest Level
Let's Discuss Joins at the Simplest Level
Modeling Rule 3 – Distribute your Tables Based on Joins
The Two Different Philosophies for Table Join Design
Facts are Hashed and most often the Dimension is Replicated
Fact and Dimension Tables can be Hashed by the same Key
Joining Two Tables with the same PK/FK Primary Index
A Join With No Redistribution or Duplication
Aster Hates Joining Tables with a Different Distribution Key
Aster Hates to Redistribute by Hash to Join Tables
Modeling Rule 4 – Replicate Dimension Tables
Modeling Rule 5 – Partition Your Tables
Modeling Rule 6 – Make Fact Tables Skinny
Modeling Rule 6 – Make Fact Tables Skinny Example
Modeling Rule 7 – Index Your Tables
The B-Tree Index
Which Columns Might You Create an Index?
Answer - Which Columns Might You Create an Index?
Modeling Rule 8 – Denormalize based on Your Environment
Modeling Rule 8 – Denormalize based on Your Environment
Chapter 8 – Tera-Tom's Top Tips
Tera-Tom's Top Tips
Tera-Tom's Top Tips # 2
Tera-Tom's Top Tips #3
Tera-Tom's Top Tips # 3 Rewritten
Tera-Tom's Top Tips #4
When the GROUP BY Column is NOT the Distribution Key
Example of GROUP BY Column is NOT the Distribution Key
Tera-Tom's Top Tips #5
Tera-Tom's Top Tips #6 – Use EXPLAIN
Query Plan and Estimates
Explain Plan Showing a Hash Join
Explain Plan Showing a Merge Join
Explain Plan Showing a Nested Loop Join
Chapter 9 - Indexes
There are Only Three Types of Scans
Guidelines for Indexes
An Index Syntax Example
The B-Tree Index
Which Columns Might You Create an Index?
Answer - Which Columns Might You Create an Index?
A Visual of an Index (Conceptually)
A Query Using an Index Uses All vworkers
Multicolumn indexes
A NUSI BITMAP Theory
A NUSI Bitmap in Action
Indexes on Expressions
Indexes on Extracts of Dates
GiST Indexes
Five Operational Tips for Efficient Indexing
REINDEX
createCompressedIndexOnCompressedTableByDefault Flag
Chapter 10 – Aster Windows Functions
Cumulative Sum
Cumulative Sum - Major and Minor Sort Key(s)
The ANSI CSUM – Getting a Sequential Number
The ANSI OLAP – Reset with a PARTITION BY Statement
PARTITION BY only Resets a Single OLAP not ALL of them
ANSI Moving Sum is Current Row and Preceding n Rows
How ANSI Moving SUM Handles the Sort
Quiz – How is that Total Calculated?
Answer to Quiz – How is that Total Calculated?
Moving SUM every 3-rows vs. a Continuous Sum
Moving Average
Quiz – How is that Total Calculated?
Answer to Quiz – How is that Total Calculated?
Quiz – How is that 4th Row Calculated?
Answer to Quiz – How is that 4th Row Calculated?
Partition By Resets an ANSI OLAP
Moving Average Using BETWEEN
Moving Difference using ANSI Syntax
Moving Difference using ANSI Syntax with Partition By
RANK Defaults to Ascending Order
Getting RANK to Sort in DESC Order
You can use Window Functions in Expressions
RANK() OVER and PARTITION BY
DENSE_RANK() OVER
PERCENT_RANK() OVER
PERCENT_RANK() OVER with 14 rows in Calculation
PERCENT_RANK() OVER with 21 rows in Calculation
RANK With ORDER BY SUM()
COUNT OVER for a Sequential Number
Quiz – What caused the COUNT OVER to Reset?
Answer to Quiz – What caused the COUNT OVER to Reset?
The MAX OVER Command
MAX OVER with PARTITION BY Reset
The MIN OVER Command
Quiz – Fill in the Blank
Answer to Quiz – Fill in the Blank
The Row_Number Command
Quiz – How did the Row_Number Reset?
Quiz – How did the Row_Number Reset?
NTILE
NTILE Using a Value of 10
NTILE With a Partition
CUME_DIST
CUME_DIST With a Partition
LEAD
LEAD With Partitioning
LAG
LAG with Partitioning
FIRST_VALUE
FIRST_VALUE After Sorting by the Highest Value
FIRST_VALUE with Partitioning
LAST_VALUE
NTH_VALUE
NTH_VALUE With Partition
SUM(SUM(n))
Chapter 11 – SQL-MapReduce
MapReduce History
What is MapReduce?
What is SQL-MapReduce?
SQL-MapReduce Input
SQL-MapReduce Output
Subtle SQL-MapReduce Processing
Aster Data Provides an Analytic Foundation
Path Analysis
Text Analysis
Statistical Analysis
Segmentation (Data Mining)
Graph Analysis
Transformation of Data
Sessionize
Tokenize
SQL-MapReduce Function . . . nPath
nPath SELECT Clause
nPath ON Clause
nPath PARTITION BY Expression
nPath DIMENSION Expression
nPath ORDER BY Expression
nPath MODE Clause has Overlapping or NonOverlapping
nPath PATTERN Clause
Pattern Operators
Pattern Operators Order of Precedence
Matching Patterns Which Repeat
nPath SYMBOLS Clause
nPath RESULTS Clause
Adding an Aggregate to nPath Results
Adding an Aggregate to nPath Results (Continued)
SQL-MapReduce Examples - Use Regular SQL
SQL-MapReduce Examples - Create Objects
SQL-MapReduce Examples - Subquery
SQL-MapReduce Examples - Query as Input
SQL-MapReduce Examples - Nesting Functions
SQL-MapReduce Examples - Functions in Derived Tables
SQL-MapReduce Examples - SMAVG
SQL-MapReduce Examples - Pack Function
SQL-MapReduce Examples - Pack Function (Continued)
SQL-MapReduce Examples - Pivot Columns
Search in book...
Toggle Font Controls
Playlists
Add To
Create new playlist
Name your new playlist
Playlist description (optional)
Cancel
Create playlist
Sign In
Email address
Password
Forgot Password?
Create account
Login
or
Continue with Facebook
Continue with Google
Sign Up
Full Name
Email address
Confirm Email Address
Password
Login
Create account
or
Continue with Facebook
Continue with Google
Prev
Previous Chapter
Chapter 11 – SQL-MapReduce
Add Highlight
No Comment
..................Content has been hidden....................
You can't read the all page of ebook, please click
here
login for view all page.
Day Mode
Cloud Mode
Night Mode
Reset