Chapter 2 - Teradata Architecture Fundamentals the DBA must know

“Choice, not chance, determines destiny.”

– Anonymous

Parallel Architecture

images

The rows of a Teradata table are spread across the AMPs, so each AMP can then process an equal amount of the rows when a USER queries the table.

The Teradata Architecture

images

The Parsing Engine (PE) takes the User's SQL, and builds a Plan for each AMP to follow to retrieve the data. Parallel Processing is all about each AMP doing an equal amount of the work. If they start at the same time and end the same time, they are performing true Parallel Processing. All communication is done over the BYNET.

All Teradata Tables are spread across ALL AMPS

images

Each table dreams of spreading their rows equally across the AMPs. Above, are three tables with each table holding 9 rows (3-rows per AMP).

Teradata Systems can Add AMPs for Linear Scalability

4-AMP Test System.

 

Order_Table has 16 rows.

images

8-AMP Production System – with the same 16 rows

If you double the size of your system (Double the AMPs), the system is twice as fast! System number one has only 4-AMPs, but system two has 8-AMPs and is twice as fast. When a customer buys more hardware, they are adding AMPs to the system. Once the hardware is configured, the AMPs will redistribute the data to include the new AMPs.

AMPs and Parsing Engines (PE's) live inside SMP Nodes

SMP Node

images

AMPs and PE's are called Virtual Processors because each is a process that lives inside a node's memory. Think of a node as a very powerful personal computer. SMP stands for symmetric multi-processing, which means each CPU processor performs equally and all CPUs share a pool of memory and operate under one operating system. In other words, it is a small server contained within itself.

Each Node is Attached via a Network to a Disk Farm

images

Each AMP is assigned its own virtual disk

A Teradata AMP will be assigned a Virtual disk to store its tables and the rows assigned to it. Only the AMP assigned to the virtual disk can read or write to that disk. A node holds many AMPs. In the early days, each node held around 8-10 AMPs. With more power in a node due to CPU advances, 64-bit architecture, and a ton more memory, many nodes today will hold up to 40-50 AMPs. Each AMP is still attached to its virtual disk. Think of a single node attached to a cable which then attaches to a single disk farm. Now, each AMP in the node knows where its virtual disk resides.

Two SMP Nodes Connected Become One MPP System

images

 

When nodes are connected to the BYNETs, then they become part of one large Teradata system. In the picture above there are two nodes. Each node is connected to the BYNETs so now our system has 8 Parsing Engines and 80 AMPs, but physically they are separate hardware nodes. When a customer wants to grow their system, they add additional nodes, which in turn add additional Parsing Engines, AMPs and each AMP's assigned Virtual disk. Two SMP nodes connected via the BYNETs are now one Massively Parallel Processing (MPP) system.

There are Many Nodes in a Teradata Cabinet

images

Teradata has many different configurations, but I want you to understand that nodes are kept in cabinets. Sometimes the disks are within the cabinet, but sometimes they are not. The same goes for the BYNET boards.

This is the Visual You Want to Understand Teradata

images

 

 

Teradata has many different configurations, but I want you to understand that nodes are kept in cabinets. Sometimes the disks are within the cabinet, but sometimes they are not. The same goes for the BYNET boards.

Responsibilities of the DBA

User Management – creation of databases, users, accounts, roles, and profiles

Space Allocation and Usage – perm, spool, and temporary space

Access of Objects (e.g., tables, views) – access rights, roles, use of views, etc.

Access Control and Security – logon access, logging access, etc.

System Maintenance – specification of system defaults, restarts, data integrity, etc.

System Performance – use of Priority scheduler, job scheduling, etc.

Resource Monitoring – use of ResUsage tables/views, query capture (DBQL), etc.

Data Archives, Restores, and Recovery – ARC facility, Permanent Journals, etc.

 

The Parsing Engine (PE) takes the User's SQL and builds a Plan for each AMP to follow to retrieve the data. Parallel Processing is all about each AMP doing an equal amount of the work. If they start at the same time and end the same time, they are performing true Parallel Processing. All communication is done over the BYNET.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset