Subject Index

A
Add analysis services connection manager, 573, 574
Add sequence number, 415, 416
Airline identification number (AirlineID), 100
Airline industry software system, functional characteristics, 54
external inputs (EI), 54
external inquiries (EQ), 54
external interface files (EIF), 54
external outputs (EO), 54
AirportCode column, 436
AirportHashKey satellite, 365, 472
Airport hub table, 452
ghost records in, 452
null references, link connection with, 453
Analysis server database, 569, 570
Application programming interface (API), 202, 343, 412, 419
Association rule algorithm, 569
apriori, 569
FP growth, 569
Audit transformation editor, 337, 338
B
Bad data, 26, 231, 521
Big Data, 7
definition of, 7
environments, 348
performance issues, 7
Bill of material (BOM) hierarchy, 129
Bureau of Transportation Statistics (BTS), 343
Business intelligence system, 19, 195, 345, 524
correct and complete data, 525
data quality tagging, 525
data, standardization of, 525
derived data, transforming of, 524
match and consolidate data, 525
Business keys, 95, 307, 450
composite, 95, 311
hashing of, 350
identification process, 96
loading of, 451
NULL, 436
scope of, 97
vs. surrogate keys, 97
Business logic, 45, 137, 199, 279, 335, 452
Business metadata, 284
business column names, 285
data elements, technical numbering of, 285
definitions, 285
ontologies and taxonomies, 285
physical table and column names, 285
Business Vault, 28, 124, 151, 567
computed aggregate links, 124
FlightCount, 137
HubAirport, 137
HubCarrier, 137
HubFlight, 137
LinkFlight, 137
LinkService, 137
SatService, 137
computed satellites, 124
exploration links, 124
C
Capability maturity model integration (CMMI), 12, 33, 39, 231
capability levels, 40
Data Vault 2.0 methodology, integrating CMMI in, 41
maturity level 5, advancing to, 41
maturity levels, 40
CData
components, 403
GoogleSheets source, 412, 415
Change data capture (CDC) systems, 100, 143, 151, 501
for employees, 144
COALESCE
function, 364
statements, 582
CodePlex, 288
Comma-separated values (CSV), 61, 324
Composite keys, 96
bar codes, 96
credit card numbers, 96
email addresses, 96
IMEI number, 96
ISBN codes, 96
ISSN codes, 96
MAC numbers, 96
phone numbers, 96
Computed satellites, 149
data storage, 149
logical design, 150
Compute system values subprocess, 346
hash keys, 347
load date, 346
record source, 346
sequence number, 347
ConnectionAssociation mining model, 573
ConnectionAssociation_Structure, 573
Content management systems (CMS), 374
CONVERT function, 364
Create, read, update, or delete (CRUD) text, 202
CREATE TABLE DDL statement, 597
Cube wizard, 639, 640
accessing of, 646
existing dimensions, selecting of, 642
FactFlight table, 642
logical model of, 644
processing of, 645
select measures for, 641
Customer relationship management (CRM), 1, 123
Customer relationship system (CRS), 294
D
Data
access mode, 453
compression, 156
correct and complete, 530
automated correction, 530
data, rejecting of, 531
DQS example, 532, 533
cleansing transformation, 542, 545
client application, 532
domain management, 533, 534
knowledge discovery, 533
lookup transformation, 540, 543
matching policy, 533
person name suffix domain, 537
person sex domain, 536
server component, 532
set up in knowledge database, 535
SSIS transformation, 532
manual correction, 530
multiple data fields, 530
OLE DB source editor, 539
probability, 530
single data fields, 530
SSIS example, 537
T-SQL example, 531
correction activity, 524
extraction of, 343
match and consolidation of, 548
data correction software, 550
data flow, 559
incremental load, 559
truncate target before full load, 559
data matching techniques, 549
data mining software, 550
same entity, representing of, 548
same household, representing of, 549
SSIS example, 550
false-negative match, 554
false-positive match, 554
true-negative match, 555
true-positive match, 555
wrong entity, representing of, 548
sources of
denormalized, 422
types of, 373
accounting software, 374
cloud databases, 374
content management systems (CMS), 374
CRM systems, 373
ERP systems, 374
file systems, 374
JSON documents, 373
mainframe copybooks, 374
network access, 374
relational tables, 373
remote terminals, 374
semi structured documents, 374
social networks, 374
spreadsheets, 373
text files, 373
unstructured documents, 374
web sources, 374
XML documents, 373
wizard, 624, 627
Data aging, 503
Mark business keys deleted, 504
T-SQL example, 504
Database options, 210
data compression, 212
filegroups, 212
partitioning, 211
column, 212
function, 211
scheme, 212,
TEMPDB, 210
database features, row versions from, 210
internal objects, 210
read-committed transactions, row versions from, 210
temporary user objects, 210
Database workloads, 195
characteristics, 196
consistency, 196
data latency, 196
data types, 196
predictability, 197
response time, 196
updateability, 196
DataCode values, 327
Data definition language (DDL), 65
Data flow
for loading exploration link, 580
name, 313
Data mining query task, 573, 574
OLE DB destination connection, setting up of, 578
transformation editor, 575, 576
Data modelers, 284
Data package identifier, 300
Data quality (DQ), 33, 301
in architecture, 523
business expectations towards, 519
business expectations, 519
data quality expectations, 519
in Data Vault 2.0 architecture, 523
implementation using pit tables, 616
low, 520
business strategy, 521
customer and partner satisfaction, 520
decision-making, 521
financial costs, 520
organizational mistrust, 521
re-engineering, 521
regulatory impacts, 520
tagging, 525
Data quality services (DQS), 197, 213, 532
Data standardization, 528
data mapping, 528
data rearranging, 528
data reordering, 528
domain value redundancy, 528
extraneous punctuation, stripping of, 528
format inconsistencies, 528
T-SQL example, 528, 529
Data stewards, 250
Data Vault, 217
2.0, 283, 309, 312
architecture, 11, 12, 21, 284
business rules application, 23
business rules definition, 22
business vault, 28
data warehouse layer, 26
information mart layer, 27
metrics vault, 27
operational vault, 29
self-service BI, 30
staging area layer, 25
business intelligence, system of, 11
hard rule, 285
definition of, 300
implementation, 12
links, data flow to, 459
Metrics Mart in, 333
model, 294, 348
modeling, 12, 89, 151, 171
application for, 171
bridge tables, 158
business entity, 90
point-in-time (PIT) tables, 151
reference tables, 160
satellites, 139
applications for, 139
backing up, 218
connection manager, 599
database, 219, 455
hardware considerations for, 218
hubs, 98, 302, 304
applications of
business key, consolidation of, 124
business vault entity, 126
data vault model, 126
entity-relationship (ER) diagram, 125
raw Data Vault, 125
business key, 98
example of, 101
hash key, 98
last seen date, 100
load date, 99
passenger, 124
record source, 99
links, 127
applications for, 127
computed aggregate links, 137
exploration links, 139
hierarchical links, 129
link-on-link, 127
nondescriptive links, 136
nonhistorized links, 132
same-as links, 129
methodology, 12, 33
communication channels, 37
alpha release reach, 38,
beta release reach, 38
gamma release reach, 38
components of, 34
project planning, 33
review and improvement, 33
modeling, 292, 328
bridge tables
hash keys, 159
information, 159
logical model of, 158
passenger data, query performance, 158
physical design of, 159
structure, 159
vs. pit tables, 160
business entity
hub, 90
link, 90
satellites, 90
point-in-time (PIT) tables
managed PIT window, 156
structure, 153
reference tables
calendar data in, 162
code and descriptions, 164, 166, 168
descriptions, 166
history-based, 163
no-history, 161
reference data, satellite with, 163
satellites, 302, 371, 469
computed, 149
conditional split transformation editor, 480
effectivity, 145
input output selection dialog, 481
merge join transformation editor, 479
multi-active, 141
OLE DB destination editor, 482
overloaded, 139
query parameters dialog, 476
record tracking, 146
source editor for target data, 478
SSIS example, 473
source staging table, 473
target satellite table, 473
standard loading template for, 469, 470
status tracking, 143
T-SQL example, 470
OriginCityMarketID, 470
OriginCityName, 470
OriginState, 470
OriginStateFips, 470
OriginStateName, 470
OriginWac, 470
Data warehouse, 1, 151, 195, 283, 429, 430
architecture, 12
three-layer, 13
two-layer, 12
arrival issues, 430
business-centric model for, 90, 284, 286
costs of, 10
bad planning, 10
changing business requirements, 10
low quality, 10
storage, 10
database system, physical architecture, 195
Data Vault 2.0, 11
divide and conquer problem, 430
enterprise data warehouse environment, 5
access, 5
auditing, 9
big data, 7
complexity, 8
compliance, 9
costs, 10
facts, single version of, 6
mission criticality, 6
multiple subject areas, 5
other business requirements, 11
performance issues, 7
scalability, 6
truth, single version of, 5
environments for development, 197
blue-green deployment, 198, 199
errors in, 520, 524
business rule, 520
data, 520
perception, 520
exception-handling rules, 285
functional characteristics of, 59
dimension load, 59
fact load, 59
hub load, 59
link load, 59
report build, 59
satellite load, 59
stage load, 59
hashing in, 347
complete restore limitations, 348
data distribution, 348
loading process, dependencies in, 347
multiple environment, synchronization of, 348
NoSQL engines, difference in, 348
parent lookups, 348
scalability issues, 348
serial algorithms, dependencies in, 348
high-performance processes for, 431,
history of, 2
data warehouse systems (DWH), 4
decision support systems, 3
information hierarchy, 2
latency issues, 430
loading process of, 430
network issues, 431
physical architecture, 203
business processes, 203
data size, 203
hardware components, 203
memory options, 207
network options, 209
access data on SAN, 209
ad-hoc reporting, 209
application or user queries, 209
application queries, 209
for data transfer, 209
processor options, 206
service level agreements, 203
storage options, 207
users, 203
volatility, 203
process execution, 284
record source, 285
security issues, 431
set logic, applying of, 430
setting up, 213
database setup, 219
data vault, 217
data vault back up, 218
error marts, 226
hardware considerations for, 218
information marts, 222
stage area, hardware considerations for, 214
stage database setup, 214
source system business definitions, 286
stage area, 213
database setup, 214
hardware considerations for, 214
systems, 171
hierarchy, 184
table specifications, 285
technical, 284
Data warehouse quality (DWQ), 84
Data warehouse system (DWH) model, 4, 633
Date dimension
completing wizard for, 638
hierarchy defining, 638
selecting attributes for, 635, 637
Delimiters, importance of, 354, 362, 392, 469
Dependency analyzer tool, 291
Dependency executor, 291
Derived data, transforming of, 525
business rules, noncompliance with, 525
demographic information, 526
dummy values, 525
geographic information, 526
multiple sources, conflicting data with, 526
multipurpose fields, 525
multipurpose tables, 525
psychographic information, 526
redundant data, 526
reused keys, 525
smart columns, 526
T-SQL example, 526
DimDate table, 179, 635
Dimension hierarchies, 183
date dimension, logical model, 185
geographic, 184
city, 184
country, 184
postal code, 184
state-province, 184
Dimensions
creating of, 631, 632
attributes in wizard, 634
date dimensions, 633
existing table, use of, 631
nontime table generation in data source, 632
select creation method for, 636
time table generation in data source, 631
time table generation on server, 632
data mart, 154
information mart, 27, 108, 176, 248, 292, 445
Dimensions modeling, 171
dimension design, 180
hierarchies, 183
snowflake, 189
multiple stars, 179
conformed dimensions, 179
of relational tables, 172
star schemas, 172
for airport visits, 173
definition of, 172
dimension tables, 176
fact tables, 174
Dimension tables, 176
fully additive measures, 177
nonadditive measures, 177
passengerkey, 176
passportnumber, 176
semi-additive measures, 177
Direct attached storage disk (DASD), 215
Disciplined agile delivery (DAD), 74
E
End-dating satellites, 486
changed records loading template, 494
data flow using hash diffs, 492
Enterprise data warehouse (EDW), 5, 18, 27, 50, 432, 439
function points for, 61
business vault loads, 61
information mart loads, 61
staging and data vault loads, 61
staging tables, 61
Enterprise service buses (ESBs), 21, 85, 151, 373, 620
Entity-relationship (ER) diagram, 125
Error Mart, implementing of, 335
erroneous data in SQL server integration services, 336
F
FactConnection fact table, 590
FactFlight table, 642
Fact tables, 174
foreign key references, 174
CancelledKey, 174
DivertedKey, 174
TailnumberKey, 174
limiting scope of, 178
measure values, 174
DepartureDelay, 174
SecurityDelay, 174
WeatherDelay, 174
required dimensions, selection of, 177
required facts, selection of, 177
summarization of, 178
File checksum integrity verifier (FCIV), 351
Flat files
connection manager editor, 380
sourcing of, 375
configure columns of, 383
connection manager, 380
control flow, 375, 380
data flow, 383
setting up of, 382
source editor, 381
Foreach ADO enumerator, 410, 411
Foreach loop container
collection configuration of, 375, 376
map variable of, 375, 377
Fuzzy grouping transformation, 552, 554, 556
G
Gap analysis, 521, 522
layers of, 522
Google Drive account, 404, 407
connection manager for, 408
connection result, 409
property values for, 407
GoogleSheets connection manager, 411, 413
property expression editor for, 421
required property values to, 414
variable mappings to, 412
Google Spreadsheets, 406, 416
H
Hadoop platforms, 201, 345
Hard business rule, types, 291, 299, 314
Hash collisions, 356
collision freedom, 356
distinct parallel load operations, 359
hub with hash key, 361
probablities of, 357
HashDiff attribute, 365, 366
modified example input to, 369
Hash differences
calculation, improving of, 369
column, 364, 365
Hash functions, 364
for change detection, 364
hash diffs, maintaining of, 367
to data, applying of, 351
case sensitivity, 354
character set, 353
concatenated fields, 354
data type, length, precision and scale, 353
date format, 353
decimal separator, 353
delimiters for concatenated fields, lack of, 354
embedded or unseen control characters, 353
endianness, 354
leading and trailing spaces, 353
NULL values, 354
revisited, 350
avalanche effect, 350
collision-free, 350
deterministic, 350
irreversible, 350,
MD5 message-digest algorithm, 350
secure hash algorithm, 350
risks of, 355
collisions, 356
maintenance, 359
performance, 360
storage requirements, 358
tool and platform compatibility, 358
standards document, 354, 355
Hash keys, 299, 361, 450
Carrier, 361
CarrierHashKey, 451
ConnectionHashKey, 361
FlightHashKey, 361
FlightNum, 361
FlightNumHashKey, 451
LinkConnection, 361
link with, 362
Historical data, sourcing of, 399
SSIS example for, 401
HubAirport, 92, 127, 307
HubPassenger table, 434
HubPerson
defining of, 551
OLE DB source editor for, 552
Hubs, 123
applications of, 123
business keys, storage of, 123
business key column number, 311
entity, 91
definition of, 93
examples, 100
HubAirline, 100
HubCarrier, 100
HubFlight, 100
HubFlightCode, 100
references, addition of, 607
HubCarrier, 608
HubFlightNumber, 608
HubTailNumber, 608
soft-deleting data in, 499
T-SQL example, 500
structure, 98
I
Information marts, 222, 283, 567
accessing of, 624
connection, creating of, 625
datasource, creating of, 624, 626
impersonation information configuration, 626
business vault as intermediate to, 567
computed satellite, 567
exploration link, building of, 569
database setup, 223
hardware considerations for, 223
hash keys in, 620
additional sequence numbers, introduction of, 621
advantages of, 620
dimensions in cube, reduction of, 620
fixed binary data type, use of, 620
size, reduction in, 621
layer of, 27, 153
error mart, 27
meta mart, 27
materializing of, 579
fact tables, loading of, 585, 590
type 1 dimensions, 580
type 2 dimensions, 582
setting up, 222
Information technology (IT), 35
International function point user group (IFPUG), 54
Internet information services (IIS), 196
IsSorted property, 475
J
Java virtual machine, 354, 355
K
Key performance indicators (KPIs), 30, 33, 160
Kimball data lifecycle, 13
L
Least-significant byte (LSB), 354
LibreOffice download, 351
LinkConnection, 307
Link entity, 91
definition of, 101
examples, 111
flexibility of, 105
granularity of, 106
many-to-many relationships, 103
one-to-many relationships, 104
structure, 110
dependent child key, 111
hash key, 111
unit-of-work, 109
Link-FixedBaseOp references, 307
HubAirport, 307
HubCarrier, 307
Link-FlightNumCarrier, 455
Link overloading process, 450, 458
Link tables, template for loading of, 449
Load date, 370
definition of, 370
missing dates on sources, 371
mixed arrival latency, 371
mixed time zones for source data, 371
timestamp, 454
trustworthiness of dates on sources, 371
variable, 439
Logical data models, 284
Lookup transformation editor, 441, 444, 445, 453, 455, 456
M
Management instrumentation (MI), 333
Management object format (MOF), 333
Massively parallel processing (MPP), 6, 18, 205
Master data management (MDM), 29, 85, 123, 229, 230, 285
definition of, 230
drivers for, 232
complex data handling, 232
consistent, 232
correct format, 232
deduplicated, 233
privacy and data protection, 233
regulatory compliance, 233
safety and security, 233
goals, 231
data, facilitate exchange of, 231
data quality improvement, 231
information, processing and checking of, 231
information requirements reduction, 231
hierarchies, 248
derived, 248
explicit, 249
integration management of, 254
managing user permissions, 256
Microsoft excel add-in for, 252
model creation, 256
business rules, 261
entities, 258
model, importing of, 263
operational systems and Data Vault, integration of, 265
stage tables, 267
subscription views, 278
validation status, 282
operational vs. analytical, 235
business rule parameters, 235
codes and descriptions, 237
date and calendar information, 237
groups and bins definition, 236
hierarchy definitions, 237
technical parameters, 237
for self-service BI, 238
compliance, 238
reusability, 239
security, 238
transparency, 238
subscription view page, 278
derived hierarchy, 278
entity, 278
format, 278
level, 278
model, 278
name, 278
version, 278
subscription views, metadata columns, 281
ChangeTrackingMask, 281
EnterDateTime, 281
EnterUserName, 281
EnterVersionNumber, 281
LastChgDateTime, 281
LastChgUserName, 281
LastChgVersionNumber, 281
MUID, 281
ValidationStatus, 281
VersionFlag, 281
VersionName, 281
VersionNumber, 281
for total quality management, 239
explorer, 250
integration management, 253
master data manager, 249
MDS object model, 241
system administration, 254
user and group permissions, 255
version management, 252
Master data services (MDS), 29, 163, 229, 336, 425, 506
MD5 message-digest algorithm (MD5), 350
MDX query, 646
Message queue (MQ), 151
Metadata, 283
back room, 283
definition of, 283
front room, 283
management of, 283
content, 283
general documentation, 283
indexes, 283,
record layout, 283
referential integrity, 283
scheduling, 283
usage, 283
Meta Mart, 226, 285
database model, 288, 289
audit, 288
LookupConnectionID, 288
object attributes, 288, 289
RunScan, 288
version, 288
database setup, 227
hard rules, capturing of, 298
hardware considerations for, 227
metadata capturing forstaging area, 300
naming conventions, 292, 293
setting up, 226
soft rules, capturing of, 311
source system definitions, capturing of, 296
source tables, capturing requirements to, 301, 302
source tables to data vault tables, capturing of, 302
hub entities, loading of, 303
link entities loading, metadata for, 304
satellite entities on hubs, 307
satellite entities on links, 310
SQL server BI metadata toolkit, 288
table mappings, 315, 317
capturing requirements to, 317
toolkit, 288
dependency analyzer, 288
dependency executor, 288, 290
dependency viewer, 288
Metrics Mart, implementing of, 333
Metrics Vault, 320
dependency chain, 321
error metrics, 321
frequency metrics, 321
implementing of, 320
error inspection, 320
performance metrics, 320
root cause analysis, 320
performance metrics, 320
timing metrics, 320
volume metrics, 321
Microsoft analytics platform system, 352
Microsoft Azure, 200
cloud computing platform, 200
cloud services, 200
SQL database, 200
Microsoft data services (MDS), 633
Microsoft SQL server, 349
2014, 151, 200
analysis services, 195
Management Studio, 570, 646
reporting services, 623
Missing data, dealing with, 501
ETL loading error, 501
full table scan for detecting deleted scenes, 502
last seen date, introducing of, 502
source filter, 501
source system not in scope, 501
technical error, 501
Multiple active result sets (MARS), 210
N
Network infrastructure metrics, 321
No-history
links, 457, 460
input output selection for, 466
lookup columns for, 466
lookup transformation editor for, 464
OLE DB destination editor of, 467
OLE DB source editor for, 462
query parameter, setting up of, 464
SSIS data flow for, 468
SSIS example, 462
T-SQL example, 460
FlightHashKey, 460
HubAirport, 460
HubFlightNum, 460
HubTailNum, 460
satellites, 496
Normalizing source system files, 423, 424
NoSQL platforms, 345
NULL LoadEndDate, 491
O
Object-relational mapping (ORM), 343
OLE DB
connection manager, 447
destination component, 395, 397, 455, 558
destination editor, 395, 420
source component, 439, 441
set query parameters dialog, 443
SQL statements for, 442
source editor, 440, 441, 454, 456
for link table destination, 458
mapping columns in, 459
Online analytic processing (OLAP), 195, 623
Online transaction processing (OLTP), 195
OnTimeOnTimePerformance table, 395
Open management infrastructure (OMI), 333
Operational data store (ODS), 13
Operational Vault, 29
master data management (MDM), 29
Microsoft master data services (MDS), 29
OriginAirportHashDiff value, 398
OriginHashKey, 398
P
Param direction, 453
Passenger hub data, 152
hashkey, 152
loaddate, 152
number, 152
record source, 152
Process execution metadata, 287
control flow, 287
data flow, 287
package, 287
process, 287
Project execution, 62
Data Vault 2.0 methodology, software development life-cycle, 67
parallel teams, 69
technical numbering, 71
traditional software development life-cycle, 63
design, 65
implementation and unit testing, 65
integration and system testing, 66
operation and maintenance, 66
requirements engineering, 64
Waterfall model, 63
Project management process (PMP), 33, 69
Project planning, 33
business sponsor, 34
capability maturity model integration (CMMI), 39
change manager, 35
data architect, 35
definition, 50
agile requirements gathering, 52
estimation of, 54
data warehousing, boundaries in, 56
data warehousing, function point analysis for, 56, 58
enterprise data warehouse, function points for, 61
ETL complexity factors, assessement of, 57
function point analysis, 54
function points, measuring with, 54
size, 56
ETL developer, 35
information architect, 35
IT manager, 35
management, 42
Data Vault 2.0 methodology, integrating scrum with, 47
iterative approach, 46
product and sprint backlog, 46
scrum, 45
metadata manager, 35
report developer, 35
technical business analyst, 34
Project review
total quality management (TQM), 81
R
Raw Data Vault, 158, 285, 429
entities loading, 432
hubs, 434
SSIS example, 439
T-SQL example, 436
links, 446
loading template for, 446
SSIS example, 453
T-SQL example, 450
load process of, 433
hubs, 287
performance affecting factors, 429
complexity, 429
data size, 429
latency, 429
satellites, 618
ReadOnlyVariables, 401, 402
Record source
attribute, 568
column, 512
purpose of, 372
Record tracking satellites, 146
denormalized data, 147
logical design, 148
normalized data, 149
RefDelayGroup, 619
ArrivalDelayGroups, 619
DepartureDelayGroups, 619
Reference data, dealing with, 618
Reference tables, loading of, 505
code and descriptions,
T-SQL example, 513
code and descriptions with history, 514
T-SQL example, 507, 514
history-based, 509
T-SQL example, 509
no-history, 506
T-SQL example, 507
Relational database management system (RDBMS), 2, 19, 196, 284, 429
Relational tables, 283
REPLACENULL function, 387
Representational state transfer (REST), 343
Restricted operation codes (RstOpCode), 166
S
SALPerson target link, 559
Same-as-link (SAL), 123, 549
creating dimensions from, 560
for customer numbers, 130
de-duplicated dimension formation, 560
for passenger, 125
passenger business keys, de-duplicate, 125
ER diagram for, 126
Sample airline data, sourcing of, 403
data flow, 414
desktop application, client ID for, 405
Google Drive, authenticating with, 404
GoogleSheets connection manager, 411
native application, client ID for, 406
setup consent screen, 404, 405
SatAirport satellite, 365
SatDestAirport satellite, 472
Satellites, 154, 465
default loading template for, 465
entity, 91
definition of, 112
driving key, 119
examples, 118
splitting, 114
by rate of change, 115
by source system, 114
structure, 116
extract date, 117
hash difference, 117
load date, 116
load end date, 117
parent hash key, 116
loading process, 465
with multiple hub references, 362, 363
passenger address, 154
for passenger data, 155
preferred dish, 154
structure of
addition of new records, 367
after adding a new column, 367
initial, 367
SatOriginAirport satellite, 472, 612
SatPassengerAddress satellite, 400, 401
SatPassengerCRM satellite, 294
SatPassengerFCRM, 295
SatPassengerMCRM, 295
SatPassengerSCRM, 295
Sat PassengerPreferredDish CRM satellite, 294
SatPassenger satellite, 308
COUNTRY, 308
PASSPORT_NO, 308
PAX, 308
SatPreferredDish satellite, 491
Scalable data warehouse architectures, dimensions of, 17
analytical complexity, 19
availability, 20
data complexity, 18
varieties, 18
velocity, 19
veracity, 19
volume, 19
maintenance cycle, 18
query complexity, 19
security, 20
workload, 18
Script Component command, 385
destination, 385
source, 385
transformation, 385
Script task editor, 376, 379, 401, 402
Script transformation editor, 386
Secure hash algorithm (SHA), 350
Self-service business intelligence, 30, 163
direct access to source systems, 30
key performance indicators (KPIs), 30
low data quality, 30
nonstandardized business rules, 30
unconsolidated raw data, 30
unintegrated raw data, 30
Semantic meaning, changing of, 370
Service level agreement (SLA), 20
Service-oriented architecture (SOA), 21, 85
Setup connection manager, 395, 396
Six sigma, 74, 78
breakthrough results, 77
data warehousing, 80
disciplined agile delivery (DAD), 74
DMAIC improvement, 79
framework, 78
functions, 78
design, 78
manufacturing, 78
transactional, 78
process performance triangle, 75
software, 76
Soft business rule, 314, 317
Software development life cycle (SDLC), 33
Software requirements specification (SRS), 64
SortKeyPosition property, 475
Source table
physical name, 300
schema name, 300
sp_ssis_addlogentry stored procedure, 330
SQL command, 453
SQL server analysis services (SSAS), 171, 569, 623
aggregation management, 623
calculations, 623
query performance, 623
security provisions, 623
user-defined metadata, 623
SQL server business intelligence metadata toolkit, 288, 291
SQL server integration services (SSIS), 287
logs, 323, 325, 326
Metrics Vault for, 329
performance data, capturing of, 323
for SQL server, 323
for SQL server profiler, 323
for text files, 324
for Windows event log, 324
for XML files, 323
SQL task editor, 600
dSnapshotDate variable, 601
parameter mapping of, 600
SSIS multiple hash, 364
StageArea database, 439, 453
Stage BTS On Time On Time Performance data flow, 380
Stage load hash computation, 349
Stage table, template used for, 345, 346
Staging area
purpose of, 343
add system-generated attributes, 346
true duplicates, removal of, 346
truncating of, 517
delete specific partitions, 518
delete specific records, 518
truncate table, 518
Standard operations code (StdOpCode), 166
Status tracking satellite, 144
data, 145
logical design, 144
SatEmployeeStatus, 144
Storage area networks (SANs), 209
central management, 209
disaster, recovery from, 209
higher availability, 209
space utilization, 209
Symmetric multiprocessor (SMP), 18
sysssislog table, 327, 328, 335
System-driven load dates, 372
System.Text.UnicodeEncoding class, 398
T
Target tables, 317
DimAirline, 317
SatAirline, 317
schema name, 301
Technical metadata, 286
business rules, 286
data models, 286
data quality, 286
source systems, 286
taxonomies, 286
volumetrics, 286
tempdb database, 553
Temporal dimensions, implementing of, 614
Three-layer architecture, 13
atomic data warehouse, 13
Inmon data warehouse, 14
operational data store (ODS), 13
TLinkEvent, 328
HubComputer, 328
HubEvent-type, 328
HubOperator, 328
HubSource, 328
TLinkFlights nonhistorized links, 588
Total cost of ownership (TCO), 596
Total data quality management (TDQM), 84
cyclic phases, 84
analysis, 84
definition, 84
improvement, 84
measurement, 84
Total quality management (TQM), 12, 33, 81
computer-integrated design, 81
continuous improvement, 81
cost of quality, 81
data quality dimensions, 83
data vault 2.0 methodology, integrating TQM with, 85
data warehouse quality, 84
experiments, design of, 81
information systems, 81
participative management, 81
quality assurance, 81
quality circles, 81
quality function deployment, 81
robust design, 81
statistical process control, 81
Taguchi methods, 81
total data quality management, 84
total productive maintenance, 81
value engineering, 81
Transactional links, 457
TSatDiagnosticExEvent satellite, 328
TSatEvent satellite, 328
TSatFlight satellite, 618
TSatOnPipelinePreComponentCallEvent satellite, 328
T-SQL data types, 298
Two-layered architecture, 12
advantage of, 12
Kimball data lifecycle, 13
U
Unicode strings, 381
UNION operation, 434
United Nations Organization (UNO), 160
UPPER function, 364
V
VARCHAR column, 298
Variable dLoadDate settings, 376, 377
Variable mappings, 375
Vehicle descriptor section (VDS), 95
Vehicle identification number (VIN), 93
Vehicle identifier section (VIS), 95
Virtualization
computed satellites, 568
disadvantages, 568
quick deployment, 568
quick development, 568
simple implementation, 568
leveraging pit and bridge tables for, 592
advantages of, 595
agile development process, 595
ease of change, 595
improved developer productivity, 596
lower total cost of ownership, 596
simplified solution, 595
bridge tables, loading of, 604
additional customization, applying of, 604
hub references, removal of, 607
joins between links, 604
required aggregations, 604
dimensions, creating of, 601
facts, creating of, 608
performance affecting factors, 594
pit tables, loading of, 596
business logic, implementing of, 596
data joining from multiple satellites, 596
W
Windows management instrumentation (WMI), 331, 332
data reader task, 321, 333
World manufacturer identifier (WMI), 95
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset