Subject Index
A
Add analysis services connection manager,
573,
574
Airline identification number (AirlineID),
100
Airline industry software system, functional characteristics,
54
external inquiries (EQ),
54
external interface files (EIF),
54
external outputs (EO),
54
AirportHashKey satellite,
365,
472
null references, link connection with,
453
Analysis server database,
569,
570
Association rule algorithm,
569
Audit transformation editor,
337,
338
B
Big Data,
definition of,
performance issues,
Bill of material (BOM) hierarchy,
129
Bureau of Transportation Statistics (BTS),
343
correct and complete data,
525
data quality tagging,
525
data, standardization of,
525
derived data, transforming of,
524
match and consolidate data,
525
identification process,
96
business column names,
285
data elements, technical numbering of,
285
ontologies and taxonomies,
285
physical table and column names,
285
computed aggregate links,
124
C
Capability maturity model integration (CMMI),
12,
33,
39,
231
Data Vault 2.0 methodology, integrating CMMI in,
41
maturity level 5, advancing to,
41
CData
COALESCE
Comma-separated values (CSV),
61,
324
Compute system values subprocess,
346
ConnectionAssociation mining model,
573
ConnectionAssociation_Structure,
573
Content management systems (CMS),
374
Create, read, update, or delete (CRUD) text,
202
CREATE TABLE DDL statement,
597
existing dimensions, selecting of,
642
Customer relationship management (CRM), ,
123
Customer relationship system (CRS),
294
D
Data
correct and complete,
530
automated correction,
530
cleansing transformation,
542,
545
lookup transformation,
540,
543
person name suffix domain,
537
set up in knowledge database,
535
multiple data fields,
530
OLE DB source editor,
539
match and consolidation of,
548
data correction software,
550
truncate target before full load,
559
data matching techniques,
549
data mining software,
550
same entity, representing of,
548
same household, representing of,
549
false-negative match,
554
false-positive match,
554
wrong entity, representing of,
548
sources of
content management systems (CMS),
374
semi structured documents,
374
unstructured documents,
374
Mark business keys deleted,
504
database features, row versions from,
210
read-committed transactions, row versions from,
210
temporary user objects,
210
Data definition language (DDL),
65
Data flow
for loading exploration link,
580
Data mining query task,
573,
574
OLE DB destination connection, setting up of,
578
transformation editor,
575,
576
Data package identifier,
300
Data quality (DQ),
33,
301
business expectations towards,
519
business expectations,
519
data quality expectations,
519
in Data Vault 2.0 architecture,
523
implementation using pit tables,
616
customer and partner satisfaction,
520
organizational mistrust,
521
Data standardization,
528
domain value redundancy,
528
extraneous punctuation, stripping of,
528
format inconsistencies,
528
business rules application,
23
business rules definition,
22
information mart layer,
27
business intelligence, system of,
11
point-in-time (PIT) tables,
151
hardware considerations for,
218
applications of
business key, consolidation of,
124
business vault entity,
126
entity-relationship (ER) diagram,
125
computed aggregate links,
137
nondescriptive links,
136
communication channels,
37
review and improvement,
33
bridge tables
passenger data, query performance,
158
business entity
point-in-time (PIT) tables
reference tables
reference data, satellite with,
163
conditional split transformation editor,
480
input output selection dialog,
481
merge join transformation editor,
479
OLE DB destination editor,
482
query parameters dialog,
476
source editor for target data,
478
source staging table,
473
target satellite table,
473
standard loading template for,
469,
470
business-centric model for,
90,
284,
286
changing business requirements,
10
database system, physical architecture,
195
divide and conquer problem,
430
enterprise data warehouse environment,
access,
auditing,
big data,
complexity,
compliance,
facts, single version of,
mission criticality,
multiple subject areas,
other business requirements,
11
performance issues,
scalability,
truth, single version of,
environments for development,
197
blue-green deployment,
198,
199
exception-handling rules,
285
functional characteristics of,
59
complete restore limitations,
348
loading process, dependencies in,
347
multiple environment, synchronization of,
348
NoSQL engines, difference in,
348
serial algorithms, dependencies in,
348
high-performance processes for,
431,
history of,
data warehouse systems (DWH),
decision support systems,
information hierarchy,
physical architecture,
203
application or user queries,
209
service level agreements,
203
set logic, applying of,
430
hardware considerations for,
218
stage area, hardware considerations for,
214
stage database setup,
214
source system business definitions,
286
hardware considerations for,
214
table specifications,
285
Data warehouse quality (DWQ),
84
Data warehouse system (DWH) model, ,
633
Date dimension
completing wizard for,
638
selecting attributes for,
635,
637
Dependency analyzer tool,
291
Derived data, transforming of,
525
business rules, noncompliance with,
525
demographic information,
526
geographic information,
526
multiple sources, conflicting data with,
526
psychographic information,
526
Dimension hierarchies,
183
date dimension, logical model,
185
Dimensions
attributes in wizard,
634
existing table, use of,
631
nontime table generation in data source,
632
select creation method for,
636
time table generation in data source,
631
time table generation on server,
632
conformed dimensions,
179
of relational tables,
172
fully additive measures,
177
nonadditive measures,
177
semi-additive measures,
177
Direct attached storage disk (DASD),
215
Disciplined agile delivery (DAD),
74
E
End-dating satellites,
486
changed records loading template,
494
data flow using hash diffs,
492
information mart loads,
61
staging and data vault loads,
61
Entity-relationship (ER) diagram,
125
Error Mart, implementing of,
335
erroneous data in SQL server integration services,
336
F
FactConnection fact table,
590
foreign key references,
174
required dimensions, selection of,
177
required facts, selection of,
177
File checksum integrity verifier (FCIV),
351
Flat files
connection manager editor,
380
configure columns of,
383
Foreach ADO enumerator,
410,
411
Foreach loop container
collection configuration of,
375,
376
G
Google Drive account,
404,
407
connection manager for,
408
GoogleSheets connection manager,
411,
413
property expression editor for,
421
required property values to,
414
variable mappings to,
412
H
distinct parallel load operations,
359
modified example input to,
369
Hash differences
calculation, improving of,
369
for change detection,
364
hash diffs, maintaining of,
367
to data, applying of,
351
data type, length, precision and scale,
353
delimiters for concatenated fields, lack of,
354
embedded or unseen control characters,
353
leading and trailing spaces,
353
MD5 message-digest algorithm,
350
secure hash algorithm,
350
storage requirements,
358
tool and platform compatibility,
358
Historical data, sourcing of,
399
HubPerson
OLE DB source editor for,
552
business keys, storage of,
123
business key column number,
311
references, addition of,
607
soft-deleting data in,
499
I
connection, creating of,
625
datasource, creating of,
624,
626
impersonation information configuration,
626
business vault as intermediate to,
567
exploration link, building of,
569
hardware considerations for,
223
additional sequence numbers, introduction of,
621
dimensions in cube, reduction of,
620
fixed binary data type, use of,
620
fact tables, loading of,
585,
590
Information technology (IT),
35
International function point user group (IFPUG),
54
Internet information services (IIS),
196
J
Java virtual machine,
354,
355
K
Key performance indicators (KPIs),
30,
33,
160
Kimball data lifecycle,
13
L
Least-significant byte (LSB),
354
LibreOffice download,
351
many-to-many relationships,
103
one-to-many relationships,
104
Link-FixedBaseOp references,
307
Link-FlightNumCarrier,
455
Link overloading process,
450,
458
Link tables, template for loading of,
449
missing dates on sources,
371
mixed arrival latency,
371
mixed time zones for source data,
371
trustworthiness of dates on sources,
371
M
Management instrumentation (MI),
333
Management object format (MOF),
333
Massively parallel processing (MPP), ,
18,
205
complex data handling,
232
privacy and data protection,
233
regulatory compliance,
233
data, facilitate exchange of,
231
data quality improvement,
231
information, processing and checking of,
231
information requirements reduction,
231
integration management of,
254
managing user permissions,
256
Microsoft excel add-in for,
252
operational systems and Data Vault, integration of,
265
operational
vs. analytical,
235
business rule parameters,
235
codes and descriptions,
237
date and calendar information,
237
groups and bins definition,
236
hierarchy definitions,
237
technical parameters,
237
subscription view page,
278
subscription views, metadata columns,
281
LastChgVersionNumber,
281
for total quality management,
239
integration management,
253
system administration,
254
user and group permissions,
255
MD5 message-digest algorithm (MD5),
350
general documentation,
283
referential integrity,
283
hard rules, capturing of,
298
hardware considerations for,
227
metadata capturing forstaging area,
300
soft rules, capturing of,
311
source system definitions, capturing of,
296
source tables, capturing requirements to,
301,
302
source tables to data vault tables, capturing of,
302
hub entities, loading of,
303
link entities loading, metadata for,
304
satellite entities on hubs,
307
satellite entities on links,
310
SQL server BI metadata toolkit,
288
capturing requirements to,
317
Metrics Mart, implementing of,
333
Microsoft analytics platform system,
352
cloud computing platform,
200
Microsoft data services (MDS),
633
Microsoft SQL server,
349
Missing data, dealing with,
501
full table scan for detecting deleted scenes,
502
last seen date, introducing of,
502
source system not in scope,
501
Multiple active result sets (MARS),
210
N
Network infrastructure metrics,
321
No-history
input output selection for,
466
lookup transformation editor for,
464
OLE DB destination editor of,
467
OLE DB source editor for,
462
query parameter, setting up of,
464
Normalizing source system files,
423,
424
O
Object-relational mapping (ORM),
343
OLE DB
set query parameters dialog,
443
for link table destination,
458
Online analytic processing (OLAP),
195,
623
Online transaction processing (OLTP),
195
OnTimeOnTimePerformance table,
395
Open management infrastructure (OMI),
333
Operational data store (ODS),
13
master data management (MDM),
29
Microsoft master data services (MDS),
29
OriginAirportHashDiff value,
398
P
Process execution metadata,
287
Data Vault 2.0 methodology, software development life-cycle,
67
traditional software development life-cycle,
63
implementation and unit testing,
65
integration and system testing,
66
operation and maintenance,
66
requirements engineering,
64
Project management process (PMP),
33,
69
capability maturity model integration (CMMI),
39
agile requirements gathering,
52
data warehousing, boundaries in,
56
data warehousing, function point analysis for,
56,
58
enterprise data warehouse, function points for,
61
ETL complexity factors, assessement of,
57
function point analysis,
54
function points, measuring with,
54
information architect,
35
Data Vault 2.0 methodology, integrating scrum with,
47
product and sprint backlog,
46
technical business analyst,
34
Project review
total quality management (TQM),
81
R
loading template for,
446
performance affecting factors,
429
Record source
Record tracking satellites,
146
DepartureDelayGroups,
619
Reference data, dealing with,
618
Reference tables, loading of,
505
code and descriptions,
code and descriptions with history,
514
Relational database management system (RDBMS), ,
19,
196,
284,
429
REPLACENULL function,
387
Representational state transfer (REST),
343
Restricted operation codes (RstOpCode),
166
S
SALPerson target link,
559
creating dimensions from,
560
for customer numbers,
130
de-duplicated dimension formation,
560
passenger business keys, de-duplicate,
125
Sample airline data, sourcing of,
403
desktop application, client ID for,
405
Google Drive, authenticating with,
404
GoogleSheets connection manager,
411
native application, client ID for,
406
setup consent screen,
404,
405
SatAirport satellite,
365
SatDestAirport satellite,
472
default loading template for,
465
with multiple hub references,
362,
363
structure of
addition of new records,
367
after adding a new column,
367
SatOriginAirport satellite,
472,
612
SatPassengerAddress satellite,
400,
401
SatPassengerCRM satellite,
294
Sat PassengerPreferredDish CRM satellite,
294
SatPassenger satellite,
308
SatPreferredDish satellite,
491
Scalable data warehouse architectures, dimensions of,
17
analytical complexity,
19
Script Component command,
385
Script transformation editor,
386
Secure hash algorithm (SHA),
350
Self-service business intelligence,
30,
163
direct access to source systems,
30
key performance indicators (KPIs),
30
nonstandardized business rules,
30
unconsolidated raw data,
30
unintegrated raw data,
30
Semantic meaning, changing of,
370
Service level agreement (SLA),
20
Service-oriented architecture (SOA),
21,
85
Setup connection manager,
395,
396
disciplined agile delivery (DAD),
74
process performance triangle,
75
Software development life cycle (SDLC),
33
Software requirements specification (SRS),
64
SortKeyPosition property,
475
Source table
sp_ssis_addlogentry stored procedure,
330
SQL server analysis services (SSAS),
171,
569,
623
aggregation management,
623
user-defined metadata,
623
SQL server business intelligence metadata toolkit,
288,
291
SQL server integration services (SSIS),
287
performance data, capturing of,
323
for SQL server profiler,
323
for Windows event log,
324
dSnapshotDate variable,
601
parameter mapping of,
600
Stage BTS On Time On Time Performance data flow,
380
Stage load hash computation,
349
Stage table, template used for,
345,
346
Staging area
add system-generated attributes,
346
true duplicates, removal of,
346
delete specific partitions,
518
delete specific records,
518
Standard operations code (StdOpCode),
166
Status tracking satellite,
144
Storage area networks (SANs),
209
disaster, recovery from,
209
Symmetric multiprocessor (SMP),
18
System-driven load dates,
372
System.Text.UnicodeEncoding class,
398
T
Temporal dimensions, implementing of,
614
Three-layer architecture,
13
atomic data warehouse,
13
operational data store (ODS),
13
TLinkFlights nonhistorized links,
588
Total cost of ownership (TCO),
596
Total data quality management (TDQM),
84
Total quality management (TQM),
12,
33,
81
computer-integrated design,
81
continuous improvement,
81
data quality dimensions,
83
data vault 2.0 methodology, integrating TQM with,
85
data warehouse quality,
84
experiments, design of,
81
participative management,
81
quality function deployment,
81
statistical process control,
81
total data quality management,
84
total productive maintenance,
81
TSatDiagnosticExEvent satellite,
328
TSatFlight satellite,
618
TSatOnPipelinePreComponentCallEvent satellite,
328
Two-layered architecture,
12
Kimball data lifecycle,
13
U
United Nations Organization (UNO),
160
V
Variable dLoadDate settings,
376,
377
Vehicle descriptor section (VDS),
95
Vehicle identification number (VIN),
93
Vehicle identifier section (VIS),
95
Virtualization
simple implementation,
568
leveraging pit and bridge tables for,
592
agile development process,
595
improved developer productivity,
596
lower total cost of ownership,
596
bridge tables, loading of,
604
additional customization, applying of,
604
hub references, removal of,
607
required aggregations,
604
dimensions, creating of,
601
performance affecting factors,
594
pit tables, loading of,
596
business logic, implementing of,
596
data joining from multiple satellites,
596
W
Windows management instrumentation (WMI),
331,
332
World manufacturer identifier (WMI),
95