Table of Contents

Preface

Part 1: Preparing Data

1

Introduction to CompTIA Data+

Understanding Data+

CompTIA Data+: DAO-001

Data science

Introducing the exam domains

Data Concepts and Environments

Exam format

Who should take the exam?

Summary

2

Data Structures, Types, and Formats

Understanding structured and unstructured data

Structured databases

Unstructured databases

Relational and non-relational databases

Going through a data schema and its types

Star schema

Snowflake schema

Understanding the concept of warehouses and lakes

Data warehouses

Data marts

Data lakes

Updating stored data

Updating a record with an up-to-date value

Changing the number of variables being recorded

Going through data types and file types

Data types

Variable types

File types

Summary

Practice questions and their answers

Questions

Answers

3

Collecting Data

Utilizing public sources of data

Public databases

Open sources

Application programming interfaces and web services

Collecting your own data

Web scraping

Surveying

Observing

Differentiating ETL and ELT

ETL

ELT

Delta load

Understanding OLTP and OLAP

OLTP

OLAP

Optimizing query structure

Filtering and subsets

Indexing and sorting

Parameterization

Temporary tables and subqueries

Execution plan

Summary

Practice questions and their answers

Questions

Answers

4

Cleaning and Processing Data

Managing duplicate and redundant data

Duplicate data

Redundant data

Dealing with missing data

Types of missing data

Deletion

Imputation

Interpolation

Dealing with MNAR

Understanding invalid data, specification mismatch, and data type validation

Invalid data

Specification mismatch

Data type validation

Understanding non-parametric data

Finding outliers

Summary

Practice questions

Questions

Answers

5

Data Wrangling and Manipulation

Merging data

Key variables

Joining

Blending

Concatenation and appending

Calculating derived and reduced variables

Derived variables

Reduction variables

Parsing your data

Recoding variables

Recoding numbers into categories

Recoding categories into numbers

Shaping data with common functions

Working with dates

Conditional operators

Transposing data

System functions

Summary

Practice questions

Questions

Answers

Part 2: Analyzing Data

6

Types of Analytics

Technical requirements

Exploring your data

Common types of EDA

EDA example

Checking on performance

KPIs

Project management

Process analytics

Discovering trends

Finding links

Choosing the correct analysis

Why is choosing an analysis difficult?

Assumptions

Making a list

Finally choosing the analysis type

Summary

Practice questions

Questions

Answers

7

Measures of Central Tendency and Dispersion

Discovering distributions

Normal distribution

Uniform distribution

Poisson distribution

Exponential distribution

Bernoulli distribution

Binomial distribution

Skew and kurtosis

Understanding measures of central tendency

Mean

Median

Mode

When to use which

Calculating ranges and quartiles

Ranges

Quartiles

Interquartile range

Finding variance and standard deviation

Variance

Standard deviation

Summary

Practice questions

Questions

Answers

8

Common Techniques in Descriptive Statistics

Understanding frequencies and percentages

Frequencies

Percentages

Calculating percent change and percent difference

Percent change

Percent difference

Discovering confidence intervals

Understanding z-scores

Summary

Practice questions

Questions

Answers

9

Hypothesis Testing

Understanding hypothesis testing

Why use hypothesis testing

Hypothesis testing process

Differentiating null hypothesis and alternative hypothesis

Null hypothesis ()

Alternative hypothesis ()

Null hypothesis versus alternative hypothesis

Learning about p-value and alpha

p-value

Alpha

Alpha and tails

Understanding type I and type II errors

Type I error

Type II error

How type I and type II errors interact with alpha

Writing the right questions

The parts of a good question

Qualities of a good question

What to do about bad questions

Summary

Practice questions

Questions

Answers

10

Introduction to Inferential Statistics

Technical requirements

Understanding t-tests

What you need to know about t-tests

T-test practice

Knowing chi-square

What you need to know about chi-square

Chi-square practice

Calculating correlations

Correlation

Correlation practice

Understanding simple linear regression

What you need to know about simple linear regression

Simple linear regression practice

Summary

Practice questions

Questions

Answers

Part 3: Reporting Data

11

Types of Reports

Distinguishing between static and dynamic reports

Point-in-time reports

Real-time reports

Static versus dynamic reports

Understanding ad hoc and research reports

Ad hoc reports

Research reports

Knowing about self-service reports

Understanding recurring reports

Compliance reports

Risk and regulatory reports

Operational reports (KPI reports)

Knowing important analytical tools

Query tools

Spreadsheet tools

Programming language tools

Visualization tools

Business services

All-purpose tools

Which tools you should learn to use

Summary

Practice questions

Questions

Answers

12

Reporting Process

Understanding the report development process

Creating a plan

Getting the plan approved

Creating the report

Delivering the report

Knowing what to consider when making a report

Business requirements

Dashboard-specific requirements

Understanding report elements

Understanding report delivery

Designing reports

Branding

Fonts, layouts, and chart elements

Color theory

Summary

Practice questions

Questions

Answers

13

Common Visualizations

Understanding infographics and word clouds

Infographics

Word clouds

Comprehending bar charts

Bar charts

Stacked charts

Histograms

Waterfall charts

Charting lines, circles, and dots

Line charts

Pareto charts

Pie charts

Scatter plots

Bubble charts

Understanding heat maps, tree maps, and geographic maps

Heat maps

Tree maps

Geographic maps

Summary

Practice questions

Questions

Answers

14

Data Governance

Understanding data security

Access requirements

Security requirements

Knowing use requirements

Acceptable use policy

Data processing

Data deletion

Data retention

Understanding data classifications

Personally identifiable information

Personal health information

Payment Card Industry

Handling entity relationship requirements

Summary

Practice questions

Questions

Answers

15

Data Quality and Management

Understanding quality control

When to check for quality

Data quality dimensions

Data quality rules and metrics

Validating quality

Cross-validation

Sample/spot check

Reasonable expectations

Data profiling

Data audits

Automated checks

Understanding master data management

When to use MDM

Processes of MDM

Summary

Practice questions

Questions

Answers

Part 4: Mock Exams

16

Practice Exam One

Practice exam one

Congratulations!

Practice exam one answers

17

Practice Exam Two

Practice exam two

Congratulations!

Practice exam two answers

Index

Other Books You May Enjoy

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset