Applying Data Quality Control
This chapter covers Objective 5.2 (Given a scenario, apply data quality control concepts) and includes the following topics:
Circumstances to check for quality
Data quality dimensions
Data quality rules and metrics
Methods to validate quality
For more information on the official CompTIA Data+ exam topics, see the Introduction.
This chapter starts by examining data quality dimensions such as accuracy, coverage, consistency, timeliness, and completeness, as well as circumstances to check for quality, including data acquisition, data transformation, conversion, data manipulation, and final product. Next, this chapter focuses on automated validation, which depends on data type validation and number of data points. This chapter also discusses rules and metrics to be followed in data quality. Finally, this chapter looks at various methodologies for validating data quality, including cross-validation, data auditing, data profiling, spot checking, and ensuring reasonable expectations.
DATA QUALITY DIMENSIONS AND CIRCUMSTANCES TO CHECK FOR QUALITY
When you are making a business decision, it needs to be based on facts. If the facts are not right, the decision is not likely to be right either. This is why the quality of information to make an informed decision matters—and data quality is what this chapter is about.
To ensure that good-quality data is available for organizational stakeholders to make the right decisions at the right times, multiple factors need to be considered to ensure that the quality is acceptable. This focus on quality needs to begin when data is collected at the source, whether manually or automatically, and it needs to continue through transformation, manipulation, and display of data (as reports or dashboards, for example).