This entails the data that's available but we missed during its extraction from a source. It deals with engineering tasks such as the following:
- Scraping from a website
- Querying from a database
- Extracting from flat files
There can be many sources of missing values, some of which are as follows:
- Regular expressions resulting in the wrong or non-unique results
- Wrong query
- A different data type storage
- Incomplete download
- Incomplete processing