Data extraction 

This entails the data that's available but we missed during its extraction from a source. It deals with engineering tasks such as the following:

  • Scraping from a website
  • Querying from a database
  • Extracting from flat files

There can be many sources of missing values, some of which are as follows:

  • Regular expressions resulting in the wrong or non-unique results
  • Wrong query
  • A different data type storage
  • Incomplete download
  • Incomplete processing

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset