Regular Expressions

In this chapter, we are going to learn and understand what regular expressions are. The purpose of regular expressions is to represent a pattern that can be identified within some text data. In the context of data analysis, there are a couple of important uses for regular expressions:

  • To validate fields to make sure that all values within a particular column adhere to a particular format
  • To search fields based on a particular pattern

Word processors and editing applications have a Find and Replace feature. You submit a bit of text to identify within a larger bit of text, and the desired replacement. The application will replace all of the found text with the desired text. Many of these applications now include regular expression support. Rather than submitting an exact sequence of characters that need to be found, we submit a pattern. This pattern defines what is considered valid or not, using regular expressions. So, regular expressions are a mini-language. They aren't limited to the Haskell language. Once you understand regular expressions, you should be able to translate that knowledge into other programming languages. Each programming language may implement the mini-language of regular expressions, with slight variations, so you'll need to properly test your expressions when moving from language to language.

In this chapter, we're going to understand the mini-language of regular expressions, which includes the following:

  • Dots and pipes
  • Atoms and atom modifiers
  • Character classes
  • Using regular expressions with a CSV file
  • Using regular expressions with an SQLite3 database
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset