Manipulating Text Data - An Introduction to Regular Expressions

Previous chapters have dealt with data manipulation of data on a macroscopic level, without much emphasis on the values in each data entry. In other words, the content up until this point has focused with processing datasets as a whole.

In these next two chapters, I will discuss data wrangling on a more microscopic level, placing emphasis on the individual values of the dataset. This chapter will be about working with text data. In this chapter, I will introduce and discuss the use of regular expressions to recognize patterns in strings. After a brief introduction of regular expressions, I will demonstrate a specific application of regular expressions in a project to extract street names from a dataset containing addresses.

This chapter will include the following sections:

  • Logistical overview
  • Understanding the need for pattern recognition
  • Introducing regular expressions
  • Looking for patterns
  • Quantifying the existence of patterns
  • Extracting patterns
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset