Converting types

If you are familiar at all with programming, you know that data can be stored as different variable types, ranging from simple integers to complex decimals to string (character) types. These types differ in terms of the operations that can be performed on them. For example, if the numbers 3 and 5 are stored as integer types, we can easily calculate 3+5= 8 using code. However, if they are stored as string types, adding "3" to "5" may yield an error, or it may yield "35," and this would cause all sorts of problems with our data, as you can imagine. Part of cleaning and inspecting the data is making sure every variable is stored as its proper type. Numerical data should correspond to numerical types, and most other data should correspond to string or categorical types.

In addition to the variable type, in many modeling languages, decisions must be made as to how to store data using more complex data containers, such as lists, vectors, and dataframes in R and lists, dictionaries, tuples, and dataframes in Python. Various importing and modeling functions may assume different choices of data structures, so once again, interconversion between data structures is usually necessary in order to achieve the desired result, and this is a crucial part of data cleansing. We will cover Python-related data structures in Chapter 5, Computing Foundations – Introduction to Python.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset