Data overview

The preceding DataFrame is the customer data of an automobile servicing firm. They basically provide services to their clients on a periodic basis. Each row in the DataFrame corresponds to a unique customer. Hence, it is customer-level data. Here is an observation from the data: 

The shape of the DataFrame

We can observe that the data contains 27,002 records and 26 characteristics.

Before we start exploratory data analysis on any data, it is advised to know as much about the data as possible—the column names and their corresponding data types, whether they contain null values or not (and if so, how many), and so on. The following screenshot shows some of the basic information about the DataFrame obtained using the info function in pandas:

Basic information about the DataFrame

Using the info() function, we can see that the data only has float and integer values. Also, none of the columns has null/missing values.

The describe() function in pandas is used to obtain various summary statistics of all the numeric columns. This function returns the count, mean, standard deviation, minimum and maximum values, and the quantiles of all the numeric columns. The following table shows the description of the data obtained using the describe function:

Describing the Data
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset