In-built styling options

pandas has predefined formatting rules written and stored as functions that can be readily used.

The highlight_null method highlights all NaNs or Null values in the data with a specified color. In the DataFrame under discussion, the Age and Cabin columns have NaNs. Hence, in the following screenshot, the NaNs are flagged in blue in these columns.

The following snippet highlights the NaN values in these columns:

df.style.highlight_null(null_color = "blue")

This results in the following output:

Figure 9.2: Highlighting Nulls and NANs with blue 

The highlight_max and highlight_min methods apply highlighting (with a chosen color) to the maximum or minimum value across either axis. In the following example, the minimum values in each column have been highlighted:

 df.iloc[0:10, :].style.highlight_max(axis = 0)

Please note that only columns with the numeric datatype are subject to highlighting.

The following screenshot highlights the maximum values for each column:

Figure 9.3: Highlighting the maximums across rows (among numerical columns) with yellow

In the preceding code, highlight_max has been used to highlight the maximum values in each column.

Next, we use the same function to find the maximum for each column, changing the value of the axis parameter while doing so:

df.style.highlight_max(axis = 1)

The following screenshot shows the highlighted maximum values across columns: 

 

Highlighting the maximums across columns (among numerical columns) with yellow

Now, let's use the highlight_min function to highlight the minimum values with a custom-defined color. Both highlight_min and highlight_max have the same syntax and accept the same set of parameters:

Highlighting the minimums with green

A background color gradient based on conditional formatting can be applied to columns to give a sense of high, medium, and low values based on color. The backgrounds are colored with different colors based on whether they are high, medium, or low.

The background gradient of the table can be controlled through the background_gradient() styling function. Any existing colormaps or user-defined colormaps can be used as a gradient. Parameters such as low and high help us use part of the colormap's color range. Further, the axis and subset parameters can be set to vary the gradient along a certain axis and subset of columns:

df.style.background_gradient(cmap='plasma', low = 0.25, high = 0.5) 

This results in the following output:

Creating a background color gradient separately for each numerical column based on its high and low values

Styling can also be done independently of values. Let's modify the properties to change the font color, background color, and border color. You can do so by using the following code.

df.style.set_properties(**{'background-color': 'teal',
                               'color': 'white',
                               'border-color': 'black'})

This results in the following output:

Changing the background colour, font color, font type, and font size for an output DataFrame

Styling options also help us control the numerical precision. Consider the following DataFrames:

DataFrame numbers without precision rounding off

Take a look at the following code, which sets the precision to 2 decimal places or rounds off a number number to 2 decimal places.

rand_df.style.set_precision(2)

This results in the following output:

DataFrame numbers with precision rounding off to 2 decimal places

Now, let's set a caption for the preceding DataFrame:

rand_df.style.set_precision(2).set_caption("Styling Dataframe : Precision Control")

This results in the following output:

DataFrame numbers with precision rounding off to 2 decimal places and a table caption

The set_table_styles function can also be used to modify the table independently of the data. It accepts a list of table_styles. Each table_style should be a dictionary consisting of a selector and a property. table_styles can be used to define custom action-based styles. For example, the following style gives the selected cell the lawngreen background color:

df.style.set_table_styles([{'selector': 'tr:hover','props': [('background-color', 'lawngreen')]}]

This results in the following output:

table_style output showing a lawngreen background color for the selected cell

The hide_index and hide_columns styling options allow us to hide either the index or specified columns when they're displayed. In the following code, we have hidden the default index column:

df.style.hide_index()

The following screenshot shows the output DataFrame, without its index:

Hiding the Index column from an output DataFrame

Now, let's use the hide_columns option to hide the "Name", "Sex", "Ticket", and "Cabin" columns:

df.style.hide_columns(["Name", "Sex", "Ticket", "Cabin"])

The following screenshot displays the columns that are shown after hiding a few columns from a DataFrame:

Hiding a number of columns from an output DataFrame
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset