pivot table

The pandas pivot_table function is more advanced than the pivot function in several ways. Let's discuss some interesting parameters of the pivot_table function:

  • data: The DataFrame object that is to be reshaped
  • values: A column or a list of columns that are to be aggregated
  • index: The key across which grouping of pivot index occurs
  • columns: The key with respect to which grouping of the pivot column occurs
  • aggfunc: The function to use for aggregation, such as np.mean

Let's pivot the sample sales data to slice and dice Sales across Category and ShipMode. Note that when aggfunc is empty, the mean is calculated:

pd.pivot_table(sales_data, values = "Sales", index = "Category", columns = "ShipMode")

The following will be the output:

  Pivot table from pandas

Now, it is possible to have multiple values for values, index, column, or aggfunc. Those multiple values can be passed as a list. Let's calculate mean for Sales and sum for Quantity:

pd.pivot_table(sales_data, values = ["Sales", "Quantity"], index = "Category", columns = "ShipMode", aggfunc = {"Sales": np.mean, "Quantity": np.sum})

The following will be the output:

 Pivot table with multiple aggregations

Through pivot_table, DataFrames with hierarchical indices can be created. The fill_value and dropna parameters of the pivot_table function help in handling missing values.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset