Filtering rows using Boolean indexing

So far, we've discussed using labels, integers, and slicing to select values in DataFrames. Sometimes, it is convenient to select certain rows that meet a certain condition in one of their statements. For example, if we wanted to restrict an analysis on people whose age is greater than or equal to 50 years.

pandas DataFrames support Boolean indexing, that is, indexing using a vector of Boolean values to indicate which values we wish to include, provided that the length of the Boolean vector is equal to the number of rows in the DataFrame. Because a conditional statement involving a DataFrame column yields exactly that, we can index DataFrames using such conditional statements. In the following example, the df DataFrame is filtered to include only rows in which the value of the age column is equal to or exceeds 50:

df3_filt = df3[df3['new_col3'] > 10]
print(df3_filt)

The output is as follows:

  col3 new_col1  new_col2  new_col3  new_col4  new_col5  new_col6
3    a                  0        11         1       1.0        13
4    b                  0        13         1       1.0        14
5    c                  0        15        21       9.5        15
6    d                  0        17        23      10.5        16

Conditional statements can be chained together using logical operators such as | or &.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset