So far, we've discussed using labels, integers, and slicing to select values in DataFrames. Sometimes, it is convenient to select certain rows that meet a certain condition in one of their statements. For example, if we wanted to restrict an analysis on people whose age is greater than or equal to 50 years.
pandas DataFrames support Boolean indexing, that is, indexing using a vector of Boolean values to indicate which values we wish to include, provided that the length of the Boolean vector is equal to the number of rows in the DataFrame. Because a conditional statement involving a DataFrame column yields exactly that, we can index DataFrames using such conditional statements. In the following example, the df DataFrame is filtered to include only rows in which the value of the age column is equal to or exceeds 50:
df3_filt = df3[df3['new_col3'] > 10]
print(df3_filt)
The output is as follows:
col3 new_col1 new_col2 new_col3 new_col4 new_col5 new_col6 3 a 0 11 1 1.0 13 4 b 0 13 1 1.0 14 5 c 0 15 21 9.5 15 6 d 0 17 23 10.5 16
Conditional statements can be chained together using logical operators such as | or &.