Using the where() method

The where() method is used to ensure that the result of Boolean filtering is the same shape as the original data. First, we set the random number generator seed to 100 so that the user can generate the same values, as shown next:

    In [379]: np.random.seed(100)
           normvals=pd.Series([np.random.normal() for i in np.arange(10)])
        normvals
    Out[379]: 0   -1.749765
        1    0.342680
        2    1.153036
        3   -0.252436
        4    0.981321
        5    0.514219
        6    0.221180
        7   -1.070043
        8   -0.189496
        9    0.255001
        dtype: float64
    
    In [381]: normvals[normvals>0]
    Out[381]: 1    0.342680
        2    1.153036
        4    0.981321
        5    0.514219
        6    0.221180
        9    0.255001
        dtype: float64
    
    In [382]: normvals.where(normvals>0)
    Out[382]: 0         NaN
        1    0.342680
        2    1.153036
        3         NaN
        4    0.981321
        5    0.514219
        6    0.221180
        7         NaN
        8         NaN
        9    0.255001
        dtype: float64

This method seems to be useful only in the case of a Series, as we get this behavior for free in the case of a DataFrame:

    In [393]: np.random.seed(100) 
           normDF=pd.DataFrame([[round(np.random.normal(),3) for i in np.arange(5)] for j in range(3)], 
                 columns=['0','30','60','90','120'])
        normDF
    Out[393]:  0  30  60  90  120
      0  -1.750   0.343   1.153  -0.252   0.981
      1   0.514   0.221  -1.070  -0.189   0.255
      2  -0.458   0.435  -0.584   0.817   0.673
      3 rows × 5 columns
    In [394]: normDF[normDF>0]
    Out[394]:  0  30  60  90  120
      0   NaN   0.343   1.153   NaN   0.981
      1   0.514   0.221   NaN       NaN   0.255
      2   NaN   0.435   NaN   0.817   0.673
      3 rows × 5 columns
    In [395]: normDF.where(normDF>0)
    Out[395]:  0  30  60  90  120
      0   NaN   0.343   1.153   NaN   0.981
      1   0.514   0.221   NaN   NaN   0.255
      2   NaN   0.435   NaN   0.817   0.673
      3 rows × 5 columns

The inverse operation of the where method is mask:

    In [396]: normDF.mask(normDF>0)
    Out[396]:  0  30  60  90  120
      0  -1.750  NaN   NaN  -0.252  NaN
      1   NaN  NaN  -1.070  -0.189  NaN
      2  -0.458  NaN  -0.584   NaN  NaN
      3 rows × 5 columns

Table of Contents for Using the where() method

Create new playlist

Sign In

Sign Up

Table of Contents for
Using the where() method