To get a sense of how mad the missing value problem is, you may want to find out about the following information:
- How many cells in a column have a missing value
- Which cells in a column have a missing value
- How many columns have missing values
These tasks can be performed as follows:
- Finding cells that have missing values:
pd.isnull(data['body']) #returns TRUE if a cell has missing values
pd.notnull(data['body']) #returns TRUE if a cell doesn't have missing values
- Finding the number of missing values in a column:
pd.isnull(data['body']).values.ravel().sum() #returns the total number of missing values
pd.nottnull(data['body']).values.ravel().sum()#returns the total number of non-missing values
The third one has been left as an exercise for you.