Day of the week

As a sanity check that the data was imported correctly, let's also explore the VDAYR variable, which indicates the day of the week that the patient visit occurred:

X_train.groupby('VDAYR').size()

The output is as follows:

VDAYR
1    2559
2    2972
3    2791
4    2632
5    2553
6    2569
7    2506
dtype: int64

As we would expect, there are seven possible values, and the observations are relatively uniformly distributed across the possible values. We could get fancy and engineer a WEEKEND feature, but engineering additional features can be very time-consuming and memory-consuming, often for minimal gain. We'll leave that exercise up to the reader.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset