With our first cell, let's import the metadata and print a small preview of it:
import pandas as pd
pd.set_option('mode.chained_assignment',None)
HOME_PATH = 'C:\Users\Vikas\Desktop\Bk\health-it\ed_predict\data'
df_helper = pd.read_csv(
HOME_PATH + 'ED_metadata.csv',
header=0,
dtype={'width': int, 'column_name': str, 'variable_type': str}
)
print(df_helper.head(n=5))
You should see the following output:
width column_name variable_type 0 2 VMONTH CATEGORICAL 1 1 VDAYR CATEGORICAL 2 4 ARRTIME NONPREDICTIVE 3 4 WAITTIME CONTINUOUS 4 4 LOV NONPREDICTIVE
So the ED_metadata.csv file simply is a comma-separated values file containing the width, column name, and variable type as specified in the documentation. This file can be downloaded from the code repository for this book.
In the next cell, we convert the columns of the pandas DataFrame we imported into separate lists:
width = df_helper['width'].tolist()
col_names = df_helper['column_name'].tolist()
var_types = df_helper['variable_type'].tolist()