The clinical database

Now that we've gotten to know the five patients whose information is contained in our database, we can describe the table structure and fields contained in the database, for six mock tables: PATIENT, VISIT, MEDICATIONS, LABS, VITALS, and MORT. Although every clinical database is different, I've tried to use a structure that is commonly seen in healthcare. Typically, tables are presented by clinical domains (for an example of a research study that received tables in such a distributed format, see Basole et al., 2015). For example, there is often one table that contains demographic and personal information, one table for lab results, one for medications, and so on, so that is how we constructed the database in this example. They tend to be tied together by a common identifier, which in our case is the Pid field.

As we describe the tables, we must keep our end-goal of the data engineering phase in mind–to combine the relevant information from the six tables into a single table, whose columns include the target variable (mortality in this case) in addition to predictor variables, which should be useful for predicting the target variable. This will enable us to make a machine learning model with popular packages such as Python's scikit-learn. With this in mind, we will highlight selected fields that will be useful for our assignment.

Table of Contents for The clinical database

Create new playlist

Sign In

Sign Up

Table of Contents for
The clinical database