Chapter 8: Clinical Trial Case Study
8.1 Scenario, Learning Objectives, and Introductory Activities
8.2 Reading and Summarizing Visit and Lab Data
8.3 Improving Reading of Data; Creating Charts
8.4 Working with Data Stacked Across Visits (and Sites)
8.5 Assembling and Summarizing Data—Sites 1, 2, and 3
8.6 Data Restructuring and Report Writing
8.7 Advanced Data Reading and Report Writing—Connecting to Spreadsheets and Databases
8.1 Scenario, Learning Objectives, and Introductory Activities
This chapter is a series of activities built around data simulated to create a mock clinical trial, with all data in the Clinical Trial Case Study folder available in the download accompanying the text book (along with some additional information). The trial is a study of safety and efficacy of the test drug in a four-arm crossover design. The test compound is compared to a standard, reference treatment of ibuprofen, both at a 30mg daily dose. Two alternate doses are available: 15mg, twice per day; and 10mg, three times per day (coded as 1, 2, and 3, respectively).
At a recruitment visit to any one of the five sites enrolling patients for the study, demographic information is collected along with informed consent. Those patients continuing on the study are referred for a first clinical visit. The first clinical visit establishes baseline vital signs, lab results, and pain measurements, and provides a final check of screening criteria. Exclusion criteria include: age, under 21 or over 70; pregnant females; high blood pressure at outset or during treatment phase (systolic > 140 or diastolic > 90); unwillingness to sign informed consent; and baseline pain level below threshold (0 on the pain scale).
Patients continuing are randomized to one of four study arms: TRTR, RTRT, RTTR, and TRRT (T-Test, R-Reference). Subsequent visits at 3, 6, 9 and 12 months include vital signs, lab results, and pain measurements. Except for the final visit, each visit also serves to check inclusion criteria are still satisfied and serves as the transition point among treatment elements in the assigned arms (for subjects continuing on the study).
The activities for each section of this chapter are programming tasks that not only work to analyze the clinical trial data, but directly relate to the corresponding book chapter. Therefore, the activities in Section 8.1 are tied to concepts in Chapter 1, activities in Section 8.2 relate to Chapter 2, and so forth. The learning objectives for each section are thus the learning objectives from the corresponding chapter, with the activities given here designed to reinforce those objectives.
Data from different sites comes in different forms, with sites 1, 2, and 3 being used throughout the chapter. Data from sites 4 and 5 is not used until Section 8.7, as site 4 has its data stored in a Microsoft Excel spreadsheet and site 5 has its data stored in a Microsoft Access database—working directly with these files is discussed at the end of Chapter 7.
To start, Program 8.1.1 accesses the metadata for the demographic data for site 1. In order to run this code successfully, the Clinical library must be assigned to the folder where the data for this case study is stored.
Program 8.1.1: Generating Metadata on Demographics Data for Site 1
libname Clinical “—place correct path to data folder here—”;
ods trace on;
proc contents data=clinical.demographics_site1;
run;
In order for the code to execute, the Clinical library reference must be correctly assigned. Place a correct path reference between the quotations and submit the code. Remember, the path reference can be relative, built from the working directory, or absolute (build from a drive name or letter).
When the code executes successfully, the ODS TRACE ON statement will put the names of all output tables generated into the SAS log. Having this result is essential for successful execution of Program 8.1.2.
Output 8.1.1: Metadata on Demographics Data for Site 1
CLINICAL.DEMOGRAPHICS_SITE1 |
Observations |
136 |
|
Member Type |
DATA |
Variables |
11 |
Engine |
V9 |
Indexes |
0 |
Created |
Local Information Differs |
Observation Length |
112 |
Last Modified |
Local Information Differs |
Deleted Observations |
0 |
Protection |
Compressed |
NO |
|
Data Set Type |
Sorted |
NO |
|
Label |
|||
Data Representation |
WINDOWS_64 |
||
Encoding |
wlatin1 Western (Windows) |
Engine/Host Dependent Information |
|
Data Set Page Size |
65536 |
Number of Data Set Pages |
1 |
First Data Page |
1 |
Max Obs per Page |
584 |
Obs in First Data Page |
136 |
Number of Data Set Repairs |
0 |
ExtendObsCounter |
YES |
Filename |
Local Information Differs |
Release Created |
9.0401M4 |
Host Created |
X64_7PRO |
Owner Name |
Local Information Differs |
File Size |
128KB |
File Size (bytes) |
131072 |
Alphabetic List of Variables and Attributes |
||||
# |
Variable |
Type |
Len |
Format |
6 |
dob |
Num |
8 |
MMDDYY10. |
5 |
dov |
Num |
8 |
MMDDYY10. |
9 |
ethnicity |
Num |
8 |
|
10 |
ic |
Char |
1 |
|
4 |
notif_date |
Num |
8 |
MMDDYY10. |
8 |
race |
Num |
8 |
|
3 |
screen |
Num |
8 |
|
7 |
sex |
Char |
1 |
|
2 |
sf_reas |
Char |
50 |
|
11 |
site_loc |
Char |
3 |
|
1 |
subject |
Num |
8 |
Based on the information in the SAS log generated by ODS TRACE ON, fill in the ODS SELECT statement in Program 8.1.2 to limit the output to the variable information table shown in Output 8.1.2.
Program 8.1.2: Limiting the Metadata to the Variable List
proc contents data=clinical.demographics_site1;
ods select /**--place table name here--**/;
run;
Output 8.1.2: Metadata Limited to the Variable List
Alphabetic List of Variables and Attributes |
||||
# |
Variable |
Type |
Len |
Format |
6 |
dob |
Num |
8 |
MMDDYY10. |
5 |
dov |
Num |
8 |
MMDDYY10. |
9 |
ethnicity |
Num |
8 |
|
10 |
ic |
Char |
1 |
|
4 |
notif_date |
Num |
8 |
MMDDYY10. |
8 |
race |
Num |
8 |
|
3 |
screen |
Num |
8 |
|
7 |
sex |
Char |
1 |
|
2 |
sf_reas |
Char |
50 |
|
11 |
site_loc |
Char |
3 |
|
1 |
subject |
Num |
8 |
Program 8.1.3 includes the VARNUM option in the PROC CONTENTS statement, which alters the variable information table, both in the output generated and its name. Determine the new name of the variable information table and insert it into the ODS SELECT statement in Program 8.1.3.
Program 8.1.3: Limiting the Metadata to a Variable List in Column Order
proc contents data=clinical.demographics_site1 varnum;
ods select /**--place table name here--**/;
run;
Output 8.1.3: Metadata Limited to a Variable List in Column Order
Variables in Creation Order |
||||
# |
Variable |
Type |
Len |
Format |
1 |
subject |
Num |
8 |
|
2 |
sf_reas |
Char |
50 |
|
3 |
screen |
Num |
8 |
|
4 |
notif_date |
Num |
8 |
MMDDYY10. |
5 |
dov |
Num |
8 |
DDMMYY10. |
6 |
dob |
Num |
8 |
DDMMYY10. |
7 |
sex |
Char |
1 |
|
8 |
race |
Num |
8 |
|
9 |
ethnicity |
Num |
8 |
|
10 |
ic |
Char |
1 |
|
11 |
site_loc |
Char |
3 |
As an exercise, rewrite either Program 8.1.2 or Program 8.1.3 to verify that the set of variables in the demographic data for sites 2 and 3 is the same as the set shown for site 1.
The PRINT procedure is useful for viewing the data records contained in the file; Program 8.1.4 uses it to display the data on a list of variables selected with the VAR statement.
Program 8.1.4: Displaying Demographics Data from Site 1
proc print data=clinical.demographics_site1(obs=6);
var subject dob dov sex;
run;
The OBS= is a data set option that limits the processing to a maximum number of observations or rows, and is commonly used in this book to shorten output tables. (Not all shortened output tables show this option in the code.) Removing the option (including the parentheses) results in the full data set being printed.
The names listed in the VAR statement are a subset of those shown in Output 8.1.2 and Output 8.1.3, and these variables are displayed in the order provided.
Output 8.1.4: Displaying Demographics Data from Site 1 (Partial Listing)
Obs |
Subject |
Date of Birth |
Date of Visit |
F: Female, M: Male |
1 |
1 |
12/02/1954 |
01/15/2018 |
M |
2 |
2 |
01/13/1984 |
01/16/2018 |
F |
3 |
3 |
11/18/1980 |
01/16/2018 |
M |
4 |
4 |
09/09/1959 |
01/16/2018 |
M |
5 |
5 |
01/07/1965 |
01/16/2018 |
F |
6 |
6 |
12/21/1989 |
01/16/2018 |
M |
Change the set (and/or ordering) of the variables in the VAR statement, see how it affects the output, and generate similar output for each of sites 2 and 3. Also, open a view of each data set and compare the PROC PRINT output to the data in the viewer.
Program 8.1.5 creates a bar chart, which is set up to be delivered to a TIF file in this case; however, it may not be clear where that file will be stored. Before submitting Program 8.1.5, include an appropriate path in the X statement, or remove the X statement and change the working directory to the path where the file is to be placed. (See Program 1.5.1.)
Program 8.1.5: Bar Chart for Sex and Ethnicity Distribution Among Site 1 Enrollees
proc format;
value eth
1=’Non-Hispanic’
2=’Hispanic’
;
run;
x ‘cd --place path here--’;
ods _all_ close;
ods listing image_dpi=300;
ods graphics/reset width=4in imagename=’Sex and Ethnicity Chart-Site 1’ imagefmt=tif;
proc sgplot data=clinical.demographics_site1;
hbar Sex / group=Ethnicity groupdisplay=cluster;
xaxis label=’Number of Subjects’;
yaxis Display=(nolabel);
keylegend / title=’’;
format Ethnicity Eth.;
run;
The X statement accesses the command line for the operating system, and CD is the change directory command. The path placed here becomes the SAS working directory when this command is submitted. This command can also be removed (or commented out) and the working directory can be directly changed in the SAS session.
Graphs from PROC SGPLOT are potentially delivered to a variety of output destinations. ODS _ALL_ CLOSE stops delivery of output to all destinations.
For tables, the ODS LISTING destination is the Output window in the windowing environment. For graphs generated by PROC SGPLOT, the destination is the graphics file itself. IMAGE_DPI is an option that sets the resolution of the image in dots per inch.
The ODS GRAPHICS statement is available to set options for graphs produced, including sizes, names, file types, and various other options.
Output 8.1.5: Bar Chart for Sex and Ethnicity Distribution Among Site 1 Enrollees
Change the code given in Program 8.1.5 to produce a similar bar chart for sites 2 and 3.
8.2 Reading and Summarizing Visit and Lab Data
Using the techniques covered in Section 2.7, read the Site 1, Baseline Visit.txt and Site 1, Baseline Lab Results.txt files into SAS data sets, and use them to make the tables shown below. Output 8.2.1 is a listing of subjects recruited at site 1 who failed screening at the baseline visit.
Output 8.2.1: Screening Failures at Baseline Visit in Site 1 (Partial Listing)
Subject |
Screen Failure Reason |
Failure |
3 |
LOW BASELINE PAIN |
23JAN2018 |
5 |
HIGH BLOOD PRESSURE |
19JAN2018 |
7 |
HIGH BLOOD PRESSURE |
25JAN2018 |
12 |
LOW BASELINE PAIN |
29JAN2018 |
14 |
HIGH BLOOD PRESSURE |
24JAN2018 |
15 |
HIGH BLOOD PRESSURE |
25JAN2018 |
Output 8.2.2 considers subjects who failed screening due to an excessive blood pressure reading at the baseline visit for subjects recruited at site 1, and includes the blood pressure measurements in the listing.
Output 8.2.2: Data for Screen Failures on Blood Pressure at Baseline Visit in Site 1 (Partial Listing)
Subject |
Screen Failure Reason |
Failure |
Systolic Blood Pressure |
Diastolic Blood Pressure |
5 |
HIGH BLOOD PRESSURE |
19JAN2018 |
144 |
115 |
7 |
HIGH BLOOD PRESSURE |
25JAN2018 |
132 |
92 |
14 |
HIGH BLOOD PRESSURE |
24JAN2018 |
119 |
98 |
15 |
HIGH BLOOD PRESSURE |
25JAN2018 |
124 |
100 |
21 |
HIGH BLOOD PRESSURE |
03FEB2018 |
139 |
99 |
22 |
HIGH BLOOD PRESSURE |
04FEB2018 |
152 |
120 |
Output 8.2.3 is an attempt to investigate patterns in pain level across males and females, separated by those who failed initial screening and those who did not.
Output 8.2.3: Sex Versus Pain Level Split on Screen Failure Versus Pass, Site 1 Baseline
Table 1 of Sex by pain |
||||||
Controlling for screen=Fail |
||||||
Sex |
pain |
|||||
Frequency |
0 |
1 |
2 |
3 |
4 |
Total |
Female |
3 |
0 |
0 |
0 |
0 |
3 |
Male |
11 |
0 |
0 |
0 |
0 |
11 |
Total |
14 |
0 |
0 |
0 |
0 |
14 |
Frequency Missing = 37 |
Table 2 of Sex by pain |
||||||
Controlling for screen=Pass |
||||||
Sex |
pain |
|||||
Frequency |
0 |
1 |
2 |
3 |
4 |
Total |
Female |
0 |
5 |
10 |
10 |
10 |
35 |
Male |
0 |
7 |
7 |
12 |
16 |
42 |
Total |
0 |
12 |
17 |
22 |
26 |
77 |
Output 8.2.4 summarizes the distribution of screening failure reasons at the baseline visit across males and females at the baseline visit in site 1.
Output 8.2.4: Sex Versus Screen Failure at Baseline Visit in Site 1
Table of Sex by sf_reason |
|||
Sex |
sf_reason(Screen Failure Reason) |
||
Frequency |
HIGH BLOOD PRESSURE |
LOW BASELINE PAIN |
Total |
Female |
23 |
3 |
26 |
Male |
14 |
11 |
25 |
Total |
37 |
14 |
51 |
Considering the results of Output 8.2.3 against the summary in 8.2.4, determine why the first table in Output 8.2.3 has 37 missing values and why the second table has a first column with a zero frequency. Reduce the three tables produced across Output 8.2.3 and 8.2.4 to two tables containing the relevant information.
Output 8.2.5 provides a statistical summary from the baseline visit at site 1 for diastolic blood pressure and pulse rates across combinations of sex and systolic blood pressure (split into two classes).
Output 8.2.5: Diastolic Blood Pressure and Pulse Summary Statistics at Baseline Visit in Site 1
Sex |
Systolic |
N Obs |
Variable |
N |
Mean |
Std Dev |
Minimum |
Maximum |
Female |
Acceptable (120 or below) |
37 |
dbp |
37 |
77.4 |
8.7 |
61.0 |
91.0 |
High |
24 |
dbp |
24 |
100.6 |
9.3 |
87.0 |
120.0 |
|
Male |
Acceptable (120 or below) |
55 |
dbp |
55 |
77.1 |
10.9 |
49.0 |
100.0 |
High |
12 |
dbp |
12 |
97.6 |
10.1 |
80.0 |
118.0 |
Output 8.2.6 shows glucose and hemoglobin statistical summaries from the baseline lab results at site 1.
Output 8.2.6: Glucose and Hemoglobin Summary Statistics from Baseline Lab Results, Site 1
Sex |
N Obs |
Variable |
Minimum |
Lower Quartile |
Median |
Upper Quartile |
Maximum |
Female |
35 |
c_gluc |
96.0 |
101.0 |
105.0 |
109.0 |
116.0 |
Male |
42 |
c_gluc |
98.0 |
101.0 |
106.0 |
109.0 |
118.0 |
Produce versions of Output 8.2.1, 8.2.2, 8.2.5, and 8.2.6 for the baseline visits at sites 2 and 3, and also for all other visits at sites 1, 2, and 3.
8.3 Improving Reading of Data; Creating Charts
Using the expanded techniques for reading raw data covered in Chapter 3, re-read the Site 1, Baseline Visit.txt and Site 1, Baseline Lab Results.txt files into SAS data sets. Read all nonstandard values, such as dates, in a manner that maximizes their utility, and use them to create the results shown in this section. Output 8.3.1 is a chart of monthly recruits at site 1.
Output 8.3.1: Recruiting by Month in Site 1
Output 8.3.2 reconstructs Output 8.3.1, removing all recruits who failed initial screening at the baseline visit.
Output 8.3.2: Recruits that Pass Initial Screening, by Month in Site 1
Mean albumin levels are summarized in Output 8.3.3, with error bars representing 95% confidence intervals for the mean.
Output 8.3.3: Average Albumin Results—Baseline Lab, Site 1
Also construct versions of Output 8.3.1, 8.3.2, and 8.3.3 for each of sites 2 and 3.
Output 8.3.4 represents an investigation of the Pos variable, which records the position of the subject during reading of vital signs. Modify the DATA step that reads in the raw file so that it corrects the issue shown.
Output 8.3.4: Investigation of the Positions Recorded—Baseline Visit, Site 1
Position for VS Reading |
||||
pos |
Frequency |
Percent |
Cumulative |
Cumulative |
RECLINED |
29 |
22.66 |
29 |
22.66 |
RECUMBANT |
39 |
30.47 |
68 |
53.13 |
RECUMBENT |
60 |
46.88 |
128 |
100.00 |
Using the techniques like those shown in Section 3.9, inspect other variables for possible errors and correct them, if possible. Complete this investigation for data (labs, vitals, and demographics) from all visits at sites 1, 2, and 3.
8.4 Working with Data Stacked Across Visits (and Sites)
Chapter 4 discusses methods for stacking data sets which, for this case study, allows for lab information across all visits to sites 1, 2, and 3 to be put into a single data set—and the same can be done for the visit data. Output shown in this section relies on assembling these data sets at various levels, with some additional specifications for information to be added for tracking the origin of each record. Specifications for those variables and values are given in Table 8.4.1.
Table 8.4.1: Variables and Values for Visit Tracking
Variable Name |
Type (Format or Length) |
Value Definition |
VisitC |
Character, length 20 |
Term Visit Term is one of: Baseline, 3 Month, 6 Month, 9 Month, or 12 Month |
VisitNum |
Numeric, format: best12. |
0 for baseline visit, 1 for 3 month visit, 2 for 6 month, and so forth. |
VisitMonth |
Numeric, format: best12. |
0 for baseline visit, 3 for 3 month visit, 6 for 6 month, and so forth. |
Output 8.4.1: Subjects Ordered on Visit, Site 1 (Partial Listing)
subject |
visitc |
visitMonth |
visitNum |
Date of Visit |
Systolic Blood Pressure |
Diastolic Blood Pressure |
pulse |
1 |
Baseline Visit |
0 |
0 |
01/22/2018 |
110 |
80 |
75 |
1 |
3 Month Visit |
3 |
1 |
04/24/2018 |
112 |
81 |
76 |
1 |
6 Month Visit |
6 |
2 |
07/25/2018 |
114 |
82 |
77 |
1 |
9 Month Visit |
9 |
3 |
10/25/2018 |
116 |
83 |
78 |
1 |
12 Month Visit |
12 |
4 |
01/25/2019 |
118 |
84 |
79 |
2 |
Baseline Visit |
0 |
0 |
01/23/2018 |
99 |
84 |
81 |
2 |
3 Month Visit |
3 |
1 |
04/26/2018 |
100 |
84 |
80 |
2 |
6 Month Visit |
6 |
2 |
07/28/2018 |
101 |
84 |
79 |
2 |
9 Month Visit |
9 |
3 |
10/29/2018 |
102 |
84 |
78 |
2 |
12 Month Visit |
12 |
4 |
01/30/2019 |
103 |
84 |
77 |
Reproduce the following graphs, which are built from Lab and Visit information assembled across all visits at site 1.
Output 8.4.2: Albumin Distributions, Post-Baseline Visits, Site 1
Output 8.4.3: Glucose Distributions, Baseline and 3 Month Visits, Site 1
Output 8.4.4: Systolic Blood Pressure Quartiles, Site 1
Take the lab data assembled across lab visits for each of site 1, 2, and 3 and put all of those sites together—including a variable that indicates which site a record came from—so all lab records from all visits for sites 1, 2, and 3 are in a single data set with information to identify the site and visit on each record. Do the same for the visit data. Generate variations of Outputs 8.4.2, 8.4.3, and 8.4.4 that summarize similar results across various combinations of sites and visits.
8.5 Assembling and Summarizing Data—Sites 1, 2, and 3
Starting with the data assembled in Section 8.4 for site 1, produce correlations among mean (across visits for each subject) systolic BP, diastolic BP, and pulse. Partial results are given in Output 8.5.1.
Output 8.5.1: Correlations Among BP and Pulse, Site 1, Separated by Sex (Only Females Shown)
sex=F
Simple Statistics |
||||||
Variable |
N |
Mean |
Std Dev |
Sum |
Minimum |
Maximum |
sbp_Mean |
61 |
115.73907 |
17.82507 |
7060 |
79.00000 |
155.00000 |
dbp_Mean |
61 |
86.90164 |
14.23967 |
5301 |
61.00000 |
120.00000 |
pulse_Mean |
61 |
83.77596 |
14.80678 |
5110 |
56.33333 |
118.00000 |
Pearson Correlation Coefficients, N = 61 |
|||
sbp_Mean |
dbp_Mean |
pulse_Mean |
|
sbp_Mean |
1.00000 |
0.89318 |
0.86459 |
dbp_Mean |
0.89318 |
1.00000 |
0.97953 |
pulse_Mean |
0.86459 |
0.97953 |
1.00000 |
Also create the scatterplot and spline-smoothing plot (shown in Outputs 8.5.2 and 8.5.3, respectively) from the assembled data (the smoothing parameter in the PBSPLINE statement is set to 10,000).
Output 8.5.2: Scatter Plot of BP Values, Site 1, All Visits
Output 8.5.3: Spline Smoothing on Weight Versus Systolic BP, Site 1, All Visits
From the visit data sets, create a table of all visit dates for subjects who completed the study (in other words, have a 12 Month visit), computing the number of days from the first visit to the last.
Output 8.5.4: Visits for Subjects Completing the Study for Site 1, with Total Days on Study (Partial Listing)
Obs |
subject |
dovB |
dov3 |
dov6 |
dov9 |
dov12 |
Days |
1 |
1 |
01/22/2018 |
04/24/2018 |
07/25/2018 |
10/25/2018 |
01/25/2019 |
368 |
2 |
2 |
01/23/2018 |
04/26/2018 |
07/28/2018 |
10/29/2018 |
01/30/2019 |
372 |
3 |
6 |
01/24/2018 |
04/26/2018 |
07/25/2018 |
10/22/2018 |
01/19/2019 |
360 |
4 |
8 |
01/21/2018 |
04/25/2018 |
07/25/2018 |
10/25/2018 |
01/25/2019 |
369 |
5 |
9 |
01/27/2018 |
04/30/2018 |
08/02/2018 |
11/04/2018 |
02/06/2019 |
375 |
6 |
10 |
01/24/2018 |
04/22/2018 |
07/22/2018 |
10/23/2018 |
01/24/2019 |
365 |
7 |
11 |
01/27/2018 |
04/27/2018 |
07/25/2018 |
10/21/2018 |
01/17/2019 |
355 |
8 |
13 |
01/29/2018 |
04/27/2018 |
07/25/2018 |
10/23/2018 |
01/21/2019 |
357 |
Modify Output 8.5.4 to show all subjects with missing visits for non-completing patients, as shown in Output 8.5.5.
Output 8.5.5: Visits for All Subjects, Site 1 (Partial Listing)
subject |
vis1 |
vis2 |
vis3 |
vis4 |
vis5 |
1 |
01/22/2018 |
04/24/2018 |
07/25/2018 |
10/25/2018 |
01/25/2019 |
2 |
01/23/2018 |
04/26/2018 |
07/28/2018 |
10/29/2018 |
01/30/2019 |
4 |
01/20/2018 |
04/19/2018 |
. |
. |
. |
6 |
01/24/2018 |
04/26/2018 |
07/25/2018 |
10/22/2018 |
01/19/2019 |
8 |
01/21/2018 |
04/25/2018 |
07/25/2018 |
10/25/2018 |
01/25/2019 |
9 |
01/27/2018 |
04/30/2018 |
08/02/2018 |
11/04/2018 |
02/06/2019 |
10 |
01/24/2018 |
04/22/2018 |
07/22/2018 |
10/23/2018 |
01/24/2019 |
11 |
01/27/2018 |
04/27/2018 |
07/25/2018 |
10/21/2018 |
01/17/2019 |
Add a column for days, from first visit to last visit, as shown in Output 8.5.6.
Output 8.5.6: Visits for All Subjects with Days on Study, Site 1 (Partial Listing)
subject |
vis1 |
vis2 |
vis3 |
vis4 |
vis5 |
Days |
1 |
01/22/2018 |
04/24/2018 |
07/25/2018 |
10/25/2018 |
01/25/2019 |
368 |
2 |
01/23/2018 |
04/26/2018 |
07/28/2018 |
10/29/2018 |
01/30/2019 |
372 |
4 |
01/20/2018 |
04/19/2018 |
. |
. |
. |
89 |
6 |
01/24/2018 |
04/26/2018 |
07/25/2018 |
10/22/2018 |
01/19/2019 |
360 |
8 |
01/21/2018 |
04/25/2018 |
07/25/2018 |
10/25/2018 |
01/25/2019 |
369 |
9 |
01/27/2018 |
04/30/2018 |
08/02/2018 |
11/04/2018 |
02/06/2019 |
375 |
10 |
01/24/2018 |
04/22/2018 |
07/22/2018 |
10/23/2018 |
01/24/2019 |
365 |
11 |
01/27/2018 |
04/27/2018 |
07/25/2018 |
10/21/2018 |
01/17/2019 |
355 |
From the lab data, create a list of the variable names and labels. Output 8.5.7A shows a possible result; however, these names and labels are chosen during the process of reading the raw data and may differ. Output 8.5.7B shows information from the Lab_Info data set.
Output 8.5.7A: Lab Test Information for Lab Data Sets
Obs |
_NAME_ |
_LABEL_ |
1 |
alb |
Chem-Albumin, g/dL |
2 |
alk_phos |
Chem-Alk. Phos., IU/L |
3 |
alt |
Chem-Alt, IU/L |
4 |
ast |
Chem-AST, IU/L |
5 |
c_gluc |
Chem-Glucose, mg/dL |
6 |
d_bili |
Chem-Dir. Bilirubin, mg/dL |
7 |
ggtp |
Chem-GGTP, IU/L |
8 |
hematocr |
EVF/PCV, % |
9 |
hemoglob |
Hemoglobin, g/dL |
10 |
preg |
Pregnancy Flag, 1=Pregnant, 0=Not |
11 |
prot |
Chem-Tot. Prot., g/dL |
12 |
t_bili |
Chem-Tot. Bilirubin, mg/dL |
13 |
u_gluc |
Uri.-Glucose, 1=high |
Output 8.5.7B: Lab Test Information from Lab_Info Data Set
Obs |
lbtestcd |
labtest |
colunits |
1 |
ALB |
ALBUMIN |
g/dL |
2 |
ALP |
ALK. PHOS. |
IU/L |
3 |
ALT |
ALT (SGPT) |
IU/L |
4 |
AST |
AST (SGOT) |
IU/L |
5 |
BILDIR |
DIRECT BILI |
mg/dL |
6 |
GGT |
GGTP |
IU/L |
7 |
GLUC |
GLUCOSE |
mg/dL |
8 |
GLUC |
GLUCOSE |
|
9 |
HCT |
HEMATOCRIT |
% |
10 |
HGB |
HEMOGLOBIN |
g/dL |
11 |
BILI |
TOTAL BILI |
mg/dL |
12 |
PROT |
TOTAL PROT |
g/dL |
13 |
PREG |
PREG |
Determine a mapping/matching from the set of test results given in the lab data sets and the test information in the Lab_Info data set, creating variables in each data set to achieve a match merge. If successful, it is then possible to produce a data set with information like that shown in Output 8.5.8 (for one subject at the baseline visit) and Output 8.5.9 (for one subject and selected labs across multiple visits). Note the RangeFlag variable, which is 1 any time the value is outside the range limits and is 0 otherwise.
Output 8.5.8: Baseline Lab Results for Subject 13, Site 1—Including Flag for Values Outside Normal Range
Subject |
Laboratory Test |
Collected Units |
Value |
Normal Range Lower Limit |
Normal Range Upper Limit |
RangeFlag |
13 |
ALBUMIN |
g/dL |
3.60 |
3.4 |
5.4 |
0 |
13 |
ALK. PHOS. |
IU/L |
132.00 |
20.0 |
140.0 |
0 |
13 |
ALT (SGPT) |
IU/L |
24.00 |
5.0 |
35.0 |
0 |
13 |
AST (SGOT) |
IU/L |
14.00 |
10.0 |
34.0 |
0 |
13 |
DIRECT BILI |
mg/dL |
0.31 |
0.0 |
0.3 |
1 |
13 |
TOTAL BILI |
mg/dL |
2.19 |
0.3 |
1.9 |
1 |
13 |
GLUCOSE |
mg/dL |
104.00 |
100.0 |
110.0 |
0 |
13 |
GGTP |
IU/L |
39.00 |
0.0 |
51.0 |
0 |
13 |
HEMATOCRIT |
% |
53.00 |
35.0 |
49.0 |
1 |
13 |
HEMOGLOBIN |
g/dL |
14.40 |
11.7 |
15.9 |
0 |
13 |
PREG |
0.00 |
. |
. |
0 |
|
13 |
TOTAL PROT |
g/dL |
7.20 |
6.0 |
8.3 |
0 |
13 |
GLUCOSE |
0.00 |
0.0 |
0.0 |
0 |
Output 8.5.9: Selected Lab Results for Subject 2, Site 1—Including Flag for Values Outside Normal Range
Subject |
Date of Visit |
visitNum |
Laboratory Test |
Value |
Normal Range Lower Limit |
Normal Range Upper Limit |
RangeFlag |
2 |
01/23/2018 |
0 |
DIRECT BILI |
0.10 |
0.0 |
0.3 |
0 |
2 |
04/26/2018 |
1 |
DIRECT BILI |
0.20 |
0.0 |
0.3 |
0 |
2 |
07/28/2018 |
2 |
DIRECT BILI |
0.10 |
0.0 |
0.3 |
0 |
2 |
10/29/2018 |
3 |
DIRECT BILI |
0.20 |
0.0 |
0.3 |
0 |
2 |
01/30/2019 |
4 |
DIRECT BILI |
0.30 |
0.0 |
0.3 |
0 |
2 |
01/23/2018 |
0 |
TOTAL BILI |
0.82 |
0.3 |
1.9 |
0 |
2 |
04/26/2018 |
1 |
TOTAL BILI |
1.49 |
0.3 |
1.9 |
0 |
2 |
07/28/2018 |
2 |
TOTAL BILI |
0.93 |
0.3 |
1.9 |
0 |
2 |
10/29/2018 |
3 |
TOTAL BILI |
1.49 |
0.3 |
1.9 |
0 |
2 |
01/30/2019 |
4 |
TOTAL BILI |
2.06 |
0.3 |
1.9 |
1 |
2 |
01/23/2018 |
0 |
HEMATOCRIT |
48.00 |
35.0 |
49.0 |
0 |
2 |
04/26/2018 |
1 |
HEMATOCRIT |
35.00 |
35.0 |
49.0 |
0 |
2 |
07/28/2018 |
2 |
HEMATOCRIT |
46.00 |
35.0 |
49.0 |
0 |
2 |
10/29/2018 |
3 |
HEMATOCRIT |
32.00 |
35.0 |
49.0 |
1 |
2 |
01/30/2019 |
4 |
HEMATOCRIT |
28.00 |
35.0 |
49.0 |
1 |
2 |
01/23/2018 |
0 |
HEMOGLOBIN |
13.40 |
11.7 |
15.9 |
0 |
2 |
04/26/2018 |
1 |
HEMOGLOBIN |
12.00 |
11.7 |
15.9 |
0 |
2 |
07/28/2018 |
2 |
HEMOGLOBIN |
12.60 |
11.7 |
15.9 |
0 |
2 |
10/29/2018 |
3 |
HEMOGLOBIN |
11.20 |
11.7 |
15.9 |
1 |
2 |
01/30/2019 |
4 |
HEMOGLOBIN |
9.80 |
11.7 |
15.9 |
1 |
From the full set of visits at site 1, create a graph to display pain score trends in the various arms, as shown in Output 8.5.10.
Output 8.5.10: Pain Score Trends in Various Arms
Repeat all of the activities in this section for sites 2 and 3.
8.6 Data Restructuring and Report Writing
Data from the baseline visit for site can be rotated as shown in Output 8.6.1. Once that rotation is achieved successfully, the same code can be extended to rotate the data across all visits.
Output 8.6.1: Rotated Data from Baseline Visit, Site 1 (Partial Listing)
Obs |
subject |
Date of Visit |
name |
value |
units |
1 |
1 |
01/22/2018 |
Systolic BP |
110.0 |
mm/hg |
2 |
1 |
01/22/2018 |
Diastolic BP |
80.0 |
mm/hg |
3 |
1 |
01/22/2018 |
Pulse |
75.0 |
beats/mi |
4 |
1 |
01/22/2018 |
Temperature |
98.4 |
F |
5 |
1 |
01/22/2018 |
Weight |
192.0 |
lb |
6 |
2 |
01/23/2018 |
Systolic BP |
99.0 |
mm/hg |
7 |
2 |
01/23/2018 |
Diastolic BP |
84.0 |
mm/hg |
8 |
2 |
01/23/2018 |
Pulse |
81.0 |
beats/mi |
9 |
2 |
01/23/2018 |
Temperature |
98.5 |
F |
10 |
2 |
01/23/2018 |
Weight |
134.0 |
lb |
From the rotated version of the full set of visits at site 1, create the following tables of summary statistics shown in Output 8.6.2 and 8.6.3.
Output 8.6.2: Summary Report on Selected Vital Signs, All Visits, Site 1
Visit |
Measurement |
Units |
Mean |
Median |
Std. |
Minimum |
Maximum |
Baseline Visit |
Diastolic BP |
mm/hg |
83.5 |
82.5 |
14.09 |
49.0 |
120.0 |
Pulse |
beats/mi |
79.9 |
79.0 |
14.70 |
42.0 |
118.0 |
|
Systolic BP |
mm/hg |
111.0 |
110.0 |
17.03 |
67.0 |
155.0 |
|
Temperature |
F |
98.5 |
98.5 |
0.58 |
97.1 |
99.8 |
|
3 Month Visit |
Diastolic BP |
mm/hg |
77.1 |
79.0 |
9.67 |
49.0 |
90.0 |
Pulse |
beats/mi |
73.5 |
75.0 |
10.46 |
44.0 |
89.0 |
|
Systolic BP |
mm/hg |
103.7 |
104.0 |
12.23 |
70.0 |
133.0 |
|
Temperature |
F |
98.4 |
98.4 |
0.75 |
96.9 |
100.2 |
|
6 Month Visit |
Diastolic BP |
mm/hg |
77.7 |
80.0 |
9.65 |
50.0 |
91.0 |
Pulse |
beats/mi |
74.3 |
76.0 |
10.71 |
43.0 |
90.0 |
|
Systolic BP |
mm/hg |
104.9 |
104.0 |
12.26 |
71.0 |
135.0 |
|
Temperature |
F |
98.4 |
98.4 |
1.04 |
96.2 |
100.3 |
|
9 Month Visit |
Diastolic BP |
mm/hg |
78.1 |
80.0 |
9.53 |
51.0 |
91.0 |
Pulse |
beats/mi |
74.8 |
75.5 |
10.18 |
45.0 |
93.0 |
|
Systolic BP |
mm/hg |
105.9 |
105.0 |
11.99 |
74.0 |
138.0 |
|
Temperature |
F |
98.5 |
98.6 |
1.25 |
95.7 |
101.2 |
|
12 Month Visit |
Diastolic BP |
mm/hg |
77.6 |
80.0 |
9.07 |
52.0 |
91.0 |
Pulse |
beats/mi |
74.4 |
75.0 |
9.78 |
45.0 |
89.0 |
|
Systolic BP |
mm/hg |
105.4 |
105.0 |
11.20 |
75.0 |
128.0 |
|
Temperature |
F |
98.3 |
98.2 |
1.43 |
95.0 |
102.1 |
Output 8.6.3: BP Summaries, All Visits, Site 1
Measurement |
||||||
Diastolic BP |
Systolic BP |
|||||
Visit |
Mean |
Median |
Std. Dev. |
Mean |
Median |
Std. Dev. |
Baseline Visit |
83.5 |
82.5 |
14.09 |
111.0 |
110.0 |
17.03 |
3 Month Visit |
77.1 |
79.0 |
9.67 |
103.7 |
104.0 |
12.23 |
6 Month Visit |
77.7 |
80.0 |
9.65 |
104.9 |
104.0 |
12.26 |
9 Month Visit |
78.1 |
80.0 |
9.53 |
105.9 |
105.0 |
11.99 |
12 Month Visit |
77.6 |
80.0 |
9.07 |
105.4 |
105.0 |
11.20 |
Repeat each of these for sites 2 and 3, then assemble all data for all sites into a single data set and repeat these activities for that collection of data.
8.7 Advanced Data Reading and Report Writing—Connecting to Spreadsheets and Databases
Read the adverse event data from site 1, create the listing in Output 8.7.1, and repeat the same process for sites 2 and 3.
Output 8.7.1: Adverse Events, Site 1 (Partial Listing)
Obs |
subject |
aetext |
stdt |
endt |
1 |
25 |
TIA |
11Jan2018 |
12Jan2018 |
2 |
25 |
CARPAL TUNNEL IN LEFT HAND |
18Jan2018 |
19Jan2018 |
3 |
38 |
SUICIDAL |
24Jan2018 |
24Jan2018 |
4 |
41 |
MOUTH PAIN |
14Jan2018 |
15Jan2018 |
5 |
41 |
IRRITABLE BOWEL SYNDROME |
28Jan2018 |
29Jan2018 |
6 |
41 |
LOOSE STOOL |
30Jan2018 |
01Jan2018 |
7 |
46 |
RACING HEART BEAT |
30Jan2018 |
01Jan2018 |
8 |
46 |
DELIRIOUS |
26Jan2018 |
27Jan2018 |
9 |
50 |
HEART FLUTTERING |
17Jan2018 |
17Jan2018 |
10 |
50 |
RINGING IN EAR |
30Jan2018 |
01Jan2018 |
Extend Output 8.6.3 into that shown in Output 8.7.2, repeating this for sites 2 and 3, and for the full data set across all sites.
Output 8.7.2: BP Summaries—Extended, Site 1
Measurement |
||||||||
Diastolic BP |
Systolic BP |
Ratios (SBP:DBP) |
||||||
Visit |
Mean |
Median |
Std. Dev. |
Mean |
Median |
Std. Dev. |
Mean |
Median |
Baseline Visit |
83.5 |
82.5 |
14.09 |
111.0 |
110.0 |
17.03 |
132.9% |
133.3% |
3 Month Visit |
77.1 |
79.0 |
9.67 |
103.7 |
104.0 |
12.23 |
134.4% |
131.6% |
6 Month Visit |
77.7 |
80.0 |
9.65 |
104.9 |
104.0 |
12.26 |
135.0% |
130.0% |
9 Month Visit |
78.1 |
80.0 |
9.53 |
105.9 |
105.0 |
11.99 |
135.6% |
131.3% |
12 Month Visit |
77.6 |
80.0 |
9.07 |
105.4 |
105.0 |
11.20 |
135.8% |
131.3% |
Repeat the process leading to Output 8.6.2 to produce the enhanced version shown in Output 8.7.3. Also repeat this for each of sites 2 and 3, and for the full data set across all three sites.
Output 8.7.3: Summary Report on Selected Vital Signs, All Visits, Site 1—Enhanced
Visit |
Test |
Units |
Mean |
Median |
Std. Dev. |
Min. |
Max. |
Baseline Visit |
Diastolic BP |
mm/hg |
83.5 |
82.5 |
14.09 |
49.0 |
120.0 |
Pulse |
beats/mi |
79.9 |
79.0 |
14.70 |
42.0 |
118.0 |
|
Systolic BP |
mm/hg |
111.0 |
110.0 |
17.03 |
67.0 |
155.0 |
|
Temperature |
F |
98.5 |
98.5 |
0.58 |
97.1 |
99.8 |
|
93.2 |
97.8 |
18.15 |
42.0 |
155.0 |
|||
3 Month Visit |
Diastolic BP |
mm/hg |
77.1 |
79.0 |
9.67 |
49.0 |
90.0 |
Pulse |
beats/mi |
73.5 |
75.0 |
10.46 |
44.0 |
89.0 |
|
Systolic BP |
mm/hg |
103.7 |
104.0 |
12.23 |
70.0 |
133.0 |
|
Temperature |
F |
98.4 |
98.4 |
0.75 |
96.9 |
100.2 |
|
88.2 |
89.0 |
16.10 |
44.0 |
133.0 |
|||
6 Month Visit |
Diastolic BP |
mm/hg |
77.7 |
80.0 |
9.65 |
50.0 |
91.0 |
Pulse |
beats/mi |
74.3 |
76.0 |
10.71 |
43.0 |
90.0 |
|
Systolic BP |
mm/hg |
104.9 |
104.0 |
12.26 |
71.0 |
135.0 |
|
Temperature |
F |
98.4 |
98.4 |
1.04 |
96.2 |
100.3 |
|
88.8 |
90.0 |
16.14 |
43.0 |
135.0 |
|||
9 Month Visit |
Diastolic BP |
mm/hg |
78.1 |
80.0 |
9.53 |
51.0 |
91.0 |
Pulse |
beats/mi |
74.8 |
75.5 |
10.18 |
45.0 |
93.0 |
|
Systolic BP |
mm/hg |
105.9 |
105.0 |
11.99 |
74.0 |
138.0 |
|
Temperature |
F |
98.5 |
98.6 |
1.25 |
95.7 |
101.2 |
|
89.3 |
90.5 |
16.07 |
45.0 |
138.0 |
|||
12 Month Visit |
Diastolic BP |
mm/hg |
77.6 |
80.0 |
9.07 |
52.0 |
91.0 |
Pulse |
beats/mi |
74.4 |
75.0 |
9.78 |
45.0 |
89.0 |
|
Systolic BP |
mm/hg |
105.4 |
105.0 |
11.20 |
75.0 |
128.0 |
|
Temperature |
F |
98.3 |
98.2 |
1.43 |
95.0 |
102.1 |
|
88.9 |
89.0 |
15.84 |
45.0 |
128.0 |
Use the lab results to produce the table shown in Output 8.7.4 (for potentially any subject at any site, subject 2 at site 1 shown).
Output 8.7.4: Selected Lab Results for Subject 2, Site 1—Enhanced
Subject |
Lab Test |
Visit |
Units |
Measured Value |
Normal Low |
Normal High |
2 |
DIRECT BILI |
Baseline Visit |
mg/dL |
0.1 |
0 |
0.3 |
3 Month Visit |
mg/dL |
0.2 |
0 |
0.3 |
||
6 Month Visit |
mg/dL |
0.1 |
0 |
0.3 |
||
9 Month Visit |
mg/dL |
0.2 |
0 |
0.3 |
||
12 Month Visit |
mg/dL |
0.3 |
0 |
0.3 |
||
HEMATOCRIT |
Baseline Visit |
% |
48 |
35 |
49 |
|
3 Month Visit |
% |
35 |
35 |
49 |
||
6 Month Visit |
% |
46 |
35 |
49 |
||
9 Month Visit |
% |
32 |
35 |
49 |
||
12 Month Visit |
% |
28 |
35 |
49 |
||
HEMOGLOBIN |
Baseline Visit |
g/dL |
13.4 |
11.7 |
15.9 |
|
3 Month Visit |
g/dL |
12 |
11.7 |
15.9 |
||
6 Month Visit |
g/dL |
12.6 |
11.7 |
15.9 |
||
9 Month Visit |
g/dL |
11.2 |
11.7 |
15.9 |
||
12 Month Visit |
g/dL |
9.8 |
11.7 |
15.9 |
||
TOTAL BILI |
Baseline Visit |
mg/dL |
0.82 |
0.3 |
1.9 |
|
3 Month Visit |
mg/dL |
1.49 |
0.3 |
1.9 |
||
6 Month Visit |
mg/dL |
0.93 |
0.3 |
1.9 |
||
9 Month Visit |
mg/dL |
1.49 |
0.3 |
1.9 |
||
12 Month Visit |
mg/dL |
2.06 |
0.3 |
1.9 |
||
Using the techniques described in Section 7.6, connect to data for each of sites 4 and 5 and repeat each of these results for those sites as well. Combine the site 4 and 5 data with the data from the other three sites and repeat the activities from Sections 8.6 and 8.7 for the full data set.
From the data across all sites and all visits, create a table of lab information (LB) and vital signs information (VS) as given in the specifications found in the file SDTM Specs.xls in the Clinical Trial Case Study folder.