Chapter 27: Statistical and Mathematical Analysis in a Healthcare Setting (1/3)

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

231

Chapter 27

Statistical and

Mathematical Analysis in

a Healthcare Setting

Roque Perez-Velez

Introduction

is chapter will discuss the topic of statistical and mathematical analysis in a healthcare setting.

e author will share his experience with dealing, analyzing, and studying data from various

healthcare systems, and how to explain trends to a nontechnically oriented audience. If the reader

is interested in the basics of statistics or learning more about this topic, the author recommends

Kurtz,

Walpole and Myers,

†

or Montgomery and Runger.

‡

So, what is statistical and mathematical analysis? First, we need to dene several terms. We, as

engineers or managers, are concerned with two types of problems: summarizing, describing, and

M. Kurtz, “Engineering Economics,” in Standard Handbook of Engineering Calculations, 2nd ed., ed. T. G.

Hicks (New York: McGraw-Hill Book Co., 1985).

†

R. E. Walpole and R. H. Myers, Probability and Statistics for Engineers and Scientists, 4th ed. (New York:

Macmillan Publishing Company, Inc., 1989).

‡

D. C. Montgomery and G. C. Runger, Applied Statistics and Probabilities for Engineers, 2nd ed. (New York: John

Wiley& Sons, Inc., 1999).

Contents

Introduction .............................................................................................................................231

Commonly Used Descriptive Statistics .................................................................................... 232

Data Visualization ................................................................................................................... 234

Mathematical Analysis ............................................................................................................. 240

Exploratory Data Analysis ........................................................................................................241

Conclusion ...............................................................................................................................241

232 ◾ Roque Perez-Velez

exploring data or using data to infer on its nature. Mendenhall and Sincich

dene descriptive sta-

tistics “as the branch of statistics devoted to the organization, summarization, and description of

data sets.” Furthermore, in our profession we need to understand the type of data we are working

with. Vining

†

classies statistical analysis as “either enumerative or analytic studies. Enumerative

studies tend to assume that the data come from a static process. Analytic studies tend to assume

that the data come from a dynamic process that changes over time.”

Also, Boslaugh

‡

oers that “the practice of statistics usually involves analyzing data, and the

validity of the statistical results depends in large part on the validity of the data analyzed.” She

asserts that “this means that at some point between data collection and data analysis, someone

has to get her hands dirty working directly with the data le, cleaning, organizing, and otherwise

getting it ready for analysis.” Finally, Peck and Devore

assert, “statistics involves collecting, sum-

marizing, and analyzing data. All three tasks are critical. Without summarization and analysis,

raw data are of little value, and even sophisticated analyses can’t produce meaningful information

from data that were not collected in a sensible way.”

With this in mind, we can dene statistical analysis as the collection, management, organiza-

tion, summarization, analysis, and description of data sets by means of a statistical software pro-

gram or other similar methods.

Perhaps the reader has heard the term structured data analysis. Is this a dierent analysis or is

it associated with the statistical analysis dened above? First, structured data analysis is dened

as the statistical analysis of structured data sets such as results from surveys, multiple-choice

questionnaires, or other arranged data sets. By denition, structured data analysis is a subset

of statistical analysis. Some examples of this methodology are regression, Bayesian, cluster, and

algebraic analysis.

e author denes mathematical analysis as the study of stochastic, continuous probability

and Markov chain analyses as a subdivision of the work performed during statistical analysis.

e parameters calculated with statistical analysis are used as a foundation, in stochastic or

Markov chain analyses, to further study any healthcare system, such as an emergency depart-

ment’s patient ow.

Commonly Used Descriptive Statistics

In this section, the author denes and provides examples of the most commonly used descriptive

statistics: mean, standard deviation, median, mode, minimum, and maximum. First, we will dene

the statistics that are used as measures of central tendency followed by the measures of dispersion.

e mean, commonly called the arithmetic mean, is the average of a set of values. e mean

is used as a measure of central tendency. Suppose we have a family medicine practice clinic with

weekly patient load as shown in Table27.1.

We can calculate the mean as:

(52 + 57 + 57 + 61 + 44)/5 = 54.2

W. Mendenhall and T. Sincich, Statistics for Engineers and the Sciences, 3rd ed. (San Francisco, CA: Dellen

Publishing Co., 1992).

†

G. Georey Vining, Statistical Methods for Engineers (Pacic Grove, CA: Brooks/Cole Publishing Co., 1998).

‡

S. Boslaugh, Statistics in a Nutshell (Sebastopol, CA: O’Reilly Media, Inc., 2012).

R. Peck and J. L. Devore, Statistics: e Exploration and Analysis of Data (Boston, MA: Brooks/Cole Publishing

Co., 2012).

Statistical and Mathematical Analysis in a Healthcare Setting ◾ 233

e arithmetic mean formula, as expressed in summation notation, is shown in Equation (27.1):

∑

µ=

(27.1)

When the values are ranked in ascending or descending order, the median, mode, minimum,

and maximum are the middle value, the most frequently occurring value, and the lowest and the

highest occurring values, in that order. e median is a better measure of central tendency than

the mean for data that is asymmetrical or contains outliers, while the mode is most often useful

in describing ordinal or categorical data. Continuing with the clinic example above, the patient

load, ranked in ascending order, is: 44, 52, 57, 57, and 61. e minimum is 44, the median is 57,

the mode is 57, and the maximum is 61. e median is formally dened as the (n + 1)/2 values for

odd numbers or average of the two middle values for even numbers.

Please bear in mind that, in perfectly symmetrical distribution such as the normal distribu-

tion, the mean, median, and mode are identical while in asymmetrical or skewed distributions,

these three measures will dier.

A common measure of dispersion for continuous data is standard deviation. It describes how

much the individual values in a data set vary from the mean. e formula for the sample standard

deviation is shown in Equation (27.2):

∑

−

()

(27.2)

So, what will the standard deviation be for our family practice clinic example? Let’s see:

s = 1/(5 – 1) × [(44 – 54.2)

+ (52 – 54.2)

+ (57 – 54.2)

+ (61 – 54.2)

] = 6.53

Another measure of dispersion is the percentile, of which quartiles are a subset. When an

ordered set of data is divided into four equal parts, the division points are called quartiles. e

rst or lower quartile, q1, is a value that has approximately 25% of the observations below it and

approximately 75% of the observations above. e second quartile, q2, has approximately 50%

of the observations below its value. e second quartile is exactly equal to the median. e third

Table27.1 Weekly Patient

Load

Weekday Patient Load

Monday 52

Tuesday 57

Wednesday 57

Thursday 61

Friday 44

234 ◾ Roque Perez-Velez

quartile, q3, has approximately 75% of the observations below its value. e rst and third quar-

tiles can be calculated as (n + 1)/4 and 3(n + 1)/4 respectively, where n is the number of observa-

tions. e interquartile range (iqr) is calculated as (q3 – q1). Also, the smallest and largest values

are calculated as q1 – 1.5 (q3 – q1) and q3 + 1.5 (q3 – q1), respectively. ese metrics are exten-

sively used in the creation of box plots or commonly known as box-and-whiskers plots. Tuery

indicates that it can also be used to compare two populations, or to detect the individual outliers

that must be excluded from the analysis to avoid falsifying the results.

Suppose that we have a pediatric unit where the management engineer is conducting a stang

analysis. e engineer wants to know the estimated daily census for any given day. One way for

the engineer to understand the census dispersion for a particular day is to calculate the sample’s

percentiles and subdivide it into quartiles. Table27.2 shows the census for 24 days.

For this example, the median, after sorting in ascending order, is 22. e minimum and maxi-

mum values, respectively, are 13 and 31. e rst and third quartiles, using the formulas presented

above, are 19 and 25, respectively. ese values give the engineer a pretty good perspective in rela-

tion to the spread or dispersion for the daily census.

ese metrics are widely used to analyze any process or system within the healthcare envi-

ronment no matter if the data is nominal, ordinal, interval, continuous, or discrete. e use of

metrics, such as the mean and standard deviation, is the foundation of statistical process control

(SPC), Total Quality Management (TQM), and Six Sigma methodologies, which are discussed in

another chapter of this book.

Now, the author has noticed that when presenting statistical analysis, on occasions where the

audience’s background is diverse (nontechnical to technical), the audience is likely to mistakenly

believe that the values for the mean and standard deviation are equal to quartiles. Figure27.1

shows how these two metrics compare.

Data Visualization

Statistical analysis results must be presented in meaningful ways, specically if the audience is

diverse. It should be presented in a simple and clear but concise method. Care must be taken when

visualizing data to present an unbiased picture. Ryan

†

stresses that “much care must be exercised

in the use of graphical procedures, otherwise, the impressions that are conveyed could be very

misleading.” ere are methods that are appropriate for displaying essential information in large

data sets and there are methods for displaying small data sets. Methods for displaying small data

sets include, but are not limited to, tabular displays, steam-and-leaf displays, control charts, scat-

ter plots, frequency tables, bar charts, pie charts, and dot plots. Common methods for displaying

S. Tuery, Data Mining and Statistics for Decision Making (West Sussex, UK: John Wiley & Sons Ltd., 2011).

†

T. P. Ryan, Modern Engineering Statistics (Hoboken, NJ: John Wiley & Sons, Inc., 2007).

Table27.2 Weekly Patient Load

Pediatric Unit Daily Census

31 20 18 30 20 27 22 15

13 19 17 15 24 24 27 18

23 21 25 22 21 19 30 25

Statistical and Mathematical Analysis in a Healthcare Setting ◾ 235

large data sets include, but are not limited to, histograms, box plots, Pareto charts, line and regres-

sion charts, and bivariate and multivariate charts. For extremely large data sets, the most recent

analysis method is called data mining.

e following is a discussion of several examples of displaying data sets of various sizes.

Pie charts are broadly used to display small data sets with a small number of workable groups.

Pie charts are the simplest and most commonly used to depict nominal data, such as limited-

option questionnaires. Ott and Longnecker

provide simple guidelines for constructing pie charts.

ey recommend choosing “a small number (ve or six) of categories for the variable, and to,

whenever possible, construct the pie chart so that percentages are in either ascending or descend-

ing order.” Figure27.2 depicts a local hospital’s percentage of births by day of week in a pie chart.

Bar charts are widely used to display small to medium-size data sets. e chart consists of

two axes, horizontal and vertical, arranged on a small number of workable groups, that visually

represents magnitude. A simple example would be for a clinical laboratory’s manager to respond

to a question related to length of time per transaction for a pneumatic tube transport system.

Table27.3 summarizes the number of transactions per time frame.

Figure27.3 shows the same data plotted using a bar chart.

By using similar data to that presented in Table27.2, the management engineer can plot the

daily census, by day of the week, for a pediatric unit. is will enable the engineer to better visu-

alize any patterns in daily census. is large data set is from the results of a dynamic simulation

R. L. Ott, and M. Longnecker, An Introduction to Statistical Methods and Data Analysis (Belmont, CA: Brooks/

Cole, Cengage Learning, 2010).

4σ3σ2σ1σ0σ

15.73% 68.27% 15.73%

24.65% 50%

Median

Q1 – 1.5 × IQR Q3 + 1.5 × IQR

Q1 Q3

IQR

24.65%

–4σ –3σ –2σ –1σ

4σ3σ2σ1σ0σ–4σ –3σ –2σ

–2.698σ 2.698σ–0.6745σ 0.6745σ

–1σ

4σ3σ2σ1σ0σ–4σ –3σ –2σ –1σ

Figure 27.1 Graphs illustrating mean and standard deviation.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 27: Statistical and Mathematical Analysis in a Healthcare Setting (1/3)

Create new playlist

Sign In

Sign Up

Table of Contents for
Chapter 27: Statistical and Mathematical Analysis in a Healthcare Setting (1/3)