2
Health Risk Assessment

If the Earth were sick, nobody would be healthy.

2.1 Environmental Health

Human footprints on the nature significantly deteriorate the environmental quality of air, water, and soil, pose a great threat to human health and reduce human life expectancy. Taking air pollution as an example, the primary and secondary ambient air standards of particulate matter of diameter 2.5 µm (PM2.5) in the United States are 12 and 15 µg/m3, respectively (US EPA, 2016). The corresponding standards of PM2.5 in China are set at 15 and 30 µg/m3, respectively (China MEP, 2016). However, the average concentration of PM2.5 for 10 provinces in China was greater than 55 (µg/m3) in 2015. According to the data from the Administration of Energy of China, 64% of primary energy in China is from coal. China with 1.4 billion population burns about 47% of the world’s total coal consumption. The annual average PM2.5 for the top 10 most polluted provinces in China is at least twice as much as the secondary standard of 35 µg/m3 as shown in Figure 2.1.

China’s 10 most polluted provinces in 2015. The graph has 10 descending columns with 3 horizontal lines. Columns are along Henan, Beijing, Hebei, Tianjin, Shandong, Hubei, Jiangsu, Shanxi, Anhui, and Chongqing.

Figure 2.1 China’s 10 most polluted provinces in 2015.

(Source: From the US EPA and China MEP (2016).)

In China, mortality rates due to environmental damage are significantly greater than natural mortality rates. Lung, liver, and stomach cancers are the top three mortality causes in China. Epidemic studies indicate that lung cancer is associated with chemical smog due to coal burning, liver cancer with drinking water and strong liquor, and stomach cancers with contaminated food due to soil pollution. About 90% of the 161 cities whose air quality was monitored in 2014 were below Chinese official standards according to the National Bureau of Statistics of China (NBSC). More than half of China's surface water is so polluted that it cannot be used as drinking water resources. As a result, about 4000 people died every day in China due to coal burning as the likely principal cause. Due to its severity, air pollution is the leading cause of lung cancer and other respiratory infections. Figure 2.2 shows the percentage of leading causes of death in the world. In China, the most common causes of death are (i) lung cancer with 1.59 million deaths, (ii) liver cancer with 745 000 deaths, (iii) stomach cancer with 723 000 deaths, (iv) colorectal cancer with 694 000 deaths, (v) breast cancer with 521 000 deaths, and (vi) esophageal cancer with 400 000 deaths.

Graph for the 7 leading causes of death in the world in 2012. The graph has 7 columns, along ischaemic heart disease, stroke, COPD, lower respiratory infections, etc. Ischaemic heart disease has the highest peak.

Figure 2.2 Seven leading causes of death in the world in 2012.

(Source: Data from WHO (2014).)

Worldwide, environmental pollution is also the leading cause of cancer and contributes to major premature mortality. WHO (2014) estimated that there were 3.1, 1.6, and 1.5 million people who died from lower respiratory infections, trachea bronchus lung cancers, and diarrhea diseases in the world as shown in Figures 2.2 and 2.3. There were approximately 14 million new cases, and the number of new cases is expected to rise by about 70% over the next two decades (WHO, 2014). In 2012, ischemic heart disease, stroke, and chronic obstructive pulmonary disease caused 7.4, 6.7, and 3.1 million deaths, respectively (NBSC, 2012).

Pie graph of 7 leading causes of death: 29.7% ischaemic heart disease; 26.9% stroke; 12.5% COPD and lower respiratory infections; 6.4% tracheal bronchus, lung cancers; and 6% HIV/AIDS and diarrheal diseases.

Figure 2.3 Seven leading causes of death (percent) in the world.

(Source: Data from WHO (2014).)

Figure 2.4 shows that the leading causes of mortality due to malignant tumor is 168 and 159 deaths per 100 000 population in urban and rural China, respectively, which is about 28 and 25% of the leading causes of death in urban and rural China.

Clustered bars for mortality rate of malignant tumor; heart, cerebrovascular, and respiratory system diseases; external causes of injury and poison; and other causes of death in urban and rural China in 2009.

Figure 2.4 Leading causes of death in urban and rural China in 2009.

(Source: Data from National Bureau of Statistics of China (2012).)

Further evidence of the environmental health issue can be seen from Figures 2.5 and 2.6. Zhao et al. (2010) showed that lung, liver, stomach, esophageal, and colorectal cancer were 32, 19, 18, 9, and 8% in urban China, while these data were 24, 23, 23, 15, and 5% in rural China, respectively, as shown in Figure 2.5.

2 Pie graphs of leading causes of death (percent), such as malignant tumor and heart diseases in urban (left) and rural (right) China in 2009. Both graphs indicate highest rate for malignant tumor.

Figure 2.5 Leading causes of death (percent) in urban and rural China in 2009.

(Source: Data from National Bureau of Statistics of China (2012).)

Pie graphs of top 10 cancers, such as lung cancer, liver cancer, stomach cancer, and esophageal cancer in urban (left) and rural (right) China in 2004–2005. Both graphs indicate highest rate for lung cancer.

Figure 2.6 Top 10 cancers in urban and rural China in 2004–2005.

(Source: Zhao et al. (2010). Reproduced with permission of Oxford University Press.)

To monitor the progress of protecting human health, the United Nation uses an improved sanitation facility as a metric. It is defined as piped sewerage, septic tanks, and pit latrines with slabs or composting toilets not shared with other households. From 1990 to 2015, the coverage of improved sanitation at these facilities rose from 54% to around 68% globally as shown in Figure 2.7. This number, however, missed the Millennium Development Goal (MDG) target by 9%. In 2015, about 946 million people were still practicing open defecation worldwide (WHO, 2016).

Vertical bars indicating approximately 4.9, 0.9, 0.7, and 0.6 billion sanitation facilities with improved sanitation, open defecation, unimproved sanitation, and shared sanitation, respectively, in 2015.

Figure 2.7 Global sanitation facilities in 2015.

(Source: Data from WHO (2016).)

To achieve the Sustainable Development Goals (SDG) target, cities all over the world need to reduce open defecation, promote handwashing, and improve management and treatment of faucal wastes from both collected sewer and on‐site facilities. For drinking water, the UN uses pipe water on premise and public standpipes, boreholes, protected wells, springs, and rainwater as indicators for improved drinking water. In 2015, there were 6.6 billion people using improved drinking water, while 0.7 billion still used an unimproved drinking water source or surface water (WHO, 2016), as shown in Figure 2.8.

Vertical bars indicating population of (highest–lowest) piper water on premises; public standpipes, boreholes, protected wells and springs, and rainwater; unimproved water sources; and surface water.

Figure 2.8 Global drinking water sources.

(Source: Data from WHO (2016).)

About one‐quarter of improved sources are contaminated with feces and approximately 1.8 billion people drink water containing such contamination (WHO, 2016). Therefore, only 68% in urban areas and 20% in rural areas have truly safe drinking water. Unsafe drinking water causes liver cancer due to disinfection by‐products (DBPs) and other contaminants such as antibiotics in drinking water.

2.2 Environmental Standards

The US EPA establishes environmental standards using health risk assessment (HRA) according to carcinogenic or noncarcinogenic chemicals. For chemicals that are known or expected to cause adverse health effects, the EPA established an enforceable maximum contaminant level (MCL) or nonenforceable maximum contaminant level goal (MCLG). The Safe Drinking Water Act (SDWA) proclaims standards and health advisories (HAs) for DBPs. HRA quantifies factors such as adsorption during ingestion, pharmacokinetics, mutagenicity, reproductive and developmental effects, and carcinogenicity (Pontius, 1990). The MCLs are federally enforceable limits for contaminants in drinking water established as the national primary drinking water regulations (NPDWRs). The secondary MCLs are established under the SDWA to protect public welfare such as odor, taste, and appearance. The HAs for drinking water contaminants are levels considered to be without appreciable health risk for specific durations of exposure and are not legally enforceable. Similarly, the MCLGs are nonenforceable health goals, which are to be set at levels at which no known or anticipated adverse effects on the health of persons occur with an adequate margin of safety. Table 2.1 is the description of the SDWA standards and HA categories.

Table 2.1 SDWA standards and health advisories.

Standards
MCL (maximum contaminant level) The enforceable concentration that is provided to public water system users
MCLG (maximum contaminant level goal) The nonenforceable concentration that protects humans from adverse effects
Health advisories
RfD (reference dose) An estimation of daily human exposure without appreciable risk to adverse effects over a lifetime
DWEL (drinking water equivalent level) The estimation of a lifetime exposure that protects humans from adverse noncancerous health effects, assuming the sole exposure source is from drinking water
One‐day exposure The drinking water concentration that is not expected to cause adverse noncarcinogenic effects if exposure continues for five consecutive days
10‐Day exposure The drinking water concentration that is not expected to cause adverse noncarcinogenic effects if exposure continues for 14 consecutive days
Long‐term exposure The drinking water concentration that is not expected to cause noncarcinogenic effects if exposure continues for 10% of the person’s lifetime
Lifetime HA The drinking water concentration that is not expected to cause adverse noncarcinogenic effects if exposure continues for a lifetime

To assess an individual’s risk, bioassays are converted to estimate human risk based on human exposure. The dose–response curves from animal tests were used to determine the equivalent human dose–response curve. The MCLG is set using a three‐category approach dependent upon the evidence of carcinogenicity. If the evidence of carcinogenicity is strong, the MCLG is set to zero. If the evidence of carcinogenicity is limited, one of two methods is utilized to calculate the MCLG depending on the toxicity data available. For every chemical compound that has an MCLG, an MCL or treatment technique must be determined. Considering the cost of remediation, the MCL is based on the best available technology (BAT) and set close to the MCLG. If BAT cannot achieve zero, MCL cannot equal to zero. When the MCL is within a risk of 10−4 to 10−6, the MCLG is set to zero.

In the past, reference dose (RfD) expressed in units of milligrams per kilogram of body weight per day (mg/kg/day) was used. The RfD is based on lifetime exposure level at which there is no significant risk to humans. It is calculated by dividing the no observed adverse effect level (NOAEL) or the lowest observed adverse effect level (LOAEL) by an uncertainty factor. The uncertainty factor accounts for differences between human and animal and differences within the human population and varies from 10 to 1000 depending upon the toxicity data available. RfD is determined using the following equation:

(2.1)images

Since an extremely high uncertainty factor is used, major limitations of RfDs are the following: (i) it is limited to one of the doses in the study and is dependent on study design, (ii) it does not account for variability in the estimate of the dose–response, (iii) it does not account for the slope of the dose–response curve, and (iv) it cannot be applied when there is no NOAEL, except through the application of an uncertainty factor (Crump, 1984; Kimmel and Gaylor, 1988).

To overcome the high uncertainty factor of the RfD, the EPA developed benchmark dose (BMD) methods by fitting mathematical models to dose–response data and to select a BMD associated with a predetermined benchmark response (BMR), such as a 10% increase in the incidence of a particular lesion or a 10% decrease in body weight gain. Results from all models include a reiteration of the model formula and model run options chosen by the user, goodness‐of‐fit information, the BMD, and the estimate of the lower‐bound confidence limit on the BMD (BMDL). The benchmark dose software (BMDS 2.6) by the US EPA (2016) has the nested models, parameter standard error reporting, and parameter initialization for continuous models. BMDS 2.6 contains thirty different models appropriate for the analysis of dichotomous (quantal) data, continuous data, nested developmental toxicology data, multiple tumor analysis, and concentration–time data. Typical models used in the software are shown in Table 2.2.

Table 2.2 Models used in the US EPA Benchmark Dose Software 2.6 (US EPA, 2015).

Model type Model Abbreviation
Continuous Exponential exp
Hill hil
Linear lin
Polynomial ply
Power pow
Dichotomous Gamma gam
Logistic log
LogLogistic lnl
LogProbit lnp
Multistage mst
Multistage cancer msc
Probit pro
Weibull wei
Quantal linear qln
Dichotomous hill dhl
Dichotomous alternative Gamma‐BgDose gmb
Logistic‐BgResponse
LogProbit‐BgDose lpb
Mutistage‐BgDose msb
Multistage‐Cancer‐BgDose mcb
Probit‐BgResponse prb
Weibull‐BgDose web
Nested Nested logistic nln
NCTR nct
Rai and van Ryzin rvr
Repeated response measures ToxicoDiffusion txd
Concentration × time ten Berge ten
Multitumor MS_Combo multi

The SDWA proclaims standards and HAs for DBPs (US EPA, 1997). However, setting the MCLs and MCLGs involves a lot of uncertainty, because discrepancies may exist for lifetime and longer‐term exposure HAs due to conservative policies. For example, the uncertainty factor may vary from 5 to 5000 when the lifetime health advisory concentration is estimated (U.S. EPA, 1996). Human HRA and environments risk assessment (ERA) are used to develop both ambient and discharge standards by the US EPA. Risk assessment is a systematic approach to characterize the nature and magnitude of the risks associated with environmental or health hazards.

Since chlorination is the major disinfection process in the United States, regulation of DBP concentration in drinking water is one of the major challenges faced by the US EPA. Major human and financial resources have been devoted to identify, monitor, assess, and regulate the human health effect of DBPs. As a result, HRA guidelines of DBPs were developed to protect the public from both biological and chemical risks. For example, the US EPA developed the Stage 2 DBPR aiming to reduce peak DBP concentrations in the distribution system. When a water treatment plant (WTP) assesses its disinfection strategy, both the disinfectant effectiveness against the target pathogen and the DBPs formed as a result of the disinfectant must be considered in the decision‐making process. Since no toxicity test could be performed on humans, animal test data are used with quantified uncertainty and variation involved. During HRA of DBP, information for accurate evaluation of DBP risk may not be complete, and the uncertainty factor in the assessment may be quite large (Cothern et al., 1986). For example, for total trihalomethanes (TTHM) and five haloacetic acids (HAA5), the drinking water standards were set at 80 and 60 µg/l as locational running annual average (LRAA) by the EPA (2006), respectively. If there was no specific toxicity information available, the MCLG was set up based upon a quantitative structure–activity relationship (QSAR) study. For example, the MCLG of 1,1‐dichloroethylene and cis‐1,2‐dichloroethylene were developed using QSAR approach.

In addition to pathogenic bacteria and viruses, Cryptosporidium is one of the new public health concerns for the EPA. The EPA established the Long Term 2 Enhanced Surface Water Treatment Rule (LT2ESWTR) based on Cryptosporidium concentrations in source water and current treatment practices. Four bins were recommended corresponding to additional treatment requirements for filtered WTPs as shown in Table 2.3. WTPs with average Cryptosporidium concentrations less than 0.075 oocysts per liter (oocysts/l) are placed in bin 1 where no additional treatment is required. For concentrations of 0.075 oocysts/l or more, treatment beyond the existing processes is required. The additional treatment required for each bin, specified in terms of log removal, depends on the type of treatment that the WTP already uses. In setting the biological standards, the term “log” means the order of magnitude reduction in concentration; e.g. 2‐log removal equals a 99% reduction, 3‐log removal equals a 99.9% reduction, and 4‐log removal equals a 99.99% reduction. Giardia and virus need 3‐log and 4‐log removal and/or inactivation, respectively.

Table 2.3 Bin requirements for filtered PWSs (US EPA, 2006).

Cryptosporidium concentration (oocysts/l) Bin classification And if the following filtration treatment is operating in full compliance with existing regulations, then the additional treatment requirements are:
Conventional filtration treatment (includes softening) Direct filtration Slow sand or diatomaceous earth filtration Alternative filtration technologies
<0.075 1 No additional treatment No additional treatment No additional treatment No additional treatment
≥0.075 and <1.0 2 1‐log treatment 1.5‐log treatment 1‐log treatment As determined by the state
≥1.0 and <3.0 3 2‐log treatment 2.5‐log treatment 2‐log treatment As determined by the state
≥3.0 4 2.5‐log treatment 3‐log treatment 2.5‐log treatment As determined by the state

To reduce biological and chemical risks simultaneously, UV disinfection appeared to be the best option for drinking water and treated wastewater effluent. Unlike chemical disinfectants, UV leaves no residual that can be monitored to determine UV dose and inactivation credit. To earn disinfection credits, however, a relationship between the required UV dose and these parameters must be established and then monitored at a WTP to ensure sufficient disinfection of microbial pathogens. The UV dose depends on the UV intensity (measured by UV sensors), the flow rate, and the UV transmittance (UVT). The US EPA (2006) recommended UV dose requirements (mJ/cm2) in Table 2.4.

Table 2.4 UV dose requirements: millijoules per centimeter squared (mJ/cm2) (the US EPA, 2006).

Target pathogens Log inactivation
0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0
Cryptosporidium 1.6 2.5 3.9 5.8 8.5 12 15 22
Giardia 1.5 2.1 3.0 5.2 7.7 11 15 22
Virus 39 58 79 100 121 143 163 186

2.3 Health Risk Assessment

Human health risk assessment (HRA) quantifies factors such as adsorption during ingestion, pharmacokinetics, mutagenicity, reproductive and developmental effects, and carcinogenicity (Pontius, 1990). It is to quantify the likelihood that adverse human health effects may occur or are occurring as a result of exposure to one or more stressors (US EPA, 1992). To reduce environmental health mortality, HRA is used to detect issues, identify hazards, characterize hazards, assess exposure, and manage and communicate risk. By using HRA, the EPA developed an Integrated Risk Information System (IRIS) that contains the human health effect databases of chemicals that may result from exposure to various chemicals in the environment. The IRIS 10−6 risk level is the contaminant concentration (in µg/l) in drinking water that would yield no greater than an additional risk of one in a million (10−6) after a lifetime of drinking that water. The acute 10‐day values apply specifically to acute toxic effects on children but are expected to be protective for adults. For noncarcinogenic chemicals, this value is typically the same as the MCLG. The chronic (lifetime) values for cancer are set at a level that should yield no greater than an additional 10−6 risk over a lifetime exposure. According to the IRIS, the EPA cancer risk is classified in six different categories: (i) H is carcinogenic to humans, (ii) L is likely to be carcinogenic to humans, (iii) L/N is likely to be carcinogenic above a specified dose, (iv) S is suggestive evidence of carcinogenic potential, (v) I is inadequate information to assess carcinogenic potential, and (vi) N is not likely to be carcinogenic to humans.

2.3.1 Hazard Identification

Hazard identification (HI) is to determine whether exposure to a stressor can cause an increase in the incidence of adverse health effect in humans, which reflects the capacity of an agent to cause adverse health effects in humans and other animals (US EPA, 1995). Qualitative description in HI is complemented with QSAR, genetic toxicity, pharmacokinetic, and the weight of evidence. For example, toxicokinetic data that deal with how the body absorbs, distributes, metabolizes, and eliminates specific chemicals are usually used in HI. In the HI of chlorination, chemical disinfectants such as chlorine and chloramines react with naturally occurring organic material (NOM) to produce DBPs that are potential carcinogens and development defects in laboratory animals. DBPs have been linked to potential health risks such as liver, kidney, or central nervous system problems and the increased risk of cancer. To quantify the toxicity of DBPs, QSAR analysis is one important method of estimating the carcinogenicity of DBP using animal toxicity tests. There are six major classes of DBPs, namely, halogenated alkane, halogenated alkene, halogenated aromatic, halogenated aldehyde, halogenated ketone, and halogenated carboxylic acid. A class of DBPs usually has the similar toxic mode or mechanisms. Table 2.5 lists these DBP classes and their definition.

Table 2.5 Classification of DBPs.

DBP class Definition
Trihalomethanes Chemical compounds in which three of the four hydrogen atoms of methane (CH4) are replaced by halogen atoms
Haloacetic acids Carboxylic acids in which a halogen atom takes the place of a hydrogen atom in acetic acid
Haloacetonitriles Small organic compounds containing nitrogen, chlorine, and/or bromine. Little data is available on the toxicity of the haloacetonitriles; however, animal studies suggest that dichloroacetonitrile is mutagenic and therefore potentially carcinogenic
Haloketone A functional group consisting of a ketone group with an α‐halogen substituent. The general structure is RR′C(X)C(=O)R where R is an alkyl or aryl residue and X any one of the halogens
Other halogenateds compounds Chloropicrin, chloral hydrate, and cyanogen chloride
Aldehydes Organic compounds that have an acyl group, R–C=O with a hydrogen bond to the carbonyl or acyl carbon (double‐bonded carbon)
Inorganic compound Chlorine, chloramines, chlorine dioxide, and bromate

In addition to the main molecular structures such as alkane, alkene, aromatic compounds, substituents such as chlorine and bromine within each DBP class contribute significantly to the toxicity of the corresponding DBPs. Table 2.6 compares the relative order of potency of haloacetonitriles in terms of different toxic mechanisms (Pontius, 1990). For example, haloalkanes are procarcinogens that are activated via reactions in which CYP450 acts as a catalyst in dehalogenation. Brominated compounds are expected to be more alkylating activity reactive than chlorinated compounds. Chlorine or bromine substitution at the α‐carbon or terminal carbon in this group is expected to be potential alkylating agents. The active halogen at both ends of the aliphatic chain is also expected to be cross‐linking agents. The stability of several chlorinated ketones in aqueous solutions follows this order: 1,3‐dichloro > pentachloro ≫ hexachloro. Among the mutagenic chloropropanols (mostly direct acting), the relative mutagenic potency follows the order 1,3‐ > 1,1,3,3‐ > penta‐ > 1,1,3‐ > 1,1,1‐ > 1,1‐, with the potency of 1,3‐ being about 100–1000 times higher than that of 1,2‐dichloroacetic acid (DCA) and trichloroacetic acid (TCA), which are known as mouse liver carcinogens and are by‐products found in drinking water. In animal tests, DCA produces developmental, reproductive, neural, and hepatic effects. Aldehydes as an electrophilic compound may form DNA–protein cross‐links and lead carcinogenesis.

Table 2.6 Comparison of chemical, biochemical, and biologic properties of haloacetonitriles.

Chemical/biochemical/biologic tests in assessing carcinogenic potential Relative order of potency of haloacetonitriles tested
Alkylating activity 4‐(p‐nitrobenzyl)(pyridine reaction) Br2 ≫ BrCl > Cl ≫ Cl2 ≫ Cl3
Inhibition of glutathione S‐transferase Cl3 > Br2 > Cl2 > Br > Cl
Escherichia coli SOS chromotest BrCl > Br2 > Cl2 ≫ Cl (inactive)
Ames or Ames fluctuation tests Cl2 ≈ BrCl > Cl3 ≈ Br2 > Cl ≈ Br (inactive)
Cl3 > Cl > Cl2 > Br2 ≈ Br (inactive)
DNA single‐strand breaks in HeLa cell (comet assay) Cl3 > BrCl > Br2 > Cl2 > Cl
Br2 > Cl3 ≈ Cl2 ≈ BrCl > Cl
Sister chromatid exchanges in Chinese hamster ovary cells Br2 > BrCl > Cl3 ≥ Cl2 > Cl
Newt micronucleus assay Br2 ≥ Cl3 > Br > Cl2 ≫ Cl
In vivo mouse micronucleus assay Br2 ≈ BrCl ≈ Cl3 ≈ Cl2 ≈ Cl (all inactive)
Lung adenoma assay in strain A mice Cl ≈ Cl3 ≈ BrCl > Br2 ≈ Cl2 (inactive)
Skin tumor initiation in SENCAR mice Br2 ≈ Cl ≈ BrCl > Cl3 (inconsistent) > Cl2 (inactive)

2.3.2 Dose–Response Curves

A dose–response relationship describes how severity of adverse health effects (the responses) is related to the exposure amount to a chemical compound. The measured response usually increases as the dose increases. To establish drinking water standards, dose–response curves of animal toxicity tests are used to establish toxicity slopes or BMD and to document the dose–response relationship over the range of observed doses. With animal toxicity data, the health risk is extrapolated beyond the lower range of available observed data until the dose level begins to be adverse to human health. The shape of the dose–response relationship curve depends on the chemical, the kind of response, incidence of disease, and death. However, there are major gaps during the extrapolation from animal to human and from high dose to low dose to predict human health risk. Great uncertainty could be introduced during these extrapolations through either nonlinear or linear dose–response models for different modes of action. For example, animal tumor data is based on the development of the dose–response parameter as effect dose (ED) at 10% (ED10) or lethal dose at 10% (LD10). The actual dose heavily depends upon the exposure routes and animals. The EPA lists the general LD50 of different chemical compounds with great variations in Table 2.7. For dioxin (TCDD), the LD50 is as low as 0.001 mg/kg. For inhalation toxicity test, the LD50 for rats and mice are 293 and 137 mg/kg, respectively. Table 2.7 shows LD50 for different chemicals varied thousands of times.

Table 2.7 LD50 of typical chemicals based upon animal toxicity test (US EPA, 2006).

Chemical LD50 (mg/kg) Chemical LD50 (with route and animal)
Ethyl alcohol 10 000 Caffeine 620 mg/kg – oral mouse
Sodium chloride 4 000 192 mg/kg – oral rat
105 mg/kg – i.v. rat
68 mg/kg – i.v. mouse
Ferrous sulfate 1 500 Chlorine (LC50) 293 ppm/1 h – rat
137 ppm/1 h – mouse
Morphine sulfate 900 THC (from marijuana) 175 mg/kg – i.v. mouse
Strychnine sulfate 150 155 mg/kg – i.v. rabbit
100 mg/kg – i.v. dog
Nicotine 1 Mercury(I) chloride 210 mg/kg – oral rat
8 mg/kg – i.v. mouse
Black widow 0.55 Mercury(II) chloride 37 mg/kg – oral rat
10 mg/kg – oral mouse
Curare 0.50 Arsenic acid (V oxidation state) 48 mg/kg – oral rat
Rattlesnake 0.24 Arsenic trioxide (III oxidation state) 20 mg/kg – oral rat
Dioxin (TCDD) 0.001 Dimethylarsenic acid (methylated arsenic form used as a cotton defoliant) 700 mg/kg – oral rat

2.3.2.1 Nonlinear Dose–Response Assessment

In nonlinear dose–response assessment, the threshold of toxicity is where the effects (or their precursors) begin to occur. The no observed adverse effect level (NOAEL) is the highest exposure level at which no statistically or biologically significant increases are seen in the frequency or severity of adverse effect between the exposed population and its control population. Different mathematical models were used to establish the bench mark dose (BMD) or benchmark dose lower confidence limit (BMDL) in the range from 1 to 10% depending on toxicity tests. The BMDL is a statistical lower confidence limit on the dose that produces the selected response. The lowest observed adverse effect level (LOAEL), NOAEL, or BMDL is used as the point of departure for extrapolation to lower doses. The EPA (2012) developed a guideline on using dose–response modeling to obtain BMD, i.e. dose levels corresponding to specific response levels, near the low end of the observable effect as shown in Figure 2.9. The fraction of animals affected in each group is indicated by the points with the error bars of 95% confidence intervals. In developing the BMD, the EPA requires the following statistical information of the process: (i) rationale, (ii) estimation procedure, (iii) estimates of model parameters, (iv) goodness of fit such as log‐likelihood and Akaike Information Criterion (AIC), and (v) standardized residuals. The US EPA (2012) provided excellent examples to illustrate some important aspects of computing benchmark doses (BMDs) and BMDLs from simple datasets and endpoints using EPA’s BMDS package.

Fraction affected vs. dose displaying an ascending curve and dots with error bars, with MD and BMDL indicated at the bottom left portion. At the top left is a legend indicating dots for data and line for multistage.

Figure 2.9 Example of a model fit to dichotomous data, with BMD and BMDL indicated.

Table 2.9 Parameter estimates with standard errors for 2nd‐degree multistage model.

Parameter Maximum likelihood estimates (MLEs) Standard error
Background 0.12 0.132665
Beta1 0.00930036 0.141898
Beta2 0.00925286 0.0246904
Fitted 2nd-degree multistage model and data means represented by an ascending line and dots, respectively, for tumor incidence versus human equivalent dose and with dashed lines indicating BMDL and BMD.

Figure 2.10 Fitted 2nd‐degree multistage model and data means.

Table 2.10 Parameter estimates with standard errors for 1st‐degree multistage model.

Parameter Maximum likelihood estimates (MLEs) Standard error
Background 0.111488 0.111488
Beta1 0.120556 0.120556

Table 2.11 Goodness‐of‐fit table.

Dose Estimated probability Expected number responding Observed number responding Group size Scaled residual
0.0000 0.1115 5.574 6 50 0.086
2.8300 0.2417 11.842 10 49 −0.205
5.6700 0.3531 17.657 19 50 0.118
Fitted 1st-degree multistage model and data means represented by an ascending line and dots, respectively, for tumor incidence versus human equivalent dose and with dashed lines indicating BMDL and BMD.

Figure 2.11 Fitted 1st‐degree multistage model and data means.

2.3.2.2 Linear Dose–Response Assessment

For carcinogens, if “mode of action” information is insufficient, then linear extrapolation is typically used as the default approach for dose–response assessment. A straight line is drawn from the point of departure for the observed data (typically the BMDL) to the origin (where there are zero dose and zero response). The slope of this straight line is referred to as the slope factor (SF) or cancer SF that is used to estimate risk at exposure levels. When linear dose–response is used to assess cancer risk, excess lifetime cancer risk is calculated as follows:

(2.2)images

Total cancer risk is calculated by adding the individual cancer risks for each pollutant in each pathway of concern (i.e. inhalation, ingestion, and dermal absorption) by using reasonable maximum exposure (RME) and then adding together the risk for all pathways.

2.3.3 Exposure Assessment

Exposure assessment estimates the magnitude, frequency, and duration of human exposure to an agent in the environment or estimates future exposures for an agent that has not yet been released. Exposed concentration can be measured at the point of contact (the outer boundary of the body), estimated by separately evaluating the exposure concentration and the time through different scenarios, or reconstructed through internal indicators (biomarkers, body burden, and excretion levels) after the exposure. Table 2.12 lists the variables used in the US EPA Guidelines for Exposure Assessment (1992) for HRA.

Table 2.12 Specific parameters in health risk assessment.

Parameter Definition Default – child Default – adult
TRL Target risk level (unitless) 10−6 10−6
BW Body weight (kg) 15 70
AT Averaging time (year) 70 70
SFABS Absorbed cancer slope factor (mg/kg/day)−1 Chemical specific Chemical specific
ED Exposure duration (year) 6 30
EV Event frequency (events/day) 1 1
EF Exposure frequency (days/year) 350 350
FA Fraction absorbed (unitless) Chemical specific Chemical specific
tevent‐RME Event duration (h) 1 (bathing) 0.58 (showering)
SA Surface area (cm2) 6 600 18 000
Kp Permeability coefficient (cm/h) Chemical specific Chemical specific
ABSGI Absorption fraction (unitless) Chemical specific Chemical specific
τevent Lag time per event (h) Chemical specific Chemical specific
SFo Oral cancer slope factor (mg/kg/day) Chemical specific Chemical specific
t* Time to reach steady state (h) Chemical specific Chemical specific
DAD Dermal absorbed dose (mg/kg/day) Site specific Site specific
ADevent Absorbed dose per event (mg/cm2/event) Site specific Site specific
B Dimensionless ratio of the permeability coefficient of a compound through the stratum corneum relative to its permeability coefficient across the viable epidermis (ve) (dimensionless) Chemical specific Chemical specific

The internal dose via the dermal route, µg/kg bw/day, can be calculated as follows:

(2.3)images

where

  • images
  • images
  • images
  • images
  • images
  • images
  • images
  • images

2.3.3.1 Cancer Screening Calculation for Dermal Contaminants in Water

For a given cancer risk level at 10−6, the following equations can be used to estimate dermal absorbed dose (DAD) in mg/kg/day:

  • For cancer risk
    (2.4)images
  • For hazard quotient
    (2.5)images
  • Evaluate ADevent:
    (2.6)images
  • Evaluate permissible water concentration, Cw: For organics
    (2.7)images

    (2.8)images

  • For inorganics
    (2.9)images

Example 2.3 illustrates the steps used to calculate the cleanup level from dermal exposure to compounds in water given an acceptable risk of 10−6. The default scenarios used in the calculations are (i) the adult 30‐year exposure and (ii) an age‐adjusted 30‐year exposure incorporating a child bathing for 1 h/event (RME value), once a day, 350 days/year for 6 years and an adult showering at 35 min/event (RME value), once a day, 350 days/year for 24 years. The general equations could be applied to any compound, and the example gives the calculation for one compound in water with a cancer risk of 10−6.

2.3.3.2 Noncancer Screening Calculation for Contaminants in Residential Soil

The following equations are provided by the EPA in the exposure assessment process. The scenario to be evaluated is residential soil. Equations (2.10), (2.11), and (2.12) are used for calculating the soil concentration, Csoil:

Child or adult:

Age adjusted:

The age‐adjusted, body‐part‐weighted dermal factor is as presented in equation

For toxicity assessment, cancer SF can be derived based on absorbed dose:

(2.13)images

while RfD can be expressed as follows based on absorbed dose:

2.3.4 DBP Health Advisory Concentration

Many benchmark concentrations could be correlated with ELUMO, while ELUMO significantly correlates with the number of chlorine for a given class DBP with a specific carbon number (Tang and Wang, 2010). Example 2.5 illustrates such correlation.

2.3.5 Risk Characterizations

Risk characterization conveys the risk assessor’s judgment as to the nature and presence or absence of risks. How the risk was assessed should present where assumptions and uncertainties exist and where policy choices will need to be made. EPA recommends that the characterization fully and explicitly disclose the risk assessment methods, default assumptions, logic, rationale, extrapolations, uncertainties, and overall strength of each step in the assessment. After risk assessment, risk management and communication are key to protecting public health. Each component of the risk assessment (e.g. hazard assessment, dose–response assessment, exposure assessment) has an individual risk characterization for the key findings, assumptions, limitations, and uncertainties. Risk characterization also applies to both human HRA and ecological risk assessments.

2.4 QSAR Analysis in HRA

The US EPA permits and uses quantitative structure and activity relationship (QSAR) principles to classify and prioritize DBPs because more than 600 DBPs have been identified and cataloged by the US EPA but only small fraction of them have been studied on toxicity quantification. QSAR can be used to predict the toxicity of a specific DBP or molecular property of DBP such as log P, which reflects hydrophobicity of a chemical compound. Experts use the principles of mechanism‐based structure and activity relationship (SAR) relative to known carcinogens, and mechanisms include structural analogy to known carcinogens, toxicokinetic and toxicodynamic factors, potency indicators for a structural analog, short‐term test data, and metabolic activation. Therefore, QSAR analysis has its unique role in setting drinking water standards.

QSAR analysis is based upon a simple fact that similar chemical structures are expected to exhibit similar chemical behavior. With tens of thousands of chemical structures to assess and many different empirical tests, QSAR is often used in strategic screening in product development through HI and risk assessment. Given sufficient knowledge on structurally or functionally related compounds, QSAR may be used for screening well‐defined biological, toxicological, or pharmacological endpoints of interest and associated kinetic characteristics such as absorption, distribution, metabolism, and excretion. In QSAR analysis, chemical structure is quantitatively correlated with a well‐defined process, such as biological activity, chemical reactivity, or toxicity of a chemical compound. The US EPA uses QSAR analysis to prioritize DBPs using three general types of predictive models:

  1. Mathematical models such as SARs and QSARs use descriptors and mathematical relationships to derive predictions. Examples of SAR/QSAR models are ECOSAR and select modules contained in EPISuite™ like WSKOWWIN.
  2. Fragment‐based models such as the BioWin module within EPISuite evaluate the features of molecular fragments present on the molecule to make predictions.
  3. Expert systems, like OncoLogic™, use rule‐based decision trees to mimic an expert’s judgment. Other expert systems utilize artificial neural networks and molecular models.

There are hundreds of chemicals that have been identified as DBPs, a large proportion of which fall into the general category of halogenated DBPs. Due to evidence of carcinogenicity and developmental effects, halogenated DBPs are of particular interest. Two major molecular descriptors can be used in predicting the toxicity of DBP. One is log P and the other is the number of carbon and halogen atoms. For QSAR analysis, halogenated DBPs are classified into eight classes as illustrated in Figure 2.14.

Structural formulas of halogenated alkane, halogenated alkene, halogenated aromatic, halogenated aldehyde, halogenated ketone, halogenated carboxylic acid, heterocycle, and a DBP compound.

Figure 2.14 Classification of DBPs.

According to the molecular structure of DBPs, Pierotti (1999) developed a DBP database containing more than one thousand DBP compounds for the eight classes of DBPs. The database includes more than 20 different physical and chemical characteristics for each DBP compound. Major variables are chemical name, address, subclass, CAS number, SMILES, disinfection process, molecular formula and structure, molecular weight, log P, cLog P, log P reference, melting point (°C), boiling point (°C), vapor pressure (mm of Hg), solubility (mol/l), EHOMO, ELUMO, surface area, MCLG (mg/l), MCL (mg/l), 10‐day exposure for a 10 kg child (mg/l), long‐term exposure for 10 kg child (mg/l), long‐term exposure for a 70 kg adult (mg/l), mg/l at 10−4 cancer risk, cancer group, SDWA reference, and toxicological effect. The database is used in developing QSAR models in the following section to illustrate statistical methods such as regression, outlier detection, quantification of uncertainty and sensitivity, and validation of QSAR models.

2.4.1 Multiple Linear Regression (MLR)

Multiple linear regression (MLR) is a common algorithm in deriving QSAR models. It relates the dependent variable y to a number of independent (predictor) variables, xj, by using a linear equation as follows:

(2.15)images

where

  • ŷ = calculated dependent variable
  • xj = predictor variable
  • bj = regression coefficient

To assess goodness of fit quantified by the correlation coefficient of multiple determination, (R2) is calculated by Equation (2.17). R2 estimates the proportion of the variation of y that is explained by regression equation (Massart et al., 1997). If there is a perfect fit, R2 = 1, while if there is no linear relationship between the dependent and independent variables, R2 = 0:

(2.16)images

where

  • images
  • images
  • images
  • images
  • images
  • images

In predicting the toxicity of a chemical compound, two important molecular descriptors are hydrophobicity such as log P and electronic properties such as the energy of the lowest unoccupied molecular orbitals (ELUMO). Log P reflects the hydrophobicity of molecules, which often correlates well with the bioactivity of chemicals (Leo and Hansch, 1999). The logarithm of the partition coefficient (log P), also referred to as log Kow, describes the distribution of a compound between organic (usually n‐octanol) and water phases by the following equation:

where

  • images
  • images

If log P is greater than zero (0), the compound has a greater solubility in the organic phase; if log P is less than zero (0), the compound has a greater solubility in the aqueous phase. Chen et al. (2016) reported that log P can be predicted accurately by the ELUMO, the number of carbon (NC), and the number of chlorine (NCl). A general MLR model is to predict log P is

(2.18)images

To develop a robust QSAR model, outliers have to be detected to improve the correlation coefficiency of the QSAR model. The Hotelling test and the associated leverage statistics can be used to detect outliers. The leverage of a chemical provides a measure of the distance of the chemical from the centroid of its active set. Chemicals in the active set have leverage values between 0 and 1. A warning leverage (h*), defined in Equation (2.19) (Eriksson et al., 2003), is a critical value to cut off the outliers from the dataset:

where

  • images
  • images
  • images

The William plot has a double‐ordinate Cartesian plot of cross‐validation residuals (first ordinate), standard residuals (second ordinate), and leverage (Hat diagonal: abscissa) values (h), which defined the domain of applicability of the model as a squared area within ±2 band for residuals and a leverage threshold, h*.

Figure 2.16 is a box plot that shows the contribution of each variable such as ELUMO, NCl, and NC to the standardized coefficients after the outliers were removed. It suggests that NC contributes to log P most positively, while ELUMO contributes to log P negatively to a lesser extent. The number of Cl and NCl contributes to log P positively but less than both NC and ELUMO.

Standardized coefficients vs. variable displaying boxes with error bars labeled E(LUMO), –0.558; NCI, 0.113; and NC, 1.236 depicting each variable contribution to log P in halogenated alkane.

Figure 2.16 Box plot of each variable contribution to log P in halogenated alkane after removal of outliers.

(Source: Reproduced with permission of Springer.)

Figure 2.17 shows that all the predicted log P values are within the boundary of 95% confidence of the observed log P values.

Observed log P vs. predicted log P displaying circle markers along an ascending solid line from the origin between two ascending dotted lines.

Figure 2.17 Measured log P versus predicted log P for halogenated alkane after removal of outliers.

(Source: Reproduced with permission of Springer.)

2.4.2 Validation of QSAR Models

QSAR models should be balanced between the two extremes of overfitted versus underfitted through model validation. Model validation consists of internal model performance (goodness of fit and robustness) and external model performance (predictivity). Cross‐validation, bootstrapping, response randomization test, and training/test set splitting are recommended by the OECD (2007). In cross‐validation, a number of modified datasets are deleted. In each case, one or a small group of compounds from the data are processed in such a way that each object is removed one at a time. From the original dataset, a reduced dataset (training set or active set) is used to develop a partial model, while the remaining data (validation set) are used to evaluate the model predictivity (Efron, 1983; Osten, 1988).

The leave‐one‐out (LOO) method is the simplest cross‐validation procedure. Each compound is removed one at a time. For given n compounds, n‐reduced models are developed. Each of these models is developed with the remaining n − 1 compounds and used to predict the response of the deleted compound. The predictive power of the model is calculated as the sum of squared differences between the observed and estimated responses. This LOO method is accurate to quantify each compound’s impact on the developed QSAR model. However, the predictive power is often too optimistic, particularly with a large dataset compound because the perturbation of one compound is often insignificant.

The leave‐many‐out (LMO) cross‐validation method is to remove more than one compound each time. The dataset is divided into a number of blocks (referred to as cancelation groups). At each time, all the compounds belonging to a block are left out from the derivation of the model. Compared with the LOO method, the LMO method gives a more realistic estimation of predictive power because it introduces a larger perturbation in the dataset. There is no standard rule for splitting the data for a block, so it is normally defined by model users. The random clustering method in which one or more compounds are randomly selected as a subset of compounds in the cancelation block and left out for the deviation of the QSAR model can also be used.

2.5 Quantification of Uncertainty

All the models have intrinsic uncertainty that has to be quantified to communicate to the public about risks. Constraints, uncertainties, and assumptions having an impact on the risk assessment should be explicitly considered at each step in the risk assessment. Therefore, the quantitative description of uncertainty in HRA, carrying capacity, and climate change are critical for policy makers and the general public. For example, the IPCC Fifth Assessment (2015) adopted the following technical terms in dealing with uncertainty due to the uncertainties in quantifying all consequences of different emissions:

  • Confidence in the validity of a finding, based on the type, amount, quality, and consistency of evidence (e.g. mechanistic understanding, theory, data, models, expert judgment) and the degree of agreement. Confidence is expressed qualitatively.
  • Quantified measures of uncertainty in a finding expressed probabilistically (based on statistical analysis of observations or model results or expert judgment).

For the policy makers, the IPCC adopted the following terms for quantitative description purposes to express the assessed likelihood of an outcome or a result. Seven different descriptive terms are assigned to seven different likelihood probability ranges as in Table 2.19.

Table 2.19 Technical terms used for describing the probability of an outcome (IPCC, 2013).

Term Likelihood probability of an outcome (%)
Virtually certain 99–100
Very likely 90–100
Likely 66–100
About as likely as not 33–66
Unlikely  0–33
Very unlikely  0–10
Exceptionally unlikely  0–1

To quantify different uncertainty, the US EPA identifies scenario, parameter, and model uncertainty. In each category of uncertainty, the sources of uncertainty are identified with a specific example in Table 2.20.

Table 2.20 Type of uncertainty with sources and examples (US EPA).

Type of uncertainty Sources Examples
Scenario uncertainty Descriptive errors Incorrect or insufficient information
Aggregation errors Spatial or temporal approximations
Judgment errors Selection of an incorrect model
Incomplete analysis Overlooking an important pathway
Parameter uncertainty Measurement errors Imprecise or biased measurements
Sampling errors Small or unrepresentative samples
Variability In time, space, or activities
Surrogate data Structurally related chemicals
Model uncertainty Relationship errors Incorrect inference on the basis for correlations
Modeling errors Excluding relevant variables

Furthermore, the US EPA lists different quantification methods of uncertainty with examples in Table 2.21.

Table 2.21 Approaches to quantitative analysis of uncertainty.

Approach Description Examples
Sensitivity analysis Changing one input variable at a time while leaving others constant, to examine effect on output Fix each input at lower (then upper) boundary while holding others at nominal values (e.g. medians)
Analytical uncertainty propagation Examining how uncertainty in individual parameters affects the overall uncertainty of the exposure assessment Analytically or numerically obtain a partial derivative of the exposure equation with respect to each input parameter
Probabilistic uncertainty analysis Varying each of the input variables over various values of their respective probability distributions Assign probability density function to each parameter; randomly sample values from each distribution and insert them in the exposure equation (Monte Carlo)
Classical statistical methods Estimating the population exposure distribution directly, based on measured values from a representative sample Compute confidence interval estimates for various percentiles of the exposure distribution

2.5.1 Quantification of QSAR Model’s Uncertainty

For a QSAR model in which r is function of j variables X1, X2, …, Xj:

(2.22)images

The uncertainty due to errors in variable can be expressed as follows (Coleman and Steele, 1989):

(2.23)images
(2.24)images

When the above uncertainty equation is applied to log P as function r, while ELUMO, NCl, and NC are the three variables, the uncertainty equation of log P can be expressed in Equations 2.25 and 2.26:

When the data reduction expression is very complex and the task of computing the partial derivatives in the above equations is extremely laborious, Monte Carlo simulation (MCS) that is the most efficient quantification of uncertainty according to distribution of independent variables can be used (Cox et al., 2001). MCS is used as an example to quantify the uncertainty of the predicted log P using ELUMO, NCl, and NC as three independent variables.

2.5.2 Monte Carlo Simulation

Monte Carlo Simulation (MCS) has been extensively used for quantifying uncertainty of linear equations (Tang et al., 2009). Estimating propagation of error distributions by MCS is based on theoretical principles and supports a fully consistent and transferable estimation of measurement uncertainty. In general case, a linear model equation is to measure output Z indirectly obtained from the input variables X1, X2, …, Xn by a functional relationship F:

The knowledge about the values that may be reasonably attributed to quantities Xi, considered as continuous random variables, is expressed by their probability distribution function (PDF), images, within the corresponding domain. An expectation or best estimate for the value of Xi E(Xi) and the uncertainty associated with this value, μ(Xi), assimilated to the standard deviation σ(Xi), are obtained from the PDF:

(2.28)images
(2.29)images

If the equation is linearized by means of a Taylor expansion about the point, the estimated uncertainty of the measured output μ(Z) is from the input μ = (μ1, μ2, …, μn) while the second‐ and higher‐order terms are neglected, the estimated uncertainty of μ(Z) from the input variables is calculated as follows:

(2.30)images

Taking μ(Z) as the values F(μ) = F(μ1, μ2, …, μn) leads to the well‐known law of propagation of uncertainty:

(2.31)images
images

In the Guide to the Expression of Uncertainty in Measurement (GUM, 1993), which is the internationally accepted master document for the evaluation of uncertainty, the combined standard uncertainty μ(Z) is evaluated from the standard uncertainty of the input variables, μ(Xi), and the covariances between correlated ones, cov(Xi, Xj), if all the input variables are independent, (Xi, Xj) = 0. The essence of the MCS is to simulate sampling to a target population with a given expectation μz and variance σ2(Z). A random sample of size M is obtained from the simulation of M independent and identically distributed random variables Z1, Z2, …, ZM. According to the central limit theorem (Martinez, 2002), the distribution of the sum is approximately normal, with expectation Mµz and variance Mσ2:

(2.32)images

According to the rule of “two sigmas,” the probability

By dividing Equation (2.33) by M on both sides, the expression becomes

By substituting images into Equation (2.34), the expression becomes

(2.35)images

This relationship is the foundation of the MCS because it establishes the rule to evaluate the error of set images as images. The sample variance may be best estimated using σ2:

(2.36)images

After defining the coverage probability, P, the confidence interval for the result is evaluated as images, where extremes correspond to the 2.5 and 97.5% percentiles of the sorted Z values. When the skewness value of the Z forecast discrete distribution is near zero, the confidence interval becomes symmetric and expanded uncertainty U(Z) is estimated by Equation (2.38):

MCS is a powerful tool in quantifying uncertainty in addition to experimental, analytical, or other numerical methods. MCS protocol can replace point estimates with random variables drawn from probability density functions. To perform MCS, a computer program is used to generate the pseudorandom numbers to simulate the values of the variables within a given PDF. Several commercial software programs such as Crystal Ball, LabVIEW, @RISK, and Analytica are the most popular for this purpose. There are three major steps if Crystal Ball by Oracle (2016) is used to carry out the MCS. First, a probability distribution is incorporated into a spreadsheet cell, and each time the spreadsheet is recalculated, a new value of the random variable is selected from the distribution and used for calculations. Second, the entire simulation is run at least 10 000 times to satisfy the required high number of trials. Each time new values of the random variables are selected, a new estimate of the final target is generated. Third, the results of simulations are summarized in a user‐friendly interface such as a table and a figure.

A step‐by‐step procedure for uncertainty assessment of regression QSAR model for halogenated DBPs is illustrated in Figure 2.21. The first step is the compilation of the DBP data. Three variables such as ELUMO, the number of chorine (NCl), and the number of carbon (NC) were used. The characterization of the probability distributions is carried out by statistic software SAS 9.4 to obtain the data average and standard deviation. All these parameters are fed into the MCS that gives the results in a probability distribution around a mean value that is used to carry out a detailed sensitivity analysis. A sensitivity analysis is applied to identify which parameters had the most impact on the predicted log P. If small modifications of one parameter characterized by a probability distribution strongly influenced the final result, it may be concluded that the sensitivity of the variable is very high. Sensitivity is crucial in determining what variables are the most important in predicting a dependent outcome. This can be analyzed by displaying the sensitivity as a percentage of the contribution from each parameter to the variance of the final result. Crystal Ball presents contribution in terms of percentage for each independent variable with the sum of percentage contribution to 100%. A general procedure in the uncertainty quantification of the predicted log P of a class DBP through MCS is outlined as follows:

  1. Selection of significant sources of uncertainty. Molecular descriptors ELUMO, NCl, NC, the intercept b0, and the coefficiencies b1, b2, and b3 are the significant sources contributing to the model’s uncertainty.
  2. Identification of the probability density function corresponding to the uncertainty sources selected. All uncertainty sources, e.g., ELUMO, NCl, NC, b0, b1, b2, and b3 are analyzed by Crystal Ball to obtain its best fit probability distribution functions. The results of the probability functions for each variable are then used in the best fit MLR equation to predict log P.
  3. Selection of the number M of Monte Carlo Simulation trials. For example, MCS was carried out 10 000 times in order to have a sufficiently high number of trials.
  4. Simulation of M samples {Xi1, Xi2, …, Xin,} for each Xi (ELUMO, NCl, NC, b0, b1, b2, and b3) uncertainty source, which was considered a random variable with a probability density function p(ELUMO), p(NCl), p(NC), p(b0), p(b1), p(b2), p(b3).
  5. Computation of the M results {Z1, Z2, …, Zn} by applying Equation (2.27) or (2.37) to M samples for each variable Xi (ELUMO, NCl, NC, b0, b1, b2, and b3).
  6. Analysis and interpretation of Monte Carlo Simulation results.

Flowchart depicting the procedure of Monte Carlo simulation in uncertainty assessment, starting from compilation of DBP data to selection of essential parameters leading to analysis and discussion of results.

Figure 2.21 Procedure for Monte Carlo simulation in uncertainty assessment.

(Source: Adapted from the US EPA (1999).)

Example 2.9 presents the MCS results produced by Crystal Ball software. The methodology can be used for the quantification of uncertainties of any regression curve in general.

2.5.3 Comparison of Uncertainties of Different QSAR Models

The uncertainty of log P could also be estimated by point estimate method (PEM). In most cases, uncertainty quantified by MCS should be more accurate than that calculated by PEM because MCS counts the distributions and propagation of the uncertainty. The uncertainty results obtained from the PEM method and the MCS for QSAR models are compared in Example 2.10.

2.5.4 Sensitivity Analysis by Monte Carlo Simulation

The influence of different variables on the outcome of model’s prediction can be visually presented through sensitivity analysis in Crystal Ball. Table 2.25 summarizes the sensitivity results for each DBP class. Figure 2.26 demonstrates that NCl is the most influential molecular descriptor in predicting log P of all DBP classes, except halogenated alkane.

Table 2.25 Sensitivity of various descriptors to log P by DPB classes.

No. Compound class ELUMO (%) NCl (%) NC (%)
1 Halogenated alkane −16.0 0.8 83.2
2 Halogenated alkene −44.0 53.9 2.1
3 Halogenated aromatic 0.6 90.6 8.7
4 Halogenated aldehyde −7.3 64.9 27.8
5 Halogenated ketone 0.0 93.0 7.0
6 Halogenated carboxylic acid 36.5 55.6 7.9
Sensitivity of various descriptors to log P by DPB classes, depicted by 6 sets of 3 shaded vertical bars for group number 1, 2, 3, 4, 5, and 6. Each set consists of bars representing E(LUMO), NCl, and NC.

Figure 2.26 Sensitivity of various descriptors to log P by DPB classes.

(Source: Adapted from Chen (1999).)

Table 2.25 shows the major difference between sensitivities of single bond compounds, such as halogenated alkanes. Log P of chlorinated alkane changes significantly with the length of the carbon chain, NC. However, for all the other classes of DBPs containing unsaturated π bond, log P will change mostly with the number of chlorine, NCl. Log P reflects the amount of hydrogen bonding between the chemical compound and the hydrogen in the water molecule in the absence of oxygen in the chlorinated alkanes. It appears that when chlorine attaches to carbon, it only has influence on the electron cloud of the attached carbon and such electronic interaction is tightly bonded; therefore, the number of chlorine on the chlorinated alkane carbon chain does not significantly influence the strength of hydrogen bonds. However, with DBP compounds containing unsaturated π bonds, the chlorine atom would attract electron cloud to itself since most compounds containing unsaturated bonds may have a resonance structure through which the influence of chlorine will be transmitted. As a result, the strength of the hydrogen bond between the chemical compound and water molecule would be significantly influenced by the number of chlorine, NCl. Therefore, log P is more sensitive to the number of chlorine.

2.5.5 Computer Software for Quantitative Risk Assessment

There are many computer software available to conduct risk assessment. Table 2.26 lists other computer software for quantitative risk assessment.

Table 2.26 Software for quantitative risk assessment.

Software Type of analysis Creator
@RISK Uncertainty and risk analysis Palisade Corporation
Analytica Uncertainty and risk analysis Lumina Decision Systems, Inc.
GENII/SUNS Uncertainty and sensitivity analysis Sandia National Laboratories
Pacific Northwest Laboratory
MOUSE Uncertainty analysis EPA, Risk Reduction Engineering Laboratory
ORMONTE Uncertainty and sensitivity analysis Oak Ridge National Laboratory
Risk Calc Uncertainty and risk analysis Applied Biomathematics
SimLab Uncertainty and sensitivity analysis Simlab
TAM3 Uncertainty and sensitivity analysis Oak Ridge National Laboratory
Uncertainty Analysis Uncertainty analysis Integrated Sciences Group
Crystal Ball® Uncertainty and sensitivity analysis Oracle

2.6 Exercise

2.6.1 Questions

  1. What is the meaning of typical MD such as log P, ELUMO, and EHOMO?
  2. What is quantitative structure–activity relationship (QSAR)?
  3. How can QSAR be used in HRA?
  4. What are the typical molecular descriptors?
  5. What EPA QSAR tools can be used for HRA?
  6. What are the four steps in risk assessment?
  7. What are the four different ways to identify the hazards of a chemical compound?
  8. How do you calculate RfD?
  9. How do you calculate BMD and BMDL?
  10. What are the differences between assessment procedures of carcinogenic and noncarcinogenic chemical compounds?
  11. How is the carcinogenic risk of a chemical compound calculated?

2.6.2 Calculation

  1. Wang et al. (2014) reported that halobenzoquinones (HBQs) as a class of DBPs are of likely to be carcinogenic and found that the IC50 of HBQs are always less than their counter‐hydroxylated products such as halo‐hydroxyl‐benzoquinones (OH‐HBQs). For toxicity to the Chinese hamster ovary CCL‐61 (ATCC, Manassas, VA) cell (CHO‐K1), the IC50 in μM for 24‐hour incubation of DCBQ, DCMBQ, TriCBQ, and DBBQ are 27.3, 11.4, 45.5, and 19.8, respectively. The IC50 in μM of their corresponding hydroxylated compounds, OH‐HBQs, are 61.0, 20.4, 64.4, and 42.8, respectively. Answer the following questions:
    1. Which class of the HBQs or OH‐HBQs is more toxic? Why?
    2. Calculate the corresponding BMDL values using the EPA BMDS 2.6.0.1 software. Is there any correlation between the BMDL and the IC50?
    3. Explain why there is or there is no correlation existing between these data for both HBQs and OH‐HBQs.
  2. Chloroform is one of the disinfection by‐products of chlorinated water. Calculate the BMDL and BMD of chloroform with 95% confidence interval by using the EPA BMDS 2.6.0.1 software, which is available at https://www.epa.gov/bmds/download‐benchmark‐dose‐software‐bmds#installing
  3. The current FDA guideline for methyl mercury in fish is 1 ppm (or 1 mg of methyl mercury/kilogram of fish). Answer the following:
    1. Calculate the dose of mercury (mg/kg body weight of the person) if a person were to eat a tuna fish sandwich (with 3 oz of tuna fish meat) with methyl mercury levels of 0.4 ppm.
    2. How does the amount of methyl mercury in a 4‐oz sandwich compare with the maximum amount that a pregnant woman should eat on a daily basis, according to the EPA official guidelines? List your citation.
    3. How many 4‐oz tuna fish sandwiches would a person need to eat in order to have a 50% probability of dying from acute methyl mercury poisoning? Assume that humans and rats (or mice) have equal sensitivity to methyl mercury, that oral exposure in rats (or mice) is comparable with oral exposure in humans, and that all mercury in the fish is in the form of methyl mercury.
  4. Study EPA example of human health risk assessment of isodecyl acrylate. Answer the following:
    1. What uncertainty could be introduced in the HRA?
    2. How would you outline an MCS procedure to quantify the uncertainty?

2.6.3 Assignment

  1. Study the Crystal Ball user manual.
  2. Study the SPSS user manual.
  3. Study EPA guidelines on Monte Carlo simulation.
  4. Go to the website http://apps.who.int/gho/data/node.main.1?lang=en and download the following Excel database on World Health Statistics:
    • Mortality and global health estimates
    • Cause‐specific mortality and morbidity
    • Selected infectious diseases
    • Health service coverage
    • Risk factors
    • Health systems
    • Health equity monitor
    • Demographic and socioeconomic statistics

Under the heading of “Mortality and global health estimates,” you will be able to download the following database:

Life expectancy for different countries. Use SPSS to conduct the following:

  1. Statistical analysis of mean and standard deviation of life expectancy for developed and developing countries. What does this mean and standard deviation suggest for the two groups of countries when their mean life expectancy is compared?
  2. Income and environmental indicators are the two major indicators. Conduct correlation analysis between life expectancy and income and air quality indicator, or both. Answer the following:
    1. Is there any correlation between life expectancy and income and air quality indicator, or both?
    2. Using Crystal Ball to analyze sensitivity and state, which one is more important predictor for life expectancy?

2.6.4 Projects

2.6.4.1 Xiongan Project

Collect water quality and air quality data of Xiongan and conduct health risk assessment on the major primary pollutants of the following. Toxicity data could be found in the US EPA Exposure Factors Handbook 2011 Edition:

  1. Air pollutants: O3, SO2, NO2, P.M.2.5, and P.M.10
  2. Water pollutants: trihalomethanes (TTHM) and five haloacetic acids (HAA5)
  3. Soil pollutants: (a) toxic metals (Hg, Cr, and As), (b) pesticides (parathion, dichlorvos, and aldrin), and (c) herbicides (glyphosate, atrazine, and flazasulfuron)

Answer the following questions:

  1. What are the log P and ELUMO and EHOMO for water pollutants?
  2. How do these molecular descriptors contribute to their toxicity?
  3. What are the health risks of these pollutants?
  4. Estimate lung cancer and liver cancer number of Xiongan due to air and water pollutants in the next 5, 10, 20, and 50 years.

2.6.4.2 Community Project

  1. Do the same as above as your hometown and estimate lung cancer and liver cancer number of your city due to air, water, and soil pollutants in the next 5, 10, 20, and 50 years.
  2. Rank the health risk of air, water, and soil pollutants and identify major contributor of major pollutants.
  3. Make recommendations on priority of major EEIS to be built to the local environmental protection agencies to reduce these risks less than 10−6 in the next 5, 10, 20, and 50 years.

References

  1. Akaike, H. (1973). Information theory and an extension of the maximum likelihood principle. Proceedings of the Second International Symposium on Information Theory (ed. B.N. Petrov and F. Csaki), 267–281. Budapest: Akademiai Kiado.
  2. Chen, R.Z. (1999). Development, Validation, and Uncertainty Analysis of Quantitative Structure and Activity Relationship Models for log P of Disinfection By‐products. PhD.dissertation, Florida International University, Miami, FL.
  3. Chen, R.Z., Tang, W.Z., and Sillanpää, M. (2016). Prediction of log P of halogenated alkanes by their ELUMO and number of chlorine and carbon. Environmental Processes 1: 73–91.
  4. Chinese Ministry of Environmental Protection (MEP). (2016). Ambient Air Quality Standards. http://english.mep.gov.cn/Resources/standards/Air_Environment/quality_standard1/201605/t20160511_337502.shtml (accessed 29 November 2016).
  5. Coleman, H.W. and Steele, Jr., W.G. (1989). Planning an experiment: General uncertainty analysis. Experimentation and Uncertainty Analysis for Engineers. New York: Wiley Blackwell.
  6. Cothern, C.R., Coniglio, W.A., and Marcus, W.L. (1986). Estimating risk to human health. Environmental Science and Technology 20 (2): 111–116.
  7. Cox, M.G., Dainton, M.P., and Harris, P.M. (2001). Software support for metrology best practice guide. Uncertainty and statistical modeling. Technical Report. Teddington: National Physical Laboratory.
  8. Crump, K. (1984). A new method for determining allowable daily intakes. Fundamental and Applied Toxicology 4: 854–871.
  9. Efron, B. (1983). Estimating the error rate of a prediction rule: improvement on cross – validation. Journal of American Statistical Association 78 (382): 316–331
  10. EPA (U.S. Environmental Protection Agency) (1995). Guidelines for Neurotoxicity Risk Assessment. EPA/630/R‐95/001F. Risk Assessment Forum, U.S. (EPS), Washington, DC [online]. www.epa.gov/ncea/raf/pdfs/neurotox.pdf (accessed 30 March 2005).
  11. Eriksson, L., Jaworska, J., Worth, A.P., et al. (2003). Methods for reliability and uncertainty assessment and for applicability evaluations of classification and regression‐based QSARs, Environmental Health Perspectives 111: 1361–1375.
  12. GUM (1993). Guide to the Expression of Uncertainty in Measurements. Geneva: ISO.
  13. Hansch, C., Leo, A., and Hoekman, D. (1995). Exploring QSAR—Fundamentals and Applications in Chemistry and Biology. Washington, DC: American Chemical Society.
  14. IPCC (2013). Summary for Policymakers. In: Climate Change 2013: The Physical Science Basis. Contribution of Working Group I to the Fifth Assessment Report of the Intergovernmental Panel on Climate Change (ed. T.F. Stocker, D. Qin, G.‐K. Plattner, et al.). Cambridge/New York: Cambridge University Press.
  15. IPCC (2015). Fifth assessment. AR5.
  16. Jiang Z., Yang, H., Guan, Y. et al. (2000). Chlorine effect on molecular descriptors of disinfection by‐products. Environmental Science 21(5): 51–54.
  17. Kimmel, C. and Gaylor, D. (1988). Issues in qualitative and quantitative risk analysis for developmental toxicology. Risk Analysis 8: 15–21.
  18. Leo, A. and Hansch, C. (1999). Role of hydrophobic effects in mechanistic QSAR. Perspectives in Drug Discovery 17: 1–25.
  19. Linhart, H. and Zucchini, W. (1986). Model Selection. New York: Wiley.
  20. Martinez, W.L. (2002). Computational Statistics Handbook with MATLAB. Boca Ratón: Chapman & Hall/CRCnetBase.
  21. Massart, D.L., Vandeginste, B.G.M., Buydens, L.M.C., et al. (1997). Handbook of Chemometrics and Qualimetrics: Part A. Amsterdam: Elsevier Science.
  22. National Bureau of Statistics of China (2012). China Statistical Yearbook. Beijing: China Statistics Press. http://www.stats.gov.cn/tjsj/ndsj/2012/indexeh.htm (accessed January 2018).
  23. NTP (National Toxicology Program) (1988). Toxicology and carcinogenesis studies of chlorodibromomethane. CAS No. 124‐48‐1 in F344/N rats and B6C3F1 mice (gavage studies). TR‐282. https://ntp.niehs.nih.gov/results/pubs/longterm/reports/longterm/index.html (accessed 21 March 2018).
  24. OECD (2007). Guidance document on the validation of (Quantitative) Structure–Activity Relationships [(Q)SAR] models. Paris: Organization for Economic Co‐Operation and Development.
  25. Osten, D. (1988). Selection of optimal regression models via cross‐validation. Journal of Chemometrics 2: 39–48
  26. Pierotti, A.J. (1999). Chlorine effect on molecular descriptors of disinfection by‐products. Master of Science Degree thesis. Florida: Florida International University.
  27. Pontius, F.W. (1990). Toxicology and drinking water regulations. Journal of American Water Works Association 90: 17–19.
  28. Stone, M. (1998). Akaike’s criteria. In: Encyclopedia of Biostatistics (eds. P. Armitage and T. Colton). New York: Wiley.
  29. Tang, W.Z. and Wang, F. (2010). Chlorine effect on quantum molecular descriptors of disinfection by‐products‐chlorinated alkanes. Chemosphere 78: 914–921.
  30. Tang, W.Z., Wang, F., Miralles‐Wilhelm, F., and Damisse, E. (2009). Uncertainty analysis of rating equations of submerged orifice flow at gated spillway. Conference on Reliability and Quality in Design, the International Society of Science and Applied Technologies (ISSAT) and the IEEE Reliability Society, San Francisco, CA (6–8 August 2009), 165–169.
  31. The US EPA (1997). Guiding principles for Monte Carlo analysis. EPA/630/R‐97/001.
  32. The US EPA (2016). NAAQS table. https://www.epa.gov/criteria‐air‐pollutants/naaqs‐table (accessed 29 November 2016).
  33. U.S. Environmental Protection Agency (EPA) (1994). Drinking water maximum contaminant level goals and national primary drinking water regulations for lead and copper. Federal Register 59(125): 33860–33864.
  34. US EPA (1992). Guidelines for environment exposure assessment. Federal Register 57: 22888–22936.
  35. U.S. EPA (1996). Safe drinking water act standards and health advisories. Office of Water 4305. EPA‐822‐B‐96‐002.
  36. US EPA (1997). National primary drinking water regulations: disinfectants and disinfection byproducts notice of data availability; proposed rule. EPA‐815‐Z‐98‐005.
  37. US EPA (2005). Guidelines for carcinogen risk assessment. Federal Register 70, 66, 177650‐18717. http://www.epa.gov/raf/pubalpha.htm (accessed 15 December 2017).
  38. US EPA (2006). Ultraviolet disinfection guidance manual for the final long term 2 enhanced surface water treatment rule. EPA 815‐R‐06‐007, November 2006.
  39. US EPA (2011). Exposure factors handbook 2011 edition (Final). EPA/600/R‐09/052F. Washington, DC: US Environmental Protection Agency.
  40. US EPA (2012). Benchmark dose technical guidance, risk assessment forum. 20460, EPA/100/R‐12/001. Washington, DC: US Environmental Protection Agency, June 2012.
  41. US EPA (2015). User’s Manual for Benchmark Dose Software 2.6. Washington, DC: US Environmental Protection Agency.
  42. US EPA (2016). Benchmark dose tools. https://www.epa.gov/bmds (accessed 20 November 2016).
  43. Wang, W., Qian, Y.C., Li, J.H., et al. (2014). Analytical and toxicity characterization of halo‐hydroxylbenzoquinonesas stable halobenzoquinone disinfection byproducts in treated water. Analytical Chemistry 86: 4982−4988.
  44. World Health Organization (WHO) (2014). The world cancer report. Geneva: WHO.
  45. World Health Organization (WHO) (2016). World health statistics 2016: monitoring health for the SDGs, sustainable development goals. Geneva: WHO.
  46. Yang, H., Jiang, Z., Guan, Y., et al (2000). Correlations between SDWA standards and health advisories concentrations of DBPs and molecular descriptors. Water and Wastewater Engineering 26 (1): 22–25.
  47. Zhao, P., Dai, M., Chen, W., and Li, N. (2010). Cancer trends in China. Japanese Journal of Clinical Oncology 40 (4): 281–85.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset