Volume 35, Issue 3 p. 513-525
ET&C Focus
Free Access

Environmental surveillance and monitoring—The next frontiers for high-throughput toxicology

Anthony L. Schroeder

Corresponding Author

Anthony L. Schroeder

Water Resources Center, University of Minnesota–Twin Cities, St. Paul, Minnesota, USA

National Health and Environmental Effects Research Laboratory, Mid-Continent Ecology Division, US Environmental Protection Agency, Duluth, Minnesota, USA

Address correspondence to [email protected]

Search for more papers by this author
Gerald T. Ankley

Gerald T. Ankley

National Health and Environmental Effects Research Laboratory, Mid-Continent Ecology Division, US Environmental Protection Agency, Duluth, Minnesota, USA

Search for more papers by this author
Keith A. Houck

Keith A. Houck

Office of Research and Development, National Center for Computational Toxicology, US Environmental Protection Agency, Research Triangle Park, North Carolina, USA

Search for more papers by this author
Daniel L. Villeneuve

Daniel L. Villeneuve

National Health and Environmental Effects Research Laboratory, Mid-Continent Ecology Division, US Environmental Protection Agency, Duluth, Minnesota, USA

Search for more papers by this author
First published: 28 February 2016
Citations: 66

Abstract

High-throughput toxicity testing technologies along with the World Wide Web are revolutionizing both generation of and access to data regarding the biological activities that chemicals can elicit when they interact with specific proteins, genes, or other targets in the body of an organism. To date, however, most of the focus has been on the application of such data to assessment of individual chemicals. The authors suggest that environmental surveillance and monitoring represent the next frontiers for high-throughput toxicity testing. Resources already exist in curated databases of chemical–biological interactions, including highly standardized quantitative dose–response data generated from nascent high-throughput toxicity testing programs such as ToxCast and Tox21, to link chemicals detected through environmental analytical chemistry to known biological activities. The emergence of the adverse outcome pathway framework and the associated knowledge base for linking molecular-level or pathway-level perturbations of biological systems to adverse outcomes traditionally considered in risk assessment and regulatory decision-making through a series of measurable biological changes provides a critical link between activity and hazard. Furthermore, environmental samples can be directly analyzed via high-throughput toxicity testing platforms to provide an unprecedented breadth of biological activity characterization that integrates the effects of all compounds present in a mixture, whether known or not. Novel application of these chemical–biological interaction data provides an opportunity to transform scientific characterization of potential hazards associated with exposure to complex mixtures of environmental contaminants. Environ Toxicol Chem 2016;35:513–525. © 2016 SETAC

High-Throughput Toxicology

In 2007, the National Research Council published a report entitled “Toxicity testing in the 21st Century—A vision and a strategy,” which called for a paradigm shift in regulatory toxicity testing and chemical risk assessment 1. The authors advocated for movement away from traditional reliance on whole-animal toxicity testing and toward greater use of in vitro and micro-scale bioassays that could be conducted both rapidly and cost-effectively using modern robotics, computing, and miniaturization. While this could include in vivo testing with small animal models such as zebrafish embryos or Caenorhabditis elegans, it was envisioned that the new testing strategy would largely employ batteries of in vitro bioassays that could effectively screen chemicals for their ability to interact with specific molecular targets (e.g., enzymes, receptors, transporters, transcription factors) or biological pathways whose perturbation, if sufficiently severe, could be expected to lead to toxicologically relevant outcomes. In response, the US Environmental Protection Agency (USEPA) established its ToxCast research program, while the National Institute for Environmental Health Sciences, the Food and Drug Administration, the National Institutes of Health, and the USEPA pooled federal resources and expertise to establish the Tox21 program 2, 3. Both efforts were aimed at generating the data and developing the methods, models, and tools to bring about the envisioned paradigm shift in regulatory toxicity testing.

The ToxCast program has employed a large battery of high-throughput, primarily commercially available in vitro but also a few custom and in vivo assays that cover a relatively large breadth of biological space 2. To date, the ToxCast program has generated pathway-based biological effects data for approximately 2000 individual chemicals using more than 700 different bioassays. In contrast, the Tox21 program employs a more limited range of assays (approximately 50) but has tested a larger chemical space, resulting in pathway-based biological effects data for approximately 10 000 individual chemicals 3.

Although these novel sets of largely in vitro data have their limitations (e.g., many of the systems used lack the ability to metabolize test chemicals), they nonetheless represent tremendous value added in terms of information available to support chemical risk assessments. Given the promising strides that have been made in this 21st-century approach to chemical screening, it is not implausible to think the majority of individual chemicals in commerce could be screened for a broad spectrum of toxicologically relevant biological activities within the next decade. This is a significant achievement when one considers an almost total lack of traditional toxicity data available for the majority of the chemicals in commerce 4, 5.

Coming to Terms

Adverse outcome pathway—A conceptual framework that portrays existing knowledge concerning the scientifically supported linkage between molecular-level perturbation of a particular biological target or pathway in an organism with adverse outcomes on survival, growth, reproduction, health, well-being, or other end points traditionally viewed as relevant to risk assessment and regulatory decision-making. These linkages are supported by both plausibility based on knowledge of the “normal” functioning of biological systems and weight of evidence from studies in which specific targets are perturbed.

Chemical–pathway interaction network—A graphical representation of reported interactions between specific chemicals and biological targets such as genes, proteins, and functional pathways in which chemicals and targets are represented as nodes and interactions are represented as edges between a chemical and target node; useful for visualizing collective/interactive effects of multiple chemicals on a diversity of biological targets and amenable to network analysis.

Exposure–activity ratio—Ratio that compares measured chemical concentrations, usually from biomonitoring data, compared to a chemical concentration needed to cause a response observed from an in vitro bioassay.

urn:x-wiley:14381656:media:etc3309:etc3309-math-0001

Half-maximal activity concentration—Concentration in the in vitro bioassay at which a 50% increase in response is observed, also known as an activity concentration or half-maximal activity concentration value.

High-throughput toxicology—A toxicity testing paradigm that focuses on the use of in vitro biochemical-based and cell-based assays and small nonrodent animal models using miniaturization and automation to facilitate testing of hundreds to thousands of chemicals in a matter of days.

High-throughput toxicology biological effects prediction or bio-effects prediction—Term describing the use of various chemical–biological interaction data sources to translate analytical chemistry results into hypotheses regarding expected biological effects, most often at the molecular, biochemical, or pathway level based on existing knowledge from curated literature sources and/or compilations of high-throughput toxicology data.

High-throughput toxicology biological effects surveillance or bio-effects surveillance—Term describing the application of large suites of in vitro bioassays, generally implemented using high-throughput technologies, to directly characterize the integrated biological effects of a complex mixture. The same term may also be applied to the use of “omics” or other biological approaches that allow for rapid and/or concurrent screening of the effects of a mixture on a large number of biological pathways.

The Next Frontier for High-Throughput Toxicology

Demonstrating the feasibility of providing pathway-based data for the majority of chemicals in commerce is a key step in chemical safety assessment. However, the “elephant in the room” as it pertains to chemical risk assessment is the recognition that organisms in the environment, including humans, are exposed to complex mixtures, not individual chemicals. Evaluating the potential human health and ecological risks associated with exposures to chemical mixtures has its own unique set of challenges. To adequately assess the risks associated with complex mixtures, it is critical to identify which contaminants are present and the potential hazards they may pose to exposed organisms. In theory, screening of predesigned chemical mixtures using high-throughput assays, such as those associated with ToxCast, could provide data necessary to assess the risks of complex mixtures. However, it is neither practical nor feasible to screen the nearly infinite number of combinations of relevant chemical mixtures that could arise from the more than 80 000 chemicals in commerce, not to mention their breakdown and biotransformation products. We propose that high-throughput toxicology data generated for individual chemicals and/or via direct screening of environmental extracts using the approaches employed in programs such as ToxCast and Tox21 can help address some of the most challenging aspects of mixture toxicology. Additionally, the widespread availability of knowledge bases curating chemical–gene, chemical–protein, and/or chemical–pathway interaction data from peer-reviewed literature and other sources (e.g., the Comparative Toxicogenomics Database 6) has created important opportunities to address complex mixture uncertainties, which the field has just started to tap into. Below, we describe 2 generalized approaches that utilize readily accessible sources of pathway-based data from online knowledge bases, as well as high-throughput toxicity testing, to provide insights as to the potential biological effects of complex environmental mixtures (Figure 1).

Details are in the caption following the image
Overview of workflows for utilizing pathway-based data sources for qualitative bio-effects prediction (A), semiquantitative bio-effects prediction (B), or bio-effects surveillance (C). Databases and assay platforms cited are representative examples, not comprehensive. AC10/AC50 = 10% and 50% activity concentrations, respectively; AOP = adverse outcome pathway; CTD = Comparative Toxicogenomics Database; AOP-KB = Adverse Outcome Pathway Knowledge Base; STITCH = Search Tool for Interactions of Chemicals; PFCs = perfluorinated compounds.

Employing pathway-based data for single chemicals to predict biological effects

Environmental monitoring traditionally has relied on chemical-specific instrumental analyses to measure, in some instances, hundreds of chemicals present in complex environmental samples. This has resulted in vast chemical data sets that are useful for identifying which chemicals may be present at a site or in a particular environmental matrix and at what concentrations. However, such analytical characterization provides no information about the potential hazard of these chemicals. A “bio-effects prediction” approach can leverage existing knowledge of chemical–pathway interactions to help link chemical occurrence data to potential hazards (Figure 1A). Currently, there are several publically available sources providing in vivo chemical interaction data (Table 1). Some of these, such as the Comparative Toxicogenomics Database, have curated the peer-reviewed literature and assembled information on the chemical–gene, chemical–protein, and chemical–disease relationships reported from a diverse array of controlled laboratory experiments 6. These databases can be searched for known interactions, such as a reported chemical binding to a particular protein, for all chemicals detected at a site or in an environmental sample. Further, it is possible to use the reported chemical–gene or chemical–protein interactions to develop for specific sites, what we term “chemical–pathway interaction networks ([7,8]; A. Schroeder, USEPA, Duluth, MN, unpublished data). These networks allow for a systematic and integrated analysis of gene or protein targets predicted to be impacted by 1 or more detected chemicals in a mixture, based on a priori knowledge. How the networks are analyzed to identify relevant biological targets will vary with problem formulation, specific research interests, or hypotheses. We have, for example, prioritized gene/protein targets on the basis of which targets have the greatest number of detected chemicals interacting with that target, the assumption being that the greater number of chemical interactions, the greater the chance a target would be impacted (e.g., Martinović-Weigelt et al. 7, Cavallin et al. 8). The major advantage of this approach, especially in the context of using publically available data, is that it provides the broadest in vivo chemical and gene/protein interaction information available. Furthermore, because these databases have such wide chemical and biological coverage, collectively they provide a strong weight-of-evidence for the likelihood of a chemical or class of chemicals to interact with a gene or protein. The major disadvantage of the approach is that the available interaction data are not consistent in terms of their study designs (e.g., species utilized, tissues examined) and often have limited dose–response information, making quantitative comparisons of relative potencies or effect concentrations very difficult. Therefore, the use of the databases represented in Table 1 can typically only provide a qualitative indication of potential biological effects possibly associated with chemical exposure at a site. Despite its limitations and the potential for this approach to not always identify the correct hazards, the methodology does provide managers with a way to begin prioritizing hazard-screening efforts at a site when only chemical monitoring data are available.

Table 1. Sources of (primarily) in vivo chemical interaction data and database attributes
Database Database description Data description Database URL
Comparative Toxicogenomics Database (CTD) 6 A multiple-species, manually curated database containing chemical–gene/protein interactions and chemical–disease relationships from published literature More than 10 000 chemicals; more than 37 000 genes; more than 1 million curated chemical–gene interactions; more than 1 million chemical–disease associations http://ctdbase.org
Search Tool for Interactions of Chemicals (STITCH) 28 A database containing interactions information from metabolic pathways, crystal structures, binding experiments, and drug–target relationships More than 68 000 chemicals with interactions for 1.5 million genes from 373 genomes http://stitch.embl.de
Toxic Exposome Database 29 A database containing chemicals and chemical–gene target information from different databases, government documents, books, and scientific literature More than 3000 chemicals, including pollutants, pesticides, drugs, and food toxins; more than 42 000 chemical–gene target associations www.t3db.ca
ToxNet 30 A group of databases covering chemicals and drugs, diseases and the environment, environmental health, occupational safety and health, risk assessment and regulations, and toxicology; includes information on specific chemicals and mixtures, unknown chemicals, and specific toxic effects of chemicals In vitro and in vivo information on more than 50 000 environmental chemicals from different sources http://toxnet.nlm.nih.gov
Chemical Effects in Biological Systems (CEBS) 31 A database that houses several types of study data from academic, industrial, and government laboratories 6000 chemicals and drugs with more than 800 molecular (most microarray) datasets http://tools.niehs.nih.gov/cebs3
DrugMatrix 32 A database populated with gene expression results from in vivo rat or in vitro rat hepatocytes exposures to various drugs and chemicals; in vivo studies were carried out in multiple target organs More than 600 drugs and chemicals with gene expression data (mostly microarray) from more than 4000 studies https://ntp.niehs.nih.gov/drugmatrix/index.html
TG-GATEs 33 A database that contains in vivo (rat) and in vitro (human and rat hepatocyte) exposures; includes biochemistry, hematology, and histopathology findings Mostly gene expression (mostly microarray) and histopathology or cytotoxicity data for 170 compounds http://toxico.nibio.go.jp/English/index.html

In addition to databases focused largely on in vivo effects, other publically available sources provide large amounts of in vitro chemical interaction data. For example, databases such as PubChem Bioassay 9 and ChEMBL 10 (Table 2) are growing repositories for in vitro chemical–gene and chemical–protein interaction (i.e., bioassay) data produced by high-throughput toxicity testing programs. These databases cover relatively large chemical and biological space and have the advantage over many of the in vivo databases by often providing quantitative in vitro effect concentrations for each chemical evaluated. Consequently, use of these in vitro interaction data not only allows for qualitative identification of pathway-specific biological activities in a similar manner as with information from the in vivo databases, but also offers the potential to further prioritize biological targets by flagging chemicals detected at concentrations sufficient to modulate specific targets/pathways. A major limitation in employing these in vitro bioassay databases is the lack of standardization of data and assays arising from the ability of anyone screening chemicals with in vitro bioassays to deposit data. For example, the preparation of the chemicals (i.e., use of a solvent, different solvents, etc.) and experimental design (i.e., concentrations tested) can vary substantially from chemical to chemical and assay to assay. This makes comparing the relative potencies for each detected chemical difficult.

Table 2. Sources of (primarily) in vitro chemical interaction data and database attributes
Database Database description Data description Database URL
PubChem 9 A database of biological tests of small molecules generated through screening experiments, medicinal chemistry studies, chemical biology research, and drug discovery programs 47 million chemical compounds screened; 700 000 bioassays, more than 13 billion data points https://pubchem.ncbi.nlm.nih.gov
ChEMBL v20 10 A manually curated database containing binding, functional, and absorption, distribution, metabolism, excretion, and toxicity information for a large number of drug-like bioactive compounds More than 1 million chemicals screened; More than 1 million assays covering more than 10 000 targets; more than 13 million bioactivity measurements https://www.ebi.ac.uk/chembl
ToxNet 30 A group of databases covering chemicals and drugs, diseases and the environment, environmental health, occupational safety and health, risk assessment and regulations, and toxicology; includes information on specific chemicals and mixtures, unknown chemicals, and specific toxic effects of chemicals In vitro and in vivo information on more than 50 000 environmental chemicals from different sources http://toxnet.nlm.nih.gov
TG-GATEs 33 A database that contains in vivo (rat) and in vitro (human and rat hepatocyte) exposures; includes biochemistry, hematology, and histopathology findings Mostly gene expression (mostly microarray) and histopathology or cytotoxicity data for 170 compounds http://toxico.nibio.go.jp/English/index.html
USEPA iCSS Dashboard 11, 12 An interactive database to rapidly explore the high throughput chemical screening data generated from the Toxicity Forecaster (ToxCast) and Toxicity Testing in the 21st-Century (Tox21) collaboration ToxCast: More than 1800 chemicals screened in more than 600 assays; Tox21: Almost 10 000 chemicals screened through 50 assays for Tox 21 https://actor.epa.gov/dashboard
Connectivity Map (cMap) 34 A database containing gene expression results from human cells treated with various chemicals More than 1300 chemical compounds with more than 7000 gene expression profiles www.broadinstitute.org/cmap

Other available sources of high-throughput toxicity testing data, such as ToxCast or Tox21 11, 12, provide dose–response bioassay results but in a standardized manner. Although at present ToxCast and Tox21 may not provide as broad chemical or biological coverage as some of the other in vitro databases, the fact that the chemicals are prepared, delivered, and tested in a standardized manner makes these data very useful for identifying not only chemical–pathway interactions but also approximate effect concentrations and the relative potency of chemicals in terms of interaction with a given target. Consequently, these data can be applied in a more quantitative manner that considers both the concentrations and relative potencies of the chemicals detected in a sample and the type of interactions (e.g., chemical–protein) that could be expected.

As an example, the ToxCast program provides a half-maximal activity concentration (AC50) for each chemical screened. The AC50 is the concentration at which a chemical induces a 50% increase in activity compared with a reference. To determine if a detected environmental chemical is at a concentration necessary to elicit an observed biological activity, an exposure–activity ratio can be used (Figure 1B) 13. Exposure–activity ratios compare the concentration of a chemical detected in an environmental sample to the AC50 concentration for that particular chemical to determine if the concentration measured in the environment may be high enough to cause a biological activity. Conceptually, when an exposure–activity ratio is equal to or greater than 1, the measured chemical can be assumed to be present at a high enough concentration to elicit the observed biological activity. In contrast, an exposure–activity ratio less than 1 suggests that the chemical may not be present at a sufficient concentration to produce significant bioactivity. We recognize this interpretation as a dramatic simplification that does not account for factors that influence the actual dose of a chemical in an organism such as chemical bioavailability and stability in the environmental matrix as well as in vivo processes controlling absorption, distribution, metabolism, and elimination. In addition, the AC50 is not necessarily the exposure required for in vivo activity because biological significance could require much higher levels of modulation (e.g., 90% activity concentration) or be perturbed by much lower ones (e.g., 10% activity concentration). Finally, there are cases where the AC50 estimates themselves may be inaccurate as a result of using an automated data analysis pipeline to fit a regression model to data that violate key assumptions. However, it does represent a reasonable first-tier approach for considering both concentration and potency in evaluating the effect of a mixture on defined biological targets. Relative exposure–activity ratio values can be used to rank and identify the highest-priority chemicals and pathways for further investigation. Overall, exposure–activity ratios several orders of magnitude less than 1 provide some confidence that a given chemical is probably not present at a bioactive concentration, whereas exposure–activity ratios greatly exceeding 1 provide reason for concern. Significantly, when using standardized data sources such as ToxCast to calculate exposure–activity ratios, the exposure–activity ratios for all chemicals interacting with a given target can be summed to estimate the total integrated potency of the mixture relative to potential interaction with that target/pathway. This can begin to account for chemical mixture effects based on models such as concentration addition 14 because all of the detected chemicals are activating a single biological target. Consequently, the exposure–activity ratio approach provides risk managers with chemicals and potential biological activities to prioritize for further research and subsequent monitoring.

The predictive approaches described have both strengths and weaknesses. The approaches rely on having, ideally, relatively extensive chemistry data available for a given site and some type of in vitro or in vivo biological effects data for the chemicals, which can be of admittedly varying quality/applicability (Table 3). Overall, however, the development of chemical–pathway interaction networks and the exposure–activity ratio approaches allows for connecting chemicals detected at a site to their potential collective, biological activities. These insights support a reasonable data-driven approach for screening sites for potential hazard when only chemistry data are available. However, an important limitation of the approach is that it reflects only the universe of chemicals actually measured at a given site and cannot account for the unknown chemicals that may be present. Even nontargeted chemical analysis approaches may miss relevant contributors because of limitations in both detection limits and chemical identification.

Table 3. Summary of the attributes of data sources useful for biological effects prediction
Attribute Group 1 (CTD, STITCH) Group 2 (ChEMBL, PubChem) Group 3 (iCSS Dashboard)
Curated from literature Yes No No
Primarily in vivo/in vitro In vivo In vitro In vitro
Dose response data Limited Often available Consistently available
Standardized assay platform No Partially – generally for small sets of chemicals (10s–100s) Yes – for large sets of chemicals (1000s)
Relative diversity of chemical–pathway interaction space covered (at present) Greatest (1000s of chemicals × 1000s of targets/pathways); most chemicals environmentally relevant. Moderate (100 000s of chemicals × 1000s of targets/pathways); fewer environmentally relevant chemicals Moderate (1000s of chemicals × 100s of targets/pathways); mix of chemicals that are and are not environmentally relevant
Phenotypic/disease anchoring Yes No No
  • CTD = Comparative Toxicogenomics Database; STITCH = Search Tool for Interactions of Chemicals.

High-throughput toxicology for biological effects surveillance

To address uncertainties associated with unmeasured chemicals, high-throughput toxicity testing can be used to directly screen complex environmental extracts for relevant biological activities and potency. Quite simply, environmental samples such as water, sediment, soil, or other matrices are collected, extracted, and then screened using the same type of miniaturized, automated, and standardized high-throughput assays employed by ToxCast and/or Tox21 in testing individual chemicals (Figure 1C). As opposed to the bio-effects prediction approaches, where only the measured contaminants present in environmental samples are accounted for, direct analysis of environmental extracts using high-throughput toxicity testing assay batteries can account for the unknown contaminants present (e.g., those for which analytical methods are lacking, metabolites, degradates). This “bio-effects surveillance” approach provides an integrated measure of the biological activity of an environmental sample across a broad range of pathways. In many ways, it is directly analogous to whole-effluent testing with invertebrates and fish used as part of the National Pollutant Elimination System permitting program. However, rather than focusing on the integrated measure of apical impacts on survival, growth, or reproduction of test organisms, the high-throughput toxicity testing bio-effects surveillance approach involves screening of an environmental extract using a large battery of pathway-specific, generally in vitro, bioassays. While it is generally not feasible to provide complete coverage of every toxicologically relevant pathway that could lead to apical effects in vivo, what the high-throughput toxicity testing approach lacks in breadth of coverage it adds in terms of diagnostic specificity. For example, knowledge of biological targets being impacted can lend substantial insights into the types of (perhaps unmeasured) chemicals present at biologically relevant concentrations in a sample. Further, this type of in vitro data can greatly aid in the selection of appropriate in vivo assays and endpoints to more fully characterize hazard at a given site. Accordingly, this type of approach can be a major asset for initial surveillance of a site aimed at identifying the potential activities or hazards that may be present as a means to direct future research and/or monitoring.

Connecting pathway interactions with hazard

Once relevant biological activities or targets have been identified or predicted using either the bio-effects surveillance or bio-effects prediction approaches outlined, a next critical step is to link those potential biological pathway perturbations to relevant hazards that are of regulatory or societal concern (e.g., potential impacts on survival, growth, reproduction, human disease). The adverse outcome pathway (AOP) framework provides an organizing structure for making those linkages 15. By mapping affected molecular targets to established AOPs present—for example, in the AOP Knowledge Base 16—potential hazards and the appropriate biological endpoints for site-specific monitoring can be identified.

Application of bio-effects prediction or bio-effects surveillance along with AOPs facilitates a hypothesis-driven approach to site-specific environmental assessments. This allows risk-management professionals to focus their attention and resources on the biological effects and activities most relevant to a given site, its unique exposure matrix, and the chemical classes most likely to be responsible for those activities. This is particularly powerful in surveillance scenarios where background information regarding specific effects of concern at a site is lacking. Coupled with AOP knowledge, the use of high-throughput toxicity testing provides unprecedented tools for characterizing and assessing potential impacts of chemical mixtures on human health and the environment.

Moving Toward the Next Frontier

The practical application of the bio-effects prediction and bio-effects surveillance approaches described can be illustrated with a simple case study focused on characterization of 3 sites at which our research team has worked over the last several years. Site 1 is a nearshore location in Lake Superior, USA, isolated from known point sources of contamination (46.94622°N, 91.77416°W). Site 2 is approximately 10 m from the outfall of a wastewater treatment plant (WWTP), discharging approximately 50 million gal/d into the St. Louis River (MN and WI, USA), a tributary flowing into Lake Superior (46.75774°N, 92.12014°W). Site 3 is also positioned near a WWTP discharge, in this case discharging approximately 130 million gal/d into the Maumee River, a tributary of Lake Erie (OH, USA; 41.6888°N, 83.47740°W). Composite (4 d) water samples were collected using methods described in Kahl et al. 17. Samples were split, with 1 portion submitted for analytical chemistry analysis (see Supplemental Data, Table S1 for complete list) 18 and the other portion extracted and reconstituted in dimethyl sulfoxide, a carrier compatible with the high-throughput toxicity testing platforms employed by the ToxCast and Tox21 programs. Extracts were analyzed for bioactivity using an assay platform (Attagene CIS- and TRANS-FACTORIAL) designed to detect effects of compounds on activities of over 80 transcription factors and to evaluate agonist/antagonist properties for all 48 nuclear hormone receptors found in humans (and conserved across many vertebrates) 19, 20. Each of these targets represents a potential biological pathway that can be perturbed by 1 or more chemical(s) found in a sample.

Qualitative biological effects prediction

The list of chemicals detected at each site was uploaded to the Comparative Toxicogenomics Database 6. On average 86% of the chemicals detected had interaction information in the Comparative Toxicogenomics Database. A batch query was performed, which extracted all information from the database concerning protein targets with which the individual chemicals detected had been reported to interact (defined by specifying the search criteria). The resulting data files were used to build chemical–protein interaction networks (Figure 2) in which nodes representing either chemicals (rectangles, Figure 2) or protein targets (ovals, Figure 2) are linked together by edges (lines, Figure 2) indicating that an interaction had been reported. These networks provide a great deal of useful information. First, the overall network complexity in terms of total numbers of nodes and edges suggests the diversity of biological responses/perturbations that could potentially result from exposures at the site. Second, when using a “spring-loaded” view, the overall position of a protein node or chemical node within the network suggests the number of interactions with which it is involved, with nodes having more interactions/edges drawn toward the center of the network and those having fewer interactions relegated to the periphery. Finally, examining specific protein nodes, the number of edges connected to that node (termed “degree”) suggests the total number of chemicals detected at the site that have been reported to interact with a given target. Determination and ranking of the “degree” for each protein node effectively provide a means to rank biological targets on the basis of how many chemicals detected in a given sample have been shown, in the literature curated in the database, to interact with that particular protein. The assumption is that the greater the number of chemicals interacting with a given protein at a particular site, the more likely that protein target and its function may be perturbed in an organism exposed to that environment.

Details are in the caption following the image
Example chemical–protein interaction networks constructed from interactions reported in the peer-reviewed literature and coded into the Comparative Toxicogenomics Database 6. Rectangles represent chemical nodes; ovals represent protein nodes; lines represent edges representing a known interaction between a chemical and protein. Networks shown were constructed based on chemicals detected in an extract of surface water collected at case study sites 1, 2, or 3.

Focusing on our 3 case study sites, one could readily see that the network developed for site 1 was far less complex than those developed for sites 2 and 3 (Figure 2), immediately conveying an overall lower extent of contamination and less likelihood for biological perturbation(s). Focusing on the position of protein nodes in the network for site 2 (Figure 2), we see proteins like albumin (ALB) and estrogen receptor alpha (ESR1 or ESRα; highlighted in green, Figure 2) located in the center of the network, reflecting the fact that multiple chemicals detected at the sites have been reported to bind these targets (n = 8 in the case of ESR1; n = 5 in the case of albumin). In contrast, we see proteins like thyroid hormone receptor alpha (THRA) and caspase 9 (CASP9; highlighted in yellow, Figure 2) on the periphery, indicating a known interaction with just 1 or 2 chemicals at the site. Applying a degree of 3 or greater as a cutoff criterion, we then generated a list of protein targets (not shown) whose perturbation at the study site could be plausibly hypothesized based on prior knowledge captured in the Comparative Toxicogenomics Database. Using this approach, the sites could be qualitatively ranked in terms of relative potential for biological perturbation. The Duluth WWTP site (site 2) had the greatest number of predicted targets, followed by the Toledo WWTP (site 3), whereas the Lake Superior location (site 1) did not have any protein targets for which 3 or more potential chemical interactions were known. Based on this qualitative evaluation, we could reasonably hypothesize that sites 2 and 3 would be of greater concern for potential biological hazards and, further, that perturbation of ESR1, pregnane X receptor (PXR)-related, and aryl hydrocarbon receptor (AhR)-related pathways may be among the more prominent biological responses to consider. Consequently, AOPs involving ESR1, PXR, and AhR perturbation would be useful for identifying putative hazards of concern at these sites and potential in vivo endpoints that could be examined with either resident or caged organisms exposed at the sites 17, 21.

Semiquantitative biological effects prediction

To illustrate the more quantitative approach to bio-effects prediction, the chemicals detected at each site were mapped to the ToxCast data set 11, 12. On average, 64% of chemicals detected had been analyzed by the ToxCast program. The concentration of each chemical detected at the case study sites was compared with the AC50 reported for all assays represented in the ToxCast database. This includes data from 13 different assay platforms and a total of more than 700 bioassays for almost 2000 chemicals 22. For simplicity, however, we focused on exposure–activity ratios associated with 2 assay platforms (Attagene and NovaScreen), which cover approximately half of the 700 bioassays.

For site 1, all chemicals with significant activity in the Attagene or NovaScreen assays were below detection limits. Therefore, no viable exposure–activity ratio estimates were derived. In contrast, for the sites near WWTP outfalls, total estimated exposure–activity ratios (summed across all chemicals detected) ranged up to 1.1, with the greatest exposure–activity ratio at site 2 or 3 being associated with interactions of DEET (N,N-diethyl-meta-toluamide) with peroxisome proliferator activated receptor gamma (PPARγ). In total, there were 12 exposure–activity ratios exceeding 0.01 at site 2 and 7 exceeding 0.01 at site 3. In addition to interactions with PPARγ, these included interactions with monoamine oxidase b, protein tyrosine phosphatase, nonreceptor type 2, constitutive androstane receptor, peroxisome proliferator–activated receptor α, thyroid hormone receptor, ESR1, estrogen responsive elements, and Sox. It is also notable that although the qualitative chemical–protein interaction network–based approach suggested that perturbations of ESR1, PXR, and AhR-related pathways would be among the most prominent to consider based on total numbers of detected chemicals known to interact with these targets, the exposure–activity ratio–based approach, which accounts for both chemical concentration and potency, only identified 1 of those targets as being of high concern, at least relative to an arbitrary cutoff of exposure–activity ratio = 0.01. Interactions with PXR and AhR were still suggested by the exposure–activity ratio analysis, although with PXR and AhR having summed exposure–activity ratios which ranked lower than those for at least 11 or 90 other chemical–bioassay target interactions, respectively. This highlights some of the differences between the qualitative approach and the semiquantitative exposure–activity ratio–based approach. In the qualitative approach chemicals of concern could be ranked based on the number of protein targets they interacted with, potentially biasing the analysis toward the most promiscuous or most studied compounds. For example, the multitude of interactions reported and curated in the Comparative Toxicogenomics Database for benzo[a]pyrene dominates the network constructed for site 3 (Figure 2). In contrast, using the quantitative exposure–activity ratio, the focus is on a chemical's environmental concentration relative to its potency in a particular assay, thereby helping to focus on the most sensitive and probable of the chemical–biological interactions in the overall network.

Biological effects surveillance

While the bio-effects prediction methods provide useful insights into the types of biological activities that may be elicited by chemicals detected at a particular site, analytical characterization of the contaminants present in the samples is invariably incomplete. Recent research has suggested that even when fairly extensive analytical characterization is employed (i.e., measuring tens or hundreds of analytes), measured chemicals may contribute only a small fraction to the total bioactivity elicited by a sample (e.g., Tang et al. 23). These compounds that go unmeasured analytically but nonetheless are present and contributing to the biological effects have been termed “iceberg chemicals,” because the fraction being measured is just the tip of the iceberg while, in some cases, the bulk of active chemicals may in effect remain invisible to analytical characterization. This is where bio-effects surveillance approaches come into play. Because they provide a direct measure of the integrated biological activity of a mixture without regard to its composition, bio-effects surveillance data either provide a useful complement to bio-effects prediction when some analytical data are available or serve as an alternative starting point for site characterization in the absence of extensive analytical characterization.

The potential of bio-effects surveillance through direct analysis of water samples in high-throughput toxicity testing assays, such as the Attagene battery of ToxCast assays, to capture relevant activities that might otherwise be missed based on analytical chemistry alone was highlighted in our case study. We were able to observe at least 1 biological activity, activation of the pregnane X response element, at site 1 through direct analysis, even though such activity was not predicted qualitatively or semiquantitatively based on the chemicals detected at the site.

For the other 2 sites near the WWTPs, 13 biological activities were identified at each, with 7 of the activities being similar between the 2 sites (Table 4). These included the ESR1 and the estrogen responsive element, AhR, and PXR and its response element (PXRE). Site-specific activities also were identified. For example, the activities against glucocorticoid receptor and transforming growth factor-β were identified only at the Toledo WWTP site (site 3), and nuclear factor erythroid 2–related factor activation was elicited by the extract from only the Duluth WWTP site (site 2). Notably, all of the activities identified from the direct analysis were predicted using the exposure–activity ratio approach, although with variable ranked order of priority, and many of the identified activities were predicted by the network approach (e.g., ESR1 interactions). The congruence between the network approach, the exposure–activity ratios, and the direct screening methodology suggests that these tools, collectively, are capable of identifying relevant biological activities associated with complex mixtures.

Table 4. Summary of the Attagene bioassay results for extracts prepared from surface water samples collected from 3 sitesa
Attagene Transcription Factorsb Genes Site 1c Site 2c Site 3c
Aryl hydrocarbon receptor/xenobiotic response AHR ­− + +
Pregnane X receptor, xenobiotic pathway PXRE + + +
Pregnane X receptor PXR + +
Estrogen receptor pathway ERE + +
Estrogen receptor-α ERα + +
Estrogen receptor-β ERβ +
Vitamin D receptor/vitamin D pathway VDRE + +
Antioxidant response pathway NRF2 +
Hypoxia-inducible factor-1a/hypoxia pathway HIF1a +
Peroxisome proliferator-activated receptor-d PPARg + +
TGF-beta pathway TGFb +
Progesterone receptor PR +
Glucocorticoid receptor GR +
  • a The 3 sites are a nearshore location in Lake Superior (site 1); a location near the outfall of a wastewater treatment plant in the St. Louis River, Minnesota, USA (site 2); and a location near the discharge of a wastewater treatment plant in the Maumee River, Ohio, USA (site 3).
  • b Only assays for which significant activity was observed are shown.
  • c A plus sign (+) indicates a significant assay response was observed for a sample from that particular case study site. A minus sign (–) indicates that a significant assay response was not observed.

Integrated application

Although the approaches described have independent utility, they can also be applied in a complementary fashion to provide enhanced insights to inform the focus and design of more intensive monitoring and assessment efforts at a site. For example, in our case study, certain pathways (e.g., ESR1, PXR, AhR) were identified as being relevant using all 3 approaches. When resources are limited, it makes sense to focus on those pathways for which there are multiple orthogonal lines of evidence pointing toward a potential perturbation. Activities detected through the direct bio-effects surveillance approach but not identified through the bio-effects prediction approach indicate that chemicals other than those detected in the sample and/or represented in the Comparative Toxicogenomics Database, ToxCast, or other relevant pathway-based interaction data sources may be responsible for the bioactivity observed. If that bioactivity is particularly strong and/or maps to an AOP of substantial concern (i.e., a strong linkage between target perturbation and adverse apical outcomes has been established) 13, bioassay-directed fractionation and/or toxicity identification evaluation approaches may be warranted to detect causative agents 21, 24. By the same token, when high-throughput toxicity testing–based bio-effects surveillance is performed independently of analytical characterization, sources of chemical–pathway interaction data can be used to identify the types of chemicals/chemical classes that are known to cause the bioactivities observed. This could then be used to target analytical characterization of specific compounds. Collectively, these approaches can be used in many ways to help refine and focus further site characterization activities.

Making the link to hazard

Biological activity alone does not constitute a hazard. Some targets or pathways play more critical functions in growth, development, or survival than others. Further, organisms are better able to cope (compensate, adapt) with perturbation of some pathways than others. Interpretation of the significance of impairing certain enzyme activities, binding particular receptors, or activating specific transcription factors depends on both context (e.g., the species, routes of exposure, target tissue[s] of concern) as well as the degree of scientific confidence we have in linking a particular perturbation to an apical response relevant for risk assessment. Consequently, another critical component of effectively applying high-throughput toxicity testing and pathway-based data involves making the link to hazard or at least other key biological events via the use of AOPs. Once targets have been identified using the bio-effects prediction or surveillance approach, AOPs such as those present in the AOP Knowledge Base 16 can provide relevant endpoints to evaluate whether or not that perturbation observed at a given site has the potential to adversely impact the survival, growth and development, reproduction, or well-being of organisms exposed. For example, in our case study, the network approach, exposure–activity ratio, and direct analysis identified the estrogen receptor as a relevant biological target/activity at both the WWTP sites. There are a couple of relevant AOPs for fish that have been described relating estrogen receptor activation to relevant adverse outcomes, including chronic effects on adult reproduction 15 and shorter-term impacts on survival of males exposed to exogenous estrogens (AOP 29) 16. The key events for the latter AOP are an increase in vitellogenin (egg yolk protein) synthesis in the liver of males and a corresponding increase in plasma vitellogenin concentrations and renal pathology as a result of vitellogenin deposition, which can lead to death. This adverse outcome pathway suggests that we could use endpoints such as increased liver vitellogenin mRNA or plasma vitellogenin protein measured in either caged fish or wild fish to determine whether there is indeed in vivo evidence of the perturbation of signaling in the estrogen receptor pathway that could lead to impacts at the individual and population levels. Additional AOPs can be identified in a similar manner for the other biological targets identified, allowing for identification of relevant hazards and endpoints for targeted monitoring.

Considerations and Challenges for the Next Frontier

There appears to be great potential for approaches based on high-throughput toxicity testing and other sources of pathway-based data to help overcome challenges associated with mixture toxicology. Each of the approaches described in the present article has its own strengths and limitations, but all add value in terms of helping to identify and prioritize potential biological effects of chemicals present in the environment. It is important to remember that the targets identified were based on prior knowledge predictions (e.g., databases) or through in vitro measurements, and the greatest strength of the approaches is, arguably, hypothesis generation. A site assessment generally would not stop at identifying biological targets or activities. Instead, the biological impacts of these activities should be tested in vivo using various measurement endpoints to confirm that predicted hazards are actually occurring in taxonomically relevant species.

In particular, there are 3 general areas that should be considered in applying these technologies and approaches: the nature of the chemical universe being assessed, the assays actually utilized, and the biological domain of applicability of the results. From a chemical universe perspective, our illustration of the use of high-throughput toxicity testing for environmental mixtures focused on surface water is an environmental matrix appropriate to the consideration of fish health. However, other types of environmental matrices (e.g., sediment, air, soil) can also be sampled and analyzed for biological activity, depending on the assessment endpoint of concern. Similarly, solid-phase extraction with the particular sorbent (Waters HLB) used in our studies is a common technique for removing or concentrating relatively nonpolar organic compounds from water, but this technique is less effective for isolating polar compounds. Accordingly, the sample processing or extraction techniques employed need to be tailored to the physicochemical properties of the contaminants of concern in a given matrix. As needed, specific types of compounds can be enriched from environmental matrices through fractionation, and those fractions can be screened through the assays.

The choice of assays and appropriate interpretation of results obtained when assessing complex mixtures are also vital. For example, there currently is only limited information about how chemical mixtures actually “behave” within the in vitro assays. In some instances, mixture components can cause overt toxicity, such as in cell-based systems, hence obviating identification of specific biological activities. Appropriate negative and positive controls need to be developed to aid in the interpretation of data in this type of scenario. Another challenge involves confirming how mixtures of toxicologically similar chemicals would be expected to behave in high-throughput toxicity testing assays. Specifically, it is not clear if chemicals that interact with the same target in an assay will always cause an additive response. When calculating a total exposure–activity ratio, we assumed that this would be the case, but this assumption has not been formally tested, for example, with the Attagene assays. Ideally, studies would be conducted similar to those of Blake et al. 25, who evaluated the activity of well-defined mixtures of androgenic and antiandrogenic chemicals in an androgen-responsive cell line later adapted to a high-throughput format. Well-defined chemical mixture studies using the ToxCast data for individual chemicals can help with understanding how mixtures may behave in this system.

Finally, determining the biological domain of applicability in the form of species susceptible to chemical exposures and species sensitivity differences are key considerations for robust application of the complex mixture assessment tools discussed herein. Identifying potentially sensitive ecologically relevant species can be difficult because the majority of in vitro high-throughput toxicity testing assays are mammal-focused. This is the case for the majority of the bioassays associated with ToxCast. Similarly, most of the data present in the interaction and in vitro databases are also heavily mammal-focused. Therefore, species extrapolation will almost always have to be considered when determining the impact of observed biological activities in an ecological context. The USEPA recently developed a sequence alignment tool called “SeqAPASS” 26 to predict across-species susceptibility, which can assist in identifying taxa that would potentially be susceptible to specific chemical perturbations based on molecular target similarity. The use of SeqAPASS combined with AOP knowledge can help to extrapolate the identified activities in a mammal-based system to other toxicologically relevant taxa (e.g., LaLone et al. 27). Also, the development of high-throughput toxicity testing in vitro bioassays with a suite of molecular targets derived from a large number of toxicologically relevant taxa would help to identify potentially susceptible species. The development of these types of bioassays would allow for the identification of species sensitivity differences when the chemicals and assays are run in a standardized manner, similar to the ToxCast approach.

In summary, the emergence of high-throughput toxicity testing is changing the field of regulatory ecotoxicology. Information is rapidly being generated for thousands of individual chemicals for which there previously were no, or only limited, toxicity data. Likewise, modern computational power and the widespread dissemination of knowledge via the Web offer ever-increasing access to repositories of well-organized chemical–pathway interaction data. We feel that these technologies are potentially transformative for toxicological characterization of complex environmental mixtures and effectively prioritizing limited resources available for conducting site-specific monitoring and risk assessment. Coupled with AOP knowledge, these approaches move mixture toxicology to a truly hypothesis-driven paradigm that can benefit from the decades of mechanistic toxicology research that has been conducted and published along with rapidly emerging streams of high-throughput toxicity testing data. Consequently, environmental surveillance and monitoring is truly the next great frontier for 21st-century toxicology.

Supplemental Data

The Supplemental Data are available on the Wiley Online Library at DOI: 10.1002/etc.3309.

Acknowledgment

The authors thank members of the Great Lakes Restoration Initiative research group, in particular S. Corsi and T. Smith. The authors also thank V. Wilson for reviewing an earlier version of the manuscript.

    Disclaimer

    The present article has been reviewed in accordance with official US Environmental Protection Agency (USEPA) policy. Mention of products or trade names does not indicate endorsement or recommendation for use. Conclusions drawn in the present study neither constitute nor reflect the view or policies of the USEPA. The authors have no conflict of interests to declare.

    Data availability

    The data for this focus article are not publicly available as they were used for illustrative purposes only. It was thought that incorporating an illustrative example was necessary to show the utility of the approaches incorporated in the focus article.