INTRODUCTION
A major challenge in biology is to predict how organisms will behave based on how they interact with their environments. This is hard because essential behaviors, such as how organisms reproduce or develop, depend on sensing and responding to diverse environmental factors, often involving the activation and expression of multiple genes as well as coordinated interaction among multiple gene products. To address this challenge, it may help to start small by targeting the simplest organisms, whose growth and development are encoded by the shortest genomes, involving a manageable number of essential genes and interactions.
Why focus on viruses? Many viruses can readily be cultured, facilitating their detailed study, and the relatively simple life cycles of many viruses have for many decades been amenable to molecular-level dissection and characterization. More broadly, viruses can affect the behavior of ecosystems and play significant roles in human health and disease. Studies on viruses that infect bacteria, the bacteriophages, played a key role in seminal discoveries of molecular biology. For example, early studies on phage T4 by Hershey and Chase provided compelling evidence of the role of nucleic acids, not proteins, as the material that encodes genetic information. Later work on phage T4 by Brenner, Jacob, and Meselson led to our mechanistic understanding of protein translation, while Jacob and Monod's studies on phage lambda provided the earliest understanding of how the expression of genes could be regulated (
1,
2). The one-step growth method for phages developed by Delbrück and Ellis set a foundation for the first quantitative study of the virus growth cycle (
1), and still further work on animal viruses has contributed to the understanding of how eukaryotic cells regulate their growth and how loss of such regulation can lead to cancer (
3).
Every genome in nature encodes multiple processes, and genomes of viruses are no exception. In an appropriate environment of a living cell, the release of a genome from an invading virus can take command, directing material and energy resources away from cellular processes and toward the synthesis of components that are essential for virus growth, i.e., viral mRNA, viral proteins, viral genomes, and lipids of viral membranes. Assembly of these and other parts produces progeny virus particles that, upon release by the cell, may then infect other susceptible cells. For many viruses, including viruses that infect microbes, plants, animals, and humans, essential molecular processes of intracellular development have been elucidated. However, despite the relatively short lengths of virus genomes, the networks of reactions that define virus growth and their interactions with their host cells remain complex. These networks can contain multiple positive and negative feedbacks that make it challenging to predict how perturbations to viral or host cellular functions, either by genetic engineering or by the presence of drugs that specifically target viral or cellular functions, will quantitatively influence virus production or cell survival. To begin to systematically address this challenge, one may build mathematical models to account for the essential processes (
Fig. 1). Here we review efforts aimed at modeling how viruses reproduce within cells, and we describe how such quantitative models have begun to provide insights into the integrated behavior of virus growth. Further, we show how such models can serve as a basis for the development of new antiviral strategies and shed light on the evolution of viruses.
To set the stage, we use the remainder of the introduction to provide some background on virus genomes and the process of virus growth within cells. Furthermore, we suggest how the integration of knowledge on viruses at the molecular and cellular levels can be enriched by perspectives of physical scientists, mathematicians, and engineers.
Virus genome sizes span several orders of magnitude. The largest known viral genome, which belongs to Pandoravirus (2,770 kb) (
4), is larger than the smallest genome of a free-standing bacterium, which belongs to
Mycoplasma genitalium (580 kb) (
5). The smallest viral genomes are less than 10 kb long, and they include genomes for hepatitis B virus (HBV) (3 kb) and phage Qβ (4 kb). Most known virus genomes fall in the size range of 5 to 100 kb and encode up to several hundred proteins (
Fig. 2). The larger virus genomes, such as those of Epstein-Barr virus, smallpox, and Pandoravirus, are composed of DNA. In contrast, many of the smaller virus genomes, such as those of human immunodeficiency virus type 1 (HIV-1), influenza virus, and hepatitis C virus (HCV), are made of RNA. These genomes may be composed of one or more segments of either single-stranded or double-stranded DNA or RNA. Despite these biochemical differences in the genomes of DNA and RNA viruses, all viruses must make mRNA, and all viruses use the host cell translation machinery to synthesize viral proteins. Viruses that carry single-stranded RNA genomes may possess positive- or negative-sense RNA, and a positive-sense RNA genome may immediately serve as an mRNA template for the translation of viral proteins.
Figure 3 shows how the genomes of different viruses employ different viral or host polymerase activities to synthesize RNA and, ultimately, protein. Viruses that carry double-stranded DNA (dsDNA) genomes, such as Epstein-Barr virus, use the host cellular RNA polymerase (RNA Pol) for transcription of viral mRNA, while other dsDNA viruses, such as smallpox virus, use their own viral DNA-dependent RNA polymerase. Viruses that carry positive-sense RNA genomes, such as hepatitis C virus and poliovirus, may immediately use their genomes as mRNA templates to direct the synthesis of virus proteins, while viruses that carry negative-sense RNA genomes, such as influenza or Ebola virus, employ a viral RNA-dependent RNA polymerase to make positive-sense RNA, which may then serve as a template for protein synthesis. Double-stranded RNA viruses, such as rotavirus, which can cause diarrhea in children, also employ a viral RNA polymerase to make viral mRNA. Retroviruses, most notably HIV-1, carry two copies of a positive-sense RNA genome, but this RNA is not translated at the outset of infection. Instead, the retroviral genome is reverse transcribed by a viral RNA-dependent DNA polymerase to make viral DNA that is integrated into the host cellular DNA, which later serves as the template to make viral mRNA. A more extensive description of the diverse strategies that viruses use to express their genome-encoded functions and to replicate their genomes is available elsewhere (
6).
Infection cycles follow a series of basic steps that are shared by many viruses (
Fig. 4). These steps generally include (i) binding of the virus particle to the cell surface, (ii) entry of the viral genome into the cytoplasm or nucleus, (iii) transcription of viral mRNA, (iv) translation of viral proteins, (v) replication of viral genomes, (vi) assembly and packaging of virus particles, and (vii) release of progeny virus particles into the extracellular environment. Some viruses, notably HIV-1 and various herpesviruses, can establish a state of latent infection in which their host cells do not actively produce virus progeny but instead turn off most genes while retaining the ability to switch to productive infection in response to certain changes that result in conditions that have been selected to enable viral persistence over a range of resource-rich or -poor host environments. Quantitative and mechanistic models have been developed to study various partial facets of the virus infection cycle, including virus particle adsorption to cells (
7–9), virus entry (
10–13), transcription (
14–18), RNA splicing (
19,
20), protein translation (
21–23), genome replication (
24–26), assembly of virus particles (
27–29), packaging of virus particles (
30–32), and release or budding of progeny virus particles from cells (
33). We do not review this extensive literature here; instead, our focus is on models that have sought to be more comprehensive, integrating quantitative descriptions of most of the essential steps in the virus infection cycle. Many books provide a useful introduction and serve as references on viruses and their growth. Among the most accessible to the nonspecialist is Sompayrac's introduction to pathogenic viruses (
34). A more-detailed reference in which strategies of specific viruses are used to illustrate broader concepts of virology is that by Flint and colleagues (
35), and a more comprehensive reference, which provides detailed reviews of specific virus families, is
Fields Virology (
36).
For molecular biologists and biophysicists, we show how molecular mechanisms deduced from diverse experiments may be integrated into models that enable an integrated perspective of their behavior. In short, we show how models can be used to explore how individual molecules and their interactions contribute to the overall development of a minimal organism. For the physical scientist, engineer, mathematician, or computer scientist, we highlight a central role for creating mathematical and computable descriptions of the relevant molecular processes as a systematic way to account for their contributions. For the experimental biologist, we emphasize a key role for the development of quantitative experiments (with frequent sampling and repeat measures) that permit estimations of parameters and their variability, an essential process for validation and refinement of the models. For biomedical scientists, we show that such models can enable exploration and identification of potentially robust antiviral strategies or means to maximize yields of virus particles for vaccine or gene therapeutic applications. Finally, for ecological and evolutionary biologists, we make the case that the ability to calculate the growth rate or fitness of an organism can provide a means to probe questions on how genes interact and their effects on the design, robustness, and adaptability of organisms. A summary of the kinetic models of virus growth established over the last 25 years is provided in
Table 1.
BACTERIOPHAGE T7
The genome of bacteriophage T7 is a double-stranded DNA of 39,937 nucleotides that carries 56 genes and infects E. coli in a lytic manner, typically producing 100 phage progeny in 40 min at 30°C. The genome enters the host cell in a gradual manner, coupled with transcription of the phage genes, initially by the E. coli RNA polymerase and later by the phage RNA polymerase, which encounters stronger promoters as more of the genome becomes accessible. Owing to the gradual nature of genome entry and transcription, T7 genes are expressed at different times depending on their relative positions in the genome.
An initial motivation of Endy et al. for developing an intracellular model for bacteriophage T7 was to better understand the behavior of phage mutants that were selected during evolution experiments. In phage cultures, wild-type phage T7 was continuously passaged on recombinant hosts that constitutively expressed an essential phage enzyme, the T7 RNA polymerase. During 30 generations, at most, phage mutants arose that carried a deletion in the gene for this essential enzyme, making them completely dependent on the host-supplied enzyme for their growth. Moreover, these mutants grew faster on the recombinant hosts than the initial wild-type phage did (
64,
65). Endy et al. speculated that the faster production of mutant progeny could be attributed to the faster synthesis of shorter phage genomes and the ability to dispense with the time required to synthesize the polymerase in wild-type infections. Back-of-the-envelope calculations of the rate of T7 DNA polymerase and the transcription and translation rates needed to produce the enzyme appeared to account for the 10% faster (3 to 4 min) production of mutants than that of wild-type phage (
66). Using these calculations as a basis, we assembled a more comprehensive model that accounted for the rate of phage genome entry into the host cell, synthesis of T7 mRNAs, translation of these mRNAs to produce T7 proteins, known protein-protein interactions and feedbacks on host and phage transcription, synthesis of T7 DNA genomes, and assembly of phage progeny (
67). Over the course of developing the model, other applications for models of virus intracellular growth emerged. For example, simulations based on published mechanisms and data can be used to examine the consistency of the data and reveal when new results challenge the existing literature. Second, models can readily be extended to predict how different antiviral strategies will influence the simulated growth dynamics. Since the simulations can readily be modified, many potential strategies can be explored efficiently, often suggesting experiments that might not otherwise be performed or helping to prioritize planned experiments.
Several key assumptions defined the framework for the T7 model. Although Endy et al. only later became aware of Knijnenburg and Kreisher's 1983 phage model, they nevertheless employed several of the same assumptions. For example, for the T7 model it was assumed that the protein synthesis machinery of the cell (ribosomes, activated tRNAs, and proteins) was present in excess at all stages of infection, that protein-protein binding interactions were sufficiently fast to be treated as equilibrated, and that the intracellular processes were “well mixed,” neglecting any spatial heterogeneities or roles of component transport in the dynamics of growth. In contrast to Knijnenburg and Kreisher's stochastic kinetics model, however, the phage T7 model was deterministic, as it was based on a set of coupled ordinary differential and algebraic equations. Further assumptions specifically related to details of phage T7. For example, to simulate the assembly of the T7 virion particles, which had not yet been studied in detail, results for phage P22, a dsDNA phage that is morphologically similar to T7, were employed. The final model incorporated 46 parameters spanning 27 years of the phage and E. coli literature. Twenty of the parameters described rates of elongation by RNA polymerases (transcription), DNA polymerase (genome replication), and ribosomes (protein synthesis), spacing requirements for polymerases and ribosomes on their respective templates, decay rates for mRNA, proteins, and DNA, binding constants for protein-protein interactions, and other processes. Moreover, 16 parameters quantified the relative strengths and initiation efficiencies of the T7 transcriptional promoters, and 10 parameters described the stoichiometry requirements of the progeny phage particles. None of the parameters were adjusted to fit the final results. In addition, owing to a lack of mechanistic understanding of the lysis step that allows progeny T7 phage to be released from the cell, we simulated only the production of phage progeny within the cell.
The initial T7 model was able to capture essential features of the T7 growth behavior, based on a comparison with data that were independent of the model formulation. The observed timing of simultaneous shutdown of host RNA polymerase activity and expression of phage RNA polymerase activity early in the infection cycle was well captured by the model. The simulated order of appearance of phage proteins that are characteristically expressed during early, middle, and late stages of the infection cycle captured trends observed from pulse-labeling experiments, though discrepancies were notable, particularly for later structural proteins, which tended to appear earlier in the simulations than in the experiments. Finally, experimental validation of phage intracellular production was performed by lysing cells at different time points following the initiation of infection and quantifying the corresponding intracellular level of phage particles. Simulated production of wild-type T7 phage rose at a rate similar to that for wild-type phage in our experiments, but one-step growth curves appeared about 3 to 5 min earlier than those observed experimentally for a 30-min infection cycle. Simulations of a phage deletion mutant on a recombinant host that expresses the T7 RNA polymerase produced mutant phage earlier than that found in the wild-type phage simulations, consistent with experimental observations.
Antiviral Strategies
We initially employed our model of phage T7 intracellular growth to simulate how different antiviral strategies might influence T7 growth (
67,
68). We chose to initially test antisense strategies because such approaches offered a convenient way to think mechanistically and quantitatively about the effects of targeting specific phage functions. For each antiviral strategy, the following three features were defined: the specific T7 mRNA target, the initial level of target-specific antisense RNA in the cell, and the equilibrium binding constant for the formation of the complex between the target mRNA and the antisense RNA. Here the binding constant for complex formation can be viewed as a parameter that describes the intrinsic potency of the antisense drug against its mRNA target. We expanded our model of T7 intracellular development to include antisense drugs that targeted mRNA encoding either the major coat (capsid) protein or the T7 RNA polymerase (RNA Pol). Key results suggest that drugs that target different essential components can produce qualitatively different effects on growth (
Fig. 5). RNA Pol transcribes phage proteins that inhibit the function of the host transcription machinery, and this machinery is essential for expression of the RNA Pol. Thus, RNA Pol is part of a negative-feedback loop that ultimately regulates its own expression. While it is challenging to anticipate how a drug that targets a component of an early regulatory loop will ultimately affect the production of virus progeny, modeling offers a useful way to expand our intuition. Our study suggested how drug targeting of regulatory loops could offer benefits by enabling one to block established pathways of drug escape or drug resistance. However, taking a still broader perspective, it is important that novel antiviral strategies should be viewed as simply defining modified criteria for the selection of new viral escape strategies. In other terms, mutations in functions that are not directly related to the targeted function may still alter how virus growth responds to antiviral drug treatment in ways that would be challenging to anticipate. In this context, the model may serve as a foundation to test other escape strategies.
From Transcriptome to Proteome
By accounting for the synthesis of the phage mRNAs and proteins that are essential intermediates for the production of progeny particles, we simulate as a by-product the concentration-versus-time trajectory of all phage mRNAs and proteins. This information provides an idealized data set with which one may test data mining or mechanistic inference tools. For example, one may consider the simplest way to express the relationship between the concentration of a protein and its corresponding mRNA within the cell, as follows: d[Pi]/dt = ktrans[mRNAi], where [Pi] and [mRNAi] are free concentrations of the ith protein and ith mRNA, respectively, and ktrans is the rate at which the ith mRNA is translated into protein. For a fixed translation rate, plotting d[Pi]/dt versus [mRNAi] yields a line with the slope ktrans and an intercept of zero. Alternatively, one may view the free concentration of the ith protein as a process that integrates the level of the ith mRNA over time.
When the free concentration of a protein can be influenced by other processes, such as protein degradation, posttranslational modifications, or formation of complexes with other proteins or nucleic acids, additional terms appear on the right-hand side, as follows:
d[
Pi]/
dt =
ktrans[mRNA
i] −
kd[
Pi] +
f(modifications) +
g(interactions), where
kd is a rate constant for first-order degradation of the
ith protein and
f and
g are functions representing other processes that may influence the concentration of the
ith protein. In general, these additional processes will cause trajectories of
d[
Pi]/
dt versus [mRNA
i] to deviate from a straight line. In the context of our phage T7 model, we simulated plots of
d[
Pi]/
dt versus [mRNA
i] and found that some phage proteins produced loop-like trajectories that reflected their regulatory roles (
69). Moreover, the model was used to quantify for each phage protein how it deviated from linearity in its plot of
d[
Pi]/
dt versus [mRNA
i] over the course of infection. Such deviant proteins are interesting because they represent proteins that are modified, where
f(modifications) is nonzero, or proteins that interact with other proteins, where
g(interactions) is nonzero. If two or more proteins interact with each other, then they would be expected to be depleted (and to deviate) in a correlated manner. We calculated extents of such correlation for all phage protein pairs and found that, indeed, the protein pairs that were computationally modeled to form complexes also shared highly correlated deviations. It will be interesting to use highly quantitative measures of mRNA and proteins from the same cells over time to explore whether such a “correlated deviation” approach can be used to infer physical interactions and formation of multiprotein complexes during infection. There is not yet a consensus on how global transcript and protein data should be treated to gain mechanistic insight. While much of the field has focused on characterizing the extent of correlation between protein and mRNA levels (
70,
71), alternative approaches that combine mechanistic models of translation with mRNA and protein measurements have begun to highlight challenges in characterizing how limited and changing translation resources influence quantitative links between message and protein (
69,
72–75).
Genome Organization Affects Virus Fitness
Given that we have a simulation for phage T7 one-step growth, one can begin to explore alternative genome designs by changing the way the simulation is implemented. This may be done by changing the order of genes or the parameters of the model. In the case of phage T7, entry of the genome into the cell is mediated by the action of the host RNA polymerase and later by the phage RNA polymerase, with the processivity of the polymerase corresponding to the entry of phage genes. In short, genes at the left end of the genome are transcribed before genes at the right end. Moreover, because the strength of promoters increases as one moves from the left to the right end, later genes are also expressed at higher levels. By altering the positions of genes within the phage genome, one can alter their timing and level and thereby change the fitness of the phage in subtle or not-so-subtle ways. One of the not-so-subtle ways is to move the T7 RNA polymerase gene to positions where its transcription is put under the control of a T7 promoter, creating a positive-feedback loop of T7 RNA Pol on its own transcription. Predicted enhancements in phage protein synthesis and phage production were not supported by quantitative experiments on protein production by pulse-chase methods or by yields of progeny phage by one-step growth experiments (
76). One reason for the discrepancy is that the simulation did not account for the contribution of nonessential genes, such as phage gene 1.7. Although it is known that 1.7 is nonessential for phage growth, control experiments established that the absence of gene 1.7 does have a detrimental effect on phage progeny yields. A better accounting of the finite resources provided by the host cell was not able to explain the differences observed between experiments and simulations (
77). The significant observed differences between predicted phage with the alternative gene order and three experimentally generated and tested phages highlight the substantial work that still needs to be done to better understand the nature of the discrepancies.
Synthetic Biology Test of Model Assumptions
Current modeling of virus growth builds on simplifying assumptions that may or may not be valid. Quantitative experiments can be used to test and modify assumptions that may subsequently allow the model to account for more observations or make new predictions. An alternative approach to gaining biological insights is to alter the biology to match the simplifying assumptions of the model. For example, the genome of phage T7 has multiple overlapping genes that are not essential for phage growth. In our model, we assumed that T7 genes were not overlapping, an assumption that enabled us to model the kinetics of transcription of these genes (
67). It was unknown at the time how the presence of overlapping genes might affect phage growth. To address these effects, Endy and coworkers redesigned and synthesized 30% of the T7 genome to eliminate overlapping regions. The resulting chimeric phage was able to produce viable phage progeny, though plaque sizes indicated that the growth properties of these variants were attenuated relative to those of the wild type (
78). This exercise was useful in showing how simplifying modeling assumptions might be tested by “changing the biology to fit the model” instead of the more common approach of “changing the model to fit the biology.” In the specific case of phage T7, the chimeric phage showed that overlapping genes are not essential for T7 growth but likely have an impact on the overall growth or fitness of the phage.
How the Host Cell Environment Can Affect Virus Fitness
A key assumption that enables development of initial models of virus intracellular growth is that host resources are infinite. Using this assumption, one may then simulate virus growth based on the dynamics of the virus-encoded processes. These minimally include transcription of viral mRNA, translation of viral proteins, synthesis of viral genomes, and assembly of virus progeny particles. The assumption of infinite host resources is most likely to be reasonable at the earliest stages of infection, when the demand by the virus infection is defined for a small number of initial viral genomes (the ones that initiated infection). To better account for the effects of host resources on growth, phage T7 infections were performed on host cells cultured under different conditions that were set up to provide different host resources. In general, rapidly growing cells will have more biosynthetic resources, such as ribosomes for translation, than more slowly growing cells. We cultured
E. coli hosts in a chemostat and controlled their growth by adjusting the dilution rate of the chemostat (
79). Cells at different growth rates, spanning from 0.7 to 1.7 doublings/h, were used as hosts for synchronized one-step growth cultures of phage T7, and the resulting rise rates and eclipse times (characteristics of one-step growth) were determined. To predict how altered growth rates of host cells would influence phage growth, we employed empirical correlations established by others to indicate how cellular resources and properties, such as host RNA polymerase levels and elongation kinetics, ribosome levels and elongation rates, NTP and amino acid pools, and cell size, correlated with host growth rate. Thus, cellular growth rates set by the experimental conditions were used as inputs to the correlations to estimate cellular resource levels and kinetics, which were used as initial conditions for the T7 simulation. In experiments, faster cell growth reduced the eclipse time (time to production of initial phage progeny) and increased the rise rates of progeny production, and these behaviors were captured by the simulation. The simulation then provided an opportunity to uncouple the effects of different host resource conditions on phage growth in cases where such uncoupling would be difficult or impossible to implement in experiments. Through such studies, we identified the processing of the translation machinery, specifically the ribosome elongation rates, to be most limiting for phage production. Moreover, independent changes in simulated host resources indicated how resources of host transcription or translation would need to change in order for phage infections to become limited or “bottlenecked” by host RNA polymerase levels or ribosome levels. An assumption of the model was that host cell resources for the entire infection process could be defined adequately based on their state at the time of sampling from the chemostats. A more refined perspective will need to account for the consequences of infection on the host physiological state, accounting for the effects of potential changes in energy metabolism on levels of biosynthetic resources or, for example, the synthesis or decay of NTP or amino acid pools during infection.
Epistasis: Quantitative Assessment of Genetic Interactions
A fundamental challenge in quantitative genetics is to better understand how interactions between genes quantitatively affect the fitness of an organism. Simulations of virus growth are potentially useful here because they provide a quantifiable link between functional molecular characteristics of gene products, such as binding affinities or kinetic parameters, and the growth rate, infection productivity, or fitness of the corresponding virus. To help to understand the terminology, it is useful to consider a simple quantitative example of how mutations in two genes, call them A and B, may interact. If the wild type has a fitness of 1.0 and mutations in genes A and B give rise to mutants that have fitness levels of 0.80 and 0.70, respectively, then one may ask, what is the fitness of a mutant phage containing both mutations? If these mutations do not interact, then the double mutant will have a fitness of 0.56, just the product of the fitness values for the single variants. If these deleterious mutations interact synergistically, then the fitness of the double mutant will be <0.56, and if they interact antagonistically or by buffering their deleterious effects, then the fitness of the double mutant will be closer to that of the wild type (>0.56). In quantitative genetics one seeks to answer the following question: on average, to what extent do deleterious mutations interact synergistically, antagonistically, or not at all? This can be a challenging experiment because it requires that one be able to generate mutants having well-defined mutations and to accurately quantify the effects of these mutations on some measure of fitness. In simulations of virus growth, it is feasible to create such mutations by altering parameters that correspond to molecular functions, such as a binding constant for complex formation or a rate constant corresponding to the elongation rate of an RNA polymerase, and then to calculate the effects of such alterations on the yield of virus progeny from a cell, one measure of fitness. The rationale here is that some mutations can alter quantitative characteristics of molecules and that the corresponding parameters in the models can be changed to simulate the effects of such mutations on molecular function. If one seeks to simulate deleterious mutations, then one just needs to check that the alteration in a parameter reduces the calculated fitness. We used this approach to generate and test interactions among diverse computationally generated deleterious mutations to examine the effects on the calculated fitness of phage T7 (
80). In this study, two metrics for fitness were employed: one for a resource-rich environment and one for a resource-poor one. For the rich environment, we assumed that host cells were available in unlimited supply, so the fitness of the virus depended on maximizing its production rate within a given cell. For the poor environment, we assumed that only the infected cell's resources were available, so fitness was based on maximizing the yield, without limitations on the rate. It was found that mildly deleterious mutations tended to act synergistically in resource-poor environments but antagonistically in resource-rich environments. Moreover, severely deleterious mutations tended to buffer themselves, acting antagonistically in both rich and poor environments. The work was relevant to population genetics in suggesting that the effects of synergistic interactions on fitness, while important for theory, may be challenging to detect in practice owing to their emergence when the quantitative effects of mutation on fitness are minimal. Future studies would benefit by exploring how quantitative rather than qualitative changes in environmental resources affect whether epistasis is synergistic or antagonistic. This may be achieved, for example, by using the metric for fitness based on production rate (e.g., the rich environment, as described above) and altering the growth rate of the host, creating richer or poorer environments based on higher or lower rates of host cell growth (
79). Still more realistic assessments of epistasis might define the average fitness of the virus over multiple cycles of infection where one allows for fluctuations in host resources, conditions that could readily be implemented in virus growth simulations.
Robustness versus Fragility
Biological systems tend to perform robustly with respect to conditions under which they evolved but are more sensitive to environmental changes that they have not encountered (
81,
82). These ideas were examined in the context of the bacteriophage T7 system by testing the effects of simulated natural mutations, implemented by changing parameters of the model over plausible ranges for natural mutations, and simulated unnatural mutations, using previously described genomic rearrangements (
76,
77). In general, the simulated phage growth was robust with respect to parameter changes but relatively sensitive or fragile with respect to genomic rearrangements, which are not known to occur in nature for phage T7. Such observations were consistent with theory (
83). The robust behavior of growth was also a feature of the phage Qβ growth models (
47). The effects of natural and unnatural perturbations to the simulated one-step growth of phage T7 were also evaluated in rich and poor host resource environments, and it was interesting that the fitness of wild-type phage was nearly optimal under resource-poor conditions but average under resource-rich conditions (
83). This finding suggests that limited host resources served as a constraint on processes of phage T7 evolution. The extent to which such findings are general remains to be tested.
BACTERIOPHAGE M13
Phage M13 is a single-stranded DNA virus with a 6.4-kb genome that encodes 11 proteins, 5 of which combine to encase a single genome in a well-defined filamentous structure. The structure is composed of four minor coat proteins (p3, p6, p7, and p9), each incorporated at 5 copies, and a major coat protein (p8) that is present at about 2,700 copies per phage particle. The well-resolved structures of the proteins and their arrangement in defining the surface of the particle, along with the facile tools for altering the gene for each protein, have in the last 15 years made phage M13 popular for controlled engineering at the nanometer scale, with diverse applications, including biosensors, batteries, and memory devices (
84). Prior to these technological developments, fundamental studies established many of the molecular mechanisms associated with M13 DNA replication, mRNA processing, and mRNA degradation (
85,
86). Using this and an extensive literature review, a kinetic model for M13 intracellular growth was developed that employed 81 differential equations and 64 kinetic parameters, of which 43 parameters could be estimated from experimental data (
37). The model assumed unlimited resources in pools of amino acids, nucleic acids, and energy, but it avoided infinite virus growth owing to limitations on ribosomal availability. If specific data were not available, alternative approaches were found in some cases. For example, the distribution of a limited pool of ribosomes to phage mRNA should depend on the nature of the interaction of each mRNA species with the ribosome, as reflected in part by the ribosome binding sequence (RBS), but strengths of RBS sequences for M13 mRNA have not been measured experimentally. To address this limitation, an RBS calculator was used; the calculator employs an equilibrium statistical thermodynamic model to account for the effects of the RBS sequence on translation initiation rates (
87).
The overall M13 intracellular growth model was able to reproduce diverse aspects of the phage's biology, including the timing and extent of phage DNA replication, the levels of mRNA and protein production, and the timing and levels of progeny phage production. Further, by observing how different parameter values could be combined to produce higher or lower simulated progeny levels, the work suggested that changes in rates of phage progeny assembly are tightly linked to levels of phage DNA and protein production. Because the phage does not kill its host cell after a single cycle of phage progeny production, phage production can continue over multiple cell divisions. Extension of the intracellular model to account for multiple cell cycles allowed for testing of the sensitivities of phage processing in establishing a persistently infected state or a cured state (
38), conditions that have both been observed experimentally. More specifically, simulations suggested that p5, a protein that binds single-stranded DNA, may be involved in previously undocumented feedback loops between p1, p3, and p8, working at the level of translational attenuation. Ultimately, because these models quantitatively and mechanistically account for production of each phage component (DNA, mRNA, and protein) in the broader production of phage progeny, they provide an opportunity to explore diverse perturbations, such as mechanisms of translational control, that would be experimentally challenging to modify or elucidate independently. Here such simulations suggested that dynamic control of the amount of p5 in the infected cell plays a key role in the allocation of biosynthetic resources during M13 infection.
HIV-1
HIV-1, the virus that causes AIDS, is of great interest in human health owing to the 33 million people worldwide who suffer from AIDS and the 2 million annually who die from the disease. As a retrovirus, it carries two single-stranded positive-sense RNA genomes that serve as templates for the packaged viral reverse transcriptase to synthesize a double-stranded DNA molecule, which integrates into the host genome as part of its infection cycle.
To what extent can the dynamics of HIV-1 growth within its mammalian host cell, a CD4
+ T lymphocyte, be explained or accounted for by the kinetics of its underlying processes? In the case of HIV-1, following binding to the host cell and particle entry, the viral genomic RNA is released into the cytoplasm, where it is reverse transcribed to make double-stranded proviral DNA, which is then brought into the nucleus by the viral integrase enzyme and integrated into the genome of the host cell. The proviral DNA remains in a nonproductive or latent state until the cell and viral transcription processes are activated by external factors. The model of Reddy and Yin neglected the latent phase by assuming that the cell was activated, so transcription of viral messages could immediately proceed. The model accounted for splicing of mRNA and its transport from the nucleus to the cytoplasm, translation of viral proteins in the cytoplasm, feedbacks of regulatory proteins Tat and Rev on transcription, transport of viral proteins to the cell membrane, and particle assembly, budding, and maturation (
88). Simulated levels of HIV-1 DNA and entering genomic RNA, which are synthesized and degraded during reverse transcription, matched well with experimental observations. Further, the kinetics of production of viral RNA genomes (full-length RNA) and translation of viral proteins was consistent with the sparse available data at the time. In the absence of mechanisms of virus particle assembly, it was assumed that virions assembled instantaneously, with the only constraint being that each virus particle must satisfy the established particle stoichiometry (e.g., 1 genome and 1,200 Gag, 80 Gag-Pol, and 280 Env proteins). This assumption resulted in a simulated production of virus progeny that preceded the observed production by about 6 h, providing an estimate for the time required for the assembly process. The model also enabled study of how perturbations to individual virus functions might influence the overall growth. If growth is particularly sensitive to small changes in a specific parameter or function, then there is a rationale for targeting drugs to that function. The simulation highlights a need to exercise caution in targeting regulatory proteins, such as Rev. When the effects of inhibiting Rev were tested, simulations suggested that doubly spliced transcripts would be enhanced, activating Tat, which would activate viral transcription overall and lead to an enhancement of viral growth. However, simulations exploring the effects of directly inhibiting Tat suggested that such interventions would always have a detrimental effect on virus growth.
More-detailed simulations enabled studies of the effects of transcript splicing on growth. Inefficient splicing of HIV-1 mRNA was generally beneficial for HIV-1 growth, but an extreme reduction in the splicing efficiency could be detrimental, suggesting the existence of a splicing efficiency that optimizes HIV-1 growth (
20). When splicing causes an increase in the fraction of either Rev or Tat mRNA relative to that of the other viral mRNA pool, the outcomes are generally beneficial for HIV-1 growth. However, simulations indicated that when mutations cause either Rev or Tat mRNA to be favored over the other, the imbalance is amplified, suggesting that a balance of Rev and Tat is needed in order for HIV-1 to optimize its growth (
19). Further, interactions between two feedback loops, the negative feedback of Rev on nuclear export of fully spliced transcripts and the positive feedback of Tat on overall transcription of viral mRNA, create a robust regulatory network that is able to compensate for mutations that might alter functions of components within the network (
89). Such interactions between positive- and negative-feedback loops may have more general relevance to the robustness and evolvability of developmental processes in higher organisms (
90).
Other approaches to modeling HIV intracellular growth have employed agent-based modeling, where different states of the cell or stages of intracellular infection are initially defined and transitions between states are expressed as rules, with probabilities of transition related to the magnitudes of experimentally determined parameters (
91). Such rule-based approaches may enable accounting for compartments within the cell, such as the nucleus or assembly sites, in a manner that explicitly incorporates information about the position, size, or movement of compartments.
An Anti-HIV Strategy
As noted earlier, one may implement antiviral strategies within a model of virus growth by including additional reactions and parameters to simulate the action of an antiviral “drug” on a specific viral target. Moreover, one can test drug escape strategies by simulating how changes in virus parameters, corresponding to virus mutations, may enable virus variants to grow better than the original wild-type virus in the presence of drug. For HIV-1, an RNA interference (RNAi) strategy was proposed to computationally explore factors that would influence targeting of the Tat-mediated positive-feedback loop (
92). This example is interesting because of the sequence-specific targeting by RNAi of the
trans-acting responsive (TAR) element, a highly conserved RNA structure that is essential for Tat transactivation. One can imagine that mutations that would enable the virus to escape from such RNAi might also have fitness penalties on virus growth. Empirically based rules were applied to account for the effects of specific base changes on the contribution of this structure to the transcriptional feedback, and ultimately the fitness of the virus. When this strategy was carried out in experiments, it was found that no base changes were detected in the TAR element. Instead, mutations occurred in nontargeted regions that enabled indirect upregulation of transcription (
93). These results, which would have been improbable to be anticipated by current methods, highlight the limitations that even quite comprehensive models have in anticipating multiple ways that viruses may find to evade antiviral strategies.
INFLUENZA A VIRUS
In 2004, Sidorenko and Reichl developed the first kinetic model for influenza A virus growth in animal cell culture with an aim to identify resource-limiting steps that might be addressed to increase virus yields for vaccine production (
94). The genome of influenza A virus has eight segments that encode at least 10 virus proteins, nine of which are incorporated into the virus particle. The kinetic model accounted for steps for viral particle attachment to the cell surface, receptor-mediated entry or endocytosis of the viral particle, release of the viral ribonucleoprotein (vRNP) (RNA-protein complexes) into the cytoplasm and its transport to the nucleus, transcription of viral mRNA and its export to the cytoplasm, translation of viral proteins, replication of viral genomes, formation of viral ribonucleoprotein complexes, and budding and release of progeny virus particles from the cell surface. It was initially assumed that translation resources (ribosomes and precursors for protein synthesis) were present in excess, so levels of these components were not explicitly included in the model. Virus particles were also assumed to assemble with correct segregation of the eight RNA species and protein stoichiometries. Based on experimental observations of cell death 12 h following the initiation of infection, simulations were carried out by integrating the model to 12 h.
The model enabled one to identify potential bottlenecks in virus production and suggested strategies for increasing virus yields for vaccine production. The simulations suggested that M1, the viral matrix protein, becomes a limiting factor in the production of vRNP complexes, which was apparent as levels of M1 initially accumulated and then fell to zero as vRNP complexes were formed. Subsequently, vRNPs become a limiting factor in the budding and release of progeny virus from the infected cell. Moreover, according to the simulation, rates of virus production could be enhanced by increasing the activity of the viral polymerase, which could be achieved in practice by using stronger promoters. Likewise, simulations suggested that increasing the efficiency of translation of viral proteins would also increase the production of virus progeny. Such efficiencies could conceivably be enhanced in practice by using virus mutants that more efficiently inhibited the utilization of cellular translation resources by cellular mRNAs.
Subsequent measurements of two viral proteins (NP and M1) and virus production over the course of an infection cycle provided constraints for a population balance model of influenza virus infection (
95). In this model, levels of these essential viral proteins, quantified by flow cytometry, were used to define an internal coordinate for progression of infected cells to production of virus. For an excess of added virus particles to cells (multiplicity of infection [MOI] of 3), the model was able to capture the accumulation of these proteins in infected cells during earlier stages of infection, up to about 10 h, but deviated at later times, reflecting potential limitations in amino acid resources or the onset of virus-induced apoptosis. A more-detailed rendering of the viral replication process did not reduce deviations or provide further insight into underlying mechanisms for deviations (
96).
A mathematical model of influenza virus growth in cells was also published by Bazhan et al. (
97). Their work describes (in Russian) the regulation of transcription, translation, replication, and assembly of the virus. An analysis of the model indicates a high sensitivity of model behavior to parameters associated with the binding of viral polymerase to virus-specific RNAs, which are essential processes for transcription and genome replication. Such essential processes might be effective targets for the development of potent antiviral drugs.
Control of Viral RNA Synthesis
In a reformulation of the Sidorenko and Reichl model, the set of original kinetic equations was reduced and further data sets from the literature were incorporated to estimate key parameters (
98). In addition, mechanisms of RNA stabilization and nuclear export of RNA species were incorporated to better resolve the dynamics of viral RNA transcription and genome replication. Constraints on the model included experimental measures of virus entry, i.e., absolute and relative levels of viral messenger, replicative intermediates, and genomic RNA per cell, as well as average levels of viral progeny released per cell. The resulting model provided support for early regulation of genome replication by stabilization of viral RNA replicative intermediates. It also suggested how the viral matrix protein 1 (M1), which normally mediates export of viral genome copies from the nucleus, also might control viral RNA levels in the late phase of infection (
98). Finally, the model predicted an intracellular accumulation of viral proteins and RNA toward the end of infection, providing evidence that transport processes or particle budding limits the process of virus progeny release.
Kinetics of Defective Interfering Particles
It has long been known that influenza A virus infections can produce defective virus particles that can interfere with the production of infectious virus (von Magnus phenomenon). Such particles carry deletions in one or more essential genes needed for growth, making them unable to reproduce alone. However, during coinfection with infectious virus, defective genomes divert resources to their own replication and packaging, which interferes with the production of infectious particles. Such defective interfering particles (DIPs), described above in “Testing Antiviral Strategies,” have been found in clinical isolates of influenza virus (
99) and can negatively affect the manufacture of live vaccines (
100). To better understand mechanistically how DIPs reproduce, the intracellular kinetic model for influenza virus was extended to include defective interfering RNAs that replicate more rapidly than full-length RNAs owing to their reduced length (
101). The extended model was able to account for observed effects of DIPs on infectious virus production. Moreover, the model suggests that DIPs that specifically carry deletions in RNA segments encoding the viral polymerase can become enriched, in agreement with experimental observations. Further, the model and experimental observations suggest that other mechanisms, such as competition for viral proteins (polymerase and nucleoprotein), can also contribute to interference and DIP enrichment.
BACULOVIRUS
The large-scale production of recombinant proteins, particularly proteins requiring posttranslational modification, has long been implemented in insect cells infected by a recombinant baculovirus engineered to express heterologous proteins of interest (
118). Baculoviruses are double-stranded DNA viruses with genomes of 80 to 180 kb that typically encode about 150 proteins. An early study highlighted two factors, the time of infection and the multiplicity of infection (MOI), for defining key tradeoffs in the production of virus and heterologous protein (
119). Infection of cells during the late exponential growth phase, before they have reached their culture capacity (maximum cell concentration), can result in lower total yields of virus. However, infection of cells late, as they approach their culture capacity, can also limit virus production owing to lower biosynthetic capabilities of cells as their growth slows. If the MOI is well below 1, then cells that are not initially infected may continue to grow and become infected when the first generation of virus is released, contributing to overall higher productivity of the culture. In such scenarios, the initial virus production and release need to occur before cell growth can progress to stationary phase, when productivity drops. For these studies, the kinetics of virus production was not considered mechanistically. Instead, the focus was on three kinetic milestones: the time postinfection for extracellular virion synthesis, the time postinfection for extracellular protein (recombinant product) synthesis, and the time postinfection for cell lysis. The importance of culture time and MOI on the dynamics of cell, infected-cell, and virus populations were subsequently validated experimentally in a study that extended application of the baculovirus expression system for the production of virus-like particles (VLPs) (
120). Virus-like particles are often highly immunogenic, making them potentially useful as vaccines. Infected cells were immunostained using an antibody against a key protein of the VLPs. Further, it was shown that a higher MOI could increase the rate of VLP production, reducing the time to harvest.
The baculovirus expression system has been harnessed for the production of VLPs of rotavirus, a common cause of severe diarrhea in children. In this application, three rotaviral proteins were coexpressed, and their measured viral DNA, mRNA, and protein levels were found to be consistent with models of baculovirus transcription and translation (
23). As a notable aside, it was found that experimental uncertainty associated with estimating the MOI propagated exponentially in the calculation of viral DNA templates. To minimize these effects, experimental measures of early processes (e.g., binding and entry) were not used in the estimation of parameters. Instead, parameters for viral DNA replication, transcription, and translation were estimated from levels of viral DNA, mRNA, and protein measured at least 5 h after the start of infection.
Recent studies on baculovirus production have returned to the classic problem of producing high virus yields when host insect cells are at high density. A useful approach has been to explore how cell physiological changes, particularly changes that are coupled with the transition from exponential-phase cell growth to stationary-phase cell behavior, might limit the availability of resources that are essential for virus production. Global metabolic changes associated with such transitions can be estimated by combining measurements of metabolite changes with known metabolic networks and their analysis. Methods for analysis of metabolic fluxes have become increasingly standardized (
121). Application of these approaches to Sf9 insect cells at high density or after infection by baculovirus suggested a depletion of intermediates within the tricarboxylic acid (TCA) cycle (
122). More specifically, carbon fluxes through glycolysis and the TCA cycle were found to decrease as cell density increased, causing sharp drops in ATP production and availability of metabolic energy. Virus infections were found to have similar effects on cell metabolism, highlighting the depletion of ATP, the central currency for metabolic energy, as a key factor linking high cell density with drops in virus production. Further analysis supporting this result indicated that the drop in productivity of viruses at high cell density could not be attributed to the depletion of essential nutrients or the accumulation of inhibitory by-products. To address limitations of metabolic energy on virus production, the cell culture medium was supplemented with key depleted intermediates. Specifically, addition of pyruvate or α-ketoglutarate at the time of infection resulted in higher virus yields (up to 7-fold) during high-cell-density culture (
123). In this case, metabolic flux analysis showed a strong correlation between the net rate of ATP formation and the generation of redox equivalents in the form of NADH.
HCV
HCV is an enveloped positive-strand RNA virus that can cause chronic liver disease, which can lead to cirrhosis (scarring and impaired function of the liver) and liver cancer. About 130 million individuals worldwide are chronically infected with HCV, and no vaccine against HCV infection currently exists. HCV has been very challenging to study experimentally, owing in part to the lack of a laboratory system for culturing the virus. A replicon system has enabled the study of HCV genome replication in a specific liver cancer cell line (
127) and served as the basis for an initial kinetic model of HCV replication (
128). With guidance by the structure of the phage Qβ model, this HCV replicon model accounted for reactions that synthesize genomic and antigenomic RNA strands, a ribosome-genome complex to synthesize the viral polyprotein, cleavage of the polyprotein to produce the viral polymerase, and intermediates in the replication process formed by the polymerase and genomic and antigenomic templates. The model also allowed for spatial compartmentalization, with translation occurring in the cytoplasm, replication occurring within a vesicular membrane structure (VMS), and the viral genomic template and polymerase able to move between the cytoplasm and the VMS. Like the models for other positive-sense RNA viruses, e.g., phage Qβ and poliovirus, an imbalance in the rates of HCV replication favors production of HCV genomes over antigenomes, at a 10-to-1 ratio. Further, the model was used to explore the role of replication compartmentalization in the VMS. Specifically, could observed steady-state levels of HCV RNA be attained without the VMS? This hypothetical question was addressed by simplifying the model to allow both protein translation and genome replication to occur in the cytoplasm. The single-compartment model showed that one could attain steady-state RNA levels that were consistent with experimental results. However, the corresponding predicted levels of the viral polymerase (NS5B) were 4 orders of magnitude lower than observed levels. To address this discrepancy, ribosome levels were increased, but this adjustment then caused the steady-state RNA levels to move significantly out of the observed range. In short, this analysis indicated that the VMS may plausibly serve to restrain viral amplification and perhaps limit associated host cell damage.
To explore antiviral strategies against HCV, Mishchenko and colleagues developed an HCV replicon model which included reactions to target specific HCV functions (
129). The model was based on much of the biology of the Dahari model (
128) but also included inhibitors of the HCV protease (NS3), the HCV polymerase (NS5B), and a host protein (hVAP-33) that is essential for assembly of the HCV replication complex. By simulating the effects of drugs of different potencies on the steady-state level of HCV genomic RNA, this model showed that direct targeting of the polymerase or the host factor would have a greater inhibitory effect on viral replication than that of targeting the HCV protease. Moreover, testing of combined treatments that target both the protease and the polymerase showed no enhancement of inhibition, at least for weak inhibitors of these functions.
Taking a similar approach to his modeling of hepatitis B virus, Nakabayashi developed an intracellular kinetic model for HCV. As with HBV, he found two major patterns of replication, one arrested and one explosive, depending on the distribution of replication resources (
130).
HSV-1
The global prevalence of adult carriers of herpes simplex virus type 1 (HSV-1) was estimated to be 67% in 2012 (
131), reflecting the stability and efficient transmission of a virus with major impacts on public health. HSV-1, which encodes at least 84 proteins in productively infected cells (
132), expresses its genes based on their timing, which can be divided into three classes: immediate early, early, and late (
133). To better understand the role of regulatory feedbacks on the temporal order of gene expression, genome replication, and protein synthesis in the production of virus progeny, Nakabayashi and Sasaki developed an intracellular kinetic model (
18). An initial version of the model that accounted for viral DNA genomes, mRNAs from each of the three classes of expression, their translation to produce their corresponding proteins, viral assembly, and degradation rates of all nucleic acid, protein, and virus species was simplified by lumping processes of transcription and translation and neglecting rates of species degradation. Analysis of the simplified model revealed two modes of growth, either explosive or arrested, whose characteristics were similarly investigated by Nakabayashi et al. in the models of hepatitis B and hepatitis C. In the case of HSV-1, the explosive or arrested growth behavior depended on the relative expression of early versus late gene products. Higher expression levels of early gene products supported genome replication and a positive-feedback loop that explosively amplified viral genomes, while greater expression of late gene products yielded more envelope and structural proteins that depleted free genomes by packaging them into virus progeny particles. Under conditions for explosive growth, where genome replication is favored over genome packaging, one could estimate a waiting time for appearance of the explosive growth, which depended inversely on the initial level (or dose) of viral genomes. In short, higher initial levels of viral infection could result in shorter waiting times for explosive growth. The work further considered scenarios in which a virus with arrested growth behavior could accumulate mutations in early or late gene promoters that shift the balance of gene expression in favor of explosive growth, effectively suggesting how diverse wait times for explosive growth may arise. Although the model did not explicitly address issues or mechanisms of viral latency—when the viral genome is maintained in a nonreplicative state but poised to transition into a lytic or productive growth state—it is plausible that analysis of the transition from arrested to explosive growth may offer insights into the transition from latency to lytic growth. Finally, the simplified model was extended to account for potential limited intracellular resources needed for synthesis of viral DNA, RNA, and proteins. Such models, combined with advances in our mechanistic understanding of viral genes, may help to elucidate how cellular resources are distributed to viral functions over the course of infection (
134).
ACKNOWLEDGMENTS
We are indebted to the outstanding graduate students, postdocs, and undergrads in the Yin lab, who over the last 25 years have contributed their ideas and hard work toward our appreciation and understanding of viruses. We thank Paul Ahlquist, Udo Reichl, and Ophelia Venturelli for many thoughtful comments and suggestions on the manuscript.
We gratefully acknowledge support over the years from the National Science Foundation (grants BES-0087939, BES-0331337, BES-9896067, EF0313214, EIA-0130874, and EIA-0331337), the National Institutes of Health (grants AI077296, AI071197, AI091646, AI104317, T32-GM08349, and T32-HG002760), the National Library of Medicine (grant T15LM007359), the Office of Naval Research (grant N00014-98-1-0226), the Texas-Wisconsin Modeling and Control Consortium, Merck Research Laboratories, the Wisconsin Alumni Research Foundation, the Wisconsin Institute for Discovery, and the University of Wisconsin-Madison (Graduate School, Office of the Vice Chancellor for Research and Graduate Education, and the William F. Vilas Trust Estate).