Skip to main content
FreeReviews

Making the Most of “Omics” for Symbiosis Research

Department of Entomology, Comstock Hall, Cornell University, Ithaca, New York 14853

Abstract

Omics, including genomics, proteomics, and metabolomics, enable us to explain symbioses in terms of the underlying molecules and their interactions. The central task is to transform molecular catalogs of genes, metabolites, etc., into a dynamic understanding of symbiosis function. We review four exemplars of omics studies that achieve this goal, through defined biological questions relating to metabolic integration and regulation of animal-microbial symbioses, the genetic autonomy of bacterial symbionts, and symbiotic protection of animal hosts from pathogens. As omic datasets become increasingly complex, computationally sophisticated downstream analyses are essential to reveal interactions not evident from visual inspection of the data. We discuss two approaches, phylogenomics and transcriptional clustering, that can divide the primary output of omics studies—long lists of factors—into manageable subsets, and we describe how they have been applied to analyze large datasets and generate testable hypotheses.

Introduction

In symbioses, the irreducible complexity of an organism is compounded by persistent symbiotic interactions with one, two, or many phylogenetically different organisms, each of which is adapted to function in the context of its partner or partners. Until recently, the molecular basis of symbiosis could be studied only on the basis of one or a few genes and their products at a time. For example, in their research on interactions between a Legionella-like bacterium and its host Amoeba proteus, Jeon and colleagues (Jeon and Jeon, 2003) were able to correlate the reduced expression of a single host gene product, S-adenosyl methionine synthetase (SAMS), with the rapid evolution from a pathogenic to a mutualistic relationship. It is very likely that this dramatic evolutionary transition involved multiple coevolved changes in the metabolic and regulatory networks of the two organisms, but a systematic analysis of these putative changes accompanying the change in host SAMS expression was technically unrealistic at that time. Today, just 9 years later, the association in A. proteus and other fascinating symbioses can be interrogated by a range of high-throughput methods that reveal the total (or near-total) complement of a particular class of biological molecules: genes, transcripts, proteins, lipids, metabolites, etc.

The “omics” revolution of the last decade has transformed our capacity to understand symbioses at the molecular level. It is now possible, for example, to construct an inventory of the genes coded by each partner, to quantify patterns of transcription under different environmental conditions, to establish the relationship between transcript and protein abundance for every protein-coding gene, and to determine the metabolite set that makes up the metabolic pool of the interacting symbiotic partners. Symbiosis researchers have adopted these techniques with great alacrity, providing many novel insights. In the first part of this review, we have selected four studies as exemplars of how to “make the most” of omics approaches for symbiosis research. The common theme of these studies is that the omics approaches with the greatest impact are driven by important biological questions.

Nevertheless, omics biology brings challenges as well as opportunities. The minimal output of omics is lists of genes, proteins, metabolites, etc., that are a partial or near-complete molecular catalog of an organism or symbiosis. To use omics methods to answer biological questions requires great care in experimental design and interpretation. In the second part of this review, we discuss informatics routes that can help the researcher to make the most of omics data for symbiosis research, especially in relation to genomic and transcriptomic approaches.

Exemplars That “Make the Most” of Omics Approaches in Symbiosis Research

In this section, we describe four sets of experiments, each conducted on a single type of animal-microbial symbiosis. In their very different ways, using each of genomic, transcriptomic, proteomic, and metabolomic data, these studies illustrate how omics approaches can be applied to answer specific questions in symbiosis research.

Inferring metabolic interactions from bacterial genome sequences

The complete genome sequence of an organism defines its biological capabilities, but our capacity to interpret genome sequence data depends on the quality of gene annotation. The large proportion of conserved hypothetical genes (i.e., genes predicted in silico but without evidence of expression in vivo) and lineage-specific genes of unknown function in most sequenced genomes demonstrate the limitations to our capacity to interpret genomic data. Arguably, the simplest genomes to interpret are the smallest; and these are found among the bacterial symbionts of insects. The insight into symbioses that can be obtained from genomics is illustrated by recent studies on the genome sequences of bacterial endosymbionts of insects (McCutcheon and Moran, 2012).

Symbioses involving bacteria with very small genomes (<1 Mb) have evolved independently in multiple insect groups with the common trait of feeding through the lifecycle on nutritionally poor or unbalanced diets. Examples include blood-feeding insects (e.g., bedbugs, lice, tsetse flies), plant-sap-feeding insects (e.g., whitefly, aphids, cicadas), and scavengers, such as the cockroaches. The inference that these symbioses have a nutritional basis (Buchner, 1965) has been confirmed amply by modern nutritional and physiological studies indicating that these insects derive specific nutrients, including essential amino acids and vitamins, from their microbial symbionts (Douglas, 2009). The bacteria are intracellular and restricted to one cell type, generically known as the bacteriocyte, which apparently functions exclusively to house and maintain the symbiosis. The bacteria are vertically transmitted, usually directly from the bacteriocyte to the cytoplasm of the eggs in the female ovary (Buchner, 1965). The small genome size of the bacterial symbionts is interpreted as the evolutionary consequence of obligate vertical transmission. Gene loss can be attributed to relaxed selection on genes not required in the symbiosis and to genomic decay, caused by the small effective population size of the vertically transmitted bacteria and resultant accumulation of mildly deleterious mutations (Moran, 1996); in addition, these genomes appear not to be susceptible to horizontally acquired genes.

Our exemplar of genomics in symbiosis research is a set of papers by McCutcheon and Moran describing the genomes of the symbiotic bacteria in three related groups of xylem-feeding insects: the Cercopidae (spittlebugs), Cicadoidae (cicadas), and Cicadellinae (sharpshooters) (McCutcheon and Moran, 2007, 2010; McCutcheon et al., 2009;). All three insect groups bear two morphologically distinct bacterial symbionts: a common primary symbiont (Sulcia mulleri) and a distinct auxiliary symbiont (Fig. 1). Plant xylem sap lacks the 10 essential amino acids that animals cannot synthesize but require for protein synthesis. Genomic inspection of the primary and auxiliary symbionts revealed that Sulcia has the genetic capacity to synthesize either 7 (in spittlebugs) or 8 (in cicadas and sharpshooters) essential amino acids, and that the auxiliary symbionts encode the biosynthetic pathways for the remaining essential amino acids; thus, the various auxiliary symbionts and the cohabiting Sulcia have perfect complementarity in their genetic capacity for essential amino acid synthesis. These studies demonstrate how the genome of each bacterial symbiont is shaped by coevolutionary interactions with symbiotic partners. Furthermore, they generate very specific predictions about the three-way transfer of multiple metabolites (including essential amino acids) among the host and symbionts.

Figure 1.
Figure 1.

Complementary genetic capacity for essential amino acid synthesis by the primary symbiont (Sulcia) and the secondary symbiont in three groups of xylem-feeding insects. Solid bars, genes for biosynthetic pathway present on genome. Data collated from McCutcheon et al. (2009); McCutcheon and Moran (2007 and 2010).

As these studies illustrate, metabolic interactions in symbioses can be inferred by visual inspection of genomic data. Nevertheless, the metabolic networks are inherently complex, even in bacteria with reduced genomes, and metabolic modeling based on the inventory of metabolism genes offers a valuable route to identify and quantify the nutritional resources utilized by the symbiotic bacteria and their metabolic adaptations for the release of specific nutrients to the host. These methods have been applied with success, for example, to the endosymbionts of aphids (Thomas et al., 2009; MacDonald et al., 2011), sharpshooters (Cottret et al., 2010), and cockroaches (Gonzalez-Domenech et al., 2012). This approach is ideally suited to bacteria with much reduced genomes, in which all metabolism-related genes are expressed. For many bacteria, however, the metabolic phenotype under any one set of conditions is underpinned by a subset of the metabolism-related genes. For these bacteria, it is essential to complement the genomic information with gene expression data.

Transcriptomics and the regulation of symbiotic bacteria

Our exemplar of a transcriptomics study is the analysis by Wier et al. (2010) of gene expression patterns in the symbiosis between the bobtail squid Euprymna scolopes and the luminescent bacterium Vibrio fischeri. This study concerns a central issue in symbiosis research: the regulation of microbial symbionts, specifically the mechanisms by which the numbers and location of symbionts in an animal host are controlled. Valuable insights have come from studying how symbioses respond to environmental perturbations, such as changes in temperature or nutrient supply. E. scolopes juvenile squid are born devoid of Vibrio bacteria but acquire them by passing water over the entrance to a specialized ventral squid organ called the light organ. Through a series of selective events (Nyholm and McFall-Ngai, 2004), V. fischeri alone gain access to the organ, where they grow to high densities (> 108 CFU/ml; Boettcher and Ruby, 1990) at night and generate light, providing camouflage by counterillumination for their host (Jones and Nishiguchi, 2004). At dawn, more than 90% of the bacterial cells are expelled from the light organ, and the residual bacterial population subsequently proliferates to regenerate the dense, luminescent population by nightfall. Thus, the squid symbiosis has a particular experimental advantage in that the symbiosis is perturbed naturally on a daily basis, yielding regulatory changes in the symbiosis that are highly reproducible across individuals and over time.

Fluctuations in the light organ bacterial population are associated with dramatic reorganization of host tissues and gene expression. The greatest changes in gene expression occurred at dawn, the time of symbiont expulsion. The expression of more than 50 host genes with annotated cytoskeletal function and various symbiont genes mediating anaerobic respiration of glycerol was elevated uniquely at this time. Simultaneously, the microvilli of the apical membrane were lost, with associated membrane blebbing. The inference that the bacteria utilized host lipids derived from the membrane reorganization was supported by the close similarity in fatty acid composition of host tissues and symbiont cells. Through the subsequent day, the symbiosis re-assembled. The apical microvilli on the host epithelial cells generated anew, and the residual symbionts initiated cell growth and division, processes respectively orchestrated by changes in the expression of the host cytoskeletal genes and by a shift in expression of the symbiont metabolism genes from glycerol fermentation to the utilization of chitin as the preferred respiratory substrate.

Experimental design was key to the success of this analysis. Importantly, the transcriptomes of host and symbiont were analyzed in parallel, enabling the interactions between the gene expression patterns of the partners to be analyzed. Furthermore, the host samples exclusively comprised the core of the light organ, including the epithelial cells that interact directly with the symbiotic bacteria, minimizing the incidence of host transcriptional responses that vary over the diel cycle for reasons unrelated to the symbiosis. In addition, because the spatial and temporal interactions between host and symbiont were already well characterized, it was possible to link gene expression patterns to specific symbiotic phenomena, such as bacterial expulsion, diel variation in bacterial resource acquisition patterns, and ultrastructural host cell membrane reorganization.

Proteomes and symbiont autonomy

Proteomics, the global analysis of proteins, is technically more demanding and costly than transcriptomics; it is less sensitive than transcriptomics; and it requires a protein sequence database derived from the genome or extensive cDNA libraries, preferably of the same species. For these multiple reasons, transcriptomics is widely adopted as the method of choice to study the expression of protein-coding genes, on the assumption that transcript and protein abundance are correlated. In broad terms, this assumption is justified. Multiple studies have demonstrated a moderate, positive correlation between transcript and protein abundance. Even so, the slope of the relationship is significantly less than unity, suggesting that the proteins contributing to a cell, tissue, or organism vary less in abundance than transcripts (Bonaldi et al., 2008; Sun et al., 2010). An additional complication for time-course studies is that the temporal pattern of transcript and protein abundance can differ for a single gene, and that the pattern of this difference can vary widely among genes (Wang et al., 2010a). Where the focus of interest is protein-coding genes and post-transcriptional regulation is known (or anticipated) to be important, proteomics is the method of choice. Proteomics is also essential for large-scale analysis of the spatial distribution of proteins within cells and among organisms in symbioses.

Our exemplar of a proteomics study concerns the transfer of proteins between host and symbiont and relates, specifically, to the question of whether intracellular symbionts with very small genomes are analogous to bacterial-derived organelles. For example, animal proteins of nucleocytoplasmic origin are targeted to mitochondria, subsidizing the limited functional capabilities coded by the mitochondrial genome. The focus of the study is a vertically transmitted symbiont housed in bacteriocytes of an insect (as described above in relation to sharpshooters and their allies): specifically, the symbiotic bacterium Buchnera aphidicola in the pea aphid Acyrthosiphon pisum. The Buchnera genome is just 0.64 Mb, less than 20% of the genome of the related bacterium Escherichia coli, and it lacks genes for metabolic functions that are also coded by the aphid genome (IAGC, 2010). To investigate whether host proteins, including metabolic enzymes, are transported to the Buchnera cells, Poliakov and colleagues (2011) conducted a quantitative analysis of the proteome of multiple aphid samples in which the Buchnera cells were progressively enriched: from the whole insect body, through isolated host cells and partially purified Buchnera cells, to Buchnera cells purified on a Percoll gradient. Those proteins that co-purify with the Buchnera cells through the enrichment series are predicted to be associated with the Buchnera cells (Fig. 2). Overall, more than 1900 aphid proteins and 400 Buchnera proteins were detected. Cluster analysis revealed that proteins coded by the Buchnera genome were selectively enriched in Buchnera cells, with no evidence for selective transfer of any proteins in either direction between host and symbiont. This study indicates that metabolic integration between the partners is mediated by the transfer of small metabolites, and not proteins, and generates the specific hypothesis that certain metabolic pathways are shared between the host and symbiont, with the transfer of intermediate metabolites between the partners.

Figure 2.
Figure 2.

Quantitative proteomic analysis of tissue fractions of the pea aphid-Buchnera symbiosis identifies proteins coded by the Buchnera genome enriched in Buchnera cells by hierarchical clustering (red, enriched; green, depleted). Data from Poliakov et al. (2011).

The genome sequence of the pea aphid host is congruent with these results. As for all other eukaryotes, the genome includes various genes that can be assigned to the bacterial ancestor of the mitochondrion and code for proteins that are targeted to the mitochondrion. By contrast, the only genetic material of likely Buchnera origin is two highly truncated pseudogenes (ΨDnaE and ΨAtpH) (Nikoh et al., 2010). It appears that genome reduction in Buchnera has not involved the net transfer of intact genes to the host nucleus.

The conclusion that Buchnera lacks genetic subsidy by the host, a cardinal feature of a bacterial-derived organelle, raises the question of whether other insect endosymbionts with genomes even smaller than Buchnera and comparable to mitochondria and plastids are also genetically autonomous. Quantitative proteomics, as conducted for Buchnera, can resolve this issue. We should not presume that all symbioses involving bacteria with small genomes have solved the functional problems posed by genomic erosion in the same way.

Metabolomics and symbiotic protection against pathogens

The metabolome—that is, the global set of metabolites in a biological system— differs fundamentally from the genome, transcriptome, and proteome in that it cannot be deduced directly from the genome sequence. Metabolomics poses a unique set of challenges. In particular, different techniques are required to analyze different classes of metabolites; and many of the metabolites detected by mass spectrometry, NMR spectroscopy, and related methods cannot realistically be identified. For some purposes, important information can be gleaned from analysis of the metabolite differences between samples (e.g., hosts bearing and lacking symbionts) without substantial investment in identification of the compounds. This type of metabolite analysis is often known as metabolite fingerprinting, or metabonomics. For other experimental designs, where identification is crucial, access to high-quality spectral library databases is essential (Tohge and Fernie, 2009).

Of particular interest for symbiosis research is the study of Fukuda et al. (2011), which used metabolomics to pinpoint a single metabolite that conferred resistance against a pathogenic bacterium. Their system comprised mice, symbiotic bacteria of the genus Bifidobacterium that colonize the mouse colon, and the pathogen E. coli strain O157, which produces the proteinaceous Shiga toxin. The authors demonstrated that when mice bearing either the gut bacterium Bifidobacterium longum or B. adolescentis were infected with the pathogenic E. coli O157, the O157 cells proliferated, but only the mice with B. adolescentis died. They hypothesized that metabolites released by B. longum were important in mediating protection against O157. Their 1H-13C NMR metabolomic study revealed striking differences in the sugar profiles of feces produced by mice bearing the two Bifidobacterium species. This metabolomic analysis set the authors onto the scientific trail to identify the active compound. Recognizing that Bifidobacteria ferment sugars to short-chain fatty acids, they then demonstrated that the concentration of one such fatty acid, acetic acid, was significantly elevated in the feces of mice bearing B. longum; and that acetic acid enhanced the barrier function of the colon epithelial cells, such that the translocation of O157 cells and the Shiga toxin across the epithelium was inhibited. The elevated production of acetic acid by B. longum could be linked to its expression of genes for fructose-transporters and high rates of fructose uptake in vitro.

The study of Fukuda et al. (2011) provides an important lesson in omics: that omics methods are a discovery tool that can be used to construct testable hypotheses. Although omics lend themselves to cataloging the molecular composition of living organisms, their power is most evident when applied to answering defined biological questions. This vital point is the basis for the following section of this review: a consideration of approaches that have been used to gain useful information from the large dataset outputs of omics studies.

Making the Most of Omics Approaches

As considered in the Introduction, the crude output of an omics study is a catalog of the genes, proteins, or small metabolites that constitute the biological sample studied. Visual inspection of the data is very important for understanding the results, but the datasets are often so large and complex that supplementary computational methods are essential for full interpretation of the data. For example, these approaches can protect against inadvertent “cherry-picking” of data that conform to preconceived expectations while ignoring potentially important genes or metabolites with no known function or functions apparently unrelated to the experimental treatment. The burgeoning field of systems biology offers a great diversity of strategies and tools to analyze omics datasets. We will discuss two approaches—phylogenomics and regulatory gene network discovery—and focus on their use in inferring gene function by identifying genes of unknown function that have similar distribution patterns or patterns of expression as genes of known function.

Phylogenomics allocates genes according to their evolutionary history, with the rationale that genes with a similar evolutionary history will cluster according to function (Srinivasan et al., 2005). This technique is particularly valuable to generate candidate functions for genes lacking functional annotation. Gene evolutionary history is predicted by creating a coinheritance matrix of all the proteins in the genome sequence of interest (rows) against a library of genomes (for example, all of the genome sequences available on NCBI; one column per genome). From this matrix, a similarity matrix is created that clusters proteins according to the similarity of their coinheritance pattern. The similarity matrix is then transformed into a 2-D plot with each point representing a protein in the input genome sequence. Clusters of points tend to represent shared functions (e.g., as measured by mapping with gene ontology categories), and the role of genes with unknown function can be predicted from their co-clustering with genes of known function. Additional information can be obtained where particular functions, as identified by gene ontology, are over-represented in a gene cluster, although functional inference can be complicated for large clusters (comprising hundreds of genes) and clusters in which multiple gene ontology categories are represented.

Phylogenomics is particularly well-suited for functional inference of genes in bacteria because many sequenced bacterial genomes are available to support the analysis. Phylogenomics has been applied recently in combination with filtering approaches to identify putative symbiosis-related genes of Xenorhabdus nematophila and X. bovienii, bacterial symbionts of entomopathogenic nematodes (Chaston et al., 2011). Steinernema nematode species carry specific Xenorhabdus species at the anterior of the nematode intestine in a specialized structure called the receptacle (Poinar, 1966; Wouts, 1980; Bird and Akhurst, 1983; Flores-Lara et al., 2007). The nematodes actively seek out and penetrate soil-dwelling insect hosts. Once inside the insect, the symbiotic bacteria are released and kill the insect host via immune system suppression and production of potent effectors, and the bacteria provide nutrition to the nematode, which reproduces through several generations inside the insect (Kaya and Gaugler, 1993; Forst et al., 1997; Goodrich-Blair and Clarke, 2007). When nutrients are spent, the nematodes acquire a complement of colonizing bacteria and leave the nutrient-depleted cadaver in search of a new insect host (reviewed in Richards and Goodrich Blair, 2009). The X. nematophila and X. bovienii genomes were studied to identify candidate genes that contribute to the maintenance of this symbiosis. X. nematophila genes that were conserved in other Enterobacteriaceae and specific to each of the two Xenorhabdus species were discarded, and the remaining genes were divided into two groups on the basis of their conservation in other bacterial symbionts of entomopathogenic nematodes (Fig. 3). Each of the two gene groups was analyzed by phylogenomics separately, resulting in assignment of 533 genes to 24 clusters (12% of the X. nematophila genome). To focus on clusters containing genes with predicted symbiotic functions, the genes from each cluster were assessed for enrichment in proteins found in host-associated microbes (Fig. 3E). Inferred symbiosis clusters included genes that function in toxin production and secretion and in resistance to heavy metal stress. This analysis offered a rich, but manageable, catalog of 221 genes (5% of the total gene complement) with candidate symbiotic function for empirical analysis that could not have been achieved by visual inspection of the Xenorhabdus genome sequences alone.

Figure 3.
Figure 3.

Use of phylogenomics to identify candidate symbiosis-related genes in the bacterium Xenorhabdus nematophila (Enterobacteriaceae), which associates with entomopathogenic nematodes of the genus Steinernema. (A) The pool of 4299 coding genes in the X. nematophila genome was reduced by (B) subtracting all 1275 genes absent from the congeneric symbiont X. bovienii and (C) subtracting 2491 nonsymbiotic genes (shared with the free-living Enterobacteriaceae Salmonella enterica Typhimurium LT2, and Escherichia coli K12). (D) The remaining 533 genes were divided as “nematode symbiosis-conserved” if they were shared in two con-familial Photorhabdus species that are also nematode symbionts (290 genes), or specific to the nematode host Steinernema (“Steinernema-conserved”) (243 genes) if they were absent in the Photorhabdus species. (E) Phylogenomic analyses performed on the two gene groups identified 15 and 9 clusters, respectively, of which 6 and 4 clusters were enriched in genes shared in plant and animal symbionts in the NCBI database (identified by custom metadata mining), yielding 170 and 51 genes for final analysis. (F) Schematic of the phylogenetic relationship between Xenorhabdus and other bacteria that provided reference genomes used in the study, all of which are related in the Family Enterobacteriaceae. Data from Chaston et al. (2011).

A valuable approach for interpreting transcriptome data is provided by transcriptional networks. Specifically, computational models can be applied to generate gene networks that identify functional groups of genes responding to the same regulatory factors. Gene network construction is a particularly useful tool for predicting gene function because genes of unknown and known functions allocated to the same regulatory cluster are inferred to function in similar regulatory hierarchies and respond to similar stresses and stimuli. One approach for gene network creation is to identify direct interactions between genes by using Gaussian graphical models (GGMs) (Dobra, 2004; Schäfer and Strimmer, 2005). GGMs use partial correlations of transcriptional data as a metric to identify direct interactions of two genes. Standard (i.e., Pearson) correlations are not suitable because they do not discriminate correlations resulting from other factors such as indirect gene-gene interactions or regulation by a common gene. One area of current focus is to increase the predictive power of the causality of identified interactions; that is, to discriminate causal from reactive interactions (Schadt et al., 2005), by integrating transcriptomic data with other omics approaches that identify expression quantitative trait loci (eQTLs: loci that cause changes in gene transcription across individuals) or protein-protein interactions (Zhu et al., 2008). However, networks constructed only from transcriptomic data produce connectivity profiles similar to the networks created by integrating multiple omics methods, and this reduces the ability to infer causality (e.g., Zhu et al., 2008). Thus, gene network creation can still be applied to microarray and RNAseq datasets, even when researchers may not have access to extensive strain libraries for eQTL mapping, or to other relevant omics information such as protein-protein interaction data.

To date, the symbiosis community has apparently made little use of transcriptional network analysis, but we anticipate that this will change rapidly in the next few years. One recent study that does apply gene networks to the study of symbiosis investigated the transcriptional responses of humans to probiotic Lactobacillus (van Baarlen et al., 2011). The same human subjects were separately exposed to each of three probiotic Lactobacillus species. The microarray data were highly variable across individuals: transcriptional responses were more similar for different treatments on the same individual than for the same treatments on different individuals. Nevertheless, network analysis of the transcriptomes identified a number of networks (e.g., blood pressure regulation, wound healing) that responded to the different Lactobacillus species, and certain networks (e.g., ion homeostasis) that responded to more than one of the probiotic bacteria. Although the actual transcriptional levels varied among individuals across the study, the network responses appeared to be conserved, allowing interrogation of what otherwise seemed to be a near-uninterpretable dataset.

Although the regulatory networks described in van Baarlen et al. (2011) were not created from transcriptomic data (instead, networks were created by combing the literature for experimental data), they demonstrate the power of gene network discovery for analysis of large transcriptomic datasets in symbiotic systems. Network creation from transcriptomic data has recently been used to study mammalian gastrointestinal symbiotic systems (Shulzehnko et al., 2011; Greenblum et al., 2012), and has also been used successfully, sometimes with experimental verification, with transcriptomic data in a variety of organisms (Guan et al., 2008; Lee et al., 2008; Ayroles et al., 2009; Chang et al., 2010; Logsdon and Mezey, 2010).

Outlook

Omics approaches are taking our understanding of symbioses to a new level of molecular sophistication. Indeed, the monumental efforts to define and characterize human-associated microbial communities by initiatives such as the Human Microbiome Project are attainable only by implementation of omics methods. Some types of interactions, such as regulation of the nutritional status and cell proliferation patterns of the host (Buchon et al., 2009; Shin et al., 2011), make intuitive sense, but interpreting other results of omics experiments will require further research on the function of certain gene classes. For example, one-third of the proteins that differ in abundance between pea aphids bearing and experimentally deprived of their Buchnera bacteria are cuticular proteins (Wang et al., 2010b), a result that cannot be related to any currently known aphid-Buchnera interaction. As this result illustrates, omics experiments have great potential to spur efforts to understand molecular function in symbiosis.

From a symbiotic perspective, gene classes of particular potential interest are those that respond specifically to perturbation of the symbiosis and have no known function. Conserved genes of this class may represent the deep history that defines the predisposition of animals for symbioses with microorganisms, while recently evolved, lineage-specific genes may underpin the unique functions of individual associations. Elucidation of these patterns will contribute not only to our understanding of symbiosis, but also to the resolution of central problems posed by conserved and lineage-specific genes of no known function.

Acknowledgments

We thank Tomas Lazo for the drawing of the aphid in Figure 2. This work was supported by NIH grant 1R01GM095372-01, NSF grant IOS-0919765, and the Sarkaria Institute for Insect Physiology and Toxicology.

Literature Cited

  • Ayroles, J. F. , M. A. Carbone , E. A. Stone , K. W. Jordan , R. F. Lyman , M. M. Magwire , S. M. Rollmann , L. H. Duncan , F. Lawrence , R. R. Anholt et al. 2009. Systems genetics of complex traits in Drosophila melanogaster. Nat. Genet. 41: 299–307.

  • Bird, A. F. , and R. J. Akhurst. 1983. The nature of the intestinal vesicle in nematodes of the family Steinernematidae. Int. J. Parasitol. 13: 599–606.

  • Boettcher, K. J. , and E. G. Ruby 1990. Depressed light emission by symbiotic Vibrio fischeri of the sepiolid squid Euprymna scolopes. J. Bacteriol. 172: 3701–3706.

  • Bonaldi, T. , T. Straub , J. Cox , C. Kumar , P. B. Becker , and M. Mann. 2008. Combined use of RNAi and quantitative proteomics to study gene function in Drosophila. Mol. Cell 31: 762–772.

  • Buchner, P.. 1965. Endosymbioses of Animals with Plant Microorganisms. John Wiley, Chichester, UK.

  • Buchon, N. , N. A. Broderick , S. Chakrabarti , and B. Lemaitre. 2009. Invasive and indigenous microbiota impact intestinal stem cell activity through multiple pathways in Drosophila. Genes Dev. 23: 2333–2344.

  • Chang, X. , S. Liu , Y. T. Yu , Y. X. Li , and Y. Y. Li. 2010. Identifying modules of coexpressed transcript units and their organization of Saccharopolyspora erythraea from time series gene expression profiles. PLoS One 5: e12126.

  • Chaston, J. M. , G. Suen , S. L. Tucker , Andersen, A. W , A. Bhasin , E. Bode , H. B. Bode , A. O. Brachmann , C. E. Cowles , K. N. Cowles et al. 2011. The entomopathogenic bacterial endosymbionts Xenorhabdus and Photorhabdus: convergent lifestyles from divergent genomes. PLoS One 6: e27909.

  • Cottret, L. , P. V. Milreu , V. Acuna , A. Marchetti-Spaccamela , L. Stougie , H. Charles , and M. F. Sagot. 2010. Graph-based analysis of the metabolic exchanges between two co-resident intracellular symbionts, Baumannia cicadellinicola and Sulcia muelleri, with their insect host, Homalodisca coagulata. PLoS Comput. Biol. 6: e1000904.

  • Dobra, A.. 2004. Sparse graphical models for exploring gene expression data. J. Multivar. Anal. 90: 196–212.

  • Douglas, A. E.. 2009. The microbial dimension in insect nutritional ecology Funct. Ecol. 23: 38–47.

  • Flores-Lara, Y. , D. Renneckar , S. Forst , H. Goodrich-Blair , and P. Stock. 2007. Influence of nematode age and culture conditions on morphological and physiological parameters in the bacterial vesicle of Steinernema carpocapsae (Nematoda: Steinernematidae). J. Invertebr. Pathol. 95: 110–118.

  • Forst, S. , B. Dowds , N. Boemare , and E. Stackebrandt. 1997. Xenorhabdus and Photorhabdus spp.: bugs that kill bugs. Annu. Rev. Microbiol. 51: 47–72.

  • Fukuda, S. , H. Toh , K. Hase , K. Oshima , Y. Nakanishi , K. Yoshimura , T. Tobe , J. M. Clarke , D. L. Topping , T. Suzuki et al. 2011. Bifidobacteria can protect from enteropathogenic infection through production of acetate. Nature 469: 543–547.

  • Gonzalez-Domenech, C. M. , E. Belda , R. Patino-Navarrete , A. Moya , J. Pereto , and A. Latorre. 2012. Metabolic stasis in an ancient symbiosis: genome-scale metabolic networks from two Blattabacterium cuenoti strains, primary endosymbionts of cockroaches. BMC Microbiol. 12: S5.

  • Goodrich-Blair, H. , and D. J. Clarke. 2007. Mutualism and pathogenesis in Xenorhabdus and Photorhabdus: two roads to the same destination. Mol. Microbiol. 64: 260–268.

  • Greenblum, S. , P. J. Turnbaugh , and E. Borenstein. 2012. Metagenomic systems biology of the human gut microbiome reveals topological shifts associated with obesity and inflammatory bowel disease. Proc. Natl. Acad. Sci. USA 109: 594–599.

  • Guan, Y. , C. L. Myers , R. Lu , I. R. Lemischka , C. J. Bult , and O. G. Troyanskaya. 2008. A genomewide functional network for the laboratory mouse. PLoS Comput. Biol. 4: e1000165.

  • IAGC. International Aphid Genomics Consortium. 2010. Genome sequence of the pea aphid Acyrthosiphon pisum. PLoS Biol. 8: e1000313.

  • Jeon, T. J. , and K. W. Jeon. 2003. Characterization of sams genes of Amoeba proteus and the endosymbiotic X-bacteria. J. Eukaryot. Microbiol. 50: 61–69.

  • Jones, B. W. , and M. K. Nishiguchi. 2004. Counterillumination in the hawaiian bobtail squid Euprymna scolopes Berry (Mollusca : Cephalopoda). Mar. Biol. 144: 1151–1155.

  • Kaya, H. K. , and R. Gaugler. 1993. Entomopathogenic nematodes. Annu. Rev. Entomol. 8: 181–206.

  • Lee, I. , B. Lehner , B. Crombie , W. Wong , A. G. Fraser , and E. M. Marcotte. 2008. A single gene network accurately predicts phenotypic effects of gene perturbation in Caenorhabditis elegans. Nat. Genet. 40: 181–188.

  • Logsdon, B. A. , and J. Mezey. 2010. Gene expression network reconstruction by convex feature selection when incorporating genetic perturbations. PLoS Comput. Biol. 6: e1001014.

  • MacDonald, S. J. , G. H. Thomas , and A. E. Douglas. 2011. Genetic and metabolic determinants of nutritional phenotype in an insect-bacterial symbiosis. Mol. Ecol. 20: 2073–2084.

  • McCutcheon, J. P. , and N. A. Moran. 2007. Parallel genomic evolution and metabolic interdependence in an ancient symbiosis. Proc. Natl. Acad. Sci. USA 104: 19392–19397.

  • McCutcheon, J. P. , and N. A. Moran. 2010. Functional convergence in reduced genomes of bacterial symbionts spanning 200 My of evolution. Genome Biol. Evol. 2: 708–718.

  • McCutcheon, J. P. , and N. A. Moran. 2012. Extreme genome reduction in symbiotic bacteria. Nat. Rev. Microbiol. 10: 13–26.

  • McCutcheon, J. P. , B. R. McDonald , and N. A. Moran. 2009. Convergent evolution of metabolic roles in bacterial co-symbionts of insects. Proc. Natl. Acad. Sci. USA 106: 15394–15399.

  • Moran, N. A.. 1996. Accelerated evolution and Muller’s rachet in endosymbiotic bacteria. Proc. Natl. Acad. Sci. USA 93: 2873–2878.

  • Nikoh, N. , J. P. McCutcheon , T. Kudo , S. Y. Miyagishima , N. A. Moran , and A. Nakabachi. 2010. Bacterial genes in the aphid genome: absence of functional gene transfer from Buchnera to its host. PLoS Genet. 6: e1000827.

  • Nyholm, S. V. , and M. J. McFall-Ngai. 2004. The winnowing: establishing the squid-vibrio symbiosis. Nat. Rev. Genet. 2: 632–642.

  • Poinar, G. O.. 1966. The presence of Achromobacter nematophilus in the infective stage of a Neoaplectana sp. (Steinernematidae: Nematoda). Nematologica 12: 105–108.

  • Poliakov, A. , C. W. Russell , L. Ponnala , H. J. Hoops , Q. Sun , A. E. Douglas , and K. J. van Wijk. 2011. Large-scale label-free quantitative proteomics of the pea aphid- Buchnera symbiosis. Mol. Cell. Proteomics 10: M110.007039.

  • Richards, G. R. , and H. Goodrich-Blair. 2009. Masters of conquest and pillage: Xenorhabdus nematophila global regulators control transitions from virulence to nutrient acquisition. Cell. Microbiol. 11: 1025–1033.

  • Schadt, E. , J. Lamb , X. Yang , J. Zhu , S. Edwards , D. Guhathakura , S. K. Sieberts et al. 2005. An integrative genomics approach to infer causal associations between gene expression and disease. Nat. Genet. 37: 710–717.

  • Schäfer, J. , and K. Strimmer. 2005. Learning large-scale graphical Gaussian models from genomic data. Pp. 263–276 in Science of Complex Networks: From Biology to the Internet and WWW, Mendes, José F. F. , S. N. Dorogovtsev , A. Povolotsky , F. V. Abreu , and J. G. Oliveira. eds. CNET 2004, 29 August – 2 September 2004, Aveiro, Portugal. AIP Conference Proceedings 776, American Institute of Physics, Melville, NY.

  • Shin, S. C. , S. H. Kim , Y. You , B. Kim , A. C. Kim , K. A. Lee , J. H. Yoon , J. H. Ryu , and W. J. Lee. 2011. Drosophila microbiome modulates host developmental and metabolic homeostasis via insulin signaling. Science 334: 670–674.

  • Shuylzhenko, N. , A. Morgun , W. Hsiao , M. Battle , M. Yao , O. Gavrilova , M. Orandle , L. Mayer , A. J. Macpherson , K. D. McCoy , C. Fraser-Liggett , and P. Matzinger. 2011. Crosstalk between B lymphocytes, microbiota and the intestinal epithelium governs immune versus metabolism in the gut. Nat. Med. 17: 1585–1593.

  • Srinivasan, B. S. , N. B. Caberoy , G. Suen , R. G. Taylor , R. Shah , F. Tengra , B. S. Goldman , A. G. Garza , and R. D. Welch. 2005. Functional genome annotation through phylogenomic mapping. Nat. Biotechnol. 23: 691–698.

  • Sun, N. , C. Pan , S. Nickell , M. Mann , W. Baumeister , and I. Nagy. 2010. Quantitative proteome and transcriptome analysis of the archaeon Thermoplasma acidophilum cultured under aerobic and anaerobic conditions. J. Proteome Res. 9: 4839–4850.

  • Thomas, G. H. , J. Zucker , S. J. Macdonald , A. Sorokin , I. Goryanin , and A. E. Douglas. 2009. A fragile metabolic network adapted for cooperation in the symbiotic bacterium Buchnera aphidicola. BMC Syst. Biol. 3: 24.

  • Tohge, T. , and A. R. Fernie. 2009. Web-based resources for mass-spectrometry-based metabolomics: a user’s guide. Phytochemistry 70: 450–456.

  • van Baarlen, P. , F. Troost , C. van der Meer , G. Hooiveld , M. Boekschoten , R. J. Brummer , and M. Kleerebezem. 2011. Human mucosal in vivo transcriptome responses to three lactobacilli indicate how probiotics may modulate human cellular pathways. Proc. Natl. Acad. Sci. USA 108 Suppl 1: 4562–4569.

  • Wang, H. , Q. Wang , U. J. Pape , B. Shen , J. Huang , B. Wu , and X. Li. 2010a. Systematic investigation of global coordination among mRNA and protein in cellular society. BMC Genomics 11: 364.

  • Wang, Y. , J. C. Carolan , F. Hao , J. K. Nicholson , T. L. Wilkinson , and A. E. Douglas. 2010b. Integrated metabonomic-proteomic analysis of an insect-bacterial symbiotic system. J. Proteome Res. 9: 1257–1267.

  • Wier, A. M. , S. V. Nyholm , M. J. Mandel , R. P. Massengo-Tiasse , A. L. Schaefer , I. Koroleva , S. Splinter-Bondurant , B. Brown , L. Manzella , E. Snir et al. 2010. Transcriptional patterns in both host and bacterium underlie a daily rhythm of anatomical and metabolic change in a beneficial symbiosis. Proc. Natl. Acad. Sci. USA 107: 2259–2264.

  • Wouts, W. M.. 1980. Biology, life cycle and redescription of Neoaplectana bibionis Bovien, 1937 (Nematoda; Stenernematidae). J. Nematol. 12: 62–72.

  • Zhu, J. , B. Zhang , E. N. Smith , B. Drees , R. B. Brem , L. Kruglyak , R. E. Bumgarner , and E. E. Schadt. 2008. Integrating large-scale functional genomic data to dissect the complexity of yeast regulatory networks. Nat. Genet. 40: 854–861.