Although bacteriophage dominate marine viral communities, research on marine phage genomes is still in its infancy. Currently, only one marine phage genome has been sequenced (
24) while more than 100 nonmarine phage or prophage genomes were sequenced. The study on the genome of roseophage SIO1, a lytic phage of the heterotrophic marine bacterium
Roseobacter suggested that marine and nonmarine phages are genetically related but their basic life histories may be significantly different (
24). The rapidly increasing genomic data on bacteriophage have led to some new findings on phage evolution. It has been proposed that double-stranded DNA phage and prophage are mosaics that arose by horizontal gene transfer of genetic material from a global phage pool (
14). A later study based on the analysis of six prophage suggested the existence of two modes of genetic evolution, depending on the phage infectivity (
8). Chopin et al. (
8) proposed that lysogenic phage follows the mode suggested by Hendrix et al. (
14) while lytic phage would not exchange DNA outside its group.
In order to better understand the biological properties of marine and nonmarine phages and explore potential functional linkage between cyanophage and cyanobacteria, the whole genome of cyanophage P60, which infects marine Synechococcus spp., was sequenced.
Nucleotide sequence accession number
The complete nucleotide sequence of cyanophage P60 has been deposited in GenBank under the single accession no. AF338467 .
The genome size of P60 was 47,872 bp. Eighty ORFs were identified on the P60 genome, and 19 of them could be assigned putative functions (Fig.
2). All of these ORFs were preceded by 5′ start codons (ATG) and predicated on the basis of this information plus a plausible Shine-Dalgarno prior to the start codon. In general, the ORFs of P60 were most similar to those found in bacteriophages T3, T7, phi-YeO3-12, and SIO1 (Table
1). However, ORFs 26, 27, 53, and 67 of P60 shared strikingly high similarity (>50% amino acid sequence identity) with sequences found in marine
Synechococcus and
Prochlorococcus strains (Table
1). These ORFs corresponded to the genes that code for ribonucleoside triphosphate reductase A and B, thymidylate synthase, and an unidentified protein, respectively.
The genomic arrangements of four different podoviruses (P60, T7, phi-YeO3-12, and SIO1) were compared (Fig.
3). The genes responsible for DNA replication (e.g., primase-helicase and DNA polymerase [DNA
pol]) appeared to be conserved among all these phages. In terms of gene organization, cyanophage P60 was more similar to phages T7 and phi-YeO3-12 than to roseophage SIO1. However, the genes involved in ribonucleotide triphosphate reduction were only conserved between marine cyanophage P60 and marine roseophage SIO1 (Fig.
3). Classes I, II, and III were defined by Dunn and Studier in 1983 (
11). Briefly, the class I genes are transcribed by host RNA polymerase and include functions to overcome host restriction; the class II genes are next to be expressed and are responsible for phage DNA replication and metabolism; the class III genes are the last to be expressed and mainly include the structural genes for maturation and packaging of phage DNA.
The phylogenetic relationship constructed based on the DNA
pol from 13 different strains of podoviruses demonstrated that the two marine podoviruses (cyanophage P60 and roseophage SIO1) were not necessarily more related than nonmarine podoviruses (Fig.
4). In fact, cyanophage P60 was more closely related to coliphages T3, T7, and phi-YeO3-12 than to roseophage SIO1 (Fig.
4). Phages that infect the same or closely related hosts appeared to be more closely related. For example, coliphages T3, T7, and phi-YeO3-12 were closely related and the
Bacillus phages B103, M2, PZA, phi29, and GA-1 were also clustered together (Fig.
4). According to the DNA
pol phylogeny, podoviruses can be divided into two major clusters, A and B. Interestingly, the genome sizes of podoviruses in cluster A were nearly twice as large as those in cluster B. The genome sizes of these podoviruses are compared in Table
2.
Although phages are a group of highly diverse viruses, morphology, genome size, and host relatedness could provide important clues to their evolution and taxonomy. Our data suggested that T7-like podoviruses with a genome size of about 40 kb share very similar genomic arrangements. Phylogenetic analysis based on conserved DNA
pol further proved the kinship of T7-like phages. The phi29-like podoviruses (phages B103, BS32, GA-1, M2, Nf, phi15, phi29, and PZA) have very similar morphologies and genome sizes (∼20 kb) and were shown to evolve from a common ancestor based on the genomic comparison (
21). In general, viruses that infect the same host or closely related hosts have similar morphologies and are evolutionarily close. For example, the capsid assembly gene was conserved in cyanomyoviruses that infect marine
Synechococcus (
13) and several head and tail genes were found to be conserved among T4-type phage that infects enterobacteria (
30). The large double-stranded DNA viruses that infect eukaryotic microalgae were also shown to be closely related on the basis of the viral DNA
pol gene (
7). Evolution of phage is likely more dependent on the host rather than the environment. The two marine phages SIO1 and P60 were not the most closely related strains in terms of genomic structure. In terms of genomic arrangement, P60 is more similar to T7 and phi-YeO3-12 than to SIO1. The RNA polymerase, an essential component of T3 and T7 life cycles, was not identified in the SIO1 genome (
24).
Genomic sequences of phage provide many new insights on the biology and ecology of phage. Currently, the genomes from 109 bacteriophages have been sequenced and deposited in GenBank. Genomic sequences from
Myoviridae (10 strains),
Podoviridae (10 strains),
Siphoviridae (42 strains),
Inoviridae (16 strains),
Leviviridae (8 strains),
Microviridae (10 strains), and several viruses that infect archaeal bacteria are available to the public on the National Center for Biotechnology Information website (
http://www.ncbi.nlm.nih.gov:80/PMGifs/Genomes/phg.html ). According to the phage genomic database, 2 of 10 myovirus genomes and 3 of 42 siphovirus genomes are lytic phages. Lysogenic phage is the dominant form among the known myovirus and siphovirus genomes. Interestingly, the gene that codes for the integrase was found in all lysogenic siphovirus and myovirus genomes, suggesting that these phages were able to integrate their genomes into the host genome and become lysogenized. A majority of bacteriophages isolated from marine environments were myoviruses and siphoviruses (
18,
28,
33,
34). More than 40% of the bacterial isolates contained inducible prophage, and the percentage of lysogenic bacteria was higher in oligotrophic environments than in coastal or estuarine environments (
16). It will be interesting to estimate what proportion of marine phages contains the integrase gene. Eight of nine podoviral genomes (except for phage P22) contained the DNA
pol gene, and some other genes (e.g., primase and helicase) were associated with DNA replication. These phages have been known to be lytic phages. Although morphologically P22 is a member of
Podoviridae, many studies have suggested that it is a member of the lambdoid family (
31). Phage lambda (
Siphoviridae) is a typical lysogenic phage. Most of the lysogenic phage genomes in
Siphoviridae and
Myoviridae did not contain the DNA replication genes (i.e., primase and DNA
pol).
The evolution of lytic podovirus could be less influenced by the genetic exchange between phage and host. A conserved DNA
pol gene contained in the viral genome could be an important inherited feature for lytic bacteriophages. The replication of lysogenic phage would be more host dependent than that of lytic phage. The genetic diversity of lysogenic phage should be much higher than that of lytic phage due to higher frequencies of lateral genetic exchange between lysogenic phage and the host. Numerous cases of horizontal gene transfer were observed among lambda-like phages upon genomic comparison (
4). According to the recent study of
Lactococcus prophages, genomic similarity of lysogenic phages is much lower than that of lytic phages (
8). Chopin et al. (
8) further suggested that the frequencies of horizontal genetic exchange are lower among lytic phage than lysogenic phage. Our study suggested that the evolution of DNA replication machinery of lytic podoviruses is more independent of the host than is that of lysogenic phage. This is consistent with the view that acute viruses tend not to show phylogenetic congruence with their hosts (
32).
Genetic exchanges between cyanophage P60 and marine cyanobacteria occurred at the sites that code for ribonucleoside triphosphate reductase A and B, thymidylate synthase, and an unidentified protein. It is not clear why P60 maintains extensive similarity with marine cyanobacteria for genes involved in nucleic acid metabolism. Such homologues were also found between phage T4 and
Escherichia coli. Marine roseophage SIO1 also contains these proteins that exhibit higher similarity to bacteria than to other phages (
24). Ribonucleotide reductases are the key enzymes that convert ribonucleosides to deoxynucleotides, which are the immediate precursors of DNA (
23). With ribonucleotide reductase and thymidylate synthase, the rate of DNA synthesis of T4 could be increased 10-fold compared to the system without these enzymes (
19). Although host DNA degraded by phage-encoded ribonucleases can be incorporated into the DNA of the progeny phages, it is believed that the great bulk of the deoxynucleoside triphosphates for T4 DNA synthesis comes from de novo synthesis catalyzed by phage-encoded proteins like ribonucleotide reductase and thymidylate synthase (
19). Perhaps acquisition of these DNA metabolism genes generated the rapidly growing lytic phages like P60, SIO1, and T4. Again, these genes were not common in lysogenic phages. Another example of genetic exchange between lytic phage and host is the phage-encoded PhoH, a host-borne protein typically induced under conditions of phosphate starvation. This gene was found in the lytic phage SIOI that infects marine
Roseobacter sp. (
24) but not in marine lytic phage P60.
Although no single universal genetic marker was found for all of the double-stranded DNA phages, the phylogenetic diversity of viruses in natural environments could be explored within defined groups on the basis of their infection mechanism, morphology, genome size, and host linkage. The DNA
pol gene has been proven to be a suitable genetic marker for examining the evolutionary relationship between algal viruses and other large double-stranded DNA viruses (
5-
7). Our study here suggested that the DNA
pol gene could also be used as a marker molecule to study the phylogenetic relationship or diversity of podoviruses. Recently, a partial DNA
pol gene sequence was obtained from cyanophage φ12, another podovirus which infects marine
Synechococcus WH8017 (
33). The nucleotide sequence (GenBank accession no. AY063486 ) of partial DNA
pol from cyanophage φ12 was 97.5% identical to that of cyanophage P60, suggesting that specific PCR primers could be designed at least for the podoviruses of marine
Synechococcus spp.
Although podoviruses are not the dominant form among the phage isolates, they could represent a unique group of viruses that are important in terms of controlling bacterial mortality in natural environments due to their superinfectivity and high host specificity. Podoviruses that infect marine
Synechococcus spp. were found to be more host specific than myoviruses, the dominant form of cyanophage isolates (
18,
28,
33). However, the DNA
pol gene did not appear to be conserved among myophages and siphophages. In our laboratory, more than six sets of degenerate primers have been designed based on the conserved regions of known DNA
pol (families A and B, respectively) and used to amplify the gene target from 35 cyanophage isolates. In most cases, nonspecific products were amplified and sequence data did not match the DNA
pol. In a recent study, the primers based on the T4 sequences also failed to amplify the DNA
pol gene from the T4-type phages (
30).
In order to investigate genetic diversity of myophages or siphophages in natural environments, one should probably consider viral structural genes as probes. For example, the capsid assembly gene (g20) of myoviruses infecting marine
Synechococcus spp. were found to be conserved (
13) and the specific primers based on the g20 genes have been used to compare the genetic diversity of this group of cyanophages in natural environments (
36,
37). Moreover, several major tail genes (i.e., genes 18 and 19) and the capsid gene (gene 23) were found to be conserved within T4-type phages and suitable for phylogenetic analysis (
30). It was also found that the sequences of the head assembly proteins were conserved between several phiC31-like siphoviruses (prophages) that were isolated from evolutionarily diverse hosts (
25). These prophages were proposed to share a common head assembly mechanism (
25). Furthermore, the lysogeny-related genes were found to be conserved among lysogenic siphoviruses from an evolutionarily related branch of low-GC-content gram-positive bacteria (
9).
There is no doubt that viral communities in aquatic environments are much more complex than what we have seen from their morphologies. The extent of viral diversity in marine environments is still largely unknown. More phage genomes from aquatic environments should be explored in order to better understand the evolutionary history and biological and ecological functions of marine viruses.