<iframe src="https://www.googletagmanager.com/ns.html?id=GTM-KCV32QR" height="0" width="0" style="display:none;visibility:hidden">

Abstract

We present the analysis of the evolution of tumors in a case of hepatocellular carcinoma. This case is particularly informative about cancer growth dynamics and the underlying driving mutations. We sampled nine different sections from three tumors and seven more sections from the adjacent nontumor tissues. Selected sections were subjected to exon as well as whole-genome sequencing. Putative somatic mutations were then individually validated across all 9 tumor and 7 nontumor sections. Among the mutations validated, 24 were amino acid changes; in addition, 22 large indels/copy number variants (>1 Mb) were detected. These somatic mutations define four evolutionary lineages among tumor cells. Separate evolution and expansion of these lineages were recent and rapid, each apparently having only one lineage-specific protein-coding mutation. Hence, by using a cell-population genetic definition, this approach identified three coding changes (CCNG1, P62, and an indel/fusion gene) as tumor driver mutations. These three mutations, affecting cell cycle control and apoptosis, are functionally distinct from mutations that accumulated earlier, many of which are involved in inflammation/immunity or cell anchoring. These distinct functions of mutations at different stages may reflect the genetic interactions underlying tumor growth.
Tumorigenesis is generally believed to be the consequence of mutation accumulation, including single nucleotide substitutions, structural variations, and epigenetic changes, in somatic cells (1). A typical cancer may have thousands of somatic mutations, of which 10–100 may be in coding regions (27). A central issue in cancer genomics is then the dynamics of tumor growth in relation to the accumulation of these mutations. Given any individual case of cancer, the questions are hence: (i) how many adaptive mutations drive the tumor growth; (ii) how strongly each mutation drives the growth; and (iii) what their molecular nature is vis-à-vis that of the background mutations. To answer these questions, we treat each tumor as a population of cells and apply population genetic principles to infer adaptive mutations (8).
Cancer mutations are often divided into drivers and passengers (9). Driver mutations are those that contribute directly to tumorigenesis and their identification is crucial for understanding the molecular biology of cancers. An important issue is how driver mutations should be defined operationally. Candidate driver mutation in the literature often refers to coding changes in genes that are commonly mutated, for example, in multiple cases of hepatocellular carcinoma (HCC). Adaptive mutation proposed here is an alternative definition of candidate driver mutation, inferred from the dynamics of cell proliferation in its natural setting within a single patient.
In this report, we analyze a case of HCC, the fifth most common cancer worldwide, by such an approach. We regard HCC as particularly favorable for identifying candidate driver mutations for several reasons. First, liver resections from the surgery usually contain high yields of DNA from hepatocytes. Second, liver tissues regenerate, resulting in active cell turnover and an opportunity for a more clonal genealogical pattern. Third, previous studies including our own (10) suggest that different cases of HCC may exhibit a wide range of evolutionary dynamics, as their pathologies and anatomies vary extensively. Some of these cases should have the growth rate and pattern conducive for isolating the small number of adaptive mutations.

Results

Sequencing and Mutation Detection.

The subject of this study was a female patient with chronic hepatitis B virus (HBV) infection, diagnosed with HCC at the age of 35. A pedunculate tumor (labeled “primary” in Fig. 1) was removed in the first surgery. This primary tumor was grade II to III HCC with prominent clear cell components. Fifteen months later, HCC recurrences were detected and the patient received a second surgery. Recurrent tumor 1 occurred in the regenerated liver at the site of the initial resection and a smaller recurrent tumor 2 was also identified at a second, nearby site. The case report and informed consent are presented in SI Materials and Methods A1.
Fig. 1.
The scheme of sampling from the HCC liver. The resected portion containing the primary tumor is drawn outside of the liver, as indicted by the dotted lines. From the primary tumor, one large section (T0, >50 mm3) and six small sections (T0–T6, <5 mm3, shown as dots) were taken. Six small sections (N1–N6) were also taken from the adjacent nontumor tissues. The recurrent tumors were detected and operated on 15 mo after the first surgery. The larger recurrent tumor resides in the regenerated portion of the liver and a section (R1, >50 mm3) was taken from it, as well as a section (N0) from the adjacent nontumor tissue. Another section (R2) was taken from the smaller recurrent tumor. For more information, please see Materials and Methods.
The locations and sizes of patient samples are summarized in Fig. 1. In total, nine sample sections from the three tumors (T0–T6 from the primary tumor and R1/R2 from the two recurrent tumors) and seven sample sections from the adjacent nontumor tissues (N0, N1–N6; Fig. 1) were obtained. Examination of the pathological anatomy indicated that the proportion of hepatoma cells in the tumor sections was 70–90%. This estimation is corroborated by the sequencing results presented below.
The samples of T0, R1, R2, and N0 were subjected to exon capture and Next Generation Sequencing (NGS) to a depth of 50–60×. In addition, R1 and N0 were subjected to whole-genome sequencing, which yielded a 20× coverage of uniquely mapped reads (SI Materials and Methods A4 and Results B1). For R1 and N0, coverage of exon sequences thus reached an average depth of 70× . We chose the R1 section, instead of T0, for whole-genome sequencing, because the primary tumor and recurrent tumors occurred in the same region of the liver. Given the progression of events, we expected R1 to carry all T0 mutations as well as a few additional ones. This is indeed the case from the exon-capture data (SI Materials and Methods A3). Because all T0 mutations are represented by the R1 data, R1 cells are most likely the direct descendants of the primary tumor. Finally, the choice of nontumor liver sections (N0–N6), vis-a-vis a nonliver tissue, as a reference for inferring tumor-specific mutations has somewhat different consequences (SI Materials and Methods A3).
These sequencing data were used to select sites of somatic mutations for validation. The detailed procedures including site selection, validation accuracy, and false-positive and false-negative estimation are presented in SI Materials and Methods A5.1 and Fig. S1. In brief, sites were chosen when the frequency of a candidate mutation was higher than a cutoff (often, but not always, at 30%) in the R1 or R2 section and zero in the normal section (referred to as T > N sites). The cutoff was chosen to include even marginal candidate sites so true sites would not be missed. False positives could then be screened out by validations. We allowed higher false positives than usual, obtaining an average validation rate of 50%. Validation was performed for all nine tumor and seven nontumor sections (Dataset S1). All T > N sites were subjected to Sequenom validation (MassARRAY MALDI-TOF MS system) and about one-half were subjected to further validation by PCR-NGS sequencing to an average depth of >8,000× . The validated mutation frequencies by Sequenom and PCR-NGS sequencing are in good agreement with the correlation coefficient ranging between 0.86 and 0.89 (SI Results B2 and Dataset S1). In nontumor sections, mutant frequencies at T > N sites were too low to measure accurately by Sequenom; hence, only PCR-NGS data were used (SI Results B2).
Table 1.
Coding genes affected by somatic mutations in tumors
Gene name Amino acid changes1 Mutation frequencies in (N0, R1, R2, T0)2 Mutation effect3 Description
Mutations polymorphic among tumors (M3 and M4 are foreground mutations; M1 and M2 are background mutations subsequently deleted by Δ5q)
TMEM173 (M1) 276 Q->* 0.00, 0.00, 0.36, 0.02   STING (stimulator of IFN genes); inflammation/defense/immunity
ANKHD1 (M2) 689 P->R 0.00, 0.00, 0.34, 0.01 D A scaffolding protein affecting leukemia-cell phenotype
CCNG1 (M3) 15 H->N 0.00, 0.00, 0.45, 0.00 d Cell cycle, G2/M arrest, a target of P53
P62 (M4) 258 D->G 0.00, 0.30, 0.00, 0.00 D Sequestosome-1; autophagy, apoptosis
Mutations in high frequencies in all tumor sections (Background mutations)
TP53 151frame-shift 0.00, 0.56, 0.80, 0.72 D tumor suppressor gene
DUOX2 519 F->S 0.00, 0.08, 0.28, 0.19 D inflammation/defense/immunity
CYSLTR1 140 G->C 0.01, 0.31, 0.43, 0.34 d inflammation/defense/immunity
CYBB 156 A->V 0.00, 0.23, 0.38, 0.34 0 inflammation/defense/immunity
PON3 248 E->* 0.00, 0.29, 0.43, 0.34   Hydrolyze lactone, inhibit oxidation, inflammation/defense
COL1A2 253 G->D 0.00, 0.16, 0.22, 0.19   Extracellular matrix, cell anchorage
COL4A6 497 P->A 0.00, 0.54, 0.60, 0.57   Extracellular matrix, cell anchorage
PIGF 19frame-shift 0.00, 0.16, 0.20, 0.19   Cell anchorage (frequency estimates less reliable)
NUP205 1771 S->I 0.00, 0.27, 0.34, 0.29 d
ATP13A3 539 V->G 0.00, 0.32, 0.43, 0.37 D ATPase
ZNF541 1117 Q->H 0.00, 0.28, 0.41, 0.36  
RPL12 68 Q->E 0.00, 0.31, 0.44, 0.38 0
HMGCS2 416 L->F 0.00, 0.92, 0.94, 0.89 0 Metabolic enzyme in mitochondria
GALNTL4 591 C->* 0.01, 0.76, 0.85, 0.85  
KIAA1644 43 H->R 0.00, 0.13, 0.61, 0.29 d
C14orf28 78 R->W 0.00, 0.29, 0.44, 0.42 D
CXorf64 167 G->V 0.00, 0.31, 0.45, 0.35 d
RELN 824 I->F 0.04,0.38, 0.50, 0.44 0 Extracellular signal molecule
C17orf75 132 E->D 0.00, 0.35, 0.47, 0.41 0
Truncated/fused genes
C5orf51 - CPEB4 (M10) Chr5:41951996 - Chr5: 173314849 The two genes were truncated and fused by Δ5q. The fused transcript can be detected, designated M10, and it is a foreground mutation.
1 *represents stop codon. 2. Frequency of lineage specific mutation is boldfaced from the PCR-GAIIx data. 3. The effect of mutation determined by the program PolyPhen-2. D, high confidence damaging effect; d, possible damaging effect; 0, low confidence damaging effect.

Divergence Between Tumor (R1) and Nontumor (N0) Sections.

We first present the accumulation of mutations in one tumor section (R1) relative to a nontumor section (N0). Data from other sections will follow later. Three classes of somatic mutations were considered.
First, we identified point mutations and small indels. In Table S1, 214 point (single nucleotide) mutations were validated, of which 193 were noncoding or synonymous (silent) and 21 were nonsynonymous. Fig. S2 shows examples of silent mutation frequencies. The list of criteria used in the filtering process is summarized in Table S2. On a whole-genome basis, the estimated rate of somatic point mutations is ∼0.8 per Mb, or 2,500 mutations genome wide (SI Materials and Methods A5.3). This mutation density is close to the median value reported in the literature (2). The 21 nonsynonymous mutations we detected are close to the expected number based on the genome-wide density of 2,500 mutations. Table 1 lists these mutations individually, with their frequencies in N0, T0, R1, and R2 shown. The frequencies are usually <50%, as the sites are mostly heterozygous and the proportions of cancerous cells in the samples are 70–90% (see below). Table 1 also includes two small indels that cause a frame-shift in the coding region of the PIGF and TP53 genes (SI Materials and Methods A6.1).
Second, copy number variants (CNVs) and large chromosomal indels were identified. CNVs are a source of genetic diversity in many cancers (11, 12). A major class of CNVs is large chromosomal indels, which include duplications and deletions of chromosomal segments. It should be noted that polyploids are common in tumor cells and, in even normal hepatocytes, tetraploids are often observed (1315). Hence, regionally averaged minor allele frequencies (MAFs) at heterozygous sites provide reliable estimates of local copy number. As expected, the MAF statistic is remarkably stable across chromosomes in nontumor sections (Fig. 2). In tumor sections, MAFs and read depth are both informative about chromosomal indels and are generally concordant (Fig. 2 and Fig. S3). A region on chromosome 6 (Fig. 2, marked by a red box) is an exception. It has the baseline copy number, but the MAFs deviate strongly from the average. A possible explanation is that this region has a 3:1 allele ratio instead of the baseline 2:2 for a tetraploid.
Fig. 2.
Detection of large indels on chromosome 5 and 6 from sequence reads. (A) For each chromosome, shown are the minor allele frequency (MAF) at heterozygous site in the non-tumor tissue, N0. Each point represents the sum of 50 consecutive polymorphic sites. Non-tumor tissues do not appear to harbor large indels as the frequencies stay relatively constant across regions. (B) The corresponding frequencies in the R1 section. The contrast is clear since defined regions in R1 show characteristic reductions in MAFs. (C) Read depth is shown; red and green lines denote regions of unusually high or low read depth. There is substantial concordance between B and C in delineating regions of aberration. Since they are built on very different data, the concordance lends confidence to the interpretation of chromosomal indels. Two features are noteworthy as indicated by a red bar (a deletion, D5q) and a red box, respectively. The region marked by the red box has the average read depth but MAFs are aberrant there. A possible interpretation is that, in tetraploids, the two homologs exist in a ratio of 3:1, instead of 2:2.
Given that the beginning and end of each chromosomal indel are characterized by abrupt transitions in both read depth and MAF (SI Materials and Methods A7), we used these data to identify all copy number breakpoints in the genome (Table S3). In total, 26 such breakpoints were identified from either MAF or read depth data. We conservatively considered breakpoints not jointly called by both data types to be false positives, as mis-inferences are common for smaller chromosomal indels. Using concordant breakpoints as a guide, we identified 19 chromosomal indels and three CNV regions (two on chromosome 5 and one on chromosome 11; Fig. S3) in the whole genome.
With the resolution of our analysis, all chromosomal indels of >1 Mb at >20% in frequency in the cell population should have been detected (SI Materials and Methods A7). Among those detected, Δ5q (Fig. 2 and Fig. S4) is of particular interest. The breakpoints of Δ5q fall in the introns of two genes, resulting in their truncation and fusion. The fused transcript can be detected by RT-PCR and will be referred to as the M10 mutation (Fig. S5). Because the impact of Δ5q on tumor growth could result from either the lower dosage of genes in the deleted region or the transcript at the breakpoints, we shall refer to this deletion as Δ5q (M10) whenever both properties are relevant.
Finally, 18 HBV integrations were detected. No insertion site was found in the coding regions. We chose four integration sites for PCR validation, two in the introns of coding genes (TPPP and SHANK2) and two in intergenic regions. The validation shows all four of them to be present in all tumor sections (T0–T6, R1, and R2) and absent in nontumor sections (N0–N6). These insertions support a simple clonal-expansion model for these tumors. SI Materials and Methods A2 provides further information.

Genetic Diversity Within and Between Tumors.

Among the 214 point mutations shown in Table S1, 205 are observed at similar frequencies in all three tumors (Fig. S2A for examples). Only nine mutations, or 4.2%, were observed at very different frequencies among tumor sections (see Table 1 for the nonsynonymous ones). These mutations, polymorphic in the tumor tissue, are labeled M1–M9 in Fig. 3 and will be the basis on which the evolution of these tumors is analyzed in the next section (Fig. 3). Among the silent mutations, M5–M7 deserve a special note. As shown in Fig. S2A, these two mutations are absent in R2 and, interestingly, are unusually low in frequency in T3 and T6 (Fig. S2C).
Fig. 3.
Evolution of the tumors inferred from the data of T0–T6, R1, and R2. The table in the inset shows the presence/absence, indicated by +/−, of each foreground mutation in the tumor sections. (+) denotes presence but at a lower frequency. The table defines the cell lineages. Below the red arrow are mutations accumulated during tumor growth. Red shade denotes tumor cell lineages (labeled as π0–π3). The closely related noncancerous cell lineage is labeled πn. Sample sections, shown in brackets, are written beneath or inside the corresponding cell lineages. M1–M4 and M10 mutations affected amino acid sequences, as shown in Table 1. M5–M9 (□) are silent mutations in intergenic or intronic regions. The deletion Δ5q truncated and fused two genes at the breakpoints. This event is labeled M10. Δ5q also deleted two earlier mutations, M1 and M2. Time is marked by the length of the double arrows on the far right. t1 (=15 mo) is the time between the two surgeries. Among the life time collection of mutations, <5% occurred in the duration of t2. Above the red arrow are background mutations, 188 and 19 of which are silent and nonsynonymous, respectively.
Mutation frequencies are also used to gauge sample purity. We note in Fig. S2 that the frequency profiles are fairly consistent in the same sections, roughly in the order of R2 > T0 > R1. These differences likely reflect the different proportions of tumor cells that carry the mutation. We shall refer to this proportion as the composition index [= (the proportion of cancerous cells in the sample) × (the proportion of cancerous cells carrying the mutation)]. The composition index for R2, T0, and R1 is 0.88, 0.75, and 0.65, respectively (SI Materials and Methods A8), in accord with the pathology report of 70–90% hepatoma cells in the samples.
The 22 chromosomal indels/CNVs reported in Fig. S3 were initially observed in R1. We then surveyed the other eight tumor sections for their presence by genotyping germ-line heterozygous sites (the position of which is marked on the bottom of Fig. S3). MAFs across these sites indicated that Δ5q is the only chromosomal indel that is not present in every tumor section (SI Results B3). Indeed, Δ5q is completely missing in R2 and is in lower frequencies in T3 and T6 than in other tumor sections. Recall that the analysis of Fig. S2 has already found T3 and T6 to be somewhat differentiated from other T sections.
The polymorphism of Δ5q among the tumors raises an interesting question, as the three nonsynonymous mutations, M1–M3, all fall in the region spanned by Δ5q. These three mutations are common in R2 but absent in other tumor sections (Table 1). Hence, they could have occurred in R2, or, alternatively, in the common ancestors but were deleted by Δ5q in all other sections. From the analysis of the SI Results B3, M1 and M2 indeed occurred in the common ancestors but were deleted along with Δ5q, as shown in Fig. 3. M3, in contrast, occurred only in R2.

Evolution of the Tumors.

The nine point mutations (M1–M9) together with Δ5q (M10) define four different cell lineages among the nine tumor sections (Fig. 3). Each tumor section contains cells from one single lineage, the exceptions being T3 and T6, which consist of mixed lineages. The table in the inset of Fig. 3 summarizes the pattern, as explained below. Two lineages of cells have the M1 and M2 mutations (or, more accurately, did not lose them as a result of Δ5q). The distinction between the two lineages is that the π1 lineage has M3 (a nonsynonymous mutation in a cyclin G gene) and the π0 lineage has a silent M8 mutation. The other two lineages, π2 and π3, both have M5–M8 and Δ5q (M10) mutations. The π3 lineage, in addition, has M4 (a nonsynonymous mutation in the P62 gene) and M9. In this figure, cell lineages are drawn as triangles to denote their expansions from a single cell that acquired new mutations. In addition, the lineage from which tumor cells emerged is designated as π0 in Fig. 3. Details of the phylogenetic reconstruction are given in SI Results B5.
There is hardly any doubt that these tumors and cell lineages are highly clonal. After all, more than 95% of somatic mutations, either coding or noncoding changes, are present in all tumor samples. As judged by the size of the π0 lineage, the cell mass of these tumors generally remained small even when 95% of the mutations had accrued. The growth of the primary and R2 tumors associated with the last few mutations, as shown in Fig. 3, was hence very substantial. R1 deserves special mention, as it occurred in the regenerated liver. The progenitors of R1 are themselves aggressively growing cells of the primary tumor, as discussed before. However, 15 mo after the surgical removal of the primary tumor, the cells that predominated in the recurrent R1 all carried the M4 mutation, which was not even detectable in T0.
The various growth rates of these tumors raise a question of the designation of R2 as “recurrent.” Because R2 and the big “primary” tumor (represented by T0–T6) bifurcated from a common lineage when the number of cancerous cells was still small, the late emergence of R2 was due to its slower growth. In fact, either one could have been the true primary tumor. We consider the latter a more likely candidate for the primary site not only because it was observed earlier but also because the least evolved π0 lineage can be found only in T3 and T6. The designation, however, affects neither the analysis nor the conclusion of this report.
Most tumor-associated mutations are found in all parts of the tumors. They accumulated in the normal cell lineage, shown as πn in Fig. 3, and are referred to as background mutations. The remaining few mutations that are polymorphic between or within tumors are referred to as foreground mutations. Foreground mutations that are common in some part of the tumors but absent in other parts are most interesting. As stated above, if a foreground mutation in a gene-coding region is uniquely associated with a large section of tumor and its absence is associated with slower cell proliferation, then this mutation is considered adaptive in terms of the population genetics of cells.

Background mutations.

Among the 24 coding region mutations of Table 1, only 3 are foreground mutations, the genealogical patterns of which have been presented in the preceding paragraphs. We shall now describe the possible function of the remaining 21 background mutations and return to the functions of the 3 foreground mutations later.
Because the π0 lineage carries all of the background mutations without significant expansion, the background mutations by themselves appeared insufficient for cell proliferation. Nevertheless, some of these mutations may have “primed” cells for transformation, discussed below. One of the background mutations is in P53. Because nearly 30% of HCC have mutations in this gene, the observation is not unexpected. Four of the 21 background mutations in Table 1 affect genes related to inflammation, defense, and/or immunity. They are CYSLTR1 (cysteinyl leukotriene receptor 1), TMEM173, DUOX2 (dual oxidase 2), and CYBB (also called NOX2 for NADPH oxidase2). Recent studies have increasingly suggested a connection between inflammation, immunity, and cancer development (1618). Most HCC cases in Asian populations, including the one reported here, are HBV-positive and arise following chronic inflammation of the liver (19).
Three genes affected by background mutations are related to cell anchoring and migration. In this study, the migration of cancerous or precancerous cells took place before the expansion of the π1 and π2 cell lineages. Proper anchoring can transduce signals through the integrin pathway to promote cell division. A step in cancer cell transformation is often the abolishment of this anchorage-dependent cell division (1). Collagens are an important component of this process and mutations in two collagen genes, COL1A2 (G253D) and COL4A6 (P497A), were found (Table 1). Both collagens have been reported to function in cell adhesion, migration, differentiation, and growth (20), and their disruption is associated with carcinomas (21). A third gene, PIGF (phosphatidylinositol glycan F), plays a role in cell-cell anchorage (22, 23).
The deletions of two background mutations, M1 and M2, in the π2 lineages by Δ5q merit some attention. Although these two mutations may be merely neutral mutations, it is also possible that they have played a role in the earlier phase of tumorigenesis but have become dispensable later. Both genes appear to have cancer-related functions (Table 1).

Foreground adaptive mutations.

Among the coding region mutations of Table 1, only three are in the foreground and considered adaptive. These three (M3, M4, and M10), together with a few silent mutations, delineate the cell lineages of Fig. 3. The π0 lineage is represented by the least-evolved cancerous cells in our samples. π0 also appears to have the fewest cells among all of the lineages, suggesting that the π0 cells are less malignant than those in π1 through π3. The π1 lineage is defined by M3 in CCNG-1 (Cyclin G1, H15N), which has a growth inhibitory activity linked to auxin response factor-tumor protein 53 (ARF-p53) and retinoblastoma protein (pRb) tumor suppressor pathways (24). Cyclin G1 is also a target of microRNA (miR)-122a, a microRNA frequently down-regulated in HCC (25). A mutation in CCNG1 has indeed been reported in renal cell carcinoma (7). The CCNG1 mutation marked the transition from the least proliferative cells of the π0 lineage to the moderately aggressive cells of the π1 lineage.
The π2 lineage leads to the primary tumor and later to R1. Δ5q (M10) is the only known coding region mutation that marks the transition from π0 to the aggressive π2 and π3 lineages. The breakpoints of Δ5q create a fused transcript, M10, which consists of the first five exons of C5orf51 and the 3′ end of CPEB4. The latter includes the last exon of CPEB4, inferred to have a frame-shift, and the 3′ UTR. C5orf51 is known to be strongly expressed in the liver and highly conserved among mammals; CPEB4 has been implicated in mitotic control (26). Furthermore, the region spanned by Δ5q contains the APC gene and the 5q31 cluster of cytokines, both having been found to be lost in adenomas and carcinomas (27). Loss of heterozygosity in 5q has been reported to be correlated with cancer risk (28) and the histopathological grade of tumors and metastases (29, 30).
The π3 lineage is defined by M4 (in the P62 gene). This rapidly proliferating lineage is a main constituent of R1. In the 15 mo after surgery that removed the primary tumor, R1 grew in the regenerated portion of the liver and reached a substantial size. In comparison, R2 is much smaller, even though it may have started earlier (as R1 could start growing only after the resection of the primary tumor). p62 is a multidomain signaling adaptor protein that affects autophagy, apoptosis, and cancer (31). Indeed, autophagy suppresses tumorigenesis with the elimination of p62 (32), which is implicated in the regulation of many targets, including MEK5, ERK, RIP, aPKC, and TRAF6 (31). Genetic ablation of p62 suppresses the appearance of ubiquitin-positive protein aggregates in hepatocytes (33). These findings link p62 activities to apoptosis and suggest that the modulation of p62 by autophagy might be relevant to tumorigenesis (32).
Finally, we should note that some normal samples can be informative about tumor evolution as well. For example, in the N3 section, the mutation frequency at many sites appears to be higher than those in other nontumor sections, but the difference is substantially larger in some sites than in others. Interestingly, M5 and M6 is unusually low in N3. If N3 contains advanced cancerous cells, the frequency profile should be even across sites. These observations suggest that N3 may contain precursor cancer cells at an earlier stage of evolution.

Discussion

In addition to the identities of somatic mutations, cancer genomic data can provide detailed information on how tumors grow in relation to the accumulation of mutations. A cell-population genetic analysis of tumors is not unlike the analysis of mutation accumulation in geographical populations of natural species like E. coli (34, 35) [and, to some extent, humans (36, 37) and Drosophila (38)]. Among the thousands of mutations accrued in each case, it is sometimes possible to identify a small number of adaptive mutations that drive cell proliferation. Furthermore, even noncoding mutations can be informative about how rapidly the tumors have grown. We should note that each individual case of cancer is informative on its own and the assumption of common mutations is not necessary.
In this case of HCC, the tumors remained small (judged by the size of the π0 lineage) late in cancer evolution, when all background mutations have already occurred (Fig. 3). If we use silent mutations to mark the divergence time between cell lineages, the ratio of foreground to background mutations is 5:188. For coding region mutations, three [CCNG1, P62, and Δ5q (M10)] are foreground changes among the 24 reported in Table 1. Thus, the evolutionary dynamics inferred from this study is a long process of accumulation of background mutations, followed by the rapid spread of a relatively small number of (adaptive) foreground mutations.
Nonsynonymous mutations in the background and foreground fall into different functional categories. In this study, background mutations, including one in P53, did not directly cause cell proliferation, but some of them might have “primed” the cells to proliferate. Indeed, seven background mutations are in genes of inflammation/immunity or cell anchoring. In comparison, foreground mutations affect genes of cell cycle control and apoptosis. One might expect that, after the background mutations have laid the groundwork, foreground mutations should directly affect cell division and cell death. Hence, the functional division between background and foreground mutations appears to agree with this simple expectation.
The distinct functions between foreground and background mutations suggest that tumorigenesis may be driven by epistatic gene interactions. With epistasis, mutations of either kind alone may have a much weaker effect on tumor growth than the joint presence of background and foreground mutations. Such a genetic architecture is not uncommon for traits that have evolved over time (39). With that consideration, the best genetic background to test the functions of the three adaptive mutations would be that of the π0 lineage, which has all of the background mutations. In a wild-type genetic background, it is possible that the three adaptive mutations may not impart cancer-causing phenotypes.
There are caveats, both specific and general, that need to be heeded. Specifically, we identified one, and only one, protein-coding mutation for each of the three proliferation events in Fig. 3. In SI Results B5, we present several lines of evidence that coding mutations should not have been missed. In addition, the depth of coverage, the cutoff used in choosing sites for validation, and the paucity of intermediate frequency mutations are also addressed. A more general caveat is that this case might be unusual and its level of genetic differentiation happens to be particularly suitable for identifying driver mutations. Indeed, the process of mutation accumulation and natural selection is likely to be highly stochastic. In some cases of tumor evolution, there might be little genetic diversity among all tumor cells if a powerful driver mutation has caused a strong “selective sweep” (40). The variation in the evolutionary dynamics may prove to be as informative about tumorigenesis as the common mutations. If that is true, this study would be a small step in elucidating that variation.

Materials and Methods

A 35-y-old woman with chronic HBV infection was diagnosed with HCC. Two tumor and one nontumor sections, R1, R2, and N0, were subjected to exon capture and SOLiD sequencing. R1 and N0, in addition, were subjected to whole-genome sequencing. Sequence reads were aligned to the reference human genome (NCBI36) using SOLiD Corona Lite and Burrows-Wheeler Aligner. Putative somatic mutations identified by sequencing were then validated by Sequenom genotyping and deep sequencing across all nine tumor and seven nontumor tissue sections. CNVs and chromosomal indels were identified using our in-house programs combining information from both read depth (coverage) and MAF at germ-line heterozygous sites. Full materials and methods used to generate this data set and results are provided in SI Materials and Methods.

Acknowledgments

We thank Maynard Olson for advice. We also thank Andy Clark, Michelle LeBeau, Steve O'Brien, Julie Schneider, Ralph Weichselbaum, and Y.M. Jeng for comments and input. Taiwan Liver Cancer Network provided valuable assistance. This study was supported by National S&T Major Project of China Grants 2009ZX08010-017B and2009ZX08009-149B, National Natural Science Foundation of China Grants 30950006, 31000957, and 31071914, Chinese Academy of Sciences Grant KSCX1-YW-22, National Basic Research Program of China Grants 2011CB510101 and 2011CB510106, and an National Research Program for Genomic Medicine grant (to P.-J.C.).

Supporting Information

Supporting Information (PDF)
Supporting Information
sd01.xls

References

1
R Weinberg The Biology of Cancer (Garland Science, New York, 2007).
2
C Greenman, et al., Patterns of somatic mutation in human cancer genomes. Nature 446, 153–158 (2007).
3
TJ Ley, et al., DNA sequencing of a cytogenetically normal acute myeloid leukaemia genome. Nature 456, 66–72 (2008).
4
SP Shah, et al., Mutational evolution in a lobular breast tumour profiled at single nucleotide resolution. Nature 461, 809–813 (2009).
5
ED Pleasance, et al., A small-cell lung cancer genome with complex signatures of tobacco exposure. Nature 463, 184–190 (2010).
6
L Ding, et al., Genome remodelling in a basal-like breast cancer metastasis and xenograft. Nature 464, 999–1005 (2010).
7
GL Dalgliesh, et al., Systematic sequencing of renal carcinoma reveals inactivation of histone modifying genes. Nature 463, 360–363 (2010).
8
M Greaves, Darwinian medicine: a case for cancer. Nat Rev Cancer 7, 213–221 (2007).
9
NH Segal, et al., Epitope landscape in breast and colorectal cancer. Cancer Res 68, 889–892 (2008).
10
YJ Chen, et al., Chromosomal changes and clonality relationship between primary and recurrent hepatocellular carcinoma. Gastroenterology 119, 431–440 (2000).
11
N Navin, et al., Inferring tumor progression from genomic heterogeneity. Genome Res 20, 68–80 (2010).
12
JR Pollack, et al., Microarray analysis reveals a major direct role of DNA copy number alteration in the transcriptional program of human breast tumors. Proc Natl Acad Sci USA 99, 12963–12968 (2002).
13
BN Kudryavtsev, MV Kudryavtseva, GA Sakuta, GI Stein, Human hepatocyte polyploidization kinetics in the course of life cycle. Virchows Arch B Cell Pathol Incl Mol Pathol 64, 387–393 (1993).
14
JE Guidotti, et al., Liver cell polyploidization: a pivotal role for binuclear hepatocytes. J Biol Chem 278, 19095–19101 (2003).
15
AW Duncan, et al., The ploidy conveyor of mature hepatocytes as a source of genetic variation. Nature 467, 707–710 (2010).
16
SI Grivennikov, FR Greten, M Karin, Immunity, inflammation, and cancer. Cell 140, 883–899 (2010).
17
WE Naugler, et al., Gender disparity in liver cancer due to sex differences in MyD88-dependent IL-6 production. Science 317, 121–124 (2007).
18
S Rebouissou, et al., Frequent in-frame somatic deletions activate gp130 in inflammatory hepatocellular tumours. Nature 457, 200–204 (2009).
19
PJ Chen, DS Chen, Hepatitis B virus infection and hepatocellular carcinoma: molecular genetics and clinical perspectives. Semin Liver Dis 19, 253–262 (1999).
20
H Tanjore, R Kalluri, The role of type IV collagen and basement membranes in cancer progression and metastasis. Am J Pathol 168, 715–717 (2006).
21
K Ikeda, et al., Loss of expression of type IV collagen alpha5 and alpha6 chains in colorectal cancer associated with the hypermethylation of their promoter region. Am J Pathol 168, 856–865 (2006).
22
K Ohishi, et al., Structure and chromosomal localization of the GPI-anchor synthesis gene PIGF and its pseudogene psi PIGF. Genomics 29, 804–807 (1995).
23
J Takeda, et al., Deficiency of the GPI anchor caused by a somatic mutation of the PIG-A gene in paroxysmal nocturnal hemoglobinuria. Cell 73, 703–711 (1993).
24
L Zhao, et al., Cyclin G1 has growth inhibitory activity linked to the ARF-Mdm2-p53 and pRb tumor suppressor pathways. Mol Cancer Res 1, 195–206 (2003).
25
L Gramantieri, et al., Cyclin G1 is a target of miR-122a, a microRNA frequently down-regulated in human hepatocellular carcinoma. Cancer Res 67, 6092–6099 (2007).
26
I Novoa, J Gallego, PG Ferreira, R Mendez, Mitotic cell-cycle progression is regulated by CPEB1 and CPEB4-dependent translational control. Nat Cell Biol 12, 447–456 (2010).
27
B Vogelstein, et al., Genetic alterations during colorectal-tumor development. N Engl J Med 319, 525–532 (1988).
28
LG Johnson, et al., Risk of cervical cancer associated with allergies and polymorphisms in genes in the chromosome 5 cytokine cluster. Cancer Epidemiol Biomarkers Prev 20, 199–207 (2011).
29
R Morita, et al., Common regions of deletion on chromosomes 5q, 6q, and 10q in renal cell carcinoma. Cancer Res 51, 5817–5820 (1991).
30
KM Fong, PV Zimmerman, PJ Smith, Tumor progression and loss of heterozygosity at 5q and 18q in non-small cell lung cancer. Cancer Res 55, 220–223 (1995).
31
J Moscat, MT Diaz-Meco, p62 at the crossroads of autophagy, apoptosis, and cancer. Cell 137, 1001–1004 (2009).
32
R Mathew, et al., Autophagy suppresses tumorigenesis through elimination of p62. Cell 137, 1062–1075 (2009).
33
M Komatsu, et al., Homeostatic levels of p62 control cytoplasmic inclusion body formation in autophagy-deficient mice. Cell 131, 1149–1163 (2007).
34
R Lenski, M Rose, S Simpson, S Tadler, Long-term experimental evolution in Escherichia coli. I. Adaptation and divergence during 2,000 generations. Am Nat 138, 1315–1341 (1991).
35
VS Cooper, RE Lenski, The population genetics of ecological specialization in evolving Escherichia coli populations. Nature 407, 736–739 (2000).
36
SA Tishkoff, et al., Convergent adaptation of human lactase persistence in Africa and Europe. Nat Genet 39, 31–40 (2007).
37
MT Hamblin, A Di Rienzo, Detection of the signature of natural selection in humans: evidence from the Duffy blood group locus. Am J Hum Genet 66, 1669–1679 (2000).
38
AJ Greenberg, JR Moran, JA Coyne, CI Wu, Ecological adaptation during incipient speciation revealed by precise gene replacement. Science 302, 1754–1757 (2003).
39
S Sun, CT Ting, CI Wu, The normal function of a speciation gene, Odysseus, and its hybrid sterility effect. Science 305, 81–83 (2004).
40
JM Smith, J Haigh, The hitch-hiking effect of a favourable gene. Genet Res 23, 23–35 (1974).

Information & Authors

Information

Published in

Go to Proceedings of the National Academy of Sciences
Go to Proceedings of the National Academy of Sciences
Proceedings of the National Academy of Sciences
Vol. 108 | No. 29
July 19, 2011
PubMed: 21730188

Classifications

Submission history

Published online: July 5, 2011
Published in issue: July 19, 2011

Keywords

  1. cell genealogy
  2. cellular evolution
  3. foreground mutation

Acknowledgments

We thank Maynard Olson for advice. We also thank Andy Clark, Michelle LeBeau, Steve O'Brien, Julie Schneider, Ralph Weichselbaum, and Y.M. Jeng for comments and input. Taiwan Liver Cancer Network provided valuable assistance. This study was supported by National S&T Major Project of China Grants 2009ZX08010-017B and2009ZX08009-149B, National Natural Science Foundation of China Grants 30950006, 31000957, and 31071914, Chinese Academy of Sciences Grant KSCX1-YW-22, National Basic Research Program of China Grants 2011CB510101 and 2011CB510106, and an National Research Program for Genomic Medicine grant (to P.-J.C.).

Authors

Affiliations

Yong Tao1
Laboratory of Disease Genomics and Individualized Medicine, and
Jue Ruan1
Laboratory of Disease Genomics and Individualized Medicine, and
Shiou-Hwei Yeh1
Graduate Institute of Clinical Medicine and Hepatitis Research Center, National Taiwan University and Hospital, Taipei 106, Taiwan;
Xuemei Lu1
Laboratory of Disease Genomics and Individualized Medicine, and
China Academy of Sciences Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100029, People's Republic of China;
Yu Wang1
Laboratory of Disease Genomics and Individualized Medicine, and
Weiwei Zhai1
Laboratory of Disease Genomics and Individualized Medicine, and
Jun Cai1
Laboratory of Disease Genomics and Individualized Medicine, and
Shaoping Ling
China Academy of Sciences Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100029, People's Republic of China;
Qiang Gong
Laboratory of Disease Genomics and Individualized Medicine, and
Zecheng Chong
Laboratory of Disease Genomics and Individualized Medicine, and
Zhengzhong Qu
Laboratory of Disease Genomics and Individualized Medicine, and
Qianqian Li
Laboratory of Disease Genomics and Individualized Medicine, and
Jiang Liu
Laboratory of Disease Genomics and Individualized Medicine, and
Jin Yang
China Academy of Sciences Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100029, People's Republic of China;
Caihong Zheng
Laboratory of Disease Genomics and Individualized Medicine, and
Changqing Zeng
Laboratory of Disease Genomics and Individualized Medicine, and
Hurng-Yi Wang
Graduate Institute of Clinical Medicine and Hepatitis Research Center, National Taiwan University and Hospital, Taipei 106, Taiwan;
Jing Zhang
Laboratory of Disease Genomics and Individualized Medicine, and
Sheng-Han Wang
Graduate Institute of Clinical Medicine and Hepatitis Research Center, National Taiwan University and Hospital, Taipei 106, Taiwan;
Lingtong Hao
China Academy of Sciences Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100029, People's Republic of China;
Lili Dong
China Academy of Sciences Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100029, People's Republic of China;
Wenjie Li
China Academy of Sciences Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100029, People's Republic of China;
Min Sun
China Academy of Sciences Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100029, People's Republic of China;
Wei Zou
China Academy of Sciences Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100029, People's Republic of China;
Caixia Yu
China Academy of Sciences Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100029, People's Republic of China;
Chaohua Li
China Academy of Sciences Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100029, People's Republic of China;
Guojing Liu
Laboratory of Disease Genomics and Individualized Medicine, and
Lan Jiang
Laboratory of Disease Genomics and Individualized Medicine, and
Jin Xu
Laboratory of Disease Genomics and Individualized Medicine, and
Huanwei Huang
Laboratory of Disease Genomics and Individualized Medicine, and
Chunyan Li
Laboratory of Disease Genomics and Individualized Medicine, and
Shuangli Mi
Laboratory of Disease Genomics and Individualized Medicine, and
Bing Zhang
China Academy of Sciences Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100029, People's Republic of China;
Baoxian Chen
China Academy of Sciences Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100029, People's Republic of China;
Wenming Zhao
China Academy of Sciences Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100029, People's Republic of China;
Songnian Hu
China Academy of Sciences Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100029, People's Republic of China;
Shi-Mei Zhuang
State Key Laboratory of Biocontrol, School of Life Science, SunYat-Sen University, Guangzhou 510275, People's Republic of China; and
Yang Shen
State Key Laboratory of Biocontrol, School of Life Science, SunYat-Sen University, Guangzhou 510275, People's Republic of China; and
Suhua Shi
State Key Laboratory of Biocontrol, School of Life Science, SunYat-Sen University, Guangzhou 510275, People's Republic of China; and
Christopher Brown
Institute for Genomics and Systems Biology, and
Kevin P. White
Institute for Genomics and Systems Biology, and
Ding-Shinn Chen2 [email protected]
Graduate Institute of Clinical Medicine and Hepatitis Research Center, National Taiwan University and Hospital, Taipei 106, Taiwan;
Pei-Jer Chen
Graduate Institute of Clinical Medicine and Hepatitis Research Center, National Taiwan University and Hospital, Taipei 106, Taiwan;
Chung-I Wu2 [email protected]
Laboratory of Disease Genomics and Individualized Medicine, and
Department of Ecology and Evolution, University of Chicago, Chicago, IL 60637

Notes

2
To whom correspondence may be addressed. E-mail: [email protected] or [email protected].
1
Y.T., J.R., S.-H.Y., X.L., Y.W., W. Z., and J.C. contributed equally to this work.
Author contributions: S.-H.Y., X.L., D.-S.C., P.-J.C., and C.-I.W. designed research; Y.T., S.-H.Y., Q.G., Z.Q., J.Y., C. Zheng, H.-Y.W., S.-H.W., W.L., M.S., Chaohua Li, G.L., H.H., Chunyan Li, S.M., B.Z., B.C., S.-M.Z., and S.S. performed research; J.R., Y.W., W. Zhai, J.C., S.L., Q.G., Z.C., Q.L., J.Z., L.H., L.D., W. Zou, C.Y., L.J., J.X., W. Zhao, S.H., and Y.S. analyzed data; S.-H.Y. and P.-J.C. provided clinical samples; and Y.T., J.R., S.-H.Y., X.L., Y.W., W. Zhai, J.C., S.L., J.L., C. Zeng, C.B., K.P.W., D.-S.C., P.-J.C., and C.-I.W. wrote the paper.

Competing Interests

The authors declare no conflict of interest.

Metrics & Citations

Metrics

Note: The article usage is presented with a three- to four-day delay and will update daily once available. Due to ths delay, usage data will not appear immediately following publication. Citation information is sourced from Crossref Cited-by service.


Citation statements




Altmetrics

Citations

If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download.

Cited by

    Loading...

    View Options

    View options

    PDF format

    Download this article as a PDF file

    DOWNLOAD PDF

    Get Access

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Personal login Institutional Login

    Recommend to a librarian

    Recommend PNAS to a Librarian

    Purchase options

    Purchase this article to get full access to it.

    Single Article Purchase

    Rapid growth of a hepatocellular carcinoma and the driving mutations revealed by cell-population genetic analysis of whole-genome data
    Proceedings of the National Academy of Sciences
    • Vol. 108
    • No. 29
    • pp. 11727-12185

    Media

    Figures

    Tables

    Other

    Share

    Share

    Share article link

    Share on social media