Circulating Tumor DNA Mutation Profiling by Targeted Next Generation Sequencing Provides Guidance for Personalized Treatments in Multiple Cancer Types
Cancer is a disease of complex genetic alterations, and comprehensive genetic diagnosis is beneficial to match each patient to appropriate therapy. However, acquisition of representative tumor samples is invasive and sometimes impossible. Circulating tumor DNA (ctDNA) is a promising tool to use as a non-invasive biomarker for cancer mutation profiling. Here we implemented targeted next generation sequencing (NGS) with a customized gene panel of 382 cancer-relevant genes on 605 ctDNA samples in multiple cancer types. Overall, tumor-specific mutations were identified in 87% of ctDNA samples, with mutation spectra highly concordant with their matched tumor tissues. 71% of patients had at least one clinically-actionable mutation, 76% of which have suggested drugs approved or in clinical trials. In particular, our study reveals a unique mutation spectrum in Chinese lung cancer patients which could be used to guide treatment decisions and monitor drug-resistant mutations. Taken together, our study demonstrated the feasibility of clinically-useful targeted NGS-based ctDNA mutation profiling to guide treatment decisions in cancer.
Cancers arise largely due to genetic mutations, yet are notorious for genetic diversity in different carriers1, 2. Personalized cancer treatment optimizes the clinical benefits for each patient by choosing targeted interventions based on that patient’s unique genetic profile and thus avoids ineffective therapies3. However, personalized treatment requires comprehensive and precise genetic profiling of the patient’s tumor. The development of next generation sequencing (NGS) has offered unprecedented progress in uncovering cancer genome characteristics and facilitating personalized cancer therapy due to its outstanding accuracy, sensitivity and high throughput4,5,6. Resected tumor tissues are frequently used in current NGS-based genetic testing7, but the operation is generally invasive, risky and often simply not possible, especially for cancer patients with advanced disease8. Additionally, cancer cells continuously acquire new mutations due to genomic instability and/or selective pressure from the tissue microenvironment and clinical treatment9,10,11. Thus, testing of a single tumor sample may overlook intra- and inter-tumor heterogeneity12, 13.
Circulating tumor DNA (ctDNA) is tumor-derived fragmented DNA circulating in blood along with cell free DNA (cfDNA) from other sources. ctDNA has an average length of 167 bp14,15,16. Although the mechanisms of ctDNA release into circulation have not yet been fully addressed, most reports consider apoptosis and/or necrosis of tumor cells as its main sources16, 17. ctDNA is therefore a genomic reservoir of different tumor clones and a good representation of tumor genomic diversity compared to a single tumor sample. Moreover, with a half-life from 16 minutes to a few hours18,19,20, ctDNA reflects the most up-to-date status of the tumor genome. Several methods have been developed to inspect tumor-specific mutations in ctDNA, such as allele-specific PCR, droplet digital PCR (ddPCR) and “BEAMing” (Beads, Emulsions, Amplification and Magnetics)21,22,23,24. Although these techniques are highly sensitive, their two major drawbacks are (1) low throughput on mutation scanning, and (2) the requisite of predefined knowledge of molecular targets to test for, meaning these techniques cannot be used for de novo mutation identification. Several groups have incorporated NGS into ctDNA mutation profiling and successfully identified numerous genetic alterations that correlate with disease progress and prognosis25,26,27,28. With growing interest in ctDNA testing, it is important to evaluate its utility in different types of solid tumors, not only as a diagnostic tool, but also as a tool for screening, monitoring and novel biomarker identification. In this study, we used a self-designed pan-cancer gene panel covering the exons of 382 genes for targeted NGS and established a clinically-applicable pipeline for ctDNA enrichment, sequencing and data analysis. This method was applied onto 605 patients with 29 different types of solid tumors. For comparison, tumor tissues from 344 patients were tested at the same time. Our data proves that mutation profiling of ctDNA by targeted NGS is feasible in clinical practice to guide treatment decisions during diagnosis and disease monitoring.
Study Design and Patient Enrollment
605 cancer patients were randomly selected from 25 hospitals across Mainland China with a total of 29 types of tumors to ensure the representativeness and generalizability of this study. Brain tumors were excluded from this study due to the inhibition of release of ctDNA to blood by the blood-brain barrier24 (Supplementary Table 1 and Fig. 1a). Most patients had progressed to advanced cancer at the time of recruitment. Lung cancer (n = 373) was the largest category due to its high incidence in China29 and success in targeted therapy30,31,32. The other most common tumor types in our study were colorectal (n = 49), breast (n = 35) and stomach cancers (n = 30) (Fig. 1a). Two major criteria were applied for the exclusion of patients: 1) patients subjected to recent surgeries with primary tumors resected; and 2) patients currently receiving intensive targeted and/or non-targeted chemotherapies, with clinical imaging showing restrained tumor progression.
Study design and patient enrollment. (a) The percentage of different tumor types that enrolled in this study, including both Cohort I and II. Tumor types that were represented by less than 4 cases were classified as “Others”. (b) A schematic outlining our two-tiered study, including the cohorts and the specimens involved in this study.
Patients were divided into two cohorts: one cohort (cohort I) had ctDNA samples collected along with their matched archived formalin-fixed paraffin-embedded (FFPE) or frozen tumor tissue samples, and another cohort (cohort II) only had ctDNA samples collected (Fig. 1b). Cohort I was designed to compare mutation profiles between ctDNA and tumor samples, as well as to assess the concordance rate for mutation detection between sample types. Cohort II was designed to compare ctDNA mutation profiles between two cohorts of patients. For each subject in both cohorts, genomic DNA from matched whole blood was also sequenced in order to discriminate of somatic and germline abnormalities.
Targeted NGS-based Pan-Cancer Gene Mutation Profiling
Tumor tissue, cfDNA (which includes the ctDNA along with cfDNA from other sources) and whole blood controls were all sequenced using the same predefined panel (Fig. 2). Briefly, cfDNA was extracted from plasma, while genomic DNA was extracted from whole blood and either fixed or fresh tissue blocks (Fig. 2a). The median concentration of cfDNA from all 605 patients was 11.48 ng/ml plasma (Supplementary Fig. 1a). Extracted cfDNA was analyzed by the Agilent 2100 Bioanalyzer in order to detect genomic DNA contamination (Supplementary Fig. 1b) and subjected to an extra size-selection step using magnetic beads if contamination was found (data not shown). cfDNA and genomic DNA from all sample types underwent whole-genome library construction (Supplementary Fig. 1c and data not shown), followed by hybridization-based capture enrichment of 5,804 exons of 382 cancer-relevant genes and 37 introns of 16 genes frequently rearranged in solid tumors (Fig. 2b and Supplementary Table 2). Libraries after target enrichment were sequenced to high uniform depth on Illumina Miseq or HiSeq4000 platforms, depending on the sample type (see Methods). Sequencing data was analyzed using a customized bioinformatic pipeline optimized to accurately detect different classes of genomic alterations, including base substitutions, indels, copy number variations (CNV) and gene fusions (Fig. 2c). Finally, both germline and somatic genetic alterations in each patient were subject to manual data curation and reported (Fig. 2d). The average turnover time from receiving samples to reporting was 10 business days.
Workflow of targeted NGS-based mutation profiling. (a) Genomic DNA is extracted from multiple sample types. (b) Whole-genome libraries are prepared from fragmented genomic DNA or cfDNA, followed by hybridization capture with biotinylated DNA probes to establish target-enriched sequencing libraries for NGS. (c) Sequencing data undergoes quality control (QC), mapping and bioinformatic analysis to identify different classes of genomic aberrations. (d) Mutations identified are filtered and annotated according to related databases, and their clinical significances are interpreted in the final report.
High Concordance of Mutation Spectra between Matched ctDNA and Tumor Samples
Cohort I included 344 patients representing 28 tumor types (Fig. 3a). ctDNA and tumor tissue blocks were sequenced simultaneously to provide a direct comparison of these two sample types. Patients with at least one somatic mutation identified in their ctDNA were defined as patients with detectable mutations in ctDNA, and were grouped by tumor types (Fig. 3b). Overall, ctDNA abnormalities were detected in around 80% of patients representing a majority of tumor types, with the exception of soft tissue tumors (mostly sarcoma) which showed the lowest rate of mutation detection in ctDNA.
Mutation detection concordance between matched tumor and ctDNA samples in cohort I. (a) The composition of different tumors classified by their tissue origins. Tissue types that have less than 4 cases represented in the study are classified as “Others”. (b) The percentage of patients with mutations detected in ctDNA within different tumor types. Tumor types with less than 4 cases are not shown. (c) Shared and unique mutations identified in tumor and ctDNA samples. (d) The composition of mutation types in tumors and ctDNA. (e) Correlation of mutation numbers in ctDNA and matched tumors (Spearman’s rank test, p < 0.0001). The scatter dots were plotted according to mutation numbers identified per patient in ctDNA and matched tumors and the density represents the number of patients. (f) The correlation between mutation detection concordances in the matched tumor-ctDNA samples and sequencing coverage depth. The concordance rate was calculated by dividing the number of mutations in ctDNA to the number of mutations in matched tumor sample for each patient. Each dot represents one individual patient with median concordance rate shown by the black bar. *p < 0.05, Dunn’s multiple comparisons test; ns, not significant.
In cohort I, a total of 1109 CNVs and mutations in ctDNA samples, and 1249 CNVs and mutations in tumors, were identified from 208 genes. 932 out of 1249 genetic abnormalities (74.6%) in tumor tissue were shared by matched ctDNA samples (Fig. 3c), although CNV was under-reppresented in ctDNA due to the current limitation of statistical approaches for CNV identification in NGS, especially in low tumor content samples33 (Fig. 3d and Supplementary Fig. 2a). Not surprisingly, CNV constituted 57% (181 out of 317) of the tumor-unique mutations. Therefore, CNVs were excluded in later analysis, and the concordance rate for all the other types of abnormalities increased to 87.0% (883 out of 1016). Meanwhile, ctDNA samples also harbor 177 unique abnormalities (including 24 CNVs) that were absent from tumor tissues, which may be ascribed to the inadequate representation of spatial and temporal heterogeneity by tumor tissue sequencing.
There is a significant correlation (Spearman r = 0.64, p < 0.0001) in the number of mutations identified between ctDNA and matched tumor tissue within each patient (Fig. 3e). 88% of tumor and 91% of plasma samples had 1–6 somatic mutations identified (median: 3 per sample for both). As expected, ctDNA, since it is mixed with cfDNA from other non-tumor sources, displayed significantly lower mutant allele frequencies (MAFs) with 66% of them below 10% (median: 5%), while mutations in tumor tissues had much higher MAFs with a median of 23% (Supplementary Fig. 2b).
Increasing sequencing coverage depth of ctDNA proved to be an efficient way to improve the detection sensitivity of tumor-specific mutations (Fig. 3f and Supplementary Fig. 2c). We chose 300× mean coverage depth of ctDNA as our cutoff for data analysis in this study. We could not detect any tumor-specific mutations in the ctDNA of 24% of patients using a mean coverage depth below 300× and a MAF detection cutoff of 1% (see Methods); however, when the coverage depth was increased to 300–500×, we observed a significant improvement in detection of matching mutations between ctDNA and tumor pairs. Further increasing the coverage depth to 500–1000×, 1000–2000× or >2000× did not significantly improve the matching rate (Fig. 3f). The number of mutations identified in tumor tissue and ctDNA, as well as their overlaps, are provided in Supplementary Table S3. Similar results were observed when comparing output of coverage depth variation between groups of different coverage depth in cohort I. When below 300× coverage depth for ctDNA sequencing, only 67% of tumor mutations were identified in ctDNA samples (Supplementary Fig. 2c). When coverage depth was increased to 300–500×, 88% of tumor mutations were detected in ctDNA, but the concordance rate was hardly improved when the coverage depth was increased up to 2000×. As a result, samples with coverage depth below 300× in this pilot group were excluded from other analyses.
There are several reasons to explain why mutations would be detected in tumor tissue but not in ctDNA, even at high coverage depth. One key reason is the low MAF of these mutations in tumor samples, suggesting extensive tumor heterogeneity in these patients (Supplementary Fig. 3a). Indeed, 43% of mutations that were undetected in ctDNA have a low MAF (below < 10%) in matched tumor tissues (Supplementary Fig. 3b). Other reasons include aged FFPE samples that may not represent the current mutation profile of the patient’s tumor, early stage cancers with low tumor burden, treatment intervention that may lower the possibility of detecting mutations in plasma and specific genetic regions that cannot be targeted well by NGS due to high GC-content or repetitive sequences that would influence ctDNA more than tissue sample due to the low MAF in ctDNA. Previous studies have showed that cfDNA concentration in plasma may be correlated with tumor burden and disease status24; however, we observed that the mutation detection rate was not significantly influenced by the plasma cfDNA concentration (Supplementary Fig. 3c).
Targeted NGS-based Mutation Profiling of ctDNA Shows Great Potential for Clinical Practice to Guide Treatment Decisions
Our cohort II includes solely 261 ctDNA samples from patients with 28 types of solid tumors to validate the ctDNA mutation patterns observed in cohort I (Fig. 4). Promisingly, the results from cohorts I and II showed similar trends in several aspects, including percentage of patients with detectable mutations in ctDNA, number of mutations per patient, distributions of MAFs and different mutation types (Fig. 4b–e).
Similar ctDNA results between cohorts I and II. (a) There was a comparable distribution of tumor types covered in both cohorts I and II. (b) The fraction of patients with detectable ctDNA mutations in cohort II. (c) The distribution of mutation numbers identified per patient in cohorts I and II. (d) The distribution of MAFs in cohorts I and II. No significant difference was detected between the two cohorts in c and d by Mann-Whitney U test. (e) The distribution of different mutation types in cohorts I and II.
Combining ctDNA samples from cohorts I and II, somatic mutations were detected in 87% (529 out of 605) of patients. By compiling mutations identified in ctDNA samples of these two cohorts, it was observed that TP53 (18.1% of all mutations), APC (3.3%) and DNMT3A(2.5%) were the most frequently mutated tumor suppressor genes, while EGFR (11.9%), KRAS (3.7%) and PIK3CA (3.0%) were the most frequently mutated oncogenes (Fig. 5a). 35.3% (662 out of total 1874 mutations in cohort I and II) of all mutations detected in ctDNA are potentially clinically-actionable, which are defined by three criteria: 1) they are related to FDA approved drugs or therapies; 2) they contribute to clinical therapy choice and outcome predictions in published clinical studies; and 3) they are targets of drugs or therapies that are currently under active clinical trials, showing promising intervention results27, 34 (Fig. 5a, green and red portions; Supplementary Table 4). Among these, 66.0% (437 out of 662) of mutations can be targeted by drugs already approved or currently in clinical trials (Fig. 5a, red portion). In summary, at least one clinically-actionable mutation was detected in 71% of patients (376 out of 529 patients with mutations detectable in their ctDNA, Fig. 5b), and 54% of patients had at least one druggable mutation (Fig. 5c). Overall, our data strongly suggests that this technique is an informative and effective approach to uncover druggable molecular targets, and has great potential to be used in guiding clinical treatment decisions.
Clinically-actionable mutations identified in ctDNA samples. (a) Genes that are frequently mutated in cancer were ranked by their mutation frequency in all ctDNA samples tested, with the proportion of currently druggable mutations and potentially actionable mutations highlighted in red and green, respectively. Genes with low mutation occurrences (≤4) were not shown. (b) The percent of patients that presented with varying numbers of clinically-actionable mutations. (c) The percent of patients that presented with varying numbers of druggable mutations.
Characteristics of Mutations Identified in ctDNA from Chinese Lung Cancer Patients
Our study also comprehensively analyzed the mutation spectrum in Chinese lung cancer patients. 273 lung cancer patients involved in this study were documented with clear clinical histological diagnosis and cancer stage classification. Adenocarcinoma represents 84% (n = 228) of all cases and no obvious gender difference in susceptibility to this subtype was observed (Fig. 6a). However, for squamous cell (n = 30) and small cell carcinoma (n = 15), more male patients were observed than females. By performing a co-mutation plot of mutations identified with the highest incidences in these patients, we showed that TP53mutations (52.38%) appear most frequently in ctDNA taken from patients with all types of lung cancers, followed by EGFR mutations (~40%) that exhibit strong preference to patients with adenocarcinoma (Fig. 6a and c). Different genes displayed variable preferences for different types of mutations. For example, EGFR is prone to adopt in-frame indel and missense mutations, while ALK is prone to forming fusion genes (Fig. 6b). In EGFR, the most prevalent mutations are exon19-deletion, T790M, L858R and exon20-insertion, with other types of activated EGFR mutations detected at lower frequencies (Fig. 6d). T790M mutations were only present in patients with resistance to tyrosine kinase inhibitor (TKI) treatment35. EGFR-C797G/S mutations, which were reported as the acquired resistance mutations in patients treated with the third generation of TKI, AZD929136, were also detected.
Mutation analysis of lung cancer patients. (a) A co-mutation plot of various types of mutations in the ctDNA of lung cancer patients. Only genes with more than 10 occurrences are shown in this plot. (b) The composition of mutation types within each gene. (c) The percentage of patients with mutations in each gene. (d) The specific mutations identified in EGFR and their frequencies in the ctDNA of our cohorts.
It was observed that KRAS mutations are detected in 8% of patients, mainly those with adenocarcinoma, and are mutually exclusive from EGFR mutations as previously reported37, 38, although one patient harbored KRAS-G12L, EGFR amplification and exon19-deletion simultaneously (Fig. 6a). In squamous cell carcinoma, the occurrences of CDKN2A (27%) and PTEN (17%) mutations become more frequent compared to other cancer subtypes, while in small cell carcinoma, RB1 mutations become more prevalent (40%) (Fig. 6a). Taken together, our study fully validates the use of targeted NGS-based genetic testing of ctDNA samples in clinical practice for guiding therapy decisions and monitoring treatment responses in lung cancer.
In this study, we presented that targeted NGS of ctDNA is non-invasive, can be completed with a fast turnover time, and can be standardized to generate reproducible data for routine clinical practice.