Background

Advances in treatments for cancer have generally come incrementally because noveltreatments are subjected to large prospective randomized clinical trials. In thesestudies, several hundred patients are randomized to one treatment arm or another and thetreatment associated with the best outcome is advanced. This method has worked well forrelatively common cancers, including breast and colon cancers. This approach, however,falls short when one is faced with rare cancers such that prospective trials involvinglarge numbers of patients are difficult or impossible to conduct. In these cases,oncologists may choose chemotherapy regimens because the rare tumor is thought to besimilar to a more common cancer for which an accepted standard treatment exists. Such isthe case with cancers of the ampulla of Vater. These cancers account for only 0.2% ofgastrointestinal cancers and approximately 7% of periampullary tumors. Periampullarytumors arise from either pancreatic ductal epithelium, the distal common bile duct, theduodenal mucosa, or the ampulla of Vater. When resectable, ampullary cancers are treatedlike pancreatic cancers with a pancreaticoduodenectomy. When they present at an advancedmetastatic stage, there is little information guiding choices for chemotherapy regimens.Although they represent a minority in such trials, patients with ampullary cancers areoften included in clinical trials of patients with biliary tract cancers, so thesepatients are often treated with gemcitabine and cisplatin [1].

Genomic technologies have resulted in some limited but remarkable advances in cancertreatment. Prior to the discovery of the Philadelphia chromosome and the identificationof the BCR/ABL fusion protein leading to the development of imatinib, chronicmyelogenous leukemia, a relatively rare form of the disease, was nearly uniformly fatal.Treatment was a bone marrow transplant with its attendant high risks of both morbidityand death. Treatment with imatinib, a tyrosine kinase inhibitor, can induce remission inapproximately 87% of patients with greatly reduced risks of complications [2]. Imatinib was subsequently also found to be remarkably effective againstgastrointestinal stromal tumors [3]. Other targeted drugs that have recently been shown to have efficacy in thesetting of an indentified genomic aberration include vismodegib in advanced basal cellskin cancers harboring mutations in PTCH1, and vemurafenib in patients withadvanced melanoma exhibiting a V600E mutation in the BRAF (v-raf murine sarcomaviral oncogene homolog B1) gene product [4, 5].

The rapid advancement of genomic technologies offers the possibility to tailorchemotherapy based on an in-depth analysis of a limited number of tumor samples. Theadvent of next generation sequencing technologies has now paved the way for nearcomplete interrogation of tumor genomes, providing the first opportunity for efficientglobal genomic tumor profiling at the point mutation, copy number, and breakpointdimensions of the cancer genome. At a time in which there is an increasing array ofchemotherapy drugs targeting aberrant molecular pathways, individualized genomicanalysis to aid treatment decisions is quickly becoming feasible. Such an approach seemsparticularly well suited to the treatment of rare cancers for which there is a paucityof other clinical data to guide therapy. To demonstrate the potential clinical utilityof individualized genomic analysis in patients with rare cancers, we applied wholegenome sequencing to the tumor of a 63-year-old man with a resected cancer of theampulla of Vater and identified therapeutic targets distinct from what would have beentargeted based on existing literature.

Materials and methods

Samples

Written informed consent was obtained and the patient samples were collected forresearch purposes at Banner Good Samaritan Medical Center, Phoenix, Arizona. Thestudy was approved by the Western Institutional Review Board (WIRB) and was conductedin accordance with the 1996 Declaration of Helsinki. This was a study entitled,'Pancreas Cancer Biospecimens Repository' (WIRB® Protocol #20040832). Informedconsent was obtained from the patient with cancer of the ampulla of Vater, includingwritten consent for collection of the tissue and whole blood samples as well asclinical information and for genetic analysis of the specimens. The samples were thenanonymized and assigned a unique identifier. Samples included fresh frozen tumortissue collected within 20 minutes after surgical resection. Whole blood was obtainedbefore the start of the operation at the time of induction of anesthesia.Histopathological analysis of the frozen specimen was quality assessed and determinedto contain approximately 60% tumor cellularity. DNA and RNA were extracted fromfrozen tissue and whole blood using the Qiagen All Prep kit (Germantown, MD, USA)using the manufacturer's recommendations.

Next generation sequencing

To facilitate whole genome next generation sequencing, we utilized the LifeTechnologies SOLiD™ (version 3) technology with mate-pair chemistry using themanufacturer's recommendations (Carlsbad, CA, USA). Briefly, 20 µg of genomicDNA is mechanically sheared to an average fragment size of 1.5 kb using theHydroShear. These size-selected fragments are then end repaired and circularizedaround a long mate-pair adaptor by nicked ligation. Nick translation is then used todisplace the nick roughly 70 bp from either side of the internal adaptor. A nucleasereaction linearizes these fragments. SOLiD™ sequencing-specific sequencingadaptors are then ligated to the ends of these fragments. We prepared two independent1.5 kb mate-pair libraries from the patient's constitutional (germline) DNA, and twoindependent mate-pair libraries from the patient's tumor DNA. Following PCRamplification, these mate-pair libraries are then used as templates in emulsion PCRreactions using SOLiD™ proprietary sequencing beads to generate clonal singlemolecule templated beads. Subsequently, an average of 500,000 templated beads areenriched and deposited onto SOLiD™ flowcells for massive ligation-basedsequencing to generate 50 bp × 50 bp mate-pair sequences per bead. For thisgermline/tumor pair, we sequenced an average of one billion beads per library, thusgenerating two billion mate-pair reads for germline and two billion mate-pair readsfor tumor.

Next generation sequencing data processing

Raw next generation sequencing data in the form of csfasta and qual files are used toalign 50 bp × 50 bp paired end reads from either the patient germline genomesequence or tumor genome sequence to the reference human genome (NCBI build 36,hg18). For alignment, we utilized the Life Technologies BioScope version 1.3 softwaresuite, which is based upon a seed-and-extend algorithm [6]. Compressed binary sequence alignment/map (BAM) formatted output files forgermline and tumor genome alignments are generated and PCR duplicates aresubsequently removed using the Picard Tools.

Next generation sequencing data analysis

Somatic single nucleotide variants

We employed two different algorithms. The first algorithm (SolSNP) [7] detects a SNP variant by comparing two discrete distributions. Itcompares the distance of the discrete sampled distribution of the base-pair pileupon each strand to the expected distributions (according to ploidy), and determinesthe genotype call. This is done using a Kolmogorov-Smirnov-like distance measurebased on both the base (that is, reference or alternative base) as well as theconfidence in the base called (that is, the quality score of each base in thepileup). If the genome is haploid, two expected pileups are created at eachposition: one consisting of only the reference base (a 'homozygous-reference'pileup) and another consisting of only the alternative base (a 'homozygousnon-reference' pileup). The confidence of each pileup position is kept the same.The expected pileup that has the minimal Kolmogorov-Smirnov distance to thesampled pileup is considered to be the genotype of the locus on the strand. Indiploid genomes, SolSNP also considers a pileup half of which is made up of thereference bases and the other half made of alternative bases (a heterozygouspileup). A locus on the chromosome is called a SNP if a variant genotype (either'homozygous non-reference' or heterozygous) is detected on both strands. SolSNPcan restrict its calls to loci where the genotype calls on both strands areidentical. This is achieved by passing the 'Genotype Consensus' value to theparameter 'STRAND_MODE'. In this mode, the tool is able to produce genotype callsas well as variants. The second algorithm (Mutation Walker) calculates a test ofproportions for the tumor/normal set to construct a test-statistic for reads inthe forward direction and the reverse detection separately. The minimum of thesetwo comparisons is used as the reported test-statistic, ensuring evidence is foundin both the normal and reverse detection. Sites with evidence in the normal arefiltered from the final report so as to reduce false positives arising fromunder-sampled polymorphic germline events. Calls common to both the algorithmswere considered for further examination. To reduce the false negative rate, twosets of common calls were made. One was made with a strict and the other with alenient set of parameters for both the algorithms. Both the sets were visuallyexamined for false positives, which were then filtered to get a final list of truesingle nucleotide variants.

Indel detection

For detecting somatic indels we employed a two-step strategy. In the first step,we removed reads from the tumor sample BAM whose insert size lay outside theinterval (500,5000) for SOLiD™. Genome Analysis Toolkit [8] was then used to generate a list of potential small indels from thisBAM. A customized perl script, which used the Bio-SamTools library from BioPerl [9], then took these indel positions and for each of the indels looked atthe region in the germline sample consisting of five bases upstream of the startand five bases downstream of the end of the indel. An indel was determined to besomatic only if there was no indel detected in the region under consideration.

Structural variants

Structural variants were analyzed by comparing two sources of information:relative normal/tumor read-level coverage and anomalously mapping read pairs.Assessing structural variants by read-level coverage is termed copy-numberanalysis since it is parallel in concept to microarrays. In copy number analysis,gains and losses were determined by calculating the log2 difference in normalizedcoverage between tumor and germline. Briefly, we investigated regions in 100 bpwindows where the coverage in the germline was between 0.1 and 10 of the modecoverage in order to remove regions with high degrees of repeat sequence (forexample, centromes or difficult to sequence regions. Normalized coverage wasdetermined by the log2 coverage within a 100 bp bin over the overall modalcoverage. We then reported the difference between the germline and tumornormalized coverage by a sliding window of size 2 kb. Deleted and amplifiedregions were flagged by a departure of greater than 0.75 from baseline. Moderatedeletions were identified by a similar method utilizing sequence coverage ratherthan clonal coverage for consensus coding sequence exons only.

In anomalous read-pair analysis, we used perl scripts to detect enrichment ofanomalously mapping read-pairs. These would be read-pairs that deviate from theexpected mate-pair orientation of both reads occurring in the same direction orread-pairs that are outside the expected 1.5 kb insert size. A series ofcustomized perl scripts were employed in the detection of translocation. Thesescripts used SAMtools [10] internally to access the BAM files. The analysis itself was made up oftwo steps. The first was the detection of a potential translocation in both tumorand germline samples. The second was comparison of a potential translocation intumor to those detected in the germline sample to weed out potential falsepositives for statistical identification of outliers. The genome was analyzed by awalker with step size equivalent to the insert size where the number of anomalousreads was counted, that is, those reads whose mates align on a differentchromosome. For each window we chose the highest hit to be the chromosome to whichmates of most of the discordant reads mapped. We compared the ratios of discordantreads to the total aligned reads across all the windows to detect potentialoutliers. Outlier detection was done under the assumption that the normaldistribution of the proportion of hit discordant reads in 2 kb windows aggregatedacross the chromosome will follow a normal distribution. We then computed the meanof the distributions and chose a cutoff of 3 standard deviations. The window witha proportion of hit discordant reads higher than this cutoff contained the regionof potential translocation. The actual region of translocation is then determinedby the span of the hit discordant reads in the window. For somatic translocations,the germline and the tumor sample are called separately and regions of overlap areeliminated. The output is a general feature format (gff) file of paired lineswhere the source tag indicates which two genomic regions show potentialtranslocations. These regions were further inspected to reduce false positives andarrive at the more confident list. Additional details related to the methods fordetection of somatic translocations and intrachromosomal rearrangements areincluded in Additional file 1.

Validation of next generation sequencing findings

Briefly, ten single nucleotide variants and one local deletion were selected atrandom for chain termination sequencing (Sanger method). Validation was conductedusing tumor DNA. Specific genomic primer pairs (Additional file 2) were designed to anneal in flanking single nucleotide variant regionsand approximately 150 to 500 bp fragments to be amplified in 25-cycle PCR. Someprimers carried M13 sequences on the 5' end as a back up for sequencing runs.Reaction products were column purified using a QIAquick PCR Purification kit (Qiagen)and submitted to the Arizona State University sequencing facility. Electropherogramswere then manually examined for the presence of mutations/deletions in bothorientations (Additional file 3).

Genomic quantitative PCR was performed to validate homozygous PTEN (phosphatase and tensin homolog) deletion (Additional file 4). In addition to the PTEN locus, genes located in adjacentregions of hemizygous deletion (RGR (retinal G protein coupled receptor) andHHEX (hematopoietically expressed homeobox)) were also measured.BICC1 (bicaudal C homolog 1 (Drosophila)) and TRUB1 (TruBpseudouridine (psi) synthase homolog 1), located in unaffected regions of chromosome10, were used as internal controls. Quantitative PCR reactions were set up in a384-well plate in triplicate with 3 ng of genomic DNA input per reaction.Amplifications were performed using a LightCycler480 instrument and SYBRGreen IMaster Mix (Roche). Melting curves were examined for the presence of a single peakand Ct values were used in calculating fold-change according to the CT method [11]. All tumor and normal CT values were first normalized toglyceraldehyde 3-phosphate dehydrogenase (GAPDH). The quantity of genomicmaterial present for each gene in the tumor sample was then normalized to its normalcounterpart.

Results

The patient is a 63-year-old Caucasian man diagnosed with adenocarcinoma of the ampullaof Vater. The patient had a Whipple procedure to resect the head of the pancreas, distalstomach duodenum, distal common bile duct, and gallbladder. The maximum dimension of thetumor, which was present at the junction of the ampullary and duodenal mucosa was 1.5cm. The tumor invaded into the duodenal muscle wall but no lymphatic or vascularinvasion was noted. There was no evidence of neoplasm of the lines of resection andthere was no evidence of metastatic carcinoma to the 16 peripancreatic lymph nodesexamined microscopically (pathologic TNM (Tumor, Node, Metastasis) stage T2, N0, M0).The patient's past history is significant of having smoked one to two packs per day for15 years, stopping approximately 16 years before the diagnosis of his adenocarcinoma ofthe ampulla of Vater.

Massively parallel whole-genome sequencing was performed on genomic DNA from germlineand tumor samples using the Life Technologies SOLiD™ version 4.0 mate-pairchemistry. Basic sequence run statistics based on our analysis pipeline are provided inTable 1. A total of 2.38 and 2.21 billion uniquely mappable readswere generated from germline and tumor DNA, which equates to 108 Gb and 100 Gb ofuniquely mappable sequence for germline and tumor, respectively. Therefore, we achieved37× and 40× genome coverage for tumor and germline, respectively. We detecteda total of 2,771,201 SNPs from the germline genome, 91% of which are present in dbSNP(release 129). The transition to transversion ratio was 2.12, which is inline with whatwould be expected in a diploid human genome [12]. The full genome has been deposited in the database of Genotypes andPhenotypes (dbGaP) of the National Center for Biotechnology Information (submission IDSRA 053213).

Table 1 Basic sequencing statistics

To discover somatic mutations within ampullary cancer, we used a custom paired analysispipeline. The overview of somatic alterations within this tumor is provided in the formof a Circos plot (Figure 1). Our paired analysis revealed 19,143genome-wide somatic point mutations, of which 30 map within known annotated codingsequences. A list of all somatic missense (n = 28) and nonsense mutations (n = 2) isprovided in Table 2. The most notable mutation is an activatingKRAS (Kirsten rat sarcoma viral oncogene homolog) mutation at codon 12(G12V), which is one of the most commonly reported mutations in ampullary carcinomas [13, 14]. Furthermore, we discovered three somatic small insertions and deletionswithin coding regions, which result in frameshift mutations (Table 2). All missense mutations were assessed for likely functional consequencesusing the SIFT prediction algorithm [15, 16], which characterized mutations as tolerated or damaging. Of the 28 missensemutations that were assessed, 19 (68%) were predicted to be damaging. Previously, wecalculated the rate of SIFT damaging calls from a random set of approximately 10,000missense variants from the 1000 Genomes data, which showed a rate of damaging mutationsof 15%. Validation by Sanger sequence analysis is presented in Additional file 3.

Figure 1
figure 1

Circos plot summarizing somatic events contained within pancreatic tumor of theampulla of Vater. The outer ring shows gene symbols for those genessomatically altered in the tumor relative to their map position against the humangenome chromosome karyotype. Blue tick marks denote genes containing nonsynonymouspoint mutations. Cyan tick marks denote genes containing coding indels. Magentatick marks represent discordant read pairs supporting putative translocationevents and those genes involved in breakpoints. The inner ring represents somaticcopy number events with regions of gain shown in red and regions of loss shown ingreen, with brighter colors denoting higher degrees of gain or loss. Magenta linesin the center represent breakpoint regions for translocation events.

Table 2 List of somatic coding point mutations and small indels

To identify regions of somatic copy number loss, we utilized a basic algorithm thatdetermined log2 ratios in coverage difference between tumor and germline over a slidingwindow of 4,000 bp. Regions of copy number gain or loss are shown in Figure 1. This tumor exhibited whole chromosome copy number gains ofchromosomes 2 and 8, along with copy number loss of chromosome 19. Of most significancewas an approximate 20 Mb interstitial deletion at 10q23, which also contained a morefocal region (2 Mb) of homozygous loss that encompassed the PTEN tumorsuppressor gene (Figure 2). No other regions of focal gain oramplification were detected in this tumor (validation data are presented in Additionalfile 4).

Figure 2
figure 2

Zoom in of the 10q region containing focal homozygous deletion encompassing the PTEN tumor suppressor gene.

To identify potential cis chromosomal rearrangements and translocation events,we searched for significant evidence of discordant mate pairs. The long insert matepairs provide improved power for detecting structural alterations through improvedclonal coverage. Clonal coverage can be defined as the genomic coverage (that is,30×) multiplied by the length of the insert (1,500 bp), divided by the amount ofsequence derived from each mate pair (100 bp). For example, at 37× genomic coveragefor our tumor specimen and with 1,500 bp average mate-pair insert size, and with 2× 50 bp mate-pairs (or 100 bp total), we achieve a clonal coverage of 432×.With such high clonal coverage we have significant power to detect evidence ofdiscordant mate-pair reads, where the length of the insert deviates substantially fromthe mean insert length and/or map to different chromosomes or chromosomal regions.Utilizing an algorithm that identified discordant mate-pairs specific to the tumor, wediscovered two independent translocation events occurring in the tumor. Both eventsinvolve genes on each side of the translocation event. One event is evidenced bysignificant discordant read pairs in the tumor mapping to the LINGO2 (leucinerich repeat and Ig domain containing 2) locus at 9q21.1 (chr9: 27990017-27991975), whichis translocated to the TTC28 (tetratricopeptide repeat domain 28) locus at22q12.1 (chr22: 27401302-27401562) (Figure 1). A second event isevidenced by discordant mate-pair read mapping to the PRIM2 (primase, DNA,polypeptide 2) locus at 6p12.1 (chr6: 57450028- 57451992) and to the NPAS3 (neuronal PAS domain protein 3) locus at 14q13.1 (chr14: 33206124- 33207653)(Figure 1).

Discussion

Adenocarcinomas of the ampulla of Vater are relatively rare, accounting for only 0.2% ofgastrointestinal cancers [17]. Perhaps due to their location and propensity to present with jaundice at anearly resectable stage, these tumors are more likely to be resectable at the time ofdiagnosis than are pancreatic cancers [18]. Furthermore, in comparison to pancreatic cancer, resected ampullary cancersare associated with better 5-year survival rates of 34 to 61% [1921]. Surgical series have demonstrated the factors affecting survival includecompleteness of surgical resection and nodal status. Surgical treatment for ampullarycancer and cancers in the head of the pancreas are similar in that surgeons perform apancreaticoduodenectomy. Thereafter, the treatments may diverge. There is no clearconsensus on the role of or the optimal regimen for adjuvant chemotherapy in ampullarycancers. Similarly, in part due to its relative rarity, there is no clear standardchemotherapeutic regimen for recurrent or metastatic ampullary cancer.

A better understanding of molecular oncogenesis and the emergence of targeted agentswill likely lead to improved treatment outcomes in this and other cancers. Our studyused whole genome sequencing to analyze the genome of a resected ampullary carcinoma. Wefound expected as well as novel aberrations. We found an activating mutation in KRAS codon 12. KRAS mutations are common in ampullary cancer although the 25 to37% incidence appears to be lower than the approximately 95% rate of KRAS mutation seen in pancreatic adenocarcinomas [13, 14, 22, 23]. Furthermore, similar to what is seen in colonic adenomas, KRAS mutations occur in benign ampullary adenomas, suggesting activating mutations ofKRAS are relatively early events in the progression toward cancer and themutation does not appear to affect prognosis [14]. This tumor also demonstrated a somatic nonsynonymous mutation in SMAD4 (mothers against decapentaplegic homolog 4), which has been observed previously in50% of ampullary cancers but infrequently in bile duct cancers [24].

The most notable gene deletion we found was a focal deletion of a region in chromosome10 including the PTEN tumor suppressor gene (phosphate and tensin homologuedeleted on chromosome 10). Cowden's syndrome is characterized by a germline mutation inthe PTEN gene resulting in loss of function. This syndrome is characterized bynoncancerous hamartomas of the skin and mucous membranes and affected patients have inincreased risk of tumors of the breast, thyroid, uterus and gastrointestinal tract.Benign tumors of the ampulla of Vater have been reported in patients with Cowden'ssyndrome but are not a common feature within cancers of the ampulla. Loss of PTENexpression by immunohistochemisty has been associated with liver metastases and poorprognosis in colon cancer [25]. In a large-scale survey of the genomic aberrations of pancreatic cancers,PTEN deletions were not seen, although small deleterious coding mutationswere detected [26]. We can conclude that despite their anatomic location in proximity to thepancreas, ampullary cancers are distinct entities from adenocarcinoma of the pancreasand bile duct cancers and thus should be treated as a different entity.

To that end, the loss of PTEN expression is important not only in thepathogenesis but because it exposes a potential therapeutic target (Figure 3). The PTEN protein product is an inhibitor ofphosphoinositide 3-kinase (PI3K) and downstream signaling through AKT. Phosphorylationof Akt results in phosphorylation of several target proteins involved in regulation ofkey cellular functions, including cell proliferation, glucose metabolism, proteintranslation, and cell survival [27]. Additionally, activation of the PI3K pathway has been linked to activationof mammalian target of rapamycin (mTOR), although the mechanism is not yet fullyelucidated [28]. The presence of a deletion in PTEN in this ampullary cancer wouldbe predicted to release from inhibition activation of the PI3K/mTOR pathway.Consequently, one can infer that an agent that is a dual PI3K/mTOR inhibitor, such asNVP-BEZ235, would be an attractive therapeutic option for our patient should his diseaserecur [29]. NVP-BEZ235 and other agents like it have been shown in vitro toinhibit growth of cancer cells with activating mutations of PI3K and are all underclinical development [30]. In the case presented here, however, the tumor carries both a KRAS activating mutation and complete inactivation of PTEN, supporting dualactivation of both the MEK/ERK and the PI3K/AKT axes (Figure 3).The inhibition of only one axis may not be sufficient for effective treatment as thereis likely to be compensatory activity from the other activated axis.

Figure 3
figure 3

Simplified map and interactions of the phosphoinositide 3-kinase (PI3K) and RASpathways highlighting the genomic aberrations (-, loss of function mutation,*gain of function mutation) identified in a cancer of the ampulla of Vater andthe putative therapeutic site of vulnerability. ERK,extracellular-signal-regulated kinase; grb2, growth factor receptor bound protein2; mTOR, mammalian target of rapamycin; RTK, receptor tyrosine kinase; SHC, SHC(Src homology 2 domain containing) transforming protein 1; SOS, son ofsevenless.

Our group reported the beneficial results seen in a clinical trial on patients withrefractory solid tumors whose chemotherapy was chosen based on analysis of tumorbiopsies using immunohistochemistry and expression arrays [31]. New technologies such as applied herein have made high-throughputwhole-genome sequencing a more rapid and cost-effective process in a manner not possiblewith older technologies such as Sanger sequencing. The prospect is raised, therefore,that one may soon be able to apply whole-genome sequencing to the analysis of anindividual patient's tumor to guide an informed choice of a therapeutic regimen. Thistype of personalized or precision medicine has only begun to be studied. Severallimitations remain before this whole-genome sequencing methodology can be widelyapplied, including the need for improved and standardized bioinformatic analysis, alongwith reliable and rapid methods for validation of genomic findings and cost.Furthermore, if a target is found, one must have access to an agent and, in many cases,such agents may not be approved for clinical use. Thus, we must begin to understand thelinks between genomic profile and drug context in early drug development. This isamplified even more where there is evidence to support combination therapies.

Conclusions

We have analyzed the whole genome sequence of a cancer of the ampulla of Vater touncover the compendium of somatic events occurring in this tumor. Among the mutationsdiscovered were those that might be considered potential therapeutic vulnerabilities. Aswhole-genome sequencing becomes more rapid and less expensive, the potential fortargeted and truly personalized treatments increases. Consequently, as we continue torefine our abilities to uncover the full landscape of somatic alterations, we must inparallel continue innovative drug development methods, including preclinical and earlyphase I combination trials. This will allow us to understand toxicities and appropriatedosing regimens, to obtain the safest and most appropriate combinations matched tospecific genomic and molecular contexts.