Introduction

Cancer is a disease caused by genetic mutation, with common mutations that occur across cancer types yet vary among individual patients. Traditional cancer treatment involves the selection of therapy that is specific for each type of cancer. With the advent of precision medicine, a more effective treatment strategy is implemented according to the genetic profile of each patient’s cancer.

For example, the EGFR gene mutation is found in approximately half of Japanese lung adenocarcinomas [1, 2] and is an important marker for treatment selection. When the EGFR mutation is found in tumor cells of a lung cancer, gefitinib [4, 5], erlotinib [6], and afatinib [7,8,9,10], drugs that target this mutation, are used as successful treatments. However, when the EGFR mutation is T790M, the tumor shows resistance to the above drugs [11,12,13,14,15], and osimertinib, which has a different mechanism of action, is a better treatment option [16]. In addition, when gene amplification of another tyrosine kinase such as MET is observed, resistance to EGFR inhibitors is predicted by bypass signaling [17, 18]. In another example, colorectal cancer patients with constitutively active mutations in the KRAS gene are unlikely to benefit from the EGFR inhibitors cetuximab and panitumumab [19,20,21].

Thus, the most rational therapeutic agent can be selected by considering the type of genetic mutation, the change in functionality due to the mutation, and the signal transduction pathways involved.

Therefore, it is necessary to detect mutations for multiple genes simultaneously, and then to curate clinically meaningful mutations that can be targeted for treatment, among many mutations. In this review, we discuss the knowledge base for efficiently collecting reference information necessary for the curation process that is essential in the practice of precision oncology.

Comprehensive cancer panel sequencing tests

Gene panel testing is a technique that simultaneously analyzes multiple genes using next-generation sequencing (NGS). The gene panel testing methods currently used for cancer diagnosis can be roughly grouped into two types: companion diagnostics (CDx), and other in vitro diagnostics (IVD). The former is used to assess the treatment outcome of administering a specific therapeutic agent that had been authorized for a cancer with a specific genetic profile, and thus to decide whether to administer that drug as a treatment for that specific cancer. Companion diagnostics use predictive biomarker assays that guide the use of targeted cancer drugs, and are often developed in parallel to the drug [22]. On the other hand, gene profiling obtained with the latter provides information on the diagnosis or prognosis, and assists in the determination of the potential therapeutic strategy including the molecular-targeted drug. Such treatment regimens include approved drug therapy for specific tumor types, off-label treatment for unapproved tumor types, and clinical trials with investigational drugs.

Table 1 shows genetic panel tests currently being conducted in Japan (including tests for research purposes). In April 2018, the Oncomine™ Dx Target Test CDx system was approved as the first gene panel testing in Japan. The Oncomine™ Dx Target Test approved in the United States analyzes 46 target genes, and has been approved as a CDx to determine whether to administer the Food and Drug Administration (FDA)-approved drugs targeting three genes (EGFR, BRAF, and ROS1) among 23 genes related to non-small-cell lung cancer (NSCLC) [23]. In Japan, the same system was approved as a CDx to determine treatment only for the BRAF (V600E) mutation (http://www.info.pmda.go.jp/ygo/pack/840863/23000BZX00089000_A_01_01/).

Table 1 Comprehensive cancer panel sequencing tests for implementation in Japan

Meanwhile, the “NCC OncoPanel” provided by the National Cancer Center, the “Todai OncoPanel” provided by the University of Tokyo, and the “Oncomine Target Test” provided by Osaka University are being implemented as IVD options used for cancer genome medical treatment as of November 2018. The clinical usefulness of the above implementations is now being verified through the Japanese Advanced Medical Care B program of the Ministry of Health, Labour and Welfare (MHLW), with the aim of obtaining approval by the Pharmaceuticals and Medical Devices Agency (PMDA), and being implemented under insurance reimbursement of MHLW by 2019.

Scheme for cancer genome medicine

Unlike CDx, determining the outcome in IVD tests is complicated. Figure 1 shows the analysis scheme of cancer genome medical treatment in Japan. Sequence information obtained by genetic panel testing is converted into mutation information after bioinformatics analysis. Subsequently, a curation process is performed, which specifies clinically useful mutations such as pathogenic mutations leading to treatment, followed by the preparation of a report. Finally, an expert panel consisting of experts from several fields decides the treatment policy taking into consideration existing adaptable medicines and participable clinical trials. Depending on the implementing facility, members of the expert panel, who are required to have extensive knowledge and highly advanced skills, may be responsible for the curation step. Furthermore, the reference information and the curation methods used are different in each facility, so the final interpretation may be different depending on the implementation facility. Therefore, for the purpose of equalizing the quality and standardizing the process of curation as much as possible, it would be necessary to collect information mechanically.

Fig. 1
figure 1

Scheme for cancer genome panel test. Genomic DNA is extracted from tumor tissue or biopsies, and sequenced. Variants are called annotated and prioritized for potential functional or clinical relevance before being reported to a tumor board, where an interdisciplinary team “expert panel” decides the treatment options. The knowledge base contains useful information on evidence in literature, clinical research, therapeutic drugs, etc.

Knowledge base for curation and treatment selection

Table 2 lists the knowledge bases for mechanically collecting necessary information for the curation process. There are different types of knowledge bases. One is a database that aggregates meaningful mutations among genetic mutations obtained from NGS data. The other is a knowledge base that automatically integrates information on treatments and clinical trials, and performs evidence-level classification based upon this information. COSMIC [24], which accumulates data specifically on somatic mutations of human cancer, and ClinVar [25], which is specialized for pathological mutations, are databases of the former type. In the ClinVar database, the meaning of mutation may be inconsistent because it is defined by the registrant. In addition, ClinGen is a database that performs a curation process with uniform standards, and redefines mutations [26,27,28,29].

Table 2 Knowledge base for curation and treatment decision

Integrated knowledge bases include OncoKB [30] developed for MSK-IMPACT panel analysis, JAX Clinical Knowledgebase (JAX-CKB) [31] provided by The Jackson Laboratory, Precision Medicine Knowledgebase (PMKB) [32], Cancer Driver Log (CanDL) [33], My Cancer Genome [34], Personalized Cancer Therapy Database [35], Cancer Genome Interpreter (CGI) [36], DoCM [37], CIVic [38,39,40], and others.

OncoKB is a curated database for MSK-IMPACT panel analysis. It has annotated 4232 alterations in 554 genes, 38 tumor types, and 76 drugs. OncoKB contains information on alteration types, and provides information on the actionability and therapeutic implications, classified into the following six levels according to available evidence: Level 1, FDA approved; Level 2, Standard care; Level 3, Clinical evidence; Level 4, Biological evidence; Level R1, Standard care resistance; and Level R2, Clinical evidence of resistance. OncoKB provides multiple-access methods, including an application programming interface (API), data file download, and web browser-based data access, while information on clinical trials is not included.

The JAX-CKB database contains gene and variant descriptions, drug indication status, clinical trials by indication, treatment approaches, efficacy evidence supporting response to treatment approaches by indication, and resistance evidence supporting resistance to treatments by indication. Since October 2018, the Jackson Laboratory has introduced a new tiered structure in which users may choose between three levels of access: CKB Core (TM) provides free access; CKB Boost (TM), the newest option, provides only web-access to 1000 + genes; and CKB Flex (TM) provides scalable and flexible content integration into bioinformatic workflows for these 1000 + genes.

The Public Access version of JAX-CKB contains only 82 commonly known driver genes. Users can search JAX-CKB for genes, genetic mutants, drugs, drug classes, indications, and clinical trials. The web-based version of JAX-CKB is designed to query the knowledge base for specific data attributes.

PMKB provides information about clinical cancer variants and interpretations in a structured way, as well as allowing users to submit and edit existing entries. The database contains 457 variant descriptions with 281 clinical-grade interpretations. Importantly, all interpretations are either written or approved by certified molecular pathologists.

CanDL is an expert-curated database for actionable driver mutations, which has recorded 373 variants in 60 genes. These mutations are categorized into four groups depending on the evidence level and published literature as follows: mutations corresponding to FDA approved or The National Comprehensive Cancer Network recommended therapy; mutations with treatment based on evidence from clinical trials, case reports, or exceptional responders; mutations that can predict response or resistance based on evidence in pre-clinical data (in vitro or in vivo); and putative oncogenic driver mutations. The CanDL database can be searched by genetic unit or amino acid substitution, and allows users to download all or selected genes as a comma-separated value file. However, the latest data update is July 2015 according to the official website.

My Cancer Genome contains an overview of mutations, actionability, therapeutic implications, and available clinical trials. In addition to a web interface, My Cancer Genome provides access via API and a Mobile App to facilitate dissemination of information and enhance accessibility. The information in My Cancer Genome is hierarchically organized by cancer types, cancer-related genes, and specific cancer-related genetic mutations. However, alteration-specific information is organized by tumor type rather than by alteration, which may cause difficulties when extrapolating information for adaptation to other tumor types.

Personalized Cancer Therapy Database is a semi-public knowledge base provided by MD Anderson Cancer Center. This knowledge base provides information on the function of common genomic alterations and their therapeutic implications for about 32 genes important in cancer. The contents of this database include overviews on genes and their function, genetic alterations, frequencies and outcomes, therapeutic implications, FDA-approved drugs or investigational therapeutics in clinical trials targeting a pathway, and genotype-selected clinical trials and genotype-relevant clinical trials. In addition to mechanically obtained information, stored information is reviewed manually by experts from multiple disciplines, and especially, information from clinical trials is updated weekly. Information is organized for each gene regardless of the tumor type, while it is also possible to search for each mutation. However, the API for this database is not disclosed at this time.

CGI is the knowledge base with a reference database and curation system. CGI collects known oncogenic alterations in a tumor, and predicts the effect of the remaining alterations of uncertain significance. It also reports the known influence of these variants on drug response according to the level of supporting clinical evidence, and lists the interactions of existing chemical compounds with genes bearing driver alterations. While freely available through an API or a web interface, information resources in CGI include the Catalog of Cancer Genes, Catalog of Validated Oncogenic Mutations, Cancer Biomarkers database, and Cancer Bioactivities Database.

DoCM assembles known pathogenic variants validated in cancer. DoCM contains information on 122 cancer sub-types, 132 genes, 1364 variants, variant types, variant effects, and evidence from the literature. DoCM is provided as an open source and open license database, and the API system for data download has been released.

CIViC summarizes and aggregates evidence of clinically actionable variants into clinical interpretations, and is based on crowdsourcing. As a result, CIViC is a highly transparent knowledge base with all information from the source of evidence to the source code open and accessible. Furthermore, DoCM and CIVic share complementary information.

Approved drug database

Although many useful knowledge bases have been developed in Europe and the United States, information on therapeutic drugs and related evidence is based on FDA- and European Medicines Agency (EMA)-approved drugs, and trial information in Europe and the United States. For the implementation of precision medical treatment in Japan, verification with approved drugs in Japan and clinical trial information corresponding to the detected genetic variation are essential. Therefore, it is necessary to collect and refer to drug information with PMDA approval as shown in Table 3.

Table 3 Approved drug database

Clinical trial information

In order to determine whether clinical studies are available for patients without indication for standard therapy, it is necessary to collect information on clinical research. A database that collects information on clinical research in each country is shown in Table 4. There is a centralized database on clinical trials in the United States, because there is an obligation to register clinical trial implementation with the FDA. In Europe, apart from each national trial, there is a central application in the EU, which is managed centrally by the EMA. In Japan, the Japan Primary Registries Network (JPRN), which integrates information from three organizations—the University hospital Medical Information Network (UMIN) clinical trial registration system (UMIN-CTR), Japan Medical Information Center (JAPIC), and Japan Medical Association Center for Clinical Trials (JMACCT)—was established and internationally recognized as a primary registry in 2008. JPRN provides clinical test information to the International Clinical Trial Registry Platform (ICTRP) managed by the World Health Organization (WHO). International Committee of Medical Journal Editors (ICMJE) requires that information be registered in WHO’s ICTRP primary registry as a condition for publication. However, in Japan, registration with the three organizations has been optional thus far, and all associated information was not always disclosed. In April 2018, the Clinical Trials Act was enacted, which mandated the registry of clinical trial implementation plans via the clinical research implementation plan/research outline release system, Japan Registry of Clinical Trials (jRCT). It is expected that information on the clinical trials conducted in Japan will be more efficiently and comprehensively collected when this system is in operation.

Table 4 Clinical trial databases (administrative organ)

Collateral of curation quality, and standardization and sharing of methods

In April 2018, the US FDA issued two guidance documents on NGS-based tests [41, 42]. The knowledge base that collects evidence supporting the association between a genetic variant and disease was institutionalized to allow it to be open access with the FDA approval system. To obtain certification, it is necessary to perform rigorous examination of application accuracy, standard operation procedures (SOPs), security, information management, etc. These public databases are expected to contain resources similar to those of ClinGen [28]. This system provides the developers of NGS-based genetic examinations the advantage of reduced burden from additional clinical data by supporting the use of the FDA-approved database. While the draft guideline published in 2016 pertained to germline mutations, the final version also included somatic mutations.

Thus, the government of the United States is working to ensure that quality beyond a certain level is guaranteed. Furthermore, dissemination is promoted by ensuring that the databases are open source. Developers of the knowledge bases also organized the Variant Interpretation for Cancer Consortium (VICC) as the driver project of the Global Alliance for Genomics Health (GA4GH), and developed meta-knowledgebase (https://search.cancervariants.org/) that can perform curation across databases. The beta version of this project has been released [43].In Japan, no consensus method has been obtained for curation thus far. In each facility, both the reference database used for curation and the curation method are different, and it is often not publicly available. “Clinical practice guidance for next-generation sequencing in cancer diagnosis and treatment (Edition 1.0)” [44] was published by Japanese Society of Medical Oncology, Japan Society of Clinical Oncology, and Japan Cancer Association. The accompanying “Evidence Levels of Gene Panel Testing Results (ver. 1.0) as of Aug 21, 2017” is a document that is helpful for selecting treatment methods that can be implemented in Japan. However, since the schedule for information update is undecided, other sources of information will be also needed in the near future to obtain approval information on drugs, which is changing on a daily basis, as well as the latest clinical trial information.

Conclusion

Despite many challenges as discussed above, cancer genome therapy is utilized in Japan. To select an effective treatment method from the gene profile of cancer cells of a patient, curation work is indispensable in identifying the clinical significance of the mutation information obtained, and whether the applicable therapy exists or not. To conduct high-quality curation, it is necessary to comprehensively collect a wide variety of abundant information, continue updating information on drugs and clinical trials that are constantly updated on a daily basis, and link the latest information correctly. Therefore, the development of an integrated knowledge base system, which takes advantage of bioinformatics and automatic curation, while integrating the existing knowledge base with information on proprietary approved drugs and clinical trials, is urgently needed, and expected to be implemented in the near future, especially in Japan.