Abstract
The increase in the number of both patients and healthcare practitioners who grew up using the Internet and computers (so-called “digital natives”) is likely to impact the practice of precision medicine, and requires novel platforms for data integration and mining, as well as contextualized information retrieval. The “Illuminating the Druggable Genome Knowledge Management Center” (IDG KMC) quantifies data availability from a wide range of chemical, biological, and clinical resources, and has developed platforms that can be used to navigate understudied proteins (the “dark genome”), and their potential contribution to specific pathologies. Using the “Target Importance and Novelty Explorer” (TIN-X) highlights the role of LRRC10 (a dark gene) in dilated cardiomyopathy. Combining mouse and human phenotype data leads to increased strength of evidence, which is discussed for four additional dark genes: SLX4IP and its role in glucose metabolism, the role of HSF2BP in coronary artery disease, the involvement of ELFN1 in attention-deficit hyperactivity disorder and the role of VPS13D in mouse neural tube development and its confirmed role in childhood onset movement disorders. The workflow and tools described here are aimed at guiding further experimental research, particularly within the context of precision medicine.
Similar content being viewed by others
References
Abbott WM, Damschroder MM, Lowe DC (2014) Current approaches to fine mapping of antigen-antibody interactions. Immunology 142(4):526–535
Abifadel M, Varret M, Rabès J-P, Allard D, Ouguerram K, Devillers M, Cruaud C et al (2003) Mutations in PCSK9 cause autosomal dominant hypercholesterolemia. Nat Genet 34(2):154–156
Amberger J, Bocchini CA, Scott AF, Hamosh A (2009) McKusick’s Online mendelian inheritance in man (OMIM). Nucleic Acids Res 37:793–796
Anding AL, Wang C, Chang T-K, Sliter DA, Powers CM, Hofmann K, Youle RJ, Baehrecke EH (2018) Vps13D encodes a ubiquitin-binding protein that is required for the regulation of mitochondrial size and clearance. Curr Biol 28(2):287–295
Ashburner M, Ball CA, Blake JA, Botstein D, Butler JH, Cherry M, Davis AP et al (2000) Gene ontology: tool for the unification of biology. Nat Genet 25(1):25–29
Berger KM, Schneck PA (2019) National and transnational security implications of asymmetric access to and use of biological data. Front Bioeng Biotechnol 7(February):21
Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN, Bourne PE (2000) The protein data bank. Nucleic Acids Res 28(1):235–242
Bezemer T, de Groot MC, Blasse E, Ten Berg MJ, Kappen TH, Bredenoord AL, van Solinge WW, Hoefer IE, Haitjema S (2019) A human(e) factor in clinical decision support systems. J Med Internet Res 21(3):e11732
Cannon DC, Yang JJ, Mathias SL, Ursu O, Mani S, Waller A, Schürer SC et al (2017) TIN-X: target importance and novelty explorer. Bioinformatics. https://doi.org/10.1093/bioinformatics/btx200
Clementi N, Mancini N, Castelli M, Clementi M, Burioni R (2013) Characterization of epitopes recognized by monoclonal antibodies: experimental Approaches supported by freely accessible bioinformatic tools. Drug Discov Today 18(9–10):464–471
Collins FS, Varmus H (2015) A new initiative on precision medicine. N Engl J Med 372(9):793–795
Dolan J, Mitchell KJ (2013) Mutation of Elfn1 in mice causes seizures and hyperactivity. PLoS ONE 8(11):e80491
Edwards AM, Isserlin R, Bader GD, Frye SV, Willson TM, Frank HY (2011) Too many roads not taken. Nature 470(7333):163–165
Gaulton A, H A, Nowotka AM, Bento P, Chambers J, Mendez D, Mutowo P et al (2017) The ChEMBL database in 2017. Nucleic Acids Res 45(D1):D945–D954
Gauthier J, Meijer IA, Lessel D, Mencacci NE, Krainc D, Hempel M, Tsiakas K et al (2018) Recessive mutations in > VPS13D cause childhood onset movement disorders. Ann Neurol 83(6):1089–1095
Hajduk PJ, Huth JR, Tse C (2005) Predicting protein druggability. Drug Discov Today 10(23–24):1675–1682
Hopkins AL, Groom CR (2002) The druggable genome. Nat Rev Drug Discov 1(9):727–730
Kandoi G, Acencio ML, Lemke N (2015) Prediction of druggable proteins using machine learning and systems biology: a mini-review. Front Physiol 6(December):366
Kibbe WA, Arze C, Felix V, Mitraka E, Bolton E, Fu G, Mungall CJ et al (2015) Disease ontology 2015 update: an expanded and updated database of human diseases for linking biomedical knowledge through disease data. Nucleic Acids Res 43:1071–1078
Kiermer V (2008) Antibodypedia. Nat Methods 5(10):860–861
Knowles J, Gromo Gianni (2003) Target Selection in drug discovery. Nat Rev Drug Discov 2(1):63–69
Koscielny G, Yaikhom G, Iyer V, Meehan TF, Morgan H, Atienza-Herrero J et al (2014) The international mouse phenotyping consortium web portal, a unified point of access for knockout mice and related phenotyping data. Nucleic Acids Res 42:802–809
Koscielny G, An P, Carvalho-Silva D, Cham JA, Fumis L, Gasparyan R, Hasan S et al (2017) Open targets: a platform for therapeutic target identification and validation. Nucleic Acids Res 45(D1):D985–D994
Lenat DB, Feigenbaum EA (1991) On the thresholds of knowledge. Artif Intell 47:185–250
Lin Y, M S, Küçük-McGinty H, Turner JP, Vidovic D, Forlin M, Koleti A et al (2017) Drug target ontology to classify and integrate drug discovery data. J Biomed Semant 8(1):50
Lipinski CA, Lombardo F, Dominy BW, Feeney PJ (1997) Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings. Adv Drug Deliv Rev 23(1–3):3–25
MacArthur J, Bowler E, Cerezo M, Gil L, Hall P, Hastings E, Junkins H et al (2017) The new NHGRI-EBI catalog of published genome-Wide association studies (GWAS Catalog). Nucleic Acids Res 45(D1):D896–D901
McMurry JA, Köhler S, Washington NL, Balhoff JP, Borromeo C, Brush M, Carbon S et al (2016) Navigating the phenotype frontier: the monarch initiative. Genetics 203(4):1491–1495
Mould DR, Meibohm B (2016) Drug development of therapeutic monoclonal antibodies. BioDrugs 30(4):275–293
National Research Council, Division on Earth and Life Studies, Board on Life Sciences, and Committee on A Framework for Developing a New Taxonomy of Disease (2012) Toward precision medicine: building a knowledge network for biomedical research and a new taxonomy of disease. National Academies Press, Washington DC
Nguyen D-T, Mathias S, Bologa C, Brunak S, Fernandez N, Gaulton A, Hersey A et al (2017) Pharos: collating protein information to shed light on the druggable genome. Nucleic Acids Res 45(D1):D995–D1002
Nooren IMA, Thornton JM (2003) Diversity of protein–protein interactions. EMBO J 22(14):3486–3492
Oprea TI, Bologa CG, Brunak S, Campbell A, Gan GN, Gaulton A, Gomez SM et al (2018a) Unexplored therapeutic opportunities in the human genome. Nat Rev Drug Discov 17(5):377
Oprea TI, Jan L, Johnson GL, Roth BL, Ma’ayan A A, Schürer S, Shoichet BK, Sklar LA, McManus MT (2018b) Far away from the lamppost. PLoS Biol 16(12):e3000067
Pafilis E, Frankild SP, Fanini L, Faulwetter S, Pavloudi C, Vasileiadou A, Arvanitidis C, Jensen LJ (2013) The SPECIES and ORGANISMS resources for fast and accurate identification of taxonomic names in text. PLoS ONE 8(6):e65390
Pandey AK, Lu L, Wang X, Homayouni R, Williams RW (2014) Functionally enigmatic genes: a case study of the brain ignorome. PLoS ONE 9(2):e88889
Perlman RL (2016) Mouse models of human disease: an evolutionary perspective. Evol Med Public Health 2016(1):170–176
Pletscher-Frankild S, Pallejà A, Tsafou K, Binder JX, Jensen LJ (2015) DISEASES: text Mining and data integration of disease-gene associations. Methods 74(March):83–89
Poirier S, Mayer G, Benjannet S, Bergeron E, Marcinkiewicz J, Nassoury N, Mayer H, Nimpf J, Prat A, Seidah NG (2008) The proprotein convertase PCSK9 induces the degradation of low density lipoprotein receptor (LDLR) and its closest family members VLDLR and ApoER2. J Biol Chem 283(4):2363–2372
Prosperi M, Min JS, Bian J, Modave F (2018) Big data hurdles in precision medicine and precision public health. BMC Med Inf Decis Mak 18(1):139
Rader DJ, Cohen J, Hobbs HH (2003) Monogenic hypercholesterolemia: new insights in pathogenesis and treatment. J Clin Investig 111(12):1795–1803
Rath A, Olry A, Dhombres F, Brandt MM, Urbero B, Ayme S (2012) Representation of rare diseases in health information systems: the orphanet approach to serve a wide range of end users. Hum Mutat 33(5):803–808
Robinson PN, Mungall CJ, Haendel M (2015) Capturing phenotypes for precision medicine. Cold Spring Harb Mol Case Stud 1(1):a000372
Rodgers G, Austin C, Anderson J, Pawlyk A, Colvis C, Margolis R, Baker J (2018) Glimmers in illuminating the druggable genome. Nat Rev Drug Discov 17(5):301–302
Rouillard AD, Gundersen GW, Fernandez NF, Wang Z, Monteiro CD, McDermott MG, Ma’ayan A (2016) The harmonizome: a collection of processed datasets gathered to serve and mine knowledge about genes and proteins. Database. https://doi.org/10.1093/database/baw100
Rye K-A, Barter PJ (2014) Cardioprotective functions of HDLs. J Lipid Res 55(2):168–179
Santos R, Ursu O, Gaulton A, Bento AP, Donadi RS, Bologa CG, Karlsson A et al (2017) A comprehensive map of molecular drug targets. Nat Rev Drug Discov 16(1):19–34
Seneviratne MG, Kahn MG, Hernandez-Boussard T (2019) Merging heterogeneous clinical data to enable knowledge discovery. Pac Symp Biocomput 24:439–443
Seong E, Insolera R, Dulovic M, Kamsteeg E-J, Trinh J, Brüggemann N, Sandford E et al (2018) Mutations in VPS13D lead to a new recessive ataxia with spasticity and mitochondrial defects. Ann Neurol 83(6):1075–1088
Southam L, Gilly A, Süveges D, Farmaki A-E, Schwartzentruber J, Tachmazidou I, Matchan A et al (2017) Whole genome sequencing and imputation in isolated populations identify genetic associations with medically-relevant complex traits. Nat Commun 8(May):15606
Southan C, Sharman JL, Benson HE, Faccenda E, Pawson AJ, Alexander SPH, Buneman OP et al (2016) The IUPHAR/BPS guide to pharmacology in 2016: towards curated quantitative interactions between 1300 protein targets and 6000 ligands. Nucleic Acids Res 44(D1):D1054–D1068
Stoeger T, Gerlach M, Morimoto RI, Amaral LAN (2018) Large-scale investigation of the reasons why potentially important genes are ignored. PLoS Biol 16(9):e2006643
Suntharalingam G, Perry MR, Ward S, Brett SJ, Castello-Cortes A, Brunner MD, Panoskaltsis N (2006) Cytokine storm in a phase 1 trial of the anti-CD28 monoclonal antibody TGN1412. N Engl J Med 355(10):1018–1028
Surade S, Blundell TL (2012) Structural biology and drug discovery of difficult targets: the limits of ligandability. Chem Biol 19(1):42–50
Szklarczyk D, Gable AL, Lyon D, Junge A, Wyder S, Huerta-Cepas J, Simonovic M et al (2019) STRING v11: protein–protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets. Nucleic Acids Res 47(D1):D607–D613
Target importance and novelty explorer (TIN-X) (2014) TIN-X. http://newdrugtargets.org/. Accessed 14 Dec 2014
Tomioka NH, Yasuda H, Miyamoto H, Hatayama M, Morimura N, Matsumoto Y, Suzuki T et al (2014) Elfn1 recruits presynaptic mGluR7 in trans and its loss results in seizures. Nat Commun 5(July):4501
UniProt Consortium (2015) UniProt: a hub for protein information. Nucleic Acids Res 43:204–212
Ursu O, Holmes J, Knockel J, Bologa CG, Yang JJ, Mathias SL, Nelson SJ, Oprea TI (2017) DrugCentral: online drug compendium. Nucleic Acids Res 45(D1):D932–D939
Ursu O, Glick M, Oprea T (2019a) Novel drug targets in 2018. Nat Rev Drug Discov. https://doi.org/10.1038/d41573-019-00052-5
Ursu O, Holmes J, Bologa CG, Yang JJ, Mathias SL, Stathias V, Nguyen D-T, Schürer S, Oprea T (2019b) DrugCentral 2018: an update. Nucleic Acids Res 47(D1):D963–D970
van der Harst P, Verweij N (2018) Identification of 64 novel genetic loci provides an expanded view on the genetic architecture of coronary artery disease. Circ Res 122(3):433–443
Velayos-Baeza A, Vettori A, Copley RR, Dobson-Stone C, Monaco AP (2004) Analysis of the human VPS13 gene family. Genomics 84(3):536–549
Watkins, X, Garcia LJ, Pundir S, Martin MJ, UniProt Consortium (2017) ProtVista: visualization of protein sequence annotations. Bioinformatics 33(13):2040–2041
Woon MT, Long PA, Reilly L, Evans JM, Keefe AM, Lea MR, Beglinger CJ et al (2018) Pediatric dilated cardiomyopathy-associated LRRC10 (Leucine-rich repeat-containing 10) variant reveals LRRC10 as an auxiliary subunit of cardiac L-type Ca2 + channels. J Am Heart Assoc 7(3):1–10. https://doi.org/10.1161/JAHA.117.006428
Wu Fan, Ma Cong, Tan Cheemeng (2016) Network motifs modulate druggability of cellular targets. Sci Rep 6(November):36626
Acknowledgements
This work was supported by NIH Grants U54CA189205, U24CA224370 (for IDG KMC), and U24TR002278 (for IDG RDOC).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
Dr. Oprea was a former full-time employee at AstraZeneca (1996–2002). He has received honoraria, or consulted for, Abbott, AstraZeneca, Chiron, Genentech, Infinity Pharmaceuticals, Merz Pharmaceuticals, Merck Darmstadt, Mitsubishi Tanabe, Novartis, Ono Pharmaceuticals, Pfizer, Roche, Sanofi, and Wyeth. His spouse was a full-time employee of AstraZeneca (2002–2014) and is a full-time employee of Genentech Inc.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Oprea, T.I. Exploring the dark genome: implications for precision medicine. Mamm Genome 30, 192–200 (2019). https://doi.org/10.1007/s00335-019-09809-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00335-019-09809-0