Springer Nature is making SARS-CoV-2 and COVID-19 research free. View research | View latest news | Sign up for updates

Integrating phenotypic features and tissue-specific information to prioritize disease genes

  • 165 Accesses

  • 2 Citations

Abstract

Prioritization of candidate disease genes is crucial for improving medical care, and is one of the fundamental challenges in the post-genomic era. In recent years, different network-based methods for gene prioritization are proposed. Previous studies on gene prioritization show that tissue-specific protein-protein interaction (PPI) networks built by integrating PPIs with tissue-specific gene expression profiles can perform better than tissue-na¨ıve global PPI network. Based on the observations that diseases with similar phenotypes are likely to have common related genes, and genes associated with the same phenotype tend to interact with each other, we propose a method to prioritize disease genes based on a heterogeneous network built by integrating phenotypic features and tissue-specific information. In this heterogeneous network, the PPI network is built by integrating phenotypic features with a tissue-specific PPI network, and the disease network consists of the diseases that are associated with the same phenotype and tissue as the query disease. To determine the impacts of these two factors on gene prioritization, we test three typical network-based prioritization methods on heterogeneous networks consisting of combinations of different PPIs and disease networks built with or without phenotypic features and tissue-specific information. We also compare the proposed method with other tissuespecific networks. The results of case studies reveals that integrating phenotypic features with a tissue-specific PPI network improves the prioritization results. Moreover, the disease networks generated using our method not only show comparable performance with the widely used disease similarity dataset of 5080 human diseases, but are also effective for diseases that are not in the dataset.

This is a preview of subscription content, log in to check access.

References

  1. 1

    Ritchie M D, Holzinger E R, Li R, et al. Methods of integrating data to uncover genotype-phenotype interactions. Nat Rev Genet, 2015, 16: 85–97

  2. 2

    Moreau Y, Tranchevent L-C. Computational tools for prioritizing candidate genes: boosting disease gene discovery. Nat Rev Genet, 2012, 13: 523–536

  3. 3

    Piro R M, Di Cunto F. Computational approaches to disease-gene prediction: rationale, classification and successes. FEBS J, 2012, 279: 678–696

  4. 4

    Wang X J, Gulbahce N, Yu H Y. Network-based methods for human disease gene prediction. Brief Funct Genomics, 2011, 10: 280–293

  5. 5

    Lan W, Wang J X, Li M, et al. Computational approaches for prioritizing candidate disease genes based on PPI networks. Tsinghua Sci Technol, 2015, 20: 500–512

  6. 6

    Wu X B, Jiang R, Zhang M, et al. Network-based global inference of human disease genes. Mol Syst Biol, 2008, 4: 189

  7. 7

    Vanunu O, Magger O, Ruppin E, et al. Associating genes and protein complexes with disease via network propagation. PLoS Comput Biol, 2010, 6: e1000641

  8. 8

    Li Y J, Patra J C. Genome-wide inferring gene-phenotype relationship by walking on the heterogeneous network. Bioinformatics, 2010, 26: 1219–1224

  9. 9

    Wang J X, Peng X Q, Peng W, et al. Dynamic protein interaction network construction and applications. Proteomics, 2014, 14: 338–352

  10. 10

    Gaulton K J, Mohlke K L, Vision T J. A computational system to select candidate genes for complex human traits. Bioinformatics, 2007, 23: 1132–1140

  11. 11

    Schlicker A, Lengauer T, Albrecht M. Improving disease gene prioritization using the semantic similarity of Gene Ontology terms. Bioinformatics, 2010, 26: i561–i567

  12. 12

    Linghu B, Snitkin E S, Hu Z, et al. Genome-wide prioritization of disease genes and identification of disease-disease associations from an integrated human functional linkage network. Genome Biol, 2009, 10: R91

  13. 13

    Franke L, van Bakel H, Fokkens L, et al. Reconstruction of a functional human gene network, with an application for prioritizing positional candidate genes. Amer J Hum Genet, 2006, 78: 1011–1025

  14. 14

    Robinson P N, Webber C. Phenotype ontologies and cross-species analysis for translational research. PLoS Genet, 2014, 10: e1004268

  15. 15

    Hwang S, Kim E, Yang S, et al. MORPHIN: a web tool for human disease research by projecting model organism biology onto a human integrated gene network. Nucl Acids Res, 2014, 42: W147–W153

  16. 16

    Winter E E, Goodstadt L, Ponting C P. Elevated rates of protein secretion, evolution, and disease among tissue-specific genes. Genome Res, 2004, 14: 54–61

  17. 17

    Chao E C, Lipkin S M. Molecular models for the tissue specificity of DNA mismatch repair-deficient carcinogenesis. Nucl Acids Res, 2006, 34: 840–852

  18. 18

    Magger O, Waldman Y Y, Ruppin E, et al. Enhancing the prioritization of disease-causing genes through tissue specific protein interaction networks. PLoS Comput Biol, 2012, 8: e1002690

  19. 19

    Prasad T S K, Goel R, Kandasamy K, et al. Human protein reference database2009 update. Nucl Acids Res, 2009, 37: D767–D772

  20. 20

    Barshir R, Basha O, Eluk A, et al. The tissuenet database of human tissue protein-protein interactions. Nucl Acids Res, 2013, 41: D841–D844

  21. 21

    Su A I, Wiltshire T, Batalov S, et al. A gene atlas of the mouse and human protein-encoding transcriptomes. Proc Nat Acad Sci USA, 2004, 101: 6062–6067

  22. 22

    Berglund L, Björling E, Oksvold P, et al. A genecentric human protein atlas for expression profiles based on antibodies. Mol Cell Proteom, 2008, 7: 2019–2027

  23. 23

    Bradley R K, Merkin J, Lambert N J, et al. Alternative splicing of RNA triplets is often regulated and accelerates proteome evolution. PLoS Biol, 2012, 10: e1001229

  24. 24

    Chatr-aryamontri A, Breitkreutz B-J, Oughtred R, et al. The BioGRID interaction database: 2015 update. Nucl Acids Res, 2015, 43: D470–D478

  25. 25

    Salwinski L, Miller C S, Smith A J, et al. The database of interacting proteins: 2004 update. Nucl Acids Res, 2004, 32: D449–D451

  26. 26

    Orchard S, Ammari M, Aranda B, et al. The MIntAct project—IntAct as a common curation platform for 11 molecular interaction databases. Nucl Acids Res, 2014, 42: D358–D363

  27. 27

    Licata L, Briganti L, Peluso D, et al. MINT, the molecular interaction database: 2012 update. Nucl Acids Res, 2012, 40: D857–D861

  28. 28

    Barshir R, Shwartz O, Smoly I Y, et al. Comparative analysis of human tissue interactomes reveals factors leading to tissue-specific manifestation of hereditary diseases. PLoS Comput Biol, 2014, 10: e1003632

  29. 29

    Greene C S, Krishnan A, Wong A K, et al. Understanding multicellular function and disease with human tissue-specific networks. Nat Genet, 2015, 47: 569–576

  30. 30

    Li M, Zhang J Y, Liu Q, et al. Prediction of disease-related genes based on weighted tissue-specific networks by using DNA methylation. BMC Med Genomics, 2014, 7: S4

  31. 31

    Ganegoda G U, Wang J X, Wu F-X, et al. Prediction of disease genes using tissue-specified gene-gene network. BMC Syst Biol, 2014, 8: S3

  32. 32

    Jacquemin T, Jiang R. Walking on a tissue-specific disease-protein-complex heterogeneous network for the discovery of disease-related protein complexes. BioMed Res Int, 2013, 2013: 455–458

  33. 33

    Robinson P, Krawitz P, Mundlos S. Strategies for exome and genome sequence data analysis in disease-gene discovery projects. Clin Genet, 2011, 80: 127–132

  34. 34

    Köhler S, Bauer S, Horn D, et al. Walking the interactome for prioritization of candidate disease genes. Amer J Hum Genet, 2008, 82: 949–958

  35. 35

    van Driel M A, Bruggeman J, Vriend G, et al. A text-mining analysis of the human phenome. Eur J Hum Genet, 2006, 14: 535–542

  36. 36

    Brunner H G, van Driel M A. From syndrome families to functional genomics. Nat Rev Genet, 2004, 5: 545–551

  37. 37

    Yang H, Robinson P N, Wang K. Phenolyzer: phenotype-based prioritization of candidate genes for human diseases. Nat Methods, 2015, 12: 841–843

  38. 38

    Javed A, Agrawal S, Ng P C. Phen-Gen: combining phenotype and genotype to analyze rare disorders. Nat Methods, 2014, 11: 935–937

  39. 39

    Chen Y, Jiang T, Jiang R. Uncover disease genes by maximizing information flow in the phenome-interactome network. Bioinformatics, 2011, 27: i167–i176

  40. 40

    Xie M Q, Hwang T, Kuang R. Prioritizing disease genes by bi-random walk. In: Proceedings of 16th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining, Kuala Lumpur, 2012. 292–303

  41. 41

    Hamosh A, Scott A F, Amberger J S, et al. Online mendelian inheritance in man (OMIM), a knowledgebase of human genes and genetic disorders. Nucl Acids Res, 2005, 33: D514–D517

  42. 42

    Lage K, Hansen N T, Karlberg E O, et al. A large-scale analysis of tissue-specific pathology and gene expression of human disease genes and complexes. Proc Nat Acad Sci, 2008, 105: 20870–20875

  43. 43

    Basha O, Flom D, Barshir R, et al. MyProteinNet: build up-to-date protein interaction networks for organisms, tissues and user-defined contexts. Nucl Acids Res, 2015, 43: W258–W263

  44. 44

    Köhler S, Doelken S C, Mungall C J, et al. The human phenotype ontology project: linking molecular biology and disease through phenotype data. Nucl Acids Res, 2014, 42: D966–D974

  45. 45

    Resnik P. Using information content to evaluate semantic similarity in a taxonomy. In: Proceedings of the 14th International Joint Conference on Artificial Intelligence. San Francisco: Morgan Kaufmann Publishers Inc., 1995. 448–453

  46. 46

    Schlicker A, Domingues F, Rahnenf¨uhrer J, et al. A new measure for functional similarity of gene products based on Gene Ontology. BMC Bioinform, 2006, 7: 302

  47. 47

    Guo X L, Gao L, Wei C S, et al. A computational method based on the integration of heterogeneous networks for predicting disease-gene associations. PLoS ONE, 2011, 6: e24171

  48. 48

    Zhou X Z, Menche J, Barabási A-L, et al. Human symptoms-disease network. Nat Commun, 2014, 5: 4212

  49. 49

    Goh K-I, Cusick M E, Valle D, et al. The human disease network. Proc Nat Acad Sci, 2007, 104: 8685–8690

Download references

Author information

Correspondence to Lin Gao.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Deng, Y., Gao, L., Guo, X. et al. Integrating phenotypic features and tissue-specific information to prioritize disease genes. Sci. China Inf. Sci. 59, 070101 (2016). https://doi.org/10.1007/s11432-016-5584-y

Download citation

Keywords

  • gene prioritization
  • tissue-specific network
  • phenotype
  • PPI network
  • disease network