Prediction of Target Genes Based on Multiway Integration of High-Throughput Data

  • Wei-Li Guo
  • Kyungsook Han
  • De-Shuang HuangEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9771)


In the past few years, ChIP-seq data have emerged as a powerful source to discover regulatory functional elements as well as disease related genetic reasons. However, owing to cost, or biological material availability, it is hard to do experiment for every transcription factor in every cell line, which restricts analysis requiring completed data to only those with existed experiments. The imputation of missing ChIP-seq data can help to solve the problem, while because of the massive scale of ChIP-seq data, traditional methods is unsuitable to treat such huge amount of data or it is time-consuming. In this paper, we proposed a tensor completion-based method for the imputation of ChIP-seq data by modeling the ChIP-seq dataset as a 3-way tensor pattern. The results show that the proposed method is better than state-of-the-art baseline methods with respect to imputation accuracy.


ChIP-seq data Transcription factor Target gene Latent factor Trace norm 


  1. 1.
    Furey, T.S.: ChIP–seq and beyond: new and improved methodologies to detect and characterize protein–DNA interactions. Nat. Rev. Genet. 13(12), 840–852 (2012)CrossRefGoogle Scholar
  2. 2.
    Chen, X., Xu, H., Yuan, P., Fang, F., Huss, M., Vega, V.B., Wong, E., Orlov, Y.L., Zhang, W., Jiang, J.: Integration of external signaling pathways with the core transcriptional network in embryonic stem cells. Cell 133(6), 1106–1117 (2008)CrossRefGoogle Scholar
  3. 3.
    Gerstein, M.B., Kundaje, A., Hariharan, M., Landt, S.G., Yan, K.-K., Cheng, C., Mu, X.J., Khurana, E., Rozowsky, J., Alexander, R.: Architecture of the human regulatory network derived from ENCODE data. Nature 489(7414), 91–100 (2012)CrossRefGoogle Scholar
  4. 4.
    Consortium, E.P.: An integrated encyclopedia of DNA elements in the human genome. Nature 489(7414), 57–74 (2012)CrossRefGoogle Scholar
  5. 5.
    Kundaje, A., Meuleman, W., Ernst, J., Bilenky, M., Yen, A., Heravi-Moussavi, A., Kheradpour, P., Zhang, Z., Wang, J., Ziller, M.J.: Integrative analysis of 111 reference human epigenomes. Nature 518(7539), 317–330 (2015)CrossRefGoogle Scholar
  6. 6.
    Malhotra, D., Portales-Casamar, E., Singh, A., Srivastava, S., Arenillas, D., Happel, C., Shyr, C., Wakabayashi, N., Kensler, T.W., Wasserman, W.W.: Global mapping of binding sites for Nrf2 identifies novel targets in cell survival response through ChIP-Seq profiling and network analysis. Nucleic Acids Res. 38(17), 5718–5734 (2010)CrossRefGoogle Scholar
  7. 7.
    Kunarso, G., Chia, N.-Y., Jeyakani, J., Hwang, C., Lu, X., Chan, Y.-S., Ng, H.-H., Bourque, G.: Transposable elements have rewired the core regulatory network of human embryonic stem cells. Nat. Genet. 42(7), 631–634 (2010)CrossRefGoogle Scholar
  8. 8.
    Wu, G., Yustein, J.T., McCall, M.N., Zilliox, M., Irizarry, R.A., Zeller, K., Dang, C.V., Ji, H.: ChIP-PED enhances the analysis of ChIP-seq and ChIP-chip data. Bioinformatics, btt108 (2013)Google Scholar
  9. 9.
    Zhu, L., Guo, W.-L., Deng, S.-P., Huang, D.-S.: ChIP-PIT: enhancing the analysis of ChIP-seq data using convex-relaxed pair-wise tensor decompositionGoogle Scholar
  10. 10.
    Kolda, T.G., Bader, B.W.: Tensor decompositions and applications. SIAM Rev. 51(3), 455–500 (2009)MathSciNetCrossRefzbMATHGoogle Scholar
  11. 11.
    Liu, J., Musialski, P., Wonka, P., Ye, J.: Tensor completion for estimating missing values in visual data. IEEE Trans. Pattern Anal. Mach. Intell. 35(1), 208–220 (2013)CrossRefGoogle Scholar
  12. 12.
    Filipović, M., Jukić, A.: Tucker factorization with missing data with application to low-n-rank tensor completion. Multidimension. Syst. Signal Process. 26(3), 677–692 (2015)CrossRefGoogle Scholar
  13. 13.
    McAuley, J., Leskovec, J.: Hidden factors and hidden topics: understanding rating dimensions with review text. In: Proceedings of the 7th ACM Conference on Recommender systems 2013, pp. 165–172. ACM (2013)Google Scholar
  14. 14.
    Acar, E., Dunlavy, D.M., Kolda, T.G., Mørup, M.: Scalable tensor factorizations for incomplete data. Chemometr. Intell. Lab. Syst. 106(1), 41–56 (2011)CrossRefGoogle Scholar
  15. 15.
    Ermiş, B., Acar, E., Cemgil, A.T.: Link prediction via generalized coupled tensor factorisation (2012). arXiv preprint arXiv:1208.6231
  16. 16.
    Sun, J., Papadimitriou, S., Lin, C.-Y., Cao, N., Liu, S., Qian, W.: MultiVis: content-based social network exploration through multi-way visual analysis. In: SDM 2009, pp. 1063–1074. SIAM (2009)Google Scholar
  17. 17.
    Liu, Y., Shang, F., Cheng, H., Cheng, J., Tong, H.: Factor matrix trace norm minimization for low-rank tensor completion. In: SDM 2014, pp. 866–874. SIAM (2014)Google Scholar
  18. 18.
    Gandy, S., Recht, B., Yamada, I.: Tensor completion and low-n-rank tensor recovery via convex optimization. Inverse Prob. 27(2), 025010 (2011)MathSciNetCrossRefzbMATHGoogle Scholar
  19. 19.
    Lin, Z., Chen, M., Ma, Y.: The augmented lagrange multiplier method for exact recovery of corrupted low-rank matrices (2010). arXiv preprint arXiv:1009.5055

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  1. 1.Institute of Machine Learning and Systems BiologyCollege of Electromics and Information Engineering, Tongji UniversityShanghaiChina
  2. 2.Department of Computer Science and EngineeringInha UniversityIncheonSouth Korea

Personalised recommendations