Cluster Computing

, Volume 22, Supplement 3, pp 5435–5446 | Cite as

A joint deep model of entities and documents for cumulative citation recommendation

  • Lerong Ma
  • Dandan SongEmail author
  • Lejian Liao
  • Yao Ni


Knowledge bases (such as Wikipedia) are valuable resources of human knowledge which have contributed to various of applications. However, their manual maintenance makes a big lag between their contents and the up-to-date information of entities. Cumulative citation recommendation (CCR) concentrates on identifying worthy-citation documents from a large volume of stream data for a given target entity in knowledge bases. Most previous approaches first carefully extract human-designed features from entities and documents, and then leverage machine learning methods such as SVM and Random Forests to filter worthy-citation documents for target entities. There are some problems in handcraft features for entities and documents: (1) It is an empirical process that requires expert knowledge, thus cannot be easily generalized; (2) The effectiveness of humanly designed features has great effect on the performance; (3) The implementation of the feature extraction process is resource dependent and time-consuming. In this paper, we present a Joint Deep Neural Network Model of Entities and Documents for CCR, termed as DeepJoED, to identify highly related documents for given entities with several layers of neurons, by automatically learn feature extraction of the entities and documents, and train the networks in an end-to-end fashion.An extensive set of experiments have been conducted on the benchmark dataset provided in the Text REtrieval Conference (TREC) Knowledge base acceleration (KBA) task in 2012. The results show the model can bring a significant improvement relative to the state-of-the-art results on this dataset in CCR.


Knowledge base acceleration Cumulative citation recommendation Word embedding Convolution Neural Networks Latent semantic representations 



This work has been supported by the National Key Research and Development Program of China under grant (Grant Nos. 2016YFB1000902), the National Natural Science Foundation of China (Grant Nos. 61472040), and the Natural Science Basic Research Plan in Shaanxi Province of China (Grant Nos. 2016JM6082).


  1. 1.
    Mihalcea, R., Csomai, A.: Wikify!: linking documents to encyclopedic knowledge. In: Proceedings of the sixteenth ACM conference on Conference on information and knowledge management, pp. 233–242. ACM (2007)Google Scholar
  2. 2.
    Dalton, J., Dietz, L., Allan, J.: Entity query feature expansion using knowledge base links. In: Proceedings of the 37th international ACM SIGIR conference on Research & development in information retrieval, pp. 365–374, ACM (2014)Google Scholar
  3. 3.
    Zhang, C., Zhou, M., Han, X., Zheng, H., Ji, Yang: Knowledge graph embedding for hyper-relational data. Tsinghua Sci. Technol. 22(02), 185–197 (2017)CrossRefGoogle Scholar
  4. 4.
    Dang, H.T., Kelly, D., Lin, J.J.: Overview of the trec 2007 question answering track. In: TREC, vol. 7, pp. 63 (2007)Google Scholar
  5. 5.
    Balog, K., Serdyukov, P., de Vries, A.P.: Overview of the trec 2010 entity track. Technical report, DTIC Document (2010)Google Scholar
  6. 6.
    Zhang, F., Yuan, N.J., Lian, D., Xie, X., Ma, W.Y.: Collaborative knowledge base embedding for recommender systems. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’16, pp. 353–362, New York, NY, USA (2016)Google Scholar
  7. 7.
    Frank, J.R, Kleiman-Weiner, M., Roberts, D.A., Niu, F., Zhang, C, Ré, C., Soboroff, I.: Building an entity-centric stream filtering test collection for trec 2012. Technical report, DTIC Document (2012)Google Scholar
  8. 8.
    Kjersten, B., McNamee, P.: The hltcoe approach to the trec 2012 kba track. In: TREC. NIST (2012)Google Scholar
  9. 9.
    Balog, K., Ramampiaro, H.: Cumulative citation recommendation: Classification vs. ranking. In: Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval, pp. 941–944. ACM (2013)Google Scholar
  10. 10.
    Ma, L., Song, D., Liao, L., Wang, J.: Psvm: a preference-enhanced svm model using preference data for classification. Sci. China Inf. Sci. 60(12), 122103 (2017)CrossRefGoogle Scholar
  11. 11.
    Berendsen, R., Meij, E., Odijk, D., de Rijke, M., Weerkamp, W.: The university of amsterdam at trec 2012. In: TREC. NIST (2012)Google Scholar
  12. 12.
    Balog, K., Ramampiaro, H., Takhirov, N., Nørvåg, K.: Multi-step classification approaches to cumulative citation recommendation. In: OAIR, pp. 121–128. ACM (2013)Google Scholar
  13. 13.
    Bonnefoy, L., Bouvier, V., Bellot, P.: A weakly-supervised detection of entity central documents in a stream. In: SIGIR, pp. 769–772. ACM (2013)Google Scholar
  14. 14.
    Wang, J., Song, D., Lin, C.Y., Liao, L.: Bit and msra at trec kba ccr track 2013. In: TREC. NIST (2013)Google Scholar
  15. 15.
    LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)CrossRefGoogle Scholar
  16. 16.
    Yuan, Z., Yongqiang, L., Xue, Y.: Droiddetector: android malware characterization and detection using deep learning. Tsinghua Sci. Technol. 21(1), 114–123 (2016)CrossRefGoogle Scholar
  17. 17.
    Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)
  18. 18.
    Greff, K., Srivastava, R.K, Koutník, J., Steunebrink, B.R., Schmidhuber, J.: Lstm: A search space odyssey. IEEE Trans. Neural Netw. Learn. Syst. (2016)Google Scholar
  19. 19.
    Zaremba, W., Sutskever, I., Vinyals, O.: Recurrent neural network regularization. CoRR, arXiv:1409.2329 (2014)
  20. 20.
    Wen, T.H., Gasic, M., Mrksic, N., Su, P.H., Vandyke, D., Young, S.J.: Semantically conditioned lstm-based natural language generation for spoken dialogue systems. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, EMNLP 2015, Lisbon, Portugal, September 17–21, 2015, pp. 1711–1721 (2015)Google Scholar
  21. 21.
    Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., Kuksa, P.P.: Natural language processing (almost) from scratch. J. Mach. Learn. Res. 12, 2493–2537 (2011)zbMATHGoogle Scholar
  22. 22.
    Kim, Y.: Convolutional neural networks for sentence classification. In: Empirical methods in natural language processing, pp. 1746–1751 (2014)Google Scholar
  23. 23.
    Kalchbrenner, N., Grefenstette, E., Blunsom, P.: A convolutional neural network for modelling sentences. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, ACL 2014, June 22–27, 2014, Baltimore, MD, USA, Volume 1: Long Papers, pp. 655–665 (2014)Google Scholar
  24. 24.
    Johnson, R., Zhang, T.: Deep pyramid convolutional neural networks for text categorization. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), vol. 1, pp. 562–570 (2017)Google Scholar
  25. 25.
    Shen, Y., He, X., Gao, J., Deng, L., Mesnil, G.: Learning semantic representations using convolutional neural networks for web search. In: Proceedings of the 23rd International Conference on World Wide Web, pp. 373–374. ACM (2014)Google Scholar
  26. 26.
    Shen, Y., He, X., Gao, J., Deng, L., Mesnil, G.: A latent semantic model with convolutional-pooling structure for information retrieval. In: Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management, CIKM ’14, pp. 101–110, New York, NY, USA (2014)Google Scholar
  27. 27.
    Qu, W., Wang, D., Feng, S., Zhang, Y., Yu, G.: A novel cross-modal hashing algorithm based on multimodal deep learning. Sci. China Inf. Sci. 60(9), 092104 (2017)CrossRefGoogle Scholar
  28. 28.
    Zhang, R., Lee, H., Radev, D.R.: Dependency sensitive convolutional neural networks for modeling sentences and documents. In: NAACL HLT 2016, The 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, San Diego California, USA, June 12–17, 2016, pp. 1512–1521 (2016)Google Scholar
  29. 29.
    Lee, J.Y., Dernoncourt, F.: Sequential short-text classification with recurrent and convolutional neural networks. In: NAACL HLT 2016, The 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, San Diego California, USA, June 12–17, 2016, pp. 515–520 (2016)Google Scholar
  30. 30.
    Ng, A.Y., Jordan, M.I.: On discriminative vs. generative classifiers: a comparison of logistic regression and naive bayes. In: Dietterich, T.G., Becker, S., Ghahramani, Z. (eds.) Advances in Neural Information Processing Systems, vol. 14, pp. 841–848. MIT Press, Cambridge (2002)Google Scholar
  31. 31.
    Genkin, A., Lewis, D.D., Madigan, D.: Large-scale bayesian logistic regression. Technometrics 49(3), 291–304 (2007)MathSciNetCrossRefGoogle Scholar
  32. 32.
    Yang, Y., Liu, X.: A re-examination of text categorization methods. In: SIGIR, pp. 42–49. ACM (1999)Google Scholar
  33. 33.
    Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting held December 5–8, 2013, Lake Tahoe, Nevada, United States., pp. 3111–3119 (2013)Google Scholar
  34. 34.
    Hara, K., Saitoh, D., Shouno, H.: Analysis of function of rectified linear unit used in deep learning. In: 2015 International Joint Conference on Neural Networks, IJCNN 2015, Killarney, Ireland, July 12–17, 2015, pp. 1–8 (2015)Google Scholar
  35. 35.
    Zhang, T.: Solving large scale linear prediction problems using stochastic gradient descent algorithms. In: Proceedings of the Twenty-first International Conference on Machine Learning, ICML ’04, pp. 116, New York, NY, USA (2004)Google Scholar
  36. 36.
    Hinton, G.E., Srivastava, N., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Improving neural networks by preventing co-adaptation of feature detectors. CoRR, arXiv:1207.0580, (2012)
  37. 37.
    Le, Q.V., Mikolov, T.: Distributed representations of sentences and documents. In: ICML, vol. 14, pp. 1188–1196 (2014)Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2017

Authors and Affiliations

  1. 1.Engineering Research Center of High Volume Language Information Processing and Cloud Computing Applications, School of Computer Science and TechnologyBeijing Institute of TechnologyBeijingChina
  2. 2.College of Mathematics and Computer ScienceYan’an UniversityYan’anChina

Personalised recommendations