From Web Crawled Text to Project Descriptions: Automatic Summarizing of Social Innovation Projects

  • Nikola MiloševićEmail author
  • Dimitar Marinov
  • Abdullah Gök
  • Goran Nenadić
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11608)


In the past decade, social innovation projects have gained the attention of policy makers, as they address important social issues in an innovative manner. A database of social innovation is an important source of information that can expand collaboration between social innovators, drive policy and serve as an important resource for research. Such a database needs to have projects described and summarized. In this paper, we propose and compare several methods (e.g. SVM-based, recurrent neural network based, ensambled) for describing projects based on the text that is available on project websites. We also address and propose a new metric for automated evaluation of summaries based on topic modelling.


Summarization Evaluation metrics Text mining Natural language processing Social innovation SVM Neural networks 



The work presented in this paper is part of the KNOWMAK project that has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No. 726992.


  1. 1.
    Bazrfkan, M., Radmanesh, M.: Using machine learning methods to summarize persian texts. Indian J. Sci. Res. 7(1), 1325–1333 (2014)Google Scholar
  2. 2.
    Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)zbMATHGoogle Scholar
  3. 3.
    Bonifacio, M.: Social innovation: a novel policy stream or a policy compromise? An EU perspective. Eur. Rev. 22(1), 145–169 (2014)CrossRefGoogle Scholar
  4. 4.
    Cheng, J., Lapata, M.: Neural summarization by extracting sentences and words. arXiv preprint arXiv:1603.07252 (2016)
  5. 5.
    Dong, Y.: A survey on neural network-based summarization methods. arXiv preprint arXiv:1804.04589 (2018)
  6. 6.
    Fattah, M.A., Ren, F.: GA, MR, FFNN, PNN and GMM based models for automatic text summarization. Comput. Speech Lang. 23(1), 126–144 (2009)CrossRefGoogle Scholar
  7. 7.
    Lin, C.Y.: ROUGE: a package for automatic evaluation of summaries. Text Summarization Branches Out (2004)Google Scholar
  8. 8.
    Maynard, D., Lepori, B.: Ontologies as bridges between data sources and user queries: the KNOWMAK project experience. In: Proceedings of Science, Technology and Innovation Indicators 2017, STI 2017 (2017)Google Scholar
  9. 9.
    Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)Google Scholar
  10. 10.
    Milosevic, N., Gok, A., Nenadic, G.: Classification of intangible social innovation concepts. In: Silberztein, M., Atigui, F., Kornyshova, E., Métais, E., Meziane, F. (eds.) NLDB 2018. LNCS, vol. 10859, pp. 407–418. Springer, Cham (2018). Scholar
  11. 11.
    Nallapati, R., Zhai, F., Zhou, B.: SummaRuNNer: a recurrent neural network based sequence model for extractive summarization of documents. In: Thirty-First AAAI Conference on Artificial Intelligence (2017)Google Scholar
  12. 12.
    Nallapati, R., Zhou, B., Gulcehre, C., Xiang, B., et al.: Abstractive text summarization using sequence-to-sequence RNNs and beyond. arXiv preprint arXiv:1602.06023 (2016)
  13. 13.
    Nenkova, A., Passonneau, R.: Evaluating content selection in summarization: the pyramid method. In: Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics: HLT-NAACL 2004 (2004)Google Scholar
  14. 14.
    Neto, J.L., Freitas, A.A., Kaestner, C.A.A.: Automatic text summarization using a machine learning approach. In: Bittencourt, G., Ramalho, G.L. (eds.) SBIA 2002. LNCS (LNAI), vol. 2507, pp. 205–215. Springer, Heidelberg (2002). Scholar
  15. 15.
    Pagliardini, M., Gupta, P., Jaggi, M.: Unsupervised learning of sentence embeddings using compositional n-gram features. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), vol. 1, pp. 528–540 (2018)Google Scholar
  16. 16.
    Riedhammer, K., Favre, B., Hakkani-Tür, D.: Long story short-global unsupervised models for keyphrase based meeting summarization. Speech Commun. 52(10), 801–815 (2010)CrossRefGoogle Scholar
  17. 17.
    Rush, A.M., Chopra, S., Weston, J.: A neural attention model for abstractive sentence summarization. arXiv preprint arXiv:1509.00685 (2015)
  18. 18.
    Sarkar, K., Nasipuri, M., Ghose, S.: Using machine learning for medical document summarization. Int. J. Database Theory Appl. 4(1), 31–48 (2011)Google Scholar
  19. 19.
    Sinha, A., Yadav, A., Gahlot, A.: Extractive text summarization using neural networks. arXiv preprint arXiv:1802.10137 (2018)
  20. 20.
    Steinberger, J., Ježek, K.: Evaluation measures for text summarization. Comput. Inform. 28(2), 251–275 (2012)Google Scholar
  21. 21.
    Young, T., Hazarika, D., Poria, S., Cambria, E.: Recent trends in deep learning based natural language processing. IEEE Comput. Intell. Mag. 13(3), 55–75 (2018)CrossRefGoogle Scholar
  22. 22.
    Zhang, Z., Petrak, J., Maynard, D.: Adapted textrank for term extraction: a generic method of improving automatic term extraction algorithms. Procedia Comput. Sci. 137, 102–108 (2018)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.School of Computer ScienceUniversity of ManchesterManchesterUK
  2. 2.Hunter Centre For Entrepreneurship, Strathclyde Business SchoolUniversity of StratclydeGlasgowUK

Personalised recommendations