Skip to main content

From Web Crawled Text to Project Descriptions: Automatic Summarizing of Social Innovation Projects

Part of the Lecture Notes in Computer Science book series (LNISA,volume 11608)


In the past decade, social innovation projects have gained the attention of policy makers, as they address important social issues in an innovative manner. A database of social innovation is an important source of information that can expand collaboration between social innovators, drive policy and serve as an important resource for research. Such a database needs to have projects described and summarized. In this paper, we propose and compare several methods (e.g. SVM-based, recurrent neural network based, ensambled) for describing projects based on the text that is available on project websites. We also address and propose a new metric for automated evaluation of summaries based on topic modelling.


  • Summarization
  • Evaluation metrics
  • Text mining
  • Natural language processing
  • Social innovation
  • SVM
  • Neural networks

This is a preview of subscription content, access via your institution.

Buying options

USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-030-23281-8_13
  • Chapter length: 13 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
USD   59.99
Price excludes VAT (USA)
  • ISBN: 978-3-030-23281-8
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   79.99
Price excludes VAT (USA)
Fig. 1.


  1. 1.

  2. 2.

  3. 3.

  4. 4.

  5. 5.


  1. Bazrfkan, M., Radmanesh, M.: Using machine learning methods to summarize persian texts. Indian J. Sci. Res. 7(1), 1325–1333 (2014)

    Google Scholar 

  2. Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)

    MATH  Google Scholar 

  3. Bonifacio, M.: Social innovation: a novel policy stream or a policy compromise? An EU perspective. Eur. Rev. 22(1), 145–169 (2014)

    CrossRef  Google Scholar 

  4. Cheng, J., Lapata, M.: Neural summarization by extracting sentences and words. arXiv preprint arXiv:1603.07252 (2016)

  5. Dong, Y.: A survey on neural network-based summarization methods. arXiv preprint arXiv:1804.04589 (2018)

  6. Fattah, M.A., Ren, F.: GA, MR, FFNN, PNN and GMM based models for automatic text summarization. Comput. Speech Lang. 23(1), 126–144 (2009)

    CrossRef  Google Scholar 

  7. Lin, C.Y.: ROUGE: a package for automatic evaluation of summaries. Text Summarization Branches Out (2004)

    Google Scholar 

  8. Maynard, D., Lepori, B.: Ontologies as bridges between data sources and user queries: the KNOWMAK project experience. In: Proceedings of Science, Technology and Innovation Indicators 2017, STI 2017 (2017)

    Google Scholar 

  9. Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)

    Google Scholar 

  10. Milosevic, N., Gok, A., Nenadic, G.: Classification of intangible social innovation concepts. In: Silberztein, M., Atigui, F., Kornyshova, E., Métais, E., Meziane, F. (eds.) NLDB 2018. LNCS, vol. 10859, pp. 407–418. Springer, Cham (2018).

    CrossRef  Google Scholar 

  11. Nallapati, R., Zhai, F., Zhou, B.: SummaRuNNer: a recurrent neural network based sequence model for extractive summarization of documents. In: Thirty-First AAAI Conference on Artificial Intelligence (2017)

    Google Scholar 

  12. Nallapati, R., Zhou, B., Gulcehre, C., Xiang, B., et al.: Abstractive text summarization using sequence-to-sequence RNNs and beyond. arXiv preprint arXiv:1602.06023 (2016)

  13. Nenkova, A., Passonneau, R.: Evaluating content selection in summarization: the pyramid method. In: Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics: HLT-NAACL 2004 (2004)

    Google Scholar 

  14. Neto, J.L., Freitas, A.A., Kaestner, C.A.A.: Automatic text summarization using a machine learning approach. In: Bittencourt, G., Ramalho, G.L. (eds.) SBIA 2002. LNCS (LNAI), vol. 2507, pp. 205–215. Springer, Heidelberg (2002).

    CrossRef  Google Scholar 

  15. Pagliardini, M., Gupta, P., Jaggi, M.: Unsupervised learning of sentence embeddings using compositional n-gram features. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), vol. 1, pp. 528–540 (2018)

    Google Scholar 

  16. Riedhammer, K., Favre, B., Hakkani-Tür, D.: Long story short-global unsupervised models for keyphrase based meeting summarization. Speech Commun. 52(10), 801–815 (2010)

    CrossRef  Google Scholar 

  17. Rush, A.M., Chopra, S., Weston, J.: A neural attention model for abstractive sentence summarization. arXiv preprint arXiv:1509.00685 (2015)

  18. Sarkar, K., Nasipuri, M., Ghose, S.: Using machine learning for medical document summarization. Int. J. Database Theory Appl. 4(1), 31–48 (2011)

    Google Scholar 

  19. Sinha, A., Yadav, A., Gahlot, A.: Extractive text summarization using neural networks. arXiv preprint arXiv:1802.10137 (2018)

  20. Steinberger, J., Ježek, K.: Evaluation measures for text summarization. Comput. Inform. 28(2), 251–275 (2012)

    Google Scholar 

  21. Young, T., Hazarika, D., Poria, S., Cambria, E.: Recent trends in deep learning based natural language processing. IEEE Comput. Intell. Mag. 13(3), 55–75 (2018)

    CrossRef  Google Scholar 

  22. Zhang, Z., Petrak, J., Maynard, D.: Adapted textrank for term extraction: a generic method of improving automatic term extraction algorithms. Procedia Comput. Sci. 137, 102–108 (2018)

    CrossRef  Google Scholar 

Download references


The work presented in this paper is part of the KNOWMAK project that has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No. 726992.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Nikola Milošević .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Verify currency and authenticity via CrossMark

Cite this paper

Milošević, N., Marinov, D., Gök, A., Nenadić, G. (2019). From Web Crawled Text to Project Descriptions: Automatic Summarizing of Social Innovation Projects. In: Métais, E., Meziane, F., Vadera, S., Sugumaran, V., Saraee, M. (eds) Natural Language Processing and Information Systems. NLDB 2019. Lecture Notes in Computer Science(), vol 11608. Springer, Cham.

Download citation

  • DOI:

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-23280-1

  • Online ISBN: 978-3-030-23281-8

  • eBook Packages: Computer ScienceComputer Science (R0)