Enriching Digital Libraries with Crowdsensed Data

Twitter Monitor and the SoBigData Ecosystem
  • Stefano Cresci
  • Salvatore Minutoli
  • Leonardo Nizzoli
  • Serena TardelliEmail author
  • Maurizio Tesconi
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 988)


SoBigData is a Research Infrastructure (RI) aiming to provide an integrated ecosystem for ethic-sensitive scientific discoveries and advanced applications of social data mining. A key milestone of the project focuses on data, methods and results sharing, in order to ensure the reproducibility, review and re-use of scientific works. For this reason, the Digital Library paradigm is implemented within the RI, providing users with virtual environments where datasets, methods and results can be collected, maintained, managed and preserved, granting full documentation, access and the possibility to re-use.

In this paper, we describe the results of our effort for integrating the Twitter Monitor, a tool for gathering messages from the Twitter Online Social Network, into the SoBigData RI. The Twitter Monitor provides a simple user interface, enabling researchers and stakeholders, without programming skills, to seamlessly (i) select relevant messages out of the huge Twitter stream by means of language, keyword, user tracking and geographical filters, (ii) store data on user personal Workspace, (iii) and publish them in the SoBigData Resource Catalogue, which implements all the aforementioned Digital Library features.

Thanks to the seamless integration in the SoBigData RI, the Twitter Monitor allows researchers and stakeholders, belonging to different areas and having different backgrounds, to exploit the crowdsensing paradigm for enriching the SoBigData Digital Library. In this way, crowdsensing acquires the key features of openness, accessibility, interoperability and interdisciplinarity that characterize the Digital Libraries framework.


Digital libraries Resource sharing Online social networks Crowdsensing 



This research is supported in part by the EU H2020 Program under the schemes INFRAIA-1-2014-2015: Research Infrastructures grant agreement #654024 SoBigData: Social Mining & Big Data Ecosystem.


  1. 1.
    Avvenuti, M., Bellomo, S., Cresci, S., La Polla, M.N., Tesconi, M.: Hybrid crowdsensing: a novel paradigm to combine the strengths of opportunistic and participatory crowdsensing. In: Proceedings of WWW 2017 Companion, pp. 1413–1421. ACM (2017)Google Scholar
  2. 2.
    Avvenuti, M., Cimino, M.G., Cresci, S., Marchetti, A., Tesconi, M.: A framework for detecting unfolding emergencies using humans as sensors. SpringerPlus 5(1), 43 (2016)CrossRefGoogle Scholar
  3. 3.
    Avvenuti, M., Cresci, S., Del Vigna, F., Fagni, T., Tesconi, M.: CrisMap: a big data crisis mapping system based on damage detection and geoparsing. Inf. Syst. Front. 1–19 (2018)Google Scholar
  4. 4.
    Avvenuti, M., Cresci, S., Marchetti, A., Meletti, C., Tesconi, M.: Predictability or early warning: using social media in modern emergency response. IEEE Internet Comput. 20(6), 4–6 (2016)CrossRefGoogle Scholar
  5. 5.
    Avvenuti, M., Cresci, S., Nizzoli, L., Tesconi, M.: GSP (Geo-Semantic-Parsing): geoparsing and geotagging with machine learning on top of linked data. In: Gangemi, A., et al. (eds.) ESWC 2018. LNCS, vol. 10843, pp. 17–32. Springer, Cham (2018). Scholar
  6. 6.
    Bezuidenhout, L., Chakauya, E.: Hidden concerns of sharing research data by low/middle-income country scientists. Glob. Bioeth. 29(1), 39–54 (2018)CrossRefGoogle Scholar
  7. 7.
    Borgman, C.L.: The conundrum of sharing research data. J. Am. Soc. Inf. Sci. Technol. 63(6), 1059–1078 (2012)CrossRefGoogle Scholar
  8. 8.
    Candela, L., Castelli, D., Pagano, P.: D4Science: an e-infrastructure for supporting virtual research environments. In: Proceedings of IRCDL 2009, pp. 166–169 (2009)Google Scholar
  9. 9.
    Candela, L., Castelli, D., Pagano, P.: Virtual research environments: an overview and a research agenda. Data Sci. J. 12, GRDI75–GRDI81 (2013)CrossRefGoogle Scholar
  10. 10.
    Candela, L., et al.: Setting the foundations of digital libraries. D-Lib Mag. 13(3/4), 1082–9873 (2007)Google Scholar
  11. 11.
    Cresci, S., Di Pietro, R., Petrocchi, M., Spognardi, A., Tesconi, M.: Social fingerprinting: detection of spambot groups through DNA-inspired behavioral modeling. IEEE Trans. Dependable Secure Comput. 15(4), 561–576 (2018)Google Scholar
  12. 12.
    Cresci, S., Lillo, F., Regoli, D., Tardelli, S., Tesconi, M.: \$FAKE: evidence of spam and bot activity in stock microblogs on Twitter. In: Proceedings of ICWSM 2018, pp. 580–583. AAAI (2018)Google Scholar
  13. 13.
    Deelman, E., Gannon, D., Shields, M., Taylor, I.: Workflows and e-Science: an overview of workflow system features and capabilities. Future Gener. Comput. Syst. 25(5), 528–540 (2009)CrossRefGoogle Scholar
  14. 14.
    Foster, I., Kesselman, C., Tuecke, S.: The anatomy of the grid: enabling scalable virtual organizations. Int. J. High Perform. Comput. Appl. 15(3), 200–222 (2001)CrossRefGoogle Scholar
  15. 15.
    Giannotti, F., Trasarti, R., Bontcheva, K., Grossi, V.: SoBigData: social mining & big data ecosystem. In: Proceedings of WWW 2018 Companion, pp. 437–438. ACM (2018)Google Scholar
  16. 16.
    Hey, T., Trefethen, A.E.: Cyberinfrastructure for e-Science. Science 308(5723), 817–821 (2005)CrossRefGoogle Scholar
  17. 17.
    Newman, H.B., Ellisman, M.H., Orcutt, J.A.: Data-intensive e-science frontier research. Commun. ACM 46(11), 68–77 (2003)CrossRefGoogle Scholar
  18. 18.
    Simeoni, F., Candela, L., Lievens, D., Pagano, P., Simi, M.: Functional adaptivity for digital library services in e-infrastructures: the gCube approach. In: Agosti, M., Borbinha, J., Kapidakis, S., Papatheodorou, C., Tsakonas, G. (eds.) ECDL 2009. LNCS, vol. 5714, pp. 51–62. Springer, Heidelberg (2009). Scholar
  19. 19.
    Tablan, V., Roberts, I., Cunningham, H., Bontcheva, K.: a platform for large-scale, open-source text processing on the cloud. Phil. Trans. R. Soc. A 371(1983), 20120071 (2013)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Institute of Informatics and TelematicsIIT-CNRPisaItaly
  2. 2.Department of Information EngineeringUniversity of PisaPisaItaly

Personalised recommendations