Advertisement

A Web Crawling Environment to Support Financial Strategies and Trend Correlation

– Extended Abstract –
  • Giovanni PontiEmail author
  • Giuseppe Santomauro
  • Fiorenzo Ambrosino
  • Giovanni Bracco
  • Antonio Colavincenzo
  • Matteo De Rosa
  • Agostino Funel
  • Dante Giammattei
  • Guido Guarnieri
  • Silvio Migliori
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11054)

Abstract

We provide an overview on the development and the integration in ENEAGRID of a web crawling tool to retrieve data from the Web, manage and display it, and extract relevant information. We collected all these instruments in a collaborative environment called Web Crawling Virtual Laboratory, offering a GUI to operate remotely. Finally, we describe an ongoing activity on semantic crawling and data analysis to discover trends and correlations in finance.

Keywords

Web crawling Big data Machine learning Market trends 

Notes

Acknowledgements

The computing resources and the related technical support used for this work have been provided by ENEAGRID/CRESCO High Performance Computing infrastructure and its staff [2]. ENEAGRID/CRESCO High Performance Computing infrastructure is funded by ENEA, the Italian National Agency for New Technologies, Energy and Sustainable Economic Development and by Italian and European research programmes, see http://www.cresco.enea.it/english for information.

References

  1. 1.
    Boldi, P., Marino, A., Santini, M., Vigna, S.: BUbiNG: massive crawling for the masses. CoRR abs/1601.06919 (2016)Google Scholar
  2. 2.
    Ponti, G. et al.: The role of medium size facilities in the HPC ecosystem: the case of the new CRESCO4 cluster integrated in the ENEAGRID infrastructure, pp. 1030–1033 (2014)Google Scholar
  3. 3.
    Santomauro, G., et al.: A collaborative environment for web crawling and web data analysis in ENEAGRID. In: DATA 2017, 24–26 July 2017, Madrid, Spain, pp. 287–295 (2017)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Giovanni Ponti
    • 1
    Email author
  • Giuseppe Santomauro
    • 1
  • Fiorenzo Ambrosino
    • 1
  • Giovanni Bracco
    • 1
  • Antonio Colavincenzo
    • 2
  • Matteo De Rosa
    • 1
  • Agostino Funel
    • 1
  • Dante Giammattei
    • 1
  • Guido Guarnieri
    • 1
  • Silvio Migliori
    • 1
  1. 1.ENEA - ICT Division - Portici Research Center (NA)PorticiItaly
  2. 2.Accenture Technology Solutions s.r.l. - Assago (MI)MilanItaly

Personalised recommendations