A Reliable Large Distributed Object Store Based Platform for Collecting Event Metadata

Fernández Casaní, Álvaro; Orduña, Juan M.; Sánchez, Javier; González de la Hoz, Santiago

doi:10.1007/s10723-021-09580-0

A Reliable Large Distributed Object Store Based Platform for Collecting Event Metadata

Open access
Published: 30 August 2021

Volume 19, article number 39, (2021)
Cite this article

Download PDF

You have full access to this open access article

Journal of Grid Computing Aims and scope Submit manuscript

A Reliable Large Distributed Object Store Based Platform for Collecting Event Metadata

Download PDF

Álvaro Fernández Casaní ORCID: orcid.org/0000-0003-1394-509X¹,
Juan M. Orduña²,
Javier Sánchez¹ &
…
Santiago González de la Hoz¹

359 Accesses
3 Citations
Explore all metrics

Abstract

The Large Hadron Collider (LHC) is about to enter its third run at unprecedented energies. The experiments at the LHC face computational challenges with enormous data volumes that need to be analysed by thousands of physics users. The ATLAS EventIndex project, currently running in production, builds a complete catalogue of particle collisions, or events, for the ATLAS experiment at the LHC. The distributed nature of the experiment data model is exploited by running jobs at over one hundred Grid data centers worldwide. Millions of files with petabytes of data are indexed, extracting a small quantity of metadata per event, that is conveyed with a data collection system in real time to a central Hadoop instance at CERN. After a successful first implementation based on a messaging system, some issues suggested performance bottlenecks for the challenging higher rates in next runs of the experiment. In this work we characterize the weaknesses of the previous messaging system, regarding complexity, scalability, performance and resource consumption. A new approach based on an object-based storage method was designed and implemented, taking into account the lessons learned and leveraging the ATLAS experience with this kind of systems. We present the experiment that we run during three months in the real production scenario worldwide, in order to evaluate the messaging and object store approaches. The results of the experiment show that the new object-based storage method can efficiently support large-scale data collection for big data environments like the next runs of the ATLAS experiment at the LHC.

Article PDF

Optimizing LSM-based indexes for disaggregated memory

Article 19 June 2024

The big data system, components, tools, and technologies: a survey

Article 18 September 2018

MongoDB Vs PostgreSQL: A comparative study on performance aspects

Article Open access 05 June 2020

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Data Availability

The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.

References

ATLAS Collaboration: The ATLAS experiment at the CERN Large Hadron Collider. J. Instrum. 3(08), S08003 (2008)
Google Scholar
Barberis, D., Cárdenas Zárate, SE, Cranshaw, J., Favareto, A., Fernández Casaní, A, Gallas, E.J., Glasman, C., González De La Hoz, S, Hřivnáč, J, Malon, D., Prokoshin, F., Salt Cairols, J., Sánchez, J, Többicke, R, Yuan, R.: The ATLAS EventIndex: Architecture, design choices, deployment and first operation experience. J. Phys.: Conf. Ser. 664(4), 042003 (2015)
Google Scholar
Barberis, D., Cranshaw, J., Favareto, A., Fernández Casaní, A, Gallas, E., González de la Hoz, S, Hřivnáč, J, Malon, D., Nowak, M., Prokoshin, F., Salt, J., Sánchez Martínez, J, Többicke, R, Yuan, R.: The ATLAS EventIndex: Full chain deployment and first operation. Nuclear and Particle Physics Proceedings 273-275, 913–918 (2016)
Article Google Scholar
White, T.: Hadoop: The definitive guide. O’Reilly Media, Inc. (2012)
Sánchez, J, Casaní, AF, de la Hoz, S.G.: Distributed data collection for the ATLAS EventIndex. J. Phys.: Conf. Ser. 664(4), 042046 (2015)
Google Scholar
Salomoni, D., Campos, I., Gaido, L., de Lucas, J.M., Solagna, P., Gomes, J., Matyska, L., Fuhrman, P., Hardt, M., Donvito, G., et al: Indigo-datacloud: A platform to facilitate seamless access to e-infrastructures. J Grid Computing 16(3), 381–408 (2018). https://doi.org/10.1007/s10723-018-9453-3
Article Google Scholar
Krašovec, B, Filipčič, A: Enhancing the grid with cloud computing. J Grid Computing 17(1), 119–135 (2019)
Article Google Scholar
Hagras, T., Atef, A., Mahdy, Y.B.: Greening duplication-based dependent-tasks scheduling on heterogeneous large-scale computing platforms. J. Grid Comput. 19(1), 13 (2021)
Article Google Scholar
Activemq, http://activemq.apache.org/
Fernandez Casani, A., Sanchez, J., Gonzalez de la Hoz, S., Orduña, JM: Designing Alternative Transport Methods for the Distributed Data Collection of ATLAS EventIndex Project. http://cds.cern.ch/record/2235644/files/ATL-SOFT-SLIDE-2016-869.pdf (2016)
Karol, M., Hluchyj, M., Morgan, S.: Input versus output queueing on a space-division packet switch. IEEE Transactions on Communications 35(12), 1347–1356 (1987). https://doi.org/10.1109/tcom.1987.1096719
Article Google Scholar
Mesnier, M., Ganger, G.R., Riedel, E.: Object-based storage. IEEE Commun. Mag. 41(8), 84–90 (2003). https://doi.org/10.1109/MCOM.2003.1222722
Article Google Scholar
Rabbitmq, http://www.rabbitmq.com/
Apache flume, http://flume.apache.org/
Logstash data processing pipeline, https://www.elastic.co/products/logstash
Kreps, J., Narkhede, N., Rao, J., et al.: Kafka: A distributed messaging system for log processing. In: Proceedings of NetDB, pp 1–7 (2011)
Dobbelaere, P., Esmaili, K.S.: Kafka versus rabbitmq: A comparative study of two industry reference publish/subscribe implementations: Industry paper. In: Proceedings of the 11th ACM International Conference on Distributed and Event-based Systems, DEBS ’17, ACM, New York, NY, USA, pp 227–238. https://doi.org/10.1145/3093742.3093908 (2017)
Bird, I.G.: Lhc computing (wlcg): Past, present, and future. Grid and Cloud Computing: Concepts and Practical Applications 192, 1 (2016)
Google Scholar
Garonne, V., Vigne, R., Stewart, G., Barisits, M., Lassnig, M., Serfon, C., Goossens, L., Nairz, A., Collaboration, A., et al: Rucio–the next generation of large scale distributed system for atlas data management. In: Journal of Physics: Conference Series, IOP Publishing, 513, pp 042021 (2014)
Calafiura, P., De, K., Guan, W., Maeno, T., Nilsson, P., Oleynik, D., Panitkin, S., Tsulaia, V., Gemmeren, P.V., Wenaus, T.: The atlas event service: A new approach to event processing. J. Phys.: Conf. Ser. 664(6), 062065 (2015)
Google Scholar
Amazon simple storage service developer guide, http://docs.aws.amazon.com/AmazonS3/latest/dev/s3-dg.pdf
Weil, S.A., Brandt, S.A., Miller, E.L., Long, D.D.E., Maltzahn, C.: Ceph: A scalable, high-performance distributed file system. In: Proceedings of the 7th Symposium on Operating Systems Design and Implementation, OSDI ’06, USENIX Association, Berkeley, CA, USA, pp 307–320 (2006)
Ceph, http://ceph.com/
Amazon s3, cloud computing storage for files, images, videos. http://aws.amazon.com
Google protocol buffers: Google’s data interchange format. https://developers.google.com/protocol-buffers/

Download references

Acknowledgements

This work has been partially supported by MICINN in Spain under grants FPA2016-75141-C2-1-R, PID2109-104301RB-C21 and TIN2015-66972-C5-5-R, which include FEDER funds from the European Union. This work was done as part of the databases applications research and development programme of the ATLAS Collaboration, and we thank the collaboration for its support and cooperation.

Funding

Open Access funding provided thanks to the CRUE-CSIC agreement with Springer Nature.

Author information

Authors and Affiliations

Instituto de Física Corpuscular (IFIC), Universidad de Valencia and CSIC, Valencia, Burjassot, Spain
Álvaro Fernández Casaní, Javier Sánchez & Santiago González de la Hoz
Departamento de Informática, Universidad de Valencia, Valencia, Spain
Juan M. Orduña

Authors

Álvaro Fernández Casaní
View author publications
You can also search for this author in PubMed Google Scholar
Juan M. Orduña
View author publications
You can also search for this author in PubMed Google Scholar
Javier Sánchez
View author publications
You can also search for this author in PubMed Google Scholar
Santiago González de la Hoz
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Álvaro Fernández Casaní.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Fernández Casaní, Á., Orduña, J.M., Sánchez, J. et al. A Reliable Large Distributed Object Store Based Platform for Collecting Event Metadata. J Grid Computing 19, 39 (2021). https://doi.org/10.1007/s10723-021-09580-0

Download citation

Received: 23 December 2019
Accepted: 20 July 2021
Published: 30 August 2021
DOI: https://doi.org/10.1007/s10723-021-09580-0

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

A Reliable Large Distributed Object Store Based Platform for Collecting Event Metadata

Abstract

Article PDF

Similar content being viewed by others

Optimizing LSM-based indexes for disaggregated memory

The big data system, components, tools, and technologies: a survey

MongoDB Vs PostgreSQL: A comparative study on performance aspects

Data Availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A Reliable Large Distributed Object Store Based Platform for Collecting Event Metadata

Abstract

Article PDF

Similar content being viewed by others

Optimizing LSM-based indexes for disaggregated memory

The big data system, components, tools, and technologies: a survey

MongoDB Vs PostgreSQL: A comparative study on performance aspects

Data Availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation