Abstract
Advancements in artificial intelligence and machine learning have strongly impacted all fields of research in the past few years. The application of these new methods is often restricted due to a lack of access to sufficient amounts of high-quality data, an issue created through decades of manual and inconsistent data management. In the example of Li-ion battery research, which is a very dynamic field of research, this is an especially widespread problem. Since data is commonly still managed separately by the individual researchers, standard formats or tools are either missing completely or specific to an institute. To resolve this issue, we propose an ontology-based framework that emphasizes strong data harmonization and standardization. We combine these data structures with an intuitive web platform in order to simplify data management and reduce the time burden for the individual researcher. Our platform-based approach strengthens collaboration between institutions and enables battery researchers to create data sets that are suitable for data science and comparable across projects.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Adhikari K, Patten SB, Patel AB, Premji S, Tough S, Letourneau N, …, Metcalfe A (2021) Data harmonization and data pooling from cohort studies: a practical approach for data management. Int J Popul Data Sci 6(1):1680
Agarwal P, Shroff G, Malhotra P (2013) Approximate incremental big-data harmonization. In: 2013 IEEE international congress on big data, pp 118–125. https://doi.org/10.1109/BigData.Congress.2013.24
Baier L, Jöhren F, Seebacher S (2019) Challenges in the deployment and operation of machine learning in practice. In: 27th European conference on information systems
de Vass T, Shee H, Miah S (2021) IoT in supply chain management: opportunities and challenges for businesses in early Industry 4.0 context. In: Forum O (ed) Oper Supply Chain Manage Int J 14(2):148–161. https://doi.org/10.31387/oscm0450293
Fortier I, Burton PR, Robson PJ, Ferretti, V, Little J, L'Heureux F, Hudson T (2010) Quality, quantity and harmony: the DataSHaPER approach to integrating data across bioclinical studies. Int J Epidemiol 39(5):1383–1393
Fortier I, Raina P, Van den Heuvel ER, Griffith LE, Craig C, Saliba M, Burton P (2016) Maelstrom research guidelines for rigorous retrospective data harmonization. Int J Epidemiol 46(1):103–105. https://doi.org/10.1093/ije/dyw075
Kumar G, Basri S, Imam AA, Khowaja SA, Capretz LF, Balogun AO (2021) Data harmonization for heterogeneous datasets: a systematic literature review. Appl Sci 11(17):8275. https://doi.org/10.3390/app11178275
Mutz M, Perovic M, Gümbel P, Steinbauer V, Taranovskyy A, Li Y, …, Kraus T (2023) Toward a Li-Ion battery ontology covering production and material structure. Energy Technol 11(5):2200681. https://doi.org/10.1002/ente.202200681
Paleyes A, Urma R-G, Lawrence ND (2022) Challenges in deploying machine learning: a survey of case studies. ACM Comput Surv 1–29
Pinfield S, Cox AM, Smith J (2014) Research data management and libraries: relationships, activities drivers and influences. PLoS ONE 9(12):1–28. https://doi.org/10.1371/journal.pone.0114734
Polyzotis N, Roy S, Whang S, Zinkevich M (2018) Data lifecycle challenges in production machine learning: a survey. ACM SIGMOD Rec 47(2):17–28. https://doi.org/10.1145/3299887.3299891
Rubacha M, Rattan AK, Hosselet SC (2011) A review of electronic laboratory notebooks available in the market today. JALA: J Assoc Lab Autom 16(1):90–98. https://doi.org/10.1016/j.jala.2009.01.002
Sajid S, Haleem A, Bahl S, Javaid M, Goyal T, Mittal M (2021) Data science applications for predictive maintenance and materials science in context to Industry 4.0. Mater Today Proc (45):4898–4905. https://doi.org/10.1016/j.matpr.2021.01.357
The DELVE Initiative (2020) Data readiness: lessons from an emergency
Wiedau M, Tolksdorf G, Oeing J, Kockmann N (2021) Towards a systematic data harmonization to enable AI application in the process industry. Chem Ing Tec 93(12):2105–2115. https://doi.org/10.1002/cite.202100203
Wieder WR, Pierson D, Earl S, Lajtha K, Baer SG, Ballantyne F, …, Johnson (2021) SoDaH: the SOils DAta Harmonization database, an open-source synthesis of soil data from research networks, version 1.0. Earth Syst Sci Data 13(5):1843–1854. https://doi.org/10.5194/essd-13-1843-2021
Wilkinson MD, Dumontier M, Aalbersberg IJ, Appleton G, Axton M, Baak A, …, Mons B (2016) The FAIR guiding principles for scientific data management and stewardship. Sci Data 3(1):160018
Acknowledgements
The authors gratefully acknowledge the financial support of the German Federal Ministry for Education and Research (BMBF) within the project DigiBatMat (03XP0367D). DigiBatMat is a joint project of the August-Wilhelm Scheer Institute, the Leibniz Institute for New Materials, the Aalen University of Technology and Economics, the KIT Institute for Applied Informatics and Formal Description Methods as well as the Institute for Particle Technology and the Institute for Machine Tools and Manufacturing Technology from TU Braunschweig.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Nebel, V., Mutz, M., Heim, Y., Werth, D. (2024). Overcoming the Challenges of Data Harmonization: A Platform Approach from Li-Ion Battery Research. In: Ullah, A., Anwar, S., Calandra, D., Di Fuccio, R. (eds) Proceedings of International Conference on Information Technology and Applications. ICITA 2022. Lecture Notes in Networks and Systems, vol 839. Springer, Singapore. https://doi.org/10.1007/978-981-99-8324-7_5
Download citation
DOI: https://doi.org/10.1007/978-981-99-8324-7_5
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-8323-0
Online ISBN: 978-981-99-8324-7
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)