Advertisement

Conceptual Model Engineering for Industrial Safety Inspection Based on Spreadsheet Data Analysis

  • Nikita O. Dorodnykh
  • Aleksandr Yu. YurinEmail author
  • Alexey O. Shigarov
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 1126)

Abstract

Conceptual models are the foundation for many modern intelligent systems, as well as a theoretical basis for conducting more in-depth scientific research. Various information sources (e.g., databases, spreadsheets data, and text documents, etc.) and the reverse engineering procedure can be used for creation of such models. In this paper, we propose an approach to support the conceptual model engineering based on the analysis and transformation of tabular data from CSV files. Industrial safety inspection (ISI) reports are used as examples for spreadsheets data analysis and transformation. The automated conceptual model engineering involves five steps and employs the following software: TabbyXL for extraction of canonical (relational) tables from arbitrary spreadsheet data in the CSV format; Personal Knowledge Base Designer (PKBD) for generation of conceptual model fragments based on analysis and transformation of canonical tables, and aggregating these fragments into domain model. Verification of the approach was carried out on the corpus containing 216 spreadsheets extracted from six ISI reports. The obtained conceptual models can be used in the design of knowledge bases.

Keywords

Spreadsheet data Conceptual models Class diagram UML Model transformation Industrial safety inspection 

Notes

Acknowledgments

This work was supported by the Russian Science Foundation, grant number 18-71-10001.

References

  1. 1.
    Berman, A.F., Nikolaichuk, O.A., Yurin, A.Y., Kuznetsov, K.A.: Support of decision-making based on a production approach in the performance of an industrial safety review. Chem. Petrol. Eng. 50(11–12), 730–738 (2015).  https://doi.org/10.1007/s10556-015-9970-xCrossRefGoogle Scholar
  2. 2.
    Yurin, A.Y., Dorodnykh, N.O., Nikolaychuk, O.A., Grishenko, M.A.: Prototyping rule-based expert systems with the aid of model transformations. J. Comput. Sci. 14(5), 680–698 (2018).  https://doi.org/10.3844/jcssp.2018.680.698CrossRefGoogle Scholar
  3. 3.
  4. 4.
    Shigarov, A.O., Mikhailov, A.A.: Rule-based spreadsheet data transformation from arbitrary to relational tables. Inf. Syst. 71, 123–136 (2017).  https://doi.org/10.1016/j.is.2017.08.004CrossRefGoogle Scholar
  5. 5.
    Mauro, N., Esposito, F., Ferilli, S.: Finding critical cells in web tables with SRL: trying to uncover the devil’s tease. In: 12th International Conference on Document Analysis and Recognition, pp. 882–886 (2013).  https://doi.org/10.1109/ICDAR.2013.180
  6. 6.
    Adelfio, M., Samet, H.: Schema extraction for tabular data on the web. VLDB Endowment 6(6), 421–432 (2013).  https://doi.org/10.14778/2536336.2536343CrossRefGoogle Scholar
  7. 7.
    Chen, Z., Cafarella, M.: Integrating spreadsheet data via accurate and low-effort extraction. In: 20th ACM SIGKDD International Conference Knowledge Discovery and Data Mining, pp. 1126–1135 (2014).  https://doi.org/10.1145/2623330.2623617
  8. 8.
    Embley, D.W., Krishnamoorthy, M.S., Nagy, G., Seth, S.: Converting heterogeneous statistical tables on the web to searchable databases. IJDAR 19(2), 119–138 (2016).  https://doi.org/10.1007/s10032-016-0259-1CrossRefGoogle Scholar
  9. 9.
    Rastan, R., Paik, H., Shepherd, J., Haller, A.: Automated table understanding using stub patterns. In: Navathe, S.B., Wu, W., Shekhar, S., Du, X., Wang, X.S., Xiong, H. (eds.) DASFAA 2016. LNCS, vol. 9642, pp. 533–548. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-32025-0_33CrossRefGoogle Scholar
  10. 10.
    Goto, K., Ohta, Yu., Inakoshi, H., Yugami, N.: Extraction algorithms for hierarchical header structures from spreadsheets. In: Workshops of the EDBT/ICDT 2016 Joint Conference, vol. 1558, pp. 1–6 (2016)Google Scholar
  11. 11.
    Nagy, G., Seth, S.: Table headers: An entrance to the data mine. In: 23rd International Conference Pattern Recognition, pp. 4065–4070 (2016).  https://doi.org/10.1109/ICPR.2016.7900270
  12. 12.
    Koci, E., Thiele, M., Romero, O., Lehner, W.: A machine learning approach for layout inference in spreadsheets. In: Proceedings of 8th International Joint Conference Knowledge Discovery, Knowledge Engineering and Knowledge Management, pp. 77–88 (2016).  https://doi.org/10.5220/0006052200770088
  13. 13.
    de Vos, M., Wielemaker, J., Rijgersberg, H., Schreiber, G., Wielinga, B., Top, J.: Combining information on structure and content to automatically annotate natural science spreadsheets. Int. J. Hum.-Comput. Stud. 130, 63–76 (2017).  https://doi.org/10.1016/j.ijhcs.2017.02.006CrossRefGoogle Scholar
  14. 14.
    Kandel, S., Paepcke, A., Hellerstein, J., Heer, J.: Wrangler: interactive visual specification of data transformation scripts. In: SIGCHI Conference on Human Factors in Computing Systems, 3363–3372 (2011).  https://doi.org/10.1145/1978942.1979444
  15. 15.
    Hung, V., Benatallah, B., Saint-Paul, R.: Spreadsheet-based complex data transformation. In: 20th ACM International Conference on Information and Knowledge Management, pp. 1749–1754 (2011).  https://doi.org/10.1145/2063576.2063829
  16. 16.
    Harris, W., Gulwani, S.: Spreadsheet table transformations from examples. ACM SIGPLAN Notices 46(6), 317–328 (2011).  https://doi.org/10.1145/1993316.1993536CrossRefGoogle Scholar
  17. 17.
    Astrakhantsev, N., Turdakov, D., Vassilieva, N.: Semi-automatic data extraction from tables. In: Proceedings 15th All-Russian Conference Digital Libraries, pp. 14–20 (2013)Google Scholar
  18. 18.
    Barowy, D.W., Gulwani, S., Hart, T., Zorn, B.: FlashRelate: extracting relational data from semi-structured spreadsheets using examples. ACM SIGPLAN Notices 50(6), 218–228 (2015).  https://doi.org/10.1145/2813885.2737952CrossRefGoogle Scholar
  19. 19.
    Cunha, J., Erwig, M., Mendes, M., Saraiva, J.: Model inference for spreadsheets. Autom. Softw. Eng. 23, 361–392 (2016).  https://doi.org/10.1007/s10515-014-0167-xCrossRefGoogle Scholar
  20. 20.
    Jin, Z., Anderson, M.R., Cafarella, M., Jagadish, H.V.: Foofah: Transforming data by example. In: ACM International Conference Management of Data, pp. 683–698 (2017).  https://doi.org/10.1145/3035918.3064034
  21. 21.
    Hermans, F., Pinzger, M., van Deursen, A.: Automatically extracting class diagrams from spreadsheets. In: D’Hondt, T. (ed.) ECOOP 2010. LNCS, vol. 6183, pp. 52–75. Springer, Heidelberg (2010).  https://doi.org/10.1007/978-3-642-14107-2_4CrossRefGoogle Scholar
  22. 22.
    Amalfitano, D., Fasolino, A.R., Tramontana, P., De Simone, V., Di Mare, G., Scala, S.: A reverse engineering process for inferring data models from spreadsheet-based information systems: an automotive industrial experience. In: Helfert, M., Holzinger, A., Belo, O., Francalanci, C. (eds.) DATA 2014. CCIS, vol. 178, pp. 136–153. Springer, Cham (2015).  https://doi.org/10.1007/978-3-319-25936-9_9CrossRefGoogle Scholar
  23. 23.
    Tijerino, Y.A., Embley, D.W., Lonsdale, D.W., Ding, Y., Nagy, G.: Towards ontology generation from tables. World Wide Web Internet Web Inf. Syst. 8(8), 261–285 (2005).  https://doi.org/10.1007/s11280-005-0360-8CrossRefGoogle Scholar
  24. 24.
    Yurin A.Y., Dorodnykh N.O., Nikolaychuk O.A., Berman A.F., Pavlov A.I.: ISI models, mendeley data, v1 (2019).  https://doi.org/10.17632/f9h2t766tk.1

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.Matrosov Institute for System Dynamics and Control Theory, Siberian Branch of the Russian Academy of SciencesIrkutskRussia

Personalised recommendations