Skip to main content

Abstract

This paper proposes a multifaceted approach to the design of an algorithm for the automatic recognition of chemical compounds in Croatian written as multiword expressions. The algorithm, which we have named the Croatian Chemical Compounds Module, consists of three layers: it uses (1) the NooJ dictionary as the basis for (2) a morphological grammar, and both (1) and (2) are used for (3) a syntactic grammar. This module supports not only single-unit words and homoatomic entities but also variations of chemical names recognized through a variety of suffixes, multiplicative prefixes, hyphens, Roman and Latin numerals, Greek letters, and round, square and curly brackets. Terminological diversity and inconsistency in writing style are discussed as they present a great problem for any such endeavor.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Notes

  1. 1.

    Generally, suffixation is the most productive derivational process in Croatian. Babić [22] provides a list of 771 suffixes used for the derivation of all parts of speech (526 suffixes for nouns, 160 suffixes for adjectives, 61 suffixes for verbs and 24 suffixes for adverbs). On the other hand, there are only 77 prefixes used in the derivation of all major parts of speech.

References

  1. Silberztein, M.: Formalizing Natural Languages: The NooJ Approach. Wiley-ISTE, London (2016)

    Book  Google Scholar 

  2. Kocijan, K., Kurolt, S., Mijić, L.: Building croatian medical dictionary from medical corpus. Rasprave Instituta za hrvatski jezik i jezikoslovlje 46(2), 765–782 (2020)

    Article  Google Scholar 

  3. Kocijan, K., Šojat, K., Kurolt, S.: Multiword expressions in the medical domain: who carries the domain-specific meaning. In: Bekavac, B., Kocijan, K., Silberztein, M., Šojat, K. (eds.) Formalising Natural Languages: Applications to Natural Language Processing and Digital Humanities. 14th International Conference, NooJ 2020, Zagreb, Croatia, June 5–7, 2020, Revised Selected Papers, pp. 49–60. Springer, Cham (2021)

    Google Scholar 

  4. Kocijan, K., Šojat, K.: Formalizing the Recognition of Medical Domain Multiword Units. In: Dash, S., Parida, S., Tello, E., Acharya, B., Bojar, O. (eds.) Natural Language Processing in Healthcare: A Special Focus on Low Resource Languages, pp. 89–120. CRC Press, Boca Raton (2022)

    Chapter  Google Scholar 

  5. Portada, T., Stilinović, V.: Što treba znati o hrvatskoj kemijskoj nomenklaturi? Kem. Ind. 56(4), 209–215 (2007)

    Google Scholar 

  6. Ball, D.W.: Elemental Etymology: What’s in a Name? J. Chem. Educ. 62, 787–788 (1985)

    Article  Google Scholar 

  7. Ringnes, V.: Origin of the names of chemical elements. J. Chem. Educ. 66, 731–738 (1989)

    Article  Google Scholar 

  8. Raos, N., Portada, T., Stilinović, V.: Anionic names of acids: an experiment in chemical nomenclature. Bull. Hist. Chem. 38, 61–66 (2013)

    Google Scholar 

  9. Dijskstra, A.J., Hellwich, K.-H., Hartshorn, R.M., Reedijk, J., Szabo, E.: End-of-Line Hyphenation of Chemical Names (IUPAC Recommendations 2020). Pure Appl. Chem. 93(1), 47–68 (2021)

    Article  Google Scholar 

  10. Raos, N.: Kako definirati organsku kemiju? Kem. Ind. 71(7–8), 507–512 (2022)

    Google Scholar 

  11. Gotkova, T., Chepurnykh, N.: Public perception and usage of the term carbon: linguistic analysis in an environmental social media corpus. Psychol. Lang. Commun. 26(1), 297–312 (2022)

    Article  Google Scholar 

  12. Giomini, C., Cardinali, M.E., Cardellini, L.: Simples and compounds: a proposal. Chem. Int. 27(1), 18 (2005)

    Google Scholar 

  13. Portada, T., Stilinović, V.: Simples and compounds: another opinion. Chem. Int. 27(5), 20 (2005)

    Google Scholar 

  14. Portada, T., Stilinović, V.: Prijedlog pridjevske funkcijsko-razredne nomenklature. Kem. Ind. 58(10), 461–464 (2009)

    Google Scholar 

  15. Portada, T.: Kako na hrvatskom jeziku reći entacapone? Kem. Ind. 61(3), 177–178 (2012)

    Google Scholar 

  16. Ingrosso, F., Polguère, A.: How terms meet in small-world lexical networks: the case of chemistry terminology. In: Poibeau, T., Faber, P. (eds.) Proceedings of the 11th International Conference on Terminology and Artificial Intelligence, pp. 167–171. Granada (2015)

    Google Scholar 

  17. Simeon, V.: Proslov hrvatskomu izdanju. In: Međunarodna unija za čistu i primijenjenu kemiju, Hrvatska nomenklatura anorganske kemije, preporuke HKD 1995, pp. IX–XVI. Školska knjiga, Zagreb (1996)

    Google Scholar 

  18. Strohal, D.: Prijedlog za izmjenu kemijskog nazivlja kiselina. Kemijski vjestnik 15(16), 126 (1941/1942)

    Google Scholar 

  19. Stojanov, T., Lewis, K., Portada, T.: Rad na Struni na primjeru hrvatskoga kemijskog nazivlja. In: Ledinek, N., Žagar Karer, M., Humar, M. (eds.) Terminologija in sodobna terminografija, pp. 181–194 (2009)

    Google Scholar 

  20. Grdinić, V.: Farmaceutski naslovi u Hrvatskoj farmakopeji. Farm. Glas. 63(1), 37–55 (2007)

    Google Scholar 

  21. Lowe, D.M., Corbett, P.T., Murray-Rust, P., Glen, R.C.: Chemical name to structure: OPSIN, an open source solution. J. Chem. Inf. Model. 51(3), 739–753 (2011)

    Article  Google Scholar 

  22. Babić, S.: Tvorba riječi u hrvatskome književnome jeziku. HAZU i Nakladni zavod Globus, Zagreb (2002)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kristina Kocijan .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Kocijan, K., Šojat, K., Portada, T. (2024). Deciphering the Nomenclature of Chemical Compounds in NooJ. In: Bartulović, A., Mijić, L., Silberztein, M. (eds) Formalizing Natural Languages: Applications to Natural Language Processing and Digital Humanities. NooJ 2023. Communications in Computer and Information Science, vol 1816. Springer, Cham. https://doi.org/10.1007/978-3-031-56646-2_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-56646-2_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-56645-5

  • Online ISBN: 978-3-031-56646-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics