Toward a knowledge-to-text controlled natural language of isiZulu

Abstract

The language isiZulu belongs to the Nguni group of languages, which also include isiXhosa, isiNdebele and siSwati. Of the four Nguni languages, isiZulu is the most dominant language in South Africa, which is spoken by 22.7 % of the country’s 51.8 million population. However, isiZulu (and even more so the other Nguni languages) still remains an under-resourced language for software applications. In this article we focus on controlled natural languages for structured knowledge-to-text viewed from a potential utility for verbalising business rules and OWL ontologies. IsiZulu grammar—and by extension, all Bantu languages—shows that a template-based approach is infeasible. This is due to, mainly, the noun class system, the agglutination and verb conjugation with concords for each noun class. We present verbalisation patterns for existential and universal quantification, taxonomic subsumption, axioms with simple properties, and basic cases of negation. Based on the preliminary user assessment of the patterns, selected ones are refined into algorithms for verbalisation to generate correct isiZulu sentences, which have been evaluated.

This is a preview of subscription content, access via your institution.

We’re sorry, something doesn't seem to be working properly.

Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

Fig. 1
Fig. 2

Notes

  1. 1.

    https://nrs.dst.gov.za/nikmas/; last accessed 20 November 2014.

  2. 2.

    http://www.monnet-project.eu; last accessed: June 2014; oflline on 24 December 2014.

  3. 3.

    http://www.molto-project.eu; last accessed 24 December 2014.

  4. 4.

    http://www.organic-lingua.eu; last accessed 24 December 2014.

  5. 5.

    The following abbreviations are used: A = aspect; ADV = adverb; APPL = applicative; Ext = extension; FV = final vowel; M = mood; NEG = negative tense; OC = object concord; Rad = radical; SG = singular; SC = subject concord; T = tense; VR = verb root; VS = verb stem.

  6. 6.

    http://www.w3.org/community/ontolex/wiki/Final_Model_Specification; last accessed 24 December 2014.

  7. 7.

    http://www.njas.helsinki.fi/salama/index.html; last accessed 8 December 2014.

  8. 8.

    The prefixes in these classes are identical to canonical prefixes (class 1a and 5) and are disambiguated semantically, however, the complexity is that their corresponding plurals are found in canonical classes (2a and 6 respectively).

  9. 9.

    http://en.wiktionary.org/wiki/Appendix:Zulu_nouns; last accessed 17 December 2014.

References

  1. Alberts, R., Fogwill, T., & Keet, C. M. (2012). Several required OWL features for indigenous knowledge management systems. In P. Klinov & M. Horridge (Eds.), 7th Workshop on OWL: Experiences and directions (OWLED 2012), volume 849 of CEUR-WS, p. 12, May 27–28, Heraklion, Crete, Greece.

  2. Androutsopoulos, I., Lampouras, G., & Galanis, D. (2013). Generating natural language descriptions from owl ontologies: The naturalowl system. Journal of Artificial Intelligence Research, 48, 671–715.

    Google Scholar 

  3. Baader, F., Calvanese, D., McGuinness, D. L., Nardi, D., & Patel-Schneider, P. F., (Eds). (2008). The description logics handbook—Theory and applications. 2 edn. Cambridge: Cambridge University Press.

  4. Bentley, M., & Kulemeka, A. (2001). Chichewa. Müchen: Lincom Europa.

    Google Scholar 

  5. Bosca, A., Dragoni, M., Francescomarino, C. D., & Ghidini, C. (2014). Collaborative management of multilingual ontologies. In P. Buitelaar & P. Cimiano (Eds.), Towards the multilingual semantic Web. Berlin: Springer.

    Google Scholar 

  6. Bouayad-Agha, N., Casamayor, G., & Wanner, L. (2014). Natural language generation in the context of the semantic web. Semantic Web Journal, 5(6), 493–513.

    Google Scholar 

  7. Chavula, C. & Keet, C. M. (2014). Is lemon sufficient for building multilingual ontologies for Bantu languages? In C. M. Keet & V. Tamma (Eds.), Proceedings of the 11th OWL: Experiences and Directions Workshop (OWLED’14), volume 1265 of CEUR-WS, pp. 61–72. Riva del Garda, Italy, Oct 17–18, 2014.

  8. Cimiano, P., Buitelaar, P., McCrae, J., & Sintek, M. (2011). Lexinfo: A declarative model for the lexicon-ontology interface. Web Semantics: Science, Services and Agents on the World Wide Web, 9(1), 29–51.

    Article  Google Scholar 

  9. Curland, M., & Halpin, T. (2007). Model driven development with NORMA. Proceedings of the 40th International Conference on System Sciences (HICSS-40) (pp. 286a–286a). Hawaii: IEEE Computer Society. Los Alamitos.

    Google Scholar 

  10. Dent, G. R., & Nyembezi, C. L. S. (2009). Scholar’s Zulu Dictionary. Pietermaritzburg: Shuter & Shooter Publishers.

    Google Scholar 

  11. Doke, C. (1927). Text book of Zulu grammar. Witwatersrand: Witwatersrand University Press.

    Google Scholar 

  12. Doke, C. (1935). Bantu Linguistic terminology. London: Longman, Green and Co.

    Google Scholar 

  13. Durrant, P. (2013). Formulaicity in an agglutinating language: The case of Turkish. Corpus Linguistics and Linguistic Theory, 9(1), 1–38.

    Article  Google Scholar 

  14. Engelbrecht, C., Shangase, N., Majeke, S., Mthembu, S., & Zondi, Z. (2010). Isizulu terminology development in nursing and midwifery. Alternation, 17(1), 249–272.

    Google Scholar 

  15. Fogwill, T., Viviers, I., Engelbrecht, L., Krause, C., & Alberts, R. (2011). A software architecture for an indigenous knowledge management system. In Indigenous knowledge technology conference 2011. Windhoek, Namibia, 2–4 November 2011.

  16. Franconi, E., Guagliardo, P., & Trevisan, M. (2010). An intelligent query interface based on ontology navigation. In Workshop on Visual Interfaces to the Social and Semantic Web (VISSW’10). Hong Kong, February 2010.

  17. Fuchs, N. E., Kaljurand, K., & Kuhn, T. (2010). Discourse Representation Structures for ACE 6.6. Technical Report ifi-2010.0010, Dept of Informatics, University of Zurich, Switzerland.

  18. Ghidini, C., Kump, B., Lindstaedt, S., Mabhub, N., Pammer, V., Rospocher, M., and Serafini, L. (2009). Moki: The enterprise modelling wiki. Proceedings of the 6th Annual European Semantic Web Conference (ESWC2009). Heraklion, Greece, 2009 (demo).

  19. Goldsmith, J. & Buthelezi, G. (2005). The Zulu Language - Fall 2005. Online course material. University of Chicago, http://hum.uchicago.edu/jagoldsm/ZuluLanguage/. Last accessed 24 December 2014.

  20. Guthrie, M. (1971). Comparative Bantu: An introduction to the comparative linguistics and prehistory of the bantu languages. Number v. 1–2. Gregg.

  21. Jarrar, M., Keet, C. M., & Dongilli, P. (2006). Multilingual verbalization of ORM conceptual models and axiomatized ontologies. Starlab technical report, Vrije Universiteit Brussel, Belgium.

  22. Kaljurand, K., Kuhn, T., & Canedo, L. (2014). Collaborative multilingual knowledge management based on controlled natural language. Semantic Web Journal, 6, 241–258.

    Google Scholar 

  23. Keet, C. M., & Khumalo, L. (2014a). Basics for a grammar engine to verbalize logical theories in isiZulu. In A. Bikakis et al., (Eds.), Proceedings of the 8th International Web Rule Symposium (RuleML’14), volume 8620 of LNCS, pp 216–225. Springer. August 18–20, 2014, Prague, Czech Republic.

  24. Keet, C. M.&, Khumalo, L. (2014b). Toward verbalizing logical theories in isiZulu. In B. Davis, T. Kuhn, & K. Kaljurand, (Eds.), Proceedings of the 4th Workshop on Controlled Natural Language (CNL’14), volume 8625 of LNAI, pp. 78–89. Springer. 20–22 August 2014, Galway, Ireland.

  25. Keet, C. M., & Barbour, G. (2015). Limitations of regular terminology development practices: The case of the isiZulu computing terminology. Alternation, 22(1), 33–70.

    Google Scholar 

  26. Khumalo, L. (2007). An analysis of the Ndebele passive construction. Ph.D. thesis, University of Oslo, Norway.

  27. Kuhn, T. (2013). A principled approach to grammars for controlled natural languages and predictive editors. Journal of Logic, Language and Information, 12, 13–48.

    Google Scholar 

  28. Leo, J. (2008). Modeling relations. Journal of Philosophical Logic, 37, 353–385.

    Article  Google Scholar 

  29. Maho, J. (1999). A (tentative) verb slot system for Shona. Unpublished report for the ALLEX (African Languages Lexical) Project: Department of Oriental and African Languages, Göteborg University, Sweden.

    Google Scholar 

  30. Mberi, N. E. (2002). The categorical status and functions of auxiliaries in Shona. Ph.D. thesis, University of Oslo, Norway.

  31. McCrae, J., de Cea, G. A., Buitelaar, P., Cimiano, P., Declerck, T., Gómez-Pérez, A., et al. (2012). The Lemon cookbook. Monnet Project: Technical report.

    Google Scholar 

  32. Meinhof, C. (1948). Grundzüge einer vergleichenden Grammatik der Bantusprachen. Eckhardt und Messtorff: Hamburg.

    Google Scholar 

  33. Miti, L. (2006). Comaprative Bantu phonology and morphology. The Center for Advanced Studies of African Societies (CASAS): Cape Town.

    Google Scholar 

  34. Motik, B., Grau, B. C., Horrocks, I., Wu, Z., Fokoue, A., & Lutz, C. (2009). OWL 2 Web ontology language profiles. W3C recommendation, W3C.

  35. Msila, V. (2014). Africa must take pride of place in higher education. Mail & Guardian, Nov 14, 2014. http://mg.co.za/article/2014-11-13-africa-must-take-pride-of-place-in-higher-education.

  36. Ngcobo, M. N. (2010). Zulu noun classes revisited: A spoken corpus-based approach. South African Journal of African Languages, 1, 11–21.

    Google Scholar 

  37. Ngcobo, M. N. (2013). Loan words classification in isiZulu: The need for a sociolinguistic approach. Language Matters: Studies in the Languages of Africa, 44(1), 21–38.

    Article  Google Scholar 

  38. Pretorius, L., & Bosch, S. (2009). Exploiting cross-linguistic similarities in zulu and xhosa computational morphology: Facing the challenge of a disjunctive orthography. Proceedings of the EACL 2009 workshop on language technologies for African languages—AfLaT 2009, pp 96–103.

  39. Pretorius, L., & Bosch, S. E. (2003). Enabling computer interaction in the indigenous languages of South Africa: The central role of computational morphology (p. 56). New York: ACM Interactions.

    Google Scholar 

  40. Ranta, A. (2011). Grammatical framework: Programming with multilingual grammars. Stanford: CSLI Publications.

    Google Scholar 

  41. Rector, A., Drummond, N., Horridge, M., Rogers, L., Knublauch, H., Stevens, R., et al. (2004). OWL pizzas: Practical experience of teaching OWL-DL: Common errors & common patterns. Proceedings of the 14th international conference knowledge acquisition, modeling and management (EKAW’04), volume 3257 of LNCS (pp. 63–81). UK: Springer. Whittlebury Hall.

  42. Sharma Grover, A., Van Huyssteen, G., & Pretorius, M. (2011). The South African human language technology audit. Language Resources & Evaluation, 45, 271–288.

    Article  Google Scholar 

  43. Spiegler, S., van der Spuy, A., and Flach, P. A. (2010). Ukwabelana - an open-source morphological zulu corpus. Proceedings of the 23rd international conference on computational Linguistics (COLING’10), pp. 1020–1028. Association for Computational Linguistics. Beijing.

  44. Third, A., Williams, S., and Power, R. (2011). OWL to English: a tool for generating organised easily-navigated hypertexts from ontologies. poster/demo paper, Open Unversity UK. 10th International Semantic Web Conference (ISWC’11), 23–27 Oct 2011, Bonn, Germany.

  45. Turner, N. S. (1990). IsiZulu sokuzwana (zulu for mutual understanding). Course notes.

  46. Wald, B. (1987). Swahili and the Bantu languages. In B. Comrie (Ed.), The World’s Major Languages (pp. 991–1014). Oxford: Oxford University Press.

    Google Scholar 

Download references

Acknowledgments

This work is based on the research supported in part by the National Research Foundation of South Africa (CMK: Grant Number 93397).

Author information

Affiliations

Authors

Corresponding author

Correspondence to C. Maria Keet.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Keet, C.M., Khumalo, L. Toward a knowledge-to-text controlled natural language of isiZulu. Lang Resources & Evaluation 51, 131–157 (2017). https://doi.org/10.1007/s10579-016-9340-0

Download citation

Keywords

  • Bantu languages
  • isiZulu
  • Controlled natural language
  • OWL