Abstract
The Quality Department of the French National Space Agency (CNES, Centre National d’Études Spatiales) wishes to design a writing guide based on the real and regular writing of requirements. As a first step in this project, the present article proposes a linguistic analysis of requirements written in French by CNES engineers. One of our goals is to determine to what extent they conform to several rules laid down in two existing Controlled Natural Languages (CNLs), namely the Simplified Technical English developed by the AeroSpace and Defense Industries Association of Europe and the Guide for Writing Requirements proposed by the International Council on Systems Engineering. Indeed, although CNES engineers are not obliged to follow any controlled language in their writing of requirements, we believe that language regularities are likely to emerge from this task, mainly due to the writers’ experience. We are seeking to identify these regularities in order to use them as a basis for a new CNL for the writing of requirements. The issue is approached using natural language processing tools to identify sentences that do not comply with the rules or contain specific linguistic phenomena. We further review these sentences to understand why the recommendations cannot (or should not) always be applied when specifying large-scale projects.
Similar content being viewed by others
Notes
“Analysis would gain by contrasting two normative processes: norming, which is an intrinsic feature of spontaneous language activity, and normalization, which is a conscious and planned activity.”
“Any body of language that enables mutual understanding possesses its own systemic norms: it is this logic that we will call `norming'.”
According to Jakobson (1960), for example, the referential function, which is the closest to the one consisting in transmitting information, is only one among the six functions of language.
“The controller shall send the driver's itinary (sic) for the day to the driver” must be preferred to “The controller shall send the driver his itinary (sic) for the day”.
But, once again, these rules can be difficult to understand or apply; one of them, for instance, states that “when you count words for sentence length, each word in a hyphenated group counts as a separate word unless it is a prefix”, which implies a morphological analysis of the words to be properly applied.
Compare: “The agent does the operation” (active voice) vs. “The operation is done [by the agent]” (passive voice). In the first case, the sentence would not be grammatical without its subject; in the second case, the agent is optional.
“Si la différence (en valeur absolue) entre les dates de fin de lecture de deux fichiers, lus sur tranche de COME M—canal TMI i et sur tranche de COME N—canal TMI j, est inférieure à OPS_DELAI_INTER_FIN_LEC secondes, alors il est interdit d'enchaîner (lecture enchaînée) par la lecture de la tranche de COME N sur le canal i et de la tranche de COME M sur le canal j.” [“If the difference (in absolute terms) between the reading end-dates of the two files (read on COME M—canal TMI i and on COME N—canal TMI j) is lower than OPS_DELAI_INTER_FIN_LEC seconds, then it is forbidden to continue (continuous reading) with the reading of COME N on canal i and of COME M on canal j”].
In French, the complementizer “que” is mandatory.
Or, if we really want to use the active voice: “Le nom de cette durée sera DPDV_COEFF” [“The name of this duration will be DPDV_COEFF”].
In the sentences using a present verb, however, it is not always clear. In the following example: “L’opérateur […] peut, à tout moment, se connecter/déconnecter du système d’exploitation” [“The operator can, at any time, connect to or disconnect from the operating system”], is this already the case, or is it a condition to be met by the system – and therefore an instruction?.
Example: “Le CCC met à disposition du COO:” [“The CCC provides the COO with:”].
For instance, the sentence “The system will provide black and white pictures” is not equivalent to “The system will provide black pictures” + “The system will provide white pictures”.
References
AeroSpace and Defence Industries Association of Europe. (2007). Simplified Technical English. Specification ASD-STE100. International specification for the preparation of maintenance documentation in a controlled language. Issue 4.
Anthony, L. (2005). AntConc: design and development of a freeware corpus analysis toolkit for the technical writing classroom (pp. 729–737). Presented at the Professional Communication Conference, 2005. IPCC 2005. Proceedings. International. doi:10.1109/IPCC.2005.1494244.
Austin, J. L. (1975). How to do things with words. Oxford: Oxford University Press.
Bakhtin, M. (2004). Speech genres and other late essays. Austin: University of Texas Press Slavic Series.
Barcellini, F., Albert, C., Grosse, C., & Saint-Dizier, P. (2012). Risk analysis and prevention: LELIE, a tool dedicated to procedure and requirement authoring. In Calzolari et al. (Eds.), Proceedings of the 8th International Conference on Language Resources and Evaluation (LREC'12) (pp. 698–705). Istanbul: European Language Resources Association (ELRA).
Bhatia, V. K. (1993). Analysing genre: Language use in professional settings. London: Longman.
Biber, D. (1988). Variation across speech and writing. Cambridge: Cambridge University Press.
Biber, D. (2006). University language: A corpus-based study of spoken and written registers. Amsterdam: John Benjamins Publishing.
Biber, D. (2009). A corpus-driven approach to formulaic language in English: Multi-word patterns in speech and writing. International Journal of Corpus Linguistics, 14(3), 275–311.
Bouquet, S. (2007). Contribution à une linguistique néo-saussurienne des genres de la parole (1): une grammaire du morphème on. Linx. Revue des linguistes de l’université Paris X Nanterre, 56, 143–156. doi:10.4000/linx.376.
Bruce, B., Rubin, A., & Starr, K. S. (1981). Why readability formulas fail. IEEE Transactions on Professional Communication, PC-24(1), 50–52.
Cabré, M. T. (1999). Terminology: Theory, methods, and applications. Amsterdam: John Benjamins Publishing.
Carlson, N., & Laplante, P. (2013). The NASA automated requirements measurement tool: A reconstruction. Innovations System Software Engineering, 10(2), 77–91.
Chervak, S., Drury, C. G., & Ouellette, J. P. (1996). Field evaluation of simplified english for aircraft workcards. In Proceedings of the 10th FAA/AAM Meeting on Human Factors in Aviation Maintenance and Inspection.
Clark, P., Murray, W. R., Harrison, P., & Thompson, J. (2010). Naturalness vs. predictability: A key debate in controlled languages. In N. E. Fuchs (Ed.), Controlled natural language (pp. 65–81). Berlin: Springer.
Condamines, A. (1995). Terminology: New needs, new perspectives. Terminology, 2(2), 219–238. doi:10.1075/term.2.2.03con.
Condamines, A. (2010). Variations in terminology: Application to the management of risks related to language use in the workplace. Terminology, 16(1), 30–50. doi:10.1075/term.16.1.02con.
Condamines, A. & Bourigault, D. (1999). Alternance nom/verbe: Explorations en corpus spécialisés. In B. Victorri & J. François (Eds.), Sémantique du lexique verbal, Actes de l’atelier de Caen, 22–23 janvier 1999 (pp. 41–48). Caen: Cahiers de l’Elsap.
DuBay, W. H. (Ed.). (2004). The principles of readability. California: Impact Information.
Firth, J. R. (1957). Papers in linguistics, 1934–1951. Oxford: Oxford University Press.
Flesch, R. (1948). A new readability yardstick. Journal of Applied Psychology, 32(3), 221–233.
Fludernik, M. (1991). Shifters and deixis: Some reflections on Jakobson, Jespersen, and reference. Semiotica, 86(3–4), 193–230. doi:10.1515/semi.1991.86.3-4.193.
Freixa, J. (2006). Causes of denominative variation in terminology: A typology proposal. Terminology, 12(1), 51–77. doi:10.1075/term.12.1.04fre.
Gaudin, F. (1993). Pour une socioterminologie: Des problèmes sémantiques aux pratiques institutionnelles. Rouen: Publications de l’Université de Rouen.
Gilliland, J. (1972). Readability. London: University of London Press Ltd.
Guespin, L. (1993). Normaliser ou standardiser? Le Langage et l’homme, 28(4), 213–222.
Hymes, D. (1967). Models of the interaction of language and social setting. Journal of Social Issues, 23(2), 8–28. doi:10.1111/j.1540-4560.1967.tb00572.x.
International Council on Systems Engineering. (2011). Guide for writing requirements. Version 1. INCOSE-TP-2010-006-01, San Diego.
Jakobson, R. (1960). Linguistics and poetics. In T. Sebeok (Ed.), Style in language (pp. 350–353). Cambridge: M.I.T. Press.
Kittredge, R., & Lehrberger, J. (1982). Sublanguage: Studies of language in restricted semantic domains. Berlin: Walter de Gruyter.
Klare, G. R. (1976). A second look at the validity of readability formulas. Journal of Reading Behavior, 8, 129–152.
Kuhn, T. (2014). A survey and classification of controlled natural languages. Computational Linguistics, 40(1), 121–170. doi:10.1162/COLI_a_00168.
Kurzon, D. (1997). “Legal language”: varieties, genres, registers, discourses. International Journal of Applied Linguistics, 7(2), 119–139. doi:10.1111/j.1473-4192.1997.tb00111.x.
Le Querler, N. (2004). Les modalités en français. Revue belge de philologie et d’histoire, 82(3), 643–656. doi:10.3406/rbph.2004.4850.
Lopez, S., Condamines, A., Josselin-Leray, A., O’Donoghue, M., & Salmon, R. (2013). Linguistic analysis of english phraseology and plain language in air-ground communication. Journal of Air Transport Studies, 4(1), 44–60.
Malrieu, D. (2007). Contribution à une linguistique néo-saussurienne des genres de la parole (2): Analyse des valeurs d’indexicalité interlocutoire de on selon les genres textuels. Linx Revue des linguistes de l’université Paris X Nanterre, 56, 157–178. doi:10.4000/linx.377.
McEnery, T., & Wilson, A. (1996). Corpus linguistics. Edinburgh: Edinburgh University Press.
Meyer, B. (1985). On formalism in specifications. IEEE Software, 2(1), 6–26.
Nishina, Y. (2007). A corpus-driven approach to genre analysis: The reinvestigation of academic, newspaper and literary texts. Empirical Language Research, 1(2), 1–36.
O’Brien, S. (2003). Controlling controlled english. An analysis of several controlled language rule sets. Proceedings of EAMT-CLAW, 3, 105–114.
Pace, G. J., & Rosner, M. (2010). A Controlled language for the specification of contracts. In N. E. Fuchs (Ed.), Controlled natural language (pp. 226–245). Berlin: Springer.
Paumier, S. (2016). Unitex 3.1: User manual. Université Paris-Est Marne-la-Vallée. http://igm.univ-mlv.fr/~unitex/UnitexManual3.1.pdf.
Pearson, J. (1998). Terms in context. Amsterdam: John Benjamins Publishing.
Poudat, C. (2003). Characterization of French linguistic research articles using morphosyntactic variables. In K. Fløttum & F. Rastier (Eds.), Academic discourse. Multidisciplinary approaches (pp. 77–95). Oslo: Novus.
Quiniou, S., Cellier, P., Charnois, T., & Legallois, D. (2012). What about sequential data mining techniques to identify linguistic patterns for stylistics? In A. Gelbukh (Ed.), Computational linguistics and intelligent text processing (pp. 166–177). Berlin: Springer.
Shubert, S. K., Spyridakis, J. H., Holmback, H. K., & Coney, M. B. (1995). The comprehensibility of simplified English in procedures. Journal of Technical Writing and Communication, 25(4), 347–369.
Somers, H. (1998). An attempt to use weighted cusums to identify sublanguages. In Proceedings of the Joint Conferences on New Methods in Language Processing and Computational Natural Language Learning (pp. 131–139). Stroudsburg, PA: Association for Computational Linguistics.
Stage, L. (2002). Les modalités épistémique et déontique dans les énoncés au futur (simple et composé). Revue Romane, 37(1), 44–66.
Stewart, K. M. (1998). Effect of AECMA simplified English on the comprehension of aircraft maintenance procedures by non-native English speakers. Master’s thesis, University of British Columbia.
Swales, J. (2004). Research genres: Exploration and applications. Cambridge: Cambridge University Press.
Temmerman, R. (2000). Towards new ways of terminology description: The sociocognitive-approach. Amsterdam: John Benjamins Publishing.
Temnikova, I. (2012). Text complexity and text simplification in the crisis management domain (PhD thesis). Wolverhampton: University of Wolverhampton.
Temnikova, I., Baumgartner Jr, W. A., Hailu, N. D., Nikolova, I., McEnery, T., Kilgarriff, A., et al. (2014). Sublanguage corpus analysis toolkit: A tool for assessing the representativeness and sublanguage characteristics of corpora. In Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC 2014).
Tognini-Bonelli, E. (2001). Corpus linguistics at work. Amsterdam: John Benjamins Publishing.
Urieli, A. (2013). Robust French syntax analysis: Reconciling statistical methods and linguistic knowledge in the Talismane toolkit (PhD thesis). Université Toulouse II - Le Mirail.
Warnier, M., & Condamines, A. (2015). A Methodology for Identifying Terms and Patterns Specific to Requirements as a Textual Genre Using Automated Tools. In Proceedings of the International Conference “Terminology and Artificial Intelligence. 4–6 Nov, Granada, Spain, pp. 183–190.
Wüster, E. (1968). The machine tool: An interlingual dictionary of basic concepts. London: Technical Press.
Wyner, A., Angelov, K., Barzdins, G., Damljanovic, D., Davis, B., Fuchs, N., et al. (2010). On controlled natural languages: Properties and prospects. In N. E. Fuchs (Ed.), Controlled natural language (pp. 281–289). Berlin: Springer.
Zhang, Q. (1998). Fuzziness–vagueness–generality–ambiguity. Journal of Pragmatics, 29(1), 13–31.
Acknowledgments
This study is carried out as part of a PhD thesis granted by the CNES and the Regional Council Midi-Pyrénées. We would like to thank the CNES for their active cooperation as well as for providing us with the requirements corpus. We are also very grateful to the anonymous reviewers of this special issue and to those of the Fourth Workshop on Controlled Natural Language (CNL 2014), for all their relevant comments, suggestions and references.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Condamines, A., Warnier, M. Towards the creation of a CNL adapted to requirements writing by combining writing recommendations and spontaneous regularities: example in a space project. Lang Resources & Evaluation 51, 221–247 (2017). https://doi.org/10.1007/s10579-016-9368-1
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10579-016-9368-1