Abstract
We explore the implications of using fuzzy techniques (mainly those commonly used in the linguistic description/summarization of data discipline) from a natural language generation perspective. For this, we provide an extensive discussion of some general convergence points and an exploration of the relationship between the different tasks involved in the standard natural language generation system pipeline architecture and the most common fuzzy approaches used in linguistic summarization/description of data, such as fuzzy quantified statements, evaluation criteria or aggregation operators. Each individual discussion is illustrated with a related use case. Recent work made in the context of cross-fertilization of both research fields is also referenced.
Similar content being viewed by others
Notes
This technique is commonly known in NLG as “corpus” analysis and is performed systematically in the early conception stages of an NLG system [28].
Note that we are using actual natural language expressions to illustrate lexicalization, but this task actually involves creating new sets of structures which define an intermediate syntax between the original content messages and the actual text [27].
References
SoftLearn Demo (2015) http://tec.citius.usc.es/SoftLearn/
Castillo-Ortega, R., Chamorro-Martínez, J., Marín, N., Sánchez, D., Soto-Hidalgo, J.M.: Describing images via linguistic features and hierarchical segmentation. In: Fuzzy systems (FUZZ), 2010 IEEE International Conference, pp. 1–8 (2010). doi:10.1109/FUZZY.2010.5584443
van Deemter, K.: Generating referring expressions that involve gradable properties. Comput. Linguist. 32(2), 195–222 (2006). doi:10.1162/coli.2006.32.2.195
van Deemter, K.: Utility and language generation: the case of vagueness. J. Philos. Log. 38(6), 607–632 (2009). doi:10.1007/s10992-009-9114-x
Delgado, M., Ruiz, M.D., Sánchez, D., Vila, M.A.: Fuzzy quantification: a state of the art. Fuzzy Sets Syst. 242, 1–30 (2014). doi:10.1016/j.fss.2013.10.012 (theme: Quantifiers and Logic)
Díaz-Hermida, F., Ramos-Soto, A., Bugarín, A.: On the role of fuzzy quantified statements in linguistic summarization. In: Proceedings of 11th International conference on. intelligent systems design and applications (ISDA), pp. 166–171 (2011)
Diaz-Hermida F, Pereira-Fariña M, Vidal JC, Ramos-Soto A (2016) Characterizing quantifier fuzzification mechanisms: a behavioral guide for practical applications. arXiv:1605.03506
Eciolaza, L., Pereira-Fariña, M., Trivino, G.: Automatic linguistic reporting in driving simulation environments. Appl. Soft Comput. 13(9), 3956–3967 (2013)
Gatt, A., Belz, A.: Introducing shared tasks to nlg: the tuna shared task evaluation challenges. In: Krahmer, E., Theune, M. (eds.) Empirical methods in natural language generation, Lecture notes in computer science, vol. 5790, pp. 264–293. Springer, Berlin, Heidelberg, (2010). doi:10.1007/978-3-642-15573-4_14
Gatt, A., Portet, F.: Multilingual generation of uncertain temporal expressions from data: a study of a possibilistic formalism and its consistency with human subjective evaluations. Fuzzy Sets Syst. 285, 73–93 (2016). doi:10.1016/j.fss.2015.07.018. http://www.sciencedirect.com/science/article/pii/S0165011415003590 (special Issue on Linguistic Description of Time Series)
Gatt, A., Marín, N., Portet, F., Sánchez, D.: The role of graduality for referring expression generation in visual scenes. In: Information processing and management of uncertainty in knowledge-based systems: 16th International Conference, IPMU 2016, Eindhoven, The Netherlands, June 20–24, 2016, Proceedings, Part I, pp. 191–203. Springer International Publishing (2016). doi:10.1007/978-3-319-40596-4_17
Glöckner, I., Knoll, A.: Application of fuzzy quantifiers in image processing: a case study. In: Proceedings of the third international conference on knowledge-based intelligent information engineering systems, pp. 259–262 (1999)
Hunter, J., Freer, Y., Gatt, A., Reiter, E., Sripada, S., Sykes, C.: Automatic generation of natural language nursing shift summaries in neonatal intensive care: Bt-nurse. Artif. Intell. Med. 56(3), 157–172 (2012). doi:10.1016/j.artmed.2012.09.002
Kacprzyk, J., Zadrozny, S.: Protoforms of linguistic database summaries as a human consistent tool for using natural language in data mining. Int. J. Softw. Sci. Comput. Intell. (IJSSCI) 1(1), 100–111 (2009). doi:10.4018/jssci.2009010107
Kacprzyk, J., Zadrozny, S.: Computing with words is an implementable paradigm: fuzzy queries, linguistic data summaries, and natural-language generation. Fuzzy Syst. IEEE Trans. 18(3), 461–472 (2010). doi:10.1109/TFUZZ.2010.2040480
Marín, N., Sánchez, D.: Fuzzy sets and systems + natural language generation: a step forward in the linguistic description of time series. Fuzzy Sets Syst. 285, 1–5 (2016a). doi:10.1016/j.fss.2015.12.003. http://www.sciencedirect.com/science/article/pii/S0165011415005758 (special Issue on Linguistic Description of Time Series)
Marín, N., Sánchez, D.: On generating linguistic descriptions of time series. Fuzzy Sets Syst. 285, 6–30 (2016b). doi:10.1016/j.fss.2015.04.014 (special Issue on Linguistic Description of Time Series)
de Oliveira, R., Sripada, Y., Reiter, E.: Proceedings of the 15th European workshop on natural language generation (ENLG), association for computational linguistics, chap designing an algorithm for generating named spatial references, pp. 127–135 (2015)
Portet, F., Gatt, A.: Towards a possibility-theoretic approach to uncertainty in medical data interpretation for text generation. In: Riaño, D., ten Teije, A., Miksch, S., Peleg, M. (eds.) Knowledge Representation for Health-Care. Data, processes and guidelines, Lecture notes in computer science, vol. 5943, pp. 155–168. Springer, Berlin, Heidelberg (2010). doi:10.1007/978-3-642-11808-1_13
Portet, F., Reiter, E., Gatt, A., Hunter, J., Sripada, S., Freer, Y., Sykes, C.: Automatic generation of textual summaries from neonatal intensive care data. Artif. Intell. 173(7–8), 789–816 (2009). doi:10.1016/j.artint.2008.12.002
Power, R., Williams, S.: Generating numerical approximations. Comput. Linguist. 38(1), 113–134 (2012). doi:10.1162/COLI_a_00086
Ramos-Soto, A., Bugarín, A., Barro, S., Diaz-Hermida, F.: Automatic linguistic descriptions of meteorological data: a soft computing approach for converting open data to open information. In: 8th Iberian conference on information systems and technologies, pp. 1–6 (2013)
Ramos-Soto, A., Bugarín, A., Barro, S., Taboada, J.: Linguistic descriptions for automatic generation of textual short-term weather forecasts on real prediction data. Fuzzy Syst. IEEE Trans. 23(1), 44–57 (2015a). doi:10.1109/TFUZZ.2014.2328011
Ramos-Soto, A., Pereira-Farina, M., Bugarin, A., Barro, S.: A model based on computational perceptions for the generation of linguistic descriptions of data. In: Fuzzy systems (FUZZ-IEEE), 2015 IEEE international conference on, pp. 1–8 (2015b). doi:10.1109/FUZZ-IEEE.2015.7337923
Ramos-Soto, A., Bugarín, A., Barro, S.: On the role of linguistic descriptions of data in the building of natural language generation systems. Fuzzy Sets Syst. 285, 31–51 (2016). doi:10.1016/j.fss.2015.06.019
Reiter, E.: An architecture for data-to-text systems. In: Busemann, S. (ed.) Proceedings of the 11th European workshop on natural language generation, pp. 97–104 (2007)
Reiter, E., Dale, R.: Building natural language generation systems. cambridge University Press, Cambridge (2000)
Reiter, E., Sripada, S., Hunter, J., Davy, I.: Choosing words in computer-generated weather forecasts. Artif. Intell. 167, 137–169 (2005)
Rodriguez-Fdez, I., Mucientes, M., Bugarin, A.: Reducing the complexity in genetic learning of accurate regression tsk rule-based systems. In: Fuzzy systems (FUZZ-IEEE), 2015 IEEE international conference, pp. 1–8 (2015)
Sripada, S.G., Burnett, N., Turner, R., Mastin, J., Evans, D.: A case study: NLG meeting weather industry demand for quality and quantity of textual weather forecasts. In: INLG-2014 Proceedings (2014)
Turner, R., Sripada, S., Reiter, E., Davy, I.P.: Selecting the content of textual descriptions of geographically located events in spatio-temporal weather data. Appl. Innov. Intell. Syst. XV, 75–88 (2007)
Turner, R., Sripada, S., Reiter, E., Davy, I.P.: Using spatial reference frames to generate grounded textual summaries of georeferenced data. In: Proceedings of the fifth international natural language generation conference, Association for computational linguistics, Stroudsburg, INLG ’08, pp. 16–24 (2008)
Turner, R., Sripada, S., Reiter, E.: Generating approximate geographic descriptions. In: Krahmer, E., Theune, M. (eds.) Empirical methods in natural language generation, lecture notes in computer science, vol. 5790, pp. 121–140. Springer, Berlin, Heidelberg (2010). doi:10.1007/978-3-642-15573-4_7
Van Deemter, K., Krahmer, E., Theune, M.: Real versus template-based natural language generation: a false opposition? Comput. Linguist. 31(1), 15–24 (2005). doi:10.1162/0891201053630291
Vázquez-Barreiros, B., Lama, M., Mucientes, M., Vidal, J.C.: Softlearn: A process mining platform for the discovery of learning paths. In: 14th international conference on advanced learning technologies (ICALT 2014), pp. 373–375. IEEE Press (2014)
Wilbik, A., Kaymak, U.: Information Processing and management of uncertainty in knowledge-based systems: 15th International Conference, IPMU 2014, Montpellier, France, July 15–19, 2014, Proceedings, Part II, Springer International Publishing, chap Gradual Linguistic Summaries, pp. 405–413 (2014). doi:10.1007/978-3-319-08855-6_41
Yager, R.: On ordered weighted averaging aggregation operators in multicriteria decisionmaking. Syst. Man Cybern. IEEE Trans. 18(1), 183–190 (1988). doi:10.1109/21.87068
Yager, R.R.: A new approach to the summarization of data. Inf. Sci. 28(1), 69–86 (1982). doi:10.1016/0020-0255(82)90033-0
Yu, J., Hunter, J., Reiter, E., Sripada, S.: An approach to generating summaries of time series data in the gas turbine domain. In: Proceedings of IEEE international conference on Info-tech & Info-net (ICII2001), Beijing, pp. 44–51 (2001)
Yu, J., Reiter, E., Hunter, J., Sripada, S.: Sumtime-turbine: a knowledge-based system to communicate gas turbine time-series data. In: Chung, P., Hinde, C., Ali, M. (eds.) Developments in applied artificial intelligence, Lecture notes in computer science, vol. 2718, pp. 379–384. Springer, Berlin, Heidelberg (2003) doi:10.1007/3-540-45034-3_38
Zadeh, L.: A computational approach to fuzzy quantifiers in natural languages. Comput. Math. Appl. 9, 149–184 (1983)
Zadeh, L.A.: A prototype-centered approach to adding deduction capability to search engines-the concept of protoform. In: Intelligent systems, 2002. Proceedings. 2002 First International IEEE Symposium, vol. 1, pp. 2–3 (2002). doi:10.1109/IS.2002.1044219
Zimmermann, H.J., Zysno, P.: Latent connectives in human decision making. Fuzzy Sets Syst. 4(1), 37–51 (1980). doi:10.1016/0165-0114(80)90062-7
Author information
Authors and Affiliations
Corresponding author
Additional information
This work was supported by the Spanish Ministry for Economy and Competitiveness (Grant TIN2014-56633-C3-1-R) and by the European Regional Development Fund (ERDF/FEDER) and the Galician Ministry of Education (Grants GRC2014/030 and CN2012/151). Alejandro Ramos Soto (A. Ramos-Soto) is supported by the Spanish Ministry for Economy and Competitiveness (FPI Fellowship Program) under Grant BES-2012-051878.
Rights and permissions
About this article
Cite this article
Ramos-Soto, A., Bugarín, A. & Barro, S. Fuzzy sets across the natural language generation pipeline. Prog Artif Intell 5, 261–276 (2016). https://doi.org/10.1007/s13748-016-0097-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13748-016-0097-x