Abstract
Healthcare organizations are in a continuous effort to improve health outcomes, reduce costs, and enhance patient experience of care. Data is essential to measure and help achieving these improvements in healthcare delivery. Consequently, a data influx from various clinical, financial, and operational sources is now overtaking healthcare organizations and their patients. The effective use of this data, however, is a major challenge. Clearly, text is an important medium to make data accessible. Financial reports are produced to assess healthcare organizations on some key performance indicators to steer their healthcare delivery. Similarly, at a clinical level, data on patient status is conveyed by means of textual descriptions to facilitate patient review, shift handover, and care transitions. Likewise, patients are informed about data on their health status and treatments via text, in the form of reports, or via e-health platforms by their doctors. Unfortunately, such text is the outcome of a highly labor-intensive process if it is done by healthcare professionals. It is also prone to incompleteness and subjectivity and hard to scale up to different domains, wider audiences, and varying communication purposes. Data-to-text is a recent breakthrough technology in artificial intelligence which automatically generates natural language in the form of text or speech from data. This chapter provides a survey of data-to-text technology, with a focus on how it can be deployed in a healthcare setting. It will (1) give an up-to-date synthesis of data-to-text approaches, (2) give a categorized overview of use cases in healthcare, (3) seek to make a strong case for evaluating and implementing data-to-text in a healthcare setting, and (4) highlight recent research challenges.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Androutsopoulos, I., Malakasiotis, P.: A survey of paraphrasing and textual entailment methods. J. Artif. Intell. Res. 38, 135–187 (2010)
Barzilay, R., Lapata, M.: Aggregation via set partitioning for natural language generation. In: Proceedings of HLT-NAACL-06, pp. 359–366 (2006)
Bateman, J.A.: Enabling technology for multilingual natural language generation: the KPML development environment. Nat. Lang. Eng. 3(1), 15–55 (1997)
Bohnet, B., Wanner, L., Mille, S., Burga, A.: Broad coverage multilingual deep sentence generation with a stochastic multi-level realizer. In: Proceedings of CoLing-10, pp. 98–106 (2010)
Castro Ferreira, T., Wubben, S., Krahmer, E.: Generating flexible proper name references in text: data, models and evaluation. In: Proceedings of EACL-17, pp. 655–664 (2017)
Cawsey, A.J., Jones, R.B. Pearson, J.: The evaluation of a personalised health information system for patient with cancer. User Model. User-Adap. Inter. 10, 47–72 (2000)
Chen, D.L., Raymond J., Mooney, R.J.: Learning to sportscast: a test of grounded language acquisition. In: Proceedings of ICML-08, pp. 128–135 (2008)
Cohn, T., Lapata, M.: Large margin synchronous generation and its application to sentence compression. In: Proceedings of EMNLP-CoLing-07, pp. 73–82 (2007)
Dale, R., Reiter, E.: Computational interpretations of the Gricean maxims in the generation of referring expressions. Cogn. Sci. 19(2), 233–263 (1995)
Dale, R., White, M.: Shared tasks and comparative evaluation in natural language generation: workshop report. Technical report, Ohio State University, Arlington, VA (2007)
De Rosis, F., Grasso, F.: Affective natural language generation. In: Paiva, A. (ed.) Affective Interactions, pp. 204–218. Springer, Berlin (2000)
Dethlefs, N.: Context-sensitive natural language generation: from knowledge-driven to data-driven techniques. Lang. Linguist. Compass 8(3), 99–115 (2014)
Dras, M.: Tree adjoining grammar and the reluctant paraphrasing of text. Ph.D. thesis, Macquarie University, Sydney (1999)
Edmonds, P., Hirst, G.: Near-synonymy and lexical choice. Comput. Linguist. 28(2), 105–144 (2002)
Elhadad, N.: Comprehending technical texts: predicting and defining unfamiliar terms. In: Proceedings of AMIA-06, pp. 239–243 (2006)
Elhadad, M., Robin, J.: An overview of SURGE: a reusable comprehensive syntactic realization component. In: Proceedings of INLG-98, pp. 1–4 (1996)
Elhadad, M., Robin, J., McKeown, K.R.: Floating constraints in lexical choice. Comput. Linguist. 23(2), 195–239 (1997)
Elting, L.S., Martin, C.G., Cantor, S.B., Rubenstein, E.B.: Influence of data display formats on physician investigators’ decisions to stop clinical trials: prospective trial with repeated measures. Br. Med. J. (Clin. Res. Ed.) 318(7197), 1527–1531 (1999)
Filippova, K., Strube, M.: Tree linearization in English: improving language model based approaches. In: Proceedings of NAACL-HLT-09, pp. 225–228 (2009)
Ford E.S., Bergmann, M.M., Boeing, H., Li, C., Capewell, S.: Healthy lifestyle behaviors and all-cause mortality among adults in the United States. Prev. Med. 55(1), 23–27 (2012). https://doi.org/10.1016/j.ypmed.2012.04.016
Ganeshan, D., Duong, P.T., Probyn, L., Lenchik, L., McArthur, T.A., Retrouvey, M., Ghobadi, E.H., Desouches, S.L., Pastel, D., Francis, I.R.: Structured reporting in radiology. Acad. Radiol. 25(1), 66–73 (2018). https://doi.org/10.1016/j.acra.2017.08.005
Garcia-Retamero, R., Galesic, M.: Who profits from visual aids: overcoming challenges in people’s understanding of risks. Soc. Sci. Med. 70(7), 1019–1025 (2010)
Gatt, A., Krahmer, E.: Automatic text generation: a survey of the state of the art in natural language generation: core tasks, applications and evaluation. J. Artif. Intell. Res. 61, 65–170 (2018)
Gatt, A., Reiter, E.: SimpleNLG: a realisation engine for practical applications. In: Proceedings of ENLG-09, pp. 90–93 (2009)
Gatt, A., Portet, F., Reiter, E., Hunter, J., Mahamood, S., Moncur, W., Sripada, S.: From data to text in the neonatal intensive care unit: using NLG technology for decision support and information management. AI Commun. 22(3), 153–186 (2009)
Goldberg, Y.: A primer on neural network models for natural language processing. J. Artif. Intell. Res. 57, 345–420 (2016)
Goldberg, Y.: An adversarial review of ‘adversarial generation of natural language’. https://goo.gl/EMipHQ (2017) . Cited 13 July 2018
Grosz, B., Joshi, A.K., Weinstein, S.: Centering: a framework for modeling the local coherence of discourse. Comput. Linguist. 21(2), 203–225 (1995)
Harbusch, K., Kempen, G.: Generating clausal coordinate ellipsis multilingually: a uniform approach based on postediting. In: Proceedings of ENLG-09, pp. 138–145 (2009)
Hardy, W., Powers, J., Jasko, J.G., Stitt, C., Lotz, G., Aloia, M.: SleepMapper: a mobile application and website to engage sleep apnea patients in PAP therapy and improve adherence to treatment. In: Proceedings of SLEEP-14, APSS (2014)
Harris, M.D.: Building a large-scale commercial NLG system for an EMR. In: INLG-08, pp. 157–160 (2008)
Holmes-Rovner, M., Kelly-Blake, K., Dwamena, F., Dontje, K., Henry, R.C., Olomu, A., Rovner, D.R., Rothert, M.L.: Shared decision making guidance reminders in practice (SDM-GRIP). Patient Educ. Couns. 85(2), 214–224 (2011)
Hovy, E.H.: Generating Natural Language Under Pragmatic Constraints. Lawrence Erlbaum Associates, Hillsdale (1988)
Hunter, B., Buckley, C.: Population health management 2017, part 1: validating adoption of PHM functionality. KLAS research report (2017)
Hunter, B., Buckley, C.: Population health management 2017, part 2: balancing collaboration and functionality. KLAS research report (2017)
Hunter, J., Freer, Y., Gatt, A., Reiter, E., Sripada, S., Sykes, C.: Automatic generation of natural language nursing shift summaries in neonatal intensive care: BT-Nurse. Artif. Intell. Med. 56(3), 157–172 (2012)
Hüske-Kraus, D.: Text generation in clinical medicine—a review. Methods Inf. Med. 42(1), 51–60 (2003)
Hüske-Kraus, D.: Suregen-2: a shell system for the generation of clinical documents. In: Proceedings of EACL-03, pp. 215–218 (2003)
Kahn, M.G., Fagan, L., Sheiner, L.B.: Model-based interpretation of time-varying medical data. In: Proceedings of Annual Symposium on Computer Application in Medical Care-89, pp. 28–32 (1989)
Kay, M.: Chart generation. In: Proceedings of ACL-96, pp. 200–204 (1996)
Kondadadi, R., Howald, B., Schilder, F.: A statistical NLG framework for aggregated planning and realization. In: CoLing-13, pp. 1406–1415 (2013)
Konstas, I., Lapata, M.: A global model for concept-to-text generation. J. Artif. Intell. Res. 48, 305–346 (2013). https://doi.org/10.1613/jair.4025
Krahmer, E., Van Deemter, K.: Computational generation of referring expressions: a survey. Comput. Linguist. 38, 173–218 (2012)
Langkilde-Geary, I., Knight, K.:. HALogen statistical sentence generator. In: Proceedings of ACL-02 (Demos), pp. 102–103 (2002)
Lavie, A., Agarwal, A.: METEOR: an automatic metric for MT evaluation with improved correlation with human judgments. In: ACL-05, pp. 65–72 (2005)
Law, A.S., Freer, Y., Hunter, J., Logie, R.H., McIntosh, N., Quinn, J.: A comparison of graphical and textual presentations of time series data to support medical decision making in the neonatal intensive care unit. J. Clin. Monit. Comput. 19(3), 183–194 (2005)
Lee, D., Pate, R.R., Lavie, C.J., Sui, X., Church, T., Blair, S.: Leisure-time running reduces all-cause and cardiovascular mortality risk. J. Am. Coll. Cardiol. 64(5), 472–481 (2014)
Lennox, S., Osman, L., Reiter, E., Robertson, R., Friend, J., McCann, I., Skatun, D., Donnan, P.: The cost-effectiveness of computer-tailored and non-tailored smoking cessation letters in general practice: a randomised controlled study. Br. Med. J. 322, 13–96 (2001)
Lin, C.Y., Hovy, E.H.: Automatic evaluation of summaries using N-gram co-occurrence statistics. In: Proceedings of HLT-NAACL-03, pp. 71–78 (2003)
Lin, C.Y., Och, F.J.: Automatic evaluation of machine translation quality using longest common subsequence and skip-bigram statistics. In: Proceedings of ACL-04, pp. 605–612 (2004)
Litjens, G., Kooi, T., Bejnordi, B.E., Setio, A.A.A., Ciompi, F., Ghafoorian, M., van der Laak, J.A.W.M., van Ginneken, B., Sánchez, C.I.: A survey on deep learning in medical image analysis. Med. Image Anal. 42, 60–88 (2017)
Madnani, N., Dorr, B.J.: Generating phrasal and sentential paraphrases: a survey of data-driven methods. Comput. Linguist. 36(3), 341–387 (2010)
Mahamood, S., Reiter, E.: Generating affective natural language for parents of neonatal infants. In: Proceedings of ENLG-2011, pp. 12–21 (2011)
Mahamood, S., Reiter, E., Mellish, C.: Neonatal intensive care information for parents – an affective approach. In: Proceedings of CBMS-08, pp. 461–463 (2008)
Mann, W.C., Thompson, S.A.: Rhetorical structure theory: toward a functional theory of text organization. Text 8(3), 243–281 (1988)
McRoy, S.W, Channarukul, S., Ali, S.S.: An augmented template-based approach to text realization. Nat. Lang. Eng. 9(4), 381–420 (2003)
Mille, S., Bohnet, B., Wanner, L., Belz, A.: Multilingual surface realization using universal dependency trees. In: Proceedings of INLG-17, pp. 120–123 (2017)
Monico, E., Schwartz, I.: Communication and documentation of preliminary and final radiology reports. J. Healthc. Risk Manag. 30, 23–25 (2010). https://doi.org/10.1002/jhrm.20039
Pezzullo, J.A., Tung, G.A., Rogg, J.M., Davis, L.M., Brody, J.M., Mayo-Smith, W.W.: Voice recognition dictation: radiologist as transcriptionist. J. Digit. Imaging 21(4), 384–389 (2008)
Poesio, M., Stevenson, R., Di Eugenio, B., Hitzeman, J.: Centering: a parametric theory and its instantiations. Comput. Linguist. 30(3), 309–363 (2004)
Portet, F., Reiter, E., Gatt, A., Hunter, J., Sripada, S., Freer, Y., Sykes, C.: Automatic generation of textual summaries from neonatal intensive care data. Artif. Intell. 173(7–8), 789–816 (2009)
Power, R., Scott, D., Bouayad-Agha, N.: Document structure. J. Comput. Linguist. 29(2), 211–260 (2003)
Rajkumar, R., White, M.: Better surface realization through psycholinguistics. Lang. Linguist. Compass, 8(10), 428–448 (2014)
Reiser, S.: Technological Medicine: The Changing World of Doctors and Patients. Cambridge University Press, Cambridge (2009)
Reiter, E.: An architecture for data-to-text systems. In: Proceedings of ENLG-07, pp. 97–104 (2007)
Reiter, E., Dale, R.: Building natural language generation systems. Nat. Lang. Eng. 3, 57–87 (1997)
Reiter, E., Dale, R.: Building Natural Language Generation Systems. Cambridge University Press, Cambridge (2000)
Reiter, E., Robertson, R., Osman, L.M.: Lessons from a failure: generating tailored smoking cessation letters. Artif. Intell. 144(1–2), 41–58 (2003)
Reiter, E., Sripada, S., Hunter, J.R., Yu, J., Davy, I.: Choosing words in computer-generated weather forecasts. Artif. Intell. 167(1–2), 137–169 (2005)
Reiter, E., Gatt, A., Portet, F., Van der Meulen, M.: The importance of narrative and other lessons from an evaluation of an NLG system that summarises clinical data. In: Proceedings of INLG-08, pp. 147–156 (2008)
Salzburg Global Seminar: Salzburg statement on shared decision making. Br. Med. J. (Clin. Res. Ed.) 342, d1745 (2011)
Schiphof-Godart, L., Hettinga, F.J.: Passion and pacing in endurance performance. Front. Physiol. 8, 83 (2017)
Shaw, J.: Clause aggregation using linguistic knowledge. In: Proceedings of IWNLG-98, pp. 138–148 (1998)
Siddharthan, A., Nenkova, A., McKeown, K.R.: Information status distinctions and referring expressions: an empirical study of references to people in news summaries. Comput. Linguist. 37(4), 811–842 (2011)
Siegler, E.L.: The evolving medical record. Ann. Intern. Med. 153(10), 671–677 (2010)
Sinsky, C., Colligan, L., Li, L., Prgomet, M., Reynolds, S., Goeders, L., Westbrook, J., Tutty, M., Blike, G.: Allocation of physician time in ambulatory practice: a time and motion study in 4 specialties. Ann. Intern. Med. 165, 753–60 (2016)
Spiegelhalter, D., Pearson, M., Short, I.: Visualizing uncertainty about the future. Science 333(6048), 1393–1400 (2011)
Sripada, S., Gao, F.: Linguistic interpretations of scuba dive computer data. In: Proceedings of ICIV-07, pp. 436–444 (2007)
Stede, M.: The hyperonym problem revisited: conceptual and lexical hierarchies in language. In: Proceedings of INLG-00, pp. 93–99 (2000)
Stiggelbout, A.M., Van der Weijden, T., De Wit, M.P.T., Frosch, D., Légaré, F., Montori, V.M., Trevena, L., Elwyn, G.: Shared decision making: really putting patients at the centre of healthcare. Br. Med. J. 344, e256 (2012)
Tatousek, J., Lacroix, J., Visser, T., Den Teuling, N.: Promoting adherence to CPAP with tailored education and feedback: a randomized controlled clinical trial. In: Proceedings of Sleep 2015 (2016)
Theune, M., Hielkema, F., Hendriks, P.: Performing aggregation and ellipsis using discourse structures. Res. Lang. Comput. 4, 353–375 (2006)
Tintarev, N., Reiter, E., Black, R., Waller, A., Reddington, J.: Personal storytelling: using natural language generation for children with complex communication needs, in the wild. Int. J. Hum. Comput. Stud. 92–93, 1–16 (2016)
Travis, A.R., Sevenster, M., Ganesh, R., Peters, J.F., Chang, P.J.: Preferences for structured reporting of measurement data: an institutional survey of medical oncologists, oncology registrars and radiologists. Acad. Radiol. 21(6), 785–796 (2014)
Van Deemter, K.: Not Exactly: In Praise of Vagueness. Oxford University Press, Oxford (2012)
Van Deemter, K.: Designing algorithms for referring with proper names. In: Proceedings of INLG-16, pp. 31–35 (2016)
Van Deemter, K., Krahmer, E., Theune, M.: Real versus template-based natural language generation: a false opposition? Comput. Linguist. 31(1), 15–24 (2005)
Van der Meulen, M., Logie, R.H., Freer, Y., Sykes, C., McIntosh, N., Hunter, J.: When a graph is poorer than 100 words: a comparison of computerised natural language generation, human generated descriptions and graphical displays in neonatal intensive care. Appl. Cogn. Psychol. 21, 1057–1075 (2007). https://doi.org/10.1002/acp
Van Genugten, L., Calo, R., Van Wissen, A., Vinkers, C., Van Halteren, A.: Psychosocial health coaching for chronically ill in a telehealth context: a pilot study. In: Frontiers in Public Health, Conference Abstract: 2nd Behaviour Change Conference: Digital Health and Wellbeing (2016). https://doi.org/10.3389/conf.FPUBH.2016.01.00108
Walker, M.A.: Redundancy in collaborative dialogue. In: Proceedings of CoLing-92, pp. 345–351 (1992)
Wenger, N., Méan, M., Castioni, J., Marques-Vidal, P., Waeber, G., Garnier, A.: Allocation of internal medicine resident time in a Swiss Hospital: a time and motion study of day and evening shifts. Ann. Intern. Med. 166, 579–586 (2017)
White, M., Rajkumar, R.: Minimal dependency length in realization ranking. In: Proceedings of EMNLP-12, pp. 244–255 (2012)
Wilkinson, K.M., Hennig, S.: The state of research and practice in augmentative and alternative communication for children with developmental/intellectual disabilities. Ment. Retard. Dev. Disabil. Res. Rev. 13, 58–69 (2007)
Wubben, S., Van den Bosch, A.P.J., Krahmer, E.J.: Creating and using large monolingual parallel corpora for sentential paraphrase generation. In: LREC-14, pp. 4295–4299 (2014)
Yu, J., Reiter, E., Hunter, J., Mellish, C.: Choosing the content of textual summaries of large time-series data sets. Nat. Lang. Eng. 13(1), 25–49 (2007)
Zeng-Treitler, Q., Goryachev, S., Kim, H., Keselman, A., Rosendale, D.: Making texts in electronic health records comprehensible to consumers: a prototype translator. In: AMIA-07, pp. 846–850 (2007)
Zeng-Treitler, Q., Goryachev, S., Tse, T., Keselman, A., Boxwala, A.: Estimating consumer familiarity with health terminology: a context-based approach. J. Am. Med. Inform. Assoc. 15(3), 349–356 (2008). https://doi.org/10.1197/jamia.M2592
Zhu, Z., Bernhard, D., Gurevych, I.: A monolingual tree-based translation model for sentence simplification. In: CoLing-10, pp. 1353–1361 (2010)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Pauws, S., Gatt, A., Krahmer, E., Reiter, E. (2019). Making Effective Use of Healthcare Data Using Data-to-Text Technology. In: Consoli, S., Reforgiato Recupero, D., Petković, M. (eds) Data Science for Healthcare. Springer, Cham. https://doi.org/10.1007/978-3-030-05249-2_4
Download citation
DOI: https://doi.org/10.1007/978-3-030-05249-2_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-05248-5
Online ISBN: 978-3-030-05249-2
eBook Packages: Computer ScienceComputer Science (R0)