Bringing Modern Machine Learning into Clinical Practice Through the Use of Intuitive Visualization and Human–Computer Interaction
The increasing trend of systematic collection of medical data (diagnoses, hospital admission emergencies, blood test results, scans, etc) by healthcare providers offers an unprecedented opportunity for the application of modern data mining, pattern recognition, and machine learning algorithms. The ultimate aim is invariably that of improving outcomes, be it directly or indirectly. Notwithstanding the successes of recent research efforts in this realm, a major obstacle of making the developed models usable by medical professionals (rather than computer scientists or statisticians) remains largely unaddressed. Yet, a mounting amount of evidence shows that the ability to understand and easily use novel technologies is a major factor governing how widely adopted by the target users (doctors, nurses, and patients, amongst others) they are likely to be. In this work we address this technical gap. In particular, we describe a portable, web-based interface that allows healthcare professionals to interact with recently developed machine learning and data driven prognostic algorithms. Our application interfaces a statistical disease progression model and displays its predictions in an intuitive and readily understandable manner. Different types of geometric primitives and their visual properties (such as size or colour) are used to represent abstract quantities such as probability density functions, the rate of change of relative probabilities, and a series of other relevant statistics which the heathcare professional can use to explore patients’ risk factors or provide personalized, evidence and data driven incentivization to the patient.
KeywordsHealth care Data Visualization Medicine Patient Interaction
Electronic medical records (EMRs)—also referred to digital medical records, or electronic health records—nowadays a routinely collected data resource in hospitals in economically developed countries, offer an exciting opportunity for machine learning-based knowledge discovery which could significantly affect healthcare delivery, its quality, and therefore intervention outcomes [1, 2, 3, 4, 5]. Some of the most prominent problems addressed by the existing literature include the discovery of risk factors, the modelling of disease progression patterns, and the development of patient specific prognostics [6, 7, 8, 9]. However, a major challenge posed by the need to interface these technological advancements with medical personnel and patients themselves, has attracted much less research attention [10, 11, 12, 13, 14]. Yet, some of the very premises of the work on person specific prognosis include the incentivization of patients . Moreover, the ability to interact with technology in an intuitive manner is a major aspect governing its adoptability in actual healthcare practice [16, 17].
The visualization tools we introduce in this work are built around a recently proposed disease progression model which has demonstrated highly promising results on real-world data [8, 15]. This model, and indeed all models likely to be successful on the task of comorbidity modelling and prediction, is highly technical and in that sense not readily accessible to medical practitioners or patients. A large volume of previous work has shown that this can be a major obstacle in the adoption of technology in the clinical context [10, 16]. Thus, the contribution of this paper is a novel framework which makes a major step towards bridging this gap of outstanding practical significance.
Under the Hood: The Underlying Prediction Model
For completeness herein we present a summary of the key ideas of the adopted method. For in-depth technical details, and the related discussion and results, the reader is referred to the original publications [8, 18, 19, 20].
Sequential, Non-temporal Visualization
Next, observe that there are multiple histories displayed concurrently. The bottommost history, labelled ‘Initial History’, corresponds to the history from which the space of possible diagnostic trajectories is explored. In clinical practice this initial history will usually be the diagnostic record of a patient at admission. Thereafter exploration proceeds by the user selecting a specific diagnosis (by clicking the corresponding circle). This action changes the history denoted ‘Current History’ which corresponds to the current state in the exploratory process and is guided by information in the topmost row. Unlike the three other rows which display the same type of information, namely diagnostic histories, the circles in this row also vary in their size and colour which encode the probability of a specific diagnosis given the current diagnostic history, estimated using the model detailed in the previous section. Thus, the user is informed in the exploratory process and can pursue possible diagnostic futures which are more likely.
Inclusion of Temporal Information
The tool described in the previous section allows for the visualization of sequential information only without any associated temporal understanding. Yet, in the present context time is critical – it is necessary for stratifying patients into low and high risk categories, and for allocating resources. However, the incorporation of temporal information in an easily understandable manner is challenging. In addition to the most obvious information which is ‘time until event’ (or rather, the probability density function corresponding to it), the rate of change in instantaneous risk is of importance, and different temporal characteristics of comorbidities can effect a sequencing change over time, which are factors which all add to the complexity and multinationality of information which needs to be displayed. By consulting with a number of relevant healthcare professionals (clinicians, doctors, and nurses) and by adopting an iterative design-test-reassess design process, we found that different users found different manners of information presentation most intuitive and easiest to understand. Consequently we developed a combination of different visualization options which can be readily switched between by the user.
The first circle-based visualization approach resembles a so-called blob chart , with equidistant blobs which represent different disease diagnoses being distributed horizontally, as illustrated in Fig. 4. The corresponded diagnoses are labelled using their codes under the adopted coding system (e.g. WHO’s diagnosis-related groups , or the Australian refined diagnosis-related groups ). As noted earlier, these are standard codes, used widely and understood by healthcare professionals, and allow for the diagnoses to be shown in a succinct, clutter free manner. Additional information and a more detailed description of a diagnosis can be obtained by clicking any visualization element associated with the diagnosis (its label or the corresponding blob, in this visualization).
The size of a particular blob encodes the value of the cumulative density function corresponding to the occurrence of the respective diagnosis by the specific time in future. This time is specified by the user and allows the user to gain an understanding of the highest risks for the patient within this period. Larger and thus more prominent blobs (and hence the corresponding diagnoses) draw the user’s attention to the most probable complications while at the same time providing a simple way of judging relative risks too—several large blobs immediately suggest a cluster of comorbidities, whereas single dominant blob highlights a specific primary diagnosis of interest.
Bar Chart-Based Visualization
Radial Chart-Based Visualization
Help, Hints, etc.
To facilitate instantaneous help and a ready understanding of different visual elements in our visualizations, when the cursor hovers over any of the relevant geometric entities, all associated information is emphasized. For example, as illustrated in Fig. 7, after hovering over the blob representing the first disease included explicitly in the model , a line connecting the blob with the corresponding value of the probability density function is shown. Further guidance on the features accessible from the main window of the application is also available, see Fig. 8, as well as a thorough step-by-step tutorial with comprehensive usage information, see Fig. 9. Furthermore, in addition to the disease code, a full description of the diagnosis is shown both above the cursor and at the bottom of the window. To avoid so-called change blindness, further animations emphasize transitions between history vectors or changes to the date of interest. This feedback uses highlighting and increased contrast against the beige coloured background of the current visualization after 0.5 s.
Automatic Time Lapse and Long-Term Outcome Simulations
Our application also provides further interactive features, activated using buttons placed below the main visualization space. These buttons resemble the widely known and hence intuitively understandable functions of a media player, such as ‘play’, ‘pause’, ‘stop’. These buttons provide effortless navigation through time via simulations of possible temporal trajectories through the space of possible diagnoses. Temporal transitions predicted by the adopted model are accompanied by the automatic visualization of the corresponding disease progression. The forward and backward buttons allow for manual time jumps. Such time jumps change the date of interest and update the visualization accordingly. The duration of such time jumps (e.g. days, months or years) can be specified in the ‘date selection’ modal window.
Clicking the play button opens a modal window where users can also choose the length of time jumps, the real time between predicted transitions (e.g. every 2 s), and whether diagnoses should be added automatically upon exceeding a certain probability of occurrence (i.e. the corresponding cdf value). In the latter case, the play function can add future diagnoses deterministically by using maximum likelihood prediction or non-deterministically by pdf weighted random sampling (ensuring that more likely diagnostic paths are simulated with the correspondingly higher frequency). Once the play function is activated in the modal window, our application repeatedly makes forward temporal jumps (as explained earlier, their duration can be set by the user). If deterministic prediction is selected, diagnoses are added to the visualized medical history if the corresponding cdf exceeds a probability threshold which too can be specified by the user. Random sampling adds diagnoses using cdf-based weighting, thus allowing the clinician to explore multiple future disease progression patterns with repeated activation of the function. When the play function is running, for clarity the ‘play’ button disappears, and is replaced by the ‘pause’ button. The click event of the pause button puts the play function on hold to enable users to explore the currently displayed simulated healthcare record in detail. Clicking the stop button terminates the play function and resets the date of interest to its default value (the present date).
Note on Implementation
The d3.js-based circles and rectangles used to visualize blobs and bars are nested in a scalable vector graphics (svg). Their radii and lengths are calculated using d3.js scale functions. Heat map colouring uses chroma.js interpolation between four plain colours and scaling with d3.js to calculate the corresponding mapping between the pdf rate of change values and the computed colour palette. To switch from the default blob chart to another visualization format, JQueryUI-based modal functions append HTML code to the interface.
In this paper we introduced an intuitive visual interface built around a recently proposed computational model of disease progression, aimed at making the model’s predictions accessible to health professionals in their daily work. A range of interactive features allows the user to explore patient specific risk across time. To the best of the authors’ knowledge, this is the first attempt at bridging the gap between increasingly complex machine learning-based algorithms and the realm of heathcare practice. We trust that our contribution will facilitate increased adoption of technology in healthcare delivery, empowering both the medical community and patients in understanding risk and how to address it. Moreover, we hope that our work will inspire future research in this realm.
Compliance with Ethical Standards
Conflict of interest
On behalf of all authors, the corresponding author states that there is no conflict of interest.
- 1.Zhou S-M, Fernandez-Gutierrez F, Kennedy J, Cooksey R, Atkinson M, Denaxas S, Siebert S, Dixon WG, O’Neill TW, Choy E, Sudlow C, Brophy S (2016) Defining disease phenotypes in primary care electronic health records by a machine learning approach: a case study in identifying rheumatoid arthritis. PLoS ONE 11(5):e0154515CrossRefGoogle Scholar
- 2.Lau EC, Mowat FS, Kelsh MA, Legg JC, Engel-Nitz NM, Watson HN et al (2011) Use of electronic medical records (EMR) for oncology outcomes research: assessing the comparability of EMR information to patient registry and health claims data. Clin Epidemiol 3:259–272Google Scholar
- 4.Li J, Arandjelovic O (2017) Glycaemic index prediction: a pilot study of data linkage challenges and the application of machine learning. In: Proceedings IEEE international conference on biomedical and health informatics, pp 357–360Google Scholar
- 5.Yue X, Dimitriou N, Arandjelovic O (2019) Colorectal cancer outcome prediction from H&E whole slide images using machine learning and automatically inferred phenotype profiles. In: Proceedings international conference on bioinformatics and computational biologyGoogle Scholar
- 8.Vasiljeva I, Arandjelović O (2016) Towards sophisticated learning from EHRs: increasing prediction specificity and accuracy using clinically meaningful risk criteria. In: Proceedings of international conference of the IEEE engineering in medicine and biology society, pp 2452–2455Google Scholar
- 12.Barracliffe L, Arandjelović O, Humphris G (2017) Can machine learning predict healthcare professionals’ responses to patient emotions? In: Proceedings of international conference on bioinformatics and computational biology, pp 101–106Google Scholar
- 13.Osuala R, Arandjelovic O (2017) Visualization of patient specific disease risk. In: Proceedings IEEE international conference on biomedical and health informatics, pp 241–244Google Scholar
- 14.Li J, Arandjelovic O (2017) Intuitive and interpretable visual communication of a complex statistical model of disease progression and risk. In: Proceedings international conference of the IEEE engineering in medicine and biology society, pp 4199–4202Google Scholar
- 15.Arandjelović O (2015) Prediction of health outcomes using big (health) data. In: Proceedings of international conference of the IEEE engineering in medicine and biology society, pp 2543–2546Google Scholar
- 18.Arandjelović O (2015) Discovering hospital admission patterns using models learnt from electronic hospital records. Bioinformatics 31(24):3970–3976Google Scholar
- 19.Vasiljeva I, Arandjelović O (2016) Prediction of future hospital admissions—what is the tradeoff between specificity and accuracy? In: Proceedings of international conference on bioinformatics and computational biology, pp 3–8Google Scholar
- 20.Vasiljeva I, Arandjelović O (2016) Automatic knowledge extraction from EHRs. In: Proceedings of international joint conference on artificial intelligence workshop on knowledge discovery in healthcare dataGoogle Scholar
- 21.World Health Organization (2004) International statistical classification of diseases and related health problems, vol 1. World Health Organization, GenevaGoogle Scholar
- 22.Kobel C, Thuilliez J, Bellanger M, Pfeiffer K-P (2011) DRG systems and similar patient classification systems in Europe. Diagnosis-Related Groups in Europe: moving towards transparency, efficiency and quality in hospitals, 1st edn. Open University Press and WHO Regional Office for Europe, Buckingham, pp 37–58Google Scholar
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.