Keywords

The chapter relates to the technical priorities Data Analytics of the European Big Data Value Strategic Research and Innovation Agenda [1]. It addresses the horizontal concern Data Analytics of the BDV Technical Reference Model and addresses the vertical concerns of Big Data Types and Semantics. The chapter further relates to the Reasoning and Decision Making cross-sectorial technology enablers of the AI, Data and Robotics Strategic Research, Innovation and Deployment Agenda [2].

1 Overview

The chapter begins by discussing Intensive Care medicine and the types of machines and data that is being recorded continuously, and thus producing ‘Big Data’. It proceeds to explain some of the challenges that can arise when working with such data, including measurement errors and bias. The subsequent section explains some of the common methodologies used to provide predictive analytics with examples given for both acute and chronic clinical conditions, and discuss our own work for the promotion of lung protective ventilation, to highlight the accuracies that can be achieved when pairing health ‘Big Data’ with common machine learning methodologies. The chapter concludes by discussing the future of this field and how we, as a society, can provide value to our healthcare systems by utilizing the routinely collected data at our disposal.

2 Intensive Care Medicine and Physiological Data

Intensive Care Units (ICU) offer expensive and labor-intensive treatments for the critically ill and are therefore a costly resource for the health sector around the world. They can also be referred to as Intensive Therapy Units (ITU) or Critical Care Units (CCU). UK estimates in 2007 highlighted that intensive care in the NHS costs £719 million per year [3]. Comparably, American studies reported in 2000 have shown that median costs per ICU stay can range between $10,916 and $70,501 depending on the length of stay [4]. A typical length of stay varies depending on the condition of the patient with studies showing the mean length of stay being 5.04 days [3], while the condition of the patient can change quickly and sometimes unpredictably.

Patients will normally be admitted to intensive care after a serious accident, a serious condition such as a stroke, an infection or for surgical recovery. Throughout their stay in ICU, these patients are monitored closely due to their critical condition, and on average require one nurse for every one or two patients. Many devices and tests may be used to ensure the correct level of care is provided. Patients will be placed on machines to monitor their condition, support organ function and allow for the detection of any improvements or deterioration. The functions of these machines can vary: from the monitoring of vital parameters such as heart rate via the patient monitor to the use of mechanical ventilators that provide respiratory function when a patient is not able to do so themselves.

The workload on the clinical staff in ICU is intense and so utilizing Big Data analytics will allow for healthcare providers to improve their efficiency through better management of resources, detection of decompensation and adverse events, and treatment optimization, among many benefits to both patient outcomes and hospital costs [5].

2.1 Physiological Data Acquisition

Health data recorded in ICU can be classified as ‘Big Data’ due to the volume, complexity, diversity and timeliness of the parameters [6], the aim being to turn these ‘Big Data’ records into a valuable asset in the healthcare industry.

As highlighted, patients requiring critical care are placed on numerous monitors and machines to help with and provide their care. Equipment can include, but is not limited to [7]:

  • Patient monitoring system to monitor clinical parameters such as electrocardiogram (ECG), peripheral oxygen saturation, blood pressure, temperature.

  • Organ support systems such as mechanical ventilator, extracorporeal organ support such as continuous renal replacement therapy.

  • Syringe pump for the delivery of medicines.

These machines are all monitoring continuously (24×7) and thus representing one example of the emerging field of Big Data. It is important to wean patients off these machines as quickly as possible to avoid dependency and to lower the risk of infection. In addition to the organ support machines, the electronic health record includes laboratory data, imaging reports such as X-rays and CT scans, and daily review record.

2.1.1 Time Series Data

The human physiologic state is a time-dependent picture of the functions and mechanisms that define life and is amenable to mathematical modeling and data analysis. Physiology involves processes operating across a range of time scales resulting in different granularities. Parameters such as heart rate and brain activity are monitored on the millisecond level while others such as breathing have time windows over minutes and blood glucose regulation over hours. Analysis of instantaneous values in these circumstances is rarely of value. On the other hand, application of analytical techniques for time series offers the opportunity to investigate both the trends in individual physiological variables and the temporal correlation between them, thereby enabling the possibility to make predictions.

Features can also be extracted from these time series using packages such as the Python tsfresh software toolkit to use as input into Machine Learning models and to gain further insights into the relationship between the parameter and time [8]. By analyzing the time series we can make predictions on the future trajectory and alert care givers of possible issues in order to prevent complications.

Prior research has shown that these streams of data have very useful information buried in them, yet in most medical institutions today, the vast majority of the data collected is either dropped and lost forever, or is stored in digital archives and seldom reexamined [9].

In the past number of years there has been a rapid implementation of Electronic health records, or EHRs, around the world. EHRs are a common practice to record and store real-time, patient health data, enabling authorized users to track a patient from initial admission into the hospital, their deterioration/improvement, diagnoses, and all physiological parameters monitored and drugs given across all healthcare systems.

2.1.2 Publicly Available Datasets

As part of a global effort to improve healthcare, numerous institutions have put together publicly available data sets based on their EHRs for people to use in research; enabling visualization, analysis, and model development.

One of the best-known and commonly used publicly available databases is MIMIC, the Multiparameter Intelligent Monitoring in Intensive Care database. Produced from the critical care units of the Beth Deaconess Medical Centre at Harvard Medical School, Boston, this large, freely available database consists of de-identified health-related data associated with over 40,000 patients who were in critical care between 2001 and 2012 [10]. After being approved access, users are provided with vital signs, medications, laboratory measurements, observations, and charted notes. Furthermore, waveforms are available for use, along with patient demographics and diagnoses.

PhysioNet offers access to a large collection of health data and related open-source software, including the MIMIC databases [11]. All data recorded in these publically available databases are anonymous, ensuring no patient can be identified from their data.

3 Artificial Intelligence in ICU

The broad field of Data Science has emerged within the discipline of Computer Science over the past 10 years, approximately. Its roots arguably lie in the work of Fayyad et al. [12], which defined a pipelined process for the extraction of abstract models from data using an amalgam of techniques, including Statistics and Machine Learning. Artificial Intelligence (AI) is the ability of a computer system to perform tasks that aim to reach the level of intelligent thinking of the human brain through applying machine learning methodologies, further details are discussed in the methodologies section.

3.1 Challenges

As in majority of real-world applications, there comes a series of challenges that arise when applying Big Data-driven analytics to such sensitive and intense data. We must aim to build trust and quality between the decision support tools and the end users. It is always important to question if there is real clinical impact from carrying out such work and exploring AI methodologies for solving particular problems.

3.1.1 Data Integrity

Before building any Big Data-driven analytic system, it is important to ensure data has been collected and preprocessed correctly, with any errors handled accurately. It is crucial to ensure data harmonization, the process of bringing together all data formats from the different machines and tests into one database. Without this step, the algorithms will produce unreliable results, which in turn could result in a critically dangerous implementation and a lack of trust in the system. Errors in the data can be due to a variety of reasons:

  • Data input errors: In intensive care, caregivers often have to input readings on the system. Fields will have a required metric in which the data should be entered and these may sometimes be ignored or mishandled. For example, height that should be entered in centimeters but a user may enter the reading in meters, e.g., 1.7 m instead of the 170 cm.

  • Sensor errors: With the complexity and multitude of monitors, leads, and machines that a patient can be on at any given time, the sensors can sometimes fail or miss a reading. Sensors can be disconnected for a period of time to deliver medicines or for imaging tests. Patients’ movements in the bed can cause a sensor error or unexpected result. These errors will present as outliers in the data and should be dealt with accordingly as to not throw off any predictive modeling.

  • Bias in the data: AI methodologies are only as good as the data of which they are trained on. If this data contains, e.g., racial or gender biases, the algorithm will learn from this and produce similar results such as women not being given the same number of tests as men. Similarly, statistical biases can be present due to small sample numbers from underrepresented groups, for instance, an algorithm only being trained with White patients may not pick up the same diagnosis when presented with Hispanic patients [13]. Furthermore, selection biases exist when the selection of study participants is not a true representation of the true population, resulting in both cultural and social differences.

  • Bias in the study: Studies can also contain information bias. These include measurement errors for continuous variables and misclassification for categorical variables. For example, a tumor stage being misdiagnosed would lead to algorithms being trained on incorrect data [14].

    We must ensure to implement appropriate techniques such as imputation, smoothing and oversampling to prevent errors in our data and build trust with the user.

3.1.2 Alert Fatigue

When building decision support tools, it is crucial to ensure the integrity of alerts raised. Too many can result in alert fatigue, leading to alarms being switched off or ignored. Research has shown that there is on average 10.6 alarms per hour, too many for a single person to handle amidst their already busy work schedule [15].

In addition, 16% of health care professionals will switch off an alarm and 40.4% will ignore or silence the alarm [16].

This influx of alarms resulting in the monitoring system being turned off will lead to vital physiological problems being missed. We need to be confident that our systems are only producing true alerts and that false alerts generated are minimized.

It has further been highlighted that reducing the number of alerts repeated per patient will help reduce override rates and fatigue [17].

3.1.3 Bringing AI Systems to Clinical Trials

While we know these AI models, derived from Big Data-driven predictive analytics, can provide value to the healthcare industry, we must be able to test them in real time in order for them to be implemented and used throughout the world. This requires a clinical trial to be carried out using the system on patients in real time. While there exist the SPRIRT 2013 and CONSORT 2010 checklists for clinical trials, covering checklists for what you intended to do and what you actually did, respectively, neither of these include steps for AI implementation [18].

With the rise of AI, these guidelines have been extended in 2020 to include steps for reporting the quality of data, errors, clinical context, intended use, and any human interactions involved. These additional steps allow reviewers to better evaluate and compare the quality of research and thus systems created in the future [19, 20]. Other authors are further expanding these checklists to include reporting for diagnostic accuracies and prediction models [21, 22].

Researchers have analyzed the current work that has been published in this area [23]. With 93% of papers exploring model development to demonstrate potential for such systems, and only 5% validating these models with data from other centers. A further 1% of the work currently published reports on models that have actually been implemented and tested in real-time hospital systems and the final 1% of research have integrated their models in routine clinical practice which has proven to work. This summary highlights the large potential for the future of AI in healthcare systems, with huge opportunities for bringing forward accurate models to real-world clinical trials.

3.2 AI Methodology

Machine Learning methodologies are commonly used to enable machines to become ‘intelligent’ and can be classified as being supervised, predicting using a labeled dataset, i.e., a known output or target, or unsupervised, i.e., finding patterns or groupings in unlabeled data [24]. Common methodologies are utilized across the board for predictive and analytical purposes. This chapter focuses on commonly used supervised learning techniques; however, unsupervised methods can further be used to understand data. We can categorize supervised learning techniques into regression and classification models. Regression techniques aim to find relationships between one dependent variable and a series of other independent variables, e.g., a time series as previously discussed is common in physiological data. Classification techniques on the other hand attempt to label outcomes and draw conclusions from observed values, e.g., if patient has disease or not.

The question of whether the numerical models that are generated can actually be understood by humans has become a hot research topic. In the United States, DARPA has conducted significant investigative research in this field [25]. Other research teams have begun to define the concept of explainable AI with respect to several problem domains. They identify three classifications: opaque systems that offer no insight; interpretable systems where mathematical analysis of the algorithm is viable; and comprehensible systems that emit symbols enabling user-driven explanations of how a conclusion is reached [26].

We know from previous research that utilizing Big Data within the ICU can lead to many benefits to both patient and hospital. We can not only greatly improve the care given and thus patient outcomes, see Table 1, but also reduce hospital costs [5] and the stress levels of care givers [27]. McGregor and colleagues have demonstrated the viability of monitoring physiological parameters to detect sleep apnea in neo-natal ICU leading to the software architecture of the IBM InfoSphere product, which has now been extended into a Cloud environment (the Artemis project) making the functionality available at remote hospital sites [28, 29].

Table 1 Highlighting applications of AI in healthcare

Patient deterioration can often be missed due to the multitude of and complicated relational links between physiological parameters. AI-driven multivariate analysis has the potential to ameliorate the work load of ICU staff. Multiple studies have shown AI to be comparable to routine clinical decision making, including ECG analysis, delirium detection, sedation, and identification of septic patients [28, 37,38,39].

AI-driven predictive analytics within healthcare most commonly uses supervised learning approaches, due to which we aim to base algorithms and decisions on previous examples and train our models with accurate outcomes, in particular regression analysis is used for time series data.

Below we review the more widely adopted Machine Learning methodologies developed [31], and examples are highlighted in Table 1, where these have been used in previous research.

3.2.1 Expert Systems

In a rule-based expert system, also known as a knowledge-based clinical decision support system, knowledge is represented by a series of IF-THEN rules that are created by knowledge acquisition, which involves observing and interviewing human experts and finalizing rules in format ‘IF X happens THEN do Y’. These rules cannot be changed, learnt from or adapted to different environments, meaning the human experts must manually monitor and modify the knowledge base through careful management of the rules. Expert systems allow us to view and understand each of the rules applied by the system.

The systems can take over some mundane decision tasks and discussions of health care professionals, saving vital time and money. An example expert system was created for Diabetes patients to provide decision support for insulin adjustment based on simple rules depending on the regimen that patients were placed on [32].

3.2.2 Decision Trees

A decision tree can be used to visually represent a series of decisions used to reach a certain conclusion, useful when exploring medical reasons behind decisions made. The tree starts at the root, asking a specific question to split the data by the given condition. The tree then splits into branches and edges at each node representing the input variables, continually splitting by conditions until the final decision is achieved at the leaf node or the output variable.

They can also be referred to as Classification and Regression Trees (CART) as they can solve both classification and regression problems. A classification tree will arrive at a binary condition, leaf node, i.e., patient survives or not, whereas the regression trees will predict a certain continuous value, i.e., the heart rate of a patient.

The tree will not only explore the conditions used to split the data at each decision but also the features used and which features are the most significant at splitting the data, added as top-level nodes. Researchers have utilized simple decision tree models for the classification of patients with diabetes, among other disease states. Features include age, BMI, and both systolic and diastolic blood pressure of the patient to arrive at a decision whether the patient has diabetes or is healthy. Figure 1 shows how the features are split and decisions are made, in this circumstance [33].

Fig. 1
figure 1

Decision tree for the classification of diabetes [20]

When building decision tree models it is important to monitor the maximum depth of the tree to avoid overfitting and lengthy training times.

3.2.3 Ensemble Methods

To achieve the greatest predictive performance when working with complex problems, ensembles can be created. The decision trees are combined in different ways depending on the methodology used. The methods can be categorized as bagging or boosting (Fig. 2).

Fig. 2
figure 2

Decision trees and ensemble methods

Boosting methods build the trees in a sequential way; for each predicted value multiple models or decision trees are made using different features and combinations of features, then weights are given to these models based on their error so that the final prediction is the most accurate. AdaBoost is an example of this method where very short trees are produced and higher weights are given to more difficult decisions. GradientBoosting further combines boosting with gradient descent, allowing for the optimization of loss functions.

In contrast, for bagging methods, each model is given equal weights, they are combined in parallel and all of the predictions are averaged to get a final, most accurate decision, examples include Bagging and Extra Trees. Bagging can be extended to the RandomForest algorithm by randomly selecting the features used, and decision trees are built to have as many layers as possible.

3.2.4 Neural Networks

A neural network saves a lot of time when working with large amounts of data by combining variables, figuring out which are important and finding patterns that humans might not ever see. The neural network is represented as a set of interconnected nodes, connected by neurons.

They feed the weighted sum of the input values through an activation function, which takes the value and transforms it before returning an output. These activation functions in turn improve the way the neural network learns and allows for more flexibility to model complex relationships between the input and output (Fig. 3).

Fig. 3
figure 3

Neural network structure

The neural network can be described as ‘Deep Learning’ when it has multiple hidden layers between the input and output layers. The neural network learns by figuring out what it got wrong and working backwards to determine what values and connections made the prediction incorrect.

Additionally, there are different types of neural networks, and the list continues to expand as researchers propose new types. There are feedforward neural networks, where all the nodes only feed into the next layer from initial inputs to the output. Recurrent neural networks (RNN) make it possible to feed the output of a node back into the model as an input the next time you run it. Nodes in one layer can be connected to each other and even themselves. Furthermore, they work well with sequential data as they remember previous outputs. The long short-term memory form of the RNN enables the model to store information over longer periods of time, ideal for modeling time series data. Additionally, there are convolutional neural networks (CNN) which look at windows of variables rather than one at a time. Convolution applies a filter, or transformation, to the windows to create features. When working with large databases, pooling is a technique that takes a huge number of variables to create a smaller number of features. The neural network will then use the features generated by convolution and pooling to give its output.

The key parameters to distinguish neural networks are the number of layers and the shape of the input and the output layers.

Researchers have utilized many different formats of neural networks for prediction problems. A medical decision support tool for patient extubation was developed using a multilayer perception artificial neural network, a class of feedforward neural networks. The input layer consisted of 8 input parameters, defining 17 perceptions, and 2 perceptions in the output layer for prediction output. Perceptions can be described as a classifying decision. They explored the change in performance based on the number of perceptions in the hidden layer: 19 producing highly accurate results [34]. Other studies have shown that RNNs produce accurate results for the prediction of kidney failure [35] and CNNs have shown promise in the prediction of atrial fibrillation in ECG analysis [36],

However, it is difficult to explain predictions from neural networks to healthcare professionals due to having to understand a particular weight as a discrete piece of knowledge, although work is being done in the area [37].

With common limitations reoccurring, such as generalization, dataset size, noisy and unbalanced data, we highlight the importance of continuing research and building of larger datasets, across multiple centers, and further the exploration of methodologies to smooth noisy data in order to advance the work of producing accurate systems that can be implemented in real-world healthcare settings.

Imaging and waveforms are a further, huge division of physiological monitoring. Machines such as X-rays and CTs can produce images of internal organs to provide a greater insight into patient state. In addition, ECG waveforms and brain waves can be analyzed for diagnoses. Thus, signal processing is an integral part of understanding physiological data and getting the full picture of patient state.

Furthermore, Machine Learning methodologies can be used for natural language processing to analyze lab notes and patient charts in order to summarize and detect common words and phrases used by care givers, in turn automating the process of flipping through hundreds of pages of records and saving vital time.

4 Use Case: Prediction of Tidal Volume to Promote Lung Protective Ventilation

Around 44.5% of patients in ICU are placed on mechanical ventilation to support respiratory function at any given hour [52]. However, the delivery of high tidal volume values can often lead to lung injury. Tidal volume being the amount of air delivered to the lungs; it is common knowledge amongst critical care providers that tidal volume values should be no greater than 8 ml per kg of ideal body weight. In our recent work we explored regressors for ensemble methods and the long short-term memory (LSTM) form of neural networks to predict tidal volume values to aid in lung protective ventilation [53].

Data acquisition took place at the Regional Intensive Care Unit (RICU), Royal Victoria Hospital, Belfast, over a 3-year period and the VILIAlert system was introduced [54]. The data streams were monitored against the thresholds for lung protective ventilation and if thresholds were breached continuously, an alert was raised. We then turned our attention to predicting these alerts with the aim of preventing violations and protecting the patient’s lungs. The VILIAlert system ran for nearly 3 years, recording minutely tidal volume values for almost a thousand patients.

As discussed, noisy signals are common in ICU data. Time series often needs to be filtered to remove noise in the data and produce smooth signals. Methods such as moving average and exponential smoothing can be applied to the data to extract true signals, such as the work we carried out to extract true tidal volume trends [53]. Figure 4 shows how smoothing the time series, shown in blue, removes anomaly in the data, the large jumps, and extracts the true patient trend as shown in red. This work is related to the efforts of the international project known as the Virtual Physiological Human (http://www.vph-institute.org), which seeks to use individualized physiology-based computer simulations in all aspects of prevention, diagnosis, and treatment of diseases [55]. This computational approach has provided immediate insight into the COVID-19 pandemic [56].

Fig. 4
figure 4

Time series of a patient’s tidal volume profile. Raw minutely values in blue, 15-min averaged bins in red

We compare multiple regressor ensemble methods for initially predicting 15 min ahead. For each patient, we use the tsfresh toolkit to extract features to use as input into the regressor models in order to predict one time bin ahead and report the RMSE between the true observed values and the predicted values from our models. Table 2 reports RMSE calculated for 8 of the patients using each of the ensemble methods. In all models the maximum number of trees is set to 10. We can compare the depth of the bagging method trees: RandomForest being 32 ± 5, ExtraTrees being 37 ± 4, and Bagging being 33 ± 5. In contrast, the boosting methods: AdaBoost and GradientBoosting, set trees of depth four by default. As expected, increasing the number of trees decreases the RMSE.

Table 2 Comparison of regressor ensemble models performance for the prediction of patient’s tidal volume one time step ahead

It is important to take computational time into consideration when choosing algorithms to make predictions in real time. From our experiments we found AdaBoost to give the best trade-off between RMSE and computation time.

One might expect that predicting further ahead in time would lead to larger RMSE values, however, the change in RMSE is small. We therefore explore using AdaBoost regression for the prediction of tidal volume values up to 1 h ahead, finding very little increase in RMSE values across patients.

As described in this chapter, a benefit of using ensemble models made from decision trees is that we can visualize the features and decisions used to make decisions. Figure 5 is one of the decision trees created when using the AdaBoost method.

Fig. 5
figure 5

One of the ten trees created by the AdaBoost method for patient 1 predictions

The features were extracted by tsfresh as the most significant features for this problem. It is interesting here to discuss what these features can mean for our problem domain. The Ricker Wavelet is used to describe properties of a viscoelastic homogeneous media and the Friedrich coefficient aims at describing the random movement of a particle in a fluid. We can hypothesize from this finding that the amount of fluid in the lungs would be an impacting factor in how a patient’s tidal volume can change over time.

A comparison is then made applying long short-term memory neural networks to the same problem: predicting tidal volume 1 h ahead. Two models are created: ModelA has one hidden layer and ModelB has three hidden layers, with a 20% dropout layer between the second and third layers to avoid overfitting. In contrast to the regressor models that work with features extracted from the time series, the LSTM models use the time series values directly; requiring 70% of the time series to train the models. Each layer in our LSTM models has 50 nodes and both models use 20 input points to predict 4 ahead.

The RMSE values for our LSTM models are significantly greater than the AdaBoost method for predicting 1 h ahead, so we deem AdaBoost the better method of the two for this problem.

The VILIAlert system alerted when four consecutive bins were greater than the 8 ml/kg tidal volume threshold. These alert times were stored in the database. We can thus work out the accuracy of our models to predict these alerts, showcasing the possibility of preventing threshold breaches and preventing injury. Table 3 highlights the predictive accuracy of the AdaBoost model for the 8 patients. Total Alerts being the total number of alerts recorded by the VILIAlert system, TP being the true positives: the alerts that would have been predicted by the model, and FN being false negatives: the alerts that would not have been predicted. The accuracy is then calculated using:

$$ \mathrm{Accuracy}=\frac{\mathrm{TP}}{\mathrm{TP}+\mathrm{FN}.} $$
Table 3 Prediction accuracy of alerts using AdaBoost

For the 84 alerts that were generated for patient 1, 81 would have been predicted using our models and thus those threshold breaches could have been prevented. These results showcase how Machine Learning algorithms, when paired with big data, can provide value in preventing lung injury during mechanical ventilation of intensive care patients.

4.1 The ATTITUDE Study

The ATTITUDE study operated by Queen’s University, Belfast, and funded by The Health Foundation UK, aims to develop a clinical decision support tool to improve weaning protocols commonly used in clinical practice, and further understand the barriers in implementing evidence-based care from these tools tested in a proof-of-concept study carried out at the Royal Victoria Hospital ICU, Belfast. Improving patient care, outcomes, and mortality by reducing the duration of weaning can lead to reduced hospital stays and costs, and this study aims to find out if the use of clinical decision support tools can improve the quality of critical care practices.

5 Future of ML and AI in ICU

The methodology discussed in this chapter can, and must be, explored with various and extensive types and volumes of data, to investigate more disease states and clinical conditions. This data is currently being recorded worldwide in what are known as Electronic Health Records and these hold valuable insights which must be utilized to improve healthcare going forward.

From the European Big Data Value Strategic Research and Innovation Agenda [1] we understand the importance of Big Data and utilizing it to benefit both the economy and society. By exploring the already available EHRs we can provide societal benefit by improving patient outcomes and saving lives, and further economic benefits of saving millions in hospital costs, through shorter lengths of stays and disease prediction, among others. Healthcare, being one of the most important and largest sectors, can greatly impact the agenda of a data-driven economy across Europe. Data-driven predictive analytics, built using the methodologies discussed in the chapter, can produce clinical decision support tools, allowing for advanced decision making or automation of procedures. The Machine Learning methodologies can further provide greater insights into patient states and inform healthcare professionals with new information or possible reasoning that would not have been caught by a human. These new insights result in further research questions that can be explored.

This chapter can be aligned with the Big Data Reference model in various ways. The data recorded in ICU is extensive and in various different formats, from structured data and time series to imaging and text inputs. Advanced visualization tools can be used to add value to data by presenting it in user-friendly ways. Data analytics and Machine Learning can be applied for prediction and reasoning of disease states. There exist further protocols and guidelines for the handling of patient information, ensuring efficient data protection and management.

The new AI, data and robotics partnership highlights the value opportunities that exist when transforming the healthcare sector by applying AI to produce value-based and patient-centric care in areas such as pandemic response, disease prevention, diagnosis decision support, and treatment [2].

High-performance computing is an integral part in deploying real-time predictive analytic models in intensive care. We must ensure our machines can process the data efficiently and quickly. Utilizing parallelism will provide speed up data processing and model predictions.

This chapter has explored the use of Big Data-driven analytics for acute and chronic clinical conditions to provide value to healthcare services. While there exists vast research carried out in certain disease states, such as Sepsis [30, 40,41,42], work is needed to provide greater in-depth analysis and insights into patients’ complex physiologic state while in such critical conditions. Recent developments have led to a greater acceptance and excitement in this field, resulting in updated guidelines for testing AI models in real-life clinical trials to promote worldwide acceptance of the use of AI in healthcare.