Introduction

Rapid development in computer technology has led to sophisticated methods of analyzing large datasets with the aim of improving human decision making [1]. Artificial Intelligence and Machine Learning (ML) approaches hold tremendous potential for solving complex real-world problems such as those faced by stakeholders attempting to prevent work disability [2]. These techniques are especially appealing in work disability contexts that collect large amounts of data such as workers’ compensation settings, insurance companies, large corporations, and health care organizations, among others. However, the approaches require thorough evaluation to determine if they add value to traditional statistical approaches. In this special series of articles, we examine the role and value of ML in the field of work disability prevention and occupational rehabilitation.

Definitions of Key Terms

Since the ML field is relatively new but rapidly expanding, several terms have not been clearly defined and are often used interchangeably. Therefore, we begin by providing definitions for some key terms used in this area.

Artificial Intelligence

Artificial Intelligence is a discipline striving to get machines to perform tasks that would normally require human, and even super-human, cognitive abilities. Artificial Intelligence can be seen as an enhancement of human creativity. It is a tool that allows us to do what we cannot do, or perform tasks more efficiently.

Machine Learning

A subset of Artificial Intelligence, ML is an algorithmic means to learn from data to solve a task without specifically programming the computer for that task.

Deep Learning

Deep Learning builds on the tradition of artificial neural networks. It is a subclass of ML that involves utilizing multiple layers of nonlinearity for classification, prediction, translation, and other related problems. The distinctive property of deep learning is its ability to learn hierarchal feature representations that ‘disentangle’ the underlying factors that explain the data. This often leads to significantly better performance than traditional ML algorithms.

Data Mining

Data Mining is the process of discovering and extracting potentially useful and previously unknown patterns from large collections of data. Data Mining is the analysis and intelligent interpretation of large datasets in order to provide actionable knowledge from this data for human decision support or automatic decision making.

Big Data

Big Data refers to the massive amounts of complex data that are difficult to manipulate and understand using traditional processing methods. Big Data typically means applying tools such as ML to vast data beyond that captured in standard databases. Big Data is unconventional data integrated from different data sources, that is typically in a massive amount, produced at high speed, and is managed and analyzed in unconventional ways using techniques designed to cope with the volume and velocity of the data.

Statistical Modelling

Statistical Modelling is a process founded in statistical principles that favours inducing abstract representations of data via probability distribution and other parametric forms. The techniques used in statistical modelling typically focus on additivity of effects in the data. Examples of statistical models include ordinary linear regression with an assumption of a Gaussian distribution for the residuals, logistic regression, Cox proportional hazards regression, longitudinal models, quantile regression, ridge regression, lasso, and elastic net.

Research and Practice

All of the above techniques and approaches are used in research as well as clinical practise and/or business settings. There are several useful applications of ML and statistics for purposes of prediction and analytically-informed decision-making, however, the field is still developing and areas of strength and limitation are still being identified. This special series of articles aims to inform: (1) the use of ML as a research tool for developing generalizable knowledge; and (2) the use of ML for practical clinical or business applications aimed at improving real-world decision making. While most of the articles are in the context of research projects, they provide useful lessons and cautions for front-line analysts in health and insurance settings as well as other knowledge users.

Comparing ML and Traditional Statistical Modelling

Both ML and traditional Statistical Modelling aim to discover patterns in data to learn from the data (i.e., build predictive or explanatory models) [3]. Often the goals are similar but the approaches used are different. In statistical modeling, we care about finding relationships between variables in the data and the significance of these relationships for prediction. Alternatively, ML focuses on optimizing relevant performance metrics via learning on a training set and testing on a validation set or through a cross-validation procedure. In this setup, the performance of the predictive models takes precedence over understanding the relationship between dependent variables and the independent variables. This allows greater flexibility in the model, but at the cost of interpretability.

A traditional statistical model (SM) incorporates probabilities for the data generating mechanism and identifies previously unknown parameters that are usually interpretable and of special interest (e.g., effects of predictor variables and distributional parameters about the outcome variable) [4]. The most commonly used SMs are regression models, which potentially allow for a separation of the effects of competing predictor variables. SMs include ordinary regression, Bayesian regression, semiparametric models, generalized additive models, longitudinal models, time-to-event models, penalized regression, and others. SMs allow for complexity (i.e., nonlinearity and second-order interactions) and an unlimited number of candidate features if sample size is adequate or penalization (shrinkage; regularization) is used. It is especially easy, using regression splines, to allow every continuous predictor to have a smooth nonlinear effect.

ML is taken to mean an algorithmic approach that typically does not use traditional identified statistical parameters, and for which a preconceived structure is not imposed on the relationships between predictors and outcomes. However, many ML algorithms are adopted from the statistics literature, and a number of Bayesian methods have been incorporated. ML usually does not attempt to isolate the effect of any single variable, but is concerned with building an empirical algorithm for purposes of prediction or classification. ML approaches include random forests, recursive partitioning (CART) and decision trees, bagging, boosting, support vector machines, neural networks, deep learning, repeated incremental pruning to produce error reduction (RIPPER), associative classifiers, and others. Many ML approaches do not model the underlying distribution of the data, but rather attempt to learn from the dataset at hand. As such, ML has been considered as much or more a part of computer science than a part of statistics. However, many ML approaches (i.e., variational autoencoder, generative adversarial networks, Boltzmann Machines, Bayesian Deep Learning, and recurrent neural networks) attempt to non-parametrically model the underlying distribution of data via a training set. Therefore, there is some overlap in the methods of ML and traditional statistical modelling, and the toolboxes contain many of the same or similar methods.

Perhaps the simplest way to distinguish ML from SMs is that SMs (at least in the regression subset of SM) favour additivity of predictor effects while ML usually does not give additivity of effects any special emphasis.

While the field of ML has built on statistics literature, it has developed somewhat independently of the field of statistics. As a result, ML experts typically focus on classification and prediction in general and tend not to emphasize probabilistic thinking, whereas probabilistic thinking, understanding uncertainty, and variation are hallmarks of statistics. By not thinking probabilistically, ML scientists frequently utilize classifiers instead of using risk prediction models. Another major difference between SMs and ML are the respective sample size requirements. Because SMs favour additivity as a default assumption, when additive effects dominate SM requires far lower sample sizes (e.g., 20 events per candidate predictor) than ML. ML typically requires 200 events per candidate predictor [5]. Additionally, ML approaches have fewer assumptions than SM and are better at finding non-pre-specified interactions (SM typically requires interactions to be pre-specified). However, if important non-additive effects are rare ML will be no better than traditional regression approaches in terms of predictive discrimination. Lastly, while several ML approaches provide ‘black box’ models that users are not able to view or evaluate, SMs always provide models that are interpretable and avoid this ‘black box’ problem.

ML and artificial intelligence approaches have had their greatest successes in high signal:noise situations (e.g., visual and sound recognition, language translation, and playing games with concrete rules). Each of these situations allow quick feedback while training and availability of the ‘correct’ answer. In medicine, ML has had good success in the field of radiology where the goal is pattern recognition and mimicking radiologists’ expert image interpretations. In the areas of rehabilitation and work disability prevention, the situation is very different with a low signal:noise ratio and high levels of error in outcome measures. It is unknown how well ML approaches will perform within these settings. Additionally, new algorithms and methodologies may be required to permit ML to be applied in areas of rehabilitation.

In summary, there are similarities but several key differences between ML and SMs. There may be situations that favour ML while others that favour more traditional SM. However, we need research in the field of occupational rehabilitation and work disability prevention to determine when ML is the best option.

Articles and Issues in the Special Series

Articles in this special series cover important current issues in this area of research. These include whether ML analyses are able to predict as well or better than traditional statistical methods, the validity and replicability of ML algorithms, practical aspects of the development of real-world databases in clinical settings for building ML algorithms, and importantly, the ethical aspects of use of patient and worker data for ML approaches. Each of these are discussed briefly.

A study by van Hoffen and colleagues studied the development of prediction models for sickness absence due to mental health conditions in the general working population of the Netherlands [6]. They compared models developed using traditional logistic regression to models created using decision tree analysis. Findings indicate that an 11-predictor regression model and a 3-node decision tree identified workers at risk of long-term sickness absence due to mental health conditions equally well. However, the decision tree appeared to provide better insight into the mental health long-term sickness absence risk groups and may be easier to use in occupational health care practice.

Another example of using ML analysis was provided by Akbarzadeh Khorshidi et al. [7]. They modeled a large (n = 20,693) insurance dataset of people injured in transport accidents in Victoria, Australia. Using a hybrid approach combining unsupervised and supervised ML methods, they identified eight patient clusters that were highly predictive of injury outcomes. The analysis improved cost predictability in comparison with predictors such as sex, age and injury type. They also conclude that the transparency and interpretability of a 3-node decision tree (categorical variables of distress, sex, work satisfaction, and work pace) allowed convenient integration of classification rules into operational processes.

A less promising but instructional result was obtained by Gross et al. in their validation study examining the Work Assessment Triage Tool for selecting rehabilitation interventions for workers’ compensation claimants in Alberta, Canada [8]. They found that accuracy of an algorithm developed using ML declined in the validation cohort and proved less accurate than human clinical recommendations. This has important implications for organizations using ML analysis approaches, and suggests that models must be constantly updated to react to concept drift or changes in the system that alter model classification accuracy.

The implications of applying ML into real world decision-support for preventing work disability was discussed by Six Dijkstra et al. [9]. In this ethical deliberation related to a hypothetical clinical scenario from a workers’ health assessment, the authors reflect on relevant biomedical ethical principles: respect for autonomy, beneficence, non-maleficence, and justice. They provide three recommendations for the socially responsible design of ML Decision-Support Tools to minimize undesirable adverse effects of their development and implementation. This includes: (1) developing ‘well-trained’ or externally valid decision-support tools that incorporate ML algorithms; (2) more thorough review and oversight of research involving ML algorithms for clinical decision-support by health research ethics committees; and (3) adequate education of health care professionals related to the strengths and limitations of decision-support tools incorporating ML algorithms.

Another paper by Fong and colleagues reviewed efforts to develop robotic solutions that incorporate ML algorithms to overcome limitations inherent to Functional Capacity Evaluation [10]. While the field is currently exploratory and developmental, novel robotics with integrated ML algorithms appear to have potential for improving traditional practice through more accurate quantification as well as distance applications to reach rural and remote locations. Through Telerehabilitation and internet connectivity, robotic assessment techniques that incorporate ML-based approaches can be used over a distance to reach rural and remote locations. This has tremendous opportunity for reaching patients who otherwise could not participate in assessment and rehabilitation using traditional methods available to rehabilitation professionals.

Lastly, the development and theoretical basis of a cloud platform containing ML algorithms for risk prediction in case management decisions was described by Cheng et al. [11]. The Smart Work Injury Management (SWIM) system is a secure and centralized cloud platform containing a set of management tools for use by front-line insurance and health care providers. SWIM incorporates systems for data storage, data analytics, and ML. When fully developed, SWIM will hopefully provide more accurate prediction models for the cost of work injuries as well as advice for optimal medical care and RTW interventions to all RTW stakeholders. The paper provides a practical example for how health care practitioners and insurers can work collaboratively to develop useful tools and better systems for prevention of work disability.

Conclusion

The development of modern computer technology (hardware and software) provides tremendous potential and opportunity for development of better decision-support systems and practical tools to prevent disability. Research is currently in exploratory stages, but is rapidly progressing. This special series highlights promising results, practical examples, and recommendations for socially responsible development and implementation of ML applications in the field of work disability prevention.