Prediction of Prednisolone Dose Correction Using Machine Learning

Sato, Hiroyasu; Kimura, Yoshinobu; Ohba, Masahiro; Ara, Yoshiaki; Wakabayashi, Susumu; Watanabe, Hiroaki

doi:10.1007/s41666-023-00128-3

Prediction of Prednisolone Dose Correction Using Machine Learning

Research Article
Open access
Published: 15 February 2023

Volume 7, pages 84–103, (2023)
Cite this article

Download PDF

You have full access to this open access article

Journal of Healthcare Informatics Research Aims and scope Submit manuscript

Prediction of Prednisolone Dose Correction Using Machine Learning

Download PDF

Hiroyasu Sato ORCID: orcid.org/0000-0002-9566-834X¹,
Yoshinobu Kimura²,
Masahiro Ohba³,
Yoshiaki Ara⁴,
Susumu Wakabayashi⁵ &
…
Hiroaki Watanabe⁶

2380 Accesses
Explore all metrics

Abstract

Wrong dose, a common prescription error, can cause serious patient harm, especially in the case of high-risk drugs like oral corticosteroids. This study aims to build a machine learning model to predict dose-related prescription modifications for oral prednisolone tablets (i.e., highly imbalanced data with very few positive cases). Prescription data were obtained from the electronic medical records at a single institute. Cluster analysis classified the clinical departments into six clusters with similar patterns of prednisolone prescription. Two patterns of training datasets were created with/without preprocessing by the SMOTE method. Five ML models (SVM, KNN, GB, RF, and BRF) and logistic regression (LR) models were constructed by Python. The model was internally validated by five-fold stratified cross-validation and was validated with a 30% holdout test dataset. Eighty-two thousand five hundred fifty-three prescribing data for prednisolone tablets containing 135 dose-corrected positive cases were obtained. In the original dataset (without SMOTE), only the BRF model showed a good performance (in test dataset, ROC-AUC:0.917, recall: 0.951). In the training dataset preprocessed by SMOTE, performance was improved on all models. The highest performance models with SMOTE were SVM (in test dataset, ROC-AUC: 0.820, recall: 0.659) and BRF (ROC-AUC: 0.814, recall: 0.634). Although the prescribing data for dose-related collection are highly imbalanced, various techniques such as the following have allowed us to build high-performance prediction models: data preprocessing by SMOTE, stratified cross-validation, and BRF classifier corresponding to imbalanced data. ML is useful in complicated dose audits such as oral prednisolone.

Predicting the need for a reduced drug dose, at first prescription

Article Open access 22 October 2018

Electronic Prescription Service for Improved Healthcare Delivery

Detecting diseases in medical prescriptions using data mining methods

Article Open access 24 November 2022

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Prescription errors are a serious problem that threaten patient safety. A large variation in the frequency of prescription errors has been reported in different studies, ranging from 1.1 to 41.7% [1,2,3,4,5]. Major categories of prescription errors include wrong drug, wrong dose, or wrong strength [6]. While pharmacists conduct prescription audits to detect inappropriate prescriptions, human checks usually have limitations. In a clinical decision support (CDS) systems or computerized physician order entry (CPOE) system, overdose alert features have been implemented in recent years. However, most dose alert features currently implemented in CDS or CPOE are simple designs that warn only when the prescribed dose exceeds the approved upper-limit dose. With this specification, it is difficult to detect incorrect doses for drugs whose appropriate dose range varies widely depending on the disease or patient.

Oral corticosteroids are potent anti-inflammatory drugs used in the treatment of many diseases. The doses of these drugs vary greatly, depending on the disease or severity or patient population. In Japan, prednisolone tablets are one of the most commonly prescribed oral corticosteroids. These tablets may be administered daily, at a maintenance dose of ≤ 5 mg/body, in chronic diseases such as collagen disease. Conversely, in the chemotherapy of malignant lymphoma, high-dose prednisolone such as 100 mg/body is prescribed. Moreover, there is a possibility of dose adjustment within the same patient, based on his/her severity of symptoms.

To accommodate the wide variation in the required dose, oral prednisolone tablets are available in some strengths (1 mg, 2.5 mg, and 5 mg tablets in Japan). When adjusting the dose from a previous prescription, wrong input of the dose or wrong selection of strengths may occur due to the existence of multiple strengths of the drug. Prednisolone tablets, which have a very wide range of approved doses, are high-risk drugs that are prone to prescription errors [7, 8]. Even with the maximum approved dose of prednisolone tablets (e.g., 100 mg/day) set as a threshold for dose alert features currently implemented in CDS or CPOE, it is difficult to detect inappropriate doses in the clinical setting.

In recent years, research using machine learning (ML) has been actively conducted in the healthcare fields [9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31], and its usefulness has been reported for solving complex problems such as diagnostic support and prognosis prediction. The aim of this study is to develop a predictive model to determine the appropriateness of the dosage of oral prednisolone tablets using a machine learning algorithm. Real-world data in the field of drug safety, such as dose errors, are imbalance with very few positive cases. We hypothesized that ML would perform better than conventional models in detecting dose-inappropriate prescription, which is highly disproportionate data. We explore the possibility that a highly accurate prediction model using machine learning will be implemented as a prescription audit function for CDS and CPOE and that it will contribute to the prevention of dose errors of high-risk drugs.

2 Related Work

Clinical data are often imbalanced, i.e., some classes have much fewer instances than others. The percentage of positive cases in clinical dataset of Table 1 is imbalanced (1.6–47.2%). Without dealing with data imbalances, the minority class is often insensitive to machine learning. This problem is serious when the minority class is the target of the prediction. Several approaches have been proposed and research into solutions to overcome the problem of imbalanced datasets [32, 33].

Table 1 Previous works: binary prediction with multi-ML algorithm in clinical data (excluding image and text data) in 2017–2020

Full size table

Ichikawa et al. [34] developed screening model for hyperuricemia using the random under-sampling method on highly imbalanced data with a positive rate of 1.4%. The synthetic minority over-sampling technique (SMOTE) [35] is the best-known over-sampling technique and is often used for imbalanced medical data: prediction of all-cause mortality by Sakr et al. [36] (34,212 patients data with 11.3% positive cases), prediction of stroke by Wu et al. [37] (1131 patients data with 5.0% positive case), and prediction of the risk for severe complication after bariatric surgery by Cao et al. [38] (training dataset of 37,811 patients with 3.2% positive case).

Feature selection [39] is a method of considering an imbalanced distribution by selecting the best features. De Silva et al. [40] identified prediabetes predictors from 6346 persons data with 23.4% positive case by combining feature selection and machine learning. In this study, four feature selection algorithms were used to select 15 to 30 variables depending on severity from 156 preselected variables. Ali et al. [41] improve the diagnostic performance of Parkinson’s disease by two-dimensional data selection using voice data.

Cost-sensitive learning [42] may also be used for imbalanced learning. This concept takes into account the cost of prediction errors when training a machine learning model by assuming that the cost of misclassification of the minority sample is higher than that of the majority sample. Nakamura et al. [43] used the L2-regularized logistic regression with cost-sensitive learning and CNN models to identify readmissions within 30 days of children and reported that the F-measures for both models were similar.

3 Methods

3.1 Database and Features

The data warehouse, which stores EMR data at Obihiro Kosei General Hospital, a regional center hospital in a rural area of Hokkaido in Japan, was used.

One hundred twenty-one thousand one hundred ninety prescription orders for oral prednisolone tablets (5 mg, 2.5 mg, and 1 mg), from April 1, 2012, to March 31, 2017, were extracted. Prescription orders in which one strength was prescribed more than once were excluded, assuming that the drug is taken every other day or that the instruction requires tapering (these prescriptions are sometimes seen in Japan for the purpose of reducing side effects). Also, those in which different strengths were prescribed on different days were excluded. When different strengths were prescribed for the same number of days, their doses were combined to calculate the daily dose of prednisolone. As a result, 116,285 prednisolone tablet orders were aggregated into 89,443 observations. In addition, the prescription that the previous prescription was confirmed during the investigation period were eligible (i.e., the prescription ordered prednisolone tablets for the first time was excluded). Eighty-two thousand five hundred fifty-three observations were obtained (Online Resource 1).

For the target prescriptions, department, age, gender, daily dose, and prescription days were investigated. For previous prescriptions, prescription date, daily dose, and prescription days were investigated. These features are available through the pharmacy’s dispensing system.

3.2 Standardization of Clinical Departments

The names of the departments included in this study varied depending on the characteristics and size of the hospital. Hence, the clustering of medical departments was done to construct a general model that could be applied to hospitals of any characteristics and size. The subject data were stratified in two dimensions: daily doses (seven categories), the number of prescription days (nine categories), and the number of cases corresponding to each cell were calculated for every medical department (Online Resource 2). A Euclidean distance-based cluster analysis (Ward method) was performed, based on a distribution table normalized so that the total number of prescriptions was one for each clinical department. Nineteen clinical departments were classified into six clusters with a similar prescription trend for prednisolone tablets.

3.3 Outcome and Variates

Prescription modification for prednisolone doses was investigated, and 135 prescriptions with dose modification were defined as positive cases. For positive cases, the dose in the pre-revision order (first edition) was investigated. Positive cases included dose-related changes (i.e., increase or decrease in the daily dose, addition and deletion of prednisolone tablets, or changes in the strength of prednisolone tablets). Changes in prescribing days, comments, and other concomitant medications were not considered positive cases. Eighty-two thousand four hundred eighteen prescriptions without dose modification were considered negative case. There was significant imbalance in the outcome, with a positive rate of 0.16%

The following features have been adapted to build a predictive model in the pharmacy prescription audit setting: the cluster of clinical departments, current daily dose (pre-revision dose in positive case), current prescription days (pre-revision days in positive case), daily dose of the previous prescription, and prescription days of the previous prescription.

3.4 Data Preprocessing

The data were stratified by the percentage of positive cases and divided into a 70% training dataset and a 30% test dataset (Fig. 1).

To see if dealing with imbalance data contributes to improved performance, training datasets were prepared with/without resampling preprocessing. Original training dataset includes 57,693 negative cases and 94 positive cases (positive rate: 0.16%). The preprocessed dataset was resampled by SMOTE so that the minority class was 10% of the majority (i.e., training dataset after resampling includes 57,693 negative cases and 5769 positive cases). SMOTE was performed using the Imbalanced-Learn library in the Python 3.6 language with the following settings: sampling_strategy = 0.1, k_neighbors = 5, and n_jobs = 1. The test dataset was not resampled.

3.5 Model Development

Logistic regression (LR) and following five machine learning algorithms were used for learning by training dataset: support vector machine (SVM), k-nearest neighbor (KNN), random forest (RF), gradient boosting (GB), and balanced random forest (BRF). Then, the learning models were validated on the test dataset.

The Python 3.6 language was used for coding the algorithm, and the Scikit-Learn library was used for all ML modules, except for BRF. The Imbalanced-Learn library was used for the latter. Hyperparameter optimization was not explored in this study.

BRF is an adaptation of random forest that under-samples the majority class, making use of the fact that random forest is an ensemble method [44]. In BRF, for each tree in the random forest, a bootstrap sample is drawn from the minority class. The same number of observations is then randomly drawn from the majority class. In the usual under-sampling method, most of the information in the majority class is lost without being used. The BRF algorithm uses the majority class data in the ensemble tree. Although studies that have used BRF for imbalanced data have been reported in various fields [45,46,47,48], very few have been reported for the healthcare field.

3.6 Model Performance

To evaluate model performance, the following indicators were computed: area under the receiver operating characteristic curve (ROC-AUC), accuracy, precision, and recall.

$$\mathrm{Accuracy}=\frac{\mathrm{True}\;\mathrm{Positives}+\mathrm{True}\;\mathrm{Negatives}}{\mathrm{True}\;\mathrm{Positives}+\mathrm{False}\;\mathrm{Positives}+\mathrm{False}\;\mathrm{Negatives}+\mathrm{True}\;\mathrm{Negatives}}$$

(1)

$$\mathrm{Precision}=\frac{\mathrm{True}\;\mathrm{Positives}}{\mathrm{True}\;\mathrm{Positives}+\mathrm{False}\;\mathrm{Positives}}$$

(2)

$$\mathrm{Recall}=\frac{\mathrm{True}\;\mathrm{Positives}}{\mathrm{True}\;\mathrm{Positives}+\mathrm{False}\;\mathrm{Negatives}}$$

(3)

ROC-AUC was used as the primary performance indicator. Recall was used as the second indicator because oversight should be avoided rather than overdetermined in building a predictive model for medical safety fields.

In addition, a confusion matrix, which displays model performance via true positives, false positives, false negatives, and true negatives, was evaluated.

3.7 Internal Validation

Internal validation was performed using k-fold stratified cross-validation (CV) for each prediction model. A normal CV divides the data into k parts in sequential order, and there may be very few or no positive cases in a sub-dataset, in imbalanced data. Stratified CV equally divides cases into positive cases and negative cases into all sub-datasets. Hence, stratified CV is used to validate imbalanced data in the healthcare field [49,50,51,52,53]. In this study, the data were divided evenly into five sub-datasets. The arithmetic means of the performance score, obtained from the five sub-datasets, was defined as performance after internal validation. fivefold stratified CV was only applied to training dataset.

3.8 Statistical Analysis

All features had no missing data and no interpolation was done. For each feature in the model, positive case and negative case rates were described. Numerical information such as age, dose, days, pre-dose, and pre-days was tested by the Student’s t test. The chi-square test was used for sex, which is the count information. Differences in the distribution of the cluster of clinical department were compared by Fisher’s exact test. These statistical analyzes were performed using R (version 3.6.3) software.

4 Results

Data related to 82,553 prednisolone prescriptions were applied to the model. Table 2 shows the background information of the subject data. Age and dose were significantly different between the groups, but their median differences were small.

Table 2 Characteristics of the dataset that was used in this machine learning

Full size table

The prescription patterns of prednisolone tablets differed by the clinical department (Online Resource 3). In gastroenterology, respiratory medicine, and cardiology departments, low-dose and long-term prescriptions were common. In the pediatrics department, moderate-dose and short-term prescriptions were common. In the hematology department, peaks were observed in high-dose and short-term zones because oral prednisolone was used as an anticancer drug treatment for malignant lymphomas.

Cluster analysis classified clinical departments into six clusters with similar prescription patterns (Fig. 2). Seven departments, which had a tendency to prescribe low to moderate doses for longer duration, were grouped in one cluster, while the hematology and gynecology departments were each clustered in one clinical department. There were no positive cases in multiple clusters of clinical departments, and a significant difference was confirmed in the cluster distribution among the groups.

In the training dataset of the original data (without SMOTE), the learning results showed that the ROC-AUC of RF, GB, SVM, and BRF models was higher than that of LR (Table 3). The BRF model, which is a classifier corresponding to imbalanced data, showed the highest performance among the five ML models and had the highest recall. The LR and SVM models showed zero precision and recall and could not detect any positive cases. Due to the highly imbalanced data in this study, accuracy and ROC-AUC were high even when no positive cases were detected. Confirmation of the confusion matrix showed that no true positives existed for the LR and SVM models (Online Resource 4).

Table 3 Performance indicator in 5 machine learning and logistic regression models with original data (without SMOTE)

Full size table

As a result of validation with the test dataset using the models learned by the original training dataset, only the BRF model showed high ROC-AUC, but the precision was very low (Table 3). The LR and SVM models were unable to classify true positives even on test dataset.

In the over-sampling data by SMOTE, performance improved for both training and test datasets for most algorithms (Table 4). Amplification of positive cases increased the proportion of true positives. The highest performing model was BRF even after resampling. With over-sampling preprocessing, precision of BRF in training dataset has also greatly improved. In the test dataset with SMOTE; there was no improvement of Recall and ROC-AUC in the BRF model, but the precision increased. In the test dataset after resampling by SMOTE, the SVM model showed the highest ROC-AUC equivalent to BRF model.

Table 4 Performance indicator in 5 machine learning and logistic regression models with preprocessing data by SMOTE

Full size table

5 Discussion

Oral corticosteroids, such as prednisolone, are associated with various adverse events including peptic ulcer, osteoporosis, hyperglycemia, and susceptibility to infections. Corticosteroids are one of the most commonly used drugs for which hospitalization due to adverse drug events is required [54]. Because corticosteroids have symptoms related to overdose and withdrawal, inappropriate dosing due to prescribing errors is a serious concern [55, 56]. Thus, it is of clinical significance to accurately detect any error related to the dose correction of prednisolone tablets. Determining the appropriate dose of oral prednisolone requires complex considerations such as calculation of daily dose (the sum of multiple standards), disease, severity, weight (especially for children), and the relationship with the previously prescribed dose. There are limits to human prescription audits, and objective support from CDS or CPOE system is desired. We developed the best ML model to detect the dose-related modified prescriptions of prednisolone tablets.

LR model without resampling, as a traditional model, could not be judged for true positives at all, and its performance was not clinically applicable. The BRF model showed highest ROC-AUC and highest recall without preprocessing. Chen et al. reported the usefulness of BRF in imbalanced data with a minority class using 2.3 to 9.7% datasets and that under-sampling appears to be superior to over-sampling [44]. The minority class of the dataset in this study was 0.16%, which was extremely imbalanced compared to the previously reported datasets. However, when combined with stratified CVs, the BRF model showed very high performance at a clinically implementable level.

By adding pre-processing of over-sampling by SMOTE, the training performance of each model was improved. While SVM performance has improved significantly, BRF has not seen much performance improvement. It seems that the BRF without resampling had already obtained sufficient performance. Another reason may be that under-sampling by BRF after over-sampling by SMOTE offset the preprocessing effect. The SVM model after SMOTE resampling (SMOTE + SVM) showed the highest ROC-AUC for the test dataset, slightly above SMOTE + BRF. Because SVM classifiers are very sensitive to imbalanced data [57], SMOTE + SVM has been reported to be a good combination [58] and was considered to be the best algorithm candidate in future developmental studies. In this study, default parameters are used for machine learning. Adjusting hyperparameters can further improve performance. Only one type of SMOTE setting, which sets the amplification factor of the minority class to 10% of that of the majority class, is being considered. It has not been verified whether this amplification setting is optimal. Since the data in this study were obtained from a single facility, the applicability of this model in other facilities needs to be further investigated.

In the prediction in the medical safety area, it is important to catch all positive cases. Therefore, in this study, recall was more important than precision. Since in this study, positive cases were defined as “dose-related prescribing correction cases,” various types of positive cases were observed. High-risk prescriptions with prednisolone dose errors that must be detected include the wrong selection of strength (a case in which instead of a dose reduction from 1 tablet of 5 mg [5 mg/day] to 4 tablets of 1 mg [4 mg/day], 4 tablets of 5 mg [20 mg/day] were prescribed), wrong selection of dose unit (a case in which instead of a prescription of 5 mg, 5 tablets were prescribed), and typing error (a case in which instead of 10 mg, 1 mg or 100 mg was prescribed). On the other hand, positive cases in this study also include cases with little risk in which the dose was adjusted after the prescription order, according to the patient’s condition (for example, a correction from 5 to 4 mg after the order). In this study, since the data were collected retrospectively, it was not possible to investigate the reasons for dose correction. Because in the case of an acute exacerbation, prednisolone may suddenly be administered at high doses; it was also difficult to objectively define “high-risk error dose,” even if the dose is many times higher than the previous prescription. In this study, assuming a secondary audit by a pharmacist, we attempted to construct a first screening model to detect any dose modification. Therefore, the precision of the best ML model in this study was expected to be low to some extent.

Because the optimal ML model constructed in this study is assumed to be used in the setting of community pharmacies, the input variables were limited to the following information described in Japanese prescriptions (i.e. the information which could be recognized by the community pharmacists): patient age, gender, clinical department, and previous prescription information. Features such as laboratory data, indications, severity, and weight were not used in this study because they are difficult to obtain at many pharmacies in Japan. The addition of these variables is expected to improve predictive performance.

Furthermore, the input variables of different departments were clustered in this study. It was observed that the prednisolone prescribing pattern was different for each clinic, and clinic information was considered to be an influential factor in erroneous dose detection. However, the name of the department name can vary, depending on the number of beds and features of the hospital. For example, in the university hospitals, medical departments are highly subdivided (e.g. “endocrinology and metabolism,” “nephrology,” “diabetes”), but in regional small or medium hospitals, integrated names are often listed (e.g. “general internal medicine”). In this study, the clustering of clinical departments was performed using prescription patterns of dose and days to construct a general-purpose model. However, since there has been no report on the standardization of the clustering of clinical departments for prednisolone prescription patterns, it is necessary to verify our method as the clustering by the Ward’s method with Euclidean and thresholds for dividing doses and days to create frequency distributions.

There are few reports of ML adaptation to medication error such as prescription audits [59,60,61,62,63,64]. To the best of our knowledge, this is the first report that ML was considered for prescription audits of drugs with a very wide range of clinical doses, such as oral steroids.

The results of this study show that even in the field of clinical drug safety, where the positive cases are few, ML dealing with imbalanced data may be useful for problems that cannot be solved by mathematical models. Prednisolone tablets, a commonly prescribed oral corticosteroid, is a high-risk drug, and its overdose or underdose is a significant risk to patients. Since this study targets prescription audit settings in pharmacies, a prediction model was constructed with an emphasis on recall to prevent under-detection. As a result, although we built a high-performance machine learning model, prescriptions that detect inappropriate doses of prednisolone should be reviewed by pharmacists for the necessary of prescription question to physicians. Accurate warning regarding prescription dosage errors with ML will be beneficial to both healthcare professionals and patients. It is expected that ML will be extensively utilized in clinical drug safety management, including prescription audits, to prevent serious incidents.

Data Availability

The dataset is not open. The datasets generated and/or analyzed during the current study are available from the corresponding author on reasonable request.

References

Dornan T, Ashcroft D, Heathfield H, Lewis P, Miles J, Taylor D, Tully M, Wass V (2009) Final report: An in depth investigation into causes of prescribing errors by foundation trainees in relation to their medical education - EQUIP study. General Medical Council. http://www.gmc-uk.org/FINAL_Report_prevalence_and_causes_of_prescribing_errors.pdf_28935150.pdf. Accessed 20 Jul 2022
Avery AJ, Ghaleb M, Barber N, Franklin BD, Armstrong SJ, Serumaga B, Dhillon S, Freyer A, Howard R, Talabi O, Mehta RL (2013) The prevalence and nature of prescribing and monitoring errors in English general practice: a retrospective case note review. Br J Gen Pract 63:e543-553. https://doi.org/10.3399/bjgp13X670679
Article Google Scholar
Claesson CB, Burman K, Nilsson J, Vinge E (1995) Prescription errors detected by Swedish pharmacists. Int J Pharm Pract 3:151–156. https://doi.org/10.1111/j.2042-7174.1995.tb00809.x
Article Google Scholar
Lustig A (2000) Medication error prevention by pharmacists–an Israeli solution. Pharm World Sci 22:21–25. https://doi.org/10.1023/A:1008774206261
Article Google Scholar
Khaja KA, Al-ansari TM, Sequeira R (2005) An evaluation of prescribing errors in primary care in Bahrain. Int J Clin Pharmacol Ther 43:294–301. https://doi.org/10.5414/cpp43294
Article Google Scholar
Academy of Managed Care Pharmacy (2019) What is managed care pharmacy? – Concepts in Managed Care Pharmacy; Medication Errors [Internet]. https://www.amcp.org/about/managed-care-pharmacy-101/concepts-managed-care-pharmacy/medication-errors. Accessed 18 Oct 2022
Chua SS, Chua HM, Omar A (2010) Drug administration errors in paediatric wards: a direct observation approach. Eur J Pediatr 169:603–611. https://doi.org/10.1007/s00431-009-1084-z
Article Google Scholar
Sangtawesin V, Kanjanapattanakul W, Srisan P, Nawasiri W, Ingchareonsunthorn P (2003) Medication errors at Queen Sirikit National Institute of Child Health. J Med Assoc Thai 86(Suppl 3):S570-575
Google Scholar
AI-Janabi S, Mahdi MA (2019) Evaluation prediction techniques to achievement an optimal biomedical analysis. Int J Grid and Utility Computing 10:512–27. https://doi.org/10.1504/IJGUC.2019.102021
Article Google Scholar
Weng SF, Reps J, Kai J, Garibaldi JM, Qureshi N (2017) Can machine-learning improve cardiovascular risk prediction using routine clinical data? PLoS One 12:e0174944. https://doi.org/10.1371/journal.pone.0174944
Article Google Scholar
Zhang Z, Ho KM, Hong Y (2019) Machine learning for the prediction of volume responsiveness in patients with oliguric acute kidney injury in critical care. Crit Care 23:112. https://doi.org/10.1186/s13054-019-2411-z
Article Google Scholar
Jhee JH, Lee S, Park Y, Lee SE, Kim YA, Kang S, Kwon J, Park JT (2019) Prediction model development of late-onset preeclampsia using machine learning-based methods. PLoS One 14:e0221202. https://doi.org/10.1371/journal.pone.0221202
Article Google Scholar
Churpek MM, Yuen TC, Winslow C, Meltzer DO, Kattan MW, Edelson DP (2016) Multicenter comparison of machine learning methods and conventional regression for predicting clinical deterioration on the wards. Crit Care Med 44:368–374. https://doi.org/10.1097/CCM.0000000000001571
Article Google Scholar
Qiu J, Li P, Dong M, Xin X, Tan J (2019) Personalized prediction of live birth prior to the first in vitro fertilization treatment: a machine learning method. J Transl Med 17:317. https://doi.org/10.1186/s12967-019-2062-5
Article Google Scholar
Lenhard F, Sauer S, Andersson E, Månsson KN, Mataix-Cols D, Rück C, Serlachius E (2018) Prediction of outcome in internet-delivered cognitive behaviour therapy for paediatric obsessive-compulsive disorder: a machine learning approach. Int J Methods Psychiatr Res 27:e1576. https://doi.org/10.1002/mpr.1576
Article Google Scholar
Hsieh MH, Hsieh MJ, Chen CM, Hsieh CC, Chao CM, Lai CC (2018) Comparison of machine learning models for the prediction of mortality of patients with unplanned extubation in intensive care units. Sci Rep 8:17116. https://doi.org/10.1038/s41598-018-35582-2
Article Google Scholar
Harrington L, diFlorio-Alexander R, Trinh K, MacKenzie T, Suriawinata A, Hassanpour S (2018) Prediction of atypical ductal hyperplasia upgrades through a machine learning approach to reduce unnecessary surgical excisions. JCO Clin Cancer Inform 2:1–11. https://doi.org/10.1200/CCI.18.00083
Article Google Scholar
Huang C, Murugiah K, Mahajan S, Li S, Dhruva SS, Haimovich JS, Wang Y, Schulx WL, Testani JM, Wilson FP, Mena CI, Masoudi FA, Rumsfeld JS, Spertus JA, Mortazavi BJ, Krumholz HM (2018) Enhancing the prediction of acute kidney injury risk after percutaneous coronary intervention using machine learning techniques: a retrospective cohort study. PLoS Med. 15:e1002703. https://doi.org/10.1371/journal.pmed.1002703
Article Google Scholar
Davoudi A, Ebadi A, Rashidi P, Ozrazgat-Baslanti T, Bihorac A, Bursian AC (2017) Delirium prediction using machine learning models on preoperative electronic health records data. Proc IEEE Int Symp Bioinformatics Bioeng 2017:568–573. https://doi.org/10.1109/BIBE.2017.00014
Article Google Scholar
Corey KM, Kashyap S, Lorenzi E, Lagoo-Deenadayalan SA, Heller K, Whalen K, Balu S, Heflin MT, McDonald SR, Swaminathan M, Sendak M (2018) Development and validation of machine learning models to identify high-risk surgical patients using automatically curated electronic health record data (Pythia): a retrospective, single-site study. PLoS Med. 15:e1002701. https://doi.org/10.1371/journal.pmed.1002701
Article Google Scholar
Ge Y, Wang Q, Wang L, Wu H, Peng C, Wang J, Xu Y, Xiong G, Zhang Y, Yi Y (2019) Predicting post-stroke pneumonia using deep neural network approaches. Int J Med Inform 132:103986. https://doi.org/10.1016/j.ijmedinf.2019.103986
Article Google Scholar
Cramer EM, Seneviratne MG, Sharifi H, Ozturk A, Hernandez-Boussard T (2019) Predicting the incidence of pressure ulcers in the intensive care unit using machine learning. EGEMS (Wash DC) 7:49. https://doi.org/10.5334/egems.307
Article Google Scholar
Jeong E, Park N, Choi Y, Park RW, Yoon D (2018) Machine learning model combining features from algorithms with different analytical methodologies to detect laboratory-event-related adverse drug reaction signals. PLoS One 13:e207749. https://doi.org/10.1371/journal.pone.0207749
Article Google Scholar
Hong JC, Niedzwiecki D, Palta M, Tenenbaum JD (2018) Predicting emergency visits and hospital admissions during radiation and chemoradiation: an internally validated pretreatment machine learning algorithm. JCO Clin Cancer Inform 2:1–11. https://doi.org/10.1200/CCI.18.00037
Article Google Scholar
Du Z, Yang Y, Zheng J, Li Q, Lin D, Li Y, Fan J, Cheng W, Chen X, Cai Y (2020) Accurate prediction of coronary heart disease for patients with hypertension from electronic health records with big data and machine-learning methods: model development and performance evaluation. JMIR Med Inform. 8:e17257. https://doi.org/10.2196/17257
Article Google Scholar
Lin WC, Goldstein IH, Hribar MR, Sanders DS, Chiang MF (2020) Predicting wait times in pediatric ophthalmology outpatient clinic using machine learning. AMIA Annu Symp Proc 2019:1121–1128
Google Scholar
Yang X, Gong Y, Waheed N, March K, Bian J, Hogan WR, Wu Y (2020) Identifying cancer patients at risk for heart failure using machine learning methods. AMIA Annu Symp Proc 2019:933–941
Google Scholar
Brisimi TS, Xu T, Wang T, Dai W, Adams WG, Paschalidis IC (2018) Predicting chronic disease hospitalizations from electronic health records: an interpretable classification approach. Proc IEEE Inst Electr Electron Eng 106:690–707. https://doi.org/10.1109/JPROC.2017.2789319
Article Google Scholar
Ibrahim ZM, Wu H, Hamoud A, Stappen L, Dobson RJB, Agarossi A (2020) On classifying sepsis heterogeneity in the ICU: insight using machine learning. J Am Med Inform Assoc 27:437–443. https://doi.org/10.1093/jamia/ocz211
Article Google Scholar
Ye C, Li J, Hao S, Liu M, Jin H, Zheng L, Xia M, Jin B, Zhu C, Alfreds ST, Stearns F, Kanov L, Sylvester KG, Widen E, McElhinney D, Ling XB (2020) Identification of elders at higher risk for fall with statewide electronic health records and a machine learning algorithm. Int J Med Inform 137:104105. https://doi.org/10.1016/j.ijmedinf.2020.104105
Article Google Scholar
Jamei M, Nisnevich A, Wetchler E, Sudat S, Liu E (2017) Predicting all-cause risk of 30-day hospital readmission using artificial neural networks. PLoS One 12:e0181173. https://doi.org/10.1371/journal.pone.0181173
Article Google Scholar
Ali SH (2012) Miner for OACCR: Case of medical data analysis in knowledge discovery. 2012 6th International Conference on Sciences of Electronics, Technologies of Information and Telecommunications (SETIT), pp.962–975. Sousse. https://doi.org/10.1109/SETIT.2012.6482043
Haixiang G, Yijing L, Shang J, Mingyun G, Yuanyue H, Bing G (2017) Learning from class-imbalanced data: review of methods and applications. Expert Syst Appl 73:220–239. https://doi.org/10.1016/j.eswa.2016.12.035
Article Google Scholar
Ichikawa D, Saito T, Ujita W, Oyama H (2016) How can machine-learning methods assist in virtual screening for hyperuricemia? A healthcare machine-learning approach. J Biomed Inform 64:20–24. https://doi.org/10.1016/j.jbi.2016.09.012
Article Google Scholar
Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) SMOTE: Synthetic minority over-sampling technique. J Artif Intell Res 16:321–357. https://doi.org/10.1613/jair.953
Article MATH Google Scholar
Sakr S, Elshawi R, Ahmed AM, Qureshi WT, Brawner CA, Keteyian SJ, Blaha MJ, Al-Mallah MH (2017) Comparison of machine learning techniques to predict all-cause mortality using fitness data: the Henry Ford Exercise Testing (FIT) project. BMC Med Inform Decis Mak 17:174. https://doi.org/10.1186/s12911-017-0566-6
Article Google Scholar
Wu Y, Fang Y (2020) Stroke prediction with machine learning methods among older Chinese. Int J Environ Res Public Health 17:1828. https://doi.org/10.3390/ijerph17061828
Article Google Scholar
Cao Y, Fang X, Ottosson J, Näslund E, Stenberg E (2019) A comparative study of machine learning algorithms in predicting severe complications after bariatric surgery. J Clin Med 8:668. https://doi.org/10.3390/jcm8050668
Article Google Scholar
Yijing L, Haixiang G, Xiao L, Yanan L, Jinling L (2016) Adapted ensemble classification algorithm based on multiple classifier system and feature selection for classifying multi-class imbalanced data. Knowl-Based Syst 94:88–104. https://doi.org/10.1016/j.knosys.2015.11.013
Article Google Scholar
Silva KD, Jönsson D, Demmer RT (2020) A combined strategy of feature selection and machine learning to identify predictors of prediabetes. J Am Med Inform Assoc 27:396–406. https://doi.org/10.1093/jamia/ocz204
Article Google Scholar
Ali L, Zhu C, Zhou M, Liu Y (2019) Early diagnosis of Parkinson’s disease from multiple voice recordings by simultaneous sample and feature selection. Expert Syst Appl 137:22–28. https://doi.org/10.1016/j.eswa.2019.06.052
Article Google Scholar
López V, Fernández A, Moreno-Torres GJ, Herrera F (2012) Analysis of pre- processing vs. cost-sensitive learning for imbalanced classification. Open problems on intrinsic data characteristics. Expert Syst Appl 39:6585–6608. https://doi.org/10.1016/j.eswa.2011.12.043
Article Google Scholar
Nakamura MM, Toomey SL, Zaslavsky AM, Petty CR, Lin C, Savova GK, Rose S, Brittan MS, Lin JL, Bryant MC, Ashrafzadeh S, Schuster MA (2019) Potential impact of initial clinical data on adjustment of pediatric readmission rates. Acad Pediatr 19:589–598. https://doi.org/10.1016/j.acap.2018.09.006
Article Google Scholar
Chen C, Liaw A, Breiman L (2004) Using random forest to learn imbalanced data. Technical Report 666, Statistics Department, University of California at Berkeley. https://statistics.berkeley.edu/tech-reports/666. Accessed 20 Jul 2022
Chen J, Lalor J, Liu W, Druhl E, Granillo E, Vimalananda VG, Yu H (2019) Detecting hypoglycemia incidents reported in patients’ secure messages: using cost-sensitive learning and oversampling to reduce data imbalance. J Med Internet Res 21:e11990. https://doi.org/10.2196/11990
Article Google Scholar
Zhu B, Baesens B, VandenBroucke SK (2017) An empirical comparison of techniques for the class imbalance problem in churn prediction. Inf Sci 408:84–99. https://doi.org/10.1016/j.ins.2017.04.015
Article Google Scholar
Branion-Calles MC, Nelson TA, Henderson SB (2016) A geospatial approach to the prediction of indoor radon vulnerability in British Columbia, Canada. J Expo Sci Environ Epidemiol 26:554–565. https://doi.org/10.1038/jes.2015.20
Article Google Scholar
Carcillo F, Borgne YL, Caelen O, Kessaci Y, Oblé F, Bontempi G (2021) Combining unsupervised and supervised learning in credit card fraud detection. Inf Sci 555:317–331. https://doi.org/10.1016/j.ins.2019.05.042
Article MathSciNet Google Scholar
Kiely DG, Doyle O, Drage E, Jenner H, Salvatelli V, Daniels FA, Rigg J, Schmitt C, Samyshkin Y, Lawrie A, Bergemann R (2019) Utilising artificial intelligence to determine patients at risk of a rare disease: idiopathic pulmonary arterial hypertension. Pulm Circ 9:2045894019890549. https://doi.org/10.1177/2045894019890549
Article Google Scholar
Raj R, Luostarinen T, Pursiainen E, Posti JP, Takala RSK, Bendel S, Konttila T, Korja M (2019) Machine learning-based dynamic mortality prediction after traumatic brain injury. Sci Rep 9:17672. https://doi.org/10.1038/s41598-019-53889-6
Article Google Scholar
Fulton LV, Dolezel D, Harrop J, Yan Y, Fulton CP (2019) Classification of Alzheimer’s disease with and without imagery using gradient boosted machines and ResNet-50. Brain Sci 9:212. https://doi.org/10.3390/brainsci9090212
Article Google Scholar
Mouzan ME, Korolev KS, Mofarreh MA, Menon R, Winter HS, Sarkhy AA, Dowd SE, Barrag AM, Assiri A (2018) Fungal dysbiosis predicts the diagnosis of pediatric Crohn’s disease. World J Gastroenterol 24:4510–4516. https://doi.org/10.3748/wjg.v24.i39.4510
Article Google Scholar
Decruyenaere A, Decruyenaere P, Peeters P, Vermassen F, Dhaene T, Couckuyt I (2015) Prediction of delayed graft function after kidney transplantation: comparison between logistic regression and machine learning methods. BMC Med Inform Decis Mak 15:83. https://doi.org/10.1186/s12911-015-0206-y
Article Google Scholar
Weiss AJ, Elixhauser A, Bae J, Encinosa W (2011) Origin of adverse drug events in US hospitals: Healthcare Cost and Utilization Project (HCUP) Statistical Briefs [Internet]. https://www.ncbi.nlm.nih.gov/books/NBK169247/. Accessed 20 Jul 2022
Jiménez Muñoz AB, MartínezMondéjar B, MuiñoMiguez A, Romero Ayuso D, SaizLadera GM, CriadoÁlvarez JJ (2019) Errores de prescripción, trascripción y administración según grupo farmacológico en el ámbito hospitalario [Errors of prescription, transcription and administration according to pharmacological group at hospital] (in Spanish). Rev Esp Salud Publica 93:e201901004 (Spanish)
Google Scholar
Magal P, Spiller HA, Casavant MJ, Chounthirath T, Hodges NL, Smith GA (2017) Non-health care facility medication errors associated with hormones and hormone antagonists in the United States. J Med Toxicol 13:293–302. https://doi.org/10.1007/s13181-017-0630-8
Article Google Scholar
Akbani R, Kwek S, Japkowicz N (2004) “Applying support vector machines to imbalanced datasets,” in Proceedings of the 15th European Conference on Machine Learning, pp.39–50, Pisa, Italy. https://doi.org/10.1007/978-3-540-30115-8_7
Sain H, Purnami SW (2015) Combine sampling support vector machine for imbalanced data classification. Proc Comput Sci 72:59–66. https://doi.org/10.1016/j.procs.2015.12.105
Article Google Scholar
Schiff GD, Volk LA, Volodarskaya M, Williams DH, Walsh L, Myers SG, Bates DW, Rozenblum R (2017) Screening for medication errors using an outlier detection system. J Am Med Inform Assoc 24:281–287. https://doi.org/10.1093/jamia/ocw171
Article Google Scholar
Segal G, Segev A, Brom A, Lifshitz Y, Wasserstrum Y, Zimlichman E (2019) Reducing drug prescription errors and adverse drug events by application of a probabilistic, machine-learning based clinical decision support system in an inpatient setting. J Am Med Inform Assoc 26:1560–1565. https://doi.org/10.1093/jamia/ocz135
Article Google Scholar
Rozenblum R, Rodriguez-Monguio R, Volk LA, Forsythe KJ, Myers S, McGurrin M, Williams DH, Bates DW, Schiff G, Seoane-Vazquez E (2020) Using a machine learning system to identify and prevent medication prescribing errors: a clinical and cost analysis evaluation. Jt Comm J Qual Patient Saf 46:3–10. https://doi.org/10.1016/j.jcjq.2019.09.008
Article Google Scholar
Corny J, Rajkumar A, Martin O, Dode X, Lajonchère J, Billuart O, Bézie Y, Buronfosse A (2020) A machine learning-based clinical decision support system to identify prescriptions with a high risk of medication error. J Am Med Inform Assoc 27:1688–1694. https://doi.org/10.1093/jamia/ocaa154
Article Google Scholar
Hogue SC, Chen F, Brassard G, Lebel D, Bussières J, Durand A, Thibault M (2021) Pharmacists’ perceptions of a machine learning model for the identification of atypical medication orders. J Am Med Inform Assoc 28:1712–1718. https://doi.org/10.1093/jamia/ocab071
Article Google Scholar
Boussadi A, Caruba T, Karras A, Berdot S, Degoulet P, Durieux P, Sabatier B (2013) Validity of a clinical decision rule-based alert system for drug dose adjustment in patients with renal failure intended to improve pharmacists’ analysis of medication orders in hospitals. Int J Med Inform 82:964–972. https://doi.org/10.1016/j.ijmedinf.2013.06.006
Article Google Scholar

Download references

Funding

This work was supported by Grant for Research Project of the Japanese Society of Drug Informatics in 2018.

Author information

Authors and Affiliations

Department of Pharmacy, Obihiro Kosei General Hospital, Minami 10-chome, Nishi 14-jo, Hokkaido, 080-0024, Obihiro city, Japan
Hiroyasu Sato
Department of Pharmacy, Soka Municipal Hospital, Soka, Saitama, Japan
Yoshinobu Kimura
Department of Pharmacy, Isehara Kyodo Hospital, Isehara, Kanagawa, Japan
Masahiro Ohba
Department of Pharmacy, National Hospital Organization Shinshu Ueda Medical Center, Ueda, Nagano, Japan
Yoshiaki Ara
Department of Pharmacy, Kyorin University Hospital, Mitaka, Tokyo, Japan
Susumu Wakabayashi
Department of Pharmacy, Obihiro Daiichi Hospital, Obihiro, Hokkaido, Japan
Hiroaki Watanabe

Authors

Hiroyasu Sato
View author publications
You can also search for this author in PubMed Google Scholar
Yoshinobu Kimura
View author publications
You can also search for this author in PubMed Google Scholar
Masahiro Ohba
View author publications
You can also search for this author in PubMed Google Scholar
Yoshiaki Ara
View author publications
You can also search for this author in PubMed Google Scholar
Susumu Wakabayashi
View author publications
You can also search for this author in PubMed Google Scholar
Hiroaki Watanabe
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Hiroyasu S.: Conception and design of the study, data collection, development of the machine learning models, and drafting of the article.

Yoshinobu K.: Data cleansing and collection of variables.

Masahiro O.: Data cleansing and clustering of clinical department.

Yoshiaki A.: Development of the machine learning models and validation of the Python programing.

Susumu W.: Advice on the conception of the study and interpretation and discussion of results.

Hiroaki W.: Organization and coordination of the trial and final approval of the article.

All authors approved the manuscript to be published and agree to be accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved.

Corresponding author

Correspondence to Hiroyasu Sato.

Ethics declarations

Ethics Approval

This study was approved by the Institutional Review Board in Obihiro Kosei general Hospital (No. 2017–032).

Competing Interests

The authors declare no competing interests.

Additional information

Employment

All authors have no present or anticipated employment by an organization that may gain or lose financially through the publication of this manuscript.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (PDF 28 KB)

Supplementary file2 (PDF 18 KB)

Supplementary file3 (PDF 949 KB)

Supplementary file4 (PDF 158 KB)

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Sato, H., Kimura, Y., Ohba, M. et al. Prediction of Prednisolone Dose Correction Using Machine Learning. J Healthc Inform Res 7, 84–103 (2023). https://doi.org/10.1007/s41666-023-00128-3

Download citation

Received: 21 June 2022
Revised: 20 November 2022
Accepted: 03 February 2023
Published: 15 February 2023
Issue Date: March 2023
DOI: https://doi.org/10.1007/s41666-023-00128-3

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Prediction of Prednisolone Dose Correction Using Machine Learning

Abstract

Similar content being viewed by others

Predicting the need for a reduced drug dose, at first prescription

Electronic Prescription Service for Improved Healthcare Delivery

Detecting diseases in medical prescriptions using data mining methods

1 Introduction

2 Related Work

3 Methods

3.1 Database and Features

3.2 Standardization of Clinical Departments

3.3 Outcome and Variates

3.4 Data Preprocessing

3.5 Model Development

3.6 Model Performance

3.7 Internal Validation

3.8 Statistical Analysis

4 Results

5 Discussion

Data Availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Ethics Approval

Competing Interests

Additional information

Employment

Publisher's Note

Supplementary Information

Supplementary file1 (PDF 28 KB)

Supplementary file2 (PDF 18 KB)

Supplementary file3 (PDF 949 KB)

Supplementary file4 (PDF 158 KB)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation