Skip to main content

Advertisement

Log in

Accept/decline decision module for the liver simulated allocation model

  • Published:
Health Care Management Science Aims and scope Submit manuscript

Abstract

Simulated allocation models (SAMs) are used to evaluate organ allocation policies. An important component of SAMs is a module that decides whether each potential recipient will accept an offered organ. The objective of this study was to develop and test accept-or-decline classifiers based on several machine-learning methods in an effort to improve the SAM for liver allocation. Feature selection and imbalance correction methods were tested and best approaches identified for application to organ transplant data. Then, we used 2011 liver match-run data to compare classifiers based on logistic regression, support vector machines, boosting, classification and regression trees, and Random Forests. Finally, because the accept-or-decline module will be embedded in a simulation model, we also developed an evaluation tool for comparing performance of predictors, which we call sample-path accuracy. The Random Forest method resulted in the smallest overall error rate, and boosting techniques had greater accuracy when both sensitivity and specificity were simultaneously considered important. Our comparisons show that no method dominates all others on all performance measures of interest. A logistic regression-based classifier is easy to implement and allows for pinpointing the contribution of each feature toward the probability of acceptance. Other methods we tested did not have a similar interpretation. The Scientific Registry of Transplant Recipients decided to use the logistic regression-based accept-decline decision module in the next generation of liver SAM.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. Ahn JH, Hornberger JC (1996) Involving patients in the cadaveric kidney transplant allocation process: a decision-theoretic perspective. Manag Sci 42(5):629–641

    Article  Google Scholar 

  2. Alagoz O, Maillart LM, Schaefer AJ, Roberts MS (2007) Determining the acceptance of cadaveric livers using an implicit model of the waiting list. Oper Res 55(1):24–36

    Article  Google Scholar 

  3. Bertsimas D, Chang A (2012) ORC: ordered rules for classification a discrete optimization approach to associative classification. Research Working papers

  4. Bishop CM (2006) Pattern recognition and machine learning. Springer

  5. Breiman L (2001) Random forests. Mach Learn 45(1):5–32

    Article  Google Scholar 

  6. Campbell C, Ying Y (2011) Learning with support vector machines. Morgan &; Claypool Publishers

  7. Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) SMOTE: synthetic minority over-sampling technique. J Artif Intell Res 16:321–357

    Google Scholar 

  8. Cover TM, Thomas JA (2012) Elements of information theory. Wiley-Interscience

  9. David I, Yechiali U (1985) A time-dependent stopping problem with application to live organ transplants. Oper Res 33(3):491–504

    Article  Google Scholar 

  10. Duda RO, Hart PE, Stork DG (2012) Pattern classification. Wiley-Interscience

  11. Gentry S, Chow E, Wickliffe C, Massie A, Leighton T, Snyder JJ, Israni AK, Kasiske BL, Segev D (2013) Relationship of cold ischemia time to estimated transportation time for liver allografts. Research Working papers

  12. Harper AM, Taranto SE, Edwards EB, Daily OP (2000) An update on a successful simulation project: the Unos Liver Allocation Model. In: Proceeding of Winter Simulation Conference, Winter Simulation Conference, pp 1955–1962

  13. Howard DH (2002) Why do transplant surgeons turn down organs? A model of the accept/reject decision. J Heart Fail 21(6):957–969

    Google Scholar 

  14. Jolliffe I (1982) A note on the use of principal components in regression. J R Stat Soc Ser C (Appl Stat) 31(3):300–303

    Google Scholar 

  15. Kira K, Rendell LA (1992) A practical approach to feature selection. In: Proceedings of the ninth international workshop on machine learning. Morgan Kaufmann Publishers Inc., pp 249– 256

  16. Kreke J, Schaefer AJ, Angus DC, Bryce CL, Roberts MS (2002) Incorporating biology into discrete event simulation models of organ allocation. In: Proceeding of winter simulation conference, winter simulation conference

  17. Kubat M, Matwin S (1997) Addressing the curse of imbalanced training sets: one-sided selection. In: Proceedings of the fourteenth international conference on machine learning

  18. Leppke S, Leighton T, Zaun D, Chen SC, Skeans M, Israni AK, Snyder JJ, Kasiske BL (2013) Scientific registry of transplant recipients: Collecting, analyzing, and reporting data on transplantation in the United States. Transplant Rev 27:50–56

    Article  Google Scholar 

  19. Neter J, Kutner MH, Nachtsheim CJ, Wasserman W (1996) Applied linear statistical models. McGraw-Hill/Irwin

  20. Pritsker AAB, Daily OP, Pritsker KD (1996) Using simulation to craft a national organ transplantation policy. In: Proceeding of winter simulation conference. ACM Press, New York, pp 1163–1169

    Google Scholar 

  21. Ratcliffe J, Young T, Buxton M, Eldabi T, Paul R, Burroughs A, Papatheodoridis G, Rolles K (2001) A simulation modelling approach to evaluating alternative policies for the management of the waiting list for liver transplantation. Health Care Manag Sci 4(2):117–124

    Article  Google Scholar 

  22. Sandikci B, Maillart LM, Schaefer AJ, Alagoz O, Roberts MS (2008) Estimating the patient’s price of privacy in liver transplantation. Oper Res 56(6):1393–1410

    Article  Google Scholar 

  23. Sandikci B, Maillart LM, Schaefer AJ, Roberts MS (2011) Alleviating the Patient’s price of privacy through a partially observable waiting list. SSRN electronic journal

  24. Schapire RE, Freund Y (2012) Boosting.Foundations and Algorithms, MIT Press

  25. Shechter SM (2005) A clinically based discrete-event simulation of end-stage liver disease and the organ allocation process. Med Dec Making 25(2):199–209

    Article  Google Scholar 

  26. Su X, Zenios SA (2005) Patient choice in kidney allocation: a sequential stochastic assignment model. Oper Res 53(3):443–455

    Article  Google Scholar 

  27. Su XM, Zenios SA, Chertow GM (2004) Incorporating recipient choice in kidney transplantation. J Am Soc Nephrol 15(6):1656–1663

    Article  Google Scholar 

  28. Tang Y, Zhang YQ, Chawla NV, Krasser S (2009) SVMs modeling for highly imbalanced classification. IEEE transactions on systems, man, and cybernetics. Part B (Cybern) 39(1):281–288

    Article  Google Scholar 

  29. Thompson D, Waisanen L, Wolfe R, Merion RM, McCullough K, Rodgers A (2004) Simulating the allocation of organs for transplantation. Health Care Manag Sci 7(4):331–338

    Article  Google Scholar 

  30. TOMEK I (1976) Two modifications of CNN. IEEE transactions on systems. Man Cybern 6:769–772

    Article  Google Scholar 

  31. Veropoulos K, Campbell C, Cristianini N (1999) Controlling the sensitivity of support vector machines. In: Proceedings of the international joint conference on AI, pp 55–60

  32. Wilson DR, Martinez TR (1997) Improved heterogeneous distance functions. J Artif Intell Res 6:1–34

    Google Scholar 

  33. Wold S, Sjöström M, Eriksson L (2001) PLS-regression: a basic tool of chemometrics. Chemometr Intell Lab Syst 58(2):109–130

    Article  Google Scholar 

  34. Yu L, Liu H (2003) Feature selection for high-dimensional data: a fast correlation-based filter solution. Proc Twentieth Int Conf Mach Learn 20(2):856

    Google Scholar 

  35. Zhao Z, Morstatter F, Sharma S, Alelyani S, Anand A, Liu H (2010). Advancing feature selection research. ASU feature selection repository

Download references

Acknowledgements

This work was conducted under the auspices of the Minneapolis Medical Research Foundation, contractor for the Scientific Registry of Transplant Recipients (SRTR), as a deliverable under contract no. HHSH250201000018C (US Department of Health and Human Services, Health Resources and Services Administration, Healthcare Systems Bureau, Division of Transplantation). As a US Government-sponsored work, there are no restrictions on its use. The views expressed herein are those of the authors and not necessarily those of the US Government. The authors thank University of Minnesota colleague Zhi Zhang for assistance with data preparation and interpretation, and SRTR colleagues Jon J. Snyder for valuable contributions to the study; David Schladt, Sally Gustafson, Eugene Shteyn, and Xinyue Wang for assistance with data preparation and interpretation; Susan Leppke for project assistance; Delaney Berrini for manuscript preparation; and Nan Booth for manuscript editing.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Diwakar Gupta.

Appendix

Appendix

In this appendix, we explain methods used in our paper for binary classification. These techniques are also applicable to multi-class problems. In addition, we provide cross-validation results on training sample and performance comparisons on test data in Tables 6, 7, 8, 9, 10 and 11 in the Appendix. Table 12 shows sample path SMSE comparisons for different performance criteria. Tables 13 and 15 contain lists of top 50 features selected by the Information Gain feature selection method for 1A and not-1A candidates respectively. These tables are provided in support of the abridged results presented in the paper. We begin by describing data coding procedures we used in this study.

All potentially relevant attributes associated with each donor-candidate pair are called features. We coded categorical variables with several categories into binary variables, e.g., race with 8 basic categories and 3 additional options for missing, multi-racial, and unknown categories, was coded into 11 0-1 variables. This resulted in an original data set of 554 features from three sources: match run, candidate, and donor features. We eliminated those features that were either empty or were not known to the candidates at the time of receiving an offer. For example, only those lab values were retained that were received before the offer date. We also removed certain features that could lead to over-fitting to the existing allocation policy. For example, candidate designation as local, regional, or national, relative to the donor OPO was not considered. Similarly, TxC indices were dropped. We instead created several variables that could explain TxC-specific strategies. In particular, we calculated and included the number of TxCs within 100, 200, and 300 miles of the donor hospital to capture the effect of competition among TxCs. We also included travel times between donor hospital and TxC where each candidate was located [11].

We coded certain features differently from how they appeared in the original data. For example, surgeons indicated that the donor and candidate weight, height, and age ratios were important to them in making accept or decline decisions. Therefore, we created features that consisted of these ratios and removed redundant features.

After the above-mentioned data manipulation steps, our data consists of 3,140 livers transplanted from deceased donors with n = 59,510 valid observations (donor-patient pairs with Y/N responses) from 2011 match-run data, each comprising a vector x={x 1,x 2,...,x k } of donors’ and candidates’ characteristics and a label y that is either 1 (for accept) or -1 (for decline). The parameter k=376 denotes the number of relevant characteristics. Throughout this appendix, we also use (x i ,y i ) to denote the i-th observation. Similarly, the notation x i j refers to the value of the jth feature in the ith observation. There are four sections in this appendix, which provide, respectively, the mathematical underpinnings of feature selection and imbalance correction methods, information about MATLAB implementation of different methods, and cross validation and test results.

1.1 Appendix A: Feature selection

As described in the paper, we focus on filter methods [35]. In what follows, we describe each of the four methods we compared in the paper.

1.1.1 A.1 Fisher score

Fisher score measures the dissimilarity of feature values across different classes. Since it treats features one at a time, it cannot consider the redundancy of features [10]. It calculates a score for each feature. Features with higher scores are preferred because a higher score implies greater differences in feature values across different label classes. For a feature labeled f j , its Fisher Score F S(f j ) is calculated as follows:

$$ FS(f_{j}) = {{{\sum}^{2}_{z=1} n_{z} (\mu_{j,z}-\mu_{j})^{2}} \over {{\sum}^{2}_{j=1} n_{z} \sigma^{2}_{j,z}}}, \qquad \text{where} $$
(3)
μ j ::

the mean value of the feature f j for all classes,

n z ::

the number of samples in the zth class,

μ j,z ::

the mean value of f j within class z, and

σ j,z ::

the variance of f j values within class z.

1.1.2 A.2 Relief

Relief is a weighting method in which the weights are determined by measuring distances between the value of a feature for a particular observation and its value for the nearest same and different class observations [15]. The feature that provides the widest overall gap earns the highest weight. In particular, the score of a feature is calculated as follows:

$$ RS(f_{j}) = {1 \over 2} {\sum}^{n}_{i=1} \text{diff}(x_{ij}-f_{NM(x_{ij})})-\text{diff}(x_{ij}-f_{NH(x_{ij})}), $$
(4)

where x i j is the value of jth feature of ith instance x i , \(f_{NH(x_{ij})}\) and \(f_{NM(x_{ij})}\) are the values on the jth feature of nearest points to x i with the same and different class label, respectively [35]. The function diff(⋅) is the distance measurement defined as following. When u and v are nominal variables,

$$\begin{array}{@{}rcl@{}} \text{diff}(u,v) = \left\{\begin{array}{l} 0~\text{if}~u = v\\ 1~\text{otherwise.}\end{array}\right. \end{array} $$
(5)

When u and v are numerical variables,

$$ \text{diff}(u,v) = {|u-v| \over Nu} , $$
(6)

where N u is an appropriately sized normalization unit needed to make the difference lie in the interval [0,1].

1.1.3 A.3 Information Gain (IG)

Information Gain measures the dependency between the feature and the class label [8]. It is widely used because it is simple to calculate and easy to interpret. In order to explain Information Gain, we first define the concept of entropy. Entropy R(U) of random variable \(U \in \mathcal {U}\) with probability mass function p(u) is defined by

$$ R(U)=-\sum\limits_{u\in \mathcal{U}} p(u)\log_{2} p(u). $$
(7)

Entropy R(U) is a measure of uncertainty of the random variable U. Similarly, conditional entropy R(U|V) is defined as

$$\begin{array}{@{}rcl@{}} R(U|V) &=& \sum\limits_{u\in \mathcal{U}} p(u)R(V|U=u) \end{array} $$
(8)
$$\begin{array}{@{}rcl@{}} &=& - \sum\limits_{u\in \mathcal{U}}P(u) \sum\limits_{y \in \mathcal{V}} p(v|u)\log p(v|u) \end{array} $$
(9)
$$\begin{array}{@{}rcl@{}} &=&-\sum\limits_{u\in \mathcal{U}}\sum\limits_{v \in \mathcal{V}}p(u,v)\log p(v|u) \end{array} $$
(10)
$$\begin{array}{@{}rcl@{}} &=&\mathbf{E} \log p(V|U) \end{array} $$
(11)

With these definitions in hand, the Information Gain (IG) of feature f j for class label Y is calculated as follows:

$$ IG(X_{f_{j}},Y) = R(X_{f_{j}})-R(X_{f_{j}}|Y) $$
(12)

For each label class, features are ranked in terms of Information Gain. Similar to Fisher score, IG considers each feature one at a time. Therefore, it cannot eliminate redundant features.

1.1.4 A.4 Correlation Based Filter (CBF)

This method uses the correlations between features and class labels and between features to eliminate redundant features [34]. There are several ways to measure correlation between features and labels, and between features. We present Fast CBF, which we used in the paper. In this method symmetrical uncertainty (SU) is employed to measure correlation. SU is defined as follows:

$$ SU(\mathbf{u},\mathbf{v}) = 2 \left[ {{IG(\mathbf{u},\mathbf{v})} \over {R(\mathbf{u})+R(\mathbf{v})}} \right] $$
(13)

where R(u) and I G(u,v) is defined (7) and (12), respectively.

The algorithm used by the CBF method is presented in Table 5. At the beginning, CBF chooses a set of features according to SU between features and class labels. If the SU between a feature and labels is higher than a predefined threshold (δ), then the feature is selected as a predominant feature, i.e., it then belongs to the set S l i s t . All features that are not selected as predominant features are discarded. Then, the feature with the highest SU with a class label becomes a base feature (f b ). If the SU between a feature(f q ) and the base feature(f b ) is higher than that between a feature(f q ) and class label, then that feature is dropped from the list. The next highest feature in the predominant list then becomes the base feature and the above process is repeated until no additional features can be dropped. All other methods described in this appendix provide a relative rank of each feature, leaving the task of selecting which features to use for tuning classifiers to the analyst. In contrast, CBF actually selects a certain number of features by itself.

Table 5 The CBF algorithm

1.2 Appendix B: Treating imbalanced data: under-sampling

Because we have many more observations with the N label, we use under-sampling to balance the number of observations within each label class. It is possible to use random selection to decide which N-labeled observations to keep in the training data. However, there exist better methods that select important observations. We describe one such approach [17]. In this approach, we remove two kinds of observations: redundant and Tomek links. Redundant observations do not hurt the classifier, but increase classification error cost of the N-class labels. In contrast, Tomek links are observations located near the boundary; see Fig. 6 for a graphical representation. These observations affect the classifier and their noisiness can lower the accuracy of the classifier if they are not removed.

Fig. 6
figure 6

Under-Sampling; Observations in the Two Envelopes are Removed

Redundant observations are identified by using the One Nearest Neighbor (denoted 1-NN) method. 1-NN is one of the simplest classification methods that finds the nearest observation from each tagged observation and assigns to it the same label as that of its nearest neighbor. For implementing 1-NN, distance is calculated by the Value Difference Metric (VDM), which is shown below [32].

$$\begin{array}{@{}rcl@{}} \text{VDM}_{f_{i}}(u,v)=\sum\limits^{2}_{c=1}\left| \frac{N_{{f_{i}},u,c}}{N_{{f_{i}},u}}-\frac{N_{{f_{i}},v,c}}{N_{{f_{i}},v}}\right|^{q}\\ = \sum\limits^{2}_{c=1}\left|P_{{f_{i}},u,c}-P_{{f_{i}},v,c} \right|^{q}, \end{array} $$
(14)

where

  • \(N_{{f_{i}},u}\): the number of instances in the training set (T) that have value u for feature f i ,

  • \(N_{{f_{i}},u,c}\): the number of instances in T that have value u for feature f i and class c,

  • c: output class. c∈{1,2}, for binary classification.

  • q: a constant, usually 1 or 2, and

  • \(P_{{f_{i}},u,c}\): conditional probability that the output class is c given that the attribute f i has value u, \(P_{{f_{i}},u,c} =P(c|u,{f_{i}})\).

Finally, \(P_{{f_{i}},u,c}\) is defined as

$$ P_{{f_{i}},u,c}=\frac{N_{{f_{i}},u,c}}{N_{{f_{i}},u}}, $$
(15)

and \({N_{{f_{i}},u}}\) is the sum of \(N_{{f_{i}},u}\) over all classes. That is,

$$ N_{{f_{i}},u} = \sum\limits^{C}_{c=1}N_{{f_{i}},u,c}. $$
(16)

Tomek links are defined as follows. Let δ(x i ,x j ) represent the distance between observations x i and x j with different class labels. The pair (x i ,x j ) is called a Tomek link if there is no observation x k such that δ(x i ,x k )<δ(x i ,x j ) and δ(x k ,x j )<δ(x i ,x j ) [30].

With these definitions in hand, the under-sampling algorithm has the following steps: (i) create a subset S with all the Y labeled observations, (ii) select one N-labeled observation at random and apply 1-NN, (iii) if the 1-NN observation is misclassified, then add the N-labeled observation to set S, else remove it from further consideration, (iv) repeat the procedure by selecting each N-labeled observation one by one until all such observations are considered, and finally (v) among data points in subset C, find Tomek-link observations and remove them. The resulting set S is the under-sampled training data set.

1.3 Appendix C: MATLAB Code

MATLAB codes we used for feature selection are available from Feature Selection Group’s web site at the Arizona State University (featureselection.asu.edu).

  • Fisher score: fsFisher.m

  • Relief: fsReliefF.m

  • CBF: fsFCBF.m

  • Information Gain: fsInfoGain.m

We used MATLAB(R2012b) functions from statistics toolbox for classifiers (details can be found at the following URL: www.mathworks.com/products/statistics/description4.html).

  • Logistic Regression : glmfit.m with the option, ’binomial’

  • SVM : svmtrain.m

  • Boosting : fitensemble.m with the options, ’AdaBoostM1’ and ’GentleBoost’

  • RF : treebagger.m

  • CART : classregtree.m

1.4 Appendix D: Cross-validation and test results

This Appendix contains cross-validation and test results, as well as top features identified by the Information Gain feature selection method. Many machine learning models have one or more control parameters, e.g., σ in kernel for SVM. Before the final evaluation of our final model, we need to tune these control parameters. We divided the data into three groups; training (60 %), cross-validation(20 %), and test (20 %) data sets. We build a model with training data and tune the control parameters by intermediate evaluation with cross-validation, and the final model evaluation is done with test data set. The cross-validation helps us not only tune parameters but also prevent over- fitting.

Table 6 Cross-validation results for 1A
Table 7 Cross-validation results for 1A - continued
Table 8 Cross-validation results for Not-1A
Table 9 Cross-validation results for Not-1A - continued

Tuning objective is maximizing G-metric except for RF-RUS (RF with Random Under Sampling).

Table 10 Performance: test data set with information gain for 1A

In the cross-validation and test result tables,

  • CM: Classification Method

  • FS: Feature Selection Method

  • # of Features: the number of features we selected from the top of the list

  • Error rate: percent of observations that were incorrectly classified

  • Accuracy: percente of observations that were correctly classified

  • Sensitivity: Accuracy only within positive class

  • Specificity: Accuracy only within negative class

  • G-metric: the square root of the product of sensitivity and specificity

Because CBF determines the number of features by itself, it had only 12 and 8 features in the lists for 1A and Not-1A, respectively. In all other cases, we obtained a rank ordered list of all features. Information Gain resulted in the best G-metric for all methods in cross-validation results. Therefore, we used the feature set provided by the Information Gain method to finalize parameters of each classifier.

Table 11 Performance: test data set with information gain for Not-1A
Table 12 Comparisons of Sample Path SMSE
Table 13 Top 50 features for 1A candidates by information gain
Table 14 Correlation Coefficients among top 10 features for 1A Candidates by Information Gain
Table 15 Top 50 features for Not-1A candidates by information gain
Table 16 Correlation Coefficients among top 10 features for Not-1A Candidates by Information Gain

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kim, SP., Gupta, D., Israni, A.K. et al. Accept/decline decision module for the liver simulated allocation model. Health Care Manag Sci 18, 35–57 (2015). https://doi.org/10.1007/s10729-014-9295-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10729-014-9295-x

Keywords

Navigation