Minimum distance estimation of the binormal ROC curve

The receiver operating characteristic (ROC) curve describes the performance of a diagnostic test, which classifies individuals into one of two categories. Many parametric, semiparametric and nonparametric estimation methods have been proposed for estimating the ROC curve and its functionals. In this paper the minimum distance estimation of the binormal ROC curve is considered. A modification of the estimator considered in the paper of Davidov and Nov (J Stat Plan Inference 142(4):872–877, 2012) and some new estimators are proposed. We compare the accuracy of the new estimators with known minimum distance estimators of the binormal ROC curve and we conclude that our estimators generally perform better than their competitors.


Introduction
The receiver operating characteristic (ROC) curve is commonly used to describe the accuracy of a medical or other diagnostic test, which classifies individuals into "non-diseased" and "diseased" categories. It is defined as a plot of the true positive rate against the false positive rate, or sensitivity versus 1-specificity, for various threshold values. Over the years, it has been widely applied in many fields including B Alicja Jokiel-Rokita alicja.jokiel-rokita@pwr.edu.pl biosciences, data mining, experimental psychology, finance, geosciences, machine learning, medicine, radiology, sociology and others. For comprehensive review of the literature, see , Pepe (2003), Krzanowski and Hand (2009) and Gonçalves et al. (2014).
More precisely, let X and Y be the test results from a non-diseased population and a diseased population, respectively. Let F be a continuous cumulative distribution function (cdf) of the random variable X, and G-a continuous cdf of the random variable Y. The ROC curve is defined as a plot of 1 − G(c) versus 1 − F(c) for −∞ ≤ c ≤ ∞, or equivalently as a plot against t, for t ∈ [0, 1]. A special feature of the ROC curve is that it is invariant to any increasing transformation of the data, i.e. if X = h(X ), and Y = h(Y ), for some increasing transformation h, then the ROC curve corresponding to the distribution functions F and G is the same as the ROC curve corresponding to the distribution function F and G of the random variables X and Y , respectively.
In this paper we consider the problem of estimation of the ROC curve in the binormal model, i.e. we assume that after some increasing transformation h, the random variables X and Y are normally distributed. Without loss of generality we can assume that X ∼ N (0, 1) and Y ∼ N (μ, σ 2 ). In this case the ROC curve has a simple parametric form where is the cumulative distribution function of the standard normal distribution. Thus, in the binormal model, the estimation of the ROC curve reduces to the estimation of the parameters μ and σ. The most common arguments in favor of using the binormal estimator are presented in Hanley (1988). Swets (1986) and Hanley (1988Hanley ( , 1996 also argue that the binormal estimator is robust. Many different techniques have been proposed to solve the problem of semiparametric estimation of the ROC curve. For estimating ROC curves from discrete or grouped response data, the most commonly used procedure is that proposed by Dorfman and Alf (1969). Metz et al. (1998) developed an algorithm called LABROC, which groups continuous data into a finite number of ordered categories and then uses the maximum likelihood algorithm from Dorfman and Alf (1969). Hsieh and Turnbull (1996) proposed a generalized least squares procedure for grouped data and a minimum distance estimator (MDE), which does not require grouping data. MDE of the binormal ROC curve was also considered in the papers of Nov (2009, 2012). In the papers of Zou and Hall (2000), Cai and Moskowitz (2004), Zhou and Lin (2008) maximum likelihood and pseudo-likelihood approach to estimate the binormal ROC curve was considered. Techniques based on regression were also proposed (see for example Lloyd 2002;Cai and Pepe 2002;Qin and Zhang 2003;Wan and Zhang 2007). Bayesian approach to the semiparametric estimation of the ROC curve was considered in the papers of Branscum et al. (2008), Erkanli et al. (2006), , Gu and Ghosal 2009. The paper Gonçalves et al. (2014) overviews some developments on the estimation of the ROC curve with the particular emphasis on some frequentist and Bayesian methods which have been mostly employed in the medical setting. This paper deals with minimum distance estimation of the binormal ROC curve. To the best of our knowledge, a minimum distance approach to estimating the binormal ROC curve parameters was considered only by Hsieh and Turnbull (1996) and Nov (2009, 2012). In the paper of Davidov and Nov (2009) the central idea was to estimate the unknown function h (a transformation of X and Y to normal random variables) in two different ways; only one of the two estimates depended on the unknown parameters μ and σ of the binormal ROC curve. Then, they estimated μ and σ by the values that minimized a certain norm of the difference between the estimates of the function h. In this paper we do not develop this idea. A different approach is presented in the papers of Hsieh and Turnbull (1996) and Davidov and Nov (2012). They took into consideration two different measures of distance between the empirical and the theoretical ordinal dominance curve (ODC), the curve closely related to the ROC curve. Davidov and Nov (2012) showed that their MDE is consistent and asymptotically normally distributed and it outperforms Hsieh and Turnbull's original, grouped-data estimator, but it has not been compared with the Hsieh and Turnbull's MDE estimator.
In this paper we compare the accuracy of the known MDE's given by Hsieh and Turnbull (1996) and Davidov and Nov (2012). We obtain that the MDE given by Hsieh and Turnbull (1996) outperforms, in some sense, MDE given by Davidov and Nov (2012). Both of the estimators are obtained by minimization of distance measures between the unknown binormal and empirical ROC curve. Empirical ROC curve, as a step function, often gives unsatisfactory nonparametric estimators of the ROC curve in the case of small sample sizes. Therefore, the second purpose of this work is to introduce modifications of these known measures of distance by replacing the underlaying empirical ROC curve by its continuous nonparametric counterparts. Another modification of Davidov and Nov (2012) approach stems from widening the domain taken into account when the distance between empirical and binormal ROC curve is calculated. In this paper, a total of seven new estimators in binormal model are introduced and their performances are compared in the simulation study.
The paper is organized as follows. In Sect. 2 we recall the MDE's of the binormal ROC curve parameters considered in the papers of Hsieh and Turnbull (1996) and Davidov and Nov (2012). Then we propose a modification of the Davidov and Nov estimator, and some new MDE's by replacing the empirical ROC curve by the Bayesian bootstrap estimator of the ROC curve (see  in measures of distance considered by Hsieh and Turnbull (1996) and Davidov and Nov (2012). We prove the consistency of the estimators proposed. We also recall two smooth nonparametric estimators of the ROC curve, namely the kernel estimator considered by Lloyd (1998), and the estimator proposed by Jokiel-Rokita and Pulit (2013), which we also use to obtain MDE's of the binormal ROC curves. Results from simulation studies are provided in Sect. 3. In Sect. 4 real data analysis is discussed. The paper ends with some concluding remarks in Sect. 5.

Minimum distance estimation of the ROC curve
In this section, we recall some known methods and provide some new methods of estimation of the parameters μ and σ in the binormal model, basing on the minimum distance concept. Minimum distance estimation has been studied extensively beginning with the work of Wolfowitz (1957). The concept of minimum distance estimation of the binormal ROC curve parameters was introduced in framework of estimation of binormal ordinal dominance curve (ODC) given by The ODC curve is closely related to the ROC curve and in the binormal model it has the following parametric form However, in course of this paper, we find more convenient to construct all estimators of the unknown parameters μ and σ in the direct reference to the ROC curves. Therefore all results originally established for ODC curves will be rephrased in terms of ROC curves.

Minimum distance estimator of Hsieh and Turnbull
Assume that independent samples X 1 , . . . , X m and Y 1 , . . . , Y n from distributions with cdf's F and G, respectively, are available. Denote by F m and G n the empirical distribution functions of X 1 , . . . , X m and Y 1 , . . . , Y n , respectively, and the empirical quantile function by G −1 n (t) = inf{y : G n (y) ≥ t}. The empirical ROC curve is defined as while the empirical ODC curve is given by In the paper of Hsieh and Turnbull (1996), MDE's of the ROC curve parameters are derived by finding the ODC curve that fits most closely to the empirical ODC curve using a L 2 norm criterion. We adopt the original idea introduced by Hsieh and Turnbull (1996). More precisely, for θ = (μ, σ ) T , let us denote by and the L 2 -distance measure between ROC(t) and ROC mn (t).
The MDE θ = ( μ, σ ) T of the parameter θ is defined by where = {θ = (μ, σ ) : μ ∈ R, σ > 1}, as in the paper of Hsieh and Turnbull (1996). The restriction that σ > 1 is not unreasonable if one thinks of the healthy response as "noise" and the diseased response as "noise plus signal". However, we can avoid this restriction if we modify the distance criterion (5) so that the integral is over a closed interval excluding 0 and 1. In the sequel, we will denote the MDE estimator θ by θ H T = ( μ H T , σ H T ). Using the theory developed by Millar (1984), Hsieh and Turnbull (1996) proved the asymptotic normality of their MDE of the parameter θ , but did not provide any concrete procedure to compute them. In Sect. 3, we describe an algorithm, used in the simulation study, to obtain the estimatesθ H T . Hsieh and Turnbull (1996) also proposed (in Remark 1), as an object for future research, to modify their measure of distance by applying the −1 transformation to both D mn (t) and D(t) which, in terms of the ROC curve, leads to following counterpart

Minimum distance estimator of Davidov and Nov
of ξ mn (θ ). Davidov and Nov (2012) followed on this suggestion and considered estimation of the parameter θ based on minimization of the following objective function where the integration endpoints 0 < a < b < 1 ensures that the last integral is finite. Namely, they considered the MDE The minimization problem given by (9) is convex and quadratic in μ and σ and, unlike (6), it enjoys a closed-form solution 123 Please note that since we employed the ROC instead of the ODC curve, the formulas (12)-(15) differ from corresponding Davidov and Nov's (2012) formulas. The integration endpoints a, b were introduced to ensure that −1 (ROC mn (t)) = ±∞ and hence that optimization problem (9) is well-defined. However, the selection of the upper integral limit according to Eq. (11) causes that the difference between the empirical ROC curve and the true (binormal) ROC curve on the interval [b, c], where c := min{i/m : ROC mn (i/m) = 1, i = 1, . . . , m} (on the last step of the ROC mn ) is not taken into account. We think that this loss of information influences the accuracy of estimates for small samples sizes m and n. Hence, we propose a modification of the minimum distance estimator considered by Davidov and Nov by choosing the upper limit of integration just before the last jump of the empirical ROC curve. Since ROC mn (t) is right-continuous, we take where ε m < 1/m is a positive constant, which guarantees that −1 (ROC mn (t)) < ∞. Moreover, thanks to the right continuity of the empirical ROC curve, there is no need to introduce any modification for the lower integration endpoint (the lowest possible value is already provided by formula (10)). The estimates of the parameters μ and σ computed with b m instead of b in (12)-(15) will be denoted byμ DN M andσ DN M , respectively. It is clear, that those modified estimators are consistent and asymptotically normal as the original estimators of Davidov and Nov (see Davidov and Nov 2012, Theorems 1 and 2), under the same assumptions.

Minimum distance estimators of the binormal ROC curve parameters based on BB estimator of the ROC curve
In the paper of  the Bayesian bootstrap (BB) for the nonparametric estimation of the ROC curve and its functionals has been proposed (see also . In this approach stochastic empirical distribution functions, introduced by Rubin (1981), are employed. Let U 1 , . . . , U m−1 be iid uniform U(0, 1) random variables, independent of data. Rubin's stochastic empirical distribution function, say F m , based on the sample X 1 , . . . , X m , is defied as follows where n be Rubin's stochastic empirical distribution function based on the observations Y 1 , . . . , Y n from the second sample. In order to get a ROC curve estimator, say ROC (b) mn , we proceed in the same way as in the case of empirical ROC curve given by (3), and plug in Rubin's stochastic empirical distribution function G (b) n and quantile function F (b)−1 m into (1). Next the BB estimate of the ROC curve is obtained by averaging over a large number of ROC The estimator ROC B B mn is a bandwidth-free nonparametric estimator and, because of averaging over two random variations, is "smoother" than ROC mn . The BB estimates of the ROC curve for two different values of B, based on the samples of equal sizes n = m = 15, together with the empirical and the true ROC curve, are presented in Fig. 1. As can be seen, that even when we average over a small number of realizations, we obtain "smoother" estimate than the empirical ROC curve.
Remark 1 An efficient three-step procedure for computing BB estimates, which does not require inverting the stochastic empirical distribution function (17), was proposed by . In the first step auxiliary variables Z j are defined, based on BB resampling distribution, where ( p 1 , . . . , p m ) ∼ Dirichlet (m; 1, . . . , 1) independent of others. In the second step a random realization of ROC curve, ROC # mn , is generated as randomized distribution function of Z 1 , . . . , Z n ; we have where (q 1 , . . . , q n ) ∼ Dirichlet (n; 1, . . . , 1) independent of others. In the last step the BB estimate of ROC curve is obtained by averaging over the ensemble of random ROC curves ROC B B mn (t) = mean(ROC # mn (t)). A convenient method for generating ( p 1 , . . . , p m ) ∼ Dirichlet (m; 1, . . . , 1) was also proposed by Gu. Let us assume that Moreover, throughout this section we assume that the sample sizes m, n are such that m = m(n) and n/m → λ ∈ (0, ∞) as n → ∞, and that the following two conditions are satisfied (C2) Let cdf's F and G satisfy Condition 1, and additionally sup x∈(α,β) Using the theory of Kiefer processes,  proved some strong approximation results and asymptotic properties of the Bayesian bootstrap ROC curve estimator. In particular, its rate of convergence to the true ROC curve was shown to be n −1/2 . We will consider minimum distance estimation of the binormal ROC curve parameters by replacing the empirical ROC curve with corresponding BB estimator ROC B B mn (t) in measure (8). Since jumps of ROC B B mn (t) are random we can choose the integration limits in (12)-(15) to be closer to 0 and 1 then in the original procedure. Namely we define where ε m < 1/m is a positive constant, which need to be introduced due to right continuity of ROC B B mn function (analogously to (16)). To be more specific, we consider the MDE Using the same approach as in Sect. 2.2, one can show that the solution to the optimization problem above is given bŷ The following lemma can be proved in an analogous manner to Lemma 1 in Davidov and Nov (2012).
A proof of Theorem 1 is given in in Appendix. We will also consider an estimator of the parameter ϑ, which combines the minimum distance concept of Hsieh and Turnbull with the BB nonparametric estimator of the ROC curve. In this method, Eq. (4) is modified by replacing the empirical ROC mn (t) curve with the Bayesian bootstrap estimator ROC B B mn (t) which gives 123 and the corresponding L 2 -distance measure is The minimum distance estimateθ H T B = (μ H T B ,σ H T B ) of the parameter θ is defined as the value which minimizes (21), i.e.

Minimum distance estimators of the binormal ROC curve parameters based on smooth nonparametric estimators of the ROC curve
The empirical ROC curve retains many properties of the empirical distribution function. It is uniformly convergent to the theoretical curve (Hsieh and Turnbull 1996), but it is also not continuous and not very accurate for small sample sizes. The idea behind semiparametric procedures of Hsieh and Turnbull, as well as Davidov and Nov, is to minimize a distance between binormal ROC curve given by (2), and the empirical one.
In this section we propose MDE's of the binormal curve by replacing the empirical ROC curve, in measures (5) and (8), by its continuous nonparametric counterparts. Consequently, each considered nonparametric estimator of the ROC curve leads to two new semiparametric minimum distance estimators.

Kernel estimator of the ROC curve
Lloyd (1998) used the kernel smoothing technique to obtain a smooth ROC curve estimator given by where z)dz and bandwidth parameters h n and h m . Lloyd and Yong (1999) showed that estimator (22) has better mean squared error properties than the empirical ROC curve. In the problem of kernel density estimation, choosing between many available kernel functions is of secondary importance as all give comparable results, but more care needs to be taken over the selection of bandwidth. Therefore, in the kernel ROC curve estimation the main emphasis is put on the bandwidth selection Harezlak 2002, Hall andHyndman 2003). In the Simulation study (Sect. 3), the Gaussian kernel is employed and the bandwidth parameter h m is chosen according to where s x and iqr x are the standard deviation and the interquartile range for nondiseased population, respectively. The bandwidth parameter h n for diseased population was determined in the same way. This method of bandwidth selection was recommended by Silverman (1986) as it works 'very well for a wide range of densities', which is reasonable in our case, since we have no information about samples distribution.
Kernel estimator (22) of the ROC curve allows us to introduce two new minimum distance estimators of the binormal ROC curve parameters which will be denoted bŷ θ H T K andθ DN K . The first one employs the ROC K mn (t) instead of the empirical ROC curve in Eq. (4), while the latter-in Eq. (7), e.g.
where the integration limits a and b are the counterparts of Eqs. (18)

Estimator of the ROC curve by smoothing the sample distribution functions
In the paper of Jokiel-Rokita and Pulit (2013), the authors proposed to estimate the ROC curve using the plug in method with smoothed sample distribution functions. Let X 1:m ≤ X 2:m ≤ · · · ≤ X m:m and Y 1:n ≤ Y 2:n ≤ · · · ≤ Y n:n denote order statistics from the samples X X X m and Y Y Y n , respectively. We set where L, U are random variables such that L ≤ min {X 1:m , Y 1:n } and U ≥ max {X m:m , Y n:n } almost surely. Denote , j = 1, 2, . . . , n.

123
With this notation we define the estimators of the distribution functions F, G by where r : [0, 1] → [0, 1] is a continuous, strictly increasing function such that An appropriate choice of the function r, appearing in formula (23), can guarantee differentiability of the estimator (e.g. if function r is differentiable and r + (0) = r − (1) = 0). Simultaneously, determination of the estimator (24) remains as easy as in the case of the empirical ROC curve. Minimum distance estimators of the parameter θ, based on the nonparametric ROC curve estimator ROC S mn applied in (4) and (7) instead of the estimator ROC mn , will be denoted byθ H T S andθ DN S , respectively.

Simulation study
A simulation experiment was conducted in order to • Investigate the accuracy of the original minimum distance estimators considered by Davidov and Nov (2012) in comparison with their modification proposed in Sect. 2.2, • Compare the accuracy of the minimum distance estimators of the binormal ROC curve parameters proposed by Hsieh and Turnbull (1996) with those considered by Davidov and Nov (2012) (answer the question: which measure of distance provides more accurate estimators), • Compare the accuracy of the minimum distance estimators considered by Hsieh and Turnbull (1996) and Davidov and Nov (2012) with their counterparts obtained by replacing the empirical ROC curve with BB estimator or with the smooth nonparametric estimators of the ROC curve (the kernel estimator and the estimator proposed by Jokiel-Rokita and Pulit 2013).
An important index connected with the ROC curve is the area under the curve, commonly denoted by It can be easily shown that in the model considered AU C = P(X < Y ). We considered binormal ROC curves which values of AUC were 0.75 and 0.85 and assumed that X ∼ N (0, 1) and Y is normally distributed with standard deviation σ ∈ {1, 4/3, 2} and mean value μ follows according to μ = √ 1 + σ 2 −1 (AUC). For each ROC curve, 5000 data sets with m = n ∈ {15, 20, 100} were generated. Next, for each data set, four nonparametric ROC curve estimators were computed: the empirical ROC curve ROC mn , the smoothed estimator ROC S mn according to Eq. (24) with linking function r (x) = x, the kernel estimator ROC K mn given by formula (22), and the Bayesian bootstrap estimator ROC B B mn averaged over B = 1000 realizations. All nonparametric estimators were calculated on regular grid with intervals length of 0.0001. For kernel estimator we additionally used four times denser support grid, in order to compute the inverse of the cdf estimator F K m −1 with sufficient accuracy. As it was tested, further increase of the grid density virtually did not alter the simulation results. Then semiparametric minimum distance estimators were calculated based on nonparametric ones. In study, nine distinct semiparametric estimators were considered: five based on minimum distance approach considered by Davidov and Nov (2012) (shortly D-N estimators) and four based on the measure of distance considered by Hsieh and Turnbull (1996) (shortly H-T estimators). For all D-N estimators, except the original DN, the integration endpoints were calculated according to equation (19) with proper nonparametric ROC estimator plugged in. In practice, due to the finite distance between grid points, there is no need to introduce the ε n constant.
In Hsieh and Turnbull approach one need to numerically minimize the L 2 -distance between the binormal ROC curve and considered nonparametric estimator. For the binormal model this problem corresponds to minimization of a function of two variables μ and σ . In simulations the Nelder-Mead method was employed to minimize the objective function and initial values of unknown parameters were calculated using corresponding DNM estimator.
The performance of estimators introduced in previous section is studied in two ways: by comparing the estimates of binormal parameters and by looking at the deviation of estimated ROC curve from it's true shape. In Table 1 estimated bias and MSE of parameters μ and σ are listed for four binormal models (with σ = 1 and σ = 2 and for two values of AUC: 0.75 and 0.85). In practice one is more interested in estimation of the ROC curve than the parameters of binormal model. Hence, in order to examine overall goodness of fit of the ROC curve estimator the mean integrated square error (MISE) 123 was estimated, where ROC(t) stands for the considered ROC curve estimator. In Table 2 the estimated values of MISE (multiplied by 100, for brevity) are collected for three values of σ , AUC=0.75, and different sample sizes. Results corresponding to AUC=0.85 are given in Table 3. MISE's are presented for both semiparametric and nonparametric ROC curves estimates for comparison. As can be seen from Table 1, there are quite big differences in accuracy between the original (DN) and the modified (DNM) minimum distance estimators of Davidov and Nov, even though the latter requires only a marginal modification in the computational procedure. For m = n = 10 and m = n = 15 estimated mean square errors of the DNM estimators of parameters μ and σ are significantly smaller (sometimes even by half) than the corresponding estimated errors of the original DN estimators. The bias forθ DN M is also smaller than the one forθ DN , but the difference between them is less prominent. For large samples size, m = n = 100, when formulas (11) and (16) yields virtually the same integration endpoints, the DN and DNM procedures give almost the same biases and mean square errors, as expected. The DNM estimator outperforms the original Davidov and Nov (2012) estimator (DN) also in terms of mean integrated square error. The results given in Tables 2 and 3 indicate a reduction of MISE by approximately 10% in the case of small sample sizes and 3% for m = n = 100. We find interesting to examine the accuracy of the estimates obtained by minimization of two distinct measures (5) and (8). In the case of small sample sizes m = n = 15 and m = n = 20, the HT procedure performs much better in terms of bias and mean square error than DNM, and hence also outperforms the DN, regardless of AUC and true value of parameter σ (cf. Table 1). For m = n = 100, the bias ofμ H T remains much lower than the corresponding bias ofμ DN andμ DM N , while the differences in MSE between these estimators are reduced. Simultaneously, the HT method gives also smaller bias of the estimator of σ in comparison to DN and DNM procedures but in some cases it yields greater MSE. These conclusions also holds to a great extend when DNS estimator, based on smoothed nonparametric ROC curve, is compared with corresponding HTS estimator. At the same time, inspection of the results collected in Tables 2 and 3 reveals that estimators based on D-N approach, aside from the original DN, yielded better fit to the true ROC curve in terms of MISE than these originating from H-T procedure-in all models, expect one, estimates that gave the lowest MISE were obtained utilizing the distance measure considered by Davidov and Nov (2012).
Based on simulations, we may also address the influence of replacing the empirical ROC curve with other nonparametric estimators on the accuracy of estimated binormal ROC curve. In all considered models, semiparametric estimators based on smoothed empirical ROC curve, ROC S mn (t), performed better than their counterparts based on empirical curve ROC mn (t) for both employed distance measures. The bias and MSE ofμ DN S andσ DN S are considerably smaller than ofμ DN M andσ DN M , respectively. Similar conclusions can be drawn when compare HTS with original HT procedure. For small sample sizes, the mean square error for estimates of both parameters decreases, by factor of 4.5 on average, when underlaying empirical ROC curve is replaced with it's smoothed counterpart (24). Naturally, the advantage of estimates based on ROC S mn (t) over those based on ROC mn (t) decreases when sample size increases. However, no significant improvement of parameters estimates is observed when kernel or BB methods are employed. In the case of methods based on Davidov and Nov approach, when one minimizes the objective function given by (9), the estimated biases and MSE's of the estimatorsθ DN K andθ DN B are only slightly reduced with comparison to DNM method. Furthermore, for HTK and HTB methods even some increase of bias and MSE is observed in comparison to original minimum distance procedure of Hsieh and Turnbull. Replacing the underlaying empirical ROC curve with it's smoothed counterpart leads also to decrease of mean integrated square error of both semiparametric and nonparametric estimators. For eighteen binormal models considered in Tables 2  and 3 the DNS method always outperform the DN and in fifteen cases it yields smaller MISE than DNM estimator. In fact, for AUC = 0.75, the DNS estimator achieves the lowest MISE among all considered in 8 out of 9 comparisons. The HTS estimator exceeds the HT also in 15 out of 18 comparisons. Some improvement of estimates is observed when bootstrap estimator is employed (DNB and HTB methods). Consequently, simulation study shows that replacing empirical ROC curve (3) with its smoothed counterpart (24) significantly improves the minimum distance estimates of the binormal ROC curve.

Real data analysis
To illustrate all considered semiparametric estimators, we apply them to data analysed in the paper of Tupikowski et al. (2012). In the dataset the effectiveness of combined treatment of interferon alpha and metronomic cyclophosphamide in patients with metastatic kidney cancer was studied in terms of hemoglobin level (HL) and serum fibrinogen concentration (FC). The dataset contains 31 observations in total; 14 with and 17 without clinical response. Low value of HL or FC level has been recognized as a negative predictor of treatment response and associated with short survival. The estimates of the binormal ROC curves parameters for HL and FC as predictive factors are given in Table 4 for all considered methods. The estimated values of AUC are also tabulated. Interestingly, while the estimates of the parameters μ and σ vary between methods, the estimates of AUC are close to each other, and differ only by 7% for both HL and FC.

Conclusions and some prospects
In this article seven new estimators of binormal ROC curve in semiparametric setting have been proposed. New estimators originate from the minimum distance concept applied to the ROC curve estimation by Hsieh and Turnbull (1996) and recently revisited by Davidov and Nov (2012). In the original MDE procedures one minimizes some distance measures between the binormal ROC curve, characterized by two parameters μ and σ , and the empirical ROC curve. In our methods we propose to replace the ROC mn estimator, which is not continuous and not very accurate for small sample sizes, with other nonparametric estimators of the ROC curve. Procedures involving kernel, Bayesian bootstrap and smoothed ROC curve estimators were considered. Moreover, for estimators based on the Davidov and Nov (2012) approach, the role of appropriate integration limits was emphasized. The small-sample performance of the proposed estimators was investigated numerically and compared with original procedures of Davidov and Nov (2012) and Hsieh and Turnbull (1996). The biggest improvement, both in terms of the parameters accuracy and MISE, was observed for estimators based on the smoothed ROC S mn nonparametric ROC curve estimator (see Sect. 2.4.2). For samples of small sizes, we observed that replacing the ROC mn with ROC S mn in minimum distance procedures can reduce the MSE of the estimators of μ and σ parameters by an order of magnitude, and by factor of 4.5 on average. The goodness of fit of the estimator of the ROC curve to the true ROC curve is also improved as indicated by lower mean integrated square error. Employing the BB estimator does not improve the performance of MDE's so much, while using the kernel estimators sometimes leads to even less accurate semiparametric ROC curves estimates.
In the future research we are going to examine the asymptotic equivalence of the estimators considered. Especially, the asymptotic properties of DNS and HTS estimators needs further investigation since as these methods clearly outperforms the others. In fact, the smoothed nonparametric estimator of the ROC curve, introduced by Jokiel-Rokita and Pulit (2013), seems to be very promising method and theoretical investigation of its asymptotic properties is of our interest. We are also going to study robustness of the considered estimators on model misspecification.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. ofS i , obtained by substituting ROC(t) for ROC B B mn (t), and the values 0 and 1 for the lower and the upper integration limit, respectively, i.e., for example S 1 = 1 0 −1 (ROC(t))dt.
ConvergenceS 3 → S 3 andS 4 → S 4 in probability, as n → ∞, can be easily derived from Lemma 1. We will show thatS 1 → S 1 in probability. In very similar fashion one can show thatS 2 → S 2 in probability, hence, by definition (20), and Continuous Mapping Theorem, the theorem will be proved.
From Lemma 1, the coefficient 1/(b m − a m ) converges to 1 a.s., therefore it can be omitted. We have, The second term of the right-hand side of the above inequality converges to 0 a.s., as it was indicated in Lemma 1, hence it also converges to 0 in probability. Therefore it remains to show that the first term of the above inequality converges to 0 in probability. Using the same arguments as in the original paper of Davidov and Nov (2012)  . We will show that although˙ −1 (ROC(a m )) converges to ∞ as m increases, it converges to 0 after being multiplied by 1/ √ m; the corresponding proof for˙ −1 (ROC(b m )) is very similar and hence it is omitted. Let a (b) m = inf{t ∈ [0, 1] : ROC (b) mn > 0}, then the lower integration limit, defined by (19) The definition of a (b) m may be equivalently written as where F m is the empirical distribution function based on X 1 , . . . , X m . As in the proof of Theorem 1 in Davidov and Nov (2012), we can show that the rate of convergence of the first term of (29) is P (1/ √ m). The notation P is the equivalent of O P for an asymptotic lower bound, i.e., Q n = P (R n ) if R n /Q n is bounded in probability. By the Dvoretzky-Kiefer-Wolfowitz inequality, the term in second bracket in (29) converges in probability to 0 exponentially. We will show that the expression in third bracket in (29) converges in probability to 0 faster than 1/m, hence a  (17) and properties of the empirical distribution function, the following inequality holds Since U (k) is k-th order statistic from the uniform distribution U(0, 1), it has beta distribution B(k, m − k) with expected value equal to k/m. A suitably tight upper bound for the last probability can be obtained using the following inequality (see Mitzenmacher and Upfal 2005, p. 59) We have  (28), we conclude that a m = O P (1/ √ m). Using the same approach as Davidov and Nov (2012) in their proof of Theorem 1, we can show thaṫ −1 (ROC(a m )) = o P ( √ m) which completes the proof thatS 1 → S 1 in probability, as n → ∞, and thus theorem is proved.