Skip to main content
Log in

Predicting binary outcomes based on the pair-copula construction

  • Published:
Empirical Economics Aims and scope Submit manuscript

Abstract

We develop a new econometric model for the purpose of predicting binary outcomes based on an ensemble of predictors. The method uses the pair-copula construction (PCC) to optimally combine diverse information. As a building block of PCC, the conditional copula is permitted to depend on the conditioning variable in a nonparametric way. This is the major methodological departure from our previous work. We apply this methodology to predict US business cycle peaks 6 months ahead based on the three prominent leading indicators currently used by The Conference Board. In terms of the predictive accuracy as measured by the receiver operating characteristic curve, the proposed scheme is found to do well in comparison with some popular combination models. We have also evaluated the probability forecasts generated from these models using a battery of diagnostic tools, each of which reveals different aspects of skill of the generated forecasts.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Notes

  1. Recently, Ferrara et al. (2015) conducted an extensive analysis to compare forecasting ability of linear vs nonlinear models during the Great Recession episode.

  2. This implies that we can estimate the logit model (with or without quadratic terms) by maximum likelihood to get the optimal rule. Since ROC curve is invariant to the logit transformation, an even simpler combination rule is given by the linear index of the estimated logit model. However, the coefficients in the linear index must be estimated by maximum likelihood, rather than by OLS as we often do in a typical linear probability model. We thank Robin Sickles for suggesting this interpretation.

  3. A general introduction to the modeling strategies based on copulas was given by Joe (1997), Nelsen (2006), Patton (2012), and Trivedi and Zimmer (2005). Copulas have been applied successfully to predict multiple economic events in Anatolyev (2009), Patton (2006, 2013) and Scotti (2011). Using copulas, Smith (2008) specified a joint distribution for normal and half normal distributions where the two random variables were allowed to be dependent; Amsler et al. (2014, 2016) used copulas to model dependence between asymmetric errors over time in panel and under endogeneity. Amsler and Schmidt (2021) and Patton and Fan (2014) provided two high-quality surveys on copula methods from different econometric perspectives.

  4. The idea of PCC goes back to Joe (1996). Bedford and Cooke (2001, 2002) introduced a graphic device, called regular vine to categorize different decompositions of a multivariate distribution. The general formula of the density of PCC for n variables and concrete examples of various vine structures can be found in Aas et al. (2009), which also proposed a sequential estimation method for PCC models. However, this methodology has not received much attention in econometrics to date. Exceptions include Brechmann and Schepsmeier (2013), Kielmann et al. (2022), Min and Czado (2010), and Zimmer (2015a, 2015b).

  5. It is worth noting that even PCCs relaxing the simplifying assumption are not dense in the space of multivariate copula functions. Although a multivariate distribution can be decomposed as in (12), the functional form of the conditional copula \(c_{13|2}\{F_{1|2}(x_1|x_2),F_{3|2}(x_3|x_2);x_2\}\), not just its parameter \(\theta \), may depend on \(x_2\). For example, \(c_{13|2}\) may be a Gaussian copula when \(x_2=1\) while it may become a Gumbel copula when \(x_2=-1\). A limitation of our PCC model lies in the restriction that \(c_{13|2}\) has to belong to the same parametric family regardless of the value of \(x_2\). To the best of our knowledge, there are no theoretical results allowing \(c_{13|2}\) to change from one family to another as \(x_2\) varies.

  6. The same technique is utilized in the nonparametric local linear regression (Fan and Gijbels 1996).

  7. The maximizer of (13) would change if a different value of \(\lambda _{T}\) is used.

  8. Among the ten leading indicators proposed by TCB, these are the three indicators with most enduring performance for predicting economic recessions. A preliminary analysis (available upon request) demonstrates that most of the relevant information contained in the ten indicators is captured by the chosen three indicators, yielding almost identical ROC curves. See also Lahiri and Yang (2015b). In the subsequent analysis, these 3 predictors will be abbreviated as Confidence, Spread and ISM_Diffu, respectively.

  9. Recently, Salgado et al. (2019) found that economic recessions are accompanied by negative third-moment (skewness) shocks, leading to a left tail of large negative outcomes. As is well known, the usual normal distribution cannot accommodate this important feature.

  10. To select the best copula for \(c_{13|2}\), we first generate \(({\hat{U}}_t,{\hat{V}}_t)\) using the procedure in Sect. 3.2.4. The chosen copula is the one minimizing AIC.

  11. Kendall’s \(\tau \) describes rank correlation between two random variables. In view of Nelsen (2006), it can be expressed in terms of copula function as \(\tau =4\int \int C(u,v)dC(u,v)-1\).

  12. We have tried two alternative specifications with each of the other two predictors (Confidence and ISM_Diffu) as the conditioning variable in PCC. The resulting ROC curves are akin to that in Fig. 3. To save space, we omit these results here, although they are available from the authors.

  13. We have tried the probit model instead. The difference between the logit and probit models is negligible.

  14. The ROC curve of K &K nearly overlaps with that of the logit model and thus is omitted from Fig. 3.

  15. For brevity, we only report results for PCC model and logit model in Tables 3, 4 and 5.

References

  • Aas K, Czado C, Frigessi A, Bakken H (2009) Pair-copula constructions of multiple dependence. Insur Math Econ 44:182–198

    Google Scholar 

  • Acar EF, Craiu RV, Yao F (2011) Dependence calibration in conditional copulas: a nonparametric approach. Biometrics 67:445–453

    Google Scholar 

  • Acar EF, Genest C, Nešlehová J (2012) Beyond simplified pair-copula constructions. J Multivar Anal 110:74–90

    Google Scholar 

  • Acar EF, Czado C, Lysy M (2019) Flexible dynamic vine copula models for multivariate time series data. Econom Stat 12:181–197

    Google Scholar 

  • Amsler C, Schmidt P (2021) A survey of the use of copulas in stochastic frontier models. In: Parmeter CF, Sickles RC (eds) Advances in efficiency and productivity analysis. Springer, pp 125–138

    Google Scholar 

  • Amsler C, Prokhorov A, Schmidt P (2014) Using copulas to model time dependence in stochastic frontier models. Econom Rev 33:497–522

    Google Scholar 

  • Amsler C, Prokhorov A, Schmidt P (2016) Endogeneity in stochastic frontier models. J Econom 190:280–288

    Google Scholar 

  • Anatolyev S (2009) Multi-market direction-of-change modeling using dependence ratios. Stud Nonlinear Dyn Econom 13, Article 5

  • Ang A, Bekaert G (2002) International asset allocation with regime shifts. Rev Financ Stud 15:1137–1187

    Google Scholar 

  • Bates JM, Granger CWJ (1969) The combination of forecasts. Oper Res Q 20:451–468

    Google Scholar 

  • Bedford T, Cooke RM (2001) Probability density decomposition for conditionally dependent random variables modeled by vines. Ann Math Artif Intell 32:245–268

    Google Scholar 

  • Bedford T, Cooke RM (2002) Vines—a new graphical model for dependent random variables. Ann Stat 30:1031–1068

    Google Scholar 

  • Berg D, Aas K (2009) Models for construction of higher-dimensional dependence: a comparison study. Eur J Finance 15:639–659

    Google Scholar 

  • Brechmann EC, Schepsmeier U (2013) Modeling dependence with C- and D-vine copulas: the R package CDVine. J Stat Softw 52:1–27

    Google Scholar 

  • Brier GW (1950) Verification of forecasts expressed in terms of probability. Mon Weather Rev 78:1–3

    Google Scholar 

  • Brown BM, Wang YG (2005) Standard errors and covariance matrices for smoothed rank estimators. Biometrika 92:149–158

    Google Scholar 

  • Chauvet M, Potter S (2005) Forecasting recessions using the yield curve. J Forecast 24:77–103

    Google Scholar 

  • Chen X, Fan Y (2006) Estimation and model selection of semiparametric copula-based multivariate dynamic models under copula misspecification. J Econom 135:125–154

    Google Scholar 

  • Chen X, Fan Y (2007) A model selection test for bivariate failure-time data. Econom Theor 23:414–439

    Google Scholar 

  • Clayton D (1978) A model for association in bivariate life tables and its application in epidemiological studies of familial tendency in chronic disease incidence. Biometrika 65:141–151

    Google Scholar 

  • Czado C (2010) Pair-copula constructions of multivariate copulas. In: Jaworski P, Durante F, Härdle WK, Rychlik T (eds) Copula theory and its applications. Springer

    Google Scholar 

  • Czado C, Nagler T (2022) Vine copula based modeling. Annu Rev Stat Appl 9:453–477

    Google Scholar 

  • Dawid AP (1984) Present position and potential developments: some personal views: statistical theory: the prequential approach. J R Stat Soc Ser A 147:278–292

    Google Scholar 

  • Drechsel K, Scheufele R (2012) The performance of short-term forecasts of the German economy before and during the 2008/2009 recession. Int J Forecast 28:428–445

    Google Scholar 

  • Elliott G, Timmermann A (2016) Economic forecasting. Princeton University Press

    Google Scholar 

  • Fan J, Gijbels I (1996) Local polynomial modelling and its applications. Chapman & Hall

    Google Scholar 

  • Fawcett T (2006) An introduction to ROC analysis. Pattern Recognit Lett 27:861–874

    Google Scholar 

  • Fermanian J (2005) Goodness-of-fit tests for copulas. J Multivar Anal 95:119–152

    Google Scholar 

  • Ferrara L, Marcellino M, Mogliani M (2015) Macroeconomic forecasting during the great recession: the return of non-linearity? Int J Forecast 31:664–679

    Google Scholar 

  • Genest C, Rémillard B, Beaudoin D (2009) Goodness-of-fit tests for copulas: a review and a power study. Insur Math Econ 44:199–213

    Google Scholar 

  • Granger CWJ, Ramanathan R (1984) Improved methods of combining forecasts. J Forecast 3:197–204

    Google Scholar 

  • Haff IH, Aas K, Frigessi A (2010) On the simplified pair-copula construction-simply useful or too simplistic? J Multivar Anal 101:1296–1310

    Google Scholar 

  • Han AK (1987) Non-parametric analysis of a generalized regression model. J Econom 35:303–316

    Google Scholar 

  • Hao L, Ng ECY (2011) Predicting Canadian recessions using dynamic probit modelling approaches. Can J Econ 44:1297–1330

    Google Scholar 

  • Harding D, Pagan A (2011) An econometric analysis of some models for constructed binary time series. J Bus Econ Stat 29:86–95

    Google Scholar 

  • Hastie T, Tibshirani R, Friedman J (2001) The elements of statistical learning: data mining, inference, and prediction. Springer

    Google Scholar 

  • Hogg RV, McKean JW, Craig AT (2012) Introduction to mathematical statistics. Prentice Hall

    Google Scholar 

  • Hsiao C, Wan SK (2014) Is there an optimal forecast combination? J Econom 178:294–309

    Google Scholar 

  • Jafarzadeh SR, Johnson WO, Gardner IA (2016) Bayesian modeling and inference for diagnostic accuracy and probability of disease based on multiple diagnostic biomarkers with and without a perfect reference standard. Stat Med 35:859–876

    Google Scholar 

  • Joe H (1993) Parametric families of multivariate distributions with given marginals. J Multivar Anal 46:262–282

    Google Scholar 

  • Joe H (1996) Families of m-variate distributions with given margins and \(m(m-1)/2\) bivariate dependence parameters. In: Rüschendorf L, Schweizer B, Taylor MD (eds) Distributions with fixed marginals and related topics. Institute of Mathematical Statistics

    Google Scholar 

  • Joe H (1997) Multivariate models and dependence concepts. Chapman & Hall

    Google Scholar 

  • Kamstra M, Kennedy P (1998) Combining qualitative forecasts using logit. Int J Forecast 14:83–93

    Google Scholar 

  • Kauppi H, Saikkonen P (2008) Predicting U.S. recessions with dynamic binary response models. Rev Econ Stat 90:777–791

    Google Scholar 

  • Kielmann J, Manner H, Min A (2022) Stock market returns and oil price shocks: a CoVaR analysis based on dynamic vine copula models. Empir Econ 62:1543–1574

    Google Scholar 

  • Klugman SA, Parsa R (1999) Fitting bivariate loss distributions with copulas. Insur Math Econ 24:139–148

    Google Scholar 

  • Krzanowski WJ, Hand DJ (2009) ROC curves for continuous data. Chapman & Hall

    Google Scholar 

  • Lahiri K, Wang JG (2013) Evaluating probability forecasts for GDP declines using alternative methodologies. Int J Forecast 29:175–190

    Google Scholar 

  • Lahiri K, Yang L (2013) Forecasting binary outcomes. In: Timmermann A, Elliott G (eds) Handbook of economic forecasting volume 2B. North-Holland, Amsterdam, pp 1025–1106

    Google Scholar 

  • Lahiri K, Yang L (2015a) A nonlinear forecast combination procedure for binary outcomes. Stud Nonlinear Dyn Econom 29:175–190

    Google Scholar 

  • Lahiri K, Yang L (2015b) Further analysis of the conference board’s new Leading Economic Index. Int J Forecast 31:446–453

    Google Scholar 

  • Lahiri K, Yang L (2016) Asymptotic variance of Brier (skill) score in the presence of serial correlation. Econ Lett 141:125–129

    Google Scholar 

  • Lahiri K, Yang L (2018) Confidence bands for ROC curves with serially dependent data. J Bus Econ Stat 36:115–130

    Google Scholar 

  • Lahiri K, Yang L (2021) Construction of Leading Economic Index for recession prediction using vine copulas. Stud Nonlinear Dyn Econom 25:193–212

    Google Scholar 

  • Levanon G, Manini J, Ozyildirim A, Schaitkin B, Tanchua J (2015) Using financial indicators to predict turning points in the business cycle: the case of the Leading Economic Index for the United States. Int J Forecast 31:426–445

    Google Scholar 

  • Li D (2000) On default correlation: a copula function approach. J Fixed Income 9:43–54

    Google Scholar 

  • Li F, Kang Y (2018) Improving forecasting performance using covariate-dependent copula models. Int J Forecast 34:456–476

    Google Scholar 

  • Li Q, Racine JS (2008) Nonparametric econometrics: theory and practice. Princeton University Press

    Google Scholar 

  • Li F, Villani M, Kohn R (2010) Flexible modeling of conditional distributions using smooth mixtures of asymmetric student-t densities. J Stat Plan Inference 140:3638–3654

    Google Scholar 

  • Lin H, Zhou L, Peng H, Zhou X (2011) Selection and combination of biomarkers using ROC method for disease classification and prediction. Can J Stat 39:324–343

    Google Scholar 

  • Longin F, Solnik B (2001) Extreme correlation of international equity markets. J Finance 56:649–676

    Google Scholar 

  • McIntosh MW, Pepe MS (2002) Combining several screening tests: optimality of the risk score. Biometrics 58:657–664

    Google Scholar 

  • Min A, Czado C (2010) Bayesian inference for multivariate copulas using pair-copula constructions. J Financ Econom 8:511–546

    Google Scholar 

  • Murphy AH (1972) Scalar and vector partitions of the probability score: part I. Two-state situation. J Appl Meteorol 11:273–282

    Google Scholar 

  • Nelsen RB (2006) An introduction to copulas. Springer

    Google Scholar 

  • Patton AJ (2006) Modelling asymmetric exchange rate dependence. Int Econ Rev 47:527–556

    Google Scholar 

  • Patton AJ (2012) A review of copula models for economic time series. J Multivar Anal 110:4–18

    Google Scholar 

  • Patton AJ (2013) Copula methods for forecasting multivariate time series. In: Timmermann A, Elliott G (eds) Handbook of economic forecasting volume 2B. North-Holland, Amsterdam, pp 899–960

    Google Scholar 

  • Patton AJ, Fan Y (2014) Copulas in econometrics. Annu Rev Econ 6:179–200

    Google Scholar 

  • Pepe MS, Cai T, Longton G (2006) Combining predictors for classification using the area under the receiver operating characteristic curve. Biometrics 62:221–229

    Google Scholar 

  • Prokhorov A, Schmidt P (2009) Likelihood-based estimation in a panel setting: robustness, redundancy and validity of copulas. J Econom 153:93–104

    Google Scholar 

  • Ranjan R, Gneiting T (2010) Combining probability forecasts. J R Stat Soc B 72:71–91

    Google Scholar 

  • Salgado S, Bloom N, Guvenen F (2019) Skewed business cycles. NBER Working Paper No. 26565

  • Scotti C (2011) A bivariate model of Federal Reserve and ECB main policy rates. Int J Cent Bank 7:37–78

    Google Scholar 

  • Seillier-Moiseiwitsch F, Dawid AP (1993) On testing the validity of sequential probability forecasts. J Am Stat Assoc 88:355–359

    Google Scholar 

  • Sklar A (1973) Random variables, joint distributions, and copulas. Kybernetica 9:449–460

    Google Scholar 

  • Smith MD (2008) Stochastic frontier models with dependent error components. Econom J 11:172–192

    Google Scholar 

  • Stephenson DB (2000) Use of the ‘odds ratio’ for diagnosing forecast skill. Weather Forecast 15:221–232

    Google Scholar 

  • Stöber J, Joe H, Czado C (2013) Simplified pair copula constructions-limitations and extensions. J Multivar Anal 119:101–118

    Google Scholar 

  • Stock JH, Watson MW (1993) A procedure for predicting recessions with leading indicators: econometric issues and recent experience. In: Stock JH, Watson MW (eds) New research on business cycles, indicators and forecasting. University of Chicago Press, pp 95–156

    Google Scholar 

  • Timmermann A (2006) Forecast combinations. In: Elliott G, Granger CWJ, Timmermann A (eds) Handbook of economic forecasting. North-Holland, Amsterdam, pp 135–196

    Google Scholar 

  • Trivedi PK, Zimmer DM (2005) Copula modeling: an introduction for practitioners. Found Trends Econom 1:1–111

    Google Scholar 

  • Yates JF (1982) External correspondence: decompositions of the mean probability score. Organ Behav Hum Perform 30:132–156

    Google Scholar 

  • Yates JF, Curley SP (1985) Conditional distribution analysis of probabilistic forecasts. J Forecast 4:61–73

    Google Scholar 

  • Youden WJ (1950) Index for rating diagnostic tests. Cancer 3:32–35

    Google Scholar 

  • Zimmer DM (2015) Asymmetric dependence in house prices: evidence from USA and international data. Empir Econ 49:161–183

    Google Scholar 

  • Zimmer DM (2015) Analyzing comovements in housing prices using vine copulas. Econ Inq 53:1156–1169

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Liu Yang.

Ethics declarations

Conflict of interest

Authors Kajal Lahiri and Liu Yang declare that they have no conflict of interest.

Ethical approval

This article does not contain any studies with human participants performed by any of the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

We thank the three anonymous referees for many helpful comments. However, any remaining errors are our responsibility.

Mathematical Appendix

Mathematical Appendix

Proof of Theorem 1

The proof follows directly from Theorem 8.1.1 in Hogg et al. (2012). Given w satisfying \(FA=P(\Gamma (X)>w|Z=0)\), define

$$\begin{aligned} R_{\Gamma }\equiv \{x\in R^k:\Gamma (x)>w\} \end{aligned}$$
(17)

and

$$\begin{aligned} R_{\eta }\equiv \{x\in R^k:\eta (x)=1\}. \end{aligned}$$

By assumption, we have

$$\begin{aligned} P(X\in R_{\Gamma }|Z=0)=P(X\in R_{\eta }|Z=0)=FA. \end{aligned}$$
(18)

It is straightforward to see that

$$\begin{aligned} \begin{aligned} P(X\in R_{\Gamma }|Z=0)&=P(X\in R_{\Gamma }\cap R_{\eta }|Z=0)+P(X\in R_{\Gamma }\cap R_{\eta }^c|Z=0)\\ P(X\in R_{\eta }|Z=0)&=P(X\in R_{\eta }\cap R_{\Gamma }|Z=0)+P(X\in R_{\eta }\cap R_{\Gamma }^c|Z=0). \end{aligned} \end{aligned}$$
(19)

Combining (18) and (19) yields

$$\begin{aligned} P(X\in R_{\Gamma }\cap R_{\eta }^c|Z=0)=P(X\in R_{\eta }\cap R_{\Gamma }^c|Z=0). \end{aligned}$$
(20)

Our objective is to prove \(P(\Gamma (X)>w|Z=1)\ge P(\eta (X)=1|Z=1)\), which is equivalent to

$$\begin{aligned} P(X\in R_{\Gamma }|Z=1)\ge P(X\in R_{\eta }|Z=1). \end{aligned}$$
(21)

As shown above, (21) amounts to

$$\begin{aligned} P(X\in R_{\Gamma }\cap R_{\eta }^c|Z=1)\ge P(X\in R_{\eta }\cap R_{\Gamma }^c|Z=1), \end{aligned}$$

which holds since

$$\begin{aligned} \begin{aligned}&P(X\in R_{\Gamma }\cap R_{\eta }^c|Z=1)=\int _{R_{\Gamma }\cap R_{\eta }^c}f(x|z=1)dx\ge w\int _{R_{\Gamma }\cap R_{\eta }^c}f(x|z=0)dx\\&\quad =wP(X\in R_{\Gamma }\cap R_{\eta }^c|Z=0)=wP(X\in R_{\eta }\cap R_{\Gamma }^c|Z=0)\\ {}&\quad =w\int _{R_{\eta }\cap R_{\Gamma }^c}f(x|z=0)dx\\&\quad \ge \int _{R_{\eta }\cap R_{\Gamma }^c}f(x|z=1)dx=P(X\in R_{\eta }\cap R_{\Gamma }^c|Z=1), \end{aligned} \end{aligned}$$

where the first and the last inequalities are derived from the definition of \(R_{\Gamma }\) in (17). \(\square \)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lahiri, K., Yang, L. Predicting binary outcomes based on the pair-copula construction. Empir Econ 64, 3089–3119 (2023). https://doi.org/10.1007/s00181-023-02418-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00181-023-02418-6

Keywords

Navigation