## Abstract

We develop a new econometric model for the purpose of predicting binary outcomes based on an ensemble of predictors. The method uses the pair-copula construction (PCC) to optimally combine diverse information. As a building block of PCC, the conditional copula is permitted to depend on the conditioning variable in a nonparametric way. This is the major methodological departure from our previous work. We apply this methodology to predict US business cycle peaks 6 months ahead based on the three prominent leading indicators currently used by The Conference Board. In terms of the predictive accuracy as measured by the receiver operating characteristic curve, the proposed scheme is found to do well in comparison with some popular combination models. We have also evaluated the probability forecasts generated from these models using a battery of diagnostic tools, each of which reveals different aspects of skill of the generated forecasts.

### Similar content being viewed by others

## Notes

Recently, Ferrara et al. (2015) conducted an extensive analysis to compare forecasting ability of linear

*vs*nonlinear models during the*Great Recession*episode.This implies that we can estimate the logit model (with or without quadratic terms) by maximum likelihood to get the optimal rule. Since ROC curve is invariant to the logit transformation, an even simpler combination rule is given by the linear index of the estimated logit model. However, the coefficients in the linear index must be estimated by maximum likelihood, rather than by OLS as we often do in a typical linear probability model. We thank Robin Sickles for suggesting this interpretation.

A general introduction to the modeling strategies based on copulas was given by Joe (1997), Nelsen (2006), Patton (2012), and Trivedi and Zimmer (2005). Copulas have been applied successfully to predict multiple economic events in Anatolyev (2009), Patton (2006, 2013) and Scotti (2011). Using copulas, Smith (2008) specified a joint distribution for normal and half normal distributions where the two random variables were allowed to be dependent; Amsler et al. (2014, 2016) used copulas to model dependence between asymmetric errors over time in panel and under endogeneity. Amsler and Schmidt (2021) and Patton and Fan (2014) provided two high-quality surveys on copula methods from different econometric perspectives.

The idea of PCC goes back to Joe (1996). Bedford and Cooke (2001, 2002) introduced a graphic device, called

*regular vine*to categorize different decompositions of a multivariate distribution. The general formula of the density of PCC for*n*variables and concrete examples of various vine structures can be found in Aas et al. (2009), which also proposed a sequential estimation method for PCC models. However, this methodology has not received much attention in econometrics to date. Exceptions include Brechmann and Schepsmeier (2013), Kielmann et al. (2022), Min and Czado (2010), and Zimmer (2015a, 2015b).It is worth noting that even PCCs relaxing the simplifying assumption are not

*dense*in the space of multivariate copula functions. Although a multivariate distribution can be decomposed as in (12), the functional form of the conditional copula \(c_{13|2}\{F_{1|2}(x_1|x_2),F_{3|2}(x_3|x_2);x_2\}\), not just its parameter \(\theta \), may depend on \(x_2\). For example, \(c_{13|2}\) may be a Gaussian copula when \(x_2=1\) while it may become a Gumbel copula when \(x_2=-1\). A limitation of our PCC model lies in the restriction that \(c_{13|2}\) has to belong to the same parametric family regardless of the value of \(x_2\). To the best of our knowledge, there are no theoretical results allowing \(c_{13|2}\) to change from one family to another as \(x_2\) varies.The same technique is utilized in the nonparametric local linear regression (Fan and Gijbels 1996).

The maximizer of (13) would change if a different value of \(\lambda _{T}\) is used.

Among the ten leading indicators proposed by TCB, these are the three indicators with most enduring performance for predicting economic recessions. A preliminary analysis (available upon request) demonstrates that most of the relevant information contained in the ten indicators is captured by the chosen three indicators, yielding almost identical ROC curves. See also Lahiri and Yang (2015b). In the subsequent analysis, these 3 predictors will be abbreviated as Confidence, Spread and ISM_Diffu, respectively.

Recently, Salgado et al. (2019) found that economic recessions are accompanied by negative third-moment (skewness) shocks, leading to a left tail of large negative outcomes. As is well known, the usual normal distribution cannot accommodate this important feature.

To select the best copula for \(c_{13|2}\), we first generate \(({\hat{U}}_t,{\hat{V}}_t)\) using the procedure in Sect. 3.2.4. The chosen copula is the one minimizing AIC.

Kendall’s \(\tau \) describes rank correlation between two random variables. In view of Nelsen (2006), it can be expressed in terms of copula function as \(\tau =4\int \int C(u,v)dC(u,v)-1\).

We have tried two alternative specifications with each of the other two predictors (Confidence and ISM_Diffu) as the conditioning variable in PCC. The resulting ROC curves are akin to that in Fig. 3. To save space, we omit these results here, although they are available from the authors.

We have tried the probit model instead. The difference between the logit and probit models is negligible.

The ROC curve of K &K nearly overlaps with that of the logit model and thus is omitted from Fig. 3.

## References

Aas K, Czado C, Frigessi A, Bakken H (2009) Pair-copula constructions of multiple dependence. Insur Math Econ 44:182–198

Acar EF, Craiu RV, Yao F (2011) Dependence calibration in conditional copulas: a nonparametric approach. Biometrics 67:445–453

Acar EF, Genest C, Nešlehová J (2012) Beyond simplified pair-copula constructions. J Multivar Anal 110:74–90

Acar EF, Czado C, Lysy M (2019) Flexible dynamic vine copula models for multivariate time series data. Econom Stat 12:181–197

Amsler C, Schmidt P (2021) A survey of the use of copulas in stochastic frontier models. In: Parmeter CF, Sickles RC (eds) Advances in efficiency and productivity analysis. Springer, pp 125–138

Amsler C, Prokhorov A, Schmidt P (2014) Using copulas to model time dependence in stochastic frontier models. Econom Rev 33:497–522

Amsler C, Prokhorov A, Schmidt P (2016) Endogeneity in stochastic frontier models. J Econom 190:280–288

Anatolyev S (2009) Multi-market direction-of-change modeling using dependence ratios. Stud Nonlinear Dyn Econom 13, Article 5

Ang A, Bekaert G (2002) International asset allocation with regime shifts. Rev Financ Stud 15:1137–1187

Bates JM, Granger CWJ (1969) The combination of forecasts. Oper Res Q 20:451–468

Bedford T, Cooke RM (2001) Probability density decomposition for conditionally dependent random variables modeled by vines. Ann Math Artif Intell 32:245–268

Bedford T, Cooke RM (2002) Vines—a new graphical model for dependent random variables. Ann Stat 30:1031–1068

Berg D, Aas K (2009) Models for construction of higher-dimensional dependence: a comparison study. Eur J Finance 15:639–659

Brechmann EC, Schepsmeier U (2013) Modeling dependence with C- and D-vine copulas: the R package CDVine. J Stat Softw 52:1–27

Brier GW (1950) Verification of forecasts expressed in terms of probability. Mon Weather Rev 78:1–3

Brown BM, Wang YG (2005) Standard errors and covariance matrices for smoothed rank estimators. Biometrika 92:149–158

Chauvet M, Potter S (2005) Forecasting recessions using the yield curve. J Forecast 24:77–103

Chen X, Fan Y (2006) Estimation and model selection of semiparametric copula-based multivariate dynamic models under copula misspecification. J Econom 135:125–154

Chen X, Fan Y (2007) A model selection test for bivariate failure-time data. Econom Theor 23:414–439

Clayton D (1978) A model for association in bivariate life tables and its application in epidemiological studies of familial tendency in chronic disease incidence. Biometrika 65:141–151

Czado C (2010) Pair-copula constructions of multivariate copulas. In: Jaworski P, Durante F, Härdle WK, Rychlik T (eds) Copula theory and its applications. Springer

Czado C, Nagler T (2022) Vine copula based modeling. Annu Rev Stat Appl 9:453–477

Dawid AP (1984) Present position and potential developments: some personal views: statistical theory: the prequential approach. J R Stat Soc Ser A 147:278–292

Drechsel K, Scheufele R (2012) The performance of short-term forecasts of the German economy before and during the 2008/2009 recession. Int J Forecast 28:428–445

Elliott G, Timmermann A (2016) Economic forecasting. Princeton University Press

Fan J, Gijbels I (1996) Local polynomial modelling and its applications. Chapman & Hall

Fawcett T (2006) An introduction to ROC analysis. Pattern Recognit Lett 27:861–874

Fermanian J (2005) Goodness-of-fit tests for copulas. J Multivar Anal 95:119–152

Ferrara L, Marcellino M, Mogliani M (2015) Macroeconomic forecasting during the great recession: the return of non-linearity? Int J Forecast 31:664–679

Genest C, Rémillard B, Beaudoin D (2009) Goodness-of-fit tests for copulas: a review and a power study. Insur Math Econ 44:199–213

Granger CWJ, Ramanathan R (1984) Improved methods of combining forecasts. J Forecast 3:197–204

Haff IH, Aas K, Frigessi A (2010) On the simplified pair-copula construction-simply useful or too simplistic? J Multivar Anal 101:1296–1310

Han AK (1987) Non-parametric analysis of a generalized regression model. J Econom 35:303–316

Hao L, Ng ECY (2011) Predicting Canadian recessions using dynamic probit modelling approaches. Can J Econ 44:1297–1330

Harding D, Pagan A (2011) An econometric analysis of some models for constructed binary time series. J Bus Econ Stat 29:86–95

Hastie T, Tibshirani R, Friedman J (2001) The elements of statistical learning: data mining, inference, and prediction. Springer

Hogg RV, McKean JW, Craig AT (2012) Introduction to mathematical statistics. Prentice Hall

Hsiao C, Wan SK (2014) Is there an optimal forecast combination? J Econom 178:294–309

Jafarzadeh SR, Johnson WO, Gardner IA (2016) Bayesian modeling and inference for diagnostic accuracy and probability of disease based on multiple diagnostic biomarkers with and without a perfect reference standard. Stat Med 35:859–876

Joe H (1993) Parametric families of multivariate distributions with given marginals. J Multivar Anal 46:262–282

Joe H (1996) Families of m-variate distributions with given margins and \(m(m-1)/2\) bivariate dependence parameters. In: Rüschendorf L, Schweizer B, Taylor MD (eds) Distributions with fixed marginals and related topics. Institute of Mathematical Statistics

Joe H (1997) Multivariate models and dependence concepts. Chapman & Hall

Kamstra M, Kennedy P (1998) Combining qualitative forecasts using logit. Int J Forecast 14:83–93

Kauppi H, Saikkonen P (2008) Predicting U.S. recessions with dynamic binary response models. Rev Econ Stat 90:777–791

Kielmann J, Manner H, Min A (2022) Stock market returns and oil price shocks: a CoVaR analysis based on dynamic vine copula models. Empir Econ 62:1543–1574

Klugman SA, Parsa R (1999) Fitting bivariate loss distributions with copulas. Insur Math Econ 24:139–148

Krzanowski WJ, Hand DJ (2009) ROC curves for continuous data. Chapman & Hall

Lahiri K, Wang JG (2013) Evaluating probability forecasts for GDP declines using alternative methodologies. Int J Forecast 29:175–190

Lahiri K, Yang L (2013) Forecasting binary outcomes. In: Timmermann A, Elliott G (eds) Handbook of economic forecasting volume 2B. North-Holland, Amsterdam, pp 1025–1106

Lahiri K, Yang L (2015a) A nonlinear forecast combination procedure for binary outcomes. Stud Nonlinear Dyn Econom 29:175–190

Lahiri K, Yang L (2015b) Further analysis of the conference board’s new Leading Economic Index. Int J Forecast 31:446–453

Lahiri K, Yang L (2016) Asymptotic variance of Brier (skill) score in the presence of serial correlation. Econ Lett 141:125–129

Lahiri K, Yang L (2018) Confidence bands for ROC curves with serially dependent data. J Bus Econ Stat 36:115–130

Lahiri K, Yang L (2021) Construction of Leading Economic Index for recession prediction using vine copulas. Stud Nonlinear Dyn Econom 25:193–212

Levanon G, Manini J, Ozyildirim A, Schaitkin B, Tanchua J (2015) Using financial indicators to predict turning points in the business cycle: the case of the Leading Economic Index for the United States. Int J Forecast 31:426–445

Li D (2000) On default correlation: a copula function approach. J Fixed Income 9:43–54

Li F, Kang Y (2018) Improving forecasting performance using covariate-dependent copula models. Int J Forecast 34:456–476

Li Q, Racine JS (2008) Nonparametric econometrics: theory and practice. Princeton University Press

Li F, Villani M, Kohn R (2010) Flexible modeling of conditional distributions using smooth mixtures of asymmetric student-t densities. J Stat Plan Inference 140:3638–3654

Lin H, Zhou L, Peng H, Zhou X (2011) Selection and combination of biomarkers using ROC method for disease classification and prediction. Can J Stat 39:324–343

Longin F, Solnik B (2001) Extreme correlation of international equity markets. J Finance 56:649–676

McIntosh MW, Pepe MS (2002) Combining several screening tests: optimality of the risk score. Biometrics 58:657–664

Min A, Czado C (2010) Bayesian inference for multivariate copulas using pair-copula constructions. J Financ Econom 8:511–546

Murphy AH (1972) Scalar and vector partitions of the probability score: part I. Two-state situation. J Appl Meteorol 11:273–282

Nelsen RB (2006) An introduction to copulas. Springer

Patton AJ (2006) Modelling asymmetric exchange rate dependence. Int Econ Rev 47:527–556

Patton AJ (2012) A review of copula models for economic time series. J Multivar Anal 110:4–18

Patton AJ (2013) Copula methods for forecasting multivariate time series. In: Timmermann A, Elliott G (eds) Handbook of economic forecasting volume 2B. North-Holland, Amsterdam, pp 899–960

Patton AJ, Fan Y (2014) Copulas in econometrics. Annu Rev Econ 6:179–200

Pepe MS, Cai T, Longton G (2006) Combining predictors for classification using the area under the receiver operating characteristic curve. Biometrics 62:221–229

Prokhorov A, Schmidt P (2009) Likelihood-based estimation in a panel setting: robustness, redundancy and validity of copulas. J Econom 153:93–104

Ranjan R, Gneiting T (2010) Combining probability forecasts. J R Stat Soc B 72:71–91

Salgado S, Bloom N, Guvenen F (2019) Skewed business cycles. NBER Working Paper No. 26565

Scotti C (2011) A bivariate model of Federal Reserve and ECB main policy rates. Int J Cent Bank 7:37–78

Seillier-Moiseiwitsch F, Dawid AP (1993) On testing the validity of sequential probability forecasts. J Am Stat Assoc 88:355–359

Sklar A (1973) Random variables, joint distributions, and copulas. Kybernetica 9:449–460

Smith MD (2008) Stochastic frontier models with dependent error components. Econom J 11:172–192

Stephenson DB (2000) Use of the ‘odds ratio’ for diagnosing forecast skill. Weather Forecast 15:221–232

Stöber J, Joe H, Czado C (2013) Simplified pair copula constructions-limitations and extensions. J Multivar Anal 119:101–118

Stock JH, Watson MW (1993) A procedure for predicting recessions with leading indicators: econometric issues and recent experience. In: Stock JH, Watson MW (eds) New research on business cycles, indicators and forecasting. University of Chicago Press, pp 95–156

Timmermann A (2006) Forecast combinations. In: Elliott G, Granger CWJ, Timmermann A (eds) Handbook of economic forecasting. North-Holland, Amsterdam, pp 135–196

Trivedi PK, Zimmer DM (2005) Copula modeling: an introduction for practitioners. Found Trends Econom 1:1–111

Yates JF (1982) External correspondence: decompositions of the mean probability score. Organ Behav Hum Perform 30:132–156

Yates JF, Curley SP (1985) Conditional distribution analysis of probabilistic forecasts. J Forecast 4:61–73

Youden WJ (1950) Index for rating diagnostic tests. Cancer 3:32–35

Zimmer DM (2015) Asymmetric dependence in house prices: evidence from USA and international data. Empir Econ 49:161–183

Zimmer DM (2015) Analyzing comovements in housing prices using vine copulas. Econ Inq 53:1156–1169

## Author information

### Authors and Affiliations

### Corresponding author

## Ethics declarations

### Conflict of interest

Authors Kajal Lahiri and Liu Yang declare that they have no conflict of interest.

### Ethical approval

This article does not contain any studies with human participants performed by any of the authors.

## Additional information

### Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

We thank the three anonymous referees for many helpful comments. However, any remaining errors are our responsibility.

## Mathematical Appendix

### Mathematical Appendix

### Proof of Theorem 1

The proof follows directly from Theorem 8.1.1 in Hogg et al. (2012). Given *w* satisfying \(FA=P(\Gamma (X)>w|Z=0)\), define

and

By assumption, we have

It is straightforward to see that

Combining (18) and (19) yields

Our objective is to prove \(P(\Gamma (X)>w|Z=1)\ge P(\eta (X)=1|Z=1)\), which is equivalent to

As shown above, (21) amounts to

which holds since

where the first and the last inequalities are derived from the definition of \(R_{\Gamma }\) in (17). \(\square \)

## Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

## About this article

### Cite this article

Lahiri, K., Yang, L. Predicting binary outcomes based on the pair-copula construction.
*Empir Econ* **64**, 3089–3119 (2023). https://doi.org/10.1007/s00181-023-02418-6

Received:

Accepted:

Published:

Issue Date:

DOI: https://doi.org/10.1007/s00181-023-02418-6