Skip to main content
Log in

A conformal predictive system for distribution regression with random features

  • Application of soft computing
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

Distribution regression is the regression case where the input objects are distributions. Many machine learning problems can be analyzed in this framework, such as multi-instance learning and learning from noisy data. This paper attempts to build a conformal predictive system (CPS) for distribution regression, where the prediction of the system for a test input is a cumulative distribution function (CDF) of the corresponding test label. The CDF output by a CPS provides useful information about the test label, as it can estimate the probability of any event related to the label and be transformed to prediction interval and prediction point with the help of the corresponding quantiles. Furthermore, a CPS has the property of validity as the prediction CDFs and the prediction intervals are statistically compatible with the realizations. This property is desired for many risk-sensitive applications, such as weather forecast. To the best of our knowledge, this is the first work to extend the learning framework of CPS to distribution regression problems. We first embed the input distributions to a reproducing kernel Hilbert space using kernel mean embedding approximated by random Fourier features, and then build a fast CPS on the top of the embeddings. While inheriting the property of validity from the learning framework of CPS, our algorithm is simple, easy to implement and fast. The proposed approach is tested on synthetic data sets and can be used to tackle the problem of statistical postprocessing of ensemble forecasts, which demonstrates the effectiveness of our algorithm for distribution regression problems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Data availability

The data used during the current study are available from the corresponding author upon reasonable request.

References

  • Balasubramanian V, Ho S-S, Vovk V (2014) Conformal prediction for reliable machine learning: theory. Newnes, Adaptations and Applications

    MATH  Google Scholar 

  • Bosc N, Atkinson F, Felix E, Gaulton A, Hersey A, Leach AR (2019) Large scale comparison of QSAR and conformal prediction methods and their applications in drug discovery. J Cheminform 11(1):4. https://doi.org/10.1186/s13321-018-0325-4

    Article  Google Scholar 

  • Bourouis S, Al-Osaimi FR, Bouguila N, Sallay H, Aldosari F, Al Mashrgy M (2019) Bayesian inference by reversible jump MCMC for clustering based on finite generalized inverted Dirichlet mixtures. Soft Comput 23(14):5799–5813. https://doi.org/10.1007/s00500-018-3244-4

    Article  Google Scholar 

  • Bousquet O, Elisseeff A (2002) Stability and generalization. J Machine Learn Res 2(3):499–526. https://doi.org/10.1162/153244302760200704

    Article  MathSciNet  MATH  Google Scholar 

  • Cortés-Ciriano I, Bender A (2019) Reliable prediction errors for deep neural networks using test-time dropout. J Chem Inf Model 59(7):3330–3339. https://doi.org/10.1021/acs.jcim.9b00297

    Article  Google Scholar 

  • Fraley C, Raftery A E, Gneiting T, & Sloughter J M (2018) ensemblebma: An r package for probabilistic forecasting using ensembles and bayesian model averaging, r package version 5.1.5.[Available online at https://CRAN.R-project.org/package=ensembleBMA.]

  • Gneiting T, Katzfuss M (2014) Probabilistic forecasting. Annual Rev Stat Appl 1(1):125–151. https://doi.org/10.1146/annurev-statistics-062713-085831

    Article  Google Scholar 

  • Gneiting T, Raftery AE (2005) Atmospheric science. Weather Forecast Ensemble Methods Sci 310(5746):248–249. https://doi.org/10.1126/science.1115255

    Article  Google Scholar 

  • Huang GB, Wang DH, Lan Y (2011) Extreme learning machines: a survey. Int J Mach Learn Cybern 2(2):107–122. https://doi.org/10.1007/s13042-011-0019-y

    Article  Google Scholar 

  • Jitkrittum W, Gretton A, Heess N, Eslami S M A, Lakshminarayanan B, Sejdinovic D, & Szabó Z (2015) Kernel-based just-in-time learning for passing expectation propagation messages. In: UAI’15 proceedings of the thirty-first conference on uncertainty in artificial intelligence (pp 405–414)

  • Laxhammar R, & Falkman G (2011) Sequential conformal anomaly detection in trajectories based on hausdorff distance. In 14th international conference on information fusion (pp 1–8). IEEE

  • Laxhammar R, Falkman G (2013) Online learning and sequential anomaly detection in trajectories. IEEE Trans Pattern Anal Mach Intell 36(6):1158–1173. https://doi.org/10.1109/TPAMI.2013.172

    Article  MATH  Google Scholar 

  • Lopez-Paz D, Muandet K, lkopf B S, & Tolstikhin I (2015) Towards a learning theory of cause-effect inference. In proceedings of the 32nd international conference on machine learning (pp 1452–1461)

  • Melluish T, Saunders C, Nouretdinov I, & Vovk V (2001) Comparing the bayes and typicalness frameworks. In European conference on machine learning (pp 360–371). Springer, Berlin, Heidelberg

  • Messner J (2017) Ensemble postprocessing data sets. R package ensemblepp version 0.1–0.[Available online at https://CRAN.R-project.org/package=ensemblepp]

  • Muandet K, Fukumizu K, Sriperumbudur B, Scholkopf B (2017) Kernel mean embedding of distributions: a review and beyond. Found Trends Machine Learn. https://doi.org/10.1561/2200000060

    Article  MATH  Google Scholar 

  • Nouretdinov I, Costafreda SG, Gammerman A, Chervonenkis A, Vovk V, Vapnik V, Fu CH (2011) Machine learning classification with confidence: application of transductive conformal predictors to MRI-based diagnostic and prognostic markers in depression. Neuroimage 56(2):809–813. https://doi.org/10.1016/j.neuroimage.2010.05.023

    Article  Google Scholar 

  • Oliphant TE (2006) A guide to NumPy. Trelgol Publishing, USA

    Google Scholar 

  • Papadopoulos H, Gammerman A, & Vovk V (2009) Confidence predictions for the diagnosis of acute abdominal pain. In IFIP international conference on artificial intelligence applications and innovations (pp 175–184). Springer, Boston, MA

  • Póczos B, Singh A, Rinaldo A, & Wasserman L A (2013) Distribution-free distribution regression. In artificial intelligence and statistics (pp 507–515)

  • R Core Team (2018) R: A language and environment for statistical computing, R foundation for statistical computing, Vienna, Austria. [Available online at https://www.R-project.org/]

  • Raftery AE, Gneiting T, Balabdaoui F, Polakowski M (2005) Using Bayesian model averaging to calibrate forecast ensembles. Mon Weather Rev 133(5):1155–1174. https://doi.org/10.1175/Mwr2906.1

    Article  Google Scholar 

  • Rahimi A, & Recht B (2007) Random features for large-scale kernel machines. In advances in neural information processing systems 20 (Vol. 20, pp. 1177–1184)

  • Rahimi A, & Recht B (2008a) Uniform approximation of functions with random bases. In 2008a 46th annual allerton conference on communication, control, and computing (pp 555–561). IEEE

  • Rahimi A, Recht B (2008b) Weighted sums of random kitchen sinks: replacing minimization with randomization in learning. In Adv Neural Inform Process Syst 21(21):1313–1320

    Google Scholar 

  • Ren WJ, Wang YW, Han M (2021) Time series prediction based on echo state network tuned by divided adaptive multi-objective differential evolution algorithm. Soft Comput. https://doi.org/10.1007/s00500-020-05457-8

    Article  Google Scholar 

  • Van Rossum G, & Drake Jr F L (1995) Python tutorial (Vol. 620): Centrum voor Wiskunde en Informatica Amsterdam

  • Scheuerer M, Hamill TM (2015) Statistical postprocessing of ensemble precipitation forecasts by fitting censored. Shifted Gamma Distribut Month Weather Rev 143(11):4578–4596. https://doi.org/10.1175/Mwr-D-15-0061.1

    Article  Google Scholar 

  • Scheuerer M (2014) Probabilistic quantitative precipitation forecasting using ensemble model output statistics. Quarter J Royal Meteorol Soc 140(680):1086–1096

    Article  Google Scholar 

  • Shawe-Taylor J, Cristianini N (2004) Kernel methods for pattern analysis. Cambridge University Press, Cambridge

    Book  MATH  Google Scholar 

  • Sloughter JM, Raftery AE, Gneiting T, Fraley C (2007) Probabilistic quantitative precipitation forecasting using Bayesian model averaging. Mon Weather Rev 135(9):3209–3220. https://doi.org/10.1175/Mwr3441.1

    Article  Google Scholar 

  • Szabó Z, Gretton A, Póczos B, & Sriperumbudur B (2015) Two-stage sampled learning theory on distributions. In Artif Intell Stat (pp 948–957)

  • Szabó Z, Sriperumbudur BK, Poczos B, Gretton A (2016) Learning theory for distribution regression. J Machine Learn Res 17(1):5272–5311

    MathSciNet  MATH  Google Scholar 

  • Vannitsem S, Wilks DS, Messner J (2018) Statistical postprocessing of ensemble forecasts. Elsevier, London

    Google Scholar 

  • Vovk V, Nouretdinov I, Manokhin V, Gammerman A (2018a) Conformal predictive distributions with kernels. Braverman Read Machine Learn Key Ideas Inception Curr State 11100:103–121. https://doi.org/10.1007/978-3-319-99492-5_4

    Article  MathSciNet  Google Scholar 

  • Vovk V, Shen JL, Manokhin V, Xie MG (2019) Nonparametric predictive distributions based on conformal prediction. Mach Learn 108(3):445–474. https://doi.org/10.1007/s10994-018-5755-8

    Article  MathSciNet  MATH  Google Scholar 

  • Vovk V, Petej I, Nouretdinov I, Manokhin V, Gammerman A (2020a) Computationally efficient versions of conformal predictive distributions. Neurocomputing 397:292–308. https://doi.org/10.1016/j.neucom.2019.10.110

    Article  Google Scholar 

  • Vovk V, Gammerman A, Shafer G (2005) Algorithmic learning in a random world. Springer Science & Business Media, USA

    MATH  Google Scholar 

  • Vovk V, Nouretdinov I, Manokhin V, & Gammerman A (2018b) Cross-conformal predictive distributions. In conformal and probabilistic prediction and applications (pp 37–51)

  • Vovk V, Petej I, Toccaceli P, Gammerman A, Ahlberg E, & Carlsson L (2020b). Conformal calibrators. In conformal and probabilistic prediction and applications (pp 84–99)

  • Vovk V (2019) Universally consistent conformal predictive distributions. In conformal and probabilistic prediction and applications (pp 105–122)

  • Wang JS, Han S, Guo QP (2014) Echo state networks based predictive model of vinyl chloride monomer convention velocity optimized by artificial fish swarm algorithm. Soft Comput 18(3):457–468. https://doi.org/10.1007/s00500-013-1068-9

    Article  Google Scholar 

  • Wang D, Wang P, Shi JZ (2018) A fast and efficient conformal regressor with regularized extreme learning machine. Neurocomputing 304:1–11. https://doi.org/10.1016/j.neucom.2018.04.012

    Article  Google Scholar 

  • Wang D, Wang P, Yuan Y, Wang P, Shi J (2020) A fast conformal predictive system with regularized extreme learning machine. Neural Netw 126:347–361. https://doi.org/10.1016/j.neunet.2020.03.022

    Article  Google Scholar 

  • Yan D, Chu Y, Zhang H, Liu D (2018) Information discriminative extreme learning machine. Soft Comput 22(2):677–689. https://doi.org/10.1007/S00500-016-2372-Y

    Article  Google Scholar 

  • Yuen R A, Baran S, Fraley C, Gneiting T, Lerch S, Scheuerer M, & Thorarinsdottir T L (2018) ensembleMOS: Ensemble model output statistics. R package version 0.8.2

  • Zhai JH, Xu HY, Wang XZ (2012) Dynamic ensemble extreme learning machine based on sample entropy. Soft Comput 16(9):1493–1502. https://doi.org/10.1007/s00500-012-0824-6

    Article  Google Scholar 

Download references

Acknowledgements

The authors would like to thank the anonymous editor and reviewers for their valuable comments and suggestions which improved this work.

Funding

This work was supported by the National Natural Science Foundation of China under Grant 62106169, 72261147706, 72231005 and 61972282.

Author information

Authors and Affiliations

Authors

Contributions

All authors contributed to the study conception and design. Material preparation, data collection and analysis were performed by Wei Zhang and Di Wang. All authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Di Wang.

Ethics declarations

Conflict of interest

Wei Zhang, Zhen He and Di Wang declare that they have no conflict of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, W., He, Z. & Wang, D. A conformal predictive system for distribution regression with random features. Soft Comput 27, 11789–11800 (2023). https://doi.org/10.1007/s00500-023-07859-w

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-023-07859-w

Keywords

Navigation