Advertisement

Minimum-Width Confidence Bands via Constraint Optimization

  • Jeremias Berg
  • Emilia Oikarinen
  • Matti Järvisalo
  • Kai Puolamäki
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10416)

Abstract

The use of constraint optimization has recently proven to be a successful approach to providing solutions to various NP-hard search and optimization problems in data analysis. In this work we extend the use of constraint optimization systems further within data analysis to a central problem arising from the analysis of multivariate data, namely, determining minimum-width multivariate confidence intervals, i.e., the minimum-width confidence band problem (MWCB). Pointing out drawbacks in recently proposed formalizations of variants of MWCB, we propose a new problem formalization which generalizes the earlier formulations and allows for circumvention of their drawbacks. We present two constraint models for the new problem in terms of mixed integer programming and maximum satisfiability, as well as a greedy approach. Furthermore, we empirically evaluate the scalability of the constraint optimization approaches and solution quality compared to the greedy approach on real-world datasets.

References

  1. 1.
    Asín, R., Nieuwenhuis, R., Oliveras, A., Rodríguez-Carbonell, E.: Cardinality networks: a theoretical and empirical study. Constraints 16(2), 195–221 (2011)MathSciNetCrossRefMATHGoogle Scholar
  2. 2.
    Biere, A., Heule, M., van Maaren, H., Walsh, T.: Handbook of Satisfiability. Frontiers in Artificial Intelligence and Applications, vol. 185. IOS Press, Amsterdam (2009)MATHGoogle Scholar
  3. 3.
    Davies, J., Bacchus, F.: Exploiting the power of mip solvers in maxsat. In: Järvisalo, M., Van Gelder, A. (eds.) SAT 2013. LNCS, vol. 7962, pp. 166–181. Springer, Heidelberg (2013). doi: 10.1007/978-3-642-39071-5_13 CrossRefGoogle Scholar
  4. 4.
    Gardner, M.J., Altman, D.G.: Confidence intervals rather than P values: estimation rather than hypothesis testing. Br. Med. J. (Clin. Res. Ed.) 292(6522), 746–750 (1986)CrossRefGoogle Scholar
  5. 5.
    Goldberger, A.L., Amaral, L.A.N., Glass, L., Hausdorff, J.M., Ivanov, P.C., Mark, R.G., Mietus, J.E., Moody, G.B., Peng, C.K., Stanley, H.E.: PhysioBank, PhysioToolkit, and PhysioNet: components of a new research resource for complex physiologic signals. Circulation 101(23), e215–e220 (2000)CrossRefGoogle Scholar
  6. 6.
    Guilbaud, O.: Simultaneous confidence regions corresponding to Holm’s step-down procedure and other closed-testing procedures. Biom. J. 50(5), 678 (2008)MathSciNetCrossRefGoogle Scholar
  7. 7.
    Hyndman, R.J., Fan, Y.: Sample quantiles in statistical packages. Am. Stat. 50(4), 361–365 (1996)Google Scholar
  8. 8.
  9. 9.
    Kolsrud, D.: Time-simultaneous prediction band for a time series. J. Forecast. 26(3), 171–188 (2007)MathSciNetCrossRefGoogle Scholar
  10. 10.
    Korpela, J., Oikarinen, E., Puolamäki, K., Ukkonen, A.: Multivariate confidence intervals. In: Proceedings of SDM, pp. 696–704. SIAM (2017)Google Scholar
  11. 11.
    Korpela, J., Puolamäki, K., Gionis, A.: Confidence bands for time series data. Data Min. Knowl. Discov. 28(5–6), 1530–1553 (2014)MathSciNetCrossRefGoogle Scholar
  12. 12.
    Koshimura, M., Zhang, T., Fujita, H., Hasegawa, R.: QMaxSAT: a partial Max-SAT solver. J. Satisf. Boolean Model. Comput. 8(1/2), 95–100 (2012)MathSciNetMATHGoogle Scholar
  13. 13.
    Lichman, M.: UCI machine learning repository (2013). http://archive.ics.uci.edu/ml
  14. 14.
    Liu, W., Jamshidian, M., Zhang, Y., Bretz, F., Han, X.: Some new methods for the comparison of two linear regression models. J. Stat. Plan. Inference 137(1), 57–67 (2007)MathSciNetCrossRefMATHGoogle Scholar
  15. 15.
    Mandel, M., Betensky, R.A.: Simultaneous confidence intervals based on the percentile bootstrap approach. Comput. Stat. Data Anal. 52(4), 2158–2165 (2008)MathSciNetCrossRefMATHGoogle Scholar
  16. 16.
    Menne, M., Durre, I., Korzeniewski, B., McNeal, S., Thomas, K., Yin, X., Anthony, S., Ray, R., Vose, R., Gleason, B., Houston, T.: Global Historical Climatology Network – Daily (GHCN-Daily), version 3.11 (2012)Google Scholar
  17. 17.
    Menne, M., Durre, I., Vose, R., Gleason, B., Houston, T.: An overview of the global historical climatology network-daily database. J. Atmos. Ocean. Technol. 29, 897–910 (2012)CrossRefGoogle Scholar
  18. 18.
    Moody, G.B., Mark, R.G.: The impact of the MIT-BIH arrhythmia database. IEEE Eng. Med. Biol. Mag. 20(3), 45–50 (2001)CrossRefGoogle Scholar
  19. 19.
    Morgado, A., Dodaro, C., Marques-Silva, J.: Core-guided MaxSAT with soft cardinality constraints. In: O’Sullivan, B. (ed.) CP 2014. LNCS, vol. 8656, pp. 564–573. Springer, Cham (2014). doi: 10.1007/978-3-319-10428-7_41 Google Scholar
  20. 20.
    Nuzzo, R.: Scientific method: statistical errors. Nature 506, 150–152 (2014)CrossRefGoogle Scholar
  21. 21.
    Schüssler, R., Trede, M.: Constructing minimum-width confidence bands. Econ. Lett. 145, 182–185 (2016)MathSciNetCrossRefGoogle Scholar
  22. 22.
    Staszewska-Bystrova, A., Winker, P.: Constructing narrowest pathwise bootstrap prediction bands using threshold accepting. Int. J. Forecast. 29(2), 221–233 (2013)CrossRefGoogle Scholar
  23. 23.
    Trafimow, D., Marks, M.: Editorial. Basic Appl. Soc. Psychol. 37(1), 1–2 (2015)MathSciNetCrossRefGoogle Scholar
  24. 24.
    Wolf, M., Wunderli, D.: Bootstrap joint prediction regions. J. Time Ser. Anal. 36(3), 352–376 (2015)MathSciNetCrossRefMATHGoogle Scholar
  25. 25.
    Woolston, C.: Psychology journal bans P values. Nature 519, 9 (2015)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Jeremias Berg
    • 1
  • Emilia Oikarinen
    • 2
  • Matti Järvisalo
    • 1
  • Kai Puolamäki
    • 2
  1. 1.HIIT, Department of Computer ScienceUniversity of HelsinkiHelsinkiFinland
  2. 2.Finnish Institute of Occupational HealthHelsinkiFinland

Personalised recommendations