Skip to main content
Log in

Multiple linear regression models for random intervals: a set arithmetic approach

  • Original paper
  • Published:
Computational Statistics Aims and scope Submit manuscript

Abstract

Some regression models for analyzing relationships between random intervals (i.e., random variables taking intervals as outcomes) are presented. The proposed approaches are extensions of previous existing models and they account for cross relationships between midpoints and spreads (or radii) of the intervals in a unique equation based on the interval arithmetic. The estimation problem, which can be written as a constrained minimization problem, is theoretically analyzed and empirically tested. In addition, numerically stable general expressions of the estimators are provided. The main differences between the new and the existing methods are highlighted in a real-life application, where it is shown that the new model provides the most accurate results by preserving the coherency with the interval nature of the data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

References

  • Billard L, Diday E (2000) Regression analysis for interval-valued data. Data analysis, classification and related methods. In: Kiers HAL et al (eds) Proceedings of 7th conference IFCS, vol 1, pp 369–374

  • Blanco-Fernández Á, Corral N, González-Rodríguez G (2011) Estimation of a flexible simple linear model for interval data based on set arithmetic. Comput Stat Data Anal 55(9):2568–2578

    Article  MathSciNet  MATH  Google Scholar 

  • Blanco-Fernández Á, Colubi A, García-Bárzana M (2013) A set arithmetic-based linear regression model for modelling interval-valued responses through real-valued variables. Inf Sci 247(20):109–122

    Article  MathSciNet  MATH  Google Scholar 

  • Boruvka A, Cook RJ (2015) A Cox–Aalen model for interval-censored data. Scand J Stat 42(2):414–426

    Article  MathSciNet  MATH  Google Scholar 

  • Boukezzoula R, Galichet S, Bisserier A (2011) A midpoint radius approach to regression with interval data. Int J Approx Reason 52(9):1257–1271

    Article  MathSciNet  MATH  Google Scholar 

  • Černý M, Rada M (2011) On the possibilistic approach to linear regression with rounded or interval-censored data. Meas Sci Rev 11(2):34–40

    Article  Google Scholar 

  • Diamond P (1990) Least squares fitting of compact set-valued data. J Math Anal Appl 147:531–544

    Article  MathSciNet  MATH  Google Scholar 

  • D’Urso PP (2003) Linear regression analysis for fuzzy/crisp input and fuzzy/crisp output data. Comput Stat Data Anal 42:47–72

    Article  MathSciNet  MATH  Google Scholar 

  • D’Urso PP, Giordani P (2004) A least squares approach to principal component analysis for interval valued data. Chemom Intell Lab 70:179–192

    Article  Google Scholar 

  • Efron B, Tibshirani R (1993) An introduction to the bootstrap. Chapman & Hall, New York

    Book  MATH  Google Scholar 

  • Freedman DA (1981) Bootstrapping regression models. Ann Stat 9(6):1218–1228

    Article  MathSciNet  MATH  Google Scholar 

  • Gil MA, González-Rodríguez G, Colubi A, Montenegro M (2007) Testing linear independence in linear models with interval-valued data. Comput Stat Data Anal 51:3002–3015

    Article  MathSciNet  MATH  Google Scholar 

  • Gillis N (2012) Sparse and unique nonnegative matrix factorization through data preprocessing. J Mach Learn Res 13:3349–3386

    MathSciNet  MATH  Google Scholar 

  • Golub HG, Van Loan CF (1996) Matrix computations. Johns Hopkins University Press, Baltimore

    MATH  Google Scholar 

  • González-Rodríguez G, Blanco Á, Corral N, Colubi A (2007) Least squares estimation of linear regression models for convex compact random sets. Adv Data Anal Classif 1:67–81

    Article  MathSciNet  MATH  Google Scholar 

  • Higham NJ (1996) Accuracy and stability of numerical algorithms. Society for Industrial and Applied Mathematics, Philadelphia

    MATH  Google Scholar 

  • Jahanshahloo GR, Hosseinzadeh Lotfi F, Rostamy Malkhalifeh M, Ahadzadeh Namin M (2008) A generalized model for data envelopment analysis with interval data. Appl Math Model 33:3237–3244

    Article  MathSciNet  MATH  Google Scholar 

  • Johnston J (1972) Econometric methods. McGraw-Hill Book Co., New York

    Google Scholar 

  • Körner R (1997) On the variance of fuzzy random variables. Fuzzy Set Syst 92:83–93

    Article  MathSciNet  MATH  Google Scholar 

  • Lauro CN, Palumbo F (2005) Principal component analysis for non-precise data. New developments in classification and data analysis. In: Studies in classification, data analysis and knowledge organization. Springer, pp 173–184

  • Lemke CE (1962) A method of solution for quadratic programs. Manag Sci 8(4):442–453

    Article  MathSciNet  MATH  Google Scholar 

  • Liew CK (1976) Inequality constrained least-squares estimation. J Am Stat Assoc 71:746–751

    Article  MathSciNet  MATH  Google Scholar 

  • Lima Neto EA, De Carvalho FAT (2010) Constrained linear regression models for symbolic interval-valued variables. Comput Stat Data Anal 54:333–347

    Article  MathSciNet  MATH  Google Scholar 

  • Lima Neto EA, Dos Anjos UU (2015) Regression model for interval-valued variables based on copulas. J Appl Stat 42(9):2010–2029

    Article  MathSciNet  Google Scholar 

  • Näther W (1997) Linear statistical inference for random fuzzy data. Statistics 29(3):221–240

    Article  MathSciNet  MATH  Google Scholar 

  • Park C, Yongho J, Kee-Hoon K (2016) An exploratory data analysis in scale-space for interval-valued data. J Appl Stat 43(14):2643–2660

    Article  MathSciNet  Google Scholar 

  • Ramos-Guajardo AB, Grzegorzewski P (2016) Distance-based linear discriminant analysis for interval-valued data. Inf Sci 272:591–607

    Article  MATH  Google Scholar 

  • Ramos-Guajardo AB, Colubi A, González-Rodríguez G (2014) Inclusion degree tests for the Aumann expectation of a random interval. Inf Sci 288(20):412–422

    Article  MathSciNet  MATH  Google Scholar 

  • Sinova B, Colubi A, Gil MA, González-Rodríguez G (2012) Interval arithmetic-based linear regression between interval data: discussion and sensitivity analysis on the choice of the metric. Inf Sci 199:109–124

    Article  MathSciNet  MATH  Google Scholar 

  • Srivastava MS, Srivastava VK (1986) Asymptotic distribution of least squares estimator and a test statistic in linear regression models. Econ Lett 21:173–176

    Article  MathSciNet  MATH  Google Scholar 

  • Trutschnig W, González-Rodríguez G, Colubi A, Gil MA (2009) A new family of metrics for compact, convex (fuzzy) sets based on a generalized concept of mid and spread. Inf Sci 179(23):3964–3972

    Article  MathSciNet  MATH  Google Scholar 

  • Wets RJB (1991) Constrained estimation: consistency and asymptotics. Appl Stoch Model Data Anal 7:17–32

    Article  MathSciNet  MATH  Google Scholar 

  • Yu Q, Hsu Y, Yu K (2014) A necessary and sufficient condition for justifying non-parametric likelihood with censored data. Metrika 77(8):995–1011

    Article  MathSciNet  MATH  Google Scholar 

  • Zhang Z (2009) Linear transformation models for interval-censored data: prediction of survival probability and model checking. Stat Model 9(4):321–343

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

The research in this paper has been partially supported by the Spanish Government through MINECO-18-MTM2017-89632-P Grant and by the COST Action 1408. Their financial support is greatfully acknowledged.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ana Belén Ramos-Guajardo.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

García-Bárzana, M., Ramos-Guajardo, A.B., Colubi, A. et al. Multiple linear regression models for random intervals: a set arithmetic approach. Comput Stat 35, 755–773 (2020). https://doi.org/10.1007/s00180-019-00910-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00180-019-00910-1

Keywords

Navigation