Summary
While symbolic data exist in their own right, contemporary datasets can be too large to analyse using traditional statistical methodologies. Aggregation of these large datasets into sets of more managable size perforce produce datasets whose entries are symbolic data. This paper studies the derivation of basic description statistics, in particular, histograms and mean and variances plus joint histograms for interval-valued datasets when logical dependency rules are present. Algorithms for calculating these histograms are also provided.
Similar content being viewed by others
References
Bertrand, P. & Goupil, F. (2000), ‘Descriptive statistics for symbolic data’, Analysis of Symbolic Data: Exploratory Methods for Extracting Statistical Information from Complex Data (eds. H.-H. Bock and E. Diday), Berlin, Springer-Verlag, pp 103–124.
Billard, L. & Diday, E. (2003), ‘From the statistics of data to the statistics of knowledge: Symbolic data analysis’, journal of the American Statistical Association 98, pp 470–487.
Billard, L. & Diday, E. (2005), ‘Histograms in symbolic data analysis’, Bulletin International Statistical Institute (in press).
Bock, H.-H. & Diday, E. (2000), ‘Analysis of Symbolic Data: Exploratory Methods for Extracting Statistical Information from Complex Data’, Berlin, Springer-Verlag.
Vanessa, A. & Vanessa, L. (2004), ‘La meilleure équipe de Base-ball. CERE-MADE, Université dé Paris 9, Dauphine.
Author information
Authors and Affiliations
Appendices
Appendix A - Histogram Algorithm
Appendix B - Joint Histogram Algorithm
Rights and permissions
About this article
Cite this article
Billard, L., Diday, E. Descriptive statistics for interval-valued observations in the presence of rules. Computational Statistics 21, 187–210 (2006). https://doi.org/10.1007/s00180-006-0259-6
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00180-006-0259-6