Using coarse information for real valued prediction

Dhurandhar, Amit

doi:10.1007/s10618-012-0287-5

Using coarse information for real valued prediction

Published: 14 August 2012

Volume 27, pages 167–192, (2013)
Cite this article

Data Mining and Knowledge Discovery Aims and scope Submit manuscript

Amit Dhurandhar¹

548 Accesses
1 Citation
Explore all metrics

Abstract

In domains such as consumer products and manufacturing amongst others, we have problems that warrant the prediction of a continuous target. Besides the usual set of explanatory attributes, we may also have exact (or approximate) estimates of aggregated targets, which are the sums of disjoint sets of individual targets that we are trying to predict. The question now becomes can we use these aggregated targets, which are a coarser piece of information, to improve the quality of predictions of the individual targets? In this paper, we provide a simple yet provable way of accomplishing this. In particular, given predictions from any regression model of the target on the test data, we elucidate a provable method for improving these predictions in terms of mean squared error, given exact (or accurate enough) information of the aggregated targets. These estimates of the aggregated targets may be readily available or obtained—through multilevel regression—at different levels of granularity. Based on the proof of our method we suggest a criterion for choosing the appropriate level. Moreover, in addition to estimates of the aggregated targets, if we have exact (or approximate) estimates of the mean and variance of the target distribution, then based on our general strategy we provide an optimal way of incorporating this information so as to further improve the quality of predictions of the individual targets. We then validate the results and our claims by conducting experiments on synthetic and real industrial data obtained from diverse domains.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Arnold A, Liu Y, Abe N (2007) Temporal causal modeling with graphical granger methods. In: Knowledge discovery and data mining. ACM
Dhurandhar A (2010) Multistep time series prediction in complex instrumented domains. In: Large-scale analytics for complex instrumented systems workshop, in international conference on data mining. IEEE
Dietterich T (1998) Approximate statistical tests for comparing supervised classification learning algorithms. Neural Comput 10: 1895–1923
Article Google Scholar
Fleuret F, Geman D (2001) Coarse-to-fine face detection. Int J Comput Vis 41: 85–107
Article MATH Google Scholar
Hastie T, Tibshirani R, Friedman J (2009) The elements of statistical learning, 2nd edn. Springer, New York
Book MATH Google Scholar
Jackson C, Best N, Richardson S (2008) Hierarchical related regression for combining aggregate and individual data in studies of socio-economic disease risk factors. J R Stat Soc Ser A 171(1): 159–178
MathSciNet Google Scholar
Liu Y, Kalagnanam J, Johnsen O (2009) Learning dynamic temporal graphs for oil-production equipment monitoring system. In: Knowledge discovery and data mining. ACM, pp 1225–1234
Munoz D, Bagnell J, Hebert M (2010) Stacked hierarchical labeling. In: ECCV
Tibshirani R (2007) Averaged gene expressions for regression. Biostatistics 8: 212–227
Article MATH Google Scholar
Raudenbush S, Bryk A (2002) Hierarchical linear models, 2nd edn. Sage, Thousand Oaks
Google Scholar
Sapp B, Toshev A, Taskar B (2010) Cascaded models for articulated pose estimation. In: ECCV
Singer J, Willett J (2003) Applied longitudinal data analysis: modeling change and event occurrence, 1st edn. Oxford University Press, Oxford
Book Google Scholar
Slav P (2009) Coarse-to-fine natural language processing. PhD Thesis, UC Berkeley
Ward JH Jr (1963) Hierarchical grouping to optimize an objective function. J Am Stat Assoc 58: 236–244
Article Google Scholar
Weiss D, Taskar B (2010) Structured prediction cascades. In: Proceedings of AISTATS

Download references

Author information

Authors and Affiliations

IBM T.J. Watson, Yorktown Heights, NY, USA
Amit Dhurandhar

Authors

Amit Dhurandhar
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Amit Dhurandhar.

Additional information

Responsible editor: Chih-Jen Lin.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Dhurandhar, A. Using coarse information for real valued prediction. Data Min Knowl Disc 27, 167–192 (2013). https://doi.org/10.1007/s10618-012-0287-5

Download citation

Received: 13 July 2011
Accepted: 02 August 2012
Published: 14 August 2012
Issue Date: September 2013
DOI: https://doi.org/10.1007/s10618-012-0287-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Using coarse information for real valued prediction

Abstract

Access this article

Similar content being viewed by others

Review on model predictive control: an engineering perspective

Data Science and Analytics: An Overview from Data-Driven Smart Computing, Decision-Making and Applications Perspective

Predictive big data analytics for supply chain demand forecasting: methods, applications, and research opportunities

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Using coarse information for real valued prediction

Abstract

Access this article

Similar content being viewed by others

Review on model predictive control: an engineering perspective

Data Science and Analytics: An Overview from Data-Driven Smart Computing, Decision-Making and Applications Perspective

Predictive big data analytics for supply chain demand forecasting: methods, applications, and research opportunities

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation