Abstract
We investigate regression problems for which one is given the additional possibility of controlling the conditional variance of the output given the input, by varying the computational time dedicated to supervise each example. For a given upper bound on the total computational time, we optimize the trade-off between the number of examples and their precision, by formulating and solving a suitable optimization problem, based on a large-sample approximation of the output of the ordinary least squares algorithm. Considering a specific functional form for that precision, we prove that there are cases in which “many but bad” examples provide a smaller generalization error than “few but good” ones, but also that the converse can occur, depending on the “returns to scale” of the precision with respect to the computational time assigned to supervise each example. Hence, the results of this study highlight that increasing the size of the dataset is not always beneficial, if one has the possibility to collect a smaller number of more reliable examples.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
One can notice that the invertibility of \(X' X\) in Eq. (3) requires \(N \ge p\).
- 2.
I.e., for every \(\varepsilon >0\), \(\mathrm{Prob} \left( \left\| \frac{X_{N(\varDelta T)}' X_{N(\varDelta T)}}{N(\varDelta T)} - \mathbb {E} \left\{ \underline{x} \underline{x}'\right\} \right\| > \varepsilon \right) \) (where \(\Vert \cdot \Vert \) is an arbitrary matrix norm) tends to 0 as \(N(\varDelta T)\) tends to \(+\infty \).
References
Bacigalupo, A., Gnecco, G., Lepidi, M., Gambarotta, L.: Optimal design of low-frequency band gaps in anti-tetrachiral lattice meta-materials. Compos. Part B Eng. 115, 341–359 (2017)
Bacigalupo, A., Lepidi, M., Gnecco, G., Gambarotta, L.: Optimal design of auxetic hexachiral metamaterials with local resonators. Smart Mater. Struct. 25(5), 19 (2016). Article ID. 054009
Hamming, R.: Numerical Methods for Scientists and Engineers, 2nd edn. McGraw-Hill, New York (1973)
Ruud, P.A.: An Introduction to Classical Econometric Theory, 1st edn. Oxford University Press, Oxford (2000)
Varian, H.R.: Big data: new tricks for econometrics. J. Econ. Perspect. 28, 3–28 (2014)
Wilkinson, J.H.: The evaluation of the zeros of ill-conditioned polynomials. Part I. Numer. Math. 1, 150–166 (1959)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Gnecco, G., Nutarelli, F. (2020). On the Trade-Off Between Number of Examples and Precision of Supervision in Regression. In: Oneto, L., Navarin, N., Sperduti, A., Anguita, D. (eds) Recent Advances in Big Data and Deep Learning. INNSBDDL 2019. Proceedings of the International Neural Networks Society, vol 1. Springer, Cham. https://doi.org/10.1007/978-3-030-16841-4_1
Download citation
DOI: https://doi.org/10.1007/978-3-030-16841-4_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-16840-7
Online ISBN: 978-3-030-16841-4
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)