On the Trade-Off Between Number of Examples and Precision of Supervision in Regression

Gnecco, Giorgio; Nutarelli, Federico

doi:10.1007/978-3-030-16841-4_1

Giorgio Gnecco⁷ &
Federico Nutarelli⁷

Part of the book series: Proceedings of the International Neural Networks Society ((INNS,volume 1))

Included in the following conference series:

INNS Big Data and Deep Learning conference

1013 Accesses
7 Citations

Abstract

We investigate regression problems for which one is given the additional possibility of controlling the conditional variance of the output given the input, by varying the computational time dedicated to supervise each example. For a given upper bound on the total computational time, we optimize the trade-off between the number of examples and their precision, by formulating and solving a suitable optimization problem, based on a large-sample approximation of the output of the ordinary least squares algorithm. Considering a specific functional form for that precision, we prove that there are cases in which “many but bad” examples provide a smaller generalization error than “few but good” ones, but also that the converse can occur, depending on the “returns to scale” of the precision with respect to the computational time assigned to supervise each example. Hence, the results of this study highlight that increasing the size of the dataset is not always beneficial, if one has the possibility to collect a smaller number of more reliable examples.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
One can notice that the invertibility of \(X' X\) in Eq. (3) requires \(N \ge p\).
2.
I.e., for every \(\varepsilon >0\), \(\mathrm{Prob} \left( \left\| \frac{X_{N(\varDelta T)}' X_{N(\varDelta T)}}{N(\varDelta T)} - \mathbb {E} \left\{ \underline{x} \underline{x}'\right\} \right\| > \varepsilon \right) \) (where \(\Vert \cdot \Vert \) is an arbitrary matrix norm) tends to 0 as \(N(\varDelta T)\) tends to \(+\infty \).

References

Bacigalupo, A., Gnecco, G., Lepidi, M., Gambarotta, L.: Optimal design of low-frequency band gaps in anti-tetrachiral lattice meta-materials. Compos. Part B Eng. 115, 341–359 (2017)
Article Google Scholar
Bacigalupo, A., Lepidi, M., Gnecco, G., Gambarotta, L.: Optimal design of auxetic hexachiral metamaterials with local resonators. Smart Mater. Struct. 25(5), 19 (2016). Article ID. 054009
Article Google Scholar
Hamming, R.: Numerical Methods for Scientists and Engineers, 2nd edn. McGraw-Hill, New York (1973)
MATH Google Scholar
Ruud, P.A.: An Introduction to Classical Econometric Theory, 1st edn. Oxford University Press, Oxford (2000)
Google Scholar
Varian, H.R.: Big data: new tricks for econometrics. J. Econ. Perspect. 28, 3–28 (2014)
Article Google Scholar
Wilkinson, J.H.: The evaluation of the zeros of ill-conditioned polynomials. Part I. Numer. Math. 1, 150–166 (1959)
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

IMT School for Advanced Studies, Piazza S. Francesco 19, 55100, Lucca, Italy
Giorgio Gnecco & Federico Nutarelli

Authors

Giorgio Gnecco
View author publications
You can also search for this author in PubMed Google Scholar
Federico Nutarelli
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Giorgio Gnecco .

Editor information

Editors and Affiliations

Department of Informatics, Bioengineering, Robotics, and Systems Engineering, University of Genova, Genoa, Italy
Luca Oneto
Department of Mathematics, University of Padova, Padua, Italy
Nicolò Navarin
Department of Mathematics, University of Padova, Padua, Italy
Alessandro Sperduti
Department of Informatics, Bioengineering, Robotics, and Systems Engineering, University of Genova, Genoa, Italy
Davide Anguita

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Gnecco, G., Nutarelli, F. (2020). On the Trade-Off Between Number of Examples and Precision of Supervision in Regression. In: Oneto, L., Navarin, N., Sperduti, A., Anguita, D. (eds) Recent Advances in Big Data and Deep Learning. INNSBDDL 2019. Proceedings of the International Neural Networks Society, vol 1. Springer, Cham. https://doi.org/10.1007/978-3-030-16841-4_1

Download citation

DOI: https://doi.org/10.1007/978-3-030-16841-4_1
Published: 03 April 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-16840-7
Online ISBN: 978-3-030-16841-4
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics