Learning on the Web

Pereira, Fernando C. N.

doi:10.1007/978-3-642-04747-3_3

Fernando C. N. Pereira²³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5808))

Included in the following conference series:

International Conference on Discovery Science

2005 Accesses

Abstract

It is commonplace to say that theWeb has changed everything.Machine learning researchers often say that their projects and results respond to that change with better methods for finding and organizing Web information. However, not much of the theory, or even the current practice, of machine learning take the Web seriously. We continue to devote much effort to refining supervised learning, but the Web reality is that labeled data is hard to obtain, while unlabeled data is inexhaustible. We cling to the iid assumption, while all the Web data generation processes drift rapidly and involve many hidden correlations. Many of our theory and algorithms assume data representations of fixed dimension, while in fact the dimensionality of data, for example the number of distinct words in text, grows with data size. While there has been much work recently on learning with sparse representations, the actual patterns of sparsity on the Web are not paid much attention. Those patterns might be very relevant to the communication costs of distributed learning algorithms, which are necessary at Web scale, but little work has been done on this.

Download to read the full chapter text

Chapter PDF

Introduction to the Theory of Randomized Machine Learning

Online Learning

Scale Effects in Web Search

Author information

Authors and Affiliations

University of Pennsylvania, USA
Fernando C. N. Pereira

Authors

Fernando C. N. Pereira
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Faculty of Economics; Rua Dr. Roberto Frias, University of Porto, 4200-465, Porto, Portugal
João Gama
DCC-FC, Universidade do Porto, Portugal
Vítor Santos Costa
LIACC/FEP, Universidade do Porto, Portugal
Alípio Mário Jorge
LIAAD-INESC Porto L.A./Faculty of Economics, University of Porto, Rua de Ceuta, 118-6, 4050-190, Porto, Portugal
Pavel B. Brazdil

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Pereira, F.C.N. (2009). Learning on the Web. In: Gama, J., Costa, V.S., Jorge, A.M., Brazdil, P.B. (eds) Discovery Science. DS 2009. Lecture Notes in Computer Science(), vol 5808. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04747-3_3

Download citation

DOI: https://doi.org/10.1007/978-3-642-04747-3_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-04746-6
Online ISBN: 978-3-642-04747-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Learning on the Web

Abstract

Chapter PDF

Similar content being viewed by others

Introduction to the Theory of Randomized Machine Learning

Online Learning

Scale Effects in Web Search

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Learning on the Web

Abstract

Chapter PDF

Similar content being viewed by others

Introduction to the Theory of Randomized Machine Learning

Online Learning

Scale Effects in Web Search

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation