Skip to main content

Inferring Demographic Attributes of Anonymous Internet Users

  • Conference paper
  • First Online:
Web Usage Analysis and User Profiling (WebKDD 1999)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1836))

Included in the following conference series:

Abstract

Today it is quite common for web page content to include an advertisement. Since advertisers often want to target their message to people with certain demographic attributes, the anonymity of Internet users poses a special problem for them. The purpose of the present research is to find an effective way to infer demographic information (e.g. gender, age or income) about people who use the Internet but for whom demographic information is not otherwise available. Our hope is to build a high quality database of demographic profiles covering a large segment of the Internet population without having to survey each individual Internet user. Though Internet users are largely anonymous, they nonetheless provide a certain amount of usage information. Usage information includes, but is not limited to, (a) search terms entered by the Internet user and (b) web pages accessed by the Internet user. In this paper, we describe an application of the Latent Semantic Analysis (LSA) [1] information retrieval technique to construct a vector space in which we can represent the usage data associated with each Internet user of interest. Subsequently, we show how the LSA vector space enables us to produce demographic inferences by supplying the input to a three layer neural model trained using the scaled conjugate gradient (SCG) method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Deerwester, S., Dumais, S. T., Furnas, G. W., Landauer, T. K., & Harshman, R. Indexing by latent semantic analysis. Journal of the American Society For Information Science, 41(6), 1990.

    Google Scholar 

  2. Landauer, T. K., & Dumais, S. T., How come you know so much? From practical problem to theory. In D. Hermann, C. McEvoy, M. Johnson, & P. Hertel (Eds.), Basic and applied memory: Memory in context. Mahwah, NJ: Erlbaum, 105–126, 1996.

    Google Scholar 

  3. Landauer, T. K., & Dumais, S. T., A solution to Plato’s problem: The Latent Semantic Analysis theory of the acquisition, induction, and representation of knowledge. Psychological Review, 104, 211–240, 1997.

    Article  Google Scholar 

  4. M. W. Berry et al., SVDPACKC: Version 1.0 User’s Guide, Tech. Rep. CS-93-194, University of Tennessee, Knoxville,TN, October 1993.

    Google Scholar 

  5. N. Belkin and W. Croft. Retrieval techniques. In M. Williams, editor, Annual Review of Information Science and Technology (ARIST), volume 22, chapter 4, pages 109–145. Elsevier Science Publishers B.V., 1987.

    Google Scholar 

  6. G. Golub and C. Van Loan. Matrix Computations. Johns-Hopkins, Baltimore, Maryland, second edition, 1989.

    MATH  Google Scholar 

  7. Salton, G. (ed), The SMART Retrieval System — Experiments in Automatic Document Processing, Englewood Cliffs, New Jersey: Prentice-Hall, 1971.

    Google Scholar 

  8. William H. Press, Saul A. Teukolsky, William T. Vetterling, and Brian P. Flannery. Numerical Recipes in C: The Art of Scientific Computing. Cambridge University Press, Cambridge, 2nd edition, 1992.

    Google Scholar 

  9. A. Zell et al., Stuttgart Neural Network Simulator: User Manual Version 4.1, University of Stuttgart, 1995.

    Google Scholar 

  10. Dumais, S. T. (1995), Using LSI for information filtering: TREC-3 experiments. In: D. Harman (Ed.), The Third Text REtrieval Conference (TREC3) National Institute of Standards and Technology Special Publication, in press 1995.

    Google Scholar 

  11. Dumais, S.T., Improving the retrieval of information from external sources. Behavior Research Methods, Instruments and Computers, 23(2), 229–236, 1991.

    Google Scholar 

  12. Dumais, S. T., Furnas, G. W., Landauer, T. K. and Deerwester, S., Using latent semantic analysis to improve information retrieval. In Proceedings of CHI’88: Conference on Human Factors in Computing, New York: ACM, 281–285, 1988.

    Google Scholar 

  13. Dumais, S.T., “Latent Semantic Indexing (LSI) and TREC-2.” In: D. Harman (Ed.), The Second Text REtrieval Conference (TREC2), National Institute of Standards and Technology Special Publication 500-215, pp. 105–116, 1994.

    Google Scholar 

  14. Dumais, S. T., “LSI meets TREC: A status report.” In: D. Harman (Ed.), The First Text REtrieval Conference (TREC1), National Institute of Standards and Technology Special Publication500-207, pp. 137–152, 1993.

    Google Scholar 

  15. Kaski, S., Dimensionality reduction by random mapping: Fast similarity computation for clustering. In Proceedings of IJCNN’98, International Joint Conference on Neural Networks, volume 1, pages 413–418. IEEE Service Center, Piscataway, NJ., 1998.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2000 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Murray, D., Durrell, K. (2000). Inferring Demographic Attributes of Anonymous Internet Users. In: Masand, B., Spiliopoulou, M. (eds) Web Usage Analysis and User Profiling. WebKDD 1999. Lecture Notes in Computer Science(), vol 1836. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44934-5_1

Download citation

  • DOI: https://doi.org/10.1007/3-540-44934-5_1

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-67818-2

  • Online ISBN: 978-3-540-44934-8

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics