Skip to main content

The Role of Algorithms in Profiling

  • Chapter
Book cover Profiling the European Citizen

Algorithms can be utilised to play two essential roles in the data mining endeavour. Firstly, in the form of procedures, they may determine how profiling is conducted by controlling the profiling process itself. For example, methodologies such as CRISP-DM have been designed to control the process of extracting information from the large quantities of data that have become readily available in our modern, data rich society. In this situation, algorithms can be tuned to assist in the capture, verification and validation of data, as discussed in the reply to Chapter 3.

Secondly, algorithms, dominantly as mathematical procedures, can be used as the profiling engine to identify trends, relationships and hidden patterns in disparate groups of data. The use of algorithms in this way often means that more effective profiles can ultimately be computed than would be possible manually. In this chapter we show how algorithms find a natural home at the very heart of the profiling process and how such machine learning can actually be used to address the task of knowledge discovery.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 59.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 69.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 79.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Aggarwal, C. C., Hinneburg, A., Keim, D.A., ‘On the Surprising Behavior of Distance Metrics in High Dimensional Spaces’, Proceedings of the 8th International Conference on Database Theory 2001, Springer Verlag GmbH 2006, pp. 420–434.

    Google Scholar 

  • Agrawal, R., Imielinski, T., Swami, A., ‘Mining Association Rules between Sets of Items in Large Databases’, Proceedings of the ACM SIGMOD Conference Washington DC, USA, May 1993, ACM Press, New York, 1993, pp. 207–216.

    Google Scholar 

  • Banks, D., et al., Classification, Clustering, and Data Mining Applications, Springer, Berlin, 2004.

    Google Scholar 

  • Backhaus, K., et al., Multivariate Analysemethoden - Eine anwendungsorientierte Einführung, Springer, Berlin, 2000.

    Google Scholar 

  • Cristianini N., Shawe-Taylor J., Kernel Methods for Pattern Analysis, Cambridge University Press, Cambridge, 2004.

    Google Scholar 

  • Codd, E. F., ‘A Relational Model of Data for Large Shared Data Banks’, Communications of the ACM, Vol.13, No 6, ACM Press, New York, pp. 377–387, 1970.

    Google Scholar 

  • Fisher, R., ‘The correlation between relatives on the supposition of Mendelian inheritance’, Philosophical Transactions of the Royal Society of Edinburgh, Vol. 52, Royal Society of Edinburgh, Scotland, 1918, pp. 399–433.

    Google Scholar 

  • Fisher, R., ‘On the mathematical foundations of theoretical statistics’, Philosophical Transactions of the Royal Society, Royal Society of Edinburgh, Scotland, Vol. 222, 1922, pp. 309–368.

    Google Scholar 

  • Franzén, T., Gödel’s Theorem. An Incomplete Guide to its Use and Abuse, Wellesley, Mass.: A. K. Peters, 2005.

    Google Scholar 

  • Han, J., Kamber, M., Data Mining: Concepts and Techniques, Morgan Kaufmann Publishers, 2000.

    Google Scholar 

  • Holte R., ‘Very simple classification rules perform well on most commonly used datasets’, Machine Learning, Vol. 11, No. 1, Springer, Netherlands, 1993, pp. 63–91.

    Google Scholar 

  • Jain, A. K., Murty, M. N., Flynn P. J., ‘Data Clustering: A Review’, ACM Computing Surveys, Vol. 31, No. 3, ACM Press, New York, 1999, pp. 264–323.

    Google Scholar 

  • Lusti, M., Data Warehousing and Data Mining, Springer, Berlin, 2001.

    Google Scholar 

  • MacKay, D. J. C., Information Theory, Inference and Learning Algorithms, Cambridge University Press, Cambrigde, 2003.

    Google Scholar 

  • Mackenzie, D., Mechanizing Proof. Computing, Risk, and Trust. Cambridge, Mass.: MIT, 2001.

    Google Scholar 

  • Müller, J. A., Lemke, F., Self-Organising Data Mining, BoD GmbH, Norderstedt, 2000.

    Google Scholar 

  • Oliveira, S. R. M., Zaïane, O. R., ‘Towards Standardization in Privacy-Preserving Data Mining’, Proceeding of the 3rd. Workshop on Data Mining Standards (DM-SSP 2004), in conjunction with KDD 2004, Seattle, WA, USA, August, 2004. Available at: http://www.cs.ualberta.ca/%7Ezaiane/postscript/dm-ssp04.pdf

  • Pearl, J., Probabilistic reasoning in plausible inference, Morgan Kaufmann, 1988.

    Google Scholar 

  • Picard, J., ‘Modeling and Combining Evidence Provided by Document Relationships Using Probabilistic Argumentation Systems’, Proceedings of the ACM SIGIR’98 Conference, ACM Press, New York, 1998, pp. 182–189.

    Google Scholar 

  • Quinlan, J. R., ‘Inductive Learning of Decision Trees’, Machine Learning, Vol 1, No. 1, Springer, Netherlands, 1986, pp. 81–106.

    Google Scholar 

  • Schölkopf, B., Smola, A., Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond (Adaptive Computation and Machine Learning), MIT Press, Cambridge, MA, 2002.

    Google Scholar 

  • Verykios, V. S., Bertino, E., Fovino, I. N., Provenza, L. P., Saygin, Y., Theodoridis, Y., ‘State-of-the-art in Privacy Preserving Data Mining’, SIGMOD Record, Vol. 33, No. 1, New York, March 2004, pp. 50–57. Available at: http://dke.cti.gr/CODMINE/SIGREC_Verykios-et-al.pdf

  • Witten, I. H., Frank, E., Data Mining: Practical machine learning tools and techniques, Morgan Kaufmann, San Francisco, 2005.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer Science + Business Media B.V

About this chapter

Cite this chapter

Anrig, B., Browne, W., Gasson, M. (2008). The Role of Algorithms in Profiling. In: Hildebrandt, M., Gutwirth, S. (eds) Profiling the European Citizen. Springer, Dordrecht. https://doi.org/10.1007/978-1-4020-6914-7_4

Download citation

  • DOI: https://doi.org/10.1007/978-1-4020-6914-7_4

  • Publisher Name: Springer, Dordrecht

  • Print ISBN: 978-1-4020-6913-0

  • Online ISBN: 978-1-4020-6914-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics