The Role of Algorithms in Profiling

Anrig, Bernhard; Browne, Will; Gasson, Mark

doi:10.1007/978-1-4020-6914-7_4

The Role of Algorithms in Profiling

Bernhard Anrig³,
Will Browne⁴ &
Mark Gasson⁴

Chapter

2835 Accesses
9 Citations

Algorithms can be utilised to play two essential roles in the data mining endeavour. Firstly, in the form of procedures, they may determine how profiling is conducted by controlling the profiling process itself. For example, methodologies such as CRISP-DM have been designed to control the process of extracting information from the large quantities of data that have become readily available in our modern, data rich society. In this situation, algorithms can be tuned to assist in the capture, verification and validation of data, as discussed in the reply to Chapter 3.

Secondly, algorithms, dominantly as mathematical procedures, can be used as the profiling engine to identify trends, relationships and hidden patterns in disparate groups of data. The use of algorithms in this way often means that more effective profiles can ultimately be computed than would be possible manually. In this chapter we show how algorithms find a natural home at the very heart of the profiling process and how such machine learning can actually be used to address the task of knowledge discovery.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 69.99; Price excludes VAT (USA)

Hardcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Aggarwal, C. C., Hinneburg, A., Keim, D.A., ‘On the Surprising Behavior of Distance Metrics in High Dimensional Spaces’, Proceedings of the 8th International Conference on Database Theory 2001, Springer Verlag GmbH 2006, pp. 420–434.
Google Scholar
Agrawal, R., Imielinski, T., Swami, A., ‘Mining Association Rules between Sets of Items in Large Databases’, Proceedings of the ACM SIGMOD Conference Washington DC, USA, May 1993, ACM Press, New York, 1993, pp. 207–216.
Google Scholar
Banks, D., et al., Classification, Clustering, and Data Mining Applications, Springer, Berlin, 2004.
Google Scholar
Backhaus, K., et al., Multivariate Analysemethoden - Eine anwendungsorientierte Einführung, Springer, Berlin, 2000.
Google Scholar
Cristianini N., Shawe-Taylor J., Kernel Methods for Pattern Analysis, Cambridge University Press, Cambridge, 2004.
Google Scholar
Codd, E. F., ‘A Relational Model of Data for Large Shared Data Banks’, Communications of the ACM, Vol.13, No 6, ACM Press, New York, pp. 377–387, 1970.
Google Scholar
Fisher, R., ‘The correlation between relatives on the supposition of Mendelian inheritance’, Philosophical Transactions of the Royal Society of Edinburgh, Vol. 52, Royal Society of Edinburgh, Scotland, 1918, pp. 399–433.
Google Scholar
Fisher, R., ‘On the mathematical foundations of theoretical statistics’, Philosophical Transactions of the Royal Society, Royal Society of Edinburgh, Scotland, Vol. 222, 1922, pp. 309–368.
Google Scholar
Franzén, T., Gödel’s Theorem. An Incomplete Guide to its Use and Abuse, Wellesley, Mass.: A. K. Peters, 2005.
Google Scholar
Han, J., Kamber, M., Data Mining: Concepts and Techniques, Morgan Kaufmann Publishers, 2000.
Google Scholar
Holte R., ‘Very simple classification rules perform well on most commonly used datasets’, Machine Learning, Vol. 11, No. 1, Springer, Netherlands, 1993, pp. 63–91.
Google Scholar
Jain, A. K., Murty, M. N., Flynn P. J., ‘Data Clustering: A Review’, ACM Computing Surveys, Vol. 31, No. 3, ACM Press, New York, 1999, pp. 264–323.
Google Scholar
Lusti, M., Data Warehousing and Data Mining, Springer, Berlin, 2001.
Google Scholar
MacKay, D. J. C., Information Theory, Inference and Learning Algorithms, Cambridge University Press, Cambrigde, 2003.
Google Scholar
Mackenzie, D., Mechanizing Proof. Computing, Risk, and Trust. Cambridge, Mass.: MIT, 2001.
Google Scholar
Müller, J. A., Lemke, F., Self-Organising Data Mining, BoD GmbH, Norderstedt, 2000.
Google Scholar
Oliveira, S. R. M., Zaïane, O. R., ‘Towards Standardization in Privacy-Preserving Data Mining’, Proceeding of the 3rd. Workshop on Data Mining Standards (DM-SSP 2004), in conjunction with KDD 2004, Seattle, WA, USA, August, 2004. Available at: http://www.cs.ualberta.ca/%7Ezaiane/postscript/dm-ssp04.pdf
Pearl, J., Probabilistic reasoning in plausible inference, Morgan Kaufmann, 1988.
Google Scholar
Picard, J., ‘Modeling and Combining Evidence Provided by Document Relationships Using Probabilistic Argumentation Systems’, Proceedings of the ACM SIGIR’98 Conference, ACM Press, New York, 1998, pp. 182–189.
Google Scholar
Quinlan, J. R., ‘Inductive Learning of Decision Trees’, Machine Learning, Vol 1, No. 1, Springer, Netherlands, 1986, pp. 81–106.
Google Scholar
Schölkopf, B., Smola, A., Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond (Adaptive Computation and Machine Learning), MIT Press, Cambridge, MA, 2002.
Google Scholar
Verykios, V. S., Bertino, E., Fovino, I. N., Provenza, L. P., Saygin, Y., Theodoridis, Y., ‘State-of-the-art in Privacy Preserving Data Mining’, SIGMOD Record, Vol. 33, No. 1, New York, March 2004, pp. 50–57. Available at: http://dke.cti.gr/CODMINE/SIGREC_Verykios-et-al.pdf
Witten, I. H., Frank, E., Data Mining: Practical machine learning tools and techniques, Morgan Kaufmann, San Francisco, 2005.
Google Scholar

Download references

Author information

Authors and Affiliations

VIP, Berne University of Applied Science, Switzerland
Bernhard Anrig
University of Reading, England
Will Browne & Mark Gasson

Authors

Bernhard Anrig
View author publications
You can also search for this author in PubMed Google Scholar
Will Browne
View author publications
You can also search for this author in PubMed Google Scholar
Mark Gasson
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Erasmus University Rotterdam, The Netherlands
Mireille Hildebrandt & Serge Gutwirth &
Vrije Universiteit Brussel, Belgium
Mireille Hildebrandt & Serge Gutwirth &

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Anrig, B., Browne, W., Gasson, M. (2008). The Role of Algorithms in Profiling. In: Hildebrandt, M., Gutwirth, S. (eds) Profiling the European Citizen. Springer, Dordrecht. https://doi.org/10.1007/978-1-4020-6914-7_4

Download citation

DOI: https://doi.org/10.1007/978-1-4020-6914-7_4
Publisher Name: Springer, Dordrecht
Print ISBN: 978-1-4020-6913-0
Online ISBN: 978-1-4020-6914-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Buying options