Global Classifier for Confidential Data in Distributed Datasets

Jasso-Luna, Omar; Sosa-Sosa, Victor; Lopez-Arevalo, Ivan

doi:10.1007/978-3-540-88636-5_30

Global Classifier for Confidential Data in Distributed Datasets

Omar Jasso-Luna³,
Victor Sosa-Sosa³ &
Ivan Lopez-Arevalo³

Conference paper

2043 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5317))

Abstract

Everyday, a huge amount of data are produced by many institutions. In most of the cases these data are stored on centralized servers where usually are analyzed to extract knowledge from them. This knowledge is represented by patterns or tendencies that become valuable assets for decision makers. Data analysis requires high performance computing. This situation has motivated the development of Distributed Data Mining (DDM) architectures. DDM uses different distributed data sources to build a global classifier. Building a global classifier implies that all of the data sources be integrated in a unique global dataset. This means that private data have to be shared by every participant. This situation sometimes represents a data privacy intrusion that is not desired by data owners. This paper describes a DDM application where participants work in an interactive way to built a global classifier for data mining process without need sharing the original data. Results show that the global classifier created of this way offers better performance than doing it individually and avoids data privacy intrusion.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Witten, H., Frank, E.: Data Mining: Practical machine learning tools and techniques. Morgan Kaufmann Publishers, San Francisco (2005)
MATH Google Scholar
Talia, D., Trunfio, P., Verta, O.: Weka4WS: A WSRF-Enabled Weka Toolkit for Distributed Data Mining on Grids. In: Jorge, A.M., Torgo, L., Brazdil, P.B., Camacho, R., Gama, J. (eds.) PKDD 2005. LNCS (LNAI), vol. 3721, pp. 309–320. Springer, Heidelberg (2005)
Chapter Google Scholar
Khoussainov, R., Zuo, X., Kushmerick, N.: Grid-enabled Weka: A Toolkit for Machine Learning on the Grid. ERCIM 59, 47–48 (2004)
Google Scholar
Shaikh Ali, A., Rana, O.F., Taylor, I.J.: Web Services Composition for Distributed Data Mining. In: International Conference Workshop on Parallel Processing, pp. 11–18. IEEE, Los Alamitos (2005)
Google Scholar
Peña, J.M., Sánchez, A., Robles, V., Pérez, M.S., Herrero, P.: Adapting the Weka Data Mining Toolkit to a Grid Based Environment. In: Szczepaniak, P.S., Kacprzyk, J., Niewiadomski, A. (eds.) AWIC 2005. LNCS (LNAI), vol. 3528, pp. 492–497. Springer, Heidelberg (2005)
Chapter Google Scholar
Quinlan, J.R.: Induction of Decision Trees. Machine Learning 1(1), 81–106 (1986)
Google Scholar
Ross Quinlan, J.: C4.5: programs for machine learning. Morgan Kaufmann, San Francisco (1993)
Google Scholar
McQueen, J.: Some methods for classification and analysis of multivariations. In: Proc. 5th Berkeley Symposium on Mathematical Statistics and Probability, pp. 281–297 (1967)
Google Scholar
Mierswa, I., Wurst, M., Klinkenberg, R., Scholz, M., Euler, T.: YALE: Rapid Prototyping for Complex Data Mining Tasks. In: 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2006)
Google Scholar
University of Illinois and Data Mining Research Group and DAIS Research Laboratory, IlliMine 1.1.0, http://illimine.cs.uiuc.edu/
Statistics Department of the University of Auckland, R Project 2.6.1, http://www.r-project.org/
Williams, G.: Rattle 2.2.74, http://rattle.togaware.com/
Artificial Intelligence Unit of University of Dortmund, Yale 4.0, http://rapid-i.com/

Download references

Author information

Authors and Affiliations

Laboratory of Information Technology, Center for Research and Advanced Studies, Cd. Victoria, Tam., Mexico
Omar Jasso-Luna, Victor Sosa-Sosa & Ivan Lopez-Arevalo

Authors

Omar Jasso-Luna
View author publications
You can also search for this author in PubMed Google Scholar
Victor Sosa-Sosa
View author publications
You can also search for this author in PubMed Google Scholar
Ivan Lopez-Arevalo
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Center for Computing Research, National Polytechnic Institute, 07738, Mexico City, México
Alexander Gelbukh
Ciencias Computacionales, Instituto Nacional de Astrofísica, Óptica y Electrónica (INAOE), Luis Enrique Erro #1 , Sta. María Tonantzintla, 72840, Puebla, México
Eduardo F. Morales

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Jasso-Luna, O., Sosa-Sosa, V., Lopez-Arevalo, I. (2008). Global Classifier for Confidential Data in Distributed Datasets. In: Gelbukh, A., Morales, E.F. (eds) MICAI 2008: Advances in Artificial Intelligence. MICAI 2008. Lecture Notes in Computer Science(), vol 5317. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-88636-5_30

Download citation

DOI: https://doi.org/10.1007/978-3-540-88636-5_30
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-88635-8
Online ISBN: 978-3-540-88636-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics