Abstract
Knowledge discovery from large acoustic images is a computationally intensive task. The data-mining step in the knowledge discovery process that involves unsupervised learning (clustering) consumes the bulk of the computation. We have developed a technique that allows us to partition the data, distribute it to different processors for training, and train a single system to join the results of the independent categorizers. We report preliminary results using this approach for knowledge discovery with large acoustic images having more than 10,000 training instances.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Avalon Computer Systems, Inc. 1998. Avalon Series A12 Parallel Supercomputers. http://www.teraflop.com/html/a12.html, accessed May 15, 1998.
Bradley, P. S., Usama Fayyad, and Cory Reina. 1998. Scaling clustering algorithms to large databases. In Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining. Edited by Rakesh Agrawal and Paul Stolorz. Menlo Park, CA: AAAI Press. 9–15.
Bridges, Susan, Julia Hodges, Bruce Wooley, Donald Karpovich, George Brannon Smith. 1998. Knowledge discovery in an oceanographic database. Submitted for publication.
Chan, Philip K., and Salvatore J. Stolfo. 1995. Learning arbiter and combiner trees from partitioned data for scaling machine learning. In Proceedings of the First International Conference on Knowledge Discovery and Data Mining. Edited by Usama Fayyad and Ramasamy Uthurusamy. Menlo Park, CA: AAAI Press. 39–44.
Chan, Philip K., and Salvatore J. Stolfo. 1996. Scalable exploratory data mining of distributed geoscientific data. In Proceedings of the Second International Conference on Knowledge Discovery and Data Mining. Edited by Evangelos Simoudis, Jiawei Han and Usama Fayyad. Menlo Park, CA: AAAI Press. 2–7.
Cheeseman, Peter, and John Stutz. 1996. Bayesian classification (AutoClass): Theory and results. Advances in Knowledge Discovery and Data Mining. Edited by Usama M. Fayyad, Gregory Piatetsky-Shapiro, Padhraic Smyth, and Ramasamy Uthurusamy. Menlo Park, CA: AAAI Press. 158–180.
Cheeseman, P. J. Kelly, M. Self, J. Stutz, W. Taylor, and D. Freeman. 1988. AutoClass: A Bayesian classification system. In Proceedings of the Fifth International Conference on Machine Learning. Reprinted in Readings in Machine Learning, edited by Jude W. Shavlik and Thomas G. Dietterich, San Mateo, CA: Morgan Kaufmanns Publishers, Inc. 296–306.
Fayyad, Usama M., Gregory Piatetsky-Shapiro, and Padhraic Smyth. 1996. From data mining to knowledge discovery: An overview. Advances in knowledge discovery and data mining. Edited by Usama M. Fayyad, Gregory Piatetsky-Shapiro, Padhraic Smyth, and Ramasamy Uthurusamy. Menlo Park, CA: AAAI Press. 1–36.
Hodges, Julia, Susan Bridges, Bruce Wooley, Donald Karpovich, and Brannon Smith. 1997. Knowledge Discovery in an Object-Oriented Oceanographic Database System. October 21, 1997. Mississippi State University Technical Report #971021.
Karpovich, Donald. 1998. Choosing the optimal features and texel sizes in image categorization. In Proceedings of the 36th ACM Southeast Conference held in Marietta, GA, April 1–3, 1998. 104–107
Livny, Miron, Raghu Ramakrishnan, and Tian Zhang. 1998. Fast density and probability estimation using CF-Kernel method for very large databases. http://www.cs.wisc.edu/~zhang/birch.html, accessed Oct 1998.
NASA Ames Research Center, Computational Sciences Division. 1998. AutoClass C General Information. http://ic-www.arc.nasa.gov/ic/projects/bayesgroup/autoclass/autoclass-c-program.html, accessed May 15, 1998.
Reed, Thomas Beckett IV, and Donald Hussong. 1989. Digital image processing techniques for enhancement and classification of SeaMARC II side scan sonar imagery. Journal of Geophysical Research. 94(B6). 7469–7490.
Snir, Marc, Steve W. Otto, Steven Huss-Lederman, David W. Walker, and Jack Dongarra. 1996. MPI: The Complete Reference. Cambridge, Massachusetts: The MIT Press.
Wooley, Bruce and George Brannon Smith. 1998. Region-growing techniques based on texture for provincing the ocean floor. In Proceedings of the 36th ACM Southeast Conference held in Marietta, GA, April 1–3, 1998. 99–103.
Wooley, Bruce, Yoginder Dandass, Susan Bridges, Julia Hodges, And Anthony Skjellum. 1998. Scalable knowledge discovery from oceanographic data. In Intelligent engineering systems through artificial neural networks. Volume 8 (ANNIE 98). Edited by Cihan H Dagli, Metin Akay, Anna L Buczak, Okan Ersoy, and Benito R. Fernandez. New York, NY: ASME Press. 413–24.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2000 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Wooley, B., Bridges, S., Hodges, J., Skjellum, A. (2000). Scaling the Data Mining Step in Knowledge Discovery Using Oceanographic Data. In: Logananthara, R., Palm, G., Ali, M. (eds) Intelligent Problem Solving. Methodologies and Approaches. IEA/AIE 2000. Lecture Notes in Computer Science(), vol 1821. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45049-1_11
Download citation
DOI: https://doi.org/10.1007/3-540-45049-1_11
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-67689-8
Online ISBN: 978-3-540-45049-8
eBook Packages: Springer Book Archive