Skip to main content

Scaling the Data Mining Step in Knowledge Discovery Using Oceanographic Data

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1821))

Abstract

Knowledge discovery from large acoustic images is a computationally intensive task. The data-mining step in the knowledge discovery process that involves unsupervised learning (clustering) consumes the bulk of the computation. We have developed a technique that allows us to partition the data, distribute it to different processors for training, and train a single system to join the results of the independent categorizers. We report preliminary results using this approach for knowledge discovery with large acoustic images having more than 10,000 training instances.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Avalon Computer Systems, Inc. 1998. Avalon Series A12 Parallel Supercomputers. http://www.teraflop.com/html/a12.html, accessed May 15, 1998.

  2. Bradley, P. S., Usama Fayyad, and Cory Reina. 1998. Scaling clustering algorithms to large databases. In Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining. Edited by Rakesh Agrawal and Paul Stolorz. Menlo Park, CA: AAAI Press. 9–15.

    Google Scholar 

  3. Bridges, Susan, Julia Hodges, Bruce Wooley, Donald Karpovich, George Brannon Smith. 1998. Knowledge discovery in an oceanographic database. Submitted for publication.

    Google Scholar 

  4. Chan, Philip K., and Salvatore J. Stolfo. 1995. Learning arbiter and combiner trees from partitioned data for scaling machine learning. In Proceedings of the First International Conference on Knowledge Discovery and Data Mining. Edited by Usama Fayyad and Ramasamy Uthurusamy. Menlo Park, CA: AAAI Press. 39–44.

    Google Scholar 

  5. Chan, Philip K., and Salvatore J. Stolfo. 1996. Scalable exploratory data mining of distributed geoscientific data. In Proceedings of the Second International Conference on Knowledge Discovery and Data Mining. Edited by Evangelos Simoudis, Jiawei Han and Usama Fayyad. Menlo Park, CA: AAAI Press. 2–7.

    Google Scholar 

  6. Cheeseman, Peter, and John Stutz. 1996. Bayesian classification (AutoClass): Theory and results. Advances in Knowledge Discovery and Data Mining. Edited by Usama M. Fayyad, Gregory Piatetsky-Shapiro, Padhraic Smyth, and Ramasamy Uthurusamy. Menlo Park, CA: AAAI Press. 158–180.

    Google Scholar 

  7. Cheeseman, P. J. Kelly, M. Self, J. Stutz, W. Taylor, and D. Freeman. 1988. AutoClass: A Bayesian classification system. In Proceedings of the Fifth International Conference on Machine Learning. Reprinted in Readings in Machine Learning, edited by Jude W. Shavlik and Thomas G. Dietterich, San Mateo, CA: Morgan Kaufmanns Publishers, Inc. 296–306.

    Google Scholar 

  8. Fayyad, Usama M., Gregory Piatetsky-Shapiro, and Padhraic Smyth. 1996. From data mining to knowledge discovery: An overview. Advances in knowledge discovery and data mining. Edited by Usama M. Fayyad, Gregory Piatetsky-Shapiro, Padhraic Smyth, and Ramasamy Uthurusamy. Menlo Park, CA: AAAI Press. 1–36.

    Google Scholar 

  9. Hodges, Julia, Susan Bridges, Bruce Wooley, Donald Karpovich, and Brannon Smith. 1997. Knowledge Discovery in an Object-Oriented Oceanographic Database System. October 21, 1997. Mississippi State University Technical Report #971021.

    Google Scholar 

  10. Karpovich, Donald. 1998. Choosing the optimal features and texel sizes in image categorization. In Proceedings of the 36th ACM Southeast Conference held in Marietta, GA, April 1–3, 1998. 104–107

    Google Scholar 

  11. Livny, Miron, Raghu Ramakrishnan, and Tian Zhang. 1998. Fast density and probability estimation using CF-Kernel method for very large databases. http://www.cs.wisc.edu/~zhang/birch.html, accessed Oct 1998.

  12. NASA Ames Research Center, Computational Sciences Division. 1998. AutoClass C General Information. http://ic-www.arc.nasa.gov/ic/projects/bayesgroup/autoclass/autoclass-c-program.html, accessed May 15, 1998.

  13. Reed, Thomas Beckett IV, and Donald Hussong. 1989. Digital image processing techniques for enhancement and classification of SeaMARC II side scan sonar imagery. Journal of Geophysical Research. 94(B6). 7469–7490.

    Article  Google Scholar 

  14. Snir, Marc, Steve W. Otto, Steven Huss-Lederman, David W. Walker, and Jack Dongarra. 1996. MPI: The Complete Reference. Cambridge, Massachusetts: The MIT Press.

    Google Scholar 

  15. Wooley, Bruce and George Brannon Smith. 1998. Region-growing techniques based on texture for provincing the ocean floor. In Proceedings of the 36th ACM Southeast Conference held in Marietta, GA, April 1–3, 1998. 99–103.

    Google Scholar 

  16. Wooley, Bruce, Yoginder Dandass, Susan Bridges, Julia Hodges, And Anthony Skjellum. 1998. Scalable knowledge discovery from oceanographic data. In Intelligent engineering systems through artificial neural networks. Volume 8 (ANNIE 98). Edited by Cihan H Dagli, Metin Akay, Anna L Buczak, Okan Ersoy, and Benito R. Fernandez. New York, NY: ASME Press. 413–24.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2000 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Wooley, B., Bridges, S., Hodges, J., Skjellum, A. (2000). Scaling the Data Mining Step in Knowledge Discovery Using Oceanographic Data. In: Logananthara, R., Palm, G., Ali, M. (eds) Intelligent Problem Solving. Methodologies and Approaches. IEA/AIE 2000. Lecture Notes in Computer Science(), vol 1821. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45049-1_11

Download citation

  • DOI: https://doi.org/10.1007/3-540-45049-1_11

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-67689-8

  • Online ISBN: 978-3-540-45049-8

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics