Skip to main content

Searching for Bent-Double Galaxies in the First Survey

  • Chapter
Data Mining for Scientific and Engineering Applications

Part of the book series: Massive Computing ((MACO,volume 2))

Abstract

Data mining techniques are increasingly gaining popularity in various scientific domains as viable approaches to the analysis of massive data sets. In this chapter, we describe our experiences in applying data mining to a problem in astronomy, namely, the identification of radio-emitting galaxies with a bent-double morphology. Until recently, astronomers associated with the FIRST (Faint images of the radio Sky at Twenty-cm) survey identified these galaxies through a visual inspection of images. White this manual approach has been very subjective and tedious, it is also becoming increasingly infeasible as the survey has grown in size. Upon completion, FIRST will include almost a million galaxies, making the use of semi-automated analysis methods necessary. We describe the FIRST data set and the problem of identifying bent-double galaxies. We discuss our solution approach, focusing on the challenges we face in the application of data mining to a scientific data set. We explain why, in contrast with most commercial data mining applications, data preprocessing requires a considerable effort in scientific applications. Using decision tree classifiers, we describe the work we are doing in the detection of bent-double galaxies. Our results indicate that data mining techniques, steered by proper domain knowledge, can greatly enhance the manual exploration of massive data sets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. L. Breiman, J. H. Friedman, R. A. Olshen, and C. J. Stone. Classification and Regression Trees. CRC Press, 1984.

    MATH  Google Scholar 

  2. R. H. Becker, R.L. White, and D.J. Helfand. The FIRST survey: Faint images of the radio sky at twenty-cm. Astrophysical Journal, 450:559, 1995.

    Article  Google Scholar 

  3. I. K. Fodor, E. CantĂș-Paz, C. Kamath, and N. Tang. Finding bent-double radio galaxies: A case study in data mining. In Interface: Computer Science and Statistics, volume 33, April 2000.

    Google Scholar 

  4. FIRST: Faint images of the radio sky at twenty centimeters. http://sundog.stsci.edu/.

  5. C. Kamath, C. Baldwin, I. Fodor, and N. Tang. On the design and implementation of a parallel, object-oriented, image processing toolkit. In Proceedings International Symposium on Optical Science and Technology, SPIE Annual Meeting, San Diego, July 2000.

    Google Scholar 

  6. C. Kamath and E. CantĂș-Paz. On the design of a parallel object-oriented data mining toolkit. In Workshop on Distributed and Parallel Knowledge Discovery at the Knowledge Discovery and Data Mining Conference Boston, August 2000.

    Google Scholar 

  7. C. Kamath and R. Musick. Scalable data mining through finegrained parallelism: The present and the future. In H. Kargupta and P. Chan, editors, Advances in Distributed and Parallel Knowledge Discovery, pages 29–77. AAAI Press/The MIT Press, 2000.

    Google Scholar 

  8. J. Lehar, A. Buchalter, R. McMahon, C. Kochanek, D. Helfand, R. Becker, and T. Muxlow. The FIRST efficient gravitational lens survey. 1999. submitted to “Gravitational Lensing: Recent progress and Future Goals, eds: T. Brainerd and C. Kochanek, ASP Conf Series See also http://xxx.lanl.gov/abs/astro-ph/9908353/abs/astro-ph/9908353.

    Google Scholar 

  9. J. R. Quinlan. C4.5: Programs for Machine Learning. Morgan Kaufman, 1993.

    Google Scholar 

  10. Sapphire: Large-scale data mining and pattern recognition. http://www.llnl.gov/casc/sapphire/casc/sapphire.

  11. R. L. White, R.H. Becker, D.J. Helfand, and M.D. Gregg. A catalog of 1.4 GHz radio sources from the FIRST survey. Astrophysical Journal, 475:479, 1997.

    Article  Google Scholar 

  12. R. L. White, 1999. Private Communication.

    Google Scholar 

Download references

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2001 Springer Science+Business Media Dordrecht

About this chapter

Cite this chapter

Kamath, C., CantĂș-Paz, E., Fodor, I.K., Tang, N.A. (2001). Searching for Bent-Double Galaxies in the First Survey. In: Grossman, R.L., Kamath, C., Kegelmeyer, P., Kumar, V., Namburu, R.R. (eds) Data Mining for Scientific and Engineering Applications. Massive Computing, vol 2. Springer, Boston, MA. https://doi.org/10.1007/978-1-4615-1733-7_6

Download citation

  • DOI: https://doi.org/10.1007/978-1-4615-1733-7_6

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-1-4020-0114-7

  • Online ISBN: 978-1-4615-1733-7

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics