Steps Toward a Large-Scale Solar Image Data Analysis to Differentiate Solar Phenomena
- 201 Downloads
- 7 Citations
Abstract
We detail the investigation of the first application of several dissimilarity measures for large-scale solar image data analysis. Using a solar-domain-specific benchmark dataset that contains multiple types of phenomena, we analyzed combinations of image parameters with different dissimilarity measures to determine the combinations that will allow us to differentiate between the multiple solar phenomena from both intra-class and inter-class perspectives, where by class we refer to the same types of solar phenomena. We also investigate the problem of reducing data dimensionality by applying multi-dimensional scaling to the dissimilarity matrices that we produced using the previously mentioned combinations. As an early investigation into dimensionality reduction, we investigate by applying multidimensional scaling (MDS) how many MDS components are needed to maintain a good representation of our data (in a new artificial data space) and how many can be discarded to enhance our querying performance. Finally, we present a comparative analysis of several classifiers to determine the quality of the dimensionality reduction achieved with this combination of image parameters, similarity measures, and MDS.
Keywords
Solar image data analysis Content-based image retrieval (CBIR) Dissimilarity measuresNotes
Acknowledgements
This work was supported in part by the NASA Grant Award No. 08-SDOSC08-0008, funded from NNH08ZDA001N-SDOSC: Solar Dynamics Observatory Science Center solicitation. We would also like to thank our internal reviewers Michael Schuh and Richard McAllister.
Supplementary material
References
- Aggarwal, C., Hinneburg, A., Keim, D.A.: 2001, On the surprising behavior of distance metrics in high dimensional space. In: Van den Bussche, J., Vianu, V. (eds.) Internat. Conf. Database Theory, Springer, Berlin, 420 – 434. Google Scholar
- Banda, J.M., Angryk, R.: 2009, On the effectiveness of fuzzy clustering as a data discretization technique for large-scale classification of solar images. In: Feng, G.G. (ed.) Proc. IEEE International Conference on Fuzzy Systems, IEEE, New York, 2019 – 2024. Google Scholar
- Banda, J.M., Angryk, R.: 2010a, An experimental evaluation of popular image parameters for monochromatic solar image categorization. In: Guesgen, H., Murray, C. (eds.) The 23rd Florida Artificial Intelligence Research Society Conf., 380 – 385. http://www.aaai.org/ocs/index.php/FLAIRS/2010/paper/view/1364. Google Scholar
- Banda, J.M., Angryk, R.: 2010b, Usage of dissimilarity measures and multidimensional scaling for large scale solar data analysis. In: Srivastava, A., Chawla, N., Yu, P., Melby, P. (eds.) Proc. 2010 Conference on Intelligent Data Understanding (CIDU) 2010, NASA Ames Research Center, 189 – 203. Google Scholar
- Banda, J.M., Angryk, R.: 2010c, Selection of image parameters as the first step towards creating a CBIR system for the solar dynamics observatory. In: Zhang, J., Shen, C., Geers, G. (eds.) Proceedings of International Conference on Digital Image Computing: Techniques and Applications (DICTA), IEEE, New York, 528 – 534. Google Scholar
- Banda, J.M., Angryk, R., Martens, P.C.H.: 2012, On dimensionality reduction for indexing and retrieval of large-scale solar image data. Solar Phys. 283, 113 – 141. doi: 10.1007/s11207-012-0027-4. ADSCrossRefGoogle Scholar
- Beatty, M., Manjunath, B.S.: 1997, Dimensionality reduction using multi-dimensional scaling for content-based retrieval. In: Internat. Conf. on Image Processing 1997 2, 835. Google Scholar
- Borg, I., Groenen, P.: 2005, Modern Multidimensional Scaling: Theory and Applications, 2nd edn., Springer, Berlin, 145 – 150. Google Scholar
- Cernadas, E., Carriön, P., Rodriguez, P., Muriel, E., Antequera, T.: 2005, Analyzing magnetic resonance images of Iberian pork loin to predict its sensorial characteristics. Comput. Vis. Image Underst. 98(2), 344 – 360. CrossRefGoogle Scholar
- Chaudhuri, B.B., Nirupam, S.: 1995, Texture segmentation using fractal dimension. IEEE Trans. Pattern Anal. Mach. Intell. 17(1), 72 – 77. CrossRefGoogle Scholar
- Datta, R., Li, J., Wang, K.: 2005, Content-based image retrieval – approaches and trends of the new age. In: Zhang, H., Smith, J., Tian, Q. (eds.) Proc. of the 7th ACM SIGMM Internat. Workshop on Multimedia Information Retrieval, ACM, New York, 253 – 262. CrossRefGoogle Scholar
- Deselaers, T., Keysers, D., Ney, H.: 2008, Features for image retrieval: an experimental comparison. Inf. Retr. 11, 77 – 107. CrossRefGoogle Scholar
- Devendran, V., Hemalatha, T., Amitabh, W.: 2009, SVM based hybrid moment features for natural scene categorization. In: Internat. Conf. Comput. Science Eng. 1, IEEE Computer Soc., Washington, 356 – 361. Google Scholar
- Francois, D., Wertz, V., Verleysen, M.: 2005, Non-Euclidean metrics for similarity search in noisy datasets. In: European Symposium on Artificial Neural Networks, 27 – 29. http://citeseerx.ist.psu/viewdoc/summary?doi=10.1.1.59.9051. Google Scholar
- Gonzalez, R.C., Woods, R.E.: 2006, Digital Image Processing, 3rd edn., Prentice-Hall, New York, 100 – 120. Google Scholar
- Guo, G.D., Jain, A.K., Ma, W.Y., Zhang, H.J.: 2002, Learning similarity measure for natural image retrieval with relevance feedback. IEEE Trans. Neural Netw. 13, 811 – 820. CrossRefGoogle Scholar
- Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: 2009, The WEKA data mining software: an update. In: Grossman, R., Zaiane, O., Aggarwal, C., Goethals, B. (eds.) SIGKDD Explorations 11, ACM, New York, 10 – 18. Google Scholar
- Handy, B.N., Acton, L.W., Kankelborg, C.C., Wolfson, C.J., Akin, D.J., Bruner, M.E., Caravalho, R., Catura, R.C., Chevalier, R., Duncan, D.W., Edwards, C.G., Feinstein, C.N., Freeland, S.L., Friedlaender, F.M., Hoffmann, C.H., Hurlburt, N.E., Jurcevich, B.K., Katz, N.L., Kelly, G.A., Lemen, J.R., Levay, M., Lindgren, R.W., Mathur, D.P., Meyer, S.B., Morrison, S.J., Morrison, M.D., Nightingale, R.W., Pope, T.P., Rehse, R.A., Schrijver, C.J., Shine, R.A., Shing, L., Strong, K.T., Tarbell, T.D., Title, A.M., Torgerson, D.D., Golub, L., Bookbinder, J.A., Caldwell, D., Cheimets, P.N., Davis, W.N., Deluca, E.E., McMullen, R.A., Warren, H.P., Amato, D., Fisher, R., Maldonado, H., Parkinson, C.: 1999, The transition region and coronal explorer. Solar Phys. 187(2), 229 – 260. doi: 10.1023/A:1005166902804. ADSCrossRefGoogle Scholar
- Holalu, S.S., Arumugam, K.: 2006, Breast tissue classification using statistical feature extraction of mammograms. Med. Imaging Inf. Sci. 23(3), 105 – 107. Google Scholar
- Kullback, S., Leibler, R.A.: 1951, On information and sufficiency. Ann. Math. Stat. 22(1), 79 – 86. MathSciNetCrossRefMATHGoogle Scholar
- Lam, R., Ip, H., Cheung, K., Tang, L., Hanka, R.: 2000, Similarity measures for histological image retrieval. In: The 15th International Conference on Pattern Recognition 2000 2, IEEE Computer Society, Washington, 2295 – 2298. Google Scholar
- Lamb, R.: 2008, An information retrieval system for images from the TRACE satellite. Master’s Thesis, Montana State University, Bozeman, MT, USA. Google Scholar
- Lin, J.: 2001, Divergence measures based on the Shannon entropy. IEEE Trans. Inf. Theory 37(1), 145 – 151. CrossRefGoogle Scholar
- Lux, M., Savvas, A.C.: 2008, Lire: lucene image retrieval – an extensible Java CBIR library. In: Proc. of the 16th ACM International Conference on Multimedia, ACM, New York, 1085 – 1088. CrossRefGoogle Scholar
- Munkres, J.: 1999, Topology, 2nd edn., Prentice Hall, New York, 280 – 281. Google Scholar
- Naud, A.: 2001, Neural and statistical methods for the visualization of multidimensional data. Ph.D. Thesis, Uniwersytet Mikolaja Kopernika w Toruniu, 84–85. Google Scholar
- Ojala, T., Pietikainen, M., Harwood, D.: 1996, A comparative study of texture measures with classification based feature distributions. Pattern Recognit. 29(1), 51 – 59. CrossRefGoogle Scholar
- Pentland, A.P.: 1984, Fractal-based description of natural scenes. IEEE Trans. Pattern Anal. Mach. Intell. 6, 661 – 674. CrossRefGoogle Scholar
- Rubner, Y., Guibas, L.J., Tomasi, C.: 1997, The Earth mover’s distance, multi-dimensional scaling, and color-based image retrieval. In: Proceedings of the ARPA Image Understanding Workshop, 661 – 668. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.79.4654. Google Scholar
- Schroeder, M.: 1991, Fractals, Chaos, Power Laws: Minutes from an Infinite Paradise, W.H. Freeman, San Francisco, 41 – 45. MATHGoogle Scholar
- Shahrokni, A.: 2004, Texture boundary detection for real-time tracking. In: Pajdla, T., Matas, J. (eds.) Euro. Conf. Comput. Vis. 2004 2, Springer, Berlin, 566 – 577. CrossRefGoogle Scholar
- Shepard, R.N.: 1980, Multidimensional scaling, tree-fitting, and clustering. Science 210 (4468), 390 – 398. MathSciNetADSCrossRefMATHGoogle Scholar
- Spearman, C.: 1904, The proof and measurement of association between two things. Am. Psychol. 15, 72 – 101. CrossRefGoogle Scholar
- Tamura, H., Mori, S., Yamawaki, T.: 1978, Texture features corresponding to visual perception. IEEE Trans. Syst. Man Cybern. 8(6), 460 – 473. CrossRefGoogle Scholar
- Tan, P.-N., Steinbach, M., Kumar, V.: 2005, Introduction to Data Mining, Addison Wesley, Reading. Google Scholar
- Wen-lun, C., Zhong-ke, S., Jian, F.: 2006, Traffic image classification method based on fractal dimension. In: Yao, Y., Shi, Z., Wang, Y., Kinsner, W. (eds.) IEEE International Conference on Cognitive Informatics 2, IEEE, New York, 903 – 907. Google Scholar
- Yang, K., Trewn, J.: 2004, Multivariate Statistical Methods in Quality Management, McGraw-Hill, New York, 183 – 185. Google Scholar