Skip to main content
Log in

Hierarchical Clustering of Spectral Images with Spatial Constraints for the Rapid Processing of Large and Heterogeneous Data Sets

  • Original Research
  • Published:
SN Computer Science Aims and scope Submit manuscript

Abstract

When dealing with full spectrum images in which each pixel is characterized by a full spectrum, i.e. spectral images, standard segmentation methods, such as k-means or hierarchical clustering might be either inapplicable or inappropriate ; one aspect being the multi-GB size of such data set leading to very expensive computations. In the present contribution, we propose an approach to spectral image segmentation combining hierarchical clustering and spatial constraints. On the one hand spatial constraints allow to implement an algorithm with a reasonable computation time to obtain a segmentation and with a certain level of robustness with respect to the signal-to-noise ratio since the prior knowledge injected by the spatial constraint partially compensates for the increase in noise level. On the other hand hierarchical clustering provides a statistically sound and known framework that allows accurate reporting of the instrument noise model. In terms of applications, this segmentation problem is encountered particularly in the study of ancient materials that benefits from the wealth of information provided by the acquisition of spectral images. In the last few years, data collection has been considerably accelerated, enabling the characterization of the sample with a high dynamic range in both the spatial dimensions and composition and leading to an average size of a single data set in the tens of GB range. Hence we also considered computational and memory complexity when developing the herein proposed algorithm. Taking on this application domain, we illustrate the proposed algorithm on a X-ray fluorescence spectral image collected on an ca. 100 Myr fossil fish, as well as on simulated data to assess the sensitivity of the results to the noise level. For such experiment, the lower sensitivity to noise simultaneously lead to an increase in the spatial definition of the collected spectral image, thanks to the faster acquisition time, and to a reduction in the potentially harmful radiation dose density to which the samples are subjected.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  1. Alfeld M, Janssens K. Strategies for processing mega-pixel x-ray fluorescence hyperspectral data: a case study on a version of Caravaggio’s painting Supper at Emmaus. J Anal At Spectrom. 2015;30(3):777–89.

    Article  Google Scholar 

  2. Ambroise C, Govaert G. Convergence of an EM-type algorithm for spatial clustering. Pattern Recogn Lett. 1998;19:919–327.

    Article  Google Scholar 

  3. Bergamaschi A, Medjoubi K, Messaoudi C, Marco S, Somogyi A. Mmx-i: data-processing software for multimodal x-ray imaging and tomography. J Synchrotron Radiat. 2016;23(3):783–94.

    Article  Google Scholar 

  4. Bertrand L, Cotte M, Stampanoni M, Thoury M, Marone F, Schöder S. Development and trends in synchrotron studies of ancient and historical materials. Phys Rep. 2012;519(2):51–96. https://doi.org/10.1016/j.physrep.2012.03.003.

    Article  Google Scholar 

  5. Bertrand L, Robinet L, Thoury M, Janssens K, Cohen SX, Schöder S. Cultural heritage and archaeology materials studied by synchrotron spectroscopy and imaging. Appl Phys A Mater Sci Process. 2012;106(2):377–96. https://doi.org/10.1007/s00339-011-6686-4.

    Article  Google Scholar 

  6. Bertrand L, Thoury M, Anheim E. Ancient materials specificities for their synchrotron examination and insights into their epistemological implications. J Cult Herit. 2013;14(4):277–89.

    Article  Google Scholar 

  7. Bertrand L, Thoury M, Gueriau P, Anheim É, Cohen S. Deciphering the chemistry of cultural heritage: Targeting material properties by coupling spectral imaging with image analysis. Accounts Chem Res. 2021. https://doi.org/10.1021/acs.accounts.1c00063.

    Article  Google Scholar 

  8. Bonnet N, Herbin M, Vautrot P. Multivariate image analysis and segmentation in microanalysis. Scanning Microsc. 1997;11:1–21.

    Google Scholar 

  9. Breiman L, Friedman J, Stone CJ, Olshen RA. Classification and regression trees. New York: Taylor & Francis; 1984.

    MATH  Google Scholar 

  10. Calinski T, Harabasz A. A dendrite method for cluster analysis. Commun Stat. 1974;3:1–27.

    MathSciNet  MATH  Google Scholar 

  11. Cleveland W, Grosse E, Shyu WM. Statistical models in S, chap. Chapter 8: local regression models. New York: Wadsworth & Brooks; 1992.

    Google Scholar 

  12. Davesne D, Gueriau P, Dutheil D, Bertrand L. Exceptional preservation of a cretaceous intestine provides a glimpse of the early ecological diversity of spiny-rayed fishes (acanthomorpha, teleostei). Sci Rep. 2018;8:8509.

    Article  Google Scholar 

  13. Everitt BS, Landau S, Leese M, Stahl D. Cluster analysis. 5th ed. New York: Wiley; 2010.

    MATH  Google Scholar 

  14. Fiske LD, Katsaggelos AK, Aalders MCG, Alfeld M, Walton M, Cossairt O. A data fusion method for the delayering of x-ray fluorescence images of painted works of art. In: 2021 IEEE International Conference on Image Processing (ICIP), 2021;3458–3462. 10.1109/ICIP42928.2021.9506300

  15. Grabowski B, Masarczyk W, Głomb P, Mendys A. Automatic pigment identification from hyperspectral data. J Cult Herit. 2018;31:1–12.

    Article  Google Scholar 

  16. Gueriau P, Bernard S, Farges F, Mocuta C, Dutheil DB, Adatte T, Bomou B, Godet M, Thiaudière D, Charbonnier S, et al. Oxidative conditions can lead to exceptional preservation through phosphatization. Geology. 2020;2:2.

    Google Scholar 

  17. Gueriau P, Jauvion C, Mocuta M. Show me your yttrium, and i will tell you who you are: implications for fossil imaging. Palaeontology. 2018;61(6):981–90.

    Article  Google Scholar 

  18. Gueriau P, Mocuta C, Bertrand L. Cerium anomaly at microscale in fossils. Anal Chem. 2015;87(17):8827–88367.

    Article  Google Scholar 

  19. Gueriau P, Mocuta C, Dutheil D, Cohen S, Thiaudière D, Charbonnier S, Clément G, Bertrand L. Trace elemental imaging of rare earth elements discriminates tissues at microscale in flat fossils. PLoS One. 2014;9(1):e86946.

    Article  Google Scholar 

  20. Gueriau P, Réguer S, Leclercq N, Cupello C, Brito P, Jauvion C, Morel S, Charbonnier S, Thiaudière D, Mocuta C. Visualizing mineralization processes and fossil anatomy using synchronous synchrotron X-ray fluorescence and X-ray diffraction mapping. J R Soc Interface. 2020;17(169):20200216. https://doi.org/10.1098/rsif.2020.0216.

    Article  Google Scholar 

  21. Lance GN, Williams WT. A general theory of classificatory sorting strategies: II. Clustering systems. Comput J. 1967;10(3):271–7. https://doi.org/10.1093/comjnl/10.3.271.

    Article  Google Scholar 

  22. Lebart L. Programme d’agrégation avec contrainte. Cahiers de L’analyse des Données. 1978;3:275–87.

    Google Scholar 

  23. Mihalić IB, Fazinić S, Barac M, Karydas AG, Migliori A, Doračić D, Desnica V, Mudronja D, Krstić D. Multivariate analysis of pixe+ xrf and pixe spectral images. J Anal At Spectrom. 2021;36(3):654–67.

    Article  Google Scholar 

  24. Milligan G, Cooper M. An examination of procedures for determining the number of clusters in a data set. Psychometrika. 1985;50:159–79.

    Article  Google Scholar 

  25. Pouyet E, Rohani N, Katsaggelos AK, Cossairt O, Walton M. Innovative data reduction and visualization strategy for hyperspectral imaging datasets using t-sne approach. Pure Appl Chem. 2018;90(3):493–506.

    Article  Google Scholar 

  26. R Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria (2020). https://www.R-project.org/

  27. Rand WM. Objective criteria for the evaluation of clustering methods. J Am Stat Assoc. 1971;66(336):846–50.

    Article  Google Scholar 

  28. Rodriguez MA, Kotula PG, Griego JJ, Heath JE, Bauer SJ, Wesolowski DE. Multivariate statistical analysis of micro-X-ray fluorescence spectral images. Powder Diffr. 2012;27(2):108–13.

    Article  Google Scholar 

  29. Sciutto G, Oliveri P, Prati S, Quaranta M, Bersani S, Mazzeo R. An advanced multivariate approach for processing X-ray fluorescence spectral and hyperspectral data from non-invasive in situ analyses on painted surfaces. Anal Chim Acta. 2012;752:30–8.

    Article  Google Scholar 

  30. Solé VA, Papillon E, Cotte M, Walter P, Susini J. A multiplatform code for the analysis of energy-dispersive X-ray fluorescence spectra. Spectrochim Acta B. 2007;62:63–8.

    Article  Google Scholar 

  31. Vekemans B, Janssens K, Vincze L, Aerts A, Adams F, Hertogen J. Automated segmentation of \(\mu\)-xrf image sets. X-Ray Spectrom. 1997;26(6):333–46.

    Article  Google Scholar 

  32. Vogt S, Maser J, Jacobsen C. Data analysis for X-ray fluorescence imaging. J Phys IV. 2003;104:617–22.

    Google Scholar 

  33. Ward JH. Hierarchical grouping to optimize an objective function. J Am Stat Assoc. 1963;58:236–44.

    Article  MathSciNet  Google Scholar 

  34. Webb S. The microanalysis toolkit: X-ray fluorescence image processing software. In: AIP Conference Proceedings, vol. 1365. American Institute of Physics 2011; pp. 196–199

Download references

Acknowledgements

We thank S. Charbonnier, G. Clément, N.-E. Jalil, Didier B. Dutheil (MNHN, Paris), A. Tourani (Cadi Ayyad University, Marrakesh), P.M. Brito (Rio de Janeiro State University, Rio de Janeiro), F. Khaldoune, H. Bourget and B. Khalloufi for organizing and/or participating in the field work that collected the fossil. This field expedition to Morocco was supported by the Muséum national d’Histoire naturelle through the “ATM Biodiversité actuelle et fossile” and by UMR 7207 CR2P. We acknowledge Synchrotron SOLEIL for provision of beamtime, and C. Mocuta and D. Thiaudière for assistance at the DiffAbs beamline. Authors would also like to thanks the peers that reviewed the manuscript for their constructive comments and advices that helped us enhancing the presentation of our results.

Author information

Authors and Affiliations

Authors

Contributions

This work arose from discussions between GC, SXC and AG. SXC proposed the exploitation of spatial constraints and the use of \(\chi ^2\) as an adapted dissimilarity measure for XRF spectra. GC proposed the heuristic rule to stop applying spatial constraint on the segmentation. AG proposed a version of the \(\chi ^2\) metric consistent between the spatially constrained initial steps and the unconstrained agglomerative steps, so that Lance and William formulae could be used in this latter part. SXC and AG implemented the algorithm and its result representations in R. SXC proposed and implemented the color matching in segmentation representation and the zero noise models. PG performed all the experimental measurements and interpretations on the fossil, and oriented the algorithm design to ensure results are valuable for the practitioner. All authors contributed to the writing of this manuscript.

Corresponding author

Correspondence to Serge X. Cohen.

Ethics declarations

Conflict of interest statement

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Celeux, G., Cohen, S.X., Grimaud, A. et al. Hierarchical Clustering of Spectral Images with Spatial Constraints for the Rapid Processing of Large and Heterogeneous Data Sets. SN COMPUT. SCI. 3, 194 (2022). https://doi.org/10.1007/s42979-022-01074-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s42979-022-01074-4

Keywords

Navigation