1 Introduction

Solar activity is currently increasing as the Sun nears its maximum for Solar Cycle 25, predicted to be around 2025 (Benson et al., 2020). Thus, geoeffective solar eruptions are becoming more frequent and intense. This, for example, has been beautifully evidenced by aurorae observed at unusually low latitudes (e.g. in the UK or France) in February this year (2023). Beyond aurorae, this increased activity may also jeopardise astronauts, satellites, aviation electronics and radio communications with a massive influx of highly energetic particles spitted out by our frantic star during these active times. Ground-based electronic systems can also suffer from the disruptions caused by the so-called geomagnetically induced currents. Our modern society and technosphere are therefore directly affected by these vigorous space weather (SW) events, whose scope of impact is not yet fully known. Maehara et al. (2017) even infer that the physical processes at stake are the same on both the Sun and solar-type stars, where super-flares – with energy between \(10^{33}\) and \(10^{35}\) erg, in comparison with a range of \(10^{28}\) to \(10^{32}\) erg for typical solar flares – were observed. Shibata et al. (2013) suggest the possibility of a super-flare occurring on Earth in the 8th century, therefore the non-zero probability of such events is a further impetus to analyse solar magnetism. A better understanding of the Sun’s activity and improved SW forecasts are desired and much needed.

Growing solar activity data are at our disposal and must be wisely used in order to unveil the triggers of SW. For instance, sunspots are easily recognisable in optical white-light images as small dark patches, and counting them has been a classic and convenient way of measuring solar activity for years. Sunspots are indeed found in solar active regions (ARs), where they can reach very high magnetic-field strengths, often up to more than 3000 G. They consist of a darker umbra surrounded by a lighter outer penumbra (Chevalier, 1907; Siu-Tapia et al., 2017). Larger compact magnetic flux concentrations without penumbrae – i.e. reduced to umbrae – are called pores. Okamoto and Sakurai (2018) reported, as of writing, the largest value of 6200 G within an umbra, while van Noort et al. (2013) and Siu-Tapia et al. (2019) found a value of more than 7000 G in some penumbrae. Sunspots appear dark in the photosphere and thus are very easy to identify on photospheric images because they are colder than the regions around them: the magnetic field is so intense that it inhibits convection movements (De Jager, 1963).

Most powerful solar activities, such as flares and coronal mass ejections (CMEs), originate mainly from these ARs and sunspots (Zirin, 1970; Tend and Kuperus, 1978; Forbes et al., 2006; Mikić and Lee, 2006; Georgoulis, 2008; Liu et al., 2021). In particular, Patty and Hagyard (1986) and Shi and Wang (1993, 1994) have found that complex delta-sunspot configurations are strongly correlated with the onset of X-class flares. Quantitative information about these solar regions (e.g. their evolution, lifespan, location, area, magnetic flux, etc.) is thus essential for a better understanding of the underlying physics and improving SW forecasting (as examples, see Korsós, Baranyi, and Ludmány, 2014; Korsós et al., 2015a,b; Korsós, Chatterjee, and Erdélyi, 2018; Korsós, Yang, and Erdélyi, 2019 and Erdélyi et al., 2022 have investigated the correlation between X-class flares and the horizontal magnetic gradient’s behaviour inside delta-sunspots with the aim of providing earlier flare warning).

To investigate sunspot configurations, it has historically been usual practice to report them manually. However, eye-checking and hand-drawing turn out to be a long and tedious process. Besides, it is not always objective, as it heavily depends on how one counts. In recent years, automated methods for identifying sunspots were implemented based on machine learning (e.g. simulated annealing genetic method developed by Yang et al., 2018) or more classical image processing techniques like thresholding (e.g., adaptive thresholding method carried out by Hanaoka, 2022). Hanaoka (2022) also emphasises the importance of automated sunspot detection. While the method developed by Hanaoka (2022), for example, achieves similar performance to that of sunspot drawing, it offers an undeniable saving time, efficiency, and a better generalisation to data from different sources.

In this paper, we present a mathematical morphology (MM) approach based on simple yet powerful mathematical transforms exploiting the geometrical shape, size, directionality, and local intensity of the objects of interest in images (e.g. sunspots). MM is a fast and easy-to-implement method. Other works, led by Barata et al. (2018) and Carvalho et al. (2020), have already exploited MM for automated solar plage and sunspot detection, respectively, and validated this approach qualitatively by visual inspection (more references to studies exploring MM in the field of heliophysics are given in Section 2.1).

Here, we seek to apply, further develop and validate the MM method in a quantitative manner – with the future goal of employing it to more complex solar image segmentation problems. We show that MM delivers results that are accurate and reliable compared to the current sunspot drawing databases. If some fine-tuning may be needed, the method can actually deal with solar images from different instruments – on board both space-borne and ground-based observatories – and measure the area of sunspots in a consistent way, comparable to that of existing catalogues (Baranyi, Győri, and Ludmány, 2016; Győri, Ludmány, and Baranyi, 2017 and Mandal et al., 2020).

This work is organised as follows. In Section 2, we present the data and methods. In particular, Section 2.1 describes the mathematical morphology method, how it works, and its main transforms. Section 2.2 briefly introduces the satellite data needed for this work, as well as two solar catalogues selected for comparing and validating MM. Section 3 focuses on the detailed procedure for sunspot extraction and sunspot area measurement. In Section 4, we compare the sunspot areas obtained by MM with the data reported in the solar catalogues introduced in Section 2.2. We also discuss the performance, strengths and weaknesses of the MM method. Finally, we conclude and outline some desirable future prospects in Section 5.

2 Methodology

2.1 Mathematical Morphology

2.1.1 Solar Physics Applications

In this work, we adopt an automated method for sunspot detection based on MM algorithms. MM gathers a set of nonlinear image processing operations. Collaboratively conceived in 1964 by Georges Matheron and Jean Serra, MM was first used to study porous media (Matheron, 1967; Haas, Matheron, and Serra, 1967; Serra and École nationale supérieure des mines de Paris, 1969; Serra, 1982; Jeulin, 1989; Heijmans, 1995; Soille, 1999; Matheron and Serra, 2001; Serra, 2020). Indeed, it was a groundbreaking technique to assess the porosity in images considering the geometrical shape of image structures. The MM transforms, originally performing on binary images, have been quickly extended to greyscale, colour and 3D images, thus opening up a wider scope of applicability. Examples of applications of this method can be found in other research areas like medical imaging (Prêteux, 1992; Zhao, Zhang, and Ma, 2012), but its use in heliophysics is not yet common. However, since the early 2000s, valuable studies have been carried out regarding sunspot detection through MM, providing promising results in terms of efficiency and accuracy. Zharkov et al. (2005) and Curto, Blanca, and Martínez (2008) have proposed early applications of MM to automatic sunspot detection on full-disk solar images. Barata et al. (2018) pointed out in their work on facular region identification with MM in chromospheric images from the Observatory of the University of Coimbra that MM is a very suitable tool for solar image processing. It is indeed able to detect in solar images, with high accuracy, complex and irregular-shaped objects which are of interest to SW forecasting. For instance, Qu et al. (2005) and Koch and Rosolowsky (2015) used MM techniques to automatically identify solar filaments in H\(\alpha\) full-disk images from the Big Bear Solar Observatory for investigating filament disappearances, and in far-infrared/submillimetre dust emission data from the Hershel Gould Belt Survey to probe the stability and brightness of interstellar filaments in star-forming cloud complexes, respectively.

Similarly, Shih and Kowalski (2003) implemented robust MM algorithms that excel in determining large filaments and provide reasonable detection rates for smaller ones. MM can also be applied in simulation data to the extraction of flux rope structures present in the early phase of CMEs (Wagner et al., 2023). Stenning et al. (2013) have classified sunspots according to the Mount Wilson scheme with MM and, more recently, MM algorithms have been used by Carvalho et al. (2020) and Ling et al. (2020) to identify sunspots, by du Toit et al. (2020) to track them for CME forecasting purposes, and by Hou et al. (2022) to extract both umbrae and penumbrae in order to investigate the long-term variation of the penumbra to umbra area ratio utilizing hand-drawing sunspot data. Here, we use MM to automatically identify, construct, and draw the contours of sunspots and measure their area.

2.1.2 Structuring Element and Main Transforms

Applying MM transforms to images requires the user to a priori fine-tune the size and shape of a peculiar object called structuring element (SE). This object plays the role of a kernel probing the image: the goal is to compare all the features of the image with the SE. The latter must be simpler than the features to be analysed. For extracting sunspots out of full-disk solar images, for example, it is appropriate to select a circular-shaped SE in order to avoid any distortion of the solar disk (Carvalho et al., 2020) and also because the features of interest (e.g. sunspots) are isotropic. If the image features have a preferential direction, it could be more suitable to choose a line-shaped SE – for instance, in the case of coronal jets, which are thin, straight-line structures. Therefore, the SE has to be carefully hand-picked, taking into account (i) its form, (ii) its dimensions, and (iii) its directionality (Soille, 1999). The different features in the image undergo transformations (e.g. they are excluded or included, reduced or enlarged, thinned or thickened, reconstructed, etc.) according to the transform that we apply to the image.

MM is formulated on two elemental yet noteworthy transforms: the erosion and the dilation. MM rests primarily on set theory, therefore, we first recall a set of fundamentally important definitions that are needed for further understanding.

The erosion of a set X gathers the centre points of a SE, noted \(S\), such that \(S\) is included into X (Equation 1). This definition can be extended to greyscale images in general: Soille (1999) defined the erosion of an image f as the “minimum of the translations of f by the vectors −b of S” (Equation 2). In practical terms, the erosion computes an output pixel in an image as the minimum value of all the pixels present in a neighbourhood defined by the chosen SE.

$$\begin{aligned} \epsilon _{S}(X) & = X \ominus S = \bigcap _{b \in S} X_{- b} = \{x | S \subseteq X\}. \end{aligned}$$
(1)
ϵ S (f)=fS= b S f b x,( ϵ S (f))(x)= min b S f(x+b).
(2)

The dilation of a set X is defined by all the centre points of \(S\) such that S touches X (Equation 3), which can be extrapolated to greyscale image processing by Equation 4 (Soille, 1999): an output pixel is computed by the dilation operation as the maximum value of all the pixels present in a neighbourhood defined by the SE.

$$\begin{aligned} \delta _{S}(X) & = X \oplus S = \bigcup _{b \in S} X_{- b} = \{x | S \cap X \ne \emptyset \}. \end{aligned}$$
(3)
δ S (f)=fS= b S f b x,( δ S (f))(x)= max b S f(x+b).
(4)

While, in linear image processing, the fundamental laws are based on the addition and multiplication operations with the scalar product, nonlinear MM processing builds upon infimum (⋀ for images, set intersection ⋂ in set theory) and supremum (⋁ for images, set union ⋃ in set theory) with the ordering relation (≤ for images, inclusion ⊆ in set theory) – an ordering relation satisfying the three properties of reflexivity (Equation 5), antisymmetry (Equation 6) and transitivity (Equation 7).

$$\begin{aligned} \forall x, & \quad x \leqslant x. \end{aligned}$$
(5)
x,y,xyandyxx=y.
(6)
x,y,z,xyandyzxz.
(7)

Rather than the convolution in classical linear filtering, the main basic operations in MM processing are the erosion and dilation transforms. An operation is considered a morphological filter if, and only if, it is increasing – that is, it preserves ordering – and idempotent, that is, it reaches convergence in one iteration. For example, both erosion and dilation are morphological filters. Indeed, both of these operations are increasing, as they preserve the inclusion relation (see Equation 8) and idempotent since, once they have been applied, the result (i.e. an eroded or a dilated image) is not modified by subsequent iterations of these operations with the same structuring element (see Equation 9).

(0,0)SandXYXSYSandXSYS.
(8)
$$\begin{aligned} X \ominus S & = (X \ominus S) \ominus S \quad \textrm{and} \quad X \oplus S = (X \oplus S) \oplus S. \end{aligned}$$
(9)

Erosion is an anti-extensive morphological filter (an eroded image object is equal in size or smaller than the original image object, see Equation 10), while dilation is extensive (a dilated image object is equal in size or larger than the original image object, see Equation 11). Erosion and dilation are dual (Equation 12): performing an erosion on the background of an image is equivalent to performing a dilation operation on its foreground (the background and foreground being interchangeable). In this paper, we aim to identify sunspot structures inside the solar disk (foreground). These two fundamental operations – erosion and dilation – build a strong foundation for the transforms we use in this study (detailed in Section 3).

(0,0)SXSX.
(10)
(0,0)SXXS.
(11)
$$\begin{aligned} X \oplus S & = (X^{\mathsf{c}} \ominus S^{\mathsf{c}})^{\mathsf{c}}. \end{aligned}$$
(12)

2.2 Data

In order to explore and validate the use of MM for sunspot identification, we rely on the SDO contrast-enhanced data (Section 2.2.1) and two solar catalogues (Section 2.2.2).

2.2.1 SDO Contrast Enhanced Data

We use, in this work, the image data yielded by the Debrecen Heliophysical Observatory (DHO) catalogue. The DHO catalogue consists of \(4096 \times 4096\) Solar Dynamics Observatory (SDO)/Helioseismic and Magnetic Imager (HMI) full-disk images (Schou et al., 2012; Pesnell, Thompson, and Chamberlin, 2012; Couvidat et al., 2016). These images are contrast enhanced by the DHO (see an example in Figure 1). We have selected 61 images at every 1st and 15th of each month between the 1st of January 2012 and the 1st of July 2014. In this study, we set our results side by side with two solar catalogues of sunspot data: the DHO and Mandal et al. (2020) catalogues.

Figure 1
figure 1

The Sun on the 1st of January 2014: Solar Dynamics Observatory/Helioseismic and Magnetic Imager full-disk contrast enhanced intensity image - processed by the Debrecen Heliophysical Observatory.

2.2.2 DHO and Mandal et al. (2020) Catalogues

DHO Catalogue

The DHO catalogue is a rather precise database that has been manually recorded (e.g. hand-drawn sunspots). Implemented in 1958, observations of sunspot areas were undertaken daily in this observatory. The DHO took over the Royal Greenwich Observatory (RGO) campaign in 1976, and various catalogues have been created since, based on both space-borne (SDO, SOHO, etc.) and ground-based observatories (e.g. Gyula Bay Zoltán Solar Observatory (GSO), etc.) In this paper, we use the DHO processed intensity images originating from the SDO satellite (see Section 2.2.1), and the sunspot area values provided by the DHO catalog using SDO/HMI data (HMIDD).

Mandal et al. (2020) Catalogue

To further validate the results obtained with the MM method – sunspot contours and areas –, we have considered a second database. Developed by Mandal et al. (2020), this catalogue aims to cover a large period between 1874 and 2019, with data coming from nine different observatories (RGO, Kislovodsk, Pulkovo, Kodaikanal, Solar Optical Observing Network (SOON), Rome, Catania, Yunnan and DHO). The data have been compared to one another and cross-calibrated according to their quality and temporal coverage. Mandal et al. (2020) found that RGO data is the most reliable as it is able to measure small sunspot areas (down to \(1 \mu Hem\)), while SOON data underestimates sunspot areas by almost 50% in comparison to RGO. According to Mandal et al. (2020), this is largely due to the presence of small sunspots – with an area smaller than 10 \(\mu Hem\) – not detected in SOON data. The differences between all sunspot records are indeed quite significant, depending on the facilities, seeing conditions, data processing techniques, etc. The values obtained by Mandal et al. (2020) differ thus a little from those provided by the DHO database, as it is showcased in Section 4 with Figure 7 and Figure 9.

3 Procedure

3.1 Pre-Processing

3.1.1 Resizing and Normalization

The (\(4096 \times 4096\)) SDO images are reduced to (\(1024 \times 1024\)) dimensions for quicker pre-processing and an easier generalisation of the algorithm to other types of images. The resized images are standardised, that is, we set the image mean \(\mu \) to 0 and the standard deviation \(\sigma \) to 1.

3.1.2 Reshaping and Resizing the Sun’s Contour

At this stage, we seek to smooth the contour of the Sun, as its borders are sometimes discontinuous on the DHO pre-processed images (as an example, see Figure 2a). We also adopted to set the Sun’s radius at a reference value found in Barata et al. (2018) – 450 pixels – since Barata et al. (2018) have used it in their work and proved to obtain good results with this choice. The main steps of this procedure are given below.

Figure 2
figure 2

Pre-processing of the SDO images provided by the DHO database.

Firstly, a threshold (with a value set to \(10^{-4}\)) and a fill holes transform are applied to identify the entire Sun in the image without considering small holes inside the solar disk corresponding to patches of lower level intensity (e.g. sunspots). Then, opening (with a disk-shaped SE of size 300) and erosion (with a disk-shaped SE of size 5) transforms are performed to even the borders of the Sun (some problematic borders are shown in Figure 2a). In set theory, the opening of a set \(X\) is defined in Equation 13 as the union of all the SEs \(S\) contained in \(X\) (Soille, 1999). In fact, the opening of an image \(f\), \(\gamma _{S}(f)\), corresponds to the erosion of \(f\) by \(S\), followed by the dilation of \(f\) with the same SE \(S\) (Equation 14). The erosion used in an opening operation is very powerful in the sense that it can completely wipe some details out of the initial image. In other words, the opening operation is irreversible, and the amount of filtered noise depends on the SE.

$$\begin{aligned} \gamma _{S}(X) & = \bigcup _{x} \{S_{x} | S_{x} \subseteq X\}. \end{aligned}$$
(13)
$$\begin{aligned} \gamma _{S}(f) & = \delta _{S}(\epsilon _{S}(f)). \end{aligned}$$
(14)

Once the contour of the Sun is smoothed out, we can get its label (see example in Figure 2b) and measure its area. Finally, the image is resized for the Sun to have a standard radius in each image: \(R_{target} = 450\) pixels.

3.1.3 Area and Centre Coordinates of the Sun

Once the Sun – and thus the image – is resized, the same steps are repeated (standardisation, thresholding, filling holes, labelling) to calculate the new area \(A\) of the Sun and the coordinates of its centre in pixels. The Sun’s area, expressed in millionths of the solar hemisphere (\(\mu \)Hem), is given by: \(A \times 2 \times 10^{-6}\).

3.2 MM Processing

Now that we have cleaned and resized images, we can run MM algorithms on them to extract sunspots. The best MM operation to recover small black spots in greyscale images is the black top-hat transform (top left panel in Figure 3). The top-hat transform is so called because it compares areas with low levels of intensity, called valleys, to high level intensity areas, called peaks, by means of a top-hat-shaped object. The black top-hat transform, as described in Equation 17, is the difference between the original image and the corresponding closed image – a closing operation (see Equation 15) being a dilation followed by an erosion operating with the same structuring element (Equation 16). The black top-hat extracts the valleys, while a white top-hat transform would easily find the peaks in an image. Therefore, the transform defined in Equation 17 is quite suitable to identify the dark sunspots inside ARs. In this work, we use a disk as SE with a size of 70 pixels in the black top-hat operation.

$$\begin{aligned} \phi _{S}(X) & = [ \,\bigcup _{x} \{S_{x} | S_{x} \subseteq X^{ \mathsf{c}}\}] \,^{\mathsf{c}}. \end{aligned}$$
(15)
$$\begin{aligned} \phi _{S}(f) & = \epsilon _{S}(\delta _{S}(f)). \end{aligned}$$
(16)
$$\begin{aligned} Bth(f) & = \phi _{S}(f) - f. \end{aligned}$$
(17)
Figure 3
figure 3

Main steps of the MM algorithm. From top to bottom and left to right: Black top-hat, Fixed threshold (binary image), Opening by reconstruction, External gradient.

Then, we apply a threshold (set to 0.33) and an opening by reconstruction (with a disk-shaped SE of size 2.00001). This latter operation, in comparison to a traditional opening, computes additional dilations on the eroded image, thus enabling a better reconstruction of the original image. In other words, the image structures that are not modified by the opening operation can be recovered in a similar way as before the transformation: pepper noise can be filtered out with the opening, while the biggest dark structures (like sunspots) remain preserved with the reconstruction.

Finally, a morphological gradient is applied to delimit sunspot contours. Just like the opening and top-hat transforms, it is defined in Equation 18 from the key erosion and dilation transforms. There are wide level variations in solar images at the boundaries around sunspots, and the morphological gradient helps to outline them. In particular, when the boundary is thick, we may apply half-gradients. Equation 19 shows the internal gradient used to delineate the inner edges of objects in images, and Equation 20 shows the external gradient used to delineate the outer ones. This external gradient, also called half-gradient by dilation, is applied here to get the outer boundaries of sunspots penumbrae.

$$\begin{aligned} \text{grad}(f) & = \delta _{S}(f) - \epsilon _{S}(f). \end{aligned}$$
(18)
$$\begin{aligned} \text{grad}^{-}(f) & = f - \epsilon _{S}(f). \end{aligned}$$
(19)
$$\begin{aligned} \text{grad}^{+}(f) & = \delta _{S}(f) - f. \end{aligned}$$
(20)

Both internal and external gradients are complementary, and their sum is equal to the morphological gradient. When applying these gradient operators, one needs to be extremely careful about the signal-to-noise ratio in the image. We have to previously filter out any spurious noise signal that could be otherwise enhanced by the morphological gradient operation – hence the importance of applying an opening operation in the first place.

To illustrate the method, the main processing operations for sunspot extraction are shown in Figure 3 and Figure 4 and summarised below as follows:

  • Black top-hat in Figure 3 top left panel.

  • Fixed threshold in Figure 3 top right panel.

  • Opening by reconstruction in Figure 3 bottom left panel.

  • Half-gradient by dilation = external gradient in Figure 3 bottom right panel.

  • External gradient overlaid to the original image resized to the new dimensions in Figure 4.

    Figure 4
    figure 4

    External sunspot contours overlaid on the original SDO/HMI intensity image.

3.3 Role of the Structuring Element

All the MM transforms that we have presented in this section (e.g. erosion, opening, black top-hat, opening by reconstruction) use a different SE of their own. These structuring elements must be carefully selected. It is not difficult to designate their shape (e.g. a disk) as sunspots are roughly circular, and any distortion of the solar disk is to be avoided when we apply different MM transforms on SDO/HMI image data. By contrast, their size has a major impact on the final results.

Figure 5 shows the evolution of the total projected sunspot area varying with the size of the SE on an intensitygram recorded on 1st of January 2014. On the one hand, in Figure 5a, we present the sunspot area variations obtained when the SE used for the black top-hat operation spans a size of 2 to 300 pixels with a step of 2. It is observed that the size of the SE is most impactful between 0 and 50 pixels, that is, roughly within the range of the sunspots’ size in the processed images with a solar radius of 450 pixels. When the SE is too small, the black top-hat operation struggles to capture the smallest sunspots and thus underestimates the total projected sunspot area – an example is given in Figure 6b with a disk-shaped SE with a size of 10 pixels, where smaller and fainter sunspots are missed. By contrast, when the SE is too large, we outline larger boundaries around sunspots – for instance, by considering darker granules around penumbrae. This is not a major issue here, as the recorded sunspot area reaches a plateau when we further increase the size of the SE. On the other hand, in Figure 5b, the SE is used by the opening by reconstruction operation (in one of the last steps of the MM sunspot extraction algorithm) and spans a size of 1.5 to 3 pixels with a step of 0.01. This operation helps us filter the image, thus the SE has to be chosen small. However, a too small SE would result in dark noise enhancement, as we can see in Figure 6a, where the SE has a 1.9 pixel size. Therefore, the choice of the SE size is a very important task. Although it may be time-consuming – requiring a lot of trial and error – to find the optimal MM parameters (i.e. the structuring elements) in one image, the same parameters can then be applied to continuum images spanning an entire solar cycle, in spite of the solar activity and sunspot area variations, making MM an effective and automated tool.

Figure 5
figure 5

Total projected sunspot area varying as a function of the SE size (on the left: for the black top-hat, on the right: for the opening by reconstruction).

Figure 6
figure 6

Influence of the SE size used in the opening by reconstruction and black top-hat transforms on the resulting sunspot contours.

3.4 Sunspot Area Measurements

3.4.1 Projected Areas

Let us now calculate the areas defined by the sunspot contours found with the method described previously in Section 3.2. Each sunspot is labelled: its centre is located, its area is measured and then converted to \(\mu Hem\).

3.4.2 Corrected Areas

The foreshortening effect is a perspective effect occurring when a 3D object (the Sun) is projected onto a 2D imaging plane. It makes the apparent size of sunspots near the limb appears smaller than it actually is. Therefore, it is important to consider this effect in order to correctly estimate the true total sunspots area.

The heliographic latitude \(B_{0}\) and longitude \(L_{0}\) of the solar centre in the SDO processed images, along with the position angle \(P\) of the solar north pole, are needed at each date to measure the corrected area for the foreshortening effect. We take these values from the DHO database. For each sunspot, the centre coordinates are calculated in pixels. With the centre coordinates in pixels, the fixed radius of the Sun (450 pixels), the apparent angular diameter (varying slightly, but is fixed here at 0.5°), and \(B_{0}\), \(L_{0}\), \(P\) we can measure the heliographic latitude \(B\) and longitude \(L\) of each sunspot centre. The projected area \(A_{proj}\) is corrected for the foreshortening effect using a correction factor \(d\). The new corrected area is \(A_{corr} = A_{proj}/d\), with \(d\) given in Equation 21.

$$ d = (\sin (B_{0}) \times \sin (B)) + (\cos (B_{0}) \times \cos (B) \times \cos (L_{0} - L)). $$
(21)

4 Validation of the MM Method

In order to validate the MM method, we seek to compare the output of its application with standard and relatively reliable solar catalogues introduced in Section 2.2 (e.g. the DHO catalogue, Baranyi, Győri, and Ludmány, 2016; Győri, Ludmány, and Baranyi, 2017, and the catalogue created by Mandal et al. (2020), based on cross-referencing of data from several observatories). Both projected and corrected sunspot areas are confronted.

4.1 Comparison of Projected Areas

In Figure 7, we display the values of the projected sunspot areas obtained with the MM method in red. Sunspot areas are given in millionths of the solar hemisphere as a function of time, over approximately two-and-a-half years, at every 1st and 15th of each month from 1st of January 2012 to 1st of July 2014, in line with Solar Cycle 24 progressing towards its maximum (reached in April 2014). For comparison, the values of the projected sunspot areas recorded at the DHO and in Mandal et al. (2020) at the same dates are also plotted in Figure 7 in blue and green, respectively. We notice that the sunspot areas measured by different sources globally follow the same trend, although they vary slightly from one source to another. This is mostly due to different recording conditions at the observing facilities. In general, the values found by Mandal et al. (2020) seem to be larger than the two other ones: this catalogue is likely to precisely examine very small areas that the DHO and the MM method do not consider. The MM method, on the other hand, appears to persistently underestimate the area values in comparison to the two other catalogues. This is due to the use of the MM opening by reconstruction operation that performs well in eliminating small dark noise, but also pores and small sunspots of the same size. Indeed, when the solar activity is enhanced – until the maximum is reached in April 2014 –, the total sunspot area increases, with larger sunspots appearing on the solar disk, but also with small-sized sunspots of low intensity emerging all over the solar disk and especially around the larger spots. While the latter are accurately determined by the MM method, the former are missed due to their small size and low intensity, although they can be numerous in photospheric images displaying a broad total sunspot area.

Figure 7
figure 7

Projected sunspot areas measured by the MM method (in red), the DHO catalogue (in blue), and Mandal et al. (2020) (in green).

In Figure 8, we investigate the correlation between these three sets of data by plotting the linear fits between the MM results and the DHO data (Figure 8, top panel), between the MM results and Mandal et al. (2020) (Figure 8, middle panel), and between the standard Mandal et al. (2020) and DHO catalogues (Figure 8, bottom panel). In these figures, the correlation coefficients between the 3 different sets of data are very good: 0.95 between DHO/MM, 0.96 between Mandal et al. (2020)/MM, and 0.97 between DHO/ Mandal et al. (2020) – values are reported in Table 1. There is a better correlation between MM/ Mandal et al. (2020) than between MM/DHO, although Figure 8 highlights a larger dispersion regarding wider sunspot areas – the catalogue provided by Mandal et al. (2020) being more precise in the count of pores and small-sized sunspots, thus recording higher total sunspot area. It seems that MM/ Mandal et al. (2020) present a higher correlation due to the gap between these two measurements remaining more consistent (even if this gap is larger than the one between MM/DHO). All in all, we obtain correlation coefficients above 0.95, which indicates mutually consistent results, and as might be expected, the two reference catalogues (e.g. Baranyi, Győri, and Ludmány, 2016; Győri, Ludmány, and Baranyi, 2017 and Mandal et al., 2020) provide the best correlation (see Figure 8, bottom panel).

Figure 8
figure 8

Comparison of the projected sunspot areas between the MM method and the reference Mandal et al. (2020) and DHO catalogues.

Table 1 Summary of the correlation coefficients between the three data sets (MM, DHO and Mandal et al., 2020) regarding both projected and corrected sunspot areas.

4.2 Comparison of Corrected Areas

We now seek to compare the values of the corrected sunspot areas, as these correspond to the true areas accounting for the foreshortening effect. As previously, we show in Figure 9 in red the values of the corrected sunspot areas obtained with the MM method, expressed in millionths of the solar hemisphere, for the same dates – every 1st and 15th of each month, from 1st of January 2012 to 1st of July 2014. We plot the DHO data in blue and Mandal et al. (2020) values in green.

Figure 9
figure 9

Corrected sunspot areas measured by the MM method (in red), the DHO catalogue (in blue), and Mandal et al. (2020) (in green).

Once again, we investigate in Figure 10 the correlation between these three data sets by plotting the linear fits between the MM results and the DHO data (Figure 10, top panel), between the MM results and Mandal et al. (2020) (Figure 10, middle panel), and between both the DHO and Mandal et al. (2020) catalogues (Figure 10, bottom panel). Although the correlation coefficients are lower than the ones obtained with the projected areas, they still look promising: Table 1 displays a correlation coefficient of 0.91 between DHO/MM, 0.88 between Mandal et al. (2020)/MM, and 0.93 between DHO/ Mandal et al. (2020). Once again, we observe in Figure 10 a more accentuated dispersion between MM and Mandal et al. (2020) than MM/DHO (due to the precise account for small-sized sunspots by Mandal et al., 2020) and, this time, the correlation coefficient is also the least good – below 0.9. This could express some shortcomings in the MM method to accurately draw the contours of sunspots that are located on the solar limb, especially when they are numerous. For example, this can be seen in the upper and middle panels of Figure 10: a few points are more distant from the linear curve in the top left-hand corner of the plots, showing us that MM does not identify all the sunspots found with the other methods. Since these points do not appear in Figure 8, we conclude that the sunspots missed by MM – due to their small size and lower intensity – are actually located near the solar limb.

Figure 10
figure 10

Comparison of the corrected sunspot areas between the MM method and the reference Mandal et al. (2020) and DHO catalogues.

5 Conclusion and Future Prospects

We have identified sunspot contours with an MM approach and measured the areas defined inside these contours. A good agreement is achieved between these values and the areas reported in reliable and standard catalogues. The analysis carried out validates the MM method, not only qualitatively by visual inspection, but also quantitatively, as the comparison with the two reference catalogues shows.

In this study, MM turns out to be a performing tool for sunspot detection. MM is more effective than simple thresholding methods. Indeed, one might find it difficult to determine an appropriate threshold due to the limb darkening effect appearing in solar images. The MM black top-hat, on the other hand, factors in the local background around sunspots, thus providing an effective solution to counteract local variations in brightness. Besides, MM can be used not only on low-resolution images (e.g. spectroheliograms from the Observatory of the University of Coimbra, Portugal, exploited by Barata et al. (2018) and Carvalho et al., 2020), but also on higher-resolution satellite image data, as shown in this work. These encouraging results allow us to be confident in the use of the MM approach and to apply it to more complex solar feature identification for SW forecasting purposes (e.g. umbra/penumbra segmentation, delta-sunspots and PILs detection). This can be achieved, for example, by using more complex MM transforms and/or a combination of the most straightforward operations.

However, one must keep in mind that the MM method requires a lot of trial and error to find the optimal size and shape for the SE, the appropriate transform to use and, above all, the ideal sequence in which all the different MM operations should be combined to get a successful feature detection. Combining MM with machine learning (ML) in what is called a morphological neural network (MNN) could provide us with promising results in this automation process. Indeed, ML techniques applied to solar feature recognition have accomplished outstanding success, such as convolutional neural networks for sunspot detection (Santos et al., 2023) or deep learning for sunspot classification (Chola and Benifa, 2022), among others. Mondal, Dey, and Chanda (2020) show that MM can indeed be associated with ML in an MNN to yield better results than the ones provided by each method independently. While the ML algorithm would allow us to find the optimal MM parameters through an automated process, MM, on the other hand, would speed up the algorithm by significantly reducing the number of parameters that have to be learnt by the model. Besides, MM is able to provide some insight into the black box, since the image in an MNN is processed in terms of geometrical and topological operations – the weights corresponding to the size and shape of the MM structuring elements. Therefore, a hybrid method combining the strengths of these two methods – MM and ML – could be considered in future work for the identification of complex-shaped solar features.