1 Introduction

It has been known for almost two hundred years that the occurrence of sunspots is cyclic, although not strictly periodic. The length of the sunspot cycle (SC) has varied between 9.0 and 13.7 years. The shape of the sunspot cycle has also changed somewhat. Waldmeier noticed the asymmetry of sunspot cycles, with the ascending phase being typically shorter than the declining phase, and that there is an anti-correlation between cycle amplitude and the length of the ascending phase of the cycle (Waldmeier, 1935, 1939). The number of sunspots has been observed, at least, since the 17th century, although the measurements in the early days were scarce and somewhat inaccurate. These early observations include also hand-drawn figures, but these are only for short periods (Neuhäuser, Arlt, and Richter, 2018; Carrasco et al., 2021, 2022). The positions and areas of the sunspots have been systematically recorded by the Royal Greenwich Observatory (RGO) since 1874 (Hathaway, 2013; Mandal et al., 2020). Recently, efforts have been made to estimate the sunspot areas and sometimes even tried to separate umbra and penumbra structures from the early drawings (see the excellent review by Artl and Vaquero, 2020). Since the late 19th century, photographic recordings have been done in many observatories. As stated earlier, the Royal Greenwich Observatory (RGO) started recordings in 1874 and continued them until 1976 (Hathaway, 2013), Kodaikanal observatory in India has taken white-light images of the sunspots since 1904 (Ravindra et al., 2013; Mandal et al., 2017; Jha et al., 2022), Mt. Wilson has collections from 1917 to 1982 (Howard, Gilman, and Gilman, 1984; Bogdan et al., 1988), and Debrecen continued the recordings of RGO from 1974 until today (Baranyi et al., 2001; Győri, Ludmány, and Baranyi, 2017).

There have been some studies of penumbra-umbra ratio (\(q\)) of the sunspots during the 20th century. Nicholson (1933) and Waldmeier (1939) found consistent results with each other that \(q\) decreased as the sunspot size increased. Jensen, Nordø, and Ringnes (1955) also reported a slight decrease in \(q\)-values as a function of sunspot size, but that \(q\) was higher during maxima than minima of the cycles. During cycle minima, the variation of the \(q\)-values was also smaller with increasing sunspot size. Antalová (1971) noticed that \(q\) is an increasing function of the sunspot size, and in double maxima cycles, the ratio is higher at the first maximum than at the second. She also stated that \(q\)-values do not change with heliographic latitude.

Hathaway (2013) studied the daily records of sunspot group areas compiled by the Royal Observatory, Greenwich from May of 1874 through 1976. He found that, on average, \(q\) increases from about 5 to 6 with the group area increasing from 100 to 2000 \(\mu \)Hem and does not change with latitude or phase of the cycle. However, he found a peculiar change of \(q\) with time such that it decreased smoothly from more than 7 in 1905 to lower than 3 by 1930 and then increased again to over 7 in 1961. Carrasco et al. (2018) studied sunspot data from the catalog published by the Coimbra Astronomical Observatory (CAO) for the period 1929 – 1941. They could not find such kind of change during the analyzed period. There were also significant differences in \(q\)-values for the smaller groups between the RGO and CAO analyses. They state that the main differences are in the measurements of the umbra area, and while CAO instrument and methods did not change during the period 1929 – 1941, there were changes in RGO arrangements, which could have affected the discrepancy of the results.

Jha, Mandal, and Banerjee (2019) and Jha et al. (2022) have studied \(q\)-values from the recently published Kodaikanal digitized white-light images from 1904 to 2017. They have also compared their results to the \(q\)-values of RGO and Debrecen datasets for the same period. They found that \(q\) increases from 5.5 to 6 as the sunspot size increases from 100 to 2000 \(\mu \)Hem. They did not find any systematic trend in smaller (\(<\,100~\mu \text{Hem}\)) sunspots as was reported earlier in the results from RGO database by Hathaway (2013). Furthermore, they found that the average \(q\) does not show variation as a function of latitude. They compare \(q\) time-series calculated from Kodaikanal dataset to the \(q\) time-series calculated from RGO and Debrecen sunspot datasets. It seems that for the sunspots \(<\,100~\mu \text{Hem}\), \(q\)-values for Kodaikanal data fluctuate between 4 to 5 all the time and for sunspots \(\geq\, 100~\mu \text{Hem}\) between 5 to 6, while for the other two datasets (RGO and Debrecen) the variation is much larger. So, the Kodaikanal sunspot dataset seems to be more homogeneous than the other two datasets.

Recently, Hou et al. (2022) published an article where they analyze the recently digitized sunspot drawings observed from Yunnan Observatories (1957 – 2021) to study \(q\) for the Solar Cycles 19 – 24. They found that \(q\) is \(6.63 \pm 0.98\), and its probability distribution fits very well with the log-normal distribution function. They did not notice any dependence of the \(q\)-value on the latitude or the phase of the cycle.

In this study, we analyze the latitudinal distribution and temporal evolution of the sunspot penumbra-umbra ratio (\(q\)) of SC12 – SC23 (occasionally also SC24, which is, however, only until 2017). To get more data for solid statistics, we superimpose the even and odd cycles separately. To this end, we harmonize the duration of the cycles such that all have the same length, 128 months, which is about the average for the Solar Cycles 16 – 23, i.e., the interval for the Kodaikanal dataset. This paper is organized as follows. Section 2 presents the databases used in this study. In Section 3, we study the \(q\)-values of the Debrecen dataset for Cycles 21 – 24. Section 4 deals with the \(q\)-values of RGO Cycles 12 – 20 and Section 5 with the \(q\)-values of Kodaikanal Cycles 16 – 24. In Section 6, we compare the results for the overlapping intervals of the datasets above and give our conclusions in Section 7.

2 Sunspot Area Databases

In the analyses of the sunspot area, we use three databases. The longest dataset for umbral and total sunspot areas is the recently published Kodaikanal database (Mandal et al., 2017; Jha et al., 2022), which is reconstructed from the white-light images of the same observatory since 1904. Because there are gaps before 1921, and Cycle 24 is incomplete, we use this dataset mainly for the Solar Cycles 16 – 23. Notice that this data contains areas of separate sunspots. Another database is that of the Royal Observatory, Greenwich-USAF/NOAA Sunspot Data for the years 1874 – 2016. This database contains, among others, time, latitude and whole area size (in millionths of solar hemisphere, \(\mu \)Hem) for individual sunspots for SC12 – SC23, and also umbra sizes of the sunspots for SC12 – SC20 (Hathaway, 2013). We use here the data for Solar Cycles 12 – 20, i.e., the period containing the whole and umbra areas of the sunspot group data. The third dataset is the Debrecen Photoheliographic Data (DPD), which consists of daily, group, and sunspot data (Baranyi et al., 2001; Győri, Ludmány, and Baranyi, 2017). Here we use whole and umbra areas of the sunspot group data, marked in the database with “g”. Because the Debrecen database seems the most precise for the recent Solar Cycles 21 – 24, we use it as a preliminary database. The minima and lengths for the Solar Cycles 12 – 24 used in this study are listed in Table 1.

Table 1 Sunspot-cycle lengths and dates [fractional years, and year and month] of (starting) sunspot minima for Solar Cycles 12 – 24 (NGDC, 2013).

In this study, we define the penumbra-umbra ratio (\(q\)) as (Antalová, 1971; Hathaway, 2013)

$$ q = \left (A_{\mathrm{Sp}}-A_{\mathrm{Um}}\right )/A_{\mathrm{Um}} , $$
(1)

where \(A_{\mathrm{Sp}}\) is the sunspot total area, and \(A_{\mathrm{Um}}\) is the umbral area of the sunspot.

3 Debrecen Sunspot Group Dataset

Figure 1 shows the temporal distribution of penumbra-umbra ratio, q, for Solar Cycles 21 – 24 (Cycle 24 only until 19th June 2018) of Debrecen sunspot group data. To get more data to a more solid statistics, we harmonize the solar cycles, each having 128 months, which is about their average cycle length during the longest database in this study, i.e., Kodaikanal Solar Cycles 16 – 24. We then study separately the superimposed cycles of the period above for Debrecen data. The blue vertical errorbars show the standard deviation at each point of the monthly measurements. Interestingly there is a huge valley at the maximum region of the cycles, i.e., around 40 – 50 months. To analyze this valley in more detail, we plot the temporal distributions for areas smaller or equal to 100 \(\mu \)Hem (small) and over 100 \(\mu \)Hem (large) separately. Figure 2 shows \(q\) for small and large category sunspot groups as blue (red errorbars) and black (magenta errorbars) curves, respectively. The valley exists in both categories but is somewhat deeper in the small category groups (when studying in more detail, it turned out that the decrease is the deepest for groups between 50 – 100 \(\mu \)Hem). We believe that this decrease in \(q\) is related to the Gnevyshev gap (GG) (Gnevyshev, 1967) at about 35 – 40% from the start of the cycles (Takalo and Mursula, 2018). Figure 3a shows the number of small (blue) and large (red) sunspot groups during Cycles 21 – 23 (Cycle 24 is omitted, because it is incomplete). Note that the number of large groups decreases more than the number of small groups during the GG. This result is consistent with the result by Takalo (2020a) that GG is more visible in large than in small sunspots. It is then understandable that when the relative number of smaller sunspot groups increases, the \(q\)-values are lower at the GG-region. There is another drop after 60 months, but it is caused mainly by Cycle 24, when large groups disappear abruptly after the cycle maximum in 2014. Although the errorbars are quite wide, the shape of the penumbra-umbra curves is very obvious regarding that small groups dominate at the beginning and at the end of the cycles, and large groups are relatively more abundant during the maximum of the cycles (see the inset of Figure 3a). The mean \(q\)-value for small groups (\(\leq \,100~\mu \text{Hem}\)) is 5.44 (standard deviation, std = 0.90), for large groups (\(>\,100~\mu \text{Hem}\)) 6.09 (std = 0.82) and for all groups (Figure 1) 5.74 (std = 0.82). Note that the standard deviation in the brackets here and later is different from the errorbars in the figures, i.e. it is calculated from the average monthly (or average latitudinal) \(q\)-values during the cycles, i.e. the larger this std is the more variation (trends) there is in the distribution in the \(q\)-values.

Figure 1
figure 1

Debrecen temporal penumbra-umbra ratio of all sunspots for Cycles SC21 – 24 (black curve). The blue errorbars show the standard deviations at each monthly point of the ratio.

Figure 2
figure 2

Debrecen temporal penumbra-umbra ratio of the sunspot groups for Cycles SC21 – 24 separately for areas \(\leq\, 100~\mu \text{Hem}\) and \(>\,100~\mu \text{Hem}\) as blue and black curves, respectively. The errorbars are shown with red and magenta colors, respectively.

Figure 3
figure 3

a) Number of small (blue) and large (red) sunspot groups as a function of time for Debrecen Solar Cycles 21 – 23. The inset shows the ratio of small/large groups along the cycles. b) Number of small (blue) and large (red) sunspot groups as a function of latitude for Debrecen Solar Cycles 21 – 23. The inset shows the ratio of small/large groups along the latitudes.

Figure 4a shows the Debrecen latitudinal distribution for the even and odd cycles SC21 – SC24. The \(q\)-graph for the even cycles has a valley around zero latitude, except that of different values at −3 degrees of latitude. The maximum values seem to be between 10 to 20 degrees at both hemispheres. This is also the region where the larger sunspots (\(>\,100~\mu \text{Hem}\)) are most abundant (see Figure 3b). It is, however, clear that the penumbra-umbra ratio is smaller near the Equator of the Sun. This is due to the smaller size of sunspots dominating near the Equator, which also leads to a smaller \(q\)-value. Note from the inset of Figure 3b that the ratio of small/large sunspots is about four between −2 – 2 degrees while is under 2 from 5 to 25 degrees in both hemispheres. It is also evident that the odd cycles have higher \(q\)-values than the even cycles. This is because the odd cycles (21, 23) have more larger category sunspot groups than the even cycles (22, 24). Figure 4b shows the latitudinal number of large sunspot groups for even (blue) and odd (black) cycles. Note that the difference in \(q\)-values is the largest at the latitudes in which the difference in the number of groups also is the largest. Figure 4b also shows that the number of large groups has a local minimum at or near 15 degrees of latitude. This is especially clear for even cycles. This is consistent with the results of Takalo (2020a,b) that large sunspots and sunspot groups have local minima at the time when the average latitude of sunspots crosses 15 degrees of latitude. We believe that this coincides with the Gnevyshev gap in the SSN and sunspot areas. The mean \(q\)-values for the sunspot groups of even and odd Cycles 21 – 24 between latitudes −35 – 35 are 5.53 (std = 0.56) and 6.33 (std = 0.55), respectively.

Figure 4
figure 4

a) Debrecen latitudinal \(q\) for all sunspots of the even (blue with magenta errorbars) and odd (black with red errorbars) Cycles 21 – 24. b) Latitudinal distribution for the number of sunspot groups of the even (blue) and odd (black) Cycles 21 – 24.

4 Kodaikanal Sunspot Dataset

We start the study of the Kodaikanal dataset with a plot of the hemispheric umbral areas in Figure 5 for Solar Cycles 18 – 24 (note that Cycle 24 is incomplete). The reason why we concentrate in this figure for this period is the base map, which shows the so-called Homogeneous Coronal Data Set (HCDS). The HCDS is the irradiance of the Sun as a star in the coronal green line (Fe XIV, 530.3 nm). It is derived from ground-based observations of the green corona made by the network of coronal stations (Kislovodsk, Lomnický Štít, Norikura, and Sacramento Peak). The coronal intensities have been measured at 72 points at five-degree separation starting from the North Pole counterclockwise around the Sun at a height around 50 arcsec from the surface of the Sun. The values are calibrated to the center of the solar disk to get absolute values of intensity, i.e., absolute coronal units (ACU). One ACU represents the intensity of the continuous spectrum of the center of the solar disk in the width of one Ångström at the same wavelength as the observed coronal spectral line (1 ACU = 3.89 Wm−2 sr−1 at 530.3 nm) (Minarovjech, Rušin, and Saniga, 2011; Takalo, 2022). This database exists only for SC18 – 24, and the temporal and spatial intensities of this corona are shown as a colorbar on the right side of the figure. The green and red dots are sunspots with umbral area 75 to 150 \(\mu \)Hem and greater than or equal to 150 \(\mu \)Hem, respectively (we do not show the smaller sunspots here for clarity such that the corona is also visible). The white curves show the (relative) hemispheric total umbral areas, and the vertical black lines are official maxima of the solar cycles. Significant information can be extracted from Figure 5. It is evident that the corona and stronger sunspots are located simultaneously, although the corona covers larger latitudinal region. It is seen that the maximal corona starts at latitudes 35 – 40 degrees, similarly to sunspots, but migrates both towards the Equator and poleward during cycle evolution, although as fainter toward the Poles than equatorside. It is also seen that the strongest Solar Cycle 19 is most symmetric around the cycle maximum in the corona and sunspots in the northern hemisphere, while most of the largest sunspots in the southern hemisphere are located in the ascending phase of the cycle. The other cycles, except Cycle 24, have a prolonged tail such that the majority of the large sunspots are after the cycle maximum. Solar Cycle 24 has hemispheric asymmetry such that the largest northern hemisphere sunspots exist before the cycle maximum, and southern hemisphere sunspots locate around the cycle maximum. Note also that Solar Cycles 23 and 24 have much fainter HCDS corona than other cycles.

Figure 5
figure 5

The butterfly diagram of Kodaikanal dataset umbral areas for Solar Cycles 18 – 24. The green and red dots are sunspots with umbral area 75 to 150 \(\mu \)Hem and greater than or equal to 150 \(\mu \)Hem, respectively. The slightly smoothed white curves show the total umbral area for the northern and southern hemispheres of the Sun. The colors on the base of the figure show HCDS corona, whose intensities are shown on the right. The black vertical lines show the sites of the maxima of the cycles.

The recently published Kodaikanal sunspot dataset is the longest sunspot set, which contains also areas of sunspot umbras recorded on a single station. We again harmonize Solar Cycles 16 – 23, each having 128 months. Figures 6a and b show the penumbra umbra ratio (\(q\)) for the even and odd cycles for all sunspots of SC16 – SC23. These figures show that the odd cycles have a moderately higher ratio than the even cycles in the ascending phase of the average cycle. This means that odd cycles have considerably more large sunspots in the ascending phase than after the maximum. On the other hand, the shape of the \(q\)-graph for the even cycles reminds more of the shape of the sunspot cycle itself, i.e., the largest sunspots are in the middle of the cycle. The error limits are smaller than for Debrecen data, especially for the even cycles. The mean \(q\) for even and odd cycles are 5.27 and 5.43 with standard deviations 0.36 and 0.47 calculated from the monthly \(q\)-values, respectively.

Figure 6
figure 6

a) Kodaikanal temporal penumbra-umbra ratio (\(q\)) for all sunspots of even Cycles 16 – 23. b) \(q\) for all sunspots of odd Cycles 16 – 23.

Figures 7a and b show the latitudinal penumbra-umbra ratio for sunspots smaller than 100 \(\mu \)Hem and for sunspots larger than 100 \(\mu \)Hem of even (blue with red errorbars) and odd (black with magenta errorbars) cycles. We restrict the latitudes between −35 and 35 degrees again because the small number of umbral data (the divisor in the calculation of penumbra-umbra ratio) causes quite wild behavior at larger latitudes. Note that the error limits are nevertheless large around zero latitude and at latitudes higher than 25 degrees. The latitudinal distribution of \(q\)-values is quite flat for small sunspots but has more variation for large sunspots. Note that for large category odd cycles, there is again a valley around zero latitude, but there are bad values (very wide errorbars) on both sides of the zero latitude for the even cycles. The mean values for \(q\) between −35 to 35 degrees of smaller category and larger category sunspots are for the even cycles 4.79 (std = 0.32) and 6.02 (0.37), and for the odd cycles 5.00 (std = 0.48), 6.29 (0.36), respectively.

Figure 7
figure 7

a) Kodaikanal latitudinal penumbra-umbra ratio (\(q\)) of all sunspots smaller than 100 \(\mu \)Hem for even and odd Cycles 16 – 23. b) Latitudinal \(q\) for sunspots larger than 100 \(\mu \)Hem for even and odd Cycles 16 – 23.

5 RGO Sunspot Group Dataset

The Royal Greenwich Observatory (RGO) compiled sunspot group observations from a small network of observatories starting in May 1874. These observations lasted until 1976, after which the measurement has been done in Debrecen. The RGO sunspot group dataset also contains umbra areas and that is why we can study penumbra-umbra ratios for the total Solar Cycles 12 – 20 using this dataset. We use the same common length for the cycles in order to compare the results with Kodaikanal distributions. Figures 8a and b show the temporal penumbra-umbra ratios of the RGO dataset for the even and odd cycles, respectively. In this case, \(q\) for the even cycles seems to be somewhat higher than for the odd cycles. The monthly \(q\)-graph is flatter for the odd cycles than for the even cycles, but both have slightly higher values in the descending part of the cycles. The mean values for \(q\) are 5.20 and 4.75, with standard deviations of 0.46 and 0.41 for the even and odd cycles, respectively. Note that now (Cycles 12 – 20) even cycles have higher average \(q\)-value than odd cycles, while for Kodaikanal (Cycles 16 – 23) even cycles have lower average \(q\)-value than odd cycles.

Figure 8
figure 8

a) RGO temporal penumbra-umbra ratio (\(q\)) of all sunspots for the even Cycles 12 – 20. b) RGO temporal \(q\) of all sunspots for the odd Cycles 12 – 20.

Figures 9a and b show the latitudinal penumbra-umbra ratios for sunspot groups smaller than 100 \(\mu \)Hem and larger than 100 \(\mu \)Hem, respectively. The overall shape of the ratio for small groups is flat, if not slightly convex upwards, with no valleys or humps. There are, however, huge errorbars in the small category plots. These are due to the large variation in the RGO penumbra/umbra ratio, especially for small groups (Hathaway, 2013; Jha et al., 2022). As stated earlier, Carrasco et al. (2018) did not find such a large variation in the study of Coimbra Astronomical Observatory measurements for the period 1929 – 1941. There is a shallow valley around zero degrees for larger sunspot groups, except for a couple of anomalous values at negative low latitudes. The maxima are between 5 to 20 degrees of latitude for the even cycles but are wider and flatter for the odd cycles. Note that, although there are valleys around zero latitude, they have just minor significance because the errorbars for large group \(q\)-values are also quite wide. The \(q\)-values between −35 to 35 degrees for small and large groups are 4.97 (std = 0.56), 4.47 (0.58) and 5.61 (0.33), 5.13 (0.45) for the even and odd cycles, respectively.

Figure 9
figure 9

a) RGO latitudinal penumbra-umbra ratio (\(q\)) of sunspots smaller than 100 \(\mu \)Hem for even and odd Cycles 12 – 20. b) RGO latitudinal \(q\) for sunspots larger than 100 \(\mu \)Hem for even and odd Cycles 12 – 20.

6 Comparison of Kodaikanal, RGO and Debrecen Sunspot Datasets

There are overlapping Cycles 16 – 20 in RGO and Kodaikanal datasets and Cycles 21 – 23 in Kodaikanal and Debrecen datasets. That is why it is appropriate to compare these intervals between those data. Figure 10 shows the penumbra-umbra ratios for RGO and Kodaikanal Solar Cycles 16 – 20. It is clear that Kodaikanal \(q\) is moderately higher than RGO \(q\), except at the end of the average cycle. The shape of the \(q\)-graph for Kodaikanal is quite similar to Kodaikanal even cycles (Figure 6), but the \(q\)-graph of the RGO has no trends. This is because the \(q\)-graphs for separate cycles have opposite trends (C16 decreasing, C17 increasing etc., see also Jha et al., 2022), which cancel each other when they are overlaid such that the sum is quite flat. The mean values for the average Cycle between 16 – 20 are 5.35 and 4.75, with standard deviations of 0.33 and 0.26 for Kodaikanal and RGO cycles, respectively.

Figure 10
figure 10

Comparison of Kodaikanal and RGO temporal penumbra-umbra ratios for Solar Cycles 16 – 20.

Figure 11 shows the \(q\)-graphs for Kodaikanal and Debrecen sunspot datasets of Solar Cycles 21 – 24. We show only the errorbars of Kodaikanal \(q\)-values because the graph for Debrecen is the same as in Figure 1. It is evident that Debrecen has higher values except at the very beginning and at the end of the cycles. Otherwise, the \(q\)-graph of Debrecen data has much more structure than the \(q\)-graph of Kodaikanal data. The most interesting difference between these \(q\)-graphs is the GG-related decrease around of 40 – 50 months in Debrecen \(q\), which does not exist in Kodaikanal \(q\). Note that Kodaikanal \(q\) is highest in the ascending phase of the cycle and also has a shallow valley in \(q\)-values somewhat later than Debrecen, i.e., between 50 – 60 months. The mean \(q\)-value for Debrecen is 5.90 (std = 0.76), and mean \(q\)-value for Kodaikanal is 5.34 (std = 0.56).

Figure 11
figure 11

Comparison of Kodaikanal and Debrecen temporal penumbra-umbra ratios for Solar Cycles 21 – 24.

Figures 12a and b show the total and umbral areas for Debrecen and Kodaikanal data, respectively. This figure may explain why Kodaikanal \(q\) does not show the GG-related valley, which is so distinct in the Debrecen temporal \(q\). Debrecen data have a deep decrease in both total sunspot and the umbral area between 42 – 51 months, which is related to GG-phenomenon. On the other hand, Kodaikanal data have two drops, first at 35 – 42 and second very short at Debrecen GG region (47 months). It turns out that the first is related to odd cycles, and the second to even cycles. The latter drop is narrow because odd cycles have maxima at the time of the even cycle minima. Neither of these seems, however, to cause the \(q\)-value to decrease during the GG-regions. Note that Kodaikanal areas are calculated from separate sunspots and Debrecen areas from sunspot groups. However, we have done a similar analysis also to Debrecen sunspot data (shown with “s” in the database), and it gives results consistent with the Debrecen \(q\)-values of the groups.

Figure 12
figure 12

a) Total sunspot group areas and group umbral areas of Debrecen data for Solar Cycles 21 – 24. b) Total sunspot areas and sunspot umbral areas of Kodaikanal data for Solar Cycles 21 – 24. (Sunspot/group areas as blue color with the axis on the left side and umbral areas as red with the axis on the right side).

Figure 13a shows the comparison of RGO and Kodaikanal latutudinal \(q\)-values for Cycles 16 – 20 with areas larger than 100 \(\mu \)Hem. Here we also show a 13-point trapezoidal smoothing of the \(q\)-values. Trapezoidal smoothing is a common moving average smoothing with end points of the window having half of the weight of the inner points. The smoothed shapes of the \(q\)-graphs seem to be very similar to a shallow valley at the Equator. The average of the RGO \(q\) is again smaller than the average of the Kodaikanal \(q\), i.e., mean values (standard deviations) are 5.11 (0.15) and 5.35 (0.15) for RGO and Kodaikanal, respectively. Figure 13b shows the comparison of Kodaikanal and Debrecen latutudinal \(q\)-values for Cycles 21 – 24 with areas larger than 100 \(\mu \)Hem. It is evident that the shape of the Kodaikanal \(q\)-graph is very similar to that of the Kodaikanal \(q\)-graph in Figure 13a with a slightly deeper valley at the Equator in the smoothed curve. Note that the errorbars are behaving very wildly, especially for Debrecen, because the \(q\)-values of the odd cycles are much higher than the \(q\)-values of the even cycles. The Debrecen \(q\)-graph is, however, still deeper at the Equator, having clear maxima between 10 – 20 degrees at both hemispheres. The mean \(q\)-values (standard deviations) are 6.22 (0.34) and 6.36 (0.41) for Kodaikanal and Debrecen, respectively. It should, however, be noted that while it is understandable that the \(q\)-values are smaller near the Equator because sunspots and sunspot groups are smaller there than around 10 to 25 degrees of latitude, the error limits are also so wide that the confidence of this result is vague.

Figure 13
figure 13

a) Comparison of the latitudinal penumbra-umbra ratios (\(q\)) for Kodaikanal and RGO Cycles 16 – 20 with areas larger than 100 \(\mu \)Hem. b) Comparison of the latitudinal \(q\)-values for Debrecen and Kodaikanal Cycles 21 – 24 with areas larger than 100 \(\mu \)Hem.

7 Conclusions

We have analyzed the penumbra-umbra ratio (\(q\)) of sunspot latitudinal and temporal distributions for RGO SC12 – SC20, Kodaikanal SC16 – SC23 (also incomplete Cycle 24 until June 2018) and Debrecen dataset SC21 – SC24 (SC24 until the end of 2017) to get enough data for a solid statistics.

The analysis of \(q\)-values for the Kodaikanal Cycles 16 – 24 shows that the odd cycles have considerably more large sunspots in the ascending phase than after the maximum. On the other hand, for the even cycles, the shape of the \(q\)-graph reminds more of the shape of the sunspot cycle itself, i.e. the largest sunspots are in the middle of the cycle. The mean \(q\)-values for even and odd cycles are 5.27 and 5.43, respectively. On the contrary, our analysis shows that the sunspots under 100 \(\mu \)Hem are quite evenly distributed throughout the cycle for both the odd and even sunspot cycles. The latitudinal \(q\) for large category sunspots and all sunspots seems to change such that it is lowest around the Equator and increases slightly towards higher latitudes. The latitudinal variation is, however, inside the errobars, and we can say that it is insignificant. Our analysis shows again that for the small sunspots (\(<\,100~\mu \text{Hem}\)), the latitudinal distribution of \(q\) is flat. The latitudinal average \(q\) for all sunspots is the same size as the temporal average q, i.e., 5.41 and 5.65 for the even and odd cycles, respectively.

The analysis of \(q\) for the RGO Cycles 12 – 20 shows quite similar results, but now the \(q\)-values for the even cycles are higher than for the odd cycles. The mean values for \(q\) are now 5.20 and 4.75 for the even and odd cycles, respectively. Note that the \(q\)-values are moderately smaller for the RGO dataset than the Kodaikanal dataset. The latitudinal analysis for RGO sunspot groups data confirms the result mentioned above that \(q\) is at lowest around Equator and increases towards higher latitudes for the large (\(>\,100~\mu \text{Hem}\)) groups. Note that the errorbars for RGO data are the largest because the level of the \(q\)-values changes so much between cycles, especially for small category sunspot groups.

For Debrecen sunspot data, we superimposed the Cycles 21 – 24 in temporal evolution of the \(q\)-values. This is because of the shorter interval, i.e., fewer cycles in the Debrecen database than for the other datasets. In this case, we find an interesting phenomenon. The shape of the \(q\)-graph has two clear maxima and a deep valley between them for both small and large category sunspot groups. It turns out that the decrease in the \(q\)-values is simultaneous to the drop of the total and umbral area of the Debrecen sunspot groups. We believe that this is related to the Gnevyshev gap as the deepest point of the total and umbral area locates 36% from the start of the cycle (Takalo and Mursula, 2018). The average \(q\)-value for the Debrecen cycles is 5.74. The latitudinal \(q\)-values for Debrecen sunspot groups have the largest variation such that they are clearly lowest (except some bad values) at the Equator of the Sun. The decrease, as calculated separately for the even and odd cycles, seems to be deeper than the errorbars of one standard deviation. The mean values for the even and odd Cycles 21 – 24 between latitudes −35 – 35 are 5.53 and 6.33, respectively.

We have also compared the overlapping intervals of the RGO and Kodaikanal datasets for Cycles 16 – 20 and Kodaikanal and Debrecen datasets for Cycles 21 – 23. The results are consistent with the separately made analyses such that RGO has the smallest \(q\)-values and Debrecen the highest \(q\)-values. The reason for the differences in \(q\) is probably difficulty in separating the border between umbra and penumbra, especially for sunspot groups (Steinegger, Bonet, and Vázquez, 1997). This issue is quite complicated and is not within the scope of this study. Furthermore, Foukal (2014) states that the main reason why spot areas recorded using photographic or CCD observations are larger than those based on drawings seems to be that the areas of spots too small to draw are still individually measurable on good plates and CCD images.

Debrecen dataset, however, seems to have the smallest umbral areas, at least for large sunspots. The gradient method used in the analyses of Debrecen sunspot data is described in the article by Györi (1998). Debrecen dataset also seems to give the most precise results in the sense that it also shows the effect of the Gnevyshev gap in its temporal evolution of the \(q\)-values. Gnevyshev gap is somehow related to the reversal of the magnetic field of the Sun. The basic reason for the GG is not the scope of this study, and it is an open question now, but some good attempts have been presented (Georgieva, 2011; Karak, Mandal, and Banerjee, 2018). There have also been suggestions about a relic field in the Sun (Bravo and Stewart, 1996; Mursula, Usoskin, and Kovaltsov, 2001; Song and Wang, 2005). If the relic field exists, it could maybe explain why GG is more intense in the even cycles than in the odd cycles (Takalo and Mursula, 2020; Takalo, 2020b, 2021). It should be noted that RGO and Kodaikanal datasets do not show the temporal decrease of the \(q\)-values, although they both have a clear twofold GG in their total area as a function of time for Solar Cycles 16 – 20 (see Figure 15). Note that the GG region, in this case, is a few months later than the GG of Debrecen for Solar Cycles 21 – 24.

The Kodaikanal research group uses their own procedure in defining the border, which is based on the threshold method introduced by Otsu (1979). The Kodaikanal data seem to be most compactly located around the mean \(q\)-value. This is seen in Figure 14, where we show the relative densities of RGO, Kodaikanal, and Debrecen databases. Kodaikanal has by far the smallest standard deviation of the \(q\)-distributions. Note that this standard deviation is related to the errorbars in the \(q\)-graphs shown earlier for each databases. From these three datasets RGO, seems to be the most heterogeneous, which is seen also as a curious drop of \(q\)-values in the first third of the 20th century (Hathaway, 2013, see also solarscience.msfc.nasa.gov/greenwch.shtml). Although the RGO data is one of the best sunspot area datasets, the drop in \(q\)-values for spots smaller than 100 \(\mu \)Hem has not been seen in any other datasets so far. Thus, one must proceed with caution when comparing \(q\)-values between different observatories. It should also be noted that corrections in the RGO database have been made recently (Erwin et al., 2013).

Figure 14
figure 14

The relative densities of \(q\)-values for RGO, Kodaikanal, and Debrecen databases.

Figure 15
figure 15

Total areas of a) Kodaikanal and b) RGO datasets for Solar Cycles 16 – 20.