Introduction

Cancer is one of the leading causes of death in Spain. In particular, lung cancer is one of the most fatal cancers, as well as one of the most common (Alonso et al., 1996; Cayuela et al., 2011). The increase in lung cancer female mortality has been documented in the literature, particularly at younger ages (Bosetti et al., 2012; Janssen et al., 2021; Levi et al., 2007; Rafiemanesh et al., 2016) and could be considered by some authors as a modern epidemic (Bray & Weiderpass, 2010; MacRosty & Rivera, 2020). Many studies previously mentioned that male mortality from lung-related cancer has decreased over the last few years in Spain, while female mortality has increased instead (Cayuela et al., 2007; Izarzugaza et al., 2010; Ocaña-Riola et al., 2013). According to the National Institute of Statistics (INE), lung cancer was responsible for more than 4000 deaths in 2019, a big leap compared to 1980 (slightly above 1000 deaths). While male mortality is still above the levels of female mortality, the direction of the ongoing trends for women is certainly a reason for concern (Remon et al., 2021).

Cigarette smoking (and other forms of tobacco consumption) is one of the main drivers that can cause not only lung cancer but also larynx, bronchial, and other respiratory tumors as well (which we will broadly consider as lung cancers in this article). Changes in lung cancer mortality are mainly considered a result of changes in smoking patterns (Franco et al., 2002). Smoking is currently one of the main hazards in population health in Spain, being responsible for a significant number of disability-adjusted life years and deaths (Haeberer et al., 2020; Soriano et al., 2018).

Since smoking is an acquired habit, it is considered an avoidable risk behavior, and is also well-known that prevalence and consumption are heavily mediated by generational components (associated with birth cohorts). This study aims to investigate the different risks of lung cancer mortality among cohorts of females across Spain and its regions and to identify factors associated with potential divergences in mortality risk across birth cohorts.

Background

It is accepted that the smoking epidemic in many Western societies began at the end of the 19th Century and the beginning of the 20th Century (Proctor, 2004; Slade, 1992). However, in the case of Spain, cigarette consumption was fairly limited before 1960, because many smokers were facing poverty and cigarettes were a rather expensive product (Villalbí et al., 2019). Later on, along with the economic growth, consumption increased, along with the expansion of the Tabacalera, a state-owned monopoly that made anti-tobacco regulations a difficult task. Furthermore, the Francoist government failed to prevent an increase in mass consumption by neglecting the health hazards induced by cigarettes, when some of the first studies finding associations between cigarette smoking and lung cancer were published (Villalbí et al., 2019). From 1977 (when democracy in Spain was restored and the first general election was conducted under the new republic) until 2007, consumption of tobacco has increased for females aged 45 and above, while also decreasing for males of all ages (Fernández et al., 2003; Regidor et al., 2010). The later expansion of tobacco consumption when compared to other Western countries, the increased smoking prevalence among females, and the lagged effects on health after the habit made lung cancer a health problem not only in contemporary Spain (Cayuela et al., 2020a, 2020b; Fernández et al., 2003; Franco et al., 2002) but also in its regions (Cayuela et al., 2004, 2011). This made lung cancer one of the few conditions among females that presented a mortality increase instead of a decline over time, making it a relevant phenomenon to study.

However, not everyone has the same propensity to develop lung cancer. Generally, the concept of a cohort is associated with the collective exposure of a certain group to a certain phenomenon (Glenn, 1976; Ryder, 1965) that has (or has not) differential outcomes when compared to other cohorts who had a different exposure. In this case, cohort effects would imply that some birth cohorts have a greater risk of dying of lung cancer than others, either because of smoking or because of exposure to other health hazards. Therefore, a cohort approach to analyze long-term mortality trends in Spanish females would investigate generational differences in lung cancer mortality and regional variations. Even in the light of some anti-tobacco policies that tried to mitigate active and passive consumption (Pinilla & Abásolo, 2017), diseases like cancer are the result of developing long-term exposures. In other words, birth cohorts have special consideration when analyzing trends of lung cancer mortality, revealing inequalities in health that otherwise may remain hidden given the generational association between lung cancer and cohorts.

Furthermore, the smoking propensity is not equal among socioeconomic statuses. Factors such as educational attainment or income (classic proxies of socioeconomic position) are strongly associated with tobacco use, with individuals with a lower social standing presenting higher levels of consumption in developed nations (Cavelaars et al., 2000; Huisman et al., 2005). However, that was not always the case: the tobacco epidemic smoking in developed nations (Giskes et al., 2005; Graham, 1996) began as a habit associated with individuals with higher social standing (and probably more wealthy regions) before individuals with a lower socioeconomic position also coped up, as tobacco became more affordable. After that, divergence was present again, but with higher educated individuals presenting a lower prevalence, as awareness of negative health outcomes of smoking became more widespread (Pärna et al., 2014; Regidor et al., 2001). That does not mean that consumption is absent in higher-educated women, and differences in prevalence are small, particularly for the older generations (Cavelaars et al., 2000; Haeberer et al., 2020; Huisman et al., 2005; Regidor et al., 2001). Evidence also points out that reducing or quitting smoking may result in lower lung cancer mortality (Godtfredsen et al., 2005; Hecht et al., 2004). In other words, behavioral changes (expressed in changes of prevalence at a population level) may be associated with the choice of smoking and, ultimately, with developing lung cancer mortality. And those behavioral changes may be associated with structural factors, such as the social standing of groups of individuals or public policies limiting consumption.

Since the sanction of the new Spanish Constitution in 1978, a slow process that was completed in 2002 indicated a decentralization of health care, policy, and prevention services into the autonomous communities, the largest subnational government unit in Spain (Marqués Fernández, 2003). The first Spanish legislation that acknowledged tobacco as a harmful substance was sanctioned in 1988, with the Royal Decree Nº192 (Sasco et al., 2003), but no strong anti-tobacco laws such as Law 28/2005 (Peruga et al., 2021; Pinilla & Abásolo, 2017) were sanctioned until 2005. Indicating that the responsibility to deal with the prevention and regulation of tobacco consumption fell mostly into the regional autonomic health care systems. Historically, the success of treatments in dealing with lung cancer has been limited (Jones & Baldwin, 2018). Hence, prevention and early detection are critical markers to improve the health situation of a population. And those aspects (or many other possible health determinants) can vary critically by different spatial dimensions, with Spain not being an exception to that fact (Borrell et al., 2010; Rottenberg et al., 2019; Santos-Sánchez et al., 2020). Factors such as gross domestic product (La Torre et al., 2018) or public health expenditure (Crémieux et al., 1999; Martín Cervantes et al., 2019; Nixon & Ulmann, 2006; Rentería & Zueras, 2022) may also be related with effective policies that result in lower mortality or morbidity.

In Spain, lung cancer mortality trends across cohorts have been described and investigated, but only at a national, aggregate level (Bray & Weiderpass, 2010; Franco et al., 2002), or only at a specific province or autonomous community level (Cayuela et al., 2007; Ocaña-Riola et al., 2013). Considering that some of those females that could be exposed to lung cancer were born at the dawn of the 20th Century, we expect to identify differential risks across cohorts (in this case, females born between the years 1900 and 1975): Franco et al. (2002) identified an increase in the relative risk of lung cancer deaths between the cohorts of 1935 and 1955 but did not consider cohorts beyond that point. All of these studies have used statistical age-period-cohort models to arrive at such conclusions.

Therefore, it is critical to determine if the mortality risks across cohorts continued or declined afterward, to establish better healthcare policies, and identify which cohorts are more at risk, not only at a national level but also at a regional level, given the heterogeneous nature of mortality across space. To the best of our knowledge, there are no studies analyzing and comparing patterns of lung cancer across autonomous communities in Spain from a cohort perspective, identifying which are the most vulnerable birth cohorts at a regional level, nor trying to identify factors associated with regional divergence in mortality in the country. While we could expect that lower lung cancer mortality is associated with more developed regions (associated with higher GDP per capita, higher overall education, and lower prevalence of smoking), the temporal and generational nature of the smoking epidemic makes such associations less obvious when considering mortality across cohorts. Therefore, this study is more interested in broadly describing and understanding the dynamics of mortality across cohorts of females in different regions of Spain.

Materials and methods

For this study, we used a combination of data sources, all publicly available on the Spanish National Institute (Instituto Nacional de Estadística, also known as INE, given their acronym in Spanish) website. Materials and code for reproducing this paper are available at https://osf.io/x47sf/.

We considered lung, bronchial, and trachea cancer (cause 018 in the Spanish mortality file classification, and groups C34 for lung and bronchus and group C33 for trachea in the ICD-10 classification), larynx malignant tumors (cause 017 in the Spanish mortality file classification and group C32 in the ICD-10 classification), and other respiratory and thoracic-related tumors (cause 019 in the Spanish mortality file classification and group C76.1 in the ICD-10 classification) deaths as lung/respiratory cancer deaths.Footnote 1 We also obtained the population exposures at a given year as the denominator to estimate death rates (on July 1st of each year)Footnote 2 Both death counts and INE population estimates are given in five-year age groups, starting at age 40. Figure 9 in the appendix presents the deaths attributable to lung cancer between 1980 and 2019 in Spain.

While there are various indirect methods to describe smoking-attributable mortality, we resorted to the Aggregate Exposure Method, also known as the Basic Method by Pérez-Ríos and Montes (2008), which consists in simply describing lung cancer death rates, and acknowledging that the causes behind such rates may not entirely be due to tobacco consumption. Prevalence-dependent methods to estimate smoking-attributable mortality (Piñeiro et al., 2022 is a good example of this type of study) rely on age-specific smoking rates for the whole period and autonomous communities, which is something we do not have for this study. And other methods that do not rely on prevalence such as the one proposed by Peto et al. (1992), have to rely upon lung cancer death rates for non-smokers in other target populations (using data from other countries such as the United States, given the unavailability of such estimates for Spain). This assumption was used in the past to estimate smoking-attributable mortality in the Spanish case (for instance, see Rey-Brandariz et al., 2022). Since we are not interested in calculating deaths attributable to tobacco specifically, but rather describing and identifying mortality risks of lung cancer across cohorts and regions, we believe that the use of the Aggregate Exposure Method is reasonable for our analysis, just like many of the previous studies that analyzed the evolution of lung cancer in Spain and its regions.

Given that lung cancer mortality is relatively scarce before age 40 (there were less than 50 deaths below age 39 at a national level per year in the 1980–2019 period), we model mortality based on death counts in five-year age groups lower-truncated at age 40 and upper-truncated at age 80.. In cancer epidemiology, age-period-cohort analysis is a widely used technique to identify not only because it can detect how some specific cohorts may have differential risks in a given health outcome (Murphy & Yang, 2018). Similarly, we pooled periods from 1980 to 2019 at five-year intervals, to adequately represent the required tabulation for cohort analysis symmetrically, but also to make sure we had observations (deaths) in all age groups for all periods. We believe this is betterfor the modeling tool we used to estimate age, period and cohort effects, because using single years would have resulted in multiple age-period tabulations without deaths (potentially skewing parts of the analysis due to the large presence of zeroes in the numerator). Therefore, birth cohorts from 1900 to 1975 are represented in the analysis, with Table 1 in the appendix presenting its tabulation. However, we acknowledge the possibility of using five-year age groups/one-year period bins for the analysis as an alternative. This implies a trade-off between a more detailed temporal estimation but with less long-term stability, since some autonomous communities have observations with zero deaths in given age groups in certain years. We ultimately decided to prioritize stability in resultsin this paper, so we kept the five-age groups/give-years tabulation.

We chose all autonomous communities, except for the overseas territories of Ceuta and Melilla, due to their small size. Those autonomous communities correspond to the NUTS-2 level regional classification in Spain. Figure 10 in the appendix presents a map of such regions.

Before proceeding into the age-period-cohort modeling phase, a set of exploratory procedures were performed, that are customary for this kind of analysis, starting by estimating the age-standardized death rates for each autonomous community. We used the average population structure of Spain in the 2019 period as the standard that is necessary to apply the direct method of standardization (considering the population aged between 40 and 84). Furthermore, we performed the classic combination of four plots: age-by-period, period-by-age, age-by-cohort, and cohort-by-age for observed rates (available in the appendix from Figs. 10, 11, 12, 13 and 14), to visually inspect the presence of period and cohort features.

Lung cancer death rates were modeled as estimable age, period, and birth cohort functions. We followed Carstensen’s procedure (2007), based on Holford’s approach (Clayton & Schifflers, 1987; Holford, 1983) to estimate age-period-cohort models.

In this approach, the APC effects are treated as non-linear continuous functions of f(a)h(c), and g(p) respectively, along with a linear trend (drift), whose allocation depends on the chosen parameterization. While there is no way to solve the linear identification problem (given that age plus cohort equals period), we can identify non-linear effects (also known as second-order effects) on one given dimension, assign the linear trend to a different dimension and estimate the log-rates of the remaining dimension (generally age, because of its higher predicting power when calculating a given phenomenon). In other words, in APC models only second-order (non-linear) effects are fully identifiable from a mathematical standpoint for the period and cohort dimension. Previous analyses at the national or regional level had presented age-standardized mortality rates with an almost constant growth pattern (Cayuela et al., ). Therefore, we assumed explicitly that linear changes in mortality were attributed to period effects (also known as period-major parameterization), meaning that birth-cohort effects were estimated in a non-linear fashion, as the relative risks to the average cohort trend (known as “APC” parameterization in the Epi package), as it was done in lung cancer studies previously (Takahashi et al., 2001).

This means that the dimension of interest (in this case, cohort) is constrained to have a zero slope and a zero average, becoming “detrended” as a result. This dimension is expressed as an interaction of the two remaining dimensions (Chauvel & Schröder, 2014) in a Poisson generalized linear model, with the maximum-likelihood estimations expressing the rate ratio compared to the average trend. That way, those cohorts whose relative risks are above the trend) represent cohorts that have higher mortality than the average trend, and changes over time in the relative risks implye trend direction. The dimension carrying the drift (in this case, period) is presented in terms of the relative risk to a reference period of choosing, considering both linear and non-linear effects altogether. While arbitrary,we chose this particular model, because we wanted to determine the extent of cohort effects that are fully identifiable (meaning they are second-order).

The “Epi” package was developed in R software (Carstensen et al., 2021) and provided the necessary tools for the analyses, and allowed the user to choose among a series of options for modeling. Equation 1 presents the basic Age-Period-Cohort equation:

$${\text{ln}}\left( {d\left( {a, p} \right)} \right) \, = r_{p0} \left( a \right) \, + \delta \left( {p - p_{0} } \right) \, + g\left( p \right) \, + h\left( c \right)$$
(1)

where rp0(a) are the age-specific prevalence rates in the reference period p0; δ represents the slope of the drift relative to the period of reference; h(c) is the non-linear cohort function, and g(p) is the non-linear period function relative to the period of reference. The sum of period effects is interpretable as the log relative risk to the period of reference p0. Apart from that, we also presented the contribution of each parameter to deviance reduction, considering the linear trend, the possible presence of non-linear period effects, and a full APC model (Carstensen, 2007; Clayton & Schifflers, 1987). This was done to determine the intensity of the drift (the average linear change over time), which could be potentially useful for the interpretations made. Given that in this approach dimensions are considered continuous variables, we chose the 1980–84 period (that has the year 1982.5 as the mid-point) as the reference for the APC model to visualize period changes relative to the beginning of the trend.

However, for a different perspective of the cohort effects we also presented an alternative parameterization, presented in Eq. 2, which is cohort major (“ACP” parameterization in the Epi package). In this case, age-specific rates are longitudinal to the reference cohort c0, the linear trend δ is attached to the cohort dimension relative to the reference cohort c0, and period effects are detrended and shown as non-linear compared to the average trend. In this case, the reference cohort is the 1935 birth cohort (midpoint of the range between 1932.5 and 1937.5 birth cohorts which are expressed as the difference of period minus age).

$${\text{ln}}\left( {d\left( {a, c} \right)} \right) \, = r_{c0} \left( a \right) \, + \delta \left( {c - c_{0} } \right) \, + g\left( p \right) \, + h\left( c \right)$$
(2)

For easier visualization, we chose natural cubic splines to fit the models (Carstensen, 2007). To extract the drift, we chose the standard naïve weights, and five spline knots to fit each one of the age, period, and cohort dimensions. Finally, in addition, we produced a Lexis Surface of the smoothed age-specific death rates (using the smooth functions that the Epi package provides in the apc.fit function), to visually inspect the presence of period or cohort features based on the curvature of the surfaces (Acosta & van Raalte, 2019; Rau et al., 2018).

Results

Figure 1 presents the age-standardized death rates for the analyzed period for each autonomous community. Death rates increased across all regions and at a national level. However, death rates were relatively stable until the 1990–1995 period (except for Madrid, which presents an increase right from the start) when they started to increase across all regions of Spain. At the end of the analyzed period, regions such as Cantabria, the Canary Islands, and the Basque Country had the highest death rates, while Andalusia, Castile-La Mancha, and Murcia had some of the lowest death rates.

Fig. 1
figure 1

(Source: author´s calculations based on INE)

Lung cancer age-standardized death rate in Spanish females by the autonomous community, 1980–2019

Along with the standardized rates, the traditional set of two-dimensional plots (that can be found in the corresponding appendix in Figs. 11, 12, 13, 14, 15 and 16) indicated a combination of period and cohort effects present for females, as indicated by the frequent crossing of trend lines within each time perspective, and the fact that death rates were the highest in certain cohorts (such as the one between 1956 and 1960) that were in the middle of the distribution.

Figure 2 presents the Lexis Surface of the smoothed death rates (with contour lines for values ranging from 1 to 4 by 0.5), with yellow for the highest values and purple for the lowest. The diagonal lines underline effects that are indicative of the progressive cohort effects, indicating that over time newer cohorts are performing worse than older cohorts when it comes to lung cancer.

Fig. 2
figure 2

(Source: author´s calculations based on INE)

Lexis Surface of smoothed lung cancer death rates for females across autonomous communities of Spain

However, in some regions, we can also see some lines that are more vertical than oblique, which should be indicative of a combination of period and cohort effects (such as the Basque Country, Asturias, or Cantabria) since many age groups were similarly affected in a given moment. The shape of the curves at the right end of the surface is also telling: autonomous communities with a higher curvature of the contour lines at recent periods (Basque Country, Balearic Islands, Madrid, Navarra among others) are indicative of mortality stabilization in the last five years, which is coincident with some of the trends shown in Fig. 1. And those who present less curvature (like Castile & Leon or Extremadura) suggest that relative risks have not stabilized yet.

Figure 3 indicates the contribution of each additional parameter to the reduction of deviance (when compared to a single-factor age model), presenting two different scenarios: one in which two-factor Age-Period (AP) models are considered and the other in which Age-Cohort (AC) models are considered instead. Overall, the age-drift (AD, which could be period or cohort-dependent) model is responsible for almost seventy percent of the deviance reduction in Spain. The left half of the figure suggests that the full APC model (which in this case contains both the non-linear period and non-linear cohort effects, in other words, whatever is not explained already by the AD and AP models) does account for a quarter of the total reduction (being the largest for Galicia, Murcia, and the Canary Islands) when compared to a single-factor age model. The contribution of the AP models (considering only non-linear period effects) was small in almost all regions, ranging from virtually non-existent (as the case for Madrid) to a quantity nearing 30% (as was the case for Extremadura). On average, those effects represented five percent of all deviance reduction when compared to a single age-factor model. The right half in the figure indicates, however, that an APC model does not explain much more than a two-factor Age-Cohort model, since the gains with a full APC model are much more modest (Note: the sum of AC/AP and APC are not necessarily symmetrical, since APC is essentially the difference of deviance that is not explained by AD + AP/AC, respectively, and the number of degrees of freedom in AC and AP models is not the same in this case) However, we believe incorporating both non-linear period and non-linear cohort models in a full APC model is a far better alternative when analyzing reduction of deviance.

Fig. 3
figure 3

(Source: author´s calculations based on INE)

Contribution to deviance reduction between Age and APC models to estimate lung cancer mortality in females in Spanish autonomous communities. All Parameters have a P Value < 0.0001

However, while Fig. 3 indicates that the linear trend has the largest contribution in terms of deviance reduction, the intensity of this is not clear. Therefore, Fig. 4 presents the maximum likelihood estimates of the drift values (the average linear change in risk by every five years), by considering both the drift in a full APC model (with 95% CI). Figure 3 indicates that for Spain, the average increase in relative risk was slightly above 1.03 times higher by every five years (marked with the red vertical line). In some autonomous communities, such as Andalusia, Castile-La Mancha, Extremadura, and Murcia the average linear change was below the national drift. Other regions such as the Basque Country Cantabria, Asturias, and the Valencian Community presented an average linear change above the national trend. Recall that the change of the linear drift in an APC model is a relative measure and not an indicator of the level of overall mortality. In any case, those values are the expression of strong growing linear trends of lung cancer mortality, with the average drift in all regions being above their respective baseline mortality (in the 1980–84 period).

Fig. 4
figure 4

(Source: author´s calculations based on INE)

ML-estimated values of the linear drift by autonomous region

Figures 5, 6 and 7 present the different components of the APC model results, as Age, Period, and Non-Linear Cohort effects, respectively. As Fig. 5 shows, in all autonomous communities, lung cancer mortality increases exponentially with age (with the 1980–1984 period as a baseline reference). The increase was practically identical in all regions, with a (log) death rate three times higher for the final age group when compared to the 40–44 age group. Figure 13 in the appendix, which presents the age effects of the reference cohort (1935), also indicates a very similar result as Fig. 5, with higher rates as age increases.

Fig. 5
figure 5

(Source: author´s calculations based on INE)

Cross-sectional Age effects (age-specific log death rates) of lung cancer mortality for females across autonomous communities of Spain of the 1980–84 period

Fig. 6
figure 6

(Source: author´s calculations based on INE)

Period effects of lung cancer mortality for females across autonomous communities of Spain compared to the reference period 1980–84.

Fig. 7
figure 7

(Source: author´s calculations based on INE)

Non-linear Cohort effects of lung cancer mortality for females across autonomous communities of Spain compared to the average trend.

Figure 6 presents the linear (drift) and non-linear period effects altogether. Once again, in all regions is evident that the relative risk of period effects (contextual factors) increased since the reference period of 1980–1984 (highlighted with the horizontal line). At a national level, in the period 2015–2019, the relative risk was almost three times higher when compared to the baseline. However, there was regional variation in the timing of the increase: while for Madrid and Catalonia, the increase was almost instantaneous after the baseline period, other regions such as Extremadura and Murcia presented a more recent, lagged increase. Despite this, increases in relative risk ranged between two and four times above their respective regional baselines were found in all regions across the analyzed period. Figure 14 in the appendix presents the results of non-linear period effects in there is a small positive period effect during the decade of 1990 in some autonomous communities and at the national level as well.

Figure 7 presents the non-linear cohort effects (generational factors) in relative risk when compared to the average trend (highlighted in the horizontal red line). Complementarily, Fig. 8 presents the cohort effects (relative to the 1935 cohort, with the vertical green line), with the model in which the linear trend is attached to the cohort dimension. From the inspection of Figs. 7 and 8 we found a stable relative risk virtually in all regions and at a national level until the cohorts born in 1935. Between that point and the 1955 cohort, both figures find an increase in relative risk in all autonomous communities, indicating that such cohorts presented higher mortality in all regions when compared to the older cohorts. However, unlike Figs. 5 and 6, we could find regional divergence in the cohorts born after 1955 (highlighted in the blue line). The change in the curve for cohorts born between 1955 and 1975 in the figure indicates divergent non-linear patterns across different autonomous communities: most autonomous communities presented a sharp decline in risk, indicating an improvement in mortality (Basque Country, Asturias, Madrid, Navarra as some examples). Fig. 8 shows the sum of linear and non-linear mortality risk compared to the 1935 cohort, indicating a stagnation of the increasing mortality.

Fig. 8
figure 8

(Source: author´s calculations based on INE)

Cohort effects of lung cancer mortality for females across autonomous communities of Spain relative to the 1935 cohort, depicted in a vertical green line

In other communities such as Castile & Leon, Extremadura, and Galicia mortality keeps ever increasing: both Figs. 7 and 8 indicate there is an increasing mortality trend for the younger cohorts, driven mostly by linear effects but also by some slight non-linear cohort effects as well.

A third scenario includes Catalonia, Aragon, and the Valencian Community for instance, and to a lesser extent, the average of Spain. In this case, Fig. 8 indicates that the linear trend of mortality is increasing for the younger cohorts as well. However, unlike the previous scenario, non-linear cohort effects seem to be on par with the average mortality trend or even show slight improvement, indicating that change in mortality for these cohorts is essentially linear (meaning that can be either attributed to period or cohort from a mathematical standpoint).

Discussion

This study consisted of a descriptive analysis of female mortality attributed to lung cancer across birth cohorts in Spain, exploring national and regional trends comprehending 40 years of analysis. The results indicate that age-standardized mortality in all regions increased between 1980 and 2019. Linear effects on mortality presented the same pattern. In some cases, like Madrid, the region in which the capital is located, the increase in mortality occurred right from the beginning of the analyzed period. Madrid has one of the highest purchasing powers of Spain and it is possibly the place where smoking was first adopted as a widespread habit in the country. This also could explain why in other regions, like Galicia and Extremadura, the increase in mortality was only observed since the early 2000s. Apart from late adoption, it might be possible that the strong anti-drug campaigns in the 1980s and 1990s (Fuente et al., 2006; Miguel-Arias et al., 2016; Sánchez-Niubò et al., 2009) might have had a spillover effect on the consumption of other substances such as tobacco as well. That could also explain the presence of non-linear period effects during that period.

In all autonomous communities, the relative risk of dying decreased for cohorts born before 1935, then increased for cohorts who were born between the years 1935 and 1955, who arguably were the ones most exposed to the expansion of smoking in Spanish society. This is consistent with a previous study (Franco et al., 2002), which suggested that female cohorts born in 1955 had higher relative risks of dying from lung cancer in Spain, albeit it did not indicate what happened beyond. We now know that at a national level, non-linear cohort relative risks stalled after that point, but the increasing overall (linear) mortality trend continued. While in some communities, such as Madrid or the Basque Country the relative risk among cohorts was contained (around the cohorts born near 1935), it was not the case for other communities, such as Galicia or Castile & Leon, whose younger cohorts kept a mortality risk above their respective average trends.

These diverging cohort trends could be related to the role of those autonomous communities in dealing with and having control of their health policies. That being said, the lower mortality among younger cohorts was mostly found in regions such as Madrid, Navarra, or the Basque Country, which coincidentally are the regions with a higher Per Capita GDPFootnote 3 or a higher share of females with post-secondary educationFootnote 4 (as Table 2 of the appendix shows). In other words, the fact that regions that are more developed present recent improvements (or stagnation) in lung cancer mortality might be related to structural factors, resulting in the adoption of habits that might result in less exposure to the condition. However, we are also aware that this might be an ecological fallacy and that changes in lung cancer mortality could be purely driven by individual behaviors independently of where individuals live.

Among other limitations, we have to consider that we were unable to follow migration patterns, which may result in certain population selection when analyzing mortality: mortality levels by place at advanced ages are strongly influenced by the composition of migrants depending on where they were born (Aldea et al., 2023), butwe believe that is not enough to change significantly the overall trends that were identified.. Also, we were unable to establish why the risk across cohorts varies in each autonomous community with the current data (and if females were affected differently based on their social standing or educational attainment). Furthermore, as we mentioned in the methods section, it has to be considered that, not all lung cancer deaths are due to the direct consumption of tobacco, which could not be identified with the data sources analyzed in this paper, due to the lack of supporting information for the whole period. However, the main interest of this study was not to establish smoking-attributed mortality but to define the divergent risk that different generations of women face concerning lung cancer as a whole. Also, the role of environmental factors (passive exposure to tobacco smoking) in lung cancer mortality (Lee et al., 2000) is unclear in the analyzed data as well.

We have to remind the readers that since the chosen modeling strategy is dependent on the prior allocation of the drift, the results that we found in this paper are dependent on that particular allocation. However, the figures based on a cohort-major parameterization and the Lexis Surface mostly supported similar conclusions. Due to the strong values of the linear trend, and the scarcity of non-linear period effects (apart from the mentioned non-linear effects during the 1990–2000 period in some regions), we thought inserting the drift in the period dimension offered a clear visualization of the non-linear cohort effects (for instance, as shown in Trias-Llimós et al., 2017).

The findings of this study seem to suggest that female lung cancer mortality has increased in most regions of Spain between 1980 and 2019. For some authors, the growing trend in female lung cancer mortality could be considered an epidemic (Cayuela et al., 2007; Estève & Coleman, 2009; López-Campos et al., 2014; MacRosty & Rivera, 2020). Determining the threshold of incidence to consider a particular phenomenon an epidemic is a complicated and delicate task and exceeds the purposes of this study. However, we can say that while the linear trends are increasing (and this also stands true if we arbitrarily consider that cohort effects might be linear as well), the non-linear patterns observed in younger cohorts might be an encouraging sign that this could be reversed soon.

It is clear that in Spain some generations of females are more susceptible to die than others because of lung cancer and smoking, due to previous behaviors that were adopted during the second half of the twentieth century and part of the twenty-first century as well. But also, on occasion, the generations that are more susceptible to dying from lung cancer do change across autonomous communities, and structural and behavioral aspects may be playing a part in such distinction. Therefore, this study is a reminder that space not only is relevant from a period perspective to analyze mortality, but also from a cohort perspective, and cohort analysis is a useful tool when addressing the consequences of such historical processes.