1 Introduction

The study of social and economic impacts caused by disasters has become increasingly important due to the higher exposition of the global population to these shocks. However, research on the effects on labour is less abundant (Kirchberger 2017), with most research documenting impacts on aggregated labour outputs such as unemployment, participation rates, and wages (Brown et al. 2006; Kirchberger 2017; Xiao and Feser 2014; Zissimopoulos and Karoly 2010). Consequently, research focusing on aggregated labour might hide impacts on particular labour sub-groups or sectorial labour (How and Kerr 2019; Zissimopoulos and Karoly 2010). For example, little attention has been paid to labour employed by the ICT (Information and Communications Technologies) sector in the context of the current technological change in production that is supposed to be mainly driven by ICT and related computer-based technologies (Acemoglu and Autor 2011; Almeida et al. 2020; Hwang and Shin 2017). Remarkably, the adoption and integration of ICT into various sectors have led to significant advancements in productivity, communication, and information exchange, reshaping the global economic and social landscape. A range of scholarly insights discusses this notion (e.g., Jordan 2008; Matthews 2007). In this regard, some suggest that disasters can be considered episodes or substantial events affecting the pace of technological change (Crespo Cuaresma et al. 2008; Okuyama 2003; Okuyama et al. 2004; Skidmore and Toya 2002). Nevertheless, it remains uncertain whether disasters can speed up the advancement of ICT-related technology by replacing destroyed machinery with updated, ICT-compatible equipment. There is no conclusive answer to this question now. We test this assumption by evaluating changes in demand for ICT-related employment as a proxy for a faster technology adoption rate. We use the 27th February 2010 Biobío earthquake, which struck Chile’s Central Region in 2010, as a substantial event affecting ICT labour.

About our stated assumption that disasters may positively affect the demand for ICT employment because of subsequent higher levels of ICT technology adoption, it must be borne in mind that no existing study has examined the role of disasters in explaining changes in demand for ICT labour. Most studies have analysed changes in aggregated labour, hiding impacts on sub-groups (How & Kerr 2019; Zissimopoulos and Karoly 2010). Overall, the disasters literature emphasises the importance of ICT technologies in coping with problems in the aftermath of catastrophes, where ICT plays a vital role in reducing disaster fatalities, managing recovery costs and dealing with other aspects of disaster management (Benali and Feki 2018; Toya and Skidmore 2015; Walker 2012). However, more attention has been paid to ICT labour in the context of other shocks, like pandemics. To illustrate, the COVID-19 pandemic affected the ICT workforce relatively less than other occupations, given the prevalence of teleworking in their sector and their lower exposure to social or face-to-face interactions (Pouliakas and Branka 2020; Redmond and Mcguinness 2020). Still, the overall lack of studies impedes our understanding of disasters’ impacts on disaggregated groups or specialised labour. More importantly, this research field requires a cumulative number of cases to support an explanatory framework robust enough to enable us to understand how employment is affected (Jara and Faggian 2018). We contribute to this limited literature focusing on ICT labour.

Examining how disasters might accelerate the ICT-intense technical change rate proxied by changes in demand for ICT is relevant to countries like Chile. First, the country supplies an environment particularly suitable for studying the impacts of disasters like earthquakes. Ten of the most destructive earthquakes, i.e., 8 Mw and above,Footnote 1 hit Chile in the past century (Barrientos and CSN Team 2018). In the last decade, three earthquakes of this magnitude affected different Chilean regions in 2010, 2014 and 2015, characterising Chile as a site of recurring earthquakes. Secondly, technical change has been an essential driver of the economic development seen by Chile in the last 40–50 years (Beyer et al. 1999; Campos-González & Balcombe 2024; Gallego 2012) and indicators covering assets like hardware, telecommunications, and software show that the share of ICT in total investment for Chile has been growing, resulting in important ICT capital formation (ECLAC 2013). In this regard, examining the impacts of disasters and how they are related to technical change is an added step towards understanding changes in demand for employment, especially ICT labour. We have defined ICT labour by identifying occupations that are involved in providing goods and services related to the ICT sector. Following the International Classification of Occupations (ILO 2012). Some examples of these ICT occupations are Systems Analysts, Software Developers, Database Designers and Administrators, Computer programmers, Computer Network and Systems Professionals, and Technicians.

We explore the impact of disasters on technological replacement by examining the text content of a collection of 4136 online job postings published two years before and two years after the event in most Chilean regions affected by the 27th February 2010 Biobío earthquake (Mw 8.8). Our pre-disaster period is represented by data availability from January 2008; the decision to use a two-year post-disaster span assumes that the economic scenario in the second year after the earthquake might provide a more stable basis for making technological replacement decisions.

We apply a set of techniques based on the text data of our collection of job postings to evaluate changes in demand for ICT labour. However, our sample lacks a variable to filter ICT-specific job postings. Besides, ICT occupations or job titles can vary widely. Hence, our modelling and estimation strategy relies on the Structural Topic Model, STM (Roberts et al. 2013, 2014, 2016). As a topic model, STM uncovers word co-occurrence patterns across a collection of documents, i.e., our sample of job posting ads, to estimate a set of word clusters or topics. Next, we identify the ICT-related topic that best represents ICT labour, and we examine changes in its prevalence by applying a treatment effect estimation. We identify whether the job postings were published before or after the disaster, where the post-disaster period corresponds to our treated period. In this regard, different to recurrent topic model approaches, STM incorporates document metadata, i.e., the date of job posting publication, to structure the document collection. In terms of results, we expect a higher prevalence of ICT labour after the disaster because of the rapid adoption of ICT-compatible equipment. Also, as pointed out in the literature, we expect that our natural experiment positively influences topics standing for, among others, Construction labour, given the recovery and reconstruction activities.

Our results show that the prevalence of the topic representing ICT labour does not significantly change after the earthquake. Conversely, like other identified economic sectors (e.g., Occupational Risk prevention), the Construction labour topic prevalence significantly differs after the disaster, i.e., the prevalence increased. These findings suggest that reconstruction activities lead to differences in Construction employment, but we do not observe changes in ICT labour. Thus, our results do not support the view on substantial technological replacements occurring after the 27th of February 2010 Biobío earthquake of a kind that impacted the labour market, particularly the demand for ICT labour.

The structure of our research is as follows: firstly, we introduce the evidence from the literature and our conceptual framework. Following that, we provide details regarding the context of the disaster, labour impacts and data. Then, we elaborate on our methodological strategy and present and discuss our findings. Finally, in the conclusion section, we summarise our argument and offer policy recommendations based on our results.

2 Past evidence and conceptual framework

Studies show inconclusive evidence regarding disasters as forces affecting technological upgrading and economic and labour outputs. On the one hand, some show that replacing damaged capital goods with updated equipment in the aftermath of disasters can improve economic growth (Benson and Clay 2004; Crespo Cuaresma et al. 2008; Loayza et al. 2012; Toya and Skidmore 2007). Disasters can lead to increased industrial growth (Loayza et al. 2012) and increased physical capital accumulation (Leiter et al. 2009). However, others reported that disasters do not significantly affect subsequent economic growth (Cavallo et al. 2013). In addition, benefits from capital upgrading have been linked to countries with higher levels of development because of better institutions, policies, and financial systems, among other factors (Crespo Cuaresma et al. 2008; Toya and Skidmore 2007). Technology upgrades in post-disaster scenarios usually face financial and time constraints (Benson and Clay 2004; Di Pietro and Mora 2015). More importantly, some analyses of various disasters from a pool of countries have suggested significant adverse impacts on technological innovation, measured by the number of patent applications (Chen et al. 2021). In contrast, some emphasise the importance of the various approaches communities take towards innovation in the face of disasters by highlighting the vital role of innovation in mitigating hazards, responding to emergencies, and recovering from disasters (Wachtendorf et al. 2018). Therefore, we cannot establish that disasters are unequivocally a source of adjustment for technological innovation and, consequently, for changes in economic outcomes such as growth or demand for labour.

On the other hand, employment adjustments can result from reconstruction efforts unrelated to technological improvements. For instance, when labour substitutes for damaged or missing physical equipment, a disaster will lead to positive employment impacts (e.g., more demand), especially in the construction sector (Belasen and Polachek 2009; Skidmore and Toya 2002). Also, Leiter et al. (2009) reported employment growth, given the higher physical capital accumulation in regions affected by disasters. However, even if a catastrophe promotes a more significant capital stock, it does not necessarily imply positive impacts on labour participation. Tanaka (2015) found a negative impact on employment despite over-investment in physical capital. Tanaka speculates that a decreased population in the affected area may be a possible reason. A lower population might result from direct impacts on labour (e.g., death, injuries) or indirect, like forced displacements. The extent to which workers can stay in the labour market after a disaster also influences potential technological replacements.

Nevertheless, it is not only the inconclusive evidence regarding disasters’ impacts on labour but also the insufficiency of studies per se. According to Jiménez et al. (2020), between 1900 and November 2019, only 118 articles on the effects of disasters on labour were published in indexed journals. Most of them refer to Japan, the US and China. Only a few studies appeared for Chile, for example, Jiménez and Cubillos (2010) and Jiménez et al. (2020). Some additional research can be found in other sources, with Jara and Faggian (2018) and Sanhueza et al. (2012) being the only studies referencing impacts on labour. This lack of studies also motivates us since, as introduced, despite the increasing frequency and impact of disasters globally, there remains a significant gap in our understanding of how such events affect labour markets (Kirchberger 2017). The recent COVID-19 pandemic has further underscored the importance of comprehending and preparing for the impacts of large-scale disruptions on employment. The lack of detailed research in this area has real-world implications; for example, during the COVID-19 pandemic, many governments were unprepared for the massive shifts in labour demand, particularly in sectors like healthcare and essential services (Mazzucato and Kattel 2020). This unpreparedness resulted in inefficient resource allocation and suboptimal policy responses, exacerbating the socio-economic negative aspects of the crisis. A couple of additional elements of the lack of published studies might be attributed to a publication bias, whereby significant findings generate higher chances of publication (Klomp and Valckx 2014), and research primarily focuses on shorter-term impacts since it is more difficult to identify long-term effects (Jiménez et al. 2020).

Conceptually, we set out a simple framework to examine the interactions between disasters, labour markets, and technological change. Our approach combines extensions of growth models like the Solow–Swan model (Solow 1956; Swan 1956) and a more literal explanation of the Schumpeterian creative-destruction hypothesis (Aghion and Howitt 1992; Schumpeter 1976). In its original version, the Solow–Swan model evaluates economic growth based on the shape of the neoclassical production function. We follow the application of the Solow–Swan model in a disaster situation in per capita terms following Okuyama (2003), who applies the model with labour-augmenting technological progress described by Barro and Sala-i-Martin (2004). A detailed demonstration of the Solow–Swan model in a disaster situation is available in the Appendix (see Sect. A1).

An extended Solow–Swan model provides insights into resource allocation involving labour and capital for economic recovery after disasters (Okuyama 2003). It can compare the effects resulting from the destruction and subsequent upgrading of capital goods on the economy’s steady state and eventual recovery. The central assumption is that older and outdated capital goods are more prone to be damaged by a catastrophe because of vulnerabilities, including weaker structures, mechanical fatigue due to age, and outdated regulations, from which updated equipment is free (Okuyama 2003). Related to the creative-destruction hypothesis, initially, this conceptual idea gives prominence to the effects of competition between new consumer goods, new markets, and new technologies. These dynamics incessantly transform the economic structure from within: the creative destruction process permanently destroys the old and creates the new. In the disaster literature, this process refers to the technological replacement after a catastrophe (Crespo Cuaresma et al. 2008). This sudden turnover of capital might represent a positive jump in technological improvement.

We develop our research according to the frameworks above to evaluate how disasters can positively affect the pace of technical change, resulting in positive impacts on employment. We assume that a technical change embodied in ICT capital goods covering assets like hardware equipment, telecommunications, and software, among others, drives much of the technological change in production (Acemoglu and Autor 2011; Almeida et al. 2020; Hwang and Shin 2017). In this sense, the expected jump in technological adoption would imply that much of the technology replacement after a disaster will be based on ICT capital goods. This rapid move towards ICT-compatible equipment might improve the demand for ICT labour.

The described conceptualisation responds to researchers’ attempts to develop conceptual and theoretical foundations, such as the works of Okuyama (2003) and Okuyama et al. (2004). In this regard, our approach has been extensively used in this literature with some variations in the established theoretical assumptions (Crespo Cuaresma et al. 2008; Hallegatte and Dumas 2009; Leiter et al. 2009; Lynham et al. 2017; Panwar and Sen 2019). To illustrate, Hallegatte and Dumas (2009) applied the NEDyM model, or Non-Equilibrium Dynamic Model, which extends the traditional Solow growth model by incorporating short-term dynamics and disequilibria. It introduces delays and dynamic relationships to capture transient periods of imbalance, allowing for reproducing short-term Keynesian features following economic shocks such as disasters. NEDyM introduces changes to the core set of equations of the Solow model, including the introduction of a stock of liquid assets, a goods inventory, and a more sophisticated trade-off between consumption and saving. Additionally, the model incorporates a simple representation of technical change and its embodiment through investment, allowing for investigating the productivity effect on economic consequences following disasters. Some combined other approaches to investigate the impacts of disasters on labour markets (e.g., Kirchberger 2017). We do not test any post-disaster theoretical predictions since there is no comprehensive theory in this literature, and assumptions regarding expected impacts from the aftermath of disasters are many and varied (Coffman and Noy 2011).

3 Disaster and labour market overview and data

The 27th of February 2010 Biobío earthquake is considered the second most severe in Chile’s history and one of the ten strongest worldwide since these events have been recorded by instruments (Barrientos and CSN Team 2018; Contreras & Winckler 2013; M. Jiménez et al. 2020; Sanhueza et al. 2012). The seismic event and subsequent tsunami affected several regions in the central and central-south regions of the country that are inhabited by approximately 80% of the Chilean population. The estimated destruction included 500,000 damaged houses, 12,000 injured people, over 400 deaths and an economic cost of US$30,000 million (NOAA 2019). This earthquake has been used as a natural experiment in other studies examining the link between disasters and labour, such as its impact on perceived stress and job satisfaction (Jiménez & Cubillos 2010) and employment participation, unemployment rate and lack of access to social security (Jiménez et al. 2020; Karnani 2015; Sanhueza et al. 2012; Sehnbruch 2017). Most of this evidence suggests that the earthquake negatively affected the labour market in the short run. However, in the long term, it has been suggested that the recovery process attenuates these negative impacts, facilitated by the government’s efforts and other institutional factors (Jiménez et al. 2020). We add to this literature by considering the potential role of the 27th February 2010 Biobío in explaining changes in workforce sub-groups like ICT labour.

The Post-earthquake Survey conducted by the Ministry of Social Development of Chile (Ministry of Social Development 2010) after the 27th February 2010 Bío-Bío earthquake provides insights into the impacts on labour of the disaster in the most affected regions, as reported by past studies (Jiménez et al. 2020). The survey revealed that the earthquake had notable effects on labour activity across the O’Higgins, Maule, and Bío-Bío regions, with 2–5% of the population not seeking work due to the disaster. Additionally, 11–20% of workers in these regions reported significant workplace damage, disrupting normal productive activities. Furthermore, for 3–13% of labour, the disaster led to changes in working conditions, including contracts, social contributions, working hours, and income. Around 22% of independent workers and self-employed individuals in Bío-Bío reported significant impacts on their productive activities, Additional and more detailed insights were reported by Jiménez et al. (2020).

Our data corresponds to a sample from the online job ads dataset provided by www.trabajando.com, one of Chile’s principal internet labour market intermediaries. In addressing the representativeness and reliability of our sample, past studies have used this data to examine the impact on wages of job skills and job search behaviour, among other aspects of labour markets (Banfi et al. 2019, 2022; Banfi and Villena-Roldán, 2019; Ramos et al. 2013). Notably, Banfi et al. (2022) have suggested that this data source effectively represents the broader dynamics of the Chilean economy. Our sample, comprising 4136 job postings from regions most affected by the earthquake (see details below), collected over four years, offers a comprehensive view of labour market shifts post-disaster. While it does not explicitly filter for ICT-specific job postings or identify other specific economic sectors showing a potential increase in employment in the aftermath of disasters (e.g., the public sector), applying STM helps overcome these limitations.

We filter the Chilean regions most affected by the earthquake, i.e., the regions (in Spanish) VI de O’Higgins, VII del Maule, VIII del Biobío (ECLAC 2010; Sanhueza et al. 2012). Other studies included some additional regions such as Región Metropolitana, V de Valparaíso and the IX de La Araucanía (Jiménez et al. 2020; Karnani 2015), but these regions were less affected (ECLAC 2010).

We use the job posts published from January 2008 to March 2012. Using the job posting publication date, we create a dummy indicating whether the job post was published after the disaster (treated period), \(27F\), which is specified as follows:

$$ 27F = \left\{ {\begin{array}{*{20}l} 1 \hfill & {if\,\,the\,\,post\,\,is\,\,published\,\,between\,\,March\,\,2010{-}March\,\,2012} \hfill \\ 0 \hfill & {if\,\,the\,\,post\,\,is\,\,published\,\,between\,\,January\,\,2008{-}Februrary\,\,2010} \hfill \\ \end{array} } \right. $$
(1)

Our pre-disaster period is represented by data available from 2008 and the occurrence of the disaster on 27 February 2010. The post-disaster definition relies on short-run impacts, considering the first year after the disaster’s occurrence and the second year. Unlike past studies evaluating only one post-disaster year (see, e.g., Karnani 2015), we consider one year to be a very short period for considering decisions on technological replacements and potential ICT labour hiring. Besides, firms might cope with several potential restrictions (e.g., financial and labour shortages) during the first post-disaster year. The economic scenario in the second year after the disaster might supply a more stable basis for making these decisions. Also, we have not considered more years in the post-disaster span to properly balance the number of observations between the pre-and post-disaster span.

As noted above, after filtering by affected regions and periods before and after the disaster, our sample consists of 4136 online job posts. Table 1 shows the distribution of our sample according to pre- and post-disaster periods.

Table 1 Distribution of online job ads in the most affected regions by pre- and post-disaster periods

From our collection of job posts, we concatenated three open-text variables (job title, job description, and job-specific requirements). These concatenated text variables and the date of publication correspond to our input for performing our estimation strategies, as detailed in the next section.

4 Structural topic modelling, STM

The probabilistic or statistical topic models, TM, pioneered by Latent Dirichlet Allocation, LDA (Blei et al. 2003), are tools designed for analysing and understanding large text corpora based on words’ co-occurrence. TM are known as “unsupervised techniques” since they infer topics’ content from a collection of texts or corpus rather than assume them as supervised techniques that require ex-ante definitions of topics (Roberts et al. 2014). Since we only observe the documents, TM aim to infer the latent or hidden topics by estimating how words are distributed in topics and topics in documents. Conceptually, we refer to topics as distributions or mixtures of words that belong to a topic with a certain probability or weight. These weights indicate how important a word is in a given topic. In this context, documents are distributed over topics where a single document can be composed of multiple topics, and words can be shared across topics. Thus, we can represent a document as a vector of proportions that shows the share of words belonging to each topic (Roberts et al. 2014).

TM allow us to evaluate the importance of topics in the documents. The sum of shares of topics across all topics in a document, the so-called document-topic proportions, is one. Equally, the sum of the word probabilities or topic-word distributions for a given topic is also one (Roberts et al. 2019). The input for TM is the collection of our raw job postings transformed into a document-term matrix representation, DTM. DTM represents the corpus of our words or terms as a bag of words or terms.Footnote 2 DTM is usually sparse and allows us to analyse the data using vectors and matrix algebra to filter and weigh the essential features of our document collection. Also, a critical input is the number of topics to be considered in the model. The researcher must choose this number based on some criterion (e.g., the held-out log-likelihood proposed by Wallach et al. (2009), or it can be estimated following strategies developed for this purpose (e.g., the Anchor Words algorithm developed by Lee and Mimno (2014)).

Most TM assume that document collections are unstructured since all documents arise from the same generative model without considering additional information (Roberts et al. 2014). Instead, we implement the STM (Roberts et al. 2013, 2016). STM incorporates document metadata into the standard TM approach to structure the document collection, i.e., STM accommodates corpus structure through document-level covariates affecting topical prevalence. This feature contrasts with other TMs like LDA. Thus, the critical contribution of STM is to include the covariates in the prior distributions for document-topic proportions and topic-word distributions. These document-level covariates can affect the topical prevalence, i.e., the proportion of each document devoted to a given topic, and we can measure these changes (Roberts et al. 2013). Also, we can evaluate the topical content, which refers to the rate of word use within a given topic, but we do not implement this evaluation here.

We applied the STM topical prevalence model, which examines how much each topic contributes to a document as a function of explanatory variables or topical prevalence covariates. In our case, the covariate corresponds to our dummy \(27F\) stated by Eq. (1), showing that our collection of job postings comes from the pre- and post-disaster periods. Next, we examine the topical prevalence variation between these two periods using a treatment effect regression.

4.1 STM topic-prevalence model specification

This section and the subsequent 4.2 follow the descriptions and technical guidelines detailed in Roberts et al. (2013), (2014), (2016), (2019), Grajzl and Murrell (2019). As a model based on word counts, STM defines a data-generating process for each document, and the observed data are used to find the most likely values for the parameters specified by the model. As a step-by-step process, the documents are indexed, and each word within them is uniquely identified, forming our DTM representation. The model incorporates covariates through a design matrix, \(X\), to analyse topical prevalence in documents. The number of topics, \(K\), is pre-determined, and the model views each document as a collection of empty positions filled with terms from a vocabulary. The key process involves generating a topic-prevalence vector influenced by document covariates, which dictates the probability of each topic being assigned to a position in a document. This vector is drawn from a logistic-normal distribution, and a topic is sampled for each word in a document. The probability of selecting specific vocabulary words for a topic is determined using the word frequency and topic-specific deviation. The model concludes by drawing an observed word to fill each position in the document, guided by these probabilities. Regularising prior distributions are applied to specific parameters for model stability. The process emphasises how document properties and the observed metadata can influence topic distribution within a corpus. In the following, we formally describe this specification.

The specification starts by indexing the documents by \(d \in \left\{ {1 \ldots D} \right\}\) and each word in the documents by \(n \in \left\{ {1 \ldots N_{d} } \right\}\) in our DTM representation. The observed words, \(w_{d,n}\), are unique instances of terms from a vocabulary of size \(V\) (our corpus of interest) that we indexed by the \(v \in \left\{ {1 \ldots V} \right\}\). Regarding the addition of covariates for examining the topical prevalence, a designed matrix denoted by \(X\) holds this information. Each row defines a vector of document covariates for a given document. \(X\) has dimension \(D \times P\) (where \(p\) indexes the covariates in the design matrix \(X\), \(p \in \left\{ {1 \ldots P} \right\}\)). The rows of \(X\) are represented by \(x_{d}\). Finally, the specification of the number of topics \(K\) is indexed by \(k \in \left\{ {1 \ldots K} \right\}\).

Overall, the generative process considers each document, \(d\), as beginning with a collection of \(N_{d}\) empty positions, which are filled with terms. Since our data is represented as a DTM or bag of words representation, we can assume that, for a given document, all positions are interchangeable, i.e., the choice of topic for any empty position is the same for all positions in that document (Grajzl and Murrell 2019). The filling process starts with the number of topics chosen by the researcher (details below in Sect. 4.2.2) to build a vector of parameters of dimension \(K\) of a distribution that produces one of the topics \(k \in \left\{ {1 \ldots K} \right\}\) for each position in \(d\). This vector is the so-called topic-prevalence vector since it contains the probabilities that each of the \(k\) topics is assigned to a singular empty position. STM models the topic-prevalence vector as a function of the covariates to estimate the document properties’ influence on topic-prevalence. The process continues with selecting terms from the \(V\) vocabulary to generate a \(k\)-specific vector of dimension \(V,\) which will contain the probabilities of each term chosen to fill an empty position.

Formally, the generative process for each \(d\), given the vocabulary of size \(V\) and observed words \(\left\{ {w_{d,n} } \right\}\), the number of topics \(K\), and the design matrix \(X\), for our STM Topic-prevalence model specification can be represented as a four-step method. First, we draw the topic-prevalence vector from a logistic-normal generalised linear distribution (Roberts et al. 2019), with a mean vector parameterised as a function of the vector of covariates. This specification allows the expected topic proportions to vary as a function of the document-level covariates, as follows:

$$ \vec{\theta }_{d} |X_{d} \gamma , \Sigma \sim {\text{LogisticNormal}}\left( {X_{d} \gamma , \Sigma } \right), $$
(2)

where \(\vec{\theta }_{d}\) is the topic-prevalence vector for document \(d\), \(X_{d}\) is the 1-by-\(p\) vector, \({\text{and}} \gamma\) is the \(p\)-by-\(\left( {K - 1} \right)\) matrix of coefficients. \({\Sigma }\) is a \(\left( {K - 1} \right)\)–by–\(\left( {K - 1} \right)\) covariance matrix that allows for correlations in the topic proportions across documents. The covariates’ addition to the model allows the observed metadata to influence the frequency of discussion in the corpus for a given topic. In our specification, the covariate corresponds to the \(27F\) dummy stated by Eq. (1).

Secondly, given the topic-prevalence vector \(\vec{\theta }_{d}\) from Eq. (2), for each \(n \) word within document \(d\), which is the process of filling the empty positions \(n \in \left\{ {1 \ldots N_{d} } \right\},\) a topic is sampled and assigned to that position from a multinomial distribution as follows:

$$ z_{d,n} \sim {\text{Multinomial}}\left( {\vec{\theta }_{d} } \right), $$
(3)

where \(z_{d,n}\) is the topic assignment of words based on the document-specific distribution over topics, where the \(k^{th}\) element of \(z_{d,n}\) is one and the rest are zero for the selected \(k\).

Thirdly, we form the document-specific distribution over terms representing each topic \(k, \) oosing specific vocabulary words \(v \) as follows:

$$ \beta_{d,k,v} |z_{d,n} \propto \exp \left( {m_{v} + k_{k,v} } \right), $$
(4)

where \(\beta_{d,k,v}\) is the probability of drawing the \(v\)-th word in the vocabulary to fill a position in document \(d\) for topic \(k\). \(m_{v}\) is the marginal log frequency estimated from the total word counts of term \(v\) in the vocabulary \(V,\) representing the baseline word distribution across all documents. \(k_{k,v}\) is the topic-specific deviation for each topic \(k\) and term \(v\) over the baseline log-transformed rate for term \(v\). \(k_{k,v}\) represents the importance of the term, given the topic. The logistic transformation of \(m_{v}\) and \(k_{k,v}\) converts their sum into probabilities for use in the subsequent and final step, which refers to drawing an observed word conditional on the chosen topic.

Fourthly, the observed word \(w_{d,n}\) is drawn from its distribution over the vocabulary \(V\) to fill a position \(n\) in document \(d\) as follows:

$$ w_{d,n} \sim {\text{Multinomial}}\left( {\beta_{d,k,1} , \ldots ,\beta_{d,k,V} } \right) $$
(5)

Also, default regularising prior distributions are used for \(\gamma\) in Eq. (2) and \(k\) in Eq. (4). The regularising prior distributions refer to zero mean Gaussian distribution with shared variance parameter i.e.\(\gamma_{p,k} \sim Normal\left( {0,\sigma_{k}^{2} } \right)\) and \(\sigma_{k}^{2} \sim Inverse - Gamma\left( {1,1} \right)\) (Roberts et al. 2016), where \(p\) and \(k\) indexes the covariates and topics, respectively, as shown above.

4.2 STM topic-prevalence Model and effect estimation

This section outlines the techniques used to process our text data, estimate the number of topics, and the inference parameters of our STM Topic-prevalence model. Based on these inference parameters, we estimate the effect of our natural experiment on topic-prevalence. We start applying standard techniques (e.g. cleaning) to create a DTM. Then, a number of topics is estimated. The core of the process is the estimation of the STM Topic-prevalence model, which accounts for the observed data, chosen number of topics, and specific covariates. The model then interprets the proportion of job postings related to various topics and examines how these proportions change, applying regression analysis to investigate the shifts in topic prevalence post-disaster, utilising STM-fitted parameters. Results are derived from simulations and presented visually and numerically, highlighting changes in topic prevalence. Finally, the robustness of the results is assessed using permutation tests, ensuring the observed treatment effects are not simply artefacts of the modelling process. We use R packages like Quanteda (Benoit et al. 2018) to manage and analyse text data. The STM specification, estimation and treatment effect analysis are performed using the Stm R package (Roberts et al. 2016, 2019, 2020).

4.2.1 Pre-processing and DTM representation

We perform standard pre-processing procedures on our collection of 4136 job postings (see Sect. 3 for details). As pointed out above, since our analysis does not deal directly with text data but is performed on specific text features such as word frequencies, we construct a DTM representation (Welbers et al. 2017). We apply cleaning, tokenisation, and stemming, among others, as standard pre-processing procedures to construct our DTM. We use unigrams (unique words) and bigrams (two consecutive words) as tokens or features. Using bigrams allows us to capture text structure or context that we cannot see using single words. For example, in the case of some job titles with generic words like “Engineer”, including bigrams might make tokens more comprehensible since we are observing terms like “Software Engineer”, “Construction Engineer”, etc. We also apply the removal of infrequent terms by dropping features that do not appear in at least ten documents.

4.2.2 Estimating the number of topics, \({\varvec{K}}\), and the STM topic prevalence model parameters

We estimate \(K\) by applying the Anchor Words algorithm (Lee and Mimno 2014). This technique infers \(K\) by finding an approximated convex hull or the smallest convex polygon in a multi-dimensional word co-occurrence space given by our DTM representation. The central assumption of the Anchor Words algorithm is separability, i.e., each topic has a specific term that appears only in the context of that topic. This separability assumption implies that the terms corresponding to vertices are anchor words for topics. Alternatively, the non-anchor words correspond to the point within the convex hull. We expect a \(K\) of between 5 and 50, the range suggested for a small collection of documents, i.e., a few hundred to a few thousand (Roberts et al. 2020), like our sample.

Also, since there is no true \(K\) parameter (Lee and Mimno 2014; Roberts et al. 2016, 2019), we apply a \(K\) data-driven search as a confirmatory analysis. Therefore, we examine different topic numbers to select the proper specification from the computation of diagnostics, such as the held-out log-likelihood (Wallach et al. 2009) and residuals analysis (Taddy 2012). The held-out log-likelihood test evaluates the prediction of words within the document when those words have been removed to estimate the probability of unseen held-out documents (given some training data). For the best specification, on average, we will observe a higher probability of held-out documents indicating a better predictive model. In practical terms, we plot the number of topics and their held-out likelihood to look for some breaks in this relationship as a diagnostic showing that additional topics are not improving this likelihood much. Related to the residual analysis, it evaluates the variance overdispersion of the multinomial described by Eq. (3) within the data-generating process. An appropriate number of topics will restrict this dispersion. We are interested in the number of topics with lower values in a plot showing \(K\) and their estimated dispersion or residual level.

Regarding the STM Topic-prevalence model estimation, the strategy takes the DTM, \(K\) and the covariate and returns fitted model parameters. Put differently, given the observed data, \(K\) and our \(27F\) dummy, we estimate the most likely values for the model parameters specified by maximising the posterior likelihood (see Sect. 4.1). As a result, we can examine the proportion of job postings devoted to a given topic, or topical prevalence, over the \(27F\) dummy. However, as occurs in this kind of probabilistic model, the STM posterior distribution is intractable. Therefore, we apply the approximate inference method that Roberts et al. (2019) implemented. This method, the so-called partially-collapsed variational expectation–maximization algorithm, posterior variational EM, gives us, upon convergence, the estimates of our STM Topic-prevalence model. We discuss our convergence evaluation below.

Another complexity that follows from the intractable nature of the posterior is the starting value of the parameters: in our case, this is the initial mixture of words for a given topic. This complexity is known as initialisation, and our estimation depends on how we approach it. We specified the initialisation method using the default choice named “Spectral”.Footnote 3 The spectral algorithm is recommended for a large number of documents like ours (Roberts et al. 2020). The described estimation is executed with a maximum number of 200 posterior variational EM iterations subject to meeting convergence. Convergence is examined by observing the change in the approximate variational lower bound. The model is considered converged when the change in the approximate variational lower bound between the iterations becomes very small (default value is 1e−5). We use functionalities included in the R package Stm to estimate \(K\) and STM topic-prevalence model parameters.

In practical terms, the STM Topic-prevalence estimation described above allows us to measure how much a given topic contributes to each of our online job postings. We interpret our result by inspecting the estimated mixture of terms associated with topics. We include the most important terms for each topic using metrics like the highest probability and FREX terms (Roberts et al. 2019). FREXFootnote 4 measures the exclusivity of that term to a given topic. This association between terms, documents and topics results from the estimated model. However, for clarity, we name each topic according to our interpretation of the set of terms that motivates each of them. Thus, we can find topics associated with ICT labour. Since we specified the topical prevalence as a function of the \(27F\) dummy (see Eq. (2) related statements), we can measure the ICT labour topic prevalence variation between the pre- and post-disaster periods.

4.2.3 Treatment effect estimation and evaluation

Once we have estimated our STM Topic-prevalence model, the fitted parameters allow us to estimate a regression using the online job postings as units or documents, \(d\), to evaluate the influence of our dummy \(27F\) defined by Eq. (1) on topic-prevalence for a topic \(j\) (Roberts et al. 2019). Since \(27F\) indicates whether the job posting was published in the period before the earthquake impact or after, i.e., in the post-disaster or “treated” period (see Sect. 3), we can study how the prevalence of topics changes in the aftermath of the disasters. In other words, we evaluate the “treatment effect” of the disaster on the topical prevalence by examining changes in topics’ proportions over our sample of job postings published after the earthquake. The effect estimates are analogous to Generalized Linear Models, GLM, coefficients (Roberts et al. 2013).

We compute the topic proportions from the \(\theta\) matrix where each column is the topic-prevalence vector for the document \(d\), \(\vec{\theta }_{d}\) (see Eq. (2)), and rows are \(d\). Thus, each element \(\theta_{d,j}\) is the probability of a job posting \(d\) being assigned to the topic \(j\). As an illustration, in a model with only two topics, we consider the probability of each job posting for each of these two topics. In this example, for a job posting \(d\), we can denote its proportions over the two topics as \(\theta_{d,1}\) and \(\theta_{d,2}\) where \(\theta_{d,1} + \theta_{d,2} = 1\). Thus, the regression to evaluate the treatment effect where the topic proportions for a given topic are the outcome variable can be represented as

$$ \vec{\theta }_{d} = \alpha + \beta *27F_{d} + e $$
(6)

where \(\alpha\) is the intercept, \(\beta\) is the coefficient to be estimated and \(e\) stands for the error term. A significant \(\beta\) can be interpreted as changes (positive or negative) in topical prevalence because of our dummy standing for the post-disaster period.

The effect estimation procedure in the Stm R package relies on simulated draws of topic proportions from the EM variational posterior (see Sect. 4.2.2) to compute the coefficients. We use the default value of 25 simulated draws to compute an average over all the results. This procedure randomly samples topic proportions from each job’s estimated topic proportion distributions and repeatedly posts them to estimate any given effect. Also, as suggested by the software’s authors, we include estimation uncertainty of the topic proportions in uncertainty estimates, or “Global” uncertainty, using the method of composition (Roberts et al. 2019). Regression table results will display the various quantities of interest (e.g., coefficients, standard error, t-distribution approximation). The procedure uses 500 simulations (default value) to obtain the required confidence intervals in the standard error computation (drawn from the covariance matrix of each simulation) and a t-distribution approximation (Roberts et al. 2020). We also show our results visually by displaying the contrast produced by the change in topical prevalence, shifting from the pre-disaster to the post-disaster periods, using the mean difference estimates in topic proportions.

Regarding the evaluation of our estimation, although the robustness of the treatment effect estimation implemented here in terms of spurious effectFootnote 5 has been validated by using several tests (e.g., Monte Carlo experiments as detailed by Roberts et al. (2014)), we still apply a permutation testFootnote 6 to evaluate the robustness of our findings. The procedure estimates our model 100 times, where each run applies a random permutation of our \(27F\) dummy to the job postings or documents. Then, the largest effect on our topics of interest is calculated. Regardless of how we assigned the treatment to documents, we would find a substantial effect if the results connecting treatment to topics were an artefact of the model. Alternatively, we would find a treatment effect only when the assignment of our \(27F\) dummy aligned with the true data. We present the results of our permutation tests by plotting the contrast between our permutated model and the true model for our topics of interest.

Moreover, given the lack of additional control variables due to data availability, we also apply a Differences-in-Differences (DiD) approachFootnote 7 as an alternative specification to address the potential limitations of the model specified in Eq. (6). To define treatment and control groups, we use the approach of Jiménez et al. (2020) by using the geographical location of the job postings to classify the regions into treatment and control groups. The treatment group should include regions with the highest peak ground acceleration (PGA) values, indicating significant earthquake impact. The control group should consist of regions with the lowest PGA values, which experienced minimal impact. This approach ensures that the groups are defined based on an instrumental variable (PGA) that is not correlated with other confounding factors.Footnote 8 The DiD model can be represented as:

$$ \vec{\theta }_{d} = \alpha + \beta_{1} *27F_{d} + \beta_{2} *Treatment_{d} + \beta_{3} *\left( {27F_{d} *Treatment_{d} } \right) + e $$
(7)

where \(Treatment\) is a dummy variable that identifies whether the job posting belongs to regions classified in the treatment group, and the rest of the variables and notation as in Eq. (6). Here \(\beta_{3}\) represents the DiD estimator, which would indicate a change in the topical prevalence of ICT labour due to the earthquake. This method helps to improve the robustness of the causal inference by controlling for time-invariant differences between the treatment and control groups and common trends affecting both groups (Jimenez et al. 2020).

5 Results

5.1 Pre-processing and DTM representation

Once we applied cleaning, tokenisation and stemming, our DTM matrix was compounded by 4136 documents, 63,038 features (99,9% sparse) and one covariate (\(27F\) dummy). However, we find an important number of features belonging only to a few documents. In this regard, we remove infrequent terms by dropping features that do not appear in at least ten documents. As a result, our DTM now has 4129 documents and 2748 terms whose frequency is in the range [11, 2,095]. Table 2 shows the 15 most frequently used terms in our DTM representation. Overall, the terms refer to the most frequent words in job titles and job areas that characterise our collection of job postings, such as sales, customer service, commercial, and management. Also, in the column “Document frequency”, the last column in Table 2, we can observe how frequently the features are allocated to documents. For example, in the second row, “client” in Spanish (“customer” in English) is the most represented feature since it is found in 1210 job postings.

Table 2 The 15 most frequent DTM terms

5.2 Estimating \({\varvec{K}}\) and STM topic-prevalence model parameters

This section shows the findings from our estimation strategies detailed in Sect. 4.2.2. The number of topics applying the Anchor Words algorithm yielded a \(K\) equal to 53. Our alternative data-driven search of \(K\) produces similar results, as shown in Fig. 1. The left-hand plot corresponds to the held-out log-likelihood application. We see a “break” between 40 and 50 topics. After that point, we see more minor improvements in the log-likelihood by adding more topics. In the case of the residual analysis, the right-hand side plot of Fig. 1 shows the lower dispersion levels between 50 and 60 topics. In this regard, we can validate our \(K\) equals 53 since this quantity falls approximately within the estimated ranges from both data-driven measures.

Fig. 1
figure 1

Diagnostics values of held-out log-likelihood (left-hand plot) and residuals (right-hand plot) by number of topics

Figure 2 shows the distribution of the expected topic proportions for the 53 topics over our job posting distribution. The x-axis corresponds to the expected topic proportion, and topic labels highlight the three words of the highest probability (stem words in Spanish).

Fig. 2
figure 2

Expected topic proportions (x-axis) and the three highest probability words (in Spanish) for the 53 topics

The highest topic proportion in Fig. 2 corresponds to Topic #50 with the associated terms “vent”, “ejecut”, and “ejecut_vent”. Translated into English, these terms are sales, executive, and sales executive, respectively, implying that most of our collection of jobs is devoted to sales-related jobs. We examine the 53 topics and name them based on the ten most probable words and FREX terms (See footnote 4). In the Appendix (see Sect. A2), we show the full details of high probability and FREX terms and our proposal of names for topics (in Spanish and English).

Returning to Fig. 2, we look at topics standing for ICT labour. We find that Topic # 33 (top half of Fig. 2) can be interpreted as an ICT labour topic, given that the most probable terms, i.e., stem words in Spanish, are “informat”, “desarroll” and “program”. As non-stem English words, these words would be informatics, development and programming, respectively. Additional FREX terms include English words like data, support, and database (see Topic #33 in the Appendix, Sect. A2). Furthermore, software or programming languages belong to this topic (e.g., SQL, PHP). We do not observe other topics with similar terms, suggesting that only our topic of interest contains the expected mixture of ICT-related words.

We adopt the same approach to interpreting the rest of our topics: analysing the higher probability and FREX top words. Topics refer to broad economic sectors (e.g., Construction, Health), whereas others to specific job titles (e.g., Retail Store Manager, Management Assistants) and job posting sections (e.g., job posting benefits, job posting qualifications requirements). Furthermore, we cannot interpret some topics (we have denoted them as “Undefinable”) since we do not see a clear concept emerging from the mixture of words.

5.3 Effect estimation of the earthquake

This section examines the treatment effect of the disaster on the topical prevalence, as described in Sect. 4.2.3 of our ICT labour topic. Also, for comparative purposes, we examine the Construction labour topic (Topic #13 in the top half of Fig. 2), since reconstruction activities are expected to encourage this topic’s post-disaster prevalence and some topics that represent broader employment categories (e.g., Health). Table 3 presents the results for the regression represented by Eq. (6). A full list of regression results for the whole set of 53 topics is provided in Appendix (see Sect. A3). Firstly, we focus on the prevalence of ICT (Topic #33) and Construction (Topic #13) labour topics in Table 3. We can see that the \(27F\) covariate is not statistically significant, using the ICT topical prevalence as our output variable. In contrast \( 27F\) is significant (p-value < 0.01) and positive for the Construction labour topical prevalence. These findings show that the prevalence of the ICT labour topic does not change, indicating any difference in demand for ICT labour. Conversely, the prevalence of the Construction topic is significantly different and positive after the disaster, suggesting that reconstruction activities occurred in the earthquake’s aftermath.

Table 3 Effect treatment regression results for ICT, Construction, and other selected topics (see full list of results in Appendix, Sect. A3)

Regarding the rest of the topics in Table 3 that show some statistical significance at 1%, 5% or 10% level for the \(27F\) covariate, we observe changes in their topic prevalence before and after the disaster. To illustrate, on the one hand, sectors like Business Management (Topic #6, p-value < 0.1), Banking and Finance (Topic #35, p-value < 0.01), and Health_41Footnote 9 (Topic #41, p-value < 0.01) show topic prevalence significantly different and negative after the disaster, suggesting a contraction of these economic activities in the earthquake’s aftermath. On the other hand, and like Construction, topical prevalence for topics representing Logistics_5 (#5, p-value < 0.05), Agriculture (#28, p-value < 0.05) and Occupational Risk Prevention (#48, p-value < 0.1) are significantly different and positive after the earthquake. For agriculture, we can speculate this shift obeys recovery activities since the earthquake instantly affected this sector (Jiménez et al. 2020). For Logistics_5, we could link this shift to recovery activities in the supply chain sector. For Occupational Risk prevention, a higher topic prevalence would suggest the need for these professionals in recovery activities or a higher awareness of these activities in a post-disaster scenario. Visually, Fig. 3 shows that topical prevalence differed significantly and positively between the pre-disaster and post-disaster periods for, among others, the Construction labour topic.

Fig. 3
figure 3

Difference in topic prevalence between pre-disaster and post-disaster periods for ICT and construction labour topics. Negative and positive values indicate that the topic is more prevalent in pre- and post-disaster periods. (Confidence intervals at 95%)

Figure 4 shows the results of our permutation test (see Sect. 4.2.3 for details). For the ICT labour topic (left-hand plot), the permutation output suggests that our results of no change in topic proportions are robust since the models with a random permutation of our \(27F\) dummy and our model with the true assignment of our variable, shown by the red line on the top of the plot, have effect sizes around zero. In the case of the Construction labour topic (right-hand plot), most estimated models have effect sizes grouped around zero. However, the model including the true assignment of our \(27F\) The dummy shown by the red line at the top of the plot is a result far to the right of zero. Thus, the relationship between the treatment and examined topics arises within the sample and is not driven by the estimation method. The same results are obtained for the rest of the topics.

Fig. 4
figure 4

Permutation test results for the ICT labour topic (left-hand plot) and construction labour topic (right-hand plot) (confidence intervals at 95%)

Regarding our DiD framework application stated in Eq. (7), Table 4 shows the estimated parameters based on a sample of 73,000 job ads (63,996 in the Treatment group). These results are similar to our original effect estimation discussed above, i.e., there are no significant coefficients for the \(27Feb*Treatment\) coefficient (our DiD estimator) for ICT-related topics (in this additional analysis, two topics are related to ICT jobs) and a positive and significant DiD estimator for our Construction-related topic. We also evaluate the trends in the outcome variable (the probability of a job posting being assigned to a topic) for Treatment and Control groups to show compliance with the parallel trends assumption from the DiD approach, focusing on the pre-disaster period. As expected, we observe similar trends for the topics of interest (see Appendix A4 for details).

Table 4 DiD regression results for ICT and construction related topics

6 Discussion

This study examined the impact of the 27th of February Biobío earthquake on demand for ICT labour as a proxy for technological replacement. We do not find evidence that this large earthquake (> 8 MW) influenced the demand for ICT labour, represented by a topic featuring ICT-related terms from our job postings collection. This ICT labour topic corresponds to one of the 53 discovered by applying our STM-Topical Prevalence modelling and estimation strategy. Our number of topics is as expected, given the number of our job postings and the data-driven measures.

Our treatment effect regression results show that the ICT labour topic prevalence did not change in the earthquake’s aftermath. This result suggests no substantive technological change in the most affected regions. We do not have enough data to measure region-specific impacts. This lack of evidence does not support our conceptual framework’s main prediction that the expected technological upgrading with ICT-compatible equipment would lead to faster growth in demand for ICT labour. Unlike other studies on shocks like pandemics and recessions, as far as we know, this is the first study that has attempted to link ICT labour with disasters. Most of the literature emphasises the importance of ICT and related technologies in coping with disaster prevention and disaster management.

We can speculate as to the reasons why we have not observed evidence of technological upgrading after the earthquake. First, there is the sectorial structure of the Chilean economy. Assuming that older and outdated physical assets are more prone to be damaged by an earthquake because of weaker structure, mechanical fatigue, and other vulnerabilities (Okuyama 2003), there is a relatively low representation of the sectors accounting typically for these tangible physical assets, like the manufacturing industry. As Chile has grown, its economic development has been more concentrated in the services sector, which accounts for mainly intangible assets, while manufacturing and other sectors have declined (de la Torre et al. 2013; Parro and Reyes 2017). In the Chilean GDP structure, the services sector accounts for more than half of the GDP, whereas the manufacturing sector decreased from over 20% in the 1980s to 10% by 2010 (World Bank 2022). Consequently, the potential negative impact of a disaster on an underrepresented sector like manufacturing might be untraceable. In addition, the predominance of the services sector also can explain the lack of evidence since it has been suggested that this sector, given the intangible nature of its assets and operations, does not suffer the impact of disasters as severe as, for example, manufacturing (Doytch 2020).

Secondly, comparative studies also suggest Chile may be well equipped to cope with disasters due to building policies and economic conditions. For example, severe economic damage was expected in the aftermath of the 27th of February Biobío earthquake because it affected the central regions of the country, where most of the economic activity and population are concentrated. However, the detrimental effects on the economy were much less than those observed in low-income countries like Haiti, when it was hit by a less severe earthquake (7 MW) in January 2010 (Cavallo and Noy 2010; Congressional Research Service 2010). Another possibility suggested by past studies is that economic innovations usually appear when the economy completely recovers from a disaster (Park et al. 2017). In this regard, a longer-term analysis could capture technological upgrading by observing changes in demand for ICT labour.

A third reason could be the exclusion of regions less affected by the earthquake (see Sect. 3 for details) but with higher concentrations of ICT-related labour, such as Región Metropolitana, where 40% of the Chilean population resides. We extended the STM estimation by including these regionsFootnote 10 (now based on about 63,000 job ads) as past studies (Jiménez et al. 2020; Karnani 2015). These results show an estimated number of topics equal to 79, in which we can identify two as ICT-related employment. For these topics, the treatment effect estimation results are still negative but statistically significant compared to the analysis in this study.

Our findings on Construction employment align with our expectations and past studies (Belasen and Polachek 2009; Skidmore and Toya 2002). The positive impact on this labour sector suggests reconstruction activities occurred in the earthquake’s aftermath. This positive influence might occur as labour is substituted for damaged or missing physical capital in this sector. In this regard, some authors suggest that rebuilding activities favour unskilled and less-educated workers due to increased demand for the Construction sector, a highly intensive employer of unskilled labour (Di Pietro and Mora 2015). Less favoured groups, like migrants, can also see improvements in their labour outputs during recovery stages (How and Kerr 2019). The analysis of these positive influences of disasters on Construction is beyond the scope of this study, but it represents an opportunity for further research. Similarly, our findings regarding positive and negative impacts on other economic sectors also represent elements for future studies.

There are some caveats to the study that deserve mention. We focus on three aspects: extending the analysis’s post-disaster period beyond one year and some data representativeness and methodological issues. Regarding the former, while extending the post-disaster period offers a more comprehensive view of the labour market’s response, it also introduces the possibility that other macroeconomic or local factors could influence the observed changes, potentially confounding the effects attributed to the earthquake. To illustrate, some emphasise the importance of considering labour mobility and flexibility when modelling labour market responses after a disaster since the post-disaster economic conditions can lead to divergent outcomes (Grinberger and Samuels 2018; Venn 2012). These conditions include general economic trends, policy changes, or unrelated local developments in the affected regions that could either counteract or amplify the earthquake’s impact on labour demand.

Regarding methodological issues, first, there is potential ambiguity in the discovered topics. As a result, we cannot interpret some of them. This difficulty might be more significant for researchers without prior knowledge of the data or analysing text in a foreign language. This limitation could result in misinterpretation or misclassification of topics in this study. In this regard, we have added the full list of discovered topics and their associated keywords in the Appendix (see Sect. A.2) to show our arbitrary process of topic identification and our reliability on the chosen topic, standing for ICT labour.

A second methodological caveat, the model does not include control variables for effect estimation (see Eq. (6)), which may lead to concerns regarding consistency and potential biases in the estimates. In this regard, our data only included the title and job description for each job post, publication date and region, and we cannot add additional controls. Besides, it has been difficult to find publicly available data matching the dairy frequency of job ads and aggregating data to months or quarters will impact the data variability given the limited analysed period. However, as detailed in our methodology (see Sect. 4.2.3), the validation exercises and permutation tests are steps toward addressing potential biases in the estimates (see, e.g., Roberts et al. 2014) and specialised literature (Good 2005). When utilised in STM, the permutation test serves as a robust non-parametric method to assess the significance of relationships between topics and treatments, effectively helping to mitigate potential biases in treatment effect estimates.Footnote 11 In addition to this validation analysis, and as detailed in our treatment effect estimation section, we apply a DiD framework to address the potential limitations of the lack of control variables in our model specification. As discussed above, the main results remain unaltered (see Table 4 related statements).

A third estimation issue is that given that STM is recent, its utility and limitations are still developing. In the case of the treatment effect estimation implemented in this study, there have been some warnings about the modelling of topic proportions, such as that STM ignores the fact that proportions belong to the interval [0, 1] and the regression approach combining Bayesian and frequentist methodsFootnote 12 (Schulze et al. 2021). Improvements in tackling these limitations should be implemented in future versions of STM.

Related to data representativeness limitations, the public sector played a vital role in the aftermath of the 2010 earthquake, but we cannot identify this kind of labour in our analysis.Footnote 13 In Chile, government initiatives demonstrated effective disaster response and reconstruction strategies (Arbour et al. 2011; Siembieda et al. 2012). The public sector implementing policies and actions has the potential to significantly impact its labour demand. Although our STM approach helps us effectively identify shifts in other economic sectors, it falls short of capturing the dynamics of post-disaster government-related employment. We recognise this as a limitation of our study and suggest that future research should specifically address the role of the public sector in disaster recovery.

Further suggestions for future research include focusing on a more disaggregated analysis, theoretical development and extending the post-disaster period under examination. The importance of research differentiating labour groups or other distributions of workers lies in its ability to facilitate the identification of the worst affected or most favoured workers, either in the aftermath of a disaster or during the economic recovery. Typically, aggregated analysis hides the impact on sub-groups (How and Kerr 2019; Zissimopoulos and Karoly 2010). Regarding theoretical developments, some authors have made economic generalisations about disaster dynamics by combining conceptual frameworks (e.g., Kirchberger 2017; Okuyama 2003) as reproduced in this study. However, much theoretical work remains to be done. Regarding the extension of the post-period examination, as pointed out above, technological replacements might be a short-run and a middle or long-run decision. Our analysis speculates on some potential policy implications. First, policymakers can take advantage of recovery activities, considering to a greater extent the potential for technological upgrading. This is particularly interesting for countries or regions exposed to disasters like Chile, where the lack of technological upgrading in planning recovery activities might explain why we cannot observe technological replacements. Policymakers usually emphasise aspects like disaster risk reduction to improve resilience, where upgrading is mainly planned for infrastructure since disasters threaten sustainable development (Bello et al. 2021). However, a recovery process promoting technological replacements for firms could exploit and encourage potential technological adoption after disasters (Benson and Clay 2004; Doytch 2020). For example, policies could promote upgrading through fiscal incentives (e.g., tax reductions and financial support). Countries receiving greater inflows of external capital in the aftermath of disasters, such as foreign direct investment or FDI, could attract this investment by focusing on technological upgrading (Doytch 2020). Other highly seismic countries, like Japan, supply abundant liquidity to mitigate the financial constraints on businesses located in affected areas (Okazaki et al. 2019).

In the case of Chile, as one of the region’s strongest economies, after the 27th of February 2010 Biobío earthquake, it had a good chance of receiving support from international financial institutions (e.g., the World Bank, International Monetary Fund), not only for reconstruction (Congressional Research Service 2010) but also for technological upgrading. Nevertheless, to our knowledge, there was no strategy to consider the issue discussed here. Therefore, we would encourage policymakers to take advantage of reconstruction activities promoting potential technological upgrading through fiscal incentives, mitigation of financial restrictions, and policies targeting replacing industrial technology, as discussed above. In turn, this “forced” upgrading might improve demand for highly skilled workers in the ICT and related technologies sector.

A second policy aspect relates to more attention to disaggregated labour, for example, less favoured workers employed in recovery activities. These activities supply job opportunities for these workers that might not exist otherwise, which is desirable from a policy perspective. However, reconstruction activities typically employ low-skilled or unskilled workers, as usually occurs in the Construction sector (Rodríguez-Oreggia 2013), and this unskilled labour appears at the lowest end of the Construction sector’s wages (Sisk and Bankston 2014). In addition, these low-paying jobs are often dangerous. For example, in the aftermath of Hurricane Katrina, it has been suggested that an undocumented and foreign-born labour force carried out the most unsafe reconstruction activities, like demolition (Trujillo-Pagan 2012). Bearing this in mind, policymakers should promote strategies focused on these most vulnerable workers, such as improving workers’ prospects by retraining to mitigate the eventual lack of income once the recovery process finishes. Also, more attention should be paid to work safety policies since hard and hazardous jobs usually employ less favoured workers.

But, beyond the policy implications discussed above, our study’s findings indicate a distinct increase in labour demand within the construction sector post-disaster, while the demand for ICT labour remains relatively unchanged. This observation leads to a reconsideration of policy recommendations that focus solely on promoting technological change. Given the evidence, it seems more prudent to address labour market mismatches that have become apparent in the aftermath of the disaster. The surge in construction employment, predominantly unskilled or low-skilled, as discussed above, suggests a need for policies that not only provide the immediate reconstruction requirements but also consider the longer-term career prospects and skills development of those employed in this sector.

Furthermore, while it may seem counterintuitive to prioritise technological advancement in a context where ICT demand is stagnant, it is essential to view this within a broader economic framework. Technological advancement should not be seen in isolation but as part of a comprehensive strategy that addresses the evolving needs of the economy. Policies should, therefore, focus on facilitating transitions from temporary to permanent employment, investing in training and upskilling programs, and ensuring that workers, especially the vulnerable, are equipped to adapt to changing labour market demands. This approach aligns with the idea of supporting a dynamic labour market that can absorb workers from sectors like construction into more stable and potentially higher-skilled jobs in the long run.

7 Conclusion

The impact on ICT employment derived from technological upgrading due to the impacts of disasters has not received attention. Nevertheless, disasters can be an opportunity to accelerate technology adoption, which can positively impact demand for specialised labour like ICT workers.

We explored the influence of the 27th of February 2010 Biobío earthquake on demand for ICT labour as a proxy for a technological replacement event. Our findings using open-text data from online job postings, alongside our topic modelling and treatment effect estimations, show that demand for ICT labour did not significantly change in the aftermath of the earthquake. Given these results, we would assert that there was no significant technological upgrading of destroyed equipment by capital goods compatible with ICT in the most affected regions. However, we observed an increase in Construction labour. Therefore, and as expected, reconstruction activities featured strongly in the recovery process.

Our lack of support for the influence on ICT labour of shocks like the examined earthquake might reflect features characteristic of Chile, such as building policies, economic conditions, and the size of the manufacturing sector. Furthermore, technological replacements might occur in the medium or long run or, possibly, when the recovery activities finish. In this regard, future research should examine periods beyond our post-disaster span of two years. Also, we encourage further research, analysing disaggregated labour and developing more theoretical foundations for better conceptualising interactions between disasters, labour, and technology.

Finally, as discussed earlier, the policy recommendations arising from our study advocate for a balanced approach that acknowledges the immediate needs post-disaster, such as reconstruction, while also preparing the workforce for future economic shifts. This involves a dual focus on supporting sectors with immediate demand, like construction, and fostering an environment conducive to technological upgrades and skills development to mitigate the risks of labour market mismatches and to ensure sustainable economic recovery and growth.