Understanding the influence of news on society decision making: application to economic policy uncertainty

The abundance of digital documents offers a valuable chance to gain insights into public opinion, social structure, and dynamics. However, the scale and volume of these digital collections makes manual analysis approaches extremely costly and not scalable. In this paper, we study the potential of using automated methods from natural language processing and machine learning, in particular weak supervision strategies, to understand how news influence decision making in society. Besides proposing a weak supervision solution for the task, which replaces manual labeling to a certain extent, we propose an improvement of a recently published economic index. This index is known as economic policy uncertainty (EPU) index and has been shown to correlate to indicators such as firm investment, employment, and excess market returns. In summary, in this paper, we present an automated data efficient approach based on weak supervision and deep learning (BERT + WS) for identification of news articles about economical uncertainty and adapt the calculation of EPU to the proposed strategy. Experimental results reveal that our approach (BERT + WS) improves over the baseline method centered in keyword search, which is currently used to construct the EPU index. The improvement is over 20 points in precision, reducing the false positive rate typical to the use of keywords.


Introduction
With the rise of technology, a growing portion of human interactions, communications, and cultural activities are being documented as digital text. This encompasses everything from social media posts to news articles and transcripts of spoken exchanges in political, legal, and economic spheres. Through analyzing these digital documents, we can gain valuable insights into the correlation between language usage and human thoughts, behaviors, and social structures. However, the massive proliferation of digital collections has rendered manual analysis methods inefficient, lacking in scalability, and less able to meet time constraints.
A novel research area, referred to as text-as-data, that utilizes digital collections of data through computational methods is emerging within the discipline of computational social science [1]. This domain has a diverse array of applications in the fields of sociology, economics, and political science [2]. In economics for example, textual data have seen several applications such as newspapers [3], central bank communications [4], financial earning calls [5] and social media data [6] have been leveraged to understand the social process.
One prominent example of a text-as-data application in economics is the Economic Policy Uncertainty (EPU) index proposed by Baker et al. [3]. This index was established by extracting news articles that contained keywords related to the economy, policy, and uncertainty as a proportion of all news articles published within a certain time frame. The resulting index has proven to be useful, demonstrating the potential to extract economic signals Ahmed Zahran and Rosane Minghim contributed equally to this work. from text data. In fact, it has been found to be predictive of various economic indicators such as recessions, economic growth, and investment [3]. However, the methodology of keyword searches used in the identification stage is less rigorous and was found to be prone to many false positives and negatives [3].
The most effective approach to minimize false positives and negatives in the detection of Economic Policy Uncertainty (EPU)-related news articles is to adopt an automated methodology. Several research efforts have explored the use of unsupervised machine learning models, such as topic models [7,8], to identify topics that pertain to EPU. However, the topics identified through unsupervised methods may not be easily understandable or align with actual EPU categorizations. Fully supervised methods, such as those used by Keith et al. to classify news articles with respect to EPU categories through the use of supervised machine-learning models, have also been explored. The challenge with fully supervised methods is the requirement for high-quality training data. However, the process of annotating data is a labor-intensive task that often requires the expertise of trained individuals and can be costly. This has limited the potential of machine learning techniques in fields, such as economics, with limited resources and a lack of annotated public text data.
In this paper, we present (BERT ? WS) for the automatic classification of news articles related to Economic Policy Uncertainty (EPU). The approach employs weak supervision in conjunction with neural language models. The process begins by generating noisy labels using weak sources, in the form of labeling functions, as described in Sect. 3. These noisy labels are then used to fine-tune a language model. Our approach reduces the false positive rate associated with traditional keyword-based methodologies by over 20%. Compared to fully supervised approaches, which typically result in improved performance, the performance gain comes at the cost of a timeconsuming and often expensive labeling process that requires the expertise of domain experts. Our approach, on the other hand, is efficient and cost-effective as it only requires domain experts to provide labeling functions, rather than annotating individual news articles. The key contributions of this work is as follows: • Proposing a weak supervision approach (BERT ? WS) for automatic classification of economic policy uncertainty from news pieces. • Generating an Irish weak supervision based economic policy uncertainty index based on BERT ? WS. • Conducting extensive econometric analysis with Irish macroeconomic indicators to understand whether the generated index is predictive of macroeconomic indicators.

Background and related work
The increasing importance of text data as a timely source of information is gaining significant recognition in economics and social science research for formulating policies in the age of big data. Different textual sources such as news articles [3], central bank communications [9], financial earning calls [10], and social media [2] have been explored in that context. To extract meaningful information from text data, advanced natural language processing techniques such as topic modeling [11], word embeddings [12], and pre-trained language models [13,14] have been employed.
In this work, we focus on the application of text data originated in news articles in measuring economic policy uncertainty. Frank Knight, in his work [15], defines uncertainty as the incapacity of individuals to predict the occurrence of future events with certainty. According to this definition, people are unable to create a valid probability distribution for their beliefs on future happenings. Knight also defines risk as the individuals' established probability distribution for future events. Despite these definitions, economists tend to use the terms ''risk'' and ''uncertainty'' interchangeably.
Alternative measures for understanding economic uncertainty, beyond the traditional use of the VIX, have been explored in recent research. Li et al. proposed a graph neural network with a metapath attention mechanism for predicting stock market volatility using multi-source heterogeneous data [16]. Xu et al. introduced a hierarchical graph neural network (HGNN) to analyze the market state at a hierarchical level, predicting whether a stock that reaches its daily price limit will close at the same price level in the next trading day [17]. In a recent study conducted by Kishor et al. [18], the impact of emotional biases (including overconfidence bias, self-control bias, loss aversion bias, and regret aversion bias) on the risk preferences of individual investors was examined through the use of structural equation modeling based on survey data. In 2020, Drakopoulos et al. [19] explored a suite of essential graph analytics for recommending cultural content. Monken et al. [20] proposed AINET, an artificial intelligencepowered technique that assesses the underlying causes of exceptional events in bilateral trade modeling using neural networks. These solutions are distinct from our proposed approach as they concentrate on the stock market and do not investigate how society decisions impacted by recent events can be measured by an economic uncertainty index. In that sense, our work adds to the suit of new tools to analyze economical factors in society.
The utilization of text data as a potential source of timely data is becoming a crucial aspect in measuring Economic Policy Uncertainty (EPU). One of the most notable studies in text-based EPU measurement is that of Baker et al. [3], which serves as a foundation for our own research. They measure EPU by counting the number of news articles that contain keywords associated with economy, policy, and uncertainty, and then dividing that number by the total number of articles published in the same newspaper and month. Our work differs from theirs by incorporating an automated machine-learning approach in identifying articles that describe EPU. Previous studies [7,21] have attempted to detect economic policy uncertainty using unsupervised machine learning techniques such as latent Dirichlet allocation (LDA) [8]. However, this approach creates a time series of the counts of topics generated by the algorithm, which may not provide an accurate representation of economic policy uncertainty. Our work takes a different approach by employing proper classification techniques instead of relying on topic modeling or counting methods. Miranda et al. [22] also developed a real-time EPU index that leverages semantic clustering with word embeddings for topic modeling of digital news.
Recently, Lolic et al. [23] demonstrated the effectiveness of ensemble techniques, including random forests and gradient boosting, in improving the accuracy of the economic policy uncertainty index. On the other hand, Nyman and Ormerod [24] explored the potential of expanding the uncertainty keywords originally proposed by Baker et al. by using nearest neighbor embeddings and identified the Granger causality between their expanded keyword list and the existing EPU index. Keith et al. [25] also proposed a supervised machine learning approach for EPU classification and analyzed annotator uncertainty and causal assumptions in measuring EPU from newspapers. However, the reliance on supervision in their approach restricts its applicability. To address this issue, we propose to incorporate weak supervision into the classification process as a potential solution.
Despite the success of most machine learning algorithms, most of the performance of these models is largely enabled by high-quality annotated data, which may not be readily available or can be expensive to create. Hence, the application of these models is challenging especially in low-resource settings where there are few to no annotated datasets like with the EPU application. Transfer learning [26] is one approach where models are pretrained on generic tasks and fine-tuned on domain-specific tasks. Data augmentation [27] creates more training data points by artificially modifying the existing dataset. Active learning [28] trains models with few labeled data selected from a large pool of unlabeled data using acquisition functions that select the most informative data points. Meta-learning or few shot learning [29] aims to learn from very few examples by learning from other learning algorithms to be able to discover structure among tasks enabling them to learn faster on new tasks.
Weak supervision is an evolving machine learning approach for the automated creation and modeling of training datasets, commonly referred to as data programming [56]. The process of data programming involves domain experts or users providing simple rules or heuristics, labeled as labeling functions, such as patterns, keywords, pre-trained models, existing knowledge bases, or similar domain-specific strategies, as a substitute for manual labeling. Unlabeled data are processed through the expert-defined labeling functions, resulting in the assignment of multiple labels to each record, which may be conflicting or correlated. To tackle the issue of noise and conflicting outcomes from the labeling functions, data programming models them as a generative process using a factor model. This generative model outputs probabilistic labels, which are then used to train the final classifier, allowing for generalization beyond the labeling functions. The original methodology proposed by [56] was limited to user-written heuristics. Various alternative aggregation methods for labeling functions, aside from the generative model proposed by [56], have been proposed.
Ruhling et al. (2021) presents an innovative end-to-end approach for learning the downstream model by maximizing its agreement with probabilistic labels generated through re-parameterizing probabilistic posteriors with a neural network [30]. Fu et al. [30] introduces FLYINGS-QUID, a weak supervision framework that operates much faster than the commonly used snorkel [31]. Shin et al. presents a universal technique for weak supervision over any type of label while maintaining theoretical guarantees [32]. Yu et al. proposes a method to reduce overfitting in language models when trained with weak labels through fine-tuning with contrastive regularization and confidencebased weak supervision [33]. To further improve weak supervision methods, some automatic generation methods of labeling functions [34][35][36][37] have been proposed to reduce the burden of labeling function writing. Labeled data have also been used to supplement labeling functions through semi-supervised learning [38][39][40], active learning [41], or subset selection [42][43][44]. Data programming has been employed in various real-world applications, such as medical applications [45,46], social media analysis [47,48], and autonomous driving [49]. Our work explores the application of weak supervision to measuring Economic Policy Uncertainty, aiming to replace current querybased approaches.

Overview
Our proposed framework as shown in Fig. 1 involves three key stages; the first stage uses expert-defined labeling functions (LFs) to automatically generate a label matrix, in which each news article is assigned a number of noisy labels for each LF. The second stage uses an unsupervised generative factor model to combine the outputs of multiple LFs into a single auto-generated noisy label for each news article denoted as Y. In the third stage, we fine-tuned BERT to perform EPU classification using the auto-generated noisy labels. The following subsections present these stages in more detail.

Labeling functions
Labeling functions are a means of expressing domain knowledge to label a subset of data points without individually labeling each data point. The input of labeling functions is a dataset, denoted as X nÂ1 , consisting of independent and identically distributed (i.i.d) news articles and m labeling functions (LFs), denoted by k ¼ fk 1 ; :::; k m g, that are derived from heuristics provided by the domain experts or existing prior knowledge about the task. Each labeling function k j noisily labels each news article with k j ðxÞ 2 fÀ1; 0; 1g, where 1 indicates that the news article is about EPU, 0 indicates that the news article is not about EPU and À1 means that the labeling function abstained from labeling the news article. Data programming applies these m labeling functions on n unlabeled news articles to produce a label matrix K 2 fÀ1; 0; 1g nÂm .
The label matrix is then processed by the generative factor model to produce a vector of noisy labels Y ¼ f y 1 ; :: y k g where k denotes only those articles with an associated noisy label with k n (only those where the LFs did not abstain), these are then used to fine-tune BERT.
We describe some of the details of the labeling functions used in our case as follows: • Keywords: The experts are only required to provide a few keywords associated with each class. This is plausible because experts cannot always give an exhaustive list of keywords associated with the classes. We expand the expert provided keywords by mining the top nearest words to the provided words in the semantic space. This was achieved by first obtaining the embedding of these expert provided keywords and also an embedding for each word in the corpus, the embeddings of both are normalized to reside on a unit sphere representing a joint semantic space from where we can retrieve top k-nearest neighbors words. We choose to focus on the spherical space as opposed to the Euclidean space due to its superiority in exploiting structure and geometry of the manifold [50][51][52]. More formally, let U be the embeddings of the few user provided seed words for each class, the semantics of each class is modeled as a von Mises Fisher (vMF) distribution. A d-dimensional unit random vector u is Fig. 1 Several weak supervision sources like domain heuristics, knowledge bases, pretrained language models are expressed as labeling functions (LFs), each labeling functions assigns a news article a label or may abstain from labeling it, the combination of all labels by the labeling functions results into a label matrix. The label matrix is fed into a generative model that combines the output of these multiple and potentially conflicting noisy labels into a single label per news article. The generated labels are then used to fine-tune BERT on EPU classification on the news articles said to have d-variate von Mises Fisher (vMF) distributions if its probability density function (pdf) is given by: [50] where jjljj; k ! 0, The normalizing constant c d ðkÞ is given by: [50] where I t ð:Þ is a modified Bessel function of order t.
We used an expectation maximization (EM) algorithm to find the parameters of the von Mises Fisher(vMF) we can modify the initial pdf to include a mixture of the k vmF distributions: [50] where H ¼ fa 1 ; ::a k ; h 1 ; ::a k g Let H ¼ fh 1 ; ::h n g be a set of hidden variables for vMF distributions where the points are sampled.
[50] Assuming the values of H were observed, we could have obtained the values of the parameters using complete log-likelihood of the observed data, but since these are not observed we shall optimize the incomplete likelihood using E-step and M-step; -E-step: In the expectation step, we used the observed data to estimate or guess the values of the missing data: [50,51] -M-step: In the maximization step, we used the complete data from the E-step to update the parameters of the model: [50,51] [50,51] [50,51] [50,51] The expanded keyword list is used as keyword lookup for labeling functions, as well for semantic similarity tasks.
• Semantic similarity We found contextualized vector representations (embeddings) of our expanded keywords and for each of the news articles x within our corpus using siamese sentence transformer [53]. These embeddings are used to generate soft labeling functions using the semantic similarity of the expanded user provided keywords and for each news article in our corpus using cosine similarity. The hypothesis behind this labeling function is that the news article with a higher cosine similarity with the embedding of keywords for an EPU class is most likely to belong to that class. We assign a new article x to a pseudo-EPU class Y if: COSINEðx; kÞ\/ where / 2 ½0; 1 is a hyperparameter. • Patterns We also searched in news articles for the occurrence of key words related to EPU and one of the words known to be associated with uncertainty. For example, articles that describe uncertain events like Brexit or financial crisis are more likely to be describing policy uncertainty. • Sentiment polarity We also hypothesized that articles describing policy uncertainty are more likely to have a negative sentiment polarity. This is our hypothesis and is not necessarily supported by the literature in economics. • Zero shot classifier We used BART (Bidirectional Auto encoder Regressive Transformers) [54] to perform zero shot inference on our news articles producing noisy labels for each EPU class.
The labeling function can be adjusted or extended based on the performance from a validation set.

Generative model
The objective of this stage was to automatically assign each news article a noisy label in an unsupervised fashion. The outputs of the labeling functions described above can be conceptualized as multiple annotators labeling the same news article just like in the crowd-sourcing setting [55].
The produced labels will have conflicts and potentially correlations, this will happen even if these news articles were labeled by domain experts [25]. Instead of simply taking a majority vote to obtain a final label, we adopt a probabilistic framework to exploit the structure and correlations within the label matrix. The probabilistic model P w ðK; YÞ is formulated as a joint probability of the outputs of the labeling functions (label matrix) K and the latent (unobserved) true class labels of the news articles Y. In particular, we encoded the label matrix with a factor model using three factor types representing the conflicts, correlations, and propensity (where LFs did not abstain) of the labeling functions. The generative model can be defined as follows: [31] where Z w is the normalizing constant and / i ðK i ; y i Þ represents an aggregation representing factors for all labeling functions given a sample news article x 2 X and y i is the latent class label.
In order to learn parameters w of the model P w ðK; YÞ, we minimized the negative marginal likelihood given the observed label matrix K, by only observing the agreements and disagreements in the label matrix K since we do not have access to the ground truth labels with the formulation below: [31] The learned parameters are used to generate the noisy labels Y ¼ P w ðY j KÞ which can be used in fine-tuning the BERT classifier.

Discriminative model
In this stage, we fine-tuned BERT (BERT ? WS) using the noisy labels generated by weak supervision sources instead of the human annotated labels. This was achieved by adding a feed forward neural network on the last layers and leaving the other layers frozen to facilitate adaptation on our downstream EPU classification task. The model was trained by minimizing expected loss using a noise-aware objective function: [31] where E denotes expectation, L denotes the loss function, and k denotes the number of training examples. The objective function used is the same as a standard supervised learning loss except that we are minimizing the expected value with respect to the noisy probabilistic labels Y generated by the label model. Theoretical analysis guarantees that the generalization error of the discriminative model decreases at the same asymptotic rate as with traditional hand-labeled data [56].

Time and trade-off analysis
This section describes the trade-off in terms of time for using weak supervision (BERT ? WS) against other models. In this work, we use a two-stage weak supervision framework, which means that we first generate the noisy labels using weak supervision sources and then use the noisy labels to train a discriminative classifier (BERT). Keyword search employed by economists is faster compared to our proposed approach (BERT ? WS). Moreover, it is on the articles that are identified by keyword search that we apply our labeling functions. However, these articles retrieved by keywords contain considerable amounts of false positives and negatives [3]. The key difference between our proposed method and other discriminative classifiers is label generation. We shall use the time taken to generate the labels by humans as a proxy for the time required to generate quality labels for training and then the time required to write labeling functions as a time proxy for generating noisy labels. It took one of the authors of this work (who is trained in economics) just a week to write proper labeling functions for both datasets with inspiration from the coding guide provided by [3], while it took a team of 14 expert human annotators 6 months to label 12,000 USA news articles [3]. The difference in time required to produce labels and design labeling functions is in the order of 10, and these labeling functions can be adjusted once new information about the labels arrive. It takes the same time to train discriminative classifiers regardless of whether noisy labels or human labels are used.

Results and discussion
This section describes the implementation of our weak supervision framework (BERT ? WS) and its comparison to other available models for economic policy uncertainty (EPU) classification.

Datasets
We employed news articles published in Ireland and USA newspapers, as well as data on economic indicators to understand the usefulness of our approach in predicting economic fundamentals.

News datasets
Irish Newspapers: We retrieved news articles from the Irish Times 1 and Irish Independent 2 newspapers in the timeframe of January 1992 to August 2021. These newspapers were selected because they were among those that had the highest coverage in the country and had been in publication for a long time compared to other newspapers. The articles that were retrieved are those that contained the following keyword combinations; {('uncertain' OR uncertainty') AND ('economy' OR 'economic') AND ('regulation' OR 'legislation' OR 'dail' OR 'deficit' OR 'Taoiseach')}. In total, 10,070 articles were retrieved according to the keyword query. 10% of the retrieved articles were randomly selected and manually labeled for our experiments. The annotation process followed the coding guide provided by Baker et. al (2016) [3]. In this guide, the newspaper article is considered to be describing economic policy uncertainty if: 1. The article talks about uncertainty over who makes or will make policy decisions that have economic consequences. 2. The article talks about current and past uncertainty over what economic policy actions will be undertaken. and 3. The news articles talk about uncertainty regarding the economic effects of policy actions. The labeled dataset (1070 news pieces) was split into training, validation and testing in the ratio of 8:1:1 respectively. USA Newspapers: We used 12,000 news articles that were selected as an audit sample from the retrieved USA articles containing the terms 'economy' and 'uncertainty'. The domain experts labeled these selected news articles into binary categories of presence or absence of EPU and into further EPU categories like taxes, fiscal policy, monetary policy and others. For our case we only used the binary EPU categories for the experiments. Further details about the dataset can be obtained from the authors website 3 [3]. The dataset was split into training, validation and testing in the ratio of 8:1:1 respectively.

Economic indicators
The economic indicators that were used for econometric analysis in Table 1

Evaluation setup and metrics
We implemented BERT [13] and RoBERTa [14] models alongside their embeddings using huggingface 4 and simple transformers libraries [59]. Weak supervision approaches were implemented using Snorkel 5 [60] and Rubrix. 6 Long short-term memory (LSTM) [61] was implemented using Keras library 7 [62] and support vector machine (SVM) classifier using Scikit-Learn library 8 [63] The experiments for neural models (BERT, RoBERTa, BERT ? WS) were conducted with an Adam optimizer [64], 9 an initial learning rate of 2e À 5, a batch size of 8 and a maximum sequence length of 512 tokens. BERT and RoBERTa generated their own embeddings to be used for classification, BERT ? WS used the embeddings generated by BERT, and the standard bag of words models using term frequency-inverse document frequency (TF-IDF) were fed into the classification pipeline for LSTM and SVM. The training and validation losses for each training epoch were monitored and the models with the best accuracy on the validation set saved before comparison with the test set.
The comparison of our proposed solution with the other models was done using F1-score precision and accuracy (Eqs. 13 to 15). Accuracy

Experiments
This section describes the experimental results and evaluation of our weak supervision framework (BERT ? WS) involving BERT model fine-tuned with noisy labels generated by weak sources. We compared this framework with the models described in Sect. 4.2. Our baseline comparison is with the keyword search, the current solution employed by economists to generate EPU index from news articles. The results on the Irish news articles dataset (see Table 2) demonstrated that (BERT ? WS) presents a significant improvement in precision (?20%) from just using keyword occurrences. (BERT ? WS) also outperforms 1 https://www.irishtimes.com/. 2 https://www.independent.ie/. 3 https://www.policyuncertainty.com. 4 https://huggingface.co/. 5 https://www.snorkel.org/. 6 https://rubrix.readthedocs.io/en/stable/guides/weak-supervision. html. 7 https://keras.io/. 8 https://scikit-learn.org/. 9 Adam is an algorithm for first-order gradient-based optimization of stochastic objective functions based on adaptive estimates of lowerorder moments [64]. SVM and LSTM, but it is outperformed by RoBERTa (62.40%'against 66.34%) and BERT fine-tuned using human annotated data (62.40%'against 64.90%). Table 3 illustrates the results on the USA test dataset which is 10 times larger compared to the Irish dataset. BERT fine-tuned with weak supervision presents a significant improvement from just using keyword search (?19%) the current state-of-the-art solution for EPU indices. Stateof-the-art neural models trained on human annotated data still outperform models trained with weak supervision based noisy labels, at about 5% in precision and accuracy, but with comparable in F1-score.
It should be stressed that while the other models reported in this work require large amounts of expert annotated labels that effort is replaced in our method (BERT ? WS) by designing labeling functions. To illustrate the difference in human effort involved in this case, it took one of the authors of this work (who is trained in economics) just a week to write proper labeling functions for either dataset with inspiration from the coding guide provided by [3], while it took a team of 14 expert human annotators 6 months to label 12,000 USA news articles [3]. The difference in performance (3%) can be traded off in most applications of economics, also making (BERT ? WS) largely reproducible in a new context. Figure 4 shows the performance of the models using the receiver operating characteristic (ROC) curves of both the Irish and USA test datasets. The area under the curve for most models is larger on the USA dataset compared to the Irish dataset; this indicates that models have better   discriminative power with respect to the EPU classes on USA datasets than Irish dataset. There is also much more area covered by (BERT ? WS) on a large dataset (USA news articles) implying that fine-tuning with weak supervision also benefits from more training examples. Similar results were earlier observed with evaluation metrics in Tables 2 and 3. Figures 2 and 3 are t-distributed stochastic neighbor embedding (t-sne) [65] multidimensional projections of the neural embedding space of the news articles by different models. We evaluated the models both visually and analytically using the silhouette coefficient [66]. In both datasets, BERT had the best silhouette coefficients. The visual analysis and silhouettes coefficients of all models is Fig. 2 t-sne projections of the embedding space of Irish news articles segregated according to economic policy uncertainty (EPU) categories. Point colors represent EPU categories and the values shown in the parenthesis represent the silhouette coefficient in the projected space. Projections labeled human annotation are the model's embeddings space segregated by human provided labels for the articles, BERT and (BERT ? WS) projections were generated using BERT embeddings, LSTM and SVM were generated using TF-IDF embeddings and RoBERTa projections were generated on top of RoBERTa model embeddings not particularly good. This is expected by the difficulty that comes with EPU classification on text.
Experimental results suggest that when a significant labeling budget as well as time is available, policy makers and economists can get better results employing neural models trained with human labels. In most applications, however, that trade-off (3-5%) in performance is not attractive. Additionally, designing labeling functions that accommodate new 'views' of the dataset is a straightforward activity allowing flexibility in the process of adaptation to new tasks.

Limitations of the proposed approach
In this paper, we acknowledge the limitations of our proposed method which is based on weak supervision. These limitations are inherent of weak supervision as a framework. The method's performance is dependent on the quality of the labeling functions that are used to encode task knowledge. Writing labeling functions can be challenging in domains with complex or high-dimensional features. Additionally, the accuracy of the generative models used to combine the labeling functions can be sensitive to the initialization, training epochs, and learning rates. In future studies, we aim to explore the impact of incorporating a small amount of labeled data with labeling functions on the performance of our approach for the identification of economic uncertainty and the calculation of the Economic Policy Uncertainty (EPU) index.

Economic policy uncertainty index
We used the predictions from (BERT ? WS) to construct an Irish monthly EPU index from January 1992 to July 2021 using the following steps adapted from Baker et.al(2016) [3]: Step 1 We collected news pieces from relevant newspapers in a time window of interest (January 1992 to August 2021).
Step 2 We used (BERT ? WS) to classify news articles describing policy uncertainty.
Step 3 We counted the number of articles that have a positive label for economic policy uncertainty level for each newspaper across months.
Step 4 We them computed the time-series variance r 2 i in the selected time interval for each newspaper, normalized the time series of counts of new articles with positive EPU label using the standard deviation of the time series.
Step 5 Computed the mean M of the normalized time series of counts of news articles Step 6 We generated an EPU index by multiplying (100/ M) by the normalized time series of counts.

Econometric modeling with the weak supervision EPU index
The generated Irish EPU index captures both local and global events that are known to have caused economic uncertainty not only to Ireland, but also to the entire world as shown by Fig. 5. The index spiked highest in 2008 which we suspect is due to uncertainty that was caused by the global financial crisis following the failure of Lehman and brothers [67,68]. The spike in 2016 may be attributed to the uncertainty due to Brexit effects and local uncertain events like austerity protests. We can learn from the index that the Irish economy is considerably prone to policy uncertainty from abroad, this is partly explained by the open nature of the Irish economy [69]. 5 Econometric modeling with weak supervision EPU index

Econometric model
We conducted an econometric analysis using vector autoregression models (VAR) [70] to exploit time series variations within the Irish macroeconomic indicators.

Vector auto-regression models (VAR)
More formally, consider n Â 1 macroeconomic time-dependent variables: Y t ¼ ðy 1t ; ::y nt Þ T we defined a p-lag vector auto-regressive (VAR(p)) as: where U i are n Â n coefficient matrices and e t is an ðn Â 1Þ unobserved zero mean white noise vector process with time invariant co-variance matrix R [71].
Our goal was to use the estimated model to understand whether our generated EPU index can be predictive of standard macroeconomic variables by using this impulse response functions generated from the estimated VAR model. Impulse response function [72] are well-established tools in econometric analysis that are used to investigate how changes in a policy variable at time t causes changes in another variable after time period t with consideration of the interaction among the variables.

Granger causality
In econometric analysis, we are often interested in determining how variables affect or influence one another. Even though this is a well-studied problem in the statistics literature, determining the actual cause of a certain phenomenon is a non-trivial task. The complexity of the Economic Policy Uncertainty (EPU) Index for Ireland plotted from 1992 to 2021 generated using weak supervision-based methodology, higher index counts means that there was much policy uncertainty perceived at that time. We also annotate the graph with the potential events responsible for the extreme high values  6 Impulse response function of the response of consumer price index to policy uncertainty shocks for 10 months ahead problem arises due to the existence of confounders, confusion from spurious correlations and determining the direction of the relationship. For example, consider two variables X and Y, there are four possible relationships; X causes Y ( X ! Y), Y causes X ( X YÞ, both X and Y cause each other (X !Y) or the two variables are independent ðX j YÞ. Granger causality assumes that if X is a cause of Y, then X (Cause) must occur before Y (effect), and can be used to predict it. We can thus deduce that forecasting future values of Y t with both past target and past source time series EðY t j Y \t ; X \t Þ is significantly more powerful than only using past time series EðY t j Y \t Þ.
To infer Granger causality of two variables, we consider the following two equations: where N t is noise term for predictions without X andÑ t is a noise term for predictions including X. X is said to be a Granger-cause of Y whenever the noise termÑ t with predictions of X included has a significantly smaller variance than the noise term N t obtained with X [73].

Econometric analysis
We fitted a VAR model to monthly Ireland data from January 1992 to August 2021 using Cholesky decomposition [74]. This was done to recover the orthogonal shocks using the following macroeconomic variables: EPU index, unemployment rate, logarithm of industrial production, logarithm of consumer confidence index, short term interest rates, and logarithm of consumer price index (CPI).
To ensure that the series are stationary (a pre-condition for VAR analysis), we conducted unit root tests of the variables using augmented Dickey fuller test (ADF) test [75]. The results are shown in Table 4.
Observations from Table 4 reveal that our macroeconomic variables of interest are stationary since their P values are less than the critical value 0.05. We therefore reject the null hypothesis of presence of a unit root and conclude that our variables are stationary at 5% level of significance. Akaike's information criterion (AIC) [76] was  Impulse response function of the reaction of Business Confidence Index to policy uncertainty shocks to EPU for 10 months ahead Fig. 7 Impulse response function of the response of industrial production to policy uncertainty shocks for 10 months ahead then used to find an optimal time lag to be used to fit a VAR model and 4 months were found to optimal.
The impulse function shown in Fig. 6 indicates that consumer price index responds slowly to policy uncertainty shocks but the effects of policy uncertainty continue to reduce consumer price index for the next 10 months after the shock. This negative relationship is further supported by our causality analysis with Granger causality test [77] which is statistically significant at 5% level of significance (f-value = 3.42014, p = 0.009255). That confirms that policy uncertainty negatively impacts consumer price index.
Industrial production responds sharply to policy uncertainty shock within the first two months of the shock and then rises back to normal and starts to decline gradually after the 4th month as shown in Fig. 7. Our Granger causality tests however shows that policy uncertainty is not predictive of industrial production at 5% level of significance but the test is significant at 10% level of significance (f-value = 2.3126, p = 0.08783).
Unemployment rate responds gradually to a policy uncertainty shock, with the least effects of the shock experienced within a month after the shock but the effects of the shock continues to be experienced even after 10 months. We should note that such a relationship is not supported by the Granger causality test at 5% level of significance. Figure 7 demonstrates that policy uncertainty reduces business confidence with immediate effects and the effect of the policy uncertainty shock continues to be felt until the 6th month. Statistical analysis at 5% level of significance shows that policy uncertainty can also be predictive of business confidence (f-value = 2.4835, p = 0.04412).

Conclusion
In this paper, we explore the great promise of leveraging news articles for understanding decision making in society. This is through a novel economic index known as economic policy uncertainty (EPU). EPU index measures economic policy uncertainty from keyword occurrences in news articles. Our contribution is on the identification phase where we propose to identify news articles that are about EPU with computational methods from natural language processing and machine learning. This in contrast to simple keywords occurrences [3] currently employed in the constructing of the index.
Our solution involves fine-tuning BERT with noisy labels generated by weak supervision (BERT ? WS). The rationale for the methodology is that it does not require experts to label news articles but rather to provide relevant keywords which is often much easier to provide for experts than labeling. This proposed method provides competitive performance in terms of accuracy on the Irish news dataset (62:40% versus 39:00%) and USA news dataset (69:40% versus 38:31%) as well across other evaluation metrics shown in Table 2 and 3 compared the baseline method of keyword occurrences.
We also present further analysis and comparison of our proposed method (BERT ? WS) with other state-of-the-art methods fine-tuned with human annotated data. Models trained with human annotated still have a performance advantage over our method. However, this performance advantage comes at the cost of a pain-staking process of data labeling requires domain-level knowledge and time not readily available. The gap in performance is small, and the trade-off can be accommodated in most economic applications. The weak supervision framework presented here (BERT ? WS) aims at timely results for policy decisions compared to spending hundreds of hours on data annotation. For future work, we intend to explore complementing weak supervision with a small set of carefully selected human annotated examples through active learning or data subset selection as well as working on strategies for labeling functions and multi-label classification in regard to different types of policy uncertainty.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons. org/licenses/by/4.0/.