News and ESG investment criteria: What’s behind it?

Pikatza-Gorrotxategi, Naiara; Borregan-Alvarado, Jon; Ruiz-de-la-Torre-Acha, Aitor; Alvarez-Meaza, Izaskun

doi:10.1007/s13278-024-01209-w

News and ESG investment criteria: What’s behind it?

Original Article
Open access
Published: 27 February 2024

Volume 14, article number 47, (2024)
Cite this article

Download PDF

You have full access to this open access article

Social Network Analysis and Mining Aims and scope Submit manuscript

News and ESG investment criteria: What’s behind it?

Download PDF

936 Accesses
Explore all metrics

Abstract

News written in the press about different companies generates consumer feelings that can condition the reputation of these companies and, consequently, their financial results. One of the practices that might improve a company’s reputation is the Environmental, Social and Governance (ESG) investment criteria. In this research, using Natural Language Processing techniques like Sentiment Analysis and Word2Vec, we detected those ESG-related terms that the written press uses in news articles about companies. Thus, we have been able to discover and analyze those terms that improve sympathy toward companies, and those that worsen it. Our findings show that those terms related to sustainable development, good social practices and ethical governance improve the general public’s opinion of a company, while those related to greenwashing and socialwashing worsen it. Therefore, this methodology is valid for enabling companies to detect those terms that improve or worsen their reputation, and thus help them make decisions that improve their image.

Social media influencer marketing: foundations, trends, and ways forward

Article Open access 25 June 2023

Criteria for Good Qualitative Research: A Comprehensive Review

Article Open access 18 September 2021

Environmental-, social-, and governance-related factors for business investment and sustainability: a scientometric review of global trends

Article 25 January 2023

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

News published in the written press about different companies originates from the practices and events of these companies themselves. In turn, once these news items are published they project an image of these companies, which influences their reputation. Therefore, business practices (social, productive, economic, environmental and/or corporate) influence social opinion through what is said about them in the news, which, in turn, causes society to influence such practices through the image that is projected of them. Moreover, society is increasingly demanding social responsibility from companies, and requesting that they account for the social and environmental consequences of their actions. One way of measuring these social and environmental consequences is the Environmental, Social and Governance (ESG) investment criteria. These ESG criteria are a set of standards for a company’s behavior and are used as a tool for analysis, with which companies can try to measure their Corporate Social Responsibility (CSR), i.e., the degree of responsibility that the company adopts toward society (Porter and Kramer 2006). The ESG criteria for companies refer to the environmental, social and corporate governance factors that can be taken into account when investing in a company (Initiative 2005), as they influence the company in the form of corporate image. It is therefore a tool for analyzing the company’s environmental and social policies, which, in turn, can influence the company’s finances, in the form of reputation and image (good or bad).

ESG investment criteria are increasingly relevant when it comes to investing in a company. Indeed, they were priority topics at the World Economic Forum and the Davos Forum 2022 (ESG and Sustainable Finance Data Skills and Capacity Building Directory, 2020), (Davos 2022: How Businesses Can Deliver on ESG Promises | World Economic Forum, n.d.). In fact, for several years, many authors have studied the relationship between the application of ESG criteria and financial performance of the companies. Thus, Friede et al. (2015) demonstrate, through an exhaustive review, that applying ESG criteria in companies leads to better financial results. According to Amir and Serafeim, the main motivations for companies to use ESG information are, in order of importance: return on investment, customer demand, product strategy and, lastly, ethical considerations (Amir and Serafeim 2018). Brooks and Oikonomou also address the relationship between ESG criteria and financial performance. These authors find a link that is positive and statistically significant—but economically modest—between ESG criteria and financial performance on a company level. According to their article, there is an asymmetry in the financial impacts of ESG, whereby the negative financial effects of corporate social irresponsibility are greater than the positive financial effects of corporate social responsibility (Brooks and Oikonomou 2018). In their research, Fatemi et al. (2018) conclude that the strengths of ESG criteria increase company value and ESG concerns decrease it. Finally, Lee et al. (2016) find a significant positive relationship on a company level between environmental responsibility and financial performance, and between environmental responsibility and operational performance.

In this relationship between ESG investment criteria and the company’s financial results, the company’s reputation or image is a vitally important variable since it affects consumer satisfaction (Chun 2005). One way of measuring a company’s image is by taking into account two indicators: The first one is the sympathy that the company generates in society in general, and the second one is the company’s good financial results (Raithel et al. 2010). Society receives this data about companies, from external sources such as word of mouth, news, advertising, etc., and then forms an image of the company’s reputation (Kossovsky 2012). That is why, by performing a sentiment analysis (SA) of written news about companies, it is possible to measure the reputation they have in society. A positive Sentiment Analysis of news about companies will generate sympathy toward them, improving their reputation.

In this context, SA—a sub-discipline within data mining and computational semantics—is one way of measuring the image projected by news sentiments. According to Pang and Lee (2008), SA is a dynamic and extensively researched subject in the field of natural language processing (NLP). Its main objective is to computationally process the subjectivity in a given text and analyze the opinions, emotions, evaluations, and feelings of individuals. This powerful technique allows for a deeper understanding of data gathered from sentiment-rich sources such as news articles, social media platforms, reviews, and other similar content (Kim 2015). As a result, SA serves the purpose of extracting sentiments and emotions from text, finding applications in various domains, ranging from assessing customer satisfaction to understanding political opinions (Mäntylä et al. 2018; Pak and Paroubek 2010).

One limitation of SA is its capacity to score the degree of positivity or negativity within a given sentiment, without explaining the underlying reasons for these sentiments. SA only allows understanding the extent to which a sentiment is better or worse, as it provides degrees of sentiment. Consequently, when extracting sentiments from news articles about companies, the analysis remains incomplete because we really want to understand why those feelings are there. Upon identifying this limitation, a bibliometric review was conducted, revealing no existing research examining the meaning of terms related to ESG within written news articles based on the previously generated sentiment degrees (Liu et al. 2023; Mandas et al. 2023; Park et al. 2022; Salas-Zárate et al. 2017; Zeidan 2022).

The aim of this article, therefore, is to identify from written news those issues related to ESG investment criteria that influence whether a company has a better or worse reputation among consumers.

To achieve this, we will firstly identify news written in the press about certain companies. Then, from these news items and using SA techniques, a distinction will be made between those that generate positive and those that generate negative feelings. Finally, we will detect those terms related to ESG investment criteria through Word2Vec techniques executed in Python. It is possible to quantitatively obtain the vector distances between the different terms or words analyzed (word-embeddings), in order to observe those that are closer to—and therefore have greater affinity (Banawan et al. 2023) with—the term or terms of study in this research.

Therefore, thanks to NLP techniques (the combination of SA and Word2Vec methods or models), it is possible to detect, through the terms extracted from the news, the factors that influence whether a company has a better or worse reputation among consumers. As a result, companies will be able to identify, from the published news, those terms close to the ESG investment criteria that have a positive or negative influence on their own image. Among their practices related to ESG criteria, this can be a useful tool for helping companies identify which ones worsen and improve their reputation. In this way, they will be able to make strategic decisions to improve their image and, consequently, their financial results, through consumer behavior.

2 Methodology

The methodological process applied in this research is shown below (Fig. 1):

2.1 Database definition

The first step in the methodological process was to choose the business sample. A sample of financially consistent companies was sought. For this purpose, we selected the companies from the Eurostoxx 50 that had obtained the best dividend yield at the search date (May 2021). The eight companies with the best financial performance were as follows: Allianz, Basf, BNP Paribas, Daimler, Engie, Eni, ING andIntesa Sanpaolo (Cotizacion de EURO STOXX 50®—Indice—Resumen—Rentabilidad-Dividendo, n.d.)

2.2 Data extraction

After choosing the companies, the next objective was to retrieve the news written in the press about those companies. To do this, the original source was used, and these news items located. The query used in each case was the name of each company about which the search was being performed. The 500 most relevant news items per year were chosen for each of the companies from a time period covering 2017 to 2021. Where any company did not reach 500 news items in any year, all of them were chosen. In total, 19,953 news items were downloaded, distributed as follows according to the year (Table 1):

Table 1 Number of news items analyzed

Full size table

Therefore, 2500 news items per company were downloaded (500 news items per year for 5 years) except in three cases: Intesa San Paolo, with 1768 news items, and ING with 661.

2.3 Cleaning and classification

Once the news download was done, it was then imported to the data mining software Vantage Point (Liu and Liao 2017). The data were then structured for subsequent export.

2.4 Main corpus creation

Once the data had been cleaned and classified, we then had a corpus with which to proceed to the next step—Sentiment Analysis. The aim here was to detect the topics that influence the reputation of the companies, both positively and negatively. For this purpose, two news corpora were created: the first made up of those news items that obtained a positive Sentiment Analysis, and the second of the news items that had negative results.

2.5 NLP: Sentiment analysis (main corpus)

The news items could then be exported to Orange, a machine learning and data mining suite for data analysis through Python scripting (Demšar et al. 2013). A Sentiment Analysis of the extracted news was performed using the VADER and Hu Liu tools:

The Phyton tool, Valence Aware Dictionary and Sentiment Reasoner (VADER), is a Sentiment Analysis framework that employs a lexicon-based approach to ascertain the sentiment values of a sentence. VADER has proven to be highly effective in analyzing social media texts, NY Times editorials, movie reviews, and product reviews (Abdul-Rahman et al. 2020). (Thu and Aung 2018; Shapiro et al. 2020; Yu et al, 2021; Medhat et al. 2014). The success of VADER stems from its ability to provide not only Positivity and Negativity scores but also to quantify the degree of positivity or negativity in a given sentiment (Tunca et al. 2023 (Simplifying Sentiment Analysis Using VADER in Python (on Social Media Text) | by Parul Pandey | Analytics Vidhya | Medium, n.d.).)
The Hu and Liu lexicon is another commonly utilized tool designed specifically for Sentiment Analysis of customer reviews. It classifies words into three resulting categories: Sentiment (a global measure of positivity), Positive, and Negative. The reason for selecting this tool is that it has been predominantly used in studies that do not center around textual production in social media. Its application has shown effectiveness in analyzing customer feedback and reviews in various domains (Khoo and Johnkhan 2018).

Given that there are two suitable tools, the first step in measuring the reputation of companies will be through Sentiment Analysis of published news, measured with VADER and Hu Liu.

2.6 Sub-corpora creation

Once the results of the Sentiment Analysis had been obtained, two differentiated corpora were created from the main corpus, with all the news items. The first corpus was comprised of all those news items that had obtained a positive number in the Sentiment Analysis with both tools (VADER and Hu Liu). The second corpus was composed of all those news items that had obtained at least one negative Sentiment Analysis with either of the two tools.

2.7 NLP: correlation

2.7.1 NLP: terms related to ESG

The identification of the terms most related to ESG was carried out in each of the two corpora (positive and negative), via Natural Language Processing (NLP) techniques. Those terms were environment, environmentally, social, socially and government. This was done through Word2Vec (NLP) models generated and executed in Python, in order to quantitatively obtain the vector distances of several terms, with a value of zero corresponding to the word vectorially closest to the chosen terms, and a value of one to that furthest away

2.7.2 NLP: visual representation

A visual representation of the data was obtained. By means of a conversion to a tabular structure in Python, this new information format, comprising of the vector distances of the words and their metadata, was imported into the TensorBoard Embedding Projector tool; thus obtaining a visual representation of the set of words that make up the word-embedding developed in step 2. The terms obtained were analyzed by comparing both corpora, detecting those terms that may have a positive and negative influence on the company’s reputation.

Following the prior generation of two corpora (positive and negative) and their subsequent cleaning, a Word2Vec model—using NLP techniques through Python—was then obtained for each corpus, with information on the set of vectors of the terms that make up the corpus (word-embedding). The set of vectors provides us with the vectorial distance between the different terms (or terms to be analyzed), so that we can establish those that are most similar to each other (Savytska et al., 2021).

The terms analyzed, in order to know those words that are closer and therefore related (the smaller the vector distance, the greater the affinity), were: “environment,” “environmentally,” “social,” “socially,” and “governance.” These terms were chosen because they are the ones that make up the initial ESG (Environmental, Social and Governance). In addition, when applying NLP techniques using Python, it was observed that the words “environmentally” and “socially” appear with a high frequency in the two generated corpora; so in order to cover the maximum number of terms referring to the ESG concept, these two terms were also analyzed and their corresponding Word2Vec model created.

The most important configuration used in Python during the application of NLP techniques in the generation of Word2Vec models was as follows:

Vector size: The word vectors used have a dimension of n = 200.
The architecture used to train the algorithm was the so-called skip-gram.
Negative sampling was used to train the model.
min_count: All terms with a total frequency of less than five were not taken into consideration.
Window: The maximum distance between the term to be studied and the word to be predicted within the corpus sentences was five.
Epochs: The number of iterations performed on each corpus was 10.

Next, Fig. 2 displays the Python code developed, incorporating within it, as an example, the term “social.”

Subsequently, in order to provide another approach, these terms and their related terms were visualized in two dimensions using the Tensorflow Embedding Projector tool (Visualizing Data Using the Embedding Projector in TensorBoard|TensorFlow, 2022). For this purpose, the final Word2Vec models using Python were converted to tabular format, and these were imported into the Tensorflow Embedding Projector for subsequent mapping of the terms to be analyzed. Within this tool, the most important configuration applied was the following:

Data option: Word2Vec 10 K, as it adjusts to the dimension of n = 200 defined above.
Cosine distance: since the data distribution is unbalanced.
Number of iterations: 10,000 (stable projection).
Projection type: t-distributed stochastic neighbor embedding (t-SNE), since it fits correctly to two and three-dimensional displays (Skublov et al. 2022).
Data points: Since these are corpora with many terms, and in order to eliminate unwanted and non-valuable information, the number of points (terms) was reduced to 1000.

The described configuration is as follows:

Once the NLP analysis in Python has been exported to Word2Vec format, it is uploaded in tabular format to the online tool TensorFlow Embedding Projector, as shown in Fig. 3 below.

With the parameters set according to the defined methodology and after over 10,000 iterations, we obtain, as depicted in Fig. 4, the visual representation of words related to a positive outcome (and vectorially closer) concerning the term “environment.” For the remaining analyzed terms, the steps and configurations used are identical, except that when observing words with vectorially closer negative meanings to a term, the negative Word2Vec model, previously generated, has been loaded in tabular format instead of the positive one. Hence, as the configuration utilized for visual study remains standardized throughout this scientific work, for better reader comprehension and observation, the forthcoming images exclusively capture the visual analysis.

As seen on the right-hand side of Fig. 4, the TensorBoard Embedding Projector also provides us with terms that are vectorially closest to the search word (in this case, “environment”). The limitation present in this case is that, in order for the system to perform adequately within an acceptable computation time, we must significantly reduce the word sample, as indicated by the TensorBoard Embedding Projector itself, as depicted in the following Fig. 5.

The reduction of the sample to a maximum of 10,000 words or points would involve a reduction (or non-utilization) of 68% of the terms from the positive corpus and 59% from the negative corpus. Therefore, by reducing the sample and eliminating such a significant number of terms, the list of terms that are vectorially closest to the search word provided by the TensorBoard Embedding Projector and their vector distances differ from the results of our unfiltered corpuses. This is precisely why the NLP methodology was applied using Python. This approach ensures that we consider all terms from our corpuses (a wider terminology) and a more accurate calculation of vector distances concerning the term under study.

3 Results and conclusions

The results and conclusions, outlined in their respective sections, were derived from the methodology described earlier. As detailed in the methodology, the primary corpus yielded the initial results. The results for each company were obtained after applying the SA with VADER and Hu Liu to the corpus of news. The relevant conclusions were then drawn based on these results. By utilizing NLP to extract terms from the sub-corpora and visualizing the data, we interpreted the outcomes to arrive at the final conclusions.

4 Results

4.1 NLP: sentiment analysis (main corpus)

Table 2 shows the results obtained from applying Sentiment Analysis to the different news corpora. In this case they have been divided by company and year, from 2017 to 2021. The numbers indicate the degree of “sentiment” obtained by each company each year, when applying the two SA techniques—Vader and Hu Liu. Figures below zero (shown in red) indicate a negative result, i.e., the sentiments extracted from those news items were negative. On the contrary, if the figure is greater than zero or positive, those news items generated positive sentiments or connotations.

Table 2 Sentiment analysis applied to the news corpus

Full size table

In order to study the reliability of the two Sentiment Analysis tools, the Pearson correlation coefficient was calculated, with the results giving a coefficient between VADER and Hu Liu of 0.5624. Pearson’s correlation coefficient ranges from minus one to one. A value close to one indicates a strong positive correlation, while a value close to minus one indicates a strong negative correlation. A value close to zero indicates a weak or no correlation. In this case, a correlation coefficient of 0.5624 suggests that there is a moderately positive relationship between the two columns of VADER and Hu Liu numbers.

4.2 NLP: correlation

Word2Vec (NLP) techniques were used in each of the two corpora obtained by applying SA (the one formed from news that obtained a positive result and the one formed from news with a negative result). This was done by introducing terms related to ESG in the Python code. The terms were: environment, environmentally, social, socially and governance.

4.3 Environment and environmentally

The first study terms corresponding to this scientific work—environment and environmentally—were introduced into the execution of code in Python. In this way, we quantitatively obtained the terms “positive” and “negative” with lower vectorial distance (see Table 3), synonymous with related words, due to the continuous and constant appearance by proximity to the terms environment and environmentally, within the different sentences that make up the different corpora generated by news in the written press about companies.

Table 3 Terms classified by ESG term (environment/environmentally) and corpus

Full size table

It should be noted that the greater the existing affinity, the closer the vectorial distance is to the value of zero; and, consequently, the lower the affinity, the closer the value will be to one.

4.4 Social and socially

The same process was then carried out, but this time introducing the terms social and socially into the model. The terms that were retrieved according to the vectorial distance in each corpus (positive and negative) are shown in Table 4.

Table 4 Terms classified by ESG term (social/socially) and corpus

Full size table

4.5 Governance

Finally, the process was repeated, but this time with the third component of the initials ESG, Governance. Once again, the terms that were retrieved according to the vector distance in each corpus (positive and negative) were those shown in Table 5.

Table 5 Terms classified by ESG term (governance) and corpus

Full size table

In order to draw conclusions about these terms, it was decided to classify them. Terms obtained in each corpus (positive and negative) were classified by topics: on the one hand, those terms related to ESG investment criteria were grouped together; on the other hand, those related to the ECONOMY, and finally, those with POSITIVE and NEGATIVE connotations were also grouped together. Those terms that did not belong to any of these sections were grouped in the “NON CLASSIFIED TERMS” section. Any term belonging to more than one section, appears in all of the sections to which it belongs. This process was carried out three times: first with the data obtained from the terms “environmental” and “environmentally” (from Tables 3, 4, 5 and 6); secondly, with the results obtained by introducing the terms “social” and “socially” into the model (from Tables 4, 5, 6 and 7); and finally, the same process was carried out with the data obtained by introducing the term “government” into the model (from Tables 5, 6, 7 and 8). The results obtained in each of the three cases are as follows:

Table 6 Terms from Table 3 classified by section and corpus

Full size table

Table 7 Terms from Table 4 classified by section and corpus

Full size table

Table 8 Terms from Table 5 classified by section and corpus

Full size table

In order to provide a visual appreciation of the vectorial distances, thanks to the conversion to tabular format using Python and the subsequent import into the TensorFlow Embedding Projector tool, different analyses were carried out on the basis of the new perspectives and/or visual models (Figs. 6, 7 and 8).

5 Discussion ad conclusions

5.1 NLP: sentiment analysis. main corpus

To check the reliability of the data obtained from the Sentiment Analysis of the news, we first analyzed the tools used, in this case VADER and Hu Liu. For this purpose, the Pearson correlation was calculated between the data obtained with VADER and Hu Liu. In this case, the correlation coefficient of 0.5624 suggests that there is a moderately positive relationship between the two columns of numbers in the two tools. As the analysis coincides, it can be concluded that both techniques are valid for calculating news Sentiment Analysis, and therefore the data obtained are reliable.

Another result which allows us to conclude that the data obtained in the Sentiment Analysis are reliable is that negative results were only obtained in 14 out of 45 total cases, i.e., in 31.1%. The companies that the news reports refer to are financially consistent, and those news reports produce sentiments with positive connotations. In other words, financially consistent companies “produce” positive sentiments, and one of the variables for measuring the good reputation or image of a company is its financial consistency (Raithel et al. 2010). From this, it can be concluded once again that the data obtained are reliable.

5.2 NLP: Correlation. sub-corpora: terms related to ESG

5.2.1 Environmental and environmentally

If we look at the data visualization of the term environment, with regard to the positive terms (green box), three main clusters can be observed. One of these clusters is composed of the term “environment” together with its related words. In addition, this cluster includes a considerable number of related terms, thus generating a significant and noteworthy area, synonymous with the importance and influence it generates and its high frequency of appearance in the different news items in the written press. As for the negative terms (red box), two main clusters can be seen, which indicates a lower segmentation, but maintaining the same explanations as above; i.e., generated by the term the cluster “environment” and its related terms is relevant and, therefore, remarkable within the “negative” corpus of news in the written press.

Regarding the terms related to “environmental” and “environmentally,” the following was highlighted: The positive corpus contains many terms associated with ESG investment criteria, and several of them have a positive connotation (wellbeing, cleanest, lower-carbon, zero-carbon); in turn, the negative corpus has only one term associated with ESG criteria, and it has a neutral connotation (socially). As for the terms associated with ECONOMY, there are several characteristics: Among the terms extracted from the positive corpus, some of them have a positive connotation (cost-effective, cost-efficient, industry leading, value-add), and several of them are related to productivity. In the negative corpus, on the other hand, some economic terms refer to capital or property (owning, rewarding). Moreover, terms associated with intentionality, i.e., actions that can help to achieve a desired result, were also detected: influenced, manageable, calculate, geared, and facilitating. Finally, the positive corpus contains many terms with positive connotations (8), and none with negative connotations. The negative corpus, on the other hand, despite containing several positive terms (4), has many more negative ones (13).

Several conclusions can be drawn from the results obtained. On the one hand, the fact that there are terms with a positive connotation in the positive corpus and terms with a negative connotation in the negative corpus confirms the reliability of the data and of the methodological process. On the other hand, terms related to the ESG criteria appear in the positive corpus, meaning that ESG criteria are associated with good practices. Moreover, the fact that there are so many terms associated with the economy indicates the close relationship between the environment (keyword) and the economy, supporting the initial thesis that ESG investment criteria are closely linked to the company’s reputation and, therefore, to its financial results. It can also be seen that several of the economic terms extracted from the positive corpus indicate good results in terms of productivity; i.e., they focus on the process, on how to do, which, linked to ESG terms, can be related to sustainable development. The concept of sustainable development implies imposing limits on technology and the social organization of environmental resources to absorb the effects of human activity (Kates et al. 2005; Geissdoerfer et al. 2016). In contrast, the economic terms in the negative corpus refer to raising capital. If we relate this to the fact that there are also many terms that indicate intentionality, it can be associated with the “use” of the environment as a reputation-enhancing tool, i.e., with greenwashing, or how companies deceive consumers about their environmental performance. Such practices can have negative effects on consumer and investor confidence (Delmas and Burbano, 2011; Strauß, 2022; Mendonça et al. 2023).

Therefore, we have detected the practices related to the environment within the ESG investment criteria that improve and worsen the reputation of companies in the news: those related to sustainable development improve it while those related to greenwashing worsen it.

5.2.2 Social and socially

Regarding the visualization of the data with the term social, among the positive terms (green box), the term social belongs to the main cluster, but does not stand out as an independent cluster. Therefore, it is an important but not crucial term in the various news items analyzed. This visual information is consistent with the analysis of vectorial distances (see Table 4), which also shows that most of the words related to the term social have vector distances greater than 0.5. As for the negative terms (red box), there is no segmentation since there is only one cluster, which includes the term “social.” In this case, as with the positive terms, it is a notable but not crucial term, which coincides with the quantitative analysis corresponding to the vectorial distances.

Once again, the reliability of the data and of the methodological process is confirmed. On the one hand, in the positive corpus there are more terms with positive connotations (15) than in the negative corpus (9). On the other hand, in the negative corpus there are more terms with negative connotations (6) than in the positive corpus (1). As for the terms associated with ESG investment criteria and the environment, almost all of them appear in the positive corpora (6) (environmentally, culturally, lower-carbon, greener, environmental, governance), while in the negative corpora only one associated term appears—governance. In other words, ESG investment criteria have a positive connotation in the press, and this can have an impact on the good image of the company.

If we focus on the positive terms in the positive corpus, they can be classified into three large blocks: those related to ESG (greener, healthier and nurture); those related to the economy (value-add, prosperity, security, wellbeing and cohesion); and finally, those terms related to ways of doing or of acting (minded, professionally, trustworthy, proactive, conscious, emotionally). These terms can be related mainly to a strong work ethic, and to positive environmental, social and economic results. Therefore, news items that positively evaluate ESG investment criteria relate work ethics to good environmental and social performance and financial prosperity. As for the terms with a negative connotation (almost all of which appear in the negative corpus), once again we can see that they are terms that indicate intentionality (influenced, manageable, calculate, facilitating, geared) or bad practices (butt, critic, misunderstood, irresponsible, discriminate). Considering that all these terms come from the keywords social, and socially, we can relate a “use” of the social aspect of the company to achieving a good image, i.e., “socialwashing.” In fact, Nardi suggests that CSR communication can be decisive in discouraging “socialwashing” (Nardi 2022).

It is therefore clear that good social practices in companies get “good press” and, consequently, improve their image. On the other hand, social practices whose sole objective is to improve their image have the opposite effect.

5.2.3 Governance

Finally, with regard to the term governance, among the positive terms (green box), four clusters can be observed. Two of these clusters are practically insignificant (“ssga” and “not-for-profit”), and the third one (“thresholds,” “glow,” “values,” etc.) has little influence on the main one. In the main cluster (and with the largest area), we find the term governance together with its related words, this being considered a cluster and term that is notable and influential in the different news items analyzed. As for the negative terms (red box), there is only one single cluster, which includes the term governance, so there is no segmentation whatsoever. In this case, and in contrast to the negative terms associated with social, the quantitative analysis corresponding to the vectorial distances supports the importance of the term governance and its related terms, and, therefore, its high frequency of appearance and notoriety within the “negative” corpus generated by the analyzed news.

When we introduce the term governance, unlike in the two previous cases (environmental/environmentally and social/socially), the differences between terms with positive and negative connotations are not apparent. In fact, in neither of the two corpora are there any terms with negative connotations. Once again, most of the extracted terms can be classified into ESG and ECONOMY.

The terms in the positive corpus are related to corporate management on the one hand (accountability, chairmanship, boardrooms), and to social responsibility on the other (diversity, responsibility, inclusivity). In other words, they deal with the responsible management of companies. The terms of the negative corpus also deal with corporate social responsibility (transparency, diversity, engagement, ethical). Among these, the terms transparency and ethical stand out, in clear reference to a “clean” management of the company. However, they do not do the same in a general context as in the positive corpus, but focus on specific companies and entrepreneurs (Landed-mills, Sarasin, Zeb, Deka, Black-Rock). They deal with the ethical and transparent management of specific companies. In other words, the focus is on the responsible and ethical management of certain companies. As there are no adjectives or names with a negative connotation in the corpus, it is not possible to know the term of the criticism, whether it is in a positive or negative sense. It can therefore be concluded that when news items refer to corporate governance, the focus is on the ethical and responsible management of certain companies.

5.3 NLP: Correlation—Sub-corpora—visual representation

In terms of data visualization—via the NLP technique and Word2Vec models—as expected the results obtained are in accordance with the graphical representations observed in the TensorFlow Embedding Projector tool. The concepts or terms “environment” and “governance” can be seen both in the positive and negative variants, where they have general vectorial distances between 0.325 and 0.474, and are always part of the main clusters or large clusters. Therefore, their frequency of use, and consequently, importance and influence in the news about different companies in the written press, is quite remarkable. The term “social,” on the other hand, has an overall vectorial distance between 0.441 and 0.561 (except for the words “security” and “media”), and is not always part of the main clusters or large clusters. Therefore, although it may appear in the news, its frequency of appearance, and consequently its influence compared to the terms “environment” and “governance,” is not as high. This is synonymous with the fact that companies today are giving greater importance, within ESG, to the environmental and governance aspect than to the social aspect.

In any case, it can be concluded that the news items about companies that appear in the written press deal with the issue of ESG business investment criteria. On the one hand, it has been shown that when talking about the environment in news related to companies, those business practices related to sustainable development improve the company’s image; but on the other hand, those related to greenwashing worsen it. On the other hand, with regard to corporate social practices, we can conclude that good corporate social practices improve the company’s image, while social practices whose sole objective is to improve their image—known as socialwashing—have the opposite effect. Finally, when news items refer to corporate governance, the focus is on the ethical and responsible management of certain companies.

5.4 Implications and limitations of the study and future research

The implications of the study for scientists, business and society have been identified. For academics, as it is a new methodology, it opens up a new perspective on SA research. In terms of interdisciplinary research, it facilitates collaboration between areas such as linguistics, computer science and social sciences by merging text analysis and sentiment processing, thus fostering the exchange of knowledge and approaches. Moreover, by being applicable to a wide range of subjective texts, from news to social media posts, it broadens the scope of research in areas such as psychology, sociology, and communication. For companies, it becomes a strategic tool to understand and improve their brand image. By identifying terms that generate negative sentiment, companies can adjust their communication and marketing strategies to address issues and improve their brand perception. It also provides an agile tool to monitor brand reputation in real time, enabling a rapid response to changing trends and perceptions. Finally, the implications for society have been analyzed. By enabling SA in various types of texts, society can better understand perceptions, opinions and reactions to issues, products or companies. This promotes greater transparency in the information that is disseminated and helps society to make more conscious consumer decisions and engage in informed discussions in social networks and other media. In addition, society can influence companies to act more responsibly, as public perception can affect their image and reputation. Finally, this analysis can provide information on emerging social trends, changes in cultural perceptions and evolving attitudes toward different issues. This can be useful for governments, non-profit organizations and other actors in decision-making and strategic planning.

In terms of the limitations of the study, the feelings generated by certain topics, and their associated words, can evolve over time. This requires constant updating of the models and studies carried out, as they may become obsolete. On the other hand, when analyzing texts, there may be difficulties in accessing them. In addition, privacy and ethical concerns must be considered, as misuse of personal data or misidentification of emotions could lead to unintended consequences. It is important to consider these limitations when applying this methodology, as they could affect the accuracy, applicability and ethics of the results obtained.

Future research in this field could focus on several aspects to improve and broaden its application. On the one hand, exploring how this model can be automatically adapted and updated to reflect changing trends. On the other hand, research could be extended to address linguistic and cultural diversity by developing SA models that are applicable to different languages and cultures. Finally, the integration of research with other areas, such as artificial intelligence, psychology or sociology, could be explored to gain a deeper understanding of how emotions relate to other human aspects.

References

Amir AZ, Serafeim G (2018) Why and how investors use ESG information: evidence from a global survey. Financ Anal J 74(3):87–103. https://doi.org/10.2469/faj.v74.n3.2
Article Google Scholar
Banawan MP, Shin J, Arner T, Balyan R, Leite WL, McNamara DS (2023) Shared language: linguistic similarity in an algebra discussion forum. Computers 12(3):53. https://doi.org/10.3390/computers12030053
Article Google Scholar
Brooks C, Oikonomou I (2018) The effects of environmental, social and governance disclosures and performance on firm value: a review of the literature in accounting and finance. Br Account Rev 50(1):1–15. https://doi.org/10.1016/J.BAR.2017.11.005
Article Google Scholar
Chun R (2005) Corporate reputation: meaning and measurement. Int J Manag Rev 7(2):91–109. https://doi.org/10.1111/j.1468-2370.2005.00109.x
Article Google Scholar
Cotizacion de EURO STOXX 50® - Indice - resumen - Rentabilidad-Dividendo. (n.d.). https://www.eleconomista.es/indice/EUROSTOXX-50/resumen/Rentabilidad-Dividendo. Accessed on 14 July 2023
Delmas MA, Burbano VC (2011) The drivers of greenwashing
Demšar J, Curk T, Erjavec A, Gorup Č, Hočevar T, Milutinovič M, Možina M, Polajnar M, Toplak M, Starič A, Štajdohar M, Umek L, Žagar L, Žbontar J, Žitnik M, Zupan B (2013) Orange: data mining toolbox in python. J Mach Learn Res 14(August):2349–2353
Google Scholar
ESG & Sustainable Finance Data Skills and Capacity Building Directory. (2020). https://initiatives.weforum.org/sustainable-finance-data-skills-and-capacity-building/home. Accessed on 13 June 2023
Fatemi A, Glaum M, Kaiser S (2018) ESG performance and firm value: the moderating role of disclosure. Glob Financ J 38:45–64. https://doi.org/10.1016/J.GFJ.2017.03.001
Article Google Scholar
Friede G, Busch T, Bassen A (2015) ESG and financial performance: aggregated evidence from more than 2000 empirical studies. J Sustain Financ Invest 5(4):210–233. https://doi.org/10.1080/20430795.2015.1118917
Article Google Scholar
Geissdoerfer M, Savaget P, Bocken NM, Hultink EJ (2016) The circular economy–a new sustainability paradigm? J Clean Prod 143:757–768. https://doi.org/10.1016/j.jclepro.2016.12.048
Article Google Scholar
Initiative F (2005) UNEP FI 2005 overview.
Kates RW, Parris TM, Leiserowitz AA (2005) What is sustainable development? Goals, indicators, values, and practice. Environment 47(3):8–21. https://doi.org/10.1080/00139157.2005.10524444
Article Google Scholar
Khoo CSG, Johnkhan SB (2018) Lexicon-based sentiment analysis: comparative evaluation of six sentiment lexicons. J Inf Sci 44(4):491–511. https://doi.org/10.1177/0165551517703514
Article Google Scholar
Kim Y (2015) Convolutional neural networks for sentence classification. Master’s thesis, University of Waterloo
Kossovsky N (2012) Reputation, stock price, and you: Why the market rewards some companies and punishes others. In: Reputation, stock price, and you: why the market rewards some companies and punishes others, vol 9781430248. https://doi.org/10.1007/978-1-4302-4891-0
Lee KH, Cin BC, Lee EY (2016) Environmental responsibility and firm performance: the application of an environmental, social and governance model. Bus Strateg Environ 25(1):40–53. https://doi.org/10.1002/BSE.1855
Article Google Scholar
Liu W, Liao H (2017) A Bibliometric analysis of fuzzy decision research during 1970–2015. Int J Fuzzy Syst. https://doi.org/10.1007/s40815-016-0272-z
Article Google Scholar
Liu M, Luo X, Lu W-Z (2023) Public perceptions of environmental, social, and governance (ESG) based on social media data: evidence from China. J Cleaner Prod. https://doi.org/10.1016/j.jclepro.2022.135840
Article Google Scholar
Mandas M, Lahmar O, Piras L, De Lisa R (2023) ESG in the financial industry: what matters for rating analysts? Res Int Bus Financ. https://doi.org/10.1016/j.ribaf.2023.102045
Article Google Scholar
Mäntylä MV, Graziotin D, Kuutila M (2018) The evolution of sentiment analysis—a review of research topics, venues, and top cited papers. Comput Sci Rev 27:16–32. https://doi.org/10.1016/j.cosrev.2017.10.002
Article Google Scholar
Medhat W, Hassan A, Korashy H (2014) Sentiment analysis algorithms and applications: a survey. Ain Shams Eng J 5(4):1093–1113. https://doi.org/10.1016/j.asej.2014.04.011
Article Google Scholar
Mendonça AMB, Leal Filho W, Alves F (2023) Written Press’s approach to climate change in the autonomous region of madeira and the autonomous community of the canary islands. Clim Change Manag Part F5:459–474. https://doi.org/10.1007/978-3-031-28728-2_22
Article Google Scholar
Nardi L (2022) The corporate social responsibility price premium as an enabler of substantive CSR. Acad Manag Rev 47(2):282–308. https://doi.org/10.5465/AMR.2019.0425
Article Google Scholar
Pak A, Paroubek P (2010) Twitter as a corpus for sentiment analysis and opinion mining. http://tumblr.com. Accessed on 29 July 2023
Pang B, Lee L (2008) Opinion mining and sentiment analysis. Found Trends Inf Retr 2(2):1–135
Article Google Scholar
Park J, Choi W, Jung S-U (2022) Exploring trends in environmental, social, and governance themes and their sentimental value over time. Front Psychol. https://doi.org/10.3389/fpsyg.2022.890435
Article PubMed PubMed Central Google Scholar
Porter ME, Kramer MR (2006) Strategy & society: the link between competitive advantage and corporate social responsibility. Harv Bus Rev 84(12):78–92. https://doi.org/10.1108/sd.2007.05623ead.006
Article PubMed Google Scholar
Raithel S, Wilczynski P, Schloderer MP, Schwaiger M (2010) The value/relevance of corporate reputation during the financial crisis. J Prod Brand Manag 19(6):389–400. https://doi.org/10.1108/10610421011085703
Article Google Scholar
Salas-Zárate MDP, Valencia-García R, Ruiz-Martínez A, Colomo-Palacios R (2017) Feature-based opinion mining in financial news: an ontology-driven approach. J Inf Sci 43(4):458–479. https://doi.org/10.1177/0165551516645528
Article Google Scholar
Savytska L, Vnukova N, Bezugla I, Pyvovarov V, Turgut Sübay M (2021) Using Word2vec technique to determine semantic and morphologic similarity in embedded words of the Ukrainian language.
Skublov SG, Gavrilchik AK, Berezin AV (2022) Geochemistry of beryl varieties: comparative analysis and visualization of analytical data by principal component analysis (PCA) and t-distributed stochastic neighbor embedding (t-SNE). J Min Inst 255(3):455–469. https://doi.org/10.31897/PMI.2022.40
Article Google Scholar
Strauß N (2022) Covering sustainable finance: Role perceptions, journalistic practices and moral dilemmas. Journalism 23(6):1194–1212. https://doi.org/10.1177/14648849211001784
Article Google Scholar
Tunca S, Sezen B, Balcioglu YS (2023) Content and sentiment analysis of the New York times coronavirus (2019-nCOV) articles with natural language processing (NLP) and leximancer. Electronics 12(9):1964. https://doi.org/10.3390/ELECTRONICS12091964
Article Google Scholar
Visualizing Data using the Embedding Projector in TensorBoard | TensorFlow. (2022) https://www.tensorflow.org/tensorboard/tensorboard_projector_plugin?hl=en. Accessed on 10 July 2023
Zeidan R (2022) Why don’t asset managers accelerate ESG investing? A sentiment analysis based on 13,000 messages from finance professionals. Bus Strategy Environ 31(7):3028–3039. https://doi.org/10.1002/bse.3062
Article Google Scholar

Download references

Funding

Open Access funding provided thanks to the CRUE-CSIC agreement with Springer Nature.

Author information

Authors and Affiliations

Faculty of Engineering, Industrial Organization and Management Engineering Department, University of the Basque Country, Paseo Rafael Moreno “Pitxitxi” 2, Bilbao, Spain
Naiara Pikatza-Gorrotxategi, Jon Borregan-Alvarado & Izaskun Alvarez-Meaza
Faculty of Engineering, Industrial Organization and Management Engineering Department Vitoria-Gasteiz, University of the Basque Country, Nieves Cano 12, Vitoria-Gasteiz, Spain
Aitor Ruiz-de-la-Torre-Acha

Authors

Naiara Pikatza-Gorrotxategi
View author publications
You can also search for this author in PubMed Google Scholar
Jon Borregan-Alvarado
View author publications
You can also search for this author in PubMed Google Scholar
Aitor Ruiz-de-la-Torre-Acha
View author publications
You can also search for this author in PubMed Google Scholar
Izaskun Alvarez-Meaza
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

NP and JB wrote the main manuscript of the text. NP and AR made the analysis and interpretation of the Sentiment Analysis. NP and JB made the analysis and interpretation of the Sentiment NLP. IA and AR prepared figures. AR and IA revised it critically. All authors reviewed the manuscript.

Corresponding author

Correspondence to Naiara Pikatza-Gorrotxategi.

Ethics declarations

Conflict of interest

There are no financial or non-financial interests that are directly or indirectly related to the work submitted for publication.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Pikatza-Gorrotxategi, N., Borregan-Alvarado, J., Ruiz-de-la-Torre-Acha, A. et al. News and ESG investment criteria: What’s behind it?. Soc. Netw. Anal. Min. 14, 47 (2024). https://doi.org/10.1007/s13278-024-01209-w

Download citation

Received: 05 September 2023
Revised: 18 January 2024
Accepted: 22 January 2024
Published: 27 February 2024
DOI: https://doi.org/10.1007/s13278-024-01209-w

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

News and ESG investment criteria: What’s behind it?

Abstract

Similar content being viewed by others

Social media influencer marketing: foundations, trends, and ways forward

Criteria for Good Qualitative Research: A Comprehensive Review

Environmental-, social-, and governance-related factors for business investment and sustainability: a scientometric review of global trends

1 Introduction

2 Methodology

2.1 Database definition

2.2 Data extraction

2.3 Cleaning and classification

2.4 Main corpus creation

2.5 NLP: Sentiment analysis (main corpus)

2.6 Sub-corpora creation

2.7 NLP: correlation

2.7.1 NLP: terms related to ESG

2.7.2 NLP: visual representation

3 Results and conclusions

4 Results

4.1 NLP: sentiment analysis (main corpus)

4.2 NLP: correlation

4.3 Environment and environmentally

4.4 Social and socially

4.5 Governance

5 Discussion ad conclusions

5.1 NLP: sentiment analysis. main corpus

5.2 NLP: Correlation. sub-corpora: terms related to ESG

5.2.1 Environmental and environmentally

5.2.2 Social and socially

5.2.3 Governance

5.3 NLP: Correlation—Sub-corpora—visual representation

5.4 Implications and limitations of the study and future research

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation