Introduction

The U.S. Securities and Exchange Commission (SEC) requires all non-U.S. companies engaging in securities trading in the U.S. to file an annual Form 20-F. Recently, the SEC has finalized rules to implement the Holding Foreign Companies Accountable Act (HFCAA), that allow the delisting of foreign firms if their auditors do not comply with requests for information from the U.S. regulators. The goal of this requirement is to standardize the reporting of foreign-based companies and inspect the audits of Chinese firms that list and trade in the U.S. market. It has been documented that the quality and accuracy of financial information reported by foreign firms cross-listed in the U.S. is low (Ang et al. 2016). Furthermore, Beckmann et al. (2019) claim that foreign firms have incentives to manage their earnings around cross-listing events. For example, the Chinese firm, Luckin Coffee, had an internal fraud probe in 2020, increasing calls for action. In addition, due to information asymmetry and window-dressing in cash holdings (Khokhar 2011), there is a significant positive association between qualitative impression management and abnormal accruals in earnings management (Boudt and Thewissen 2019), and executives likely attempt to hide their earnings management activities in these disclosures (Jaspersen et al. 2021). Therefore, it is imperative for investors, analysts, and legislators to make special efforts to identify earnings management behaviors in foreign firms’20-F financial statements.

The Form 20-F consists of several sections, including key information, information on the firm, operating and financial prospects, directors, senior management, and employees. The SEC requires all “foreign private issuers” with listed equity shares on the U.S. exchanges to provide investors with detailed information about a firm’s annual business activities. Therefore, it is assumed that texts conveying uncertainty make it more challenging for investors to identify earnings management behaviors. Additionally, the tone of such documents may impact investors’ assessments of the firm (Loughran and McDonald 2016). Although the 20-Fs indicate that firms follow the U.S. rules and offer opportunities for investors on the U.S. exchanges, the tone of the financial statements may be different from the tone used by the U.S. companies. This study adopts a novel Chinese dataset because Asian cultures have more implicit communication styles than nonverbal cues (Wu et al. 2021). Moreover, Chinese individuals are more likely to hide their true feelings than Americans because of the cultural characteristics of collectivism (Sims et al. 2015). Particularly, Chinese stocks play an important role in the global economy in international portfolio diversification (Luo et al. 2012). However, understanding the true meaning of the language used in Chinese companies’ financial reports is challenging for international investors because the tone of language represents the underlying corporate culture. Specifically, firm performance dimensions are positively affected by organizational culture and innovation (Uzkurt et al. 2013). In general, this study attempts to identify earnings management behaviors in the U.S.-listed Chinese companies by analyzing the qualitative and quantitative information contained in the 20-F disclosures.

The existing literature identifies and focuses on factors affecting earnings management from both macro and micro perspectives. Studies include state-level unemployment benefits (Dou et al. 2016), trust culture (Pevzner et al. 2015), controlling ownership (Bao and Lewellyn 2017), dual-class status (Li and Zaitas 2017), corporate governance (El Diri et al. 2020), and stakeholder orientation (Ni 2020). However, there is a lack of research on how investors can use effective tools to analyze financial disclosures, such as Form 20-Fs, to identify earnings management behaviors. This study contributes to the literature by applying existing dictionary-based textual analysis methods to analyze the U.S.-listed Chinese companies.

Another stream of research examines the relationship between the textual characteristics of financial disclosures and the financial market’s evaluations of public firms (Loughran and McDonald 2013; Lang and Stice-Lawrence 2015). These studies focus on the various outcomes of textual analysis, but there are few studies on how textual analysis can help identify earnings management behavior. This study applies textual analysis to the U.S.-listed Chinese firms because an individual country sample can support an in-depth analysis of the business culture characteristics in that country. Textual analysis can effectively identify the tone and sentiment expressed in mandated company disclosures, analyst reports, and financial press releases. Loughran and McDonald (2016) find that the language used in financial statements, especially the tone and words indicating sentiment, correlate with earnings management and may signal future fraudulent activities. Several studies use textual analysis to quantify the tone and sentiment of the language used in corporate annual reports. They examine the association between the tone of the language and performance outcomes and market interpretations of such information (Loughran and McDonald 2011, 2013; Jegadeesh and Wu 2013; Lang and Stice-Lawrence 2015). Moreover, the tone used in the Management Discussion and Analysis (MD&A) section in financial disclosures is highly associated with market responses to publicly listed firms (Wu et al. 2021).

This study investigates the relationship between the tone of 20-F and earnings management behaviors of Chinese companies listed in the U.S. exchanges. Corporate managers typically have an information advantage over outside investors, causing information asymmetry and window-dressing problems in cash holdings (Khokhar 2011). Specifically, Jegadeesh and Wu (2013) indicate a significant relationship between document tone and market reactions for both negative and positive words. Recently, Bian et al. (2021) confirm that management tone is significantly and positively related to a firm’s post-market operating performance. Therefore, we examine whether managers have an incentive to select their textual tone in financial reports when communicating company information to outsiders.

It is crucial to adopt an appropriate method to measure the tone of 20-Fs. Previous literature shows that the Loughran and McDonald word list is widely used in textual analyses (Jegadeesh and Wu 2013; Loughran and McDonald 2013). This study adopts the six sentiment word lists (positive, negative, uncertain, litigious, strong modal, and weak modal) following Loughran and McDonald (2011). It adds two more words (moderate modal and irregular verbs) from McDonald’s updated 2014 Master Dictionary, providing a broader set of sentiment categories and more examples of words for each category.Footnote 1 Specifically, this study adds a moderate modal list because the modal subcategories are too broad, presenting a complete picture of the relationship between modal words and earnings management. It adds a list of irregular verbs because it is believed that these words can reflect major events occurring in the company and thus convey additional information on the most recent mergers and acquisition activities.

This study extends Kim et al. (2017) and Loughran and McDonald (2011). For example, Kim et al. (2017) examine the relationship between the grammatical structure of texts and financial reporting characteristics, separating texts into two types based on how they encode time: strong future-time reference texts and weak future-time reference texts. In contrast, this study conducts a textual analysis of the tone of Form 20-Fs filed by the U.S.-listed Chinese firms and examines whether the tone correlates with earnings management, distinct from other accounting characteristics. Specifically, this study examines whether a higher proportion of positive, uncertain, or modal words like “good, may, could, depend, and approximately” in the 20-Fs signals more earnings management behaviors.

This study applies ordinary least squares (OLS) regression models to a sample of 153 U.S.-listed Chinese firms during 2002–2014. The results show that Chinese firms with a higher proportion of positive, uncertain, or modal words in their 20-Fs exhibit higher earnings management, suggesting that managers use positive, uncertain, and modal terms to hide their earnings management behaviors. A series of robustness tests are also conducted. First, an alternative proxy for earnings management is used as the dependent variable, and the model is re-estimated; the main results hold. Second, an examination of the impact of the 2008 financial crisis reveals that the relationship between tone and earnings management is more significant in the post-crisis period than in the pre-crisis period. Third, a robustness test of cross-industry variations finds that the relationship between tone and earnings management is more significant in the information technology industry than in other sectors.

This study makes two significant contributions to the literature. First, it adds the Chinese context to the current textual analysis literature (Loughran and McDonald 2011, 2013; Price et al. 2012; Jegadeesh and Wu 2013; Kim et al. 2017; Shan 2019; Wu et al. 2021) and the predictability of qualitative disclosures on earnings management (Hu 2021; Jaspersen et al. 2021). This is the first study using the U.S.-listed Chinese firms to investigate the correlation between textual tone and earnings management behaviors. The findings provide novel empirical evidence that managers tend to use positive, uncertain, or modal words in financial statements to conceal their earnings management behaviors, supported by Li’s (2008) obfuscation theory and Boudt and Thewissen’s (2019) impression management. Second, this study contributes to the literature on the methodology used to identify potential earnings management. For example, Li et al. (2021a) claim that in many financial applications, such as fraud detection, reject inference, and credit evaluation, it is critical to understand the sub-patterns of the data to infer users’ behaviors, identifying potential risks. Specifically, this study illustrates that the textual tone in financial statements can be used as an effective tool to identify earnings management behaviors because empirical studies show the correlation between tones and earnings management, providing informative value for identifying earnings management behavior. However, previous studies highlight the correlation between investor relations, ownership, corporate governance, firms’ financial characteristics, shareholder activism, environmental regulation, and corporate culture with earnings management (Frankel et al. 2010; Bao and Lewellyn 2017; Dou et al. 2016; El Diri et al. 2020; García Lara et al. 2020; Ni 2020; Chen et al. 2021; Li et al. 2021b; Ng et al. 2021).

The remainder of this paper is organized as follows. In “Literature review and hypothesis development” section presents a comprehensive literature review and develops our main hypotheses. In “Data and methodology” section introduces the data and methodology. In “Results” section reports the empirical analysis results, and in “Conclusions” section concludes the paper.

Literature review and hypothesis development

Opinion dynamics in finance and business have been studied extensively, and a series of opinion dynamics models with different opinion evolution rules have been proposed in various fields. Opinion dynamics models typically include three basic elements: opinion expression formats, opinion evolution rules, and opinion dynamics environments (Zha et al. 2020). Textual analysis extracts information from textual resources from a new perspective (Loughran and McDonald 2016). Therefore, the tone in the text is used to represent investor sentiment and analyze issues in accounting and finance behaviors. Specifically, it helps us understand behavioral patterns across individuals, institutions, and markets (Kearney and Liu 2014; Yang and Luo 2014). In addition, these techniques may help researchers analyze hidden clues or seek additional information to that observed through financial information, increasing the quantity and quality of information (Gandía and Huguet 2021) and creating different methodologies and artifacts to advance knowledge in accounting, auditing, and finance (Fisher et al. 2016). Numerous studies examine the association between textual tone or textual characteristics and firm performance and the market reaction to the tone of financial disclosures. For example, Loughran and McDonald (2014) find that file size (in megabytes) is an accurate indicator of a text’s readability and is related to post-filing abnormal returns, volatility, and analyst dispersion. Bodnaruk et al. (2015) use the percentage of constraining words in financial statements to gauge firms that would become financially constrained and find that this measure is associated with liquidity events. Loughran and McDonald (2013) use sentiment word lists to gauge the tone used in the initial registration forms of new securities and find that initial public offerings (IPOs) with high levels of uncertain words have higher first-day returns and larger aftermarket volatility. In addition, Jegadeesh and Wu (2013) propose a return-based term-weighting scheme for quantifying document tone and find that the tone of financial statements is significantly related to the market returns of firms around their financial statement filing dates. Mai et al. (2019) assess the predictive power of textual data for forecasting bankruptcy and find that textual data can complement traditional accounting- and market-based variables in predicting bankruptcy. Wei et al. (2019) adopt the text mining approach to identify corporate energy risk factors in the risk disclosures reported in financial statements and provide a preliminary method for corporate energy risk estimation. Jiang et al. (2019) construct a manager sentiment index based on the aggregated textual tone of corporate financial disclosures and find that manager sentiment is a strong negative predictor of stock market returns.

It is well documented that people’s opinions drive human decisions and behaviors, and some information in their environment is always reflected in their language (Hofstede and Hofstede 2010). People express their opinions and sentiments through oral or written communication (Abbasi et al. 2008). For example, people may use vague statements to express their views or a strong tone to deny negative behaviors. In business environments, managers have an information advantage over outside investors. Communication about the firm’s business and financial performance may partially reflect managers’ subjective opinions about their stock returns, financial matters, fraudulent activities, or future earnings expectations (Henry 2008; Blau et al. 2015; Demaline 2019; Jiang et al. 2019). For example, Jegadeesh and Wu (2013) find that positive and negative tones in Form 10-K are useful in predicting the date of filing returns. Bian et al. (2021) find that managers’ positive tone in online roadshows can indicate a higher first-day return on IPOs in Chinese stock markets, suggesting that the sentiment or tone of managers has informative value for investors in certain events or activities. In addition, Jaspersen et al. (2021) argue that managers may be more likely to hide some earnings management behaviors in qualitative disclosures, making it difficult for investors to make correct decisions. This is consistent with impression management theory, which states that qualitative impression management positively correlates with the use of abnormal accruals in earnings management (Boudt and Thewissen 2019). Therefore, it is predicted that a significant relationship exists between sentiment expressed in the language of these texts and earnings management. Specifically, this study hypothesizes that:

Hypothesis 1

The tone used in Form 20-Fs is significantly correlated with managers’ earnings management behavior.

A natural question arises: How does tone relate to earnings management? A higher proportion of positive, uncertain, or modal words such as “good, may, could, depend, and approximately” reflect good or uncertain information about the firm, suggesting that managers convey accurate information to the public. In this situation, it can be inferred that managers do not conceal earnings management. However, managers may also use a positive or uncertain tone to hide their earnings management behavior. Studies show that the annual reports of public firms with low earnings can be difficult to read (Li 2008) and the Chinese firms with more influential largest shareholders are more prone to real earnings management (Dong et al. 2020), indicating that managers may be reluctant to report accurate information when the news is bad. Kang et al. (2018) find that managers use a flexible tone in the narrative sections of annual reports to express their specific intentions according to different firm situations. Specifically, managers may strategically use language when a firm violates regulations (Huang et al. 2014). Managers may also attempt to deceive investors out of self-interest or hide bad news that may damage the firm’s value (Kang et al. 2018). Accordingly, textual analysis can reveal firms’ fraudulent activities (Chen 2014). Jaeschke et al. (2018) find that the tone of financial reports correlates with the likelihood of future prosecution under the Foreign Corrupt Practices Act (FCPA) and that after the prosecution, the FCPA violators tend to change the tone in their financial reports. Therefore, we construct an alternative hypothesis as follows:

Hypothesis 1A

A positive, uncertain, or modal tone in 20-Fs positively correlates with managers’ earnings management behavior.

Data and methodology

Data

The sample is obtained from the S&P Capital IQ with supplements from the Wind Database and the China Stock Market & Accounting Research Database (CSMAR). The initial sample includes 153 Chinese non-financial firmsFootnote 2 listed on the National Association of Securities Dealers Automated Quotations (NASDAQ), New York Stock Exchange (NYSE), and American Stock Exchange (AMEX) from 2002 to 2014. Next, companies that released 10-K filingsFootnote 3 without any other SEC filings are removed. Finally, filings with incomplete financial information are excluded. The final sample consists of 75 companies and 449 firm-year observations.

Word lists

Following the 2014 Master Dictionary of McDonald and the Loughran and McDonald (2011) word lists, this study classifies the words into eight categories: positive (positive), negative (negative), uncertain (uncertainty), litigious (litigious), strong modal (strongml), moderate modal (moderml), weak modal (weakml), and irregular verbs (irrverb). This study uses the following examples for each category to clarify the definitions of these sentiment words: examples of positive words are “beneficial, successful, good, achieved, and empower”; examples of negative words are “loss, failure, abandon, and decline”; examples of uncertainty words are “almost, and nearly”; examples of litigious words are “abovementioned, abrogated, and certiorari”; examples of strong modal words are “must, never, definitely, and will”; examples of moderate modal words are “can, generally, and usually”; examples of weak modal words are “could, should, and ought to”; and the examples of irregular verbs are “beat, cut, and forgot.” Keywords extraction plays an important role in the financial sector. “With this combination of keyword extraction and big data, we can extract what we need in a very short time with a fully automated process eliminating most manual work and increasing speed of data collection (Pejic Bach et al. 2019).”

The textual analysis method applies to 10-K filings by industry, as described in Appendix 2 (Bodnaruk et al. 2015). Initially, this study removes all the HTML and ASCII-encoded segments from each filing; then, the text identified by the HTML as tables is eliminated if its numeric character content is greater than 15%. After removing ambiguous proper nouns, the text is parsed into a vector of words tabulated using the eight-word list categories. It calculates the percentage of words in each category using the approach proposed by Loughran and McDonald (2013).

Variables

This study estimates discretionary revenues as a proxy for earnings management, following McNichols and Stubben (2008) and Stubben (2010). Each approach to identifying earnings manipulation has advantages and disadvantages (McNichols and Stubben 2008). Discretionary accrual measures are commonly used to identify firms that engage in earnings management. This measure does not involve selection bias but may have a greater measurement error. Several studies have shown that discretionary accrual models provide biased, low-power estimates of discretion. We use the measure of discretionary revenues presented by Stubben (2010) as a proxy for earnings manipulation to increase the power of our tests. Specifically, we run the following regression in Eq. (1):

$$\Delta AR_{i,t} = \alpha_{0} + \alpha_{1} \Delta Rev_{i,t} + \varepsilon_{i,t}$$
(1)

where the subscripts refer to firm i in year t. \(\Delta AR_{i,t}\) represents the annual change in accounts receivable and \(\Delta Rev_{i,t}\) is the annual change in revenues; both are scaled by lagged total assets. Discretionary revenues are the residuals estimated using Eq. (1). The absolute values of the discretionary revenue are then called earnings management (EM).Footnote 4

This study includes various control variables that previous studies have found to affect E.M. First, it uses a dummy variable to control for the listing place (LP), which equals 1 if the firm is listed on the NASDAQ and 0 otherwise. The variable for firm nature (FN) is set to 1 if the controlling shareholder is the state and 0 otherwise, because state control may increase earnings management (Bao and Lewellyn, 2017). It creates a dummy variable for the presence of an auditor going-concern (AOT), which equals 1 if an auditor has issued a going-concern opinion and 0 otherwise. It is assumed that when auditors issue going-concern opinions firms have less incentive to engage in earnings management. It further controls for auditing firm characteristics with the variable (BIG4), which equals 1 if a firm is audited by one of the Big 4 auditing companies and 0 otherwise (Hu 2021) because these auditors may exercise stronger supervision, resulting in fewer earnings management. We also control for the following firm characteristics: Leverage (LEV), defined as the ratio of total debt to total assets, indicates that firms with high leverage manage earnings to avoid debt covenant violations. Firm size (LNSIZE) is the natural logarithm of a firm’s total assets and is used to proxy for size effects. Most studies control for this variable, but their results are inconsistent (Hu 2021; Ng et al. 2021). In addition, we control for cash and cash equivalents (CH), defined as the ratio of cash and cash equivalents to total assets, and revenue (REV), measured as the change in sales the lagged sales of the company. Cash flow (CF) is defined as the cash flow ratio from operations to total assets. It is assumed that firms with high cash flow engage in fewer earnings management because good cash flow represents strong operations. This also includes return on assets (ROA), measured by net income over total assets. However, the sign of its impact on earnings management is uncertain, because firms with good and bad financial performances may engage in earnings management. Finally, we include firm, year, and industry fixed effects to control for heterogeneity. Appendix 1 summarizes the variable definitions.

Methodology

The following OLS regression is employed to test the main hypotheses and controls for the firm, year, and industry fixed effects in Eq. (2):

$$EM_{it} = \beta *WordLt_{it} + \gamma *X_{it} + Firm_{i} + Year_{it} + Industry_{j} + \varepsilon_{it}$$
(2)

where i, t, and j refer to firm, year, and industry, respectively. EMit represents the earnings management of firm i in year t, β is a set of coefficients; \(WordLt_{it}\) represents the set of eight sentiment word lists; and \(X_{it}\) represents the control variables (\(LP_{it}\), \(FN_{it}\), \(AOT_{it}\), \(LEV_{it}\), \(LNSIZE_{it}\), \(CH_{it}\), \(REV_{it}\), and \(CH_{it}\)). \(Year_{it}\) represents year-fixed effects and \(Industry_{j}\) represents industry-fixed effects.

Furthermore, this study implements a multicollinearity test for the main regression model. In addition, it uses an alternative proxy for earnings management and re-estimates Eq. (2). Finally, to test the robustness of the empirical results, we divide the full sample into two sub-samples, each with different periods and industries. This study also uses the fixed effects model in Eq. (2) to test the impact of the 2008 financial crisis on the relationship between independent variables and earnings management and the cross-industry variations between information and non-information technology industries.

Results

Summary statistics

The industry subsamples are based on the Global Industry Classification Standard (GICS) developed by the S&P Dow Jones Indices, a leading provider of global equity indices. After excluding the financial industry (see Appendix 2), there were nine industries in the sample. Over the entire sample period, the information technology industry accounts for 42.98% of the total observations, while the GICS utility industry accounts for only 1.78%. The weights of the other industries vary from 2.67% (communication services) to 16.7% (consumer discretionary). In general, sample firms are widely dispersed across various industries.

Table 1 reports the summary statistics for the selected variables. For example, for the dependent variable, EM, the minimum is 0, and the maximum is 0.341, with a mean of 0.031 and a median of 0.017. These results suggest variations in the amount of earnings management in the sample firms. Table 1 also reports the descriptive statistics of the eight sentiment measures, the percentages of words in Form 20-F selected from each word list. The mean values are greater than the median values for most sentiment variables, implying that the data distribution is skewed and outliers. Specifically, negative, litigious, and uncertain have the highest mean percentages (0.007, 0.007, and 0.006, respectively). In addition, it reveals that firm size (LNSIZE) and change in sales (REV) have the highest standard deviations, 0.927 and 0.480, respectively, suggesting that the sample firms’ revenues vary remarkably by firm size.

Table 1 Summary of descriptive statistics

This study also examines the effect of the 2008 financial crisis on the selected variables. Specifically, it divides the sample into two sub-samples: a sub-sample of 86 observations during 2002–2007 and a sub-sample of 363 observations during 2008–2014. This study uses the beginning of 2008 as the cut-off point for the financial crisis. It compares the means of selected variables before and after 2008. The mean EM is slightly higher in the post-crisis sub-sample than in the pre-crisis period (0.031 vs. 0.025). We also find that the values for all sentiment variables are lower in the post-crisis period. For example, the mean of litigious drops from 0.009 to 0.006. Finally, the results show that the debt ratio (0.403 vs. 0.322) increases, while the cash ratio (0.234 vs. 0.300), cash flow ratio (0.074 vs. 0.137), and change in sales (0.251 vs. 0.431) decrease after the financial crisis. The results suggest that most firms have experienced financial constraints during the economic problems.

Table 2 reports the correlations between key variables. Most sentiment variables are positively associated with EM, except for litigious. For example, Positive, Uncertainty, Strongml, Moderml, and Weakml are positively correlated with EM (the coefficients are 0.196, 0.169, 0.092, 0.132, and 0.170, respectively), although the correlations between EM and Negative (0.029) and Irrverb (0.028) are weak. These results are consistent with those of Loughran and McDonald (2013), although there is an overlap between some categories in the word list. For example, there are strong correlations between Weakml and Uncertainty (0.954), Uncertainty and Positive (0.895), and Moderml and Positive (0.887), suggesting that multicollinearity will be a problem if it includes all the sentiment variables in one regression. Therefore, this study follows Loughran and McDonald (2013) and runs the regressions individually for each category of sentiment variables. In addition, there is a high correlation coefficient between CF and ROA (0.755), suggesting that firms with higher cash flows are more likely to have better performance. A possible explanation is that these firms have lower fixed asset costs and more capital turnover, resulting in higher profits.

Table 2 Correlation matrix

Baseline regression results

To further examine the link between earnings management and sentiment, we first report the baseline regression results, including the firm, year, and industry fixed effects in Table 3. Column (1) shows that the correlation between firm nature and earnings management is positive (0.013) and significant at the 5% level. This demonstrates that the managers of state-owned enterprises engage in more earnings management, consistent with studies of the Chinese financial markets (Bao and Lewellyn 2017). The firm size and cash flow coefficients are negative and significant, suggesting that larger firms with higher cash flows have fewer earnings management activities. In other words, small firms and firms with inadequate cash flow are more likely to manipulate their earnings. The coefficient of firm leverage is positive and significant, suggesting that firms with high debt ratios engage in earnings management, consistent with the original assumption.

Table 3 OLS Regression on earnings management

Columns (2)–(9) of Table 3 examine each of the eight sentiment categories as independent variables to explore whether the language used in financial disclosures can explain earnings management. The results show that most sentiment variables (Positive, Uncertainty, Strongml, Moderml, and Weakml) have positive and statistically significant coefficients for E.M. In Column (2), Positive has a positive and significant coefficient (6.482), suggesting that firms using a higher proportion of positive words in their 20-Fs experience more earnings management. This finding implies that companies use positive words more frequently in financial statements to conceal their true earnings status. Similarly, uncertainty has a positive coefficient (3.793), suggesting that firms using higher proportions of uncertain words in their 20-Fs engage in more earnings management. A one standard deviation increase in the percentage of uncertainty leads to a 0.009 increase in EM (3.793 multiplied by the standard deviation of 0.00234). This suggests that companies that frequently use words indicating uncertainty about the future may engage in more earnings management activities than others. Similarly, Columns (6)–(8) show that Strongml, Moderml, and Weakml have positive and significant coefficients (6.372, 8.741, and 5.357, respectively), suggesting that firms that use a high proportion of modal words engage in more earnings management. Specifically, a one standard deviation change in the percentage of Strongml, Moderml, and Weakml is linked to 0.005 (6.372 times 0.0008), 0.005 (8.741 times 0.00058), and 0.008 (5.357 times 0.00153) increase in earnings management, respectively.Footnote 5 We also find that the coefficients of REV and ROA are positive and significant. A possible explanation is that companies with better growth prospects or higher profitability ratios may have more motives for earnings management and manipulating numbers (Espahbodi et al. 2021). Overall, the results demonstrate that the tone of 20-Fs relates to earnings management. Specifically, a positive correlation exists between a positive, uncertain, or modal tone in 20-Fs and earnings management behaviors, suggesting that the sentiment words “good, may, could, depend, and approximately” may reflect managers’ immoral behaviors. In summary, the results support both hypotheses and indicate that qualitative textual tones can provide valuable information to investors.

These results are consistent with those obtained for the U.S. companies in previous studies (Jegadeesh and Wu 2013; Kang et al. 2018; Jaspersen et al. 2021). For example, Kang et al. (2018) claim that it is risky to interpret business managers’ positive expressions because the data can include their subjective opinions and viewpoints from the companies’ perspectives. Furthermore, they employ the text-mining method to identify the tones of the 10-K narratives to determine whether the changes are consistent with the current earning levels. Jaspersen et al. (2021) suggest that qualitative disclosures are additional sources of information about a company’s financial situation. Still, executives likely hide their earnings management activities in these disclosures. Nevertheless, their results demonstrate that qualitative disclosures can predict earnings management and are useful for learning about companies’ accounting choices. In addition, Jegadeesh and Wu (2013) find a significant relationship between document tone and market reaction to 10-K filings for negative and positive words. Furthermore, they indicate a need to partition words into positive and negative word lists subjectively. However, our study focuses on 20-Fs, different from previous studies on 10-Ks.

Multicollinearity test

A possible problem in the baseline regression models is the strong correlation between the variables and relatively small sample size, leading to multicollinearity. This study conducts an extra multicollinearity test commonly used in multiple regression models and reports the variance inflation factors (VIFs) in Table 4 to address this issue. It shows that the average VIF values are less than 10 for all multiple regressions, suggesting that there are no significant multicollinearity issues between the sentiment variables. These results confirm the robustness of the baseline regression results.

Table 4 Multicollinearity test

An alternative earnings management variable

This study further investigates whether the impact of textual analysis on earnings management persists when an alternative measure of earnings management is used to check the robustness of the empirical results. The performance-adjusted discretionary accruals variable developed by Kothari et al. (2005) is the alternative measure. It estimates the following regression in Eq. (3):

$$TAccr_{it} = \alpha_{0} + \alpha_{1} \left( {\frac{1}{{Asset_{i,t - 1} }}} \right) + \alpha_{2} \Delta Rev_{it} + \alpha_{3} PPE_{it} + \alpha_{4} ROA_{it} + \varepsilon_{it}$$
(3)

where \(TAccr_{it}\) is total accruals, measured as the change in non-cash current assets minus the change in current non-interest-bearing liabilities, minus depreciation and amortization expenses for firm i in year t, scaled by lagged total assets (\(Asset_{i,t - 1}\));\(\Delta Rev_{it}\) is the annual change in revenue scaled by lagged total assets; \(PPE_{it}\) is property, plant, and equipment costs for firm i in year t, scaled by lagged total assets; and \(ROA_{it}\) is the return on investments for firm i in year t. The residuals from the regression model are discretionary accruals. This study uses the absolute value of discretionary accruals (AVDA) as an alternative proxy for earnings management. Table 5 shows the regression results. As shown in Columns (2), (4), (7), and (8) of Table 5, the significant relationship between most sentiment words and earnings management continues to hold as an alternative measure of earnings management. This result confirms the main results presented in Table 3. This shows that companies that use more positive or uncertain words in their financial reports are more likely to engage in earnings management.

Table 5 Regressions using an alternative measure of earnings management

Controlling for the length of financial statements

The percentage of sentiment words used in financial statements may be affected by the text length. In this study, we use file size, measured by the natural logarithm of the file size in megabytes (Lnfsize) of the “complete submission text file” for Form 20-F filing, as a proxy for the length of the text. As shown in Table 6, the significance levels and signs of the coefficients of the sentiment word variables do not change significantly, indicating that the baseline regression results are robust after controlling for the length of the financial statements.

Table 6 Regressions after controlling for the length of financial statement

Impact of the 2008 financial crisis

This study also analyzes the impact of the financial crisis on earnings management because firms may face different regulations during crisis periods, influencing how they report earnings. Additionally, firms may change the tone of their financial disclosures during a crisis period because of increased uncertainty. Therefore, it divides the sample period into two subsamples: 2002–2007 (pre-crisis) and 2008–2014 (post-crisis). This analysis identifies whether there is a difference in the tone of financial statements before and after the crisis.

The results for two subsamples are reported in Panels A and B in Table 7. The first column of each panel includes only control variables. Columns (2)–(9) add the variables for each sentiment category as the main independent variables. Positive has positive coefficients on earnings management for both periods (10.10 and 6.991, respectively), suggesting no significant difference in the relationship between positive words and earnings management in these periods. Column (9) of Panel A shows that Irrverb has a marginally significant negative coefficient of − 5.993. However, the results may be biased due to the small sample size for this category (only 86 observations). Therefore, Irrverb has less power to explain earnings management in the pre-crisis subsample. Columns (3)–(9) of Panel B in Table 7 show that the sentiment variables Uncertainty, Strongml, Moderml, and Weakml have significant positive coefficients on EM after the crisis, consistent with the results for the full sample in Table 3. The results confirm that these modal variables have similar and significant relationships with EM in the pre -and post-crisis periods.

Table 7 The impact of financial crisis on earnings management

Overall, there is no significant change in the relationship between Positive, Uncertainty, Strongml, Moderml, and Weakml and EM in the pre -and post-crisis periods. Therefore, the main results are robust after controlling for the effects of the 2008 financial crisis.

Information technology versus non-information technology industries

The sample reveals that 42.98% of observations are from the information technology industry. Thus, we divide the sample into two subsamples: information technology and non-information technology industries. Panels A and B of Table 8 present the regression results for the information technology industry and non-information technology industry subsamples, respectively. Columns (4)–(6) of Panel A show that the coefficients of Litigious, Strongml, and Moderml are significant, indicating that firms in information technology are more likely to use a positive tone in their financial disclosures to conceal earnings management. However, there are no significant changes in the signs of the estimated coefficients, indicating that the robustness of the empirical results holds for all industries.

Table 8 Information technology industry versus non-information technology industry

Discussion of the empirical results

The main results are summarized as follows. First, positive, uncertain, and modal words are positively and significantly correlated with earnings management. Table 3 shows that firms with a higher proportion of positive, uncertain, and modal words in financial reports are more likely to engage in earnings management. The results imply that companies use more positive words when concealing earnings management, probably to attract potential investors or other businesses. Second, although the VIF values indicated possible multicollinearity among the selected variables, the mean values of the VIFs are less than 10 in each regression. This suggested no significant multicollinearity issues in the main regression models. Third, it uses the absolute values of discretionary accruals as an alternative proxy for earnings management and re-estimates the main regression. These results provide further support for the main results in Table 3, even after controlling for the effect of the length of financial statements. In addition, this study considers the impact of the financial crisis and finds that the main results hold in the pre- and post-crisis periods. Finally, it divides the sample into information technology and non-information technology industries and finds that the results do not exhibit significant differences between these two sub-samples.

Conclusions

This study uses a sample of 153 U.S.-listed Chinese firms to investigate the relationship between earnings management behavior and the tone of 20-F filings. We find that firms that employ more positive words in their 20-Fs engage in more earnings management, as concealing the firms’ true earnings status may help attract investors. Specifically, firms using more positive, uncertain, or modal words in their 20-Fs engage in more earnings management behaviors, suggesting that textual tone is informative. Textual analysis can be a valuable tool for investors seeking to identify earnings management. The main results are robust after conducting a multicollinearity test, re-estimating the regression models using an alternative proxy for earnings management, considering the effect of the financial crisis, and testing the results separately for information and non-information technology industries.

This study provides an innovative textual analysis method that uses financial information from Form 20-F to identify earnings management in the Chinese context. Li et al. (2021b) claim that owing to the complexity of human behaviors and changing social environments, the distributions of financial data are usually complex, and it is challenging to find patterns and provide reasonable interpretations. This study extends the literature using Loughran and McDonald’s word lists as earnings management indicators. The results confirm that the tone in 20-Fs correlates with earnings management behaviors; specifically, a higher proportion of positive, uncertain, or modal words like “good, may, could, depend, and approximately” in the 20-Fs implies more earnings management behaviors. The results are consistent with impression management theory that the strategic positioning of positive and negative words within a chief executive officer (CEO) letter is a subtle form of impression management. Additionally, managers present information in such an order that the reader of the CEO letter has a more positive perception of the underlying message (Boudt and Thewissen 2019). Moreover, Li (2008) finds that the annual reports of firms with lower earnings are harder to read and that firms with easier-to-read annual reports have more persistent positive earnings. This study has important implications for international investors, analysts, and legislators to adopt an effective tool for identifying earnings management behaviors in financial statements.

This study has some limitations. For example, the Loughran and McDonald list (2011) is a six-category list later modified to contain eight categories in our study. This method is based on the unique words used in financial reports. Thus, a better theoretical framework is needed to explain the association between tone categories and earnings management. In addition, 2014 was the end date for the sample. Therefore, extending the sample to more recent years is possible, but we only consider the years from the pre- and post-2008 samples. Furthermore, the main measure of earnings management is a simplified version of the traditional short-term Jones Model. Although we use a modified version of the Jones Model proposed by Kothari et al. (2005), problems associated with the main measure may still be present because they are derived from the same model. Future studies should include alternative measures not based on the Jones Model.