Introduction

The coronavirus pandemic starting in late 2019, caused by the novel severe acute respiratory syndrome–related coronavirus-2 (SARS-CoV2), or COVID-19, has led to an international lockdown ceasing daily activities, overwhelming health systems and depleting medical resources globally. As of June 2020, according to the World Health Organization (WHO) data, the recorded cases globally are over eight million and the death rate is 455,535. In the United Kingdom (UK), infection and the death toll had respectively been 299,255 and 42,153 [1]. Thus, as per official figures, despite forming under 1% of the world population, Britain had witnessed 10% of all COVID-19 reported deaths worldwide [1].

Amongst the information surrounding COVID-19, increased attention has been drawn towards the disproportionate rate of severe adverse outcomes within Black and minority ethnicities (BAME) within the UK. A study by the Intensive Care National Audit and Research Centre (ICNARC) had reported that approximately a third of UK COVID-19–related deaths within critical care were in those who identified as BAME [2]. UK Government figures have reported that even when adjusting for confounding variables, Bangladeshi (Hazard ratio, HR = 2.02), Pakistani (HR = 1.44), other Black (HR = 1.35), Chinese (HR = 1.28), Indian (HR = 1.22) and Black Caribbean (HR = 1.10) individuals had greater rates of mortality compared with White males [3]. This is underscored by a higher incidence of COVID-19 witnessed in ethnic minorities [4]. A limitation of the Government’s inquiry was the lack of recommendations which is paramount in reducing the burden of disease in BAME groups.

Information dissemination amongst a population is key during a pandemic, in which the Internet plays a vital role. In recent times, online traffic to “.gov” sites (i.e. official UK Government online material) has been at a high, peaking at over 29 million unique users from 23rd of March when lockdown measures were introduced [5]. Whilst there has been a surge in the number of users accessing the Government’s material on public health, whether the information is intellectually accessible and understandable is another question. Basch et al. investigated the readability of public health material to the average native-English speaking American and had shown that much of this literature was of an average to-high grade reading level which is out of the grasp of many [6]. With the gravity of the public health crisis, it is imperative that this information is not only physically accessible to the average layman but also intellectually available. Furthermore, given the diverse range of ex-patriots living in the UK, who may have variable degrees of English literacy skills, the language that this online information is in may potentially hinder the level of understanding COVID-19 in some communities, regardless of its readability [7].

In this study, we investigated the readability of a broad base of online information relating to COVID-19 and the rulings made by the UK Government for British Internet users. This included information relating to behavioural measures for the population, such as social distancing as well as employment rulings, such as the furlough scheme, as these have been put in place to allow a continual income to those who are unable to work due to lockdown and therefore potentially unable to socially isolate. The accessibility of this information for non-native English speakers by counting the availability for other languages and the presence of accompanying graphic information was also investigated. We also assessed whether the source of the information and the type of information being conveyed were associated with readability and availability of multilingual text.

Methods

Data Collection

A cross-sectional study was performed. The terms “Coronavirus”, “COVID-19”, “Lockdown”, “Social Distancing”, “Handwashing”, “Furlough Scheme” and “Sick pay” were inputted into Google, which was selected due to its popularity. All websites that were based upon or were relevant to the UK lockdown rulings were included in our analysis. Articles were excluded if the content was based upon other nations’ rulings or reported news on non-health or non-population policy issues, such as political or diplomatic events transpiring in these times. Social media posts were also excluded. Specialised material, such as academic articles, was also excluded. The included articles were categorised by their source type: government (both local and national), non-governmental organisations (NGO; including trade unions, charities and support groups), commercial sites (news and businesses) or National Health Service (NHS). The articles were also categorised by the nature of the material each page was covering: general information, information on population practise ruling and employment rules.

Text from all websites was analysed with internationally recognised tools for assessing readability to offer variables of the following: Simple Measure of Gobbledygook (SMOG), Gunning Fog Index (GF), Flesch-Kincaid Grade Level (FK), Coleman-Liau Index (CL) and Automated Readability Index (ARI) calculators accessed via the online tool Readable.io. The above models have been shown to be valid calculators for assessing readability and do so based upon the number of syllables, the length of sentences and the number of sentences and within selected text. SMOG is however considered to be the international standard variable of readability [9]. The resultant scores correspond to the equivalent United States (US) school grade that the inputted text would be suited to (Table 1). The average reading age in the UK is 13–14 years of age, which is the equivalent to US grade 8. Figure 1 shows the ages corresponding to their respective US grades. The reading age generated was our defined outcome variable of the analysis.

Table 1 A table showing the school grades of the US education system and their corresponding ages
Fig. 1
figure 1

Health-social-behavioural model for disease acquisition, adapted from Pareek, M. et al. Ethnicity and COVID-19: an urgent public health research priority, Lancet 2020 [8]

Furthermore, websites were marked for the presence of availability of other languages. According to 2011 census data, the top 5 languages spoken in the UK other than English were Polish, Punjabi, Bengali, Urdu and Gujrati [22] and as such these languages were particularly searched for within the selected websites. The presence of accompanying graphics with text was also recorded.

Statistical Analysis

The pooled sample population and the categorisation by source and by content type were characterised. The distribution of the scores was assessed using Kolmogorov-Smirnov test. As the data were non-normally distributed, the data was displayed as a median with 25th and 75th percentile bounds (interquartile range, IQR) (Table 2).

Table 2 A summary of the included websites with corresponding median readability score with interquartile ranges (IQR). The number of readable websites (i.e. a score ≤ 8) with corresponding percentages is also displayed. SMOG, Simple Measure of Gobbledygook; GF, Gunning Fog Index; FK, Flesch-Kincaid Grade Level; CL, Coleman-Liau Index; ARI, Automated Readability Index

The scores were then categorised as “readable” and “difficult to read” according to how the scores compared with the average reading age of the UK. Medians that scored ≤ 8 (i.e. equivalent to US grade 8 and below—corresponding to aged 14 and below) were graded as “readable”, and > 8 were deemed “difficult to read”. The number of websites scoring ≤ 8 was displayed as counts and percentages. The presence of other languages and the use of graphical aids were also displayed as counts and percentages.

The analysis of the difference between the medians of each group of continuous data (i.e. readability scores) was performed with the ANOVA (analysis of variance) test. The significance of the observed differences between the various groups of nominal data (i.e. the number of readable websites, the number of websites who uses supplementary graphic information or alternative languages) was assessed using chi-squared test. P values of 0.05 or less were considered to be statistically significant.

All statistical analysis was performed using IBM SPSS Version 25 (IBM Corp., released 2017. IBM SPSS Statistics for Windows, Version 25.0, Armonk, NY: IBM Corp).

Results

A total of 148 websites fit our inclusion criteria for analysis. Categorisation by source type was as follows: government (n = 41), NGO (n = 34), commercial (n = 56) and NHS (n = 17). Categorisation by content type is as follows: general information (n = 46), population practise (n = 80) and employment rules (n = 22). The characteristics of our data are shown in Table 3 and 4. The median readability scores for all calculators exceeded a score of 8, suggesting that the material’s reading age was advanced from that of the average UK reading age of 13–14 years.

The included websites deemed readable (i.e. with a score ≤ 8) were then counted. Websites were then categorised by its source and by its content (Table 3).

Table 3 A comparison of the count of readable websites (a score ≤ 8) by category. SMOG, Simple Measure of Gobbledygook; GF, Gunning Fog Index; FK, Flesch-Kincaid Grade Level; CL, Coleman-Liau Index; ARI, Automated Readability Index; NGO, non-governmental organisations; NHS, National Health Service

Globally, there is a deficiency in the number of readable websites seen amongst our data. In the vast majority of calculations, more than half of the websites were deemed as difficult to read. Only FK and ARI measurements showed that 52.9% of websites from the NHS were readable by most. No statistical significance was observed between the readability scores of different categories, suggesting that readability has no preference over the source or theme of information. Despite this finding, scores were consistently high showing that text was more difficult and that the counts of readable data (i.e. ≤ 8) were low within our cohort.

A very small number of our included URLs contained visual graphics or alternative languages of the same information. Only 3.4% and 6.8% made the same information easily available in other languages and provided accompanying graphical information respectively (Table 4). The observed differences between the source and content categories were small and statistically insignificant, suggesting that there is no correlation between the presence of these features and the URL type.

Table 4 The total number and resultant percentages of URLs containing graphical information and the same content in alternative languages. NGO, non-governmental organisations; NHS, National Health Service

Discussion

We believe the significance of this study lies in offering insight into a key yet under-looked facet of the COVID-19 pandemic: the lack of appropriate COVID-19 education materials available to BAME groups.

Our study is the first, to our knowledge, to investigate the readability of material relating to COVID-19 available in a UK cohort with a broad search strategy, encompassing both general and specific terms relating to the pandemic. This had been along with an analysis of the presence of graphics and available translated material. We demonstrate that the vast majority of information readily available to the British public is not easily readable for most of the adult population. We also show that there is a lack of visually supportive and translated material which can alienate many who do not speak English as a first language, such as those from BAME communities or those who have learning difficulties. In addition to this, the poorly readable information is important to consider when assessing why the UK has had a high mortality rate amongst the general population.

Our results may provide a variable explaining the higher rate of infection seen in BAME groups, due to possible less appreciation, understanding and subsequent poor practise of the important principles of the pandemic, such as social distancing and symptom recognition. This explanation is validated as it is widely demonstrated that health literacy is significantly lower in ethnic minority and migrant populations, even when adjusting for education levels, and subsequently have poorer health outcomes for the same medical conditions compared with their native counterparts [10,11,12,13].

Furthermore, from our results, we have seen that there is a lack of official resources easily available to BAME groups who do not speak adequate English. The vast majority of websites did not have readily available translated material that was intuitive to find by the authors whilst on the websites. This issue was raised to the UK Parliament [14], whose response was that “Public Health England’s health and safety information on COVID-19 is currently not available in Roma, Urdu, Polish or Bengali. However, resources are available in [Mandarin, Cantonese, Thai, Japanese, Korean, Malay, Farsi, modern Arabic and Italian]. These materials are available at international airports, ports and international train stations”. These members of society may be forced to rely excessively on the word of mouth, a previously documented issue which is prone to embellishment and distortion of information [15,16,17]. As a result of the lack of readily available information directly from the Government to non-English speakers, there is a risk of spread of misinformation and misinterpretation of key rulings.

The importance of language is significant given the trends we are witnessing in the UK. The Bengali population has been highlighted to be amongst the highest at-risk ethnic minority in the UK ranking second with respect to adverse outcomes [18]. According to British Government data, 3% of the British population from a Bengali background “could not speak English” and 13.2% “could not speak English well”; the highest reported rates of these respective categories from all ethnicities. Only 47% of the British-Bengali populated reported speaking English as their main language [19]. We cannot imply causation in this observation; however, the trend we have observed raises an important question as to whether the rate of infection and poor outcomes is contributed by the lack of educational resources and needs further investigation.

The lack of readable and translated material has significance when considering the model suggested by Pareek et al. [8]. We must reflect on the biological, social and behavioural factors as displayed in Fig. 5 when investigating the disproportionate effect on BAME groups COVID-19 has had. The latter two points have been a big target when attempting to decrease the reproduction rate of COVID-19 cases, otherwise known as the “R number”. The R Number can be deeply reduced by the online health campaigns which as discussed previously is a primary method of information dissemination. Much needed focus has been given to the exploration of biological causes of BAME COVID-19 deaths such as vitamin D and ACE receptor levels in BAME groups [4]. In conjunction with this, we propose that behavioural and social constructs of BAME communities need to be addressed thoroughly. With respect to COVID-19, many BAME communities exhibit high-risk behaviours and social activities. These include but are not limited to living in extended families, weekly large religious gatherings, potential lack of knowledge on coronavirus due to low health literacy and delays in health seeking, a phenomenon that is well documented in BAME groups for other disease profiles [13, 20]. These are factors that can cause increased risk to acquisition of COVID-19 and can be significantly altered by online health campaigns.

The lack of readable, translated and graphic information may have resulted in less ability online health campaigns have had in regulating behavioural and social facets of the model described above. This may offer an insight into the high mortality in BAME communities, as well as the UK population in general.

Building on this, considering that many BAME groups fall into lower socio-economic capabilities [21] and have been documented to have lower health and functional literacy levels, these findings have an amplified level of importance for BAME groups and, like COVID-19, will disproportionately affect them. Although this may not be the leading cause behind the increased risk BAME groups face, this potential institutional barrier to information acquisition through poorly readable and translated material should not be overlooked as a contributing factor. This is especially important when considering poor health literacy has been a documented cause of bad outcomes and reduced information retainment [13].

Our data holds major significance with respect to the COVID-19 pandemic as it highlights one simple, yet vital, concept that health authorities and governments have overlooked. This is the fact that public messaging of key information across ethnic minority groups is not intellectually accessible. Alterations to simplify and create widely accessible online literature may be the key in educating the population and in turn assist in preventing further waves of COVID-19 cases in areas of the UK with high proportions of BAME populations. Our results highlight an active problem and are especially vital to consider when new developments are made that can ease the burden of COVID-19, such as disseminating information effectively regarding potential vaccines, new treatments, potential second waves or updates in social distancing rules. Key information such as symptom recognition, social distancing rules, treatments, hygiene recommendations and any key developments must be made available in all commonly spoken languages and uploaded on key central web addresses such as Governmental websites, key independent bodies such as the BBC and official health bodies such as the NHS. In addition, this translated material must be tested in focus groups and sample populations prior to release to ensure that it is intellectually accessible and readable to the whole spectrum of people in BAME populations. This would be in conjunction with all official literature available in graphic formats to further enhance its intellectual accessibility. Furthermore, our results have major implications in wider medical issues and offer insight into why other diseases may disproportionately affect the BAME population. Similar analysis must be carried out on these diseases to investigate if publicly made health literature is at an appropriate level for BAME groups.

Limitations

The main limitation of this paper is that it reflects a snapshot in time. The Internet has the potential to change drastically over a short time, and COVID-19, in particular, is at risk given the situation’s volatile nature. The study was however conducted during the initial stages of the pandemic in May 2020; therefore, our data sets analysed were likely to only reflect this stage of the pandemic. Whilst we aimed to only analyse reputable sources such as established news websites and avoided social media posts, the quality of information on web addresses was not formally assessed. Further studies will be needed to assess the quality of the content uploaded. In addition to this, we also recommend formal analysis into the accessibility of web addresses. Our searches were location-specific; whilst we gathered regional URLs from a range of areas across the UK, the order and priority of search results may vary within different regions and cities in the UK. Lastly, there was a variety of video resources which we were unable to analyse as part of our study, which may be used widely by the public for self-education.

Conclusion

We have demonstrated that digital text-based resources for the general British reader may be inappropriate in terms of readability and thus may offer an insight into the alarming statistics seen regarding COVIDs effect in the UK. Factors contributing to this are sentence length and use of jargon. Only a small number of websites included graphics to accompany their content, which is an especially useful tool to clearly explain concepts. The information disseminated online during the pandemic in the UK also lacks translated materials which can alienate those who do not speak English as a first language. This can and subsequently leave the population of BAME, the population with the worst outcomes from COVID-19, relatively ill-informed compared with their English-speaking counterparts. We suggest that resources from public health bodies within the UK take into consideration the length of their sentences, the number of words and the use of words with three or more syllables. It is also imperative that official sources provide readily available translated resources, accompanied by graphical information to ensure ease of comprehension. Urgent health promotion campaigns initiated by health authorities must target BAME populations to tackle this public health emergency.