Predictors of Epstein-Barr virus serostatus in young people in England
Epstein-Barr virus (EBV) is an important human pathogen which causes lifelong infection of > 90% people globally and is linked to infectious mononucleosis (arising from infection in the later teenage years) and several types of cancer. Vaccines against EBV are in development. In order to determine the most cost-effective public health strategy for vaccine deployment, setting-specific data on the age at EBV acquisition and risk factors for early infection are required. Such data are also important to inform mathematical models of EBV transmission that can determine the required target product profile of vaccine characteristics. We thus aimed to examine risk factors for EBV infection in young people in England, in order to improve our understanding of EBV epidemiology and guide future vaccination strategies.
The Health Survey for England (HSE) is an annual, cross-sectional representative survey of households in England during which data are collected via questionnaires and blood samples. We randomly selected individuals who participated in the HSE 2002, aiming for 25 participants of each sex in each single year age group from 11 to 24 years. Stored samples were tested for EBV and cytomegalovirus (CMV) antibodies. We undertook descriptive and regression analyses of EBV seroprevalence and risk factors for infection.
Demographic data and serostatus were available for 732 individuals. EBV seroprevalence was strongly associated with age, increasing from 60.4% in 11–14 year olds throughout adolescence (68.6% in 15–18 year olds) and stabilising by early adulthood (93.0% in those aged 22–24 years). In univariable and multivariable logistic regression models, ethnicity was associated with serostatus (adjusted odds ratio for seropositivity among individuals of other ethnicity versus white individuals 2.33 [95% confidence interval 1.13–4.78]). Smoking was less strongly associated with EBV seropositivity.
By the age of 11 years, EBV infection is present in over half the population, although age is not the only factor associated with serostatus. Knowledge of the distribution of infection in the UK population is critical for determining future vaccination policies, e.g. comparing general versus selectively targeted vaccination strategies.
KeywordsEpstein-Barr virus Serostatus Infectious mononucleosis Cancer Transmission Risk factors
Adjusted odds ratio
Body mass index
Enzyme-linked immunosorbent assay
Health Survey for England
National statistics socio-economic classification
Virus capsid antigen
Epstein-Barr Virus (EBV) is a herpesvirus that infects 90–95% of humans, causing lifelong infection [1, 2]. EBV infection during childhood is generally asymptomatic, however acquisition of EBV during adolescence or early adulthood often causes infectious mononucleosis (IM),  which can cause substantial morbidity during important educational periods in adolescents and young adults [4, 5]. EBV is associated with 1% of global cancers, particularly Hodgkin’s lymphoma, Burkitt’s lymphoma, nasopharyngeal cancer and gastric cancer .
EBV infection is not currently treatable nor preventable by vaccination; however, vaccine candidates are in development. In phase II trials, a first-generation vaccine administered to healthy seronegative volunteers aged 16–25 years demonstrated protection against IM but not EBV infection . Second-generation vaccines elicited higher levels of antibody responses in animal models,  and first-in-human trials are likely to begin soon. Mathematical modelling of different vaccination strategies is essential to determine the effectiveness and cost-effectiveness of different vaccination strategies for reducing rates of EBV infection, IM, and EBV-associated cancers, taking into account factors such as vaccine efficacy, duration of protection and differing outcomes according to age at infection.
A greater understanding of EBV epidemiology, including the dynamics of EBV infection in different sub-populations, is necessary for the development of such models. EBV seroprevalence increases with age; 90–95% of people globally are infected by age 25, whilst 5–10% remain seronegative throughout life . The best public health strategy for the deployment of an infection-preventing vaccine may vary between settings; infection appears to occur at younger ages in resource-limited countries and thus children will need to be vaccinated early [10, 11, 12]. However, if the duration of vaccine-induced protection is not lengthy, vaccinated individuals may become susceptible to natural infection at an age where the consequences of infection are more severe, for example leading to IM or cancer .
Additionally, sub-optimal vaccine coverage even of a vaccine with a long duration of protection will lead to a higher age at infection amongst those who remain unvaccinated. In such situations it may be better to delay vaccination until the pre-teenage years, targeting individuals who remain EBV seronegative. Alternatively, a vaccine protecting against IM and EBV-associated diseases (such as certain cancers) could be administered to older children as they approach adolescence, which may be effective even with a shorter duration of protection. After the licensing of vaccine candidates, strategic discussions will need to take place nationally and be informed by accurate national data on the epidemiology of EBV infection.
In the United Kingdom, EBV seroprevalence increases rapidly in very young children, reaching 21 and 51% by the age of two years in children of white and Pakistani ethnicity, respectively . Another study showed that EBV seroprevalence then remained relatively constant, at around 55%, between the ages of five and 11 years . EBV seroprevalence was estimated at 75% in university students at 19 years and 92% by the age of 22 years . We recently published summary data on the seroprevalence of EBV in adolescents in England ; however, to date no study has investigated factors associated with seropositivity that could inform a targeted vaccination strategy.
Our aim was to investigate the sociodemographic and lifestyle factors, particularly age, associated with EBV serostatus in children and young adults in England, and to discuss the implications of our findings for future EBV vaccination policy.
The Health Survey for England (HSE) is an annual, cross-sectional, representative survey of households in England. Its methods are described in detail elsewhere . For this study, and in order to parameterise a model of EBV transmission,  we randomly selected individuals who participated in the 2002 HSE; 2002 was the most recent year in which survey participants gave consent for future studies to test their blood samples for blood-borne viruses. Our aim was to include 25 participants of each sex in each single year age group from 11 to 24 years, in order to fill a gap in the literature and capture the years at which infection is most likely to have clinical consequences. The participant IDs were selected randomly by the HSE, however it was not possible at the time of sampling to determine whether the samples had already been used. As a result, more than 25 IDs were selected for each age-sex group to ensure there were sufficient samples for our analysis, and therefore there are not exactly 25 samples in each group (Additional file 1: Table S1).
Measuring seroprevalence of Epstein-Barr virus and cytomegalovirus infection
Stored blood serum samples collected between January 2002 and March 2003 were obtained from the HSE. Samples were posted to the laboratory within two days, where they were centrifuged, and the remaining serum was frozen and stored at − 40°c until they were analysed, which was completed in September 2017 .
EBV virus capsid antigen (VCA)-specific IgG and CMV-specific IgG were detected in serum samples using commercial ELISA kits obtained from EUROIMMUN, Germany (EI2791–9601-G, EI2570-9601G). Assays were performed according to manufacturer’s instructions and serum antibody concentrations were calculated using a standard curve. Data on the performance of the assays are detailed in Additional file 1: Table S2. Results were presented in relative units (RU/mL); <16RU/mL samples were considered negative, ≤16 to <22RU/mL borderline and ≥ 22RU/mL positive. Borderline results from the EBV VCA IgG ELISA were subsequently subjected to re-analysis with an EBV immunoblot assay (EUROIMMUN, Germany, DY2790G) which revealed all borderline serum samples (n = 5) had reactivity to alternative EBV antigens; they were therefore considered seropositive.
Data were analysed in Stata version 15.0. We weighted our sample, using the svy commands in Stata, to be representative of the English population in 2002 with respect to age and sex, utilising data from the Office for National Statistics . All stated percentages are weighted. Descriptive analyses of the study population were undertaken. ArcMap 10.3.1 was used to create a map of EBV seroprevalence by English Government Office Region .
To investigate factors associated with being seropositive for EBV, we undertook logistic regression modelling. A causal inference framework was used to determine a priori factors to be included in multivariable models, from the available data collected in the HSE. We built two multivariable regression models.
A ‘whole-population’ model, which included our entire study population, examined the following factors: age, sex, ethnicity (categorised as ‘white’ or ‘other’ due to small numbers of non-white participants), body mass index (BMI; categorised as ‘underweight’ [BMI < 20], ‘healthy weight’ [20-<25], ‘overweight’ [25-<30]or ‘obese’ [≥30]), region of England and CMV serostatus.
A second ‘adults-only’ model was restricted to individuals aged ≥16 years, and additionally included factors for which data was only available for adults; smoking status (never smoked, current smoker, smoked in past) and occupational category from the National Statistics Socio-economic classification (NS-SEC) . The NS-SEC categorises occupations into higher managerial and professional roles (involving strategy/supervision), intermediate occupations (typically clerical, sales, service or technical positions which do not involve general planning or supervision), routine and manual occupations (involving basic labour), never worked or long-term unemployed, and other. We excluded individuals missing data on one or more variables.
Planned sensitivity analyses investigated the impact of excluding CMV serostatus as a predictor of EBV serostatus, and the impact of classifying the originally indeterminate serological results as seronegative rather than seropositive.
This study was approved by the University College London Research Ethics Committee (5683/002). The HSE obtained informed written consent for blood samples to be collected and stored for future analyses .
The number and weighted percentage of individuals seropositive for EBV in England in 2002
Number (weighted %)
Number (weighted %)
Age last birthday
East of England
Yorkshire and The Humber
Smoked in past
Higher managerial and professional
Routine and manual occupations
Never worked or long-term unemployed
Univariable and multivariable logistic regression models of factors associated with Epstein-Barr Virus seropositivity in England in 2002
Multivariable (whole population)
Multivariable (adults only)
OR (95% CI)
aOR (95% CI)
aOR (95% CI)
Age group (years)
Region of UK
East of England
Yorkshire and The Humber
Smoked in past
Higher managerial and professional
Routine and manual occupations
Never worked or long-term unemployed
Among adults, EBV seropositivity was higher among those who currently smoked (aOR 4.29 [2.13–8.65]), than those who had never smoked. There was no evidence of associations between sex, BMI or occupational category and EBV serostatus.
In sensitivity analyses, we firstly excluded CMV serostatus as a predictor of EBV serostatus, and secondly we classed indeterminate serology results (n = 5) as seronegative rather than seropositive. Both sensitivity analyses showed results consistent with our main analyses (Additional file 1: Table S3, Table S4).
The importance of EBV as a cancer-causing pathogen has generated international interest in developing an anti-infection vaccine . The cost-effectiveness of different strategies to deploy such vaccines will vary from setting to setting and is dependent on the epidemiology of the infection. For example, EBV’s association with IM means that vaccines that do not produce lifelong immunity may be better targeted towards subgroups which are likely to acquire infection in adolescence. In this observational study of factors associated with EBV seroprevalence among young people in England in 2002, we explored the distribution of seroprevalence by age and the sources of additional variability. We found a substantial increase in EBV seroprevalence with age among our sample population, associations with ethnicity and smoking, and a potential association with CMV seroprevalence.
A series of studies have demonstrated that EBV is generally acquired pre-adulthood, and that this varies between settings . Our findings regarding smoking fit with the prevailing narrative that there is an association between EBV and socioeconomic status, rather than smoking being an independent risk factor . Unfortunately, we did not have a good measure of socioeconomic status in our analysis; the NS-SEC does not account for familial socioeconomic status during childhood, which is probably more relevant to EBV seroprevalence than individual occupational status in young adults, and we were unable to measure socioeconomic status in children at all.
We found that EBV prevalence varied substantially between regions of the UK in univariable analyses and in the whole-cohort model, but not in the adults-only model, suggesting confounding between region and socioeconomic status. There was also a strong association between EBV seropositivity and ethnicities other than white, in both univariable and multivariable models. This may be the result of different mixing patterns (as people of ethnic minorities are more likely to live in larger households), different feeding practices, or residual confounding of socioeconomic status. CMV is another herpesvirus which infects a high proportion of the population from a young age,  and has also been associated with EBV in other settings [24, 25].
In England, EBV infects 55% of the population by the age of 12 ; i.e. prior to adolescence, when the risk of IM increases. Cost-effective deployment of a cheap, infection-preventing, vaccine with a lifelong duration of protection could thus likely involve targeting the early years. However, future vaccines may produce a shorter duration of immunity, potentially delaying infection and resulting in an increasing incidence of IM (and IM-associated cancers). This could be compounded by sub-optimal vaccine coverage increasing the average age at infection  and consequently potentially increasing rates of IM – similarly to how sub-optimal coverage of the MMR vaccine led to an increase in congenital rubella syndrome in Greece [27, 28].
In such a scenario, targeted vaccine deployment to the social groups who acquire infection later (when the likelihood of IM is higher) might be considered, possibly with repeated dosing if required. Such targeting could be informed by the risk factors detected within this analysis, and data such as those presented here should be considered in conjunction with the characteristics of the vaccine available when determining what a vaccine policy should look like. If a vaccine was cheap and effective, then universal coverage would be appropriate. If the duration of protection was short, it may be prudent to give repeat doses of the vaccine to people who pick up the infection at the youngest age, which is linked to ethnicity and likely to socioeconomic status. The use of an expensive vaccine could be stratified on the basis of who is most likely to suffer EBV-related disease after infection, which we have studied separately .
The limitations of our work include the age of the data and the use of a cross-sectional study design, preventing determination of the temporality of the correlation between EBV and CMV infection. In our analysis, EBV seroprevalence was higher than CMV seroprevalence in all age groups, and both increased with age. We found that CMV was associated with EBV in univariable analyses, and in the adults-only model, but not in the whole-cohort multivariable model. As both EBV and CMV are associated with increasing age, particularly during adolescence, we would not expect an association between CMV and EBV to persist in the whole-cohort multivariable model. It is possible that as the association between age and EBV seroprevalence was less strong in the adults-only multivariable model (as EBV seroprevalence starts to saturate as people reach adulthood), there was enough of a residual effect that the association between EBV and CMV could be detected. Unfortunately, our sample size was not large enough to investigate the interactions between EBV, CMV and age in more detail. The association may result from shared genetic, immunological and/or sociodemographic risk factors, or one infection could increase susceptibility to the other. Longitudinal studies with serial testing are necessary to explore this association, and additional risk factors, in more detail.
We elected to measure IgG antibodies to the EBV VCA protein and whole CMV virus, as these antibodies are present in all infected individuals and persist for life. Although we did not test for IgM antibodies, and cannot exclude the possibility that some seronegative individuals may have been recently infected, we note that VCA-specific IgG and IgM antibodies usually appear contemporaneously  and therefore we would expect the number of such individuals in our study to be low.
Knowledge of the distribution of EBV infection among young population groups in England is critical for determining future vaccination policies, including the cost-effectiveness of general versus selective approaches. Data such as those presented here should be used together with detailed information on vaccine characteristics, the implications of remaining EBV-uninfected for life, the ramifications of delayed infection, and the financial costs of IM and EBV-associated cancers to inform such policies.
We thank our colleagues at UCL and NatCen Social Research, and the interviewers, research nurses and participants of the Health Survey for England, and Shaun Scholes for assistance weighting the HSE data for analysis.
HRS, JL and GT designed the study. OT and GT conducted the serological testing. JRW conducted the data analysis and drafted the paper. JRW, CJ, HRS, JL and GT interpreted the results. All authors critically revised the paper and approved the final version for publication.
This work was supported by the Wellcome Trust . The funding source has no role in the study design, collection, analysis or interpretation of the data, the writing of the paper or the decision to submit for publication. The corresponding author had full access to all data in the study and had final responsibility to submit the paper for publication.
Ethics approval and consent to participate
This study was approved by the University College London Research Ethics Committee (5683/002). The HSE obtained informed written consent for blood samples to be collected and stored for future analyses. We were given permission to access the data by the Health Survey for England.
Consent for publication
GT reports personal fees from Genocea Biosciences, outside the submitted work. CJ is an Associate Editor at BMC Public Health. All other authors have no competing interests.
- 7.Sokal EM, et al. Recombinant gp350 vaccine for infectious mononucleosis: a phase 2, randomized, double-blind, placebo-controlled trial to evaluate the safety, immunogenicity, and efficacy of an Epstein-Barr virus vaccine in healthy young adults. J Infect Dis. 2007;196(12):1749–53.CrossRefGoogle Scholar
- 9.Hjalgrim H, Friborg J, Melbye M. In: Arvin A, et al., editors. The epidemiology of EBV and its association with malignant disease, in Human herpesviruses: Biology, Therapy and Immunoprophylaxis. Cambridge: Cambridge University Press; 2007.Google Scholar
- 12.Winter JR, et al. Predictors of Epstein-Barr virus serostatus and implications for vaccine policy: a systematic review of the literature. Under review. .Google Scholar
- 15.Kessell I, et al. Epstein-Barr virus infection in children entering a paediatric unit. J Inf Secur. 1980;2(3):269–74.Google Scholar
- 17.Deverill, C., et al., Health Survey for England 2002: The Health of Children and Young People. Methodology & Documentation, ed. K. Sproston and P. Primatesta. 2002: The Stationery Office.Google Scholar
- 18.Health & Social Care Information Centre, The Health Survey for England Bloodbank Project: Requests for extraction and analysis of HSE stored blood samples. 2013.Google Scholar
- 19.Office for National Statistics. Dataset: Population Estimates for UK, England and Wales, Scotland and Northern Ireland. 2018 16/01/2018]; Available from: https://www.ons.gov.uk/peoplepopulationandcommunity/populationandmigration/populationestimates/datasets/populationestimatesforukenglandandwalesscotlandandnorthernireland.
- 20.Office for National Statistics. 2011 Census: boundary data (United Kingdom) [data collection]. 2011 16/01/2018]; Available from: http://census.ukdataservice.ac.uk/get-data/boundary-data.aspx.
- 21.Office for National Statistics. The National Statistics Socio-economic classification (NS-SEC). 2019 06/10/2019].Google Scholar
- 26.Vynnycky, E. and R.G. White, An introduction to infectious disease modelling. 2010: Oxford University Press.Google Scholar
- 29.Bakkalci D, et al. Risk factors for Epstein-Barr virus-associated cancers: a systematic review, critical appraisal, and mapping of the epidemiological evidence. Under review.Google Scholar
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.