Variability of E2 protein-coding sequences of bovine viral diarrhea virus in Polish cattle

Bovine viral diarrhea virus (BVDV) belongs to the Pestivirus genus of the Flaviviridae family and has worldwide distribution, being one of the main causes of economic losses in cattle raising. The genome of pestiviruses is a single strand of positive-sense RNA with a length of 12.3 kb, which encodes one open reading frame flanked by untranslated regions. E2 glycoprotein is required for binding to cell-surface receptors and it also contains major antigenic determinants. The nucleotide sequence coding E2 is the most variable part of the viral genome. The heterogeneity that exists among circulating strains causes problems in the development of effective vaccines and reliable diagnostics. In this study, and for the first time analysis was made of the E2 glycoprotein coding sequences of 14 Polish BVDV-1 strains which belong to four subtypes: 1b (n = 7), 1f (n = 3), 1s (n = 3), and 1r (n = 1). These sequences showed evidence of strong purifying (negative) selection. However, we also identified positively selected sites. The availability of E2 sequences of Polish BVDV strains for reference, knowledge gained through epitope prediction attempts, and information on protein glycosylation sites can afford a better understanding of host–pathogen interactions.

A virion has a diameter of 40-60 nm and it is surrounded by a lipid membrane. The lipid membrane contains three glycosylated envelope proteins: E rns , E1, and E2 [5]. The two envelope glycoproteins E1 and E2 recognize host cells by binding to cell-surface receptors CD46 and LDL-R [6] and are required for membrane fusion and cell entry [7]. E1 is assumed to function as a membrane anchor for E2 [8], which also contains major antigenic determinants [9,10]. The BVDV genome has a high mutation rate and the E2 glycoprotein coding fragment is the most variable part of its genome. The heterogeneity that exists among circulating strains causes problems in the development of effective vaccines and sensitive diagnostics.
The present study is the first to provide sequence information for the E2 glycoprotein of Polish BVDV strains belonging to the most often isolated subtypes in Poland. We focused on determining which E2 glycoprotein regions are subject to positive selection and the detection of protein glycosylation sites. This data may be a key indicator of the nature of the host-virus interaction. In this study, we used BVDV-positive serum samples from previous detection and genotyping studies of clinical suspects, herds with virus eradication underway and herds vaccinated with killed BVDV-1a vaccine [11]. Total RNA was extracted using TRI Reagent (Sigma-Aldrich, USA) from 500 µL of serum following the manufacturer's instructions. Reaction mixes for standard RT-PCR were prepared as described previously [11]. A mix of four primer pairs specific to the E2-encoding fragment [12] and specific to regions flanking the E2 encoding sequence [13][14][15] was used. A list of primer sequences is presented in Table 1. We obtained positive RT-PCR results for 16 out of 30 samples in the form of a band on agarose gel with a size of about 1019-1200 nucleotides. However, for further studies, it was only possible to use 14 viral sequences, as only this many were of good quality, and these were submitted to GenBank with the accession numbers MK675059-MK675072. The list of strains for which sequences were obtained can be found in Table 2, where the geographical origin of the samples is given. The analyzed sequences were assigned to four groups on the phylogenetic tree ( Fig. 1), to the same subtypes as in the previous study within 5′UTR: 1b (n = 7), 1f (n = 3), 1s (n = 3), and 1r (n = 1) [11]. Subtype 1b is currently the most often isolated subtype of BVDV in Poland. Almost a quarter of all isolated viruses belonged to the 1f subtype. The remaining two subtypes, 1r and 1s, were identified recently and are rare.
We were able to amplify the E2 region from a much smaller number of strains than was possible for the 5′ untranslated region in a previous study [11]. Due to the high variability, the E2 region is not suitable as a target for diagnostic purposes, but it can provide useful information when results from the 5′UTR are unclear. The analyzed fragment of the E2 region is longer than 5′UTR, it is less conserved, and the constructed phylogenetic tree within this region has higher bootstrap values.
The percentage of sequence identity of our strains was calculated based on a 892-nucleotide fragment (nucleotide positions 55-946 and amino acid positions 19-315 of the C86 vaccine strain in the E2 gene). The fragment analyzed is shorter than the full-length E2 sequence due to unsatisfactory sequence quality at its ends. The identity between various strains ranged from 70.4 to 98.5%, with the smallest differences occurring between subtypes 1f and 1s (79.5-81.75% identity), and the largest ones between 1b and 1f (70.4-73% identity). Differences among strains within subtypes 1b, 1s, and 1f were 84.7-96%, 94.2-98.5%, and 83.9-98.5%, respectively. The analyzed region of E2 displayed evolutionary distances of 1.5-16.1% at the subtype level and 18.3-29.6% among subtypes. Similar evolutionary distances to the Polish strains were also found among Chinese strains [16]. The E2 sequences from this study displayed 437/892 (48.9%) variable sites at the nucleic acid level: 83 singleton variable sites and 354 parsimony informative sites. The total number of mutations was 597. We identified 43.77% variable sites at amino acid level, which presented up to six amino acid variants, and these six variants occurred in positions 57, 159, and 198. Five variants appeared in 13 different positions.
We investigated the nature of selection pressure acting on the envelope glycoprotein E2 gene by the estimation of synonymous/nonsynonymous (dS/dN) mutation rates (ω) using DNASP6.11 [17]. The E2 glycoprotein coding region showed evidence of strong purifying selection throughout the sequence (ω = 0.143), i.e., stabilizing it through the removal of deleterious genetic polymorphisms that arise through random mutations. Positive selection is a type of natural selection in which a specific phenotype is preferred to other phenotypes. We have identified several locations in the genome that are characterized by positive selection E2 glycoprotein, which is the most immunodominant viral protein of pestiviruses, contains major type-specific epitopes recognized by specific antibodies. This protein is presented by antigen-presenting cells and was identified as a target for cytotoxic T-cells [18]. Positive selection might contribute to the avoidance of T-cell recognition of the E2 glycoprotein. For BVDV-1, essential amino acid positions for neutralization by monoclonal antibodies (mAbs) have been mapped in the N-terminal part of E2 [19]. Two antigenic regions of the E2 glycoprotein were mapped between  amino acid positions 1-70 and 70-77 [20]. In this study, we identified five sites showing positive selection in the 1-70 fragment. Other studies showed that single nucleotide mutation causing the change of one of the four amino acids in the 71-74 region can defeat neutralization by a single mAb [20]. Changes at position 72 observed in Singer BVDV-1a strain mutants affected the binding and neutralization of mAbs 157 and 348 [20]. Leucine at position 74 was also important for the binding of the WB166 antibody [21]. The Singer strain showing a different change, at position 32, managed to evade the antibody 157. Similarly, the Hastings strain of BVDV-1 also managed to escape this antibody through a change in the same position [20]. We identified mild positive selection for this position in Polish strains of BVDV-1.
The BVDV-1-specific epitope X1 recognized by mAb 921-6 [21] (60-90 in the E2 glycoprotein) was mapped in the C-terminal portion of the A domain region. In this region, we found one area showing positive selection: amino acid 63.
An immunodominant region at amino acid position 121-150, corresponding to the linear neutralizing epitope of the E2 protein of CSFV, was recognized by polyclonal antibodies [22]. There were differences in nine positions in our strains, particularly in the C-terminal part. We also observed a positive selection in this region, specifically for position 139.
Amino acids 96 and 152 showed the strongest positive selection of the strains from this study. However, identifying the importance of this selection requires further research.
The sequence coding for the receptor binding domain, which usually mediates binding of the virion to the host cell surface, is located at amino acid position 141-170 of E2 in CSFV, indicating that binding of the receptor CD46 might occur through this region [23]. It corresponds to the polymorphic 142-172 segment in the BVDV genome. We did not detect positive selection in the area of the receptor binding site, but high variability within a protein having a cell receptor affinity may contribute to a change in tropism. Four glycosylation sites in the E2 glycoprotein predicted by NetNGlyc 1.0 have been identified in positions N117, N186, N230, and N298. One extra glycosylation site in position N25 (Fig. 1) was found in the 164-DM/15 and 165-DM/15 strains (subtypes 1f) detected in samples from the same herd (Table 2). N-linked glycosylation is crucial in protein functions, such as entry into host cells, protein antigenic properties, proteolytic processing, and protein trafficking [24].
Strains 187-AN/17 and 194-TC/17 were isolated from vaccinated animals ( Table 2). The vaccine was based on the C86 strain (GenBank accession number Y19123) belonging to subtype 1a. The percentage of nucleotide sequence identity of the tested field strains and the C86 strain sequence taken from the GenBank database was 72.6-73.2% and at the protein level it was 75.4-75.7%.
BVDV vaccines are mostly based on single subtypes including BVDV-1a, BVDV-1b, or BVDV-2a. Our and other investigators' results indicate that current vaccines based on a limited number of subtypes do not provide effective protection against other subtypes of BVD virus [25]. Humoral immune response directed against E2 is considered the main line of defense against the BVD virus circulating in the field [19]. Sequence identity between our strains and the vaccine strain C86 in the 19-90 amino acid position where many antibody-recognizable epitopes was only 55.5-63.8%. In this fragment sequence similarity percentage between our strains and the C86 vaccine strain is even lower than in the entire E2 sequence studied by us.
In summary, the present study provides information for the first time of sequences of the E2 glycoprotein of Polish BVDV strains belonging to the most often isolated subtypes in Poland. We showed that E2 glycoprotein with its high genetic variability contains fragments that are positively selected. Some of these fragments may be epitopes. We believe that the strain used in vaccine production should show greater similarity to strains circulating in a given region, and that particular emphasis should fall on similarity at sites encoding immunogenic epitopes present mainly in the E2 glycoprotein. In contrast to the most frequently studied 5′UTR and N pro region, the positively selected BVDV sites require further analysis to establish if amino acid substitutions can lead to changes of host tropism, evasion of epitope-specific CD8 T-cell response, or avoidance of antibody recognition. Understanding the importance of amino acid changes at sites of positive selection can be helpful in studying the virulence of strains and predicting future responses to vaccine strains of BVDV.
Author contributions PM conducted and coordinated the study including laboratory and computer analysis and drafted the manuscript. MP drafted and revised the manuscript. All authors read and approved the final manuscript.
Funding Funded by the KNOW (Leading National Research Centre) Scientific Consortium "Healthy Animal-Safe Food", Ministry of Science and Higher Education Resolution No. 05-1/KNOW2/2015. The funding body was involved solely in funding and had no role in the design of the study, the collection, analysis, or interpretation of the data, or in writing the manuscript.

Compliance with ethical standards
Conflict of interest We declare that all authors have no conflict of interests.
Ethical approval This article does not contain any studies on animals performed by any of the authors.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creat iveco mmons .org/licen ses/by/4.0/.