Classification of viral zoonosis through receptor pattern analysis

Bae, Se-Eun; Son, Hyeon Seok

doi:10.1186/1471-2105-12-96

Classification of viral zoonosis through receptor pattern analysis

Methodology article
Open access
Published: 13 April 2011

Volume 12, article number 96, (2011)
Cite this article

Download PDF

You have full access to this open access article

BMC Bioinformatics Aims and scope Submit manuscript

Classification of viral zoonosis through receptor pattern analysis

Download PDF

Se-Eun Bae^1,2 &
Hyeon Seok Son^1,2

6036 Accesses
7 Citations
Explore all metrics

Abstract

Background

Viral zoonosis, the transmission of a virus from its primary vertebrate reservoir species to humans, requires ubiquitous cellular proteins known as receptor proteins. Zoonosis can occur not only through direct transmission from vertebrates to humans, but also through intermediate reservoirs or other environmental factors. Viruses can be categorized according to genotype (ssDNA, dsDNA, ssRNA and dsRNA viruses). Among them, the RNA viruses exhibit particularly high mutation rates and are especially problematic for this reason. Most zoonotic viruses are RNA viruses that change their envelope proteins to facilitate binding to various receptors of host species. In this study, we sought to predict zoonotic propensity through the analysis of receptor characteristics. We hypothesized that the major barrier to interspecies virus transmission is that receptor sequences vary among species--in other words, that the specific amino acid sequence of the receptor determines the ability of the viral envelope protein to attach to the cell.

Results

We analysed host-cell receptor sequences for their hydrophobicity/hydrophilicity characteristics. We then analysed these properties for similarities among receptors of different species and used a statistical discriminant analysis to predict the likelihood of transmission among species.

Conclusions

This study is an attempt to predict zoonosis through simple computational analysis of receptor sequence differences. Our method may be useful in predicting the zoonotic potential of newly discovered viral strains.

The science of the host–virus network

Article 24 November 2021

Predicting the potential for zoonotic transmission and host associations for novel viruses

Article Open access 19 August 2022

A comprehensive dataset of animal-associated sarbecoviruses

Article Open access 07 October 2023

Background

Viral zoonosis, the transmission of a virus from its primary vertebrate reservoir species to humans, requires ubiquitous cellular proteins known as receptor proteins [1]. Zoonosis can occur not only through direct transmission, but also through intermediate reservoirs or other environmental factors [2–4]. The zoonotic viruses can be categorized according to genotype; of the various classes of viruses, the RNA viruses exhibit the highest mutation rates [5]. Most zoonotic viruses are RNA viruses that change their envelope proteins to facilitate binding to various receptors of host species [6, 7]. The high mutation rate of envelope proteins [5] hinders the development of accurate vaccines, as does the great ability of the RNA viruses to infect host species in order to exploit host proteins for viral reproduction [8].

Lacking the ability to self-replicate, viruses must utilize the replication apparatus of their host cells [9]. Viral infection of a cell begins with attachment of the virus to the cell surface [6, 10, 11]. During attachment to the cell membrane, the viral envelope protein (a structural protein) interacts with the host-cell receptor protein(s) [12]. In non-envelope viruses, the capsid plays this role. The cell receptors that play a major role in viral attachment are predominantly membrane proteins of the immunoglobin superfamily [13–15]. The identification of virus-binding cellular receptors was rapidly accelerated in the late 1980s owing to developments in the use of monoclonal antibodies and molecular cloning techniques [15]. The various receptors that have been found are surface matrix structures containing carbohydrate, lipid, and protein moieties [1, 16, 17]. In some cases, viral attachment also exploits co-receptors. For example, HIV, which uses the CD4 molecule as its receptor, uses the CXCR4 and CCR5 co-receptors to strengthen the effectiveness of infection [1, 14, 18, 19]. Similarly, hepatitis C virus utilizes CD81 as a receptor and LDLR as a co-receptor [20].

Since the host-cell range of a specific virus is predetermined by its ability to recognize specific receptors, the similarities between the receptors of its primary reservoir host cell and the potential human host cell play a major role in determining the likelihood of viral zoonosis. Here, we analysed zoonotic and non-zoonotic RNA viruses along with their cellular receptors in human and (non-human) primary reservoir species to extract the receptor characteristics common to zoonosis. Viruses not previously reported to infect humans were classified as non-zoonotic viruses. We excluded all viruses known to utilize co-receptors; i.e., only virus-receptor interactions occurring through virus tropism and pathogenesis were considered [5, 21]. The receptors and viruses examined in this study are listed in Table 1.

Table 1 Similarity scores of host receptor pairs.

Full size table

We hypothesized that the major barrier to the transmission of viruses between species is the difference in cellular receptor sequences. In other words, the specific amino acid sequence of the receptor should be the major determinant of the ability of the viral envelope protein to attach to the cell. Ordinary sequence alignment protocol tells us overall sequence similarity which we thought useful but insufficient because most receptors are membrane proteins and membrane proteins consist of distinctive hydrophobic and hydrophilic parts. Therefore, we analysed host-cell receptor sequences for their hydrophobicity/hydrophilicity characteristics. We then analysed these properties for similarities among receptors of different species to predict the likelihood of transmission across species, including humans. To our best knowledge, this study is the first attempt to predict zoonosis through a simple analysis of receptor sequence similarities and differences. This method may be useful in predicting the zoonotic potential of newly discovered viral strains.

Results and Discussion

The pair-wise receptor sequence similarities (^gS_i,1, ^gS_i,2, and ^gS_i,3) between host-species pairs for each virus family are shown in Table 1. For logical comparisons, each virus contains at least one infected host (the primary reservoir, designated as "#" in Table 1). As shown in Table 1, the similarity scores for the infected group (g = 1) were high, ranging from 0.790 to 0.988 for ¹S_i,1, from 0.841 to 0.996 for ¹S_i,2, and 0.794 to 0.962 for ¹S_i,3. All pair-wise comparisons in group 1 (human vs. primary reservoir, primary reservoir vs. host, and human vs. host) yielded high similarity scores, indicating a high similarity among receptor sequences. The similarity scores were comparatively low in the non-infection group (g = 2), ranging from 0.092 to 0.440 for ²S_i,1, from 0.108 to 0.432 for ²S_i,2, and from 0.130 to 0.416 for ²S_i,3. For group 2, both the primary host species and non-infected species are listed to illustrate the differences in similarity. In pair-wise comparisons, all the non-infection cases yielded low similarity values, i.e., the receptor sequences differed significantly from each other.

We assume that a low similarity in receptor sequences disfavors infection despite the existence of a common receptor. For example, enterovirus infects only Sus scrofa (pig); it does not infect Rattus norvegicus (rat) or Homo sapiens (human) because of the high transmission barrier. Similarly, for leukovirus, only Gallus gallus (chicken) is infected as a primary reservoir; because of the high transmission barrier, R. norvegicus and H. sapiens are not infected. These results imply that for non-infection cases, species barriers exist, and the propensity to cross the barrier is determined by the sequence similarity between the potential and primary host receptors.

Similarity scores for rabies virus were low between Canis lupus familiaris (domestic dog) and Bos Taurus (domestic cow) (²S_i,1 = 0.280, ²S_i,2 = 0.373, and ²S_i,3 = 0.366) and also between B. taurus and H. sapiens (²S_i,1 = 0.267, ²S_i,2 = 0.371, and ²S_i,3 = 0.416) but were high between C. l. familiaris and H. sapiens (¹S_i,1 = 0.947, ¹S_i,2 = 0.985, and ¹S_i,3 = 0.962). Clearly, C. l. familiaris is the primary reservoir, and transmission of the disease to H. sapiens is possible only because of the high human/dog receptor similarity. Thus, for particular viruses, transmission of disease may be species-selective, although common receptors exist among species. Furthermore, infection specificity may be determined by the species barrier, which results from receptor differences.

The values in Table 1 are plotted in Figure 1 to illustrate the differences among groups. The x- and y-axes denote ^gS_i,1 and ^gS_i,2, respectively, where "g" is the group classification. All pair-wise similarity scores are shown. Groups 1, 2 and 3 are each well separated in the colour-coded two-dimensional space. The results provide clear evidence that the receptor sequences from cases of cross-species infection are well separated from those of other infection cases. From these observations, we conclude that receptor differences are a major contributing factor to the potential of a specific viral strain to cross species barriers for transmission. In other words, the species dependence of infection is indirectly related to the receptor sequence similarity. This finding implies that once the receptor sequences of the primary reservoir and possible hosts are known, we might be able to predict the likelihood of viral disease transmission. The accuracy of these classifications can be judged by subsequent assessment of cases of actual zoonotic transmission to humans.

Our analysis revealed significant differences in receptor similarity between infection and non-infection cases. The similarity values, and the experimentally determined group categories were fed into a statistical discriminant analysis to logically predict infection (or zoonosis, in the case of human infection). As described in the Materials and Methods section, the values D_i² (i = 1, 2, 3) were calculated from the data in the Table 1 to yield results of a specific discriminant analysis.

The statistical discriminant analysis was verified using a test set of four viruses that were deliberately excluded from the training set. The viruses whose groups were predicted using the discriminant analysis are shown in Table 2. The first virus, feline immunodeficiency virus (FIV), uses Felis catus (domestic cat) as its primary host and CD4 as its receptor. According to the literature [22, 23], FIV infection of humans is rare but has been reported. Our method categorized this case as near-infection (G = 3). The second virus, classical swine fever virus, is known to be non-zoonotic and was classified as such by our method (G = 2). Thirdly, the encephalomyocarditis virus infects S. scrofa but has been known to cause sporadic infections in H. sapiens; it was classified as group 1 (G = 1) by our method. Finally, the Lass virus is known to be zoonotic and was classified as group 1 (G = 1) by our method.

Table 2 Virus group prediction.

Full size table

In Table 2, the hydrophilic similarity scores (S₁) show less consistency, comparing to the hydrophobic scores (S2), with the predictive values (G). From the result, it could be said that the hydrophobic characteristics of receptor sequence might be the key contributor to the prediction. However, this observation should only be carefully interpreted because the variables (S1, S2, S3) are complementary in the statistical process.

Conclusions

Our analysis of viral receptor sequences shows that the likelihood of viral infection correlates with the similarity in sequence of the primary and host receptors. This result is not surprising, because viral infection also inversely correlates with the inhibition of viral coat protein binding to the receptors. Importantly, we were able to establish this relationship at the amino acid sequence level, allowing for the prediction of possible human infection at an early stage of a viral outbreak, before the structures of viral coat proteins and receptors are known. Therefore, once the receptor sequences of primary reservoir and the potential host are known, the likelihood of viral infection can be predicted if the virus does not mutate too abruptly. Our simplistic approach needs further refinement because the complex processes of host tropism of viruses are largely ignored in our current method. For example, the process of host immune response could be included for better prediction of zoonosis. Although further refinements of our methods and analyses of larger databases are needed, this simple conceptual approach may be useful, even now, as a basic tool for the classification of zoonosis of new viral species.

Methods

Data collection

Viral infection requires the insertion of viral genes into host cells. Such a process begins with the binding of coat proteins to host receptors, and in some cases, co-receptors [24]. Ten RNA viruses (seven zoonotic viruses and three non-zoonotic viruses) were investigated. Viruses that use co-receptors were excluded from the study. Receptor sequence data for each virus were collected from the National Center for Biotechnology Information http://www.ncbi.nlm.nih.gov/, and the research literature was examined to determine the specific species tropism of each virus [[25], http://www.ictvonline.org/]. The viruses, host species, receptors, receptor sequences, and infection information for each host are shown in Table 1. We selected viruses that are each a representative of a different family, with different primary reservoirs. Viruses with unknown or poorly defined host receptors (particularly human receptors) were excluded from the study. Orthologues of the human receptor sequences for the non-zoonotic viruses were collected to allow for clear comparison with zoonosis cases.

Discriminant analysis for data analysis

To calculate sequence similarities among host receptors for each virus, we first conducted a pair-wise sequence alignment using Clustal X [26, 27]. We verified the alignment results with BLAST [28] and prank [[29], http://www.ebi.ac.uk/goldman-srv/prank/] and both alignment tools produced same reliable results as Clustal X. From the resulting alignment, we counted the numbers of matched amino acids and calculated three kinds of sequence similarity scores. The total sequence similarity scores were defined as:

and

where N_tot is the total number of amino acids in one sequence string; n_tot is the total number of matched amino acids in the sequence; N_phi and N_pho are the numbers of hydrophilic and hydrophobic amino acids in the sequence, respectively; N_others is the number of deleted amino acids (gaps/insertions in sequence) plus the number of amino acids with undetermined properties; n_phi and n_pho are the numbers of hydrophilic and hydrophobic amino acids matched, respectively; and ^gS_i,1 is the similarity score for hydrophilic residues of the i^th row of infection group g. Here, there are only three groups: g = 1, 2, or 3, which are the infection, non-infection, and near-infection groups, respectively. The interspecies infection information was identified and classified among three infection states: group 1 (g = 1) represents infection; group 2 (g = 2) represents non-infection; and group 3 (g = 3) represents near-infection. By definition, if a group 1 species pair includes humans, then the infection is zoonotic. Decisions for grouping were made on the basis of experimental and epidemiological studies reported in the literature [4, 30–33].

The variables (shown in Table 1) were arranged in matrices to allow for discriminant analysis, a method of multivariate analysis that can determine the group related to variables [34]. Each group has three columns and l, m, or n rows, depending on the numbers of variable sets. Here, the matrix for group 1 is defined as:

Similarly, ²S and ³S were defined as:

and

All of the related variables were tabulated as shown in Table 1. From the above matrices, three averages were found for each group:

The averages , , and for group 2 and , , and for group 3 were calculated similarly.

Three covariant matrices were constructed as:

where

and

where

and

Similar treatments yielded the ²C and ³C matrices, resulting in three covariance matrices (¹C, ²C, and ³C). We then created a pool-within-class covariance matrix P. If we define L = 3l-1, M = 3m-1, and N = 3n-1, then:

where

also

We next found the inverse matrix I, where I = P^-1. Because there were three groups in our study, we predicted the likelihood of infection for a virus of unknown infection condition by calculating the Mahalanobis distance (generally D² = d₁ × C^-1 × D_i).

Here, expansion of D² yielded three equations:

where

where S₁ , S₂ , and S₃ are the input variables; here, they were similarity variables of a virus of an unknown infection group.

Group classification (G) was identified using the criterion:

For example, if D₁² is the minimum among three values from the above set of three equations, then G = 1; i.e., "group 1" is the group classification. To automate the mathematical process described above, we developed a Java computer program named ZOO. To evaluate the accuracy of our method and software, we analysed a test data set (described in the Results & Discussion section).

References

Baranowski E, Ruiz-Jarabo CM, Domingo E: Evolution of Cell Recognition by Viruses. Science 2001, 292: 1102–1105. 10.1126/science.1058613
Article CAS PubMed Google Scholar
Schwabe CW: Veterinay medicine and human health. Baltimore, Williams & Wilkins; 1984.
Google Scholar
Webber R: Communicable disease epidemiology and control. Am J Epidemiol 1998, 147: 791–792.
Article Google Scholar
Hugh-Jones ME, Hubbert WT, Hagstad HV: Zoonoses-recognition, control and prevention. Iowa: Iowa State University Press; 2008.
Google Scholar
Schneider-Schaulies J: Cellular receptors for viruses: links to tropism and pathogenesis. J Gen Virol 2000, 81: 1413–1429.
Article CAS PubMed Google Scholar
Dimmock NJ: Initial Stages in infection with Animal viruses. J Gen Virol 1982, 59: 1–22. 10.1099/0022-1317-59-1-1
Article CAS PubMed Google Scholar
Wiley DC, Wilson IA, Skehel JJ: Structural identification of the antibody-binding sites of Hong Kong influenza haemagglutinin and their involvement in antigenic variation. Nature 1981, 289: 373–378. 10.1038/289373a0
Article CAS PubMed Google Scholar
Duffy S, Shackelton LA, Holmes EC: Rates of evolutionary change in viruses: patterns and determinats. Nat Rev Genet 2008, 1–10.
Google Scholar
Horsfall FL Jr, Hardy PH Jr, Davenport FM: The significance of combinations between viruses and host cells. Bull N Y Acad 1948, 24: 470–475.
Google Scholar
Dales S: Early Events in Cell-Animal Virus Interactions. Bacteriol Rev 1973, 37: 103–135.
PubMed Central CAS PubMed Google Scholar
Lentz TL: The recognition event between virus and host cell receptor: a target for antiviral agents. J Gen Virol 1990, 71: 751–766. 10.1099/0022-1317-71-4-751
Article CAS PubMed Google Scholar
Vrublevskaya VV, Korney AN, Smirnow SV, Morenkov OS: Cell-binding properties of glycoprotein B of Aujeszky's disease virus. Virus Res 2002, 86: 7–19. 10.1016/S0168-1702(02)00032-1
Article CAS PubMed Google Scholar
Myszka DG, Sweet RW, Hensley P, Brigham-Burke M, Kwong PD, Hendrickson WA, Wyatt R, Sodroski J, Doyle ML: Energetics of the HIV gp120-CD4 binding reaction. PNAS 2000, 97: 9026–9031. 10.1073/pnas.97.16.9026
Article PubMed Central CAS PubMed Google Scholar
Wu L, Gerard NP, Wyatt R, Choe H, Parolin C, Ruffing N, Borsetti A, Cardoso AA, Desjardin E, Newman W, Gerard C, Sodroski J: CD4-induced interaction of primary HIV-1 gp120 glycoproteins with the chemokine receptor CCR-5. Nature 1996, 384: 179–183. 10.1038/384179a0
Article CAS PubMed Google Scholar
Hyypiä T: Virus Host Cell Receptors. Encyclopedia of life science 2006, 1–8.
Google Scholar
Wang JH: Protein recognition by cell surface receptors: physiological receptors versus virus interactions. Trends Biochem Sci 2002, 27: 122–126. 10.1016/S0968-0004(01)02038-2
Article CAS PubMed Google Scholar
Wimmer E: Cellular receptors for animal viruses. Cold Spring Harbor laboratory Press, Cold Spring Harbor, NY; 1994.
Google Scholar
Haywood AM: Virus Receptors: Binding, Adhesion Strengthening, and Changes in Viral Structure. J Virol 1994, 68: 1–5.
PubMed Central CAS PubMed Google Scholar
Reeves JD, Gallo SA, Ahmad N, Miamidian JL, Harvey PE, Sharron M, Pohlmann S, Sfakianos JN, Derdeyn CA, Blumenthal R, Hunter E, Doms RW: Sensitivity of HIV-1 to entry inhibitors correlates with envelope/coreceptor affinity, receoptor density, and fusion kinetics. Proc Natil Acad Sci 2002, 99: 16249–16254. 10.1073/pnas.252469399
Article CAS Google Scholar
Pileri P, Uematsu Y, Campagnoli S, Galli G, Falugi F, Petracca R, Weiner AJ, Houghton M, Rosa D, Grandi G, Abrignani S: Binding of Hepatitis C Virus to CD81. Science 1998, 282: 938–941. 10.1126/science.282.5390.938
Article CAS PubMed Google Scholar
Gareth MJ, Andrew R, Pybus OG, Holmes EC: Rates of Molecular Evolution in RNA Viruses: A Quantitative Phylogenetic Analysis. J Mol Evol 2002, 54: 156–165. 10.1007/s00239-001-0064-3
Article Google Scholar
Faure E: Could FIV zoonosis responsible of the breakdown of the pathocenosis which has reduced the European CCR5-Delta32 allele frequencies? Virol J 2008, 5: 119. 10.1186/1743-422X-5-119
Article PubMed Central PubMed Google Scholar
VandeWoude S, Apeterei C: Going Wild: Lessons from Naturally Occurring T-Lymphotropic Lentiviruses. Clin Microbiol Rev 2006, 19: 728–762. 10.1128/CMR.00009-06
Article PubMed Central CAS PubMed Google Scholar
Berger EA, Murphy PM, Farber JM: Chemokine receptors as HIV-1 coreceptors; roles in viral entry, tropism, and disease. Annu, Rev Immunol 1999, 17: 657–700. 10.1146/annurev.immunol.17.1.657
Article CAS Google Scholar
Fauguet CM, Mayo MA, Maniloff J, Desselberger U, Ball LA: 8^thReports of the international committee on Taxonomy of viruses. Academic Press;
Thompson JD, Gibson TJ, Plewniak F, Jeanmougin F, Higgins DG: The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Research 1997, 25: 4876–4882. 10.1093/nar/25.24.4876
Article PubMed Central CAS PubMed Google Scholar
Chenna R, Sugawara H, Koike T, Lopez R, Gibson TJ, Higgins DG, Thompson JD: Multiple sequence alignment with the Clustal series of programs. Nucleic acids res 2003, 31: 3497–3500. 10.1093/nar/gkg500
Article PubMed Central CAS PubMed Google Scholar
Altschul SF, Madden TL, Schäffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 1997, 25(17):3389–402. 10.1093/nar/25.17.3389
Article PubMed Central CAS PubMed Google Scholar
Löytynoja A, Goldman N: An algorithm for progressive multiple alignment of sequences with insertions. PNAS 2005, 102: 10557–10562.
Article PubMed Central PubMed Google Scholar
Greger M: The Human/Animal Interface: emergence and resurgence of zoonotic infectious diseases. Crit Rev Microbiology 2007, 33: 243–299. 10.1080/10408410701647594
Article Google Scholar
Ryou WS: Virology. Life Science publishing; 2007.
Google Scholar
Woolhouse MEJ, Gowtage-Sequeria S: Host Range and Emerging and Reemerging Pathogens. Emerg Infect Dis 2005, 11: 1842–1847.
Article PubMed Central PubMed Google Scholar
Baltimore D: Expression of Animal Virus Genomes. Bacteriol Rev 1971, 35: 235–241.
PubMed Central CAS PubMed Google Scholar
Fisher RA: The use of multiple measurement in taxonomic problems. Ann Eugenics 2 1936, 179–188. 10.1111/j.1469-1809.1936.tb02137.x
Google Scholar

Download references

Acknowledgements

We acknowledge the invaluable contribution of the researchers who have made their data publicly available. We thank K.T. No (Yonsei University) for his support. This work was partly supported by the Brain Korea 21 project.

Author information

Authors and Affiliations

Laboratory of Computational Biology & Bioinformatics, Institute of Health and Environment, Graduate School of Public Health, Seoul National University, 599 Gwanak-ro, Gwanak-gu, Seoul, 151-742, Korea
Se-Eun Bae & Hyeon Seok Son
Interdisciplinary Graduate Program in Bioinformatics, College of Natural Science, Seoul National University, 599 Gwanak-ro, Gwanak-gu, Seoul, 151-742, Korea
Se-Eun Bae & Hyeon Seok Son

Authors

Se-Eun Bae
View author publications
You can also search for this author in PubMed Google Scholar
Hyeon Seok Son
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hyeon Seok Son.

Additional information

Authors' contributions

SEB and HSS have developed the methods and have conducted subsequent data analysis. Both authors have drafted, read and approved the manuscript.

Authors’ original submitted files for images

Below are the links to the authors’ original submitted files for images.

Authors’ original file for figure 1

Rights and permissions

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Bae, SE., Son, H.S. Classification of viral zoonosis through receptor pattern analysis. BMC Bioinformatics 12, 96 (2011). https://doi.org/10.1186/1471-2105-12-96

Download citation

Received: 07 September 2010
Accepted: 13 April 2011
Published: 13 April 2011
DOI: https://doi.org/10.1186/1471-2105-12-96

Classification of viral zoonosis through receptor pattern analysis