Structure modeling and spatial epitope analysis for HA protein of the novel H1N1 influenza virus

In recent months, a novel influenza virus H1N1 broke out around the world. With bioinformatics technology, the 3D structure of HA protein was obtained, and the epitope residues were predicted with the method developed in our group for this novel flu virus. 58 amino acids were identified as potential epitope residues, the majority of which clustered at the surface of the globular head of HA protein. Although it is located at the similar position, the epitope of HA protein for the novel H1N1 flu virus has obvious differences in the electrostatic potential compared to that of HA proteins from previous flu viruses.

In recent months, a novel influenza virus H1N1 broke out around the world. With bioinformatics technology, the 3D structure of HA protein was obtained, and the epitope residues were predicted with the method developed in our group for this novel flu virus. 58 amino acids were identified as potential epitope residues, the majority of which clustered at the surface of the globular head of HA protein. Although it is located at the similar position, the epitope of HA protein for the novel H1N1 flu virus has obvious differences in the electrostatic potential compared to that of HA proteins from previous flu viruses.
novel H1N1 influenza virus, HA protein, spatial epitope, bioinformatics Since the outbreak of the novel H1N1 influenza virus in Mexico in March, 2009, the number of the infected cases has kept increasing daily in over 30 countries [1−3] . It has been reported that human beings have not obtained resistance to this type of new virus [4] . Although antibodies and related reagents are highly sensitive and specific to diagnosis and the treatment of flu viruses, progress in developing antibodies for this novel virus is seriously restrained due to the scarcity and usage limitation of the imported virus strains. In this case, predicting the potential epitope only from the virus sequences re-leased by the National Center of Biotechnology Information (NCBI) is highly desired to help the development of specific antibodies.
Previous investigations found that the antigenic determinants of flu virus were determined by the HA protein sitting in the virus envelopes [5] . Based on the complete gene sequence of the HA protein from the first strain of novel H1N1 virus (FJ966082, A/California/04/ 2009(H1N1)) released by NCBI, the 3D structure of HA protein was obtained with homology modeling and the epitope residues were predicted with the group-grown tool SEPPA [6] .
Homology modeling was performed with program Modeller 9V6 [7] . Because there were two separate segments in the released HA structural files, separate templates were selected for each segment accordingly before joining into one complete protein. Six HA proteins of flu virus were selected as the templates, including HA proteins of swine H9 influenza virus (PDB ID: 1JSD), avian H5 influenza virus (PDB ID: 1JSM), 1930 swine H1 influenza virus (PDB ID: 1RV0), 1934 H1 influenza virus (PDB ID: 1RVX), H3N2 influenza virus (PDB ID: 2VIU) and H5 influenza virus (PDB ID: 2IBX). These templates were chosen from the representative structures of HA proteins which had been grouped according to the sequence similarity higher than 95%. The homologous similarity between the template and the query sequence was 40%-84% for the HA1 segment with E-value of 0.0, and 63%-92% for the HA2 segment with E-value of 0.0. The result of sequential alignment is mapped in Figure 1(a). In order to avoid bias caused by single modeled structure, ten different initial structures for HA protein were built in the homology modeling process. CHARMm, InsightII program was used to eliminate bad contacts in the initial structures, and the modeled structures see the online supplementary material 1. PROC- Figure 1 Modeled structure of HA protein for novel H1N1 influenza virus. (a) The result of sequential alignment for HA proteins. The residues of the potential spatial epitope are highlighted in yellow in the sequences. The symbol of asterisks is used to indicate the conserved residues for template proteins. (b) Modeled structure of HA protein and potential epitope. HA protein is shown with the solid ribbon mode; spatial epitope is shown with the surface mode.

BRIEF COMMUNICATION
HECK program was then applied to check the modeled structures [8,9] . The detailed results of analysis see the online supplementary material 2.
The modeled structures of HA protein were submitted to SEPPA for predicting the possible spatial epitope, with the default threshold of 1.8. Among all 10 modeled structures, a residue was considered as an epitope residue only if it was a consensus. The results of epitope prediction showed that the potential spatial epitope residues mainly assembled at the surface of globular head and tail segments of HA protein. Because the tail part was usually buried inside the virus envelope [5] , only the epitope residues at the globular head segment were retained for further analysis. In the end, 58 residues were identified as the potential epitope residues which are highlighted in yellow in Figure 1(a), the majority of which clustered at the surface of the globular head, as shown in Figure 1(b).
The predicted epitope was also compared to the corresponding ones of the HA protein templates from pre-vious flu viruses identified by SEPPA under the same parameters. The detailed results of epitope prediction for the six templates of HA proteins can be found in online supplementary material 3. As shown in Figure 2, the epitope was located at the similar position of HA proteins, but the chemical features of epitope patches were distinct among them. According to the calculated Gasteiger charge parameter, the electrostatic potential has been mapped to the surface of epitope residues. For the HA proteins of previous flu viruses, the epitope contained residues with strong positive potential (blue), or contained residues with both strong positive and strong negative potential (red). As to the HA protein of novel H1N1, the predicted epitope was found to have light positive or neutral potential, whereas residues with negative potential were not found there. The difference of electrostatic potential in the epitope area may affect the recognition and binding between the antibody and antigen. Further experiments are going on to validate the results in this work. . Structure of the whole protein is shown with the solid ribbon mode, and potential epitope is shown with the surface mode. The surface is colored according to the electrostatic potential, red for negative potential and blue for positive potential.