Introduction

The interferon-induced transmembrane protein 3 (IFITM3) gene is an endogenous immune-related gene classified as a small interferon-stimulated gene (ISG) (Friedman et al. 1984). The IFITM3 protein prevents viral infection by restricting viral membrane hemifusion between the host and viral membrane and exhibits a broad spectrum of potent antiviral capacity against enveloped viruses, including influenza A viruses (IAVs), Ebola virus (EBOV), Marburg virus (MARV), severe acute respiratory syndrome coronavirus (SARS-CoV), dengue virus (DEV), West Nile virus (WNV) and Zika virus (ZIKV) (Diamond and Farzan 2013; Li et al. 2013; Santhakumar et al. 2017; Savidis et al. 2016).

The IFITM3 protein is localized in the late endosome to prevent viral invasion (Feeley et al. 2011). For this reason, the IFITM3 protein has a conserved sorting signal that is needed to enter the endosomal pathway. This signal peptide domain, referred to as Yxxϕ, a tyrosine-based lysosomal targeting motif, is located in the N-terminal domain (NTD) of the IFITM3 protein (Jia et al. 2014). This signal peptide is conserved in species including mammals and birds and is used as classification criteria between IFITM1 and IFITM3 proteins, because these two proteins share a highly similar protein sequence (Wang et al. 2017). Mutagenesis of the Yxxϕ domain augmented mislocalization of the IFITM3 protein and reduced its antiviral capacity in mammals (Jia et al. 2012). However, in a recent study performed in ducks, the Yxxϕ domain was not essential for correct localization of the IFITM3 protein and did not affect antiviral capacity (Blyth et al. 2015). Although mammals and birds share a very similar protein structure called CD225, which is significantly associated with antiviral ability, birds do not likely depend on the Yxxϕ domain for correct IFITM3 protein localization. Thus, structural analysis of the IFITM3 protein between mammals and birds is highly desirable to identify genetic differences that can reveal other crucial differences between these two classes.

Previous studies have shown that single nucleotide polymorphisms (SNPs) in the human IFITM3 gene are associated with antiviral ability. The rs12252 SNP, which is located in a splicing acceptor site, results in an N-terminal truncated form of human IFITM3 protein, and has been related to the severity of H1N1 influenza infection in a 2009 pandemic. Two studies in British and Han Chinese populations reported that the rs12252 SNP is related to disease severity, and one meta-analysis reaffirmed that individuals with the CC genotype have a high risk of influenza infection (Everitt et al. 2012; Kim and Jeong 2017b; Xuan et al. 2015; Zhang et al. 2013). Furthermore, susceptibility to ulcerative colitis (UC) and hemorrhagic fever with renal syndrome is associated with polymorphisms in the IFITM3 gene (Seo et al. 2010; Xu-Yang et al. 2016). Although genetic polymorphisms in the IFITM3 gene play a crucial role in immunity against viral diseases in humans and mice, polymorphisms in this gene have not yet been investigated in chickens.

The purpose of this study was to investigate the genetic characteristics of the chicken IFITM3 gene by comparison among several species. We performed sequence alignment using ClustalW2 and predicted transmembrane domains of the IFITM3 protein using TMpred and SOSUI in humans, monkey, mice, rat, ducks, geese and chickens. We also investigated the genotype, allele, and haplotype frequencies and linkage disequilibrium (LD) among the polymorphisms in the IFITM3 gene in chickens and predicted whether the non-synonymous SNPs are benign or damaging using PolyPhen-2 and the impact of non-synonymous SNPs according to transmembrane topology. Furthermore, we investigated 300 bp upstream from the transcription start site (TSS) of the chicken IFITM3 gene to compare and analyze the promoter structure of the chicken IFITM3 gene with that of several species.

Materials and methods

Ethical statement

Dekalb White and Ross (3 weeks old) breeds were obtained from slaughter house in South Korea. All experimental procedures and animal care performed in the present study were approved according to the recommendations of the Guide of the Animal Care and Use Committee of Chonbuk National University (IACUC Number: CBNU 2017–0030) and all efforts were made to minimize suffering.

Genetic analysis of the IFITM3 gene

Genomic DNA was isolated from 20 mg muscle tissue using the LaboPass Tissue Genomic DNA Isolation Kit (Cosmo Genetech CO., Ltd., Korea) following the manufacturer’s instructions. Polymerase chain reaction (PCR) was performed with forward and reverse primers as follows: chicken IFITM3–1F (CACTTGACGGGGACACAGTT) and chicken IFITM3–1R (CTCTCCCGACGCCATCATTT), chicken IFITM3–2F (CATGCATCCCACAGAGCTCC) and chicken IFITM3–2R (ATCCCTGTCACGCTCCAGAA). The PCR reagents contained 25 pmol of each primer, 5 μl of 10 × Taq DNA polymerase buffer, 1 μl of 10 mM dNTPs and 2.5 units of Taq DNA polymerase (Promega, USA). The PCR conditions were as follows: 94 °C for 2 min to denature, and 35 cycles of 94 °C for 45 s, 63 °C for 45 s, and 72 °C for 1 min 30 s, and then 1 cycle of 72 °C for 10 min to extend the reaction. The S-1000 Thermal Cycler (Bio-Rad Laboratories, USA) was used. A 5 μl aliquot of the PCR product was analyzed by electrophoresis on a 1% agarose gel stained with ethidium bromide (EtBr) to determine the target band size (IFITM3–1, 710 bp; IFITM3–2, 630 bp). The purification of PCR products for DNA sequencing was performed using a QIAquick Gel Extraction Kit (Qiagen, USA). The PCR products were directly sequenced on an ABI 3730 automatic sequencer using a Taq Dideoxy Terminator Cycle Sequencing Kit (ABI, USA).

Statistical analysis

Genotype and allele frequencies of chicken IFITM3 gene were compared between Dekalb White and Ross breeds by chi-square test using SAS 9.4 Software (SAS Institute Inc., Cary, NC, USA). Haplotype and LD among fourteen polymorphisms were analyzed by the Haploview version 4.2 (Broad Institute, Cambridge, MA, USA).

Prediction of IFITM3 protein functional alterations

Possible impacts on the IFITM3 protein caused by non-synonymous SNPs were predicted by PolyPhen-2 (http://genetics.bwh.harvard.edu/pph2/index.shtml). The PolyPhen-2 score corresponds to the probability of a substitution being damaging and ranges from 0.0 to 1.0. The prediction outcome can be presented as ‘benign’, ‘possibly damaging’ or ‘probably damaging’. The prediction algorithm is based on phylogenetic, structural, and sequence information.

Sequence comparison and transmembrane domain prediction of the IFITM3 protein

Protein sequence alignment was performed using ClustalW2 (http://www.ebi.ac.uk/Tools/msa/clustalo). The transmembrane domains in the IFITM3 protein were predicted by TMpred (http://www.ch.embnet.org/software/TMPRED_form.html) and SOSUI (http://harrier.nagahama-i-bio.ac.jp/sosui/sosui_submit.html). Protein sequences of IFITM3 protein were obtained from GenBank at National Center for Biotechnology Information (NCBI), including those of human (Homo sapiens, AFF60355.1), monkey (Cercopithecus albogularis, ANJ01447.1), mouse (Mus musculus, NP_079654.1), rat (Rattus norvegicus, NP_001129596.1), duck (Anas platyrhynchos, AQX83312.1), goose (Anser cygnoides, AQM74179) and chicken (Gallus gallus, in this study).

Promoter comparison of the IFITM3 gene

DNA sequences of the promoter and open reading frame (ORF) of the IFITM3 gene were obtained from GenBank at NCBI, including those of human (Homo sapiens, NC_000011.10), mouse (Mus musculus, NC_000073.6) and chicken (Gallus gallus, in this study). Among the promoter sequences, TATA box and CpG islands (CGIs), which are important promoter elements, were investigated. The promoter elements were searched for using GPMiner (http://gpminer.mbc.nctu.edu.tw/index.php), which is based on the Naive Bayes model.

Results

Sequence comparison and transmembrane prediction of the IFITM3 protein

DNA sequences of the IFITM3 gene ORF sequenced in 108 Dekalb White and 72 Ross were identical to those of the Gallus gallus gene registered in GenBank (NP_001336990.1). Multiple sequence alignment showed very low homology in the NTD and C-terminal domain (CTD) between mammals and birds (Fig. 1). Since the amino acid sequence of the IFITM3 protein determines the transmembrane structure, we predicted transmembrane domains using TMpred and SOSUI (Fig. 2, Table 1). Notably, the length of the NTD in mammals is 11 or 12 amino acids longer than that in birds. However, the length of the CTD (4 amino acids in humans and monkey, and 7 amino acids in mice and rat) is shorter than that in birds. The 16 amino acid length of the CTD in chickens is shorter than that in ducks and geese, and the lengths of other domains in chickens are very similar. In addition, prediction by TMpred indicated that IFITM3 protein in birds prefers the outside-to-inside topology of transmembrane domain 1 (TM1) and the inside-to-outside topology of transmembrane domain 2 (TM2). This topology prediction is the opposite of that of mammals (i.e., humans, monkey, mice and rat prefer the inside-to-outside topology of TM1 and the outside-to-inside topology of TM2).

Fig. 1
figure 1

Comparison of IFITM3 amino acid sequences in humans, mice, ducks, geese and chickens. IFITM3 protein sequences were obtained from GenBank at the National Center for Biotechnology Information (NCBI), including those of human (Homo sapiens, AFF60355.1), monkey (Cercopithecus albogularis, ANJ01447.1), mouse (Mus musculus, NP_079654.1), rat (Rattus norvegicus, NP_001129596.1), duck (Anas platyrhynchos, AQX83312.1), goose (Anser cygnoides, AQM74179) and chicken, (Gallus gallus, in this study). Protein sequences were aligned using ClustalW2. Colors indicate the chemical properties of amino acids; blue: acidic, red: small and hydrophobic, magenta: basic, green: hydroxyl, sulfhydryl, amine and glycine

Fig. 2
figure 2

Comparison of IFITM3 protein structure in humans, mice, ducks, geese and chickens. Transmembrane topology was predicted by SOSUI. IFITM3 protein sequences were obtained from GenBank at NCBI, including those of human (Homo sapiens, AFF60355.1), monkey (Cercopithecus albogularis, ANJ01447.1), mouse (Mus musculus, NP_079654.1), rat (Rattus norvegicus, NP_001129596.1), duck (Anas platyrhynchos, AQX83312.1), goose (Anser cygnoides, AQM74179) and chicken, (Gallus gallus, in this study). Numbers in boxes indicate the number of amino acids of each domain. Abbreviations in boxes are as follows: NTD (N-terminal domain), TM1 (transmembrane domain 1), CIL (conserved intracellular loop), TM2 (transmembrane domain 2), and CTD (C-terminal domain)

Table 1 Transmembrane domains of IFITM3 protein predicted by TMpred and SOSUI

Identification of polymorphisms in the chicken IFITM3 gene and analysis of haplotype frequencies and LD

To investigate the genotype and allele frequencies of IFITM3 gene polymorphisms in chickens, we screened polymorphisms within two exons of the chicken IFITM3 gene through automatic direct sequencing in 108 Dekalb White and 72 Ross breeds. We found a total of thirteen SNPs and one insertion/deletion, including three non-synonymous SNPs, c.298C > A (L100 M), c.307G > A (V103I) and c.373A > C (N125H) (Fig. 3). Interestingly, genotype of nine polymorphisms, including c.-338G > A, c.-330G > C, c.-295G > T, c.213 + 18G > T, c.214-5C > T, c.307G > A, c.373A > C, c.657 + 29 T > G and c.657 + 47_657 + 48InsAG showed statistically different distribution between Dekalb White and Ross breeds. In addition, ten polymorphisms, including c.-330G > C, c.-295G > T, c.213 + 18G > T, c.214-5C > T, c.307G > A, c.373A > C, c.493A > C, c.657 + 29 T > G, c.657 + 47_657 + 48InsAG and c.657 + 64 T > C have significantly different allele distribution between Dekalb White and Ross breeds (Table 2).

Fig. 3
figure 3

Gene map and polymorphisms identified in the chicken interferon-induced transmembrane protein 3 (IFITM3) gene on chromosome 5. The open reading frame (ORF) within the exons is indicated by shaded blocks, and the 5′ and 3′ untranslated regions (UTRs) are indicated by white blocks. Edged horizontal bars indicate the regions sequenced. Arrows indicate the polymorphisms found in this study. Asterisks denote non-synonymous single nucleotide polymorphisms (SNPs). The Y-shaped bar indicates the insertion/deletion identified in the IFITM3 gene

Table 2 Genotype and allele frequencies of IFITM3 gene polymorphisms in chickens

To determine whether there was strong LD among the fourteen polymorphisms in the chicken IFITM3 gene, the (|D’|) was calculated. Detailed LD values with |D’| scores were described in Table 3. To analyze the haplotype frequencies, we investigated the distribution of haplotypes using Haploview version 4.2. Seven major haplotypes of chicken IFITM3 gene were found in Dekalb White breeds (Table 4). Among the seven haplotypes, the GGGTCCGAGATWtCT haplotype was the most frequently observed (39.6%). Interestingly, the Ross breed had a significantly different haplotype distribution within the chicken (Table 5). A total of nine haplotypes were identified in the the Ross breed and most frequently observed haplotype was the ACGGTCAAGAGWtCT haplotype. The most frequent haplotypes were different between the two chicken breeds.

Table 3 Linkage disequilibrium (LD) among fourteen polymorphisms in the chicken IFITM3 gene
Table 4 Haplotype frequencies of chicken IFITM3 gene polymorphisms in Dekalb White breed
Table 5 Haplotype frequencies of chicken IFITM3 gene polymorphisms in Ross breed

Predicting the impact of polymorphisms in the chicken IFITM3 gene

In a previous study in humans, the rs12252 polymorphism in the IFITM3 gene was determined to impact protein structure and trigger a deleterious effect on the severity of IAV-infected patients during a 2009 pandemic. To evaluate the degree of damage of three non-synonymous SNPs, L100 M, V103I and N125H, we used PolyPhen-2 and SOSUI program. L100 M and N125H were predicted to be ‘probably damaging’ with scores of 0.993 and 0.969, respectively (data not shown). It is also important to note that L100 M, V103I and N125H are located in the TM2 region according to the SOSUI program; therefore, we assumed that these three non-synonymous SNPs will affect transmembrane structure. Thus, we divided these three non-synonymous SNPs into eight haplotypes (100 L/103 V/125 N, 100 L/103 V/125H, 100 L/103I/125 N, 100 L/103I/125H, 100 M/103 V/125 N, 100 M/103 V/125H, 100 M/103I/125 N and 100 M/103I/125H) and performed transmembrane prediction by SOSUI (Table 6). Interestingly, the 100 M allele changed its position in TM2 from 99 to 121 to 101–123 and the length of its conserved intracellular loop domain (CIL) from 29 to 31(data not shown).

Table 6 Transmembrane domain changes in IFITM3 protein according to polymorphisms predicted by SOSUI

Comparison of promoter structure

Previous studies have reported that the immune system and immune regulatory mechanisms differ significantly between mammals and birds. The promoter is the one of the major regulatory factors of protein expression. Thus, we postulated that the promoter structure may differ between mammals and birds. We analyzed 300 bp upstream of the positive strand of the IFITM3 gene in humans, mice and chickens (Fig. 4). Notably, mammals do not contain a CGI in the proximal promoter. Remarkably, since the CGI in chickens has been distributed to the proximal promoter region and gene body of the IFITM3 gene, the promoter structure in chickens differs significantly from that of mammals.

Fig. 4
figure 4

Comparison of theIFITM3gene promoter architecture in humans, mice and chickens.IFITM3 nucleotide sequences were obtained from GenBank at NCBI, including those of humans (Homo sapiens, NC_000011.10), mice (Mus musculus, NC_000073.6), and chickens (Gallus gallus, in this study). The TATA box was predicted by the Naive Bayes model and CpG islands (CGIs) were predicted by GPMiner. The left dotted box indicates the proximal promoter region, which is located 300 bp upstream of the transcription start site. The right dotted box indicates the body of the IFITM3 gene

Discussion

The IFITM3 protein is a transmembrane protein and acts as the first line of host defense against a wide range of viruses (Brass et al. 2009; Schoggins et al. 2011; Weidner et al. 2010). The IFITM3 protein has a well-conserved structure: CD225, which consists of two major domains, TM1 and CIL. Previous studies have focused on conserved amino acid residues within two domains, including F75, F78, R87 and Y99. F75 and F78 within TM1 participate in the physical association between IFITM proteins, whereas R87 and Y99 within the CIL play a major role in the inhibition of orthomyxovirus. These residues are well-conserved among several species and the substitution of these residues reduces the antiviral capacity of the IFITM3 protein (John et al. 2013). Another important domain of the IFITM3 protein is the NTD. NTD exhibits very low homology among species. However, its sorting signal motif, Yxxϕ within the NTD, is well-conserved. Previous studies in human cell lines have reported that 20-YEML-23 is necessary for trafficking IFITM3 protein from the cell surface to the endosomal pathway. IFITM3 mutants targeting Y20 and L23 caused the IFITM3 protein to relocalize to the cell periphery (Jia et al. 2012; Jia et al. 2014). However, in a recent study in ducks, IFITM3 mutants targeting the Yxxϕ motif were correctly localized in the LAMP-1-expressing late endosome. Because the canonical sorting signal motif of duck IFITM3 protein does not participate in correct localization, we performed sequence alignment among humans, mice and several bird species to identify differences in the IFITM3 protein between mammals and birds. The NTD and CTD of the IFITM3 protein showed very low homology between mammals and birds (Fig. 1). Differences in the amino acid sequence can change the topology of the transmembrane structure. Thus, we performed transmembrane prediction using TMpred and SOSUI to compare the structure of IFITM3 proteins (Table 1, Fig. 2). Interestingly, the prediction revealed three differences in the IFITM3 protein between mammals and birds. However, this prediction was carried out on limited species of mammals and birds registered in GenBank, further confirmation in recently reported IFITM3 sequences is needed in the future (Bassano et al. 2017). In a previous topology study in the human IFITM3 protein, substantial evidence indicated that the human IFITM3 protein favors the cytosolic N-terminus. Our prediction suggested that birds have an inverted topology in their IFITM3 protein compared to that of the human IFITM3 protein, implying that the chicken IFITM3 protein prefers the extracellular N-terminus. Because the clathrin adaptor protein complex interacts with cytoplasmic tails of membrane proteins, the position of the signal motif is important to the endosomal pathway of a protein (Traub and Bonifacino 2013). In the human clathrin adaptor protein complex, AP-2 recognizes the YEML motif, and the NTD of the human IFITM3 protein prefers the cytosolic N-terminus (Bailey et al. 2013). However, IFITM3 protein in birds does not likely use the Yxxϕ motif and shows inverted topology to this protein in mammals. Since the evolutionally conserved sorting sequence of the IFITM3 protein malfunctions in birds, it will be valuable to study whether the well-conserved CTD of birds, which is predicted to exist in the same compartment with the clathrin adaptor protein complex, acts as an atypical sorting signal.

Genetic polymorphisms in disease-associated genes can influence the susceptibility to disease onset (Jeong et al. 2005a; Jeong et al. 2005b; Kim and Jeong 2017a). Previous studies have reported that SNPs in the IFITM3 gene are associated with several diseases, especially the 2009 IAV pandemic and UC. The rs12252 SNP, which is located in the ORF, showed a significant association with disease severity in the 2009 IAV pandemic. In addition, a previous genome-wide association study (GWAS) indicated that UC is associated with an SNP in the IFITM3 gene. A subsequent study reported that the distribution of rs3888188 SNP in the IFITM3 gene correlated with the number of UC patients (Seo et al. 2010; Wu et al. 2007). Furthermore, rs3888188 SNP is associated with the infection rate of tuberculosis and recent studies reported that the IFITM3 protein exhibited a broad spectrum of immune responses to variable pathogens, including not only enveloped viruses but also non-enveloped viruses and bacteria (Anafu et al. 2013; Naderi et al. 2016; Ranjbar et al. 2015). Because the IFITM3 protein participates in the host immune response and polymorphisms in the IFITM3 gene dramatically affect immune capacity, we performed direct sequencing and found ten SNPs and one in/del in the chicken IFITM3 gene (Table 2, Fig. 3), suggesting that the chicken IFITM3 gene is highly polymorphic. Because the IFITM3 gene is an important immune-related gene, its highly polymorphic property may affect to antiviral ability. Remarkably, L100 M and N125H were predicted to be ‘probably damaging’ by PolyPhen-2 (data not shown). Since the non-synonymous SNPs are located in TM2 and can influence the topology of the transmembrane domain, we performed transmembrane prediction (Table 6). Interestingly, the L100 M polymorphism changed the position of the TM2 domain and the length of the CIL domain. Since the CIL domain belongs to CD225, which is a well-conserved structure of the IFITM3 protein among species and is significantly associated with antiviral capacity, it will be important to determine these influences on host immune capacity.

The expression of IFITM3 gene in the duck is strongly elevated in response to highly pathogenic avian influenza infection, whereas that in the chicken is shown different pattern. This result suggests that ducks are more resistant to avian influenza infection than chickens (Smith et al. 2015). In several studies, broiler breed seemed to be generally more resistant to avian influenza infection than layer breed in chickens. Thus, we investigated genetic difference of the IFITM3 gene between Dekalb white as layer breed and Ross as broiler breed. Interestingly, the genotype and allele frequencies of V103I polymorphism of the IFITM3 gene showed significantly different distributions between 2 chicken breeds (P < 0.0001) (Table 2). To identify the difference of antiviral function according to alleles of V103I polymorphism of IFITM3 gene, which showed significant difference between Dekalb White and Ross, further study is highly desirable in the future.

The immune system differs substantially between mammals and birds (Kaiser 2010). Because the structure and predicted topology of the IFITM3 protein differ between these two groups, we hypothesized that the structure of the IFITM3 gene also differs. We investigated important promoter elements, including the TATA box and CGIs, in humans, mice and chickens (Fig. 4). In mammals, the IFITM3 gene does not contain a TATA box or CGI within the proximal promoter. Interestingly, the promoter structure of the chicken IFITM3 gene differs significantly compared to mammals, chickens contain a CGI only. Although the promoter can be classified above ten types according to its structure, briefly, based on the presence of two major components, the TATA box and CGIs, the promoter can be classified into 4 types. Among them, a TATA-/CGI- classified promoter is regulated in a tissue-specific manner and a TATA-/CGI+ classified promoter is regulated in a house-keeping manner (Danino et al. 2015; Juven-Gershon and Kadonaga 2010; Lenhard et al. 2012; Zhu et al. 2008). According to these classifications, the IFITM3 gene in mammals is predicted to be regulated in a tissue-specific manner and the chicken IFITM3 gene is predicted to be regulated in a house-keeping manner. It is important to note that the proximal promoter region of the IFITM3 gene differs significantly among several species. We look forward to confirming how this difference affects the immune system in a future study. Further study based on our baseline data is highly desirable to verify the differences of promoter structure of IFITM3 gene, because IFITM3 protein evolutionally well-conserved in several species and has crucial role in host immune systems (Chen et al. 2017; Smith et al. 2013; Zhang et al. 2012). Because our analysis has been performed on only limited 300 bp upstream region of the IFITM3 gene, further investigation of enlarged upstream region (~ 500 bp) of this gene and distal promoter is highly desirable to validate the feature of chicken IFITM3 gene identified in the present study.

Conclusion

Collectively, we investigated the structure of the IFITM3 protein and genetic characteristics of the IFITM3 gene. We noted structural difference in the IFITM3 protein between mammals and birds and visualized topological differences. We also first reported genetic distribution of polymorphisms in the chicken IFITM3 gene and performed novel methods to evaluate non-synonymous SNP using transmembrane topology prediction. Lastly, we identified differences in the promoter architecture between mammals and chickens.