Isolation, characterization and genetic diversity of NBS-LRR class disease-resistant gene analogs in multiple virus resistant line of chilli (Capsicum annuum L.)

Viruses are serious threat to chilli crop production worldwide. Resistance screening against several viruses resulted in identifying a multiple virus resistant genotype 'IHR 2451'. Degenerate primers based on the conserved regions between P-Loop and GLPL of Resistance genes (R-genes) were used to amplify nucleotide binding sites (NBS)-encoding regions from genotype 'IHR 2451'. Alignment of deduced amino acid sequences and phylogenetic analyses of isolated sequences distinguished into two groups representing toll interleukin-1 receptor (TIR) and non-TIR, and different families within the group confirming the hypotheses that dicots have both the types of NBS-LRR genes. The alignment of deduced amino acid sequences revealed conservation of subdomains P-loop, RNBS-A, kinase2, RNBS-B, and GLPL. The distinctive five RGAs showing specific conserved motifs were subjected to BLASTp and indicated high homology at deduced amino acid level with R genes identified such as Pvr9 gene for potyvirus resistance, putative late blight resistance protein homolog R1B-23 and other disease resistance genes suggesting high correlation with resistance to different pathogens. These pepper RGAs could be regarded as candidate sequences of resistant genes for marker development.


Introduction
Chilli (Capsicum annuum L.) is an indispensable commodity in every Indian cuisine due to its spicy taste, appealing colour and flavour of fruits. The viruses viz., Cucumber mosaic cucumovirus (CMV), Chilli veinal mottle potyvirus (ChiVMV), Chilli leaf curl virus (ChiLCV) and Groundnut bud necrosis virus (GBNV) are very problematic and pose serious threat to chilli crop production as the vectors are surging due to their wide adaptability to climate change. In recent years, farmers are not willing for chilli cultivation in India, mainly due to the severe incidence of viruses. Most of the viruses systemically infect the host, after invading to cells they express their proteins replicate and move from one cell to other (Pallas and Garcia 2011). To counteract, plants also have evolved defence mechanism that are mediated by resistant genes (R-genes) (Kang et al. 2005). Most of the R genes encode proteins with nucleotide binding site (NBS) and leucine rich repeat (LRR) region (Ellis et al. 2000).
Resistant gene analogues (RGAs) are potential genomic regions that contain structural motifs, such as the NBS and LRR regions. Through transposon tagging using Mu transposon, Hm1 gene, a plant R-gene have been cloned for the first time in maize (Johal and Briggs 1992). Later several R-genes such as N, M, RPF, Gpa2 and others have been cloned in different crops like eggplant (Reddy et al. 2015), grapevine (Donald et al. 2002), citrus (Deng et al. 2000) and other species (Sharma et al. 2009). The comparison of the sequences of these R-genes has shown that majority possess conserved amino acid motifs consisting of a NBS domain and a hydrophobic domain (HD) with a consensus amino acid sequence, Gly-Leu-Pro-Leu (GLPL), downstream to NBS (Reddy et al. 2015). Molecular markers facilitate and accelerate the disease resistance breeding programme by reducing the frequency of phenotyping in the process of selection across the generations. There are several molecular marker systems and one of them having more relevance in disease resistance is resistant gene analog polymorphism marker technique (Chen et al. 1998). These RGAP markers were used to identify linked markers for different pathogen resistances in different crops (Yi et al. 2013;Basak et al. 2004;Mutlu et al. 2008).
The conserved motifs within these NBS domains make it possible to use degenerate primers and PCR to isolate RGAs (Kanazin et al. 1996;Leister et al. 1998). These RGAs have the potential application in development of closely linked markers or functional resistant gene markers for marker assisted breeding (Hammond-Kosack and Jones 1997). Leister et al. (1996) developed a PCR based method to easily isolate R-gene analogues (RGAs), from a wide variety of plant species, in which they used degenerate primers that amplify between the kinase 1a motif of the NB-ARC domain and the GLPL motif. RGAs have been successfully isolated using the PCR based approach from a wide range of plant species (Sharma et al. 2009), Cucumis (Wan et al. 2010), hot pepper (Wan et al. 2012) and wild eggplant (Zhuang et al. 2012;Reddy et al. 2015). Cloning of resistant genes and development of molecular markers will be highly useful in marker assisted breeding programs.
The main objective of the study was to isolate RGAs in multiple virus resistant line of hot pepper 'IHR 2451' using degenerate primers designed to amplify conserved NBS and LRR regions of RGAs. These isolated RGAs would be helpful in developing the PCR based markers linked to disease resistance in peppers and also for cloning of other crop R-genes in the future.

Plant material and DNA isolation
The existence of 'Jackpot' genotypes that serve as a source of resistance for a number of different diseases (Grube et al. 2000a) will be very much useful and can be exploited by breeders. In the present study, one such jackpot genotype viz., IHR 2451 has been identified as a source of resistance to Cucumber mosaic cucumovirus (CMV) and Chilli veinal mottle potyvirus (ChiVMV) (Naresh et al. 2016), was also reported to be resistant to TMV (Tobacco Mosaic Virus) and other Leaf curl viruses (Singh and Thakur 1979). So this 'Jackpot' genotype 'IHR 2451' was used to isolate RGAs using PCR based approach. Leaf sample from the 15 days old seedling of 'IHR2451' was collected and DNA was isolated by cetyl tri methyl ammonium bromide (CTAB) method (Doyle and Doyle. 1990) with minor modifications. The integrity of isolated purified DNA was determined by gel electrophoresis and DNA was quantified by using Nanodrop (Thermo scientific) and diluted accordingly to final concentration of 40 ng/ll for PCR analysis.

Degenerate primers and PCR amplification
Two pairs of degenerate primers (Table 1) were used to isolate the RGA sequences by amplifying the region between the two conserved domains i.e. P-loop and GLPL in the plant R-genes. These two sets of degenerate primers amplify approximately 500 bp DNA fragments. PCR reaction was performed in final volume of 25 ll, containing 50 ng template DNA, 19 PCR buffer, 2.5 mmol/l MgCl 2 , 0.5 mmol/l dNTPs, 0.25 mmol/l of each primer and 1.5 units of Taq DNA polymerase (Bangalore, Genei). The reaction was performed in eppendorf thermal cycler and reaction cycle program was with initial denaturation 95°C for 5 min, followed by 35 cycles of 94°C for 45 s, 55°C for 60 s and 72°C for 60 s, with final extension at 72°C for 10 min.

Cloning and sequencing of PCR products
The PCR products were run on a 1.5% agarose gel and visualised, documented under UV Pro gel documentation unit. Bands of the expected size (*500 bp) were excised from agarose gel and were purified using a DNA gel purification kit (Sigma Aldrich). The obtained purified product was cloned into a pTZ57R/T vector (Thermo Scientific) and transformed into competent Escherichia coli DH5a cells by heat-shock method (Sambrook and Russel 2001). Total of sixty recombinant clones i.e. thirty each from the two primer combinations were randomly picked and plasmid DNA from the transformed clones was isolated using plasmid elution kit (Sigma Aldrich) and confirmed by plasmid PCR. Plasmid DNAs were sequenced for further analysis.

Sequence analysis and phylogenetic tree construction
All the isolated DNA sequences were trimmed for vector sequence contaminations and only the conserved region between P-Loop and GLPL which represents RGA was considered for further analysis. Assembly of DNA sequences and translation to the predicted amino acid sequence were performed using Transeq software (http:// www.ebi.ac.uk/Tools/st/emboss_transeq/). DNAMAN software (version 8.0) was used for homology matrix between the isolated sequences and other related R-gene sequences. Identity and similarity searches of nucleotide and amino acid sequences were performed using BLAST at the National Centre for Biotechnology Information (NCBI) database, a non-redundant GenBank (http://www.ncbi.nlm. nih.gov/BLAST/). MEME software (Multiple Expectation Maximization for Motif Elicitation) (Bailey et al. 2009) was used to analyse the conserved motif regions of the predicted RGAs. The deduced amino acid sequences of the RGAs isolated from the pepper line 'IHR 2451', with other related plant RGAs were aligned and used for phylogenetic tree construction using MEGA (Molecular Evolutionary Genetic Analysis) version 6.0 (Tamura et al. 2013). Cluster analysis was carried out using CLUSTALX (Thompson et al. 1997) based on the neighbour joining tree (Saitou and Nei 1987) with default settings.

Isolation of resistant gene analogs (RGAs)
Nucleotide binding site-encoding DNA regions were amplified using the degenerate primers in the jackpot genotype 'IHR 2451'. Generated fragments with expected size of approximately 500 bp were purified and cloned into the pTZ57R/T vector. Thirty recombinant clones of each primer combination were randomly selected for sequencing. Cloning and characterization of these amplicons revealed that each fragment included many RGA sequences. Out of 60 clones sequenced, twenty-eight sequences which had uninterrupted ORF were considered as RGAs. Homology matrix was done for the shortlisted 28 (CaTRGA1-28) sequences from this study and five sequences (CaTRGA1, CaTRGA4, CaTRGA14, CaTRGA15 and CaTRGA18) with \61% homology were selected for further analysis and were deposited in the GenBank data base (http://www.ncbi.nlm.nih.gov/ genbank) and their accession numbers are KX758993 (CaTRGA1), KX758994 (CaTRGA4), KX758995 (CaTRGA14), KX758996 (CaTRGA15) and KX758997 (CaTRGA18). The protein sequences of these RGAs were subjected to BLASTp algorithm and indicated that these RGAs had high scores compared with the RGAs or R-genes identified in other plant species (Table 2). These five sequences were compared for homology with highly characterized RGAs RPM1, Gpa2, M, N, and PRF (Table 3).

Multiple amino acid sequence alignment
The region between the P-loop and GLPL was used to achieve multiple-amino acid sequence alignment between the other plant R-genes and the deduced amino acid sequences of Capsicum RGAs. In the present study, isolated NBS-LRR was classified into TIR or non-TIR according to the characteristic amino acid at the end of the kinase-2 domain. The presence of aspartic acid (D) at the end of kinase-2 domain is classified as TIR, and according to this 13 sequences of the present study (CaTRGA1-13) belongs to TIR group along with N and M. When there is a tryptophan (W) at the end of kinase-2 domain, the sequence can be classified as non-TIR group and CaTRGA14-28 are grouped along with RPM1, Gpa2 and PRF (Fig. 1).
Multiple Expectation Maximization for Motif Elicitation motif analyser was used to identify structural diversity of conserved domains and their relative position in chilli RGAs (Fig. 2I). Six highly conserved motifs (P-loop, RNBS-A, kinase-2 RNBS-B, RNBS-C and GLPL) were observed across the sequences of 28 isolated RGAs represents the NBS-LRR R-genes structural characteristics (Fig. 2II).

Discussion
Being an important commercial and industrial spice crop, improvement of disease resistance mainly for viruses is the top priority in pepper breeding programmes. Earlier screening for virus resistance, the line 'IHR 2451' as jackpot genotype showed resistance to two major viruses (Naresh et al. 2016). Identification of RGAs and their use in identification of markers for disease resistance for marker assisted selection is an ideal approach (Yang et al. 2013;An and Yang 2014;Mutlu et al. 2008). These studies indicate the potentiality of R-gene analogues in identification of markers linked to disease resistance, as R-gene markers target conserved motifs of resistant genes (Joshi et al. 2012). To our knowledge, the isolated RGAs in this study are the first report of NBS-LRR class of RGAs isolated from multiple virus resistance chilli genotype and the phylogenetic results suggested wide genetic diversity of isolated Capsicum RGAs with previous studies (Fig. 3). These NBS-LRR proteins are mainly of two types, TIR and non-TIR types both involved in pathogen recognition but are distinctive in sequence and signalling pathways (Michale et al. 2006). In our study, the cloned sequences belonged to both TIR and non-TIR-NBS-LRR type RGAs, which is similar to results reported in eggplant (Reddy et al. 2015), cucumber (Wan et al. 2010) and peppers (Wan et al. 2012) using the same degenerate primers, indicating the success of homology based PCR cloning of NBS-LRR class disease resistance gene analogs in Capsicum. Out of 28 sequences, CaTRGA1-13 sequences were TIR type having highly conserved aspartic acid (D) at the end of the kinase-2 domain and CaTRGA14-28 sequences were classified as non-TIR type as indicated by the final residue tryptophan (W) at the end of kinase-2 motif. In case of monocots only non-TIR-NBS-LRR genes are present, but in dicots the presence of both types are reported (Joshi et al. 2010). This confirms that chilli being dicotyledonous, has both TIR and non-TIR NBS-LRR class genes. In our study the TIR: non-TIR type NBS-LRR RGAs ratio of 1:1 have been found, whereas high ratios were reported in other crops (Donald et al. 2002). But in eggplant only non-TIR RGAs were isolated by Zhuang et al. (2012). Thus it indicates that the amount and ratio of TIR and non-TIR types varies across the plant species (An and Yang 2014;Reddy et al. 2015).
Out of the 28 RGAs, five RGAs which have high homology with other resistant genes were selected and these sequences were translated into protein sequences. The homology matrix results indicates that the isolated chilli RGAs have similarity with the highly characterized R-genes and were considered as NBS-LRR class of genes. These RGAs were subjected to BLASTp and indicated that these RGAs have high scores compared with the RGAs or R genes and have high identity with the plant disease resistance proteins (Table 2). Resistant gene analogs are the best way to develop PCR based markers linked to the resistant genes as these are candidates involved in plant defence mechanism (Peter et al. 2013). Major Tomato mosaic virus (ToMV) resistance have been reported to be conferred by Tm-1 gene and Kang et al. (2010) developed SNP markers for CMV resistance by comparative mapping using homolog of Tm-1 gene in tomato, as Cmr1 gene (single dominant gene conferring resistant to CMV in Bukang) is located on LG2 which is synthetic to ToMV resistance locus (Tm-1). Similarly Yang et al. (2009) isolated RGAs from pepper variety Yv2007001, a highly resistant line for CMV using degenerate primers and PCR approach and found RGA 46 showed high homology with Tm-2 2 (ToMV), suggesting its correlation with CMV resistance. The RGA sequences isolated in present study showed high identity with the different resistant protein sequences in other crops. The lower E value suggests the most reliable match (Totad et al. 2005). Based on E value, CaTRGA1 showed high identity with ToMV resistance protein N-like in chilli (XP_016574092.1 99% identity with 5e-110 E value) and CaTRGA4 showed high identity with TMV resistance protein N-like in Solanum pennellii (XP_015062483.1, 85% identity with E value 2e-90), respectively. CaTRGA18 showed 51% identity with Pvr9 (Pepper veinal mottle virus resistant gene characterized from the resistant source pepper genotype CM 334) b Fig. 2 (Tran et al. 2015). These CaTRGA15 and CaTRGA18 could be potential candidates for studying the nematode and virus resistance in chilli.
Usually in plants R-genes are present in clusters of tightly linked genes (Hulbert et al. 2001). It is already reported that the L gene occur in a cluster of resistance Fig. 4 Phylogentic tree of selected five CaTRGAs and other R genes; the bootstrap values expressed as a percentage of 1000 replicates genes located on chromosome 11 in a position that apparently overlaps with resistance gene clusters in tomato and pepper (Grube et al. 2000b). Lefebvre (2004) reported the existence of R-gene cluster where disease resistance QTLs for potyvirus, cucumber mosaic virus (CMV), Tomato spotted wilt virus (TSWV), and Phytophthora capsici. Further, mapping of these RGAs isolated from jackpot genotype 'IHR 2451' is required to identify the R-gene cluster.

Conclusion
The present study identified five candidates of RGAs from a multiple virus resistant pepper genotype 'IHR 2451' can be used for characterization of different resistant genes. These RGAs identified could be regarded as candidate sequences of disease resistance, which may be explored in developing molecular markers for disease resistance. Mapping of these RGAs will help in better understanding of multiple virus resistance and identifying and cloning of R-genes in peppers for marker assisted breeding programs for virus resistance.