Molecular mimicry, genetic homology, and gene sharing proteomic “molecular fingerprints” using an EBV (Epstein-Barr virus)-derived microarray as a potential diagnostic method in autoimmune disease

EBV (Epstein-Barr Virus) and other human DNA viruses are associated with autoimmune syndromes in epidemiologic studies. In this work, immunoglobulin G response to EBV-encoded proteins which share regions with human immune response proteins from the human host including ZEBRA (BZLF-1 encoded protein), BALF-2 recombinase expressed primarily during the viral lytic replication cycle, and EBNA-1 (Epstein-Barr Virus Nuclear Antigen) expressed during the viral latency cycle respectively were characterized using a laser-printed micro-array (PEPperprint.com). IgG response to conserved “A/T hooks” in EBV-encoded proteins such as EBNA-1 and the BALF-2 recombinase related to host DNA-binding proteins including RAG-1 recombinase and histones, and EBV-encoded virokines such as the IL-10 homologue BCRF-1 suggest further directions for clinical research. The author suggests that proteomic “molecular fingerprints” of the immune response to viral proteins shared with human immune response genes are potentially useful in early diagnosis and monitoring of autoantibody production and response to therapy in EBV-related autoimmune syndromes.


Introduction
DNA viruses such as EBV (Epstein-Barr Virus) have been associated with numerous autoimmune syndromes in epidemiologic studies, but the specific role of infection remains unresolved [1][2][3]. Conventional serologic testing for EBV is available using specific immunoglobulin G (IgG) binding to viral proteins such as EBNA-1 (Epstein-Barr Virus Nuclear Antigen-1), VCA (Viral Capsid Antigen), and EA (viral components termed Early Antigen). A correlation between antigen-specific IgG levels to specific EBV proteins has been reported in SLE, MS (multiple sclerosis), and other autoimmune syndromes [4][5][6][7][8][9]. Since the vast majority of the adult population is positive for EBV, it would be useful to have additional laboratory analysis for clinical evaluation and monitoring of autoimmune disease in patients based on response to past EBV infection [10][11][12][13].
Recent developments in proteomic analysis and technology have dramatically increased the power and lowered the cost of many immunological tests [14]. Presently, the standard of diagnosis for many autoimmune diseases is based on ANA (anti-nuclear antibody) assays in which immunoglobulin binding is characterized against the cell cytoplasm and nucleus and related IgG binding assays. While ANA and immunoassay technology can in principle be automated, a more recent approach is the use of specific antigenic host and viral proteins David H. Dreyfus printed in a microarray with sensitivity similar or great than that of the ANA [15,16]. Using proteomic technology, hundreds or even thousands of epitopes can be analyzed simultaneously with a single serum sample. Detailed mapping of autoreactive viral epitopes could suggest epitopes capable of differentiating healthy patients versus patients with autoimmune syndromes such as SLE and scleroderma in which EBV reactivation, EBNA-1, and ZEBRA expression are evident [17][18][19].
Viral proteins trigger autoreactive IgG is Bmolecular mimicry^in which short regions of similarity between viral and host proteins are proposed to cross-react and through Bepitope spreading^lead to IgG against host proteins in the presence of viral inflammation and defects in host suppressor cells [20][21][22][23][24]. EBV-encoded proteins such as ZEBRA (BZLF-1 protein) share regions with host transcription factors in the fos/jun family and also to host ankyrin proteins that anchor the cytoskeleton and regulate host transcription factors such as p53 and NF-kB important in the immune response [25]. A well-characterized example of Bmolecular mimicry^between EBV and host protein is the similarity between regions of EBNA-1 and the BSmith Antigen^in SLE [26,27]. EBNA-1 and other proteins including the viral encoded recombinase BALF-2 protein share a DNA-binding domain termed an A/T hook with both vertebrate and invertebrate transcription factors and DNA-binding proteins such as histones and recombinases [28]. Chronic viruses also encode Bvirokinesŝ imilar to host cytokines [29]. It would therefore be useful to have a sensitive and inexpensive method to generate a Bmolecular fingerprint^of IgG binding to shared regions of viral and host proteins to compare between healthy patients with autoimmune disease or at risk in addition to available more labor intensive and costly technology such as the ANA. Using BPEPperprint^technology, the authors characterized peptide epitopes that might provide a sensitive and inexpensive method to generate a proteomic Bmolecular fingerprint^of IgG binding to shared regions of viral and host proteins [30]. Proteomic technology appears to be highly flexible and inexpensive relative to previous whole protein-based technologies and also permits localization of binding to precise intervals of the viral proteins. A limitation of the technology is that some epitopes which require complex folding and tertiary interactions of viral proteins or secondary modifications evident in ELISA binding assays will not be detected [12,31]. Conclusions regarding clinical utility therefore must await confirmation in additional studies based on direct comparison with other diagnostic systems [14].

Methods
Molecular images of BA/T hook^DNA-binding domains in figures shown were generated by the authors from published public access coordinates of X-ray crystallography coordinates from crystallized RAG-1 nonamer (pdb file 3GNA) and Herpes Simplex ICP8 (pdb file 1URJ). The Bpymol^program was used through license from the author. Immunoglobulin G response to immunologically important viral proteins containing regions shared with human immune response proteins were characterized. Serum from a healthy EBV-positive donor was obtained with consent for research as part of an ongoing research study of SLE and scleroderma [32,33].
Using proprietary technology (PEPerprint.com, Heidelberg, Germany), overlapping arrays of 15 amino acid long peptides were synthesized on a microchip [30]. Each peptide was progressively printed across the region of interest with 13 amino acid overlap between each peptide or a progression of two amino acids per peptide. Chips were incubated with serum, developed, and analyzed with antibodies specific for immunoglobulin G. Analysis of the immunoglobulin binding was provided by the chip manufacturer (PEPperprint.com) and data was presented using 1 to 100 and 1 to 500 dilutions of serum. Results are presented for both serum dilutions.
In the figures presented in this work, the results of IgG binding are presented in two columns to the right of the printed epitopes, the left column presents the results from the 1/500 serum dilution and the right column presents the results from the 1/100 serum dilution to which an arbitrary factor of B2000^has been added to assist in graphic analysis of results. Results shown in this work suggested that both dilutions of serum gave similar results. Results of IgG binding to shared epitopes were also determined in a patient with an EBV-related autoimmune syndrome, scleroderma, and were significantly different than the healthy control, for example elevated IgG response to BZLF-1 associated epitopes versus EBNA-1 epitopes relative to the healthy control as might be expected due to increased lytic viral replication; however, a complete summary of responses in healthy patients versus autoimmune disease patients is beyond the scope of this work and will be reported elsewhere.
In the tables shown in this work, the actual sequence of each printed peptide is shown with the corresponding location on the chip to the left of the peptide, followed to the right by the actual peptide sequence, followed to the right by the corrected binding intensity donor IgG at 1/500 and 1/100 dilutions of serum. In each 15 amino acid sequence, the approximate position of amino acids with maximal IgG detected binding is underlined. A similar profile or Bfingerprint^of response to defined viral peptides could be defined for patients with EBV-related syndromes as well as syndromes associated with reactivation of other herpes viruses such as herpes simplex (an alpha herpes virus latent in neuronal tissues and CMV) or HHV-6 (beta herpes virus latent in hematopoietic stem cells).

Results
IgG response to ZEBRA protein EBV-encoded ZEBRA protein is a component of the molecular switch between latency and lytic cycles, and IgG response to ZEBRA is present in many autoimmune syndromes. For example, a high level of ZEBRA protein expression is also evident in SLE and scleroderma [32,33]. As shown in Fig. 1, ZEBRA has a modular structure with three regions each shared with a different host immune response protein. The amino terminus of zebra is a transcription activating domain similar to host immune response transcription activating domains, the central region is a DNA-binding domain similar to the fos/jun DNAbinding proteins that regulate the host immune response, and the terminal region is related to host anchoring proteins regulating NF-kB and innate immune activation, resulting in potential antigen sharing with multiple host DNA-binding proteins including the p53 tumor suppressor [34,35].
IgG binding in a healthy EBV-positive donor to the entire approximately 220 amino acid long ZEBRA amino acid sequence was analyzed using a laser-printed peptide chip using most the laboratory strain B95-8 and a more virulent human strain of EBV termed BAkata.^Data are shown for the Akatabased peptides, similar or identical to those obtained with B95-8 peptides (B95-8 data not shown). As shown in Table 1, immunoglobulin G response in a healthy donor was confined to two epitopes both located in the amino terminal response of the ZEBRA protein. No immunoglobulin G response was evident to the DNA-binding or ZANK (Zebra ANKyrin-like regions) of zebra in either B95-8 or Akata protein sequences suggesting that this region of ZEBRA is not the source of autoantibodies to p53 and related protein in SLE [36,37]. Similar IgG binding epitopes in ZEBRA were evident in a patient with scleroderma as will be discussed in more detail elsewhere (data not shown). IgG binding regions of ZEBRA could be used as a part of a molecular fingerprint of  Table 1 IgG binding to ZEBRA (BZLF-1) peptides TPDPYQV (aa [15][16][17][18][19][20][21] and PTGSWFP (aa 70-76) 10  IgG binding to regions of EBV shared with host genes (see discussion).
IgG response to EBNA-1 peptide regions shared with host Smith antigen and P53/TRAF binding domains EBNA-1 protein is the primary protein expressed during EBV latency and is highly antigenic for the lifetime of the host due to periodic viral reactivation and cell lysis releasing small amounts of EBNA-1 into the circulation [38]. Despite the nuclear localization of EBNA-1 as suggested by the name BEpstein-Barr virus nuclear antigen,^the EBNA-1 generates a persistent immunoglobulin G response in both healthy patients and patients with autoimmune syndromes. The unique role of EBNA-1 in viral latency and persistence has decades of research on both the humoral and cellular immune response to EBNA-1 as a tool for viral diagnosis and potentially a tool for antiviral therapy and vaccines [7,17,19]. Because of the highly repetitive sequences of certain EBNA-1 regions and the large size of the protein, only specific defined regions of EBNA-1 were analyzed in this work, in contrast to the entire amino acid sequence of ZEBRA protein from two different EBV strains (Table 1).
Research has confirmed that suppressor mechanisms in healthy patients normally limit Bepitope spreading^the similar regions of EBNA-1 and host antigens [26,27,39]. Several decades ago, molecular mimicry between EBNA-1 and the SLE Smith antigen was observed and subsequently validated in animal models as a mechanism of viral pathogenesis. To facilitate comparison with previous studies, the PEPperprint microchip was applied to a well-characterized EBNA-1 protein epitope similar to the Smith antigen ( Table 2). As shown in Table 3, another region of EBNA-1 unrelated to the Smith antigen-like regions that binds to host transcription factors P53 and TRAF also generated a significant host IgG response [40]. This region of EBNA-1 is not related to ankyrinlike regions of ZEBRA and is also unlike regions of p53 binding autoantibodies described in SLE and related syndromes [36,37].
As with the ZEBRA protein regions, specific IgG binding epitopes of EBNA-1 protein might be a useful diagnostic marker for characterizing a normal response to EBV infection and also for identifying defects in immune suppression in a variety of EBV-related syndromes. Remarkably, a peak of IgG binding was evident overlapping the EBNA-1 smith antigenlike sequence, which also co-localizes to a longer peptide: BPPRRPPPGRRPFFHPVGEADYFEYHQE (EBNA-1 391-420)^previously shown by ELISA assay to correlate with diagnosis of MS [18]. Similar results correlating IgG binding and also T lymphocyte response progression of MS and ELISA using EBNA-1 peptide region 400-641 containing the Smith Ag like region with both T lymphocyte and IgG binding were reported by another MS case-control study [7,17,19]. If in fact the specific epitope in EBNA-1 correlating with progression in MS is similar or identical to the EBV associated BSmith antigen^identified in EBV as an early antigen in SLE, this observation might suggest that Bproteomic footprinting^can identify previously unknown relationships between pathogenesis and prognosis of MS and SLE (see BDiscussion^).  IgG response to A/T hook region peptides of EBNA-1 shared with viral recombinases and host RAG-1 recombinase The BA/T hook^domain is a family of conserved DNAbinding proteins that bind A/T-rich regions of DNA such as the immunoglobulin nonamer region (Fig. 2). A/T hook domains are also present in host histone proteins as well as both vertebrate and invertebrate DNA-binding proteins including the RAG-1 recombinase [41,42]. The vertebrate RAG-1 protein is required for generation of the host immune repertoire and binds A/T-rich regions of immunoglobulin and T cell receptor genes with an A/T hook domain shared with a conserved herpes virus recombinase [43]. Two different amino terminal EBNA-1 A/T hook regions are required for viral replication during genome latency probably by linking the host DNA replication and transcription process to specific regions of the viral episome [28]. Comparison between the EBNA-1 A/T hook 1 region of RAG-1, EBNA-1, EBVencoded BALF-2, and of other BALF-2 like proteins in herpes simplex (ICP8) and CMV and EBV (BALF-2) are shown in Fig. 2. The crystal structure of herpes simplex ICP8 A/T hook is similar to the RAG-1 A/T hook, although little primary amino acid homology is evident (Fig. 3). As shown in Fig. 3, the A/T hook is exposed on the surface of DNA-binding proteins (published data from host RAG-1 and unpublished data derived from viral ICP8 proteins shown) and therefore would presumably provoke a strong   Table 4, a high level of immunoglobulin G response is present to the first A/T hook region of EBNA-1 and a significantly lower response to the second hook region of EBNA-1. The difference in response to these two similar EBNA-1 structural regions thus specificity is present favoring recognition of the first EBNA-1 hook region over the second hook region despite similar protein tertiary structure. Preliminary data suggests that a much lower response A/T hook regions of EBNA-1 is present in a patient with scleroderma relative to the data shown from a healthy control patient, suggesting that A/T hook regions could be useful to categorize normal and abnormal response to EBNA-1 (unpublished observations). A/T hook regions described could provide a molecular fingerprint for diagnosis and evaluation of autoimmune syndromes as well as a specific marker for previous EBV and related herpes virus infection. As shown in Table 5, the A/T hook region of herpes simplex ICP8 protein is highly antigenic in a healthy donor. IgG response to the BALF-2 A/T hook and host RAG-1 protein was present but at a much decreased level to the A/T hooks of EBNA-1 or ICP-8 (data not shown). The authors suggest that in pathologic inflammation, EBNA-1 and related virus A/T hooks could serve as autoantigens, since the A/T hook is highly immunogenic in EBNA-1 and other herpes viruses.

IgG response to EBV-encoded BvirokineB
CRF-1-encoded protein Many viral pathogens including EBV encode cytokine like molecules termed Bvirokines^and also viral-encoded cytokine receptors [44][45][46]. Virokines may also be reduced or masked as targets of the host immune response due to their small size and secondary modifications such as carboxylation [29]. Virokines could interfere with host-encoded cytokines through a variety of mechanisms. Preliminary results obtained with a healthy EBVpositive donor and also a patient with scleroderma, an EBV-associated autoimmune disease, suggest that most of the IgG response against virokine IL-10 is a very low level and restricted primarily to regions of the BCRF-1 protein that are identical to the host IL-10 (data not shown). These preliminary results are consistent with previous studies demonstrating that most of the IgG response to virokines is directed at shared epitopes between virokine and host cytokine (Fig. 4).
If further studies confirm that virokines such as the BCL-2-encoded EBV protein are poorly antigenic due to the extensive sharing of epitopes with host cytokines, then the host would be vulnerable to the partial agonist and partial antagonist properties of the EBV-encoded proteins such as BCRF-1 acting on the IL-10 receptor, and this could suggest a paradigm for the pathogenic effects of other shared genes [47][48][49][50][51]. In addition, current studies that characterize levels of IL-10 in autoimmune syndromes may in fact be measuring a combination of viral-encoded virokines and host cytokines. The authors suggest that characterization of IgG response to virokines, viral-encoded cytokine receptors, and host cytokines such as IL-10 is a novel use of proteomic molecular fingerprinting that could also help to distinguish between healthy controls and patients with autoimmune syndromes. Fig. 3 A/T hook domain from host RAG-1 and herpes simplex virus ICP8. IgG-binding regions of herpes ICP-8 and corresponding regions of RAG-1 are shown (box). The human RAG-1 protein (recombination activating gene encoded protein) A/T hook has been crystallized bound to the A/T-rich nonamer region of immunoglobulin and T cell receptor genes, and the corresponding region of the herpes simplex ICP-8 protein has also been crystallized although not bound to DNA. Very similar structures are evident in A/T hook proteins despite divergent amino acid sequences

Discussion
Our preliminary results with defined IgG-binding regions of the ZEBRA, EBNA-1, and virokine BCRF-1 protein in a healthy donor suggest that proteomic Bmolecular fingerprinting^tool significantly lowers the cost and time required for analysis. Further population-based studies would be useful to compare between healthy patients with autoimmune disease or at risk in addition to available more labor-intensive and costly technology such as the ANA. Proteomic assays could potentially be used both for diagnosis and also to monitor response to therapy. In particular, using inexpensive and highly automated Bmolecular fingerprints,^it might be possible in the future to identify patients at risk of autoimmune syndromes prior to development of symptoms based solely on their response to specific epitopes in viral proteins and shared host proteins. Targeting therapy prior to onset of disease would limit disease progression and therapy-related adverse events.
Published data from an FDA-approved protein microarray showed that levels of IgG response to host proteins and EBV-encoded EBNA-1 protein and other viral proteins Table 4 IgG response was present to EBNA-1 protein A/T hook region GRPGAPGG (aa 51-57) 4 15  Binding to the EBNA-1-like peptide GRPGAPGG is less than to EBNA-1 protein A/T hook region GRPGAPGG and similar to maximal binding to ZEBRA epitope TPDPYQV (aa [15][16][17][18][19][20][21] could diagnosis both SLE and scleroderma with sensitivity and specificity similar or greater to conventional ANA testing [15]. Addition of viral protein antigens in this assay significantly improved the sensitivity and specificity, although the specific IgG-binding epitopes recognized by immunoglobulin G were not disclosed [15,16]. In this work, IgG-binding results are presented derived from a proprietary laser-printed peptide microchip containing overlapping peptides from immunologically important EBV proteins to illustrate the potential of this new technology. While abnormal immune response to EBV-specific proteins such as EBNA-1 has been described in autoimmune syndromes, specific short epitopes described in the work could facilitate proteomic Bmolecular fingerprints in autoimmune disease^ [7,[17][18][19]. The A/T hook region of EBNA-1 and many herpes virus and host proteins as well as short IgG binding regions in the lytic ZEBRA switch protein described in this work also appear to be a promising antigen for Bmolecular footprints^of other inflammatory and autoimmune syndromes. Viral proteins such as the viral recombinase BALF-2 protein in Epstein-Barr virus and its homolog ICP8 in herpes simplex and other alpha, beta, and gamma Herpesviridae could also provide epitopes defining parameters such as viral reactivation and molecular mimicry with host proteins [25,43]. The A/T hook regions of viral and host proteins are shared between both eukaryotic and prokaryotic DNA-binding proteins such as the RAG-1 recombinase and as histones in both vertebrates and invertebrates including the developmentally regulated Bhin^and Bengrailed^families [41]. Other authors have also recently noted significant overlap between proteins in the human immune response expressed by papilloma virus, another DNA virus correlated with SLE and other human autoimmune syndromes [47][48][49][50][51]. Thus, it seems that Bgene sharing^between host immune response genes is not limited to transcription factors and virokines but may include all aspects of the host immune response. Preliminary data obtained from the immune response to some shared gene-encoded proteins reported in this work suggests that while some shared gene-encoded proteins such as BZLF-1 and EBNA-1 are highly antigenic and thus trigger IgG against self-proteins, other EBV-encoded shared genes such as virokines may be poorly antigenic, permitting the viralencoded projects to function as antagonists or partial agonists of the host immune response. The author suggests that relatively inexpensive new technology such as Bproteomic fingerprinting^may be useful to define differences in immune response between patients with autoimmune syndromes and healthy controls including immune response to Bshared genes.Â cknowledgements The author acknowledges Yehuda Shoenfeld, MD and his research group (Tel Shomer, Sheba Medical Center, Israel, and Erwin Gelfand (retired), National Jewish Medical And Research Center, Denver CO) for their contributions to mentoring of DHD and helpful comments. Also, the authors acknowledge Eugenia Spanopoulou (deceased) for her unique and enduring contributions to the study of RAG protein A/T hooks and RAG function and Jaap Middeldorp for his pioneering work on EBV peptide arrays and helpful comment to the author of this work. Data processing and IgG binding analysis shown were provided on a complementary basis to Dr. David Dreyfus, Keren LLC, New Haven, CT by PEPperprint.com. All other analysis of the data was self-funded by the authors. Fig. 4 Virokines and viralencoded cytokine receptors as targets for host IgG response. A schematic diagram illustrates several mechanisms through which host IgG against viral proteins similar to host cytokines (virokines) and cytokine receptors could interfere with endogenous regulation of suppressor cell regulatory cytokines such as IL-10. EBV-encoded IL-10 like BCRF-1 is approximately 90% identical to host IL-10 and preliminary results suggest that the immune response to BCRF-1encoded protein is extremely low and restricted to areas of homology to the host IL-10 cytokine (see BDiscussion^)