Antibody Modeling, Engineering, and Design
- 265 Downloads
Initially, antibodies directed toward specific targets were obtained by immunization and hybridoma technology (Lindl 1996; Liu 2014). The development of surface display technologies has enabled obtaining antibodies in vitro as well. These tools utilize large combinatorial libraries of antibody sequences (usually in formats that include mostly the variable domains) that are expressed on the surface of a vehicle such as phage or yeast. Such approaches couple a DNA sequence to a functional readout (binding of an antigen), thus allowing for selection and in vitro evolution of antibodies to bind a target molecule (Gai and Wittrup 2007; Sheehan and Marasco 2015). Such library display methods can screen large numbers of antibody variants, sometimes on the order of 1010 or more. Advances in next-generation sequencing of synthetic libraries have further improved antibody engineering allowing for detailed analysis of the selection process and identification of specific variants of interest (Glanville et al. 2015; Koenig et al. 2017b; Wrenbeck et al. 2017).
While immunization and display technologies successfully procure antibodies that bind to almost any target antigen of interest, for many applications binding is not sufficient. For example, to be considered as a drug candidate, an antibody may need to agonize or antagonize the target protein. The aforementioned methods select the antibodies with the highest affinity, regardless of their functional effect on the target. Therefore, additional stages of functional screening are usually required. Using these methods to find a functional antibody, therefore, is sometimes analogous to finding a needle (the binder with desired function) in a large haystack (the immune repertoire or synthetic library). Better models of the antibody-antigen complex may help in identifying, or even designing, functional antibody sequences.
Modeling Antibodies and Antibody-Antigen Complexes
With the growing importance of antibodies as reagents, diagnostic tools, and therapeutics, the need for modeling and engineering them has increased.
The challenges and requirements of modeling antibodies are different than those of modeling the structures of other proteins. In general, protein structure modeling is initially dependent on the prediction of the correct fold for the protein of interest, which can be done by homology or fold recognition methods (Dunbrack 2006; Goldsmith-Fischman and Honig 2003). Once the correct protein fold is identified, additional components, including optimization of the amino acid side chains and modeling the conformations of protein loops, can be addressed. In contrast, the overall fold of antibodies, as well as most of the CDR loops (North et al. 2011), are conserved, and therefore modeling them is rather straightforward (Almagro et al. 2014). Challenges arise in modeling CDR H3, which is variable both in sequence and structure (Regep et al. 2017), as well as the interface between the VH and VL domains, as their relative orientation may vary as well. Since, H3 comprises the H-L interface, modeling these two components is interdependent. For an in-depth review of H3 modeling, see Marks and Deane 2017 (Marks and Deane 2017). Recent advances in CDR H3 and VH-VL interface modeling have been achieved using multiple template structures and CDR grafting, as well as implementing machine learning methods (Bujotzek et al. 2015; Dunbar et al. 2013; Marze et al. 2016; Weitzner et al. 2014). Several programs and web servers are available for antibody modeling and are detailed in Almagro et al. 2014. Homology-based and non-homology-based methods for modeling can be employed (Leem et al. 2016; Marcatili et al. 2008; Norn et al. 2017; Sircar et al. 2009).
A detailed three-dimensional model of the antibody-antigen complex often serves as the basis for the computational design of new antibody sequences. However, modeling the antibody bound to the antigen introduces an additional layer of complexity beyond that of antibody modeling. A model of the antibody-antigen complex can be obtained using in silico protein-protein docking. Docking programs can generate hundreds or thousands of docked models representing the complex, which are then evaluated using different scoring functions based on physicochemical properties of the docked complex, such as shape and/or electrostatic complementarity and binding energies (Gromiha et al. 2017; Zhang et al. 2016). The key to success in protein-protein docking is identifying the correct (native-like) model from the many docking poses that docking programs generate. Antibody-antigen docking is somewhat more focused than protein-protein docking in general, as constraints can be implemented to direct the docking to the CDR regions of the antibody. However, similar to what is observed in protein-protein docking (Lensink et al. 2016), antibody-antigen docking has been shown to have moderate success, at best. One assessment showed that even when the structures of the antibody and the antigens are known, a state-of-the-art docking program correctly identified the native pose as its top-ranked model in only 35% of the cases. When using modeled structures, success rate was substantially worse (Kilambi and Gray 2017).
Antibody Design: Predicting the Determinants of Specificity and Affinity
The primary goal for antibody design is to increase affinity, specificity, and in some cases stability, expression, or bioavailability. Computational antibody design aims to identify positions that can mediate such improvements. As such, a thorough understanding of both the paratope and epitope, specifically, the regions that are responsible for specificity and affinity, is essential for successful antibody engineering.
As the CDRs are largely responsible for the interactions that guide antibody function, understanding both their amino acid composition relative to the rest of the protein and distinctive roles of each CDR sheds light onto the determinants of specificity and affinity.
The prevalence of certain amino acids, such as Tyr, in the CDRs of antibodies, and their role in antigen binding, has been observed (Burkovitz et al. 2014; Krawczyk et al. 2013; Kunik and Ofran 2013; Ofran et al. 2008). Furthermore, the overrepresentation of Tyr, Ser, and Trp in antigen-contacting residues in germlines, but not in the antigen-contacting residues that are introduced during somatic hypermutation, has been demonstrated (Burkovitz et al. 2014). The prevalence of these amino acids, therefore, comes from the original germline sequences of the V, D, and J genes and is not strongly selected for during somatic hypermutation.
Each of the six antigen binding regions (ABRs, analogous to CDRs) has been shown to have unique characteristics not only in its amino acid composition but also in the type of non-covalent contacts that it preferably forms with the antigen (Kunik and Ofran 2013). For example, H2 has been shown to specialize in forming salt bridges, while H3 contributes more H-bonds and L1 and L3 contribute to polar interactions.
Antibody engineering relies not only on the specific variability that needs to be introduced when designing an antibody but also on understanding where in the sequence to introduce it as well. These positions may not necessarily be in the CDRs. A large-scale study of germline antibody sequences identified positions that undergo somatic hypermutation (SHM) and predicted the contributions of these SHM positions to the antibody binding affinity (Burkovitz et al. 2014). SHM positions in distinct structural regions of the antibody, including those not in the CDRs, contribute to the binding affinity of the antibody to its antigen (Burkovitz et al. 2014). In addition, it has been shown that framework residues may contribute to the binding specificity or affinity of the antibody, via non-local effects (Baran et al. 2017; Burkovitz et al. 2014; Koenig et al. 2017a; Sela-Culang et al. 2012; Yang et al. 2017).
Detailed information on the epitope is important for optimization of existing antibodies. This may be obtained experimentally through X-ray crystallography or through hydrogen/deuterium-exchange assays. In the absence of experimental data, epitope identification, particularly for conformational epitopes, remains challenging. Several computational approaches have been proposed to address the challenge of predicting general epitopes in proteins (Gao and Kurgan 2014). When the sequence of the antibody is known, methods for antibody-specific epitope prediction provide better epitope mapping even in the absence of a 3-D structure of the antibody (Hua et al. 2017; Krawczyk et al. 2014; Sela-Culang et al. 2015, 2014).
Attempts to improve binding affinity or alter specificity of existing antibodies often rely on structural analysis (either experimental or predicted model) and on methods to predict the biophysical effects of mutations (Clark et al. 2006; Farady et al. 2009; Lippow et al. 2007). However, while such methods can often identify mutations that have large effects, predicting which mutations will have moderate or small effects on binding remains a challenge (Sirin et al. 2016). Notwithstanding, recent ambitious studies have attempted to go beyond improvement of existing antibodies to computationally design antibodies with completely new functions, such as novel binding specificities.
Challenges and Guidelines
Current challenges to the computational design of antibodies are related to antibody and antibody-antigen complex modeling as well as to the ability to screen synthetic libraries for functional antibodies. As described above, while the framework of antibodies, as well as most of the CDR loops, can be modeled with a high level of confidence due to sequence and structural conservation, modeling CDRH3 proves more difficult as it is highly variable in sequence, structure, and length. In addition, this variability impacts the reliability of modeling the interface between the VH and VL domains since CDRH3 contributes to this interface. Even when a reliable model of the antibody is available, modeling the antibody-antigen complex is limited by the current capabilities of protein-protein docking and scoring functions. Some of the challenges described here may be overcome by focusing on modeling functional interactions between antibody and antigen, rather than a high-resolution full atomistic model of the complex.
In addition to the issues relating to computational design, challenges in engineering functional antibodies lie in the ability to select functional antibodies from a large pool. While large-scale screens of antibody libraries may yield high-affinity binders to the antigen of interest, these antibodies may not necessarily have the desired function (e.g., agonism or antagonism). Large-scale library screening can select binders based on their affinity, and there is currently no feasible systematic method for screening a large library for function. One possible solution for obtaining functional antibodies from a large-scale screen is the use of focused libraries, designed to elicit antibodies that target a specific epitope, for which a given functional effect is predicted or known. Such libraries could be expected to yield antibodies with the desired function, rather than simply the tightest binders to the most immunodominant site on the antigen.
Due to their importance in biotechnology, antibodies are the focus of intense attempts at design and engineering. Advances in modeling of antibodies and in predicting and modeling antibody-antigen complexes have opened the door for new methods of antibody design, focusing on improvement of affinity, specificity, and other biophysical characteristics. The next generation of methods for computational antibody design focuses on introducing new function, rather than on improving existing antibodies.
- Clark LA, Boriack-Sjodin PA, Eldredge J, Fitch C, Friedman B, Hanf KJ, Jarpe M, Liparoto SF, Li Y, Lugovskoy A, Miller S, Rushe M, Sherman W, Simon K, Van Vlijmen H (2006) Affinity enhancement of an in vivo matured therapeutic antibody using structure-based computational design. Protein Sci 15(5):949–960CrossRefPubMedPubMedCentralGoogle Scholar