Abstract
The development of therapeutic antibodies is an important aspect of new drug discovery pipelines. The assessment of an antibody's developability—its suitability for large-scale production and therapeutic use—is a particularly important step in this process. Given that experimental assays to assess antibody developability in large scale are expensive and time-consuming, computational methods have been a more efficient alternative. However, the antibody research community faces significant challenges due to the scarcity of readily accessible data on antibody developability, which is essential for training and validating computational models. To address this gap, DOTAD (Database Of Therapeutic Antibody Developability) has been built as the first database dedicated exclusively to the curation of therapeutic antibody developability information. DOTAD aggregates all available therapeutic antibody sequence data along with various developability metrics from the scientific literature, offering researchers a robust platform for data storage, retrieval, exploration, and downloading. In addition to serving as a comprehensive repository, DOTAD enhances its utility by integrating a web-based interface that features state-of-the-art tools for the assessment of antibody developability. This ensures that users not only have access to critical data but also have the convenience of analyzing and interpreting this information. The DOTAD database represents a valuable resource for the scientific community, facilitating the advancement of therapeutic antibody research. It is freely accessible at http://i.uestc.edu.cn/DOTAD/, providing an open data platform that supports the continuous growth and evolution of computational methods in the field of antibody development.
Graphical Abstract
Similar content being viewed by others
References
Buss NA, Henderson SJ, McFarlane M et al (2012) Monoclonal antibody therapeutics: history and future. Curr Opin Pharmacol 12:615–622. https://doi.org/10.1016/j.coph.2012.08.001
Smith SL (1996) Ten years of Orthoclone OKT3 (muromonab-CD3): a review. J Transpl Coord 6:109–19; quiz 120-1. https://doi.org/10.7182/prtr.1.6.3.8145l3u185493182
Lyu X, Zhao Q, Hui J et al (2022) The global landscape of approved antibody therapies. Antib Ther 5:233–257. https://doi.org/10.1093/abt/tbac021
Kaplon H, Chenoweth A, Crescioli S et al (2022) Antibodies to watch in 2022. MAbs. https://doi.org/10.1080/19420862.2021.2014296
Elgundi Z, Reslan M, Cruz E et al (2017) The state-of-play and future of antibody therapeutics. Adv Drug Deliv Rev 122:2–19. https://doi.org/10.1016/j.addr.2016.11.004
Carter PJ, Lazar GA (2018) Next generation antibody drugs: pursuit of the “high-hanging fruit.” Nat Rev Drug Discov 17:197–223. https://doi.org/10.1038/nrd.2017.227
Jain T, Sun T, Durand S et al (2017) Biophysical properties of the clinical-stage antibody landscape. Proc Natl Acad Sci U S A 114:944–949. https://doi.org/10.1073/pnas.1616408114
Ahmed L, Gupta P, Martin KP et al (2021) Intrinsic physicochemical profile of marketed antibody-based biotherapeutics. Proc Natl Acad Sci U S A. https://doi.org/10.1073/pnas.2020577118
Liu Y, Caffry I, Wu J et al (2014) High-throughput screening for developability during early-stage antibody discovery using self-interaction nanoparticle spectroscopy. MAbs 6:483–492. https://doi.org/10.4161/mabs.27431
Gentiluomo L, Svilenov HL, Augustijn D et al (2020) Advancing therapeutic protein discovery and development through comprehensive computational and biophysical characterization. Mol Pharm 17:426–440. https://doi.org/10.1021/acs.molpharmaceut.9b00852
Dobson CL, Devine PW, Phillips JJ et al (2016) Engineering the surface properties of a human monoclonal antibody prevents self-association and rapid clearance in vivo. Sci Rep 6:38644. https://doi.org/10.1038/srep38644
Lavoisier A, Schlaeppi JM (2015) Early developability screen of therapeutic antibody candidates using Taylor dispersion analysis and UV area imaging detection. MAbs 7:77–83. https://doi.org/10.4161/19420862.2014.985544
Azevedo Reis Teixeira A, Erasmus MF, D’Angelo S et al (2021) Drug-like antibodies with high affinity, diversity and developability directly from next-generation antibody libraries. MAbs 13:1980942. https://doi.org/10.1080/19420862.2021.1980942
Zhou Y, Huang Z, Gou Y et al (2023) AB-Amy: machine learning aided amyloidogenic risk prediction of therapeutic antibody light chains. Antib Ther 6:147–156. https://doi.org/10.1093/abt/tbad007
Zhou Y, Huang Z, Li W et al (2023) Deep learning in preclinical antibody drug discovery and development. Methods 218:57–71. https://doi.org/10.1016/j.ymeth.2023.07.003
Zhou Y, Xie S, Yang Y et al (2022) SSH2.0: a better tool for predicting the hydrophobic interaction risk of monoclonal antibody. Front Genet 13:842127. https://doi.org/10.3389/fgene.2022.842127
Lefranc MP, Lefranc G (2019) IMGT(®) and 30 years of immunoinformatics insight in antibody V and C domain structure and function. Antibodies (Basel). https://doi.org/10.3390/antib8020029
Raybould MIJ, Marks C, Lewis AP et al (2019) Thera-SAbDab: the therapeutic structural antibody database. Nucleic Acids Res 48:D383–D388. https://doi.org/10.1093/nar/gkz827
Jain T, Boland T, Vásquez M (2023) Identifying developability risks for clinical progression of antibodies using high-throughput in vitro and in silico approaches. MAbs 15:2200540. https://doi.org/10.1080/19420862.2023.2200540
Dzisoo AM, Kang J, Yao P et al (2020) SSH: a tool for predicting hydrophobic interaction of monoclonal antibodies using sequences. Biomed Res Int 2020:3508107. https://doi.org/10.1155/2020/3508107
Nelson PN, Reynolds GM, Waldron EE et al (2000) Monoclonal antibodies. Mol Pathol 53:111–117. https://doi.org/10.1136/mp.53.3.111
Scott LJ (2017) Tocilizumab: a review in rheumatoid arthritis. Drugs 77:1865–1879. https://doi.org/10.1007/s40265-017-0829-7
Grieshaber-Bouyer R, Lorenz HM (2020) Biosimilars: opportunities and risks. Internist (Berl) 61:522–529. https://doi.org/10.1007/s00108-020-00784-2
Lu X, Hu R, Peng L et al (2021) Efficacy and safety of adalimumab biosimilars: current critical clinical data in rheumatoid arthritis. Front Immunol 12:638444. https://doi.org/10.3389/fimmu.2021.638444
Ducancel F, Muller BH (2012) Molecular engineering of antibodies for therapeutic and diagnostic purposes. MAbs 4:445–457. https://doi.org/10.4161/mabs.20776
Schroeder HW Jr, Cavacini L (2010) Structure and function of immunoglobulins. J Allergy Clin Immunol 125:S41-52. https://doi.org/10.1016/j.jaci.2009.09.046
Vukovic N, van Elsas A, Verbeek JS et al (2021) Isotype selection for antibody-based cancer therapy. Clin Exp Immunol 203:351–365. https://doi.org/10.1111/cei.13545
Stanfield RL, Zemla A, Wilson IA et al (2006) Antibody elbow angles are influenced by their light chain class. J Mol Biol 357:1566–1574. https://doi.org/10.1016/j.jmb.2006.01.023
Bailey LJ, Sheehy KM, Dominik PK et al (2018) Locking the elbow: improved antibody fab fragments as chaperones for structure determination. J Mol Biol 430:337–347. https://doi.org/10.1016/j.jmb.2017.12.012
Bailly M, Mieczkowski C, Juan V et al (2020) Predicting antibody developability profiles through early stage discovery screening. MAbs 12:1743053. https://doi.org/10.1080/19420862.2020.1743053
Hebditch M, Warwicker J (2019) Charge and hydrophobicity are key features in sequence-trained machine learning models for predicting the biophysical properties of clinical-stage antibodies. PeerJ 7:e8199. https://doi.org/10.7717/peerj.8199
Sule SV, Sukumar M, Weiss WFT et al (2011) High-throughput analysis of concentration-dependent antibody self-association. Biophys J 101:1749–1757. https://doi.org/10.1016/j.bpj.2011.08.036
Estep P, Caffry I, Yu Y et al (2015) An alternative assay to hydrophobic interaction chromatography for high-throughput characterization of monoclonal antibodies. MAbs 7:553–561. https://doi.org/10.1080/19420862.2015.1016694
Mouquet H, Scheid JF, Zoller MJ et al (2010) Polyreactivity increases the apparent affinity of anti-HIV antibodies by heteroligation. Nature 467:591–595. https://doi.org/10.1038/nature09385
Hötzel I, Theil FP, Bernstein LJ et al (2012) A strategy for risk mitigation of antibodies with fast clearance. MAbs 4:753–760. https://doi.org/10.4161/mabs.22189
Dietlin-Auril V, Lecerf M, Depinay S et al (2021) Interaction with 2,4-dinitrophenol correlates with polyreactivity, self-binding, and stability of clinical-stage therapeutic antibodies. Mol Immunol 140:233–239. https://doi.org/10.1016/j.molimm.2021.10.019
O’Connor BF, Cummins PM (2017) Hydrophobic interaction chromatography. Methods Mol Biol 1485:355–363. https://doi.org/10.1007/978-1-4939-6412-3_18
Kohli N, Jain N, Geddie ML et al (2015) A novel screening method to assess developability of antibody-like molecules. MAbs 7:752–758. https://doi.org/10.1080/19420862.2015.1048410
Ravuluri S, Bansal R, Chhabra N et al (2018) Kinetics and characterization of non-enzymatic fragmentation of monoclonal antibody therapeutics. Pharm Res 35:142. https://doi.org/10.1007/s11095-018-2415-4
Xu Y, Roach W, Sun T et al (2013) Addressing polyspecificity of antibodies selected from an in vitro yeast presentation system: a FACS-based, high-throughput selection and analytical tool. Protein Eng Des Sel 26:663–670. https://doi.org/10.1093/protein/gzt047
Jacobs SA, Wu SJ, Feng Y et al (2010) Cross-interaction chromatography: a rapid method to identify highly soluble monoclonal antibody candidates. Pharm Res 27:65–71. https://doi.org/10.1007/s11095-009-0007-z
Sun T, Reid F, Liu Y et al (2013) High throughput detection of antibody self-interaction by bio-layer interferometry. MAbs 5:838–841. https://doi.org/10.4161/mabs.26186
Kraft TE, Richter WF, Emrich T et al (2020) Heparin chromatography as an in vitro predictor for antibody clearance rate through pinocytosis. MAbs 12:1683432. https://doi.org/10.1080/19420862.2019.1683432
Lecerf M, Kanyavuz A, Rossini S et al (2021) Interaction of clinical-stage antibodies with heme predicts their physiochemical and binding qualities. Commun Biol 4:391. https://doi.org/10.1038/s42003-021-01931-7
Jain T, Boland T, Lilov A et al (2017) Prediction of delayed retention of antibodies in hydrophobic interaction chromatography from sequence using machine learning. Bioinformatics 33:3758–3766. https://doi.org/10.1093/bioinformatics/btx519
Sharma VK, Patapoff TW, Kabakoff B et al (2014) In silico selection of therapeutic antibodies for development: viscosity, clearance, and chemical stability. Proc Natl Acad Sci U S A 111:18601–18606. https://doi.org/10.1073/pnas.1421779112
Tomar DS, Li L, Broulidakis MP et al (2017) In-silico prediction of concentration-dependent viscosity curves for monoclonal antibody solutions. MAbs 9:476–489. https://doi.org/10.1080/19420862.2017.1285479
Lai PK, Fernando A, Cloutier TK et al (2021) Machine learning applied to determine the molecular descriptors responsible for the viscosity behavior of concentrated therapeutic antibodies. Mol Pharm 18:1167–1175. https://doi.org/10.1021/acs.molpharmaceut.0c01073
Lai PK (2022) DeepSCM: an efficient convolutional neural network surrogate model for the screening of therapeutic antibody viscosity. Comput Struct Biotechnol J 20:2143–2152. https://doi.org/10.1016/j.csbj.2022.04.035
Lai PK, Gallegos A, Mody N et al (2022) Machine learning prediction of antibody aggregation and viscosity for high concentration formulation development of protein therapeutics. MAbs 14:2026208. https://doi.org/10.1080/19420862.2022.2026208
Schmitt J, Razvi A, Grapentin C (2023) Predictive modeling of concentration-dependent viscosity behavior of monoclonal antibody solutions using artificial neural networks. MAbs 15:2169440. https://doi.org/10.1080/19420862.2023.2169440
Magnan CN, Randall A, Baldi P (2009) SOLpro: accurate sequence-based prediction of protein solubility. Bioinformatics 25:2200–2207. https://doi.org/10.1093/bioinformatics/btp386
Smialowski P, Doose G, Torkler P et al (2012) PROSO II—a new method for protein solubility prediction. FEBS J 279:2192–2200. https://doi.org/10.1111/j.1742-4658.2012.08603.x
Hebditch M, Carballo-Amador MA, Charonis S et al (2017) Protein-Sol: a web tool for predicting protein solubility from sequence. Bioinformatics 33:3098–3100. https://doi.org/10.1093/bioinformatics/btx345
Rawi R, Mall R, Kunji K et al (2018) PaRSnIP: sequence-based protein solubility prediction using gradient boosting machine. Bioinformatics 34:1092–1098. https://doi.org/10.1093/bioinformatics/btx662
Khurana S, Rawi R, Kunji K et al (2018) DeepSol: a deep learning framework for sequence-based protein solubility prediction. Bioinformatics 34:2605–2613. https://doi.org/10.1093/bioinformatics/bty166
Raimondi D, Orlando G, Fariselli P et al (2020) Insight into the protein solubility driving forces with neural attention. PLoS Comput Biol 16:e1007722. https://doi.org/10.1371/journal.pcbi.1007722
Hou Q, Kwasigroch JM, Rooman M et al (2020) SOLart: a structure-based method to predict protein solubility and aggregation. Bioinformatics 36:1445–1452. https://doi.org/10.1093/bioinformatics/btz773
Wu X, Yu L (2021) EPSOL: sequence-based protein solubility prediction using multidimensional embedding. Bioinformatics. https://doi.org/10.1093/bioinformatics/btab463
Hon J, Marusiak M, Martinek T et al (2021) SoluProt: prediction of soluble protein expression in Escherichia coli. Bioinformatics 37:23–28. https://doi.org/10.1093/bioinformatics/btaa1102
Thumuluri V, Martiny HM, Armenteros JJA et al (2021) NetSolP: predicting protein solubility in E. coli using language models. Bioinformatics. https://doi.org/10.1093/bioinformatics/btab801
Han X, Shih J, Lin Y et al (2022) Development of QSAR models for in silico screening of antibody solubility. MAbs 14:2062807. https://doi.org/10.1080/19420862.2022.2062807
Feng J, Jiang M, Shih J et al (2022) Antibody apparent solubility prediction from sequence by transfer learning. iScience 25:105173. https://doi.org/10.1016/j.isci.2022.105173
Kim C, Choi J, Lee SJ et al (2009) NetCSSP: web application for predicting chameleon sequences and amyloid fibril formation. Nucleic Acids Res 37:W469–W473. https://doi.org/10.1093/nar/gkp351
Emily M, Talvas A, Delamarche C (2013) MetAmyl: a METa-predictor for AMYLoid proteins. PLoS ONE 8:e79722. https://doi.org/10.1371/journal.pone.0079722
Thangakani AM, Kumar S, Nagarajan R et al (2014) GAP: towards almost 100 percent prediction for β-strand-mediated aggregating peptides with distinct morphologies. Bioinformatics 30:1983–1990. https://doi.org/10.1093/bioinformatics/btu167
Família C, Dennison SR, Quintas A et al (2015) Prediction of peptide and protein propensity for amyloid formation. PLoS ONE 10:e0134679. https://doi.org/10.1371/journal.pone.0134679
Burdukiewicz M, Sobczyk P, Rödiger S et al (2017) Amyloidogenic motifs revealed by n-gram analysis. Sci Rep 7:12961. https://doi.org/10.1038/s41598-017-13210-9
Niu M, Li Y, Wang C et al (2018) RFAmyloid: a web server for predicting amyloid proteins. Int J Mol Sci. https://doi.org/10.3390/ijms19072071
Wojciechowski JW, Kotulska M (2020) PATH—prediction of amyloidogenicity by threading and machine learning. Sci Rep 10:7721. https://doi.org/10.1038/s41598-020-64270-3
Gentiluomo L, Roessner D, Frieß W (2020) Application of machine learning to predict monomer retention of therapeutic proteins after long term storage. Int J Pharm 577:119039. https://doi.org/10.1016/j.ijpharm.2020.119039
Saha S, Raghava GP (2006) Prediction of continuous B-cell epitopes in an antigen using recurrent neural network. Proteins 65:40–48. https://doi.org/10.1002/prot.21078
El-Manzalawy Y, Dobbs D, Honavar V (2008) Predicting linear B-cell epitopes using string kernels. J Mol Recognit 21:243–255. https://doi.org/10.1002/jmr.893
Sweredoski MJ, Baldi P (2009) COBEpro: a novel system for predicting continuous B-cell epitopes. Protein Eng Des Sel 22:113–120. https://doi.org/10.1093/protein/gzn075
Magnan CN, Zeller M, Kayala MA et al (2010) High-throughput prediction of protein antigenicity using protein microarray data. Bioinformatics 26:2936–2943. https://doi.org/10.1093/bioinformatics/btq551
Singh H, Ansari HR, Raghava GP (2013) Improved method for linear B-cell epitope prediction using antigen’s primary sequence. PLoS ONE 8:e62216. https://doi.org/10.1371/journal.pone.0062216
Jespersen MC, Peters B, Nielsen M et al (2017) BepiPred-2.0: improving sequence-based B-cell epitope prediction using conformational epitopes. Nucleic Acids Res 45:W24–W29. https://doi.org/10.1093/nar/gkx346
Ansari HR, Raghava GP (2010) Identification of conformational B-cell Epitopes in an antigen from its primary sequence. Immunome Res 6:6. https://doi.org/10.1186/1745-7580-6-6
Wollacott AM, Xue C, Qin Q et al (2019) Quantifying the nativeness of antibody sequences using long short-term memory networks. Protein Eng Des Sel 32:347–354. https://doi.org/10.1093/protein/gzz031
Yao B, Zheng D, Liang S et al (2020) SVMTriP: a method to predict B-cell linear antigenic epitopes. Methods Mol Biol 2131:299–307. https://doi.org/10.1007/978-1-0716-0389-5_17
Reynisson B, Alvarez B, Paul S et al (2020) NetMHCpan-4.1 and NetMHCIIpan-4.0: improved predictions of MHC antigen presentation by concurrent motif deconvolution and integration of MS MHC eluted ligand data. Nucleic Acids Res 48:W449–W454. https://doi.org/10.1093/nar/gkaa379
Prihoda D, Maamary J, Waight A et al (2022) BioPhi: a platform for antibody design, humanization, and humanness evaluation based on natural antibody repertoires and deep learning. MAbs 14:2020203. https://doi.org/10.1080/19420862.2021.2020203
Marks C, Hummer AM, Chin M et al (2021) Humanization of antibodies using a machine learning approach on large-scale repertoire data. Bioinformatics 37:4041–4047. https://doi.org/10.1093/bioinformatics/btab434
Olenyi T, Marquet C, Heinzinger M et al (2023) LambdaPP: fast and accessible protein-specific phenotype predictions. Protein Sci 32:e4524. https://doi.org/10.1002/pro.4524
Yang ZR (2009) Predicting sulfotyrosine sites using the random forest algorithm with significantly improved prediction accuracy. BMC Bioinform 10:361. https://doi.org/10.1186/1471-2105-10-361
Gao J, Thelen JJ, Dunker AK et al (2010) Musite, a tool for global prediction of general and kinase-specific phosphorylation sites. Mol Cell Proteomics 9:2586–2600. https://doi.org/10.1074/mcp.M110.001388
Huang SY, Shi SP, Qiu JD et al (2012) PredSulSite: prediction of protein tyrosine sulfation sites with multiple features and analysis. Anal Biochem 428:16–23. https://doi.org/10.1016/j.ab.2012.06.003
Chauhan JS, Rao A, Raghava GP (2013) In silico platform for prediction of N-, O- and C-glycosites in eukaryotic protein sequences. PLoS ONE 8:e67008. https://doi.org/10.1371/journal.pone.0067008
Sydow JF, Lipsmeier F, Larraillet V et al (2014) Structure-based prediction of asparagine and aspartate degradation sites in antibody variable regions. PLoS ONE 9:e100736. https://doi.org/10.1371/journal.pone.0100736
Lv H, Han J, Liu J et al (2014) CarSPred: a computational tool for predicting carbonylation sites of human proteins. PLoS ONE 9:e111478. https://doi.org/10.1371/journal.pone.0111478
Jia J, Liu Z, Xiao X et al (2016) iCar-PseCp: identify carbonylation sites in proteins by Monte Carlo sampling and incorporating sequence coupled effects into general PseAAC. Oncotarget 7:34558–70. https://doi.org/10.18632/oncotarget.9148
Yan Q, Huang M, Lewis MJ et al (2018) Structure based prediction of asparagine deamidation propensity in monoclonal antibodies. MAbs 10:901–912. https://doi.org/10.1080/19420862.2018.1478646
Luo F, Wang M, Liu Y et al (2019) DeepPhos: prediction of protein phosphorylation sites with deep learning. Bioinformatics 35:2766–2773. https://doi.org/10.1093/bioinformatics/bty1051
Delmar JA, Wang J, Choi SW et al (2019) Machine learning enables accurate prediction of asparagine deamidation probability and rate. Mol Ther Methods Clin Dev 15:264–274. https://doi.org/10.1016/j.omtm.2019.09.008
Wang D, Liu D, Yuchi J et al (2020) MusiteDeep: a deep-learning based webserver for protein post-translational modification site prediction and visualization. Nucleic Acids Res 48:W140–W146. https://doi.org/10.1093/nar/gkaa275
Zhang D, Xu Z-C, Su W et al (2020) iCarPS: a computational tool for identifying protein carbonylation sites by novel encoded features. Bioinformatics 37:171–177. https://doi.org/10.1093/bioinformatics/btaa702
Dehzangi I, Sharma A, Shatabda S (2022) iProtGly-SS: a tool to accurately predict protein glycation site using structural-based features. Methods Mol Biol 2499:125–134. https://doi.org/10.1007/978-1-0716-2317-6_5
Funding
This work was supported by grant from the National Natural Science Foundation of China (62071099, 62371112 ).
Author information
Authors and Affiliations
Corresponding authors
Ethics declarations
Conflict of Interest
The authors declare no conflict of interest, financial or otherwise.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Li, W., Lin, H., Huang, Z. et al. DOTAD: A Database of Therapeutic Antibody Developability. Interdiscip Sci Comput Life Sci (2024). https://doi.org/10.1007/s12539-024-00613-2
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s12539-024-00613-2