Biologics continue to play an increasingly dominant role in the modern pharmaceutical market. The primary objective of formulation development is to efficiently deliver a high-quality drug product that ensures desired bioavailability and stability for the benefit of patients. However, when it comes to protein and other biomolecule therapeutics, their structural flexibility offers advantages in terms of enhanced binding to the target, while also presenting challenges in terms of biophysical and biochemical stability. For instance, protein drugs often exhibit a greater propensity for oligomerization, aggregation, and fibrilization. Consequently, a crucial aspect of biological drug product development involves identifying critical quality attributes and preserving the structural integrity and stability of the drug substances within the formulation. The formulation of biological drugs involves examining various structural details, such as protein–protein and protein-excipient interactions, as they reflect fundamental mechanisms that contribute to stabilizing or destabilizing behaviors. With the increasing complexity of biological drug products, both in terms of modality and delivery routes (including biologic coformulation, high concentration, crystalline suspension or subcutaneous delivery), the utilization of advanced characterization tools becomes necessary. These analytical methods aid in establishing foundational knowledge regarding product attributes that aim to accurately predict stability, processability, and, ultimately the bioperformance of the drug.

This special issue highlights the significance of advanced biophysical and biochemical techniques in characterizing a diverse range of biopharmaceuticals, as illustrated in Fig. 1. The utilization of these techniques empowers researchers to explore and comprehend critical attributes of biomolecules, facilitating efficient formulation design across a broad range of applications. The scope of this issue encompasses peptides [1, 2], proteins [2,3,4], oligonucleotides [5], monoclonal antibodies (mAbs) [3, 6,7,8,9], vaccines [10], injectable drug products [11, 12], process development [4, 13], and stability issues [14, 15], with the overarching goal of enabling efficient formulation design. The issue highlights the application of modern analytical methods, including light scattering [2, 6], chromatography [4, 7, 8], infrared (IR) spectroscopy [15], hyperspectral Raman spectroscopy [13], mass spectrometry (MS) [7,8,9], nuclear magnetic resonance (NMR) [2,3,4,5, 10], and machine learning [11, 13]. These techniques possess the capability to characterize a comprehensive array of critical pharmaceutical attributes of biomolecules, such as chemical and higher-order structural integrity, across a wide range of molecular sizes and dynamics. By leveraging these advanced analytical tools and techniques, researchers can gain valuable insights into the characteristics and behavior of biopharmaceuticals. This knowledge is pivotal in understanding and optimizing the formulation design process, as such data can correlates critical product parameters with the pharmaceutical quality attributes of biomolecules.

Fig. 1
figure 1

Illustration showcasing the analytical-driven design of biological drug products. The dynamic evolution and intricate nature of these products necessitate advanced biophysical and biochemical characterizations, augmented by artificial intelligence (AI) methodologies.

Herein, a variety of analytical techniques are discussed for the identification, quantification, and characterization of different aspects of biopharmaceuticals in liquid formulations. These methods have been applied in the analysis of various biomolecules and formulations. One such approach is parallel reaction monitoring (PRM) LC–MS/MS, which enables accurate measurement of mAb glycan identification and quantification [7]. Another technique, peptide mapping based high-resolution (HR) MS, has the potential to detect and quantify sequence variants in mAbs [9]. Additionally, MS has been utilized to follow asparagine deamidation through succinimide formation in the complementarity determining region (CDR) of mAbs [8]. For glycation chemistry analysis, 2D NMR has proven to be an effective tool [3]. NMR has also been employed to profile the emerging modality of nucleic acids [5]. Improved infrared spectroscopy (IR) has been used to profile secondary structure, contributing to higher order structure (HOS) characterization [15]. Dynamic light scattering (DLS) and sedimentation velocity (SV) have been employed to characterize reversible mAb oligomerization [6]. Chemical modifications, such as fatty acids and polyethylene glycol (PEG), have been shown to increase the serum life of proteins or other injectable drug products. Herein, DLS and diffusion-ordered spectroscopy (DOSY) NMR have been used to accurately measure the important oligomerization state of chemically modified peptides and proteins [2]. Water NMR spin relaxation-based methods have been demonstrated for measuring the stability of aluminum adjuvant or antigen-adjuvant complexes, playing a role in vaccine quality assurance and control [10]. Furthermore, the development of peptide reference standards to support synthetic peptide characterization has been described [1]. Lastly, a review discusses light-induced stability issues in drug products [14]. These analytical techniques contribute significantly to the understanding of biopharmaceutical properties, ranging from glycan analysis to structural characterization and stability assessment. Their application facilitates the development, optimization, and safety evaluation of biopharmaceutical products.

In addition to liquid peptide and protein formulations, dry powder forms of protein biologics can be produced through lyophilization (freeze-drying) or spray drying processes. The characterization of these dry protein biologics involves several solid-state analytical techniques. Some commonly utilized techniques include solid-state Fourier transform infrared spectroscopy (ssFTIR), powder X-ray diffraction (PXRD), solid-state hydrogen/deuterium exchange with mass spectrometry (ssHDX-MS), and solid-state nuclear magnetic resonance spectroscopy (ssNMR) [4]. These techniques provide insights into the structural and chemical properties of the dried protein formulations. Moreover, differential scanning calorimetry (DSC) and various scattering techniques, such as wide-angle X-ray diffractometry (WAXD), small-angle neutron scattering (SANS), and small-angle X-ray scattering (SAXS), have also been extensively reviewed for the design and characterization of lyophilized protein formulations [12]. These techniques are valuable in understanding the physical properties, including thermal behavior and molecular arrangement, of lyophilized protein formulations. These solid-state analytical techniques play a crucial role in assessing the stability, integrity, and physical characteristics of dry protein biologics. By employing these techniques, we can gain a comprehensive understanding of the structural features and properties of lyophilized protein formulations, enabling informed formulation design and optimization efforts.

Machine learning (ML) has gained significant traction in the realm of biopharmaceutical research and development. In particular, a simplified Convolutional Neural Network algorithm has been developed to analyze micro-flow imaging (MFI) results [11]. This ML-based approach enabled the identification and characterization of aggregates by examining particle morphological differences, thereby providing valuable insights into the chemical properties of the aggregates. This technique has shown promise in enhancing our understanding of aggregate formation and its impact on biopharmaceutical products. Furthermore, Raman hyperspectral imaging coupled with machine learning has been employed for the analysis of immobilized enzymes used in biocatalysis [13]. This combination of techniques allows for the rapid and accurate assessment of enzymatic activity, as well as the monitoring of enzyme distribution and spatial organization. By leveraging machine learning algorithms, researchers can extract meaningful information from the hyperspectral data and gain deeper insights into enzyme behavior and performance. These examples highlight the potential of machine learning in advancing biopharmaceutical research and development. By integrating ML algorithms with sophisticated imaging and analytical techniques, scientists can extract valuable knowledge from complex data sets and make science based decisions during the development and optimization of biopharmaceutical processes.

To summarize, this special issue encompasses methods and tools essential for characterizing the intricate physical and chemical properties of therapeutic biomolecules. Additionally, it focuses on measuring the subtle molecular interactions within multi-component systems, which significantly influence critical formulation characteristics. While the collection of papers may not cover all aspects of biomolecule formulation, the objective is to inspire colleagues and encourage researchers that delve into advanced analytics for complex and biological drug formulations. The goal is to foster further exploration and understanding in this field, driving advancements in quality control and the development of biopharmaceuticals.