Automated tomographic assessment of structural defects of freeze-dried pharmaceuticals

,


I. INTRODUCTION -STRUCTURAL DEFECTS IN FREEZE-DRIED PHARMACEUTICALS
Freeze-drying (lyophilization) enables the heat-free removal of water from products through sublimation.This process substantially improves the stability of various pharmaceutical products [1][2][3][4].For many biopharmaceutical products, particularly those based on therapeutic proteins, lyophilization is essential for ensuring stability [5][6][7][8].The resulting products, sealed in glass vials, can be conveniently handled and later reconstituted to their original form, for example, for injection through rehydration.
The pore morphology of freeze-dried products is directly influenced by the formulation components and process parameters, playing a crucial role in drying performance and product quality attributes [9].In a typical freeze-drying cycle, the primary drying step is optimized to maintain a minimal safety margin below the critical temperature of the formulation.This optimization aims to achieve maximum drying efficiency while preserving the desired pore morphology [10].Product defects, such as the collapse of the drying matrix, may result from temperatures surpassing the critical formulation temperature [11,12].This is of significance for product quality as it can lead to issues such as water entrapment, reduced specific surface areas, and higher residual moisture contents [10,13].
In line with industry guidelines, correct volume and cake appearance are cited as critical attributes during the visual inspection of freeze-dried products [14].A recent commentary provides an overview of macroscopical product appearances and their potential impact on product quality attributes for freeze-dried products [15].Various morphological defects, each with varying degrees of severity as defined in Table I, are listed below.Example images illustrating each defect and its degrees of severity are presented in Figure 1.
In this study, we investigate several morphological defects observed during the freeze-drying process of pharmaceuticals: • Shrinkage/Collapse: The lyophilisate undergoes shrinkage, and in severe cases, it detaches from the vial wall.While some shrinkage is an anticipated • Crack Formation: Large stresses generated during the drying process cause the sample to break into multiple pieces.
• Foaming/Bubble Formation: Reduced pressure and applied heat may induce the formulation to boil, resulting in foaming during the process.Depending on the formulation's ability to stabilize the gas-liquid surface, this may lead to the presence of large gas bubbles in the lyophilisate, alongside dense filaments of dried product.I.The first column of each line presents a characteristic image of the respective defect, the second column displays a tomographic vertical slice of the lyophilisate, and the third column illustrates the various degrees of severity associated with the corresponding defect.the sample's volume.In severe cases, liquid sample is forced through the vial opening and lost.
• Loose Skin: Evaporation occurring at the fluid surface leads to the formation of a solid skin or crust.Depending on formulation and process parameters, the skin can either form a cap or subsequently detach from the dried product.
As structural defects significantly impact the lyophilisate quality, freeze-dried products undergo routine visual inspections [1,2,[16][17][18][19][20]. Additional inspections involve sporadic checks of microstructure and pore structure using destructive analytical techniques such as SEM [21] or SSA [22].However, these destructive techniques are impractical for highthroughput screening on an industrial scale.Human visual inspection is time-consuming, expensive, and tiring for workers, making it error-prone.Automated camera systems with downstream image analysis are already in use [23].While these automated methods reduce the need for human interaction, they are limited to examining surface properties of the lyophilisate.
Quality-relevant defects, such as the formation of cracks, can also occur inside the product, hindering visual inspection.In this study, we propose quality control using X-ray computed tomography (CT) for a volumetric assessment of freeze-dried products.Although CT has been manually applied to investigate individual freeze-dried products [24][25][26], we introduce an apparatus that eliminates the need for human intervention through the use of robotics and artificial intelligence.While the system presented is a conceptual study not immediately suitable for industrial use, it serves as a starting point for potential future high-throughput processes.
CT was employed to analyze defects in tablets [27][28][29][30][31] and powders used in dry powder inhalers [32].An essential application of freeze-drying is to safeguard protein-based pharmaceutical products against damage during transport and handling.The question of whether such substances withstand exposure to X-rays was addressed in [33], where freeze-dried biopharmaceuticals were irradiated with approximately 100 Gy.No adverse effects on the chemical and physical stability of three model formulations from different substance classes were observed.Consequently, CT can be considered a valuable tool for non-destructive quality control of freeze-dried pharmaceuticals.Exemplary tomograms of lyophilisates are presented in the second column of Fig. 1.

II. SKETCH OF THE AUTOMATED X-RAY SETUP
The fully automatic tomographic control of the structural quality of freeze-dried pharmaceuticals encompasses the following tasks: • automatic feeding of the samples contained in standard vials • detection and identification of the vials through individual QR codes The apparatus comprises components for positioning and handling the vials, along with the tomograph itself.A schematic representation is provided in Fig. 2: The samples (A) are placed on a conveyor belt (B) and transported to a pickup area.A camera (D) positioned above the pickup area recognizes the presence of a sample, determines the vial's position, and reads its QR code.A robotic gripper (E, F) with a control interface (H) and controller (I) picks up the vial and deposits it inside the CT device (G), which has been equipped with an actuator to operate the safety cover.The radiograms of the vial captured by the tomograph are reconstructed into a full tomogram by the computer (C).The motion of the robotic arm and the automatic classification of the sample are managed by a second PC (D).To achieve high-quality tomograms, 400 individual radiograms are recorded for each sample, each boasting a resolution slightly exceeding 1 million pixels and a color depth of 16 bits.Figure 3 illustrates the connection of the process components.
Although high throughput was not the primary objective of this conceptual study, we conducted a benchmark of the data In our setup, a single radiogram necessitates approximately 400 ms of exposure time.This duration can be significantly reduced by employing a faster image sensor.Similarly, the reconstruction time can be diminished by utilizing more powerful CPU/GPU workstations.The same applies to the time required for copying and storing radiograms and performing volumetric reconstruction.Another option is to assess whether the number of radiograms recorded for each sample can be decreased.
It is worth noting that the automatic classification of the tomogram, as detailed in Section III, only takes a few microseconds and is thus negligible in terms of the overall process time.
FIG. 4: Flowchart depicting the timing of numerical processing.

A. Concept
We classify the samples based on their X-ray tomograms, as described in Sec.II.To achieve this, we export 400 equidis-tant horizontal slices (2D images) from the bottom to the top of the vial from the tomograms.Each 2D image undergoes classification for possible defects.From the resulting individual assessments, we calculate an overall rating for the sample.
For the classification of individual slices, we employ supervised machine learning.The objective is to identify a function capable of taking the image data of a 2D slice and providing the probability that the considered slice of the sample belongs to a specific class.In our case, examples of these classes could include major crack formation or severe shrinkage, as defined in Tab.I. Technically, our input data consist of the grayscale values of each pixel in the 2D images, forming a matrix of grayscale values.
In the realm of machine learning, the functions used to predict labels from input data are commonly referred to as hypotheses.In supervised learning, hypotheses are typically determined by employing functions with numerous parameters.These parameters are chosen to predict the correct labels as accurately as possible for a given set of input data.The quality of this optimization is measured by a cost function, quantifying the difference between the predicted labels and the ground truth labels.Various metrics, such as mean squared error or cross-entropy, can be utilized here.
In machine learning, the process of optimizing parameters is termed training, and the set of labeled input data used is known as the training set.We assume that the hypothesis derived from training can be generalized to data not included in the training set.To assess this generalization, another set of labeled input data, distinct from the training set, is required.This data is referred to as the test set.The following two subsections describe the training set and the assumptions applied for classification.

B. Training set
The quality of any supervised learning algorithm heavily relies on the training set.A sufficiently large number of training examples is essential for each category to be classified, and ideally, these examples should be evenly distributed across different categories.Creating a comprehensive training set is challenging, especially when deliberately inducing defects during freeze-drying is difficult.Initially, we prepared a total of 720 samples from four different formulations of arginine, sucrose, and albumin-based systems at varying concentrations, as indicated in Tab.II.Formulations and drying conditions were selected based on prior experiences with these model systems [21,34], ensuring a wide range of expected defects.
The samples were prepared by measuring 6.5 ml of the formulation into a standard 10R vial (dimensions OD × H: 24 mm × 45 mm).For freeze-drying, the vials were semisealed with a crimp neck stopper, and after the drying process, they were hermetically sealed.Subsequently, the finished samples underwent visual classification for defects outlined in Tab.I by two operators.Both the actual sample and the CT sectional images, captured using the apparatus described in Sec.II, were examined.Results from individual inspections were compared, and samples with differing outcomes were re-inspected by both operators.During the preparation phase, each sample vial was marked with an individual QR code on its top for identification purposes.
The entire collection of visual classification outcomes is listed in the supplementary material.Here, we provide a summary of the key findings.It is evident that the concentration of the active ingredient appears to be the most crucial factor influencing both the frequency and variety of defects.The chemical composition of the formulation seems to play a subordinate role.The highest defect rate and variety are observed for a twenty percent concentration.The predominant defect scenario is shrinkage/collapse, followed by loose skin and foaming/bubble formation.Crack formation and blowout occur at a much lower rate.It is important to note that the dataset exhibits an imbalance.The type of hypothesis that has proven highly effective for image classification is deep neural networks, especially convolutional neural networks (CNN).The architecture and functionality of these machine-learning algorithms are beyond the scope of this presentation.At this point, it suffices to understand that convolutional networks are complex functions that map multiple scalar input values to a single scalar output value, with the exact functional relationship influenced by freely selectable parameters.In the case of image classification, the input values include the grayscale values of individual pixels in the image, and the output value is a real number indicating the probability that the analyzed image belongs to a certain class.
To classify images with many thousands of pixels, CNNs contain a large number of parameters that need to be determined through training.In this work, we utilize the Inception v3 CNN architecture [35][36][37][38], which comprises approximately 24 million parameters.Training a classifier with such a substantial number of parameters requires more training examples than our 720 samples provide.To address this challenge, we apply the concept of transfer learning [39][40][41].Transfer learning leverages the fact that parameters learned for large parts of the network are relatively independent of specific image material, focusing on more general aspects of image processing, such as edge detection [42][43][44][45].We start with the Inception v3 network pre-trained on the extensive ImageNet dataset [46][47][48][49], consisting of more than 14 million labeled images.Our dataset is then used to train the end area of the network, which learns class-specific features.This process, known as re-training, tailors the pre-trained Inception v3 framework to answer specific questions about the classification of freeze-dried products.
Our overall goal is to inspect the CT slice images of a sample vial for the defect categories described in Tab.I.As shown in Fig. 5, we distribute this task among six independently trained networks: The first network is trained to detect whether a defect occurs without specifying it precisely.If a defect is anticipated from this pre-analysis, the slice images of the vial are evaluated by five subsequent neural networks, each trained to detect the markedness of one of the defects from Tab. I.This separation ensures a faster overall analysis and increases accuracy.
To monitor the training process, we examine the cost or loss function as a function of the training progress, i.e., the number of training steps.Here, we use the cross-entropy [50][51][52] to calculate the loss.As seen in Fig. 6, the difference between the ground truth labels and the predicted labels decreases with the training progress for both the training-set data and the test-set data.This indicates a successful learning process, with learned parameters generalizing well to data not used for training.

D. Performance of the learning algorithm
To assess the performance of automatic quality assessment, we partition the available data into three segments: 80% of the images serve as training data, 10% of the images are used for hyperparameter tuning, such as the learning rate, and the remaining 10% are employed as a final reality test.Due to the absence of certain error categories in samples with low drug concentrations, we limit this evaluation to data from samples with a 20% active ingredient concentration, comprising 144 samples.
The performance of the learning algorithm varies significantly for two groups of defect categories: For the collapse, cracks, and foaming categories, we achieve a robust absolute accuracy of over 93%, making it suitable for practical use.However, for the blowout and loose skin categories, the Blue: loss for the test data accuracy drops to approximately 50%, indicating that automatic assessment does not function effectively.This can be attributed to three issues.It is important to note that, for this study, we exclusively consider vertical slices of the tomogram for classification.The defects of loose skin and, particularly, the blowout effect predominantly affect the vertical direction.This error could potentially be mitigated by additionally considering vertical slices.However, implementing this seemingly straightforward correction is challenging.Choosing slices away from the symmetry axis of the cylindrical sample geometry would cover different fractions of the vial depending on their distance from the symmetry axis.Conversely, selecting slices through the symmetry axis would result in an over-representation of data closer to the axis of the vial.This issue will be addressed in future enhancements of the machine learning algorithm.
In addition to the vertical nature of the defects (loose skin and blowout), which has not been considered thus far, these defects were under-represented in our dataset, further diminishing the accuracy of the learning algorithm.Finally, it must be acknowledged that these defects, especially their grading, are challenging to classify based on cross-sectional CT images, even for an experienced individual.

IV. CONCLUSION AND OUTLOOK
We present a proof of concept demonstrating the feasibility of fully automated assessment of the structural quality of freeze-dried pharmaceuticals using X-ray tomography.We have outlined an apparatus capable of feeding, scanning, and separating freeze-dried samples based on their assessments.Additionally, we have introduced a machine-learning algorithm for automatically classifying samples according to their X-ray tomograms.Once the learning algorithm is trained, the system can operate with minimal human interaction.
It is important to note that the described setup is not immediately suitable for industrial-scale applications, where even small product batches consist of several thousand vials.However, we have established that the fundamental concept of the entire process chain is viable and has the potential for future scaling to an industrial level.Potential avenues for achieving this include reducing the number of X-ray projections used for each tomogram, employing more sensitive detectors to decrease the exposure time for each radiogram, and exploring the possibilities of parallelizing the process or implementing continuous process tomographic methods.
In this study, the machine learning algorithm encountered challenges in reliably classifying certain defects listed in Table I, which are pertinent for regulatory approval.However, this limitation can be ascribed to the inadequacy of the training dataset and does not represent a fundamental flaw in our concept.The current availability of an automatic setup allows for the generation of much more extensive data records.In future iterations, we intend to incorporate vertical CT-slices of the sample, a modification expected to notably improve the accuracy of classification for the defect categories blowout and loose skin.
The approval documents for freeze-dried products typically specify a specific visual appearance.In this study, we focused on analyzing the product structure.However, other attributes related to product quality, especially the reconstitution behavior of the lyophilisate, can be more crucial in practical applications than the product's appearance.Ultimately, the decisive factor is the quality of the reconstituted lyophilisate rather than its visual characteristics.
A promising avenue for future research is to develop a learning algorithm capable of predicting the reconstitution behavior of freeze-dried products based on their computed tomography images.In this approach, the slices of the tomo-grams in the corresponding training set should be labeled according to attributes such as reconstitution time or the quality of the reconstituted product, rather than structural properties like the presence of cracks in the lyophilisate.

Automated tomographic assessment of structural defects of freeze-dried pharmaceuticalssupplementary material
Patric Müller, 1 Achim Sack, 1 Jens Dümler, 1 Michael Heckel, 1 Tim Wenzel, 2, 3 Sonja Schuldt-Lieb, 4 Henning Gieseler, 3 and Thorsten Pöschel 1 FIG.1:Each line (subfigure) depicts one of the typical morphological defects in freeze-dried pharmaceuticals as listed in TableI.The first column of each line presents a characteristic image of the respective defect, the second column displays a tomographic vertical slice of the lyophilisate, and the third column illustrates the various degrees of severity associated with the corresponding defect.

FIG. 5 :
FIG. 5: Flowchart of the image analysis procedure

1
Institut für Multiskalensimulation, Friedrich-Alexander-Universität Erlangen-Nürnberg, Germany 2 Division of Pharmaceutics, Freeze Drying Focus Group, Friedrich-Alexander-Universität Erlangen-Nürnberg, Germany3 GILYOS GmbH, Würzburg, Germany 4 medac GmbH, Wedel, Germany (Dated: April 18, 2024)I.TRAINING SETThe table below shows the result of the visual inspection of the vials of the training data set by two independent experts.Each line corresponds to one of the 720 samples.The first column indicates the ID of the vial and the second column shows the formulation of the sample and its concentration.The columns 3-7 indicate the severity of the defects discussed in Sec.I of the manuscript.As explained in Sec.III B of the manuscript, these assessments are basically the labels for the training data set, i.e. the ground truth for the machine learning algorithm. vial

TABLE I :
Criteria for characterization by optical inspection

TABLE II :
Formulations of the training set