Skip to main content
Log in

Investigation of Preprocessing and Validation Methodologies for PAT: Case Study of the Granulation and Coating Steps for the Manufacturing of Ethenzamide Tablets

  • Research Article
  • Published:
AAPS PharmSciTech Aims and scope Submit manuscript

Abstract

After the Food and Drug Association in the USA published guidelines on the enhanced use of process analytical technology (PAT) and continuous manufacturing, many studies regarding PAT and continuous manufacturing have been published. This paper describes a case study involving granulation and coating steps with ethenzamide to investigate interference for PAT model construction and model management. We investigated what factors should be considered and addressed when PAT is implemented for continuous manufacturing and how predictive models should be constructed. The product qualities that were monitored were moisture content and particle size in the granulation step and tablet weight and moisture content in the coating step. We have constructed models for the granulation step and validated the predictive capability of the models against an external dataset. A partial least squares (PLS) model with manual wavelength selection had the best predictive accuracy for loss on drying against the external validation set. We found that the prediction of loss on drying was accurate, but the prediction of particle size was not sufficiently accurate. In the coating step, because of the small amount of data, we performed three-fold cross-validation and y-scrambling 10 times, to select the optimal hyper-parameters and to check if the models were fitted to chance correlations. We confirmed that the coating agent weights, tablet weights, and water content could be accurately predicted based on the mean of the R2 score for cross-validation. Addition of other variables, as well as the absorbance, slightly improved the predictive accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  1. FDA. Innovation and continuous improvement in pharmaceutical manufacturing pharmaceutical CGMPs for the 21st Century. 2004;1–39.

  2. Kruisz J, Rehrl J, Sacher S, Aigner I, Horn M, G Khinast J. RTD modeling of a continuous dry granulation process for process control and materials diversion. Int J Pharm. 2017;528:334–44.

  3. Bhaskar A, Singh R. Residence time distribution (RTD)-based control system for continuous pharmaceutical manufacturing process. J Pharm Innov. 2019;14:316–31. Available from: https://doi.org/10.1007/s12247-018-9356-7.

  4. Escotet-Espinoza MS, Moghtadernejad S, Oka S, Wang Y, Roman-Ospino A, Schäfer E, et al. Effect of tracer material properties on the residence time distribution (RTD) of continuous powder blending operations. Part I of II: experimental evaluation. Powder Technol. 2019;342:744–763.

  5. Escotet-Espinoza MS, Moghtadernejad S, Oka S, Wang Z, Wang Y, Roman-Ospino A, et al. Effect of material properties on the residence time distribution (RTD) characterization of powder blending unit operations. Part II of II: application of models. Powder Technol. 2019;344:525–544.

  6. Snick B Van, Holman J, Cunningham C, Kumar A, Vercruysse J, De Beer T, et al. Continuous direct compression as manufacturing platform for sustained release tablets. Int J Pharm. 2017;519:390–407. Available from: https://doi.org/10.1016/j.ijpharm.2017.01.010.

  7. Fonteyne M, Vercruysse J, De Leersnyder F, Van Snick B, Vervaet C, Remon JP, et al. Process analytical technology for continuous manufacturing of solid-dosage forms. TrAC Trends Anal. Chem. Elsevier B.V. 2015;67:159–66. Available from: https://doi.org/10.1016/j.trac.2015.01.011.

  8. Simon LL, Pataki H, Marosi G, Meemken F, Hungerbu K, Baiker A, et al. Assessment of recent process analytical technology (PAT) trends: a multiauthor review. Org Process Res Dev. 2015;19:3–62.

    Article  CAS  Google Scholar 

  9. Fonteyne M, Soares S, Vercruysse J, Peeters E, Burggraeve A, Vervaet C, et al. Prediction of quality attributes of continuously produced granules using complementary pat tools. Eur J Pharm Biopharm. 2012;82:429–36.

    Article  CAS  Google Scholar 

  10. Mattes RA, Schroeder R, Dhopeshwarker V, Kowal R, Randolph W. Monitoring granulation drying using near-infrared spectroscopy. Pharm Technol Eur. 2005;41–5.

  11. Pauli V, Roggo Y, Kleinebudde P, Krumme M. Real-time monitoring of particle size distribution in a continuous granulation and drying process by near infrared spectroscopy. Eur J Pharm Biopharm. 2019;141:90–9.

    Article  CAS  Google Scholar 

  12. Maurer L, Leuenberger H. Terahertz pulsed imaging and near infrared imaging to monitor the coating process of pharmaceutical tablets. Int J Pharm. 2009;370:8–16.

    Article  CAS  Google Scholar 

  13. Michael Flores L. Development and submission of near infrared analytical procedures guidance for industry DRAFT GUIDANCE. 2015. Available from: https://www.fda.gov/regulatory-information/search-fda-guidance-documents/development-and-submission-near-infrared-analytical-procedures.

  14. Kim S, Kano M, Nakagawa H, Hasebe S. Estimation of active pharmaceutical ingredients content using locally weighted partial least squares and statistical wavelength selection. Int J Pharm. 2011;421:269–74.

    Article  CAS  Google Scholar 

  15. Kaneko H, Funatsu K. Ensemble locally weighted partial least squares as a just-in-time modeling method. AICHE J. 2016;62:717–25.

    Article  CAS  Google Scholar 

  16. Nakagawa H, Tajima T, Kano M, Kim S, Hasebe S, Suzuki T, et al. Evaluation of infrared-reflection absorption spectroscopy measurement and locally weighted partial least-squares for rapid analysis of residual drug substances in cleaning processes. Anal Chem. 2012;84:3820–6.

    Article  CAS  Google Scholar 

  17. Kim S, Okajima R, Kano M, Hasebe S. Development of soft-sensor using locally weighted PLS with adaptive similarity measure. Chemom Intell Lab Syst. 2013;124:43–9.

    Article  CAS  Google Scholar 

  18. De Beer T, Burggraeve A, Fonteyne M, Saerens L, Remon JP, Vervaet C. Near infrared and Raman spectroscopy for the in-process monitoring of pharmaceutical production processes. Int J Pharm. 2010;417:32–47.

    Article  Google Scholar 

  19. Rinnan Å, Berg F van den, Engelsen SB. Review of the most common pre-processing techniques for near-infrared spectra. TrAC Trends Anal Chem. 2009;28:1201–22. Available from: https://linkinghub.elsevier.com/retrieve/pii/S0165993609001629.

  20. Barnes RJ, Dhanoa MS, Lister SJ. Correction to the description of standard normal variate (SNV) and de-trend (DT) transformations in practical spectroscopy with applications in food and beverage analysis—2nd edition. J Near Infrared Spectrosc. 1993;1:185–6.

    Article  CAS  Google Scholar 

  21. Norris KH, Ritchie GE. Assuring specificity for a multivariate near-infrared (NIR) calibration: the example of the Chambersburg shoot-out 2002 data set. J Pharm Biomed Anal. 2008;48:1037–41.

    Article  CAS  Google Scholar 

  22. Sáiz-Abajo MJ, Mevik BH, Segtnan VH, Næs T. Ensemble methods and data augmentation by noise addition applied to the analysis of spectroscopic data. Anal Chim Acta. 2005;553(2):147–59.

    Article  Google Scholar 

  23. Savitzky A, Golay MJE. Smoothing and differentiation of data by simplified least squares procedures. Anal Chem. 1964;36:1627–39.

    Article  CAS  Google Scholar 

  24. Hazama K, Kano M. Covariance-based locally weighted partial least squares for high-performance adaptive modeling. Chemom Intell Lab Syst. 2015;146:55–62. Available from: https://linkinghub.elsevier.com/retrieve/pii/S0169743915001203.

  25. Fujiwara K, Kano M. Efficient input variable selection for soft-senor design based on nearest correlation spectral clustering and group Lasso. ISA Trans. 2015;58:367–379. Available from: https://doi.org/10.1016/j.isatra.2015.04.007.

  26. Rui Z, Yuanyuan C, Zhibin W, Kewu L. A novel ensemble L1 regularization based variable selection framework with an application in near infrared spectroscopy. Chemom Intell Lab Syst. 2017;163:7–15. Available from: https://doi.org/10.1016/j.chemolab.2017.01.020.

  27. Findlay WP, Peck GR, Morris KR. Determination of fluidized bed granulation end point using near-infrared spectroscopy and phenomenological analysis. J Pharm Sci. 2005;94:604–12.

    Article  CAS  Google Scholar 

  28. Gupta A, Peck GE, Miller RW, Morris KR. Real-time near-infrared monitoring of content uniformity, moisture content, compact density/tensile strength, and young’s modulus of roller compacted powder blends. J Pharm Sci. 2005;94:1589–97.

    Article  CAS  Google Scholar 

  29. M. Bishop C. Pattern recognition and machine learning. 1st ed. New York: Springer-Verlag New York; 2006. Available from: https://www.springer.com/gp/book/9780387310732.

  30. Yan L, Escobar MS, Kaneko H, Funatsu K. Detection of nonlinearity in soil property prediction models based on near- infrared spectroscopy. Chemom Intell Lab Syst. 2017;167:139–51.

    Article  CAS  Google Scholar 

  31. El Hagrasy AS, Cruise P, Jones I, Litster JD. In-line size monitoring of a twin screw granulation process using high-speed imaging. J Pharm Innov. 2013;8:90–8.

    Article  Google Scholar 

  32. Soh JLP, Boersen N, Carvajal MT, Morris KR, Peck GE, Pinal R. Importance of raw material attributes for modeling ribbon and granule properties in roller compaction: multivariate analysis on roll gap and NIR spectral slope as process critical control parameters. J Pharm Innov. 2007;2:106–24.

    Article  Google Scholar 

  33. Bär D, Debus H, Brzenczek S, Fischer W, Imming P. Determining particle size and water content by near-infrared spectroscopy in the granulation of naproxen sodium. J Pharm Biomed Anal. 2018;151:209–218. Available from: https://doi.org/10.1016/j.jpba.2018.01.005.

  34. Gupta A, Peck GE, Miller RW, Morris KR. Nondestructive measurements of the compact strength and the particle-size distribution after milling of roller compacted powders by near-infrared spectroscopy. J Pharm Sci. 2004;93:1047–53.

    Article  CAS  Google Scholar 

  35. Blanco M, Peguero A. An expeditious method for determining particle size distribution by near infrared spectroscopy: comparison of PLS2 and ANN models. Talanta. 2008;77:647–51.

    Article  CAS  Google Scholar 

  36. Peng T, Huang Y, Mei L, Wu L, Chen L, Pan X, Wu C. Study progression in application of process analytical technologies on film coating. Asian J Pharm Sci. 2015;10:176–185. Available from: https://doi.org/10.1016/j.ajps.2014.10.002.

  37. Wahl PR, Fruhmann G, Sacher S, Straka G, Sowinski S, Khinast JG. PAT for tableting: inline monitoring of API and excipients via NIR spectroscopy. Eur J Pharm Biopharm. 2014;87:271–278. Available from: https://doi.org/10.1016/j.ejpb.2014.03.021.

  38. Ramana Naidu V, Deshpande RS, Syed MR, Deoghare P, Singh D, Wakte PS. PAT-based control of fluid bed coating process using NIR spectroscopy to monitor the cellulose coating on pharmaceutical pellets. AAPS PharmSciTech. 2017;18;2045-2054. Available from: https://doi.org/10.1208/s12249-016-0680-2.

  39. Pasquini C. Near infrared spectroscopy: a mature analytical technique with new perspectives - a review. Anal Chim Acta. 2018;1026:8–36. Available from: http://linkinghub.elsevier.com/retrieve/pii/S0003267018304793.

  40. Mujica L, Rodellar J, Fernández A, Güemes A. Q-statistic and T2-statistic PCA-based measures for damage assessment in structures. Struct Heal Monit. 2010;10(5):539–53.

    Article  Google Scholar 

  41. Lu B, Chiang L. Semi-supervised online soft sensor maintenance experiences in the chemical industry. J Process Control. 2018;67:23–34.

    Article  CAS  Google Scholar 

  42. Russell EL, Chiang LH, Braatz RD. Fault detection in industrial processes using canonical variate analysis and dynamic principal component analysis. Chemom Intell Lab Syst. 2000;51:81–93.

    Article  CAS  Google Scholar 

  43. Jackson JE, Mudholkar GS. Control procedures for residuals associated with principal component analysis. Technometrics. 1979;21:341–9.

    Article  Google Scholar 

  44. Arakawa M, Yamashita Y, Funatsu K. Genetic algorithm-based wavelength selection method for spectral calibration. J Chemom. 2011;25:10–9.

    Article  CAS  Google Scholar 

  45. Kaneko H, Funatsu K. Classification of drug tablets using hyperspectral imaging and wavelength selection with a GAWLS method modified for classification. Int J Pharm. 2015;491(1-2):130–135. Available from: http://linkinghub.elsevier.com/retrieve/pii/S0378517315005323.

  46. Escobar MS, Kaneko H, Funatsu K. Flour concentration prediction using GAPLS and GAWLS focused on data sampling issues and applicability domain. Chemom Intell Lab Syst. 2014;137:33–46. Available from: https://doi.org/10.1016/j.chemolab.2014.06.005.

  47. Rücker C, Rücker G, Meringer M. y-randomization and its variants in QSPR/QSAR. J Chem Inf Model. 2007;47(6):2345–57.

    Article  Google Scholar 

  48. Kaneko H. Estimation of predictive performance for test data in applicability domains using y-randomization. J Chemom. 2019;33(9). Available from: https://onlinelibrary.wiley.com/doi/abs/10.1002/cem.3171.

  49. Fleming J. Adversarial validation, part one - FastML [Internet]. FastML blog. 2016 [cited 2020 Aug 26]. Available from: http://fastml.com/adversarial-validation-part-one/

  50. Fleming J. Adversarial validation, part two - FastML [Internet]. FastML blog. 2016 [cited 2020 Aug 26]. Available from: http://fastml.com/adversarial-validation-part-two/

  51. Breiman L. Random Forests. Mach Learn. 2001;45:5–32 Available from: http://link.springer.com/article/10.1023/A:1010933404324.

  52. Genetic Algorithm Optimization Toolbox, visited on 2020/11/07, https://people.engr.ncsu.edu/kay/gaotv5.zip

  53. Kaneko H, Funatsu K. Fast optimization of hyperparameters for support vector regression models with highly predictive ability. Chemom Intell Lab Syst. 2015;142:64–69. Available from: https://doi.org/10.1016/j.chemolab.2015.01.001.

  54. Libnau FO, Kvalheim OM, Christy AA, Toft J. Spectra of water in the near- and mid-infrared region. Vib Spectrosc. 1994;7(3):243–254. Available from: http://linkinghub.elsevier.com/retrieve/pii/0924203194850143.

  55. Kaneko H, Funatsu K. Moving window and just-in-time soft sensor model based on time differences considering a small number of measurements. Ind Eng Chem Res. 2015;54(2):700–704. Available from: http://pubs.acs.org/doi/pdf/10.1021/ie503962e.

Download references

Acknowledgments

We acknowledge the support of the Core Research for Evolutionary Science and Technology (CREST) project “Development of a knowledge-generating platform driven by big data in drug discovery through production processes,” grant number JPMJCR1311 of the Japan Science and Technology Agency (JST). The author S. S. acknowledges financial support of the Japan Society for the Promotion of Science (JSPS) in Grant-in-Aid for JSPS Fellows (DC2 and PD) program. The authors wish to thank members of the Facility of the Future in ISPE Japan for their kind support. We thank Takashi Terada at Freund Corp. and Takuya Nagato and Yosuke Tomita at Powrex Corp. for their kind help, data acquisition, and fruitful discussions. Victoria Muir, PhD, from Edanz Group (https://en-author-services.edanzgroup.com/ac) edited a draft of this manuscript.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kimito Funatsu.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix 1

Appendix 1

We describe the overview of data-driven prediction models below to provide readers information of methods.

Genetic Algorithm

GA is an evolutionary optimization algorithm that mimics a way of evolution that organisms on the earth have experienced. A numerical solution of an optimization problem is coded as an organisms or “chromosomes” in GA. Every chromosome in the form of vector returns a fitness score that is a value of the objective function to be maximized (or minimized) in the optimization. Chromosomes improve themselves over “generations,” iterations in optimization, by mutation of the solutions or cross-over of the solutions among chromosomes that have higher (or smaller) fitness scores; namely, elements in vector of a chromosome are either randomly altered or shuffled across chromosomes.

Partial Least Squares

PLS is a rudimentary linear regression method. The concept of PLS is to extract latent variables that is best fit to prediction of objective variables. The input vector is weighted by weight vector that is the first eigenvector of the XTYYTX for input matrix X and output matrix Y. The inner product of the input and weight vectors is a scalar latent variable that stands best for objective variables in terms of prediction of the objective variables. Afterward, a linear multivariate regression model is fit to the latent variables and objective variables by least squares.

Support Vector Regression

SVR is a sparse modeling method for regression, which is fully based on support vector machine (SVM). The concept of SVM is to fit the classification model to data as measurement noise in the objective variable does not affect the model by sparse modeling. In a similar manner, SVR fits the regression model to data in ignorance of measurement noise.

Both SVM and SVR have interested researchers for decades, because they are easily incorporated into kernel trick. Kernel trick replaces inner product with a kernel function that is defined in reproducing kernel Hilbert space. SVR with RBF kernel is a model that expresses non-linear relationship between input and output variables.

Random Forest

RF, which can be utilized for both regression and classification, is an ensemble of decision trees that form a non-continuous regression or classification plane. RF attempts to minimizing variances of prediction by aggregation of sub-models through bagging.

A decision tree model splits a space that is formed by input variables into sub-spaces, each of which has a corresponding predicted value or predicted label. When input variables are determined, one regression or classification plane corresponding to the input returns a value or label. By counting how many times a variable splits the space for sub-spaces, one decision tree can return a feature importance. The frequency is averaged over samples, and then, the ratio based on the averaged frequency is calculated as feature importance. Feature importance reflects which variable affects the prediction results; therefore, large feature importance of RF for classification of training and test samples indicates variables that are most different between training samples and test samples.

Principal Component Analysis

PCA is a basic latent variable modeling method, which extracts latent variables by maximization of variance and covariance of input variables. The input variables after mean centering are projected onto new bases that capture the largest covariances of the data and the new measurements after the projection are the latent variables that account the input matrix in smaller dimensions with the standardized orthogonal bases.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Shibayama, S., Funatsu, K. Investigation of Preprocessing and Validation Methodologies for PAT: Case Study of the Granulation and Coating Steps for the Manufacturing of Ethenzamide Tablets. AAPS PharmSciTech 22, 41 (2021). https://doi.org/10.1208/s12249-020-01911-w

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1208/s12249-020-01911-w

Key Words

Navigation