Background

Pharmaceutical excipients are substances other than the active pharmaceutical ingredients (APIs) added to the pharmaceutical dosage forms. They are added to achieve the desired consistent volume of a dosage form since it is not convenient to administer the raw API directly to patients in most cases. Pharmaceutical excipients include solvents, diluents, or bulking agents such as lactose, sweetening agents, fillers, wetting agents, flavors, sustained-release matrices, preservatives, absorption enhancers, coloring agents, emulsifiers, and so on. The principal role of pharmaceutical excipients is to provide a defined volume and uniformity of dose of an API in a dosage form be it in solid, semisolid, transdermal, parenteral, or liquid formulations, throughout the production process [1]. Moreover, in some cases, they provide convenience in terms of the administration of medicine to the patient, e.g., sweetening agents and flavors.

In the past, excipients were considered inert, especially those derived from natural origins. However, due to advancements in medicine and pharmaceutical technology in the search for newer drug delivery systems and dosage forms, the scope of pharmaceutical excipients has expanded tremendously to include those from synthetic and semi-synthetic origins. Consequently, excipients are presently not considered completely inert substances because they often interact with the loaded API and could generate unwanted impurities or impede the biopharmaceutical performance of the API loaded in a dosage form. To ensure safety and qualification for use in pharmaceuticals, manufacturers of pharmaceutical excipients must ensure the absence of intrinsic toxicity of excipients, and materials for excipients development are chosen from pharmaco-toxicologically inert substances [1, 2]. To achieve these goals, many regulatory agencies across the globe such as the US Food and Drug Administration (FDA), the European Medicines Agency (EMA), and the Japanese Ministry of Health, Labour and Welfare have set guidelines to guarantee the safety of excipients used by the pharmaceutical industries [1].

Pharmaceutical solid dosage forms, especially tablets, account for over 60% of the global medicines consumed [3]. Starch is prominent in the pharmaceutical dosage form application, particularly for tablet formulations due to its natural origin with limited toxicity potential and wide availability from numerous sources [4]. However, in its natural unmodified form, starch has some limitations for pharmaceutical application. The low quality of the native starches is attributed to their poor functionality in terms of particle characteristics that directly affect their flow rate, compactibility, and compressibility characteristics. Through modification (chemical, physical, or biotechnological), these properties could be improved to obtain pharmaceutical-grade starches [3, 5]. The process of such modification can be complex, consuming significant resources and time. Therefore, coupling some of these excipient development technologies with artificial intelligence (AI) based modeling could reduce the time, resources, and manpower needed for the development of pharmaceutical-grade excipients including starch.

AI-based models are currently gaining popularity in different areas of prediction and simulation in engineering, basic science, health science, and pharmaceutical sciences owing to their promising capacities, fast learning speed, accuracy, and precision [6]. The major motivation for employing these models in this work is to generate a consistent prediction result using various models. Even though, this is not possible owing to the dynamic nature of experimental data. It is, therefore, necessary for scientists to develop efficient and strong models by employing the experimental data available. Based on the established studies in the literature, the traditional linear models have been used widely, although they generally exhibit lower accuracy and precision levels, and this gives room for the implementation of AI-based techniques, which are regarded as more accurate, precise, and non-linear computational tools [7]. For instance, Simões et al. [8] reported the application of artificial neural networks (ANNs) in predicting the quality by the design method. The drug particle size distribution was considered as a critical parameter in the study. The ANN technique was employed in order to simulate the in vitro dissolution of the drug manufacturing process. Barmpalexis et al. [9] demonstrated the application of various data-driven approaches including multi-linear regression (MLR) and other non-linear models such as ANN, genetic programming (GP), and particle swarm optimization (PSO) for designing experimental implementation in order to determine the effect of various diluents and particle size fractions of three commonly used compression diluents. The results showed the reliability of the application of these data-driven models. Many studies have proven the effectiveness of such techniques [10,11,12,13,14,15,16].

It can be seen from the previous studies involving data-driven models that most of the works employed linear regression methods such as MLR and non-linear data-driven approaches such as ANN. As a result of some modeling issues such as slow learning ability and overfitting, a novel and hybrid model adaptive neuro-fuzzy inference system (ANFIS) is employed in order to tackle the drawbacks of the traditional neural network owing to its hybrid nature that combines the concepts of both ANN and fuzzy logic. However, since the development of computational models in the area of pharmaceutical sciences, there has been no published work in the literature indicating the implementation of non-linear data-driven algorithms (ANFIS and ANN) combined with the classical linear regression MLR for modeling the compaction performance of pharmaceutical excipients. The aim of the present study is therefore to apply AI modeling in predicting the compaction and compressibility performances of modified starches derived from Livingstone potato for application in tablet formulations.

Methods

Materials

Materials include stearic acid, glycerol, and talc (BDH Chemicals Ltd Poole, England); microcrystalline cellulose (Avicel® PH101) (ATOZ Pharmaceuticals Ltd, Ambaltur, India); absolute ethanol and hydrochloric acid (Emerck Darmstadt, Germany); xylene (Loba Chemic Laboratory Ltd., Mumbai, India); sodium hydroxide pellets (Avondale Laboratories Ltd., Banbury, England).

Starch preparation and modifications

Livingstone potato (Plectranthus esculentus) was obtained from the Vom area of Plateau State, Nigeria. The tubers were identified in the herbarium of the Department of Biological Sciences, Ahmadu Bello University, Zaria, Nigeria, (Voucher number 28448). Starch extraction was done by wet milling technique, and starch modifications were performed by three methods, namely pregelatinization, ethanol dehydrated pregelatinization, and acid hydrolysis [3]. The modified starches were labeled accordingly as pregelatinized starch (PS), ethanol dehydrated pregelatinized starch (ES), and acid hydrolyzed starch (AS), respectively. Microcrystalline cellulose (Avicel® PH101) was adopted as a standard for comparison which is used commonly in direct compression tablet formulations.

Tablet compaction studies

Powder samples of the modified starches were made into compacts by compressing 500 mg using a 10.5-mm die and flat-faced punches on an Apex hydraulic hand press (184 models, Apex Construction LTD, London). Varied pressure (28–170 MNm−2) was used with a 30 s dwell time. The tablet compacts were kept in a desiccator filled with silica for 1 day to enable elastic recovery and hardening and also to prevent low yield values. The tablet properties thickness, diameter, and weights (W) were then determined. The relative densities (D) of the tablets were then calculated according to Eq. 1 below [3]:

$$ D=W/{V}_{\rho s} $$
(1)

where V is the tablet volume (cm3) and ρs is the particle density (g/cm3) of the compact material. The Heckle plots [ln (1/1-D) versus the compression pressure P (MNm−2)] and Kawakita plots of applied pressure (P) divided by the degree of volume reduction (C) [P/C versus P] were generated [3]. Also, the compressibility indices of the materials were obtained from the plot of compact density (g/cm3) against the log of compaction pressure. Bulk, tapped and true densities were generated according to the method described by Khalid et al. [3].

Proposed methodology for AI modeling

In this work, various data-driven approaches were proposed separately for modeling the performance of these novel excipients. The primary data of this study were collected from our experimental results. Furthermore, two performances of these excipients were determined as the output variables, i.e., tablet density and degree of volume reduction. The work employed the applications of drug/excipient ratio (D/E ratio), friability (%), crushing strength (C/strength), compression pressure (C/pressure), and log of compression pressure (log C/pressure) as the input variables for the corresponding out parameters of the modified starches: PS, ES, and AS, with Avicel® 101 as standard for comparison. Therefore, this work proposes the development of three different data-driven models, which include two non-linear models, namely ANN (the most widely used data-driven model) and ANFIS (as the hybrid learning algorithm), and a traditional linear regression model (MLR, which is the most commonly used linear model). The main aim of employing different data intelligence algorithms is to understand the nature and behavior of the models towards different data sets, which in turn makes it difficult for modelers to select a specific model while simulating a certain data set. The complexity issue can be overcome by choosing various models, which include the linear data-driven algorithms despite their weakness towards handling complex non-linear data. Regarding the implementation of this work, the models were evaluated by applying various performance indices.

Artificial neural networks (ANN)

ANNs are generally new computerized tools that have broad uses in resolving many complicated real-world problems. The attraction of ANNs originates from their outstanding information processing traits related mostly to nonlinearity, fault and noise tolerance, learning, and generalized abilities [17]. ANNs are also referred to as neural networks (NNs) or connection model. It is an algorithmic numerical model that mimics the behavior characteristics of the biological brain neural network and performs distributed mode and data processing.

Adaptive neuro-fuzzy inference system (ANFIS)

Nevertheless, ANNs tools are one of the broadly use AI-based models which are motivated by copying the brain of human beings, as a result of its resilience of mimicking with a high complex connection between the input and output models of the data collections [18].

ANFIS has been demonstrated to be a successful software that incorporates the approach of the fuzzy Sugeno model that benefits from both fuzzy logic and ANN in one system. ANFIS has been recently used in predicting and modeling complex datasets [15]. ANFIS is also a real-world estimator because of its capacity to approximate real functions. In practice, several membership functions (MF) are used including trapezoidal, triangular, sigmoid, and Gaussian, although the Gaussian function is the most frequent MF [19].

Assume the FIS contains two inputs “x” and “y” and one output “f,” a first-order Sugeno fuzzy has the following rules:

$$ \mathrm{Rule}\ 1:\mathrm{if}\kern0.5em \upmu \left(\mathrm{x}\right)\ \mathrm{is}\kern0.5em {\mathrm{A}}_1\kern0.5em \mathrm{and}\kern0.5em \upmu \left(\mathrm{y}\right)\kern0.5em \mathrm{is}{\mathrm{B}}_1\kern0.5em \mathrm{then}\kern0.5em {\mathrm{f}}_1={\mathrm{p}}_1\mathrm{x}+{\mathrm{q}}_1\mathrm{y}+{\mathrm{r}}_1 $$
(2)
$$ \mathrm{Rule}\ 2:\mathrm{if}\kern0.5em \upmu \left(\mathrm{x}\right)\ \mathrm{is}\kern0.5em {\mathrm{A}}_2\kern0.5em \mathrm{and}\kern0.5em \upmu \left(\mathrm{y}\right)\ \mathrm{is}\kern0.5em {\mathrm{B}}_2\mathrm{then}\kern0.5em {\mathrm{f}}_2={\mathrm{p}}_2\mathrm{x}+{\mathrm{q}}_2\mathrm{y}+{\mathrm{r}}_2 $$
(3)

A1B1, A2, B2 Parameters are membership functions for x and y inputs.

p1, q1, r1,p2, q2, r2, are outlet function parameters. The structure and formulation of ANFIS follow a five-layer neural network arrangement.

Multi-linear regression (MLR)

Regression is generally categorized into two major domains of simple and multiple linear regression; each one can be applied according to the purpose of the simulation. For example, if we aim to estimate a linear regression, which exists between a single input and single output, such a model is known as a simple linear regression (SLR). Furthermore, if we want to simulate the linear relation between a single output and multiple input parameters, it is called a multiple linear regression (MLR) [20]. Usually, MLR is the linear regression type that is generally used, and it involves analysis such that each parameter of the inputs is correlated with the output parameter [21]. Generally, MLR consists of estimating the rate of the relationship that exists between each parameter, i.e., between the output and two or more input parameters [22]. The entire expression of MLR is shown in Eq. (4).

$$ Y=b0+b1+b2x2+\dots b1x1 $$
(4)

where x1 is the value of the predictor, b0 is the regression constant, and b1 stands for the coefficient of the predictor.

Evaluation criteria and validation method for data-driven models

Usually, for any form of the data-driven algorithm, the performances of the models are evaluated using various performance indices by comparing the simulated and experimental values. In this work, the determination coefficient of (R2) and correlation coefficient (R) as the goodness of fit and two statistical errors, root-mean-square error (RMSE) and mean- square error (MSE), were used for the evaluation of the models:

$$ {R}^2=1-\frac{\sum \frac{N}{j}=1{\left[(Y)\mathrm{obs}.j-(Y)\mathrm{com}.j\right]}^2}{\sum \frac{N}{j}=1{\left[(Y)\mathrm{obs}.j-\left(\overline{Y}\right)\mathrm{com}.j\right]}^2} $$
(5)
$$ R=\frac{\sum \frac{N}{i}=1\left({Y}_{\mathrm{obs}}-{\overline{Y}}_{\mathrm{obs}}\right)\left({Y}_{\mathrm{com}}-{\overline{Y}}_{\mathrm{com}}\right)}{\sqrt{\sum \frac{N}{i}=1{\left({Y}_{\mathrm{obs}}-{\overline{Y}}_{\mathrm{obs}}\right)}^2}\sum \frac{N}{i}=1{\left({Y}_{\mathrm{com}}-{\overline{Y}}_{\mathrm{com}}\right)}^2} $$
(6)
$$ \mathrm{RMSE}=\sqrt{\frac{\sum \frac{N}{i}=1{\left({Y}_{\mathrm{obs}i}-{Y}_{\mathrm{com}i}\right)}^2}{N}} $$
(7)
$$ \mathrm{MSE}=\frac{1}{N}\sum \frac{N}{i}=1{\left({Y}_{\mathrm{obs}i}-{Y}_{\mathrm{com}i}\right)}^2 $$
(8)

where N, Yobsi, \( \overline{Y} \), and Ycomi are the data number, observed data, the average value of the observed data, and computed values, respectively.

For the validation technique, different types of validation methods can be applied such as cross-validation (i.e., k-fold cross-validation), holdout, and leave one out. In this work, the k-fold cross-validation is used, which is regarded as the process employed in order to reduce the problems of overfitting [22]. In this technique, the initial data set is categorized into same-sized subsets of k [23].

Results

Figure 1 represents the Kawakita plot showing that a linear relationship was attained at all compression pressures with a correlation coefficient value of 0.999 for all the modified starches and Avicel® PH101. From the Kawakita profiles, the slopes and intercepts have shown that the packed initial relative densities of the modified starches with the application of small pressures and/or tapping declined in the sequence: PS>ES>AS>Avicel® PH101.

Fig. 1
figure 1

Kawakita profiles for modified starches (ES, PS, AS) and Avicel® PH101 of compact excipients

The order of the results confirmed what was observed in the Heckel profiles (Fig. 2) where Avicel® PH101 exhibited a low value for loose initial relative density in comparison to the modified starches. A similar finding was reported for microcrystalline starch obtained from Manihot esculenta [24].

Fig. 2
figure 2

Heckel profiles for modified starches (ES, PS, AS) and Avicel® PH101 of compact excipients as such

Compressibility implies the ability of a pharmaceutical excipient to undergo a substantial volume reduction when it is subjected to compression pressure. Previous works have established a linear relationship between the compaction pressure and the density of tablets [3]. Therefore, the rate of increase of tablet density with an increase in compaction pressure is expressed as compressibility of a powder material used in pharmaceutical tablet formulations [3]. The resultant slope from the profile of tablet density plotted against the logarithm of the compression pressure is considered to express the compressibility index of a pharmaceutical powder material; the greater the slope, the better the compressibility behavior and the better its ability to reduce in volume when pressure is applied [3, 22].

According to the values generated from Fig. 3, the compressibility indices ranking is as follows: Avicel® PH101>AS>PS>ES. Despite small variations, the results for AS and PS showed closer characteristics to Avicel® PH101 indicating an increase in the tablets’ compact densities when compression pressure is increased, and the ability to undergo plastic deformation which is an indication of good compressibility. In contrast, ES exhibited the lowest compressibility properties.

Fig. 3
figure 3

Profiles for compressibility indices of modified starches (ES, PS, AS) and Avicel® PH101 of compact excipients as such

Data-driven algorithms results

AI-based models (ANFIS and ANN) with a linear model (MLR) were employed to predict the performance of three different novel pharmaceutical excipients (PS, ES, AS), obtained from a natural origin (Livingstone potato) and Avicel® PH101 using drug/excipient (D/E) ratio, friability (%), crushing strength, compaction pressure and log compaction pressure, and degree of volume reduction as the corresponding input variables. The performance accuracy of the models was evaluated using four different performance indices: determination coefficient (R2), root mean square error (RMSE), mean square error (MSE), and correlation coefficient (R).

In the development of these models, the simulation was done in MATLAB 9.3 (R2019a). For the ANN model, a special algorithm is known as Levenberg–Marquardt was used by employing 1000 iterations, coefficient of the momentum of 0.9, learning speed of 0.01, and an MSE of 0.0001. The best architecture of the model was optimized and selected through the use of trial by error method.

Additionally, in modeling ANFIS, different kinds of membership functions, as well as epoch iterations, were employed through trial by error in order to determine the desired structure. Tables 1 and 2 show the results of the performance of the data-driven algorithms models in modeling the performance of these excipients in the form of tablet density and degree of volume reduction.

Table 1 Tablet compact density modeling
Table 2 Degree of volume reduction modeling

Figure 4 demonstrates the response time series plot for Avicel® PH101. According to the plot, the extent of spread values between the experimental and predicted values proved the result in Table 1.

Fig. 4
figure 4

Time series for the prediction of Avicel® PH101

Detailed and comprehensive results of the degree of volume reduction for simulating the performance of these novel excipients are shown in Table 2. According to the results obtained from both the training and the testing stages, the ANFIS model demonstrated better fitness compared with the other two data-driven algorithms. It is not surprising that the ANFIS data-driven algorithm as a hybrid model as well as an emerging non-linear system for simulation has shown a strong and promising ability in elucidating complex data. Based on the models’ performance efficiency, the hierarchical order is as follows ANFIS>ANN>MLR.

Discussion

Table 1 demonstrates the comparative prediction of the tablet density of four different excipients using three different models. It has clearly shown that the non-linear hybrid model ANFIS outperformed the traditional ANN and the classical linear regression MLR. The table further demonstrates that all the three models indicated strong results for simulating the tablet density of the four excipients in terms of the performance indices R2, R, RMSE, and MSE. This can be attributed to the cross-validation process conducted prior to the modeling, which is very important in model evaluation.

The predictive result in terms of the determination coefficient (R2) indicated that ANFIS outperformed the other two models (ANN and MLR) and enhanced their performance efficiency up to 5% and 0.05% for Avicel® PH101, 0.37% and 2% for ES, 0.1% and 0.2% for PS, and finally, 0.3% and 0.034% for AS, respectively. On the other hand, the simulated values have been demonstrated graphically using a scatter plot to show the goodness-of-fit between the measured values and the predicted values (Fig. 5). It is clear from the scatter plots that ANFIS demonstrates the best fitting agreement between the measured and simulated values. These findings are in line with the literature [23, 25,26,27,28].

Fig. 5
figure 5

Scatter plots of tablet compact densities for ANN, ANFIS, and MLR. a ES. b PS. c AS

The quantitative results based on the performance indices R, R2, RMSE, and MSE indicate that the ANFIS model achieved higher performance accuracy and outperformed the other two models (ANN and MLR) in both the training and testing phases. The performance of the models of these excipients, e.g., PS, can be further compared, as shown in Table 2. Through analyzing the results, it can be observed that the AI-based data-driven algorithms (ANFIS and ANN) emerged as satisfactory and reliable models. The predictive accuracy performance of these models was equally proved in various technical literature such as [15]. Furthermore, the coefficient of correlation (R) as shown in Table 2, demonstrating the performance of these two data-driven algorithms compared with linear regression MLR. MLR performance generally fails to a certain extent, especially when it encounters highly non-linear and complex data, which can be due to the fact that MLR follows the least-squares concept method that predicts the relationship between the inputs and the output parameters in linear form. Moreover, MLR simulation may result in the generation of many negative values, which can affect the performance efficiency of the model. Figure 6 shows the performance of the novel excipient PS in a surface radar chart, which depicts the R scale in both the training and testing stages of the models.

Fig. 6
figure 6

Radar chart of modified starch PS in both training and testing

The radar chart shows that the results in terms of the correlation coefficient (R) follow the following order: ANFIS (0.9995, 0.9980), ANN (0.9992, 0.9966), and MLR (0.9976, 0.9958) in the training and testing stages, respectively. Generally, the scale of radar ranges between 0 and 1, where the best performing model approaches one. The predictive comparison of the models based on the radar chart can be arranged based on the following order: ANFIS>ANN>MLR. Figure 7 demonstrates a predictive comparative analysis of these data-driven approaches based on their relative mean square error (RMSE). ANFIS shows higher performance accuracy compared with the other two models, as it records the lowest error values in both the training and testing stages.

Fig. 7
figure 7

Comparison of the relative mean square error of the data-driven algorithms in the simulation of AS

Conclusion

Modifications of the native starch made by acid hydrolysis (AS) and pregelatinization followed by ethanol dehydration (ES) gave rise to good table excipients that can be used for direct compression tablet formulations based on their compaction and compressibility characteristics. This work employed the application of various models, namely two artificial intelligence-based models (ANFIS and ANN) and a linear model (MLR) for modeling the performance of three novel pharmaceutical excipients with one standard (Avicel® PH101) based on their compact tablet densities as well as their degrees of volume reduction. The results indicated the reliability of the AI-based models over the linear model. Hence, the comparative results indicate that ANFIS outperformed the other two models in modeling the performance of all of the four excipients with considerable performance accuracy. This method can be further exploited in the development of other pharmaceutical excipients, not only for solid dosage forms, but also for other excipients used in semi-solid, transdermal, liquid, or injectable formulations. The predictive results further suggested that other data-driven models, as well as optimization algorithms such as principal component analysis (PCA), Hammerstein-Weiner (HW), genetic algorithms (GA), and fuzzy logic (FL), could be employed in order to improve the performance accuracy of the models.