A series acceleration algorithm for the gamma-Pareto (type I) convolution and related functions of interest for pharmacokinetics

Abstract The gamma-Pareto type I convolution (GPC type I) distribution, which has a power function tail, was recently shown to describe the disposition kinetics of metformin in dogs precisely and better than sums of exponentials. However, this had very long run times and lost precision for its functional values at long times following intravenous injection. An accelerated algorithm and its computer code is now presented comprising two separate routines for short and long times and which, when applied to the dog data, completes in approximately 3 min per case. The new algorithm is a more practical research tool. Potential pharmacokinetic applications are discussed. Graphic abstract Supplementary Information The online version contains supplementary material available at 10.1007/s10928-021-09779-4.


Introduction
Gamma-Pareto convolutions (GPC), the convolution of a gamma distribution with some type of Pareto distribution, are increasingly used for modelling diverse random processes like traffic patterns [1], flood rates, fatigue life of aluminium, confused flour beetle populations [2], and extreme rainfall events [3]. Although there are multiple possible GPC models and different nomenclatures used to describe them, a natural classification would arise from Pareto distribution classification, types I through IV, and the Lomax distribution, a type II subtype, which is the classification scheme of reference [4] and the Mathematica computer language [5]. 1 Convolution was first introduced to pharmacokinetics in 1933 by Gehlen who used the convolution of two exponential distributions, to describe plasma concentration-time data, and as originally developed in 1910 by Bateman to model radioactive decay [6][7][8]. Much later, in 2006, the Bateman equation was generalised as an exact gamma-gamma convolution (GDC) by Di Salvo [9]. Ten years later, this was then applied to 90 min continuous recordings of radioactivity in human thyroid glands following injection of 99m Tc-MIBI [10]. 2 In 1919, Widmark identified integration of a monoexponential as a model for constant infusion [11]. That integration from zero to t to find a constant infusion model can be applied not just to exponential functions, but applies equally well to any area under the curve (AUC) scaled density-function (pdf) 3 model, was shown subsequently by Wesolowski et al. [12,13]. Recently, the disposition of metformin was described precisely using the type I GPC model, which as it is asymptotically a Pareto function, has a power function tail [13]. Using direct comparison rather than classification, power function tails were shown to be always heavier than exponential tails, see the Appendix Subsection entitled Relative tail heaviness of reference [13]. A power function tail, in turn, implies an underlying fractal structure, where fractal in this context signifies a scale invariant model of vascular arborisation [14]. The GPC computer algorithm used in 2019 had long run times and was not accurate beyond 96 h [13]. These problems were corrected in order to make predictions for multiple dosing over longer times [15,16]. Since computer implementation of new functions is highly specialised, not easily arrived at by induction and yet indispensable for any practical application, documentation of a more practical type I GPC algorithm may facilitate its more widespread implementation. Accordingly, we now present a series acceleration computer implementation of a more generally applicable GPC type I function, with markedly shorter run times.

Background
The gamma-Pareto convolution distribution function family A classification for gamma-Pareto convolutions (GPC) is proposed that arises from the types of Pareto distributions [4]. These are types I through IV plus the Lomax distribution, a subtype of II. The Pareto type I distribution is where a is the shape parameter, b is a scale parameter and hðÁÞ is the unit step function such that hðt À bÞ is the unit step function time-delayed by b, and is used to make a product that is non-zero only when t [ b. 4 A type II Pareto distribution can be written as Setting l ¼ 0, this becomes the Lomax distribution; PD Lomax ðt; a; bÞ ¼ a b À 1 þ t b Á ÀaÀ1 hðtÞ; which was used to derive a Lomax gamma-Pareto distribution [1]. The relevance of this is that the GPC type I and Lomax GPC derivations are similar. As yet, the type II (not Lomax) through type IV gamma-Pareto convolutions have not been published. These convolutions are likely to be infinite sums and may require series acceleration to be of practical use. By substitution and reduction of the number of parameters, there are closed form GPC-like expressions, types II through IV, that are different distributions [2]. As a full set of solutions for the entire GPC function family has not been characterised, it is not known what additional applications there could be for the GPC family of functions. Unlike the Lomax GPC, the GPC type I does not start at t ¼ 0, but at t ¼ b. For pharmacokinetic modelling, b [ 0 is a measure of the circulation time between injection of an intravenous bolus of drug (t ¼ 0), and its arrival at a 1 Wolfram Research, Inc., (2021) Mathematica, Version 12.3, Champaign, IL https://reference.wolfram.com/language/ref/ParetoDis tribution.html. 2 740 MBq technetium-99m labeled hexakis-methoxy-isobutylisonitrile. 3 We retain the acronym pdf without a probability; p, but use f(t) preferentially. Concentration models are the product of area-underthe-curve of concentration and density functions whose total areaunder-the-curve is 1 (dimensionless dose fraction). This balances the classical mechanical units of Mass, Length, and Time, as follows, The unit step function, hðxÞ, is zero for x\0 and 1 for x ! 0, such that hðxÞ is continuous everywhere except at x ¼ 0. When x ¼ t À b and b [ 0, then hðt À bÞ is a unit step function shifted to later time (i.e., to the right) by b units in the new coordinate system; t. The unit step function is faster for numerical computations than the Heaviside theta function, which later is sometimes also symbolised as hðxÞ. The Heaviside theta is more mathematically useful when it is continuous everywhere such that its derivative and Laplace transform are defined.
peripheral venous sampling site (t ¼ b). The four-parameter gamma Pareto (type I) convolution (GPC) density function was developed to model the disposition of metformin in dogs, which exhibited an unexpectedly heavy tail poorly described by an exponential decay [13]. This heavy tail implies a prolonged buildup of the body burden of the drug [15,17] that may require dose tapering on long-term use, especially in patients with renal impairment [16]. where a is a dimensionless shape parameter, b is a rate per unit time, is the reciprocal of its scale parameter, and CðÁÞ is the gamma function. 5 This yields the GPC function, GPCðt; a; b; a; bÞ where B z ðÁ; ÁÞ is the incomplete beta function. 6 This is a density function (a pdf, or more simply an f, with units per time; t À1 ). Equation (5) is from convolution following Maclaurin series expansion of e Àb t , i.e., it is analytic. An analytic function has any number of sequential multiple integrals and derivatives, as illustrated in the following equations. Compared to their prior expression [13], the equations that follow have been put in simpler terms. GPC type I integral: Equation (6) is the cumulative density function (CDF) of the GPC, symbolised by F, the integral of the f(t) density; FðtÞ ¼ R t 0 f ðsÞ ds, GPC F ðt; a; b; a; bÞ This equation, because it is a CDF, expresses the dimensionless fraction of a unit drug dose eliminated from the body as a function of time, and was used to calculate a prolonged retention of metformin in dogs and to explain its incomplete urinary recovery at 72 h following intravenous injection in humans [13,15,17]. GPC type I double integral: Equation (7) is the double integral of the density function, f, which is also the single integral of F, the CDF, and is sometimes called a ''supercumulative'' distribution [18]. It is symbolised by F , i.e., This equation (units t) was used to construct an intravenous bolus multidose loading regimen that maintains the same mean amount of metformin in the body during successive dose intervals [13] and to predict metformin buildup during constant multidosing in humans both with normal renal function and with renal insufficiency [16]. A further use of this equation is to predict the cumulative distribution function following a period of constant infusion given only its bolus intravenous-concentration, fit function. GPC type I derivative: Equation (8) is the derivative of the GPC density, GPC 0 , or in general an f 0 , GPC 0 ðt; a; b; a; bÞ This equation (units t À2 ) is useful for finding the peaks of the GPC function by searching for when it equals zero, and for calculating disposition half-life from its general definition, which is Eq. (6) of reference [13]. Note that there is a pattern in the sequential integrals and derivatives that illustrates the analyticity of the GPC function. The integrals and derivatives above follow directly from integration or differentiation of the GPC formula, for which the following identity from integration by parts 7 is useful for simplifying the results. 5 The gamma function, or generalised factorial, is Methods, algorithms for GPC type I series acceleration and their computation

Data sources and regression methods
The source data for regression analysis and subsequent algorithm testing consists of seven intravenous bolus metformin studies performed in healthy mixed-breed dogs [19]. The 19 to 22 samples per case drawn between 20 min to 72 h postinjection are listed as open data in Supplementary material 1 (SLSX 49kb) in [13]. 8 The regression target was the so-called 1=C 2 weighted ordinary least squares (OLS) method, implemented as minimisation of the proportional norm, which is also the relative root mean square (rrms) error, as per the Concentration data and fitting it Appendix Subsection of [13]. 9 The loss function chosen to be minimised agreed with the error type the measurement system assay calibration curve. Both the metformin assay (5.2% rrms) [20], and the GPC residuals (8.6% rrms) exhibited proportional error. The reuse of assay loss functions for regression loss functions is systemically consistent and appears in these references [10,13]. The regression method used was Nelder-Mead Constrained Global Numerical Minimisation as implemented in Mathematica, a global search technique [5]. 10  Post processing is disallowed because it launches a constrained convex gradient solution refinement protocol; the interior point method, which does not converge. The use of parameter starting value ranges close to the solution helps speed up convergence. Note that regression can start with 65 significant figure accuracy but finish with less than half of that for some parameter values due to error propagation from the fit function itself and/or the regression process. In order to calculate the confidence intervals (CI) of the parameters, model-based bootstrap [21] was performed, as follows.
Care was taken to verify the normality of fit residuals and the homoscedasticity of residuals-see [13]-as suggested by [22]. Those conditions allow for the residuals to be randomly sampled with replacement, then added to the model at the sample-times to create synthetic data having the same properties as the original data, but which have altered regression parameter solutions. The bootstrap parameter values so obtained can provide more information than gradient method parameter CV's, as the latter only provides single case-wise estimates, which are not as statistically useful as case-wise distributed parameter information [13]. Table 1 shows both case-wise and populationwise coefficients of variation from an early version of a GPC algorithm. The table was amalgamated from Tables 1, 3, and 12 of [13] representing 24 h of 8-core parallel processing of 42 time-sample serum curves. There is thus an obvious need for a faster algorithm for regression analysis.
GPC type I primary definition: the shortt algorithm The primary definition of a gamma-Pareto type I convolution, Eq. (5), is GPCðt; a; b; a; bÞ ¼ GDðx; a; bÞ Ã PDðx; a; bÞ ðtÞ ¼ ðb xÞ a e Àb x x CðaÞ hðxÞ This contains alternating terms in the summation such that the sum is rapidly convergent for t not much greater than its lower limit, b. However, for sufficiently large values of t, the individual terms of the summation both alternate in sign and become extremely large in magnitude (i.e., absolute value) before absolute series convergence. For absolute convergence of an alternating series the infinite sum of the absolute values is bounded above, which permits rewrite of the summation sequence of infinite sums. This, and the ratio test [23] for it are shown in the Short-t GPC convergence Appendix Subsection. Thus, the order of infinite summation can be changed to obtain shorter run times when t ) b, and the algorithm is accelerated through an algebraic rewrite of Eqs. (9) as (10) below. Alternating infinite series with large magnitude terms occurring before absolute convergence are common, for example, the infinite-series, primary definition of sinðxÞ ¼ 9! À Á Á Á has that same property for larger magnitude xvalues. Acceleration for the sine function could include converting the x-values to be principal sine values (À p 2 to p 2 ), and adjusting the output accordingly. 11 For the GPC(t) function a similar result, i.e., adjusting the algorithmic behaviour to be accelerated for long-t values, can be obtained as follows.
GPC type I secondary definition: the longt algorithm

Theorem 1 The long-t algorithm is
GPCðt; a; b; a; bÞ Proof This is shown by substitution of the identities, 12 B z ðA; BÞ ¼ BðA; BÞ À B 1Àz ðB; AÞ, and BðA; BÞ ¼ CðAÞCðBÞ CðAþBÞ into the incomplete beta function of Eq. (9) above and yields, Substituting this into the right hand side of Eq. (9) yields, ðÀb tÞ n n! CðÀaÞCða þ nÞ Cða þ n À aÞ À ðÀb tÞ n n! Bb t ðÀa; a þ nÞ the left hand summand of which simplifies to a GPC asymptote for long times, t, where 1F1 ðÁ; Á; zÞ is the regularised confluent hypergeometric function. 13 The above formula, as 14 Àp cscðp aÞ ¼ CðÀaÞCða þ 1Þ ¼ aCðÀaÞCðaÞ, can be written alternatively as where a 6 ¼ 0; 1; 2; 3; . . ., which is Eq. (25) of the first type I GPC publication [13]. Note that not only is the summation of the above absolutely convergent, but as the last line above is an asymptote for t ! 1 of the GPC function [13], the summation converges to zero as t ! 1 relatively more rapidly than the asymptote. The summation terms, ðÀb tÞ n n! Bb t ðÀa; a þ nÞ ; are rearranged for acceleration at long times using the infinite series definition of the incomplete beta function, 15 by substituting it into the summation and simplifying to yield, Given absolute convergence (Short-t GPC convergence Appendix Subsection) the order of infinite summation can be changed with impunity by distributing the outer sum over the inner sum, and factoring, as follows, ðÀb tÞ n ð1 À a À nÞ k n! : Fortunately, the inner sum in the above formula simplifies to a closed form, allowing it to be rewritten as The k ¼ 0 term of that sum simplifies to be the gamma distribution function part of the GPC convolution. Splitting off that term and adjusting the lower summation index Next, the quickly convergent sum term, Eq. (19), is added to the gamma distribution plus asymptotic formula Eq. (13) to create a series accelerated algorithm rewrite of Eq. (5) for long t-values, GPCðt; a; b; a; bÞ This is identically Eq. (10), which completes the proof of the long-t theorem. h The term beginning with þ h ðt À bÞ of the above equation is an asymptote of the GPC function. The above equation's first line when written as a list of terms to be summed has all negative elements when k [ a, which was the case for metformin [13]. If k\a for the first few k, then the simplified summation terms are initially positive until k [ a, but in any case the magnitude of those terms is strictly monotonically decreasing such that increasing precision to sum those terms is unnecessary. The confluent hypergeometric functions in those terms and their effects on convergence are presented in detail in the Long-t GPC convergence rapidity Appendix Subsection, which shows that the absolute value of the ratio of the ðk þ 1Þth to kth terms is approximately b k t , where the k in the denominator insures that the absolute values of the simplified terms of the summand for the above formula are monotonically decreasing, and that each ðk þ 1Þ st term is many times closer to zero than the k th term, such that it is unnecessary to test for convergence using the sum to infinity of all the remainder terms, i.e., in practice it is sufficient to test the absolute value of the last term and to stop the summation when that magnitude is less than the desired precision (e.g., \10 À65 ).
Other long-t functions; the integrals and derivative GPC type I long-t integral: The derivation of a similarly accelerated series for t ! 4 b of the CDF of GPC, i.e., its 0 to t integral, GPC F , follows from its primary definition, Eq. (6), using the same procedure as Eqs. (9) to (10), leading to, GPC F ðt; a; b; a; bÞ where Qða; b tÞ ¼ Cða; b tÞ CðaÞ is the regularised upper incomplete gamma function, and is the complementary cumulative density function (CCDF ¼ 1ÀCDF) of the gamma distribution. 16 Note that GPC F is a CDF, such that the upper limit of Eq. (20) as t increases is 1 or 100% of initial dose eliminated from the body.
GPC type I long-t double integral: Similarly, the super cumulative distribution, i.e., the integral from s ¼ 0 to t of the CDF is, GPC F ðt; a; b; a; bÞ Note that the sum term is now indexed from k ¼ 2, for which each simplified summation element has a negative value when k [ a, and a multiplied out positive first term when a\2.
GPC type I long-t derivative: The GPC derivative's algorithm for t [ 4b, i.e., long-t, is GPC 0 ðt; a; b; a; bÞ " ða þ 1Þ 1F1 ða; a À a; Àb tÞ À a 1F1 ða þ 1; a À a; Àb tÞ # þ a CðaÞ The combined short-and long-t algorithm for GPC series acceleration There are now two algorithms, an algorithm that converges quickly only for short t-values, and another that converges quickly only when t-values are long. This section describes how the algorithms are combined to produce a new accelerated algorithm for any value of t. A full set of functions for the derivative and integrals of the GPC algorithm follows the same pattern as the Mathematica source code of the GPC type I accelerated algorithm Appendix Subsection. The two algorithms are combined by choosing t ¼ 4 b as the floor (least) value for use of the long-t algorithm, makes the next term at worst approximately 1/4 that of the current term. Given a next term fraction of b k t times the current term, the t ¼ 4 b floor value is not critical, the trick is to avoid second to first term ratios that initially approach 1 as t ! b, for which the shortt algorithm has fewer terms and converges faster. See the Choosing when to use the short-t and long-t algorithms Appendix Subsection for further information (Fig. 1).
The program uses so-called infinite magnitude numbers such that numbers like AE10 AE100000 can be used without overflow or underflow (code: $MaxExtraPrecision = 1). However, there is another concern; precision. Machine precision was 53 bits, or approximately 16 significant figures. When 10 À100 and 1 are added, one has to have a precision of 100 significant figures to avoid truncation. For the short-t algorithm the extended precision needed is precalculated using machine precision of large numbers, which are stored as simplified terms, and are searched to find the largest magnitude number (code: Ordering[storage,À1½½1 À 1). It is then that number as a rounded base 10 exponent (code: Round[Log10[Abs [outmax]]]) plus 65 significant figures that is used as the required precision of the computation. The terms of the summand are then recalculated to that high precision, then summed, such that the result has approximately 65 significant figures remaining even though the calculation itself may have needed a thousand or more significant figures to yield that result. The same approach could be used to calculate sinðxÞ from its infinite series definition. As mentioned above, in practice that is not used, and instead the equivalent principal sine values of x are computed. For the GPC(t) computation, one can invert the range related extra precision problem by reordering the series to make it increasingly less demanding to calculate long-t values by direct application of Eq. (10) and that is precisely what the long-t GPC type I algorithms does. The value t ¼ 4b is used to transition between shorter t-values for use by the short-t algorithm, and longer t-values for use with the long-t algorithm. As mentioned, that time of transition between long and short algorithms is not critical and is more formally presented in the Choosing when to use the short-t and long-t algorithms Appendix Subsection.

Results
This Results section shows examples for GPC algorithm run times and diagnostics, of how it can and should be used including the use for extended same dose multidosing, and a subsection illustrating confidence interval (CI) and coefficient of variation (CV) diagnostic quality assurance.

Algorithm run time analysis
The GPC combined short-and long-t algorithm was defined in terms of how to calculate it efficiently, as above. Implementation of the combined short and long time algorithm using Mathematica 12.3 without parallel processing on a 2.3 GHz 8-Core Intel i9 processor allows long t-value execution times of around 1.2 millisecond with typical 63 to 67 decimal place accuracy. (The full range of run times is approximately from 42 to 1.2 milliseconds for t-values ranging 30 s to 1/2 year.) This contrasts with the short-t implementation of GPC Eq. (9), which, as t increases, needs more terms and higher precision to maintain a given precision of the final result, with a processing time that progressively becomes intractably long. Figure 2 shows the relative performance of the algorithms in these respects using the GPC parameters from fitting metformin data for dog 1 [13]. This dog showed the median regression error of 8.7% of the seven studied. Despite having the fastest elimination at 72 h, the concentration level for that dog was predicted to be 2 Â 10 À7 of peak at one year, a small number but much larger than could be produced assuming a terminal exponential tail. For the short-t algorithm the run-time to calculate concentration at one-half year following injection was 1809 s, versus 1.2 milliseconds for the new algorithm. This difference is because the short-t algorithm used at long times had 8883 terms to sum, and the call to gpcshort was used twice; once at machine precision to find the maximum absolute value term ð1:0796 Ã 10 1392 Þ of all of the summand terms in order to calculate that 1457 place precision was required to obtain 65 place precision in the output, and once again to do the 1457 place summation arithmetic. For the combined (new) algorithm this is not needed as for short times the short-t algorithm does not have large oscillating terms, and the long-t algorithm has monotonically decreasing term magnitude both for each sequentially summed term, and as t increases, for each first term magnitude. For example, the first (and only) term of the long-t algorithm's summand at one-half year was negligible ðÀ1:851 Ã 10 À1403 Þ. These effects are illustrated in Fig. 3.
For our test case example, the two algorithms, short-t and long-t, agreed to within 63 to 67 decimal places. In practice, the short-t algorithm is used for short times and the long-t algorithm is used for long times. It makes little difference what cut-point time between short-and long-t algorithms is used, and the time 4b, albeit around 100-120 s, was chosen as a division point between algorithms short enough to ensure that extra precision padding for the shortt algorithm would be unnecessary.

Regression processing elapsed times and extended multidosing
For evaluating the 72 h data for seven dogs, the new, combined short-and long-t algorithm run time for curve fitting was approximately 1:15 to 3:00 (min:s) average values, the program prior version with hardware and software accelerations for the short-t algorithm and without sufficiently extended precision (despite using at least 65 place arithmetic) had run times in the approximate range of 34 to 35 min, but with occasional errors in parameter values of AE2 Â 10 À14 . With proper precision extension the error dropped below 10 À20 for all 5 parameters and 7 cases, but the run time increased to 50 min, using a partly accelerated short-t algorithm (Eq. (14)) and 8-core hardware acceleration. The current combined short-and longt algorithm does not use those additional accelerations. Forty model-based bootstrap cases generated for the first dog's data-see next Subsection-took 49:45 (min:s), or 1:15 per case. That is a lot faster than the 33:51 per case it took to generate 35 bootstrap models using the old software (19:44:55). Overall, the run time is very approximately 27 times faster than prior, but is variable depending on the problem being solved, the computer hardware used, and the other software running on the computer at that time. For example, Fig. 4a, with a current run time of 7.1 s, could not be computed at all using the earlier software version.
Notice that if we wish to glean information during metformin multidosing with plasma or serum sampling, the best time to do so is just prior to the next scheduled dosing as those concentrations change for each dose interval, whereas the peak concentration change over time is very small. However, because the tissue dosage 17 accumulates, the amount of drug in the body (Fig. 4b) cannot be predicted from serum (or plasma) concentration alone. Note that approximately one entire dose has accumulated in tissue by 14 days despite most of the cumulative dosage having been eliminated over that time. That is, during the first dose interval, the mean drug mass remaining in the body was 0.175 doses, and during the 14th dose interval the mean drug mass remaining in the body was 1.118 doses, where 12.88 dose masses were eliminated.

Which are better, confidence intervals or coefficients of variation?
With reference to Table 2, confidence intervals (CI) of the mean were extracted from model-based bootstrap with 40 cases generated for the first dog's data. For calculating CI's of the mean, the Student's-t method was used (Verified assumptions: Central Limit Theorem, n [ 30, light-tailed distributions). However, as a result of extensive testing the degrees of freedom were set at n rather than the more typical n À 1, as it was found that for smaller values of n, physically impossible results were obtained, whereas even for n ¼ 2, when n was used, rather than n À 1, the results were accurate. For n ¼ 40 it made very little difference whether n À 1 or n were used. Also shown are CI's of the model based bootstrap (A.K.A., parametric bootstrap) results calculated directly from the n ¼ 40 data using the nonparametric quantile (A.K.A, percentile) method of Weibull [24]. 18 Note that the Pareto rate parameter, b was not presented. Since many (38 of 40) of the results were at the constraint boundaries of 25 to 30 s, one already knows what the confidence interval largely is; the constraint values themselves. Another situation entirely exists for coefficients of variation (CV). Note in the table that when n ¼ 5 as during the prior study, that the values so obtained were too small. It is theoretically possible to use bootstrap (in our case that would be bootstrap of model-based bootstrap) to obtain confidence interval quantiles for the median CV, and although median values of CV's have shown sufficient robustness to construct confidence intervals for n sufficiently large [25], the correction for n-small is problematic as per Table 2 and the Discussion Section that follows.

Discussion
Wise [26] first proposed that power functions or gamma distributions should be used in pharmacokinetic modelling as superior alternatives to sums of exponential terms. This has been reinforced more recently, for example by Macheras [27]. While convolution models and fractal Fig. 1 Standard flow chart for the Mathematica source code of the GPC type I accelerated algorithm Appendix Subsection. The predefined shortand long-t routines are also described in the text 17 For a single dose, body drug mass is MðtÞ ¼ Dose ½1 À GPC F ðtÞ. 18 This uses the Weibull method for extracting confidence intervals, which in Microsoft Excel (2007) would format for the lower tail as PERCENTILE.EXC(A1:A40,0.025) and from Mathematica 12.3 [5] as Quantile[data, 0.025, ff0; 1g; f0; 1gg], https://mathworld.wolfram. com/Quantile.html. consistent models have been shown to be superior models in some cases and find occasional use [10,12,13,28,29] compartmental modelling software is widely available and is used by default. For example, compared to biexponential clearance evaluation of 412 human studies using a glomerular filtration rate agent, adaptively regularised gamma distribution (Tk-GV method [30,31]) testing was able to reduce sampling from 8 to 4 h and from nine to four samples for a more precise and more accurate, yet more easily tolerated and simplified clearance test [29]. Despite this, few institutions have implemented the Tk-GV method at present. In the case of metformin, a highly polar ionised base, the extensive, obligatory active transport of the drug into tissue produces a rate of expansion of the apparent volume of distribution having the same units as renal clearance, yielding the Pareto (power function) tail. This heavy tail, and Fig. 4, help to explain why metformin control of plasma glucose levels had delayed onset, e.g., following 4-weeks of oral dosing [32], and provides hints concerning the lack of a direct correlation between drug effect and blood metformin concentrations [33]. Other basic drugs whose active transport dominates their disposition may show similar behaviour. The long tail in the disposition of amiodarone may be a reflection of its very high lipid solubility rather than, or in association with, active tissue uptake. Weiss [34] described its kinetics after a 10 min intravenous infusion with an s-space Laplace transform convolution of a monoexponential cumulative distribution with an inverse Gaussian distribution and a Pareto type I density, which lacked a real or t-space inverse transform such that the modelling information had to be extracted numerically. A real space f(t) model convolution  [13]. Panel a shows that the long or short-t algorithm is more than one million times faster than the short-t algorithm when the time after injection is very long, e.g., for predicting a serum concentration at one half year (4396 h), and more than 20 times faster than the long-t algorithm for t ¼ 30 s. Panel b shows the number of summed terms (n) for each method. Note that for long t-values, the combined algorithm used the long-t method and only calculates one term, whereas the short-t algorithm would have used many terms. Panel c shows the precision needed to accommodate the largest of the terms for each algorithm. As t increases, the short-t function uses an increasing number of significant figures, but for the long-t or combined algorithms that number increases only slightly to preserve accuracy for lower concentrations at longer times. Panel d shows the largest term magnitude as powers of 10 for the short-and long-t algorithms. For long times, the short-t algorithm alternating sign intermediate terms reach quite large magnitude, while for the long-t algorithm the largest magnitude term collapses to vanishingly small values. (Color  figure online) of time-limited infusion effects of a GPC type I distribution is simple to construct and would be the same as Weiss's model in the one essential aspect that matters; testing of the amiodarone power function tail hypothesis, for which a GPC derived model would have the advantage of being more transparently inspectable. Similarly, Claret et al. [35] used finite time difference power functions to investigate cyclosporin kinetics for which GPC and related model testing may be appropriate.
We were able to use Nelder-Mead global search regression model-based bootstrap to provide more information and better information for parameter variability than would be available from a gradient matrix. Some readers would prefer to use the Levenberg-Marquardt algorithm convex gradient regression method, so that the gradients can be used to estimate case-wise coefficients of variation. The logarithm of sums of exponential terms is always convex. The GPC-metformin loss function is nonconvex, as shown by failure of the interior point method to improve on solutions as reported in the Data sources and regression Methods Subsection. Constrained nonconvex gradient methods are comparatively rarely implemented; there appears to be no such implementation in Mathematica at present.
Correction of standard deviation (SD) for small numbers (n\30) using bootstrap of model-based bootstrap and v 2 were used as mentioned elsewhere [36], and led to using n rather than n À 1 for Student's-t degrees of freedom. Whereas variance is unbiased, when the square-root of variance is taken, the result, standard deviation becomes Fig. 3 Individual terms of the sums for the short-t and long-t algorithms for the GPC model of metformin disposition in dog 1 of reference [13]. These terms are stored in a list variable called, somewhat unimaginatively, storage. Note that f ðt ¼ 4b hÞ, is an explicit replacement rule meaning that an f(t) is Note that there are fewer terms than for the short-t algorithm at that same time (panel b), that all terms are negative (red), and that they decrease by one or more orders of magnitude between successive terms. Panel d shows that by 100 h for the long-t algorithm, there are fewer terms than at 12 h, and that their magnitude is very small even for the largest magnitude, first, term. (Color figure online) biased. Arising from v 2 , a standard deviation from only two samples, is on average only 79.8%, ffiffi 2 p q , of the population standard deviation [24]. 19 Gradient methods lack pre hoc testing of the implicit assumption of residual normality and do not post hoc provide any parameter distribution information. From the trace of the gradient matrix, one obtains a standard deviation with degrees of freedom that are n À p À 1 (n-samples, p parameters) [36]. For standard deviations in the case where n À p À 1 is small, the corrections for standard deviation are large. Overall, the ratio between gradient based error propagation results and that from bootstrap is not unusually a factor of two larger or smaller [37]. Moreover, average fit errors using any loss function [ 10%, for assay methods with errors\10% may suggest that the algorithm/data combination is suspect [10,13,22,38], and for the metformin dog data that is the case for two-and three-compartment models, but not for   19 Given only two samples, the population mean is not located midway between them, however, the midpoint (mean) is used to estimate the population mean in the standard deviation formula. The correction formula multiplier for an unbiased estimator (r) of population standard deviation (r) from sample standard deviation (s) isr ¼ c n s, where c n ¼ ffiffiffiffiffiffi the GPC model, which latter model was the only one to fit the data better than 10% (average 8.6% rrms with assay error of 5.2% rrms), as well as being the only model to exhibit normality and homoscedasticity of residuals [13]. When the fit error is [10%, one should, at a minimum, test residuals for homoscedasticity and normality, and if these are not present, a better fit model should be sought for its own sake, and bootstraping becomes problematic [22]. The use of coefficients of variation is sometimes problematic. Suppose that we have lots of data, but because CV ¼ SD=mean, if by chance in a particular case especially if we have small n, some of the multiply generated mean values may approach zero, which injects some erratically high CV-values into a distribution of values. It is for that reason, numerically instability, that the more data one has, the worse the mean CV-value can be, with the solution being to first calculate many CV values, and then take their median value [25]. Even though the mean value may be not useful, the median may be, and confidence intervals for CV could be established using bootstrap quantiles, but not by using the gradient matrix approach because correction for n-small is problematic. That is, for mean values that can be rewritten as being proportional and having an established maximum range, e,g., Likert scale minus 1 variables, correcting CV underestimation for small values of n is possible. However, if, as is the case here, there is no theoretical maximum CV, one needs to invent a correction based upon the observed confidence intervals of the mean [39], such that CI-values are unavoidable for determining the meaning of the preponderance of CV results. Finally, comparison for significant differences between parameters for one subject versus another are easy to construct using CI, but more difficult to obtain for CV. Thus, CV-values cited without explicit quality assurance should be regarded as qualitative results.

Limitations
A major deficiency of the first article that applied and compared the gamma-Pareto type I convolution (GPC) model to other models [13] was the lack of an algorithm that could be used for independent investigation and/or for application to other drugs. The accelerated algorithm presented herein is the first publication of code for a gamma-Pareto type I convolution (GPC). As such, the algorithm was kept in a simple form without using all possible acceleration tools or stopping conditions. While it could be optimised for even shorter run-times using vector addition of subsets of the summands, by using Eq. (14) to reduce summand magnitudes for the short-t algorithm and/or combining partial sums of the summands for the short-or long-t algorithms, by eliminating diagnostic parameters such as run-time calculations, by compiling it, and by multiple other means. However, that would be at the expense of clarity and/or simplicity of presentation. It is complicated to compute the values of functions like the sinðxÞ efficiently. For example, an answer with precalculated exact precision can be quickly generated for sin x using the CORDIC procedure, which is optimised for binary register operations at the machine level [40]. At a much higher and slower level, compared to the GPC(t) short-t algorithm, the sinðtÞ function's series expansion has even larger magnitude terms for long t-values. In its current form, the combined short-and long-t GPC algorithm is so much faster than the previously published run times using the seven dogs 72 h data and more generally valid that it is now a practical algorithm. The current implementation is no longer limited as to how long t is, and the propagated error of up to 2 Â 10 À14 for parameter values obtained from regression of 72 h data has been reduced to \10 À20 . That error demonstrates the major extent to which errors from 65 decimal place precision can propagate during processing of tens of thousands of calculations, especially during regression, which typically, by default, halves the number of significant figures-see the Data sources and regression Methods Subsection. This does not affect any of the parameter values listed in Table 1, but the ability to quickly calculate a larger number of model-based bootstrap results would improve the parameter CI estimates. Another consideration is how to obtain exactly n significant figures precision when n are requested. Currently, for 65 significant figures requested, a result precise to several significant figures greater or lesser than 65 is returned and the algorithm is written only for 65 significant figure precision. Generalising this to request to obtain an arbitrary specific precision for a GPC functional height awaits the next several generations of algorithmic refinement.

Conclusions
The new GPC type I algorithm consists of two parts, one for very early times, and another for late times. At times less than approximately 4 b, i.e., 100-120 s for the metformin data, the short-t algorithm is actually faster than the long-t algorithm. For early data, the short-t algorithm has alternating sign terms of monotonically decreasing magnitude. However, when used at long times, the short-t GPC algorithm required precalculation of the precision needed for later summation, which represents an improvement over the algorithm previously used [13]. In the newlyproposed, combined short and long-t algorithm this precalculation is unnecessary because of the long-t algorithm usage for all but the shortest t-values, resulting in markedly accelerated convergence, and the new ability to predict concentration at any time, no matter how long.
such that the sum of absolute values of summands of Eq. (9) is bounded above by some positive constant value times an exponential function of t, and Eq. (9) is absolutely convergent.

Long-t GPC algorithm convergence rapidity
This subsection examines the rapidity of convergence of the long-t GPC algorithm. In Lemma 1 directly above, it was shown that the short-t algorithm is absolutely convergent. Therefore, its infinite series rewrite as the longt Theorem 1, Eq. (10), is also convergent but how many summation terms are needed for convergence and which parameters determine this convergence can be clarified using the substituted definition of the confluent hypergeometric series 21 as follows. 1 F 1 ða; a À k; Àb tÞ ¼ X 1 j¼0 ðaÞ j ðÀb tÞ j j!ða À kÞ j ¼ 1 À a b t a À k þ aða þ 1Þðb tÞ 2 2!ða À kÞða À k þ 1Þ À . . . : Note that in the limit as k ! 1 the above equation is asymptotically ( $ ) 1. Next, the ratio of the ðk þ 1Þth to kth term is asymptotic to b k t for k sufficiently large, b t a À k ðk þ 1Þða À k À 1Þ 1 À a aÀkÀ1 b t þ aðaþ1Þ 2ðaÀkÀ1ÞðaÀkÞ ðb tÞ 2 À Á Á Á 1 À a aÀk b t þ aðaþ1Þ 2ðaÀkÞðaÀkþ1Þ ðb tÞ 2 À Á Á Á $ b k t : For that reason, for longer t-values, one can expect faster convergence of the long-t algorithm with fewer terms summed.
Choosing when to use the short-t and longt algorithms As above, the absolute value of the ratio of the next term to the current term for the short-t algorithm is bounded above by b ðtÀbÞ nþ1 . For the long-t algorithm, the ratio of the ðk þ 1Þth to kth term approaches b k t for k sufficiently large. Note that these are in the opposite direction, that is, while t-values increase, b ðtÀbÞ nþ1 increases and b k t decreases. It is not critical exactly at what t value one elects to use the short-and longt algorithms, as the major cost in computational time and number of terms needed occurs at the extreme values of t, but in an opposite direction for each algorithm. Figure 5 shows the tradeoff for dog 1 of the metformin series between numbers of terms for summation, time following bolus injection, and the magnitude of the natural logarithm of GPCðtÞ, where GPCðtÞ ¼ CðtÞ AUC . Selecting t ¼ 4b as a cut point for switching between algorithms means that the short-t algorithm absolute sum of terms is bounded above, from substitution into e bðtÀbÞ , by e 3b b times the first term's value, not a large number, and the long-t algorithm has an approximate maximum ðk þ 1Þth to kth term ratio of 1 4k for the shortest t-value used, which can still be made as small as desired for k sufficiently large.

Mathematica source code of the GPC type I accelerated algorithm
Computer implementation of the GPC type I function is provided as a notebook (nb file type) in the Mathematica language as Supplementary Materials 1. Without access to Mathematica itself that file cannot be easily reviewed. Therefore, an image of that file's contents is provided below. Fig. 5 The tradeoff between the time elapsed following bolus intravenous injection (x-axis), the number of terms to be summed for calculating the GPC function's value (y-axis), and the natural logarithm of the GPC function at that time (z-axis). Panel a shows short-t algorithm performance and panel b shows the long-t algorithm performance. Note that only for very early elapsed times does the short-t algorithm have fewer terms than the long-t algorithm Supplementary Information The online version contains supplementary material available at https://doi.org/10.1007/s10928-021-09779-4.