Abstract
Fractional polynomials (FP) have been shown to be more flexible than polynomial models for fitting data from an univariate regression model with a continuous outcome but design issues for FP models have lagged. We focus on FPs with a single variable and construct D-optimal designs for estimating model parameters and I-optimal designs for prediction over a user-specified region of the design space. Some analytic results are given, along with a discussion on model uncertainty. In addition, we provide an applet to facilitate users find tailor made optimal designs for their problems. As applications, we construct optimal designs for three studies that used FPs to model risk assessments of (a) testosterone levels from magnesium accumulation in certain areas of the brains in songbirds, (b) rats subject to exposure of different chemicals, and (c) hormetic effects due to small toxic exposure. In each case, we elaborate the benefits of having an optimal design in terms of cost and quality of the statistical inference.
Similar content being viewed by others
References
Atkinson A, Donev A, Tobias R (2007) Optimum experimental designs, with SAS, vol 34. Oxford University Press, Oxford. https://doi.org/10.1080/10543406.2010.481801
Atzpodien J, Royston P, Stoerkel S, Reitz M (2007) Fractional polynomials in a new metastatic renal carcinoma continuous prognostic index involving histology, laboratory, and clinical predictors. Cancer Biotherapy Radiopharm 226:812–818. https://doi.org/10.1089/cbr.2007.375
Austin PC, Park-Wyllie LY, Juurlink DN (2014) Using fractional polynomials to model the effect of cumulative duration of exposure on outcomes: applications to cohort and nested case-control designs. Pharmacoepidemiol Drug Saf 238:819–829. https://doi.org/10.1002/pds.3607
Calabrese EJ (2004) Hormesis: a revolution in toxicology, risk assessment and medicine. EMBO Rep 5(S1):S37–S40. https://doi.org/10.1038/sj.embor.7400222
Calabrese EJ, Baldwin LA (2001) The frequency of U-shaped dose responses in the toxicological literature. Toxicol Sci 62(2):330–338. https://doi.org/10.1093/toxsci/62.2.330
Casero-Alonso V, Pepelyshev A, Wong WK (2018) A web-based tool for designing experimental studies to detect hormesis and estimate the threshold dose. Stat Pap 59(4):1307–1324. https://doi.org/10.1007/s00362-018-1038-5
Chang F-C, Lay C-F (2002) Optimal designs for a growth curve model. J Stat Plan Inference 104(2):427–438. https://doi.org/10.1016/s0378-3758(01)00255-5
Duong H, Volding D (2015) Modelling continuous risk variables: introduction to fractional polynomial regression. Vietnam J Sci 2:19–26. https://doi.org/10.1080/17476933.2019.1631287
Faes C, Geys H, Aerts M, Molenberghs G (2003) Use of fractional polynomials for dose-response modelling and quantitative risk assessment in developmental toxicity studies. Stat Modell 3(2):109–125. https://doi.org/10.1191/1471082x03st051oa
Fedorov VV (1972) Theory of optimal experiments. Elsevier, Amsterdam
Gasull A, Lázaro JT, Torregrosa J (2012) On the chebyshev property for a new family of functions. J Math Anal Appl 387(2):631–644. https://doi.org/10.1016/j.jmaa.2011.09.019
Geys H, Molenberghs G, Declerck L, Ryan L (2000) Flexible quantitative risk assessment for developmental toxicity based on fractional polynomial predictors. Biometric J J Math Methods Biosci 42(3):279–302. https://doi.org/10.1002/1521-4036(200007)42:3<279::aid-bimj279>3.0.co;2-f
Groten JP, Schoen ED, Van Bladeren PJ, Kuper CF, Van Zorge JA, Feron VJ (1997) Subacute toxicity of a mixture of nine chemicals in rats: detecting interactive effects with a fractionated two-level factorial design. Fundam Appl Toxicol 36(1):15–29. https://doi.org/10.1093/toxsci/36.1.15
Kiefer J, Wolfowitz J (1960) The equivalence of two extremum problems. Can J Math 12:363–366. https://doi.org/10.1007/978-1-4615-6660-1_5
Kiefer J, Wolfowitz J (1964) Optimum extrapolation and interpolation designs, I. Ann Inst Stat Math 16(1):79–108. https://doi.org/10.1007/BF02868564
Kiefer JC (1985) Collected papers III: design of experiments. Springer, Berlin
Knafl GJ (2015) Adaptive fractional polynomial modeling in SAS®
Krishnan E, Tugwell P, Fries JF (2004) Percentile benchmarks in patients with rheumatoid arthritis: health assessment questionnaire as a quality indicator (QI). Arthritis Res Ther 66:1–9. https://doi.org/10.1186/ar1220
López-Fidalgo J, Tommasi C, Trandafir PC (2007) An optimal experimental design criterion for discriminating between non-normal models. J R Stat Soc B 69(2):231–242. https://doi.org/10.1111/j.1467-9868.2007.00586.x
Mayer B, Keller F, Syrovets T, Wittau M (2016) Estimation of half-life periods in nonlinear data with fractional polynomials. Stat Methods Med Res 25(5):1791–1803. https://doi.org/10.1177/0962280213502403
Namata H, Aerts M, Faes C, Teunis P (2008) Model averaging in microbial risk assessment using fractional polynomials. Risk Anal Int J 28(4):891–905. https://doi.org/10.1111/j.1539-6924.2008.01063.x
Pázman A (1986) Foundations of optimum experimental design, vol 14. Springer, Berlin
Pukelsheim F (1993) Optimal design of experiments, vol 50. SIAM, New Delhi. https://doi.org/10.1137/1.9780898719109
Royston P, Altman DG (1994) Regression using fractional polynomials of continuous covariates: parsimonious parametric modelling. J R Stat Soc Ser C (Appl Stat) 43(3):429–467. https://doi.org/10.2307/2986270
Royston P, Ambler G, Sauerbrei W (1999) The use of fractional polynomials to model continuous risk variables in epidemiology. Int J Epidemiol 28(5):964–974. https://doi.org/10.1093/ije/28.5.964
Royston P, Reitz M, Atzpodien J (2006) An approach to estimating prognosis using fractional polynomials in metastatic renal carcinoma. Br J Cancer 94(12):1785–1788. https://doi.org/10.1038/sj.bjc.6603192
Royston P, Sauerbrei W (2004) A new approach to modelling interactions between treatment and continuous covariates in clinical trials by using fractional polynomials. Stat Med 23(16):2509–2525. https://doi.org/10.1002/sim.1815
Sauerbrei W, Royston P, Bojar H, Schmoor C, Schumacher M (1999) Modelling the effects of standard prognostic factors in node-positive breast cancer. Br J Cancer 79(11):1752–1760. https://doi.org/10.1038/sj.bjc.6690279
Serroyen J, Molenberghs G, Verhoye M, Van Meir V, Van der Linden A (2005) Dynamic manganese-enhanced MRI signal intensity processing based on nonlinear mixed modeling to study changes in neuronal activity. J Agric Biol Environ Stat 10(2):170–183. https://doi.org/10.1198/108571105x46426
Shkedy Z, Aerts M, Molenberghs G, Beutels P, Van Damme P (2006) Modelling age-dependent force of infection from prevalence data using fractional polynomials. Stat Med 25(9):1577–1591. https://doi.org/10.1002/sim.2291
Silke B, Kellett J, Rooney T, Bennett K, O’riordan D (2010) An improved medical admissions risk system using multivariable fractional polynomial logistic regression modelling. QJM An Int J Med 103(1):23–32. https://doi.org/10.1093/qjmed/hcp149
Song D, Wong WK (1999) On the construction of \(G_{rm}\)-optimal designs. Stat Sin 263–272. http://www3.stat.sinica.edu.tw/statistica/j9n1/j9n115/j9n115.htm
Stigler SM (1971) Optimal experimental design for polynomial regression. J Am Stat Assoc 66(334):311–318. https://doi.org/10.1080/01621459.1971.10482260
Studden W (1982) Some robust-type \({D}\)-optimal designs in polynomial regression. J Am Stat Assoc 77(380):916–921. https://doi.org/10.2307/2287327
Wolfe F (2000) A reappraisal of HAQ disability in rheumatoid arthritis. Arthritis Rheumatism 43(12):2751–2761. https://doi.org/10.1002/1529-0131(200012)43:12<2751::aid-anr15>3.0.co;2-6
Wong WK, Lachenbruch PA (1996) Designing studies for dose response. Stat Med 15(4):343–359. https://doi.org/10.1002/0470023678.ch3a
Acknowledgements
All authors were sponsored by Ministerio de Economía y Competitividad and fondos FEDER MTM2016–80539–C2–1–R and by Junta de Comunidades de Castilla–La Mancha SBPLY/17/180501/000380. Dr. Wong wishes to acknowledge the support from the University of Castilla–La Mancha and the program FEDER of Castilla–La Mancha 2014–2020 that made his visit possible. He also wishes to acknowledge the hospitality of the department during the visit. We also thank Diego Urruchi-Mohíno for his helpful assistance in the development of the Mathematica codes to generate optimal designs for FP models. Dr. Wong was also partially supported by a grant from the National Institute of General Medical Sciences of the National Institutes of Health under Award Number R01GM107639. The contents in this paper are solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. We also want to acknowledge the valuable comments from reviewers.
Funding
All authors were sponsored by Ministerio de Economía y Competitividad and fondos FEDER MTM2016–80539–C2–1–R and by Junta de Comunidades de Castilla–La Mancha SBPLY/17/180501/000380.
Dr. Wong was also partially supported by a grant from the National Institute of General Medical Sciences of the National Institutes of Health under Award Number R01GM107639.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
Not applicable.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
This appendix provides proofs for Theorem 1: D-optimal designs for FP1 and FP2 models and Theorem 2: I-optimal designs for the FP1 models.
Proof (Theorem 1: construction of D-optimal designs)
[Theorem 1: construction of D-optimal designs]
Suppose that a design \(\xi \) for a FP1 model defined on \([\epsilon ,a]\) with 3 points at \(\epsilon \le s_1< s_2 < s_3 \le a\). From the General Equivalence Theorem of Kiefer and Wolfowitz (1960), if \(\xi \) were D-optimal, its sensitivity function c(x) must satisfy \(c(x) =f^T(x) M^{-1}(\xi ) f(x) -2 \le 0\) for all \(x \in [\epsilon ,a]\) with equality at the support points. If \(p \ne 0\), the component functions in c(x) are \(1,x^p,x^{2p}\). These 3 component functions form a Tchebycheff system on the interval \([\epsilon ,a]\) because the determinant of the matrix
has the same sign for any \(\epsilon \le x_1< x_2 < x_3 \le a\). It follows that for each \(0 \ne p \in {\mathcal {P}}\), the sensitivity function has at most 2 zeros and since the D-optimal design for a FP1 model has at least 2 points for it to have a nonsingular information matrix, the D-optimal design is equally supported at 2 points (Pukelsheim 1993; Fedorov 1972). A similar argument shows that the same conclusion applies when \(p=0\) and the component functions are \(\{1,\ln [x],\ln [x]^2\}\). Direct calculus shows that the sensitivity function of the design equally supported at the two end-points is \(c(x) =4 \left( a^p-x^p\right) \left( \epsilon ^p-x^p\right) /\left( a^p-\epsilon ^p\right) ^2\) when \(p\ne 0\). Clearly, this function satisfies the conditions required in the equivalence theorem for D-optimality in (3) and so the design is D-optimal for FP1 models. A similar argument applies when \(p=0\).
Note that the previous reasoning about \(p \ne 0\) is valid for any real number value of p. Thus the result proved here is more general than for \(p \in {\mathcal {P}}\). The same applies for the following results.
For the FP2 models, we first establish that the D-optimal design \(\xi _D\) has three points. For such models, the sensitivity function c(x) has at most 6 component functions, i.e. \(1, x^p, x^{2p}, x^q, x^{2q}\) and \(x^{p+q}\). In selected cases, such as when \(p=-0.5\) and \(q=-1\), the sensitivity function has only 5 component functions. A direct calculation shows that the associated Wronskians for this system of functions are positive for any values of p and q and so the component functions form a Tchebycheff system (Gasull et al. 2012). It follows that there are at most 5 zeroes (counting multiplicities). The interior support points have multiplicity two, because the maximum value of the sensitivity function of the D-optimal design has to be less than or equal to zero in the interval with the maximum value attained at the support points. This implies only three support points are possible, either two interior points and one extreme point of the design interval or one interior support point and the two extreme points of the interval \([\epsilon ,a]\). Because the number of support points is the same as the number of parameters, the D-optimal design is equally weighted (Pukelsheim 1993). We next argue that the optimal designs have to include the two extreme points.
Suppose the equally weighted design \(\xi \) is supported at \(s_1<s_2<a\), where \(\epsilon <s_1\). The determinant of the information matrix of this design for model FP2(p, q) with \(0 \ne p\ne q \ne 0\) is
since D is always either positive or negative for any values of \(s_1,s_2\) and a because we have a Chebyshev system. Further,
implies that
and so D is a decreasing function of \(s_1\). Consequently, \(\epsilon \) is a support point of the D-optimal design. We note that the last equation holds if and only if \(s_1^{q-p}\frac{q (a^p-s_2^p) }{p (a^q - s_2^q) }\ne 1\), i.e. \(\left( \frac{s_1}{c} \right) ^{q-p} \ne 1.\) For the first equivalence we note that \(ps_1^{p-1}(a^q-s_2^q) \ne 0\). The last equivalence obtains because by the mean value theorem, there exists a \(c \in [s_2,a]\) such that \( (a^p-s_2^p) /(a^q - s_2^q) =c^{p-q} p/q \). Consequently, since \(s_1<c\) and \(q>p\) without loss of generality, \((s_1/c) ^{q-p}<1\) proving the result. It follows that \(\partial |M(\xi ) |/\partial s_1\) is negative, otherwise the optimal design is singular. It is easy to verify that the determinant of the information matrix is a decreasing function of \(s_1\) and its maximum is obtained at \(s_1=\epsilon \), the left end-point of the design space. Similar reasoning leads to the optimal design as being supported at \(\epsilon<s_2<s_3\) and its determinant is maximized when \(s_3=a\). The upshot is that the D-optimal design is equally supported at the two end-points of the design space and at an interior point. The above arguments apply to other cases of p and q (p and q unequal and one of them zero, p and q equal and \(p=q=0\)). In either case, the interior support point s is the unique root of the derivative of the sensitivity function.\(\square \)
Proof (Theorem 2: construction for I-optimal designs)
[Theorem 2: construction for I-optimal designs]
The previous reasoning can be used to directly prove that I-optimal designs for FP1 models are unequally supported at the end-points of the design space. We note that the component functions in the sensitivity function of (4) are \(1,x^p,x^{2p}\) when \(p \ne 0\) and \(\{1,\ln [x],\ln [x]^2\}\) when \(p=0\), and they form a Tchebycheff system on \([\epsilon ,a]\). Similarly, the weights in Sect. 3.2 are found by finding the roots of the sensitivity function of the I-optimal design evaluated at \(x=\epsilon \) and \(x=a\).
In addition, a direct calculation, from 1/w given in 1., shows the result 3. for \(p>0\) and \(\epsilon =0\). \(\square \)
Rights and permissions
About this article
Cite this article
Casero-Alonso, V., López–Fidalgo, J. & Wong, W.K. Optimal designs for health risk assessments using fractional polynomial models. Stoch Environ Res Risk Assess 36, 2695–2710 (2022). https://doi.org/10.1007/s00477-021-02155-1
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00477-021-02155-1