Abstract
A new method of Geometrically Designed least squares (LS) splines with variable knots, named GeDS, is proposed. It is based on the property that the spline regression function, viewed as a parametric curve, has a control polygon and, due to the shape preserving and convex hull properties, it closely follows the shape of this control polygon. The latter has vertices whose x-coordinates are certain knot averages and whose y-coordinates are the regression coefficients. Thus, manipulation of the position of the control polygon may be interpreted as estimation of the spline curve knots and coefficients. These geometric ideas are implemented in the two stages of the GeDS estimation method. In stage A, a linear LS spline fit to the data is constructed, and viewed as the initial position of the control polygon of a higher order (\(n>2\)) smooth spline curve. In stage B, the optimal set of knots of this higher order spline curve is found, so that its control polygon is as close to the initial polygon of stage A as possible and finally, the LS estimates of the regression coefficients of this curve are found. The GeDS method produces simultaneously linear, quadratic, cubic (and possibly higher order) spline fits with one and the same number of B-spline coefficients. Numerical examples are provided and further supplemental materials are available online.
Similar content being viewed by others
References
Antoniadis A, Gijbels I, Verhasselt A (2012) Variable selection in additive models using P-splines. Technometrics 54(4):425–438
Beliakov G (2004) Least squares splines with free knots: global optimization approach. Appl Math Comput 149:783–798
Belitser E, Serra P (2014) Adaptive priors based on splines with random knots. Bayesian Anal 9(4):859–882
Biller C (2000) Adaptive Bayesian regression splines in semiparametric generalized linear models. J Comput Graph Stat 9:122–140
Cohen E, Riesenfeld RF, Elber G (2001) Geometric modelling with splines: an introduction. A K Peters, Natick
De Boor C (2001) A practical guide to splines, revised Edition. Springer, New York
Denison D, Mallick B, Smith A (1998) Automatic Bayesian curve fitting. J R Stat Soc B 60:333–350
Donoho D, Johnstone I (1994) Ideal spatial adaptation by wavelet shrinkage. Biometrika 81:425–455
Eubank R (1988) Spline smoothing and nonparametric regression. Dekker, New York
Fan J, Gijbels I (1995) Data-driven bandwidth selection in local polynomial fitting: variable bandwidth and spatial adaptation. J R Stat Soc B 57:371–394
Farin G (2002) Curves and surfaces for CAGD, 5th edn. Morgan Kaufmann, San Francisco
Friedman JH (1991) Multivariate adaptive regression splines (with discussion). Ann Stat 19:1–141
Friedman JH, Silverman BW (1989) Flexible parsimonious smoothing and additive modeling (with discussion). Technometrics 31:3–39
Hansen MH, Kooperberg C (2002) Spline adaptation in extended linear models (with comments and a rejoinder by the authors). Stat Sci 17(1):2–51
Hastie T (1989) [Flexible Parsimonious Smoothing and Additive Modeling]: Discussion. Technometrics 31:23–29
Huang JZ (2003) Local assymptotics for polynomial spline regression. Ann Stat 31:1600–1635
Jupp D (1978) Approximation to data by splines with free knots. SIAM J Numer Anal 15:328–343
Kaishev VK (1984) A computer program package for solving spline regression problems. In: Havranek T, Sidak Z, Novak M (eds) Proceedings in computational statistics, COMPSTAT. Physica-verlag, Wien, pp 409–414
Kang H, Chen F, Li Y, Deng J, Yang Z (2015) Knot calculation for spline fitting via sparse optimization. Comput Aided Des 58:179–188
Kimber SAJ, Kreyssig A, Zhang YZ, Jeschke HO, Valenti R, Yokaichiya F, Colombier E, Yan J, Hansen TC, Chatterji T, McQueeney RJ, Canfield PC, Goldman AI, Argyriou DN (2009) Similarities between structural distortions under pressure and chemical doping in superconducting \(\text{ BaFe }_2\text{ As }_2\). Nat Mater 8:471–475
Lee TCM (2000) Regression spline smoothing using the minimum description length principle. Stat Probab Lett 48:71–82
Lee TCM (2002a) Automatic smoothing for discontinuous regression functions. Stat Sin 12:823–842
Lee TCM (2002b) On algorithms for ordinary least squares regression spline fitting: a comparative study. J Stat Comput Simul 72:647–663
Lindstrom MJ (1999) Penalized estimation of free-knot splines. J Comput Graph Stat 8(2):333–352
Luo Z, Wahba G (1997) Hybrid adaptive splines. J Am Stat Assoc 92:107–115
Mammen E, Van der Geer S (1997) Locally adaptive regression splines. Ann Stat 25(1):387–413
Marx BD, Eilers PHC (1996) Flexible smoothing with B-splines and penalties. Stat Sci 11(2):89–121
Miyata S, Shen X (2003) Adaptive free-knot splines. J Comput Graph Stat 12(1):197–231
Molinari N, Durand J-F, Sabatier R (2004) Bounded optimal knots for regression splines. Comput Stat Data Anal 45(2):159–178
Pittman J (2002) Adaptive splines and genetic algorithms. J Comput Graph Stat 11(3):1–24
Rupert D (2002) Selecting the number of knots for penalized splines. J Comput Graph Stat 11(4):735–757
Rupert D, Carroll RJ (2000) Spatially-adaptive penalties for spline fitting. Aust N Z J Stat 42:205–223
Schwetlick H, Schütze T (1995) Least squares approximation by splines with free knots. BIT Numer Math 35:854–866
Smith PL (1982) Curve fitting and modeling with splines using statistical variable selection techniques. Report NASA 166034, Langley Research Center, Hampton
Smith M, Kohn R (1996) Nonparametric regression using Bayesian variable selection. J Econom 75:317–344
Stone CJ, Hansen MH, Kooperberg C, Truong YK (1997) Polynomial splines and their tensor products in extended linear modeling. Ann Stat 25:1371–1470
Van Loock W, Pipeleers G, De Schutter J, Swevers J (2011) A convex optimization approach to curve fitting with B-splines. In: Preprints of the 18th international federation of automatic control (IFAC), Milano (Italy), 2290–2295
Wahba G (1990) Spline models for observational data. SIAM, Philadelphia
Will G (2006) Powder diffraction: the rietveld method and the two stage method. Springer, Berlin
Wood SN (2003) Thin plate regression splines. J R Stat Soc B 65(1):95–114
Yuan Y, Chen N, Zhou S (2013) Adaptive B-spline knots selection using multi-resolution basis set. IIE Trans 45(12):1263–1277
Zhou S, Shen X (2001) Spatially adaptive regression splines and accurate knot selection schemes. J Am Stat Assoc 96:247–259
Acknowledgments
The authors would like to acknowledge support received through a research grant from the UK Institute of Actuaries. The authors would also like to thank Simon Kimber for providing them with the \(\hbox {BaFe}_2\hbox {As}_2\) dataset and the results from the Rietveld fit given in Kimber et al. (2009). The sincere encouragement received by David van Dyk, and his help in discussing and providing invaluable advice on ways to improve the paper are greatly appreciated.
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Below is the link to the electronic supplementary material.
Appendices
Appendices
1.1 Proofs of the results of section 3.1
Proof of Theorem 3.4
Note that, for \(n=2\), \(\xi _i\equiv \xi _i^*\), \(i=1,\ldots ,p\), hence \(V^a[g]\equiv V[g]\) and the bound in (23), which is zero, is sharp. For \(n>2\), from (4) it follows that \(\xi _1^*\equiv a\equiv \xi _1\) and \(\xi _p^*\equiv b\equiv \xi _p\), and from the definitions of V[g] and \(V^a[g]\), (9) and (22) respectively, we have
where the last equality follows from the partition of unity property of B-splines (see Sect. 2). Applying the definition of the modulus of continuity to (27) we have
From the definition (4) of the Greville sites \(\xi _i^*\) we have \(\xi _j^*=(t_{j+1}+\ldots + t_{j+n-1})/(n-1)\), \(j=2,\ldots ,p-1\). From (21), it follows that \(t_{j+1}=\left( \xi _{j-(n-2)}+\ldots +\xi _j \right) /(n-1),\ldots , t_{j+n-1}=\left( \xi _{j}+\ldots +\xi _{j+(n-2)} \right) /(n-1)\), where we have defined \(\xi _{1-l}:=a\) and \(\xi _{p+l}:=b\), \(l=1,2,\ldots \). Consider the \(\text{ max }_{j\in \{2,\ldots ,p-1\}}\left| \xi _j-\xi _j^*\right| \) and assume it is achieved for some \(j^m\), \(2\le j^m<p-1\). Expressing \(\xi _{j^m}^*\) in terms of \(\xi _{j^m}\), using the above equalities, after some algebra, it is not difficult to see that
and if we now rearrange the terms in the sum in (29), we obtain
Assume that \(\sum _{i=1}^{n-2}i\left( \xi _{j^m+(n-1-i)}-\xi _{j^m}\right) >\sum _{i=1}^{n-2}i\left( \xi _{j^m}-\xi _{j^m-(n-1-i)}\right) \). In this case, it is not difficult to see that (30) is bounded by
Similarly, it can be shown that if \(\sum _{i=1}^{n-2}i\left( \xi _{j^m+(n-1-i)}-\xi _{j^m}\right) \le \sum _{i=1}^{n-2}i\Big (\xi _{j^m}- \xi _{j^m-(n-1-i)}\Big )\) the bound in (31) also holds. Thus, from (31) and (28) we have
Using the monotonicity and subadditivity of \(\omega (g;h)\) in h, from (32) we finally obtain
where \(\lceil \nu \rceil :=\hbox {min}\{z\in \mathbb {Z}:\nu \le z\}\). This completes the proof of Theorem 3.4. \(\square \)
Proof of Corollary 3.5
This follows directly from (32) and from the definition, (24) of \(\omega (g;h)\), i.e.
\(\square \)
Proof of Corollary 3.6
From (27), for \(n=3\) and \(g=\hat{f}\), we have
Recall that \(n=3\) and hence, \(\xi _j^*=(t_{j+1}+t_{j+2})/2\), and \(t_{j+1}=(\delta _j+\delta _{j+1})/2\), \(t_{j+2}=(\delta _{j+1}+\delta _{j+2})/2\). Therefore, we need to consider the cases when \(\delta _{j}<\xi _j^*\le \delta _{j+1}\), or \(\delta _{j+1}\le \xi _j^*<\delta _{j+2}\), \(2\le j\le p-1\). In the first case, applying the Mansfield-De Boor-Cox recurrence formula we know that if \(\delta _j<\xi _j^*<\delta _{j+1}\), then \(\sum _{i=j-1}^{j+1}\hat{\alpha }_i N_{i,2}\left( \xi _j^*\right) =\hat{\alpha }_{j-1} N_{j-1,2}\left( \xi _j^*\right) +\hat{\alpha }_{j}N_{j,2}\left( \xi _j^*\right) \), which is a convex combination of only two B-spline coefficients. Thus, (34) becomes
where we have used the fact that \(\delta _{j+2}-\delta _{j+1}>0\) to arrive at the last inequality. Similarly, it is not difficult to see that the same bound as in (35) holds in the case when \(\delta _{j+1}\le \xi _j^*\le \delta _{j+2}\). This completes the proof of Corollary 3.6. \(\square \)
Rights and permissions
About this article
Cite this article
Kaishev, V.K., Dimitrova, D.S., Haberman, S. et al. Geometrically designed, variable knot regression splines. Comput Stat 31, 1079–1105 (2016). https://doi.org/10.1007/s00180-015-0621-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00180-015-0621-7