Skip to main content

Nonlinear Regression

  • 1119 Accesses

Abstract

For regression, until now we have focused on only linear regression, but in this chapter, we will consider the nonlinear case where the relationship between the covariates and response is not linear. In the case of linear regression in Chap. 2, if there are p variables, we calculate \(p+1\) coefficients of the basis that consists of \(p+1\) functions \(1,x_1,\ldots ,x_p\). This chapter addresses regression when the basis is general. For example, if the response is expressed as a polynomial of the covariate x, the basis consists of \(1,x,\ldots ,x^p\). We also consider spline regression and find a basis. In that case, the coefficients can be found in the same manner as for linear regression. Moreover, we consider local regression for which the response cannot be expressed by a finite number of basis functions. Finally, we consider a unified framework (generalized additive model) and back-fitting.

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-981-15-7568-6_7
  • Chapter length: 30 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   39.99
Price excludes VAT (USA)
  • ISBN: 978-981-15-7568-6
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   49.99
Price excludes VAT (USA)
Fig. 7.1
Fig. 7.2
Fig. 7.3
Fig. 7.4
Fig. 7.5
Fig. 7.6
Fig. 7.7
Fig. 7.8
Fig. 7.9
Fig. 7.10
Fig. 7.11
Fig. 7.12

Notes

  1. 1.

    For \(N>100\), we could not compute the inverse matrix; errors occurred due to memory shortage.

  2. 2.

    We call such a kernel a kernel in a broader sense.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Joe Suzuki .

Appendices

Appendix: Proof of Propositions

Proposition 20

The function f(x) has K cubic polynomials \(h_1(x)=1,\ h_2(x)=x,\ h_{j+2}(x)=d_{j}(x)-d_{K-1}(x),\ j=1,\ldots ,K-2\) as a basis, and if we define

$$\gamma _1:=\beta _1,\ \gamma _2:=\beta _2,\ \gamma _3:=(\alpha _K-\alpha _1)\beta _3,\ \ldots ,\ \gamma _K:=(\alpha _K-\alpha _{K-2})\beta _K$$

for each \(\beta _1,\ldots ,\beta _K\), we can express f as \(\displaystyle f(x)=\sum _{j=1}^K\gamma _j h_j(x)\), where we have

$$d_j(x)=\frac{(x-\alpha _j)^3_+-(x-\alpha _K)_+^3}{\alpha _K-\alpha _j} ,\ j=1,\ldots ,K-1\ .$$

Proof

First, the condition (7.3) \(\displaystyle \beta _{K+1}=-\sum _{j=3}^K\frac{\alpha _K-\alpha _{j-2}}{\alpha _K-\alpha _{K-1}}\beta _j \) can be expressed as

$$\begin{aligned} \gamma _{K+1}=-\sum _{j=3}^K\gamma _j \end{aligned}$$
(7.9)

with \(\gamma _{K+1}:=(\alpha _K-\alpha _{K-1})\beta _{K+1}\).

In the following, we show that \(\gamma _1,\ldots ,\gamma _K\) are coefficients when the basis consists of \(h_1(x)=1\), \(h_2(x)=x\), \(h_{j+2}(x)=d_{j}(x)-d_{K-1}(x)\), \(j=1,\ldots ,K-2\), where

$$d_j(x)=\frac{(x-\alpha _j)^3_+-(x-\alpha _K)_+^3}{\alpha _K-\alpha _j} ,\ j=1,\ldots ,K-1\ $$

for each case of \(x \le \alpha _k\) and \(\alpha _K \le x\).

In fact, for \(x\le \alpha _{K}\), using (7.9), we obtain

$$\begin{aligned} \sum _{j=3}^{K+1}\gamma _j\frac{(x-\alpha _{j-2})_+^3}{\alpha _K-\alpha _{j-2}}= & {} \sum _{j=3}^{K}\gamma _j\frac{(x-\alpha _{j-2})_+^3}{\alpha _K-\alpha _{j-2}} -\sum _{j=3}^{K}\gamma _j\frac{(x-\alpha _{K-1})_+^3}{\alpha _K-\alpha _{K-1}}\\= & {} \sum _{j=3}^{K}\gamma _j\left\{ \frac{(x-\alpha _{j-2})_+^3}{\alpha _K-\alpha _{j-2}} -\frac{(x-\alpha _{K-1})_+^3}{\alpha _K-\alpha _{K-1}}\right\} \\= & {} \sum _{j=3}^K \gamma _j\{d_{j-2}(x)-d_{K-1}(x)\}\ , \end{aligned}$$

which means

$$\begin{aligned} f(x)= & {} \beta _1+\beta _2x+\sum _{j=3}^{K+1}\beta _j(x-\alpha _{j-2})_+^3\\= & {} \gamma _1+\gamma _2x+\sum _{j=3}^{K+1}\gamma _j\frac{(x-\alpha _{j-2})_+^3}{\alpha _K-\alpha _{j-2}}\\= & {} \gamma _1+\gamma _2x+\sum _{j=3}^{K}\gamma _j(d_{j-2}(x)-d_{K-1}(x))=\sum _{j=1}^K\gamma _j h_j(x)\ . \end{aligned}$$

For \(x\ge \alpha _{K}\), according to the definition, for \(j=1,\ldots ,K-2\), we have

$$\begin{aligned} h_{j+2}(x)= & {} \frac{(x-\alpha _j)^3-(x-\alpha _K)^3}{\alpha _K-\alpha _j} - \frac{(x-\alpha _{K-1})^3-(x-\alpha _K)^3}{\alpha _K-\alpha _{K-1}}\nonumber \\= & {} (x-\alpha _j)^2+(x-\alpha _K)^2+(x-\alpha _j)(x-\alpha _K)-(x-\alpha _K)^2\nonumber \\&-(x-\alpha _{K-1})^2-(x-\alpha _{K-1})(x-\alpha _K)\nonumber \\= & {} (\alpha _{K-1}-\alpha _j)(2x-\alpha _j-\alpha _{K-1})+(x-\alpha _K)(\alpha _{K-1}-\alpha _j)\end{aligned}$$
(7.10)
$$\begin{aligned}= & {} (\alpha _{K-1}-\alpha _j)(3x-\alpha _j-\alpha _{K-1}-\alpha _K)\ , \end{aligned}$$
(7.11)

where the second to last equality is obtained by factorization between the first and fourth terms and between the third and sixth terms. Therefore, if we substitute \(x=\alpha _K\) into \(f(x)=\sum _{j=1}^K\gamma _jh_j(x)\) and \(f'(x)=\sum _{j=1}^K\gamma _jh_j'(x)\), we obtain

$$\begin{aligned} f(\alpha _K)=\gamma _1+\gamma _2\alpha _K+\sum _{j=3}^K\gamma _j(\alpha _{K-1}-\alpha _{j-2})(2\alpha _K-\alpha _{j-2}-\alpha _{K-1}) \end{aligned}$$
(7.12)

and

$$\begin{aligned} f'(\alpha _K)=\gamma _2+3\sum _{j=3}^K\gamma _j(\alpha _{K-1}-\alpha _{j-2}). \end{aligned}$$
(7.13)

Thus, for \(x\ge \alpha _K\), we have shown that \(f(x)=\sum _{j=1}^K\gamma _jh_j(x)\) is such a line. On the other hand, using the function \(\displaystyle f(x)=\gamma _1+\gamma _2 x +\sum _{j=1}^{K+1}\gamma _j\frac{(x-\alpha _{j-2})_+^3}{\alpha _K-\alpha _{j-2}}\) for \(x\le \alpha _K\), to compute the value and its derivative at \(x=\alpha _K\), from (7.9), we obtain

$$\begin{aligned} f(\alpha _K)= & {} \gamma _1+\gamma _2\alpha _K+\sum _{j=3}^{K+1}\gamma _j\frac{(\alpha _K-\alpha _{j-2})^3}{\alpha _K-\alpha _{j-2}} =\gamma _1+\gamma _2\alpha _K+\sum _{j=3}^{K+1}\gamma _j(\alpha _K-\alpha _{j-2})^2\end{aligned}$$
(7.14)
$$\begin{aligned}= & {} \gamma _1+\gamma _2\alpha _K+\sum _{j=3}^{K}\gamma _j(\alpha _K-\alpha _{j-2})^2-\sum _{j=3}^{K}\gamma _j(\alpha _K-\alpha _{K-1})^2\nonumber \\= & {} \gamma _1+\gamma _2\alpha _K+\sum _{j=3}^{K}\gamma _j(\alpha _{K-1}-\alpha _{j-2})(2\alpha _K-\alpha _{j-2}-\alpha _{K-1}) \end{aligned}$$
(7.15)

and

$$\begin{aligned} f'(\alpha _K)= & {} \gamma _2+3\sum _{j=3}^{K+1}\gamma _j\frac{(\alpha _K-\alpha _{j-2})^2}{\alpha _K-\alpha _{j-2}}= \gamma _2+3\sum _{j=3}^{K+1}\gamma _j(\alpha _K-\alpha _{j-2})\\= & {} \gamma _2+3\sum _{j=3}^K\gamma _j(\alpha _K-\alpha _{j-2})-3\sum _{j=3}^K\gamma _j(\alpha _K-\alpha _{K-1})=\gamma _2+3\sum _{j=3}^K\gamma _j(\alpha _{K-1}-\alpha _{j-2}).\nonumber \end{aligned}$$
(7.16)

Since not only (7.12) and (7.15) but also (7.13) and (7.16) coincide, the proposition holds even for \(x\ge \alpha _K\).

Proposition 21

(Green and Silverman, 1994) The natural spline  f with knots \(x_1,\ldots , x_N\) minimizes L(f).

Proof

Let f(x) be an arbitrary function that minimizes (7.5), g(x) be the natural spline with knots \(x_1,\ldots , x_N\), and \(r(x):=f(x)-g(x)\). Since the dimension of g(x) is N, we can determine the coefficients \(\gamma _1,\ldots ,\gamma _N\) of the basis functions \(h_1(x),\ldots ,h_N(x)\) in \(g(x)=\sum _{i=1}^N\gamma _ih_i(x)\) such that

$$g(x_1)=f(x_1),\ldots , g(x_N)=f(x_N).$$

In fact, we can solve the following linear equation.

$$\begin{aligned} \left[ \begin{array}{c@{\quad }c@{\quad }c} h_1(x_1)&{}\cdots &{}h_N(x_1)\\ \vdots &{}\ddots &{}\vdots \\ h_1(x_N)&{}\cdots &{}h_N(x_N)\\ \end{array} \right] \left[ \begin{array}{c} \gamma _1\\ \vdots \\ \gamma _N\\ \end{array} \right] = \left[ \begin{array}{c} f(x_1)\\ \vdots \\ f(x_N)\\ \end{array} \right] \ . \end{aligned}$$

Then, note that we have \(r(x_1)=\cdots =r_N(x_N)=0\) and that g(x) is a line and a cubic polynomial for \(x\le x_1,\ x_N\le x\) and inside these values, respectively, which means \(g'''(x)\) is a constant \(\gamma _i\) for each interval \([x_i, x_{i+1}]\), specifically, \(g''(x_1)=g''(x_N)=0\). Thus, we have

$$\int _{x_1}^{x_N} g''(x)r''(x)dx=[g''(x)r'(x)]_{x_1}^{x_N}-\int _{x_1}^{x_N} g'''(x)r'(x)dx=-\sum _{i=1}^{N-1} \gamma _i[r(x)]_{x_{i}}^{x_{i+1}}=0\ .$$

Hence, we have

$$\begin{aligned} \int _{-\infty }^{\infty }\{f''(x)\}^2dx\ge & {} \int _{x_1}^{x_N} \{g''(x)+r''(x)\}^2dx\\\ge & {} \int _{x_1}^{x_N}\{g''(x)\}^2dx + \int _{x_1}^{x_N}\{r''(x)\}^2dx + 2\int _{x_1}^{x_N}g''(x)r''(x)dx\\\ge & {} \int _{x_1}^{x_N}\{g''(x)\}^2dx \ , \end{aligned}$$

which means that for each of the functions f that minimize \(L(\cdot )\) in (7.5), there exists a natural function g such that

$$\begin{aligned} L(f)= & {} \sum _{i=1}^N(y_i-f(x_i))^2+\lambda \int _{-\infty }^\infty \{f''(x)\}^2dx\\\ge & {} \sum _{i=1}^N(y_i-g(x_i))^2+\lambda \int _{-\infty }^\infty \{g''(x)\}^2dx=L(g)\ . \end{aligned}$$

Proposition 22

The elements \(g_{i,j}\) defined in (7.6) are given by

$$\begin{aligned}&g_{i,j}\\&=\frac{\displaystyle (x_{N-1}-x_{j-2})^2\left( 12x_{N-1}+6x_{j-2}-18x_{i-2}\right) +12(x_{N-1}-x_{i-2})(x_{N-1}-x_{j-2})(x_{N}-x_{N-1}) }{(x_N-x_{i-2})(x_N-x_{j-2})}\ , \end{aligned}$$

where \(x_i\le x_j\) and \(g_{i,j}=0\) for either \(i\le 2\) or \(j\le 2\).

Proof

Without loss of generality, we may assume \(x_i\le x_j\). Then, we have

$$\begin{aligned} \int _{x_{1}}^{x_N}h_i''(x)h_j''(x)dx= & {} \int _{\max (x_i,x_j)}^{x_N}h_i''(x)h_j''(x)dx\nonumber \\= & {} \int _{x_j}^{x_{N-1}}h_i''(x)h_j''(x)dx+\int _{x_{N-1}}^{x_N}h_i''(x)h_j''(x)dx\ , \end{aligned}$$
(7.17)

where we have used \(h_i''(x)=0\) for \(x\le x_i\) and \(h_j''(x)=0\) for \(x\le x_j\). The right-hand side can be computed as follows. The first term is

$$\begin{aligned} \int _{x_{N-1}}^{x_N}h_i''(x)h_j''(x)dx= & {} 36\int _{x_{N-1}}^{x_N}\left( \frac{x-x_{i-2}}{x_N-x_{i-2}}-\frac{x-x_{N-1}}{x_N-x_{N-1}}\right) \left( \frac{x-x_{j-2}}{x_N-x_{j-2}}-\frac{x-x_{N-1}}{x_N-x_{N-1}}\right) dx\nonumber \\= & {} 36\frac{(x_{N-1}-x_{i-2})(x_{N-1}-x_{j-2})}{(x_N-x_{i-2})(x_N-x_{j-2})}\int _{x_{N-1}}^{x_N}\left( \frac{x-x_{N}}{x_N-x_{N-1}}\right) ^2dx\nonumber \\= & {} 12 \frac{(x_{N-1}-x_{i-2})(x_{N-1}-x_{j-2})(x_{N}-x_{N-1})}{(x_N-x_{i-2})(x_N-x_{j-2})}\ , \end{aligned}$$
(7.18)

where the second equality is obtained via the following equations:

$$(x-x_{i-2})(x_N-x_{N-1})-(x-x_{N-1})(x_N-x_{i-2})=(x-x_N)(x_{N-1}-x_{i-2})$$
$$(x-x_{j-2})(x_N-x_{N-1})-(x-x_{N-1})(x_N-x_{j-2})=(x-x_N)(x_{N-1}-x_{j-2})\ .$$

For the second term of (7.17), we have

$$\begin{aligned}&\int _{x_{j-2}}^{x_{N-1}}h_i''(x)h_j''(x)dx =36\int _{x_{j-2}}^{x_{N-1}}\frac{x-x_{i-2}}{x_N-x_{i-2}}\cdot \frac{x-x_{j-2}}{x_N-x_{j-2}}dx\nonumber \\&=36\frac{x_{N-1}-x_{j-2}}{(x_N-x_{i-2})(x_N-x_{j-2})}\nonumber \\&\qquad \times \bigg \{\frac{1}{3}(x_{N-1}^2+x_{N-1}x_{j-2}+x_{j-2}^2)-\frac{1}{2}(x_{N-1}+x_{j-2})(x_{i-2}+x_{j-2}) +x_{i-2}x_{j-2}\bigg \}\nonumber \\&=36\frac{x_{N-1}-x_{j-2}}{(x_N-x_{i-2})(x_N-x_{j-2})}\left\{ \frac{1}{3}x_{N-1}^2-\frac{1}{6}x_{N-1}x_{j-2}-\frac{1}{6}x_{j-2}^2-\frac{1}{2}x_{i-2}(x_{N-1}-x_{j-2})\right\} \nonumber \\&=\frac{(x_{N-1}-x_{j-2})^2}{(x_N-x_{i-2})(x_N-x_{j-2})}\left( 12x_{N-1}+6x_{j-2}-18x_{i-2}\right) \ , \end{aligned}$$
(7.19)

where to obtain the last equality in (7.19), we used

$$\begin{aligned}&\frac{1}{3}x_{N-1}^2-\frac{1}{6}(x_{j-2}+3x_{i-2})x_{N-1}-\frac{1}{6}x_{j-2}(x_{j-2}-3x_{i-2})\\= & {} (x_{N-1}-x_{j-2})\left( \frac{1}{3}x_{N-1}+\frac{1}{6}x_{j-2}-\frac{1}{2}x_{i-2}\right) . \end{aligned}$$

Exercises 57–68

  1. 57.

    For each of the following two quantities, find a condition under which the \(\beta _0,\beta _1,\ldots ,\beta _p\) that minimize it are unique given data \((x_1,y_1),\ldots ,(x_N,y_N)\in {\mathbb R}\times {\mathbb R}\) and its solution:

    1. (a)

      \(\displaystyle \sum _{i=1}^N\left( y_i-\sum _{j=0}^p \beta _jx_i^j\right) ^2\)

    2. (b)

      \(\displaystyle \sum _{i=1}^N\left( y_i-\sum _{j=0}^p \beta _jf_j(x_i)\right) ^2\), \(f_0(x)=1\), \(x\in {\mathbb R}\), \(f_j: {\mathbb R}\rightarrow {\mathbb R}\), \(j=1,\ldots ,p\)

  2. 58.

    For \(K\ge 1\) and \(-\infty =\alpha _0<\alpha _1<\cdots<\alpha _K<\alpha _{K+1}=\infty \), we define a cubic polynomial \(f_{i}(x)\) for \(\alpha _i\le x\le \alpha _{i+1}\), \(i=0,1,\ldots ,K\), and assume that \(f_i\), \(i=0,1,\ldots ,K\) satisfy \(f_{i-1}^{(j)}(\alpha _{i})=f_i^{(j)}(\alpha _i)\), \(j=0,1,2\), \(i=1,\ldots ,K\), where \(f^{(0)}(\alpha ),f^{(1)}(\alpha ),f^{(2)}(\alpha )\) denotes the value and the first and second derivatives of f at \(x=\alpha \).

    1. (a)

      Show that there exists \(\gamma _i\) such that \(f_{i}(x)=f_{i-1}(x)+\gamma _i (x-\alpha _i)^3\).

    2. (b)

      Consider a piecewise cubic polynomial \(f(x)=f_i(x)\) for \(\alpha _{i}\le x\le \alpha _{i+1}\) \(i=0,1,\ldots ,K\) (spline curve). Show that there exist \(\beta _1,\beta _2,\ldots ,\beta _{K+4}\) such that

      $$f(x)=\beta _1+\beta _2x+\beta _3x^2+\beta _4x^3 +\sum _{i=1}^{K}\beta _{i+4} (x-\alpha _i)_+^3\ ,$$

      where \((x-\alpha _i)_+\) denotes the function that takes \(x-\alpha _i\) and zero for \(x>\alpha _i\) and \(x\le \alpha _i\), respectively.

  3. 59.

    We generate artificial data and execute spline regression for \(K=5,7,9\) knots. Define the following function f and draw spline curves.

    figure m
  4. 60.

    For \(K\ge 2\), we define the following cubic spline curve g (natural spline): it is a line for \(x\le \alpha _1\) and \(\alpha _{K}\le x\) and a cubic polynomial for \(\alpha _{i}\le x\le \alpha _{i+1}\), \(i=1,\ldots ,K-1\), where the values and the first and second derivatives coincide on both sides of the K knots \(\alpha _1,\ldots ,\alpha _K\).

    1. (a)

      Show that \(\displaystyle \gamma _{K+1}=-\sum _{j=3}^K\gamma _j\) when

      $$g(x)=\gamma _1+\gamma _2x+\gamma _3\frac{(x-\alpha _1)^3}{\alpha _K-\alpha _1}+\cdots +\gamma _K\frac{(x-\alpha _{K-2})^3}{\alpha _K-\alpha _{K-2}} +\gamma _{K+1}\frac{(x-\alpha _{K-1})^3}{\alpha _K-\alpha _{K-1}}$$

      for \(\alpha _{K-1}\le x\le \alpha _{K}\). Hint: Derive the result from \(g''(\alpha _K)=0\).

    2. (b)

      g(x) can be written as \(\displaystyle \sum _{i=1}^K\gamma _ih_i(x)\) with \(\gamma _1,\ldots ,\gamma _K\in {\mathbb R}\) and the functions \(h_1(x)=1\), \(h_2(x)=x\), \(h_{j+2}(x)=d_{j}(x)-d_{K-1}(x)\), \(j=1,\ldots ,K-2\), where

      $$d_j(x)=\frac{(x-\alpha _j)^3_+-(x-\alpha _K)_+^3}{\alpha _K-\alpha _j} ,\ j=1,\ldots ,K-1\ .$$

      Show that

      $$h_{j+2}(x)=(\alpha _{K-1}-\alpha _j)(3x-\alpha _j-\alpha _{K-1}-\alpha _K) ,\ j=1,\ldots ,K-2$$

      for each \(\alpha _K\le x\).

    3. (c)

      Show that g(x) is a linear function of x for \(x\le \alpha _1\) and for \(\alpha _K\le x\).

  5. 61.

    We compare the ordinary and natural spline functions. Define the functions \(h_1,\ldots ,h_K\), \(d_1,\ldots ,d_{K-1}\) and g, and execute the below.

    figure n

    Hint: The functions hd need to compute the size K of the knots. Inside the function g, knots may be global.

  6. 62.

    We wish to prove that for an arbitrary \(\lambda \ge 0\), there exists \(f: {\mathbb R}\rightarrow {\mathbb R}\) that minimizes

    $$\begin{aligned} RSS(f,\lambda ):=\sum _{i=1}^N (y_i-f(x_i))^2+\lambda \int _{-\infty }^{\infty }\{f''(t)\}^2dt \end{aligned}$$
    (7.20)

    given data \((x_1,y_1),\ldots ,(x_N,y_N)\in {\mathbb R}\times {\mathbb R}\) among the natural spline function g with knots \(x_1<\cdots <x_N\) (smoothing spline function).

    1. (a)

      Show that there exist \(\gamma _1,\ldots ,\gamma _{N-1}\in {\mathbb R}\) such that

      $$\int _{x_1}^{x_N} g''(x)h''(x)dx=-\sum _{i=1}^{N-1}\gamma _i\{h(x_{i+1})-h(x_i)\}.$$

      Hint: Use the facts that \(g''(x_1)=g''(x_N)=0\) and that the third derivative of g is constant for \(x_i\le x\le x_{i+1}\).

    2. (b)

      Show that if the function \(h: {\mathbb R} \rightarrow {\mathbb R}\) satisfies

      $$\begin{aligned} \int _{x_1}^{x_N}g''(x)h''(x)dx=0\ , \end{aligned}$$
      (7.21)

      then for any \(f(x)=g(x)+h(x)\), we have

      $$\begin{aligned} \int _{-\infty }^{\infty }\{g''(x)\}^2dx \le \int _{-\infty }^{\infty } \{f''(x)\}^2dx\ . \end{aligned}$$
      (7.22)

      Hint: For \(x\le x_1\) and \(x_N\le x\), g(x) is a linear function and \(g''(x)=0\). Moreover, (7.21) implies

      $$\int _{x_1}^{x_N}\{g''(x)+h''(x)\}^2dx=\int _{x_1}^{x_N}\{g''(x)\}^2dx + \int _{x_1}^{x_N}\{h''(x)\}^2dx\ .$$
    3. (c)

      A natural spline curve g is contained among the set of functions \(f: {\mathbb R}\rightarrow {\mathbb R}\) that minimize (7.20). Hint: Show that if \(RSS(f,\lambda )\) is the minimum value, \(h(x_i)=0\), \(i=1,\ldots ,N\) implies (7.21) for the natural spline g such that \(g(x_i)=f(x_i)\), \(i=1,\ldots ,N\).

  7. 63.

    It is known that \(\displaystyle g_{i,j}:=\int _{-\infty }^{\infty } h_i''(x)h_j''(x)dx\) is given by

    $$\frac{\displaystyle (x_{N-1}-x_{j-2})^2\left( 12x_{N-1}-18x_{i-2}+6x_{j-2}\right) +12(x_{N-1}-x_{i-2})(x_{N-1}-x_{j-2})(x_{N}-x_{N-1}) }{(x_N-x_{i-2})(x_N-x_{j-2})}\ ,$$

    where \(h_1,\ldots ,h_K\) is the natural spline basis with the knots \(x_1<\cdots <x_K\) and \(g_{i,j}=0\) for either \(i\le 2\) or \(j\le 2\). Write an R function G that outputs matrix G with elements \(g_{i,j}\) from the K knots \(x\in {\mathbb R}^{K}\).

  8. 64.

    We assume that there exist \(\gamma _1,\ldots ,\gamma _N\in {\mathbb R}\) such that \(\displaystyle g(x)=\sum _{j=1}^{N}g_{j}(x)\gamma _j\) and \(\displaystyle g''(x)=\sum _{j=1}^{N}g_{j}''(x)\gamma _j\) for a smoothing spline function g with knots \(x_1<\cdots <x_N\), where \(g_{j}\), \(j=1,\ldots ,N\) are cubic polynomials. Show that the coefficients \(\gamma =[\gamma _1,\ldots ,\gamma _N]^T\in {\mathbb R}^N\) can be expressed by \(\gamma =(G^TG+\lambda G'')^{-1}G^Ty\) with \(G=(g_{j}(x_i))\in {\mathbb R}^{N\times N}\) and \(\displaystyle G''=\left( \int _{-\infty }^\infty g_{j}''(x)g_{k}''(x)dx\right) \in {\mathbb R}^{N\times N}\). Moreover, we wish to draw the smoothing spline curve to compute \(\hat{\gamma }\) for each \(\lambda \). Fill in the blanks and execute the procedure.

    figure o
  9. 65.

    It is difficult to evaluate how much the value of \(\lambda \) affects the estimation of \(\gamma \) because \(\lambda \) varies and depends on the settings. To this end, we often use the effective degrees of freedom, the trace of \(H[\lambda ]:=X^T(X^TX+\lambda G)^{-1}X\), instead of \(\lambda \) to evaluate the balance between fitness and simplicity. For \(N=100\) and \(\lambda \) ranging from 1 to 50, we draw the graph of the effective degrees of freedom (the trace of \(H[\lambda ]\)) and predictive error (\(CV[\lambda ]\)) of CV. Fill in the blanks and execute the procedure.

    figure p
  10. 66.

    Using the Nadaraya-Watson estimator

    $$\hat{f}(x)=\frac{\sum _{i=1}^N K_\lambda (x,x_i)y_i}{\sum _{i=1}^N K_\lambda (x,x_i)}$$

    with \(\lambda >0\) and the following kernel

    $$\begin{aligned} K_\lambda (x,y)= & {} D\left( \frac{|x-y|}{\lambda }\right) \\ D(t)= & {} \left\{ \begin{array}{l@{\quad }l} \displaystyle \frac{3}{4}(1-t^2),&{}|t|\le 1\\ \displaystyle 0,&{}\mathrm{Otherwise} \end{array} \right. \ , \end{aligned}$$

    we draw a curve that fits \(n=250\) data. Fill in the blanks and execute the procedure. When \(\lambda \) is small, how does the curve change?

    figure q
  11. 67.

    Let K be a kernel. We can obtain the predictive value \([1,x] \beta (x)\) for each \(x \in {\mathbb R}^p\) using the \(\beta (x)\in {\mathbb R}^{p+1}\) that minimizes

    $$\sum _{i=1}^NK(x,x_i)(y_i-\beta (x)^T[1,x_i])^2$$

    (local regression).

    1. (a)

      When we write \(\beta (x)=(X^TW(x)X)^{-1}X^TW(x)y\), what is the matrix W?

    2. (b)

      Using the same kernel as we used in Problem 66 with \(p=1\), we applied \(x_1,\ldots ,x_N,y_1,\ldots ,y_N\) to local regression. Fill in the blanks and execute the procedure.

      figure r
    3. 68.

      If the number of base functions is finite, the coefficient can be obtained via least squares in the same manner as linear regression. However, when the number of bases is large, such as for the smoothing spline, it is difficult to find the inverse matrix. Moreover, for example, local regression cannot be expressed by a finite number of bases. In such cases, a method called back-fitting is often applied. To decompose the function into the sum of polynomial regression and local regression, we constructed the following procedure. Fill in the blanks and execute the process.

      figure s

Rights and permissions

Reprints and Permissions

Copyright information

© 2020 The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this chapter

Verify currency and authenticity via CrossMark

Cite this chapter

Suzuki, J. (2020). Nonlinear Regression. In: Statistical Learning with Math and R. Springer, Singapore. https://doi.org/10.1007/978-981-15-7568-6_7

Download citation

  • DOI: https://doi.org/10.1007/978-981-15-7568-6_7

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-15-7567-9

  • Online ISBN: 978-981-15-7568-6

  • eBook Packages: Computer ScienceComputer Science (R0)