Proofs of Sect. 3
Proof for the one-variable case
Proof of Theorem 3.1
We have to solve the system
$$\begin{aligned} \left\{ \begin{array}{ll}S(\varvec{\vartheta }) = 0\\ \varvec{R}\varvec{\vartheta }=0. \end{array}\right. \end{aligned}$$
(23)
The system \(S(\varvec{\vartheta })=0\) is
$$\begin{aligned} \left\{ \begin{array}{ll}\displaystyle \sum _{i=1}^n\ell '(\eta _i)\left( y_i - b'\circ \ell (\eta _i)\right) = 0\\ \displaystyle \sum _{i=1}^nx_i^{(2),j}\ell '(\eta _i)\left( y_i - b'\circ \ell (\eta _i)\right) = 0,\quad \forall j\in J. \end{array}\right. \end{aligned}$$
that is
$$\begin{aligned} \left\{ \begin{array}{ll}\displaystyle \sum _{j\in J}\ell '(\vartheta _{(1)}+\vartheta _{(2),j})\left( \sum _{i=1}^nx_i^{(2),j}y_i - m_jb'\circ \ell (\vartheta _{(1)} + \vartheta _{(2),j})\right) = 0\\ \ell '(\vartheta _{(1)}+\vartheta _{(2),j})\left( \displaystyle \sum _{i=1}^nx_i^{(2),j}y_i - m_jb'\circ \ell (\vartheta _{(1)} + \vartheta _{(2),j})\right) = 0,\quad \forall j\in J. \end{array}\right. \end{aligned}$$
The first equation in the previous system is redundancy, and
$$\begin{aligned} S(\varvec{\vartheta }) = 0 \Leftrightarrow \ell '(\vartheta _{(1)}+\vartheta _{(2),j})\left( \sum _{i=1}^nx_i^{(2),j}y_i - m_jb'\circ \ell (\vartheta _{(1)} + \vartheta _{(2),j})\right) = 0,\quad \forall j\in J. \end{aligned}$$
Hence if \(Y_i\) takes values in \(\mathbb {Y}\subset b'(\varLambda )\), and \(\ell \) injective, we have
$$\begin{aligned} \vartheta _{(1)}+\vartheta _{(j)} = g(\overline{Y}_n^{(j)})\quad \forall j\in J. \end{aligned}$$
The system (23) is
$$\begin{aligned} \left\{ \begin{array}{ll}\varvec{Q}\varvec{\vartheta }= \varvec{g({\bar{Y}})}\\ \varvec{R}\varvec{\vartheta }=0. \end{array}\right. \Leftrightarrow \left( \begin{array}{c} \varvec{Q} \\ \varvec{R}\end{array}\right) \varvec{\vartheta }=\left( \begin{array}{c}\varvec{g({\bar{Y}})}\\ 0\end{array}\right) . \end{aligned}$$
(24)
Let us compute the determinant of the matrix \(M_d = \left( \begin{array}{c} \varvec{Q} \\ \varvec{R}\end{array}\right) \). Consider \(\varvec{R} = (r_0,r_1,\ldots ,r_d)\). We have
$$\begin{aligned} M_d = \left( \begin{array}{c@{\quad }c} \varvec{1}_d &{} I_d \\ r_0 &{} \varvec{r} \end{array}\right) = \left( \begin{array}{c@{\quad }c@{\quad }c@{\quad }c@{\quad }c} 1 &{} 1 &{} 0 &{} \dots \\ 1 &{} 0 &{} 1 &{} 0 \\ \vdots &{} \vdots &{} \ddots &{} \ddots &{} \ddots \\ 1 &{} 0 &{} \dots &{} 0 &{} 1 \\ r_0 &{} r_1 &{} &{} \dots &{} r_d \\ \end{array}\right) , \text { with } \varvec{r}= \left( \begin{array}{c@{\quad }cc} r_1 &{} \dots &{} r_d \\ \end{array}\right) , \varvec{1}_d = \left( \begin{array}{c} 1 \\ \vdots \\ 1 \end{array}\right) . \end{aligned}$$
The determinant can be computed recursively
$$\begin{aligned} \det (M_d) = r_d \left| \begin{array}{c@{\quad }c@{\quad }c@{\quad }c} 1 &{} 1 &{} 0 &{} \dots \\ 1 &{} 0 &{} \ddots &{} 0 \\ \vdots &{} \vdots &{} \ddots &{} 1 \\ 1 &{} 0 &{} \dots &{} 0 \\ \end{array}\right| - \left| \begin{array}{c@{\quad }c@{\quad }c@{\quad }c} 1 &{} 1 &{} 0 &{} \dots \\ 1 &{} 0 &{} \ddots &{} 0 \\ \vdots &{} \vdots &{} \ddots &{} 1 \\ r_0 &{} r_1 &{} \dots &{} r_{d-1} \\ \end{array}\right| = (-1)^{d+1} r_d - \det (M_{d-1}). \end{aligned}$$
Since \( \det (M_1) = -r_0+ r_1 \) and \( \det (M_2) = -r_2 -(-r_0 +r_1) = r_0 - r_1 -r _2, \) we get \(\det (M_d) = (-1)^d r_0+ (-1)^{d+1}(r_1+\dots + r_d) =(-1)^d( r_0 - r_1-\dots -r_d)\). This determinant is non zero as long as \(r_0 \ne \sum _{j=1}^d r_j\).
Now we compute the inverse of matrix \(M_d\) by a direct inversion.
$$\begin{aligned} \left( \begin{array}{c@{\quad }c} \varvec{1}_d &{} I_d \\ r_0 &{} \varvec{r} \end{array}\right) \left( \begin{array}{c@{\quad }c} \varvec{a}' &{} b \\ C &{} \varvec{d} \end{array}\right) = \left( \begin{array}{c@{\quad }c} I_d &{} \varvec{0} \\ \varvec{0}' &{} 1 \end{array}\right) \Leftrightarrow \left\{ \begin{array}{ll}\varvec{1}_d \varvec{a}' + I_d C = I_d \\ b \varvec{1}_d + I_d \varvec{d} = \varvec{0} \\ r_0 \varvec{a}' + \varvec{r} C = \varvec{0}' \\ b r_0 + \varvec{r} \varvec{d} = 1 \end{array}\right. \Leftrightarrow \left\{ \begin{array}{ll}C = I_d - \frac{1}{-r_0 + \varvec{r} \varvec{1}_d}\varvec{1}_d\varvec{r} \\ \varvec{d}= \frac{1}{-r_0+\varvec{r} \varvec{1}_d} \varvec{1}_d \\ \varvec{a}' = \frac{\varvec{r}}{-r_0 + \varvec{r} \varvec{1}_d} \\ b = \frac{-1}{-r_0+\varvec{r} \varvec{1}_d} \\ \end{array}\right. \end{aligned}$$
Let us check the inverse of \(M_d\)
$$\begin{aligned}&\left( \begin{array}{c@{\quad }c} \varvec{1}_d &{} I_d \\ r_0 &{} \varvec{r} \end{array}\right) \left( \begin{array}{c@{\quad }c} \frac{\varvec{r}}{-r_0 + \varvec{r} \varvec{1}_d} &{} \frac{-1}{-r_0+\varvec{r} \varvec{1}_d} \\ I_d - \frac{\varvec{1}_d\varvec{r}}{-r_0 + \varvec{r} \varvec{1}_d} &{} \frac{\varvec{1}_d}{-r_0+\varvec{r} \varvec{1}_d} \end{array}\right) \\&\qquad = \left( \begin{array}{c@{\quad }c} \frac{\varvec{1}_d \varvec{r}}{-r_0 + \varvec{r} \varvec{1}_d} + I_d - \frac{\varvec{1}_d\varvec{r} }{-r_0 + \varvec{r} \varvec{1}_d} &{} \frac{-\varvec{1}_d}{-r_0+\varvec{r} \varvec{1}_d} +\frac{\varvec{1}_d}{-r_0+\varvec{r} \varvec{1}_d}\\ r_0 \frac{\varvec{r}}{-r_0 + \varvec{r} \varvec{1}_d}+ \varvec{r} - \frac{\varvec{r}\varvec{1}_d\varvec{r}}{-r_0 + \varvec{r} \varvec{1}_d} &{} \frac{-r_0}{-r_0+\varvec{r} \varvec{1}_d}+ \frac{\varvec{r}\varvec{1}_d}{-r_0+\varvec{r} \varvec{1}_d} \end{array}\right) \\&\qquad = \left( \begin{array}{c@{\quad }c} I_d &{} 0 \\ 0 &{} 1 \end{array}\right) . \end{aligned}$$
So as long as \(r_0 \ne \sum _{j=1}^d r_j\)
$$\begin{aligned} \widehat{\varvec{\vartheta }}_n = \left( \begin{array}{c@{\quad }c} \frac{\varvec{r}}{-r_0 + \varvec{r} \varvec{1}_d} &{} \frac{-1}{-r_0+\varvec{r} \varvec{1}_d} \\ I_d - \frac{\varvec{1}_d\varvec{r}}{-r_0 + \varvec{r} \varvec{1}_d} &{} \frac{\varvec{1}_d}{-r_0+\varvec{r} \varvec{1}_d} \end{array}\right) \left( \begin{array}{c} \varvec{g}({\varvec{{\bar{Y}}}})\\ 0 \end{array}\right) = \left( \begin{array}{c} \frac{\varvec{r} \varvec{g({\bar{Y}})}}{-r_0 + \varvec{r} \varvec{1}_d} \\ {\varvec{g({\bar{Y}})}} - \varvec{1}_d\frac{\varvec{r} \varvec{g({\bar{Y}})}}{-r_0 + \varvec{r} \varvec{1}_d} \end{array}\right) . \end{aligned}$$
In an other way, the system (24) is equivalent to
$$\begin{aligned} (\varvec{Q}',\varvec{R}')\left( \begin{array}{c} \varvec{Q} \\ \varvec{R}\end{array}\right) \varvec{\vartheta }= \varvec{Q}' \varvec{g({\bar{Y}})}, \end{aligned}$$
and for \((\varvec{Q}\, \varvec{R})\) of full rank, the matrix \((\varvec{Q}'\varvec{Q} + \varvec{R}'\varvec{R})\) is invertible and \( {\varvec{\vartheta }} = (\varvec{Q}'\varvec{Q} + \varvec{R}'\varvec{R})^{-1}\varvec{Q}'\varvec{g({\bar{Y}})}. \)\(\square \)
Examples—Choice of the contrast vector \(\varvec{R}\)
- 1.
Taking \(r_0=1, \varvec{r}=\varvec{0}\) leads to \( -r_0 + \varvec{r} \varvec{1}_d=-1 \Rightarrow \widehat{\varvec{\vartheta }}_n = \left( \begin{array}{c} 0 \\ \varvec{g({\bar{Y}})} \end{array}\right) . \)
- 2.
Taking \(r_0=0, \varvec{r}=(1,\varvec{0})\) leads to
$$\begin{aligned} -r_0 + \varvec{r} \varvec{1}_d=1 \Rightarrow \widehat{\varvec{\vartheta }}_n = \left( \begin{array}{c} g({\bar{Y}}_n^{(1)})\\ 0\\ g({\bar{Y}}_n^{(2)}) - g({\bar{Y}}_n^{(1)})\\ \vdots \\ g({\bar{Y}}_n^{(d)}) - g({\bar{Y}}_n^{(1)})) \end{array}\right) . \end{aligned}$$
- 3.
Taking \(r_0=0, \varvec{r}=\varvec{1}\) leads to
$$\begin{aligned} -r_0 + \varvec{r} \varvec{1}_d=d \Rightarrow \widehat{\varvec{\vartheta }}_n = \left( \begin{array}{c} \overline{\varvec{g({\bar{Y}})}}\\ g({\bar{Y}}_n^{(1)}) - \overline{\varvec{g({\bar{Y}})}}\\ \dots \\ g({\bar{Y}}_n^{(d)}) - \overline{\varvec{g({\bar{Y}})}} \end{array}\right) , \text { with } \overline{\varvec{g({\bar{Y}})}} = \dfrac{1}{d}\displaystyle \sum _{j=1}^dg(\overline{Y}_n^{(j)}). \end{aligned}$$
Proof of Remark 3.4
We have to solve the system
$$\begin{aligned} S(\vartheta ) = 0 \Leftrightarrow \displaystyle \sum _{i=1}^n\ell '(\eta )\left( y_i - b'\circ \ell (\eta )\right) = 0. \end{aligned}$$
If \(\ell \) is injective, the system simplifies to
$$\begin{aligned} \displaystyle \sum _{i=1}^n y_i - n b'\circ (b^\prime )^{-1}\circ g^{-1}(\eta ) = 0 \Leftrightarrow \eta = g\left( \begin{array}{c}{\overline{y}}_n\end{array}\right) \Leftrightarrow \theta = g\left( \begin{array}{c}{\overline{y}}_n\end{array}\right) . \end{aligned}$$
\(\square \)
Proof of Remark 3.5
Let \(Y_i\) from the exponential family \(F_{exp}(a,b,c,\lambda ,\phi )\). It is well known, that the moment generating function of \(Y_i\) is
$$\begin{aligned} \mathbf {E}e^{t Y_i} =\exp \left( \frac{b(\lambda +ta(\phi )) - b(\lambda )}{a(\phi )}\right) . \end{aligned}$$
Hence, the moment generating function of the average \({\overline{Y}}_m\) is
$$\begin{aligned} M_{{\overline{Y}}_m}(t) = \left( \exp \left( \frac{b(\lambda +\frac{t}{m} a(\phi )) - b(\lambda )}{a(\phi )}\right) \right) ^m = \exp \left( \frac{b(\lambda +t a(\phi )/m) - b(\lambda )}{a(\phi )/m}\right) . \end{aligned}$$
So we get back to a known result that \({\overline{Y}}_m\) belongs to the exponential family \(F_{exp}(x\mapsto a(x)/m,b,c,\lambda ,\phi )\) (e.g. McCullagh and Nelder 1989).
In our setting, random variables in the average \(\overline{Y}_n^{(j)}\) are i.i.d. with functions a, b, c and parameters \(\lambda =\ell (\vartheta _{(1)}+\vartheta _{(j)})\) and \(\phi \). And \({\overline{Y}}_n^{(j)}\) also belongs to the exponential family with the same parameter but with the function \({\bar{a}}:x\mapsto a(x)/m_j\). In particular,
$$\begin{aligned} \mathbf {E}{\overline{Y}}_n^{(j)} = b'(\ell (\vartheta _{(1)}+\vartheta _{(j)})) = g^{-1}(\vartheta _{(1)}+\vartheta _{(j)}),~ \text{ Var }{\overline{Y}}_n^{(j)} = \frac{a(\phi )}{m_j} b''(\ell (\vartheta _{(1)}+\vartheta _{(j)})). \end{aligned}$$
But the computation of \(\mathbf {E}g({\overline{Y}}_n^{(j)})\) remains difficult unless g is a linear function. By the strong law of large numbers, as \(m_j\rightarrow +\,\infty \), the estimator is consistent since
$$\begin{aligned} {\overline{Y}}_n^{(j)}{\mathop {\underset{n\rightarrow +\infty }{\longrightarrow }}\limits ^{\text {a.s.}}} g^{-1}(\vartheta _{(1)}+\vartheta _{(j)}) \Rightarrow g({\overline{Y}}_n^{(j)}){\mathop {\underset{n\rightarrow +\infty }{\longrightarrow }}\limits ^{\text {a.s.}}} g(g^{-1}(\vartheta _{(1)}+\vartheta _{(j)}))=\vartheta _{(1)}+\vartheta _{(j)}. \end{aligned}$$
By the Central Limit Theorem (i.e. \({\overline{Y}}_n^{(j)}\) converges in distribution to a normal distribution) and using the Delta Method, we obtain that the following
$$\begin{aligned}&\sqrt{m_j}\left( g({\overline{Y}}_n^{(j)}) - \vartheta _{(1)}+\vartheta _{(j)}\right) {\mathop {\underset{n\rightarrow +\infty }{\longrightarrow }}\limits ^{\mathcal {L}}} \\&\quad {\mathcal {N}}\left( 0, a(\phi )b''(\ell (\vartheta _{(1)}+\vartheta _{(j)})) g'(g^{-1}(\vartheta _{(1)}+\vartheta _{(j)}))^2 \right) . \end{aligned}$$
\(\square \)
Proof of Corollaries 3.1
The log likelihood of \(\widehat{\varvec{\vartheta }}_n\) is defined by
$$\begin{aligned} \log L(\widehat{\varvec{\vartheta }}_n\,|\,\underline{\varvec{y}}) = \frac{1}{a(\phi )}\sum _{i=1}^n \left( y_i \ell ({\widehat{\eta }}_i) - b(\ell ({\widehat{\eta }}_i))\right) + \sum _{i=1}^nc(y_i,\phi ). \end{aligned}$$
In fact, we must be verified than \(\ell ({\widehat{\eta }}_i)\) does not depend on g function. If we consider \(\widehat{\varvec{\vartheta }}_n\) defined by (8), we have \(\varvec{Q}\widehat{\varvec{\vartheta }}_n = \varvec{g({\bar{y}})}\) , since \(\widehat{\varvec{\vartheta }}_n\) is solution of the system (23), i.e. \(\varvec{Q}(\varvec{Q}'\varvec{Q} + \varvec{R}'\varvec{R})^{-1}\varvec{Q}'=I\) Using \({\widehat{\eta }}_i= (\varvec{Q}\widehat{\varvec{\vartheta }}_n)_j\) for i such that \(x_i^{(2),j}=1\) we obtain
$$\begin{aligned} \ell ({\widehat{\eta }}_i)= \displaystyle \sum _{j=1}^d\ell \circ g(\bar{y}_n^{(j)})x_i^{(2),j} = \displaystyle \sum _{j=1}^d\ell \circ \ell ^{-1}\circ (b')^{-1}({\bar{y}}_n^{(j)})x_i^{(2),j} = \displaystyle \sum _{j=1}^d (b')^{-1}({\bar{y}}_n^{(j)})x_i^{(2),j}, \end{aligned}$$
and
$$\begin{aligned} \log L(\widehat{\varvec{\vartheta }}_n\,|\,\underline{\varvec{y}}) = \frac{1}{a(\phi )}\sum _{j=1}^d\sum _{i, x_i^{(2)}=v_j} \left( y_i (b')^{-1}\left( {\overline{y}}_n^{(j)}\right) - b\left( \left( b'\right) ^{-1}\left( {\overline{y}}_n^{(j)}\right) \right) \right) + \sum _{i=1}^nc(y_i,\phi ). \end{aligned}$$
In the same way,
$$\begin{aligned} \widehat{\mathbf {E}Y_i}= & {} b'(\ell ({\widehat{\eta }}_i)) = \sum _{j=1}^d \bar{y}_n^{(j)}x_i^{(2),j}, \quad \widehat{\text{ Var }Y_i} = a(\phi )b''(\ell ({\widehat{\eta }}_i)) \\= & {} a(\phi )\sum _{j=1}^d b''\circ (b')^{-1}({\bar{y}}_n^{(j)})x_i^{(2),j}. \end{aligned}$$
\(\square \)
Proof for the two-variable case
Proof of Theorem 3.2
The system \(S(\varvec{\vartheta })=0\) is
$$\begin{aligned} \left\{ \begin{array}{ll}\displaystyle \sum _{i=1}^n\ell '(\eta _i)\left( y_i - b'\circ \ell (\eta _i)\right) = 0\\ \displaystyle \sum _{i=1}^nx_i^{(3),l}\ell '(\eta _i)\left( y_i - b'\circ \ell (\eta _i)\right) = 0,\quad \forall l\in L\\ \displaystyle \sum _{i=1}^nx_i^{(2),k}\ell '(\eta _i)\left( y_i - b'\circ \ell (\eta _i)\right) = 0,\quad \forall k\in K\\ \displaystyle \sum _{i=1}^nx_i^{kl}\ell '(\eta _i)\left( y_i - b'\circ \ell (\eta _i)\right) = 0,\quad \forall (k,l)\in KL^\star . \end{array}\right. \end{aligned}$$
that is
$$\begin{aligned} \left\{ \begin{array}{ll}\displaystyle \sum _{(k,l)\in KL^\star }\ell '(\vartheta _{(1)}+\vartheta _{(2),k} + \vartheta _{(3),l} + \vartheta _{kl})\left( \sum _{i=1}^nx_i^{(k,l)}y_i - m_{k,l}b'\circ \ell (\vartheta _{(1)}+\vartheta _{(2),k} + \vartheta _{(3),l} + \vartheta _{kl})\right) = 0\\ \displaystyle \sum _{k\in K_l^\star }\ell '(\vartheta _{(1)}+\vartheta _{(2),k} + \vartheta _{(3),l} + \vartheta _{kl})\left( \sum _{i=1}^nx_i^{(k,l)}y_i - m_{k,l}b'\circ \ell (\vartheta _{(1)}+\vartheta _{(2),k} + \vartheta _{(3),l} + \vartheta _{kl})\right) = 0\quad \forall l\in L\\ \displaystyle \sum _{l\in L_k^\star }\ell '(\vartheta _{(1)}+\vartheta _{(2),k} + \vartheta _{(3),l} + \vartheta _{kl})\left( \sum _{i=1}^nx_i^{(k,l)}y_i - m_{k,l}b'\circ \ell (\vartheta _{(1)}+\vartheta _{(2),k} + \vartheta _{(3),l} + \vartheta _{kl})\right) = 0\quad \forall k\in K\\ \ell '(\vartheta _{(1)}+\vartheta _{(2),k} + \vartheta _{(3),l} + \vartheta _{kl})\left( \displaystyle \sum _{i=1}^nx_i^{(k,l)}y_i - m_{k,l}b'\circ \ell (\vartheta _{(1)}+\vartheta _{(2),k} + \vartheta _{(3),l} + \vartheta _{kl})\right) = 0\quad \forall (k,l)\in KL^\star . \end{array}\right. \end{aligned}$$
The system have exactly \(1+d_2+d_3\) redundancies, and \(S(\varvec{\vartheta })=0\) reduces to
$$\begin{aligned}&\ell '(\vartheta _{(1)}+\vartheta _{(2),k} + \vartheta _{(3),l} + \vartheta _{kl})\left( \displaystyle \sum _{i=1}^nx_i^{(k,l)}y_i - m_{k,l}b'\circ \ell (\vartheta _{(1)}+\vartheta _{(2),k} \right. \nonumber \\&\quad \left. + \vartheta _{(3),l} + \vartheta _{kl}) \right) = 0\quad \forall (k,l)\in { KL}^\star . \end{aligned}$$
Hence the system has rank \({ KL}^\star \) and if \(Y_i\) takes values in \(\mathbb {Y}\subset b'(\varLambda )\), and \(\ell \) injective, we have
$$\begin{aligned} \vartheta _{(1)}+\vartheta _{(2),k} + \vartheta _{(3),l} + \vartheta _{kl} = g({\bar{Y}}_n^{(k,l)})\quad \forall (k,l)\in KL^\star . \end{aligned}$$
In the same way of proof of Theorem 3.1, we have to solve
$$\begin{aligned} \left\{ \begin{array}{ll}\varvec{Q}\varvec{\vartheta }= \varvec{g({\bar{Y}})}\\ \varvec{R}\varvec{\vartheta }=\varvec{0}. \end{array}\right. \end{aligned}$$
(25)
that is, because \(\varvec{Q}\varvec{Q}'+\varvec{R}\varvec{R}'\) is full rank, in the same way of proof of Theorem 3.1
$$\begin{aligned} {\varvec{\vartheta }} = (\varvec{Q}'\varvec{Q} + \varvec{R}'\varvec{R})^{-1}\varvec{Q}'\varvec{g({\bar{Y}})}. \end{aligned}$$
In that case, the MLE solves a least square problem with response variable \(\varvec{g({\bar{Y}})}\), explanatory variable \(\varvec{Q}\) under a linear constraint \(\varvec{R}\).
- 1.
Under linear contrasts (\({\tilde{C}}_0\)), the model (10) is equivalent to model (6) with \(J=KL^\star \) modalities. Hence the solution is evident.
- 2.
Under linear contrasts (\({\tilde{C}}_\varSigma \) ), the system
$$\begin{aligned} \vartheta _{(1)}+\vartheta _{(2),k} + \vartheta _{(3),l} + \vartheta _{kl} = g({\bar{Y}}_n^{(k,l)})\quad \forall (k,l)\in KL^\star \end{aligned}$$
implies that
$$\begin{aligned} \sum _{(k,l)\in KL^\star }m_{k,l}(\vartheta _{(1)}+\vartheta _{(2),k} + \vartheta _{(3),l} + \vartheta _{kl}) = \sum _{(k,l)\in KL^\star }m_{k,l} g({\bar{Y}}_n^{(k,l)}). \end{aligned}$$
Using
$$\begin{aligned} \sum _{(k,l)\in KL^\star }m_{k,l}= & {} n,\quad \sum _{(k,l)\in KL^\star }m_{k,l}\vartheta _{(2),k} = \sum _{k\in K}\sum _{l\in L^\star _k}m_{k,l}\vartheta _{(2),k}\nonumber \\= & {} \sum _{k\in K}m^{(2)}_k\vartheta _{(2),k}= 0,\\ \sum _{(k,l)\in KL^\star }m_{k,l}\vartheta _{(3),l}= & {} \sum _{l\in L}\sum _{k\in K^\star _l}m_{k,l}\vartheta _{(3),l}= \sum _{l\in L}m^{(3)}_l\vartheta _{(3),l}= 0,\nonumber \\&\quad \sum _{(k,l)\in KL^\star }m_{k,l}\vartheta _{kl} =0, \end{aligned}$$
we get \(\vartheta _{(1)} = \dfrac{1}{n}\displaystyle \sum \nolimits _{(k,l)\in KL^\star }m_{k,l} g({\bar{Y}}_n^{(k,l)}).\) In the same way, taking summation over \(K^\star _l\) for \(l\in L\) and over \(L^\star _k\) for \(k\in K\), we found \(\vartheta _{(2),k}\) and \(\vartheta _{(3),l}\), and then \(\vartheta _{kl}\).
With main effect only, the system \(S(\varvec{\vartheta })=0\) is
$$\begin{aligned} \left\{ \begin{array}{ll}\displaystyle \sum _{i=1}^n\ell '(\eta _i)y_i = \sum _{i=1}^n g^{-1}(\eta _i)\ell '(\eta _i) \\ \displaystyle \sum _{i=1}^nx_i^{(3),l}\ell '(\eta _i) y_i = \sum _{i=1}^nx_i^{(3),l} g^{-1}(\eta _i)\ell '(\eta _i) \quad \forall l\in L\\ \displaystyle \sum _{i=1}^nx_i^{(2),k}\ell '(\eta _i)y_i = \sum _{i=1}^nx_i^{(2),k} g^{-1}(\eta _i)\ell '(\eta _i),\quad \forall k\in K \end{array}\right. \end{aligned}$$
There are \(1+d_2+d_3\) equations for \(1+d_2+d_3\) parameters, but each explanatory variable are colinear. So, the two additional constraints \(\varvec{R}\varvec{\vartheta }=0\) ensures that a solution exist for the remaining \(d_2+d_3-1\) parameters. Using \(\sum _k x_i^{(2),k}=1\), the second set of equations becomes \(\forall l\in L\)
$$\begin{aligned}&\displaystyle \sum _{k\in K}\ell '(\vartheta _{(1)}+\vartheta _{(2),k} + \vartheta _{(3),l}) {\bar{y}}_n^{(k,l)} m_{k,l} \\&\quad = \sum _{k\in K} g^{-1}(\vartheta _{(1)}+\vartheta _{(2),k} + \vartheta _{(3),l}) \ell '(\vartheta _{(1)}+\vartheta _{(2),k} + \vartheta _{(3),l}) m_{k,l} \end{aligned}$$
Similarly, the third set of equations becomes \(\forall k\in K\)
$$\begin{aligned}&\displaystyle \sum _{l\in L}\ell '(\vartheta _{(1)}+\vartheta _{(2),k} + \vartheta _{(3),l}) {\bar{y}}_n^{(k,l)} m_{k,l} \\&\quad = \sum _{l\in L} g^{-1}(\vartheta _{(1)}+\vartheta _{(2),k} + \vartheta _{(3),l}) \ell '(\vartheta _{(1)}+\vartheta _{(2),k} + \vartheta _{(3),l}) m_{k,l} \end{aligned}$$
Even with a canonical link \(\ell (x)=x\) so that \(\ell '(x)=1\), this system is not a least-square problem for a nonlinear g function. \(\square \)
Calculus of the Log-likelihoods appearing in Sects. 4 and 5
Consider the Pareto GLM described on (13) and (15). The b function is \(b(\lambda ) = -\log (\lambda )\), using corollary 3.1 we have \(\ell (\hat{\eta }_i) = (b')^{-1}(\overline{z}_n^{(j)})=-(\overline{z}_n^{(j)})^{-1}\) for j such that \(x_i^{(2),j}=1\) and
$$\begin{aligned} \log L(\widehat{\varvec{\vartheta }}_n\,|\,\underline{\varvec{z}}) = \sum _{j=1}^d\sum _{i, x_i^{(2),j}=1} \left( z_i/{\overline{z}}_n^{(j)} - \log \left( -{\overline{z}}_n^{(j)} \right) \right) = n -\sum _{j=1}^d m_j \log \left( -{\overline{z}}_n^{(j)} \right) . \end{aligned}$$
Compute the original log likelihood of Pareto 1 distribution:
$$\begin{aligned} \log L(\varvec{\vartheta }\,|\,\underline{\varvec{y}}) = \sum _{i=1}^n \big (\log \ell (\eta _i) + \ell (\eta _i)\log \mu - (\ell (\eta _i) +1)\log y_i \big ). \end{aligned}$$
Hence with \(z_i=-\log (y_i/\mu )\),
$$\begin{aligned} \log L(\widehat{\varvec{\vartheta }}_n\,|\,\underline{\varvec{y}})= & {} \sum _{j=1}^d\sum _{i, x_i^{(2),j}=1}\left( -\log (-\overline{z}_n^{(j)}) -\frac{\log \mu }{{\overline{z}}_n^{(j)}} + \frac{\log (y_i)}{ {\overline{z}}_n^{(j)}} - \log y_i \right) \\= & {} n - \sum _{j=1}^d m_j\log (-{\overline{z}}_n^{(j)}) - \sum _{i=1}^n\log y_i =\log L(\widehat{\varvec{\vartheta }}_n\,|\,\underline{\varvec{z}})- \sum _{i=1}^n\log y_i. \end{aligned}$$
Now consider the shifted log-normal GLM described on (18) and (19). Here, the b function is \(b(\lambda )=\lambda ^2/2\), hence using Corollary 3.1, we have \(\ell (\hat{\eta }_i) = (b')^{-1}(\overline{z}_n^{(j)})=\overline{z}_n^{(j)}\) for j such that \(x_i^{(2),j}=1\) and Eq. (21) holds.
Let us compute the original log likelihood of the shifted log normal distribution:
$$\begin{aligned} \log L(\varvec{\vartheta }\,|\,\underline{\varvec{y}})= & {} \sum _{i=1}^n\left( - \log (x_i-\mu ) - \log (\sqrt{2\pi \phi }) -\dfrac{(\log (x_i-\mu ) - \ell (\eta _i))^2}{2\phi }\right) \\= & {} - \sum _{i=1}^n z_i - n\log (\sqrt{2\pi \phi }) - \sum _{i=1}^n \dfrac{(z_i - \ell (\eta _i))^2}{2\phi }, \end{aligned}$$
with \(z_i=\log (y_i-\mu )\). Hence
$$\begin{aligned} \log L(\widehat{\varvec{\vartheta }}\,|\,\underline{\varvec{y}})= & {} - \sum _{i=1}^n z_i - n\log (\sqrt{2\pi \phi }) - \frac{1}{2\phi }\sum _{j=1}^d\sum _{i, x_i^{(2),j}=1} (z_i - \overline{z}_n^{(j)})^2. \end{aligned}$$
Using \( {\widehat{\phi }} = \frac{1}{n}\sum _{j\in J}\sum _{i, x_i^{(2),j}=1}\left( z_i - {\bar{z}}_n^{(j)}\right) ^2 \) leads to the desired result.
Link functions and descriptive statistics
See Fig. 4 and Table 10.
Table 10 Empirical quantiles and moments (in euros)