Total Msplit estimation

Wiśniewski, Zbigniew

doi:10.1007/s00190-022-01668-z

Total M_split estimation

Original Article
Open access
Published: 18 October 2022

Volume 96, article number 82, (2022)
Cite this article

Download PDF

You have full access to this open access article

Journal of Geodesy Aims and scope Submit manuscript

Total M_split estimation

Download PDF

Zbigniew Wiśniewski ORCID: orcid.org/0000-0002-7069-7670¹

1779 Accesses
3 Citations
Explore all metrics

Abstract

M_split estimation is a method that enables the estimation of mutually competing versions of parameters in functional observation models. In the presented study, the classical functional models found in it are replaced by errors-in-variables (EIV) models. Similar to the weighted total least-squares (WTLS) method, the random components of these models were assigned covariance matrix models. Thus, the proposed method, named Total M_split (TM_split) estimation, corresponds to the basic rules of WTLS. TM_split estimation objective function is constructed using the components of squared M_split and WTLS estimation objective functions. The TM_split estimation algorithm is based on the Gauss–Newton method that is applied using a linear approximation of EIV models. The basic properties of the method are presented using examples of the estimation of regression line parameters and the estimation of parameters in a two-dimensional affine transformation.

A new criterion for assessing discriminant validity in variance-based structural equation modeling

Article Open access 22 August 2014

RMSEA, CFI, and TLI in structural equation modeling with ordered categorical data: The story they tell depends on the estimation methods

Article 04 June 2018

Estimating power in (generalized) linear mixed models: An open introduction and tutorial in R

Article Open access 05 May 2021

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Wiśniewski (2009) proposed a method for estimating parameters in split functional models of geodetic observations. Such a split occurs when two functional models that differ from each other in terms of mutually competing versions of the same parameter can correspond to a single observation. For example, in a network deformation analysis carried out based on an aggregate set of observations obtained during two measurement epochs, any given observation from this set corresponds to one of the following models: a functional model of observations from the first measurement epoch or a functional model of observations obtained during the second measurement epoch. Another example concerns the sets containing outliers. In that case, any given observation from this set can be a “good” observation or a wrong observation with a functional model appropriate for it.

The proposed method called M_split estimation assumes that if a particular observation already occurs, it brings two mutually competing pieces of f-information (Jones and Jones 2000) determined in relation to two versions of the same parameter (Wiśniewski 2009, 2010). M_split estimators of these versions are the quantities that minimise the aggregate information being the product of competing information. Similar assumptions are also adopted in the maximum likelihood method (ML-method) (e.g. Rao 1973; Wiśniewski 2017). However, this method does not allow the existence of several versions of the parameters in a functional model relating to the same observation. From this perspective, M_split estimation can be regarded as a particular kind of development of the ML method. In the absence of competing versions of the parameter, M_split estimators become ML-estimators. Since the study conducted by Huber (1964), a generalisation of the ML-method has been very popular, known as M-estimation, in which f-information is replaced by certain arbitrary functions. A similar substitution can also be observed in M_split estimation. M_split estimation based on L1 norm condition was developed to take advantage of this possibility (Wyszkowska and Duchnowski 2019).

The general theory of M_split estimation was developed without detailed assumptions about probabilistic observation models, which enables the creation of M_split estimation varieties corresponding to specific models of this nature. The most commonly accepted probabilistic model of geodetic observations is the normal distribution. The family of normal distributions corresponds to the basic variant of M_split estimation called “squared M_split estimation”. This variant of M_split estimation can be regarded as a particular type of expansion of the least-squares (LS) method (Wiśniewski 2009). In the absence of competing parameter versions, squared M_split estimators become LS-estimators. Whenever M_split estimation is mentioned further on, this will indicate this particular variant.

M_split estimation was applied inter alia in the analysis of geodetic network deformation (Duchnowski and Wiśniewski 2012, 2014; Zienkiewicz 2014; Zienkiewicz et al. 2017; Wiśniewski and Zienkiewicz 2016). In these problems, M_split estimation is particularly effective in identifying stable potential reference points (PRPs) (Nowel 2019). Janowski and Rapiński (2013) applied M_split estimation in 3D modelling, primarily for the detection of surface structures (e.g. roof planes) of engineering structures. The modelling was carried out based on laser scanning data. A similar problem is also analysed in a study by Janicka et al. (2020), where M_split estimation was proposed as a means of detecting and determining the displacements of adjacent planes. Laser scanning data also provided the basis for determining the terrain profiles using M_split estimation (Błaszczak-Bąk et al. 2015; Wyszkowska et al. 2021).

M_split estimation can provide an alternative to M-estimation which is robust to gross errors. The possibility of such applications was indicated in the following studies (Wiśniewski 2009, 2010; Yang et al. 2010; Ge et al. 2013; Janicka and Rapinski 2013; Amiri-Simkooei et al. 2017). Wiśniewski and Zienkiewicz (2021a, b) demonstrated that with properly established, competitive functional models, the robustness of M_split estimators to gross errors is their inherent feature. The robustness of these estimators in a wider context (e.g. to poorly chosen models) was analysed in detail by Duchnowski and Wiśniewski (2019, 2020).

In both the traditional models and split functional models constructed on their basis (in M_split estimation), it is assumed that only observations are affected by random errors. Currently, an errors-in-variables (EIV) model, in which design matrix elements are also affected by random errors, is applied in many geodetic problems. For example, this model was applied in geodetic datum transformation (Teunissen 1988; Davis 1999; Acar et al. 2006; Akyilmaz 2007; Schaffrin and Felus 2008; Mahboub 2012; Fang 2015; Aydin et al. 2018; Mercan et al. 2018)) as well as in remote sensing (Felus and Schaffrin 2005), in a function approximation (Wang and Zhao 2019), in linear regression (Schaffrin and Wieser 2008; Amiri-Simkooei and Jazaeri 2012; Zeng et al. 2018; Lv and Sui 2020) and in the least-squares collocation (Schaffrin 2020; Wiśniewski and Kamiński 2020). The effect of the random design matrix on the weighted LS estimate is presented in Xu et al. (2014). This paper also proposed a bias-corrected weighted LS estimate for the EIV model. A developed EIV stochastic model and an estimation of the model's components (using, inter alia, the MINQUE method) are presented in Xu and Liu (2014).

The estimation of parameters in functional models extended to the EIV form is most commonly carried out using the total least-squares (TLS) method. The optimisation problem of this method as well as its solution based on the singular value decomposition (SVD) was presented by Golub and Loan (1980). TLS using the SVD procedure was developed and adapted to geodetic purposes as well (e.g. Felus 2004; Akyilmaz 2007; Schaffrin and Felus 2008). Another way to solve the TLS optimisation problem, based on a nonlinear Lagrange function, is proposed in Schaffrin et al. (2006).

In the practical applications of the TLS method, besides having effective algorithms at one's disposal, the possibility for taking into account random weights of EIV model components is also very important. The basic solutions in this regard were presented by van Huffel and Vandewalle (1991), who established the generalised total least-squares (GTLS) method. On the other hand, Schaffrin and Wieser (2008) proposed an expansion of the TLS method, in which weights were derived from the adopted covariance matrix models (stochastic models). In the method proposed in the cited study, called “the weighted total least-squares (WTLS) method”, stochastic models can apply to both observation vectors and the vectors created from random errors affecting the design matrix.

WTLS is still developed and analysed. For example, Fang (2013) analysed necessary and sufficient conditions for WTLS optimality. Amiri-Simkooei (2017, 2018) presented the theory behind the constrained weighted total least-squares (CWTLS) method. The WTLS optimisation problem was also formulated and solved using a second-order approximation function (Wang and Zhao 2019). Due to their nonlinear nature, the WTLS estimators are biased. Bias-corrected versions of these estimators are presented in studies by (Xu et al. 2012; Tong et al. 2015). Moreover, an important problem in WTLS theory and practice is the assessment of the accuracy of the determined estimators. What might be helpful in this regard are the strategies for determining the covariance matrix of WTLS estimates (Amiri-Simkooei et al. 2016) and methods for estimating the variance components in EIV models (Xu and Liu 2014).

In the optimisation problem of the WTLS method based on the Lagrange approach, the objective function is minimised with the conditions defined by the nonlinear EIV model. An iterative algorithm to solve this problem was proposed in (Schaffrin and Wieser 2008). Shen et al. (2011), based on Newton–Gauss algorithm of nonlinear LS adjustment (Pope 1974), proposed another iterative method for solving WTLS problems, which is easier in practical applications. In this model, a nonlinear EIV model is replaced with a linear approximation, which significantly facilitates the organisation of a corresponding computational algorithm.

The origin of TLS or WTLS estimation is the LS-method which is neutral for all observations. The WTLS estimators' lack of robustness to gross errors is, therefore, an inherent feature, which may restrict the scope of the practical application of these estimators. Therefore, a robust estimation of EIV model parameters is of interest to many authors. For example, Wang et al. (2016) proposed a robust total least-squares (RTLS) method in which the robustness of WTLS estimators was obtained by means of the application of weight functions adopted in robust M-estimation. Another proposal, based on the least trimmed squares (LTS) method, was presented by Lv and Sui (2020). In that method, the authors used the inherent robustness of estimators minimising the sum of the squared orthogonal errors. They called the LTS version adjusted to EIV models “total least trimmed squares” (TLTS).

The current study will apply the EIV model in a basic M_split estimation variant. When referring to the WTLS method, models of covariance matrices of random components of this model will also be taken into account. According to the basic principles of M_split estimation, it will be assumed that the parameters in EIV model will have two mutually competing versions (which consequently leads to the split of this model). The objective function of the proposed method, called the “Total M_split (TM_split) estimation”, will be created through the application of the Lagrange approach (Schaffrin and Wieser 2008) using the approach adopted in (Shen et al. 2011), i.e. the split EIV models will be replaced with their linear approximations.

The paper is organised as follows. As the proposed method is an expansion of M_split estimation, which takes into account the basic assumptions used in the WTLS method, it appears necessary to review the theoretical foundations of both these methods. These foundations, set in the context relevant to this study, are provided in Sect. 2. The theory behind TM_split estimation and its algorithm are provided in Sect. 3. In Sect. 4, examples of the method application will be provided. TM_split estimation will be applied to estimate parameters in competing bias models (Sect. 4.1). The obtained results will be compared with classical M_split estimators calculated in Wiśniewski (2010). In Sect. 4.2, the data provided in Neri et al. (1989) and also used, inter alia, in studies by Schaffrin and Wieser (2008), Shen et al. (2011) and Mahboub (2012), will be used to determine TM_split estimators of linear regression parameters. It will be assumed that the basic set of “good” observations is disturbed by “strange” observations for which a corresponding regression line also exists. Moreover, in this Chapter, the behaviour of TM_split estimators will be checked in the event that the set contains one observation affected by gross error with different values. The determined TM_split estimators will be compared with the WTLS estimators published in the cited studies. TM_split estimators' robustness to gross errors is additionally analysed using an example of a two-dimensional affine transformation (Sect. 4.3). The data for this example are derived from (Lv and Sui 2020). The RTLS and TLTS estimators showed in the cited paper will be compared with TM_split estimators. The paper concludes with a summary.

2 Review of M_split and WTLS estimation

2.1 M_split estimation

Let ${\mathbf{y}} = {\mathbf{AX}} + {\mathbf{v}}$ be a functional model of ${\mathbf{y}} = [y_{1} , \ldots ,y_{n} ]^{T}$ observation vector, where ${\mathbf{A}}$ is the $n \times m$ coefficient matrix ($rank({\mathbf{A}}) = m$), ${\mathbf{X}}$ is the m-vector of unknown parameters to be estimated, and ${\mathbf{v}}$ is n-vector of random observation errors. In M_split estimation, this model is split into two models:

$$ {\mathbf{y}} = {\mathbf{AX}}_{\alpha } + {\mathbf{v}}_{\alpha } \quad {\text{and}}\quad {\mathbf{y}} = {\mathbf{AX}}_{\beta } + {\mathbf{v}}_{\beta } $$

(1)

where ${\mathbf{X}}_{\alpha }$ and ${\mathbf{X}}_{\beta }$ are mutually competing versions of the same vector of parameters X. The vectors ${\mathbf{v}}_{\alpha }$,${\mathbf{v}}_{\beta }$ are respective versions of the vector v, which result from the observation errors and the errors of the functional models.

M_split estimators of parameters ${\mathbf{X}}_{\alpha }$ and ${\mathbf{X}}_{\beta }$ are quantities that minimise the following general objective function (Wiśniewski 2009).

$$ \varphi ({\mathbf{X}}_{\alpha } ,{\mathbf{X}}_{\beta } ) = \sum\limits_{i = 1}^{n} {\rho_{\alpha } (v_{i\alpha })\rho_{\beta } (} v_{i\beta }) $$

(2)

where $\rho_{\alpha }$ and $\rho_{\beta }$ are arbitrary functions. In the context of cross-weighting that is natural in M_split estimation, function (2) can also be expressed in the following form:

$$ \varphi ({\mathbf{X}}_{\alpha } ,{\mathbf{X}}_{\beta } ) = \sum\limits_{i = 1}^{n} {\rho_{\alpha } (v_{i\alpha } )} w_{\alpha } (v_{i\beta } ) = \sum\limits_{i = 1}^{n} {\rho_{\beta } (v_{i\beta } } )w_{\beta } (v_{i\alpha } ) $$

(3)

where $w_{\alpha } (v_{i\beta } ) = \rho_{\beta } (v_{i\beta } )$ and $w_{\beta } (v_{i\alpha } ) = \rho_{\alpha } (v_{i\alpha } )$ are now regarded as a special type of weight function. The specific character of weighting is that the contribution of function $\rho_{\alpha } (v_{i\alpha } )$ to the optimisation problem is enhanced (or weakened) by the weight function whose argument is quantity $v_{i\beta }$ competing in relation to $v_{i\alpha }$ (and vice versa). The weight functions are not like those in M-estimation, which are modified to make the estimator robust. Mutual “cross weighting” functions $w_{\alpha } (v_{i\beta } )$ and $w_{\beta } (v_{i\alpha } )$ are applied to determine mutually competitive estimates related to the same observation set (Wiśniewski 2009). In the case of M_split estimation, one supposes that the observation set might be a mixture of realisations of two different random variables that differ from each other in the parameters of the functional models. One of those variables might be regarded as a “strange” one and its realisations as outliers in a particular case. Then results of M_split estimation are estimates of the parameters of the “good” variable (like in robust M-estimation) but also estimates of the parameters of the “strange” variable.

This study will use the basic variant of M_split estimation, in which $\rho (v_{\alpha } ) = v_{i\alpha }^{2} q_{i}^{ - 1}$ and $\rho (v_{\beta } ) = v_{i\beta }^{2} q_{i}^{ - 1}$. The quantities $q_{i}$ are diagonal elements of the ${\mathbf{Q}}_{{\mathbf{y}}}$ cofactor matrix occurring in the ${\mathbf{C}}_{{\mathbf{y}}} = \sigma_{0}^{2} {\mathbf{Q}}_{{\mathbf{y}}}$ covariance matrix model ($\sigma_{0}^{2}$—unknown variance component). The adopted functions can be associated (although this is not necessary) with normal distributions as probabilistic observation models. Taking these functions into account, based on Eqs. (2) and (3), the following will be recorded (Wiśniewski 2009; Zienkiewicz 2018a, 2018b)

$$ \begin{aligned} \varphi ({\mathbf{X}}_{\alpha } ,{\mathbf{X}}_{\beta } ) & = \sum\limits_{i = 1}^{n} {\rho_{\alpha } (v_{i\alpha } )\rho_{\beta } (} v_{i\beta } ) = \sum\limits_{i = 1}^{n} {v_{i\alpha }^{2} } v_{i\beta }^{2} q_{i}^{ - 2} \\ & = \sum\limits_{i = 1}^{n} {v_{i\alpha }^{2} } w_{\alpha } (v_{i\beta } ) = \sum\limits_{i = 1}^{n} {v_{i\beta }^{2} } w_{\beta } (v_{i\alpha } ) \\ & = {\mathbf{v}}_{\alpha }^{T} {\mathbf{W}}_{\alpha } ({\mathbf{v}}_{\beta } ){\mathbf{v}}_{\alpha } = {\mathbf{v}}_{\beta }^{T} {\mathbf{W}}_{\beta } ({\mathbf{v}}_{\alpha } ){\mathbf{v}}_{\beta } \\ \end{aligned} $$

(4)

where

$$ \begin{aligned} w_{\alpha } (v_{i\beta } ) & = v_{i\beta }^{2} q_{i}^{ - 2} ,\,\,w_{\beta } (v_{i\alpha } ) = v_{i\alpha }^{2} q_{i}^{ - 2} \\ {\mathbf{W}}_{\alpha } ({\mathbf{v}}_{\beta } ) & = {\text{Diag}}\left( {w_{\alpha } (v_{1\beta } ), \ldots ,w_{\alpha } (v_{n\beta } )} \right), \\ {\mathbf{W}}_{\beta } ({\mathbf{v}}_{\alpha } ) & = {\text{Diag}}\left( {w_{\beta } (v_{1\alpha } ), \ldots ,w_{\beta } (v_{n\alpha } )} \right) \\ \end{aligned} $$

(5)

The solution to the optimisation problem $\varphi ({\mathbf{X}}_{\alpha } ,{\mathbf{X}}_{\beta } ) \to \min$ includes such quantities ${\hat{\mathbf{X}}}_{\alpha }$ and ${\hat{\mathbf{X}}}_{\beta }$ (M_split estimators) for which the following is true:

$$ \begin{aligned} \frac{1}{2}\left. {\frac{{\partial \varphi ({\mathbf{X}}_{\alpha } ,{\mathbf{X}}_{\beta } )}}{{\partial {\mathbf{X}}_{\alpha } }}} \right|_{{{\mathbf{X}}_{\alpha } = {\hat{\mathbf{X}}}_{\alpha } ,{\mathbf{X}}_{\beta } = {\hat{\mathbf{X}}}_{\beta } }} & = {\mathbf{A}}^{T} {\mathbf{W}}_{\alpha } ({\tilde{\mathbf{v}}}_{\beta } ){\tilde{\mathbf{v}}}_{\alpha } = {\mathbf{A}}^{T} {\mathbf{W}}_{\alpha } ({\tilde{\mathbf{v}}}_{\beta } )({\mathbf{y}} - {\mathbf{A}}\hat{\mathbf{X}}_{\alpha } ) = {\mathbf{0}} \\ \frac{1}{2}\left. {\frac{{\partial \varphi ({\mathbf{X}}_{\alpha } ,{\mathbf{X}}_{\beta } )}}{{\partial {\mathbf{X}}_{\beta } }}} \right|_{{{\mathbf{X}}_{\alpha } = {\hat{\mathbf{X}}}_{\alpha } ,{\mathbf{X}}_{\beta } = {\hat{\mathbf{X}}}_{\beta } }} & = {\mathbf{A}}^{T} {\mathbf{W}}_{\beta } ({\tilde{\mathbf{v}}}_{\alpha } ){\tilde{\mathbf{v}}}_{\beta } = {\mathbf{A}}^{T} {\mathbf{W}}_{\beta } ({\tilde{\mathbf{v}}}_{\alpha } )({\mathbf{y}} - {\mathbf{A}}\hat{\mathbf{X}}_{\beta } ) = {\mathbf{0}} \\ \end{aligned} $$

(6)

where ${\tilde{\mathbf{v}}}_{\alpha } = {\mathbf{y}} - {\mathbf{A}}\hat{\mathbf{X}}_{\alpha }$ and ${\tilde{\mathbf{v}}}_{\beta } = {\mathbf{y}} - {\mathbf{A}}\hat{\mathbf{X}}_{\beta }$ are residual vectors. The above equations are solved by means of iteration. The iterative procedure can be organised in such a manner that in the steps $l = 1, \ldots ,s$, the following quantities are determined (Wiśniewski and Zienkiewicz 2021a, 2021b):

$$ \begin{aligned} {\mathbf{X}}_{\alpha (l + 1)} & = \left( {{\mathbf{A}}^{T} {\mathbf{W}}_{\alpha } ({\mathbf{v}}_{\beta (l)} ){\mathbf{A}}} \right)^{ - 1} {\mathbf{A}}^{T} {\mathbf{W}}_{\alpha } ({\mathbf{v}}_{\beta (l)} ){\mathbf{y}},\quad {\mathbf{v}}_{\alpha (l + 1)} = {\mathbf{y}} - {\mathbf{A}}\hat{\mathbf{X}}_{\alpha (l + 1)} \\ {\mathbf{X}}_{\beta (l + 1)} & = \left( {{\mathbf{A}}^{T} {\mathbf{W}}_{\beta } ({\mathbf{v}}_{\alpha (l)} ){\mathbf{A}}} \right)^{ - 1} {\mathbf{A}}^{T} {\mathbf{W}}_{\beta } ({\mathbf{v}}_{\alpha (l)} ){\mathbf{y}},\quad {\mathbf{v}}_{\beta (l + 1)} = {\mathbf{y}} - {\mathbf{A}}\hat{\mathbf{X}}_{\beta (l + 1)} \\ \end{aligned} $$

(7)

(the iterative procedure using gradients and Hessians of the function $\varphi ({\mathbf{X}}_{\alpha } ,{\mathbf{X}}_{\beta } )$ is presented in Wiśniewski 2009, 2010). The iterative process defined by Eq. (7) is convergent and ends for such $l = s$ that ${\mathbf{X}}_{\alpha (s)} = {\mathbf{X}}_{\alpha (s - 1)}$ and ${\mathbf{X}}_{\beta (s)} = {\mathbf{X}}_{\beta (s - 1)}$. Then, ${\hat{\mathbf{X}}}_{\alpha } = {\mathbf{X}}_{\alpha (s)}$ and ${\hat{\mathbf{X}}}_{\beta } = {\mathbf{X}}_{\beta (s)}$.

In M_split estimation, the stochastic model ${\mathbf{C}}_{{\mathbf{v}}} = {\mathbf{C}}_{{\mathbf{y}}} = \sigma_{0}^{2} {\mathbf{Q}}_{{\mathbf{y}}}$, similar to the functional model, is split. The split results in covariance matrices ${\mathbf{C}}_{{{\mathbf{v}}_{\alpha } }} = \sigma_{0\alpha }^{2} {\mathbf{Q}}_{{\mathbf{y}}}$ and ${\mathbf{C}}_{{{\mathbf{v}}_{\beta } }} = \sigma_{0\beta }^{2} {\mathbf{Q}}_{{\mathbf{y}}}$, which are two versions of the covariance matrix ${\mathbf{C}}_{{\mathbf{v}}}$ (Wiśniewski and Zienkiewicz 2021b). The invariant and unbiased estimators of variance coefficients $\sigma_{0\alpha }^{2}$ and $\sigma_{0\beta }^{2}$ are the following quantities (Wiśniewski and Zienkiewicz, 2021a, b):

$$ \begin{aligned} \hat{\sigma }_{0\alpha }^{2} & = \frac{{{\tilde{\mathbf{v}}}_{\alpha }^{T} {\overline{\mathbf{Q}}}_{{{\mathbf{v}}_{\alpha } }}^{ - 1} {\mathbf{Q}}_{{\mathbf{y}}} {\overline{\mathbf{Q}}}_{{{\mathbf{v}}_{\alpha } }}^{ - 1} {\tilde{\mathbf{v}}}_{\alpha } }}{{{\text{Tr(}}{\mathbf{N}}_{\alpha }^{T} {\mathbf{N}}_{\alpha } )}}\quad {\text{and}} \\ \hat{\sigma }_{0\beta }^{2} & = \frac{{{\tilde{\mathbf{v}}}_{\beta }^{T} {\overline{\mathbf{Q}}}_{{{\mathbf{v}}_{\beta } }}^{ - 1} {\mathbf{Q}}_{{\mathbf{y}}} {\overline{\mathbf{Q}}}_{{{\mathbf{v}}_{\beta } }}^{ - 1} {\tilde{\mathbf{v}}}_{\alpha } }}{{{\text{Tr(}}{\mathbf{N}}_{\beta }^{T} {\mathbf{N}}_{\beta } )}} \\ \end{aligned} $$

(8)

where

$$ \begin{aligned} {\mathbf{N}}_{\alpha } & = {\mathbf{Q}}_{{\mathbf{y}}} {\overline{\mathbf{Q}}}_{{{\mathbf{v}}_{\alpha } }}^{ - 1} {\mathbf{M}}_{\alpha } ,\quad {\mathbf{N}}_{\beta } = {\mathbf{Q}}_{{\mathbf{y}}} {\overline{\mathbf{Q}}}_{{{\mathbf{v}}_{\beta } }}^{ - 1} {\mathbf{M}}_{\beta } \\ {\mathbf{M}}_{\alpha } & = {\mathbf{I}}_{n} - {\mathbf{A}}({\mathbf{A}}^{T} {\overline{\mathbf{Q}}}_{{{\mathbf{v}}_{\alpha } }}^{ - 1} {\mathbf{A}})^{ - 1} {\mathbf{A}}^{T} {\overline{\mathbf{Q}}}_{{{\mathbf{v}}_{\alpha } }}^{ - 1} , \\ {\mathbf{M}}_{\beta } & = {\mathbf{I}}_{n} - {\mathbf{A}}({\mathbf{A}}^{T} {\overline{\mathbf{Q}}}_{{{\mathbf{v}}_{\beta } }}^{ - 1} {\mathbf{A}})^{ - 1} {\mathbf{A}}^{T} {\overline{\mathbf{Q}}}_{{{\mathbf{v}}_{\beta } }}^{ - 1} \\ \end{aligned} $$

(9)

and ${\overline{\mathbf{Q}}}_{{{\mathbf{v}}_{\alpha } }} = [{\mathbf{W}}_{\alpha } ({\mathbf{v}}_{\beta } )]^{ - 1} {\mathbf{Q}}_{{\mathbf{y}}}$, ${\overline{\mathbf{Q}}}_{{{\mathbf{v}}_{\beta } }} = [{\mathbf{W}}_{\beta } ({\mathbf{v}}_{\alpha } )]^{ - 1} {\mathbf{Q}}_{{\mathbf{y}}}$, ${\mathbf{I}}_{n}$ denotes an $n \times n$ identity matrix (${\text{Tr}}$-matrix trace).

2.2 Weighted TLS method

The total least-squares method is applied where the classical model ${\mathbf{y}} = {\mathbf{AX}} + {\mathbf{v}}$ is replaced by the EIV model of the following form:

$$ {\mathbf{y}} = ({\mathbf{A}} - {\mathbf{E}}){\mathbf{X}} + {{\varvec{\upupsilon}}} = {\mathbf{AX}} + {{\varvec{\upupsilon}}} - {\mathbf{EX}} $$

(10)

where ${\mathbf{E}}$ is an $n \times m$ random matrix corresponding to the matrix ${\mathbf{A}}$ being observed. The random vector corresponding to observation vector y in the EIV model was denoted as ${{\varvec{\upupsilon}}}$. If we assume that ${\mathbf{v}} = {\mathbf{y}} - {\mathbf{AX}}$ is the error vector in the classical functional model, then ${{\varvec{\upupsilon}}} = {\mathbf{v}} + {\mathbf{EX}}$. It should be considered that ${\mathbf{EX}} = ({\mathbf{X}}^{T} \otimes {\mathbf{I}}_{n} ){\mathbf{e}}$, where ${\mathbf{e}} = {\text{vec}}({\mathbf{E}})$ is a vector formed from successive columns of matrix ${\mathbf{E}}$ ($\otimes$—the Kronecker product, ${\mathbf{I}}_{n}$—an identity matrix of dimensions $n \times n$). Moreover, in order to simplify further notation, additional designations ${\mathbf{X}}_{ \otimes } = {\mathbf{X}} \otimes {\mathbf{I}}_{n} = ({\mathbf{X}}^{T} \otimes {\mathbf{I}}_{n} )^{T}$, ${\mathbf{X}}_{ \otimes }^{T} = {\mathbf{X}}^{T} \otimes {\mathbf{I}}_{n} = ({\mathbf{X}} \otimes {\mathbf{I}}_{n} )^{T}$ are introduced. Model (10) can then be expressed in the following form:

$$ {\mathbf{y}} = {\mathbf{AX}} + {{\varvec{\upupsilon}}} - {\mathbf{EX}} = {\mathbf{AX}} + {{\varvec{\upupsilon}}} - ({\mathbf{X}}^{T} \otimes {\mathbf{I}}_{n} ){\mathbf{e}} = {\mathbf{AX}} + {{\varvec{\upupsilon}}} - {\mathbf{X}}_{ \otimes }^{T} {\mathbf{e}} $$

(11)

In the WTLS method, in addition to the stochastic model of the observation vector ${\mathbf{C}}_{{\mathbf{y}}} = \sigma_{0}^{2} {\mathbf{Q}}_{{\mathbf{y}}}$, a stochastic model of ${\mathbf{e}}$ vector is also adopted. In the simplest case, it can be assumed that such a model is the expression ${\mathbf{C}}_{{\mathbf{e}}} = \sigma_{0}^{2} {\mathbf{Q}}_{{\mathbf{e}}}$, where ${\mathbf{Q}}_{{\mathbf{e}}}$ is the known cofactor matrix. However, there are examples in which not all columns of matrix ${\mathbf{A}}$ are affected by random disturbances (e.g. in the linear regression analysis). In that case, matrix ${\mathbf{Q}}_{{\mathbf{e}}}$ can be subject to appropriate decomposition ${\mathbf{Q}}_{{\mathbf{e}}} = {\mathbf{Q}}_{0} \otimes {\mathbf{Q}}_{{\mathbf{x}}}$, where ${\mathbf{Q}}_{{\mathbf{x}}}$ denotes a nonnegative definite diagonal matrix of size $n \times n$ (Schaffrin and Wieser 2008; Shen et al. 2011).

In the Lagrange approach applied in Schaffrin and Wieser (2008) and Lv and Sui (2020), the WTLS method is based on the following objective function:

$$ \varphi ({{\varvec{\upupsilon}}},{\mathbf{e}},{{\varvec{\uplambda}}},{\mathbf{X}}) = {{\varvec{\upupsilon}}}^{T} {\mathbf{Q}}_{{\mathbf{y}}}^{ - 1} {{\varvec{\upupsilon}}} + {\mathbf{e}}^{T} {\mathbf{Q}}_{{\mathbf{e}}}^{ + } {\mathbf{e}} - 2{{\varvec{\uplambda}}}^{T} ({\mathbf{y}} - {\mathbf{AX}} - {{\varvec{\upupsilon}}} + {\mathbf{X}}_{ \otimes }^{T} {\mathbf{e}}) $$

(12)

where ${{\varvec{\uplambda}}}$ denotes an $n \times 1$ vector of “Lagrange multipliers”. Matrix ${\mathbf{Q}}_{{\mathbf{e}}}^{ + }$ is the Moore–Penrose inverse of ${\mathbf{Q}}_{{\mathbf{e}}}$. For example, when ${\mathbf{Q}}_{0} = {\text{Diag}}(0,1)$, hence ${\mathbf{Q}}_{{\mathbf{e}}} = {\text{Diag}}({\mathbf{0}},{\mathbf{Q}}_{{\mathbf{x}}} )$, and ${\mathbf{Q}}_{{\mathbf{x}}}$ is regular, then ${\mathbf{Q}}_{{\mathbf{e}}}^{ + } = {\text{Diag}}({\mathbf{0}},{\mathbf{Q}}_{{\mathbf{x}}}^{ - 1} )$(e.g. Rao 1973; Felus 2004).The iteration procedures that solve the optimisation problem $\varphi ({{\varvec{\upupsilon}}},{\mathbf{e}},{{\varvec{\uplambda}}},{\mathbf{X}}) \to \min$ with the nonlinear conditions ${\mathbf{y}} - {\mathbf{AX}} - {{\varvec{\upupsilon}}} + {\mathbf{X}}_{ \otimes }^{T} {\mathbf{e}} = {\mathbf{0}}$ are, in general, complicated. The application of a similar objective function in M_split estimation will generate numerical procedures and computational algorithms of even greater complexity. From the perspective of computational process optimisation, however, the approach adopted in Shen et al. (2011), also applied in Wang et al. (2016), is particularly interesting for the purposes of the study. The iterative method proposed in the study is based on the Newton–Gauss algorithm of nonlinear LS adjustment, proposed by Pope (1974). In this method, the EIV model (11) is replaced, in the j-the iteration, by a linear approximation of the following form:

$$ \begin{aligned} {\mathbf{y}} & = {\mathbf{AX}} + {{\varvec{\upupsilon}}} - {\mathbf{EX}} \\ & = {\mathbf{AX}}^{j} + {\mathbf{A}}^{j} \delta {\mathbf{X}} + {{\varvec{\upupsilon}}} - {\mathbf{EX}}^{j} \\ & = {\mathbf{AX}}^{j} + {\mathbf{A}}^{j} \delta {\mathbf{X}} + {{\varvec{\upupsilon}}} - ({\mathbf{X}}_{ \otimes }^{j} )^{T} {\mathbf{e}} \\ \end{aligned} $$

(13)

where ${\mathbf{A}}^{j} = {\mathbf{A}} - {\tilde{\mathbf{E}}}^{j}$, ${\mathbf{X}} = {\mathbf{X}}^{j} + \delta {\mathbf{X}}$ and ${\mathbf{X}}_{ \otimes }^{j} = {\mathbf{X}}^{j} \otimes {\mathbf{I}}_{n}$. ${\tilde{\mathbf{E}}}^{j}$ are residual matrices built on the basis of the residual vector ${\tilde{\mathbf{e}}}^{j}$ determined in the j-th iteration. Vector $\delta {\mathbf{X}}$ is a small quantity to be solved in the iteration. After taking into account the condition ${\mathbf{y}} - {\mathbf{AX}}^{j} - {\mathbf{A}}^{j} \delta {\mathbf{X}} - {{\varvec{\upupsilon}}} + ({\mathbf{X}}_{ \otimes }^{j} )^{T} {\mathbf{e}} = {\mathbf{0}}$ resulting from model (13), the objective function (12) can be rewritten in the following form (Shen et al. 2011).

$$ \varphi ({{\varvec{\upupsilon}}},{\mathbf{e}},{{\varvec{\uplambda}}},\delta {\mathbf{X}}) = {{\varvec{\upupsilon}}}^{T} {\mathbf{Q}}_{{\mathbf{y}}}^{ - 1} {{\varvec{\upupsilon}}} + {\mathbf{e}}^{T} {\mathbf{Q}}_{{\mathbf{e}}}^{ + } {\mathbf{e}} - 2{{\varvec{\uplambda}}}^{T} \left( {{\mathbf{y}} - {\mathbf{AX}}^{j} - {\mathbf{A}}^{j} \delta {\mathbf{X}} - {{\varvec{\upupsilon}}} + ({\mathbf{X}}_{ \otimes }^{j} )^{T} {\mathbf{e}}} \right) $$

(14)

The minimum of this function is obtained through satisfying the following Euler–Lagrange necessary conditions (Shen et al. 2011)

$$ \begin{aligned} & \left. {\frac{1}{2}\frac{\partial \varphi }{{\partial {{\varvec{\upupsilon}}}}}} \right|_{{{{\varvec{\upupsilon}}} = {\tilde{\boldsymbol{\upupsilon}}},{\mathbf{e}} = {\tilde{\mathbf{e}}},\delta {\mathbf{X}} = \delta {\hat{\mathbf{X}}},{{\varvec{\uplambda}}} = {\hat{\boldsymbol{\lambda}}}}} = {\mathbf{Q}}_{{\mathbf{y}}}^{ - 1} {\tilde{\boldsymbol{\upupsilon}}} + {\hat{\boldsymbol{\lambda}}} = {\mathbf{0}} \\ & \left. {\frac{1}{2}\frac{\partial \varphi }{{\partial {\mathbf{e}}}}} \right|_{{{{\varvec{\upupsilon}}} = {\tilde{\boldsymbol{\upupsilon}}},{\mathbf{e}} = {\tilde{\mathbf{e}}},\delta {\mathbf{X}} = \delta {\hat{\mathbf{X}}},{{\varvec{\uplambda}}} = {\hat{\boldsymbol{\lambda}}}}} = {\mathbf{Q}}_{{\mathbf{e}}}^{ + } {\tilde{\mathbf{e}}} - {\mathbf{X}}_{ \otimes }^{j} {\hat{\boldsymbol{\lambda}}} = {\mathbf{0}} \\ & \left. {\frac{1}{2}\frac{\partial \varphi }{{\partial \delta {\mathbf{X}}}}} \right|_{{{{\varvec{\upupsilon}}} = {\tilde{\boldsymbol{\upupsilon}}},{\mathbf{e}} = {\tilde{\mathbf{e}}},\delta {\mathbf{X}} = \delta {\hat{\mathbf{X}}},{{\varvec{\uplambda}}} = {\hat{\boldsymbol{\lambda}}}}} = ({\mathbf{A}}^{j} )^{T} {\hat{\boldsymbol{\lambda}}} = {\mathbf{0}} \\ & \left. {\frac{1}{2}\frac{\partial \varphi }{{\partial {{\varvec{\uplambda}}}}}} \right|_{{{{\varvec{\upupsilon}}} = {\tilde{\boldsymbol{\upupsilon}}},{\mathbf{e}} = {\tilde{\mathbf{e}}},\delta {\mathbf{X}} = \delta {\hat{\mathbf{X}}},{{\varvec{\uplambda}}} = {\hat{\boldsymbol{\lambda}}}}} = {\mathbf{y}} - {\mathbf{AX}}^{j} - {\mathbf{A}}^{j} \delta {\hat{\mathbf{X}}} - {\tilde{\boldsymbol{\upupsilon}}} + ({\mathbf{X}}_{ \otimes }^{j} )^{T} {\mathbf{e}} = {\mathbf{0}} \\ \end{aligned} $$

(15)

The solution to the equations contained in Eq. (15) are the following quantities:

$$ \begin{aligned} {\hat{\boldsymbol{\lambda}}} & = - ({\mathbf{Q}}_{l}^{j} )^{ - 1} ({\mathbf{y}} - {\mathbf{AX}}^{j} - {\mathbf{A}}^{j} \delta {\hat{\mathbf{X}}}) \\ \delta {\hat{\mathbf{X}}}^{j + 1} & = \left[ {({\mathbf{A}}^{j} )^{T} ({\mathbf{Q}}_{l}^{j} )^{ - 1} {\mathbf{A}}^{j} } \right]^{ - 1} ({\mathbf{A}}^{j} )^{T} ({\mathbf{Q}}_{l}^{j} )^{ - 1} ({\mathbf{y}} - {\mathbf{AX}}^{j} ) \\ {\mathbf{X}}^{j + 1} & = {\hat{\mathbf{X}}}^{j} + \delta {\hat{\mathbf{X}}}^{j + 1} \\ & = \left[ {({\mathbf{A}}^{j} )^{T} ({\mathbf{Q}}_{l}^{j} )^{ - 1} {\mathbf{A}}^{j} } \right]^{ - 1} ({\mathbf{A}}^{j} )^{T} ({\mathbf{Q}}_{l}^{j} )^{ - 1} ({\mathbf{y}} - {\mathbf{E}}^{j} {\mathbf{X}}^{j} ) \\ \end{aligned} $$

(16)

and

$$ \begin{aligned} & {\tilde{\boldsymbol{\upupsilon}}}^{j + 1} = - {\mathbf{Q}}_{{\mathbf{y}}} ({\mathbf{Q}}_{l}^{j} )^{ - 1} ({\mathbf{y}} - {\mathbf{A}}\hat{\mathbf{X}}^{j} - {\mathbf{A}}^{j} \delta {\hat{\mathbf{X}}}^{j + 1} ) \\ & {\tilde{\mathbf{e}}} = {\mathbf{Q}}_{{\mathbf{e}}} {\mathbf{X}}_{ \otimes }^{j} ({\mathbf{Q}}_{l}^{j} )^{ - 1} ({\mathbf{y}} - {\mathbf{A}}\hat{\mathbf{X}}^{j} - {\mathbf{A}}^{j} \delta {\hat{\mathbf{X}}}^{j + 1} ) \\ \end{aligned} $$

(17)

where

$$ {\mathbf{Q}}_{l}^{j} = {\mathbf{Q}}_{{\mathbf{y}}} + ({\mathbf{X}}_{ \otimes }^{j} )^{T} {\mathbf{Q}}_{{\mathbf{e}}} {\mathbf{X}}_{ \otimes }^{j} $$

(18)

(${\tilde{\boldsymbol{\upupsilon}}}$—residual vector corresponding to the observation vector y). Shen et al. (2011) also apply the iterative process on the assumption that ${\mathbf{E}}^{j} \delta {\mathbf{X}}$ is a negligible quantity. Then, ${\mathbf{A}}^{j} \delta {\mathbf{X}} = ({\mathbf{A}} - {\mathbf{E}}^{j} )\delta {\mathbf{X}} = {\mathbf{A}}\delta {\mathbf{X}}$ and ${\mathbf{y}} = {\mathbf{AX}}^{j} + {\mathbf{A}}\delta {\mathbf{X}} + {{\varvec{\upupsilon}}} - {\mathbf{EX}}^{j}$.

The estimation of variance coefficient $\sigma_{0}^{2}$, common to stochastic models ${\mathbf{C}}_{{\mathbf{y}}} = \sigma_{0}^{2} {\mathbf{Q}}_{{\mathbf{y}}}$ and ${\mathbf{C}}_{{\mathbf{e}}} = \sigma_{0}^{2} {\mathbf{Q}}_{{\mathbf{e}}}$, is also of interest in WTLS. Schaffrin and Wieser (2008) proposed a biased estimator in the following form:

$$ \hat{\sigma }_{0}^{2} = ({\mathbf{y}} - {\mathbf{A}}\hat{\mathbf{X}})^{T} {\mathbf{Q}}_{l}^{ - 1} ({\mathbf{y}} - {\mathbf{A}}\hat{\mathbf{X}})/(n - m) $$

(19)

A correction of the estimator (19) by means of introducing a bias term $\delta b$ to it was presented by Shen et al. (2011). More complex EIV stochastic models are also being under consideration. For example, Xu and Liu (2014) introduced a model containing variance components and proposed a way to estimate these components.

3 Total M_split estimation

3.1 Theoretical foundations

Let it be assumed that according to the M_split estimation rules, the EIV model (10) is split into two mutually competing models

$$ \begin{aligned} {\mathbf{y}} & = ({\mathbf{A}} - {\mathbf{E}}){\mathbf{X}}_{\alpha } + {{\varvec{\upupsilon}}}_{\alpha } = {\mathbf{AX}}_{\alpha } + {{\varvec{\upupsilon}}}_{\alpha } - {\mathbf{EX}}_{\alpha } \\ {\mathbf{y}} & = ({\mathbf{A}} - {\mathbf{E}}){\mathbf{X}}_{\beta } + {{\varvec{\upupsilon}}}_{\beta } = {\mathbf{AX}}_{\beta } + {{\varvec{\upupsilon}}}_{\beta } - {\mathbf{EX}}_{\beta } \\ \end{aligned} $$

(20)

By applying the approach used in Shen et al. (2011) and Wang et al. (2016), see Eq. (13), the above models in the j-th iteration will be replaced with the following linear approximations:

$$ \begin{aligned} {\mathbf{y}} & = {\mathbf{AX}}_{\alpha }^{j} + {\mathbf{A}}^{j} \delta {\mathbf{X}}_{\alpha } + {{\varvec{\upupsilon}}}_{\alpha } - {\mathbf{EX}}_{\alpha }^{j} \\ & = {\mathbf{AX}}_{\alpha }^{j} + {\mathbf{A}}^{j} \delta {\mathbf{X}}_{\alpha } + {{\varvec{\upupsilon}}}_{\alpha } - ({\mathbf{X}}_{ \otimes \alpha }^{j} )^{T} {\mathbf{e}} \\ {\mathbf{y}} & = {\mathbf{AX}}_{\beta }^{j} + {\mathbf{A}}^{j} \delta {\mathbf{X}}_{\beta } + {{\varvec{\upupsilon}}}_{\beta } - {\mathbf{EX}}_{\beta }^{j} \\ & = {\mathbf{AX}}_{\beta }^{j} + {\mathbf{A}}^{j} \delta {\mathbf{X}}_{\beta } + {{\varvec{\upupsilon}}}_{\beta } - ({\mathbf{X}}_{ \otimes \beta }^{j} )^{T} {\mathbf{e}} \\ \end{aligned} $$

(21)

where ${\mathbf{X}}_{\alpha } = {\mathbf{X}}_{\alpha }^{j} + \delta {\mathbf{X}}_{\alpha }$, ${\mathbf{X}}_{\beta } = {\mathbf{X}}_{\beta }^{j} + \delta {\mathbf{X}}_{\beta }$ and ${\mathbf{X}}_{ \otimes \alpha }^{j} = {\mathbf{X}}_{\alpha }^{j} \otimes {\mathbf{I}}_{n}$, ${\mathbf{X}}_{ \otimes \beta }^{j} = {\mathbf{X}}_{\beta }^{j} \otimes {\mathbf{I}}_{n}$. After determining the quantities ${\mathbf{E}}^{j}$, ${\mathbf{X}}_{\alpha }^{j}$ and ${\mathbf{X}}_{\beta }^{j}$, the models that are valid in the j-th iteration, contained in Eq. (20), take the following forms:

$$ \begin{aligned} {\mathbf{y}} & = ({\mathbf{A}} - {\mathbf{E}}^{j} ){\mathbf{X}}_{\alpha }^{j} + {{\varvec{\upupsilon}}}_{\alpha }^{j} = {\mathbf{A}}^{j} {\mathbf{X}}_{\alpha }^{j} + {{\varvec{\upupsilon}}}_{\alpha }^{j} \\ {\mathbf{y}} & = ({\mathbf{A}} - {\mathbf{E}}^{j} ){\mathbf{X}}_{\beta }^{j} + {{\varvec{\upupsilon}}}_{\beta }^{j} = {\mathbf{A}}^{j} {\mathbf{X}}_{\beta }^{j} + {{\varvec{\upupsilon}}}_{\beta }^{j} \\ \end{aligned} $$

(22)

Total M_split estimators of parameters ${\mathbf{X}}_{\alpha }$ and ${\mathbf{X}}_{\beta }$ are quantities ${\hat{\mathbf{X}}}_{\alpha }$ and ${\hat{\mathbf{X}}}_{\beta }$ which minimise the following objective function:

$$ \varphi ({\mathbf{X}}_{\alpha } ,{\mathbf{X}}_{\beta } ) = {{\varvec{\upupsilon}}}_{\alpha }^{T} {\mathbf{W}}_{\alpha } ({{\varvec{\upupsilon}}}_{\beta } ){{\varvec{\upupsilon}}}_{\alpha } + {\mathbf{e}}^{T} {\mathbf{Q}}_{{\mathbf{e}}}^{ + } {\mathbf{e}} = {{\varvec{\upupsilon}}}_{\beta }^{T} {\mathbf{W}}_{\beta } ({{\varvec{\upupsilon}}}_{\alpha } ){\mathbf{v}}_{\beta } + {\mathbf{e}}^{T} {\mathbf{Q}}_{{\mathbf{e}}}^{ + } {\mathbf{e}} $$

(23)

This function [lake functions Eqs. (3) and (4)] is not designed to make the method robust and also does not “predict” outliers among observations of elements of matrix A. In view of the following conditions resulting from Eq. (21)

$$ \begin{aligned} {\mathbf{y}} - {\mathbf{AX}}_{\alpha }^{j} - {\mathbf{A}}^{j} \delta {\mathbf{X}}_{\alpha } - {{\varvec{\upupsilon}}}_{\alpha } + ({\mathbf{X}}_{ \otimes \alpha }^{j} )^{T} {\mathbf{e}} & = {\mathbf{0}} \\ {\mathbf{y}} - {\mathbf{AX}}_{\beta }^{j} - {\mathbf{A}}^{j} \delta {\mathbf{X}}_{\beta } - {{\varvec{\upupsilon}}}_{\beta } + ({\mathbf{X}}_{ \otimes \beta }^{j} )^{T} {\mathbf{e}} & = {\mathbf{0}} \\ \end{aligned} $$

(24)

the original objective function (23) will be supplemented to the following form:

$$ \begin{aligned} & \varphi ({{\varvec{\upupsilon}}}_{\alpha } ,{{\varvec{\upupsilon}}}_{\beta } ,{\mathbf{e}},{\mathbf{X}}_{\alpha } ,{\mathbf{X}}_{\beta } ,{{\varvec{\uplambda}}}_{\alpha } ,{{\varvec{\uplambda}}}_{\beta } ) \\ & \quad = {{\varvec{\upupsilon}}}_{\alpha }^{T} {\mathbf{W}}_{\alpha } ({{\varvec{\upupsilon}}}_{\beta } ){\mathbf{v}}_{\alpha } + {\mathbf{e}}^{T} {\mathbf{Q}}_{{\mathbf{e}}}^{ + } {\mathbf{e}} \\ & \qquad - 2{{\varvec{\uplambda}}}_{\alpha }^{T} \left( {{\mathbf{y}} - {\mathbf{AX}}_{\alpha }^{j} - {\mathbf{A}}^{j} \delta {\mathbf{X}}_{\alpha } - {{\varvec{\upupsilon}}}_{\alpha } + ({\mathbf{X}}_{ \otimes \alpha }^{j} )^{T} {\mathbf{e}}} \right) \\ & \qquad - 2{{\varvec{\uplambda}}}_{\beta }^{T} \left( {{\mathbf{y}} - {\mathbf{AX}}_{\beta }^{j} - {\mathbf{A}}^{j} \delta {\mathbf{X}}_{\beta } - {{\varvec{\upupsilon}}}_{\beta } + ({\mathbf{X}}_{ \otimes \beta }^{j} )^{T} {\mathbf{e}}} \right) \\ & \quad = {{\varvec{\upupsilon}}}_{\beta }^{T} {\mathbf{W}}_{\beta } ({{\varvec{\upupsilon}}}_{\alpha } ){\mathbf{v}}_{\beta } + {\mathbf{e}}^{T} {\mathbf{Q}}_{{\mathbf{e}}}^{ + } {\mathbf{e}} \\ & \qquad - 2{{\varvec{\uplambda}}}_{\alpha }^{T} \left( {{\mathbf{y}} - {\mathbf{AX}}_{\alpha }^{j} - {\mathbf{A}}^{j} \delta {\mathbf{X}}_{\alpha } - {{\varvec{\upupsilon}}}_{\alpha } + ({\mathbf{X}}_{ \otimes \alpha }^{j} )^{T} {\mathbf{e}}} \right) \\ & \qquad - 2{{\varvec{\uplambda}}}_{\beta }^{T} \left( {{\mathbf{y}} - {\mathbf{AX}}_{\beta }^{j} - {\mathbf{A}}^{j} \delta {\mathbf{X}}_{\beta } - {{\varvec{\upupsilon}}}_{\beta } + ({\mathbf{X}}_{ \otimes \beta }^{j} )^{T} {\mathbf{e}}} \right) \\ \end{aligned} $$

(25)

where ${{\varvec{\uplambda}}}_{\alpha }$ and ${{\varvec{\uplambda}}}_{\beta }$ are Lagrange multiplier vectors corresponding to conditions (24). It is established that the Euler–Lagrange necessary conditions have the following forms in the optimisation problem $\varphi ({{\varvec{\upupsilon}}}_{\alpha } ,{{\varvec{\upupsilon}}}_{\beta } ,{\mathbf{e}},{\mathbf{X}}_{\alpha } ,{\mathbf{X}}_{\beta } ,{{\varvec{\uplambda}}}_{\alpha } ,{{\varvec{\uplambda}}}_{\beta } ) \to \min$:

$$ \left. {\frac{1}{2}\frac{\partial \varphi }{{\partial {{\varvec{\upupsilon}}}_{\alpha } }}} \right|_{\Omega } = {\mathbf{W}}_{\alpha } ({\tilde{\boldsymbol{\upupsilon}}}_{\beta } ){\tilde{\boldsymbol{\upupsilon}}}_{\alpha } + {\hat{\boldsymbol{\lambda}}}_{\alpha } = {\mathbf{0}} $$

(26a)

$$ \left. {\frac{1}{2}\frac{\partial \varphi }{{\partial {{\varvec{\upupsilon}}}_{\beta } }}} \right|_{\Omega } = {\mathbf{W}}_{\beta } ({\tilde{\boldsymbol{\upupsilon}}}_{\alpha } ){\tilde{\boldsymbol{\upupsilon}}}_{\beta } + {\hat{\boldsymbol{\lambda}}}_{\beta } = {\mathbf{0}} $$

(26b)

$$ \left. {\frac{1}{2}\frac{\partial \varphi }{{\partial {\mathbf{e}}}}} \right|_{\Omega } = {\mathbf{Q}}_{{\mathbf{e}}}^{ + } {\tilde{\mathbf{e}}} - {\mathbf{X}}_{ \otimes \alpha }^{j} {\hat{\boldsymbol{\lambda}}}_{\alpha } - {\mathbf{X}}_{ \otimes \beta }^{j} {\hat{\boldsymbol{\lambda}}}_{\beta } = {\mathbf{0}} $$

(26c)

$$ \left. {\frac{1}{2}\frac{\partial \varphi }{{\partial \delta {\mathbf{X}}_{\alpha } }}} \right|_{\Omega } = ({\mathbf{A}}^{j} )^{T} {\hat{\boldsymbol{\lambda}}}_{\alpha } = {\mathbf{0}} $$

(26d)

$$ \left. {\frac{1}{2}\frac{\partial \varphi }{{\partial \delta {\mathbf{X}}_{\beta } }}} \right|_{\Omega } = ({\mathbf{A}}^{j} )^{T} {\hat{\boldsymbol{\lambda}}}_{\beta } = {\mathbf{0}} $$

(26e)

$$ \left. {\frac{1}{2}\frac{\partial \varphi }{{\partial {{\varvec{\uplambda}}}_{\alpha } }}} \right|_{\Omega } = {\mathbf{y}} - {\mathbf{AX}}_{\alpha }^{j} - {\mathbf{A}}^{j} \delta {\hat{\mathbf{X}}}_{\alpha } - {\tilde{\boldsymbol{\upupsilon}}}_{\alpha } + ({\mathbf{X}}_{ \otimes \alpha }^{j} )^{T} {\tilde{\mathbf{e}}} = {\mathbf{0}} $$

(26f)

$$ \left. {\frac{1}{2}\frac{\partial \varphi }{{\partial {{\varvec{\uplambda}}}_{\beta } }}} \right|_{\Omega } = {\mathbf{y}} - {\mathbf{AX}}_{\beta }^{j} - {\mathbf{A}}^{j} \delta {\hat{\mathbf{X}}}_{\beta } - {\tilde{\boldsymbol{\upupsilon}}}_{\beta } + ({\mathbf{X}}_{ \otimes \beta }^{j} )^{T} {\tilde{\mathbf{e}}} = {\mathbf{0}} $$

(26g)

The set of simultaneous substitutions: ${{\varvec{\upupsilon}}}_{\alpha } = {\tilde{\boldsymbol{\upupsilon}}}_{\alpha }$, ${{\varvec{\upupsilon}}}_{\beta } = {\tilde{\boldsymbol{\upupsilon}}}_{\beta }$, ${\mathbf{e}} = {\tilde{\mathbf{e}}}$, $\delta {\mathbf{X}}_{\alpha } = \delta {\hat{\mathbf{X}}}_{\alpha }$, $\delta {\mathbf{X}}_{\beta } = \delta {\hat{\mathbf{X}}}_{\beta }$, ${{\varvec{\uplambda}}}_{\alpha } = {\hat{\boldsymbol{\lambda}}}_{\alpha }$, ${{\varvec{\uplambda}}}_{\beta } = {\hat{\boldsymbol{\lambda}}}_{\beta }$, introduced to simplify the notation, was denoted as $\Omega$. Based on Eqs. (26a)–(26c), the following residual vectors are determined:

$$ {\tilde{\boldsymbol{\upupsilon}}}_{\alpha } = - {\mathbf{W}}_{\alpha }^{ - 1} ({\tilde{\boldsymbol{\upupsilon}}}_{\beta } ){\hat{\boldsymbol{\lambda}}}_{\alpha } $$

$$ {\tilde{\boldsymbol{\upupsilon}}}_{\beta } = - {\mathbf{W}}_{\beta }^{ - 1} ({\tilde{\boldsymbol{\upupsilon}}}_{\alpha } ){\hat{\boldsymbol{\lambda}}}_{\beta } $$

$$ {\tilde{\mathbf{e}}} = {\mathbf{Q}}_{{\mathbf{e}}} ({\mathbf{X}}_{ \otimes \alpha }^{j} {\hat{\boldsymbol{\lambda}}}_{\alpha } + {\mathbf{X}}_{ \otimes \beta }^{j} {\hat{\boldsymbol{\lambda}}}_{\beta } ) $$

(27)

By substituting the quantities obtained above to Eqs. (26f) and (26g), a system of normal equations relating to vectors ${\hat{\boldsymbol{\lambda}}}_{\alpha }$ and ${\hat{\boldsymbol{\lambda}}}_{\beta }$ is obtained. The system has the following form:

$$ \begin{aligned} {{\varvec{\Theta}}}_{\alpha }^{j} {\hat{\boldsymbol{\lambda}}}_{\alpha } + {{\varvec{\Theta}}}_{\alpha \beta }^{j} {\hat{\boldsymbol{\lambda}}}_{\beta } & = - ({\mathbf{y}} - {\mathbf{AX}}_{\alpha }^{j} - {\mathbf{A}}^{j} \delta {\hat{\mathbf{X}}}_{\alpha } ) \\ {{\varvec{\Theta}}}_{\beta \alpha }^{j} {\hat{\boldsymbol{\lambda}}}_{\alpha } + {{\varvec{\Theta}}}_{\beta }^{j} {\hat{\boldsymbol{\lambda}}}_{\beta } & = - ({\mathbf{y}} - {\mathbf{AX}}_{\beta }^{j} - {\mathbf{A}}^{j} \delta {\hat{\mathbf{X}}}_{\beta } ) \\ \end{aligned} $$

(28)

where

$$ \begin{array}{*{20}l} {{{\varvec{\Theta}}}_{\alpha }^{j} = {\mathbf{W}}_{\alpha }^{ - 1} ({\tilde{\boldsymbol{\upupsilon}}}_{\beta } ) + ({\mathbf{X}}_{ \otimes \alpha }^{j} )^{T} {\mathbf{Q}}_{{\mathbf{e}}} {\mathbf{X}}_{ \otimes \alpha }^{j} ,} \hfill & {{{\varvec{\Theta}}}_{\alpha ,\beta }^{j} = ({\mathbf{X}}_{ \otimes \alpha }^{j} )^{T} {\mathbf{Q}}_{{\mathbf{e}}} {\mathbf{X}}_{ \otimes \beta }^{j} } \hfill \\ {{{\varvec{\Theta}}}_{\beta \alpha }^{j} = ({{\varvec{\Theta}}}_{\alpha \beta }^{j} )^{T} = ({\mathbf{X}}_{ \otimes \beta }^{j} )^{T} {\mathbf{Q}}_{{\mathbf{e}}} {\mathbf{X}}_{ \otimes \alpha }^{j} ,} \hfill & {{{\varvec{\Theta}}}_{\beta }^{j} = {\mathbf{W}}_{\beta }^{ - 1} ({\tilde{\mathbf{v}}}_{\alpha } ) + ({\mathbf{X}}_{ \otimes \beta }^{j} )^{T} {\mathbf{Q}}_{{\mathbf{e}}} {\mathbf{X}}_{ \otimes \beta }^{j} } \hfill \\ \end{array} $$

(29)

After introducing block matrices

$$ \begin{aligned} {{\varvec{\Theta}}}^{j} & = \left[ {\begin{array}{*{20}l} {{{\varvec{\Theta}}}_{\alpha }^{j} } \hfill & {{{\varvec{\Theta}}}_{\alpha \beta }^{j} } \hfill \\ {{{\varvec{\Theta}}}_{\beta \alpha }^{j} } \hfill & {{{\varvec{\Theta}}}_{\beta }^{j} } \hfill \\ \end{array} } \right], \\ \mathop{\mathbf{A}}\limits^{\frown} & = {\mathbf{I}}_{2} \otimes {\mathbf{A}} = \left[ {\begin{array}{*{20}c} {\mathbf{A}} & {\mathbf{0}} \\ {\mathbf{0}} & {\mathbf{A}} \\ \end{array} } \right], \\ \mathop{\mathbf{y}}\limits^{\frown} & = {\mathbf{1}}_{2} \otimes {\mathbf{y}} = \left[ {\begin{array}{*{20}c} {\mathbf{y}} \\ {\mathbf{y}} \\ \end{array} } \right] \\ \end{aligned} $$

(30)

(${\mathbf{1}}_{2} = [1,\;1]^{T}$) and after including the mutually competing quantities being determined in combined vectors, i.e. having introduced vectors

$$ \begin{aligned} {{\varvec{\uplambda}}} & = \left[ {{{\varvec{\uplambda}}}_{\alpha }^{T} ,\;{{\varvec{\uplambda}}}_{\beta }^{T} } \right]^{T} ,\delta {\mathbf{X}} = \left[ {\delta {\mathbf{X}}_{\alpha }^{T} ,\;\delta {\mathbf{X}}_{\beta }^{T} } \right]^{T} , \\ {\mathbf{X}} & = \left[ {{\mathbf{X}}_{\alpha }^{T} ,\;{\mathbf{X}}_{\beta }^{T} } \right]^{T} \\ \end{aligned} $$

(31)

system of Eqs. (28) can also be expressed as follows:

$$ {{\varvec{\Theta}}}^{j} {\hat{\boldsymbol{\lambda}}} = - (\mathop{\mathbf{y}}\limits^{\frown} - \mathop{\mathbf{A}}\limits^{\frown}\mathbf{X}^{j} - \mathop{\mathbf{A}}\limits^{\frown}{}^{j} \delta {\hat{\mathbf{X}}}) $$

(32)

The solution to Eq. (32) is the combined Lagrange multiplier vector of the following form:

$$ {\hat{\boldsymbol{\lambda}}} = \left[ {\begin{array}{*{20}c} {{\hat{\boldsymbol{\lambda}}}_{\alpha } } \\ {{\hat{\boldsymbol{\lambda}}}_{\beta } } \\ \end{array} } \right] = - ({{\varvec{\Theta}}}^{j} )^{ - 1} (\mathop{\mathbf{y}}\limits^{\frown} - \mathop{\mathbf{A}}\limits^{\frown}\mathbf{X}^{j} - \mathop{\mathbf{A}}\limits^{\frown}{}^{j} \delta {\hat{\mathbf{X}}}) $$

(33)

After substituting Eq. (33) to conditions (26d) and (26e), jointly recorded as

$$ \left. {\begin{array}{*{20}c} {({\mathbf{A}}^{j} )^{T} {\hat{\boldsymbol{\lambda}}}_{\alpha } = {\mathbf{0}}} \\ {({\mathbf{A}}^{j} )^{T} {\hat{\boldsymbol{\lambda}}}_{\beta } = {\mathbf{0}}} \\ \end{array} } \right\}\quad \Leftrightarrow ( \mathop{\mathbf{A}}\limits^{\frown}{}^{j})^{T} {\hat{\boldsymbol{\lambda}}} = {\mathbf{0}} $$

(34)

the following normal equation is obtained:

$$ ( \mathop{\mathbf{A}}\limits^{\frown}{}^{j})^{T} ({{\varvec{\Theta}}}^{j} )^{ - 1} (\mathop{\mathbf{y}}\limits^{\frown} - \mathop{\mathbf{y}}\limits^{\frown}\mathbf{X}^{j} - \mathop{\mathbf{A}}\limits^{\frown}{}^{j} \delta {\hat{\mathbf{X}}}) = {\mathbf{0}} $$

(35)

The solution to this equation is vector

$$ \delta {\hat{\mathbf{X}}}^{j + 1} = \left[ {\begin{array}{*{20}c}{\delta {\hat{\mathbf{X}}}_{\alpha }^{j + 1} } \\ {\delta{\hat{\mathbf{X}}}_{\beta }^{j + 1} } \\ \end{array} } \right] =\left( {(\mathop{\mathbf{A}}\limits^{\frown}{}^{j})^{T}({{\varvec{\Theta}}}^{j} )^{ -1}\mathop{\mathbf{A}}\limits^{\frown}{}^{j}} \right)^{ - 1} (\mathop{\mathbf{y}}\limits^{\frown} -\mathop{\mathbf{A}}\limits^{\frown}\mathbf{X}^{j} ) $$

(36)

which represents an evaluation of the combined vector of $\delta {\mathbf{X}}$ increments in the ($j + 1$) iteration. In order to determine the combined vector ${\mathbf{X}} = {\mathbf{X}}^{j} + \delta {\mathbf{X}}$ that is valid in this iteration, it will be taken into account that ${\mathbf{A}} = {\mathbf{A}}^{j} + {\mathbf{E}}^{j}$, and thus, $\mathop{\mathbf{A}}\limits^{\frown} = \mathop{\mathbf{A}}\limits^{\frown}{}^{j} + \mathop{\mathbf{e}}\limits^{\frown}{}^{j}$, where $\mathop{\mathbf{E}}\limits^{\frown}{}^{j} = {\mathbf{I}}_{2} \otimes {\mathbf{E}}^{j}$. Equation (35) then takes the following form:

$$ (\mathop{\mathbf{A}}\limits^{\frown}{}^{j} )^{T} ({{\varvec{\Theta}}}^{j} )^{ - 1} (\mathop{\mathbf{y}}\limits^{\frown} - \mathop{\mathbf{A}}\limits^{\frown} {}^{j} {\mathbf{X}}^{j} - \mathop{\mathbf{E}}\limits^{\frown}{}^{j} {\mathbf{X}}^{j} - \mathop{\mathbf{A}}\limits^{\frown}{}^{j} \delta {\hat{\mathbf{X}}}) = {\mathbf{0}} $$

(37)

which yields the following:

$$ \begin{aligned} {\hat{\mathbf{X}}}^{j + 1} & = \left[ {\begin{array}{*{20}c} {{\hat{\mathbf{X}}}_{\alpha }^{j + 1} } \\ {{\hat{\mathbf{X}}}_{\beta }^{j + 1} } \\ \end{array} } \right] = {\hat{\mathbf{X}}}^{j} + \delta {\hat{\mathbf{X}}}^{j + 1} \\ & = \left( {({\mathbf{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{A} }}^{j} )^{T} ({{\varvec{\Theta}}}^{j} )^{ - 1} {\mathbf{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{A} }}^{j} } \right)^{ - 1} ({\mathbf{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{A} }}^{j} )^{T} ({{\varvec{\Theta}}}^{j} )^{ - 1} ({\mathbf{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{y} }} - {\mathbf{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{E} }}^{j} {\mathbf{X}}^{j} ) \\ \end{aligned} $$

(38)

Based on Eqs. (26a) and (26b), jointly recorded as

$$ \left. {\begin{array}{*{20}c} {{\tilde{\boldsymbol{\upupsilon}}}_{\alpha } = - {\mathbf{W}}_{\alpha }^{ - 1} ({\tilde{\mathbf{v}}}_{\beta } ){\hat{\boldsymbol{\lambda}}}_{\alpha } } \\ {{\tilde{\boldsymbol{\upupsilon}}}_{\beta } = - {\mathbf{W}}_{\beta }^{ - 1} ({\tilde{\mathbf{v}}}_{\alpha } ){\hat{\boldsymbol{\lambda}}}_{\beta } } \\ \end{array} } \right\}\quad \Leftrightarrow \quad {\tilde{\boldsymbol{\upupsilon}}} = - {\mathbf{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{W} }}^{ - 1} {\hat{\boldsymbol{\lambda}}} $$

(39)

the combined residual vector that is valid in the $(j + 1)$ iteration can be determined:

$$ {\tilde{\boldsymbol{\upupsilon}}}^{j + 1} = \left[ {\begin{array}{*{20}c} {{\tilde{\boldsymbol{\upupsilon}}}_{\alpha }^{j + 1} } \\ {{\tilde{\boldsymbol{\upupsilon}}}_{\beta }^{j + 1} } \\ \end{array} } \right] = {\mathbf{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{W} }}^{ - 1} ({\tilde{\boldsymbol{\upupsilon}}}_{\alpha }^{j} ,{\tilde{\boldsymbol{\upupsilon}}}_{\beta }^{j} )({{\varvec{\Theta}}}^{j} )^{ - 1} ({\mathbf{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{y} }} - {\mathbf{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{A} }}\hat{X}^{j} - {\mathbf{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{A} }}^{j} \delta {\hat{\mathbf{X}}}^{j + 1} ) $$

(40)

where

$$ {\mathbf{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{W} }}^{ - 1} ({\tilde{\boldsymbol{\upupsilon}}}_{\alpha }^{j} ,{\tilde{\boldsymbol{\upupsilon}}}_{\beta }^{j} ) = {\text{Diag}}\left( {{\mathbf{W}}_{\alpha }^{ - 1} ({\tilde{\boldsymbol{\upupsilon}}}_{\beta }^{j} ),\;{\mathbf{W}}_{\beta }^{ - 1} ({\tilde{\boldsymbol{\upupsilon}}}_{\alpha }^{j} )} \right) $$

(41)

On the other hand, based on Eq. (26c), expressed in the following form:

$$ {\tilde{\mathbf{e}}} = {\mathbf{Q}}_{{\mathbf{e}}} ({\mathbf{X}}_{ \otimes \alpha }^{j} {\hat{\boldsymbol{\lambda}}}_{\alpha } + {\mathbf{X}}_{ \otimes \beta }^{j} {\hat{\boldsymbol{\lambda}}}_{\beta } ) = {\mathbf{Q}}_{{\mathbf{e}}} {\mathbf{X}}_{ \otimes }^{j} {\hat{\boldsymbol{\lambda}}} $$

(42)

the following residual vector is determined:

$$ {\tilde{\mathbf{e}}}^{j + 1} = - {\mathbf{Q}}_{{\mathbf{e}}} {\mathbf{X}}_{ \otimes }^{j} ({{\varvec{\Theta}}}^{j} )^{ - 1} ({\mathbf{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{y} }} -\mathop{\mathbf{A}}\limits^{\frown} \hat{\mathbf{X}}^{j} - {\mathbf{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{A} }}^{j} \delta {\hat{\mathbf{X}}}^{j + 1} ) $$

(43)

where ${\mathbf{X}}_{ \otimes }^{j} = \left[ {({\mathbf{X}}_{ \otimes \alpha }^{j} )^{T} ,({\mathbf{X}}_{ \otimes \beta }^{j} )^{T} } \right]^{T}$. Taking that vector, we can obtain ${\tilde{\mathbf{E}}}^{j + 1}$, and hence, the matrix ${\mathbf{A}}^{j + 1} = {\mathbf{A}} - {\tilde{\mathbf{E}}}^{j + 1}$.

Variance coefficient estimators in TM_split estimation should be derived by referring to the split EIV models (e.g. by applying the theory presented in Wiśniewski and Zienkiewicz (2021b). However, this problem that requires additional, detailed theoretical and empirical analyses is beyond the scope of this paper. With minor random disturbances of matrix A, variance coefficient estimators appropriate for M_split estimation can also be recommended here. Then, in Eq. (8), the vectors ${\tilde{\mathbf{v}}}_{\alpha }$ and ${\tilde{\mathbf{v}}}_{\beta }$ should be replaced by the vectors ${\tilde{\boldsymbol{\upupsilon}}}_{\alpha } = {\mathbf{y}} - ({\mathbf{A}} - {\tilde{\mathbf{E}}}){\hat{\mathbf{X}}}_{\alpha }$ and ${\tilde{\boldsymbol{\upupsilon}}}_{\beta } = {\mathbf{y}} - ({\mathbf{A}} - {\tilde{\mathbf{E}}}){\hat{\mathbf{X}}}_{\beta }$. These estimators of variance coefficients stay unbiased; however, they loose their invariance according to growing values of ${\tilde{\mathbf{E}}}\hat{\mathbf{X}}_{\alpha }$ and ${\tilde{\mathbf{E}}}\hat{\mathbf{X}}_{\beta }$.

3.2 Algorithm

Total M_split estimation algorithm contains the following basic elements: Step 0—a starting step, Step 1—iterative calculation of M_split estimators for valid split EIV models (with internal iterations $l = 0, \ldots ,s$), Step 2—updating the EIV models' parameters, and return to Step 1 (until the adopted criterion for stopping the iterative process, $j = 0, \ldots ,k$), Step 3—adopting the final values of Total M_split estimators. Each of these steps is described in more detail below.

Step 0: Similar to M_split estimation, the iterative process can also be initiated here using the following classical least-squares (LS) estimators

$$ {\hat{\mathbf{X}}}_{LS} = ({\mathbf{A}}^{T} {\mathbf{WA}})^{ - 1} {\mathbf{A}}^{T} {\mathbf{Wy}}\quad {\text{and}}\quad {\tilde{\mathbf{v}}}_{LS} = {\mathbf{y}} - {\mathbf{A}}\hat{\mathbf{X}}_{LS} $$

(44)

where ${\mathbf{W}} = {\mathbf{Q}}_{{\mathbf{y}}}^{ - 1}$ is the weight matrix. Therefore, the following are adopted: ${\mathbf{X}}_{\alpha (0)}^{0} = {\hat{\mathbf{X}}}_{LS}$, ${\mathbf{X}}_{\beta (0)}^{0} = {\hat{\mathbf{X}}}_{LS}$, ${{\varvec{\upupsilon}}}_{\alpha (0)}^{0} = {\tilde{\mathbf{v}}}_{LS}$, ${{\varvec{\upupsilon}}}_{\beta (0)}^{0} = {\tilde{\mathbf{v}}}_{LS}$. Moreover, ${\mathbf{E}}^{0} = {\mathbf{0}}$, ${\mathbf{A}}^{0} = {\mathbf{A}}$ and $\delta {\hat{\mathbf{X}}}^{0} = {\mathbf{0}}$.

Step 1: Calculate M_split estimators ${\hat{\mathbf{X}}}_{\alpha }^{j}$ and ${\hat{\mathbf{X}}}_{\beta }^{j}$:

step $1_{(1)}$: Based on the valid vector ${{\varvec{\upupsilon}}}_{\beta (l)}^{j}$, the following weight matrix is constructed

$$ {\mathbf{W}}_{\alpha } ({{\varvec{\upupsilon}}}_{\beta (l)}^{j} ) = {\text{Diag}}\left( {\left( {{{\varvec{\upupsilon}}}_{1\beta (l)}^{j} } \right)^{2} q_{1}^{ - 2} , \ldots ,\left( {{{\varvec{\upupsilon}}}_{n\beta (l)}^{j} } \right)^{2} q_{n}^{ - 2} } \right) $$

(45)

and the following is calculated:

$$ \begin{aligned} {\mathbf{X}}_{\alpha (l + 1)}^{j} & = \left( {{\mathbf{A}}^{T} {\mathbf{W}}_{\alpha } ({{\varvec{\upupsilon}}}_{\beta (l)}^{j} ){\mathbf{A}}} \right)^{ - 1} {\mathbf{A}}^{T} {\mathbf{W}}_{\alpha } ({{\varvec{\upupsilon}}}_{\beta (l)}^{j} ){\mathbf{y}} \\ {{\varvec{\upupsilon}}}_{\alpha (l + 1)}^{j} & = {\mathbf{y}} - {\mathbf{A}}^{j} {\mathbf{X}}_{\alpha (l + 1)}^{j} \\ \end{aligned} $$

(46)

step $1_{(2)}$: Based on the vector ${{\varvec{\upupsilon}}}_{\alpha (l + 1)}^{j}$, the following weight matrix is constructed

$$ {\mathbf{W}}_{\beta } ({{\varvec{\upupsilon}}}_{\alpha (l + 1)}^{j} ) = {\text{Diag}}\left( {\left( {{{\varvec{\upupsilon}}}_{1\alpha (l + 1)}^{j} } \right)^{2} q_{1}^{ - 2} , \ldots ,\left( {{{\varvec{\upupsilon}}}_{n\alpha (l + 1)}^{j} } \right)^{2} q_{n}^{ - 2} } \right) $$

(47)

and the following is calculated:

$$ \begin{aligned} {\mathbf{X}}_{\beta (l + 1)}^{j} & = \left( {{\mathbf{A}}^{T} {\mathbf{W}}_{\beta } ({{\varvec{\upupsilon}}}_{\alpha (l + 1)}^{j} ){\mathbf{A}}} \right)^{ - 1} {\mathbf{A}}^{T} {\mathbf{W}}_{\beta } ({{\varvec{\upupsilon}}}_{\alpha (l + 1)}^{j} ){\mathbf{y}} \\ {{\varvec{\upupsilon}}}_{\beta (l + 1)}^{j} & = {\mathbf{y}} - {\mathbf{A}}^{j} {\mathbf{X}}_{\beta (l + 1)}^{j} \\ \end{aligned} $$

(48)

step $1_{(3)}$: Repeat steps $1_{(1)}$ and $1_{(2)}$ until

$$ \begin{gathered} \left\| {{\mathbf{X}}_{\alpha (l + 1)}^{j} - {\mathbf{X}}_{\alpha (l)}^{j} } \right\| < \varepsilon_{0} \quad {\text{and}} \hfill \\ \left\| {{\mathbf{X}}_{\beta (l + 1)}^{j} - {\mathbf{X}}_{\beta (l)}^{j} } \right\| < \varepsilon_{0} \quad ({\text{for a given}}\,\varepsilon_{0} ) \hfill \\ \end{gathered} $$

(49)

Once criterion (49) has been satisfied, the following M_split estimators in the j-th iteration:

$$ {\hat{\mathbf{X}}}_{\alpha }^{j} = {\mathbf{X}}_{\alpha (l + 1)}^{j} ,{\hat{\mathbf{X}}}_{\beta }^{j} = {\mathbf{X}}_{\beta (l + 1)}^{j} $$

(50)

and residual vectors

$$ {\tilde{\boldsymbol{\upupsilon}}}_{\alpha }^{j} = {\mathbf{y}} - {\mathbf{A}}^{j} {\hat{\mathbf{X}}}_{\alpha }^{j} ,{\tilde{\boldsymbol{\upupsilon}}}_{\beta }^{j} = {\mathbf{y}} - {\mathbf{A}}^{j} {\hat{\mathbf{X}}}_{\beta }^{j} $$

(51)

are adopted.

Step 2: For the determined iterative M_split estimators ${\hat{\mathbf{X}}}_{\alpha }^{j}$, ${\hat{\mathbf{X}}}_{\beta }^{j}$, and residual vectors ${\tilde{\boldsymbol{\upupsilon}}}_{\alpha }^{j}$, ${\tilde{\boldsymbol{\upupsilon}}}_{\beta }^{j}$, weight matrices that are valid in the j-th iteration are determined:

$$ \begin{aligned} {\mathbf{W}}_{\alpha } ({\tilde{\boldsymbol{\upupsilon}}}_{\beta }^{j} ) & = {\text{Diag}}\left( {\left( {{\tilde{\boldsymbol{\upupsilon}}}_{1\beta }^{j} } \right)^{2} q_{1}^{ - 2} , \ldots ,\left( {{\tilde{\boldsymbol{\upupsilon}}}_{n\beta }^{j} } \right)^{2} q_{n}^{ - 2} } \right) \\ {\mathbf{W}}_{\beta } ({\tilde{\boldsymbol{\upupsilon}}}_{\alpha }^{j} ) & = {\text{Diag}}\left( {\left( {{\tilde{\boldsymbol{\upupsilon}}}_{1\alpha }^{j} } \right)^{2} q_{1}^{ - 2} , \ldots ,\left( {{\tilde{\boldsymbol{\upupsilon}}}_{n\alpha }^{j} } \right)^{2} q_{n}^{ - 2} } \right) \\ \end{aligned} $$

(52)

Based on these matrices and taking ${\hat{\mathbf{X}}}_{ \otimes \alpha }^{j} = {\hat{\mathbf{X}}}_{\alpha }^{j} \otimes {\mathbf{I}}_{n}$, ${\hat{\mathbf{X}}}_{ \otimes \beta }^{j} = {\hat{\mathbf{X}}}_{\beta }^{j} \otimes {\mathbf{I}}_{n}$, we can determine matrix ${{\varvec{\Theta}}}^{j}$, Eq. (30); then, the combined vector of increments is calculated:

$$ \delta {\hat{\mathbf{X}}}^{j + 1} = \left( {({\mathbf{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{A} }} - {\mathbf{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{E} }}^{j} )^{T} ({{\varvec{\Theta}}}^{j} )^{ - 1} ({\mathbf{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{A} }} - {\mathbf{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{E} }}^{j} )} \right)^{ - 1} ({\mathbf{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{y} }} -\mathop{\mathbf{A}}\limits^{\frown}\hat{\mathbf{X}}^{j} ) $$

(53)

This vector enables the calculation of Lagrange multiplier vector that is valid in the ($j + 1$) iteration

$$ {\hat{\boldsymbol{\lambda}}}^{j + 1} = - ({{\varvec{\Theta}}}^{j} )^{ - 1} \left( {{\mathbf{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{y} }} - \mathop{\mathbf{A}}\limits^{\frown}\hat{\mathbf{X}}^{j} - ({\mathbf{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{A} }} - {\mathbf{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{E} }}^{j} )\delta {\hat{\mathbf{X}}}^{j + 1} } \right) $$

(54)

and then the calculation of combined residual vectors

$$ {\tilde{\boldsymbol{\upupsilon}}}^{j + 1} = {\mathbf{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{W} }}^{ - 1} ({\tilde{\boldsymbol{\upupsilon}}}_{\alpha }^{j} ,{\tilde{\boldsymbol{\upupsilon}}}_{\beta }^{j} )({{\varvec{\Theta}}}^{j} )^{ - 1} \left( {{\mathbf{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{y} }} - \mathop{\mathbf{A}}\limits^{\frown}\hat{\mathbf{X}}^{j} - ({\mathbf{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{A} }} - {\mathbf{E}}^{j} )\delta {\hat{\mathbf{X}}}^{j + 1} } \right) $$

(55)

and

$$ {\tilde{\mathbf{e}}}^{j + 1} = - {\mathbf{Q}}_{{\mathbf{e}}} {\hat{\mathbf{X}}}_{ \otimes }^{j} ({{\varvec{\Theta}}}^{j} )^{ - 1} \left( {{\mathbf{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{y} }} - {\mathbf{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{A} }}\hat{X}^{j} - ({\mathbf{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{A} }} - {\mathbf{E}}^{j} )\delta {\hat{\mathbf{X}}}^{j + 1} } \right) $$

(56)

Based on vector ${\tilde{\mathbf{e}}}^{j + 1}$, it is necessary to build a matrix of disturbances ${\tilde{\mathbf{E}}}^{j + 1}$ and to calculate the vector of the following parameters which are valid in the ($j + 1$) iteration:

$$ {\hat{\mathbf{X}}}^{j + 1} = \left( {({\mathbf{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{A} }} - {\mathbf{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{E} }}^{j} )^{T} ({{\varvec{\Theta}}}^{j} )^{ - 1} ({\mathbf{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{A} }} - {\mathbf{E}}^{j} )} \right)^{ - 1} ({\mathbf{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{A} }} - {\mathbf{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{E} }}^{j} )^{T} ({{\varvec{\Theta}}}^{j} )^{ - 1} ({\mathbf{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{y} }} - {\mathbf{\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{E} }}^{j} {\hat{\mathbf{X}}}^{j} ) $$

(57)

Step 3: Repeat Step 1 and Step 2 until

$$ \left\| {{\hat{\mathbf{X}}}^{j + 1} - {\hat{\mathbf{X}}}^{j} } \right\| < \varepsilon_{0} $$

(58)

Once criterion (58) has been satisfied, it is assumed that TM_split estimator of the combined vector of ${\mathbf{X}} = \left[ {{\mathbf{X}}_{\alpha }^{T} ,\;{\mathbf{X}}_{\beta }^{T} } \right]^{T}$ parameters is vector ${\hat{\mathbf{X}}} = {\hat{\mathbf{X}}}^{j + 1}$. The blocks of this vector are TM_split estimators ${\hat{\mathbf{X}}}_{\alpha }$ and ${\hat{\mathbf{X}}}_{\beta }$, the “removal” of which from vector ${\hat{\mathbf{X}}}$ may be facilitated by relationships

$$ {\hat{\mathbf{X}}}_{\alpha } = {\mathbf{D}}_{{X_{\alpha } }} {\hat{\mathbf{X}}}\quad {\text{and}}\quad {\hat{\mathbf{X}}}_{\beta } = {\mathbf{D}}_{{X_{\beta } }} {\hat{\mathbf{X}}} $$

(59)

where ${\mathbf{D}}_{{X_{\alpha } }} = \left[ {{\mathbf{I}}_{m} ,\;{\mathbf{0}}_{m,m} } \right]$ and ${\mathbf{D}}_{{X_{\beta } }} = \left[ {{\mathbf{0}}_{m,m} ,{\mathbf{I}}_{m} \;} \right]$(${\mathbf{0}}_{m,m}$—zero matrix with dimensions of $m \times m$). Once the iterative process is complete, the final residual vectors ${\tilde{\boldsymbol{\upupsilon}}} = {\tilde{\boldsymbol{\upupsilon}}}^{k}$ and ${\tilde{\mathbf{e}}} = {\tilde{\mathbf{e}}}^{k}$ are also determined. Based on vector ${\tilde{\boldsymbol{\upupsilon}}} = [{\tilde{\boldsymbol{\upupsilon}}}_{\alpha }^{T} ,{\tilde{\boldsymbol{\upupsilon}}}_{\beta }^{T} ]^{T}$, two versions of the residual vector corresponding to the observation vector y, i.e.

$$ {\tilde{\boldsymbol{\upupsilon}}}_{\alpha } = {\mathbf{D}}_{{\upsilon_{\alpha } }} {\tilde{\boldsymbol{\upupsilon}}}\quad {\text{and}}\quad {\tilde{\boldsymbol{\upupsilon}}}_{\beta } = {\mathbf{D}}_{{\upsilon_{\beta } }} {\tilde{\boldsymbol{\upupsilon}}} $$

(60)

where ${\mathbf{D}}_{{\upsilon_{\alpha } }} = \left[ {{\mathbf{I}}_{n} ,\;{\mathbf{0}}_{n,n} } \right]$ and ${\mathbf{D}}_{{\upsilon_{\beta } }} = \left[ {{\mathbf{0}}_{n,n} ,{\mathbf{I}}_{n} \;} \right]$, are obtained. On the other hand, vector ${\tilde{\mathbf{e}}}$ provides the basis for the construction of residual matrix ${\tilde{\mathbf{E}}}$ corresponding to matrix A, hence also ${\tilde{\mathbf{A}}} = {\mathbf{A}} - {\tilde{\mathbf{E}}}$.

The given above process of iterative determination of TM_split estimates is also described by the flowchart presented in Fig. 1

4 Examples

4.1 Example 1: competitive models of systematic errors

In one of the examples provided in a study by Wiśniewski (2010), it was assumed that $y_{i}$, $i = 1, \ldots ,n$ were observations of a certain value of $Y$ disturbed not only with random errors $v_{i}$ but also with systematic errors $s_{i} = s(t_{i} ) = a + bt_{i}$ (e.g. Wiśniewski 1985; Kubáčková and Kubáček 1991; Yang and Zhang 2005). The problem, however, is that two versions of this model can be used: $s_{\alpha } (t_{i} ) = a_{\alpha } + b_{\alpha } t_{i}$ and $s_{\beta } (t_{i} ) = a_{\beta } + b_{\beta } t_{i}$, whereas it is not known which of them concerns specific observation $y_{i}$. For this reason, in M_split estimation, the classical observation model

$$ y_{i} = Y + s(t_{i} ) + v_{i} = (Y + a) + bt_{i} + v_{i} = X + bt_{i} + v_{i} $$

(61)

is split into the following models:

$$ \begin{aligned} y_{i} & = (Y + a_{\alpha } ) + b_{\alpha } t_{i} + v_{i\alpha } = X_{\alpha } + b_{\alpha } t_{i} + v_{i\alpha } \\ y_{i} & = (Y + a_{\beta } ) + b_{\beta } t_{i} + v_{i\beta } = X_{\beta } + b_{\beta } t_{i} + v_{i\beta } \\ \end{aligned} $$

(62)

where $X = Y + a$. In these models, two mutually competing versions of the parameters occur, namely $X_{\alpha } = Y + a_{\alpha }$, $X_{\beta } = Y + a_{\beta }$. Observations were simulated with the assumption of theoretical values of the parameters $X_{\alpha } ,\;b_{\alpha }$ and $X_{\beta } ,\;b_{\beta }$. Theoretical observations $\overline{y}_{i}$, $i = 1, \ldots ,10$, were affected by Gaussian errors with the expected value of 0, and standard deviation of $\sigma_{y}$. For theoretical values $X_{\alpha } = 6.0$, $b_{\alpha } = 0.5$, $X_{\beta } = 3.0$, $b_{\beta } = 1.0$, and standard deviation $\sigma_{y} = 0.14$, the set of observations presented in Table 1 was obtained. The table also presents M_split estimates, namely $\hat{X}_{\alpha }$, $\hat{b}_{\alpha }$, $\hat{X}_{\beta }$, $\hat{b}_{\beta }$, the mutual competitive residuals $\hat{v}_{i\alpha }$,$\hat{v}_{i\beta }$ and weights related to such residuals $w_{\alpha } (\hat{v}_{i\beta } ) = \hat{v}_{i\beta }^{2} q_{i}^{ - 2}$, $w_{\beta } (\hat{v}_{i\alpha } ) = \hat{v}_{i\alpha }^{2} q_{i}^{ - 2}$ (for $q_{i}^{ - 2} = \sigma_{y}^{ - 4}$). A graphical illustration of the set of observations, and a graphical interpretation of the obtained results (as compared to LS-estimators determined using model (61)), are shown in Fig. 2. For the sake of clarity, the figure shows the competitive residuals only of the observation $y_{10}$. In this figure, the mutually competing results of M_split estimation are conventionally denoted as M_split(α) and M_split(β) (when it is convenient, these notations will also be used further on in this paper).

Table 1 Observed data and results of M_split estimation (Wiśniewski 2010)

Full size table

The models contained in Eq. (62) will now be replaced with EIV models of the following form:

$$ \begin{aligned} y_{i} & = X_{\alpha } + b_{\alpha } (t_{i} - e_{{t_{i} }} ) + \upsilon_{i\alpha } \\ y_{i} & = X_{\beta } + b_{\beta } (t_{i} - e_{{t_{i} }} ) + \upsilon_{i\beta } \\ \end{aligned} $$

(63)

where $e_{{t_{i} }}$ is a random error affected to the variable $t_{i}$. For n observations, based on Eq. (63), models ${\mathbf{y}} = ({\mathbf{A}} - {\mathbf{E}}){\mathbf{X}}_{\alpha } + {{\varvec{\upupsilon}}}_{\alpha }$ and ${\mathbf{y}} = ({\mathbf{A}} - {\mathbf{E}}){\mathbf{X}}_{\beta } + {{\varvec{\upupsilon}}}_{\beta }$ are constructed, where

$$ \begin{aligned} {\mathbf{A}} & = \left[ {\begin{array}{*{20}l} {1_{1} } \hfill & {t_{1} } \hfill \\ \vdots \hfill & \vdots \hfill \\ {1_{n} } \hfill & {t_{n} } \hfill \\ \end{array} } \right], \\ {\mathbf{E}} & = \left[ {\begin{array}{*{20}l} {0_{1} } \hfill & {e_{{t_{1} }} } \hfill \\ \vdots \hfill & \vdots \hfill \\ {0_{n} } \hfill & {e_{{t_{n} }} } \hfill \\ \end{array} } \right] = \left[ {{\mathbf{0}}_{n} ,\;{\mathbf{e}}_{t} } \right], \\ {\mathbf{X}}_{\alpha } & = \left[ {\begin{array}{*{20}c} {X_{\alpha } } \\ {b_{\alpha } } \\ \end{array} } \right],{\mathbf{X}}_{\beta } = \left[ {\begin{array}{*{20}c} {X_{\beta } } \\ {b_{\beta } } \\ \end{array} } \right] \\ \end{aligned} $$

(64)

(the first column of matrix A is not random). Vector ${\mathbf{e}} = {\text{vec}}({\mathbf{E}})$, built from matrix E columns, has the following form:

$$ {\mathbf{e}} = \left[ {0_{1} , \ldots ,0_{n} ,e_{{t_{1} }} , \ldots ,e_{{t_{n} }} } \right]^{T} = \left[ {{\mathbf{0}}_{n}^{T} ,\;{\mathbf{e}}_{t}^{T} } \right]^{T} $$

(65)

In view of the structure of this vector, ${\mathbf{Q}}_{{\mathbf{e}}}$ cofactor matrix, similarly as in Schaffrin and Wieser (2008) and Shen et al. (2011), will be expressed in the following form:

$$ {\mathbf{Q}}_{{\mathbf{e}}} = {\mathbf{Q}}_{0} \otimes {\mathbf{Q}}_{{\mathbf{x}}} = {\mathbf{Q}}_{0} \otimes {\mathbf{Q}}_{{{\mathbf{e}}_{t} }} \quad {\text{with}}\quad {\mathbf{Q}}_{0} = \left[ {\begin{array}{*{20}c} 0 & 0 \\ 0 & 1 \\ \end{array} } \right] $$

(66)

where ${\mathbf{Q}}_{{{\mathbf{e}}_{t} }}$ is the cofactor matrix of vector ${\mathbf{e}}_{t}$. Note that here ${\mathbf{Q}}_{{{\mathbf{e}}_{t} }}$ is regular, but ${\mathbf{Q}}_{0}$ is not; thus, ${\mathbf{Q}}_{{\mathbf{e}}}$ is not regular too (the similar situation is in the example given by Schaffrin and Wieser 2008). Random errors $e_{{t_{i} }}$ are simulated as Gaussian quantities with the expected value of 0, and standard deviation of $\sigma_{e}$. M_split and TM_split estimators of model (63) parameters will be determined for four values of standard deviation: $\sigma_{e} = 0$ (variant I), $\sigma_{e} = 0.13$ (variant II), $\sigma_{e} = 0.28$(variant III), $\sigma_{e} = 0.37$(variant IV). In each of these variants, observations $y_{i}$, $i = 1, \ldots ,10$, are as in the example cited above (Table 1). The data adopted for the calculations are listed in Table 2, while the M_split and TM_split estimators obtained for these data are presented in Table 3. Additionally, the residuals $\tilde{e}_{{t_{i} }}$ are presented in Table 4. Table 5 shows the competitive residuals and the respective weights for variant III.

Table 2 Observed data for Total M_split estimation (variant I, II, III, IV)

Full size table

Table 3 Results of M_split and Total M_split estimation (variant I, II, III, IV)

Full size table

Table 4 Residuals $\tilde{e}_{t}$ column of A

Full size table

Table 5 Residuals and weights in M_split and Total M_split estimations (variant III, $\sigma_{e} = 0.28$)

Full size table

In variant I, since matrix A is not disturbed by random disturbances, TM_split estimators are equal to M_split estimators. With an increase in the $\sigma_{e}$ standard deviation value, the differences between these estimators increase, with TM_split estimators remaining close to the theoretical values (as do M_split estimators for the variant without random errors of matrix A, $\sigma_{e} = 0$). This is well illustrated by the norm values of the vector of differences between the vector of theoretical parameters ${\mathbf{X}} = [{\mathbf{X}}_{\alpha }^{T} ,{\mathbf{X}}_{\beta }^{T} ]^{T}$ and the vector of obtained estimators ${\hat{\mathbf{X}}} = [{\hat{\mathbf{X}}}_{\alpha }^{T} ,{\hat{\mathbf{X}}}_{\beta }^{T} ]^{T}$, provided in the last row of Table 3.

In general, the iterative process involved in the determination of TM_split estimators ended after 4 to 6 steps of “external” iteration. In each of these steps, 6 to 7 “internal” iterations were carried out resulting in M_split estimators that are valid for this step. The course of the iterative process in Total M_split estimation, based on the example of variant II, is presented in Fig. 3.

The given examples apply one observation set, respectively. The additional analyses will be based on Monte Carlo (MC) simulations. The main objective is to determine the empirical accuracy of Total M_split estimation and the measures of its efficacy.

The accuracy of M_split estimation can be determined by applying asymptotical covariance matrices proposed in Wiśniewski and Zienkiewicz (2021b). The diagonal elements allow us to compute the estimated standard deviation $\sigma_{{\hat{X}_{\alpha } }}$, $\sigma_{{\hat{b}_{\alpha } }}$, $\sigma_{{\hat{X}_{\beta } }}$, $\sigma_{{\hat{b}_{\beta } }}$ of the respective M_split estimates. To apply such an approach to Total M_split estimation, one should develop the theory presented in the paper mentioned, which is beyond the scope of the present paper. Based on simulated observation sets, an empirical way could be an alternative for the analytical assessment in question. Total M_split estimates $\hat{X}_{\alpha }^{k}$, $\hat{b}_{\alpha }^{k}$,$\hat{X}_{\beta }^{k}$, $\hat{b}_{\beta }^{k}$ are computed for each $k = 1, \ldots ,N$ simulation, which is a base for determining the following MC-estimates of the parameters $X_{\alpha }$, $b_{\alpha }$,$X_{\beta }$, $b_{\beta }$:

$$ \begin{gathered} \hat{X}_{\alpha }^{MC} = \frac{1}{N}\sum\limits_{k = 1}^{N} {\hat{X}_{\alpha }^{k} } ,\quad \hat{b}_{\alpha }^{MC} = \frac{1}{N}\sum\limits_{k = 1}^{N} {\hat{b}_{\alpha }^{k} } , \hfill \\ \hat{X}_{\beta }^{MC} = \frac{1}{N}\sum\limits_{k = 1}^{N} {\hat{X}_{\beta }^{k} } ,\quad \hat{b}_{\beta }^{MC} = \frac{1}{N}\sum\limits_{k = 1}^{N} {\hat{b}_{\beta }^{k} } \hfill \\ \end{gathered} $$

(67)

The Monte Carlo estimators of the parameter standard deviations can be computed in the following way (e.g. Koch 2013; Nowel 2016; Lv and Sui 2020).

$$ \begin{aligned} \hat{\sigma }_{{\hat{X}_{\alpha } }}^{MC} & = \sqrt {\frac{1}{N}\sum\limits_{k = 1}^{N} {(\hat{X}_{\alpha }^{k} - \hat{X}_{\alpha }^{MC} )^{2} } } , \\ \hat{\sigma }_{{\hat{b}_{\alpha } }}^{MC} & = \sqrt {\frac{1}{N}\sum\limits_{k = 1}^{N} {(\hat{b}_{\alpha }^{k} - \hat{b}_{\alpha }^{MC} )^{2} } } \\ \hat{\sigma }_{{\hat{X}_{\beta } }}^{MC} & = \sqrt {\frac{1}{N}\sum\limits_{k = 1}^{N} {(\hat{X}_{\beta }^{k} - \hat{X}_{\beta }^{MC} )^{2} } } , \\ \hat{\sigma }_{{\hat{b}_{\beta } }}^{MC} & = \sqrt {\frac{1}{N}\sum\limits_{k = 1}^{N} {(\hat{b}_{\beta }^{k} - \hat{b}_{\beta }^{MC} )^{2} } } \\ \end{aligned} $$

(68)

Such quantities determine the accuracy of the estimates obtained. They can also be used to compare the stability of M_split and Total M_split estimators.

M_split estimation and its several developments (including Total M_split estimation) focus on optimal fitting the competitive functional models in the observation set. The efficacy of M_split estimation, like the efficacy of other methods, can be described by the differences between the parameter estimates obtained and the actual parameter values. Considering N simulations, the efficacy is usually measured by the root mean squared error (RMSE) (e.g. Kargoll et al. 2018; Lv and Sui 2020). Here, the efficacy of M_split or Total M_split estimates is determined by following RMSEs

$$ \begin{aligned} {\text{RMSE}}_{{\hat{X}_{\alpha } }} & = \sqrt {\frac{1}{N}\sum\limits_{k = 1}^{N} {(\hat{X}_{\alpha }^{k} - X_{\alpha } )^{2} } } , \\ {\text{RMSE}}_{{\hat{b}_{\alpha } }} & = \sqrt {\frac{1}{N}\sum\limits_{k = 1}^{N} {(\hat{b}_{\alpha }^{k} - b_{\alpha } )^{2} } } \\ {\text{RMSE}}_{{\hat{X}_{\beta } }} & = \sqrt {\frac{1}{N}\sum\limits_{k = 1}^{N} {(\hat{X}_{\beta }^{k} - X_{\beta } )^{2} } } , \\ {\text{RMSE}}_{{\hat{b}_{\beta } }} & = \sqrt {\frac{1}{N}\sum\limits_{k = 1}^{N} {(\hat{b}_{\beta }^{k} - b_{\beta } )^{2} } } \\ \end{aligned} $$

(69)

Additionally, the global root mean squared error, concerning the whole parameter vector, is determined (e.g. Wiśniewski 2014)

$$ {\text{RMSE}}_{{{\hat{\mathbf{X}}}}} = \sqrt {\frac{1}{N}\sum\limits_{k = 1}^{N} {({\hat{\mathbf{X}}}^{k} - {\mathbf{X}})^{T} ({\hat{\mathbf{X}}}^{k} - {\mathbf{X}})/r} } $$

(70)

where r is the number of estimated parameters (here $r = 4$).

The simulated errors $v_{i}^{k}$, $e_{t,i}^{k}$, $i = 1, \ldots ,10$ (for each $k = 1, \ldots ,N$) affect the theoretical observations $\overline{y}_{i}$ or the elements $a_{i,2}$ of the second column of the matrix A (the theoretical observations and the theoretical matrix A stay the same). The simulations are performed by applying the Gaussian random generators of the MatLab system, $\sigma_{y} randn(n,1)$ or $\sigma_{e} randn(n,1)$, respectively.

First, let us examine the efficacy of M_split estimates, which apply the models of (62). Since matrix A is constant, only observation errors are simulated. The computations are determined in several variants of the observation standard deviation, i.e. $\sigma_{y} = 0.05,\;\;0.1,\;\;0.2,\;\;0.3$. Table 6 presents results obtained for $N = 3000$ and the theoretical parameter values ${\mathbf{X}} = [X_{\alpha } ,\;b_{\alpha } ,\;X_{\beta } ,\;b_{\beta } ]^{T} =$ $[6.0,\;0.5,\;3.0,\;1.0]^{T}$. Figure 4 shows M_split estimates obtained in each simulation and MC estimates (for $\sigma_{y} = 0.1$).

Table 6 M_split estimates and their accuracy and efficacy (for sets without random disturbances in coefficients in the functional models)

Full size table

The accuracy and efficacy of the estimates $\hat{b}_{\alpha }$ and $\hat{b}_{\beta }$ are the most satisfying. The values of $\hat{\sigma }_{{\hat{b}_{\alpha } }}^{MC}$, $\hat{\sigma }_{{\hat{b}_{\beta } }}^{MC}$ and ${\text{RMSE}}_{{\hat{b}_{\alpha } }}$, ${\text{RMSE}}_{{\hat{b}_{\beta } }}$ are relatively small for all values of the observation standard deviations. The values obtained for the estimates $\hat{X}_{\alpha }$, $\hat{X}_{\beta }$ are higher; however, they are still acceptable.

Let us now use the models (63) to examine how random disturbances of matrix A might influence the accuracy and efficacy of the estimates. The observation errors are simulated assuming the constant standard deviation $\sigma_{y} = 0.1$, whereas the errors $e_{t,i}$ in several variants, in which $\sigma_{e} = 0,\;\;0.05,\;\;0.1,\;\;0.2,\;\;0.3$. Table 7 presents the results for $N = 3000$.

Table 7 M_split estimates and their accuracy and efficacy (for sets with random disturbances in coefficients in the functional models)

Full size table

Both accuracy and efficacy of M_split estimates decrease when the coefficient matrix is disturbed with random errors. That effect can be reduced by using Total M_split estimation, for which the measures in questions are smaller. As in the previous case, TM_split estimates of the parameters $b_{\alpha }$ and $b_{\beta }$ have the most satisfying accuracy and efficacy. The efficacy of Total M_split estimation is confirmed by the parameter estimates obtained in each simulation. The example estimates and the MC estimates are presented in Fig. 5 (for $\sigma_{e} = 0.2$).

4.2 Example 2: linear regression

Schaffrin and Wieser (2008) as well as Shen et al. (2011) and Mahboub (2012) applied WTLS for the estimation of the intercept $\xi_{1}$ and slope $\xi_{2}$ of the regression line

$$ y_{i} = \xi_{1} + (x_{i} - e_{i} )\xi_{2} - \upsilon_{i} $$

(71)

Let it now be assumed that the set of observations not only contains observations concerning model (71) but also observations for which the regression line differs in parameters $\xi_{1}$ and $\xi_{2}$ (in Total M_split estimation, these will be parameters $\xi_{1\beta }$ and $\xi_{2\beta }$). These observations will hereinafter be referred to conventionally as outliers. In contrast to the classical approach, the deviation here is of a different nature, and is not necessarily related to the existence of gross errors. If the assignment of observation $y_{i}$ to its respective regression line is not known, then this observation may correspond to both the following model

$$ y_{i} = \xi_{1} + (x_{i} - e_{i} )\xi_{2} - \upsilon_{i} = \xi_{1\alpha } + (x_{i} - e_{i} )\xi_{2\alpha } - \upsilon_{i\alpha } $$

(72)

and to the model that is competing in relation to it, namely:

$$ y_{i} = \xi_{1\beta } + (x_{i} - e_{i} )\xi_{2\beta } - \upsilon_{i\beta } $$

(73)

For the estimation of parameters in models (72) and (73), Total M_split estimation will be applied. The calculations will be carried out using the data provided in (Neri et al. 1989) and also used in (Schaffrin and Wieser 2008; Shen et al. 2011; Mahboub 2012). These data will be supplemented with two variants of bias. In the first of these variants, regression line (73) with theoretical parameters $\xi_{1\beta } = 2.0$, $\xi_{2\beta } = 0.75$ is adopted, while in the second variant, $\xi_{1\beta } = 4.5$, $\xi_{2\beta } = - 0.70$ is adopted. For the outliers, weights $W_{{x_{i} }}$ were determined based on the information on the weights of coordinates $x_{i}$ concerning the group of original observations (where necessary, also using interpolation). Moreover, the outliers are assigned equal weights $W_{{y_{i} }} = 50$, which corresponds to the standard deviation $\sigma_{y} = 0.14$. The values of these weights are very important to the success of Total M_split estimation. Too high accuracy (high weights) of the added observations may cause the original observations to be “ignored”, while too low accuracy (low weights) may cause the added observations to be “ignored”. In the presented example, satisfying results that were not much different from each other were obtained for $40 < W_{{y_{i} }} < 80$. Original observations ($i = 1, \ldots ,10$) (Neri et al. 1989), outliers ($i = 11,12,13$), and the weights $W_{{x_{i} }}$ and $W_{{y_{i} }}$, corresponding to these observations, are listed in Table 8. The results of model (71) parameter estimation using WTLS for original observations, transcribed from Shen et al. (2011), are provided in columns 2 and 3 of Table 9 [the estimate $\hat{\sigma }_{0}$ is computed by applying Eq. (19)]. Based on original observations and using models (72) and (73), TM_split estimators of the parameters occurring there were calculated (column 4, Table 9). A graphical interpretation of the original set of observations and the location of regression lines (determined based on the WTLS and TM_split estimators) in this set are shown in Fig. 6. On the other hand, the WTLS and TM_split estimators determined for the sets extended to include outliers are provided in other columns of Table 9. These sets and the corresponding regression lines are shown in Fig. 7.

Table 8 Observed data and corresponding weights

Full size table

Table 9 WTLS and Total M_split estimation results for the sets containing biases (additional regression line) (for TM_split: $\xi_{1} : = \xi_{1\alpha }$, $\xi_{2} : = \xi_{2\alpha }$)

Full size table

The TM_split estimators determined for the original set of observations may appear not wholly satisfactory (column 4, Table 9). Due to the lack of outliers, Total M_split estimation, predicting the existence of two mutually competing regression lines, forces two regression lines to fit into the set of observations (see Fig. 6). It should be noted that the regression line established by WTLS estimators lies between these lines. The average values of TM_split estimators $\hat{\xi }_{{1{\text{M}}_{{{\text{split}}}} }} = (\hat{\xi }_{1\alpha } + \hat{\xi }_{\beta } )/2 = 5.4069$ and $\hat{\xi }_{{{\text{2M}}_{{{\text{split}}}} }} = (\hat{\xi }_{2\alpha } + \hat{\xi }_{2\beta } )/2 = - 0.4612$, as compared to WTLS estimators $\hat{\xi }_{1} = 5.4799$ and $\hat{\xi }_{2} = - 0.4805$, can already be considered satisfactory. For both sets containing outliers, TM_split estimators $\hat{\xi }_{1\alpha }$ and $\hat{\xi }_{2\alpha }$ are close to the corresponding WTLS estimators obtained for the original set of observations. On the other hand, estimators $\hat{\xi }_{1\beta }$ and $\hat{\xi }_{2\beta }$ are close to the true values of parameters $\xi_{1\beta } = 2.0$, $\xi_{2\beta } = 0.75$(Variant I) and $\xi_{1\beta } = 4.5$, $\xi_{2\beta } = - 0.70$(Variant II). In such cases, WTLS estimators yielded no good answers. This is particularly true for Variant II for which the relevant comparisons are particularly unfavourable. The results obtained using WTLS are not surprising, as the lack of WTLS estimators' robustness to observations is their inherent feature.

The example presented above concerned a situation where outliers can be assigned a regression line that is appropriate for them. In practice, however, the outlying of observations may relate to single observations and result from, for example, the effect of gross errors. In order to check the response of TM_split and WTLS estimators to such errors, one of the observations from the original set will be affected by gross error with a few value versions. For example, let it be assumed that such an observation is $y_{5} = 3.5$ ($x_{5} = 3.3$) with weights $W_{{x_{5} }} = 200$ and $W_{{y_{5} }} = 20$ (Table 8). This observation will be affected by gross error with values of $g = 1$, $g = 2$, $g = 5$, $g = 10$, respectively. The data adopted for the calculations are provided in Table 10. On the other hand, the TM_split and WTLS estimators of parameters $\xi_{1}$ and $\xi_{2}$ are provided in Table 11.

Table 10 Observations and weights corresponding to them (observation $y_{5}$, highlighted in bold, is affected by gross error g with different values)

Full size table

Table 11 TM_split and WTLS estimators determined for the set containing an observation affected by gross error (for TM_split:$\xi_{1} : = \xi_{1\alpha }$, $\xi_{2} : = \xi_{2\alpha }$)

Full size table

TM_split estimators $\hat{\xi }_{1\alpha }$ and $\hat{\xi }_{2\alpha }$ for each accepted gross error value are satisfactory, especially in comparison with WTLS estimators being obtained, which are unacceptable even for small gross errors. The quantities $\hat{\xi }_{1\beta }$ and $\hat{\xi }_{2\beta }$ are competing in relation to $\hat{\xi }_{1\alpha }$ and $\hat{\xi }_{2\alpha }$. These are estimators of the parameters of the $y_{5} = \xi_{1\beta } + \xi_{2\beta } x_{5}$ regression line on which the observation affected by gross error should lie. By using the equation $\tilde{y}_{5} = \hat{\xi }_{1\beta } + \hat{\xi }_{2\beta } x_{5}$, it is possible to calculate the prediction of observation $y_{5}$ affected by gross error. The prediction of this observation for the adopted gross error values, as compared to its simulated values, is presented in Table 12. A graphical interpretation of the obtained results is provided in Fig. 8.

Table 12 Prediction of an observation affected by gross error

Full size table

4.3 Example 3: Two-dimensional affine transformation

The estimators of parameters in the EIV model, which are robust to gross errors, include inter alia robust total least-squares (RTLS) and total least trimmed squares (TLTS) estimators (Wang et al. 2016; Lv and Sui 2020). One of the examples provided in Lv and Sui (2020) concerns a two-dimensional affine transformation carried out based on observations affected by gross errors. In that case, the authors applied the following transformation model:

$$ \left[ {\begin{array}{*{20}c} {u_{t} } \\ {\upsilon_{t} } \\ \end{array} } \right] = \left[ {\begin{array}{*{20}l} {u_{s} } \hfill & {\upsilon_{s} } \hfill & 1 \hfill & 0 \hfill & 0 \hfill & 0 \hfill \\ 0 \hfill & 0 \hfill & 0 \hfill & {u_{s} } \hfill & {\upsilon_{s} } \hfill & 1 \hfill \\ \end{array} } \right]\left[ {\begin{array}{*{20}l} {a_{1} } \hfill \\ {b_{1} } \hfill \\ {c_{1} } \hfill \\ {a_{2} } \hfill \\ {b_{2} } \hfill \\ {c_{2} } \hfill \\ \end{array} } \right] $$

(74)

where $u_{s}$, $\upsilon_{s}$ and $u_{t}$, $\upsilon_{t}$ are the coordinates of the common points in the start and target coordinate systems, while $a_{1}$, $b_{1}$, $c_{1}$, $a_{2}$,$b_{2}$, $c_{2}$ are the parameters being determined. Table 13 presents 15 observation points simulated in the start and target coordinate systems (Lv and Sui 2020). Some of these observations (highlighted in bold) are affected by gross errors. For these data, Lv and Sui (2020) applied TLTS estimation (using two different algorithms yielding the same results) as well as TLTS with RTLS as a starting point (hereafter denoted as TLTS/RTLS). The authors compared the obtained estimators with WTLS and RTLS estimators. These results may now be supplemented with TM_split estimators (the seventh column of Table 14). TM_split estimators will also be calculated for the data without gross errors. All the cited and calculated estimators are provided in Table 14.

Table 13 Observed points with outliers in the start and target coordinate systems (Lv and Sui 2020). Boldface numbers indicate outliers

Full size table

Table 14 A comparison of the estimated parameters from different methods (for TM_split: $a_{1} : = a_{1\alpha }$, $b_{1} : = b_{1\alpha }$, $c_{1} : = c_{1\alpha }$, $a_{2} : = a_{2\alpha }$, $b_{2} : = b_{2\alpha }$, $c_{2\alpha } : = c_{2}$)

Full size table

Table 14 shows that TM_split(α) estimates are generally the closest to TLTS estimates. However, in the case of parameters $a_{1}$, $b_{1}$, $c_{1}$ and $b_{2}$, TM_split(α) estimators are closest to TLTS/RTLS estimators and thus to the true parameter values. The interpretation of TM_split(β) estimators is similar to that in the previous example, i.e. these are estimators of the parameters of the model for observations affected by gross errors. It is worth noting here that in the absence of gross errors, both versions of TM_split estimators differ slightly from each other.

5 Summary

By using M_split estimation, it is possible to determine the estimators of mutually competing parameters in classical functional models. However, there are cases in geodetic practice in which classical models need to be replaced with EIV models. The method proposed in this paper, called “Total M_split estimation”, is an development of M_split estimation that accepts such models. The Total M_split estimation objective function was determined by applying the Lagrange approach, as in the case of WTLS method. The mutually competing EIV models occurring in this function were replaced with their linear approximations. This enabled the construction of a relatively simple yet efficient algorithm for determining TM_split estimators. The basis of this algorithm is the iterative updating of EIV models (external iterations) based on M_split estimators (internal iterations) obtained in the previous iterative step. The proposed algorithm is efficient in all cases presented in the paper. It concerns both results and the flow of the iterative process. The problems with the convergence of the external iterations might occur because of the linear approximation of EIV models applied. It is especially evident when the errors disturbing the matrix A are too big.

The examples presented in the paper showed that the properties of TM_split estimators are, in general, similar to the properties of M_split estimators. If the elements of matrix A are not affected by random errors, then TM_split and M_split estimators are equal to each other. The possibility of determining estimators of competing parameters is of particular importance when the sets of observations are a mixture of the realisations of two random variables with mutually competing positional parameters. Such a situation occurs in Examples 1 and 2, where each observation group can be assigned a corresponding regression line. The problem, however, is that it is not known which of these lines is the best for a particular observation. TM_split estimators, similarly like M_split estimators for classical models, yield satisfactory results here. Due to their theoretical origin (neutral LS-method), WTLS estimators are not robust to gross errors. Where the sets are realisations of only a single random variable, Total M_split estimation further offers two mutually competing solutions. These are forced solutions, yet so close to each other that even in such a situation, it is possible to evaluate the functional model parameters (e.g. after the calculation of average values of respective estimators).

Total M_split estimation can also be applied for the estimation of EIV model parameters in the case where the outlying of observations results from their being affected by gross errors. From the perspective of the M_split and TM_split estimation theory, such a case is not significantly different from that discussed earlier. In the second part of Example 2, it was shown that the determined TM_split estimators enabled the determination of not only the regression line appropriate for “good” observations (TM_split(α) solution), but also of the regression line on which the observation affected by gross error lies (TM_split(β) solution).

In EIV models, it is assumed that matrix A is observed as well. Therefore, its elements can also be affected by gross errors arising from various reasons. Such a situation is the case in Example 3, in which certain coordinates are affected by gross errors, both in the start and target systems in two-dimensional affine transformation. In this example, TM_split estimators are close to robust TLTS estimators and, for certain transformation parameters, they are also close to the results of TLTS estimation with RTLS estimation as a starting step.

The estimates' accuracy is an important issue (especially in comparing the estimation methods). The accuracy of M_split estimation can be determined by applying asymptotical covariance matrices. To apply such an approach to Total M_split estimation, one should develop the theory, which is beyond the scope of the present paper. Section 4.1 presents the assessments of the M_split and Total M_split estimates accuracy (empirical standard deviations) obtained from the Monte Carlo simulations. Generally, the accuracy of M_split estimates decreases with the growing standard deviation of errors disturbing the matrix A. Total M_split estimates have smaller standard deviations than respective M_split estimates in such a context. Similar relations concern the measures of efficacy, namely values of RMSEs. Thus, when the competitive functional models are supplemented with EIV models, then M_split estimation should be replaced by Total M_split estimation. It is especially advisable when disturbances of matrix A have large standard deviations (here, the application of Total M_split estimation is justified for $\sigma_{e} = \sigma_{y}$).

References

Acar M, Özlüdemir MT, Akyilmaz O, Çelik RN, Ayan T (2006) Deformation analysis with total least squares. Nat Hazards Earth Syst Sci 6:663–669
Article Google Scholar
Akyilmaz O (2007) Total least squares solution of coordinate transformation. Surv Rev 39(303):68–80
Article Google Scholar
Amiri-Simkooei AR (2017) Weighted total least squares with singular covariance matrices subject to weighted and hard constraints. J Surv Eng 143(4):04017018
Article Google Scholar
Amiri-Simkooei AR (2018) Weighted total least squares with constraints: alternative derivation without using Lagrange multipliers. J Surv Eng 144(2):06017005
Article Google Scholar
Amiri-Simkooei A, Jazaeri S (2012) Weighted total least squares formulated by standard least squares theory. J Geod Sci 2(2):113–124
Article Google Scholar
Amiri-Simkooei AR, Asce M, Zangeneh-Nejad F, Asgari J (2016) On the covariance matrix of weighted total least-squares estimates. J Surv Eng 142(3):04015014
Article Google Scholar
Amiri-Simkooei AR, Alaei-Tabatabaei SM, Zangeneh-Nejad F, Voosoghi B (2017) Stability analysis of deformation-monitoring network points using simultaneous observation adjustment of two epochs. J Surv Eng 143(1):04016020
Article Google Scholar
Aydin C, Mercan H, Uygur SÖ (2018) Increasing numerical efficiency of iterative solution for total least-squares in datum transformations. Stud Geophys Geod 62:223–242
Article Google Scholar
Błaszczak-Bąk W, Janowski A, Kamiński W, Rapiński J (2015) Application of the M_split method for filtering airborne laser scanning data-sets to estimate digital terrain models. Int J Remote Sens 36(9):2421–2437
Article Google Scholar
Davis TG (1999) Total least squares spiral curve fitting. J Surv Eng 125(4):159–176
Article Google Scholar
Duchnowski R, Wiśniewski Z (2012) Estimation of the shift between parameters of functional models of geodetic observations by applying M_split estimation. J Surv Eng 138:1–8
Article Google Scholar
Duchnowski R, Wiśniewski Z (2014) Comparison of two unconventional methods of estimation applied to determine network point displacement. Surv Rev 46(339):401–405
Article Google Scholar
Duchnowski R, Wiśniewski Z (2019) Robustness of M_split(q) estimation: a theoretical approach. Stud Geophys Geod 63:390–417
Article Google Scholar
Duchnowski R, Wiśniewski Z (2020) Robustness of squared M_split(q) estimation: empirical analyses. Stud Geophys Geod 64:153–171
Article Google Scholar
Fang X (2013) Weighted total least squares: necessary and sufficient conditions, fixed and random parameters. J Geod 87:733–749
Article Google Scholar
Fang X (2015) Weighted total least-squares with constraints: a universal formula or geodetic symmetrical transformations. J Geod 89:459–469
Article Google Scholar
Felus F (2004) Application of total least squares for spatial point process analysis. J Surv Eng 130(3):126–133
Article Google Scholar
Felus Y, Schaffrin B (2005) Performing similarity transformations using the errors-in-variables-model. In: Proceedings of theASPRSmeeting, Washington, DC, May 2005
Ge Y, Yuan Y, Jia N (2013) More efficient methods among commonly used robust estimation methods for GPS coordinate transformation. Surv Rev 45:229–234
Article Google Scholar
Golub GH, van Loan CF (1980) An analysis of the total least squares problem. SIAM J Numer Anal 17:883–893
Article Google Scholar
Huber PJ (1964) Robust estimation of location parameter. Ann Math Stat 43(4):1041–1067
Article Google Scholar
Janicka J, Rapinski J (2013) M_split transformation of coordinates. Surv Rev 45:269–274
Article Google Scholar
Janicka J, Rapiński J, Błaszczak-Bąk W, Suchocki C (2020) Application of the M_split estimation method in the detection and dimensioning of the displacement of adjacent planes. Remote Sens 12:3203
Article Google Scholar
Janowski A, Rapiński J (2013) M_split Estimation in laser scanning data modeling. J Indian Soc Remote Sens 41:15–19
Article Google Scholar
Jones GA, Jones JM (2000) Information and coding theory. Undergraduate Mathematics Series. Springer, Heidelberg
Kargoll B, Omidalizarandi M, Loth I, Paffenholz JA, Alkhatib H (2018) An iteratively reweighted least-squares approach to adaptive robust adjustment of parameters in linear regression models with autoregressive and t-distributed deviations. J Geod 92:271–297
Article Google Scholar
Koch RK (2013) Robust estimation by expectation maximization algorithm. J Geod 87:107–116
Article Google Scholar
Kubáčková L, Kubáček L (1991) Optimum processing of measurements from a group of instruments affected by drift. Manuscr Geod 16:148–154
Google Scholar
Lv Z, Sui L (2020) The BAB algorithm for computing the total least trimmed squares estimator. J Geod 94(12):110
Article Google Scholar
Mahboub V (2012) On weighted total least-squares for geodetic transformations. J Geod 86:359–367
Article Google Scholar
Mercan H, Akyilmaz O, Aydin C (2018) Solution of the weighted symmetric similarity transformations based on quaternions. J Geod 92:1113–1130
Article Google Scholar
Neri F, Saitta G, Chiofalo S (1989) An accurate and straightforward approach to line regression analysis of error-affected experimental data. J Phys Ser E: Sci Instr 22:215–217
Article Google Scholar
Nowel K (2016) Application of Monte Carlo method to statistical testing in deformation analysis based on robust M-estimation. Surv Rev 48(348):212–223
Article Google Scholar
Nowel K (2019) Squared M_split(q) S-transformation of control network deformations. J Geod 93:1025–1044
Article Google Scholar
Pope A (1974) Two approaches to nonlinear least squares adjustments. Can Surv 28(5):663–669
Article Google Scholar
Rao CR (1973) Linear statistical inference and its applications. Wiley, New York
Book Google Scholar
Schaffrin B (2020) Total least-squares collocation: an optimal estimation technique for the EIV-model with prior information. Mathematics 8(6):971
Article Google Scholar
Schaffrin B, Felus Y (2008) On the multivariate total least-squares approach to empirical coordinate transformations. Three algorithms. J Geod 82:373–383
Article Google Scholar
Schaffrin B, Wieser A (2008) On weighted total least-squares adjustment for linear regression. J Geod 82:415–421
Article Google Scholar
Schaffrin B, Lee IP, Felus Y, Choi YS (2006) Total least-squares for geodetic straight-line and plane adjustment. Boll Geod Sci Aff 65:141–168
Google Scholar
Shen Y, Li B, Chen Y (2011) An iterative solution of weighted total least-squares adjustment. J Geod 85:229–238
Article Google Scholar
Teunissen PJG (1988) The nonlinear 2D symmetric Helmert transformation: an exact nonlinear least squares solution. Bull Geod 62:1–16
Article Google Scholar
Tong X, JinY ZS, Li L, Liu S (2015) Bias-corrected weighted total least-squares adjustment of condition equations. J Surv Eng 141(2):04014013
Article Google Scholar
van Huffel S, Vandewalle J (1991) The total least squares problem: computational aspects and analysis. SIAM, Philadelphia
Book Google Scholar
Wang L, Zhao Y (2019) Second-order approximation function method for precision estimation of total least squares. J Surv Eng 145(1):04018011
Article Google Scholar
Wang B, Liu J, Liu C (2016) A robust weighted total least squares algorithm and its geodetic applications. Stud Geophys Geod 60:177–194
Article Google Scholar
Wiśniewski Z (1985) The effect of the asymmetry of geodetic observation error distribution on the results of adjustment by the least squares method. Geod Cartogr 34(11):11–21
Google Scholar
Wiśniewski Z (2009) Estimation of parameters in a split functional model of geodetic observations (M_split estimation). J Geod 83:105–120
Article Google Scholar
Wiśniewski Z (2010) M_split(q) estimation: estimation of parameters in a multi split functional model of geodetic observations. J Geod 84:355–372
Article Google Scholar
Wiśniewski Z (2017) M-P estimation applied to platykurtic sets of geodetic observations. Geod Cartogr 66(1):117–135
Article Google Scholar
Wiśniewski Z, Kamiński W (2020) Estimation and prediction of vertical deformations of random surfaces, applying the total least squares collocation method. Sensors 20(14):3913
Article Google Scholar
Wiśniewski Z, Zienkiewicz MH (2016) Shift-M_split estimation in deformation analyses. J Surv Eng 142:1–13
Article Google Scholar
Wiśniewski Z, Zienkiewicz MH (2021a) Empirical analyses of robustness of the square M_split estimation. J Appl Geod 15(2):87–104
Article Google Scholar
Wiśniewski Z, Zienkiewicz MH (2021b) Estimators of covariance matrices in M_split(q) estimation. Surv Rev 53(378):263–279
Article Google Scholar
Wyszkowska P, Duchnowski R (2019) M_split estimation based on L1 norm condition. J Surv Eng 145(3):04019006
Article Google Scholar
Wyszkowska P, Duchnowski R, Dumalski A (2021) Determination of terrain profile from TLS data by applying M_split estimation. Remote Sens 13:31
Article Google Scholar
Xu P, Liu J (2014) Variance components in errors-in-variables models: estimability, stability and bias analysis. J Geod 88:719–734
Article Google Scholar
Xu P, Liu J, Shi C (2012) Total least squares adjustment in partial errors-in-variables models: algorithm and statistical analysis. J Geod 86:661–675
Article Google Scholar
Xu P, Liu J, Zeng W, Shen Y (2014) Effects of errors-in-variables on weighted least squares estimation. J Geod 88:705–716
Article Google Scholar
Yang Y, Zhang S (2005) Adaptive fitting of systematic errors in navigation. J Geod 79:43–49
Article Google Scholar
Yang Y, Gao W, Zhang X (2010) Robust Kalman filtering with constraints: a case study for integrated navigation. J Geod 84:373–381
Article Google Scholar
Zeng W, Fang X, Lin Y, Huang X, Zhou Y (2018) On the total least-squares estimation for autoregressive model. Surv Rev 50:186–190
Article Google Scholar
Zienkiewicz MH (2014) Application of M_split estimation to determine control points displacements in networks with unstable reference system. Surv Rev 47:174–180
Article Google Scholar
Zienkiewicz MH (2018a) Deformation analysis of geodetic networks by applying M_split estimation with conditions binding the competitive parameters. J Surv Eng 145(2):04019001
Article Google Scholar
Zienkiewicz MH (2018b) Determination of an appropriate number of competitive functional models in a square M_split(q) estimation by applying the modified Baarda approach. Surv Rev 52(370):13–23
Article Google Scholar
Zienkiewicz MH, Hejbudzka K, Dumalski A (2017) Multi split functional model of geodetic observations in deformation analyses of the Olsztyn castle. Acta Geodyn Geomater 14:195–204
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Geodesy, Faculty of Geoengineering, University of Warmia and Mazury, 10-719, Olsztyn, Poland
Zbigniew Wiśniewski

Authors

Zbigniew Wiśniewski
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zbigniew Wiśniewski.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Wiśniewski, Z. Total M_split estimation. J Geod 96, 82 (2022). https://doi.org/10.1007/s00190-022-01668-z

Download citation

Received: 10 July 2021
Accepted: 09 September 2022
Published: 18 October 2022
DOI: https://doi.org/10.1007/s00190-022-01668-z

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Total M_split estimation

Abstract

Similar content being viewed by others

A new criterion for assessing discriminant validity in variance-based structural equation modeling

RMSEA, CFI, and TLI in structural equation modeling with ordered categorical data: The story they tell depends on the estimation methods

Estimating power in (generalized) linear mixed models: An open introduction and tutorial in R

1 Introduction

2 Review of M_split and WTLS estimation

2.1 M_split estimation

2.2 Weighted TLS method

3 Total M_split estimation

3.1 Theoretical foundations

3.2 Algorithm

4 Examples

4.1 Example 1: competitive models of systematic errors

4.2 Example 2: linear regression

4.3 Example 3: Two-dimensional affine transformation

5 Summary

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Total Msplit estimation

Abstract

Similar content being viewed by others

A new criterion for assessing discriminant validity in variance-based structural equation modeling

RMSEA, CFI, and TLI in structural equation modeling with ordered categorical data: The story they tell depends on the estimation methods

Estimating power in (generalized) linear mixed models: An open introduction and tutorial in R

1 Introduction

2 Review of Msplit and WTLS estimation

2.1 Msplit estimation

2.2 Weighted TLS method

3 Total Msplit estimation

3.1 Theoretical foundations

3.2 Algorithm

4 Examples

4.1 Example 1: competitive models of systematic errors

4.2 Example 2: linear regression

4.3 Example 3: Two-dimensional affine transformation

5 Summary

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation

Total M_split estimation

2 Review of M_split and WTLS estimation

2.1 M_split estimation

3 Total M_split estimation