1 Introduction

Breeders of forest trees must often consider conservation of genetic diversity, while at the same time maximizing response to selection. The selection of new breeding parents must make rapid progress in terms of genetic gain, while maintaining genetic diversity for future genetic improvement. Buyers of planting stock produced by selected parental genotypes managed in seed orchards want maximum performance, while satisfying a restriction, sometimes legislated, on the diversity deployed to the forest.

The use of mathematical optimization approaches has been applied increasingly by breeders to the optimum selection problem [1, 15, 22, 25]. Meuwissen [15] expressed this selection of optimal contributions as an optimization problem of the form:

$$\begin{aligned} \begin{array}{lcl} \max &{}:&{} \varvec{g}^T \varvec{x}\\ \hbox {subject to} &{}:&{} \varvec{e}^T \varvec{x}= 1 \\ &{} &{} \varvec{l}\le \varvec{x}\le \varvec{u}\\ &{} &{} \varvec{x}^T \varvec{A}\varvec{x}\le 2 \theta . \end{array} \end{aligned}$$

Through this paper, we use m to denote the number of the candidate members. We use the vector \(\varvec{e}\in \mathbb {R}^m\) to denote a vector of ones, and the superscript T to denote the transpose of a vector or a matrix. The variable vector \(\varvec{x}\in \mathbb {R}^m\) corresponds to the contributions (or the proportions) of the candidates, hence the first constraint \(\varvec{e}^T \varvec{x}= 1\) requires that the total contribution of candidate members be unity. In the second constraint, \(\varvec{l}\in \mathbb {R}^m\) and \(\varvec{u}\in \mathbb {R}^m\) are element-wise lower and upper bounds.

The estimated breeding values (EBVs) [13] are often used as the coefficient vector \(\varvec{g}\in \mathbb {R}^m\) in the objective function, since the maximization of the weighted EBVs is expected to produce the best genetic performance. From the viewpoint of numerical optimization, we can assume that \(\varvec{g}\) is given.

The most interesting property in (1) is the last constraint \(\varvec{x}^T \varvec{A}\varvec{x}\le 2 \theta \). This constraint was originally introduced to consider relatedness or coancestry among candidates due to common ancestors in their pedigree. These effects accumulate over cycles of breeding as the pedigree becomes more complex. Cockerham [4] extended the definition of coancestry coefficients to describe group coancestry. The group coancestry of the contributions \(\varvec{x}\) is calculated with the formula \(\frac{\varvec{x}^T \varvec{A}\varvec{x}}{2}\), where \(\varvec{A}\in \mathbb {R}^{m \times m}\) is the numerator relationship matrix of Wright [29] (we will review a formula for the numerator relationship matrix in Sect. 2). The constraint \(\varvec{x}^T \varvec{A}\varvec{x}\le 2 \theta \) is therefore employed to keep group coancestry \(\frac{\varvec{x}^T \varvec{A}\varvec{x}}{2}\) under an appropriate level \(\theta \in \mathbb {R}\).

To solve the optimal selection problem (1) efficiently, Meuwissen [15] developed an iterative method based on Lagrangian multipliers and his method has been widely used in breeding, for example, [7, 10, 28]. It is a characteristic of this method that some variables \(x_i\) may be fixed to their lower or upper bounds (\(l_i\) or \(u_i\)) forcibly, and it was further demonstrated by [22] that the output solution of Meuwissen’s method is not always truly optimal.

Pong-Wong and Woolliams [22] utilized semidefinite programming (SDP) to derive a formula equivalent to (1). An SDP is a convex optimization problem that maximizes a linear objective function over the constraints described as linear matrix inequalities. Many studies in the 1990s extended interior-point methods from linear programming problems to SDP problems (e.g., [8, 11]). Based on primal-dual interior-point methods, several software solver packages have been developed, such as SDPA [30], SDPARA [31], SDPA-C [32], SDPT3 [27], and SeDuMi [26]. Utilizing the SDP formulation, Pong-Wong and Woolliams [22] obtained the exact optimal value of (1).

An important obstacle encountered in the SDP approach is the heavy computation cost. The number of candidate members discussed in [22] was thus limited, \(m \le 10\). Ahlinder et al. reported in [1] that the SDP approach required 5 h of computation time for a problem of the size \(m = 12{,}000\). This computation time is rather long for operational application and may require significant truncation of the candidate list prior to solving the SDP.

The objective of this paper is to propose an efficient approach based on second-order cone programming (SOCP). SOCP is a convex optimization that maximizes a linear objective function over linear constraints and second-order cone constraints. Lobo et al. [12] discussed a wide range of SOCP applications, for example, filter design and truss design, and Sasakawa and Tsuchiya [24] applied SOCP to magnetic shield design. Alizadeh and Goldfarb [2] surveyed theoretical and algorithmic aspects of SOCP. The software packages SDPT3 [27] and SeDuMi [26] can solve not only SDP but also SOCP using primal-dual interior-point methods. In addition, ECOS [5] was implemented recently to solve SOCP problems.

We first show that an SOCP formulation can attain the exact optimal solution of (1), in the same way as SDP. From the fact that SOCP is a special case of SDP, we might expect that a simple SOCP formulation would be sufficient to reduce the computation time, but preliminary numerical tests revealed that the simple SOCP formulation was no better than the SDP approach.

We thus focus on exploiting the sparsity embedded in the numerator relationship matrix \(\varvec{A}\) and establish a sparse SOCP formulation. Furthermore, we integrate Henderson’s algorithm [9] to resolve a bottleneck in the sparse formulation and propose a compact SOCP formulation. This compact formulation allows us to remove dense matrices from the computation steps of the SOCP solution.

Numerical tests with data from a sample of Scots pine (Pinus sylvestris L.) selection candidates showed that the compact SOCP formulation greatly reduced computation time. For the case of \(m=10{,}100\), we attained a computation speed improvement of 20,000-times compared to the SDP approach. In addition, the compact SOCP formulation is more efficient from the viewpoint of memory utilization.

The remainder of this paper is organized as follows: Sect. 2 describes the SDP approach of Pong-Wong and Woolliams and discusses a simple SOCP formulation; in Sect. 3, we propose SOCP formulations and derive an efficient method; Sect. 4 presents numerical results to verify the reduction in computation time for problems of various sizes; and finally, Sect. 5 gives conclusions and discusses future directions.

2 SDP formulation and simple SOCP formulation

Since a principal characteristic of the optimal selection problem (1) is determined by the numerator relationship matrix \(\varvec{A}\), we first review an algorithm to evaluate \(\varvec{A}\). We then describe the SDP formulation of Pong-Wong and Woolliams [22], and compare the performance of the SDP formulation and a simple SOCP formulation.

To evaluate \(\varvec{A}\), we separate the set of pedigree candidate members \(\mathcal {P}:=\{1, 2, \ldots , m\}\) into the three disjoint groups:

$$\begin{aligned} \mathcal {P}= \mathcal {P}_0 \cup \mathcal {P}_1 \cup \mathcal {P}_2, \end{aligned}$$


$$\begin{aligned} \left\{ \begin{array}{rcl} \mathcal {P}_0 &{}=&{} \{ i \in \mathcal {P}: \hbox {both parents} \ p(i) \ \hbox {and} \ q(i) \ \hbox {are unknown} \} \\ \mathcal {P}_1 &{}=&{} \{ i \in \mathcal {P}: \hbox {one parent} \ p(i) \ \hbox {is known and the other parent} \ q(i) \ \hbox {is unknown} \} \\ \mathcal {P}_2 &{}=&{} \{ i \in \mathcal {P}: \hbox {both parents} \ p(i) \ \hbox {and} \ q(i) \ \hbox {are known} \} . \end{array} \right. \end{aligned}$$

Figure 1 gives an example of a pedigree having \(m=9\) members and illustrates its genealogical chart. In this example, \(\mathcal {P}_0 = \{1, 2\}, \mathcal {P}_1 = \{5\}, \mathcal {P}_2 = \{3, 4, 6, 7, 8, 9\}\), and the parents of the 8th member are \(p(8) = 7\) and \(q(8) = 6\). If parent p(i) or q(i) is unknown, we assign \(p(i) = 0\) or \(q(i) = 0\), respectively, and we can assume \(i > p(i) \ge q(i)\) for all \(i \in \mathcal {P}\) without loss of generality.

Fig. 1
figure 1

An example of a coded pedigree and its diagram

The numerator relationship matrix \(\varvec{A}\) was defined by Wright [29], and its simplified formula was presented in [9]. The formula in [9] gives the elements \(A_{11}, \ldots , A_{nn}\) in a recursive style as follows:

$$\begin{aligned} \left\{ \begin{array}{ll} A_{ij} = A_{ji} = \frac{A_{j,p(i)} + A_{j,q(i)}}{2} &{} \quad \hbox {for} \;\quad i = 1, \ldots , m, \ j = 1,\ldots ,i-1 \\ A_{ii} = 1 + \frac{A_{p(i),q(i)}}{2} &{} \quad \hbox {for} \;\quad i = 1, \ldots , m, \end{array}\right. \end{aligned}$$

where \(A_{pq} = 0\) if \(p = 0\) or \(q = 0\). When we apply this recursion to the example of Fig. 1, we obtain the corresponding matrix \(\varvec{A}\) as follows:

$$\begin{aligned} \varvec{A}= \frac{1}{32} \left( \begin{array}{r@{\quad }r@{\quad }r@{\quad }r@{\quad }r@{\quad }r@{\quad }r@{\quad }r@{\quad }r} 32 &{} 0 &{} 16 &{} 16 &{} 0 &{} 16 &{} 16 &{} 16 &{} 8 \\ 0 &{} 32 &{} 16 &{} 16 &{} 16 &{} 16 &{} 8 &{} 12 &{} 12 \\ 16 &{} 16 &{} 32 &{} 16 &{} 8 &{} 24 &{} 12 &{} 18 &{} 10 \\ 16 &{} 16 &{} 16 &{} 32 &{} 8 &{} 24 &{} 12 &{} 18 &{} 10 \\ 0 &{} 16 &{} 8 &{} 8 &{} 32 &{} 8 &{} 16 &{} 12 &{} 24 \\ 16 &{} 16 &{} 24 &{} 24 &{} 8 &{} 40 &{} 12 &{} 26 &{} 10\\ 16 &{} 8 &{} 12 &{} 12 &{} 16 &{} 12 &{} 32 &{} 22 &{} 24 \\ 16 &{} 12 &{} 18 &{} 18 &{} 12 &{} 26 &{} 22 &{} 38 &{} 17 \\ 8 &{} 12 &{} 10 &{} 10 &{} 24 &{} 10 &{} 24 &{} 17 &{} 48 \end{array}\right) . \end{aligned}$$

Pong-Wong and Woolliams [22] devised an SDP formulation to solve the problem (1). They observed that the matrix \(\varvec{A}\) is always positive definite, and they used the Schur complement to replace the group-coancestry constraint \(\varvec{x}^T \varvec{A}\varvec{x}\le 2 \theta \) with an equivalent condition \(\left( \begin{array}{cc} 2 \theta &{} \varvec{x}^T \\ \varvec{x}&{} \varvec{A}^{-1} \end{array}\right) \in \mathbb {S}_+^{1+m}\), where \(\mathbb {S}_+^{1+m}\) is the space of positive semidefinite matrices of dimension \(1+m\). Consequently, they solved the following SDP problem that is equivalent to the optimal selection problem (1):

$$\begin{aligned} \begin{array}{lcl} \max &{}:&{} \varvec{g}^T \varvec{x}\\ \hbox {subject to} &{}:&{} \varvec{e}^T \varvec{x}= 1 \\ &{} &{} \varvec{l}\le \varvec{x}\le \varvec{u}\\ &{} &{} \left( \begin{array}{cc} 2 \theta &{} \varvec{x}^T \\ \varvec{x}&{} \varvec{A}^{-1} \end{array}\right) \in \mathbb {S}_+^{1+m}. \end{array} \end{aligned}$$

Detailed conversions of this problem into a standard SDP format that can be passed to SDP solvers are described in [1, 22]. Version 1 of the software package OPSEL [18] automatically performs the conversion and passes the problem to SDPA [30] for solving.

Table 1 displays the computed optimal values and the computation times for Meuwissen’s implementation (GENCONT) [16] and the SDP formulation (3). We executed numerical tests using Matlab R2015a on a Windows 8.1 PC with Xeon CPU E3-1231 (3.40 GHz, 4 cores) and 8 GB of RAM. We used Windows, since GENCONT can run on only Windows. To solve the SDP (3), we employed SDPA [30] with four-core parallel processing.

Table 1 Optimal values and computation times for GENCONT and SDP formulations (time in seconds)

We observe from Table 1 that the SDP formulation (3) attained a better optimal value than GENCONT. Actually, as shown in [22], the optimal value obtained from (3) was the exact optimal value, while the Lagrangian multiplier method implemented in GENCONT cannot guarantee optimality. Furthermore, for the large problem (\(m = 10{,}100\)), GENCONT gave up on the computation, while the SDP formulation again obtained the exact value. On the other hand, the disadvantage of the SDP formulation is its computation time. Even using parallel processing implemented in SDPA, the SDP formulation was slower than GENCONT when \(m = 2045\). Furthermore, the computation time for the large problem exceeded 10 h.

To reduce the heavy computation time of the SDP formulation, we express the group-coancestry constraint \(\varvec{x}^T \varvec{A}\varvec{x}\le 2 \theta \) as a second-order cone (SOC) constraint. The SOC constraint for \(\varvec{v}\in \mathbb {R}^n\) and \(v_0 \in \mathbb {R}\) is the inequality constraint \(||\varvec{v}|| \le v_0\), where \(||\varvec{v}|| := \sqrt{\varvec{v}^T \varvec{v}} = \sqrt{\sum _{k=1}^n v_k^2}\) is the Euclidean norm of \(\varvec{v}\). Since the matrix \(\varvec{A}\) is positive definite, the Cholesky factorization gives an upper triangular matrix \(\varvec{U}\) such that \(\varvec{A}= \varvec{U}^T \varvec{U}\). Due to the relation \(\varvec{x}^T \varvec{A}\varvec{x}= \varvec{x}^T \varvec{U}^T \varvec{U}\varvec{x}= ||\varvec{U}\varvec{x}||^2\), the group-coancenstry constraint \(\varvec{x}^T \varvec{A}\varvec{x}\le 2 \theta \) can be substituted by an SOC constraint \(|| \varvec{U}\varvec{x}|| \le \sqrt{2 \theta }\). Consequently, we can convert (1) into an SOCP problem, which will be referred to as a simple SOCP formulation or more concisely simple SOCP:

$$\begin{aligned} \begin{array}{lcl} \max &{}:&{} \varvec{g}^T \varvec{x}\\ \hbox {subject to} &{}:&{} \varvec{e}^T \varvec{x}= 1 \\ &{} &{} \varvec{l}\le \varvec{x}\le \varvec{u}\\ &{} &{} || \varvec{U}\varvec{x}|| \le \sqrt{2 \theta } . \end{array} \end{aligned}$$

From the equivalence, it is clear that this simple SOCP formulation also gives the exact optimal value of the optimal-selection problem (1).

It is well known that SOC constraints are a special case of positive semidefinite constraints. We thus expected that the simple SOCP (4) could be solved more quickly than SDP (3). Table 2 reports the results of a preliminary numerical experiment for the SDP and simple SOCP formulations. We used ECOS [5] as the SOCP solver for solving (4). For the small problem (\(m = 2045\)), simple SOCP successfully reduced the computation time from 70.21 to 0.28 s, a speed improvement of 250-times. For the large problem (\(m = 10{,}100\)), however, the speed improvement was only 7-times. When we consider the computational complexity, simple SOCP would be even slower than SDP for still larger problems. Despite that SOCP is a special case of SDP, this result suggests that a simple SOCP is not so promising.

Table 2 Computation time on SDP and simple SOCP formulations (time in seconds)

3 Efficient formulation based on second-order cone programming

To obtain a more efficient formulation based on SOCP, we investigate properties hidden in the simple SOCP (4). In particular, noting that SDP (3) relies on only \(\varvec{A}^{-1}\), we focus on the sparsity of \(\varvec{A}\) and its inverse \(\varvec{A}^{-1}\). Taking the inverse of \(\varvec{A}\) in (2) reveals that \(\varvec{A}^{-1}\) contains many more zero-elements than \(\varvec{A}\);

$$\begin{aligned} \varvec{A}^{-1} = \frac{1}{42} \left( \begin{array}{r@{\quad }r@{\quad }r@{\quad }r@{\quad }r@{\quad }r@{\quad }r@{\quad }r@{\quad }r} 105 &{} 42 &{} -\,42 &{} -\,42 &{} 21 &{} 0 &{} -\,42 &{} 0 &{} 0 \\ 42 &{} 98 &{} -\,42 &{} -\,42 &{} -\,28 &{} 0 &{} 0 &{} 0 &{} 0 \\ -\,42 &{} -\,42 &{} 105 &{} 21 &{} 0 &{} -\,42 &{} 0 &{} 0 &{} 0 \\ -\,42 &{} -\,42 &{} 21 &{} 105 &{} 0 &{} -\,42 &{} 0 &{} 0 &{} 0 \\ 21 &{} -\,28 &{} 0 &{} 0 &{} 98 &{} 0 &{} -\,21 &{} 0 &{} -\,42 \\ 0 &{} 0 &{} -\,42 &{} -\,42 &{} 0 &{} 108 &{} 24 &{} -\,48 &{} 0 \\ -\,42 &{} 0 &{} 0 &{} 0 &{} -\,21 &{} 24 &{} 129 &{} -\,48 &{} -\,42 \\ 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} -\,48 &{} -\,48 &{} 96 &{} 0 \\ 0 &{} 0 &{} 0 &{} 0 &{} -\,42 &{} 0 &{} -\,42 &{} 0 &{} 84 \end{array}\right) . \end{aligned}$$

Figure 2 illustrates the positions of the non-zero elements of \(\varvec{A}\) and \(\varvec{A}^{-1}\) for the problem of size \(m=10{,}100\). The dimensions of \(\varvec{A}\) and \(\varvec{A}^{-1}\) correspond to size m.

Fig. 2
figure 2

The positions of non-zero elements of \(\varvec{A}\) (left) and \(\varvec{A}^{-1}\) (right) for the problem of size \(m=10{,}100\)

Figure 2 again indicates that \(\varvec{A}^{-1}\) is much sparser than \(\varvec{A}\). The numbers of non-zero elements in \(\varvec{A}\) and \(\varvec{A}^{-1}\) are 56,754,980 and 56,092, respectively, hence their density against the fully-dense matrix (\(m^2\) non-zero elements) are 55.6 and 0.0549%, respectively. When we apply the Cholesky factorization to \(\varvec{A}\), the upper triangular matrix \(\varvec{U}\) inherits the dense property. The number of non-zero elements in \(\varvec{U}\) is 23,171,296, and its density against the fully-dense upper triangular matrix is still 45.4%.

3.1 Sparse SOCP formulation

We can exploit the property that \(\varvec{A}^{-1}\) is remarkably sparse compared to \(\varvec{A}\) and \(\varvec{U}\). The first key step in our approach is to replace \(\varvec{x}\) with \(\varvec{A}^{-1} \varvec{y}\), where \(\varvec{y}\) is a new variable defined by \(\varvec{y}: = \varvec{A}\varvec{x}\). We can transform \(\varvec{x}^T \varvec{A}\varvec{x}\le 2 \theta \) into another quadratic constraint \(\varvec{y}^T \varvec{A}^{-1} \varvec{y}\le 2 \theta \). This new constraint \(\varvec{y}^T \varvec{A}^{-1} \varvec{y}\le 2 \theta \) is composed of the sparse matrix \(\varvec{A}^{-1}\) and does not involve the dense matrix \(\varvec{A}\). The second step is to utilize the Cholesky factor of \(\varvec{A}^{-1}\). It is given by \(\varvec{U}^{-T}\) (the transposed matrix of \(\varvec{U}^{-1}\)) from the relation \(\varvec{A}^{-1} = (\varvec{U}^{-T})^T \varvec{U}^{-T}\), so \(\varvec{y}^T \varvec{A}^{-1} \varvec{y}\le 2 \theta \) is now transformed into an SOC constraint \(||\varvec{U}^{-T} \varvec{y}|| \le \sqrt{2 \theta }\). Using these two steps, we derive another SOCP problem which will be referred to as a sparse SOCP formulation or more concisely sparse SOCP:

$$\begin{aligned} \begin{array}{lcl} \max &{}:&{} (\varvec{A}^{-1}\varvec{g})^T \varvec{y}\\ \hbox {subject to} &{}:&{} (\varvec{A}^{-1}\varvec{e})^T \varvec{y}= 1 \\ &{} &{} \varvec{l}\le \varvec{A}^{-1}\varvec{y}\le \varvec{u}\\ &{} &{} ||\varvec{U}^{-T} \varvec{y}|| \le \sqrt{2 \theta }. \end{array} \end{aligned}$$

We should emphasize that when \(\varvec{y}^*\) is obtained as the optimal solution of (5), the optimal solution \(\varvec{x}^*\) of the original problem (1) can be computed through the relation \(\varvec{x}^* = \varvec{A}^{-1} \varvec{y}^*\) without depending on the dense matrix \(\varvec{A}\).

In Table 3, we compare the performance of SDP (3), simple SOCP (4), sparse SOCP (5), and compact SOCP (6) formulations. (We will discuss the compact SOCP formulation later in Sect. 3.2). The first column ‘nnz’ is the total number of non-zero elements that appear in a standard format that are acceptable by SDP/SOCP solvers (Technically speaking, we utilized the SeDuMi format [26] to count the non-zero elements). The second column is the computation time to convert a pedigree like Fig. 1 and EBVs (estimated breeding values) into the SDP or SOCP formulations, and the third column is the computation time of the solvers. We applied SDPA with four-core parallel processing to the SDP formulation and ECOS to the SOCP formulations. The fourth (last) column is the total computation time.

Table 3 Performance comparison of the SDP and SOCP formulations (time in seconds)

Table 3 indicates that for the large problem (\(m=10{,}100\)), the computation time of sparse SOCP was much shorter than that of simple SOCP. The sparse SOCP seems to have more complex structure than the simple SOCP, as it contained \(\varvec{A}^{-1}\) three times in the input data. Nevertheless, the total number of non-zero elements was reduced from 23,231,801 in the simple SOCP to 159,570 in the sparse SOCP, due to the change from \(\varvec{U}\) (which contains 23,171,296 non-zero elements) to \(\varvec{U}^{-T}\) (36,607 non-zero elements). This led to a reduction in computation time by the SOCP solver.

3.2 Compact SOCP formulation

To reduce the total time further, we consider the computation time required to prepare the SOCP formulations. In the case of \(m = 10{,}100\) in Table 3, the conversion comprised 94% of the total time. In particular, the construction of dense matrix \(\varvec{A}\) and its inversion are the principal bottlenecks. We evaluate \(\varvec{A}\) to formulate the sparse SOCP, but \(\varvec{A}\) will be no longer required when we solve the resultant SOCP problem with interior-point methods.

It is thus highly desirable to generate directly the sparse matrix \(\varvec{A}^{-1}\) without the dense matrix \(\varvec{A}\), and this leads us to a compact algorithm proposed by Henderson [9] to construct \(\varvec{A}^{-1}\). In Henderson’s original algorithm, the vector of inbreeding coefficients defined by \(\varvec{h}:= diag(\varvec{A}) - \varvec{e}\) appears in the computation process. Here, \(diag(\varvec{A})\) is a vector composed of the diagonal elements of \(\varvec{A}\), so Henderson’s algorithm still depends on \(\varvec{A}\). Quaas [23] devised an efficient short-cut to compute \(\varvec{h}\) without constructing the matrix \(\varvec{A}\) itself, and Masuda et al. [14] utilized this method to implement their YAMS package. The compact algorithm of [9] with the enhancement [23] is shown in Table 4. In the algorithm, the indicator function \(\delta (p)\) stands for \(\delta (0) = 1\) and \(\delta (p) = 0\) for \(p \ne 0\), and \(A_{ij}^{-1}\) denotes the (ij)th element of \(\varvec{A}^{-1}\).

By using the values \(b_i\) computed in Table 4, the mathematical formula for \(\varvec{A}^{-1}\) can be given as follows:

$$\begin{aligned} \varvec{A}^{-1}= & {} \sum _{i \in \mathcal {P}_0} b_i \varvec{e}_i \varvec{e}_i^T + \sum _{i \in \mathcal {P}_1} b_i \left( \varvec{e}_i - \frac{1}{2} \varvec{e}_{p(i)} \right) \left( \varvec{e}_i - \frac{1}{2} \varvec{e}_{p(i)} \right) ^T \\&+ \sum _{i \in \mathcal {P}_2} b_i \left( \varvec{e}_i - \frac{1}{2} \varvec{e}_{p(i)} - \frac{1}{2} \varvec{e}_{q(i)}\right) \left( \varvec{e}_i - \frac{1}{2} \varvec{e}_{p(i)} - \frac{1}{2} \varvec{e}_{q(i)}\right) ^T, \end{aligned}$$

where \(\varvec{e}_i\) is a vector whose components are all zero except the ith component being one.

It is known that \(b_i > 0\) for any \(i = 1, 2, \ldots , m\) from properties of the inbreeding coefficients, therefore, the group-coancestry constraint can be transformed into another SOC constraint:

$$\begin{aligned} \varvec{x}^T \varvec{A}\varvec{x}\le 2 \theta\Leftrightarrow & {} \varvec{y}^T \varvec{A}^{-1} \varvec{y}\le 2 \theta \nonumber \\\Leftrightarrow & {} \sum _{i \in \mathcal {P}_0} b_i y_i^2 + \sum _{i \in \mathcal {P}_1} b_i \left( y_i - \frac{1}{2} y_{p(i)} \right) ^2 \\&\quad +\,\sum _{i \in \mathcal {P}_2} b_i \left( y_i - \frac{1}{2} y_{p(i)} - \frac{1}{2} y_{q(i)}\right) ^2 \le 2 \theta \nonumber \\\Leftrightarrow & {} ||\varvec{B}\varvec{y}|| \le \sqrt{2 \theta } , \end{aligned}$$

where \(\varvec{B}\) is the matrix whose ith row vector is determined by

$$\begin{aligned} B_{i*} = \left\{ \begin{array}{lcl} \sqrt{b_i} \varvec{e}_i^T &{} \quad \hbox {for} &{}\quad i \in \mathcal {P}_0 \\ \sqrt{b_i} \left( \varvec{e}_i - \frac{1}{2} \varvec{e}_{p(i)} \right) ^T &{} \quad \hbox {for} &{}\quad i \in \mathcal {P}_1 \\ \sqrt{b_i} \left( \varvec{e}_i - \frac{1}{2} \varvec{e}_{p(i)} - \frac{1}{2} \varvec{e}_{q(i)} \right) ^T &{} \quad \hbox {for} &{}\quad i \in \mathcal {P}_2. \end{array}\right. \end{aligned}$$

Using the new SOC constraint \(||\varvec{B}\varvec{y}|| \le \sqrt{2 \theta }\), we develop another SOCP which will be referred to as a compact SOCP formulation or concisely compact SOCP:

$$\begin{aligned} \begin{array}{lcl} \max &{}:&{} (\varvec{A}^{-1}\varvec{g})^T \varvec{y}\\ \hbox {subject to} &{}:&{} (\varvec{A}^{-1}\varvec{e})^T \varvec{y}= 1 \\ &{} &{} \varvec{l}\le \varvec{A}^{-1}\varvec{y}\le \varvec{u}\\ &{} &{} ||\varvec{B}\varvec{y}|| \le \sqrt{2 \theta }. \end{array} \end{aligned}$$

The combination of the compact algorithm in Table 4 and the SOC constraint \(||\varvec{B}\varvec{y}|| \le \sqrt{2 \theta }\) gives us a formulation that does not depend on any dense matrices. Indeed, the number of non-zero elements of the matrix \(\varvec{B}\) is just 30,100 for the problem of size \(m = 10{,}100\), much smaller than that of \(\varvec{U}\) (23,171,296) for the simple SOCP and comparable to that of \(\varvec{U}^{-T}\) (36,607) for the sparse SOCP formulation.

Table 4 A compact algorithm to obtain the inverse of the numerator relationship matrix \(\varvec{A}^{-1}\) based on [9]

From Table 3, we observe that the solver time for the compact SOCP was slightly less than for the sparse SOCP. Further reduction in computation time was achieved during the formulation of the SOCP from the pedigree. In the case \(m = 10{,}100\), the conversion was reduced from 24.14 s to 0.37 s.

4 Numerical tests

We conducted numerical evaluations of the SDP and SOCP formulations on several datasets. The datasets are problems of sizes 2045, 5050, 15,100, 15,222, 50,100, 100,100 and 300,100. The datasets with sizes 2045 and 15,222 were derived from actual data for Scots pine and loblolly pine ((Pinus taeda L.), respectively (available at the Dryad Digital Repository http://dx.doi.org/10.5061/dryad.9pn5m). The other data were produced by simulation of five cycles of breeding in a closed population using the approach of [20, 21].

The computation environment for the small problems (i.e., \(m \le 10{,}100\)) was Matlab R2015a on a Windows 8.1 PC with Xeon E3-1231 (3.40 GHz) and 8 GB RAM. For the larger problems (\(m \ge 15{,}100\)), we used Matlab R2014b on a Debian Linux server with Opteron 4386 (3.10 GHz) and 128 GB RAM, since 8 GB was insufficient for the SDP formulation. We utilized SDPA [30] with four-core parallel processing as the SDP solver and ECOS [5] as the SOCP solver. For the compact SOCP (6), we also conducted the numerical experiment on CVX [6] using SDPT3 [27] and MOSEK [17] as the SOCP solvers. The use of CVX makes it much easier to write SOCP problems in the Matlab environment, and SDPT3 is the default SOCP/SDP solver in CVX while MOSEK is one of state-of-the-art commercial solvers for SDP/SOCP.

Although we mainly used Matlab for the comparison, we also implemented compact SOCP with C++. We can directly know the positions of non-zero elements that appear in the matrix \(\varvec{B}\) from the pedigree list, therefore, a specified data structure can accelerate the computation time to build \(\varvec{B}\). If we implement simple or sparse SOCP with C++, we would need to embed certain numerical routines for the Cholesky factorization to compute \(\varvec{U}\) or \(\varvec{U}^{-T}\). In addition, the sparse Cholesky factorization requires a preprocessing by, for example, AMD [3] to derive better performance. One of the advantages in compact SOCP is that we can manage the computation without numerical routines for the Cholesky factorization for obtaining \(\varvec{B}\). Our C++ implementation for compact SOCP internally calls ECOS as the SOCP solver.

Table 5 shows the numerical results for the SDP and SOCP formulations. We can see that compact SOCP (6) solved the problems much faster than other formulations. In particular, for the problem \(m=15{,}222\), the compact SOCP with C++ required only 2.21 s, compared with 22,566 s for the SDP formulation, an improvement in speed by a factor of \(10\,210\).

Table 5 Numerical results on SDP and SOCP formulations (time in seconds)

Another significant advantage of compact SOCP is memory consumption. The SDP formulation consumed 31 GB RAM to solve the problem \(m = 15{,}222\), and it was unable to handle \(m \ge 50{,}100\), despite having 128 GB available. A rough estimate suggests that the memory required for the largest problem \(m = 300{,}100\) would be 12,000 GB. Simple SOCP also suffered from a large memory requirement to store the dense matrix \(\varvec{U}\). The sparse SOCP did not require the dense matrix \(\varvec{A}\) in the resultant SOCP problem, but it still utilized \(\varvec{A}\) for computing \(\varvec{A}^{-1}\), and thus required a long conversion time in the same way as the SDP formulation. In contrast, the compact SOCP required less than 766 MB RAM to solve even the huge problem \(m = 300{,}100\), due mainly to the compact SOCP not having to process any dense matrices.

From the numerical results, we also observe that the C++ implementation of compact SOCP is faster than the implementation in Matlab. The discrepancy between Matlab (99.65 s) and C++ (79.62 s) in the largest problem was due to a specified data structure written in C++. In particular, the structure was useful when we arranged the pedigree information before building the SOCP problems.

As for the performance of CVX using SDPT3 and MOSEK, the latter was much faster, thus we focus on the comparison between CVX using MOSEK and the C++ implementation. Up to the relatively large problems (\(m \le 100{,}100\)), the total time for CVX using MOSEK was not so different from that of the C++ implementation. For the largest problem (\(m = 300{,}100\)), CVX using MOSEK was faster than the C++ implementation, mainly because the commercial solver MOSEK [17] was more efficient than ECOS [5] for large problems in the compact SOCP. ECOS is, however, a free software package and the problem size \(m = 300{,}100\) is huge even for operational use in breeding. Thus, the C++ implementation using ECOS is certainly attractive from a practical standpoint.

5 Conclusions and future directions

We examined the SOCP formulations for the optimal selection problem arising from tree breeding. We employed the transformation \(\varvec{x}= \varvec{A}^{-1} \varvec{y}\) based on the sparsity of \(\varvec{A}^{-1}\) and the efficient method for building \(\varvec{A}^{-1}\) by Henderson’s algorithm with the Quaas enhancement. As a result, the compact SOCP formulation removed the requirement for processing of large dense matrices from the computations. The numerical results demonstrated that the compact SOCP formulation obtained the optimal solution significantly faster than the SDP formulation available previously to breeders.

While the SOCP formulations proposed in this paper may look simple to researchers in the field of mathematical optimization, the reduction in computation time to solve the optimal selection problem will help improve operational application in tree breeding. We expect that this paper will be one of bridges to introduce efficient approaches cultivated in mathematical optimization to breeding. The proposed approach is now available as a part of Version 2 of the software package OPSEL [19], available free at http://www.skogforsk.se/opsel.

In this paper, we discussed optimizing unequal contributions of parental genotypes deployed in mating designs and seed orchards. In some situations it might be preferable to optimize selection of populations for equal contributions deployment. This would require a method to solve a mixed-integer SOCP problem, i.e., an SOCP problem in which some variables are constrained to be integers. The structure of the SOCP formulation developed in this paper will be a basis for an efficient method now under development to consider the mixed-integer SOCP problem for equal-deployment problems.