Analytical enclosure of the set of solutions of the three-species multivariate curve resolution problem

In 1985 Borgen and Kowalski published a geometry-based mathematical approach in order to determine the set of feasible solutions of the multivariate curve resolution problem for chemical systems with three species. Twenty years later Rajkó and István devised an algorithm for the analytical derivation of the feasible regions. They showed that the precise boundary of the solution set is piecewise representable in terms of analytical expressions for the boundary curve. This paper generalizes the approach for finding analytical boundary curves by means of duality arguments, provides the precise functional form of the curves in detail, shows how to determine the contact change values and suggests improved analytical expressions which can numerically be evaluated in a stable way. Additionally, it offers detailed instructions for an algorithmic solution and provides the underlying analysis. A program code is presented which generates a piecewise functional representation of the boundary curve.

The multivariate curve resolution (MCR) problem in chemometrics consists of factorizing a component-wise nonnegative data matrix D ∈ R k×n + of spectral data according to the bilinear Lambert-Beer law as If, for instance, D contains row-wise a sequence of spectra taken from an on-going chemical reaction with s chemical species, then (ideally) the columns of C ∈ R k×s + are the s time concentration profiles of the s chemical species. Further, the columns of S ∈ R n×s + are the associated pure component spectra. Typically, the factorization problem (1) has many solutions, namely component-wise nonnegative matrices C and S of the respective dimensions so that C S T reconstructs D. The chemically true factors are among these solutions. This fact is known under the keyword of rotational ambiguity [1,16,17]. The sets of all feasible nonnegative factors C and S can be represented with respect to a low-dimensional basis of left-and right singular vectors of D by the set of feasible solutions (SFS), see among others [6,9,20,24,27,28]. For systems with three chemical species (with s = 3) the SFS is a bounded twodimensional set. For the basic and important case of three-component systems Borgen and Kowalski [6] published in 1985 a geometric algorithm to construct the SFS. The construction of the SFS is explained in Section 2 and this geometric representation is also known as Borgen plots or Borgen-Rajkó plots in literature. A generalization for noisy, experimental data has been suggested in [12]. Various other geometrical and numerical approaches have already been developed, see the survey [9] and [13,26,28] for some specific techniques. In 2005 Rajkó and István [24] published analytical expressions for the inner boundary curve (IBC) of the SFS also known as the limiting curve and laid the foundations for an analytical solution of the three-component MCR problem. An improvement of this method was provided by Beyramysoltan et al. in 2014 [5]. Most of the above-mentioned publications underlies the important case of chemical systems with three species. We also make this restriction for the present work and assume that rank(D) = rank(C) = rank(S) = 3. This means for a singular value decomposition of the spectral data matrix D that three singular values are non-zero. Then the associated three left singular vectors of D are a basis of the column vectors of C and the associated right singular vectors also form a basis for representing the column vectors of the matrix factor S.

Data sets
The analytical studies in this work are accompanied and illustrated by evaluations for model problems. For the purpose of their comprehensibility the chosen dimensions are relatively small. A nonnegative 9-by-8 model matrix D is introduced in Appendix A. Its small dimensions ensure that the construction steps of the boundary curve elements of SFS can easily be studied. We remark that this example matrix has no background of a spectroscopic data set; however, this fact does not imply any restrictions on the generality of our approach.
Moreover, to illustrate practical chemical applications, we apply our approach to an experimental FT-IR spectroscopic data set of the catalyst formation within the rhodium-catalyzed hydroformylation process, for more details see [14]. This reaction subsystem consists of three main absorbing components-the olefin species, the acyl complex and the hydrido complex. The aldehyde product of this reaction is not seen in the spectral window. The spectral data matrix D is a 850 × 645 matrix, see

Organization of the paper
First, Sect. 2 introduces the geometric background of the nonnegative matrix factorization problem underlying multivariate curve resolution and introduces the SFS. The third section presents a geometric approach to construct the inner boundary curve of the SFS together with closed-form expressions of the coordinates of the boundary curve. So-called contact change values which are the key quantities for an algorithmic implementation of an IBC computation algorithm are discussed in Sect. 4. Finally, Sect. 5 contains a complexity analysis and proves further properties of the IBC. Complicated mathematical formula, two proofs and the pseudocode of the algorithms are placed in the appendices.

Notation
Component-wise nonnegative matrices and vectors are simply called nonnegative and are represented for example by A ∈ R c×d + . The colon notation [10] is used to specify columns or rows of a matrix. For instance, A i,: designates the ith row vector of A and A :, j is the jth column vector of A.

Inner polygon, outer polygon and the SFS
A standard approach to solving the MCR problem is based on singular value decompositions (SVD) [6,20] for dimensionality reduction. Let D = U V T be a truncated singular value decomposition of the k × n matrix D with the rank 3. Then the three nonzero singular values form the diagonal of the diagonal matrix ∈ R 3×3 . The columns of U ∈ R k×3 and V ∈ R n×3 are the associated left and right singular vectors. An SVD is not unique even if the singular values are pairwise different. Then there is still an orientation ambiguity, namely that pairs of left and right singular vectors (U :,i , V :,i ) can be substituted by (−U :,i , −V :,i ) without changing the properties of an SVD. Within the SVD plane these substitutions correspond to an axis flip. To ensure that we use fixed axes orientations for the following analysis, we assume that the SVD satisfies max(V :,i ) ≥ max(−V :,i ), i = 1, 2, 3.
In words, the entry with absolute maximum in each column of V is chosen to be positive. This orientation is used in FACPACK [26]. However it is possible to use different sign conventions depending on the application, for example, the SignFlip Function as suggested by Bro et al. in [7], which aims to provide stable orientations for series of slightly altered data matrices. It has to be noted that the orientation has no impact on the resulting nonnegative factorizations.
Any nonnegative factorization D = C S T can be represented with respect to the bases of left and right singular vectors [21]. This means that a transformationT ∈ R 3×3 exists so that The rows ofT can be scaled according to which further reduces some degree of freedom of the set of solutions of the MCR problem. Each row of the transformation is element-wise multiplied with the reciprocal of its first element. Thus the first column T :,1 is an all-ones vector. This scaling is only permitted under the weak assumption of irreducibility of D T D, see for example Theorem 2.2 in [25], which ensures that the first columnT :,1 contains no zeros. This mathematical property is a consequence of the Perron-Frobenius theory of nonnegative matrices. This row-scaling can be called the first singular vector scaling. Rajkó and István in [24] use a different scaling based on row sums of D. A one-one nonlinear transformation between these two scaling approaches exists showing their comparability [12]. This is closely related to Borgen norms, see [23]. Usually many matrices T exist which result in nonnegative factors C and S. This refers to the so-called factor ambiguity or rotational ambiguity underlying the set of solutions of the MCR problem, see [1] for an overview. For MCR problems with three chemical species, the set of possible nonnegative factors S, which can be called an SFS, can be represented by bounded sets in the two-dimensional real space. The same properties hold for the factor C. A formal definition of the SFS reads as follows.

Definition 1
The SFS (Set of Feasible Solutions) for the factor S is defined as with the three-dimensional all-ones column vector 1. Analogously, the SFS M C for the other (dual) factor C can be determined by a similar construction based on T −1 .
Note that the existence of a regular T in Definition 1 implies the correctness of the equation D = C S T , because The inherent structure of the nonnegative matrix factorization problem imposes various mathematical and geometric restrictions on the SFS. Geometric construction approaches and many numerical approximation algorithms have been suggested to construct or to compute the SFS. See the references [6,12,15,19,24] and many others. A geometrically constructed SFS for noise-free, ideal data goes back to the influential work by Borgen and Kowalski [6]. A generalization to noisy, experimental data has been suggested in [12,29].
Key objects for the construction of the SFS are an outer polygon F, where the letter F refers to the outdated name FIRPOL, and an inner polygon I which stands for INNPOL. These polygons exist for the factor C and also for S. The respective target can be expressed by an index, that is we consider I C and F C for the factor C and I S and F S for the factor S. The precise definitions of these polygons read as follows.

Definition 2
The polygon F is the intersection of affine half-planes and I is defined as where for the factor S the a i and b i are given by and similarly for the factor C The outer polygon F is a superset of M. This means that F C ⊃ M C and F S ⊃ M S . Further, F encloses the inner polygon I, see for example [20]. See Section 2.3 for more details on the geometric construction and in which way the inner polygon serves as an "interior" restriction for the SFS.

Duality principles in MCR
Definition 2 strongly motivates the concept of duality, as F C and I S as well as F S and I C are generated by the same vectors. The duality theory for these pairs of inner and outer polygons has its roots in [11,22] and has also been explored in [28]. The basic relations of duality are briefly presented below.
Definition 3 A nonzero (point) a ∈ R 2 is dual to the affine line

Remark 1
The line construction (4) is closely related to the Hesse normal form of a line. From x T a = −1 we get the equivalent condition in the shape of the Hesse normal form for a nonzero vector a ∈ R 2 This describes an affine straight line oriented orthogonal to the vector a whose smallest distance 1/ a to the origin is attained in the point −a/ a 2 .
Duality induces various relations between the inner and outer polygon and the SFS for the two factors C and S. If we want to refer to the so-called primal factor, then we use the notation F, I, M and add a subscript d for the dual factor, namely F d , I d , M d . In the following examples, S is chosen as the primal factor, unless stated otherwise. Definition 3 enables useful relationships between the two (dual) geometric representations. Most notably, a vertex of I is dual to a line that contains a facet of F d and a line that contains a facet of I is dual to a vertex of F d . Some duality relations are illustrated in Fig. 2. For the mathematical proofs of these relations see [3,28]; a deepened analysis is planned for a forthcoming publication. These relations allow us to continue the theoretical and practical considerations without having to specify which of the two factors is the primal one.
With the help of duality the inner and outer polygon can be determined easily. This as well as the convex hull algorithm from Matlab [18] was used in the improved method of Beyramysoltan et. al. [5]. In our method we assume that the inner and outer polygons have already been calculated.

Geometric construction of the set of feasible solutions
The polygons F and I are the key-geometric objects for the construction of the SFS. The construction approach was introduced by Borgen and Kowalski [6] with their simplex rotation algorithm. According to (2) and (3) a factorization of D corresponds to a triangle with the three vertices T 1,2:3 , T 2,2:3 and T 3,2:3 . Such a factorization D = C S T is feasible, that is it yields nonnegative factors C and S, if and only if the triangle includes the inner polygon I and is enclosed by the outer polygon F; for these fundamental relations see for example Theorem 3.11 in [12]. Each vertex of the triangle represents a possible spectral profile or a possible concentration profile of a pure component. Consequently, the nonnegative matrix factorization problem is equivalent to the geometric problem of determining a nested triangle with the mentioned properties. The SFS is the set of all vertices of all feasible triangles and expresses the factor ambiguity of the nonnegative matrix factorization problem. According to [6] the geometric construction of the set of all factorizations can be achieved by the construction of boundary cases. These boundary cases are explained next.
A Borgen-a, b-triangle is defined as a triangle that lies between the boundaries of the inner polygon I and the outer polygon F, that has a vertices located on the boundary of F, and b of its edges touch I. It is a generalization of the Borgen triangle, as given in [5,24] or the limiting simplex in [6]. Such triangles approximate the SFS and are called Borgen-3, 3-triangles according to our definition.
Furthermore, each feasible matrix T , according to Definition 1, results in a Borgena, b-triangle. If the triangle does not touch neither F nor I, then it is a Borgen-0, 0triangle.
Main focus for the construction of the the SFS is on Borgen-2, 3-triangles T . They consist of two outermost vertices with three innermost edges, that is they have two vertices located on the boundary of F, and all their edges touch the inner polygon I. Without rotation this third vertex is as close as possible to the inner polygon and represents a boundary of the SFS through the limitations imposed by the positions of F and I, and we call it a free vertex. Such innermost boundary points of the SFS are the free vertices. A typical triangle satisfying these conditions is illustrated for the primal factor in Fig. 3. The primal triangle is plotted green with vertices plotted in cyan. Figure 3 also shows the dual lines (in cyan) which form the dual triangle T d . Note that the dual triangle of a Borgen-2, 3-triangle is always a Borgen-3, 2-triangle.
By rotating the Borgen-2, 3-triangle around the inner polygon I a continuous curve of innermost boundary points is constructed. During the rotation process the two outermost vertices are continuously moved to the boundary of the outer polygon and the innermost vertex is always constructed according to the rules given above. The curve is given by the positions of all free vertices which have been constructed during the triangle rotation process. Parts of this curve form the inner boundary of the SFS whereas the outer boundary of the SFS is determined by the boundary of the outer polygon. Hence we call the constructed curve the inner boundary curve (IBC) of the given three-component problem and refer to it as B. If we intersect the IBC with the boundary of the outer polygon F, then the enclosed finite regions which are located in the outer polygon define the SFS. These areas are plotted gray in Fig. 3. In this figure, the IBC is plotted by a black line. Its parts in the outer polygon are of importance for the SFS construction. However, the IBC curve also exists outside the polygon F, see Fig. 3.
In many cases the IBC is a continuous (closed) curve. However, in some situations the IBC can tend to infinity which prevents the existence of a continuous or closed IBC. An example situation is shown in Fig. 4. A singularity of the IBC occurs, if the dual triangle T d does not include the origin for certain rotation angles. The SFS is not affected by such singularities of the IBC since this occurs outside the outer polygon where the feasibility of triangles is not satisfied.

Function representation of the IBC
In 2005 Rajkó and István [24] published analytical expressions for a piecewise representation of the IBC. Here we deepen this analysis, introduce a general approach of presenting the IBC by piecewise analytical expressions in function form on the basis of contact change values, generalize the classical approach by using duality arguments and derive analytical expressions with a focus on their numerical stability. The goal is to present a simple, efficient and stable algorithmic implementation by using duality relations.

Derivation of a closed-form expression for points on the IBC
The Borgen-2, 3-triangles construction only requires the coordinates of the vertices of I and also the edges of F. The location of the free vertex depends on these given parameters, as well as on the triangle rotation. See Fig. 5 for the variable names of the Borgen-2, 3-triangle T and its dual Borgen-3, 2-triangle T d . All points are denoted by lowercase letters with a dot above. The corresponding dual lines are written with a bar above the same lowercase letter. The edges of the polygons are also represented by their corresponding line equations and no further distinction between lines and edges is required. The subscript letters l and r stand for left and right with the view direction from the pointṫ of the dual plot.
Rajkó and István [24] and Beyramysoltan et. al. [5] use coordinate pairs for the pointsė = (x, y) and slope-intercept form for lines We prefer to write line equations in the Hesse form, see (4), This offers a double advantage. The lineē is dual to the pointė; compare this with Definition 3. Consequently, no computation is required for this transformation and no numerical precision is lost. Only lines passing through the origin cannot be of the form (5). However, duality does not allow lines through the origin. Thus (5) is the appropriate notation for our scope of application and increases the numerical stability.
The approach [24] considers a rotation of a triangle as a function of the x-coordinate ofṡ l . Here we choose the angle α between the positive x-axis and a vector towardṡ t in the dual plot as the free variable. This representation has some advantages with respect to the piecewise boundary construction. Thanks to duality the change in α corresponds to the rotation oft around I. Thus, a full rotation of the tangent around the I takes place for α ∈ [−π, π). This is possible because the origin is included in I and F, see [20].
The goal is to construct a closed-form expression for the pointsṗ on the IBC that depends on α and has five additional parameters , s l , s r , o l and o r . These parameters fix the local geometric setup and allow us to determine the associated point on the IBC. Both the point form and the line form can be used for the parameter representation because they are equivalent. The construction begins with the calculation of the poinṫ t = (t 1 , t 2 ) T . For this purpose we consider the geometric situation shown in Fig. 5 where the intersection of the vertexṫ with the line¯ is to be determined. The line equation reads simply holds in the first quadrant, but is also true in all other quadrants. These quadrants refer to a coordinate system fixed at the origin (0, 0). Figure 5 shows (upper plot) a case in whichṫ is located in the second quadrant. Elimination of t 1 and t 2 in Equations (6) and (7) results in , Among other mathematical representations, we prefer (8) due to its numerical stability for α values close to ± π 2 , since forṫ = 0, −1 2 and in the limit α → ± π 2 we get Further construction steps consist of two generic, repeated operations: by solving a 2-by-2 linear system of equations where degenerate cases of identical or non-crossing lines are excluded if g 1 h 2 − g 2 h 1 = 0. 2. Solving the dual problem of determining a lineē = (x, y) T ∈ R 2 : e 1 x + e 2 y = −1} that passes through two different pointsġ = (g 1 , This yields the same linear system of equations as considered above with the solution Numerical stability is a critical issue in the evaluation of the various mathematical terms derived here. In general, we present a set of expressions with the best numerical stability as found by testing various ways of computing the IBC curve. The steps for the construction of the IBC pointṗ can be accomplished via the primal triangle T and/or the dual triangle T d , as the choice of line equation allows a simple transformation from one to another. All the possible construction routes forṗ = ( p 1 , p 2 ) are illustrated in Fig. 6. Route #3 in Fig. 6 only performs computations within the primal plot to determineṗ, and this route is similar to the approach of Rajkó and István [24] as well as Beyramysoltan et. al. [5]. A discussion on the best route for an algorithmic implementation follows below.
In dependence on the angle α and the two-dimensional vector parameters , s l , s r , o l and o r we get the following analytical expressions forṗ = ( p 1 , where complicated formula for the nine coefficients c x0 , c x1 , c x2 , c y0 , c y1 , c y2 , c d0 , c d1 and c d2 are given in Appendix B.

Contact change values
The IBC is the set of all constructable pointsṗ. The true complexity of the Expression Each subinterval α i , α i+1 ) stands for a unique set of parameters used in the construction of the Borgen-2, 3-triangle. The parameters change at each contact change value. Closed-form expressions can be derived forṗ in each interval. All this results in the IBC function for all angle values α ∈ [−π, π).

Definition 5
The IBC function is an interval-wise defined function with changing sets of parameters depending on the interval. The dependence on the particular interval is expressed by an additional superindex in brackets. Thus we consider where p [i] (α) equalsṗ by (11) in the i-th interval α ∈ α i , α i+1 ) with its specific (and in this interval constant) set of parameters.
The repetition of the function p [m c ] (α) in the first and the last subinterval of [−π, π) reflects the fact that α = π is not necessarily a contact change value, but only a point in which the upper and lower semicircles meet.
The approach of Beyramysoltan et. al. [5] suggests to only calculate the IBC function for the intervals which result in the pointsṗ that are located in F, however this requires the search for all Borgen-3, 3-triangles. This method does not provide any significant improvement of the calculation time for our implementation.

Algorithmic approach for constructing the boundary curve
The main complexity of the boundary curve construction consists in finding the contact change values α i . Interval-wise the associated parameter sets are fixed and they change from one to the next interval. This section deals with some overall considerations of an algorithmic solution of this problem, but is not an attempt at a complete generalization of all possible approaches. We explain an approach based on determining tangents of a polygon by using a convex hull construction. Alternative algorithms are possible, for instance those which are based on determining the intersection of an affine line with a general polygon. Here, we prefer to compute the contact change values by means of a convex hull algorithm. This provides the parameters for the construction route #1 shown in Fig. 6. Contact change values can be categorized into three main types and different steps are to be taken to find them all. The pseudo-code for the three types and both approaches can be found in Appendix C.

Contact change values of the first kind˛1: Primal inner edge coincidence
The contact change values of the first kind represent the changing parameter values in the top level of Fig. 6. This is a situation in which an edge of the primal triangle T completely covers an edge of the inner polygon. This situation is dual to the geometric incidence of a vertex of the dual outer polygon with a vertex of the dual triangle T d . Hence, each vertex of the dual outer polygon F d corresponds to a specific contact change value of the first kind. Therefore, we find all contact change values α 1 by calculating the two-argument arctangent values (inverse tangent value) of each vertex of F d . By running through such contact change values of the first kind, the present parameter is replaced by the next, directly following edge of F d . Figure 7 illustrates such a contact change value of the first kind α 1 in the primal and dual representation. Additionally, see Fig. 8 for the construction. We call this situation a primal inner edge coincidence.

Contact change values of the second kind˛2: Primal outer vertex coincidence
Contact change values of the second kind concern a primal outer vertex coincidence, which is explained next. Such contact change values correspond to changing parameter values in the second level (from the top) of Fig. 6, that is,u l oru r changes its value or equivalentlyū l orū r in the dual plot. We calculate two tangents of I for each vertex of F by computing the convex hull of vertices of I together with a single vertex of F. Without additional effort we transform the two tangents to the linet and the dual pointṫ. Then α 2 is the two-argument inverse tangent value of the pointṫ. The change of parameters u l or u r for the IBC function is given by the following edge of F. See Fig. 9 for an illustration and Fig. 10 for the construction.

Contact change values of the third kind˛3: A second primal inner edge coincidence
The contact change values of the third kind represent the changing parameter values in the third level (from the top) of Fig. 6, that is, o l or o r changes its value. So we Then we use the dual pointṡ l of the lines l . The pointṡ l lies on F and it allows us to obtain the left and the right tangents of I by means of the convex hull algorithm. One of the tangents is the lineq l and can be discarded. The other tangent ist. Then it can be transformed to the dual pointṫ and its 2-argument inverse tangent is the desired contact change value α 3 . A right contact change value of the third kind can be obtained by following the same steps with the other tangent of the I d .
The change of either parameter o l or o r for the construction of the IBC function consists of substituting the present edge by the following edge of F. See Fig. 11 for an illustration of the geometric setup and Fig. 12 for the construction.

Computing the IBC function
Once we have determined all contact change values together with their corresponding changing parameters, they are sorted in ascending order. Each contact change value changes only a single pair of parameters whereas the remaining parameters can be adopted from the previous subinterval. This easily allows us to fill the parameter table for all subinterval as introduced in Definition 5. A subsequent evaluation of the analytical expressions (11) yields the IBC as a function of the rotation angle α in each of the subintervals.
The IBC function can be evaluated with any desired precision. The precision is only limited by the computer arithmetic used. Hence the presented approach is a truly Fig. 11 A right contact change value of the third kind α 3r for the model problem. The edgeq r of T covers an edge of I and the vertexq r of T d coincides with a vertex of F d Fig. 12 The calculation of the right contact change value of the third kind α 3r for the geometric situation as shown in Fig. 7 analytical approach. Further, the accuracy of the IBC curve can be adapted to any level of precision for regions of particular interest. However, in most practical situations the double precision (64 bit) arithmetic of a standard personal computer appears to be sufficient for most applications. In a final step the SFS is found as the intersection of the exterior of the IBC curve with the outer polygon F.

Discussion and further result
Next, an application is presented for the three-component consecutive chemical reaction system as introduced in Section 1.1. Further, we provide some properties of the IBC functions and of its algorithmic evaluation. Our analysis also encompasses a complexity analysis of the algorithm and the discussion of some problematic cases.

Application to an experimental problem
The structure of the FT-IR data set is explained in 1.1 and the three concentration profiles and the associated spectra of the pure components are shown in Fig. 1. Here the analysis with the IBC algorithm follows. F S is spanned by 217 points and F Cby 103 points. The outer boundaries required for our algorithm are calculated using the polygon inflation algorithm for noisy data, see [25,26,29] and then the inner boundaries are found with the help of duality. This step is also very similar to the use of duality in [5]. Then the IBC algorithm is applied and Fig. 13 shows the resulting SFS for both factors in gray color. B S contains 743 subintervals and subfunction. B C contains 857 subintervals. The total calculation time for the IBC of both factors typically was less than 0.6 seconds in a Matlab 2019b (single core) implementation running on a standard desktop computer (3.6GHz Intel CPU).

Number of contact change values
including possibly overlapping values. The constants m a and m b in Eq. (12) are introduced in Definition 2 and equal the dimensions k or n of the k × n spectral mixture data matrix D. If the SFS for the spectral factor S is considered, then an upper bound on the number of subfunctions equals 3k + 2n. For the concentration factor C the number of IBC subfunctions is always less or equal to 3n + 2k. For large k and n the analytical approach to determining the IBC may be too expensive. Then a purely numerical SFS approximation, for example by the polygon inflation [26], may seem more appropriate.

Time and computer storage complexity of the IBC algorithm
Computational complexity theory is concerned with the required computer resources for the solution of a problem, mainly the computing time and required computer storage. We discuss these issues by using basic concepts of the complexity theory and algorithms, for example see [8,31]. First, computer storage is not a problematic resource for the IBC algorithm because it scales only linearly with the dimension of the input data. Roughly, for each interval with constant contact change values ten parameters are to be stored in a table. Second, the time complexity class of the IBC calculation is determined by the choice of the algorithm for solving the contact problem, namely the convex hull algorithm or a comparable line-polygon intersection algorithm. Therein, the big O notation, which is also called the Landau notation, serves to describe the asymptotic growth behavior. The proof of Theorem 1 is given in Appendix D. The selection of the convex hull algorithm and of the line-polygon intersection algorithm significantly influence the asymptotic computing time. This was the reasoning for providing the pseudo-code of both versions in Appendix C. In our implementation the convex hull algorithm uses the Quickhull algorithm as implemented in Matlab [18]. For the sake of simplicity we assume that the balance conditions hold and that all points are processed. Then the complexity class of the Quickhull algorithm equals O(m log(m)) for m points, see [4]. This results in the time complexity class O(m 2 log(m)). m can be estimated to be less or equal than the maximal dimension of the matrix D.

Intersection of the outer polygon with the IBC
As explained in Sect. 4.4, the SFS is the intersection of the outer polygon F and the exterior of the IBC, which is the unbounded set if the IBC is considered to split the plane R 2 into two sets (namely a set around the origin which is enclosed by the IBC and its unbounded complement). The SFS contains the full information on the factor ambiguity of the nonnegative matrix factorization problem. If the IBC is determined, then the SFS can easily be computed numerically. However, the IBC also devises an analytical route for a precise (analytical) representation of the SFS. This approach is explained next.
Each edge of the outer polygon F is given by an affine line of the form e 1 x + e 2 y = −1 with proper parameters e 1 and e 2 . Furthermore, let us substitute z = tan(α) in Eq. (11). Then the precise coordinates of a point of intersection of an edge of F with the IBC are given by the solution of the equation if the considered subfunction of the IBC and the considered edge of F intersect. The solution can be derived by solving the quadratic equation The result can be interpreted as Borgen-3, 3-triangles from [24]. We note that the advantages of this approach are limited to certain problematic cases. In general, the numerical approach is much faster and some minor loss of precision is not meaningful.

A conic section approach to the IBC
The IBC representation (11) has a parametric form. It is also possible to represent it in the form of an implicit equation. This allows us to prove an interesting geometric property, namely that the IBC is composed of conic sections, see Fig. 14.

Theorem 2 Each of the IBC subfunctions is a segment of a conic section (namely an ellipse or circle, parabola or hyperbola or also a degenerate conic section as a line segment or a point).
The proof is given in Appendix E. This theorem has some intriguing implications, as it places the IBC construction in a classical branch of mathematics. Possibly, this can provide more insight into the problem and to prove additional mathematical properties. If the conic section is non-degenerate (not just a line or single point), then the coefficients o l and o r lie on the conic section defined by the IBC subfunction. Further, the intersection of the lines that are given by the coefficients u l and u r (if it exists), also lies on the conic section. This can be proved by inserting them in Equation (14).

Poorly conditioned cases of a numerical implementation of the IBC algorithm
The IBC algorithm is by its nature a precise analytical approach. However, the numerical and graphical representations require that the IBC is evaluated numerically. Some geometrical situations can result in poorly conditioned numerical computations. This is more or less inevitable for the suggested and other algorithms working with triangle constructions and rotations. Most of these cases occur if a vertex of I is located very close to F or if a vertex of I coincides with a vertex of F. To discuss this point, we recall that the triangle construction for finding the IBC parameters consists of finding tangents of the inner polygon through a given point of the outer polygon. If a vertex of I is located close to F, then this tangent can be defined by two points that are close to each other. Therefore, a small numerical error in one of those points can result in a large error for the tangent and for the IBC coordinates. In numerical mathematics such a problem is called ill-conditioned and so is the construction of certain triangles if vertices of I are close to those of F. In particular, if single-point or line-shaped subsets of SFS are expected, then numerical approximation methods, such as polygon inflation [26] should be preferred.
The limit case of coinciding vertices of I and F can occur if the spectral data matrix D contains multiple zero entries. Then Borgen-2, 3-triangles are not necessarily unique as illustrated in Fig. 15. We assume that this case corresponds to a degenerated cone intersection with line shape. The suggested algorithmic implementation of the IBC construction cannot deal with this special situation and constructs only a single point. The algorithmic extension would be a linear interpolation in order to close gaps in the IBC curve which originate from this special geometric constellation. Further, we must expect a reduced numerical accuracy of a possible numerical implementation of the IBC algorithm in the case of coinciding vertices of I and F. This problem cannot be avoided easily.

Conclusion
The analytical boundary curve construction is not only an absolutely precise method in order to determine the SFS for a nonnegative matrix factorization problem with three chemical species, but can also be practically implemented by the suggested algorithm. Closed-form expressions for the coordinates of the IBC have been derived which have revealed that the IBC piecewise consists of conic sections. This result joins the chemometric factor ambiguity problem with the classical field of mathematical geometry.
The IBC algorithm can support theoretical and analytical investigations with a focus on special and limit cases. For practical applications on high-dimensional experimental data the IBC algorithm is still applicable, but a numerical approximation approach might prove to be more robust. Nevertheless the IBC algorithm is promising in both the computing time and the precision of the results. The Matlab Code of our implementation can be found on the FACPACK homepage [30]. This program code computes the list of sorted contact change values for a given (chemical data) matrix and provides a Matlab function which enables the user to evaluate the boundary of the SFS with a high precision.
Funding Open Access funding enabled and organized by Projekt DEAL.

Conflicts of interest
The authors declare that they have no conflict of interest or other ethical conflicts concerning this paper.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Appendices Appendix A: Model matrix
We consider the following 9×8 nonnegative model matrix with the rank 3. This matrix has many nonnegative factorizations with factors also of the rank 3.
Appendix B: Coefficients of the IBC for Eq. (11) The following eight coefficients underlying the IBC coordinates as given in (11)  The following five pseudocode elements serve to determine the contact change values of the first up to the third kind. For explanations and details see Sect. 4. An implementation in Matlab using the convex hull approach can be found on the FACPACK homepage [30]. Find the left tangent t r and the right tangent t l of I from s lr with a convex hull algorithm. (The orientation is towards the origin.) 4: Set α 2l i = atan2(t l ).

5:
Save the edge of F that follows s lr counter-clockwise as (u l ) i . 6: Set α 2r i = atan2(t r ). 7: Save the edge of F that follows s lr counter-clockwise as (u r ) i . 8: Set i = i + 1. Find the left intersection t r and the right intersection t l of s lr with F d using a line-polygon intersection algorithm.

5:
Save the right vertex of the chosen edge of I d as (u l ) i . 6: Set α 2R i = atan2(t R ).

7:
Save the right vertex of the chosen edge of I d as (u r ) i . 8: Set i = i + 1. Find the left tangent s r and the right tangent s l from q lr to I d with a convex hull algorithm. 4: Find the left tangent q lr and the right tangent t l from s l to I with a convex hull algorithm. 5: Set α 3l i = atan2(t l ). 6: Save the edge of F d that follows q lr counter-clockwise as (o l ) i . 7: Find the left tangent t r and the right tangent q lr from s r to I with a convex hull algorithm. 8: Set α 3r i = atan2(t r ). 9: Save the edge of F d that follows q lr counter-clockwise as (o r ) i . 10: Set i = i + 1. Find the left intersection s r and the right intersection s l of q lr with F using a line-polygon intersection algorithm 4: Find the right intersection t l of s l with F d using a line-polygon intersection algorithm (the left intersection is q lr ) 5: Set α 3l i = atan2(t l ) 6: Save the right vertex of the chosen edge of I as (o l ) i 7: Find the left intersection t r of s 2 with F d using a line-polygon intersection algorithm (the right intersection is q lr ) 8: Set α 3r i = atan2(t r ) 9: Save the right vertex of the chosen edge of I as (o r ) i 10: Set i = i + 1 11: end for

Appendix D: Proof of Theorem 1
Proof Each step of the construction of the IBC is considered concerning its computing time complexity, starting with the computation of the contact change values.
To compute the contact change values of the first kind, m I iterations with a constant number of operations are performed. Therefore, this step has the time complexity class O(m I ).
To determine the contact change values of the second kind, m F iterations are performed. Each of the iterations consists of either a convex hull algorithm over m I + 1 vectors or a line-polygon intersection algorithm over m I +1 lines and a constant number of further operations. Thus, this step has the time complexity class O(m F alg(m I )).
To compute the contact change values of the third kind, m I iterations are performed. Each of the iterations consists of either a convex hull algorithm over m F +1 vectors and two more repetitions of a convex hull algorithm over m I + 1 vectors or a line-polygon intersection algorithm over m F + 1 lines and two more repetitions of a line-polygon intersection algorithm over m I + 1 lines. There is also a constant number of further operations. Therefore, O(m I (alg(m I ) + alg(m F ))) is the time complexity class of this step.
Further, the sorting of the 3m I + 2m F contact change values is required. A good sorting algorithm has the time complexity of O(n log(n)), see [8]. Thus, this step has the time complexity class O((m I + m F ) log(m I + m F )).
Finally, the resulting functions have to be calculated for each of the 3m I + 2m F subintervals. This requires a constant number of operations for each iteration. Therefore, this step has the time complexity class O(3m I + 2m F ).
The resulting time complexity of the IBC algorithm is given by where all coefficients are listed in Appendix B. These relations can be verified by observing that the parametric form (x, y) = (x(z), y(z)) fulfills Equation (14) with z = tan(α).