Seeking consistency with paired comparisons: a systems approach

It is well known that decision methods based on pairwise rankings can suffer from a wide range of difficulties. These problems are addressed here by treating the methods as systems, where each pair is looked upon as a subsystem with an assigned task. In this manner, the source of several difficulties (including Arrow’s Theorem) is equated with the standard concern that the “whole need not be the sum of its parts.” These problems arise because the objectives assigned to subsystems need not be compatible with that of the system. Knowing what causes the difficulties leads to resolutions.


Difficulties with paired comparisons
For reasons that include cost and convenience, paired comparisons are widely used to make decisions even though examples exist that cast doubt on the trustworthiness of certain approaches. As these techniques continue to be used, a useful goal (developed here) is to determine how they can be modified to yield more reliable outcomes.
The basic idea mimics the least-squares methodology by projecting information (e.g., data, results about pairs, etc.) into a space where consistent outcomes are assured. Unfortunately, an appropriate ''consistency space'' is not known even for My thanks for helpful comments from George Hazelrigg and for comments and the careful reading of two anonymous referees. This work was supported by the National Science Foundation under NSF Award Number CMMI-1923164. & Donald G. Saari dsaari@uci.edu the widely required condition of transitivity. This motivates finding a natural ''transitivity consistency'' space. Even after identifying a desired space of outcomes, the appropriate projection need not be obvious. This tends to be true with nonlinear structures. The expectation is that where, if the summation is an orthogonal vector addition, standard projections apply. But if Eq. 1 fails (which, as shown below, is true with AHP, the Analytic Hierarchy Process), rather than helping, projections can aggravate the analysis by introducing new types of mistakes. To avoid these problems, the structure of the error term must be found. Some techniques use modified forms of paired comparisons. An example is the Pugh matrix method (Pugh, 1991) where a particular alternative (often the status quo) serves as the base from which other alternatives are compared over several criteria. A standard approach is, for each criterion, to assign a ''À; 0; þ} score to an alternative where '' À } means it is poorer, ''0} if it is about the same, '' þ } if it is better.
The final score is the number of þ's minus the number of -'s. Refined options include ''double -'s and double þ's,'' or perhaps 1, 2, 3, 4, 5, where 3 is equivalent to the base comparison. It is shown how to identify and overcome weaknesses of these approaches. .
A search to improve decision outcomes is desirable, but is it futile? The pessimism derives from Arrow's Theorem (Arrow, 1963), which often is described as asserting that no decision approach is fair with three or more alternatives. Fortunately for this project, it is shown that this negative commentary is overstated.
Addressing these difficulties is part of a general project to understand systems whether from the social sciences, engineering, or biology. Toward this end, paired comparisons are treated as ''subsystems;'' i.e., the decision methods combine information from the subsystems to create an answer for the whole system. An advantage of first studying decision methods for this project is that their structures are specified, so they form a more tractable test bed to discover why system problems arise and how to correct them. Taking this point of view, Arrow's Theorem's negative conclusion becomes the standard ''the whole can differ from the sum of its parts'' concern (Sect. 2). The goal is to understand what causes conflicts between a system and its subsystems and how to avoid them.
A first step (Sect. 3) is to analyze the structure of the subsystems, which here is the space of paired comparisons. Guided by Eq. 1 and emphasizing summation methods, this space is orthogonally divided into two components-informally, call them the ''desired'' and ''error'' subspaces. Nothing goes wrong when any standard decision method uses information from the desired component. Consequently, all difficulties are caused by decision methods using information about pairs, or portion of pairs, from the error space. Accompanying the discussion are easily used computational tools.
An obvious message is to avoid subsystem outcomes (i.e., collections of paired outcomes) that rely upon the error space: this is how (in Sect. 4) the approaches described above are modified. A second class of decision methods (such as AHP) uses multiplicative procedures. In Sect. 5, results about these systems are derived by transferring material from Sect. 3. Most proofs are in Sect. 7.
2 Arrow's Theorem from a systems approach A system starts with a stated objective. Arrow's modest goal was to rank n ! 3 specified alternatives. The content of the alternatives is immaterial; they could be design plans for a project, ways to invest money, or even names for a new puppy.
For notation, with alternatives A i and A j , the symbol ''A i 1A j '' denotes ''A i is ranked above A j ,'' while ''A i $ A j '' has ''A i and A j are ranked the same.'' Arrow's goal follows: 1. Objective: The ranking outcome for the n ! 3 alternatives is complete 1 and transitive. 2 The inputs can be essentially anything. For finance, they might be how various experts rank the alternatives, or how the alternatives fare on different markets. For an engineering or management plan, the alternatives could be ranked over different criteria such as, perhaps, taxes or availability of resources. For voting, they are the voters' preference rankings. Namely, conditions are imposed on the structure, not the content, of the inputs.
alternatives that the fA 1 ; A 2 g outcome always is A 1 $ A 2 , and the fA 2 ; A 3 g and fA 1 ; A 3 g outcomes always are A 2 1A 3 and A 1 1A 3 . These constant methods satisfy the three conditions with a fixed A 1 $ A 2 1A 3 transitive ranking, but they are not of any real interest.
Another trivial choice is where each pair's outcome always depends on a single source's ranking, perhaps the foreman on a project, which may provide efficiency, or only tax information when selecting a city for a new plant, which could be disastrous. To understand systems, methods that rely upon inputs from more than one source must be explored.
The sole purpose of the following is to acknowledge that these uninteresting methods satisfy the first three conditions and then to exclude them.
4. Eliminating undesired rules: a. For each pair of alternatives, the method that is designed to rank the pair does not have a fixed outcome. That is, for at least two of the three possible rankings of the pair, each is the outcome for some profile. b. All outcomes for all of the pairs cannot always be determined by the ranking of the same single source.
As the role of #4 is to identify and dismiss methods that are of no interest, only condition #3 of the methodology applies. That is, the system's outcome is determined by results coming from the individual subsystems. As asserted next, no such methodology exists.
Theorem 1 (Saari, 2018) For n ! 3 alternatives and a ! 2 sources of inputs, there do not exist ways to rank the individual pairs so that the above four conditions always hold.
The objective is modest. Yet this theorem asserts that no way can be found to always assemble information from the subsystems to achieve the objective of the full system. Thus, Theorem 1 manifests the system conundrum where ''the whole can differ from its parts.'' [A slightly stronger version of Theorem 1 is proved in Chap. 6 of (Saari, 2018).] To establish that Arrow's theorem is a special case of Theorem 1, note that conditions 1 and 2 are the same for both results. Arrow imposes a Pareto condition whereby if all sources rank a pair in the same strict (i.e., no ties) manner, then that unanimous choice is the pair's ranking. The Pareto condition, then, is a very special case of 4a; it specifies a particular way to ensure that each pair has non-constant outcomes. There are, of course, many alternative choices. Arrow's ''no dictator'' condition is a special case of 4b.
What remains is Arrow's IIA (Independence of Irrelevant Alternatives); it asserts that if all sources rank a particular pair in the same way for any two profiles, the pair's outcome remains unchanged. As shown below (Theorem 2), if a decision rule satisfies IIA, it can be expressed as a collection of independent paired comparisons; thus IIA satisfies #3.
To illustrate with the pair fA 1 ; A 2 g (underlined to assist comparisons), each source has the same fA 1 ; A 2 g ranking in the following two profiles If the decision function F satisfies IIA for the fA 1 ; A 2 g pair, then both profiles must have the same fA 1 ; A 2 g ranking, perhaps With this agreement, F defines a paired comparison rule for fA 1 ; A 2 g denoted by g fA 1 ;A 2 g . Namely, let g fA 1 ;A 2 g ðpÞ be FðpÞ's ranking of the pair fA 1 ; A 2 g. Thus, with Eq. 3, Only profile information concerning fA 1 ; A 2 g rankings is used to define g fA 1 ;A 2 g ðpÞ, so it follows from Eq. 4 that this mapping can be re-expressed as where whenever the first and third sources prefer A 1 1A 2 and the second has A 2 1A 1 , the outcome must be A 1 1A 2 : A general assertion follows.
Theorem 2 If a mapping F satisfies conditions 2, IIA, and ranks each pair, then, for each pair of alternatives, F defines a pairwise comparison rule where the outcome strictly depends on how each source ranks that pair. That is, IIA and #3 are equivalent.
The proof (Sect. 7) mimics the above discussion. Because Theorem 2 equates IIA and condition 3, Arrow's theorem becomes a special case of Theorem 1.
An interesting consequence of Theorem 2 is how it identifies IIA's main role to be a filter; it eliminates certain decision methods from consideration. (From a system perspective, it limits which systems are admissible.) In particular, if a decision rule fails to satisfy IIA, such as the plurality vote, this failure merely means that the rule cannot be described as a collection of independent paired comparison rules. By itself, this is not a negative feature. Instead, based on the following comments, it could be a positive one.
Because the IIA filter allows only paired comparison rules to be subject to the negative consequence of Arrow's Theorem, rather than the traditional ''with three or more alternatives, no rule is fair,'' a more accurate representation of Arrow's Theorem is that ''With three or more alternatives, no decision method based on a collection of independent paired comparison rules is fair.'' From a system's perspective, Arrow's theorem is a warning that there need not exist ways to determine the outcomes of independent subsystems (here, the pairs) that always are consistent with the objective of the whole. Beyond decision rules, expect similar negative assertions to arise should a system have a rich source of inputs along with a consistency condition on the ''whole'' that mandates how independent outcomes of the parts are related.

Connecting information
The source of ''the parts do not agree with the whole'' conclusion can be illustrated with Eq. 2 where the A 1 1A 2 outcome is reasonable for p 1 , but questionable for p 2 : Doubt about the p 2 outcome has nothing to do with how each source ranks fA 1 ; A 2 g (it is the same in each profile), but how this pair is situated with respect to the other alternatives. Stated differently, the negative conclusion of Arrow's result and Theorem 1 reflect the fact that each pair's ranking is determined solely by the properties of the rule designed for that particular pair. As such, information coming from these subsystems need not adhere to the system's requirements. In fact, condition 3 (or IIA) prohibits checking whether the subsystems' (pairs') outcomes agree with the system's requirement (transitivity).
For an example, return to the above discussion of where to locate a plant. It could be that the status of taxes and the availability of land lead to the rankings Boston 1 Chicago and Chicago 1 Atlanta. The ranking of {Atlanta, Boston} is based on the ease of parking, where Atlanta 1 Boston creates a cycle. The source of the problem is clear; each pairwise rule carefully makes a choice based on its specified responsibility. But there is nothing in these assignments to assure a transitive outcome.
The above holds even should the pairwise methods be designed to achieve excellence. Thus, a corollary of the above observation is the counterintuitive statement that even imposing a high level of excellence in determining the outcome for each independent subsystem need not lead to an acceptable outcome for the full system.
Clearly, something other than achieving excellence in each subsystem's outcome is involved. This underscores the need to discover whether the subsystems' outcomes (here, pairs) can interact with one another in a manner that supports the system's objective. The approach developed next assigns only those tasks to the subsystems where the outcomes will be consistent with the system's objectives. (In general, this is difficult to do.)

The structure of pairs
Fundamentals that are known about the structure of the space of pairwise summation techniques (Saari, 2014) are expanded here. This structure holds independent of whether the ranking of pairs is determined by voting, decision rules, cost, size, or any other factor. After defining the space, a convenient way of computing is introduced.
Equation 1 objective is to describe the data (or preliminary paired comparison results) d as an orthogonal sum To achieve Eq. 6, the space of paired comparisons is orthogonally divided into a subspace where the pairs satisfy a strong form of transitivity (this assures satisfying the system's objective of having transitive outcomes) and a subspace of cyclic behavior. It turns out that items from this second subspace create all of the ranking difficulties that can arise with paired comparison methods; they constitute the system's error terms that force outcomes from the parts (the subsystems) to deviate from the full system's transitivity requirement. With n ! 3 alternatives fA 1 ; A 2 ; . . .; A n g, let d i;j be a numerical difference comparison between A i and A j where The comparisons can be almost anything such as differences in costs of internet plans where d i;j is the monthly cost of plan A i minus that of plan A j . Or, d i;j could the difference between the weights, or maybe the lengths, of objects A i and A j ; it could even be the clockwise angle between two vectors. In an election, a natural choice for d i;j is the difference between A i 's and A j 's tallies. Perhaps d i;j comes from a physics or chemistry experiment where it is the difference in temperatures of objects A i and A j . Although the origin of the d i;j terms is immaterial as long as Eq. 7 is satisfied, it plays a role when evaluating conclusions; e.g., comparing costs of internet plans need not identify the optimal choice. Thanks to Eq. 7, it suffices to know the n 2 À Á ¼ nðnÀ1Þ 2 independent d i;j ; i\j, values rather than all n 2 of them. Thus, the space of d i;j paired comparisons is identified with the n 2 À Á -dimensional Euclidean space R Definition 1 (Saari 2014) A vector d 2 R n 2 ð Þ is strongly transitive if each triplet of indices fi; j; kg leads to the equality .
The set of all strongly transitive vectors, denoted by ST n ; is the space of strongly transitive rankings.
The choices of comparing internet costs or weights of objects always are strongly transitive. But due to interaction effects, temperatures of pairs of objects in a chemistry experiment, or the angles between vectors in R 3 , need not satisfy this condition. For a voting example that fails the condition, suppose of 33 sources The fA 1 ; A 2 g pairwise vote is A 2 1A 1 with a 17:16 tally, the fA 2 ; A 3 g outcome of A 2 1A 3 has tally 18:15, and the fA 1 ; A 3 g outcome of A 1 1A 3 has tally 25:8. Trivially, the values cannot satisfy the strong transitivity of Eq. 10 even though the outcome is transitive. Any paired comparison decision tool that allows non-transitive outcomes has examples that violate strong transitivity. A simple illustration with eight sources has 4 preferring A 1 1A 2 1A 3 ; 1 preferring A 2 1A 3 1A 1 ; 2 preferring A 3 1A 2 1A 1 , and 1 preferring A 3 1A 1 1A 2 . The A 1 1A 2 and A 2 1A 3 outcomes both have a 5:3 tally. But the tied A 1 $ A 3 vote with tally 4:4 violates transitivity. Here d 1;2 ¼ 5 À 3 ¼ 2; d 2;3 ¼ 2; but d 1;3 ¼ 0; which, because 2 þ 2 6 ¼ 0, fails the strong transitivity condition.
The structure of this space ST n is captured by the following definition: . . .; n, and d k;j ¼ 0 if k; j 6 ¼ i: Thus, for alternative A j , the B j nonzero components occur only when A j is compared with any other alternative, and the d j;i ¼ 1 value favors A j . For n ¼ 4, Theorem 3 For n ! 3, ST n is a ðn À 1Þ-dimensional linear subspace of R n 2 ð Þ that is spanned by any ðn À 1Þ of the fB i g n i¼1 vectors. According to Theorem 3, algebraic combinations of strongly transitive terms are strongly transitive. As properties of summing terms in ST n resemble Eq. 9 addition, nothing goes wrong with ST n terms. So, for decision methods based on algebraic combinations of d i;j values, nothing goes wrong with data d 2 ST n . Thus, ST n is the sought after consistency space for transitivity. In turn, all terms causing system difficulties are orthogonal to this subspace.

Cyclic effects
Clearly, cycles of paired outcomes violate the system's objective of transitivity. A longstanding mystery has been whether other effects exist. The answer, as developed next by determining the subspace orthogonal to ST n , is that cyclic terms are the sole cause of all pairwise ranking difficulties.
A typical example of a cycle is Using fixed d i;j differences, this cycle defines The Eq. 13 cycle subscripts can be identified with the list as specified on a loop or circle (Fig. 1a). Here, 1 is followed by 4, but, because of the loop structure, 1 follows 2 to reflect the concluding A 2 1A 1 that completes the cycle. All cycles can be described with such lists.
Definition 3 Let k ¼ ði; j; k; . . .; sÞ be a list of at least three indices specified in a circular manner (so i is followed by j, and i follows s); each index appears only once. Define the cyclic vector C k 2 R n 2 ð Þ as follows: If i and j are adjacent in the listing, where j immediately follows i, then d i;j ¼ 1. If i and j are not adjacent, then d i;j ¼ 0: Vector C k is the ''cyclic direction defined by k''.
Proposition 1 Any C k can be expressed as a sum of cyclic vectors defined by triplets.
The next assertion provides a convenient choice of bases for ST n and CT n .
Corollary 1 For a vector d ST 2 ST n , there are unique scalars b j so that d ST can be uniquely expressed as Each cyclic component of d CT 2 CT n that includes alternative A i can be uniquely represented, with scalars c j;k , by cyclic vectors with three indices as There is no mystery about Eq. 18, it is just an analytic representation of Fig. 1c division of a cycle about node i. Comparing the ðn À 1Þ-dimension of ST n with the nÀ1 2 À Á dimension of CT n makes it clear that the space of problem-causing terms, CT n , quickly overwhelms ST n . As the CT n dimension is nÀ2 2 times that of ST n , we must definitely anticipate cyclic error terms.

Computing
According to Theorem 4, paired comparison outcomes have an Eq. 1 orthogonal structure given by Eq. 16. This means that to eliminate system problems and the limitations of the decision methods discussed above, it suffices to project the paired conclusions defined by d into ST n . Doing so eliminates the troubling cyclic d CT component, where it is arguable that the outcome should be a complete tie. A simple way to achieve this objective is, for each j, to compute the average fd j;i g value (where d j;j ¼ 0).
Definition 4 For n ! 3 and d 2 R n 2 ð Þ , a Borda Rule assigns the number to each alternative. The BR ranking is determined by the b j values where larger is better.
To illustrate with Eq. 12 and its d ¼ ðd 1;2 ; d 1;3 ; d 2;3 Þ ¼ ðÀ1; 17; 3Þ; we have that A way to interpret BR is that it does not treat the d i;j paired comparison outcomes as the final results of a decision analysis. Instead, BR treats them as the penultimate step. As the BR ranking is defined by scalar values (which are always transitive), BR avoids cycles. So the effect of the final BR step is to remove error-cyclic terms that violate the system's objective of transitivity. Other BR properties are described next.
Theorem 5 If d 2 CT n , then all of its b j values equal zero. If d 2 ST n , then for each (i, j) pair, According to Theorem 5, the projection of d to the transitive consistency space ST n can be quickly determined from Eq. 20 and the Borda Rule (BR) outcome. This is because the linearity of addition requires To illustrate with Eq. 12, rather than using linear algebra to project d ¼ ðÀ1; 17; 3Þ to ST 3 , with the above b 1 ¼ 16 and two of the vectors are known, it follows that which here is d CT ¼ ðÀ1; 17; 3Þ À ð4; 12; 8Þ ¼ ðÀ5; 5; À5Þ ¼ À5ð1; À1; 1Þ ¼ À5C 1;2;3 : Again, it is arguable that the outcome for this cyclic C 1;2;3 should be a complete tie. It is interesting that, although d ¼ ðÀ1; 17; 3Þ is not strongly transitive (it includes cyclic terms), it defines the transitive rankings A 2 1A 1 ; A 2 1A 3 ; A 1 1A 3 or A 2 1A 1 1A 3 : By beating all other alternatives, A 2 is called a Condorcet winner. The reason the Condorcet (A 2 ) and Borda (A 1 ) winners differ is strictly because the Condorcet winner is influenced by information that includes the cyclic error term d CT . Indeed, d CT ¼ 5ðÀ1; 1; À1Þ is created by the profile p CT where 5 sources have In contrast, p ST has the strongly transitive with the A 1 1A 2 1A 3 conclusion. Thus (as always true), the cyclic noise component, where no alternative is favored, is what causes the Condorcet and Borda winners to differ.
A way to think about this is that there are two sources of information; p ST and p CT . The first group has a well defined ranking outcome. Because it is arguable that the second group does not favor any alternative, combining the two sources p ST þ p CT may seem to be innocuous. It is not; doing so changes the outcome.
A common difficulty in experiments is if some d i;j data values for certain pairs are close to each other. Such settings can raise doubt about the selection of an optimal choice. Suppose differences between temperatures in a chemistry experiment have d 1;2 ¼ À1; d 2;3 ¼ 3; d 1;3 ¼ 9; although A 2 is warmer than A 1 or A 3 , the small d 1;2 ¼ À1 value can raise doubts about A 2 . The Borda scores are 3b 1 ¼ À1 þ 9 ¼ 8; 3b 2 ¼ 1 þ 3 ¼ 4; 3b 3 ¼ À3 À 9 ¼ À12; where A 1 , not A 2 , is the Borda winner. This difference arises because values supporting A 2 , which are not as decisive as those supporting A 1 , reflect cyclic noise in the data generated by a component in which ''no alternative is better.'' As shown next, this is a general phenomenon. (Equation 22 can be extended to all n ! 3, but the expression is messy.) Theorem 6 For n ¼ 3, if the Borda winner A 1 and the Condorcet winner A 2 disagree, then d is not strongly transitive (it includes cyclic components). The Borda ranking for d is A 1 1A 2 1A 3 (so the Condorcet winner is not bottom ranked), the Borda difference b 1 À b 2 [ 0 is smaller than b 2 À b 3 [ 0, and the cyclic term, is cð1; À1; 1Þ ¼ cC 1;2;3 where For n ! 3; the Condorcet winner never is Borda bottom ranked; if there is a Condorcet loser (an alternative that loses all paired comparisons), the Condorcet winner is Borda ranked over the Condorcet loser.
The message: to have consistent, transitive outcomes, use the Borda Rule. But are there other approaches? The next theorem asserts that any such choice is equivalent to the Borda Rule. One possibility is the Borda Count, which is where a n-candidate ballot is tallied by assigning n À j points to the j th positioned candidate. As another example, the Kruskal-Wallis procedure from non-parametric statistics is equivalent to the Borda Count (Haunsperger, 1989) and hence to the Borda Rule. 3 Theorem 7 Any linear method of removing the error term of d to obtain d ST is equivalent to the Borda rule. If each d i;j value is the difference of A i 's and A j 's majority vote tallies coming from a ! 2 strictly transitive rankings of the n ! 3 alternatives fA 1 ; . . .; A n g, then A j 's Borda Count tally, denoted by s j , and Borda Rule value b j satisfy To illustrate Eq. 23 with Eq. 11, it follows from the above computations that n ¼ 3, a ¼ 33 (the number of sources), 3b 1 ¼ 16; 3b 2 ¼ 4; 3b 3 ¼ À20, while s 1 ¼ 41; s 2 ¼ 35; s 3 ¼ 23: Elementary computations prove that these values satisfy Eq. 23.

Modifying methods
Knowing what causes paired comparison problems identifies how to modify decision approaches to yield more acceptable conclusions. As developed in the previous section, the ''transitive consistency space'' is the strongly transitive ST n .
The projection of d to ST n is done as follows: • Find each pair's pairwise d i;j value; e.g., for voting, compute each pair's majority vote difference. • Compute each alternative's Borda tally (Eq. 19).
The success of this approach is captured by the following theorem, which asserts that by treating paired comparison outcomes as the penultimate step in a decision analysis, where the final ranking is determined by the above projection approach, the negativity of Arrow's Theorem is avoided. That is, from a system perspective, the whole and the sum of parts agree. This holds for whatever methods are used to compute the paired outcomes.
Theorem 8 For n ! 3 alternatives and a ! 2 sources of inputs, the above projection approach defines pairwise rankings that always satisfies the four conditions of Sect. 2. The ranking coming from these tallies agrees with the BR ranking.
Stated differently, Arrow's result is totally caused by the subsystem information where cyclic combinations compromise the system's objective of transitivity. This makes sense; data with cyclic components must be expected to compromise the goal of transitivity. In contrast, the ST n terms ensure agreement between the system and the subsystems.

Pugh matrix
According to the above, anticipate difficulties with any decision method based on paired comparisons where the pairs are separately analyzed. (This separation causes the system's requirements to be ignored.) This message can be illustrated with the Pugh matrix (Pugh, 1991) approach, where the paired comparisons are made with respect to a base alternative. Even if the outcome for each pair is determined in a sound and excellent manner, the final conclusion could violate excellence.
As a five alternative example, suppose this approach judges the status quo A 2 as being better than A 3 ; A 4 ; A 5 ; only A 1 is better than A 2 : Presumably, this means that A 1 is the superior choice. But applying the Pugh approach to the same data where the supposedly superior A 1 now is the base of comparison, while A 1 remains better than A 2 , the previously ''inferior'' A 3 ; A 4 ; A 5 could be judged as being better than the supposedly superior A 1 ! A supporting example is The paired outcomes involving the status quo A 2 are A 1 1A 2 ; A 2 1A 3 ; A 2 1A 4 ; A 2 1A 5 ; so A 2 is beaten only by A 1 . That A 1 is not the superior choice follows from the paired outcomes involving A 1 , which are A 1 1A 2 ; A 3 1A 1 ; A 4 1A 1 ; A 5 1A 1 : So, A 1 is not the superior choice as it beats only A 2 : Finally A 3 1A 4 ; A 3 1A 5 ; A 4 1A 5 : This difficulty arises because valued information is ignored; e.g., when A 2 is the basis of comparison, A 1 is not compared with any other alternative. Consequently, any assertion that A 1 is superior is dubious; it is based on a tacit, unsupported assumption that the system has transitive outcomes. This leads to a second problem; cyclic error terms introduce difficulties and the cycle here is A 1 1A 2 ; A 2 1A 3 ; A 3 1A 4 ; A 4 1A 5 ; A 5 1A 1 : Which alternative should be chosen?
A resolution follows from Sect. 3 and Theorem 7: eliminate the problem-causing system error terms. For each criterion, rank all n alternatives where the base choice (e.g., the status quo) does not have a special status. According to Theorem 7, the Borda Count ranking of the alternatives captures what would happen should the paired comparisons be projected to ST n to eliminate the system's cyclic error terms. So, for each criterion's ranking of the n choices, assign the scores of ðn À jÞ to the j th ranked alternative. Sum each alternative's assigned values, and use them to rank the alternatives. Doing so for Eq. 24, where A 2 is the status quo, leads to the following: The A 3 1A 2 1A 4 1A 1 1A 5 outcome indicates that only A 3 is better than the status quo A 2 . This approach eliminates the above concern about missing information and the worry about how hidden cyclic components can distort answers.
As an answer is given, there is no reason to find d ST . But if there is a need, use Sect. 3 machinery to compute the b j values, which from Eq. 23 are 5b j ¼ 2s j À 12, or 5b 1 ¼ 2; 5b 2 ¼ 2; 5b 3 ¼ 6; 5b 4 ¼ 0; 5b 5 ¼ À6: Indeed, if A j 1A k in this Eq. 25 outcome, we know (e.g., Theorem 5) that the outcome for that part of d ST has d j;k [ 0. And so, finally, the whole and parts agree.

Rank reversal
A stronger comment about agreement between the pairs and whole comes by explaining how dropping alternatives can cause a method to change, even reverse the ranking of the remaining alternatives. As shown next, again, this problem is caused by the cyclic terms.
Theorem 9 If d 2 ST n , and if P k is a projection mapping that drops alternative A k , then the P k ðdÞ is strongly transitive and the ranking of the ðn À 1Þ alternatives in P k ðdÞ agrees with the ranking of the same alternatives in d: But if d 2 CT n and A k is in a cycle of d, then P k ðdÞ is the sum of nonzero terms in ST nÀ1 and CT nÀ1 .
To see how this behavior can cause changes in outcomes, the BR ranking of C f1;2;3;4g is a complete tie, while the BR ranking of P 4 ðC f1;2;3;4g Þ is A 1 1A 2 1A 3 , which manifests the strongly transitive term created from the cycle by dropping A 4 . So, the BR ranking of d ¼ 3B 3 þ 2B 2 þ B 1 þ 12C f1;2;3;4g is A 3 1A 2 1A 1 1A 4 , but because of the strong cyclic component, the BR ranking of P 4 ðdÞ is A 1 1A 2 1A 3 : It follows from Theorem 9 that the only way to create a reversal example with a given method is to create a profile with appropriate cyclic terms. By dropping an alternative, the cyclic terms alter the strongly transitive portion of a profile, which, with a careful construction, allows almost any ranking to emerge. This means, for instance, that the Kruskal-Wallis approach from nonparametric statistics can suffer these problems.
The construction of examples follows the lead of the above example. To suggest what can be done, suppose the Kruskal-Wallis ranking for d is A 3 1A 2 1A 1 1A 4 and we want to have an example where, by dropping A 4 , the new ranking is A 1 1A 2 1A 3 : As above, just add to d a sufficiently large multiple of C f1;2;3;4g . [To see how to design raw data values from d, see (Bargagliotti and Saari 2010).] A way to underscore the importance of Theorem 9 is to recognize that IIA in Arrow's Theorem, or assumption 3 in Theorem 1, are projections of data to pairs. Slight extensions of Theorem 9 (e.g., consider all P k projections, k ¼ 1; . . .; n) lead to Arrow-type conclusions based not on pairs, but on triplets, or on quadruples, or .... This means that subsystem-system difficulties are far more pervasive than suggested by Theorem 1 and Arrow's result.
The continuing message is that the myriad of difficulties experienced in this wide general area, whether it be Arrow's Theorem, problems in decision or statistical methods, are strictly caused by cyclic terms. To avoid problems, all analysis should strictly depend upon a profile's strongly transitive portion; that is, remove this cyclic virus, retain only portions of the profile that are compatible with the objective of the system. From a systems perspective, expect independent subsystems to admit features that counter the systems' objective,. They must be identified and removed.

Ratio scale methods
The ratio scaled comparison techniques constitute another important class of paired decision methods. Here, a comparison of A i and A j is represented by a value a i;j [ 0, which is intended to be the multiple of how much A i is better than A j . The defining comparisons for the n alternatives fA 1 ; A 2 ; . . .; A n g are the positive values a i;j [ 0 where a j;i ¼ 1 a i;j for each i; j; so a i;i ¼ 1 for each i: According to Eq. 26 (and similar to Eq. 8), all information about the n 2 paired comparisons is contained in the n 2 À Á -dimensional vector of positive entries a ¼ ða 1;2 ; a 1;3 ; . . .; a 1;n ; a 2;3 ; . . .a 2;n ; a 3;4 ; . . .; a nÀ1;n Þ 2 R n 2 ð Þ þ : The system goal is to rank the n alternatives by using information from the subsystems-the a i;j paired comparisons. Because a i;j is a multiple of how much better A i is over A j , this suggests there are scalar weights w j for A j , j ¼ 1; . . .; n, that define the a i;j values. These desired w j weights exist if and only if they satisfy the consistency condition Of importance is the following known result (which is proved for completeness).
Theorem 10 Equation 28 is true for all pairs if and only if a i;j a j;k ¼ a i;j for all triplets ði; j; kÞ: Proof To prove the statement in one direction, if weights can be found where Eq. 28 always holds, then which is the desired Eq. 29. To prove the converse, because only a i;j values are known, candidate choices for the w j weights must be found. Choose an alternative as the basis of comparison, say A n , and select a positive value for w n . Define w j , the candidate weight for A j , to be w j ¼ a j;n w n or a j;n ¼ w j w n ; j ¼ 1; . . .; n: If Eq. 29 always holds, then a i;j ¼ a i;n a n;j ¼ a i;n 1 a j;n ¼ ð w i w n Þð w n w j Þ ¼ w i w j , which is Eq. 28. The selected w n [ 0 value does not matter because for any scalar l [ 0, the a i;j values defined by fw 1 ; w 2 ; . . .; w n g are those given by flw 1 ; lw 2 ; . . .; lw n g: h With this Eq. 29 structure, the ideal situation is if a belongs to the consistency space CS n ¼ a 2 R n 2 ð Þ þ j a i;j a j;k ¼ a i;k for all triplets ði; j; kÞ Properties of CS n follow from Eq. 29, which requires each a j;k ¼ a j;1 a 1;k . This means that all a j;k values can be determined by Eq. 29 and the ðn À 1Þ values of a 1;j j ¼ 2; . . .; n. Thus, CS n is a smooth ðn À 1Þ-dimensional manifold. If a 2 CS n , then Eq. 30 can be used to assign a weight w j to alternative A j ; j ¼ 1; . . .; n. h

A difficulty
In general, a 6 2 CS n , which requires removing a's error component. As this error term had not been identified, methods were developed to adjust the w j and/or fa i;j g values. A natural choice is to project a to the space CS n ; e.g., find the CS n point closest to a. But this approach carries the tacit assumption that a has an Eq. 1 form.
Moreover, CS n is a nonlinear submanifold of R n 2 ð Þ . The nonlinearity follows immediately by finding all a 1;j and a j;2 values so that a 1;j a j;2 ¼ a 1;2 ¼ 1; which has the hyperbolic equation xy ¼ 1 form; e.g., see Fig. 2. The reality that projections can introduce new types of errors is illustrated in Fig. 2 where the dark hyperbola represents the space CS n . The nonlinearity of CS n makes it reasonable to expect that a decomposition of a is given by the light curve passing through a and the CS n . That is, the curve constitutes error terms; where the curve passes through CS n (point b in the figure) is the corrected form of a. The dashed arrow, which represents a projection, defines c 2 CS n . If the above scenario is correct (as shown in Theorem 13, it is), then standard projection approaches must introduce new errors, which, in Fig. 2, is the c À b difference.
Certain methods harness the power of the Perron-Frobenius result (from linear algebra) that a nonsingular n Â n matrix with positive entries has only one eigenvector with positive entries. The AHP approach, promoted by Saaty (Saaty, 1977;Saaty & Alexander, 1989), uses the matrix ðða i;j ÞÞ and defines w j to be this eigenvector's j th component. A linear algebra exercise shows that these weights satisfy Eq. 28 if and only if a 2 CS n : Otherwise, adjustments are needed.
The geometric mean technique developed by Crawford and Williams (Crawford & Williams, 1985;Crawford, 1987), defines w j ¼ ða j;1 a j;2 . . .a j;n Þ 1 n ; j ¼ 1; . . .; n; ð32Þ so the w j weight assigned to A j is the product of all a j;k multiples raised to the 1 n power. While these weights do not satisfy Eq. 28 if a 6 2 CS n , by replacing each a j;k with w j w k , the Eq. 29 consistency equation holds. This means that this technique does convert an a 6 2 CS n into a CS n point. But, is it the desired point? While numerical experiments suggest this geometric mean approach is better than several others Error curve c Fig. 2 The consistency space [e.g., (Colany & Kress, 1993)], to the best of my knowledge, there has not been a theoretical justification. An argument is in Sect. 5.2.

A resolution
The error term of a 6 2 CS n is what causes the system's ''parts and whole'' problem. Resolving the difficulty is simple: transfer a to the space of paired comparisons that use summations (Sect. 3) where answers are now known.
To do so, identify each a 2 R n 2 ð Þ þ with a unique d 2 R Equation 26 defining condition of a i;j ¼ 1 a j;i becomes, with the logarithm mapping, lnða i;j Þ ¼ À lnða j;i Þ; this is the required Eq. 7. Conversely, the d i;j ¼ Àd j;i expression becomes e d i;j ¼ 1 e d j;i ; this is the required Eq. 26 condition. Thanks to these diffeomorphisms, finding a's error component reduces to finding the error component of lnðaÞ where Sect. 3 structure provides answers. The next statement is central to this discussion.
Theorem 11 For n ! 3, the subspaces ST n and CS n are diffeomorphic; the Borda Rule (Eq. 19) and the geometric mean approach (Eq. 32) are equivalent.
The last sentence means that an error free approach to rank the alternatives in a ratio scale space is the geometric mean approach. The theorem also means that all of the above Sects. 3, 4 results have parallel conclusions for the multiplicative approach.
Proof What mainly needs to be shown is that ST n and CS n are mapped onto each other with the indicated mappings. The CS n space requires all triplets to satisfy Eq. 31. With the logarithm mapping, the condition a i;j a j;k ¼ a i;k becomes lnða i;j Þ þ lnða j;k Þ ¼ lnða i;k Þ, which is the strongly transitive condition (Def. 1). Thus, each a 2 CS n is mapped to a unique d 2 ST n . Conversely, if d 2 ST n ; then for all triplets d i;j þ d j;k ¼ d i;k , which under the exponential mapping becomes e d i;j þd j;k ¼ e d i;k : Because e d i;j þd j;k ¼ e d i;j e d j;k , every d 2 ST n is mapped to a unique a 2 CS n .
As for the geometric mean technique, lnða j;1 a j;2 . . .a j;n Þ 1 n ¼ 1 n P n i¼1 lnða j;i Þ; which is the average flnða j;k Þg value, or the Borda Rule value for alternative A j .
Conversely, e 1 n P n k¼1 d j;k ¼ e d j;1 e d j;2 . . .e d j;n Â Ã 1 n , so the BR is mapped to the geometric mean rule. h The structures of CS n and error terms follow by using the exponential mapping to transfer the B j and C i;j;k basis vectors of Corollary 1 to this ratio scale space. In doing so, component wise addition in the pairwise summation space transfers to coordinate wise multiplication in the ratio scale space; the vector 0 2 ST n is mapped to the vector e ¼ ð1; . . .; 1; 1; 1. . .; 1Þ 2 CS n that has unity for each component. For notation, let a ST and a CT represent, respectively, the image of strongly transitive and cyclic terms; the first provide a basis for CS n and the second for error terms.
Definition 5 For a; b 2 R n 2 ð Þ þ , define a b to be a b ¼ ða 1;2 b 1;2 ; a 1;3 b 1;3 ; . . .; a 1;n b 1;n ; . . .; a nÀ1;n b nÀ1;n Þ: Represent the product c 1 c 2 . . . c k by O k j¼1 c j For each j ¼ 1; . . .; n À 1, let a ST ðd j Þ be a vector such that each a j;k component equals d j and all other components equal unity. Similarly, if the error component involves A i , let a CT ðc j;k Þ be the vector where the a i;j and a j;k components equal c j;k , and the a i;k component equals 1 c j;k ; all others equal unity.
To illustrate with n ¼ 4, vector 2B 2 ¼ ðÀ2; 0; 0; 2; 2; 0Þ 2 ST 4 is transferred to The nonlinearity in the ranked comparison space is captured by the form of a ST ðd 2 Þ where a term (here, d 2 ) is the value of some component in the vector but the denominator in another component. In a similar manner the 3C 1;2;3 ¼ ð3; À3; 0; 3; 0; 0Þ cyclic term involves A 1 so it is mapped to the a CT ðc 2;3 Þ ¼ ðc 2;3 ; 1 c 2;3 ; 1; c 2;3 ; 1; 1Þ error term where c 2;3 ¼ e 3 : By letting the value of c 2;3 vary, it becomes clear that an error term has the nonlinear form suggested by Fig. 2.
Theorem 12 Vector a 2 R n 2 ð Þ þ can be uniquely expressed as a ¼ a ST a CT where a ST 2 CS n and all n of the geometric means of a CT agree. If A i is involved in the error term, the vectors have the representation ¼i;j\k a CT ðc j;k Þ: As an n ¼ 4 example, which demonstrates the quick and simple computations, consider The geometric mean yields the weights w 1 ¼ ð 5 3 Â 15 2 Â 5 3 Þ for ranked comparisons where a À1 ST inverts each a ST component. (For any a 2 R n 2 ð Þ þ ; a a À1 ¼ e:) The error term for the example is a CT ¼ ð 5 3 ; 15 2 ; 5 3 ; 3 2 ; 3; 6Þ ð 3 5 ; 2 5 ; 1 5 ; 2 3 ; 1 3 ; 1 2 Þ ¼ ð1; 3; 1 3 ; 1; 1; 3Þ, or a CT ¼ a CT ðc 2;3 Þ for c 2;3 ¼ 3: The structures of CS n and the error terms follow from this mathematics. Illustrating with n ¼ 3, a parametric representation of CS 3 is ð s t ; s; tÞ for 0\d 1 ¼ s; d 2 ¼ t\1: Similarly, weights w 1 ; w 2 ; and w 3 ¼ 1 define the point w Ã ¼ ð w 1 w 2 ; w 1 ; w 2 Þ 2 CS 3 . All possible terms with errors emanating from w Ã define the nonlinear error curve It follows from these nonlinearities that only rarely will the projection of an a 6 2 CS n to CS n to eliminate a's error provide a correct answer.
Theorem 13 If a 6 2 CS n , the closest point in CS n to a does not, in general, eliminate a's error term. Indeed, for n ¼ 3 and u 6 ¼ 1, the nearest CS 3 point to u (from Eq. 38) is the accurate w Ã if and only if With w 3 ¼ 1; w 2 ¼ 2, the projection approach works only in the special case where w 1 ¼ 4, and even then only for the two points u ¼ 5AE ffiffiffi ffi 21 p 2 on the error curve.

Eigenvalue approaches
It is well known that a 2 CS n if and only if the eigenvector with positive entries from the matrix ðða i;j ÞÞ has an eigenvalue equal to n. Otherwise the eigenvalue is larger. This larger value is a direct consequence of the a CT portion of a.
This is easy to prove with n ¼ 3 by using u (Eq. 38), which is a general n ¼ 3 form of a ratio scaled term with the corrected value of w Ã . The eigenvalue and eigenvector for the matrix defined by u are determined in Eq. 40.
with eigenvalue (Eq. 40) 1 þ u þ 1 u ! 3: equality arises if and only if u ¼ 1, which is the error free term. The n ¼ 3 eigenvector does yield correct weights, but, in general, this is not true for n ! 4. This is easily verified by using the general form of an a 6 2 CS n .

Concluding thoughts
Although decision methods based on paired comparisons constitute relatively simple systems, they indicate which features of more general systems must be examined. In all of the difficulties discussed here, the problems reflected a structural incongruity between the system and the subsystems. This comment makes it clear that, in general and in some manner, the information being used from each subsystem must be made consistent with the system's requirements. For methods based on paired comparisons, this objective is achieved by identifying what causes the parts to deviate from the system's requirement of transitivity. This issue is being explored.

Proofs
Theorem 2: For n ! 3 alternatives, let F be a mapping as specified in Theorem 2. For any pair fA i ; A j g, let g ðA i ;A j Þ be a mapping that ranks these two alternatives based strictly on how the a ! 2 sources rank that particular pair. Namely, if p i;j is a list of how each source ranks the pair, let p be a profile over the n alternatives and a sources where each source's ranking of fA i ; A j g is as in p i;j . (As p i;j ranks only one pair for each source, it is clear that these rankings can be embedded in a complete transitive profile for the a sources.) Let g ðA i ;A j Þ ðp i;j Þ be the fA i ; A j g ranking of FðpÞ: To establish that g ðA i ;A j Þ is well defined, let pÃ be another profile where each source's fA i ; A j g ranking is as in p i;j . Because F satisfies IIA, the fA i ; A j g ranking in FðpÞ is the same as in Fðp Ã Þ: h Theorem 3: To show that ST n is a linear ðn À 1Þ dimensional subspace, notice from Eq. 10 that the component d s;k of a vector in ST n can be expressed as d s;k ¼ Àd 1;s þ d 1;k . Thus, ST n is uniquely and linearly determined by the ðn À 1Þ values fd 1;s g n s¼2 . To show for a given j that B j is strongly transitive, note that each d j;k ¼ 1, while d k;s ¼ 0 for s; k 6 ¼ j: Thus, trivially, d j;k þ d k;s ¼ d j;s . The reason P n j¼1 B j ¼ 0 is that each d i;j , i\j, component appears in only two B s terms. In B i it has the value d i;j ¼ 1; while in B j it has d j;i ¼ Àd i;j ¼ 1; in the summation, the two terms cancel. To show that any ðn À 1Þ of the terms are linearly independent, remove B k . For each j 6 ¼ k, B j is the only vector with a nonzero d j;k component, so it cannot be expressed as a linear sum of the remaining vectors. h Proposition 1: Let C k be defined by k ¼ ðj; k; l; m; . . .; y; zÞ that has four or more entries. The first cycle defined by triplets is (j, k, l), where the remaining indices define the cycle ðj; l; m; . . .; y; zÞ: In C ðj;k;lÞ , d j;l ¼ À1 while d j;l ¼ 1 in C j;l:m;... . In the sum C ðj;k;lÞ þ C ðj;l:m;...Þ , the d j;l terms cancel leaving C k . The next triplet is from the first three terms in ðj; l; m; . . .; y; zÞ; or (j, l, m) leaving ðj; m; . . .; y; zÞ. At each step, one new index is involved, so the process continues creating the triplets ðj; k; lÞ; ðj; l; mÞ; . . .; ðj; y; zÞ: h Theorem 4: To prove that any cyclic vector is orthogonal to a d 2 ST n , consider an arbitrary cyclic vector defined by the triplet (i, j, k). The only nonzero components of C fi;j;kg are d Ã i;j ¼ d Ã j;k ¼ 1; and d Ã i;k ¼ À1. Therefore, the dot product with d equals d i;j þ d j;k À d i;k , which equals zero (Eq. 10). Thus, orthogonality is proved.
A ¼ C 1;2;3 C 1;2;4 C 1;2;5 C 1;3;4 C 1;3;5 C 1;4;5 The last nÀ1 2 À Á Â nÀ1 2 À Á block always is an identity matrix. Thus, the matrix (A for n ¼ 5) has maximal rank, which means that the dimension of CS n is at least nÀ1 2 À Á . But vectors in CS n are orthogonal to the space ST n , so the dimension of CS n is no more than n 2 À Á À ðn À 1Þ ¼ nÀ1 2 À Á , which completes the proof. h This computation means that any cycle involving A 1 can be uniquely expressed in terms of C 1;j;k vectors. But cycles not involving A 1 cannot; this is a consequence of Theorem 9.
Corollary 1: The corollary follows from the facts that fB j g nÀ1 j¼1 is a basis for ST n (Theorem 3) and fC 1;j;k g 2 j\k is a basis for CT n (from the proof of Theorem 4 and Eq. 41). h Theorem 5: In a cyclic vector's defining list, index j is adjacent to two other indices, i that precedes j and k that follows j. The cyclic vector, then, has only two nonzero d j;s values of d j;i ¼ Àd i;j ¼ À1 and d j;k ¼ 1: In the sum defining b j these terms cancel, which proves the assertion.
To prove Eq. 20, let d 2 ST n and i 6 ¼ j. The A i , A j Borda values are After removing the d i;j values from both sums the difference is If A 3 could be the Condorcet winner, A 3 beats A 1 and A 2 , or d 3;1 [ 0; d 3;2 [ 0: Thus, d's d 1;3 ; d 2;3 \0, which requires c [ 0 in d 1;3 and c\0 for d 2;3 : As this is impossible, the Condorcet winner cannot be Borda bottom ranked. If A 2 is the Condorcet winner, A 2 beats A 1 , so d 2;1 ¼ Àd 1;2 [ 0; or 0\ðb 1 À b 2 Þ\ À c: For A 2 to beat A 3 , d 2;3 must be positive, or ðb 2 À b 3 Þ [ À c: This proves Eq. 22.
For n ! 3, if A 1 is the Condorcet winner, then A 1 beats all other alternatives, so d 1;j [ 0 for j ¼ 2; . . .n: This means that b 1 [ 0: Because P n j¼1 b j ¼ 0; there must be some j where b j \0, which proves that the BR ranking never has the Condorcet winner bottom ranked. Similarly, if A 2 is the Condorcet loser, then d 2;j \0 for j ¼ 1; 3; . . .; n: This forces b 2 \0, so the Condorcet loser A 2 is BR ranked below Condorcet winner A 1 . h Theorem 7: The first sentence follows from the fact that the image of any d is known, which means that, after expressing the linear process in matrix form, the associated matrix is known; it is a projection mapping that is equivalent to the Borda rule.
As known (e.g., (Saari 2018)), the Borda tally s j for A j is the sum of the tally A j receives in each fA j ; A k g paired comparison election. With a sources, a tied pairwise outcome is a 2 : a 2 . As d j;k is the difference between the fA j ; A k g tallies, A j receives aþd j;k 2 votes. Summing over all ðn À 1Þ pairs that include A j leads to s j ¼ 1 2 ½aðn À 1Þ þ nb j , or Eq. 23. h Theorem 8: That the ranking outcome is complete and transitive follows from the fact that it is determined by the b j , j ¼ 1; . . .; n, scalars. That IIA is satisfied follows immediately from Eq. 20. h Theorem 9: The first part of the theorem follows from the following computational result.
Corollary 2 For n ! 3, if d ¼ P n j¼1 b j B n j , then for each j, b j ¼ b j À 1 n b where b ¼ P n j¼1 b j : To prove the corollary, notice that d ¼ P n j¼1 b j B n j becomes d ¼ ðb 1 À b 2 ; . . .; b 1 À b n ; b 2 À b 3 ; . . .; b 2 À b n ; . . .; b nÀ1 À b n Þ: According to Eq. 19, for each j, nb j ¼ P n s¼1 d j;s ¼ ðn À 1Þb j À P s6 ¼j b s . By adding and subtracting b j to this expression, we have nb j ¼ nb j À b, which is as specified in Corollary 2. Thus, the b j coefficients and BR values are closely related.
For the proof of the theorem, by dropping A k all d i;j terms with a subscript k are dropped. If d 2 ST n , the only difference between d and P k ðdÞ are the missing d i;j terms with a subscript k. Thus, any triplet (i, j, s) without a k satisfies d i;j þ d j;s ¼ d i;s for both d and P k ðdÞ. As d is strongly transitive, so is P k ðdÞ.
Letb j be the BR weight for P k ðdÞ. Thus (Eq. 19) ðn À 1Þb j ¼ P s6 ¼k d j;s . By adding and subtracting b j À b k and using Corollary 2, we have ðn À 1Þb j ¼ nb j À ðb j À b k Þ ¼ nb j À ðb j þ 1 n b À b k Þ ¼ ðn À 1Þb j À ð 1 n b À b k Þ: That is, for each j 6 ¼ k,b j is found by subtracting the same amount 1 nÀ1 ð 1 n b À b k Þ from b j . This subtraction can force theb j and b j values to differ, but subtracting the same value from each b j requires the ranking for each to remain the same.
The second part of Theorem 9 follows from the B n j basis vectors. A cyclic term defined by k ¼ ð. . .; i; j; k; l; . . .Þ is orthogonal to B n j because it contains only two d terms with subscript j; they are d i;j ¼ Àd j;k ; where the difference in value ensures orthogonality. Now, if A k is dropped, then the d j;k term no longer exists, so the vector has a component in the strongly transitive B n j direction. Using the same argument shows the projection is not strongly transitive, so it also has a component in the cyclic directions.
Theorem 12: This is a direct consequence of Theorem 10 and the image of the basis vectors of Corollary 1.h Theorem 13: For the projection to eliminate the error in a, a must be situated on a straight line passing through the correct value a 0 2 CS n and orthogonal to CS n (i.e., a tangent space). The derivative of the error function proves it is not a straight line. This can be illustrated with n ¼ 3 by using u (Eq. 38) where du du ¼ ð w 1 w 2 ; Àu À2 w 1 ; w 2 Þ. A parametric representation of CS 3 is given by ð s t ; s; tÞ where the general weight for A 1 is s [ 0, for A 2 by t [ 0 and 1 for A 3 for the scaling. Thus the closest point to u on CS 3 , where u 6 ¼ 1, is given by the minimal value of Gðs; tÞ ¼ ð s t À w 1 u w 2 Þ 2 þ ðs À w 1 u Þ 2 þ ðt À w 2 uÞ 2 . Setting the partials equal to zero yields oG os ¼ 1 t ð s t À w 1 u w 2 Þ þ ðs À w 1 u Þ ¼ 0 and oG ot ¼ À s t 2 ð s t À w 1 u w 2 Þ þ ðt À w 2 uÞ ¼ 0: The closest point is the accurate w Ã , or s ¼ w 1 ; t ¼ w 2 , iff w 1 w 2 2 ð1 À uÞ þ w 1 ð1 À 1 u Þ ¼ 0 and ½À w 2 1 w 3 2 þ w 2 ð1 À uÞ ¼ 0: Because u 6 ¼ 1, the second equation is satisfied iff w 1 ¼ w 2 2 ; which verifies part of Eq. 39. By multiplying by u and collecting terms, the remaining equation becomes u 2 À ð1 þ w 1 Þu þ 1 ¼ 0, with solution u ¼ ð1þw 1 ÞAE ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi ð1þw 1 Þ 2 À4 p 2 : A real solution exists iff w 1 ! 1, which completes Eq. 39.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creativecommons.org/licenses/by/4.0/.