1 Introduction

The trend towards more democratic organization structures, global networking and advanced technical possibilities often leads to strategic managerial decisions based on collective decision making and not on single decision makers. Organizations try to use the potential strengths of a collective decision making, assuming that a group of experts makes better or at least more objective decisions than individuals which underlie natural cognitive restrictions. Compared to single persons’ decision making, also termed as individual decision making (IDM), groups are providing the advantages of a broader amount of information, much more experience and alternatives, a better diversification of the individuals’ cognitive restrictions, less evaluation mistakes, and an increased acceptance of the solution (Sims 2002; Kreitner and Cassidy 2011). Participation in decision making processes may lead to a better utilization of knowledge as well as to a higher level of individual involvement and responsibility for finding an optimal solution of organizational problems (De Haas and Kleingeld 1999). However, at the same time group decision making also incorporates the danger of possible disadvantages like, for example, the domination of the group interaction by single group members with more assertiveness and vocal strength than other more quiet people (Kreitner and Cassidy 2011). Owing to this, the need for a systematic group decision support emerges.

In group decision and negotiation literature often the use of voting procedures as one of the fundamental principles to solve collective decision problems can be found. Voting procedures are based on the premise of decision problems to be solved by collectives within fixed procedures of democratic systems. They provide, for example, with majority-oriented voting rules the opportunity to determine ordinally scaled group preference orders. Areas of application include mass voting situations, like political elections, district or national referendums or parliaments with a huge amount of members, which have to be limited to the use of voting procedures already for practical reasons (less time, big group sizes, very heterogeneous group experiences and knowledge). All these decision contexts have in common that they demand from the decision makers only competence to judge in terms of a purely ordinal preference level. Additionally, the decision is implicitly compressed to a mono-criterion decision problem so that the decision makers have to mentally analyze, evaluate and assess all aspects relevant for their opinion simultaneously. Despite the overall favorability of such an approach, there are decision situations in which the functionality of voting procedures is not sufficient. One may think on situations in which the decision problem is so complex and extensive that its complete solution needs decomposition and new structuring as well as an intensive examination of its influencing criteria and dependencies and, furthermore, quantitative preference distances within the ranking of alternatives instead of ordinal preference orders. The solvation of such long-term oriented and highly aggregated strategic decision problems by (a restricted number of) members of a decision group with high individual expert knowledge implies the representation of complex individual information by adequate decision support methods as well as the necessity to apply group preference aggregation procedures. Although the well-known and often discussed voting procedures may lead to a group consensus even in such a strategic decision context, an interesting research gap can be identified in supporting strategic group decision making conducted by small expert groups with cardinally scaled and technically ambitious methods.

Moreover, such strategic decision problems are usually determined by multiple, differently dimensioned and partially conflicting goals. The complexity of strategic decision settings therefore requires not only a systematic, multi-criteria decision support but also the explicit consideration and systematization of information, experiences and preferences of multiple decision makers, whose participation may significantly improve the outcome of strategic decision environment.

The AHP (Saaty 1980; Saaty and Vargas 2001) and its generic form the analytic network process (ANP) (Saaty 1996; Saaty and Vargas 2013) are important discrete multiple criteria decision making (MCDM) methods gaining increasing importance in research and practice. Apart from their advantages in particular with regard to the ability of handling quantitative as well as qualitative information, efficient software support and the compensatory character of considering criteria, a further important benefit in comparison to other methods is their feasibility to consider decision processes adequate to reality, i.e. with multiple actors. So, AHP and ANP provide a wide range of enhancement options for multi-personal decision contexts. Although in cases of decision problems with high complexity the disadvantage of a very time-consuming application may arise, AHP and ANP offer in contrast to voting procedures the advantage of cardinally scaled priority results which measure the distances between the alternatives. Therefore, they offer a greater potential to solve the multi-personal decision problem than the above mentioned ordinal voting procedures. Referring to a MCDM-taxonomy, AHP and ANP can be positioned between ordinal voting procedures and approaches that require substitution rates and utility functions to represent the decision maker’s preferences. Insofar AHP and ANP differentiate from (more complex) utility theory based approaches as they use cardinal priorities to approximate individual preference structures instead of continuous utility functions. Utility function based approaches provide a more precise determination of optimal decision alternatives for the case if utility theory’s underlying conditions (e.g. substitutability) can be fulfilled. In situations when members of the decision group with highly individual expert knowledge shall collectively solve a strategic decision problem adequate to restrictive conditions of reality, the necessary explication of relevant utility function(s) often will not work. So, the decision settings mostly do not provide enough information for the construction of substitution rates between criteria and the evaluation of minimal variations of consequences. This will often fail in view of the decision makers‘ cognitive limits. For theoretical discussion of the important topic of cardinal utility in the context of collaborative group decisions see e.g. Keeney and Kirkwood (1975), Keeney (1976, 2009) as well as Dias and Sarabando (2012). Different to the multi-attribute-utility theory (MAUT)—as a typical representative of utility function based approaches—both AHP and ANP can be used equally for existing cardinal and ordinal information on attributes and generate cardinal rankings of the alternatives. By this, AHP/ANP suffice ambitious requirements, but are more robust for practical purposes than MAUT can be. Although this cardinality is an important scientific topic, it is not mentioned in most research contributions.

Nevertheless, keeping the identified research gap of strategic decision making with well-informed expert groups in mind, the application of AHP and ANP for group decision making (GDM) is not as unproblematic as its seems. In the field of MCDM research, many authors already recognize the immense importance of supporting group decision making. In recent years, numerous methods, especially for the well-known and often best suited AHP and ANP, have been developed. Faced with the variety of different approaches, however, the users both in science as well as in practice can lose orientation concerning the availability and appropriate selection of a group preference building method in the concerning decision making context. Therefore, a significant lack of systematization and assessment in scientific publications can be identified. Against this background, the development of a concrete view on existing approaches and of recommendations how to use them, can be concretized as identified research gap. As a vast number of aggregation approaches exist, the aim and new contribution of this research is on the one hand the presentation of an innovative and comprehensive literature review as well as on the other hand an evaluation of selected approaches of group preference building within the AHP and ANP regarding the demand criteria of the specific group decision problem to be solved.

The structure of this paper is as follows. Section 2 will point out the importance of AHP/ANP especially in the field of strategic GDM. Section 3 is focused on a literature review of the various approaches applicable for AHP/ANP group aggregation. The more detailed presentation and evaluation of selected approaches is the topic of Sect. 4 before the results of the evaluation and the comparison are summarized in Sect. 5. The paper ends with concluding remarks and future prospects in Sect. 6.

2 AHP/ANP for Supporting Strategic Decisions

The AHP is a MCDM method for dealing with multiple goals and multiple criteria in complex decision settings (Saaty 1980; Dyer and Forman 1992; Saaty 2001b). The application of analytical methods, the decomposition and hierarchical structuring of the decision problem and a systematic, comprehensive framework form the procedure of the AHP method. The AHP is based on the three principles: decomposition, comparative judgments and synthesis of priorities (Saaty 1986; Dyer and Forman 1992). As a generic form of the AHP, the ANP provides further possibilities for structuring and analyzing complex strategic decision problems (Saaty and Vargas 2013). In contrast to the hierarchically structured AHP, the ANP facilitates explicit modeling of dependencies and feedback. Problems on a standard level of complexity should be solved by AHP, whereby an increasing connectivity is inducing the application of ANP. Both the AHP and ANP methodologies have been made a subject of various books and papers (see e.g. Dyer and Forman 1992; Forman and Gass 2001; Saaty 2001b, 2009; Saaty and Vargas 2013).

For underpinning the findings of past bibliometric investigations (see e.g. Vaidya and Kumar 2006; Ho 2008; Sipahi and Timor 2010; Hülle et al. 2013), we perform a comprehensive bibliometric study (see e.g. Van Raan 2003; White 2004; Jarneving 2005) for this paper in order to highlight the ongoing rising importance of AHP and ANP within literature. As we use the Business \(\hbox {Source}^{\circledR }\) Complete database via EBSCOhost, \(\hbox {SciVerse}^{\circledR }\) ScienceDirect and Thomson Reuters Web of \(\hbox {Science}^\mathrm{TM}\) to achieve a maximum scope of literature our analysis strongly exceeds all other bibliometric studies in this field. Our time horizon covers the last decade from 2004–2013 (database accesses for 2004–2011: July 16, 2012; for 2012: January 28, 2013 and for 2013: January 30, 2014). We consider all available publications within the databases. For obtaining significant results, the relevant journal contributions’ titles, key words and abstracts were searched by “analytic hierarchy process”/“analytical hierarchy process” and “analytic network process”/“analytical network process”. Owing to the fact that we use three databases, inconsistencies in the database’s coding as well as duplications had to be corrected manually. Figure 1 shows the synthesized and manually adjusted results.

Fig. 1
figure 1

Number of AHP and ANP publications by year (bibliometric analysis)

With respect to the time distribution of the AHP and ANP publications there can be recorded a clear growth which furthermore can be interpreted as a rising importance of both approaches within research and practice (due to the large proportion of case studies). With respect to all MCDM methods the AHP shows the highest number of publications since the early 1990s (Wallenius et al. 2008; Sipahi and Timor 2010; Hülle et al. 2011). Potential fields of application range from selection, evaluation or allocation to cost-benefit-analysis and forecasting problems. Moreover, the AHP and ANP are beneficial tools for the support of strategic decision settings (Vaidya and Kumar 2006). Although the ongoing results may be transferable to the ANP the subsequent sections primarily focus on AHP due to some reasons. The AHP has the higher general awareness, its hierarchical structure is intuitively more understandable for inexperienced users and because of its simplicity it is more suitable for an illustration and evaluation of group aggregation techniques. In order to answer the identified research gap in terms of recommending an approach for group decision making in small expert groups a differentiated analysis of the AHP publications has been conducted subsequently. The articles published in the first and the last year of the examined time horizon has been further analyzed if an individual or a multi-personal decision making context (e.g. an explicit GDM, a survey of different experts or at least a group-based identification of decision parameters) is supposed. Figure 2 shows the results of this analysis in form of the total number of publications as well as the proportion of GDM in percent of the total number in 2004 and in 2013.

Fig. 2
figure 2

Total number and proportion of GDM on total number of AHP publications in 2004 and 2013 (bibliometric analysis)

A comparison of the first and the last year examined in the conducted bibliometric analysis shows that both the total number and the relative proportion of GDM publications increased in the course of time. It shows that the relevance of GDM as well as the scientific interest in solutions and applications of decision support methods in a multi-personal environment is still existent. Against the background of an increasing number of AHP publications in general (see again Fig. 1) it is noticeable that despite of a growing range of different topics of AHP related research projects the quota of GDM articles remains not only constant over the time but even increases from 8.84 to 10.51 %. A total number of 80 publications which are embedded in a group decision making context in only one single year emphasize the relevance of an adequate multi-personal decision support and confirm the identified research gap. Furthermore, in the total of AHP publications an enrichment of facets of key words can be observed comparing 2004 with 2013. So the nominal increase of the GDM-proportion is—from a realistic point of view—even higher.

By the more detailed bibliometric analysis it could be shown that most strategic decision problems solved by AHP have in common that there is a need for including more than one decision maker in order to cope with the inherent complexity and to achieve a broader, holistic view on the decision environment. This poses the challenge of considering the individual expert judgments in the desired manner. As there is no generally accepted way for aggregating individual preferences/judgments to a consensus, the aim of this paper is to perform a comparative analysis of the different methods and “inside-approach” possibilities for aggregating the individual expert preferences to a GDM solution. “Inside-approach” possibilities describe the variants of aggregation procedures within a certain approach, for example the options of using the arithmetic or the geometric mean method inside the approach of aggregating individual judgments. In order to derive recommendations for varying demands within different decision settings, the subsequent section starts with a categorization of various group aggregation techniques. Furthermore, the most appropriate methods will be selected and evaluated with regard to certain criteria.

3 Literature-Based Overview of Group Aggregation Techniques for AHP/ANP

Although AHP and ANP were originally developed for an IDM context, both approaches have various methodological extensions for coping with a strategic multi-person GDM context. There are numerous group aggregation techniques that can be adapted to different group decision situations, taking into account individual as well as shared values (Dyer and Forman 1992; Aull-Hyde et al. 2006). Depending on the particular decision context, aggregation techniques differ strongly in complexity, available aid of mathematical auxiliary methods, consideration of risk or the adequate structure of the decision problem. Apart from striving for a consensus by discussion—when all group members basically have similar objectives—the basic consensus option is to intervene the decision making process at various stages and to aggregate either individual judgments or priorities (Dyer and Forman 1992). However, the structure of the group should be analyzed prior to the application of formal aggregation techniques. In some decision settings it is appropriate to waive the assumption of equivalent group members in favor of a differentiated, relative weighting of votes. Marked differences in knowledge, experience, management level or competence between the decision makers should be reflected in their respective influences on the overall ranking of the alternatives. In order to deal with inequalities with respect to the decision power of different group members, a supra decision maker could be determined who has the competence to assign weights to each group member in an exogenous manner. In cases where such a supra decision maker does not exist or does not seem to be acceptable for the participants, the group can try to allocate the voting power in an endogenous way. The endogenous decision power distribution could be arranged in such a way as the group is designing an additional hierarchy level to judge the relative power of each group member. Thereby, decision-relevant evaluation criteria could be decision makers’ personality traits such as perceived intelligence, experience, and power or management level. All participants could then be compared according to their relative influence with respect to underlying evaluation criteria (Saaty 1989, 2001a; Ramanathan and Ganesh 1994).

Having determined the weightings (\(w^{r}\) where \(r=1, \ldots , R; \mathop \sum \nolimits _{r=1}^R w^{r}=1\)) for all decision makers R, the subsequent step is the selection of an appropriate formal aggregation technique. As indicated above, the two fundamentally accepted aggregation procedures within AHP and ANP GDM are the aggregation of individual judgments (AIJ) and the aggregation of individual priorities (AIP).

In case the group structure is homogenous and decision makers are willing to act like one single individual, a synergistic AIJ is possible. Each decision maker conducts the pairwise comparisons by himself. Afterwards the (weighted) geometric mean method (WGMM) could be used to obtain the group judgment for each entry of the comparison matrices (Saaty 1989; Forman and Peniwati 1998). Hereby, the arithmetic mean should not be used which is due to the non-reciprocity (power conditions) of the collective pairwise comparison matrices (Aczél and Saaty 1983).

In case of a decision context attended by a conflict of interests, wherein the group members are individually acting with their own value systems, a consensus may be reached by using AIP. After each decision maker comes to his independent AHP or ANP based ranking, the resulting individual priorities are aggregated to a final group preference using either a (weighted) arithmetic (WAMM) or a (weighted) geometric mean method (Van den Honert and Lootsma 1996; Forman and Peniwati 1998).

Aside from AIJ and AIP, there is a multitude of further techniques to reach a group consensus (preference). In order to provide an adequate and structured literature review we use the taxonomy of six categories with comparable subject areas to classify the numerous publications in the field of AHP and ANP GDM. For achieving a transparent presentation, the identified methods and subjects areas are categorized by their characteristic features and specific application purposes. The result of this line of action is an innovative and comprehensive literature review which brings the new contribution to the AHP as well as to the group decision and negotiation literature of providing a clearly structured compendium of possible group aggregation approaches that simplifies the selection of appropriate methods and maybe even helps the reader to find an interesting topic for further personal research. The categorized literature review is presented in Table 1. The numbers in square brackets represent the related references in the “Appendix 1”.

Category I includes all publications which are inspired by the basic aggregation procedures and the consideration of direct information. The belonging publications deal with different AIJ and AIP approaches (see e.g. [18] or [20]) or present corresponding case studies (see e.g. [26] or [32]). If a real, practical group decision with high complexity should be solved, the user will certainly find an appreciable method in this first category which incorporates techniques with a broad applicability. In addition, this category offers the potential to support extensive case studies in a GDM context. So it probably shows the greatest relation to real decision problems.

Table 1 Literature review of various approaches applicable for AHP/ANP group aggregation

Subject of the concepts listed in category II is the additional consideration of indirect information, such as the reached levels of consistency, preferential differences or the preference structure distributions. Cho and Cho ([13]), for instance, use the consistency information (especially the consistency ratio) as well as an evaluation reliability function to determine a so called ‘loss function approach’, which allows to identify the group preference. Huang et al. ([25]) provide a “Group aggregation on preferential differences and rankings”. A further approach is facilitated by Escobar and Moreno-Jiménez ([19]), who suggest to conduct a group aggregation on the basis of individual preference structures. In direct comparison to category I a relation to solve real decision problems here still exists, but the application of methods of category II is technically more demanding, more time-consuming and needs more information that is not always available in a concrete decision situation.

Category III is constituted by methodologies which suggest deriving priorities and/or aggregating priorities using mathematical extensions, such as the Bayesian priorization procedure (see e.g. [2] and [3]), the Maximum Likelihood method ([7]) or other statistical methods ([6]). A further part of this category is the explicit consideration of uncertainty in preference articulation by using fuzzy approaches. There are different variants of combining Group AHP/ANP with the fuzzy set theory, i.e. Group AHP with fuzzy preference relations ([42]) or Group Fuzzy-AHP and Goal Programming ([47]). While the application of fuzzy approaches may be useful for special kinds of practical decision cases, the other subgroups of category III are more relevant for researchers than for practical users. These techniques require a high mathematical know-how and are often only applicable under strict assumptions.

Publications in category IV describe extensions of the AHP/ANP, especially group preference building approaches in combination with additional Operations Research (OR) methods, for instance GDM with AHP and Data Envelopment Analysis ([22]; [43]) or AHP and Goal Programming ([10]; [29]; [47]). So, this category emphasized again the multifaceted application possibilities and shows that both AHP and its group aggregation techniques are combinable with a high variety of other decision support and analyses methods.

Approaches to handle the group preference building in the multiplicative AHP are the subject of category V. While Van den Honert and Lootsma ([41]) as well as Srdjevic ([36]) describe the basic aggregation procedures AIJ and AIP in the multiplicative form of the AHP, other publications suggest again a Bayesian approach ([3]; [21]) or a stochastic group preference model for the multiplicative AHP ([39]). The degree of complexity varies within this category which strongly relates to the multiplicative variant of the AHP. Therefore, the transferability to the application of ANP is quite limited.

The final category VI contains no independent aggregation procedures but summarizes publications which focus on general multi-personal particularities in the context of AHP/ANP applications, for example consistency analysis, social choice axioms ([1]) or approaches for deriving group members’ weights ([33]; [40]). This category is maybe of special interest for mathematical and theoretical researchers.

Summarizing the above results, the innovative categorization of relevant publications helps to illustrate the wide range of possible AHP and ANP group decision support options. The variety of approaches and methods found while analyzing the database of the bibliometric analysis in Sect. 2 could be structured clearly and easily understandable. It provides for both users and researchers the opportunity to discover their own research gap or to get notes on any appropriate aggregation technique relevant for their specific interest. Despite of inherent hints that the first two categories may fit best to the small expert group based strategic decision context defined in Sect. 1, no recommendations which method is superior to others can be found yet. So, in terms of our research aim, the first part of answering the question which methods do exist has been finished. Nevertheless, no answer of the question which method should be used for solving highly complex decision problems in a small expert group could be found. The second part of evaluating how the identified and categorized methods can be used for a specific kind of decision problem is still open and therefore subject of the subsequent Sect. 4. As stated above in terms of a practical methods’ operationalization the approaches of the first two categories are in the particular focus for the subsequent representation and evaluation. Even though some of the approaches within the categories III to V seem to be theoretically very sound, practical limitations can be pointed out towards the fields of time effort and the elusive requirements of more sophisticated insights in complex mathematical techniques as well as additional OR-methods. Furthermore, a manageable number of evaluation alternatives has to be selected. Therefore, the AIJ and AIP methods (category I) as well as the loss function approach and the group aggregation on preferential differences (category II) are chosen for a more detailed representation and evaluation in order to identify an appropriate method to support GDM in the AHP.

4 Evaluation

4.1 Case Example Description

The following sections are based on a case example with four test scenarios to facilitate a better understanding of the formal representation as well as the subsequent evaluation of the selected aggregation techniques. For showing the particularities of the aggregation methods we use the AHP procedure as it is more popular and comprehensible. All results equally apply for ANP.

The developed case is as follows. An interdisciplinary group consisting of three decision makers \((\textit{DM}_r ; r\in \left[ {1,2,3} \right] )\) of different departments of the company is assumed. For the purpose of the subsequent evaluation it may be required that the group has already found an agreement on a common AHP decision model as well as a determination of group members’ weights \((w^{1}=0.5; w^{2}=0.3; w^{3}=0.2)\). To solve the decision problem, the group has the choice between four alternatives \(A_{i}, i\in \left[ {1,\ldots , 4} \right] \). The structure of the AHP model is represented in Fig. 3. The decision hierarchy consists of three criteria \(C_j \) where \(\hbox {j}\in \left[ {1,2,3} \right] \), whereby criterion \(C_2 \) is further operationalized by three subcriteria \(SC_j \), \(\hbox {j}\in \left[ {1,2,3} \right] \).

Fig. 3
figure 3

AHP reference model

Additionally, it is assumed that the individual preferences and judgments concerning the evaluation of criteria as well as of alternatives diverge. Table 2 shows the global individual priorities \(P_i^{gl} \left( {\textit{DM}_{r } } \right) \) and the resulting individual rankings of alternatives for test scenario 1 which acts as a basic or reference scenario for further variants of evaluation.

Table 2 Individual global priorities for the alternatives (test scenario 1—reference scenario)

While \(\textit{DM}_1 \) gives the highest priority to alternative \(A_2 \), \(\textit{DM}_2 \) ranks \(A_{2}\) on the last position and prefers alternative \(A_1 \). \(\textit{DM}_3 \) is nearly indifferent between \(A_1 \) and \(A_2 \) and would, in turn, prefer alternative \(A_4 \). There is an agreement obviously about the assessment of alternative \( A_1 \) and \(A_3 \) because all group members prefer \( A_1 \) in direct comparison to \(A_3 \). The conflicting preferences emphasize the need for a transparent and systematic decision support with particular regard to AHP and ANP group preference building methods. To evaluate the aggregation techniques by comparison, four test scenarios are assumed whose specific characteristics and resulting individual priorities can be found in “Appendix 2”. The test scenarios 2 to 4 are based on the reference scenario 1 and mainly differ in the intensity of individual priorities and/or the rank order of alternatives. In test scenario 2, for example, all decision makers still prefer alternative \( A_1 \) to \( A_3 \) but individual priorities are closer to each other, while in test scenario 3 even a constant ratio of preferences \({\upmu }_{1,3} \) is observable. A constant ratio of preferences \({\upmu }_{4,3} \) between alternatives \(A_4 \) and \( A_3 \) is focused on in test scenario 4.

4.2 Evaluation Criteria

In order to ensure a reasonable and understandable evaluation of appropriate group aggregation techniques, the subsequent analysis is conducted by three specific evaluation criteria: Type of the decision environment, consistency and social choice axioms. In terms of a necessary reduction in complexity, the number of evaluation aspects is limited to these three relevant criteria groups but there is no claim for completeness. Against the above mentioned background of a small and well-informed expert group that has to solve a complex strategic decision problem (see Sect. 1), the actual selection of these evaluation criteria results from specific characteristics of the defined decision situations (criterion 1) as well as of fundamentals of group decision making (criterion 2 and 3).

The criterion “type of decision environment” thus especially belongs to the characteristics of the considered kind of decision problems and enables to answer the question if an aggregation method like AIJ or AIP is suitable for a specific type of decision situation, for example for a small heterogeneous group with five group members. So, the environmental conditions under which the aggregation techniques may be used are subject of the possible decision context to which the methods are applicable for. The operationalized subcriteria are the group size (small groups with two to five persons or large groups with more than five persons) and the decision making situation. The subcriterion decision making situation is further divided into three aspects: in situations (a) with mainly common objectives (all group members have basically the same objectives), (b) with diverging objectives (the group members have non-common or partly hidden individual objectives) and (c) with conflicting objectives (the group members see themselves as opponents) (Dyer and Forman 1992).

Consistency represents a necessary condition for ensuring rational decisions. Since it ensures the reliability of the results determined with AHP, consistency can be seen as essential requirement of AHP application both in IDM and GDM. Therefore, the group aggregation techniques are evaluated with regard to an explicit consideration of consistency on the one hand and to the capability to improve the individual consistency levels on the other hand. The subcriterion “consideration” means that the consistency is explicitly taken into account in the respective method. The subcriterion “improve” means to examine if an aggregation technique is able to improve the consistency levels reached by each individual through building a group consensus.

One of the fundamental principles of GDM is the compliance with so called social choice axioms which were originally developed for ordinal scaled decision support method, especially for voting procedures. Due to their great importance for GDM some of these axioms are taken into account in the subsequent evaluation. The most common social choice axioms were proposed by Arrow (see e.g. Arrow 1978). These social choice axioms are in detail the universal domain (axiom 1), the Pareto Optimality (axiom 2), the independence of irrelevant alternatives (axiom 3) and the non-dictatorship (axiom 4) (Keeney and Raiffa 1976; Arrow 1978; Ramanathan and Ganesh 1994; Saaty and Vargas 2013). Owing to the ordinal character of Arrow’s social choice axioms adjustments on a cardinal scale are necessary. As these adjustments constitute a comprehensive research objective for Keeney in terms of cardinal utility functions (Keeney and Kirkwood 1975; Keeney 1976, 2009), the four AHP/ANP adapted conditions of Aczél and Saaty (1983) will subsequently deliver the basis of further investigation: the separability condition, the unanimity condition, the homogeneity condition and the power conditions (Aczél and Saaty 1983; Saaty and Vargas 2013).

4.3 Representation and Evaluation of Appropriate Group Aggregation Techniques

4.3.1 Representation

As mentioned in Sect. 3 the AIJ and AIP methods as well as the loss function approach and the Group AHP on preferential differences and rankings (seem to) provide a particular applicability for GDM in AHP. For a better understanding of the hereinafter method evaluation a short description of aggregation techniques’ formal features is given:

  1. I.

    The AIJ procedure (Saaty 1989) is conducted by determining the (weighted) geometric or arithmetic mean of the individual judgments for each entry of the pairwise comparison matrices. If \(a_{i.j}^r \) are the individual judgments of the group members \(\textit{DM}_r \) with \(r=1, \ldots , R\) by comparing element i with element j the aggregated pairwise comparison judgment \(A\left( {i,j} \right) \) is computed by the weighted arithmetic mean method (WAMM): \(A^{\textit{WAMM}}\left( {i,j} \right) =\mathop \sum \nolimits _{r=1}^R w^{r}\cdot a_{i.j}^r \) or the weighted geometric mean method (WGMM): \(A^{\textit{WGMM}}\left( {i,j} \right) =\mathop \prod \nolimits _{r=1}^R \left( {a_{i.j}^r } \right) ^{w^{r}},\) where \(w^{r}\) is the weight of the group member r.

  2. II.

    The AIP procedure (Ramanathan and Ganesh 1994; Forman and Peniwati 1998) is based on the aggregation of the individual resulting priorities \(p^{r}\). The synthesized group priorities \(P\left( {A_i } \right) \) for an alternative \(A_i \) can be obtained by \(P^{\textit{WAMM}}\left( {A_i } \right) =\mathop \sum \nolimits _{r=1}^R w^{r}{\cdot } p^{r}\left( {A_i } \right) \) or \( P^{\textit{WGMM}}\left( {A_i } \right) =\mathop \prod \nolimits _{r=1}^R \left( {p^{r}\left( {A_i } \right) } \right) ^{w^{r}}\).

  3. III.

    The loss function approach to group aggregation (LFA) (Cho and Cho 2008) is characterized by aggregating individual judgments using the inconsistency ratio \((CR_r )\). Cho and Cho define the inconsistency ratio “as the loss that impacts evaluation quality” (Cho and Cho 2008) and suggest a concept of a loss function approach as well as a so-called ‘evaluation reliability function’ to determine group priorities under explicit consideration of consistency. The smaller the expected loss X, the higher is the inconsistency of the judgments. The procedure of the LFA is divided into four steps (see Table 3) which consist of the computation of a mean consistency ratio and the variance of inconsistency, the calculation of an expected loss, the determination of a group weight from the evaluation reliability function and most recently, the aggregation of group priorities (Cho and Cho 2008).

Table 3 Steps for application of the LFA
  1. IV.

    A further aggregation technique which attempts to provide a more realistic and rational design of group preference building than the mean of judgments is the method Group AHP with aggregation on preferential differences and rankings (Group AHP model) (Huang et al. 2009). Starting from the assumption that it is hard to find a stable and satisfactory consensus by arithmetic or geometric mean method, preferential differences and rankings of the alternatives are explicitly taken into account. Therefore, a collective priority vector for the alternatives is computed by multiplication of two weighting vectors which are on the one hand derived by consideration of the preferential differences and on the other hand by consideration of the individual rankings and rank adjusting factors for each decision maker (for further details of the application procedure see Table 4 and Huang et al. 2009).

Table 4 Steps for application of the Group AHP model

The AIJ and AIP methods are suitable for both AHP and ANP. If the LFA is modified to the extent that the specific computations of this approach focus on the local priorities instead of the suggested global priorities the LFA may also be applicable to group aggregation in the ANP. Under the restriction of decision models with only a small number of decision criteria and interdependencies as well as some technical adjustments, the idea of Group AHP model is also transferable to ANP. Apart from pairwise comparison judgments that relate to dependencies between criteria the elements necessary for the implementation of the Group AHP model may also be identified within the network structure. Judgments relating to criteria can implicitly be captured by the consideration of resulting preference differences and rankings.

Before starting the evaluation of the proposed methods it is interesting to oppose the different extents of used information (Table 5 with symbols \(+\)/\(-\)).

Table 5 Information used by appropriate group aggregation techniques

The AIJ procedure utilizes only pairwise comparison judgments, any other individual information is postulated to be irrelevant (Forman and Peniwati 1998). The procedure for AIP considers only the individual (usually global) priorities, so that individual pairwise comparisons are only incorporated into the assessment indirectly. Apart from local priorities, the LFA method uses individual consistency ratios as indicators for the quality of the individual judgments. The greatest use of available information shows the Group AHP model which integrates individual judgments and priorities as well as the preferential differences and rankings.

4.3.2 Evaluation

4.3.2.1 Decision context

Owing to the synergistic character of the AIJ procedure, which treats the group as an individual decision maker, it is only applicable to small groups. In a decision setting with two to five persons the participants are assumed to interact and influence each other more than it would be the case in anonymous groups of larger sizes (Basak and Saaty 1993). Assuming a greater cohesion of a small group, the synthesis of individual judgments to a common compromise-solution in form of collective pairwise comparison matrices works better with only a few decision makers but is conceptually applicable to large groups, too. The AIP procedure as well as the LFA and the Group AHP model are suitable for any group sizes.

Regarding the proposed types of decision making situations it can be emphasized that all selected group aggregation techniques may be used in a common objective environment. In situations with divergent or conflicting objectives, the application of the AIJ procedure is quite difficult as the synthesis to collective pairwise comparison matrices is often only accepted by the participants if the group is small and homogenous. However, experts with strong self-interests or manifested personal viewpoints usually do not accept an aggregation of individual judgments to an integrated overall assessment (Saaty 2006). Therefore, in situations with conflicting objectives the AIP procedure which preserves the identification of personal rankings can be seen to be superior towards AIJ. The LFA and the Group AHP model are determined by the intention to consider explicitly the different competences, preference differences and divergent rankings given by the group members. Therefore, an application to all defined decision making situations is possible. Nevertheless, it should be noted that—apart from the method of AIP—all aggregation techniques require an agreement on a common decision model. As a direct consequence, the application in a decision setting with conflicting goals may be restricted in such a way that a supra decision maker is needed who determines the decision model for the whole group.

4.3.2.2 Consistency

With regard to the handling of consistency the procedure of AIJ shows some particularities. The arithmetic form of AIJ (AIJ (WAMM)) generates inconsistent collective pairwise comparison matrices, even if all individual judgments were consistent. This is due to the fact that this kind of arithmetic aggregation procedure violates the property of reciprocity so that inconsistent matrices arise. In a direct consequence the use of AIJ (WAMM) must generally be excluded for group preference building with AHP and ANP. However, AIJ (WGMM) provides the particular advantage of increasing the consistency for the whole group. Geometric aggregation leads to a compensation of single inconsistent matrices to consistent group judgments (Aull-Hyde et al. 2006).

Regarding to the evaluation criteria of an explicit consistency consideration only the LFA could be mentioned. Although this concept is based on an underlying demand of explicit consideration of inconsistency, no direct changes of the consistency levels for the whole group are measurable. This can be ascribed to missing collective pairwise comparison matrices and highly complex calculations. Consistency analysis is no integral part of the remaining aggregation methods. There, the handling of inconsistent matrices is left to the users, who may exclude, e.g., answers that are heavily inconsistent or may insist on a repetition of the inconsistent pairwise comparison judgments.

4.3.2.3 Social Choice Axioms of Arrow and Saaty/Aczél Arrow’s first axiom ‘universal domain’ requires that a group aggregation technique is able to generate a group preference for any particular set of individual preferences. As there are no restrictions on the individual preference values in all presented aggregation procedures the condition of universal domain is satisfied (Ramanathan and Ganesh 1994; Saaty and Vargas 2005).

The ‘Pareto optimality axiom’ says that if all group members prefer alternative \(A_1 \) to alternative \(A_2 \), then the group should prefer \(A_1 \) to \(A_2 \), too (Ramanathan and Ganesh 1994; Van den Honert and Lootsma 1996). Although the statement of this axiom is almost universally accepted there are controversial discussions in the literature if a certain aggregation technique is able to satisfy the unanimity requirement or not. While Van den Honert and Lootsma (1996), for example, negate the validity of the Pareto principle for both variants of the AIJ procedure, Saaty (2006) indicates that at least the geometric mean method is able to satisfy the condition of unanimity. However, the evaluation based on the case example of Sect. 4.1 confirms a possible violation of the Pareto principle when aggregating pairwise comparisons (for other confirmation of this result see Ramanathan and Ganesh 1994; Van den Honert and Lootsma 1996; Chwolka and Raith 2001). The individual preference values in Table 2 show that even in the initial situation all group members unanimously prefer alternative \(A_1 \) to \(A_3\) (test scenario 1). To increase the validity of the results, another test scenario 2 is assumed, in which the individual global priorities for the alternatives \(A_1 \) to \(A_3\) are closer to each other (e.g. \(A_1 \) with 0.161 to \(A_3\) with 0.159 for \(\textit{DM}_1\)). In terms of the Pareto axiom, the evaluated aggregation techniques have to ensure that \(A_1 \) retains the higher priority in the collective ranking, too.

Table 6 Case example: results of evaluation regarding the Pareto Optimality axiom

Obviously both variants of the AIJ procedure cannot guarantee the Pareto Optimality (violation in test scenario 2). The results in Table 6 suggest that in addition to the procedure of AIJ the LFA cannot ensure the unanimity requirement for any set of individual preference values, too (violation in test scenario 2, LFA (WGMM)). A possible explanation could lie in the high sensitivity of both concepts towards changes in preference statements. Ramanathan and Ganesh (1994) see the reasons for the violation of the Pareto Principle by the geometric mean in “its tendency to use the compromise (or aggregated) responses even in the construction of pairwise comparisons matrices [which] (...) probably causes the method to be sensitive to even slight differences in member responses due to the nonlinearities of aggregation using the GMM” (Ramanathan and Ganesh 1994). For the Group AHP model neither sufficient evidences for or against compliance with the Pareto Optimality condition can be given due to a lack of comparable studies. Only the potential to generate an unanimous group preference order can be verified using the case example (Pareto Optimality is satisfied in test scenario 1 and 2). Since both results in the literature (Saaty 2006 for AIJ (WGMM); Ramanathan and Ganesh 1994 for AIP (WGMM); Forman and Peniwati 1998 for AIJ (WGMM) and AIP (WGMM)) as well as the two test scenarios in the case example confirm the unanimity condition fulfilled by the AIP method, a general validity of the Pareto Principle for both variants of the priority aggregation procedure may be assumed.

Owing to the generally known problem of potential rank reversals (see for example Saaty 1994; Forman and Gass 2001; Maleki and Zahir 2013) in the AHP and ANP even for an individual decision maker the fulfillment of the axiom ‘independence of irrelevant alternatives’ cannot be guaranteed in a GDM context (Van den Honert and Lootsma 1996).

Compliance with the ‘non-dictatorship’ axiom is primarily determined by the constellation of the group and/or the possible existence of a supra decision maker, who has the responsibility to ensure that all group members participate in the collective decision making process. It has to be guaranteed that no participant gets the weight “1” while all others are assigned to the weight “0” (Ramanathan and Ganesh 1994). The appropriate aggregation techniques respectively consider all individual preference values (judgments and/apostrophe or priorities) and additionally provide the possibility to assign different group members’ weights. Therefore, a “dictatorship” of one single decision maker can be prevented and in the given context the axiom of non-dictatorship may be regarded as fulfilled (for a critical discussion of the dictatorship axiom and the work of Arrow see also Dias and Sarabando 2012).

Saaty and Aczél’s separability condition which requires that a separate identification of individual preferences must remain possible in the overall aggregation function, is satisfied by all appropriate aggregation techniques. The conducted evaluation emphasizes that all methods use explicit individual preference values (judgments and/or priorities as well as consistency ratios in the LFA) so that changes in individual evaluations are directly affecting the group ranking of the alternatives. Although the procedure of AIJ generates one single, composed decision maker, the individual influences are maintained because the group preference is based on the individual pairwise comparison judgments.

Since the unanimity condition nearly corresponds with the required Pareto Principle of Arrow the above results are applicable analogously with regard to this evaluation criterion.

In a multi-attribute decision making context the homogeneity condition says that if each individual judges an alternative \(A_1{\upmu }\)-times as large as another alternative \(A_2 \), then the synthesized judgment on alternative \(A_1\) should be \({\upmu }\)-times as large as alternative \(A_2 \), too (Aczél and Saaty 1983). To check the homogeneity within the proposed case example the individual preferences are changed in that way that in test scenario 3 all decision makers prefer alternative \(A_1 \) to \(A_3\) with a constant factor of \({\upmu }_{1.3}= 2\). Furthermore, all group members judge alternative \(A_4 \) as 1.6 times as preferable as alternative \(A_3\,({\upmu }_{4.3} = 1.6)\) in another test scenario 4. The evaluation results presented in Table 7 indicate that only the procedure of AIP is able to satisfy this axiomatic requirement (see e.g. Saaty and Vargas 2005 for a confirmation of the results). Contrary to the statements in the literature (e.g. Aczél and Saaty 1983; Saaty 2006), the results based on the case example imply that even by application of the geometric mean the AIJ approach cannot guarantee a homogeneity of the synthesized group judgments.

Table 7 Case example: results of evaluation regarding the homogeneity condition

Regarding the power conditions of Aczél and Saaty (see e.g. Basak and Saaty 1993 for a formal description) the special case of ‘reciprocal property’ is of particular importance in the GDM context using AHP or ANP. Analogous to the axiomatic foundation of AHP and ANP (Saaty 1986) reciprocity of the aggregated pairwise comparison matrices is required. While the aggregation of individual judgments using the geometric mean (AIJ (WGMM)) generates reciprocal group judgments the application of the arithmetic mean (AIJ (WAMM) violates this property. The evaluations resulting from the case example confirm this hypothesis.

Figure 4 illustrates three exemplary consistent and reciprocal individual pairwise comparison matrices as well as the resulting aggregated matrices, one synthesized with the geometric mean, one with the arithmetic mean. Owing to the emphasized importance of the reciprocal property the arithmetic aggregation of judgments should not be used (see also Forman and Peniwati 1998, who postulated that for AIJ the “geometric mean must be used”, too). As the remaining methods for group preference building do not use individual pairwise comparison matrices the evaluation regarding the power conditions has to be modified. The reciprocal property “means that the synthesized value of the reciprocal of the individual judgments should be the reciprocal of the synthesized value of the original judgments” (Basak and Saaty 1993). The results of a corresponding verification using the reciprocals of individual assessments respectively of individual priorities are shown in Table 8. Only the procedure of AIP using the geometric mean is able to satisfy the reciprocal property as special case of the power conditions.

Fig. 4
figure 4

Case example: evaluation of AIJ regarding the reciprocal property

Table 8 Case example: evaluation results regarding the reciprocal property

5 Results and Discussion

To show particular differences with regard to certain decision environments AHP/ANP group aggregation techniques were evaluated with respect to relevant criteria. The evaluation was thereby supported by an AHP based case example and different test scenarios. Table 9 summarizes the identified results of the evaluation for the selected aggregation techniques and facilitates a direct comparison of their respective suitability.

Table 9 Results of comparative evaluation

In addition, Table 10 compares the resulting priorities and alternative rankings for each aggregation method. Obviously, it cannot be guaranteed that all methods indicate the same alternative as the best problem solution. While the procedure of AIJ and the LFA come to the conclusion that alternative \(A_2 \) should be chosen (case example, test scenario 1), the procedure of AIP and the Group AHP model recommend alternative \(A_1 \).

Table 10 Case example: group priorities and resulting rankings of the alternatives

The aim of the performed evaluation is to support the selection of the most suitable aggregation method depending on the underlying decision environment.

To summarize, it can be noted that due to the violation of the indispensable condition of reciprocity and the generation of inconsistent group matrices (even for perfect consistent individual judgments), the AIJ (WAMM) has to be excluded from any application.

However, the geometric variant (AIJ (WGMM)) shows an acceptable applicability for certain decision problems although only a small part of the available information is used (only individual judgments, see Table 5) and the fulfillment of both the Pareto Optimality axiom and the homogeneity condition is restricted. The individual decision makers should be willing to act as a synergistic unit and to give up the separate identification of their personal rankings in favor of the collective group preferences. Under such conditions a homogeneous group with fixed structures and common objectives, the AIJ (WGMM) provides the advantage of an improvement of the collective consistency level. Despite of possible inconsistent individual judgments, the geometric aggregation of collective pairwise comparison matrices generates a higher or even a complete consistency for the group judgments so that the quality of the decision can be increased.

The results of our evaluation imply a particular suitability of the AIP procedure. No other aggregation technique is able to fulfill a comparable number of evaluation criteria. AIP provides a wide range of application possibilities, although the amount of information that is used (only individual priorities, see Table 5) is quite small, too. Nevertheless, this priority based aggregation technique can be applied to deal with multi-personal decision making problems regardless of the group size, the group compilation or the decision setting. Furthermore, AIP shows great potential to support decisions with diverging or conflicting goals. Since AIP (WAMM) cannot guarantee the fulfillment of the power conditions, the AIP (WGMM) is even more suitable in the sense of a rational group decision support.

The specific characteristic of the LFA can be seen in an explicit consideration of individual consistency as a suitable indicator of rational decision making. However, the advantage of a higher procedural quality is directly linked with the problem of maintaining the transparency and objectivity of the aggregation mechanism for the whole group. In comparison to AIJ and AIP, the results of the LFA seem to be theoretically sound on a higher level and contain more information. Nevertheless, the implementation is more complex and time consuming. Regarding the distinction between the arithmetic and the geometric form of the LFA, the evaluation results do not yield any significant differences. Consequently, both variants of the LFA can be considered as relatively equivalent.

The holistic application of available preference information is seen to be advantageous using the Group AHP model. Owing to the lack of comparable evaluations, only tendencies in the results could be identified which indicate a potential to fulfill the rationality requirement. The objectivity is increased by a formal-mathematic procedure. As the application of the Group AHP model requires specific and technical know-how to handle the comparatively sophisticated aggregation process, an implementation barrier could arise from a practical point of view. By taking into consideration that AHP and ANP aim to increase the transparency and comprehensibility of decision settings, the complexity represents a further potential disadvantage of the Group AHP model. Difficulties especially increase when supporting more comprehensive decision models due to the sophisticated requirements of the concept. Therefore, reliant to the underlying decision setting, the mentioned limitations and the advantage of an almost complete utilization of the available individual information should be carefully weighed prior to the Group AHP model application.

Regarding to a critical appraisal of this research, there can be pointed out that the fulfillment of consistency criteria was observed through a discrete examination and not by a continuous simulation study. Continuous simulation studies offer opportunities such as for the evaluation of reactions on varying consistency levels (see e.g. Aull-Hyde et al. 2006 for a simulation study on the consistency of geometric aggregated randomized pairwise comparison judgments). For this purpose of a broader, comparative evaluation with 7 objects, the use of scenarios instead seemed to be more fruitful compared to a restricted application of the aggregation approaches to a single matrix. The consideration of only one matrix can also be seen as unrealistic from practical settings’ point of view.

A further limitation towards practical implementation of nearly all approaches can be stated in the field of software support, as no current AHP/ANP Decision Support System provides any advanced feature for considering GDM. LFA and Group AHP model cannot be implemented by software. For AIJ and AIP several restrictions occur which depend on the underlying software product.

6 Conclusion

Complex and extensive strategic decision problems are often characterized by a great number of different influencing criteria and dependencies. So, the solution of such strategic problems needs not only the competences and experiences of a decision group with a (sufficient but not too large) number of persons having available highly individual expert knowledge, but also an adequate multi-criteria decision support method. Against the background of a multi-personal decision context with quantitative and qualitative criteria instead of often used ordinal voting procedures, special MCDM methods like AHP and ANP suit better. AHP and ANP are important MCDM methods for solving complex, strategic decision problems, which provide the possibility to generate cardinally scaled evaluations as well as a high inherent variety of group aggregation procedures. In contrast to other methods, especially to MAUT-methods which also additively aggregate, but need a continuous utility function, they must suffice less restrictive conditions, thus have a higher tolerance against practical imperfectness and therefore are to be considered as the more robust procedures on the background of the sketched group decision problem.

In suppose of methodical suggestions in the area of AHP and ANP their relevance for group decisions was investigated. Therefore, a comprehensive bibliometric analysis was performed in Sect. 2. With respect to the temporal development in literature, the results underpin the ongoing importance of these methods. Second, a more detailed analysis of the AHP database showed that the quota of publications related to a GDM context increased in the course of time, although lots of new research topics had been developed in the meantime. Due to the great variety of AHP and ANP group aggregation methods, the first aim of this paper was an identification and categorization of AHP- and ANP-based methods supporting GDM problems. So, GDM-relevant AHP- and ANP-literature was investigated and presented in a comprehensive literature review on various aggregation possibilities in Sect. 3 within which six categories were differentiated and explained. As a definite solution for highly complex decision problems in a small expert group could not be identified yet, different approaches of the first two categories were analyzed as the second aim of this relevant research.

To point out selected methods’ (AIJ (WAMM), AIJ (WGMM), AIP (WAMM), AIP (WGMM), LFA (WAMM), LFA (WGMM) and Group AHP model) particularities, the aggregation approaches were evaluated on the basis of their fulfillment of derived criteria in the fields of underlying decision context, consistency and social choice axioms in Sect. 4. Therefore, four test scenarios were introduced to ensure the consideration of the combinations’ possible variety.

The procedure of AIP offers at low time effort the highest fulfilment of relevant rationality axioms and the broadest range of applications. It is suitable for small and large groups as well as decision settings with common and conflicting goals. Under the restriction of a complex and time-consuming implementation the LFA provides an explicit consideration of the consistency of pairwise comparison judgments. If it can be assumed that the individual decision makers put aside their own interests in favor of a preferably small and homogenous group, the WGMM variant of AIJ may be also recommended. Its arithmetic variant, however, cannot be used. Although the Group AHP model integrates a high degree of available information its applicability is rather limited to small groups and simply structured decision problems. So, the AIP remains as the most recommendable aggregation technique within the AHP and ANP to support highly complex decision problems in a small expert group.