Is VHB-JOURQUAL2 a Good Measure of Scientific Quality? Assessing the Validity of the Major Business Journal Ranking in German-Speaking Countries

This study examines the question of whether the journal ranking VHB-JOURQUAL 2 can be considered as a good measure for the construct 'scientific quality'. Various rankings in business research provide the database for the analysis. The correlations between theses rankings are used to assess the validity of VHB-JOURQUAL 2 along various validity criteria. The correlations with rankings that measure the same construct based on different methods show that VHB-JOURQUAL 2 has acceptable, but moderate convergent validity. The validity varies considerably across disciplines, showing that the heterogeneity of business administration is not sufficiently represented by this overall ranking. The variability is related to the variation in members per discipline represented by the German Association for Business Research. Furthermore, the measure shows a weak correlation with acceptance rates as an indicator of nomological validity in some disciplines.


Introduction
This study assesses the validity of the journal ranking VHB-JOURQUAL 2 as a measure of scientific quality. Journal rankings are an empirical means to determine the relative value of publications in a field. They have become a common instrument for evaluating the quality of scholarly work and the academic performance of scholars, schools, and even nations. They determine academicians' careers, promotion of scholars and the prestige of schools to a large extent. Furthermore, journals are the major vehicle for presentation of academic work to the public and journal publications demonstrate the accepted knowledge on which research traditions are founded. The more important a journal, the more its publications influence the visibility and prestige of the stake-holders in a discipline (Lewis, Templeton, and Luo 2007). The apparent importance of these rankings as a measure of scientific quality justifies their careful evaluation, since poor quality of such measures bears severe consequences for career and prestige in the academic world. The most prominent ranking of business research journals in German-speaking countries is VHB-JOURQUAL, the journal ranking of the German Association for Business Research (Verband der Hochschullehrer für Betriebswirtschaftslehre -VHB). It was originally developed with the purpose of providing a joint assessment of German journals and international ones. The second edition of the ranking (VHB-JOURQUAL 2) is based on a survey conducted amongst VHB members in 2008; an update has been undertaken in 2011 (VHB-JOURQUAL 2.1). This update integrates journals that have not been considered so far in the 2008 ranking. Given the importance the ranking has achieved amongst German scholars over the years, the data collection and journal evaluation procedure have been continuously refined since the first edition (see Hennig-Thurau, Walsh, and Schrader 2004;Schrader and Hennig-Thurau 2009). The responsible researchers have carefully performed data collection and analytical procedures and have provided some validation tests to ensure measurement quality. Nevertheless, criticism exists regarding whether the ranking is a good one and serves the intended purpose of measuring scientific quality (e.g., Nienhüser and Ridder 2009). The criticism is taken seriously by the VHB, which has tabled three public panel discussions during its annual meetings since 2003 to respond to such criticism. The current study provides an attempt to assess the validity of VHB-JOURQUAL 2 as a measure of scientific quality that goes beyond the previous validation procedures provided by Schrader and Hennig-Thurau (2009) in at least three ways: (1) the study refers to and compares both VHB-JOURQUAL 2 and VHB-JOURQUAL 2.1, (2) the study applies additional validation procedures, in particular nomological and convergent validity, and (3) the study explores whether VHB-JOURQUAL 2 sufficiently represents discipline heterogeneity in business administration. By this, the study examines the question of whether VHB-JOURQUAL 2 provides a good measure of scientific quality. The outline of the paper is as follows: First, concepts from measurement theory along with how they can be applied to assess the validity of VHB-JOURQUAL 2 are discussed. Second, the database used for the study and with which different validity criteria of VHB-JOURQUAL 2 have been assessed is described. After the results are presented, a brief discussion and implications for scholars and future developments of journal rankings, in particular for VHB-JOURQUAL, follow.

How to assess journal standing: basic concepts and validity
Rankings are based on different constructs and their measures, with journals being the objects of measurement. Journal standing (i.e., journals that are ranked along their quality) is one such construct; citation records or acceptance rates are other constructs rankings are based on. The concept of interest in this study is scientific quality, that is, the adherence to scientific principles such as methodological rigor and substantial relevance (Buchholz 1995). For the purpose of this study, journal standing is treated as a proxy for measuring scientific quality, and therefore differs from other rankings that intend to measure reputation or impact (Schrader and Hennig-Thurau 2009). The construct journal standing is measured using different methods: surveys (either assessment of peers in a field or views of researchers within a particular institution), expert ratings, or hybrid lists (any combination of the former methods). The quality of a measure is commonly assessed by various criteria of validity and reliability. A measure cannot be deemed valid unless it is found to be reliable. Reliability is concerned with the dependability of a measure over successive trials and in different contexts (Cronbach 1951). Schrader and Hennig-Thurau (2009) reported a correlation of r = .94 between VHB-JOURQUAL 2 and VHB-JOURQUAL 1 (i.e., the first VHB journal ranking from 2003, see Hennig-Thurau, Walsh, and Schrader 2004), indicating consistency of the measure over time and thus test-retest reliability. The current study focusses on validity criteria. Validity is concerned with how appropriately a measure represents the concept of interest (Cronbach and Meehl 1955). Approaches to ensure validity in this study are convergent and nomological validity. The following paragraphs describe how these criteria can be specified and translated into comparable criteria that are appropriate to ensure validity of VHB-JOURQUAL 2.

Convergent validity
Convergent validity is the tendency for a given measure to exhibit a strong relationship with other measures of the same concept (Campbell and Fiske 1959). Evidence for convergent validity can be provided by correlations with measures of the same construct (journal standing) measured with different methods (peer survey, institutional survey, expert opinions). In this study, convergent validity is examined by correlating rankings that aimed to measure scientific quality, but ap-plied different methods than VHB-JOURQUAL 2.
For convergent validity to be achieved, the correlations of VHB-JOURQUAL 2 and any of these rankings should be equally high and not differ significantly, while correlations with different constructs such as citation-based impact factors or acceptance rates should be lower, following the idea of discriminant validity, that is, the propensity of a measure to show a low correlation with measures of other concepts (Campbell and Fiske 1959).

Nomological validity
Nomological validity is proven when a measure empirically demonstrates findings consistent with conceptual expectations (Cronbach and Meehl 1955). In this study, acceptance rates and impact factors are distinct and empirically sufficiently discriminant constructs that can conceptually be linked to journal standing.
Previous studies have provided empirical support that acceptance rates differ from scientific quality (e.g., Coe and Weinstock 1984;Van Fleet, McWilliams, and Siegal 2000). Since the quality of submitted papers differs over journals, acceptance rates are not unambiguously linked to the quality of the papers that are eventually published in a journal. For instance, a top journal can receive high-quality papers only, whereas a lowerranked journal might primarily receive papers that have been rejected by higher-ranked journals. The number of submissions to both journals might be the same and thus also the acceptance rate, but the quality of publications and journals obviously differs. However, VHB-JOURQUAL 2 can be argued to serve as a predictor of acceptance rates. VHB-JOURQUAL 2 is employed for evaluating candidates for academic positions. Candidates will therefore try to maximize their VHB-JOURQUAL 2-based output as efficiently as possible. A rational strategy for candidates is to submit to journals with low rejection rates and high VHB-JOURQUAL 2 ranking, which would drive up rejection rates for these journals until eventually equilibrium is reached. The combination of both high acceptance rates and high VHB-JOURQUAL 2 ranking positions for particular journals provides opportunities for exploitation, which would question the value of VHB-JOURQUAL 2 in case candidates are being evaluated based on these journals. Consequently, cor-relations between VHB-JOURQUAL 2 and acceptance rates are an indicator of nomological validity 1 . A similar rationale applies to impact factors. Impact factors are based on citations and measure reputation and impact. Impact factors tap different constructs than scientific quality, because impact factors are influenced by many factors beyond scientific quality such as article type, topic, citation cartels, or even severe shortcomings of an article that might drive up the citations to that article (Garfield 2006;Rost and Frey 2011;Schrader and Hennig-Thurau 2009;Seglen 1997).
Still, scientific quality is one of the predictors of impact: if articles in a journal provide only papers with low relevance and rigor, they are less likely to become groundwork for other studies and might thus receive fewer citations than journals that provide papers with high relevance and rigor. This relationship should be reflected by a journal ranking that measures scientific quality. If this conceptual relationship does not hold for VHB-JOURQUAL 2, the ranking would motivate scholars to publish high-quality papers, but the quality would then not relate to the impact of the papers.

Comparison of rankings by discipline
An overall ranking of business administration journals is affected by discipline heterogeneity in business administration. That is, an overall ranking is not only influenced by scientific quality, but also by many factors such as the number of scholars active in a field, the homogeneity/heterogeneity of journal evaluations per discipline or the relative standing of a discipline in a country 2 . Comparing overall and cross-discipline rankings from other countries with VHB-JOURQUAL 2 as it is done in this study, can therefore bias overall validity estimates. Furthermore, while journal standing is actually intended to measure heterogeneity of scientific quality, it is not intended to compare the relative value of disciplines. Therefore, the validity of VHB-JOURQUAL 2 as a 1 The author thanks a reviewer for pointing out and explaining this relationship between VHB-JOURQUAL 2 and acceptance rate. 2 The author thanks a reviewer for pointing out this problem and for providing an interesting opportunity for further exploration of the dataset.  measure of scientific quality might lead to different results when comparing the overall ranking and the rankings for each discipline separately. The following analysis is performed for all journals together, as well as separately for the journals of the disciplines as identified by VHB-JOURQUAL 2, in order to answer the question for the validity of VHB-JOURQUAL 2 in different disciplines.

Data
Data for this study were retrieved from four databases: (1)  A description of the rankings is provided in Table 1.
To compare rankings by discipline, the study refers to the disciplines as identified by Schrader and Hennig-Thurau (2009), who distinguished between 16 disciplines in Business research. These disciplines correspond to the commissions of VHB. Since the journals that were additionally included in the new ranking VHB-JOURQUAL 2.1 have not been assigned yet to these disciplines, they were assigned independently by two coders (one of them the author of this study); agreement rate was achieved in 96% of the cases and the remaining cases were resolved after discussion with experts. Hence, another 101 journals could be assigned to the disciplines. In the following analysis, only ten disciplines are considered, excluding disciplines with a low number of journals because these data would be insufficient for the purpose of the analysis. Table 2 provides an overview over the disciplines and information on inclusion and exclusion of the disciplines in further analysis. Table 3 presents the matrix of correlations between the different rankings over all fields in business administration (here: overall ranking). The matrices of correlations between the rankings for each of the ten selected disciplines are presented in Appendix A (Tables 7 to 16). The correlation matrix is ordered along the different methods that were used for each ranking and within each method along publication year, in order to ease interpretation of the correlation coefficients for the purpose of validity assessments.

Results
Due to the ordinal nature of most of the data, the nonparametric Spearman's rho correlation coefficient was used that also allows computing and interpreting correlations between ranks and continuous variables (e.g., impact factor) in a meaningful way.

Validity of overall ranking
In order to check for convergent validity as assessed by measures of the same construct based on different methods, differences between correlations of JQ 2 (VHB-JOURQUAL 2) with any other journal ranking are compared. As for a convergent validity test, it is more appropriate to focus on rankings from the same period (2008 and 2009) since several factors can cause rankings from remote periods to differ and a comparison might therefore bias the validity test (e.g., the quality of particular journals might indeed have changed over time and thus variations are not only due to validity issues). The figures related to rankings from the same period are provided in the gray shadowed area in Table 4.
The results show only one significant difference between the correlations of JQ 2 with AST 08 and the correlation of JQ 2 with CNRS 08. All other correlations of JQ 2 with any of the rankings are equally high, supporting convergent validity. At the same time, the correlations with journal rankings based on surveys and expert opinions are significantly higher than correlations with impact factor and acceptance rate, thereby providing an indication for discriminant validity.
Relying on significance tests assesses only minimum requirements for validity. Unfortunately, the literature does not provide numerical threshold values for construct validity assessments, but suggests focusing on an interpretation of explained variances (Cronbach and Meehl 1955).
The correlation values suggest that on average 33% (max. of 48% and min. of 21%) of variance is shared between JQ 2 with any other ranking, meaning that more than 50% of variance is to be explained by other factors than a common understanding of the concept that should be captured by these rankings. Whether these figures are appropriate values for convergent validity can further be assessed by looking at the correlations that the remaining five rankings from the same period exhibit when correlating them with each other. The average correlation of JQ 2 with these rankings is r = .616 (explained variance = .379). The figures (average values) when correlating the remaining rankings with each other are: CNRS 08: r = .602 (explained variance = .362), ABS 09: r = .692 (explained variance = .479), ABDC 08: r = .699 (explained variance = .488), AST 08: r = .660 (explained variance = .435), CRA 09: r = .678 (explained variance = .460). That is, four out of five rankings reveal a better common understanding of the underlying concept of scientific quality and show more convergent validity than JQ 2.   All in all, though, the unexplained variance of more than 50% shows that these rankings do not succeed in arriving at a consistent journal ranking, despite following rigorous methods. The understanding of scientific quality may simply be too heterogeneous to develop a meaningful journal ranking that reflects a broad field such as business administration. While there may be no general consensus on the exact ranking of individual journals, rankings may agree on what constitutes different categories of journals (A, B, Cjournals etc.). However, when correlating the categories (A+, A, B, C, D, and E) instead of the ranking positions of JQ 2 with the categories of other rankings, neither of the correlations improves significantly and they even tend to be weaker. Still, JQ 2 might at least do a good job when identifying and ranking the leading journals versus other journals in a field, since scientific quality is more difficult to assess for lower-ranked journals (due to smaller groups of readers, higher heterogeneity of assessments, etc.), while rankings might agree on what constitutes a leading journal. This, however, seems only partly to tell the truth: when testing the correlations of JQ 2 with other rankings for A+ and A journals versus the correlations of lower ranked journals, the correlation coefficients for top journals are higher, but the difference is significant (p < .05) for only six out of 14 rankings (namely the correlations with NL 99, Theo 05, HKB 05, WU 01, EJL 05, and IMPACT). To summarize: based on the results of the significance tests, JQ 2 (and in a similar way JQ 2.1) can be considered to pass the tests for acceptable convergent validity, but the interpretation of the size of correlations indicates that this convergent validity is moderate. As for nomological validity, the significant correlations with impact factor and acceptance rates show that the nomological validity of the overall scale is acceptable.

Comparison of rankings by discipline
Tables 7-16 in the Appendix provide correlations between rankings for each discipline. The correlations show considerable variation across disciplines. For a simple and meaningful comparison of convergent validity, the explained variance can be compared across rankings. Table 5 provides figures for the mean explained variance of the relationship between all rankings (second column) and the sub-set of rankings from the same period (third column). For instance, the mean explained variance based on the correlations of JQ 2 with any other ranking in the field of Accounting and Auditing is 35%; when looking at the six rankings from the same period only, explained variance reaches 36% in the field of Accounting and Auditing. Table 5 additionally provides the explained variance for impact factors and acceptance rates (fourth and fifth column).
The results show that convergent validity is higher for General Management, Accounting and Auditing, Marketing, and Production Management, while all other disciplines show lower values than the overall ranking, with JQ 2 explaining even less than 15% of the variance with any other ranking for Business Information Systems. The results suggest that the moderate convergent validity for the overall scale is driven by a few disciplines and does not apply to each discipline in the same way. As for nomological validity, only six out of eleven disciplines have a significant correlation with impact factor and only five disciplines show a significant correlation with acceptance rate (Table 5; Table  7 to 16 in the Appendix). Explained variance for impact factors is highest for Marketing, while explained variance for acceptance rate is highest for International Management. The lowest values for impact factor are found for Business Information Systems, and the lowest value for acceptance rate is found for General Management, with less than 1% of explained variance in both cases. The results show that nomological validity varies considerably across disciplines and suggest low nomological validity for the majority of disciplines. Because correlations with other rankings differ across disciplines, JQ 2 seems apparently more consistent with foreign colleagues' perceptions in some disciplines (e.g., Marketing; Table 14 in the Appendix) than others (e.g., Business Information Systems; Table 10 in the Appendix). What are the reasons for these discrepancies? One reason might be that scholars differ in their perceptions of and approaches to evaluating scientific quality from those of foreign colleagues. This could be reflected by the fact that only a small number of German scholars strive for publications and actually publish in the discipline's leading journals. To empirically test this possibility, a database by Eisend and Schmidt (2010)    Members were assigned to disciplines according to the VHB commissions they have self-selected them in. If members self-selected them into more than one commission, they were counted for each one of the commissions. Publications were assigned to disciplines according to the procedure defined above (i.e., based on the journals they were published in; journals were assigned to disciplines in line with the approach by Schrader and Henning-Thurau 2009)

. The number of authors refers to all VHB-authors, not only authors who have self-selected them into the corresponding VHB commission.
been authored by VHB members (as of 2008). All articles in A+ and A journals were selected (593 articles, see Table 6) and they were assigned to the above disciplines based on the classification that was used throughout the paper (Schrader and Hennig-Thurau 2009). Then, the number of authors of these articles for each discipline was counted (Table  6) and the following two measures were computed: (a) the percentage of top publications per VHB member in a particular discipline, and (b) the percentage of VHB members who are authors of top publications in a particular discipline. When correlating these figures with the explained variance in Table 5 (second column) as an indicator for the consistency of rankings, using weights for the underlying number of correlations (i.e., each mean value in Table 5 is based on 12 correlations of JQ 2 with any other ranking), the results are not significant: (a) r = -.062 and (b) r = -.047 (both p's > .52). This result shows that the heterogeneity across disciplines is not to be explained by the fact that scholars differ in their perceptions and approaches for

BuR -Business Research Official Open Access Journal of VHB German Academic Associaton for Business Research (VHB) Volume 4 | Issue 2 | December 2011 | 241-274
evaluating scientific quality from foreign colleagues' approaches as indicated by scholars' activities in top journals. Alternatively, VHB might not represent German scholars in the respective field adequately, and the number of professors in a discipline being members of VHB might be too low for consistent journal assessments. The consistency of journals assessments can be related to the number of members per discipline for several reasons. For instance, disciplines with a low number of members might not be fully organized in VHB and therefore they might not be representative for all members in the respective discipline. Furthermore, outliers in the VHB-JOURQUAL survey data resulting from, for instance, strategic responses from individual VHB members can influence the survey results more strongly, the smaller the number of members in a discipline. When correlating the number of members of the corresponding commissions as indicated by the VHB membership list in 2008 (Table 6) with the explained variance in Table 5, using weights for the number of correlations (i.e., each mean value in Table 5 is based on 12 correlations of JQ 2 with any other ranking), the correlation is positive and significant (r = .519, p < .01), indicating that the number of VHB members in a particular discipline reduces discrepancies between rankings; that is, the more members, the more consistent the perceptions of journal standing and scientific quality with the perception of foreign colleagues.

In-depth analysis of the relationship of VHB-JOURQUAL 2 and acceptance rate by discipline
The most striking difference of the cross discipline comparison refers to the relationship with acceptance rate that lacks significance for several disciplines (General Management, Banking and Finance, Business Information Systems, Management of Technology and Innovation, Operations Research). This result suggests a problem with the ranking when used for evaluating candidates for academic positions. As explained above, candidates will attempt to submit to journals with low rejection rates and high ranking positions, which should eventually lead to equilibrium. As long as equilibrium is not reached, though, the result proves VHB-JOURQUAL 2 to be a weak measure for this particular journal in the short run. Figure 1 illustrates the relationship between journal ranking position (1 to 666, see Table 1) and acceptance rate. The regression lines drawn through the graph can be interpreted as a distinction for exploitation opportunities: journals above the line provide a comparatively high acceptance rate in relation to their VHB-JOURQUAL 2 ranking position, while journals below the line have quite high rejection rates relative to their ranking position. Figure 1 shows labels for journals that show a high positive deviation from the regression line (i.e., at least mean standardized residual + 1.5 standard deviation), that is, journals that provide opportunities for exploitation. The corresponding figures of the relationship between journal ranking position and acceptance rate for each discipline are presented in the Appendix (Figure 2 to 10). The "outlier" data of journals with high acceptance rates and favorable positions (and vice versa) also provide an explanation for the weak correlations between ranking and acceptance rate by discipline. Notably, the relationship of ranking positions and acceptance rate can be low due to a self-selection bias of authors; that is, the fact that authors do not necessarily always send their papers to the highestranked journal in their field, but to the journal the authors feel will view their research favorably. Hence, a lower-ranked journal might achieve a high number of submissions and thus lower acceptance rates simply because authors have decided to refrain from sending their work to a higher-ranked journal. Such a self-selection bias and a comparable number of submissions to journals with different ranking positions might also be a reason why some disciplines demonstrate comparatively low variation in acceptance rates (e.g., General Management, Business Information Systems; Figure 2 and 5 in the Appendix).

Discussion
This study attempted to assess the validity of VHB-JOURQUAL 2 as a measure of scientific quality. The findings show that VHB-JOURQUAL 2 provides acceptable but moderate convergent validity and acceptable nomological validity for the overall ranking. The convergent validity is lower than for most other rankings from the same period, indicating that other rankings were able to develop a more coherent understanding of scientific quality, although the unexplained variance of at least 50% shows that there is only a weak general consensus across journal rankings. The understanding of scientific quality across rankings might simply be too heterogeneous to develop consistent journal rankings. The results differ when disciplines are considered separately. Apparently, for Business Information Systems, VHB-JOURQUAL 2 shows lower validity values and for marketing, it shows higher validity values than for most other disciplines. An explanation for the discrepancy is provided by the variation of VHB members per discipline. The more members per discipline, the more consistent are the perceptions and evaluations of these members with foreign colleagues' journal perceptions. As for nomological validity, the correlation between VHB-JOURQUAL 2 and acceptance rate and impact factors is low for the majority of disciplines. The variation of the relationship between VHB-JOURQUAL 2 and acceptance rate across disciplines shows opportunities for exploitation for some journals with high ranking positions and low rejection probabilities. Authors can retrieve information on rejection rates from different sources, such as journal websites, databases (e.g., Campbell's directory), meet-the-editor sessions at conferences, or peer discussion. Another simple indicator for rejection probabilities is the numbering typically used for journal submissions: most submissions are counted per year and an author can infer the number of submissions per year and relate to the number of papers that are published in a particular outlet (e.g., if an author submits a paper in July and the submission is numbered 100, s/he might infer that the journal receives around 200 papers per year; if the outlet publishes 40 papers per year, the acceptance rate would be 20%.). It is very likely that the fact that some journals have high ranking positions and low rejection probabilities is a temporal phenomenon, because submissions to these journals should increase, driving up rejection rates. As for the next ranking (VHB-JOURQUAL 3), it would be interesting to investigate whether these particular journals have indeed reached equilibrium and whether the rise in submissions to these journals is due to the activities from German scholars. In the short run, however, it is important to identify these journals in order to provide a more valid assessment of candidates and research output based on journal rankings.

BuR -Business Research Official Open Access Journal of VHB German Academic Associaton for Business Research (VHB)
The study provides some general implications for the further development of VHB-JOURQUAL, which are: (1) considering approaches of foreign rankings that apparently show higher convergent validity; (2) indicating consistency or validity values for each discipline as well as the number of VHB members and survey participation rate for each discipline to encourage the discussion on whether the ranking could and should have the same meaning for each discipline, (3) updating the ranking on a regular basis and applying adjustments in shorter frequency, in order to avoid problems of low nomological validity as related to acceptance rate. The methodological approach of assessing quality of rankings as a measure of scientific quality refrained from using further validity criteria, since their application is not without problems. One of the tests that could be applied is criterion validity that could be performed by measuring the success of scholars or schools based on the quality of their publications output as assessed by VHB-JOURQUAL 2. Success of scholars and schools, however, is often defined and measured in terms of publication output (e.g., "Handelsblatt" ranking (Müller and Storbeck 2009)), which renders the results of such a criterion validity test as somewhat tautological (e.g., the "Handelsblatt" ranking is based on ranking weights derived from different journal rankings, amongst them VHB-JOURQUAL 2). In a similar way, known-groups validity can be assessed by focusing on very successful scholars and by testing whether their publication output is higher than that of other scholars (e.g., Seggie and Griffith 2009). Success criteria for scholars could be received grants (e.g., as received by the German Research Foundation), promotions, or salary, which typically increases with the number of job offers a candidate receives from different universities. Since publication output is used to assess whether a scholar receives a grant or a job offer, such tests would also be tautological. The same problem applies to content validity that is usually assessed by experts in a field. These experts would be scientists, leading to the problem of selforganization of science and the system's selfreference (e.g., Krohn and Küppers 1990;Maturana 1990): scientists develop evaluation criteria and quality standards which they apply to evaluate the quality of their own work. Although increasing experience might allow scientists to have a more comprehensive and less biased view of the science system and of journal quality, the evaluation always runs the risk of a self-serving bias. In a way, the increasing acceptance of VHB-JOURQUAL 2 as outlined in the beginning of the article is a result that already reflects a content-valid assessment of the scientific community including the experts in the field. The ranking of journals based on their quality as a major criterion for measuring scientific quality has been criticized for a variety of other reasons (e.g., Albers 2009;Frey and Rost 2010). However, these peculiarities and perils are not the focus of this paper; in other words, this paper does not discuss the advantages or disadvantages of journal rankings as a measure of scientific quality per se, but rather takes a pragmatic approach to test the measurement quality of VHB-JOURQUAL 2 as a commonly applied measure by business researchers in Germanspeaking countries. The overall findings encourage a critical use and a further development of VHB-JOURQUAL as a measure of scientific quality. A final, but important limitation of this study lies in the fact that the question of whether German journals are adequately ranked is not empirically addressable, since this study compares only international journals. Although German journals might have ended up in a ranking position that reflects their scientific quality, the findings of this paper show a moderate validity of the overall ranking and thus indicate the possibility that individual journals might have not been ranked correctly.

BuR -Business Research Official Open Access Journal of VHB German Academic Associaton for Business Research (VHB) Volume 4 | Issue 2 | December 2011 | 241-274
255 Appendix A: Correlation matrices for each discipline