Abstract
Voting advice applications (VAAs), online tools that provide voters with an estimate of their ideological congruence with political parties or candidates, have become increasingly popular in recent years. Many VAAs draw on low-dimensional spatial representations to match voters to political elites. Yet VAA spatial maps tend to be defined purely on a priori grounds. Thus fundamental psychometric properties, such as unidimensionality and reliability, remain unchecked and potentially violated. This practice can be damaging to the quality of spatial matches. In this paper we propose dynamic scale validation (DSV) as a method to empirically validate and thereby improve VAA spatial maps. The basic logic is to draw on data generated by users who access the VAA soon after its launch for an evaluation (and potential adjustment) of the spatial maps. We demonstrate the usefulness of DSV drawing on data from three actual VAAs: ParteieNavi, votulmeu and choose4cyprus.
Similar content being viewed by others
Notes
See Mendez (2012, 2014) for a discussion of matching in the high-dimensional space. There is an ongoing debate about the relative merits of high- versus low-dimensional matching (e.g. Lobo et al. 2010; Louwerse and Rosema 2014). This paper does not address this debate directly. For the present purposes it suffices to say that as neither is likely to disappear soon, research into possible ways to improve both low- and high-dimensional techniques should be welcomed.
O’Leary-Kelly and Vokurka (1998) in addition consider nomological validity as a third step.
A potential objection is that minimum sample recommendations do not apply in the scenario in question as they relate to situations where we sample from a population, but that we are dealing with the population of parties in a given election. However, minimum sample requirements still apply, for at least two reasons. First, the argument that the parties at hand represent the population is questionable. Arguably, there is always some sort of ‘super-population’; the scales should e.g. also apply to yesterday’s or tomorrow’s parties. Second, the data necessarily contains some measurement error. Thus, even if one is willing to assume that the parties at hand represent the population, results are bound to be unstable due to the low number of observations.
Of course, VAA designers could collect more data, e.g. at the candidate-level, but given the high costs involved this hardly constitutes practical advice.
Referring to Converse (1964), some might object that voters’ ideology is not structured enough for the construction of ideological scales. However, ordinary citizens also feature a fair deal of ideological constraint that is sufficient for spatial mapping purposes. Achen (1975) and Ansolabehere et al. (2008), among others, showed that Converse’s finding owed much to inadequate methodological choices. Note also that if it were true that voters have too limited issue constraint, low-dimensional representations in VAAs should probably be abandoned altogether.
To guard against early user’s possible unrepresentativity and thus improve early user-based inference, VAA designers can test for the equivalence of their spatial dimensions across different levels of, say, political interest or political knowledge. A possibly more straightforward method would involve the repetition of DSV at a later point in time to check whether the scales continue to work. Note that there is another potential problem with early user-based inference: overfitting on the early user sample. Cross-validation constitutes a powerful method to avoid overfitting.
See Straat et al. (2014) for sample size requirements in Mokken scaling. Drawing on the first 2–5000 entries also avoids the problem that the very first users may be fundamentally unrepresentative of the average VAA user, for instance if a VAA diffuses from a university setting.
A drawback of Mokken scaling is the requirement of listwise deletion. Most alternative scaling techniques share the requirement of listwise deletion.
Quasi- rather than fully inductive since the nature of the identified dimensions depends on the nature of the item bank (see Benoit and Laver 2012).
Since both exploratory techniques draw exclusively on the H-statistic, all scales must be subjected to the monotonicity test after the quasi-inductive search.
The assumption of tau-equivalence can be thought of as a factor model in which all factor loadings are constrained to equality.
See the data and methods section for the requirement of clean data.
All data can be downloaded from http://www.preferencematcher.org/.
To guarantee external consistency (the property that an item is associated with only one latent trait), we always checked whether an item associated with the X-scale can also be attributed to the Y-scale, and vice versa. ParteieNavi’s items 10 and 19 were identified as violating external consistency and excluded from the quasi-inductive analysis. Furthermore, note that all items were included in both original and reversed order because Mokken scaling can only associate items which point in the same direction. The exploratory analysis therefore outputs each scale twice, in reversed orders. Only one of the duplicates is reported.
More than two scales emerged in the case of votulmeu and choose4cyprus. However, the extra scales consist of a mere two or three items and it is difficult to make substantive sense of them since they all reflect issues already covered by the two main dimensions. Thus the extra scales can hardly be considered stand-alone dimensions. Exploratory factor analysis, another quasi-inductive technique, invariably suggests a two-dimensional structure as well.
Note though that the GA algorithm attributed an additional item to votulmeu’s left–right scale (item 21) that had to be manually removed since it failed the monotonicity test.
Note that there is some cross-case variation in the match between early and late user samples, with Parteienavi and possibly votulmeu performing better than choose4cyprus. Potential reasons include variation in early users’ representativity of late users and varying degrees of overfitting in the early user models. See footnote 7 for avenues to improve early user-based inference.
Throughout this section we only consider late users who accessed the site after the previously set cut-off since only these would have been affected by DSV.
Missing party/candidate positions are imputed with the neutral middle category. Note that the use of the Euclidean distance implies a simplification since it is tantamount to assuming a proximity voting logic. In reality, users of VAA spatial maps are free to apply whatever logic they prefer.
References
Achen, C.H.: Mass political attitudes survey response. Am. Polit. Sci. Rev. 69(4), 1218–1231 (1975)
Andreadis, I.: Data quality and data cleaning. In: Garzia, D., Marschall, S. (eds.) Matching Voters with Parties and Candidates, pp. 79–92. ECPR Press, Colchester (2014)
Ansolabehere, S., Rodden, J., Snyder, J.M.: The strength of issues: using multiple measures to gauge preference stability, ideological constraint, and issue voting. Am. Polit. Sci. Rev. 102(2), 215–232 (2008)
Benoit, K., Laver, M.: The dimensionality of political space: epistemological and methodological considerations. Eur. Union Polit. 13(2), 194–218 (2012)
Carmines, E.G., Zeller, R.A.: Reliability and Validity Assessment. Sage Publications, Thousand Oaks, CA (1979)
Cedroni, L., Garzia, D. (eds.): Voting Advice Applications in Europe: The State of the Art. ScriptaWeb, Naples (2010)
Clark, L.A., Watson, D.: Constructing validity: basic issues in objective scale development. Psychol. Assess. 7(3), 309–319 (1995)
Converse, P.E.: The nature of belief systems in mass publics. In: Apter, D.E. (ed.) Ideology and Discontent, pp. 206–261. Free Press, New York, NY (1964)
Cortina, J.M.: What is coefficient alpha? An examination of theory and applications. J. Appl. Psychol. 78(1), 98–104 (1993)
Davidov, E.: Measurement equivalence of nationalism and constructive patriotism in the ISSP: 34 Countries in a comparative perspective. Polit. Anal. 17(1), 64–82 (2009)
Gadermann, A.M., Guhn, M., Zumbo, B.D.: Estimating ordinal reliability for Likert-type and ordinal item response data: a conceptual, empirical, and practical guide. Pract. Assess. Res. Eval. 17(3), 1–13 (2012)
Garzia, D., Angelis, Ad, Pianzola, J.: The impact of voting advice applications on electoral participation. In: Garzia, D., Marschall, S. (eds.) Matching Voters with Parties and Candidates, pp. 105–114. ECPR Press, Colchester (2014)
Gemenis, K.: Estimating parties’ policy positions through voting advice applications: some methodological considerations. Acta Polit. 48(3), 268–295 (2013)
Gemenis, K., Rosema, M.: Voting advice applications and electoral turnout. Elect. Stud. 36, 281–289 (2014). doi:10.1016/j.electstud.2014.06.010
Gemenis, K., van Ham, C.: Comparing methods for estimating parties’ positions in voting advice applications. In: Garzia, D., Marschall, S. (eds.) Matching Voters with Parties and Candidates, pp. 33–47. ECPR Press, Colchester (2014)
Gerbing, D.W., Anderson, J.C.: An updated paradigm for scale development incorporating unidimensionality and its assessment. J. Mark. Res. 25(2), 186–192 (1988)
Germann, M., Mendez, F., Wheatley, J., Serdült, U.: Exploiting smartvote data for the ideological mapping of Swiss political parties. Paper presented at the 2012 Convegno SISP, September 13–15, Rome (2012)
Germann, M., Mendez, F., Wheatley, J., Serdült, U.: Spatial maps in voting advice applications: the case for dynamic scale validation. Acta Politica (2014). doi:10.1057/ap.2014.3
Hattie, J.: Methodology review: assessing unidimensionality of tests and items. Appl. Psychol. measurement 9(2), 139–164 (1985)
Hemker, B.T., Sijtsma, K., Molenaar, I.W.: Selection of unidimensional scales from a multidimensional item bank in the polytomous Mokken IRT model. Appl. Psychol. Measurement 19(4), 337–352 (1995)
Kriesi, H., Grande, E., Lachat, R., Dolezal, M., Bornschier, S., Frey, T.: Globalization and the transformation of the national political space: six European countries compared. Eur. J. Polit. Res. 45(6), 921–956 (2006)
Lefevere, J., Walgrave, S.: A perfect match? The impact of statement selection on voting advice applications’ ability to match voters and parties. Elect. Stud. 36, 252–262 (2014). doi:10.1016/j.electstud.2014.04.002
Lin, L.I.K.: A concordance correlation coefficient to evaluate reproducibility. Biometrics 45(1), 255 (1989)
Linzer, D.A., Lewis, J.B.: poLCA: an R package for polytomous variable latent class analysis. J. Stat. Softw. 42(10), 1–29 (2011)
Lobo, M.C., Vink, M., Lisi, M.: Mapping the political landscape: a vote advice application in Portugal. In: Cedroni, L., Garzia, D. (eds.) Voting Advice Applications in Europe, pp. 143–172. ScriptaWeb, Naples (2010)
Lord, F.M., Novick, M.R.: Statistical Theories of Mental Test Scores. Addison-Wesley, Reading (1968)
Louwerse, T., Otjes, S.: Design challenges in cross-national VAAs: the case of the EU profiler. Int. J. Electron. Gov. 5(3/4), 279–297 (2012)
Louwerse, T., Rosema, M.: The design effects of voting advice applications: comparing methods of calculating matches. Acta Polit. 49(3), 286–312 (2014)
Marks, G., Hooghe, L., Nelson, M., Edwards, E.: Party competition and European integration in the east and west: different structure, same causality. Comp. Polit. Stud. 39(2), 155–175 (2006)
Marschall, S.: Profiling users. In: Garzia, D., Marschall, S. (eds.) Matching Voters with Parties and Candidates, pp. 93–104. ECPR Press, Colchester (2014)
Marschall, S., Schultze, M.: Voting advice applications and their effect on voter turnout: the case of the German Wahl-O-Mat. Int. J. Electron. Gov. 5(3/4), 349–366 (2012)
McDonald, R.P.: Generalizability in factorable domains: domain validity and generalizability. Educ. Psychol. Meas. 38(1), 75–79 (1978)
Mendez, F.: Matching voters with political parties and candidates: an empirical test of four algorithms. Int. J. Electron. Gov. 5(3/4), 264–278 (2012)
Mendez, F.: What’s behind a matching algorithm? A critical assessment of how voting advice applications produce voting recommendations. In: Garzia, D., Marschall, S. (eds.) Matching Voters with Parties and Candidates, pp. 49–66. ECPR Press, Colchester (2014)
Mokken, R.J.: A Theory and Procedure of Scale Analysis Applications in Political Research. Springer, New York (1971)
O’Leary-Kelly, S., Vokurka, R.J.: The empirical assessment of construct validity. J. Oper. Manag. 16(4), 387–405 (1998)
Otjes, S., Louwerse, T.: Spatial models in voting advice applications. Elect. Stud. 36, 263–271 (2014). doi:10.1016/j.electstud.2014.04.004
Sijtsma, K.: On the use, the misuse, and the very limited usefulness of Cronbach’s alpha. Psychometrika 74(1), 107–120 (2009)
Smits, I.A.M., Timmerman, M.E., Meijer, R.R.: Exploratory Mokken scale analysis as a dimensionality assessment tool: why scalability does not imply unidimensionality. Appl. Psychol. Meas. 36(6), 516–539 (2012)
Straat, J.H., van der Ark, L.A., Sijtsma, K.: Comparing optimization algorithms for item selection in Mokken scale analysis. J. Classif. 30(1), 75–99 (2013)
Straat, J.H., van der Ark, L.A., Sijtsma, K.: Minimum sample size requirements for Mokken scale analysis. Educ. Psychol. Meas. 74(5), 809–822 (2014)
van Camp, K., Lefevere, J., Walgrave, S.: The content and formulation of statements in voting advice applications: a comparative analysis of 26 VAAs. In: Garzia, D., Marschall, S. (eds.) Matching Voters with Parties and Candidates, pp. 11–31. ECPR Press, Colchester (2014)
van de Pol, J., Holleman, B., Kamoen, N., Krouwel, A., de Vreese, C.: Beyond young, highly educated males: a typology of VAA users. J. Inf. Technol. Polit. 11(4), 397–411 (2014)
van der Ark, L.A.: Mokken scale analysis in R. J. Stat. Softw. 20(11), 1–19 (2007)
van der Ark, L.A.: New developments in Mokken scale analysis in R. J. Stat. Softw. 48(5), 1–27 (2012)
van der Ark, L.A., van der Palm, D.W., Sijtsma, K.: A latent class approach to estimating test-score reliability. Appl. Psychol. Meas. 35(5), 380–392 (2011)
van Schuur, W.H.: Mokken scale analysis: between the Guttman scale and parametric item response theory. Polit. Anal. 11(2), 139–163 (2003)
Vassil, K.: Voting Smarter: The Impact of Voting Advice Applications on Political Behavior. PhD thesis, European University Institute, Florence (2012)
Wagner, M., Ruusuvirta, O.: Matching voters to parties: voting advice applications and models of party choice. Acta Polit. 47(4), 400–422 (2012)
Walgrave, S., Nuytemans, M., Pepermans, K.: Voting aid applications and the effect of statement selection. West Eur. Polit. 32(6), 1161–1180 (2009)
Wall, M., Krouwel, A., Vitiello, T.: Dutch legislative elections. Party Polit. 20(3), 416–428 (2010)
Acknowledgments
We thank L. Andries van der Ark, Hendrik Straat, Kostas Gemenis, Jonas Lefevere, Simon Otjes, Jonathan Wheatley, the anonymous reviewer and seminar participants at the 2013 ECPR General Conference (in Bordeaux) and the 2013 EU Vox workshop (in Twente) for helpful comments. Any remaining errors are, of course, our own. The authors gratefully acknowledge financial support from the e-Democracy project funded by the Swiss cantons of Argovia (main contributor), Basel-City, Geneva, Grisons and Schaffhausen as well as the Swiss Federal Chancellery. We used R v.3.1 for much of the scaling analysis, in particular the mokken (van der Ark 2007, 2012) and the poLCA (Linzer and Lewis 2011) packages, and Stata v.11.2 for all remaining analyses. Replication files are available on http://www.preferencematcher.org/. User’s data privacy is fully protected.
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Rights and permissions
About this article
Cite this article
Germann, M., Mendez, F. Dynamic scale validation reloaded. Qual Quant 50, 981–1007 (2016). https://doi.org/10.1007/s11135-015-0186-0
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11135-015-0186-0