1 Introduction

The low voter turnout is a phenomenon that affects the most advanced democracies, since many citizens are not represented properly. Ladner and Pianzola [1] specifically mentioned Switzerland, where the voter turnout does not exceed 50% by 1975. Citizens, partly because of their lack of knowledge on the political issues, tend to avoid this democratic decision making process. E-democracy tools provide the means to use Information and Communications Technologies (ICTs) in order to inform people about the political stances of the parties who take part in the upcoming elections aiming to increase citizen participation and to promote direct involvement in political activities [2].

Voting Advice Applications (VAAs) (according to Fivaz et al. [3] this term is widely used since 2007) are e-democracy tools that undertake the role of the ‘tipster’ and facilitate citizens’ decision making process by matching their political stances with those of parties. Findings have shown that VAAs’ recommendations affect the decision making process of a significant part of voters, especially if these people are undecided, women, aged under 34, or they are voting for the first time [3, 4]. In addition, and as reported in some studies [4, 5], in many cases VAAs were responsible for increasing the participation of citizens in elections.

VAAs ask users and parties to fill a specific questionnaire that contains a number of policy statements (see a policy statement example in Figure 1), which are created according to issues that concern the nation in the time of elections and represent important political, economic and social issues [6, 7]. The recommendation process that a VAA traditionally follows contains two main steps: first the users’ answers are matched with the parties’ answers and then the VAA ranks, in decreasing order of the matching score, the parties according to party-user ‘similarity’ (see Figure 2 for party ranking based on party-user similarity as computed in traditional VAAs).

Figure 1
figure 1

A policy statement that was included in EUVox 2014 with the given set of answer choices.

Figure 2
figure 2

Party ranking based on party-user similarity as computed in traditional VAAs (EUVox 2014, Germany data).

In addition to the policy statements, many VAAs ask users to answer a number of supplementary questions. One of these supplementary (opt in) questions is the vote intention of user (i.e., which party the user intends to vote in the upcoming election). An example of how the supplementary questions appear in the EUVox, the EU-wide VAA that was designed for the 2014 elections to the European Parliament, is shown in Figure 3. Here we can see that the vote intention of the user for the European elections is found in the second question. The vote intention variable is commonly used, to cluster the community of VAA users into disjoint groups, in a social variation of VAAs, called SVAA, that is based on the collaborative filtering [8] philosophy.

Figure 3
figure 3

The supplementary questions as they appear in EUVox 2014.

SVAAs have the same policy questionnaire with the traditional VAAs but provide a radically different matching technique. In SVAA recommendation, parties’ answers are ignored and recommendation is given based on the community of VAA users using collaborative filtering techniques [8, 9]. During the training phase of the SVAA design, a number of groups is created corresponding to the parties that participate in the elections. Each group includes the users that expressed vote intention for a specific party; thus, in the training phase only the filled questionnaires of the VAA users that answered the vote intention supplementary question and expressed preference for a particular party (i.e., they did not choose answers like ‘not decided yet’ or ‘I will not vote’ and similar) are included. VAA users who are clustered into the same group, which means they support the same party, are likely to create similar answer patterns (i.e., fill in the policy questionnaire in a similar way), since they commonly share the same political opinions. So, party models can be created for each group to show the common way, if any, in which the users in each group fill the online questionnaire. Once party models are created, the SVAA starts its operation phase. The sequence of answers (i.e., filled in questionnaire) of a VAA user feed the party models and produce in their outputs party-user similarity scores. Recommendation is given to the user in a decreasing order of these similarity scores. An example of the matching scores presented to the user according to the SVAA philosophy is shown in Figure 4. SVAAs proved to make better voting predictions than the traditional VAA matching schemes between users’ and parties’ profiles [10, 11]. In addition, as recorded by users’ feedback through the emoticons shown in the right part of Figure 2 and Figure 4, SVAA recommendation surpasses VAA recommendation in terms of users’ satisfaction [12].

Figure 4
figure 4

Party ranking based on matching scores between party models and user’s answer pattern (EUVox 2014, Germany data).

In order to tackle the recommendation problem of SVAAs, machine learning techniques [13, 14] can be used to indicate the likelihood of a user to belong to a class of the VAA community of users; as already mentioned each class corresponds to a specific party. In essence what is accomplished with machine learning is the modelling of parties groups based on their users’ answer patterns, which are created by their responses to the online questionnaire. In this paper we investigate the application of Hidden Markov Model (HMM) classifiers for party-user similarity estimation, in an effort to improve the effectiveness of social vote recommendation. HMM classifiers provide a way to apply machine learning to data represented as a sequence of correlated observations [13]. Although the order in which policy statements are displayed to users is not important in VAAs, the policy statements are usually correlated and grouped into categories (e.g., external policy, economy, society, etc). Thus, opting from the various answer choices in each policy statement might be related with selections in previous and subsequent ones. Given that the order of policy statements is kept fixed within each VAA one can assume that (a) answer patterns, i.e., sequences of choices for all policy statements included in the VAA, can be found that characterise ‘typical’ voters of particular parties, and (b) the answer choice in each policy statement can be ‘predicted’ from previous answer choices. When users answer the policy statements, they are incrementally producing a sequence of symbols. Whenever a process includes a sequence of dependent observations, HMM classifiers can be used to model input sequences as generated by a parametric random process. This is our basic rationale for employing HMMs for obtaining similarity matching between parties and users in SVAAs.

An HMM classifier models a sequence of symbols from observed data without knowing the sequence of states it has to follow to generate these observations [13]. We assume that VAA users, who support the same party, produce similar sequences of symbols (answer patterns). Therefore, HMM classifiers can be used (a) to create simple and compact models for each party to show the ‘path’ that users, who support the same party, follow to answer the online questionnaire, and (b) to classify every new user into the party in which is most likely to belong, according to their given answers/symbols on the policy statements.

In short, the purpose of our paper is to introduce an SVAA method for similarity matching between parties and users based on HMMs and investigate its performance based on the accuracy of predicting their voting intention. We show that, even if the order in which the policy statements are answered in a VAA does not really matter, the HMM classifier performs quite well in estimating vote intention of unseen users. Nevertheless, the HMMs’ performance relies on the smooth distribution of samples per party and on the consistency between the answers of the users, who are classified as belonging to these parties. Therefore, in the cases where these conditions are not met, the results may not be satisfactory and outlier and/or rogues detection may be required [15]. Experiments were also conducted for a comparison of HMM classifier with the traditional VAA party-user matching method and other SVAA’s native solutions to recommend parties to users. We observe that HMM classifier shows the highest performance between the traditional party-coding recommendation method and other party-supporters modelling methods.

To the best of our knowledge this is the first time HMMs are used to compute party-user similarity either in VAAs or in SVAAs. For our experiments we use three datasets derived from EUVox 2014. EUVox was sponsored financially by the Open Society Initiative for Europe (European Elections 2014) and the Directorate-General for Communication of the European Parliament (area of Internet-based activities/online media 2014) to help voters to have quick access to information according to the political positions of the parties that took part in the 2014 elections to the European Parliament (see http://www.euvox2014.eu/). The datasets differ in size, in the number of parties participating in the elections and in the population’s distribution percentage among the various parties. The corresponding datasets are available to other researchers working in the areas of VAA and Web based recommender systems through the Preference Matcher website.Footnote 1

2 Problem formulation

The basic aim of a traditional VAA is to recommend parties to users. In such a case there is a set of N users \(X=\{\mathbf {x_{1}},\mathbf {x_{2}}, \ldots , \mathbf {x_{N}}\}\), a set of U policy statements (or issues) \(Q=\{q_{1}, q_{2}, \ldots, q_{U}\}\), and a set of D political parties \(P=\{\mathbf {p_{1}},\mathbf {p_{2}},\ldots, \mathbf {p_{D}}\}\). Each user \(\mathbf {\mathbf {x}_{j}} \in X\) and each political party \(\mathbf {p_{i}}\in P\), has answered each policy statement \(q_{k} \in Q\).

Based on their answers, every political party or user can be represented in a vector space model:

$$\begin{aligned}& \mathbf {x_{j}}=\{x_{(j,1)},x_{(j,2)}, \ldots,x_{(j,k)},\ldots ,x_{(j,U)}\} \end{aligned}$$
$$\begin{aligned}& \mathbf {p_{i}}=\{p_{(i,1)},p_{(i,2)},\ldots,p_{(i,k)}, \ldots,p_{(i,U)}\}, \end{aligned}$$

where \(x_{(j,k)}, p_{(i,k)} \in L\) are the answers of the j-th user and i-th party, respectively, to the k-th policy statement. The vectors xj and pi are, usually, named user and party profiles respectively.

A typical set of answers is a 6-point Likert scale: \(L=\{1\ \text{(Completely disagree)}, 2\ \text{(Disagree)}, 3\ \text{(Neither agree nor disagree)}, 4\ \text{(Agree)}, 5\ \text{(Completely agree)}, 6\ \text{(No} {opinion)}\}\). In several cases, and in the majority of SVAA methods proposed so far, the sixth point is not taken into consideration since it does not correspond to a particular stance and it is usually replaced with the third point (Neither agree nor disagree). In this work we keep the sixth point as a distinct emission symbol but not as a distinct state (see also Section 3). As a result the set L, in the context of this work, becomes: \(L=\{1,2,3,4,5,6\}\). In Figure 1 an example, of how the answer choices appear to VAA users, is shown.

The VAA recommendation task tries to approximate the unknown relevance \(h(j,i)\) of user j to party i given the user and party answers xj and pj respectively, and then to suggest a ranking of political parties based on user-party relevance. In machine learning terms, the task is to approximate the hidden function \(h(j,i)\) with a function \(\hat{h}:\mathbb{R}^{U} \times\mathbb{R}^{U} \rightarrow\mathbb {R}\), where \(\hat{h}(\mathbf {x_{j}},\mathbf {p_{i}})\) is the estimation of the relevance of user j with political party i. Typically \(\hat{h}(\mathbf {x},\mathbf {p})\in[0,1]\). In any case, the top suggestion \(p_{q}^{j}\) for user j should be:

$$ p_{q}^{j} = \underbrace{\operatorname {argmax}}_{i} \bigl( \hat{h}(\mathbf {x_{j}},\mathbf {p_{i}}) \bigr). $$

One of the supplementary (opt in) questions (see Figure 3) that many VAAs ask users to answer, in addition to the U policy statements, is the vote intention of user, i.e., which party the user intends to vote in the upcoming election. SVAAs usually use the vote intention variable \(vi_{j}\) to a-priori cluster VAA users into parties, i.e., party-supporters as indicated by the vote intention variable are clustered together. Then statistical or machine learning approaches are used to create party models. Thus, for every party i a model Mi is created using as training examples the subset \(\mathbf {Tr_{i}}\) of user profiles who expressed voting intention for party i, that is \(\mathbf {Tr_{i}} = [\mathbf {x_{j}}|vi_{j}=i]\). Then these models can be exploited to provide a recommendation based on collaborative filtering [16] that takes advantage of VAA’s user community. In this case the top recommendation \(p_{q}^{j}\) for user j is given by:

$$ p_{q}^{j} = \underbrace{\operatorname {argmax}}_{i} \bigl( \hat{h}(\mathbf {x_{j}},\mathbf {M_{i}}) \bigr). $$

In this work we use Hidden Markov Models to create the party models Mi. Thus, Eq. 4 becomes:

$$ p_{q}^{j} = \underbrace{\operatorname {argmax}}_{i} \bigl(\hat{h} \bigl(V^{j},\lambda_{i}\bigr)\bigr), $$

where \(V^{j}\) is the set of observations corresponding to user profile xj and \(\lambda_{i}\) is the HMM for party i created (see Section 3). The solution of Eq. 5 is obtained with the aid of Viterbi algorithm as usually happens in HMM classifiers [13].

An HMM is a double stochastic process that models data evolving in time. It is defined by a latent Markov chain, which consists of a finite number of states, and a number of observation probability distributions for each state. At each discrete time instant, the system switches from one state to another, while an observation is produced by the probability distribution according to the current state [17]. In an HMM, the states are not observable (they are ‘hidden’), but an observation is generated as a probabilistic function of the state, when the system visits the state [13, 18].

An HMM is described by three parameters: \(\lambda= (A, B, \pi) \) (see more details in Section 3), which can be estimated based on specialised Expectation Maximisation techniques, such as the Baum-Welch algorithm [19]. These parameters are calculated through several training iterations, by using the entire training data set at each time, until an objective function is maximised. To avoid knowledge corruption, the data should be stored in memory and be trained from start at each iteration; this is costly and time consuming. Therefore, in real life, the datasets for training HMMs are often limited and this can significantly reduce their performance since it heavily depends on the availability of a sufficient quantity of representative training data to calculate the model’s parameters [17].

HMMs have not being used in SVAAs so far; this is probably due to the fact that within a VAA the observations corresponding to user’s answer choices are not time dependent. However, as we already mentioned, user answer choices can be considered as a sequence of correlated observations while the answer options (‘Completely disagree’, ‘Disagree’, ‘Neither agree nor disagree’, ‘Agree’, ‘Completely agree’) can be used as the HMM states. Under these circumstances the HMMs can be applied to VAA, as we have a sufficient number of states and a fairly rich set of data.

3 Methodology

An HMM is characterised by the following ([13, 18]):

  • A set of W discrete states \(S = {S_{1}, S_{2}, S_{3},\ldots, S_{W}}\), with \(G = {g_{1}g_{2}\ldots g_{T}}\) to be the state sequence (i.e., if we have \(g_{t}=S_{i}\) that means at time t the system is in state \(S_{i}\)).

  • A set of E observations \(V = {v_{1}, v_{2}, v_{3},\ldots,v_{E}}\), with \(O = {O_{1}O_{2}\ldots O_{T}}\) to be the sequence of observations corresponding to states G.

  • A state transition matrix A, that shows the probability of going from state \(S_{i}\) to state \(S_{j}\): \(A\equiv[a_{ij}]\) where \(a_{ij}\equiv P (g_{t + 1} = S_{j} | g_{t} = S_{i})\).

  • An observation emission matrix B, that describes the probability of observing \(v_{e}\) in state \(S_{j}\): \(B\equiv[b_{j} (e)]\) where \(b_{j} (e) \equiv P (O_{t} = v_{e} | g_{t} = S_{j})\).

  • The probability distribution of being in the first state of a sequence: \(\pi\equiv[\pi_{i}]\) where \(\pi_{i} \equiv P(g_{1} = S_{i})\).

In our implementation we consider three states of the HMMs, i.e., \(W=3\), \(S=\{S_{1}, S_{2}, S_{3}\}\) labeled as \(S_{1}\):‘Negative’, \(S_{2}\):‘Neutral’ and \(S_{2}\):‘Positive’ corresponding to answer choices \(S_{1}\): (Completely disagree, Disagree), \(S_{2}\): (Neither agree nor disagree, I have no opinion), and \(S_{3}\): (Agree, Completely agree) that could be given in the U policy statements of the VAA questionnaire. We chose grouping the answer choices, since separation between the direction (agree/disagree) and intensity (completely) is often difficult to be done by respondents, who are asked to think along multiple dimensions. In addition VAA users tend to avoid taking ‘extreme’ positions in the Likert scale; as a result differences between ‘Completely Agree’ and ‘Agree’ and ‘Completely Disagree’ and ‘Disagree’ create noise rather than more subtle classification. This phenomenon can lead to measurement contamination [20]. In addition, during initial experimentation we observed that HMMs can be more easily created if only three states are considered, in terms of efficiency (training time) and effectiveness (performance).

Every state sequence G has length equal to the number of policy statements, i.e., \(T=U=30\) while the mapping from a user profile xj (see also Eq. 1) to an emission sequence \(V^{j} = \{ v_{1}^{j}, v_{2}^{j}, v_{3}^{j},\ldots,v_{E}^{j}\}\) is obtained as follows:

$$ v_{q}^{j} = x_{(j,q)}+\vert L\vert \cdot(q-1), $$

where \(x_{(j,q)}\) is the answer choice of user j to policy statement q (\(q = 1, \ldots, E\)), L is the set of answer options (see also Section 2) and \(\vert L\vert \) is its cardinality (number of answer options in the policy statements, in our case \(\vert L\vert =6\)). For instance if a VAA user selected ‘Completely Disagree’ in the 1st policy statement, then the recorded observation in the 1st place of the sequential answers of the voter would be: \(1+6*(1-1)=1\); whereas if the answer choice in the 23rd policy statement was ‘I agree’, then the observation \(4+6* (23-1)=136\) would be registered in the 23rd place of the \(V^{j}\) sequence.

As already mentioned an HMM is fully described by three parameters: \(\lambda= (A, B, \pi) \). In this work we consider that each party users can be modelled by an HMM \(\lambda_{i}\). The way VAA users respond to the first policy statement differs among users, who support different parties, reflecting into different \(\pi_{i}\); the same holds for any other policy statement reflecting in different \(B_{i}\) while the way answer choices are given in two consecutive policy statements also varies among different party supporters reflecting into different \(A_{i}\).

4 Related work

4.1 Recommender systems in politics

Recommender Systems (RSs) are software tools and techniques, which recommend products or services to be exploited by a user, in an effort to help them decide what they really need from the sheer volume of data that many modern online applications manage. Focusing in the problem of the information overload [21], these systems are widely used in e-commerce [8, 9], where they make proposals for consumers of products to buy, as well as in e-government, e-business, e-library, e-learning, e-tourism, e-resource services and e-group activities [22].

E-government is a way to use the combination of information technology, structural changes and new skills in public administration to improve the quality of public services, reinforce the democratic process and support community objectives [23]. Teran et al. [24] used a fuzzy recommender system for e-elections in e-government to inform citizens about candidates and enhance their participation in democratic processes. They introduced the fuzzy clustering analysis, which provides a graphical representation of political parties distributed in clusters, so as to give the opportunity to citizens to examine the behaviour of candidates and find similarities among them.

Dyczkowsk and Stachowiak [25] presented a content-based recommender system of elections that suggest a candidate to a voter, according to the intuitionistic fuzzy (IF)-set theory. They found that IF-set theory can sufficiently model incomplete knowledge about the political positions of a candidate and operate successfully on that information. They also developed a Web application that is intended to be a universal platform for creating recommender systems for elections.

4.2 SVAA methods

Researchers from different research fields deal with many aspects of VAAs [26]. Some of them concern for whether VAAs urge citizens to vote and whether the recommendation made by these systems affects the final vote decision [3, 4]. Some others show interest in the design of VAAs and especially to party-user similarity estimation methods that can be adopted to predict voting intention [2730].

Katakis et al. [6] noticed that voters often do not agree with the political positions of the party that they intend to vote, but they support it since they are affected by family, friends and community. Therefore, they inspired by community’s influence and proposed an alternative matching technique in VAAs that compares the answers sequence of a user (i.e., its profile xj) to those of other VAA users to find similar ones. Then, recommendation is given based on the distribution of voting intentions of the similar users. Their rational was that VAAs are, in essence, recommendation tools applied in e-politics scenarios. Thus, use of collaborative filtering approaches could be easily adopted. Based on this conceptualisation they presented the so-called Social VAA (SVAA), which proved to make better voting predictions than the traditional matching schemes between users’ and parties’ profiles. In their paper they resorted to clustering and classification approaches for generating vote advice in SVAAs. They showed that party-supporters modelling based on data mining classifiers and Support Vector Machines, achieve the best performance.

Tsapatsoulis and Mendez [31] dealt with building party models for SVAAs based not on voting intention but on the probability to vote each one of parties participating in the German elections in 2013. They compared a Mahalanobis Classifier, a Weighted Mahalanobis Classifier and function approximation approaches (i.e., regression) and they concluded that there is no much gain when using the probability to vote instead of the vote intention. Among the compared methods they noticed that non-linear party modelling techniques, such as neural network based ones, outperform the linear methods like Mahalanobis.

Tsapatsoulis et al. [29] in an effort to provide practical design guidelines for SVAAs dealt with the problem of finding the minimum number of VAA users required to build effective party models. They limited their analysis to the Mahalanobis Classifier for minimise the factors influencing their research questions. They found that, as the number of parties modelled increases the performance of recommendation decreases, in terms of the Mean Average Precision (MAP) [32] and F-measure [33]. In addition they showed that effective party-supporters models can be built based on a rather small number of user profiles.

4.3 HMM for similar problems

When there is a sufficient number of hidden states with a rich class of observation distributions, the HMMs can accurately represent probability distributions in complex real world problems and create simple and compact models [34]. Thus there are various applications of HMMs in different research areas such as in diverse sequence recognition tasks [13], in the design of automatic speech recognition systems [18, 35], in natural language processing [36], in online character recognition of handwriting [13] and signature verification [37], in bioinformatics [38], as well as in automatic translation tasks [39].

Netzer et al. [40] referred to an HMM that relates the latent relationship states to the observed buying behaviour of a customer, investing in customer relationship management. The proposed HMM enables the company to update the customer profile over the time by understanding the evolution of customer relationships through the time and makes it possible to create a long term purchasing behaviour by recognising the marketing activities that are preferred in building customer-brand relationships and by predicting the future choices better than other benchmark models.

Sahoo et al. [41] assumed that the user’s behaviour changes with time and proposed HMMs to correctly interpret the blog reading behaviour of users and make personalised article recommendations. They found that the proposed method leads to better article recommendation than the existing recommender systems do. Sahebi et al. [42] applied HMMs and other clustering algorithms in Web usage-based recommender systems. By comparing their performance, they showed that HMMs outperformed the other algorithms.

Although there is enough evidence about the appropriateness of HMM classifiers for SVAA recommendation, they have not been applied so far, probably because there are simpler machine learning techniques that can be used instead. However, we strongly believe that HMMs have an advantage compared to those methods: they can capture the correlation between answers in different policy statements.

5 Experimental results

5.1 Datasets

In this paper, experiments were conducted to measure the performance of voting prediction, i.e., the accuracy of predicting the vote intention of the users, by applying an HMM classifier to VAA data (filled in questionnaires). Comparisons with the traditional VAA party-user matching method and other party modelling techniques were also done. Three datasets derived from EUVox 2014 and corresponding to three different countries electorate (Cyprus, Germany and Greece) were used in the experiments. EUVox is an EU-wide VAA that was designed for the 2014 elections to the European Parliament. Its questionnaire consists of 30 policy statements and it is based on European-wide issues, issues that are salient for citizens in a particular region, and country-specific issues. All policy statements devised using the same criteria across all cases, which ensure that users can find many issues that are relevant to their daily lives and at the same time the policy statements capture both the supranational and the national dynamic of party competition. The policy statements are clustered into three groups according to the main issue in which they refer: (a) European Union, (b) Economy and (c) Society.

To be able to calculate the performance of the voting prediction, we took into consideration only the users who expressed vote intention for a specific party. Therefore, the questionnaires of the users, who did not answer the supplementary question on voting intention, or answered either ‘not decided yet’ or ‘I will not vote’ were exempted. Approximately 40% of the VAA users expressed voting intention for a specific party. The main characteristics of the filtered datasets are summarised in Table 1. For the evaluation, we randomly divided the users in each one of the datasets into a training set and a test set [43] in a 60:40 proportion. Figure 5 presents the distribution of samples per party in the training sets of the three datasets.

Figure 5
figure 5

Distribution of samples per party in the training set for (a) Cypriot dataset, (b) German dataset, (c) Greek dataset.

Table 1 Datasets’ characteristics

As training set \(\mathit {Tr}= \{ (\mathbf {x}_{j},vi_{j} ) | j =1,\ldots ,N_{l},vi_{j} \neq\O\} \) we set the vectors \(\mathbf {x}_{j}\) corresponding to users’ answers to the online questionnaire, along with the corresponding vote intention \(vi_{j}\) that refers to the party number. After training, the created HMM classifier (i.e., the set of party models) was used to predict the vote intention of the users in the test set \(\mathit {Te}= \{ (\mathbf {x}_{t},vi_{t} ) | (\mathbf {x}_{t},vi_{t} ) \notin \mathit {Tr}, t =1,\ldots ,N_{t},vi_{t} \neq\O\}\), which is a set of vector and vote intention pairs \((\mathbf {x}_{t}, vi_{t})\) not used in the training set.

The datasets were chosen such as to differ in size. The samples of the Cypriot dataset are few, since less than 2,000 users of the Cypriot dataset filled properly the online questionnaire and expressed at the same time their vote intention. The Greek dataset is approximately 13 times bigger than the Cypriot dataset. The number of samples in German dataset is slightly larger than the number in the Cypriot dataset, but it was preferred because it is characterised by a rather smooth distribution of samples per party, which is not the case in the Greek and Cypriot datasets. Furthermore, the number of parties varies among the selected datasets while the same happens for the population’s distribution percentage among the various parties. The mentioned differences helped us to examine the behaviour of HMMs when there is no sufficient number of data points per party and when the number of samples varies among parties.

5.2 Evaluation framework

To evaluate the voting prediction performance of HMMs, we resort to well-known measures defined in the field of information retrieval [33]. Specifically Precision, Recall, F-measure [44] and Mean Average Precision [33] are computed for all users, who support a particular party and a weighted average is calculated. The Appendix provides a clear definition of these metrics in the context of the current work.

5.3 Results and discussion

Experiments were designed to investigate the performance of social voting recommendation by using HMMs for similarity matching between parties and users in VAAs. The HMMs were trained with the aid of the HMM Toolbox of Matlab, which was built by Kevin Murphy in 1998 and it uses the Baum-Welch algorithm for estimating parameters of HMMs with discrete outputs [45]. We created an HMM \(\lambda_{i} = (A_{i}, B_{i}, \pi_{i})\), for every party in each one of the datasets. Thus, we concluded with seven HMMs for the Cypriot and German datasets and nine HMMs for the Greek dataset.

The experimental process that was adopted for each one of the datasets involves the following steps: First the parameters of the party models were initialised by random guess. Then the algorithm was updating the parameters iteratively until convergence, by using the training set Tr. After training the party models, the created HMM classifier was applied to the test set Te to classify unseen users into the most probable party class, i.e., if the user’s answer pattern was most likely to fit with the i’th party model, then the user was classified into the party \(p_{i}\). In the end, to examine the voting prediction performance of HMMs, the real vote intention of each user in the test set was compared to their predicted vote intention (the party id of the party in which they were classified) and an overall score of how well the algorithm performed was calculated. For that the Precision, Recall and F-measures were computed for all users in a particular party and then a total weighted average was estimated. In Tables 2-4 the results for each party of the Cypriot, German and Greek dataset, respectively, are shown.

Table 2 The results of HMMs for each party - Cypriot dataset

Additional experiments were also conducted to compare the performance of the HMM classifier with the traditional party-user matching method of VAAs and other classifiers that were applied for similarity matching between parties and users in SVAAs. Tables 5-7 show the total weighted averages for Precision, Recall and F-measure as well as the Mean Average Precision (MAP) by applying various algorithms to each one of the datasets. The traditional VAA method of voting recommendation is referred as ‘Party Coding’, while the KNN refers to k-nearest neighbour classification [44, 46]. The aggregate results of HMMs obtained in the German and Cypriot datasets outperform the ones obtained in the Greek dataset. Also, HMM classifier achieved better overall performance than the other applied methods.

The HMM classifier achieved a very good prediction performance for the Cypriot dataset. Actually it responded extremely well on the two first parties, which concentrate the major percentage of users (see Figure 5(a)) and their users seem to have consistency on their answer patterns. The low performance in vote prediction for the users of small parties is mainly due to insufficient number of samples (see the results in Table 2). However, even if the two first parties hold the majority of users, these users are less than the samples \(N = 20U\) (U is the number of policy statements in the questionnaire) that Tsapatsoulis et al. [29] mentioned as the required number for training party models when the Mahalanobis classifier is used. This makes us understand that HMMs can be effectively trained even with few training samples, when these samples form a single cluster in the U-dimensional hyperspace.

The quite smooth performance across parties in the German dataset, as it can be seen in Table 3, occurs due to the smooth distribution of samples per party (see Figure 5(b)) along with the homogeneity of the answer patterns among the users in each party. Even so, the prediction performance for the sixth party, which holds the majority of the users, exceeds the performance of the others. Consequently, the results for the seventh party, which has the smallest distribution of samples in the training set, are the worst.

Table 3 The results of HMMs for each party - German dataset

The vote prediction performance of HMMs for the Greek dataset is controversial (see Table 4) and varies significantly among parties. Once again the HMM for the party with the highest number of users, i.e., the second party, achieved the best score. The non-accurate results for the small parties caused mainly due to insufficient number of samples. Nevertheless, there are cases of parties with fewer samples, such as the fifth and sixth, whose HMMs performed better than parties with more samples such as the first and eighth party. By carefully examining these cases in Table 4 we see that the low number of samples reflects in unbalanced recall and precision scores, which in turn lead to low F-scores, while the poor performance for the other parties is possibly due to non-homogeneity of user profiles, which leads to low scores in both recall and precision. Non-homogeneity within the users in parties occurs for various reasons, such as different political background and different view for the various categories of policy statements. For instance, the supporters of the same party might have a common view on economy but totally different in EU policy issues. As we explain later in the Conclusion section, within party clusters can be investigated separately by modelling data from each specific cluster through a Gaussian distribution and then generating mixture of Gaussians taking into account the ratio of each source [47, 48]. It is known that whenever the distributed data are asymmetric and multi-modal, a mixture of Gaussians can be used to model them [35].

Table 4 The results of HMMs for each party - Greek dataset

The overall performance of the HMM classifier in predicting vote intention in SVAAs is very satisfactory (see the aggregate results of HMMs obtained in each one of the datasets in Tables 5-7). Thus, the use of HMMs, which are based on the conditional probabilities of the VAA users’ answers, seems to be quite effective. That was expected since the policy statements in VAA questionnaires are usually correlated and grouped into categories representing specific political issues. Therefore the answer choice to each policy statement can be ‘predicted’ from previous answer choices. Also the policy statements are answered with a specific display order, from the first to last one, and is kept constant for a specific VAA creating sequences of symbols. The users who support the same party are likely to create similar sequences of choices to policy statements (answer patterns), since they commonly share the same political opinions. So, an HMM classifier, by recognising the given answer patterns of users, who are classified in each party according to their voting intention, is able to create simple and compact models for each party and make quite well predictions on unseen data. We noticed, however, that imperfect modelling might happen due to insufficient number of samples in the party or because of no or low coherence between the profiles of users, who are classified into the same party. Even so, the non-accurate results for small parties do not critically affect the design of social recommendation, something that was also reported by Tsapatsoulis et al. [29].

Table 5 The aggregate results of party models by applying various algorithms to the Cypriot dataset
Table 6 The aggregate results of party models by applying various algorithms to the German dataset
Table 7 The aggregate results of party models by applying various algorithms to the Greek dataset

By applying HMMs to SVAAs we realised that HMM classifier’s performance is closest to that of Mahalanobis classifier’s (see the aggregate results in Tables 5-7). However, the HMM classifier achieved better performance than the Mahalanobis classifier and the other machine learning algorithms applied. In almost all cases machine learning techniques outperform the traditional VAA party-user matching method; this is in agreement with conclusions of previous studies (see Agathokleous et al. [10], Katakis et al. [6], Tsapatsoulis and Mendez [31], Tsapatsoulis et al. [29]). Finally, we can see in Table 7 that the k-nearest neighbour classification (KNN) has better Recall, F-measure and MAP scores than the HMM classifier in the Greek dataset. The KNN classifier finds the k-nearest users in the training set, whose answer patterns are nearest to the answer pattern of the user under question. Then it assigns to this user the party id of the party that has the smallest expected misclassification cost among the parties with the k-nearest users [13]. In our case we chose to take into consideration the three nearest users (i.e., \(k=3\)) and use the Euclidean distance metric. The good results of KNN classifier reinforce, even more, the remark made previously about the multi-modal distribution of user profiles within the same party in Greek dataset, since it takes into account only the nearest users and not all the users in a party group.

6 Conclusion

This study was conducted in order to investigate whether HMMs could improve the effectiveness of social voting recommendation. We based on the idea that while the users are answering the VAA policy statements, they are incrementally producing sequences of observations (answer patterns) that might characterise ‘typical’ voters of particular parties; thus, an HMM classifier, whose ability to capture correlations in symbol sequences would be beneficial.

The performance of the HMM classifier in SVAA, according to Recall, Precision and F-measure is quite promising. We observed that, even if the order in which policy statements are displayed in VAAs does not actually matter, the HMMs perform very well in estimating the vote intention of users taking into account the intra-sequence correlations. This is not a surprise as the SVAAs are based on the party-supporters models and HMM classifier creates simple and compact models by identifying the ‘path’ that users, in each party, follow to answer the online questionnaire. Also, the policy statements in VAAs are grouped according to the issue category that they represent. The statements that refer on the same subject are correlated and are evaluated in a similar way by users having similar political views. Therefore, what answer is going to be given on a policy statement is dependent on what was observed on a previous one from the same subject category. By finding the conditional probability in which a statement is given according to a statement that has already occurred, the HMMs can effectively provide vote recommendation.

From our experiments we realised that the HMM classifier outperforms the traditional party-coding recommendation method and other party-supporters modelling methods. In addition we noticed that the prediction performance of HMMs depends on the consistency between the answers of the users in each party and the distribution of samples per party. In general, the parties, with the majority of users achieved the best performance in all three datasets, even in the case of the Cypriot dataset, where the two first parties had the biggest distribution of samples in the training set, but the number of these samples is small. This lead us to the observation that HMMs can be effectively trained even with few training samples, when these samples form a single cluster in the policy statements hyperspace. In cases, where the profiles of party-supporters create a multi-modal clustering in the policy statements hyperspace due to different political backgrounds and different views in the various categories of policy statements, the results tend to be poor. Under these circumstances, the use of mixture of Gaussians [35, 47] or different clustering techniques [46, 49, 50] could be beneficial. In the near future we plan to tackle this problem by using per party and per category of policy statements HMMs. Thus, a combination of HMMs for party-supporters modelling will be pursued to account for the multi-modal distribution of VAA user profiles within the same party.

7 Declarations

List of abbreviations: This list shows the abbreviations in the order they appeared in the text:


Voting Advice Application


Hidden Markov Model


Information and Communications Technologies


European Union


Social Voting Advice Application


Recommender Systems


Intuitionistic Fuzzy set


Mean Average Precision


k-nearest neighbour