Retrieve the Hidden Leaves in the Forest: Prevent Voting Spamming in Zhihu

Zhang, Jun; Labiod, Houda

doi:10.1007/978-981-15-0758-8_13

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1095))

Included in the following conference series:

International Symposium on Security and Privacy in Social Networks and Big Data

621 Accesses

Abstract

Nowadays, more and more people start posting their opinions on online social networks, such as commercial product evaluation websites, forums, and crowdsourcing $ Q \& A$ websites. In practice, most majority vote schemes cannot reveal the true distribution of opinions, due to the spam problem. Many public relationship companies can recruit people or use automatic commenting machines to promote target products and ruin the reputation of their opponents. In such a sense, the opinions on these websites may not be reliable. In the literature, there are a lot of studies contributed to detect such spams, based on the characteristics of posted content, social relationship, user activity, posting time, etc. We find that most spam detection schemes rely heavily on the experience and preference of experts. This is dangerously as it can lead to bias and dictatorship. In this work, we take Zhihu - one popular Chinese $ Q \& A$ website as a case study, and propose a time diversity based voting scheme to reduce the impact of voting spamming. We illustrate that, our proposed opinion tolerant system can maintain a good balance in the appearance of different opinions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Where does a wise man hide a leaf? In the forest. But what does he do if there is no forest? He grows a forest to hide it in.
By G.K. Chesterton, The Innocence of Father Brown, [17].

References

Amazon. http://www.amazon.com/
Ebay. https://www.ebay.com/
Facebook. https://www.facebook.com/
How answers are ranked in Zhihu? https://zhuanlan.zhihu.com/p/19902495/
Imdb. https://www.imdb.com/
Is GMO harmful to health or envrionment? https://www.zhihu.com/question/64850604/
QQ International. http://www.imqq.com/English1033.html/
Quora. https://www.quora.com/
Stack overflow. http://stackoverflow.com/
Twitter. https://www.twitter.com/
Wikipedia. https://www.wikipedia.org/
Youtube. https://www.youtube.com/
Zhihu. http://www.zhihu.com/
Arrow, K.: A difficulty in the concept of social welfare. J. Polit. Econ. 58(4), 328–346 (1950)
Article Google Scholar
Chen, C., Wu, K., Srinivasan, V., Zhang, X.: Battling the internet water army: detection of hidden paid posters. In: IEEE/ACM ASONAM, pp. 116–120 (2013)
Google Scholar
Chen, Y., Chen, H.: Opinion spam detection in web forum: a real case study. In: WWW, pp. 173–183 (2015)
Google Scholar
Chesterton, G.K.: The Innocence of Father Brown. John Lane Company (1911)
Google Scholar
Danezis, G., Mittal, P.: SybilInfer: detecting sybil nodes using social networks. In: NDSS, pp. 1–15 (2009)
Google Scholar
Ghosh, A., Kale, S., McAfee, P.: Who moderates the moderators? Crowdsourcing abuse detection in user-generated content. In: ACM EC, pp. 167–176 (2011)
Google Scholar
Harris, C.G.: Detecting deceptive opinion spam using human computation. In: Workshops at AAAI on AI (2012)
Google Scholar
Morris, M.R., Counts, S., Roseway, A., Hoff, A., Schwarz, J.: Tweeting is believing? Understanding microblog credibility perceptions. In: CSCW, pp. 441–450 (2012)
Google Scholar
Shi, L., Yu, S., Lou, W., Hou, Y.T.: SybilShield: an agent-aided social network-based sybil defense among multiple communities. In: IEEE INFOCOM, pp.1034–1042 (2013)
Google Scholar
Thomas, K., McCoy, D., Grier, C., Kolcz, A., Paxson, V.: Trafficking fraudulent accounts: the role of the underground market in twitter spam and abuse. In: USENIX Security, pp. 195–210 (2013)
Google Scholar
Tuomisto, H.: A consistent terminology for quantifying species diversity? Yes, it does exist. Oecologia 164(4), 853–860 (2010)
Article Google Scholar
Wang, G., Konolige, T., Wilson, C., Wang, X., Zheng, H., Zhao, B.Y.: You are how you click: clickstream analysis for sybil detection. In: USENIX Security, pp. 241–256 (2013)
Google Scholar
Wang, G., et al.: Social turing tests: crowdsourcing sybil detection. http://arxiv.org/pdf/1205.3856.pdf (2012)
Wang, G., et al.: Serf and turf: crowdturfing for fun and profit. In: ACM WWW, pp. 679–688 (2012)
Google Scholar
Wei, W., Xu, F., Tan, C., Li, Q.: SybilDefender: defend against sybil attacks in large social networks. In: IEEE INFOCOM, pp. 1951–1959 (2012)
Google Scholar
Wilson, E.B.: Probable inference, the law of succession, and statistical inference. J. Am. Stat. Assoc. 22(158), 209–212 (1927)
Article Google Scholar
Yang, Z., Wilson, C., Wang, X., Gao, T., Zhao, B.Y., Dai, Y.: Uncovering social network sybils in the wild. ACM Trans. Knowl. Discov. Data 8(1), 1–29 (2014)
Article Google Scholar
Yu, H., Gibbons, P.B., Kaminsky, M., Xiao, F.: SybilLimit: a near-optimal social network defense against sybil attacks. In: IEEE S&P, pp. 3–17 (2008)
Google Scholar
Yu, H., Kaminsky, M., Gibbons, P.B., Flaxman, A.: Sybilguard: defending against sybil attacks via social networks. SIGCOMM 36(4), 267–278 (2006)
Article Google Scholar
Zhenga, X., Zenga, Z., Chen, Z., Yua, Y., Rong, C.: Detecting spammers on social networks. Neurocomputing 159, 27–34 (2015)
Article Google Scholar

Download references

Author information

Authors and Affiliations

INFRES, Telecom Paris, Institut Polytechnique de Paris, Paris, France
Jun Zhang & Houda Labiod

Authors

Jun Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Houda Labiod
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Jun Zhang or Houda Labiod .

Editor information

Editors and Affiliations

Technical University of Denmark, Lyngby, Denmark
Weizhi Meng
University of Plymouth, Devon, UK
Steven Furnell

A Definition of Diversity of Visibility

In order to formally evaluate the appearance of the diversity of answers, we borrow the concept of the true diversity from [24]. Considering a set of groups G and supposing the proportion of each group $g_i \in G$ is $q_i$, the true diversity with 1-mean is

$$\begin{aligned} D=\exp \left( -\sum _{g_i \in G} q_i \ln (q_i) \right) . \end{aligned}$$

(12)

A large number of true diversity indicates that there is a good balance between the proportion of species.

In our case, the diversity is not only related with number of answers in different groups, but also their positions. Let us denote the position of an answer a under the ranking policy p as L(a, p). We consider the fact that the visibility of an answer decreases with the decreasing of its position. Therefore we define the visibility index of an answer at position L as $\lambda ^L$, where $\lambda $ is the decay factor, under the assumption that there is an exponential decrease of visibility by rankings. For a set of answers N, we denote the set of its groups as G(N), such that

$$\begin{aligned} \nonumber&\forall g \in G(N), g \in 2^{N} \\ \nonumber&\forall g_i,g_j \in G(N), g_i \cap g_j = \varnothing \\&\cup _{g \in G(N)}=N. \end{aligned}$$

(13)

The total visibility index of each group $g \in G(N)$ is

$$\begin{aligned} V(g,p)=\sum _{a \in g} \lambda ^{L(a,p)}. \end{aligned}$$

(14)

The corresponding visibility ratio of each group $g \in G(N)$ is

$$\begin{aligned} q(g,p)=\frac{V(g,p)}{\sum _{g' \in G(N)} V(g',p)}. \end{aligned}$$

(15)

Then the diversity of visibility of a set of answers N under the ranking policy p is defined as the true diversity of the visibility of groups. Formally, it is defined as

$$\begin{aligned} D_{vis}(N,p)= \exp \left( - \sum _{g \in G(N)} q(g,p) \ln \left( q(g,p) \right) \right) . \end{aligned}$$

(16)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, J., Labiod, H. (2019). Retrieve the Hidden Leaves in the Forest: Prevent Voting Spamming in Zhihu. In: Meng, W., Furnell, S. (eds) Security and Privacy in Social Networks and Big Data. SocialSec 2019. Communications in Computer and Information Science, vol 1095. Springer, Singapore. https://doi.org/10.1007/978-981-15-0758-8_13

Download citation

DOI: https://doi.org/10.1007/978-981-15-0758-8_13
Published: 24 October 2019
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-0757-1
Online ISBN: 978-981-15-0758-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Retrieve the Hidden Leaves in the Forest: Prevent Voting Spamming in Zhihu

Abstract

Access this chapter

Notes

References

Author information

Authors and Affiliations

Corresponding authors

Editor information

Editors and Affiliations

A Definition of Diversity of Visibility

A Definition of Diversity of Visibility

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation