Cluster Computing

, Volume 22, Supplement 6, pp 14461–14476 | Cite as

Distributed and scalable Sybil identification based on nearest neighbour approximation using big data analysis techniques

  • Chinnaiah ValliyammaiEmail author
  • Ramalingam Devakunchari


The problem of Sybil detection has been examined in multiple social media sources like Twitter, LinkedIn and Facebook. The detection of Sybils (fake accounts or social bots) across online social networks emerged as a major challenge due to the current improvement of different social networks, which are promptly generating a very huge data sets termed as big data. The open-source framework, spark-based distributed, fast and scalable nearest neighbor search (S-DFS-NNS) is proposed for profile-based fake account detection across large-scale online social networks. The proposed work performs an efficient parallel processing of the NN search problem. The performance of the k-nearest neighbor (k-NN) search significantly degrades for huge data sets, because the job is computationally hard. The framework is fast and adaptable to expansive, large-scale situations. By using in-memory computation, the suspected users are identified based on the novel private feature. The Spark-DFS-NN search technique provides a substantial performance development over the nearest neighbor computation in large-scale networks. The proposed framework is evaluated using detection accuracy which is able to expose and block a large fraction of suspicious accounts during account creation. The proposed S-DFS-NN framework maintains an approximately consistent and similar performance of 89–95% on the increase of attacks with a latency of 58 ms.


Sybil attack Online social networks Big data In-memory Resilient distributed dataset 



The authors gratefully acknowledge Department of Science & Technology, New Delhi for providing financial support to carry out this research work under Promotion of University Research and Scientific Excellence (PURSE) scheme. One of the authors, Ms. Devakunchari Ramalingam, is thankful to DST, New Delhi for the award of DST PURSE fellowship. The authors thank the Big Data Analytics laboratory, MIT Campus, Anna University for the infrastructure and support for carrying out the research.


  1. 1.
    Viswanath, B., Post, A., Gummadi, K. P., Mislove, A.: An analysis of social network-based sybil defenses. In: Proceedings of the ACM SIGCOMM Computer Communication Review. New York, USA, 40(4), pp. 363–374, Aug 16 2010Google Scholar
  2. 2.
    Jia, M., Xu, H., Wang, J., Bai, Y., Liu, B., Wang, J.: Handling big data of online social networks on a small machine. Computat. Social Netw. 2, 2–5 (2015)CrossRefGoogle Scholar
  3. 3.
    Zafarani, R., Liu, H.: Connecting users across social media sites: a behavioral-modeling approach. In: Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 41–49, 11 Aug 2013Google Scholar
  4. 4.
    Peled, O., Fire, M., Rokach, L., Elovici, Y.: Matching entities across online social networks. Neurocomputing. 210, 91–106 (2016)CrossRefGoogle Scholar
  5. 5.
    Conti, M., Poovendran, R., Secchiero, M.: Fakebook: detecting fake profiles in on-line social networks. In: Proceedings of the IEEE/ACM Advances in Social Networks Analysis and Mining Conference, pp. 1071–1078, 26 Aug 2012Google Scholar
  6. 6.
    Boshmaf, Y., Logothetis, D., Siganos, G., Lería, J., Lorenzo, J., Ripeanu, M., Beznosov, K.: Integro: leveraging victim prediction for robust fake account detection in OSNs. In: Proceedings of the NDSS, 15, pp. 8–11, Feb 2015Google Scholar
  7. 7.
    Sakr, N.A., ELdesouky, A.I., Arafat, H.: An efficient fast-response content-based image retrieval framework for big data. Comput. Electr. Eng. 54, 522–538 (2016)CrossRefGoogle Scholar
  8. 8.
    Liu, J., Zhang, F., Song, X., Song, Y.I., Lin, C.Y., Hon, H.W.: What’s in a name? An unsupervised approach to link users across communities. In: Proceedings of the Sixth ACM International Conference on Web Search and Data Mining, pp. 495–504, 4 Feb 2013Google Scholar
  9. 9.
    Adikari, S., Dutta, K.: Identifying Fake profiles in LinkedIn. In: Proceedings of the PACIS, pp. 278 (2014)Google Scholar
  10. 10.
    Perito, D., Castelluccia, C., Kaafar, M.A., Manils, P.: How unique and traceable are usernames? In Springer 2011 International Symposium on Privacy Enhancing Technologies; 27 July 2011, pp. 1–17. Berlin, Heidelberg (2011)Google Scholar
  11. 11.
    Fu, J.S., Liu, Y., Chao, H.C.: ICA: an incremental clustering algorithm based on OPTICS. Wirel. Pers. Commun. 84(3), 1–20 (2015)CrossRefGoogle Scholar
  12. 12.
    Kursa, M.B., Rudnicki, W.R.: Feature selection with the Boruta package. J. Stat. Softw. 36(11), 1–3 (2010)CrossRefGoogle Scholar
  13. 13.
    Zhou, X., Liang, X., Zhang, H., Ma, Y.: Cross-platform identification of anonymous identical users in multiple social media networks. IEEE Transact. Knowl. Data Eng. 28(2), 411–424 (2016)CrossRefGoogle Scholar
  14. 14.
    Feizy, R., Wakeman, I., Chalmers, D.: Are your friends who they say they are? data mining online identities. Crossroads 16(2), 19–23 (2009)CrossRefGoogle Scholar
  15. 15.
    Maillo, J., Triguero, I., Herrera, F.: A mapreduce-based k-nearest neighbor approach for big data classification. In: Proceedings of IEEE Trustcom/BigDataSE/ISPA, 2, pp. 167–172 (2015)Google Scholar
  16. 16.
    Hemalatha, C.S., Vaidehi, V., Nithya, K., Fathima, A.A., Visalakshi, M., Saranya, M.: Multi-level search space reduction framework for face image database. Int. J. Intell. Inf. Technol. (IJIIT) 11(1), 12–29 (2015)CrossRefGoogle Scholar
  17. 17.
    Ji, Y., He, Y., Jiang, X., Cao, J., Li, Q.: Combating the evasion mechanisms of social bots. Comput. Secur. 58, 230–249 (2016)CrossRefGoogle Scholar
  18. 18.
    Cao, Q., Sirivianos, M., Yang, X., Pregueiro, T.: Aiding the detection of fake accounts in large scale social online services. In: Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation, USENIX Association., pp. 15–15 25 April 2012Google Scholar
  19. 19.
    Meligy, A.M., Ibrahim, H.M., Torky, M.F.: Identity verification mechanism for detecting fake profiles in online social networks. Int. J. Comput. Netw. Inf. Secur. 9(1), 31 (2017)Google Scholar
  20. 20.
    Gao, P., Gong, N.Z., Kulkarni, S., Thomas, K., Mittal, P.: SybilFrame: a defence-in depth framework for structure-based sybil detection, 2015, arXiv: 1503.02985v1 [cs.SI]Google Scholar
  21. 21.
    Apache Spark Streaming, URL:
  22. 22.
    Signature Preview, NISDCC. Available from: <> (2009)
  23. 23.
    Fake name generator,
  24. 24.

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Department of Computer Technology, Faculty of Information and Communication, MIT CampusAnna UniversityChennaiIndia

Personalised recommendations