Framework for Social Media Big Data Quality Analysis

  • Dua’a Al-Hajjar
  • Nouf Jaafar
  • Manal Al-Jadaan
  • Reem Alnutaifi
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 312)

Abstract

Unlimited amount of unstructured data is being captured and analyzed over social media. The paper highlights the issue of lack of standard quality control approaches that could be utilized for all social media sites. This is due to the variety of formats of big data acceptable over these sites. The issue reveals a challenge not only in the capture of big data but also in the analysis and yield of valuable data, which affect decision-making. The paper reviews a collection of archived documents in the field of big data and social media. This paper presents a framework identifying the issues of quality analysis of big data on social media, examining current techniques used by social media companies to capture and analyze big data, and mapping social media sites and the appropriate combinations of big data capture and analysis techniques with the data quality control requirements.

Keywords

Big data Social Media Framework Quality Analysis 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Gold, M.K.: Debates in the Digital Humanities. Univ of Minnesota Press (2012)Google Scholar
  2. 2.
    Deters, R., Lomotey, R.K.: RSenter: terms mining tool from unstructured data sources. Int. J. of Business Process Integration and Management 6, 298–311 (2014)Google Scholar
  3. 3.
    Mayer-Schönberger, V., Cukier, K.: Big Data: A Revolution that Will Transform how We Live, Work, and Think. Eamon Dolan/Houghton Mifflin Harcourt, New York (2013)Google Scholar
  4. 4.
    Robinson, D.: Big Data – The 4 V’s: What Was Old is New Again; Part 1, from Making Data Meaningful (December 3, 2012), http://makingdatameaningful.com/2012/12/03/big-data-the-4-vs-what-was-old-is-new-again-part-1/ (retrieved March 4, 2014)
  5. 5.
    Atefeh, F., Khreich, W.: A Survey of Techniques For Event Detection in Twitter. Computational Intelligence (September 4, 2013)Google Scholar
  6. 6.
    Vemuganti, G.: Metadata Management in Big Data. Infosys Labs Briefings (2013)Google Scholar
  7. 7.
    Liang, P.-W., Dai, B.-R.: Opinion Mining on Social Media Data. In: IEEE 14th International Conference on Mobile Data Management (MDM), Milan, vol. 2, pp. 91–96 (2013)Google Scholar
  8. 8.
    Flaounas, I., Sudhahar, S., Lansdall-Welfare, T., Hensiger, E., Cristianini, N.: Big Data Analysis of News and Social Media Content (2014), www.see-a-pattern.org/sites/default/files/Big%20Data%20Analysis%20of%20News%20and%20Social%20Media%20Content.pdf (retrieved 2014 йил 23-03 from See a pattern)
  9. 9.
    Xin Chen, M.V.: Mining Social Media Data for Understanding Students’ Learning Experiences (2013)Google Scholar
  10. 10.
    Alexa, Actionable Analytics for the Web, from Alexa (April 5, 2014), http://www.alexa.com/ (retrieved)
  11. 11.
    Kumar, S., Morstatter, F., Liu, H.: Twitter Data Analytics. Springer (2013)Google Scholar
  12. 12.
    Small, H., Kasianovitz, K., Blanford, R., Celaya, I.: What Your Tweets Tell Us About You: Identity, Ownership and Privacy of Twitter Data. The International Journal of Digital Curation 7(1), 174–197 (2012)CrossRefGoogle Scholar
  13. 13.
    Chen, X., Madhavan, K., Vorvoreanu, M.: A Web-Based Tool for Collaborative Social Media Data Analysis. In: IEEE Third International Conference on Cloud and Green Computing, pp. 383–388. IEEE Computer Society, Karlsruhe (2013)Google Scholar
  14. 14.
    Miners, Z., Ribeiro, J.: Apple snaps up Topsy, PrimeSense: acquisitions reflect interest in Twitter access, 3D sensing technology. Macworld 31(3), 24 (2014)Google Scholar
  15. 15.
    DataSift. Pull. from DataSift Developers (February 10, 2014) (retrieved April 18, 2014 )Google Scholar
  16. 16.
    Information Management Journal. Search Firms to Mine Tweets. Information Management Journal 46(3), 17 (2012)Google Scholar
  17. 17.
    Boicea, A., Radulescu, F., Agapin, L.I.: MongoDB vs Oracle - database comparison. In: Third International Conference on Emerging Intelligent Data and Web Technologies, pp. 330–335. IEEE Computer Society, Bucharest (2012)CrossRefGoogle Scholar
  18. 18.
    Okman, L., Gal-Oz, N., Gonen, Y., Gudes, E., Abramov, J.: Security Issues in NoSQL Databases. In: 2011 IEEE 10th International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom), pp. 541–547. IEEE Computer Society, Changsha (2011)Google Scholar
  19. 19.
    Li, Y., Manoharan, S.: A performance comparison of SQL and NoSQL databases. In: 2013 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing (PACRIM), pp. 15–19. IEEE, Victoria (2013)CrossRefGoogle Scholar
  20. 20.
    Information Today. Topsy introduces alerts and reports. EContent 36(4), 15Google Scholar
  21. 21.
    Akrouf, S., Meriem, L., Yahia, B., Eddine, M.N.: Social Network Analysis and Information Propagation: A Case Study Using Flickr and YouTube Networks. International Journal of Future Computer and Communication (2013)Google Scholar
  22. 22.
    Hansen, D.L., Rotman, D., Bonsignore, E., Milic-Frayling, N., Rodrigues, E.M., Smith, M., Shneiderman, B.: Do You Know the Way to SNA?: A Process Model for Analyzing and Visualizing Social Media Network Data. In: 2012 International Conference on Social Informatics (SocialInformatics), Lausanne (2012)Google Scholar
  23. 23.
    Smith, M.A.: NodeXL: Simple network analysis for social media. In: 2013 International Conference Collaboration Technologies and Systems (CTS), San Diego, CA (2013)Google Scholar
  24. 24.
    Gómez, J.A., Shneiderman, B.: Understanding social relationships from photo collection tags. Human-Computer Interaction Lab & Department of Computer Science (2011)Google Scholar
  25. 25.
    Smith, M.M.-F.: NodeXL: a free and open network overview, discovery and exploration add-in for Excel (2007/2010), http://nodexl.codeplex.com/ (retrieved 2014 йил 20-April from CodePlex)
  26. 26.
    Microsoft. Excel specifications and limits, http://office.microsoft.com/en-us/excel-help/excel-specifications-and-limits-HP010073849.aspx (retrieved 2014 йил 20-April from Microsoft Office)
  27. 27.
    Bonsignore, E.M., Dunne, C., Rotman, D., Smith, M., Capone, T., Hansen, D.L., Shneiderman, B.: First Steps to Netviz Nirvana: Evaluating Social Network Analysis with NodeXL. In: International Conference on Computational Science and Engineering, CSE 2009, Vancouver, BC (2009)Google Scholar
  28. 28.
    Bonneau, J., Anderson, J.: Prying Data out of a Social Network. Cambridge, UK (2009)Google Scholar
  29. 29.
    Hogan, B.: Facebook as a data capture site: Techniques, Traps, Terms & Conditions (2011 йил 24-March), http://www.slideshare.net/primath/dl-tech-talkhogan (retrieved 2014 йил 18-April from slideshare)
  30. 30.
    Rieder, B.: Studying Facebook via Data Extraction. The Netvizz, Amesterdam (2013 йил 29-June)Google Scholar
  31. 31.
    Hayes, M.: DataFu’s Hourglass: Incremental Data Processing in Hadoop (October 03, 2013)Google Scholar
  32. 32.
    Diane, M.: The Value and Benefits of Text MiningGoogle Scholar
  33. 33.
    Sukanyal, M., Biruntha, S.: Techniques on Text Mining (2012)Google Scholar
  34. 34.
    Alfawareh, S.J.: Techniques, Applications and Challenging Issue in Text Mining (2012)Google Scholar
  35. 35.
    Vaughan, W.: DataFu 1.0 (September 2013)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Dua’a Al-Hajjar
    • 1
  • Nouf Jaafar
    • 1
  • Manal Al-Jadaan
    • 1
  • Reem Alnutaifi
    • 1
  1. 1.Prince Sultan University – College for WomenRiyadhSaudi Arabia

Personalised recommendations