New Trends in Database and Information Systems II pp 301-314 | Cite as
Framework for Social Media Big Data Quality Analysis
Abstract
Unlimited amount of unstructured data is being captured and analyzed over social media. The paper highlights the issue of lack of standard quality control approaches that could be utilized for all social media sites. This is due to the variety of formats of big data acceptable over these sites. The issue reveals a challenge not only in the capture of big data but also in the analysis and yield of valuable data, which affect decision-making. The paper reviews a collection of archived documents in the field of big data and social media. This paper presents a framework identifying the issues of quality analysis of big data on social media, examining current techniques used by social media companies to capture and analyze big data, and mapping social media sites and the appropriate combinations of big data capture and analysis techniques with the data quality control requirements.
Keywords
Big data Social Media Framework Quality AnalysisPreview
Unable to display preview. Download preview PDF.
References
- 1.Gold, M.K.: Debates in the Digital Humanities. Univ of Minnesota Press (2012)Google Scholar
- 2.Deters, R., Lomotey, R.K.: RSenter: terms mining tool from unstructured data sources. Int. J. of Business Process Integration and Management 6, 298–311 (2014)Google Scholar
- 3.Mayer-Schönberger, V., Cukier, K.: Big Data: A Revolution that Will Transform how We Live, Work, and Think. Eamon Dolan/Houghton Mifflin Harcourt, New York (2013)Google Scholar
- 4.Robinson, D.: Big Data – The 4 V’s: What Was Old is New Again; Part 1, from Making Data Meaningful (December 3, 2012), http://makingdatameaningful.com/2012/12/03/big-data-the-4-vs-what-was-old-is-new-again-part-1/ (retrieved March 4, 2014)
- 5.Atefeh, F., Khreich, W.: A Survey of Techniques For Event Detection in Twitter. Computational Intelligence (September 4, 2013)Google Scholar
- 6.Vemuganti, G.: Metadata Management in Big Data. Infosys Labs Briefings (2013)Google Scholar
- 7.Liang, P.-W., Dai, B.-R.: Opinion Mining on Social Media Data. In: IEEE 14th International Conference on Mobile Data Management (MDM), Milan, vol. 2, pp. 91–96 (2013)Google Scholar
- 8.Flaounas, I., Sudhahar, S., Lansdall-Welfare, T., Hensiger, E., Cristianini, N.: Big Data Analysis of News and Social Media Content (2014), www.see-a-pattern.org/sites/default/files/Big%20Data%20Analysis%20of%20News%20and%20Social%20Media%20Content.pdf (retrieved 2014 йил 23-03 from See a pattern)
- 9.Xin Chen, M.V.: Mining Social Media Data for Understanding Students’ Learning Experiences (2013)Google Scholar
- 10.Alexa, Actionable Analytics for the Web, from Alexa (April 5, 2014), http://www.alexa.com/ (retrieved)
- 11.Kumar, S., Morstatter, F., Liu, H.: Twitter Data Analytics. Springer (2013)Google Scholar
- 12.Small, H., Kasianovitz, K., Blanford, R., Celaya, I.: What Your Tweets Tell Us About You: Identity, Ownership and Privacy of Twitter Data. The International Journal of Digital Curation 7(1), 174–197 (2012)CrossRefGoogle Scholar
- 13.Chen, X., Madhavan, K., Vorvoreanu, M.: A Web-Based Tool for Collaborative Social Media Data Analysis. In: IEEE Third International Conference on Cloud and Green Computing, pp. 383–388. IEEE Computer Society, Karlsruhe (2013)Google Scholar
- 14.Miners, Z., Ribeiro, J.: Apple snaps up Topsy, PrimeSense: acquisitions reflect interest in Twitter access, 3D sensing technology. Macworld 31(3), 24 (2014)Google Scholar
- 15.DataSift. Pull. from DataSift Developers (February 10, 2014) (retrieved April 18, 2014 )Google Scholar
- 16.Information Management Journal. Search Firms to Mine Tweets. Information Management Journal 46(3), 17 (2012)Google Scholar
- 17.Boicea, A., Radulescu, F., Agapin, L.I.: MongoDB vs Oracle - database comparison. In: Third International Conference on Emerging Intelligent Data and Web Technologies, pp. 330–335. IEEE Computer Society, Bucharest (2012)CrossRefGoogle Scholar
- 18.Okman, L., Gal-Oz, N., Gonen, Y., Gudes, E., Abramov, J.: Security Issues in NoSQL Databases. In: 2011 IEEE 10th International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom), pp. 541–547. IEEE Computer Society, Changsha (2011)Google Scholar
- 19.Li, Y., Manoharan, S.: A performance comparison of SQL and NoSQL databases. In: 2013 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing (PACRIM), pp. 15–19. IEEE, Victoria (2013)CrossRefGoogle Scholar
- 20.Information Today. Topsy introduces alerts and reports. EContent 36(4), 15Google Scholar
- 21.Akrouf, S., Meriem, L., Yahia, B., Eddine, M.N.: Social Network Analysis and Information Propagation: A Case Study Using Flickr and YouTube Networks. International Journal of Future Computer and Communication (2013)Google Scholar
- 22.Hansen, D.L., Rotman, D., Bonsignore, E., Milic-Frayling, N., Rodrigues, E.M., Smith, M., Shneiderman, B.: Do You Know the Way to SNA?: A Process Model for Analyzing and Visualizing Social Media Network Data. In: 2012 International Conference on Social Informatics (SocialInformatics), Lausanne (2012)Google Scholar
- 23.Smith, M.A.: NodeXL: Simple network analysis for social media. In: 2013 International Conference Collaboration Technologies and Systems (CTS), San Diego, CA (2013)Google Scholar
- 24.Gómez, J.A., Shneiderman, B.: Understanding social relationships from photo collection tags. Human-Computer Interaction Lab & Department of Computer Science (2011)Google Scholar
- 25.Smith, M.M.-F.: NodeXL: a free and open network overview, discovery and exploration add-in for Excel (2007/2010), http://nodexl.codeplex.com/ (retrieved 2014 йил 20-April from CodePlex)
- 26.Microsoft. Excel specifications and limits, http://office.microsoft.com/en-us/excel-help/excel-specifications-and-limits-HP010073849.aspx (retrieved 2014 йил 20-April from Microsoft Office)
- 27.Bonsignore, E.M., Dunne, C., Rotman, D., Smith, M., Capone, T., Hansen, D.L., Shneiderman, B.: First Steps to Netviz Nirvana: Evaluating Social Network Analysis with NodeXL. In: International Conference on Computational Science and Engineering, CSE 2009, Vancouver, BC (2009)Google Scholar
- 28.Bonneau, J., Anderson, J.: Prying Data out of a Social Network. Cambridge, UK (2009)Google Scholar
- 29.Hogan, B.: Facebook as a data capture site: Techniques, Traps, Terms & Conditions (2011 йил 24-March), http://www.slideshare.net/primath/dl-tech-talkhogan (retrieved 2014 йил 18-April from slideshare)
- 30.Rieder, B.: Studying Facebook via Data Extraction. The Netvizz, Amesterdam (2013 йил 29-June)Google Scholar
- 31.Hayes, M.: DataFu’s Hourglass: Incremental Data Processing in Hadoop (October 03, 2013)Google Scholar
- 32.Diane, M.: The Value and Benefits of Text MiningGoogle Scholar
- 33.Sukanyal, M., Biruntha, S.: Techniques on Text Mining (2012)Google Scholar
- 34.Alfawareh, S.J.: Techniques, Applications and Challenging Issue in Text Mining (2012)Google Scholar
- 35.Vaughan, W.: DataFu 1.0 (September 2013)Google Scholar