Skip to main content
Log in

Fulmqa: a fuzzy logic-based model for social media data quality assessment

  • Original Article
  • Published:
Social Network Analysis and Mining Aims and scope Submit manuscript

Abstract

Nowadays, with the advancement of information technology and the growing importance of social media, social media platforms such as Twitter and Facebook have become deeply entrenched in our lives and are growing rapidly around the world. On these platforms, users have the freedom to share and publish whatever they want without control, leading to faster generation and dissemination of information, as a result, rumors and false information can then be spread and seen by a larger number of people. Consequently, assessing the quality of these data proves to be of major importance. In this paper, we aim to present a new and efficient quality assessment model including the quality metrics needed for an accurate assessment. Our work presents an extension of the SMDQM (Social Media Data Quality Model) (Reda and Zellou, in: International conference on innovative research in applied science, engineering and technology, IEEE, 2022), which is used to assess the quality of data provided by social media platforms. We suggest using fuzzy logic to describe data quality metrics in order to overcome imprecision and subjectivity. Next, we perform extensive experiments to evaluate the performance of our model using a real-world implementation by performing an evaluation on two separate Twitter data sets. The findings indicate that the model can successfully evaluate all tweets with high performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Algorithm 1
Algorithm 2
Algorithm 3
Algorithm 4
Algorithm 5
Algorithm 6
Algorithm 7
Algorithm 8
Fig. 12

Similar content being viewed by others

Notes

  1. Twitter API. https://developer.twitter.com/en/docs/tweets/search/overview.

References

  • Abbasi MA, Liu H (2013) Measuring user credibility in social media. In: International conference on social computing, behavioral-cultural modeling, and prediction. Springer, Berlin, Heidelberg, pp 441–448

  • Ajarroud O, Zellou A, Idri A (2018) A new filtering-based query processing: improving semantic caching efficiency in mediation systems. In: Proceedings of the 12th international conference on intelligent systems: theories and applications, SITA’18, October 2018, Article no. 12, ACM international conference proceeding series. Rabat, Morocco, pp 1–6

  • Al-Hajjar D, Jaafar N, Al-Jadaan M, Alnutaifi R (2015) Framework for social media big data quality analysis. New Trends Database Inf Syst II:301–314. https://doi.org/10.1007/978-3-319-10518-5-23

    Article  Google Scholar 

  • Alizamini FG, Pedram MM, Alishahi M, Badie K (2010) Data quality improvement using fuzzy association rules. In: 2010 International conference on electronics and information engineering, vol 1. IEEE, pp V1–468

  • Alrubaian M, Al-Qurishi M, Al-Rakhami M, Hassan MM, Alamri A (2017) Reputation-based credibility analysis of twitter social network users. Concurr Comput Pract Exp 29(7):e3873

    Article  Google Scholar 

  • Alrubaian M, Al-Qurishi M, Alamri A, Al-Rakhami M, Hassan MM, Fortino G (2018) Credibility in online social networks: a survey. IEEE Access 7:2828–2855

    Article  Google Scholar 

  • Ardagna D, Cappiello C, Samá W, Vitali M (2018) Context-aware data quality assessment for big data. Futur Gener. Comput. Syst. 89:548–562

    Article  Google Scholar 

  • Arolfo F, Rodriguez KC, Vaisman A (2020) Analyzing the quality of twitter data streams. Inf Syst Front 24:1–21. https://doi.org/10.1007/s10796-020-10072-x

    Article  Google Scholar 

  • Berlanga R, Lanza-Cruz I, Aramburu MJ (2019) Quality indicators for social business intelligence. In: 2019 6th international conference on social networks analysis, management and security (SNAMS). https://doi.org/10.1109/snams.2019.8931862

  • Berti-Équille L (1999) Qualité des données multi-sources et recommandation multi-critère. In: Actes du congrès francophone INFormatique des ORganisations et systèmes d’INformation décisionnels (INFORSID’99), pp 185–204

  • Bird S (2006) NLTK: the natural language toolkit. In: Proceedings of the COLING/ACL 2006 interactive presentation sessions, pp 69–72

  • Caballero I, Verbo E, Serrano M, Calero C, Piattini M (2009) Tailoring data quality models using social network preferences. In: International conference on database systems for advanced applications, Springer, Berlin, Heidelberg, pp 152–166

  • Cai L, Zhu Y (2015) The challenges of data quality and data quality assessment in the big data era. Data Sci J 14:2

    Article  Google Scholar 

  • Chai K, Potdar V, Dillon T (2009) Content quality assessment related frameworks for social media. Lecture notes in computer science, pp 791–805. https://doi.org/10.1007/978-3-642-02457-3-65

  • Crosby PB (1979) Quality is free. McGraw-Hill, New York, p 309

    Google Scholar 

  • Deming WE (1982) Quality, productivity and competitive position. Massachusetts Institute of Technology Center for Advanced Engineering Study, Cambridge, MA, USA

    Google Scholar 

  • Earley J (1970) An efficient context-free parsing algorithm. Commun ACM 13(2):94–102

    Article  Google Scholar 

  • Ehrlinger L, Wöß W (2018) A novel data quality metric for minimality. In: International workshop on data quality and trust in big data, Springer, Cham, pp 1–15

  • El Alaoui I, Gahi Y, Messoussi R (2019) Big data quality metrics for sentiment analysis approaches. In: Proceedings of the 2019 international conference on big data engineering, pp 36–43

  • Elmasri R, Navathe SB (2000) Fundamentals of database systems, 3rd edn. Addison-Wesley, Reading, MA

    Google Scholar 

  • Even A, Shankaranarayanan G (2009) Dual assessment of data quality in customer databases. J Data Inf Qual (JDIQ) 1(3):1–29

    Article  Google Scholar 

  • Fagroud FZ, Ajallouda L, Lahmar EHB, Zellou A, El Filali S (2021) A brief survey on internet of things (IoT). In: 1st International conference on digital technologies and applications. Lecture notes in networks and systems, 211 LNNS, ICDTA, pp 335–344

  • Firmani D, Mecella M, Scannapieco M, Batini C (2015) On the meaningfulness of big data quality (invited paper). Data Sci Eng 1(1):6–20. https://doi.org/10.1007/s41019-015-0004-7

    Article  Google Scholar 

  • Gabr MI, Yehia MH, Doaa SE (2021) Data quality dimensions, metrics, and improvement techniques. Futur Comput Inform J 6(1):3

    Google Scholar 

  • Gupta P, Pathak V, Goyal N, Singh J, Varshney V, Kumar S (2019) Content credibility check on twitter. Commun Comput Inf Sci 899:197–212

    Google Scholar 

  • Hassenstein MJ, Vanella P (2022) Data quality-concepts and problems. Encyclopedia 2022(2):498–510

    Article  Google Scholar 

  • Hitzler P, Zaveri A, Rula A, Maurino A, Pietrobon R, Lehmann J, Auer S (2016) Quality assessment for linked data: a survey a systematic literature review and conceptual framework. Semant Web 1:1–5

    Article  Google Scholar 

  • Hoyle D (2006) ISO 9000 quality systems handbook. Routledge, Oxford

    Book  Google Scholar 

  • http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=42180

  • Hutto C, Gilbert E (2014) Vader: a parsimonious rule-based model for sentiment analysis of social media text. In: Proceedings of the international AAAI conference on web and social media, vol. 8, pp 216–225

  • Immonen A, Paakkonen P, Ovaska E (2015) Evaluating the quality of social media data in big data architecture. IEEE Access 3:2028–2043. https://doi.org/10.1109/access.2015.2490723

    Article  Google Scholar 

  • International Organization for Standardization ISO/IEC 25012:2008(E) (2008) Software engineering-software product quality requirementsand evaluation (SQuaRE)-data quality model. International Organization for Standardization, Geneva, Switzerland

    Google Scholar 

  • International Organization for Standardization–ISO (1994) Quality management and quality assurance: vocabulary ISO 8402:1994

  • International Standards Organization (ISO) 8402 (1994) Quality management and quality assurance

  • Juran JM (2003) Juran on leadership for quality. Simon and Schuster, New York

    Google Scholar 

  • Laranjeiro N, Soydemir SN, Bernardino J (2015) A survey on data quality: classifying poor data. In: 2015 IEEE 21st Pacific rim international symposium on dependable computing (PRDC), IEEE, pp 179–188

  • Larousse. Qualité, www.larousse.fr/dictionnaires/francais/qualit%C3%A9/65477

  • Larsen PM (1980) Industrial application of fuzzy logic control. Int J Man Mach Stud 12:3–10

    Article  Google Scholar 

  • Lee YW, Pipino LL, Funk JD, Wang RY (2006) Journey to data quality. MIT Press, Cambridge, MA

    Google Scholar 

  • Müller H, Naumann F, Freytag JC (2003) Data quality in genome databases. Humboldt University of Berlin, Berlin

    Google Scholar 

  • Nikiforova A (2020) Definition and evaluation of data quality: user-oriented data object-driven approach to data quality assessment. Balt J Modern Comput 8(3):391–432

    Google Scholar 

  • Olson JE (2003) Data quality: the accuracy dimension. Elsevier, Amsterdam

    Google Scholar 

  • Ossorio Arroyo A, Onorati T, Diaz P (2018) Quality assessment of social media: lessons learnt from the literature. In: 2018 22nd International conference information visualisation (IV). https://doi.org/10.1109/iv.2018.00055

  • Pääkkönen P, Jokitulppo J (2017) Quality management architecture for social media data. J Big Data 4(1):1–26. https://doi.org/10.1186/s40537-017-0066-7

    Article  Google Scholar 

  • Radulovic F, Mihindukulasooriya N, García-Castro R, Gómez-Pérez A (2018) A comprehensive quality model for linked data. Semant Web 9(1):3–24

    Article  Google Scholar 

  • Reda O, Zellou A (2023) Assessing the quality of social media data: a systematic literature review. Bull Electr Eng Inform 12(2):1115–1126

    Article  Google Scholar 

  • Reda O, Sassi I, Zellou A, Anter S (2020) Towards a data quality assessment in big data. In: Proceedings of the 13th international conference on intelligent systems: theories and applications. https://doi.org/10.1145/3419604.3419803.

  • Reda O, Zellou A (2022) SMDQM-social media data quality assessment model. In: 2022 2nd International conference on innovative research in applied science, engineering and technology (IRASET), IEEE, pp 1–7

  • Reuter C, Ludwig T, Ritzkatis M, Pipek V (2015, May) Social-QAS: tailorable quality assessment service for social media content. In: International symposium on end user development, Springer, Cham, pp 156–170

  • Ross TJ (2012) Fuzzy logic with engineering applications, 3rd edn. Wiley, New York, p 585

    Google Scholar 

  • Salvatore C, Biffignandi S, Bianchi A (2020) Social media and twitter data quality for new social indicators. Soc Indic Res. https://doi.org/10.1007/s11205-020-02296-w

    Article  Google Scholar 

  • Scannapieco M (2006) Data quality: concepts, methodologies and techniques. Data-centric systems and applications. Springer, Cham

    Google Scholar 

  • Sint R, Schaffert S, Stroka S, Ferstl R (2009) Combining unstructured, fully structured and semi-structured information in semantic wikis. In: 4th Workshop on semantic wikis-the semantic Wiki web 6 the European semantic web conference Hersonissos, Crete, Greece, June 2009, pp 73

  • Tayi GK, Ballou DP (1998) Examining data quality. Commun ACM 41(2):54–57

    Article  Google Scholar 

  • Verma PK, Sharma V, Agarwal S (2019) Credibility investigation for tweets and its users. In: Proceedings of the 3rd international conference on computing methodologies and communication, ICCMC 2019, pp 925–928

  • Wang RY, Strong DM (1996) Beyond accuracy: what data quality means to data consumers. J Manag Inf Syst 12:5–33. https://doi.org/10.1080/07421222.1996.1151809

    Article  Google Scholar 

  • Wang X, Ruan D, Kerre EE (2009) Mathematics of fuzziness-basic issues. Studies in fuzziness and soft computing, vol 245. Springer, Berlin/Heidelberg, p 220

  • Wayne SR (1983) Quality control circle and company wide quality control. Qual Prog 16(10):14–17

  • Woodall P, Parlikad AK (2010) A hybrid approach to assessing data quality. In ICIQ

  • Yang J, Yu M, Qin H, Lu M, Yang C (2019) A twitter data credibility framework-hurricane Harvey as a use case. ISPRS Int J Geo Inf 8(3):111

    Article  Google Scholar 

  • Yousfi A, El Yazidi MH, Zellou A (2018) Assessing the performance of a new semantic similarity measure designed for schema matching for mediation systems. In: Nguyen N, Pimenidis E, Khan Z, Trawinski B (eds) Computational collective intelligence: 10th international conference on computational collective intelligence. ICCCI’18, Bristol, UK, September 5-7, Proceeding, part I, vol 11055. Springer, pp. 64–74. Print_ISBN: 978-3-319-98442-1. Online_ISBN: 978-3-319-98443-8

  • Zadeh LA (1965) Fuzzy sets. Inf Control 8(3):338–353. https://doi.org/10.1016/S0019-9958(65)90241-X

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Contributions

Both authors equally contributed to this work.

Corresponding authors

Correspondence to Oumaima Reda or Ahmed Zellou.

Ethics declarations

Competing interests

The authors declare no competing interests.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Reda, O., Zellou, A. Fulmqa: a fuzzy logic-based model for social media data quality assessment. Soc. Netw. Anal. Min. 13, 150 (2023). https://doi.org/10.1007/s13278-023-01148-y

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s13278-023-01148-y

Keywords

Navigation