Skip to main content
Log in

Quantifying the trustworthiness of social media content

  • Published:
Distributed and Parallel Databases Aims and scope Submit manuscript

Abstract

The growing popularity of social media in recent years has resulted in the creation of an enormous amount of user-generated content. A significant portion of this information is useful and has proven to be a great source of knowledge. However, since much of this information has been contributed by strangers with little or no apparent reputation to speak of, there is no easy way to detect whether the content is trustworthy. Search engines are the gateways to knowledge but search relevance cannot guarantee that the content in the search results is trustworthy. A casual observer might not be able to differentiate between trustworthy and untrustworthy content. This work is focused on the problem of quantifying the value of such shared content with respect to its trustworthiness. In particular, the focus is on shared health content as the negative impact of acting on untrustworthy content is high in this domain. Health content from two social media applications, Wikipedia and Daily Strength, is used for this study. Sociological notions of trust are used to motivate the search for a solution. A two-step unsupervised, feature-driven approach is proposed for this purpose: a feature identification step in which relevant information categories are specified and suitable features are identified, and a quantification step for which various unsupervised scoring models are proposed. Results indicate that this approach is effective and can be adapted to disparate social media applications with ease.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Adler, B., Chatterjee, K., de Alfaro, L., Faella, M., Pye, I., Raman, V.: Assigning trust to Wikipedia content. In: 4th Intl Symposium on Wikis, Wikisym 2008. (2008)

    Google Scholar 

  2. Agichtein, E., Castillo, C., Donato, D., Gionis, A., Mishne, G.: Finding high-quality content in social media. In: Proc. of the Intl. Conf. on Web Search and Web Mining, pp. 183–194. (2008)

    Chapter  Google Scholar 

  3. Bailey, B.P., Gurak, L.J., Konstan, J.A.: Trust in cyberspace. In: Ratner, J. (ed.) Human Factors and Web Development, 2nd ed., pp. 311–321. Lawrence Erlbaum, New Jersey (2002)

    Google Scholar 

  4. Blumenstock, J.: Size matters: word count as a measure of quality on Wikipedia. In: Proc. of the 17th Intl. Conf. on World Wide Web (WWW) 2008, pp. 1095–1096. ACM, New York (2008)

    Chapter  Google Scholar 

  5. Childs, S.: Judging the quality of Internet-based health information. Perform. Meas. Metr. 6(2), 80–96 (2005)

    Article  Google Scholar 

  6. Dondio, P., Barrett, S.: Computational trust in web content quality: a comparative evaluation on the Wikipedia project. Informatica 31(2), 151–160 (2007)

    Google Scholar 

  7. Hu, M., Lim, E.P., Sun, A., Lauw, H.W., Vuong, B.Q. Measuring article quality in Wikipedia: models and evaluation. In: Proc. of the 16th ACM Conf. on Information and Knowledge Management, CIKM 2007, pp. 243–252. (2007)

    Google Scholar 

  8. Järvelin, K., Kekäläinen, J.: Cumulated gain-based evaluation of IR techniques. ACM Trans. Inf. Sys. 20(4), 422–446 (2002)

    Article  Google Scholar 

  9. Korp, P.: Health on the Internet: implications for health promotion. Health Educ. Res. 21(1), 78–86 (2005)

    Article  Google Scholar 

  10. Liu, H., Hussain, F., Tan, C., Dash, M.: Discretization: an enabling technique. Data Min. Knowl. Discov. 6(4), 393–423 (2002)

    Article  MathSciNet  Google Scholar 

  11. McGuinness, D., Zeng, H., Da Silva, P., Ding, L., Narayanan, D., Bhaowal, M.: Investigations into trust for collaborative information repositories: a Wikipedia case study. In: Proc. of the Workshop on Models of Trust for the Web 2006, pp. 3–131. (2006)

    Google Scholar 

  12. McSherry, F., Najork, M.: Computing information retrieval performance measures efficiently in the presence of tied scores. Lect. Notes Comput. Sci. 4956, 414–421 (2008)

    Article  Google Scholar 

  13. Siegrist, M., Cvetkovich, G.: Perception of hazards: the role of social trust and knowledge. Risk Anal. 20(5), 713–720 (2006)

    Article  Google Scholar 

  14. Sztompka, P.: Trust: A sociological theory. Cambridge Univ Press, Cambridge (1999)

    Google Scholar 

  15. Tan, P., Steinbach, M., Kumar, V.: Introduction to data mining. Addison-Wesley/Longman, Boston (2005)

    Google Scholar 

  16. Zeng, H., Alhossaini, M., Ding, L., Fikes, D., McGuiness, D.L.: Computing trust from revision history. In: Proc. of the 2006 Intl. Conf. on Privacy, Security and Trust (2006)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sai T. Moturu.

Additional information

Communicated by Anupam Joshi.

All the work of S.T. Motoru was performed while he was at ASU.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Moturu, S.T., Liu, H. Quantifying the trustworthiness of social media content. Distrib Parallel Databases 29, 239–260 (2011). https://doi.org/10.1007/s10619-010-7077-0

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10619-010-7077-0

Keywords

Navigation