Abstract
The open nature of the World Wide Web makes evaluating webpage credibility challenging for users. In this paper, we aim to automatically assess web credibility by investigating various characteristics of webpages. Specifically, we first identify features from textual content, link structure, webpages design, as well as their social popularity learned from popular social media sites (e.g., Facebook, Twitter). A set of statistical analyses methods are applied to select the most informative features, which are then used to infer webpages credibility by employing supervised learning algorithms. Real dataset-based experiments under two application settings show that we attain an accuracy of 75% for classification, and an improvement of 53% for the mean absolute error (MAE), with respect to the random baseline approach, for regression.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Fogg, B.J., Tseng, H.: The elements of computer credibility. In: Proc. of CHI (1999)
Burbules, N.: Paradoxes of the web: The ethical dimensions of credibility, vol. 49. University of Illinois Library School, Urbana (2001)
Schwarz, J., Morris, M.: Augmenting web pages and search results to support credibility assessment. In: Proc. of CHI (2011)
Yamamoto, Y., Tanaka, K.: Enhancing credibility judgment of web search results. In: Proc. of CHI (2011)
Adomavicius, G., Tuzhilin, A.: Toward the next generation of recommender systems: A survey of the state-of-the-art and possible extensions. IEEE Transactions on Knowledge and Data Engineering (2005)
Aggarwal, S., Van Oostendorp, H.: An attempt to automate the process of source evaluation. In: Proc. of ACE (2011)
Rubin, V., Liddy, E.: Assessing credibility of weblogs. In: Proc. of CAAW (2006)
Castillo, C., Mendoza, M., Poblete, B.: Information credibility on twitter. In: Proc. of WWW (2011)
Weerkamp, W., de Rijke, M.: Credibility improves topical blog post retrieval. In: Proc. of ACL (2008)
Fogg, B.J., Soohoo, C., Danielson, D.R., Marable, L., Stanford, J., Tauber, E.R.: How do users evaluate the credibility of web sites?: a study with over 2,500 participants. In: Proc. of DUX (2003)
Hsu, C.F., Khabiri, E., Caverlee, J.: Ranking comments on the social web. In: Proc. of CSE (2009)
Agichtein, E., Castillo, C., Donato, D., Gionis, A., Mishne, G.: Finding high-quality content in social media. In: Proc. of WSDM (2008)
Morris, M., Counts, S., Roseway, A., Hoff, A., Schwarz, J.: Tweeting is believing?: understanding microblog credibility perceptions. In: Proc. of CSCW (2012)
Gupta, M., Zhao, P., Han, J.: Evaluating event credibility on twitter. In: Proc. of SIAM (2012)
Caverlee, J., Liu, L.: Countering web spam with credibility-based link analysis. In: Proc. of PODC (2007)
Gyöngyi, Z., Garcia-Molina, H., Pedersen, J.: Combating web spam with trustrank. In: Proc. of VLDB (2004)
Sondhi, P., Vydiswaran, V.G.V., Zhai, C.: Reliability Prediction of Webpages in the Medical Domain. In: Baeza-Yates, R., de Vries, A.P., Zaragoza, H., Cambazoglu, B.B., Murdock, V., Lempel, R., Silvestri, F. (eds.) ECIR 2012. LNCS, vol. 7224, pp. 219–231. Springer, Heidelberg (2012)
LLC, O.: AlchemyApi, http://www.alchemyapi.com/ (retrieved on September 2012)
Mc Laughlin, G.: Smog grading-a new readability formula. Journal of reading (1969)
Fogg, B.J.: Prominence-interpretation theory: explaining how people assess credibility online. In: Proc. of CHI (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Olteanu, A., Peshterliev, S., Liu, X., Aberer, K. (2013). Web Credibility: Features Exploration and Credibility Prediction. In: Serdyukov, P., et al. Advances in Information Retrieval. ECIR 2013. Lecture Notes in Computer Science, vol 7814. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-36973-5_47
Download citation
DOI: https://doi.org/10.1007/978-3-642-36973-5_47
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-36972-8
Online ISBN: 978-3-642-36973-5
eBook Packages: Computer ScienceComputer Science (R0)