Using Transitivity to Increase the Accuracy of Sample-Based Pearson Correlation Coefficients

Phillips, Taylor; GauthierDickey, Chris; Thurimella, Ramki

doi:10.1007/978-3-642-15105-7_13

Taylor Phillips¹⁹,
Chris GauthierDickey¹⁹ &
Ramki Thurimella¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6263))

Included in the following conference series:

International Conference on Data Warehousing and Knowledge Discovery

1046 Accesses
7 Citations

Abstract

Pearson product-moment correlation coefficients are a well-practiced quantification of linear dependence seen across many fields. When calculating a sample-based correlation coefficient, the accuracy of the estimation is dependent on the quality and quantity of the sample. Like all statistical models, these correlation coefficients can suffer from overfitting, which results in the representation of random error instead of an underlying trend.

In this paper, we discuss how Pearson product-moment correlation coefficients can utilize information outside of the two items for which the correlation is being computed. By introducing a transitive relationship with one or more additional items that meet specified criterion, our Transitive Pearson product-moment correlation coefficient can significantly reduce the error, up to over 50%, of sparse, sample-based estimations. Finally, we demonstrate that if the data is too dense or too sparse, transitivity is detrimental in reducing the correlation estimation errors.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Adomavicius, G., Tuzhilin, A.: Towards the next generation of recommender systems: A survey of the state-of-the-art and possible extensions. IEEE Transactions on Knowledge and Data Engineering 17, 634–749 (2005)
Article Google Scholar
Ali, K., van Stam, W.: Tivo: Making show recommendations using a distributed collaborative filtering architecture. In: Proceedings of the 10th ACM International Conference on Knowledge Discovery and Data Mining, pp. 394–401 (2004)
Google Scholar
Bell, R., Koren, Y.: Improved neighborhood-based collaborative filtering. In: International Conference on Knowledge Discovery and Data Mining (2007)
Google Scholar
Bell, R., Koren, Y.: Lessons from the netflix prize challenge. SIGKDD Explorations 9, 75–79 (2007)
Article Google Scholar
Bell, R., Koren, Y.: Scalable collaborative filtering with jointly derived neighborhood interpolation weights. In: IEEE International Conference on Data Mining (ICDM 2007), pp. 43–52 (2007)
Google Scholar
Bell, R., Koren, Y., Volinsky, C.: The bellkor solution to the netflix prize. Tech. rep., AT&T Labs (2007)
Google Scholar
Buskirk, E.V.: Winning teams join to qualify for $1 million netflix prize. Wired Magazine (2009)
Google Scholar
Goldberg, D., Nichols, D., Oki, B.M., Terry, D.: Using collaborative filtering to weave an information tapestry. Communications of the ACM 35(12), 61–71 (1992)
Article Google Scholar
Hong, T., Tsamis, D.: Use of knn for the netflix prize. Tech. rep., Stanford University (2006)
Google Scholar
Koren, Y.: Factorization meets the neighborhood: a multifaceted collaborative filtering model. In: International Conference on Knowledge Discovery and Data Mining (2008)
Google Scholar
Linden, G., Smith, B., York, J.: Amazon.com recommendations: Item-to-item collaborative filtering. IEEE Internet Computing 7(1), 76–80 (2003)
Article Google Scholar
Lohr, S.: Netflix Competitors Learn the Power of Teamwork. NY Times (2009)
Google Scholar
Netflix: The Netflix Prize, http://www.netflixprize.com
Newitz, A.: Movie Tips From Your Robot Overlords. Washington Post (2009)
Google Scholar
Sarwar, B., Karypis, G., Konstan, J., Riedl, J.: Item-based collaborative filtering recommendation algorithms. In: Proc. 10th International Conference on the World Wide Web, pp. 285–295 (2001)
Google Scholar
Thompson, C.: Netflix challenge to hackers: Improve our service and win big. NY Times (2007)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of Denver, USA
Taylor Phillips, Chris GauthierDickey & Ramki Thurimella

Authors

Taylor Phillips
View author publications
You can also search for this author in PubMed Google Scholar
Chris GauthierDickey
View author publications
You can also search for this author in PubMed Google Scholar
Ramki Thurimella
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, Aalborg University, Selma Lagerløfs Vej 300, 9220, Aalborg, Denmark
Torben Bach Pedersen
IBM India Research Lab, 4, Block C, Institutional Area, Vasant Kunj, 110 070, New Delhi, India
Mukesh K. Mohania
Institute of Software Technology, Vienna University of Technology, Favoritenstr. 9-11/188, 1040, Vienna, Austria
A Min Tjoa

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Phillips, T., GauthierDickey, C., Thurimella, R. (2010). Using Transitivity to Increase the Accuracy of Sample-Based Pearson Correlation Coefficients. In: Bach Pedersen, T., Mohania, M.K., Tjoa, A.M. (eds) Data Warehousing and Knowledge Discovery. DaWaK 2010. Lecture Notes in Computer Science, vol 6263. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15105-7_13

Download citation

DOI: https://doi.org/10.1007/978-3-642-15105-7_13
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15104-0
Online ISBN: 978-3-642-15105-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics