Skip to main content

Advertisement

SpringerLink
Log in
Menu
Find a journal Publish with us
Search
Cart
Book cover

International Conference on Web Engineering

ICWE 2012: Web Engineering pp 46–60Cite as

  1. Home
  2. Web Engineering
  3. Conference paper
Scaling Pair-Wise Similarity-Based Algorithms in Tagging Spaces

Scaling Pair-Wise Similarity-Based Algorithms in Tagging Spaces

  • Damir Vandic19,
  • Flavius Frasincar19 &
  • Frederik Hogenboom19 
  • Conference paper
  • 1939 Accesses

  • 2 Citations

Part of the Lecture Notes in Computer Science book series (LNISA,volume 7387)

Abstract

Users of Web tag spaces, e.g., Flickr, find it difficult to get adequate search results due to syntactic and semantic tag variations. In most approaches that address this problem, the cosine similarity between tags plays a major role. However, the use of this similarity introduces a scalability problem as the number of similarities that need to be computed grows quadratically with the number of tags. In this paper, we propose a novel algorithm that filters insignificant cosine similarities in linear time complexity with respect to the number of tags. Our approach shows a significant reduction in the number of calculations, which makes it possible to process larger tag data sets than ever before. To evaluate our approach, we used a data set containing 51 million pictures and 112 million tag annotations from Flickr.

Keywords

  • Input Vector
  • Parameter Combination
  • Cosine Similarity
  • Scalability Issue
  • Inverted Index

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Download conference paper PDF

References

  1. Alted, F., Vilata, I., et al.: PyTables: Hierarchical Datasets in Python (2012), http://www.pytables.org

  2. Bayardo, R.J., Ma, Y., Srikant, R.: Scaling Up All Pairs Similarity Search. In: 16th International Conference on World Wide Web (WWW 2007), pp. 131–140. ACM Press (2007)

    Google Scholar 

  3. Cohen, J., Dolan, B., Dunlap, M., Hellerstein, J.M., Welton, C.: MAD Skills: New Analysis Practices for Big Data. VLDB Endowment 2(2), 1481–1492 (2009)

    Google Scholar 

  4. Görlitz, O., Sizov, S., Staab, S.: Pints: Peer-to-peer Infrastructure for Tagging Systems. In: 7th International Conference on Peer-to-Peer Systems (IPTPS 2008), pp. 19–19 (2008)

    Google Scholar 

  5. Halpin, H., Robu, V., Shepherd, H.: The Complex Dynamics of Collaborative Tagging. In: 16th International Conference on World Wide Web (WWW 2007), pp. 211–220 (2007)

    Google Scholar 

  6. Indyk, P., Motwani, R.: Approximate Nearest Neighbors. In: 13th Annual ACM Symposium on Theory of Computing (STOC 1998), pp. 604–613. ACM Press (1998)

    Google Scholar 

  7. Li, X., Guo, L., Zhao, Y.E.: Tag-Based Social Interest Discovery. In: 17th International Conference on World Wide Web (WWW 2008), pp. 675–684. ACM Press (2008)

    Google Scholar 

  8. Oliphant, T.E.: Python for Scientific Computing. Science & Engineering 9(3), 10–20 (2007)

    Google Scholar 

  9. Radelaar, J., Boor, A.-J., Vandic, D., van Dam, J.-W., Hogenboom, F., Frasincar, F.: Improving the Exploration of Tag Spaces Using Automated Tag Clustering. In: Auer, S., Díaz, O., Papadopoulos, G.A. (eds.) ICWE 2011. LNCS, vol. 6757, pp. 274–288. Springer, Heidelberg (2011)

    CrossRef  Google Scholar 

  10. Specia, L., Motta, E.: Integrating Folksonomies with the Semantic Web. In: Franconi, E., Kifer, M., May, W. (eds.) ESWC 2007. LNCS, vol. 4519, pp. 624–639. Springer, Heidelberg (2007)

    CrossRef  Google Scholar 

  11. TechRadar: Flickr reaches 6 billion photo uploads (2012), http://www.techradar.com/news/internet/web/flickr-reaches-6-billion-photo-uploads-988294

Download references

Author information

Authors and Affiliations

  1. Erasmus University Rotterdam, P.O. Box 1738, NL-3000 DR, Rotterdam, The Netherlands

    Damir Vandic, Flavius Frasincar & Frederik Hogenboom

Authors
  1. Damir Vandic
    View author publications

    You can also search for this author in PubMed Google Scholar

  2. Flavius Frasincar
    View author publications

    You can also search for this author in PubMed Google Scholar

  3. Frederik Hogenboom
    View author publications

    You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

  1. Dipartimento di Elettronica e Informazione, Politecnico di Milano, Via Ponzio 34/5, 20133, Milano, Italy

    Marco Brambilla

  2. Department of Computer Science, Tokyo Institute of Technology, 2-12-1 Oookayama, 152-8552, Tokyo, Japan

    Takehiro Tokuda

  3. Institut für Informatik, Freie Universität Berlin, Königin-Luise-Strasse 24-26, 14195, Berlin, Germany

    Robert Tolksdorf

Rights and permissions

Reprints and Permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Vandic, D., Frasincar, F., Hogenboom, F. (2012). Scaling Pair-Wise Similarity-Based Algorithms in Tagging Spaces. In: Brambilla, M., Tokuda, T., Tolksdorf, R. (eds) Web Engineering. ICWE 2012. Lecture Notes in Computer Science, vol 7387. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-31753-8_4

Download citation

  • .RIS
  • .ENW
  • .BIB
  • DOI: https://doi.org/10.1007/978-3-642-31753-8_4

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-31752-1

  • Online ISBN: 978-3-642-31753-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Share this paper

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Search

Navigation

  • Find a journal
  • Publish with us

Discover content

  • Journals A-Z
  • Books A-Z

Publish with us

  • Publish your research
  • Open access publishing

Products and services

  • Our products
  • Librarians
  • Societies
  • Partners and advertisers

Our imprints

  • Springer
  • Nature Portfolio
  • BMC
  • Palgrave Macmillan
  • Apress
  • Your US state privacy rights
  • Accessibility statement
  • Terms and conditions
  • Privacy policy
  • Help and support

167.114.118.210

Not affiliated

Springer Nature

© 2023 Springer Nature