Skip to main content

Multi-feature Based Similarity Among Entries on Media Portals

  • Conference paper
  • First Online:

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 538))

Abstract

Similar measures play an important role in information processing and have been widely investigated in computer science. With the exploration of social media such as Youtube, Wikipedia, Facebook etc., a huge number of entries have been posted on these portals. They are often described by means of short text or sets of words. Discovering similar entries based on such texts has become challenges in constructing information searching or filtering engines and attracted several research interests. In this paper, we firstly introduce a model of entries posted on media or entertainment portals, which is based on their features composed of title, category, tags, and content. Then, we present a novel similar measure among entries that incorporates their features. The experimental results show the superiority of our incorporation similarity measure compared with the other ones.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Buscaldi, D., Rosso, P., Gomez-Soriano, J.M., Sanchis, E.: Answering questions with an n-gram based passage retrieval engine. J. Intell. Inf. Syst. 34(2), 113–134 (2010)

    Article  Google Scholar 

  2. Buscaldi, D., Le Roux, J., Garca Flores, J.J., Popescu, A.: Lipn-core: semantic text similarity using n-grams, wordnet, syntactic analysis, esa and information retrieval based features (2013)

    Google Scholar 

  3. Finkel, J.R., Grenager, T., Manning, C.: Incorporating non-local information into information extraction systems by gibbs sampling. In: Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, ACL 2005, Stroudsburg, PA, USA, pp. 363–370. Association for Computational Linguistics (2005)

    Google Scholar 

  4. Lee, M.C., Chang, J.W., Hsieh, T.C.: A grammar-based semantic similarity algorithm for natural language sentences. Sci. World J. 2014, 17 (2014)

    Google Scholar 

  5. Lin, D.: An information-theoretic definition of similarity. In: Proceedings of the 15th International Conference on Machine Learning, pp. 296–304. Morgan Kaufmann, San Francisco (1998)

    Google Scholar 

  6. Nguyen, M.H., Nguyen, T.H.: A general model for similarity measurement between objects. Int. J. Adv. Comput. Sci. Appl. (IJACSA) 6(2), 235–239 (2015)

    Google Scholar 

  7. Nguyen, M.H., Tran, D.Q.: A semantic similarity measure between sentences. SE Asian J. Sci. 3(1), 63–75 (2014)

    MathSciNet  Google Scholar 

  8. Proisl, T., Evert, S., Greiner, P., Kabashi, B.: Robust semantic similarity at multiple levels using maximum weight matching. In: Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014), Dublin, Ireland, pp. 532–540. Association for Computational Linguistics and Dublin City University, August 2014

    Google Scholar 

  9. Salton, G., McGill, M.J.: Introduction to Modern Information Retrieval. McGraw-Hill, Inc., New York (1986)

    MATH  Google Scholar 

  10. Sultan, M.A., Bethard, S., Sumner, T.: Sentence similarity from word alignment. In: Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014), Dublin, Ireland, pp. 241–246. Association for Computational Linguistics and Dublin City University, August 2014

    Google Scholar 

  11. Tran, D.Q., Nguyen, M.H.: A mathematical model for semantic similarity measures. SE Asian J. Sci. 1(1), 32–45 (2012)

    MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Manh Hung Nguyen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Nguyen, T.H., Tran, D.Q., Dam, G.M., Nguyen, M.H. (2017). Multi-feature Based Similarity Among Entries on Media Portals. In: Akagi, M., Nguyen, TT., Vu, DT., Phung, TN., Huynh, VN. (eds) Advances in Information and Communication Technology. ICTA 2016. Advances in Intelligent Systems and Computing, vol 538. Springer, Cham. https://doi.org/10.1007/978-3-319-49073-1_41

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-49073-1_41

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-49072-4

  • Online ISBN: 978-3-319-49073-1

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics