Skip to main content

Building a Vietnamese SentiWordNet Using Vietnamese Electronic Dictionary and String Kernel

  • Conference paper

Part of the Lecture Notes in Computer Science book series (LNAI,volume 8863)

Abstract

In this paper, we propose a novel approach to construct a Vietnamese SentiWordNet (VSWN), a lexical resource supporting sentiment analysis in Vietnamese. A SentiWordNet is typically generated from WordNet in which each synset has numerical scores to indicate its opinion polarities. However, Vietnamese WordNet is not yet available currently. Therefore, we propose a method to construct a VSWN from a Vietnamese electronic dictionary, not from WordNet. The main drawback of constructing a VSWN from a dictionary is that it is easy to suffer from the sparsity problem, since the glosses in the dictionary are short in general. As a solution to this problem, we adopt a string kernel function which measures the string similarity based on both common contiguous and non-contiguous subsequences. According to our experimental results, first, the use of string kernel outperforms a baseline model which uses the standard bag-of-word kernel. Second, the Vietnamese SentiWordNet is competitive with the English SentiWordNet which uses WordNet when it constructed. All those results prove that our methodology is effective and efficient in constructing a SentiWordNet from an electronic dictionary.

Keywords

  • SentiWordNet
  • Vietnamese SentiWordNet
  • Opinion Mining
  • String Kernel

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-319-13332-4_18
  • Chapter length: 13 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   44.99
Price excludes VAT (USA)
  • ISBN: 978-3-319-13332-4
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   59.99
Price excludes VAT (USA)

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Baccianella, S., Esuli, A., Sebastiani, F.: SentiWordNet 3.0: An enhanced lexical resource for sentiment analysis and opinion mining. In: Proceedings of the 7th Conference on International Language Resources and Evaluation, pp. 2200–2204 (2010)

    Google Scholar 

  2. Bellman, R.E.: Adaptive Control Processes: A Guided Tour. Princeton University Press (1961)

    Google Scholar 

  3. Das, A., Bandyopadhyay, S.: SentiWordNet for indian languages. In: Proceedings of the 8th Workshop on Asian Language Resources, pp. 56–63 (2010)

    Google Scholar 

  4. Das, A., Bandyopadhyay, S.: Towards the global SentiWordNet. In: Proceedings of the 24th Pacific Asia Conference on Language, Information and Computation, pp. 799–808 (2010)

    Google Scholar 

  5. Esuli, A.: Automatic Generation of Lexcial Resources for Opinion Mining: Models, Algorithms, and Application. PhD thesis, University of Pisa (2008)

    Google Scholar 

  6. Esuli, A., Sebastiani, F.: Recognizing contextual polarity in phrase-level sentiment analysis. In: Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing, pp. 347–354 (2005)

    Google Scholar 

  7. Esuli, A., Sebastiani, F.: SentiWordNet: A publicly available lexical resource for opinion mining. In: Proceedings of the 3rd Conference on International Language Resources and Evaluation, pp. 417–422 (2006)

    Google Scholar 

  8. Esuli, A., Sebastiani, F.: SentiWordNet: A high-coverage lexical resource for opinion mining. Technical Report 2007-TR-02, Istitutiodi Scienza e Technologie dell’Informazione, University of Pisa (2007)

    Google Scholar 

  9. Fagin, R., Kumar, R., Mahdian, M., Sivakumar, D., Vee, E.: Comparing and aggregating rankings with ties. In: Proceedings of ACM International Conference on Principles of Database Systems, pp. 47–58 (2004)

    Google Scholar 

  10. Gusfield, D.: Algorithms on Strings, Trees and Sequences: Computer Science and Computational Biology. Cambridge University Press (1997)

    Google Scholar 

  11. Lodhi, H., Saunders, C., Shawe-Taylor, J., Cristianini, N., Watkins, C.: Text classification using string kernels. Journal of Machine Learning Research 2, 419–444 (2002)

    MATH  Google Scholar 

  12. Nguyen, C.-T., Phan, X.-H., Nguyen, T.-T.: JVnTextPro: A java-based vietnamese text processing tool (2010), http://jvntextpro.sourceforge.net/

  13. Pang, B., Lee, L.: Opinion mining and sentiment analysis. Foundations and Trends in Information Retrieval 2(1-2), 1–135 (2008)

    CrossRef  Google Scholar 

  14. Turney, P.D., Littman, M.L.: Measuring praise and criticism: Inference of semantic orientation from association. ACM Transaction on Information Systems 21(4), 315–346 (2003)

    CrossRef  Google Scholar 

  15. Vu, X.-S., Park, S.-B.: Construction of vietnamese SentiWordNet by using vietnamese dictionary. In: Proceedings of the 40th Conference of the Korea Information Processing Society, pp. 745–748 (2014)

    Google Scholar 

  16. Yang, Y., Pedersen, J.O.: A comparative study on feature selection in text categorization. In: Proceedings of the 14th International Conference on Machine Learning, pp. 412–420 (1997)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Vu, XS., Song, HJ., Park, SB. (2014). Building a Vietnamese SentiWordNet Using Vietnamese Electronic Dictionary and String Kernel. In: Kim, Y.S., Kang, B.H., Richards, D. (eds) Knowledge Management and Acquisition for Smart Systems and Services. PKAW 2014. Lecture Notes in Computer Science(), vol 8863. Springer, Cham. https://doi.org/10.1007/978-3-319-13332-4_18

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-13332-4_18

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-13331-7

  • Online ISBN: 978-3-319-13332-4

  • eBook Packages: Computer ScienceComputer Science (R0)