Design and validation of annotation schemas for aspect-based sentiment analysis in the tourism sector


The use of linguistic resources beyond the scope of language studies, e.g., commercial purposes, has become commonplace since the availability of massive amounts of data and the development of software tools to process them. An interesting perspective on these data is provided by Sentiment Analysis, which attempts to identify the polarity of a text, but can also pursue further, more challenging aims, such as the automatic identification of the specific entities and aspects being discussed in the evaluative speech act, along with the polarity associated with them. This approach, known as aspect-based sentiment analysis, seeks to offer fine-grained information from raw text, but its success depends largely on the existence of pre-annotated domain-specific corpora, which in turn calls for the design and validation of an annotation schema. This paper examines the methodological aspects involved in the creation of such annotation schema and is motivated by the scarcity of information found in the literature. We describe the insights we obtained from the annotation schema generation and validation process within our project, whose objectives include the development of advanced sentiment analysis software of user reviews in the tourism sector. We focus on the identification of the relevant entities and attributes in the domain, which we extract from a corpus of user reviews, and go on to describe the schema creation and validation process. We begin by describing the corpus annotation process and its further iterative refinement by means of several inter-annotator agreement measurements, which we believe is key to a successful annotation schema.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9


  1. 1.

    As the nomenclature followed for aspect-based sentiment analysis might be sometimes a bit confusing, for this work we decided to follow Pontiki et al. (2016) premises. Thus, an entity is considered the object of evaluation defined by an “is–a” relationship, whereas attribute (or aspect) refers to features characterizing an entity according to a “has–a” association.

  2. 2.

    A complete survey of the approaches employed in ABSA can be found in Schouten and Frasincar (2016).

  3. 3.

    The SentiTur project is a continuation of Lingmotif2. The proposal, submitted to the regional government of Andalusia under the European Regional Development Fund (ERDF) funding program, is currently under evaluation. The project’s main objective is the creation of an online system that displays detailed information on users’ opinions of Andalusian tourism resources, employing the ABSA methodology described here as well as an advanced visualization system based on Visual Analytics (Kohlhammer et al. 2011) and Formal Component Analysis (Ganter and Wille 1999).

  4. 4.

    Data obtained from the official website of the Andalusian Government

  5. 5.

  6. 6.

    The software used was modified from the one available at

  7. 7.

    Available at

  8. 8.

    Annotators are also mentioned by the synonyms raters and coders.

  9. 9.

  10. 10.

    The t-test shows a significant difference between Trial 1 and 2, p < 0.01.


  1. Andreevskaia A, Bergler S (2007) CLaC and CLaC-NB : knowledge-based and corpus-based approaches to sentiment tagging, pp 117–120

  2. Anthony L (2014) AntConc 3.4.3: computer software. Japan Waseda University, Tokyo. Accessed 15 Sept 2018

  3. Artstein R, Poesio M (2008) Inter-coder agreement for computational linguistics. Comput Linguist 34:555–596.

    Article  Google Scholar 

  4. Aue A, Gamon M (2005) Customizing sentiment classifiers to new domains : a case study. In: Proceedings of the international conference on recent advances in natural language processing. Borovets, Bulgaria

  5. Bennett EM, Alpert R, Goldstein AC (1954) Communications through limited-response questioning*. Public Opin Q 18:303–308.

    Article  Google Scholar 

  6. Cohen J (1960) A coefficient of agreement for nominal scales. Educ Psychol Meas XX:37–46

    Article  Google Scholar 

  7. Das SR, Chen MY (2001) Yahoo! for Amazon: opinion extraction from small talk on the web. In: Proceeding 8th Asia Pacific finance associate annual conference 2001, pp 1–16.

    Article  Google Scholar 

  8. Davies M, Fleiss JL (1982) Measuring agreement for multinomial data. Biometrics 38:1047–1051

    Article  Google Scholar 

  9. De Clercq O, Lefever E, Jacobs G, et al (2017) Towards an integrated pipeline for aspect-based sentiment analysis in various domains. In: Proceeding 8th work computer approaches to subject sentiment social media analyst, pp 136–142

  10. Deng D, Jing L, Yu J, Ng MK (2018) Topic-adaptive sentiment lexicon construction. In: 2018 first Asian conference on affective computing and intelligent interaction (ACII Asia), pp 1–6

  11. Duan W, Yu Y, Cao Q, Levy S (2016) Exploring the impact of social media on hotel service performance: a sentimental analysis approach. Cornell Hosp Q 57:282–296.

    Article  Google Scholar 

  12. Fu P, Lin Z, Yuan F, et al (2018) Learning sentiment-specific word embedding via global sentiment representation. In: Proceedings of the thirty-second AAAI conference on artificial intelligence (AAAI-18). AAAI Press, New Orleans, USA, pp 4808–4815

  13. Gamon M, Aue A, Corston-oliver S, Ringger E (2005) Pulse : mining customer opinions from free text. Adv Intell Data Anal VI:121–132

    Google Scholar 

  14. Ganter B, Wille R (1999) Formal concept analysis: mathematical foundations. Springer, Berlin

    Google Scholar 

  15. Ganu G (2009) Beyond the stars : improving rating predictions using review text content. In: Twelfth international work web databases, pp 1–6

  16. Jo Y, Oh A (2011) Aspect and sentiment unification model for online review analysis. In: WSDM ’11 proceedings of the fourth ACM international conference on web search and data mining. Hong Kong, China—February 09–12. pp 815–824

  17. Kalchbrenner N, Grefenstette E, Blunsom P (2014) A convolutional neural network for modelling sentences. In: Proceedings of the 52nd annual meeting of the association for computational linguistics (volume 1: Long Papers). Association for Computational Linguistics, Baltimore, Maryland, pp 655–665

  18. Kilgarriff A, Jakubíček M, Rychlý P et al (2014) The sketch engine: ten years on. Lexicography 1:7–36.

    Article  Google Scholar 

  19. Kim Y (2014) Convolutional neural networks for sentence classification. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). Association for Computational Linguistics, Doha, Qatar, pp 1746–1751

  20. Kohlhammer J, Keim D, Pohl M et al (2011) Solving problems with visual analytics. Proc Comput Sci 7:117–120.

    Article  Google Scholar 

  21. Krippendorff K (2004) Content analysis: an introduction to its methodology, 2nd edn. Sage Publications, Thousand Oaks

    Google Scholar 

  22. Li B, Cardier B, Wang T, et al (2015) Annotating high-level structures of short stories and personal anecdotes

  23. Litvin SW, Goldsmith RE, Pan B (2008) Electronic word-of-mouth in hospitality and tourism management. Tour Manag 29:458–468.

    Article  Google Scholar 

  24. Liu B (2011) Web data mining. Springer, Heidelberg

    Google Scholar 

  25. Liu B (2015) Sentiment analysis: mining opinions, sentiments, and emotions. Cambridge University Press, Cambridge

    Google Scholar 

  26. Liu B, Zhang L (2012) A survey of opinion mining and sentiment analysis BT—mining text data. In: Aggarwal CC, Zhai C (eds). Springer, Boston, pp 415–463

  27. Marine-Roig E, Anton Clavé S (2015) Tourism analytics with massive user-generated content: a case study of Barcelona. J Destination Mark Manage 4(3):162–172

    Article  Google Scholar 

  28. Marrese-Raylor E, Velásquez JD, Bravo-Marquez F (2014) Expert systems with applications A novel deterministic approach for aspect-based opinion mining in tourism products reviews. Expert Syst Appl 41:7764–7775.

    Article  Google Scholar 

  29. Moreno-Ortiz A (2017) Tecnolengua Lingmotif at EmoInt-2017 : a lexicon-based approach. In: Proceedings of the 8th workshop on computational approaches to subjectivity, sentiment and social media analysis, Copenhagen, Denmark, September 7–11, 2017, pp 225–232

  30. Moreno-Ortiz A, Pérez-Hernández C (2018) Lingmotif-lex: a Wide-coverage, State-of-the-art Lexicon for sentiment analysis. In: Proceedings of the eleventh international conference on language resources and evaluation (LREC 2018). European Language Resources Association (ELRA), Miyazaki, Japan, pp 2653–2659

  31. Morinaga S, Yamanishi K, Tateishi K, Fukushima T (2002) Mining product reputations on the Web. In: Proceedings of eighth ACM SIGKDD international conference on knowledge discovery data min—KDD ’02 341.

  32. Nakov P (2016) Sentiment analysis in Twitter: a SemEval perspective. In: Proceedings of the 7th workshop on computational approaches to subjectivity, sentiment and social media analysis. Association for Computational Linguistics, San Diego, California, pp 171–172

  33. Pang B, Lee L (2008) Opinion mining and sentiment analysis. Foundations Trends Inf Retr 2(1–2):1

    Article  Google Scholar 

  34. Pang B, Lee L, Vaithyanathan S (2002) Thumbs up? Sentiment classification using machine learning techniques. In: Proceedings of the conference on empirical methods in natural language processing (EMNLP), Philadelphia, July 2002. Philadelphia, PA, USA, pp 79–86

  35. Pawar AB, Jawale MA, Kyatanavar DN (2016) Fundamentals of sentiment analysis: concepts and methodology. Springer, New York.

    Google Scholar 

  36. Pontiki M, Galanis D, Pavlopoulos J, Papageorgioou H, Androutsopoulos I, Manandhar S (2014) SemEval-2014 task 4 : aspect based sentiment analysis. In: Proceedings of the 8th international workshop on semantic evaluation (SemEval 2014). Association for Computational Linguistics, Dublin, Ireland, pp 27–35

  37. Pontiki M, Galanis D, Papageorgiou H (2015) SemEval-2015 task 12 : aspect based sentiment analysis. In: Proceedings of the 9th international workshop on semantic evaluation (SemEval 2015), Denver, Colorado, June 4–5, 2015

  38. Pontiki M, Galanis D, Papageorgiou H et al (2016) SemEval-2016 task 5 : aspect based sentiment analysis. In: Proceedings of the 10th international workshop on semantic evaluation. Association for Computational Linguistics, San Diego, California, pp 19–30

  39. Riloff E, Patwardhan S, Wiebe J (2006) Feature subsumption for opinion analysis. In: Proceedings of the 2006 conference on empirical methods in natural language processing (EMNLP 2006), Sydney, July 2006. Association for Computational Linguistics, pp 440–448

  40. Rossetti M, Stella F, Zanker M (2016) Analyzing user reviews in tourism with topic models. Inf Technol Tour 16:5–21.

    Article  Google Scholar 

  41. Saroufim C, Almatarky A, AbdelHady M (2018) Language independent sentiment analysis with sentiment-specific word embeddings. In: Proceedings of the 9th workshop on computational approaches to subjectivity, sentiment and social media analysis. Association for computational linguistics, Brussels, Belgium, pp 14–23

  42. Schouten K, Frasincar F (2016) Survey on aspect-level sentiment analysis. IEEE Trans Knowl Data Eng 28:813–830

    Article  Google Scholar 

  43. Scott WA (1955) Reliability of content analysis: the case of nominal scale coding. Public Opin Q 19:321–325.

    Article  Google Scholar 

  44. Siegel S (1988) Nonparametric statistics for the behavioral science. McGraw-Hill, New York

    Google Scholar 

  45. Stenetorp P, Pyysalo S, Topi G, et al (2012) BRAT : a web-based tool for NLP-assisted text annotation. In: Proceedings of the 13th conference of the european chapter of the association for computational linguistics, Avignon, France, April 23–27 2012, pp 102–107

  46. Taboada M, Brooke J, Tofiloski M, Voll K, Stede M (2011) Lexicon-based methods for sentiment analysis. Comput Linguist 37(2):267–307

    Article  Google Scholar 

  47. Tang D, Wei F, Yang N, et al (2014) Learning sentiment-specific word embedding for Twitter sentiment classification. In: Proceedings of the 52nd annual meeting of the association for computational linguistics (volume 1: Long Papers). Association for Computational Linguistics, pp 1555–1565

  48. Thelwall M, Buckley K, Paltoglou G, Cai D (2010) Sentiment strength detection in short informal text. Am Soc Inf Sci Technol 61:2544–2558.

    Article  Google Scholar 

  49. Turney PD (2002) Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews. In: Proceedings of the 40th annual meeting of the association for computational linguistics (ACL), Philadelphia, July 2002. pp 417–424

  50. Wang B, Liu M (2015) Deep learning for aspect-based sentiment analysis. Stanford University report.

  51. Ye Q, Law R, Gu B, Chen W (2011) The influence of user-generated content on traveler behavior: an empirical investigation on the effects of e-word-of-mouth to hotel online bookings. Comput Hum Behav 27:634–639.

    Article  Google Scholar 

  52. Zaenen A (2006) Mark-up barking up the wrong tree. Comput Linguist 32:577–580.

    Article  Google Scholar 

Download references


This research has been sponsored by the Spanish Government under Grant FFI2016-78141-P (Lingmotif2).

Author information



Corresponding author

Correspondence to Soluna Salles-Bernal.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Moreno-Ortiz, A., Salles-Bernal, S. & Orrequia-Barea, A. Design and validation of annotation schemas for aspect-based sentiment analysis in the tourism sector. Inf Technol Tourism 21, 535–557 (2019).

Download citation


  • Annotation schema
  • Aspect-based sentiment analysis
  • Inter-rater agreement
  • Tourism industry
  • User-generated content