Advertisement

A Unified Framework to Identify and Extract Uncertainty Cues, Holders, and Scopes in One Fell-Swoop

  • Rania Al-Sabbagh
  • Roxana Girju
  • Jana Diesner
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9041)

Abstract

We present a unified framework based on supervised sequence labelling methods to identify and extract uncertainty cues, holders, and scopes in one-fell swoop with an application on Arabic tweets. The underlying technology employs Support Vector Machines with a rich set of morphological, syntactic, lexical, semantic, pragmatic, dialectal, and genre-specific features, and yields an average F1 score of 0.759.

Keywords

Uncertainty Automatic Analysis Supervised Sequence Labeling Unified Frameworks Morphologically-Rich Languages Twitter 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Diab, M., Levin, L., Mitamura, T., Rambow, O., Prabhakaran, V., Guo, W.: Committed Belief Annotation and Tagging. In: Proceedings of the 3rd Linguistic Annotation Workshop, Suntec, Singapore, pp. 68–73 (2009)Google Scholar
  2. 2.
    Palmer, F.R.: Mood and Modality. Cambridge University Press, Cambridge (1986)Google Scholar
  3. 3.
    Aikhenvald, A.Y.: Evidentiality. Oxford University Press, UK (2004)Google Scholar
  4. 4.
    Saurí, R., Pustejovsky, J.: FactBank: A Corpus Annotated with Event Factuality. Language Resources and Evaluation 43, 227–268 (2009)CrossRefGoogle Scholar
  5. 5.
    Díaz, N.: Detecting Negated and Uncertain Information in Biomedical and Review Texts. In: Proceedings of the Student Research Workshop Associated with RANLP 2013, Hissar, Bulgaria, pp. 45–50 (2013)Google Scholar
  6. 6.
    de Marneffe, M., Manning, C., Potts, C.: Did it Happen? The Pragmatic Complexity of Veridicality Assessment. Computational Linguistics 38, 301–333 (2012)CrossRefGoogle Scholar
  7. 7.
    Qazvinian, V., Rosengren, E., Radev, D., Mei, Q.: Rumor has it: Identifying Misinformation in Microblogs. In: Procedings of the 2011 Conference on Empirical Methods in Natural Language Processing, Edinburgh, Scotland, UK, pp. 1589–1599 (2011)Google Scholar
  8. 8.
    de Marneffe, M., Grimm, S., Potts, C.: Not a Simple Yes or No: Uncertainty in Indirect Answers. In: Proceedings of SIGDIAL 2009: the 10th Annual Meeting of the Special Interest Group in Discourse and Dialogue, pp. 136–143. Queen Mary University of London (2009)Google Scholar
  9. 9.
    Castillo, C., Mendoza, M., Poblete, B.: Information Credibility on Twitter. In: Proceedings of the 20th International Conference on World Wide Web, Heydrabad, India, pp. 675–684 (2011)Google Scholar
  10. 10.
    Soni, S., Mitra, T., Gilbert, E., Eisenstein, J.: Modeling Factuality Judgments in Social Media Text. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Short Papers), Baltimore, Maryland, USA, pp. 415–420 (2014)Google Scholar
  11. 11.
    Wagner, C., Liao, V., Pirolli, P., Nelson, L., Strohmaier, M.: It’s not in their Tweets: Modeling Topical Expertise of Twitter Users. In: Proceedings of the 2012 ASE/IEEE International Conference on Social Computing, SocialCom/PASSAT, Washington DC, USA, pp. 91–100 (2012)Google Scholar
  12. 12.
    Mowery, D.L., Velupillai, S., Chapman, W.: Medical Diagnosis Lost in Translation: Analysis of Uncertainty and Negation Expressions in English and Swedish Clinical Texts. In: Proceedings of the 2012 Workshop on Biomedical Natural Language Processing (BioNLP 2012), Montreal, Canada, pp. 56–64 (2012)Google Scholar
  13. 13.
    Baker, K., Bloodgood, M., Dorr, B.J., Callison-Burch, C., Filardo, N.W., Piatko, C., Levin, L., Miller, S.: Modality and Negation in SIMT. Computational Linguistics 38(2), 411–438 (2012)CrossRefGoogle Scholar
  14. 14.
    Wiegand, M., Klakow, D.: Prototypical Opinion Holders: What We can Learn from Experts and Analysts. In: Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2011), Missar, Bulgaria, pp. 282–288 (2011)Google Scholar
  15. 15.
    Orelid, L., Velldal, E., Oepen, S.: Syntactic Scope Resolution in Uncertainty Analysis. In: Proceedings of the 23rd International Conference on Computational Linguistics (COLING 2010), Beijin, China, pp. 1379–1387 (2010)Google Scholar
  16. 16.
    Prabhakaran, V.: Uncertainty Learning Using SVMs and CRFs. In: Proceedings of the 14th Conference on Computational Natural Language Learning: Shared Task, Uppsala, Sweden, pp. 132–137 (2010)Google Scholar
  17. 17.
    Prabhakaran, V., Bloodgood, M., Diab, M., Dorr, B., Levin, L., Piatko, C., Rambow, O., Van Durme, B.: Statistical Modality Tagging from Rule-based Annotations and Crowdsourcing. In: Proceedings of the Workshop on Extra-Propositional Aspects of Meaning in Computational Linguistics, Jeju, Republic of Korea, pp. 57–64 (2012)Google Scholar
  18. 18.
    Tjong, E., Sang, K.: A Baseline Approach for Detecting Sentences Containing Uncertainty. In: Proceedings of the 14th Conference on Computational Natural Language Learning: Shared Task, Uppsala, Sweden, pp. 148–150 (2010)Google Scholar
  19. 19.
    Szarvas, G., Vincze, V., Farkas, R., Möra, G., Gurevych, I.: Cross-Genre and Cross-Domain Detection of Semantic Uncertainty. Computational Linguistics 38(2), 335–367 (2012)CrossRefGoogle Scholar
  20. 20.
    Vincze, V.: Uncertainty Detection in Hungarian Texts. In: Proceedings of the 25th International Conference on Computational Linguistics (COLING 2014): Technical Papers, Dublin, Ireland, pp. 1844–1853 (2014)Google Scholar
  21. 21.
    Kilicoglu, H., Bergler, S.: Recognizing Speculative Language in Biomedical Research Articles: A Linguistically Motivated Perspective. In: Proceedings of BioNLP 2008: Current Trends in Biomedical Natural Language Processing, Ohio, USA, pp. 46–53 (2008)Google Scholar
  22. 22.
    Zhou, H., Li, X., Huang, D., Li, Z., Yang, Y.: Exploiting Multi-Features to Detect Hedges and Their Scope in Biomedical Texts. In: Proceedings of the 14th Conference on Computational Natural Language Learning: Shared Task, Uppsala, Sweden, pp. 106–113 (2010)Google Scholar
  23. 23.
    Vincze, V., Szarvas, G., Móra, G., Ohta, T., Farkas, R.: Linguistic Scope-Based and Biological Event-Based Speculation and Negation Annotations in the BioScope and Genia Event Corpora. Journal of Biomedical Semantics 2(5), 1–11 (2011)Google Scholar
  24. 24.
    Szarvas, G., Gurevych, I.: Uncertainty Detection for Natural Language Watermarking. In: Proceedings of the International Joint Conference on Natural Language Processing (IJCNLP 2013), Nagoya, Japan, pp. 1188–1194 (2013)Google Scholar
  25. 25.
    Vincze, V.: Weasels, Hedges and Peacocks: Discourse-Level Uncertainty in Wikipedia Articles. In: Proceedings of the International Joint Conference on Natural Language Processing (IJCNLP 2013), Nagoya, Japan, pp. 383–391 (2013)Google Scholar
  26. 26.
    Wei, Z., Chen, J., Gao, W., Li, B., Zhou, L., He, Y., Wong, W.: An Empirical Study on Uncertainty Identification in Social Media Context. In: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, Sofia, Bulgaria, pp. 58–62 (2013)Google Scholar
  27. 27.
    Shaalan, K., Abo Bakr, H., Ziedan, I.: A Hybrid Approach for Building Arabic Diacritizer. In: Proceedings of the EACL 2009 Workshop on Computational Approaches to Semitic Languages, Athens, Greece, pp. 27–35 (2009)Google Scholar
  28. 28.
    Habash, N., Roth, R.: Using Deep Morphology to Improve Automatic Error Detection in Arabic Handwriting Recognition. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics, Portland, Oregon, pp. 875–884 (2011)Google Scholar
  29. 29.
    Alkuhlani, S., Habash, N.: Identifying Broken Plurals, Irregular Gender, and Rationality in Arabic Text. In: Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics, Avignon, France, pp. 675–685 (2011)Google Scholar
  30. 30.
    Al-Sabbagh, R., Girju, R., Diesner, J.: 3arif: A Corpus of Modern Standard and Egyptian Arabic Tweets Annotated for Epistemic Modality Using Interactive Crowdsourcing. In: Proceedings of the 25th Conference on Computational Linguistics (COLING 2014), Dublin, Ireland, pp. 1521–1532 (2014)Google Scholar
  31. 31.
    Pasha, A., Al-Badrashiny, M., Diab, M., Elkholy, A., Eskandar, R., Habash, N., Pooleery, M., Rambow, O., Roth, R.: MADAMIRA: a Fast, Comprehensive Tool for Morphological Analysis and Disambiguation of Arabic. In: Proceedings of the 9th International Conference on Language Resources and Evaluation (LREC 2014), Reykjavik, Iceland, pp. 1094–1101 (2014)Google Scholar
  32. 32.
    Szarvas, G., Vincze, V., Farkas, R., Csirik, J.: The BioScope Corpus: Annotation for Negation, Uncertainty and their Scope in Biomedical Texts. In: Proceedings of BioNLP 2008: Current Trends in Biomedical Natural Language Processing, Columbus, Ohio, pp. 38–45 (2008)Google Scholar
  33. 33.
    Diab, M.: Second Generation AMIRA Tools for Arabic Processing: Fast and Robust Tokenization, POS tagging, and Base Phrase Chunking. In: Proceedings of the 2nd International Conference on Arabic Language Resources and Tools, Cairo, Egypt, pp. 285–288 (2009)Google Scholar
  34. 34.
    Marton, Y., Habash, N., Rambow, O.: Dependency Parsing of Modern Standard Arabic with Lexical and Inflectional Features. Computational Linguistics 39(1), 161–194 (2013)CrossRefGoogle Scholar
  35. 35.
    Maamouri, M., Bies, A., Krouna, S., Gaddeche, F., Bouziri, B.: Penn Arabic Treebank Guidelines. In: Linguistic Data Consortium (2009)Google Scholar
  36. 36.
    Elghamry, K., Al-Sabbagh, R., ElZeiny, N.: Cue-Based Bootstrapping of Arabic Semantic Features. In: Proceedings of the 9th International Conference on Statistical Text Analysis, Lyon, France, pp. 85–95 (2008)Google Scholar
  37. 37.
    Alkuhlani, S., Habash, N.: A Corpus for Modeling Morpho-Syntactic Agreement in Arabic: Gender, Number and Rationality. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Short Papers, pp. 357–362 (2011)Google Scholar
  38. 38.
    Elfardy, H., Al-Badrashiny, M., Diab, M.: AIDA: Identifying Code Switching in Informal Arabic Text. In: Proceedings of the 1st Workshop on Computational Approaches to Code Switching, Doha, Qatar, pp. 94–101 (2014)Google Scholar
  39. 39.
    Al-Sabbagh, R., Girju, R., Diesner, J.: Using the Semantic-Syntactic Interface for Reliable Arabic Modality Annotation. In: Proceedings of the 6th International Joint Conference on Natural Language Processing (IJCNLP 2014), Nagoya, Japan, pp. 410–418 (2013)Google Scholar
  40. 40.
    Al-Sabbagh, R., Girju, R., Diesner, J.: Unsupervised Construction of a Lexicon and a Repository of Variation Patterns for Arabic Modal Multiword Expressions. In: Proceedings of the 10th Workshop on Multiword Expressions (MWE), Göthenburg, Sweden, pp. 114–123 (2014)Google Scholar
  41. 41.
    Moncecchi, G., Minel, J., Wonsever, D.: Improving Speculative Language Detection Using Linguistic Knowledge. In: Proceeding of the Workshop on Extra-Propositional Aspects of Meaning in Computational Linguistics, Jeju, Republic of Korea, pp. 37–46. of Korea (2012)Google Scholar
  42. 42.
    Velupillai, S.: Shades of Certainty: Annotation and Classification of Swedish Medical Records. PhD thesis, Stockholm University (2012)Google Scholar
  43. 43.
    Verbeke, M., Frasconi, P., Van Asch, V., Morante, R., Daelemans, W., De Raedt, L.: Kernel-Based Logical and Relational Learning with kLog for Hedge Cue Detection. In: Muggleton, S.H., Tamaddoni-Nezhad, A., Lisi, F.A. (eds.) ILP 2011. LNCS, vol. 7207, pp. 347–357. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  44. 44.
    Yang, H., De Roeck, A., Gervasi, V., Willis, A., Nuseibeh, B.: Speculative Requirements: Automatic Detection of Uncertainty in Natural Language Requirements. In: Proceedings of 20th IEEE International Conference on Requirements Engineering, pp. 11–20 (2012)Google Scholar
  45. 45.
    Wiegand, M., Klakow, D.: The Role of Predicates in Opinion Holder Extraction. In: Proceedings of the Workshop on Information Extraction and Knowledge Acquisition, Hissar, Bulgaria, pp. 13–20 (2011)Google Scholar
  46. 46.
    Lu, B.: Identifying Opinion Holders and Targets with Dependency Parser in Chinese News Texts. In: Proceedings of the NAACL HLT 2010 Student Research Workshop, Los Angeles, California, pp. 46–51 (2010)Google Scholar
  47. 47.
    Bethard, S., Yu, H., Thornton, A., Hatzivassiloglou, V., Jurafsky, D.: Extracting Opinion Propositions and Opinion Holders Using Syntactic and Lexical Cues. In: Computing Attitude and Affect in Text: Theory and Applications, pp. 125–141. Springer Netherlands (2006)Google Scholar
  48. 48.
    Apostolova, E., Tomuro, N., Demner-Fushman, D.: Automatic Extraction of Lexico-Syntactic Patterns for Detection of Negation and Speculation Scopes. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Short Papers, Portland, Oregon, pp. 283–287 (2011)Google Scholar
  49. 49.
    Velldal, E., Ovrelid, L., Oepen, S.: Resolving Speculation: MaxEnt Cue Classification and Dependency-Based Scope Rules. In: Proceedings of the 14th Conference on Computational Natural Language Learning: Shared Task, Uppsala, Sweden, pp. 48–55 (2010)Google Scholar
  50. 50.
    Zhao, Q., Sun, C., Liu, B., Cheng, Y.: Learning to Detect Hedges and their Scope Using CRFs. In: Proceedings of the 14th Conference on Computational Natural Language Learning: Shared Task, Uppsala, Sweden, pp. 100–105 (2010)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  1. 1.Department of Linguistics and Beckman InstituteUniversity of Illinois Urbana-ChampaignChampaignUSA
  2. 2.Graduate School of Library and Information ScienceUniversity of Illinois Urbana-ChampaignChampaignUSA

Personalised recommendations