Skip to main content
Log in

Research on Chinese negation and speculation: corpus annotation and identification

  • Research Article
  • Published:
Frontiers of Computer Science Aims and scope Submit manuscript

Abstract

Identifying negative or speculative narrative fragments from facts is crucial for deep understanding on natural language processing (NLP). In this paper, we firstly construct a Chinese corpus which consists of three sub-corpora from different resources. We also present a general framework for Chinese negation and speculation identification. In our method, first, we propose a feature-based sequence labeling model to detect the negative or speculative cues. In addition, a cross-lingual cue expansion strategy is proposed to increase the coverage in cue detection. On this basis, this paper presents a new syntactic structure-based framework to identify the linguistic scope of a negative or speculative cue, instead of the traditional chunking-based framework. Experimental results justify the usefulness of our Chinese corpus and the appropriateness of our syntactic structure-based framework which has showed significant improvement over the state-of-the-art on Chinese negation and speculation identification.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Morante R, Sporleder C. Modality and negation: an introduction to the special issue. Computational Linguistics, 2012, 38(2): 223–260

    Article  MathSciNet  Google Scholar 

  2. Friedman C, Alderson P O, Austin J H, Cimino J J, Johnson S B. A general natural–language text processor for clinical radiology. American Medical Informatics Association, 1994, 1(2): 161–174

    Article  Google Scholar 

  3. Di Marco C, Kroon F W, Mercer R E. Using hedges to classify citations in scientific articles. The Information Retrieval Series, 2006, 20: 247–263

    Article  Google Scholar 

  4. Morante R, Liekens A, Daelemans W. Learning the scope of negation in biomedical texts. In: Proceedings of the Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing. 2008, 715–724

    Google Scholar 

  5. Chowdhury M F M, Lavelli A. Exploiting the scope of negations and heterogeneous features for relation extraction: a case study for drugdrug interaction extraction. In: Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2013, 765–771

    Google Scholar 

  6. Averbuch M, Karson T, Ben-Ami B, Maimon O, Rokach L. Contextsensitive medical information retrieval. In: Proceedings of the 11th World Congress on Medical Informatics. 2004, 1–8

    Google Scholar 

  7. Wilson T A. Fine-grained subjectivity and sentiment analysis: recognizing the intensity, polarity, and attitudes of private states. ProQuest, 2008

    Google Scholar 

  8. Councill I G, McDonald R, Velikovich L. What’s great and what’s not: learning to classify the scope of negation for improved sentiment analysis. In: Proceedings of the Workshop on Negation and Speculation in Natural Language Processing. 2010, 51–59

    Google Scholar 

  9. Zhu X, Guo H, Mohammad S, Kiritchenko S. An empirical study on the effect of negation words on sentiment. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics. 2014, 304–313

    Google Scholar 

  10. Snow R, Vanderwende L, Menezes A. Effectively using syntax for recognizing false entailment. In: Proceedings of the Main Conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics. 2006, 33–40

    Chapter  Google Scholar 

  11. Baker K, Bloodgood M, Dorr B J, Filardo N W, Levin L, Piatko C. A modality lexicon and its use in automatic tagging. In: Proceedings of the 7th Conference on International Language Resources and Evaluation. 2010, 1402–1407

    Google Scholar 

  12. Wetzel D, Bond F. Enriching parallel corpora for statistical machine translation with semantic negation rephrasing. In: Proceedings of the 6th Workshop on Syntax, Semantics and Structure in Statistical Translation. 2012, 20–29

    Google Scholar 

  13. Özgür A, Radev D R. Detecting speculations and their scopes in scientific text. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing. 2009, 1398–1407

    Google Scholar 

  14. Øvrelid L, Velldal E, Oepen S. Syntactic scope resolution in uncertainty analysis. In: Proceedings of the 23rd International Conference on Computational Linguistics. 2010, 1379–1387

    Google Scholar 

  15. Apostolova E, Tomuro N, Demner-Fushman D. Automatic extraction of lexico-syntactic patterns for detection of negation and speculation scopes. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Short Papers. 2011, 283–287

    Google Scholar 

  16. Morante R, Daelemans W. A metalearning approach to processing the scope of negation. In: Proceedings of the 13th Conference on Computational Natural Language Learning. 2009, 28–36

    Google Scholar 

  17. Agarwal S, Yu H. Detecting hedge cues and their scope in biomedical text with conditional random fields. Biomedical Informatics, 2010, 43(6): 953–961

    Article  Google Scholar 

  18. Sánchez L M, Li B, Vogel C. Exploiting CCG structures with tree kernels for speculation detection. In: Proceedings of the 14th Conference on Computational Natural Language Learning: Shared Task. 2007, 126–131

    Google Scholar 

  19. Zou B, Zhou G, Zhu Q. Tree kernel-based negation and speculation scope detection with structured syntactic parse features. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing. 2013, 968–976

    Google Scholar 

  20. Vincze V, Szarvas G, Farkas R, Móra G, Csirik J. The BioScope corpus: biomedical texts annotated for uncertainty, negation and their scopes. BMC Bioinformatics, 2008, 9(11): 279–282

    Google Scholar 

  21. Ji F, Qiu X, Huang X. Exploring uncertainty sentences in Chinese. In: Proceedings of the 16th China Conference on Information Retrieval. 2010, 594–601

    Google Scholar 

  22. Chen Z, Zou B, Zhu Q, Li P. Chinese negation and speculation detection with conditional random fields. Communications in Computer and Information Science, 2013, 400: 30–40

    Article  Google Scholar 

  23. Qian Z, Zou B, Li P, Zhu Q. The prediction method of rise or fall in stock markets based on the discrimination of information credibility. In: Proceedings of the 20th China Conference on Information Retrieval. 2014

    Google Scholar 

  24. Pang B, Lee L. A sentimental education: sentiment analysis using subjectivity. In: Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics. 2004, 271–278

    Google Scholar 

  25. Cohen J. A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 1960, 20: 37–46

    Article  Google Scholar 

  26. Liu L, Hong Y, Liu H, Wang X, Yao J. Effective selection of translation model training data. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, Short Papers. 2014, 569–573

    Chapter  Google Scholar 

  27. Och F J, Ney H. A systematic comparison of various statistical alignment models. Computational Linguistics, 2003, 29(1), 19–51

    Article  MATH  Google Scholar 

  28. Jiang Z, Ng T. Semantic role labeling of NomBank: A maximum entropy approach. In: Proceedings of the Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing. 2006, 138–145

    Google Scholar 

  29. Zhu Q, Li J, Wang H, Zhou G. A unified framework for scope learning via simplified shallow semantic parsing. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing. 2010, 714–724

    Google Scholar 

  30. Farkas R, Vincze V, Móra G, Csirik J, Szarvas G. The CoNLL-2010 shared task: learning to detect hedges and their scope in natural language text. Proceedings of the 14th Conference on Computational Natural Language Learning. 2010

    Google Scholar 

  31. Che W, Li Z, Liu T. LTP: a Chinese language technology platform. In: Proceedings of the 23rd International Conference on Computational Linguistics: Demonstrations. 2010, 13–16

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Qiaoming Zhu.

Additional information

Bowei Zou is a PhD candidate at Soochow University, China. He received his MS and BS both in computer science from Harbin Institute of Technology, China in 2009 and 2007 respectively. His research interests include natural language processing, information extraction, and text mining.

Guodong Zhou received the PhD degree in computer science from the National University of Singapore, Singapore in 1999. He joined the Institute for Infocomm Research, Singapore in 1999, and had been an associate scientist, scientist and associate lead scientist at the institute until August 2006. Currently, he is a distinguished professor at the School of Computer Science and Technology, Soochow University, China. His research interests include natural language processing, information extraction, and machine learning.

Qiaoming Zhu received his PhD degree from Soochow University, China in 2008. Currently, he is a professor at the university and acts as the deputy director of Department of Science, Technology and Industry. His research interests include natural language processing, information extraction, and embedded systems.

Electronic supplementary material

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zou, B., Zhou, G. & Zhu, Q. Research on Chinese negation and speculation: corpus annotation and identification. Front. Comput. Sci. 10, 1039–1051 (2016). https://doi.org/10.1007/s11704-015-5101-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11704-015-5101-2

Keywords

Navigation