Skip to main content
Log in

Co-training for Implicit Discourse Relation Recognition Based on Manual and Distributed Features

  • Published:
Neural Processing Letters Aims and scope Submit manuscript

Abstract

Implicit discourse relation recognition aims to discover the semantic relation between two sentences where the discourse connective is absent. Due to the lack of labeled data, previous work tries to generate additional training data automatically by removing discourse connectives from explicit discourse relation instances. However, using these artificial data indiscriminately has been proven to degrade the performance of implicit discourse relation recognition. To address this problem, we propose a co-training approach based on manual features and distributed features, which identifies useful instances from these artificial data to enlarge the labeled data. In addition, the distributed features are learned via recursive autoencoder based approaches, capable of capturing to some extent the semantics of sentences which is valuable for implicit discourse relation recognition. Experiment results on both the PDTB and CDTB data sets indicate that: (1) The learned distributed features are complementary to the manual features, and thus suitable for co-training. (2) Our proposed co-training approach can use these artificial data effectively, and significantly outperforms the baselines.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Notes

  1. In the PDTB, discourse relations are mainly defined between two clauses or sentences. Here, we use sentences for simplicity.

  2. For example, nonetheless is mapped into the Comparison relation.

  3. In this paper, we model implicit discourse relation recognition as four binary classification tasks (see Sect. 4.1).

  4. A number of classifiers can be used, including Maximum Entropy (ME) and so on. In our experiments, SVM achieves the best results.

  5. Randomly duplicate positive instances with replacement until the number of positive and negative instances are equal.

  6. The FBIS is a bilingual sentence aligned corpus, which consists of 237,870 Chinese-English sentence pairs with 6.72M Chinese words and 8.85M English words.

  7. We can also get artificial implicit instances from arbitrary text following the method in [5]. However, these artificial instances are much more noisy because it is hard to identify the positions of their arguments.

  8. We use all the selected artificial instances until the iteration \(K=200\) in Algorithm 1.

  9. The pdtb-parse toolkit also marks EntRel (entity-based coherence) instances as implicit discourse relation.

  10. https://github.com/percyliang/brown-cluster.

  11. In Chinese, explicit instances account for about 18.0%.

References

  1. Verberne S, Boves L, Oostdijk N, Coppen PA (2007) Evaluating discourse-based answer extraction for why-question answering. In: Proceedings of SIGIR, NY, USA, pp 735–736

  2. Louis A, Joshi A, Nenkova A (2010) Discourse indicators for content selection in summarization. In: Proceedings of SIGDIAL, pp 147–156

  3. Tu M, Zhou Y, Zong C (2014) Enhancing grammatical cohesion: generating transitional expressions for SMT. In: Proceedings of ACL. Maryland, USA, pp 850–860

  4. Prasad R, Dinesh N, Lee A, Miltsakaki E, Robaldo L, Joshi A, Webber B (2008) The Penn Discourse Treebank 2.0. In: Proceedings of LREC, vol 24, pp 2961–2968

  5. Marcu D, Echihabi A (2002) An unsupervised approach to recognizing discourse relations. In: Proceedings of ACL. PA, USA, pp 368–375

  6. Sporleder C, Lascarides A (2008) Using automatically labelled examples to classify rhetorical relations: an assessment. Nat Lang Eng 14(3):369–416

    Article  Google Scholar 

  7. Braud C, Denis P (2014) Combining natural and artificial examples to improve implicit discourse relation identification. In: Proceedings of COLING. Dublin, Ireland, pp 1694–1705

  8. Ji Y, Zhang G, Eisenstein J (2015) Closing the gap: domain adaptation from explicit to implicit discourse relations. In: Proceedings of EMNLP, Lisbon, Portugal, pp 2219–2224

  9. Rutherford A, Xue N (2015) Improving the inference of implicit discourse relations via classifying explicit discourse connectives. In: Proceedings of NAACL. Denver, Colorado, pp 799–808

  10. Lan M, Xu Y, Niu Z (2013) Leveraging synthetic discourse data via multi-task learning for implicit discourse relation recognition. In: Proceedings of ACL. Sofia, Bulgaria, pp 476–485

  11. Liu Y, Li S, Zhang X, Sui Z (2016) Implicit discourse relation classification via multi-task neural networks. In: Proceedings of AAAI. Arizona, USA, pp 2750–2756

  12. Wu C, Shi X, Chen Y, Huang Y, Su J (2016) Bilingually-constrained synthetic data for implicit discourse relation recognition. In: Proceedings of EMNLP. Austin, USA, pp 2306–2312

  13. Blum A, Mitchell T (1998) Combining labeled and unlabeled data with co-training. In: Proceedings of conference on learning theory, pp 92–100

  14. Wellner B, Pustejovsky J, Havasi C, Rumshisky A, Sauri R (2006) Classification of discourse coherence relations: an exploratory study using multiple knowledge sources. In: Proceedings of 7th Sigdial workshop on discourse and dialogue, pp 117–125

  15. Pitler E, Louis A, Nenkova A (2009) Automatic sense prediction for implicit discourse relations in text. In: Proceedings of ACL-IJCNLP. PA, USA, pp 683–691

  16. Lin Z, Kan MY, Ng HT (2009) Recognizing implicit discourse relations in the Penn Discourse Treebank. In: Proceedings of EMNLP. PA, USA, pp 343–351

  17. Rutherford AT, Xue N (2014) Discovering implicit discourse relations through brown cluster pair representation and coreference patterns. In: Proceedings of EACL, pp 645–654

  18. Socher R, Pennington J, Huang EH, Ng AY, Manning CD (2011) Semi-supervised recursive autoencoders for predicting sentiment distributions. In: Proceedings of EMNLP. PA, USA, pp 151–161

  19. Zhang J, Liu S, Li M, Zhou M, Zong C (2014) Bilingually-constrained phrase embeddings for machine translation. In: Proceedings of ACL. Maryland, USA, pp 111–121

  20. Li Y, Feng W, Sun J, Kong F, Zhou G (2014) Building Chinese discourse corpus with connective-driven dependency tree structure. In: Proceedings of EMNLP. Doha, Qatar, pp 2105–2114

  21. Kiritchenko S, Matwin S (2001) Email classification with co-training. In: Proceedings of the 2001 conference of the centre for advanced studies on collaborative research, pp 301–312

  22. Wan X (2009) Co-training for cross-lingual sentiment classification. In: Proceedings of ACL-IJCNLP. Suntec, Singapore, pp 235–243

  23. Sun S, Zhang Q (2011) Multiple-view multiple-learner semi-supervised learning. Neural Process Lett 34(3):229–240

  24. Caragea C, Bulgarov F, Mihalcea R (2015) Co-training for topic classification of scholarly data. In: Proceedings of EMNLP, Lisbon, Portugal, pp 2357–2366

  25. Balcan Mf, Blum A, Yang K (2004) Co-training and expansion: towards bridging theory and practice. In: Proceedings of NIPS, pp 89–96

  26. Snoek CGM, Worring M, Smeulders AWM (2005) Early versus late fusion in semantic video analysis. In: Proceedings of the 13th annual ACM international conference on multimedia. ACM, New York, USA, pp 399–402

  27. Levin B (1993) English verb classes and alternations: a preliminary investigation. University of Chicago Press, Chicago

    Google Scholar 

  28. Chen J, Zhang Q, Liu P, Qiu X, Huang X (2016) Implicit discourse relation detection via a deep architecture with gated relevance network. In: Proceedings of ACL, Berlin, Germany, pp 1726–1735

  29. Bengio Y, Ducharme R, Vincent P, Janvin C (2003) A neural probabilistic language model. J Mach Learn Res 3(6):1137–1155

    MATH  Google Scholar 

  30. Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. CoRR abs/1301.3781

  31. Li JJ, Nenkova A (2014) Addressing class imbalance for improved recognition of implicit discourse relations. In: Proceedings of SIGDIAL, Philadelphia, USA, pp 142–150

  32. Manning CD, Surdeanu M, Bauer J, Finkel J, Bethard SJ, McClosky D (2014) The Stanford CoreNLP natural language processing toolkit. In: Proceedings of ACL (system demonstrations), pp 55–60

  33. Turian J, Ratinov L, Bengio Y (2010) Word representations: a simple and general method for semi-supervised learning. In: Proceedings of ACL. PA, USA, pp 384–394

  34. Joachims T (2002) Learning to classify text using support vector machines. Springer, Boston

    Book  Google Scholar 

  35. Kuncheva LI, Whitaker CJ (2003) Measures of diversity in classifier ensembles and their relationship with the ensemble accuracy. Mach Learn 51(2):181–207

    Article  MATH  Google Scholar 

  36. William M, Thompson S (1988) Rhetorical structure theory: towards a functional theory of text organization. Text 8(3):243–281

    Google Scholar 

  37. Lin Z, Ng HT, Kan MY (2014) A PDTB-styled end-to-end discourse parser. Nat Lang Eng 20(02):151–184

    Article  Google Scholar 

  38. Ji Y, Eisenstein J (2015) One vector is not enough: entity-augmented distributed semantics for discourse relations. Trans Assoc Comput Linguist 3:329–344

    Google Scholar 

  39. Zhou ZM, Xu Y, Niu ZY, Lan M, Su J, Tan CL (2010) Predicting discourse connectives for implicit discourse relation recognition. In: Proceedings of COLING. PA, USA, pp 1507–1514

  40. Patterson G, Kehler A (2013) Predicting the presence of discourse connectives. In: Proceedings of EMNLP, Washington, USA, pp 914–923

  41. Hernault H, Bollegala D, Ishizuka M (2010) A semi-supervised approach to improve classification of infrequent discourse relations using feature vector extension. In: Proceedings of EMNLP. Massachusetts, USA, pp 399–409

  42. Biran O, McKeown K (2013) Aggregated word pair features for implicit discourse relation disambiguation. In: Proceedings of ACL. Sofia, Bulgaria, pp 69–73

  43. Fisher R, Simmons R (2015) Spectral semi-supervised discourse relation classification. In: Proceedings of ACL-IJCNLP. Beijing, China, pp 89–93

  44. Xu C, Tao D, Xu C (2013) A survey on multi-view learning. arXiv:1304.5634

  45. Yu J, Rui Y, Tao D (2014) Click prediction for web image reranking using multimodal sparse coding. IEEE Trans Image Process 23:2019–2032

    Article  MathSciNet  Google Scholar 

  46. Xu C, Tao D, Xu C (2015) Multi-view intact space learning. IEEE Trans Pattern Anal Mach Intell 37(12):2531–2544

    Article  Google Scholar 

  47. Yu J, Yang X, Gao F, Tao D (2016) Deep multimodal distance metric learning using click constraints for image ranking. In: IEEE transactions on cybernetics, pp 1–11

Download references

Acknowledgements

We would like to thank all the reviewers for their constructive and helpful suggestions on this paper. This work is partially supported by the National Natural Science Foundation of China (Grant Nos. 61573294, 61303082, 61672440), the Ph.D. Programs Foundation of Ministry of Education of China (Grant No. 20130121110040), the Fund of Research Project of Tibet Autonomous Region of China (Grant No. Z2014A18G2-13), and the Natural Science Foundation of Fujian Province (Grant No. 2016J05161).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiaodong Shi.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wu, C., Shi, X., Su, J. et al. Co-training for Implicit Discourse Relation Recognition Based on Manual and Distributed Features. Neural Process Lett 46, 233–250 (2017). https://doi.org/10.1007/s11063-017-9582-x

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11063-017-9582-x

Keywords

Navigation