Advertisement

Mining Uncertain Sentences with Multiple Instance Learning

  • Feng Ji
  • Xipeng Qiu
  • Xuanjing Huang
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6440)

Abstract

Distinguishing uncertain information from factual ones in online texts is of essential importance in information extraction, because uncertain information would mislead systems to find useless even fault information. In this paper, we propose a method for detecting uncertain sentences with multiple instance learning (MIL). Based on the basic assumption, we derive two new constraints for estimating the weight vector by defining a probability margin, which is used in an online learning algorithm known as Passive-Aggressive algorithm. To demonstrate the effectiveness of our method, we experiment on the biomedical corpus. Compared with an intuitive method with conventional single instance learning (SIL), our method provide higher performance by raising the performance from 79.07% up to 82.55%, over 3% improvement.

Keywords

Uncertain sentence Multiple instance learning Passive- Aggressive algorithm 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Collins, M.: Discriminative training methods for hidden markov models: Theory and experiments with perceptron algorithms. In: EMNLP, pp. 1–8. ACL (2002)Google Scholar
  2. 2.
    Crammer, K., Dekel, O., Keshet, J., Shalev-Shwartz, S., Singer, Y.: Online passive-aggressive algorithms. Journal of Machine Learning Research 7, 551–585 (2006)MathSciNetzbMATHGoogle Scholar
  3. 3.
    Farkas, R., Vincze, V., Móra, G., Csirik, J., Szarvas, G.: The conll-2010 shared task: Learning to detect hedges and their scope in natural language text. In: Proceedings of the Fourteenth Conference on Computational Natural Language Learning, pp. 1–12. ACL, Uppsala (2010)Google Scholar
  4. 4.
    Georgescul, M.: A hedgehop over a max-margin framework using hedge cues. In: Proceedings of the Fourteenth Conference on Computational Natural Language Learning, pp. 26–31. ACL, Uppsala (2010)Google Scholar
  5. 5.
    Ji, F., Qiu, X., Huang, X.: Detecting hedge cues and their scopes with average perceptron. In: Proceedings of the Fourteenth Conference on Computational Natural Language Learning, pp. 32–39. ACL, Uppsala (2010)Google Scholar
  6. 6.
    Maron, O., Lozano-Prez, T.: A framework for multiple-instance learning. In: Advances in Neural Information Processing Systems, pp. 570–576. MIT Press, Cambridge (1998)Google Scholar
  7. 7.
    Medlock, B.: Exploring hedge identification in biomedical literature. Journal of Biomedical Informatics 41(4), 636–654 (2008)CrossRefGoogle Scholar
  8. 8.
    Medlock, B., Briscoe, T.: Weakly supervised learning for hedge classification in scientific literature. In: Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, pp. 992–999. ACL, Prague (2007)Google Scholar
  9. 9.
    Morante, R., Daelemans, W.: Learning the scope of hedge cues in biomedical texts. In: Proceedings of the BioNLP 2009 Workshop, pp. 28–36. ACL, Boulder (2009)Google Scholar
  10. 10.
    Tang, B., Wang, X., Wang, X., Yuan, B., Fan, S.: A cascade method for detecting hedges and their scope in natural language text. In: Proceedings of the Fourteenth Conference on Computational Natural Language Learning, pp. 13–17. ACL, Uppsala (2010)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Feng Ji
    • 1
  • Xipeng Qiu
    • 1
  • Xuanjing Huang
    • 1
  1. 1.School of Computer Science and TechnologyFudan UniversityShanghaiChina

Personalised recommendations