Clause Boundary Identification Using Conditional Random Fields

  • R. Vijay Sundar Ram
  • Sobha Lalitha Devi
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4919)


This paper discusses about the detection of clause boundaries using a hybrid approach. The Conditional Random fields (CRFs), which have linguistic rules as features, identifies the boundaries initially. The boundary marked is checked for false boundary marking using Error Pattern Analyser. The false boundary markings are re-analysed using linguistic rules. The experiments done with our approach shows encouraging results and are comparable with the other approaches


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Carreras, X., Màrquez, L.: Boosting Trees for Clause Splitting. In: Daelemans., W., Zajac, R. (eds.) Proceedings of CoNLL 2001, Toulouse France, pp. 73–75 (2001)Google Scholar
  2. 2.
    Carreras, X., Màrquez, L.: Phrase Recognition by Filtering and Ranking with Percep-trons. In: Proceedings of RANLP-2003, Borovets Bulgaria, pp. 205–216 (2003)Google Scholar
  3. 3.
    Carreras, X., Màrquez, L., Punyakanok, V., Roth, D.: Learning and Inference for Clause Identification. In: Proceedings of the 14th European Conference on Machine Learning, Finland, pp. 35–47 (2002)Google Scholar
  4. 4.
    Carreras, X., Màrquez, L., Castro, J.: Filtering-ranking Perceptron Learning for Partial Parsing. Machine Learning 60(1), 41–71 (2005)CrossRefGoogle Scholar
  5. 5.
    McCallum, A., Li, W.: Early Results for Named Entity Recognition with Conditional Random Fields, Feature Induction and Web Enhanced Lexicons. In: Proceedings of CoNLL-2003, Edmonton Canada, pp. 188–191 (2003)Google Scholar
  6. 6.
    Déjean, H.: Using Allis for Clausing. In: Daelemans, W., Zajac, R. (eds.) Proceedings of CoNLL-2001, Toulouse France, pp. 64–66 (2001)Google Scholar
  7. 7.
    Ejerhed, E.: Finding Clauses in Unrestricted Text by Finitary and Stochastic Methods. In: Proceedings of the 2nd Conference on Applied Natural Language Processing, Austin Texas, pp. 219–227 (1988)Google Scholar
  8. 8.
    Hammerton, J.: Clause Identification with Long Short-term Memory. In: Daele-mans, W., Zajac, R. (eds.) Proceedings of CoNLL 2001, Toulouse France, pp. 61–63 (2001)Google Scholar
  9. 9.
    Lafferty., J., McCallum., A., Pereira, F.: Conditional Random Fields: Prob-abilistic Models for Segmenting and Labeling Sequence Data. In: Proc. 18th International Conference on Machine Learning, pp. 282–289. Morgan Kaufmann, San Francisco (2001)Google Scholar
  10. 10.
    Vilson, J.L.: Clause Processing in Complex Sentences. In: Proceedings of the First International Conference on Language Resource & Evaluation, vol. 1, pp. 937–943 (1998)Google Scholar
  11. 11.
    Molina., A., Pla, F.: Clause Detection Using HMM. In: Daelemans., W., Zajac., R. (eds.) Proceedings of CoNLL-2001, Toulouse, France, pp. 70–72 (2001)Google Scholar
  12. 12.
    Molina., A., Pla, F.: Shallow Parsing Using Specialized HMMs. Journal of Ma-chine Learning Research 2, 595–613 (2002)CrossRefGoogle Scholar
  13. 13.
    Orasan, C.: A Hybrid Method for Clause Splitting in Unrestricted English Text. In: Proceedings of ACIDCA 2000 Corpora Processing, Monastir Tunisia, pp. 129–134 (2000)Google Scholar
  14. 14.
    Jon, D.P., Goyal, I.: Boosted Decision Graphs for NLP Learning Tasks. In: Daelemans., W., Zajac, R. (eds.) Proceedings of CoNLL-2001, Toulouse France, pp. 58–60 (2001)Google Scholar
  15. 15.
    Harris, V.P.: Clause Recognition in the Framework of Alignment. In: Mitkov, R., Nicolov, N. (eds.) Recent Advances in Natural Language Processing, pp. 417–425. John Benjamins Publishing Company, Amsterdam/Philadelphia (1997)Google Scholar
  16. 16.
    Puscasu, G.: A Multilingual Method for Clause Splitting. In: Proceedings of the 7th Annual Colloquium for the UK Special Interest Group for Computational Linguistics, Bir-mingham UK (2004)Google Scholar
  17. 17.
    Sha, F., Pereira, F.: Shallow Parsing with Conditional Random Fields. In: Proceedings of HLT-NAACL03, pp. 213–220 (2003)Google Scholar
  18. 18.
    Erik, F.T.K.S., Déjean, H.: Introduction to the CoNLL-2001 shared task: Clause Identification. In: Daelemans, W., Zajac, R. (eds.) Proceedings of CoNLL-2001, Toulouse France, pp. 53–57 (2001)Google Scholar
  19. 19.
    Kudo, T.: CRF++, an Open Source Toolkit for CRF (2005),
  20. 20.
    Van Nguyen, V.: Using Conditional Random Fields for Clause Splitting. In: Proceedings of The Pacific Association for Computational Linguistics, University of Melbourne Australia (2007)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • R. Vijay Sundar Ram
    • 1
  • Sobha Lalitha Devi
    • 1
  1. 1.AU-KBC Research CentreMIT Campus Anna UniversityChromepet, ChennaiIndia

Personalised recommendations