Chunking in Turkish with Conditional Random Fields

Yıldız, Olcay Taner; Solak, Ercan; Ehsani, Razieh; Görgün, Onur

doi:10.1007/978-3-319-18111-0_14

Olcay Taner Yıldız¹⁴,
Ercan Solak¹⁴,
Razieh Ehsani¹⁴ &
…
Onur Görgün^14,15

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9041))

Included in the following conference series:

International Conference on Intelligent Text Processing and Computational Linguistics

2933 Accesses

Abstract

In this paper, we report our work on chunking in Turkish. We used the data that we generated by manually translating a subset of the Penn Treebank. We exploited the already available tags in the trees to automatically identify and label chunks in their Turkish translations. We used conditional random fields (CRF) to train a model over the annotated data. We report our results on different levels of chunk resolution.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Gut, Besser, Chunker – Selecting the Best Models for Text Chunking with Voting

TDC: Typed Dependencies-Based Chunking Model

Article 01 June 2017

Turkish Constituent Chunking with Morphological and Contextual Features

References

Abney, S.: Parsing by chunks. In: Principle-Based Parsing, pp. 257–278. Kluwer Academic Publishers (1991)
Google Scholar
Jurafsky, D., Martin, J.H.: Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics and Speech Recognition. Prentice Hall Series in Artificial Intelligence, 2 edn. Prentice Hall (2009)
Google Scholar
Ramshaw, L.A., Marcus, M.P.: Text chunking using transformation-based learning. In: Third ACL Workshop on Very Large Corpora, pp. 82–94 (1995)
Google Scholar
Kudo, T., Matsumoto, Y.: Chunking with support vector machines. In: Proceedings of the Second Meeting of the North American Chapter of the Association for Computational Linguistics on Language Technologies, NAACL 2001, pp. 1–8. Association for Computational Linguistics, Stroudsburg (2001)
Google Scholar
Zhang, T., Damerau, F., Johnson, D.: Text chunking based on a generalization of winnow. J. Mach. Learn. Res. 2, 615–637 (2002)
MATH Google Scholar
Tjong Kim Sang, E.F., Buchholz, S.: Introduction to the CoNLL-2000 shared task: Chunking. In: Proceedings of the 2Nd Workshop on Learning Language in Logic and the 4th Conference on Computational Natural Language Learning, ConLL 2000, pp. 127–132. Association for Computational Linguistics, Stroudsburg (2000)
Google Scholar
Park, S.B., Zhang, B.T.: Text chunking by combining hand-crafted rules and memory-based learning. In: Proceedings of the 41st Annual Meeting on Association for Computational Linguistics, ACL 2003, vol. 1, pp. 497–504. Association for Computational Linguistics, Stroudsburg (2003)
Google Scholar
Lee, Y.-H., Kim, M.-Y., Lee, J.-H.: Chunking using conditional random fields in korean texts. In: Dale, R., Wong, K.-F., Su, J., Kwong, O.Y. (eds.) IJCNLP 2005. LNCS (LNAI), vol. 3651, pp. 155–164. Springer, Heidelberg (2005)
Chapter Google Scholar
Gune, H., Bapat, M., Khapra, M.M., Bhattacharyya, P.: Verbs are where all the action lies: Experiences of shallow parsing of a morphologically rich language. In: Proceedings of the 23rd International Conference on Computational Linguistics: Posters, COLING 2010, pp. 347–355. Association for Computational Linguistics, Stroudsburg (2010)
Google Scholar
Chen, W., Zhang, Y., Isahara, H.: An empirical study of chinese chunking. In: Proceedings of the COLING/ACL on Main Conference Poster Sessions, COLING-ACL 2006, pp. 97–104. Association for Computational Linguistics, Stroudsburg (2006)
Google Scholar
Sun, G.L., Huang, C.N., Wang, X.L., Xu, Z.M.: Chinese chunking based on maximum entropy markov models. International Journal of Computational Linguistics & Chinese Language Processing 11, 115–136 (2006)
Google Scholar
Kutlu, M.: Noun phrase chunker for Turkish using dependency parser. Master’s thesis, Sabancı University (2010)
Google Scholar
El-Kahlout, İ.D., Akın, A.A.: Turkish constituent chunking with morphological and contextual features. In: Gelbukh, A. (ed.) CICLing 2013, Part I. LNCS, vol. 7816, pp. 270–281. Springer, Heidelberg (2013)
Chapter Google Scholar
Atalay, N.B., Oflazer, K., Say, B.: The annotation process in the Turkish treebank. In: 4th International Workshop on Linguistically Interpreted Corpora (2003)
Google Scholar
Marcus, M.P., Marcinkiewicz, M.A., Santorini, B.: Building a large annotated corpus of English: The Penn treebank. Computational Linguistics 19, 313–330 (1993)
Google Scholar
Kornfilt, J.: Turkish. Routledge (1997)
Google Scholar
Yıldız, O.T., Solak, E., Görgün, O., Ehsani, R.: Constructing a Turkish-English parallel treebank. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, pp. 112–117. Association for Computational Linguistics, Baltimore (2014)
Google Scholar
Sha, F., Pereira, F.: Shallow parsing with conditional random fields. In: Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, NAACL 2003, pp. 134–141. Association for Computational Linguistics, Stroudsburg (2003)
Google Scholar
Lavergne, T., Cappé, O., Yvon, F.: Practical very large scale CRFs. In: Proceedings the 48th Annual Meeting of the Association for Computational Linguistics (ACL), pp. 504–513. Association for Computational Linguistics (2010)
Google Scholar
Hakkani-Tur, D., Oflazer, K., Tür, G.: Statistical morphological disambiguation for agglutinative languages. Computers and the Humanities (2002)
Google Scholar

Download references

Author information

Authors and Affiliations

Işık University, Istanbul, Turkey
Olcay Taner Yıldız, Ercan Solak, Razieh Ehsani & Onur Görgün
Alcatel Lucent Teletaş Telekomünikasyon A.Ş, Istanbul, Turkey
Onur Görgün

Authors

Olcay Taner Yıldız
View author publications
You can also search for this author in PubMed Google Scholar
Ercan Solak
View author publications
You can also search for this author in PubMed Google Scholar
Razieh Ehsani
View author publications
You can also search for this author in PubMed Google Scholar
Onur Görgün
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Centro de Investigación en Computación, Instituto Politécnico Nacional, Mexico DF, Mexico
Alexander Gelbukh

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yıldız, O.T., Solak, E., Ehsani, R., Görgün, O. (2015). Chunking in Turkish with Conditional Random Fields. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2015. Lecture Notes in Computer Science(), vol 9041. Springer, Cham. https://doi.org/10.1007/978-3-319-18111-0_14

Download citation

DOI: https://doi.org/10.1007/978-3-319-18111-0_14
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-18110-3
Online ISBN: 978-3-319-18111-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Chunking in Turkish with Conditional Random Fields

Abstract

Access this chapter

Preview

Similar content being viewed by others

Gut, Besser, Chunker – Selecting the Best Models for Text Chunking with Voting

TDC: Typed Dependencies-Based Chunking Model

Turkish Constituent Chunking with Morphological and Contextual Features

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Chunking in Turkish with Conditional Random Fields

Abstract

Access this chapter

Preview

Similar content being viewed by others

Gut, Besser, Chunker – Selecting the Best Models for Text Chunking with Voting

TDC: Typed Dependencies-Based Chunking Model

Turkish Constituent Chunking with Morphological and Contextual Features

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation