Abstract
Our research aims at developing a system that paraphrases written language text to spoken language style. In such a system, it is important to distinguish between appropriate and inappropriate words in an input text for spoken language. We call this task lexical choice for paraphrasing. In this paper, we describe a method of lexical choice that considers the topic. Basically, our method is based on the word probabilities in written and spoken language corpora. The novelty of our method is topic adaptation. In our framework, the corpora are classified into topic categories, and the probability is estimated using such corpora that have the same topic as input text. The result of evaluation showed the effectiveness of topic adaptation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Berzilay, R., Lee, L.: Bootstrapping Lexical Choice via Multiple-Sequence Alignment. In: Proceedings of EMNLP, pp. 50–57 (2002)
Bulyko, I., Ostendorf, M., Stolcke, A.: Getting More Mileage from Web Text Sources for Conversational Speech Language Modeling using Class-Dependent Mixtures. In: Proceedings of HLT-NAACL, pp. 7–9 (2003)
Carletta, J.: Assessing Agreement on Classification Tasks: The Kappa Statistic. Computational Linguistics 22(2), 249–255 (1996)
Dunning, T.: Accurate Methods for the Statistics of Surprise and Coincidence. Computational Linguistics 19(1), 61–74 (1993)
Edmonds, P., Hirst, G.: Near-Synonymy and Lexical Choice. Computational Linguistics 28(2), 105–144 (2002)
Florian, R., Yarowsky, D.: Dynamic Nonlocal Language Modeling via Hierarchical Topic-Based Adaptation. In: Proceedings of ACL, pp. 167–174 (1999)
Gildea, D., Hofmann, T.: Topic-Based Language Models Using Em. In: Proceedings of EUROSPEECH, pp. 2167–2170 (1999)
Gillick, L., Cox, S.: Some Statistical Issues in the Comparison of Speech Recognition Algorithms. In: Proceedings of ICASSP, pp. 532–535 (1989)
Inkpen, D., Feiguina, O., Hirst, G.: Generating more-positive and more-negative text. In: Proceedings of AAAI Spring Symposium on Exploring Attitude and Affect in Text (2004)
Kaji, N., Kawahara, D., Kurohashi, S., Satoshi, S.: Verb Paraphrase based on Case Frame Alignment. In: Proceedings of ACL, pp. 215–222 (2002)
Kaji, N., Okamoto, M., Kurohasih, S.: Paraphrasing Predicates from Written Language to Spoken Language Using the Web. In: Proceedings of HLT-NAACL, pp. 241–248 (2004)
Murata, M., Isahara, H.: Automatic Extraction of Differences Between Spoken and Written Languages, and Automatic Translation from the Written to the Spoken Language. In: Proceedings of LREC (2002)
Wu, J., Khudanpur, S.: Building a Topic-Dependent Maximum Entropy Model For Very Large Corpora. In: Proceedings of ICASSP, pp. 777–780 (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kaji, N., Kurohashi, S. (2005). Lexical Choice via Topic Adaptation for Paraphrasing Written Language to Spoken Language. In: Dale, R., Wong, KF., Su, J., Kwong, O.Y. (eds) Natural Language Processing – IJCNLP 2005. IJCNLP 2005. Lecture Notes in Computer Science(), vol 3651. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11562214_85
Download citation
DOI: https://doi.org/10.1007/11562214_85
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-29172-5
Online ISBN: 978-3-540-31724-1
eBook Packages: Computer ScienceComputer Science (R0)