Introduction: Modeling, Learning and Processing of Text-Technological Data Structures
Chapter
Textual Units as Data Structures
Researchers in many disciplines, sometimes working in close cooperation, have been concerned with modeling textual data in order to account for texts as the prime information unit of written communication. The list of disciplines includes computer science and linguistics as well as more specialized disciplines like computational linguistics and text technology. What many of these efforts have in common is the aim to model textual data by means of abstract data types or data structures that support at least the semi-automatic processing of texts in any area of written communication.
Keywords
Semantic Relation Textual Data Semantic Distance Document Structure Textual Unit
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
Preview
Unable to display preview. Download preview PDF.
References
- 1.Abiteboul, S.: Querying semi-structured data. In: Afrati, F.N., Kolaitis, P.G. (eds.) ICDT 1997. LNCS, vol. 1186, pp. 1–18. Springer, Heidelberg (1996)Google Scholar
- 2.Aho, A.V., Hopcroft, J.E., Ullman, J.D.: Data Structures and Algorithms. Computer Science and Information Processing, Addison-Wesley, Reading, Massachusetts (1983)Google Scholar
- 3.Carletta, J.: Assessing agreement on classification tasks: the kappa statistic. Computational Linguistics 22, 249–254 (1996)Google Scholar
- 4.Feldman, R., Sanger, J.: The Text Mining Handbook. In: Advanced Approaches in Analyzing Unstructured Data. Cambridge University Press, Cambridge (2007)Google Scholar
- 5.Landauer, T.K., McNamara, D.S., Dennis, S., Kintsch, W.: Handbook of Latent Semantic Analysis. Lawrence Erlbaum Associates, Mahwah (2007)Google Scholar
- 6.Mani, I.: Automatic Summarization. John Benjamins, Amsterdam (2001)MATHGoogle Scholar
- 7.Mann, W.C., Thompson, S.A.: Rhetorical structure theory: Toward a functional theory of text organization. Text 8, 243–281 (1988)CrossRefGoogle Scholar
- 8.Marcu, D.: The Theory and Practice of Discourse Parsing and Summarization. MIT Press, Cambridge (2000)MATHGoogle Scholar
- 9.Sebastiani, F.: Machine learning in automated text categorization. ACM Computing Surveys 34(1), 1–47 (2002)CrossRefGoogle Scholar
- 10.Soderland, S.: Learning information extraction rules for semi-structured and free text. Machine Learning 34(1), 233–272 (1999)MATHCrossRefGoogle Scholar
- 11.Witt, A., Metzing, D. (eds.): Linguistic Modeling of Information and Markup Languages. Springer, Dordrecht (2010)MATHGoogle Scholar
Copyright information
© Springer-Verlag Berlin Heidelberg 2011