An empirical approach to syntax learning

Naumann, Sven; Schrepp, Jürgen

doi:10.1007/978-3-642-77809-4_22

An empirical approach to syntax learning

Sven Naumann² &
Jürgen Schrepp²

Conference paper

54 Accesses
2 Citations

Part of the book series: Informatik aktuell ((INFORMAT))

Abstract

This paper describes the outline of a system which is designed to infer a grammar from a collection of linguistic data (corpus). An incremental learning algorithm is used to produce a sequence of grammars which approximates the target grammar of the data provided.

In each step, a small set of sentences is selected and analysed by a special parser which produces partial structural descriptions for sentences not covered by the actual grammar. The sentence which minimizes the inductive leap for the learner is selected. For this sentence several hypotheses for completing its partial structural description are formulated and evaluated. The “best” hypothesis is then used to infer a new grammar. This process is continued until the corpus is completely covered by the grammar.

Zusammenfassung

Wir beschreiben die Grundzüge eines Systems, daß, konfrontiert mit einer Menge von linguistischen Daten (Korpus), eine Syntax für diese Daten generiert. Den Kern des Systems bildet ein inkrementeller Lernalgorithmus, der eine Folge von Grammatiken generiert, die den Verlauf des Lernprozesses reflektiert.

In jedem Schritt wird eine kleine Menge von Sätzen aus dem Korpus ausgewählt. Sie werden mit Hilfe eines speziellen Parsers analysiert, der für die Sätze, die nicht von der aktuellen Syntax erfaßt werden, partielle Beschreibungen generiert. Von diesen Sätzen wird derjenige ausgewählt, der den zur Generierung der neuen Syntax notwendigen induktiven Schritt minimiert. Die partielle Strukturbeschreibung dieses Satzes bildet die Grundlage für die Formulierung von Hypothesen zur Erweiterung der Syntax. Der Prozeß terminiert, sobald die aktuelle Syntax das Korpus vollständig abdeckt.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 49.99; Price excludes VAT (USA)

Softcover Book: USD 59.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

ANGLUIN, D. (1980). Inductive Inference of Formal Languages from Positive Data. Information and Control, 45: 117–35.
Article MATH MathSciNet Google Scholar
BERWICK, R.C. (1986). Learning from Positive-Only Examples. In: R. S. Michalski, J. G. Carbonell & T. M. Mitchell (Eds). Machine Learning-Vol.II. Morgan Kaufmann: Los Altos, 625–45.
Google Scholar
CRESPI-REGHIZZI, S. (1972). An effective model for grammar inference. In: B.Gilchrist (Ed). Information Processing 71. Elsevier North-Holland, 524–29.
Google Scholar
GARSIDE, R., G. LEECH & G. SAMPSON (1987). The computational analysis of English. Longman:New York.
Google Scholar
GOLD, E. M. (1967). Language Identification in the Limit. Information and Control, 10: 447–74.
Article MATH Google Scholar
STEELE, S. & A. DE ROECK (1987). Bidirectional Chart Parsing. In: J. Hallam & C. Mellish (Eds). Advances in Artificial Intelligence. John Wiley & Sons: New York, 223–35.
Google Scholar
YOKOMORI, T. (1989). Learning Context-Free Languages efficiently. In: K. P. Jantke (Ed). Analogical and Inductive Inference. Springer:Berlin-Heidelberg, 104–23.
Google Scholar

Download references

Author information

Authors and Affiliations

Computational Linguistics, University of Trier, P.O. Box 3825, W-5500, Trier, Germany
Sven Naumann & Jürgen Schrepp

Authors

Sven Naumann
View author publications
You can also search for this author in PubMed Google Scholar
Jürgen Schrepp
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Informatik (IMMD) VIII und FORWISS Erlangen, Universität Erlangen-Nürnberg, Am Weichselgarten 9, W-8520, Erlangen, Germany
Günther Görz

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Naumann, S., Schrepp, J. (1992). An empirical approach to syntax learning. In: Görz, G. (eds) Konvens 92. Informatik aktuell. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-77809-4_22

Download citation

DOI: https://doi.org/10.1007/978-3-642-77809-4_22
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-55959-7
Online ISBN: 978-3-642-77809-4
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics