Journal of Logic, Language and Information

, Volume 19, Issue 1, pp 53-73

First online:

Querying Linguistic Trees

  • Catherine LaiAffiliated withDepartment of Linguistics, University of Pennsylvania Email author 
  • , Steven BirdAffiliated withDepartment of Computer Science and Software Engineering, University of MelbourneLinguistic Data Consortium, University of Pennsylvania

Rent the article at a discount

Rent now

* Final gross prices may vary according to local VAT.

Get Access


Large databases of linguistic annotations are used for testing linguistic hypotheses and for training language processing models. These linguistic annotations are often syntactic or prosodic in nature, and have a hierarchical structure. Query languages are used to select particular structures of interest, or to project out large slices of a corpus for external analysis. Existing languages suffer from a variety of problems in the areas of expressiveness, efficiency, and naturalness for linguistic query. We describe the domain of linguistic trees and discuss the expressive requirements for a query language. Then we present a language that can express a wide range of queries over these trees, and show that the language is first-order complete over trees.


Linguistic databases Treebank Tree query XPath Annotation First order logic