XPath, XPointer, and XInclude

  • Deborah Nolan
  • Duncan Temple Lang
Chapter
Part of the Use R! book series (USE R)

Abstract

In this chapter, we focus on XPath, a domain-specific language that we can use from within R (amongst others) to query sets of nodes in an XML tree by patterns within nodes. XPath is quite simple but very powerful. Similar to a file hierarchy, it allows us to identify nodes of interest by specifying paths through the tree, based on node names, node content, and a node’s relationship to other nodes in the hierarchy. We typically use XPath to locate nodes in a tree and then use R functions to extract data from those nodes and bring the data into R. The combination of R and XPath gives us very powerful and flexible facilities for working with XML, and anyone working with XML on a regular basis should learn the details of XPath. XPath is the primary tool for working with XML content, either from scraping data from Web pages, services, or processing local XML documents.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. [1]
    Anders Berglund. Extensible Stylesheet Language (XSL) Version 1.1. WorldwideWeb Consortium, 2006. http://www.w3.org/TR/xsl.
  2. [2]
    Michael Brundage. XQuery: The XML Query Language. Addison Wesley, Boston, MA, 2004.Google Scholar
  3. [3]
    James Clark. XSL transformations (XSLT). Worldwide Web Consortium, 1999. http://www.w3.org/TR/xslt.
  4. [4]
    David Flanagan. JavaScript: The Definitive Guide. O’Reilly Media, Inc., Sebastopol, CA, 2006.Google Scholar
  5. [5]
    FLOWR Foundation. Zorba: The XQuery processor. http://www.zorba-xquery.com, 2012.
  6. [6]
    Elliotte Rusty Harold andW. Scott Means. XML in a Nutshell. O’Reilly Media, Inc., Sebastopol, CA, 2004.Google Scholar
  7. [7]
    Library of Congress. MODS: Metadata Object Description Schema. http://www.loc.gov/standards/mods/mods.xsd, 2010.
  8. [8]
    National Center for Integrative Biomedical Informatics. Michigan molecular interactions. http:/mimi.ncibi.org, 2010.
  9. [9]
    John Simpson. XPath and XPointer: Locating Content in XML Documents. O’Reilly Media, Inc., Sebastopol, CA, 2002.Google Scholar
  10. [10]
    Duncan Temple Lang. RXQuery: Bi-directional interface to an XQuery engine. http://www.omegahat.org/RXQuery, 2011. R package version 0.3-0.
  11. [11]
    Duncan Temple Lang. Sxslt: R extension for liblibxslt. http://www.omegahat.org/Sxslt, 2011. R package version 0.91-1.
  12. [12]
    Jenni Tennison. XSLT and XPath On the Edge. M & T Books, New York, NY, 2001.Google Scholar
  13. [13]
    Doug Tidwell. XSLT. O’Reilly Media, Inc., Sebastopol, CA, 2008.Google Scholar
  14. [14]
    W3Schools, Inc. XPath tutorial. http://www.w3schools.com/XPath/default.asp, 2011.
  15. [15]
    Priscilla Walmsley. XQuery. O’Reilly Media, Inc., Sebastopol, CA, 2007.Google Scholar

Copyright information

© Springer Science+Business Media New York 2014

Authors and Affiliations

  • Deborah Nolan
    • 1
  • Duncan Temple Lang
    • 2
  1. 1.Department of StatisticsUniversity of CaliforniaBerkeleyUSA
  2. 2.Department of StatisticsUniversity of CaliforniaDavisUSA

Personalised recommendations