International Conference on Intelligent Computer Mathematics

CICM 2012: Intelligent Computer Mathematics pp 422-426

MaxTract: Converting PDF to \(\mbox\LaTeX\), MathML and Text

  • Josef B. Baker
  • Alan P. Sexton
  • Volker Sorge
Conference paper

DOI: 10.1007/978-3-642-31374-5_29

Volume 7362 of the book series Lecture Notes in Computer Science (LNCS)
Cite this paper as:
Baker J.B., Sexton A.P., Sorge V. (2012) MaxTract: Converting PDF to \(\mbox\LaTeX\), MathML and Text. In: Jeuring J. et al. (eds) Intelligent Computer Mathematics. CICM 2012. Lecture Notes in Computer Science, vol 7362. Springer, Berlin, Heidelberg

Abstract

In this paper we present the first public, online demonstration of MaxTract; a tool that converts PDF files containing mathematics into multiple formats including \(\mbox\LaTeX\), HTML with embedded MathML, and plain text. Using a bespoke PDF parser and image analyser, we directly extract character and font information to use as input for a linear grammar which, in conjunction with specialised drivers, can accurately recognise and reproduce both the two dimensional relationships between symbols in mathematical formulae and the one dimensional relationships present in standard text.

The main goals of MaxTract are to provide translation services into standard mathematical markup languages and to add accessibility to mathematical documents on multiple levels. This includes both accessibility in the narrow sense of providing access to content for print impaired users, such as those with visual impairments, dyslexia or dyspraxia, as well as more generally to enable any user access to the mathematical content at more re-usable levels than merely visual. MaxTract produces output compatible with web browsers, screen readers, and tools such as copy and paste, which is achieved by enriching the regular text with mathematical markup. The output can also be used directly, within the limits of the presentation MathML produced, as machine readable mathematical input to software systems such as Mathematica or Maple.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Josef B. Baker
    • 1
  • Alan P. Sexton
    • 1
  • Volker Sorge
    • 1
  1. 1.School of Computer ScienceUniversity of BirminghamUK