Document Mark-Up for Different Users and Purposes

  • David King
  • David R. Morse
Part of the Communications in Computer and Information Science book series (CCIS, volume 390)


Semantic enhancement of texts aids their use by researchers. However, mark-up of large bodies of text is slow and requires precious expert resources. The task could be automated if there were marked-up texts to train and test mark-up tools. This paper looks at the re-purposing of texts originally marked-up to support taxonomists to provide computer scientists with training and test data for their mark-up tools. The re-purposing highlighted some key differences in the requirements of taxonomists and computer scientists and their approaches to mark-up.


mark-up XML annotation stand-off annotation biodiversity 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Biodiversity Heritage Library,
  2. 2.
  3. 3.
    INOTAXA, INtegrated Open TAXonomic Access,
  4. 4.
    Weitzman, A.L., Lyal, C.H.C.: INOTAXA — INtegrated Open TAXonomic Access and the “ BiologiaCentrali-Americana”. In: Proceedings Of The Contributed Papers Sessions Biomedical And Life Sciences Division, SLA, p. 8 (2006),
  5. 5.
    ViBRANT, Virtual Biodiversity Research and Access Network for Taxonomy,
  6. 6.
    Murray-Rust, P., Rzepa, H.S.: Scientific publications in XML - towards a global knowledge base. Data Science 1, 84–98 (2002)CrossRefGoogle Scholar
  7. 7.
    Cui, H.: Approaches to Semantic Mark-up for Natural Heritage Literature. In: Proceedings of the iConference 2008 (2008),
  8. 8.
    Parr, C.S., Lyal, C.H.C.: Use cases for online taxonomic literature from taxonomists, conservationists, and others. In: Proceedings of TDWG Annual Conference (2007),
  9. 9.
    Penev, L., Lyal, C.H.C., Weitzman, A., Morse, D., King, D., Sautter, G., Georgiev, T., Morris, R.A., Catapano, T., Agosti, D.: XML schemas and mark-up practices of taxonomic literature. In: Smith, V., Penev, L. (eds.) e-Infrastructures for Data Publishing in Biodiversity Science, vol. 150, pp. 89–116. ZooKeys (2011)Google Scholar
  10. 10.
  11. 11.
  12. 12.
    Weitzman, A.L., Lyal, C.H.C.: An XML schema for taxonomic literature – taXMLit - (2004),
  13. 13.
    TEI, Text Encoding Initiative,
  14. 14.
  15. 15.
    Catapano, T.: TaxPub: An extension of the NLM/NCBI Journal Publishing DTD for taxonomic descriptions. Proceedings of the Journal Article Tag Suite Conference (2010),
  16. 16.
    US National Center for Biotechnology Information,
  17. 17.
    Penev, L., Agosti, D., Georgiev, T., Catapano, T., Miller, J., Blagoderov, V., Roberts, D., Smith, V., Brake, I., Ryrcroft, S., Scott, B., Johnson, N., Morris, R., Sautter, G., Chavan, V., Robertson, T., Remsen, D., Stoev, P., Parr, C., Knapp, S., Kress, W., Thompson, C., Erwin, T.: Semantic tagging of and semantic enhancements to systematics papers: ZooKeys working examples. ZooKeys 50, 1–16 (2010), doi:10.3897/zookeys.50.538Google Scholar
  18. 18.
  19. 19.
    Willis, A., King, D., Morse, D., Dil, A., Lyal, C., Roberts, D.: From XML to XML: The Why and How of Making the Biodiversity Literature Accessible to Researchers. In: Proceedings of the Seventh conference on International Language Resources and Evaluation (LREC 2010), European Language Resources Association (ELRA), Valletta (2010), Google Scholar
  20. 20.
    Ide, N., Romary, L.: International standard for a linguistic annotation framework. Journal of Natural Language Engineering 10(3-4), 211–225 (2004)CrossRefGoogle Scholar
  21. 21.
  22. 22.
    brat rapid annotation tool,

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • David King
    • 1
  • David R. Morse
    • 1
  1. 1.Department of Computing and CommunicationsThe Open UniversityMilton KeynesUK

Personalised recommendations