Extracting Nanopublications from IR Papers

  • Aldo Lipani
  • Florina Piroi
  • Linda Andersson
  • Allan Hanbury
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8849)


The published scientific results should be reproducible, otherwise the scientific findings reported in the publications are less valued by the community. Several undertakings, like myExperiment, RunMyCode, or DIRECT, contribute to the availability of data, experiments, and algorithms. Some of these experiments and algorithms are even referenced or mentioned in later publications. Generally, research articles that present experimental results only summarize the used algorithms and data. In the better cases, the articles do refer to a web link where the code can be found. We give here an account of our experience with extracting the necessary data to possibly reproduce IR experiments. We also make considerations on automating this information extraction and storing the data as IR nanopublications which can later be queried and aggregated by automated processes, as the need arises.


Natural Language Processing Information Extraction Information Retrieval System SPARQL Query Name Entity 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Aizawa, A., Kohlhase, M., Ounis, I.: NTCIR-10 math pilot task overview. In: Proceedings of the 10th NTCIR Conference, Tokyo, Japan (2013)Google Scholar
  2. 2.
    Armstrong, T.G., Moffat, A., Webber, W., Zobel, J.: EvaluatIR: An Online Tool for Evaluating and Comparing IR Systems. In: Proceedings of the 32nd International ACM SIGIR Conference, SIGIR 2009, p. 833. ACM, New York (2009)Google Scholar
  3. 3.
    Bontcheva, K., Tablan, V., Maynard, D., Cunningham, H.: Evolving GATE to Meet New Challenges in Language Engineering. Natural Language Engineering 10(3/4), 349–373 (2004)CrossRefGoogle Scholar
  4. 4.
    De Roure, D.: Towards computational research objects. In: Proceedings of the 1st International Workshop on Digital Preservation of Research Methods and Artefacts, DPRMA 2013, pp. 16–19. ACM (2013)Google Scholar
  5. 5.
    De Roure, D., Goble, C., Stevens, R.: The design and realisation of the virtual research environment for social sharing of workflows. Future Generation Computer Systems 25(5), 561–567 (2009)CrossRefGoogle Scholar
  6. 6.
    Dussin, M., Ferro, N.: DIRECT: Applying the DIKW hierarchy to large-scale evaluation campaigns. In: Larsen, R.L., Paepcke, A., Borbinha, J.L., Naaman, M. (eds.) Proceedings of JCDL, p. 424 (2008)Google Scholar
  7. 7.
    Lipani, A., Piroi, F., Andersson, L., Hanbury, A.: An Information Retrieval Ontology for Information Retrieval Nanopublications. In: Kanoulas, E., Lupu, M., Clough, P., Sanderson, M., Hall, M., Hanbury, A., Toms, E. (eds.) CLEF 2014. LNCS, vol. 8685, pp. 44–49. Springer, Heidelberg (2014)CrossRefGoogle Scholar
  8. 8.
    Maynard, D., Li, Y., Peters, W.: NLP techniques for term extraction and ontology population. In: Proceeding of the 2008 Conference on Ontology Learning and Population: Bridging the Gap Between Text and Knowledge, pp. 107–127 (2008)Google Scholar
  9. 9.
    Nekrutenko, A., Taylor, J.: Next-generation sequencing data interpretation: Enhancing reproducibility and accessibility. Nat. Rev. Genet. 13(9), 667–672 (2012)CrossRefGoogle Scholar
  10. 10.
    Pedersen, T.: Empiricism is not a matter of faith. Computational Linguistics 34(3), 465–470 (2008)CrossRefGoogle Scholar
  11. 11.
    van Rijn, J.N., et al.: OpenML: A collaborative science platform. In: Blockeel, H., Kersting, K., Nijssen, S., Železný, F. (eds.) ECML PKDD 2013, Part III. LNCS (LNAI), vol. 8190, pp. 645–649. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  12. 12.
    Stodden, V.: The reproducible research movement in statistics. Statistical Journal of the IAOS: Journal of the International Association for Official Statistics 30(2), 91–93 (2014)Google Scholar
  13. 13.
    Stodden, V., Hurlin, C., Perignon, C.: A novel dissemination and collaboration platform for executing published computational results. Technical Report ID 2147710, Social Science Research Network (2012)Google Scholar
  14. 14.
    Vitek, J., Kalibera, T.: Repeatability, reproducibility and rigor in systems research. In: 2011 Proceedings of the International Conference on Embedded Software (EMSOFT), pp. 33–38 (2011)Google Scholar
  15. 15.
    Witte, R., Khamis, N., Rilling, J.: Flexible ontology population from text: The OwlExporter. In: Calzolari, N., Choukri, K., Maegaard, B., Mariani, J., Odijk, J., Piperidis, S., Rosner, M., Tapias, D. (eds.) Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC 2010). European Language Resources Association, ELRA (2010)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Aldo Lipani
    • 1
  • Florina Piroi
    • 1
  • Linda Andersson
    • 1
  • Allan Hanbury
    • 1
  1. 1.Institute of Software Technology and Interactive Systems (ISIS)Vienna University of TechnologyAustria

Personalised recommendations