Advertisement

Comparing Small Graph Retrieval Performance for Ontology Concepts in Medical Texts

  • Daniel R. Schlegel
  • Jonathan P. Bona
  • Peter L. Elkin
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9579)

Abstract

Some terminologies and ontologies, such as SNOMED CT, allow for post–coordinated as well as pre-coordinated expressions. Post–coordinated expressions are, essentially, small segments of the terminology graphs. Compositional expressions add logical and linguistic relations to the standard technique of post-coordination. In indexing medical text, many instances of compositional expressions must be stored, and in performing retrieval on that index, entire compositional expressions and sub-parts of those expressions must be searched. The problem becomes a small graph query against a large collection of small graphs. This is further complicated by the need to also find sub-graphs from a collection of small graphs. In previous systems using compositional expressions, such as iNLP, the index was stored in a relational database. We compare retrieval characteristics of relational databases, triplestores, and general graph databases to determine which is most efficient for the task at hand.

References

  1. 1.
    Andrš, J.: Metadata repository benchmark: PostgreSQL vs. Neo4j (2014). http://mantatools.com/metadata-repository-benchmark-postgresql-vs-neo4j
  2. 2.
    Angles, R.: A comparison of current graph database models. In: 2012 IEEE 28th International Conference on Data Engineering Workshops (ICDEW), pp. 171–177. IEEE (2012)Google Scholar
  3. 3.
    Ciglan, M., Averbuch, A., Hluchy, L.: Benchmarking traversal operations over graph databases. In: 2012 IEEE 28th International Conference on Data Engineering Workshops (ICDEW), pp. 186–189. IEEE (2012)Google Scholar
  4. 4.
    Dominguez-Sal, D., Urbón-Bayes, P., Giménez-Vañó, A., Gómez-Villamor, S., Martínez-Bazán, N., Larriba-Pey, J.L.: Survey of graph database performance on the HPC scalable graph analysis benchmark. In: Shen, H.T. (ed.) WAIM 2010. LNCS, vol. 6185, pp. 37–48. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  5. 5.
    Elkin, P.L., Brown, S.H., Husser, C.S., Bauer, B.A., Wahner-Roedler, D., Rosenbloom, S.T., Speroff, T.: Evaluation of the content coverage of snomed ct: ability of SNOMED clinical terms to represent clinical problem lists. In: Mayo Clinic Proceedings. vol. 81, pp. 741–748. Elsevier (2006)Google Scholar
  6. 6.
    Elkin, P.L., Froehling, D.A., Wahner-Roedler, D.L., Brown, S.H., Bailey, K.R.: Comparison of natural language processing biosurveillance methods for identifying influenza from encounter notes. Ann. Intern. Med. 156(1_Part_1), 11–18 (2012)Google Scholar
  7. 7.
    Elkin, P.L., Trusko, B.E., Koppel, R., Speroff, T., Mohrer, D., Sakji, S., Gurewitz, I., Tuttle, M., Brown, S.H.: Secondary use of clinical data. Stud Health Technol. Inform. 155, 14–29 (2010)Google Scholar
  8. 8.
  9. 9.
    Murff, H.J., FitzHenry, F., Matheny, M.E., Gentry, N., Kotter, K.L., Crimin, K., Dittus, R.S., Rosen, A.K., Elkin, P.L., Brown, S.H., et al.: Automated identification of postoperative complications within an electronic medical record using natural language processing. Jama 306(8), 848–855 (2011)CrossRefGoogle Scholar
  10. 10.
    Neo Technology Inc: Neo4j, the world’s leading graph database. (2015). http://neo4j.com/
  11. 11.
    Ontotext: Ontotext GraphDB. (2015). http://ontotext.com/products/ontotext-graphdb/
  12. 12.
  13. 13.
    Partner, J., Vukotic, A., Watt, N., Abedrabbo, T., Fox, D.: Neo4j in Action. Manning Publications Company, Greenwich (2014)Google Scholar
  14. 14.
    Rodriguez, M.: MySQL vs. Neo4j on a large-scale graph traversal (2011). https://dzone.com/articles/mysql-vs-neo4j-large-scale
  15. 15.
    Schlegel, D.R.: Concurrent Inference Graphs. Ph.D. thesis, State University of New York at Buffalo (2015)Google Scholar
  16. 16.
    Schlegel, D.R., Shapiro, S.C.: Visually interacting with a knowledge base using frames, logic, and propositional graphs. In: Croitoru, M., Rudolph, S., Wilson, N., Howse, J., Corby, O. (eds.) GKR 2011. LNCS, vol. 7205, pp. 188–207. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  17. 17.
    Shapiro, S.C., Rapaport, W.J.: The SNePS family. Comput. Math. Appl. 23(2–5), 243–275 (1992)CrossRefMATHGoogle Scholar
  18. 18.
    The International Health Terminology Standards Development Organisation:SNOMED CT technical implementation guide (July 2014)Google Scholar
  19. 19.
    W3C OWL Working Group: Owl 2 web ontology language document overview (2nd edn.) (2012). http://www.w3.org/TR/owl2-overview/
  20. 20.
    W3C RDF Working Group: Rdf 1.1 semantics (2014). http://www.w3.org/TR/rdf11-mt/
  21. 21.
    W3C RDF Working Group: Rdf schema 1.1 (2014). http://www.w3.org/TR/rdf-schema/
  22. 22.
    Zhao, F., Tung, A.K.: Large scale cohesive subgraphs discovery for social network visual analysis. Proc. VLDB Endowment 6(2), 85–96 (2012)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  • Daniel R. Schlegel
    • 1
  • Jonathan P. Bona
    • 1
  • Peter L. Elkin
    • 1
  1. 1.Department of Biomedical InformaticsUniversity at BuffaloBuffaloUSA

Personalised recommendations