Combined Search in Structured and Unstructured Medical Data

Heller, David

doi:10.1007/978-3-319-03035-7_8

David Heller⁴

Part of the book series: In-Memory Data Management Research ((IMDM))

2555 Accesses
1 Citations

Abstract

Today, structured medical data is often considered apart from its unstructured counterpart. When searching for a specific piece of information either structured sources, e.g. genomic variant lists and electronic medical records, or unstructured sources, e.g. medical papers, research documentations and trial descriptions, are examined. However, structured data, such as a patient’s genomic data, can be valuable in searching unstructured data like clinical trial proposals in order to find apposite information for the patient. Consequently, today’s separation of both source types impedes any insights into coherencies between them. In this contribution, I propose utilizing in-memory databases to combine results from search in structured as well as in unstructured medical data and introduce a research prototype for a clinical trial search tool. The prototype suggests matching clinical trials based on a patient’s genome and benefits from the analytical performance of the in-memory database. Furthermore, I investigate how an increasing amount of medical input data affects the performance of the prototype.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Boese JH et al. (2012) Data Management with SAP’s In-memory Computing Engine. In: Proceedings of the 15th International Conference on Extending Database Technology
Google Scholar
Chang JT, Schütze H, Altman RB (2004) GAPSCORE: Finding Gene and Protein Names One Word at a Time. Bioinformatics Journal 20(2):216–225
Article CAS Google Scholar
Chiang JH, Yu HC (2003) MeKE: Discovering the Functions of Gene Products from Biomedical Literature via Sentence Alignment. Bioinformatics Journal 19(11):1417–1422
Article CAS Google Scholar
Cios KJ, William Moore G (2002) Uniqueness of medical data mining. Artificial intelligence in medicine 26(1):1–24
Article PubMed Google Scholar
Committee HGN (2013) HUGO Gene Nomenclature Committee. http://www.genenames.org/. Accessed Sep 23, 2013
DeWitt DJ et al. (1984) Implementation Techniques for Main Memory Database Systems. In: Proceedings of the International Conference Management of Data, ACM, pp 1–8
Google Scholar
Garcia-Molina H, Salem K (1992) Main Memory Database Systems: An Overview. IEEE Transactions on Knowledge and Data Engineering 4(6):509–516
Article Google Scholar
Hamosh A et al. (2005) Online Mendelian Inheritance in Man (OMIM), a Knowledgebase of Human Genes and Genetic Disorders. Nucleic Acids Research 33:D514 – D517
Article PubMed CAS Google Scholar
Hunt DL et al. (1998) Effects of Computer-based Clinical Decision Support Systems on Physician Performance and Patient Outcomes. Journal of the American Medical Association 280(15):1339–1346
Article PubMed CAS Google Scholar
Ibrahim GM, Chung C, BernsteinM (2011) Competing for Patients: An Ethical Framework for Recruiting Patients with Brain Tumors into Clinical Trials. Journal of Neuro-Oncology 104(3):623–627
Article PubMed CAS Google Scholar
Kanehisa M, Goto S (2000) KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids Research 28(1):27–30
Article PubMed CAS Google Scholar
Knöpfel A, Gröne B, Tabeling P (2005) Fundamental Modeling Concepts. Wiley, West Sussex UK
Google Scholar
Krallinger M, Valencia A (2005) Text-mining and Information-retrieval Services for Molecular Biology. Genome Biology 6(7):224
Article PubMed Google Scholar
Krallinger M et al. (2008) Evaluation of Text-mining Systems for Biology: Overview of the Second BioCreative Community Challenge. Genome Biology 9 supplement 2:S1
Google Scholar
Krallinger M et al. (2008) Linking Genes to Literature: Text Mining, Information Extraction, and Retrieval Applications for Biology. Genome Biology 9, supplement 2:S8
Google Scholar
Nadeau D, Sekine S (2007) A Survey of Named Entity Recognition and Classification. Lingvisticae Investigationes 30(1):3–26
Article Google Scholar
National Center for Biotechnology Information, U.S. National Library of Medicine (2013) Pubmed. http://www.ncbi.nlm.nih.gov/pubmed. Accessed Sep 23, 2013
Plattner H (2013) A Course in In-Memory Data Management: The Inner Mechanics of In-Memory Databases. Springer
Google Scholar
Python Software Foundation (2013) 15.3. Time - Time Access and Conversions - Python v2.7.5 documentation. http://docs.python.org/2/library/time.html. Accessed Sep 23, 2013
Python Software Foundation (2013) 26.6. Timeit - Measure Execution Time of Small Code Snippets. http://docs.python.org/2/library/timeit.html. Accessed Sep 23, 2013
SAP AG (2013) SAP HANA Developer Guide. http://help.sap.com/hana/SAP_HANA_Developer_Guide_en.pdf. Accessed Sep 23, 2013
SAP AG (2013) Text Data Processing Extraction Customization Guide. http://help.sap.com/businessobject/product_guides/sboDS42/en/ds_42_tdp_ext_cust_en.pdf. Accessed Sep 23, 2013
SAP AG (2013) Text Data Processing Language Reference Guide. http://help.sap.com/businessobject/product_guides/boexir4/en/sbo401_ds_tdp_lang_ref_en.pdf. Accessed Sep 23, 2013
Schapranow MP, Plattner H, Meinel C (2013) Applied In-Memory Technology for High-Throughput Genome Data Processing and Real-time Analysis. In: Proceedings of the XXI Winter Course of the Centro Avanzado Tecnológico de Análisis de Imagen, pp 35–42
Google Scholar
Schapranow MP et al. (2013) Mobile Real-time Analysis of Patient Data for Advanced Decision Support in Personalized Medicine. In: Proceedings of the 5th International Conference on eHealth, Telemedicine, and Social Medicine
Google Scholar
Settles B (2005) ABNER: An Open Source Tool for Automatically Tagging Genes, Proteins and other Entity Names in Text. Bioinformatics Journal 21(14):3191–3192
Article CAS Google Scholar
Sittig DF et al. (2008) Grand challenges in clinical decision support v10. Journal of biomedical informatics 41(2):387
Google Scholar
Tanabe L, Wilbur WJ (2002) Tagging Gene and Protein Names in Full Text Articles. In: Proceedings of theWorkshop on Natural Language Processing in the Biomedical Domain, vol 3, pp 9–13
Google Scholar
The Centre for Applied Genomics (2013) Database of Genomic Variants. http://dgvbeta.tcag.ca/dgv/app/downloads. Accessed Sep 23, 2013
UniProt Consortium (2013) Universal Protein Resource (UniProt). http://www.uniprot.org/. Accessed Sep 23, 2013
U.S. Food and Drug Administration (2012) The FDA’s Drug Review Process: Ensuring Drugs Are Safe and Effective. http://www.fda.gov/drugs/resourcesforyou/consumers/ucm143534.htm. Accessed Sep 23, 2013
U.S. National Institutes of Health (2013) ClinicalTrials.gov. http://www.clinicaltrials.gov/. Accessed Sep 23, 2013
U.S. National Institutes of Health (2013) How to Use Advanced Search - ClinicalTrials.gov. http://clinicaltrials.gov/ct2/help/how-find/advanced. Accessed Sep 23, 2013
U.S. National Institutes of Health (2013) Learn About Clinical Studies - ClinicalTrials.gov. http://clinicaltrials.gov/ct2/about-studies/learn. Accessed Sep 23, 2013
U.S. National Library of Medicine (2013) 2012AB FDA Structured Product Labels Source Information. http://www.nlm.nih.gov/research/umls/sourcereleasedocs/current/MTHSPL/. Accessed Sep 23, 2013
U.S. National Library of Medicine (2013) Citations Added to MEDLINE by Fiscal Year. http://www.nlm.nih.gov/bsd/stats/cit_added.html. Accessed Sep 23, 2013
U.S. National Library of Medicine (2013) Unified Medical Language System (UMLS). http://www.nlm.nih.gov/research/umls/. Accessed Sep 23, 2013
Weizmann Institute of Science (2013) All GeneCards genes. http://genecards.org/cgi-bin/cardlisttxt.pl. Accessed Sep 23, 2013
Weizmann Institute of Science (2013) GeneCards - Human Genes | Gene Database | Gene Search. http://genecards.org/. Accessed Sep 23, 2013
Weizmann Institute of Science (2013) Information Page for GeneCards Sections. http://genecards.org/info.shtml. Accessed Sep 23, 2013
Zarin D et al. (2013) ClinicalTrials.gov and Related Projects: Improving Access to Information about Clinical Trials; A Report to the Board of Scientific Counselors. Technical Report TR –2013-001, Lister Hill National Center for Biomedical Communications, U.S. National Library of Medicine
Google Scholar

Download references

Author information

Authors and Affiliations

Potsdam, Germany
David Heller

Authors

David Heller
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to David Heller .

Editor information

Editors and Affiliations

Enterprise Platform and Integration Concepts, Hasso-Plattner-Institute, Potsdam, Germany
Hasso Plattner
Enterprise Platform and Integration Concepts Chair, Hasso Plattner Institute, Potsdam, Germany
Matthieu-P. Schapranow

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Heller, D. (2014). Combined Search in Structured and Unstructured Medical Data. In: Plattner, H., Schapranow, MP. (eds) High-Performance In-Memory Genome Data Analysis. In-Memory Data Management Research. Springer, Cham. https://doi.org/10.1007/978-3-319-03035-7_8

Download citation

DOI: https://doi.org/10.1007/978-3-319-03035-7_8
Published: 19 November 2013
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-03034-0
Online ISBN: 978-3-319-03035-7
eBook Packages: Biomedical and Life SciencesBiomedical and Life Sciences (R0)

Publish with us

Policies and ethics