Abstract
Today, structured medical data is often considered apart from its unstructured counterpart. When searching for a specific piece of information either structured sources, e.g. genomic variant lists and electronic medical records, or unstructured sources, e.g. medical papers, research documentations and trial descriptions, are examined. However, structured data, such as a patient’s genomic data, can be valuable in searching unstructured data like clinical trial proposals in order to find apposite information for the patient. Consequently, today’s separation of both source types impedes any insights into coherencies between them. In this contribution, I propose utilizing in-memory databases to combine results from search in structured as well as in unstructured medical data and introduce a research prototype for a clinical trial search tool. The prototype suggests matching clinical trials based on a patient’s genome and benefits from the analytical performance of the in-memory database. Furthermore, I investigate how an increasing amount of medical input data affects the performance of the prototype.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Boese JH et al. (2012) Data Management with SAP’s In-memory Computing Engine. In: Proceedings of the 15th International Conference on Extending Database Technology
Chang JT, Schütze H, Altman RB (2004) GAPSCORE: Finding Gene and Protein Names One Word at a Time. Bioinformatics Journal 20(2):216–225
Chiang JH, Yu HC (2003) MeKE: Discovering the Functions of Gene Products from Biomedical Literature via Sentence Alignment. Bioinformatics Journal 19(11):1417–1422
Cios KJ, William Moore G (2002) Uniqueness of medical data mining. Artificial intelligence in medicine 26(1):1–24
Committee HGN (2013) HUGO Gene Nomenclature Committee. http://www.genenames.org/. Accessed Sep 23, 2013
DeWitt DJ et al. (1984) Implementation Techniques for Main Memory Database Systems. In: Proceedings of the International Conference Management of Data, ACM, pp 1–8
Garcia-Molina H, Salem K (1992) Main Memory Database Systems: An Overview. IEEE Transactions on Knowledge and Data Engineering 4(6):509–516
Hamosh A et al. (2005) Online Mendelian Inheritance in Man (OMIM), a Knowledgebase of Human Genes and Genetic Disorders. Nucleic Acids Research 33:D514 – D517
Hunt DL et al. (1998) Effects of Computer-based Clinical Decision Support Systems on Physician Performance and Patient Outcomes. Journal of the American Medical Association 280(15):1339–1346
Ibrahim GM, Chung C, BernsteinM (2011) Competing for Patients: An Ethical Framework for Recruiting Patients with Brain Tumors into Clinical Trials. Journal of Neuro-Oncology 104(3):623–627
Kanehisa M, Goto S (2000) KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids Research 28(1):27–30
Knöpfel A, Gröne B, Tabeling P (2005) Fundamental Modeling Concepts. Wiley, West Sussex UK
Krallinger M, Valencia A (2005) Text-mining and Information-retrieval Services for Molecular Biology. Genome Biology 6(7):224
Krallinger M et al. (2008) Evaluation of Text-mining Systems for Biology: Overview of the Second BioCreative Community Challenge. Genome Biology 9 supplement 2:S1
Krallinger M et al. (2008) Linking Genes to Literature: Text Mining, Information Extraction, and Retrieval Applications for Biology. Genome Biology 9, supplement 2:S8
Nadeau D, Sekine S (2007) A Survey of Named Entity Recognition and Classification. Lingvisticae Investigationes 30(1):3–26
National Center for Biotechnology Information, U.S. National Library of Medicine (2013) Pubmed. http://www.ncbi.nlm.nih.gov/pubmed. Accessed Sep 23, 2013
Plattner H (2013) A Course in In-Memory Data Management: The Inner Mechanics of In-Memory Databases. Springer
Python Software Foundation (2013) 15.3. Time - Time Access and Conversions - Python v2.7.5 documentation. http://docs.python.org/2/library/time.html. Accessed Sep 23, 2013
Python Software Foundation (2013) 26.6. Timeit - Measure Execution Time of Small Code Snippets. http://docs.python.org/2/library/timeit.html. Accessed Sep 23, 2013
SAP AG (2013) SAP HANA Developer Guide. http://help.sap.com/hana/SAP_HANA_Developer_Guide_en.pdf. Accessed Sep 23, 2013
SAP AG (2013) Text Data Processing Extraction Customization Guide. http://help.sap.com/businessobject/product_guides/sboDS42/en/ds_42_tdp_ext_cust_en.pdf. Accessed Sep 23, 2013
SAP AG (2013) Text Data Processing Language Reference Guide. http://help.sap.com/businessobject/product_guides/boexir4/en/sbo401_ds_tdp_lang_ref_en.pdf. Accessed Sep 23, 2013
Schapranow MP, Plattner H, Meinel C (2013) Applied In-Memory Technology for High-Throughput Genome Data Processing and Real-time Analysis. In: Proceedings of the XXI Winter Course of the Centro Avanzado Tecnológico de Análisis de Imagen, pp 35–42
Schapranow MP et al. (2013) Mobile Real-time Analysis of Patient Data for Advanced Decision Support in Personalized Medicine. In: Proceedings of the 5th International Conference on eHealth, Telemedicine, and Social Medicine
Settles B (2005) ABNER: An Open Source Tool for Automatically Tagging Genes, Proteins and other Entity Names in Text. Bioinformatics Journal 21(14):3191–3192
Sittig DF et al. (2008) Grand challenges in clinical decision support v10. Journal of biomedical informatics 41(2):387
Tanabe L, Wilbur WJ (2002) Tagging Gene and Protein Names in Full Text Articles. In: Proceedings of theWorkshop on Natural Language Processing in the Biomedical Domain, vol 3, pp 9–13
The Centre for Applied Genomics (2013) Database of Genomic Variants. http://dgvbeta.tcag.ca/dgv/app/downloads. Accessed Sep 23, 2013
UniProt Consortium (2013) Universal Protein Resource (UniProt). http://www.uniprot.org/. Accessed Sep 23, 2013
U.S. Food and Drug Administration (2012) The FDA’s Drug Review Process: Ensuring Drugs Are Safe and Effective. http://www.fda.gov/drugs/resourcesforyou/consumers/ucm143534.htm. Accessed Sep 23, 2013
U.S. National Institutes of Health (2013) ClinicalTrials.gov. http://www.clinicaltrials.gov/. Accessed Sep 23, 2013
U.S. National Institutes of Health (2013) How to Use Advanced Search - ClinicalTrials.gov. http://clinicaltrials.gov/ct2/help/how-find/advanced. Accessed Sep 23, 2013
U.S. National Institutes of Health (2013) Learn About Clinical Studies - ClinicalTrials.gov. http://clinicaltrials.gov/ct2/about-studies/learn. Accessed Sep 23, 2013
U.S. National Library of Medicine (2013) 2012AB FDA Structured Product Labels Source Information. http://www.nlm.nih.gov/research/umls/sourcereleasedocs/current/MTHSPL/. Accessed Sep 23, 2013
U.S. National Library of Medicine (2013) Citations Added to MEDLINE by Fiscal Year. http://www.nlm.nih.gov/bsd/stats/cit_added.html. Accessed Sep 23, 2013
U.S. National Library of Medicine (2013) Unified Medical Language System (UMLS). http://www.nlm.nih.gov/research/umls/. Accessed Sep 23, 2013
Weizmann Institute of Science (2013) All GeneCards genes. http://genecards.org/cgi-bin/cardlisttxt.pl. Accessed Sep 23, 2013
Weizmann Institute of Science (2013) GeneCards - Human Genes | Gene Database | Gene Search. http://genecards.org/. Accessed Sep 23, 2013
Weizmann Institute of Science (2013) Information Page for GeneCards Sections. http://genecards.org/info.shtml. Accessed Sep 23, 2013
Zarin D et al. (2013) ClinicalTrials.gov and Related Projects: Improving Access to Information about Clinical Trials; A Report to the Board of Scientific Counselors. Technical Report TR –2013-001, Lister Hill National Center for Biomedical Communications, U.S. National Library of Medicine
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Heller, D. (2014). Combined Search in Structured and Unstructured Medical Data. In: Plattner, H., Schapranow, MP. (eds) High-Performance In-Memory Genome Data Analysis. In-Memory Data Management Research. Springer, Cham. https://doi.org/10.1007/978-3-319-03035-7_8
Download citation
DOI: https://doi.org/10.1007/978-3-319-03035-7_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-03034-0
Online ISBN: 978-3-319-03035-7
eBook Packages: Biomedical and Life SciencesBiomedical and Life Sciences (R0)