Summary
In spite of the many decades of progress in database research, surprisingly scientists in the life sciences community still struggle with inefficient and awkward tools for querying biological datasets. This work highlights a specific problem involving searching large volumes of protein datasets based on their secondary structure. In this chapter we define an intuitive query language that can be used to express queries on secondary structure and develop several algorithms for evaluating these queries. We have implemented these algorithms in Periscope, which is a native database management system that we are building for declarative querying on biological datasets. Experiments based on our implementation show that the choice of algorithms can have a significant impact on query performance. As part of the Periscope implementation, we have also developed a framework for optimizing these queries and for accurately estimating the costs of the various query evaluation plans. Our performance studies show that the proposed techniques are very efficient and can provide scientists with interactive secondary structure querying options even on large protein datasets.
Keywords
- Protein Secondary Structure
- Query Evaluation
- Query Optimizer
- Query Plan
- Selective Predicate
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This is a preview of subscription content, access via your institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag London Limited
About this chapter
Cite this chapter
Patel, J.M., Huddler, D.P., Hammel, L. (2005). Declarative and Efficient Querying on Protein Secondary Structures. In: Wu, X., Jain, L., Wang, J.T., Zaki, M.J., Toivonen, H.T., Shasha, D. (eds) Data Mining in Bioinformatics. Advanced Information and Knowledge Processing. Springer, London. https://doi.org/10.1007/1-84628-059-1_11
Download citation
DOI: https://doi.org/10.1007/1-84628-059-1_11
Publisher Name: Springer, London
Print ISBN: 978-1-85233-671-4
Online ISBN: 978-1-84628-059-7
eBook Packages: Computer ScienceComputer Science (R0)
