Summary
Proteomics, the comprehensive and systematic study of the properties of all expressed proteins, has become a major research area in computational biology and bioinformatics. Among these properties, knowledge of the specific subcellular structures in which a protein is located is perhaps the most critical to a complete understanding of the protein’s roles and functions. Subcellular location is most commonly determined via fluorescence microscopy, an optical method relying on target-specific fluorescent probes. The images that result are routinely analyzed by visual inspection. However, visual inspection may lead to ambiguous, inconsistent, and even inaccurate conclusions about subcellular location. We describe in this chapter an automatic and accurate system that can distinguish all major protein subcellular location patterns. This system employs numerous informative features extracted from the fluorescence microscope images. By selecting the most discriminative features from the entire feature set and recruiting various state-of-the-art classifiers, the system is able to outperform human experts in distinguishing protein patterns. The discriminative features can also be used for routine statistical analyses, such as selecting the most typical image from an image set and objectively comparing two image sets. The system can also be applied to cluster images from randomly tagged genes into statistically indistinguishable groups. These approaches coupled with high-throughput imaging instruments represent a promising approach for the new discipline of location proteomics.
Keywords
- Support Vector Machine
- Kernel Principal Component Analysis
- Data Mining Method
- Fractal Dimensionality
- Stepwise Discriminant Analysis
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This is a preview of subscription content, access via your institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag London Limited
About this chapter
Cite this chapter
Huang, K., Murphy, R.F. (2005). Data Mining Methods for a Systematics of Protein Subcellular Location. In: Wu, X., Jain, L., Wang, J.T., Zaki, M.J., Toivonen, H.T., Shasha, D. (eds) Data Mining in Bioinformatics. Advanced Information and Knowledge Processing. Springer, London. https://doi.org/10.1007/1-84628-059-1_8
Download citation
DOI: https://doi.org/10.1007/1-84628-059-1_8
Publisher Name: Springer, London
Print ISBN: 978-1-85233-671-4
Online ISBN: 978-1-84628-059-7
eBook Packages: Computer ScienceComputer Science (R0)
