Abstract
Whenever a programmer writes a loop, or a mathematician does a proof by induction, an invariant is involved. The discovery and understanding of invariants often underlies problem solving in many domains. I discuss in this tutorial powerful invariants in some problems relevant to biology and medicine. In the process, we learn several major paradigms (invariants, emerging patterns, guilt by association), some important applications (active sites, key mutations, origin of species, protein functions, disease diagnosis), some interesting technologies (sequence comparison, multiple alignment, machine learning, signal processing, microarrays), and the economics of bioinformatics.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Altschul, S.F., Madden, T.L., et al.: Gapped BLAST and PSI-BLAST: A new generation of protein database search programs. Nucleic Acids Research 25(17), 3389–3402 (1997)
Bateman, A., Birney, E., et al.: Pfam 3.1: 1313 multiple alignments and profile HMMs match the majority of proteins. Nucleic Acids Research 27(1), 260–262 (1999)
Breitling, R., Herzyk, P.: Rank-based methods as a non-parametric alternative of the T-statistic for the analysis of biological microarray data. Journal of Bioinformatics and Computational Biology 3(5), 1171–1190 (2005)
Broberg, P.: Statistical methods for ranking differentially expressed genes. Genome Biology, 4 R41.1–R41.9 (2003)
Brown, D.G., Li, M., Ma, B.: A tutorial of recent developments in the seedings of local alignment. Journal of Bioinformatics and Computational Biology 2(4), 819–842 (2004)
Chin, F.Y.L., Ho, N.L., et al.: Efficient constrained multiple sequence alignment with performance guarantee. Journal of Bioinformatics and Computational Biology 3(1), 1–18 (2005)
Chua, H.N., Sung, W.-K., Wong, L.: Exploiting indirect neighbours and topological weight to predict protein function from protein-protein interactions. Bioinformatics 22, 1623–1630 (2006)
Dong, G., Li, J.: Efficient mining of emerging patterns: Discovering trends and differences. In: Proc. 5th ACM SIGKDD Intl Conf on Knowledge Discovery & Data Mining, pp. 15–18, San Diego (1999)
Doolittle, R.F., Hunkapiller, M.W.: Simian sarcoma virus onc gene, v-sis, is derived from the gene (or genes) encoding a platelet-derived growth factor. Science 221, 275–277 (1983)
Gusfield, D.: Algorithms on Strings, Trees, and Sequences. Cambridge University Press, Cambridge (1997)
Kung, S.-Y., Mak, M.-W., Tagkopoulos, I.: Symmetric and asymmetric multi-modality biclustering analysis for microarray data matrix. Journal of Bioinformatics and Computational Biology 4(2), 275–298 (2006)
Li, J., Wong, L.: Identifying good diagnostic genes or genes groups from gene expression data by using the concept of emerging patterns. Bioinformatics 18, 725–734 (2002)
Li, M., Ma, B., et al.: PatternHunter II: Highly sensitive and fast homology search. Journal of Bioinformatics and Computational Biology 2(3), 417–440 (2004)
Liao, L., Noble, W.S.: Combining pairwise sequence similarity and support vector machines for remote protein homology detection. In: Proc. 6th Annual Intl Conf on Research in Computational Molecular Biology, pp. 225–232 (2002)
Lim, K.L., Kolatkar, P.R., et al.: Interconversion of kinetic identities of the tandem catalytic domains of receptor-like protein-tyrosine phosphatase PTP-α by two point mutations is synergistic and substrate-dependent. Journal of Biological Chemistry 273(44), 28986–28993 (1998)
Liu, H., Wong, L.: Data mining tools for biological sequences. Journal of Bioinformatics and Computational Biology 1(1), 139–168 (2003)
Ma, B., Wu, L., Zhang, K.: Improving the sensitivity and specificity of protein homology search by incorporating predicted secondary structures. Journal of Bioinformatic and Computational Biology 4(3), 709–720 (2006)
Miller, L.D., Long, P.M., et al.: Optimal gene expression analysis by microarrays. Cancer Cell 2, 353–361 (2002)
Mukherjee, S., Mitra, S.: Hidden Markov models, grammars, and biology: A tutorial. Journal of Bioinformatics and Computational Biology 3(2), 491–526 (2005)
Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48, 444–453 (1970)
Niijima, S., Kuhara, S.: Multiclass molecular cancer classification by kernel subspace methods with effective kernel parameter selection. Journal of Bioinformatics and Computational Biology 3(5), 1071–1088 (2005)
Pui, C.H., Evans, W.E.: Acute lymphoblastic leukemia. New England Journal of Medicine 339, 605–615 (1998)
Schrappe, M., Reiter, A., et al.: Improved outcome in childhood acute lymphoblastic leukemia despite reduced use of anthracyclines and cranial radiotherapy: Results of trial ALL-BFM 90. Blood 95, 3310–3322 (2000)
Slonim, D.K., Tamayo, P., et al.: Class prediction and discovery using gene expression data. In: Proc. 4th Intl Conf on Computational Molecular Biology, pp. 262–271 (2000)
Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147, 195–197 (1981)
Subramanian, A., Tamayo, P., et al.: Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles. Proc. Nat. Acad. Sci. USA 102(43), 15545–15550 (2005)
Sykes, B.: The Seven Daughters of Eve. Gorgi Books (2002)
Thompson, J.D., Gibson, T.J., et al.: The CLUSTAL-X windows interface: Flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Research 25(24), 4876–4882 (1997)
Thompson, J.D., Higgins, D.G., Gibson, T.J.: CLUSTAL W: Improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties, and weight matrix choice. Nucleic Acids Research 22, 4673–4680 (1994)
Waterman, M.S.: Introduction to Computational Biology: Maps, Sequences, and Genomes. CRC Press, Boca Raton (2000)
Wooley, J.C., Lin, H.S. (eds.): Catalyzing Inquiry at the Interface of Computing and Biology. National Academy Press, Washington (2005)
Wu, J., Kasif, S., DeLisi, C.: Identification of functional links between genes using phylogenetic profiles. Bioinformatics 19(12), 1524–1530 (2003)
Yeoh, E.-J., Ross, M.E., et al.: Classification, subtype discovery, and prediction of outcome in pediatric acute lymphoblastic leukemia by gene expression profiling. Cancer Cell 1, 133–143 (2002)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Wong, L. (2007). Manifestation and Exploitation of Invariants in Bioinformatics. In: Anai, H., Horimoto, K., Kutsia, T. (eds) Algebraic Biology. AB 2007. Lecture Notes in Computer Science, vol 4545. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-73433-8_26
Download citation
DOI: https://doi.org/10.1007/978-3-540-73433-8_26
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-73432-1
Online ISBN: 978-3-540-73433-8
eBook Packages: Computer ScienceComputer Science (R0)