A Kernel-Based Case Retrieval Algorithm with Application to Bioinformatics
- Cite this paper as:
- Fu Y. et al. (2004) A Kernel-Based Case Retrieval Algorithm with Application to Bioinformatics. In: Zhang C., W. Guesgen H., Yeap WK. (eds) PRICAI 2004: Trends in Artificial Intelligence. PRICAI 2004. Lecture Notes in Computer Science, vol 3157. Springer, Berlin, Heidelberg
Case retrieval in case-based reasoning relies heavily on the design of a good similarity function. This paper provides an approach to utilizing the correlative information among features to compute the similarity of cases for case retrieval. This is achieved by extending the dot product-based linear similarity measures to their nonlinear versions with kernel functions. An application to the peptide retrieval problem in bioinformatics shows the effectiveness of the approach. In this problem, the objective is to retrieve the corresponding peptide to the input tandem mass spectrum from a large database of known peptides. By a kernel function implicitly mapping the tandem mass spectrum to a high dimensional space, the correlative information among fragment ions in a tandem mass spectrum can be modeled to dramatically reduce the stochastic mismatches. The experiment on the real spectra dataset shows a significant reduction of 10% in the error rate as compared to a common linear similarity function.
Unable to display preview. Download preview PDF.