A General Method for Combining Predictors Tested on Protein Secondary Structure Prediction

Hansen, Jakob V.; Krogh, Anders

doi:10.1007/978-1-4471-0513-8_39

Jakob V. Hansen⁵ &
Anders Krogh⁶

Part of the book series: Perspectives in Neural Computing ((PERSPECT.NEURAL))

238 Accesses
3 Citations

Abstract

Ensemble methods, which combine several classifiers, have been successfully applied to decrease generalization error of machine learning methods. For most ensemble methods the ensemble members are combined by weighted summation of the output, called the linear average predictor. The logarithmic opinion pool ensemble method uses a multiplicative combination of the ensemble members, which treats the outputs of the ensemble members as independent probabilities. The advantage of the logarithmic opinion pool is the connection to the Kullback-Leibler error function, which can be decomposed into two terms: An average of the error of the ensemble members, and the ambiguity. The ambiguity is independent of the target function, and can be estimated using unlabeled data. The advantage of the decomposition is that an unbiased estimate of the generalization error of the ensemble can be obtained, while training still is on the full training set. These properties can be used to improve classification. The logarithmic opinion pool ensemble method is tested on the prediction of protein secondary structure. The focus is on how much improvement the general ensemble method can give rather than on outperforming existing methods, because that typically involves several more steps of refinement.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

B. Rost and C. Sander. Prediction of protein secondary structure at better than 70 % accuracy. Journal of Molecular Biology, 232(2): 584–599, Jul 20 1993.
Article Google Scholar
S. K. Riis and A. Krogh. Improving prediction of protein secondary structure using structured neural networks and multiple sequence alignments. Journal of Computational Biology, 3:163–183, 1996.
Article Google Scholar
P. Baldi and S. Brunak. Bioinformatics — The Machine Learning Approach. MIT Press, Cambridge MA, 1998.
Google Scholar
Anders Krogh and Jesper Vedelsby. Neural network ensembles, cross validation, and active learning. In G. Tesauro, D. Touretzky, and T. Leen, editors, Advances in Neural Information Processing Systems, volume 7, pages 231–238. The MIT Press, 1995.
Google Scholar
Tom Heskes. Bias/variance decompositions for likelihood-based estimators. Neural Computation, 10(6): 1425–1433, 1998.
Article Google Scholar
Tom Heskes. Selecting weighting factors in logarithmic opinion pools. In Michael I. Jordan, Michael J. Kearns, and Sara A. Solla, editors, Advances in Neural Information Processing Systems, volume 10. The MIT Press, 1998.
Google Scholar
O. Lund, K. Frimand, J. Gorodkin, H. Bohr, J. Bohr, J. Hansen, and S. Brunak. Protein distance constraints predicted by neural networks and probability density functions. Protein Engineering, 10(11): 1241–1248, 1997.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of Aarhus Ny Munkegade, Bldg. 540, DK-8000, Aarhus C, Denmark
Jakob V. Hansen
Center for Biological Sequence Analysis, Technical University of Denmark, Building 208, DK-2800, Lyngby, Denmark
Anders Krogh

Authors

Jakob V. Hansen
View author publications
You can also search for this author in PubMed Google Scholar
Anders Krogh
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Philosophy, Göteborg University, Box 200, SE-405 30, Göteborg, Sweden
Helge Malmgren BA, PhD, MD
Department of Electrical Engineering, Linköping University, SE-585 83, Linköping, Sweden
Magnus Borga MSc, PhD
Department of Computer Science, University of Skövde, PO Box 408, SE-541 28, Skövde, Sweden
Lars Niklasson BSc, MSc, PhD

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hansen, J.V., Krogh, A. (2000). A General Method for Combining Predictors Tested on Protein Secondary Structure Prediction. In: Malmgren, H., Borga, M., Niklasson, L. (eds) Artificial Neural Networks in Medicine and Biology. Perspectives in Neural Computing. Springer, London. https://doi.org/10.1007/978-1-4471-0513-8_39

Download citation

DOI: https://doi.org/10.1007/978-1-4471-0513-8_39
Publisher Name: Springer, London
Print ISBN: 978-1-85233-289-1
Online ISBN: 978-1-4471-0513-8
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics