Abstract
Searching for genes involved in traits (e.g. diseases), based on genetic data, is considered from a computational learning perspective. This leads to the problem of learning relevant variables of functions from data sampled from a certain class of distributions generalizing the uniform distribution. The Fourier transform of Boolean functions is applied to translate the problem into searching for local extrema of certain functions of observables. We work out the combinatorial structure of this approach and illustrate its potential use.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
D.A. Bell, H. Wang: A formalism for relevance and its application in feature subset selection, Machine Learning 41 (2000), 175–195
A. Bernasconi: Mathematical techniques for the analysis of Boolean functions, PhD thesis, Univ. Pisa 1998
N. Bshouty, J.C. Jackson, C. Tamon: More efficient PAC-learning of DNF with membership queries under the uniform distribution, ACM Symp. on Computational Learning Theory COLT’99, 286–293
P. Damaschke: Adaptive versus nonadaptive attribute-efficient learning, Machine Learning 41 (2000), 197–215
P. Damaschke: Parallel attribute-efficient learning of monotone Boolean functions, 7th Scand. Workshop on Algorithm Theory SWAT’2000, LNCS 1851, 504–512, journal version accepted for J. of Computer and System Sciences
A.S. Goldstein, E.M. Reingold: A Fibonacci version of Kraft’s inequality with an application to discrete unimodal search, SIAM J. Computing 22 (1993), 751–777
J.C. Jackson: An efficient membership-query algorithm for learning DNF with respect to the uniform distribution, J. of Comp. and Sys. Sci. 55 (1997), 414–440
G.H. John, R. Kohavi, K. Pfleger: Irrelevant features and the subset selection problem, 11th Int. Conf. on Machine Learning 1994, Morgan Kaufmann, 121–129
D.S. Johnson (ed.): Challenges for Theoretical Computer Science (draft), available at http://www.research.att.com/~dsj/nflist.html#Biology
S. Karlin, U. Liberman: Classifications and comparisons of multilocus recombination distribution, Proc. Nat. Acad. Sci. USA 75 (1979), 6332–6336
M.J. Kearns, R.E. Schapire: Efficient distribution-free learning of probabilistic concepts, in: Computational Learning Theory and Natural Learning Systems, MIT Press 1994, 289–329 (preliminary version in FOCS’90)
R. Kohavi: Feature subset selection as search with probabilistic estimates, in: R. Greiner, D. Subramanian (eds.): Relevance, Proc. 1994 AAAI Fall Symposium, 122–126
W. Li, J. Reich: A complete enumeration and classification of two-locus disease models, Human Hereditary (1999)
N. Linial, Y. Mansour, N. Nisan: Constant depth circuits, Fourier transform, and learnability, J. of ACM 40 (1993), 607–620
Y. Mansour: Learning Boolean functions via the Fourier transform, in: Theoretical Advances in Neural Computing and Learning, Kluwer 1994
A. Mathur, E.M. Reingold: Generalized Kraft’s inequality and discrete k-modal search, SIAM J. Computing 25 (1996), 420–447
J.C. Schlimmer: Efficiently inducing determinations: a complete and systematic search algorithm that uses optimal pruning, 10th Int. Conf. on Machine Learning 1993, Morgan Kaufmann, 284–290
J.D. Terwilliger, H.H.H. Göring: Gene mapping in the 20th and 21st centuries: statistical methods, data analysis, and experimental design, Human Biology 72 (2000), 63–132
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Damaschke, P. (2001). Approximate Location of Relevant Variables under the Crossover Distribution. In: Steinhöfel, K. (eds) Stochastic Algorithms: Foundations and Applications. SAGA 2001. Lecture Notes in Computer Science, vol 2264. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45322-9_13
Download citation
DOI: https://doi.org/10.1007/3-540-45322-9_13
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-43025-4
Online ISBN: 978-3-540-45322-2
eBook Packages: Springer Book Archive