Algorithms for Identification Key Generation and Optimization with Application to Yeast Identification
Algorithms for the automated creation of low cost identification keys are described and theoretical and empirical justifications are provided. The algorithms are shown to handle differing test costs, prior probabilities for each potential diagnosis and tests that produce uncertain results. The approach is then extended to cover situations where more than one measure of cost is of importance, by allowing tests to be performed in batches. Experiments are performed on a real-world case study involving the identification of yeasts.
KeywordsGreedy Algorithm Shannon Entropy Material Cost Test Cost Weighted Cost
Unable to display preview. Download preview PDF.
- 2.T. Wijtzes, M.R. Bruggeman, M.J.R. Nout, and M.H. Zwietering. A computerised system for the identification of lactic acid bacteria. International Journal of Food Microbiology, pages 65–70, 1997.Google Scholar
- 3.B. De la Iglesia, V.J. Rayward-Smith, and J.J. Wesselink. Classification/ identification on biological databases. Proc MIC2001, 4th International Metaheuristics Conference, ed. J.P. de Souza, Porto, Portugal, 2001.Google Scholar
- 4.R.W. Payne and C.J. Thompson. A study of criteria for constructing identification keys containing tests with unequal costs. Comp. Stats. Quarterly, 1:43–52, 1989.Google Scholar
- 5.R.W. Payne and T.J. Dixon. A study of selection criteria for constructing identification keys. In T. Havranek, Z. Sidak, and M. Novak, editors, COMPSTAT 1984: Proceedings in Computational Statistics, pages 148–153. Physica-Verlag, 1984.Google Scholar
- 6.R.W. Payne. Genkey: A program for constructing and printing identification keys and diagnostic tables. Technical Report m00/42529, Rothamsted Experimental Station, Harpenden, Hertfordshire, 1993.Google Scholar
- 9.J. R. Quinlan. C4.5: Programs for Machine Learning. Morgan Kaufmann, 1993.Google Scholar