Data Reduction for Pattern Recognition and Data Analysis

Chow, Tommy W. S.; Huang, Di

doi:10.1007/978-3-540-78293-3_2

Tommy W. S. Chow⁴ &
Di Huang⁴

Part of the book series: Studies in Computational Intelligence ((SCI,volume 115))

2269 Accesses
4 Citations

Pattern recognition involves various human activities of great practical significance, such as data-based bankruptcy prediction, speech/image recognition, machine fault detection and cancer diagnosis. Clearly, it would be immensely useful to build machines to fulfill pattern recognition tasks in a reliable and efficient way. The most general and most natural pattern recognition frameworks mainly rely on statistical characterizations of patterns with an assumption that they are generated by a probabilistic system. Research on neural pattern recognition has been widely conducted during the past few decades. In contrast to statistical methods, no assumptions (a priori knowledge) are required for building a neural pattern recognition framework. Despite the fact that different pattern recognition systems use different working mechanisms, the basic procedures of all these systems are basically the same. A typical pattern recognition procedure generally consists of three sequential parts – a sensing model for collecting and preprocessing raw data from real sites, a data processing model (which includes feature extraction/ selection and pattern selection), and a recognition/classification model [13, 58]. When one is handling a pattern recognition process, the following basic issues must be addressed:

How to process the raw data for a pattern recognition task? This issue concerns the sensing and preprocessing stage of pattern recognition;
How to determine appropriate data for a given pattern recognition model? This is a very important concern in the data processing stage. Deleting noisy or redundant data (including features and patterns) invariably leads to enhanced recognition performance;
How to design an appropriate classifier based on a given data set? This topic has been widely discussed in the pattern recognition community. Various learning algorithms and models have been proposed in an attempt to enhance recognition accuracy as much as possible, and in a fashion that is as simple as possible.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 389.00; Price excludes VAT (USA)

Hardcover Book: USD 499.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Alon U, Barkar N, Notterman DA, Gish K, Ybarra S, Mack D, Levine AJ (1996) Broad pattern of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proc. National Academy Science, 9612: 6745-6750.
Article Google Scholar
Astrahan MM (1970) Speech analysis by clustering, or the hyperphoneme method. Stanford AI Project Memo, Stanford University, CA.
Google Scholar
Battiti R (1994) Using mutual information for selecting features in supervised neural net learning. IEEE Trans. Neural Networks, 5: 537-550.
Article Google Scholar
Bins J, Draper B (2001) Feature selection from huge feature sets. In: Proc. Intl. Conf. Computer Vision, July, Vancouver, Canada: 159-165.
Google Scholar
Bishop CM (1995) Neural Networks for Pattern Recognition. Oxford University Press, New York, NY.
Google Scholar
Blum AL, Langley P (1993) Selecting concise training sets from clean data. IEEE Trans. Neural Networks, 42: 305-318.
Article Google Scholar
Blum AL, Langley P (1997) Selection of relevant feature and examples in machine learning. Artificial Intelligence, 971-2: 245-271.
Article MATH MathSciNet Google Scholar
Bonnlander B (1996) Nonparametric selection of input variables for connec-tionist learning. PhD Thesis, Department of Computer Science, University of Colorado at Boulder, CU-CS-812-96.
Google Scholar
Carunana RA, Freitag D (1994) Greedy attribute selection. In: Cohen WW, Hirsh H (eds) Proc. 11th Intl. Conf. Machine Learning, New Brunswick, NJ, July. Morgan Kaufmann, San Francisco, CA: 28-36.
Google Scholar
Catlett J (1991) Megaindiction: machine learning on very large databases. PhD Thesis, Department of Computer Science, University of Sydney, Australia.
Google Scholar
Chow TWS, Huang D (2005) Estimating optimal feature subsets using effi-cient estimation of high-dimensional mutual information. IEEE Trans. Neural Networks, 161: 213-224.
Article Google Scholar
Devijver PA, Kittler J (1982) Pattern Recognition: a Statistical Approach. Prentice Hall, Englewood Cliffs, NJ.
MATH Google Scholar
Duda RO, Hart PE, Stork DG (2001) Pattern Classification. Wiley, New York, NY.
MATH Google Scholar
Fraser AM, Swinney HL (1986) Independent coordinates for strange attractors from mutual information. Physics Reviews A, 332: 1134-1140.
Article MathSciNet Google Scholar
Freund Y, Seung H, Shamir E, Tishby N (1997) Selective sampling using the query by committee algorithm. Machine Learning, 28: 133-168.
Article MATH Google Scholar
Friedman JH (1997) Data mining and statistics: what’s the connection? In: Scott DW (ed) Proc. 29th Symp. Interface Between Computer Science and Statistics, Houston, TX, May (available online at http://www.stat.stanford. edu/jhf/ftp/dm-stats.ps - last accessed March 2007).
Golub TR, Slonim DK, Tamayo P, Huard C, Gassenbeck M, Mesirov JP, Coller H, Loh L, Downing JR, Caligiuri MA, Bloomfield CD, Lander ES (1999) Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science, 286: 531-537.
Article Google Scholar
Gray RM (1984) Vector quantization. IEEE ASSP Magazine, 12: 4-29.
Article Google Scholar
Gui J, Li H (2005) Penalized Cox regression analysis in the high-dimensional and low sample size settings, with application to microarray gene expression. Bioinformatics, 2113: 3001-3008.
Article Google Scholar
Guyon I, Weston J, Barnhill S (2002) Gene selection for cancer classification using support vector machines. Machine Learning, 46: 389-422.
Article MATH Google Scholar
Guyon I, Elisseeff (2003) An introduction to variable and feature selection. J. Machine Learning Research, 3: 1157-1183.
Article MATH Google Scholar
Hall MA (1999) Correlation-based feature selection for machine learning. PhD Thesis, Department of Computer Science, University of Waikato, New Zealand.
Google Scholar
Hall MA, Holmes G (2000) Benchmarking attribute selection techniques for data mining. Working Paper 00/10, Department of Computer Science, Uni-versity of Waikato, New Zealand (available online at http://citeseer.ist.psu. edu/382752.html - last accessed March 2007).
Han JW, Kamber M (2001) Data Mining: Concepts and Techniques. Morgan Kaufmann, San Francisco, CA.
Google Scholar
Hart PE (1968) The condensed nearest neighbour rule. IEEE Trans. Information Theory, 14: 515-516.
Article Google Scholar
Huang D, Chow TWS (2005) Efficiently searching the important input variables using Bayesian discriminant. IEEE Trans. Circuits and Systems - Part I, 524: 785-793.
Article MathSciNet Google Scholar
Huang D, Chow TWS (2006) Enhancing density-based data reduction using entropy. Neural Computation, 18: 470-495.
Article MATH Google Scholar
Jain AK, Zongker D (1997) Feature selection: evaluation, application, and small sample performance. IEEE Trans. Pattern Analysis and Machine Intelligence, 192: 153-158.
Article Google Scholar
John GH, Kohavi R, Pfleger K (1994) Irrelevant features and the subset selection problem. In: Cohen WW, Hirsh H (eds) Proc. 11th Intl. Conf. Machine Learning, New Brunswick, NJ, July. Morgan Kaufmann, San Francisco, CA: 121-129.
Google Scholar
John GH, Langley P (1996) Statistics vs. dynamics sampling for data mining. In: Simoudis E, Han J, Fayyad UM (eds) Proc. 2nd Intl. Conf. Knowledge Discovery and Data Mining, Portlnd, OR, August. AAAI Press, Menlo Park, CA: 367-370.
Google Scholar
Kohavi R, John GH (1998) The wrapper approach. In: Liu H, Motoda H (eds) Feature Extraction, Construction and Selection. Kluwer Academic Publishers, New York, NY: 33-50.
Google Scholar
Kohonen T (2001) Self-Organizing Maps. Springer-Verlag, London, UK.
MATH Google Scholar
Kudo M, Sklansky (1997) A comparative evaluation of medium and large-scale feature selectors for pattern classifiers. In: Pudil P, Novovicova J, Grim J (eds) Proc. 1st Intl. Workshop Statistical Techniques in Pattern Recognition, Prague, Czech Republic, June: 91-96.
Google Scholar
Kudo M, Sklansky J (2000) Comparison of algorithms that select features for pattern classifiers. Pattern Recognition, 33: 25-41.
Article Google Scholar
Kwak N, Choi C-H (2002) Input feature selection for classification problems. IEEE Trans. Neural Networks, 13: 143-159.
Article Google Scholar
Kwak N, Choi C-H (2002) Input feature selection by mutual information based on Parzen window. IEEE Trans. Pattern Analysis and Machine Intelligence, 2412: 1667-1671.
Article Google Scholar
Last M, Kandel A, Maimon O, Eberbach E (2000) Anytime algorithm for feature selection. In: Ziarko W, Yao Y (eds) Rough Sets and Current Trends in Comput-ing (Proc. 2nd Intl. Conf. RSCTC), October, Banff, Canada. Springer-Verlag, London, UK: 16-19.
Google Scholar
Law M, Figueiredo M, Jain A (2002) Feature saliency in unsupervised learning. Technical Report, Department of Computer Science, Michigan State Univer-sity (available at http://www.cse.msu.edu/#lawhiu/papers/TR02.ps.gz - last accessed March 2007).
Lazzerini B, Marcelloni F(2001) Feature selection based on similarity. Electronics Letters, 38(3): 121-122.
Article Google Scholar
Lewis DD, Catlett J (1994) Heterogeneous uncertainty: sampling estimation of error reduction. In: Cohen WW, Hirsh H (eds) Proc. 11th Intl. Conf. Machine Learning, New Brunswick, NJ, July. Morgan Kauffman, San Francisco, CA: 148-156.
Google Scholar
Liu H, Motoda H (1998) Feature Selection for Knowledge Discovery and Data Mining. Kluwer Academic Publishers, London, UK.
MATH Google Scholar
Liu H, Motoda H, Dash M (1998) A monotonic measure for optimal feature selec-tion. In: Nedellec C, Rouveiral C (eds) Proc. European Conf. Machine Learning, Chemnitz, Germany, April. Springer-Verlag, London, UK: 101-106.
Google Scholar
Liu H, Motoda H, Yu L (2002) Feature selection with selective sampling. In: Sammut C, Hoffmann A (eds) Proc. 9th Intl. Conf. Machine Learning, Sydney, Australia, July. Morgan Kaufmann, San Francisco, CA: 395-402.
Google Scholar
MacKay D (1992) A practical Bayesian framework for backpropagation networks. Neural Computation, 4: 448-472.
Article Google Scholar
Mitra P, Murthy CA, Pal SK (2002) Density-based multi-scale data condensation. IEEE Trans. Pattern Analysis and Machine Intelligence,246: 734-747.
Article Google Scholar
Mitra P, Murthy CA, Pal SK (2002) Unsupervised feature seelction using fea-ture similarity. IEEE Trans. Pattern Analysis and Machine Intelligence, 243: 301-312.
Article Google Scholar
Molina LC, Belanche L, Nebot A (2002) Feature selection algorithms: a survey and experimental evaluation. Technical Report, Department de Llenguatges i Sistemes Informátics, Universitat Politèncnica de Catalunya.
Google Scholar
Moon Y, Rajagopalan B, Lall U (1995) Estimation of mutual information using kernel density estimators. Physics Reviews E, 52: 2318-2321.
Article Google Scholar
Moore J, Han E, Boley D, Gini M, Gross R, Hastings K, Karypis G, Kumar V, Mobasher B (1997) Web page categorization and feature seelction using association rule and principal component clustering. Proc. 7th Intl. Workshop Information Technologies and Systems, Atlanta, GA, December (available online at http://citeseer.ist.psu.edu/15436.html - last accessed March 2007)
Narendra PM, Fukunaga K (1997) A branch and bound algorithm for feature subset selection. IEEE Trans. Computers - C, 26(9): 917-922.
Article Google Scholar
Pal SK, De RK, Basak J (2000) Unsupervised feature evaluation: a neuro-fuzzy approach. IEEE Trans. Neural Networks, 112: 366-376.
Article Google Scholar
Plutowski M, White H (1993) Selecting concise training sets from clean data. IEEE Trans. Neural Networks, 42: 305-318.
Article Google Scholar
Provost F, Kolluri V (1999) A survey of methods for scaling up inductive algorithms. Data Mining and Knowledge Discovery, 2: 131-169.
Article Google Scholar
Pudil P, Novovicova J, Kittler J (1994) Floating search methods in feature selection. Pattern Recogition Letters, 15: 1119-1125.
Article Google Scholar
Roy N, McCallum A (2001) Toward optimal active learning through sampling estimation of error reduction. In: Lapalme KG (eds) Proc. 18th Intl. Conf. Machine Learning, Williamstown, MA, June. Morgan Kauffman, San Francisco, CA: 441-448.
Google Scholar
Setiono R, Liu H (1997) Neural network feature selector. IEEE Trans. Neural Networks, 83: 654-661.
Article Google Scholar
Siedlecki W, Sklansky J (1989) A note on genetic algorithms for large scale on feature selection. Pattern Recogition Letters, 10: 335-347.
Article MATH Google Scholar
Thedodoridis S, Koutroumbas K (1998) Pattern Recognition. Academic Press, London, UK.
Google Scholar
Tong S, Koller D (2000) Support vector machine active learning with applica-tions to text classification. In: Langley P (ed) Proc. 17th Intl. Conf. Machine Learning, Stanford, CA, June. Morgan Kaufmann, San Francisco, CA: 999-1006.
Google Scholar
Wang H, Bell D, Murtagh F (1999) Axiomatic approach to feature sub-set selection based on relevance. IEEE Trans. Pattern Analysis and Machine Intelligence, 213: 271-277.
Article Google Scholar
Wang W, Jones P, Patridge D (2001) A comparative study of feature-salience ranking techniques. Neural Computation, 13: 1603-1623.
Article MATH Google Scholar
Weston J, Mukherjee S, Chapelle O, Pontil M, Poggio T, Vapnik V (2001) Feature selection for SVMs. In: Solla SA, Leen TK, Muller K-R (eds) Advances in Neural Information Processing Systems 13. MIT Press, Cambridge, MA: 688-674.
Google Scholar
Wilson AL, Martinez TR (2000) Reduction techniques for instance-based learning algorithms. Machine Learning, 38: 257-286.
Article MATH Google Scholar
Wolf L, Shashua A (2003) Feature selection for unsupervised and supervised inference: the emergence of sparsity in a wieghted-based approach. Technical Report 2003-58, June, Hebrew University, Israel.
Google Scholar
Xing EP, Jordan MI, Karp RM (2001) Feature selection for high-dimensional genomic microarray data. In: Brodley CE, Danyluk AP (eds) Proc. 18th Intl. Conf. Machine Learning, Boston, MA, June. Morgan Kauffman, San Francisco, CA.
Google Scholar
Xu L, Yan P, Chang T (1998) Best first strategy for feature selection. Proc. 9th Intl. Conf. Pattern Recognition, Rome, Italy, November. IEEE Computer Society Press, Piscataway, NJ: 706-708.
Google Scholar
Yang J, Honavar VG (1998) Feature subset selection using a genetic algorithm. IEEE Intelligent Systems, 132: 44-49.
Article Google Scholar
Yang ZP, Zwolinski(2001) Mutual information theory for adaptive mixture models. IEEE Trans. Pattern Analysis and Machine Intelligence, 23(4): 396-403.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electronic Engineering, City University of Hong Kong, China
Tommy W. S. Chow & Di Huang

Authors

Tommy W. S. Chow
View author publications
You can also search for this author in PubMed Google Scholar
Di Huang
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Computer Science and Software Engineering Faculty of Informatics, University of Wollongong, Northfields Ave, Wollongong, NSW, 2522, Australia
John Fulcher
Knowledge-Based Engineering Founding Director of the KES Centre, University of South Australia, SCT-Building Mawson Lakes Campus, Adelaide, South Australia SA, 5095, Australia
L. C. Jain

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Chow, T.W.S., Huang, D. (2008). Data Reduction for Pattern Recognition and Data Analysis. In: Fulcher, J., Jain, L.C. (eds) Computational Intelligence: A Compendium. Studies in Computational Intelligence, vol 115. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-78293-3_2

Download citation

DOI: https://doi.org/10.1007/978-3-540-78293-3_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-78292-6
Online ISBN: 978-3-540-78293-3
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics