Skip to main content
Log in

Robust classification of high-dimensional data using artificial neural networks

  • Papers
  • Published:
Statistics and Computing Aims and scope Submit manuscript

Abstract

This paper is concerned with the application of artificial neural networks (ANNs) to a practical, difficult and high-dimensional classification problem, discrimination between selected under-water sounds. The application provides for a particular comparison of the relative performance of time-delay as opposed to fully connected network architectures, in the analysis of temporal data. More originally, suggestions are given for adapting the conventional backpropagation algorithm to give greater robustness to mis-classification errors in the training examples—a particular problem with underwater sound data and one which may arise in other realistic applications of ANNs. An informal comparison is made between the generalisation performance of various architectures in classifying real dolphin sounds when networks are trained using the conventional least squares minimisation norm, L 2, that of least absolute deviation, L 1, and that of the Huber criterion, which involves a mixture of both L 1 and L 2. The results suggest that L 1 and Huber may provide performance gains. In order to evaluate these ‘robust’ adjustments more formally under controlled conditions, an experiment is then conducted using simulated dolphin sounds with known levels of random noise and misclassification error. Here, the results are more ambiguous and significant interactions are indicated which raise issues for future research.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Aitkin, M., Anderson, D., Francis, B. and Hinde, J. (1989) Statistical Modelling in GLIM. Clarendon Press, Oxford.

    Google Scholar 

  • Bahl, L. R., Jelinek, F. and Mercer, R. L. (1982) Continuous speech recognition: Statistical methods. In Handbook of Statistics, L. N. Kanal and P. R. Krishnaiah (eds.), North-Holland, Amsterdam, 2, 549–573.

    Google Scholar 

  • Barhen, J., Zak, M. and Gulati, S. (1989) Fast neural learning algorithms using networks with non-Lipschitzian dynamics. Proc. Neuro-Nimes'89, Nimes, France, 55–68.

    Google Scholar 

  • Bridle, J. S. (1990) Automatic speech recognition. In Artifical Intelligence: Concepts and Applications in Engineering, A. R. Mirzai (ed.), Chapman & Hall, London, 225–250.

    Google Scholar 

  • Broomhead, D. S. and Lowe, D. (1988) Radial basis functions, multi-variable functional interpolation and adaptive networks. RSRE Memorandum, 4148.

  • Dodd, N., Macfarlane, D. and Marland, C. (1991). Optimisation of network structure using genetic techniques implemented on multiple transputers. RIPR Memorandum, RIPRREP/ 1000/84/91.

  • Hampel, F. R., Ronchetti, E. M., Rousseeuw, P. J. and Stahel, W. A. (1986) Robust Statistics: An Approach Based on Influence Functions. Wiley, New York.

    Google Scholar 

  • Hecht-Nielson, R. (1989) Theory of the backpropagation neural network. Proc. Int. Joint Conf. on Neural Networks, 1, CA, 593–605.

    Google Scholar 

  • Holland, J. H. (1975) Adaptation in Natural and Artificial Systems. University of Michigan Press, Ann Arbor.

    Google Scholar 

  • Huber, P. J. (1981) Robust Statistics. Wiley, New York.

    Google Scholar 

  • Krogh, A. and Hertz, J. A. (1991) A simple weight decay can improve generalisation. Proc. Neural Information Processing Systems, 4.

  • Lowe, D. and Webb, A. R. (1990) Time series prediction by artificial neural networks: a dynamical systems perspective. RSRE Memorandum, 12.

  • McCulloch, N. (1988). Methods of incorporating time into artificial neural networks. RIPR Memorandum, RIPRREP/1000/ 30/88.

  • Nowlan, S. J. and Hinton, J. E. (1992) Adaptive soft weight tying using Gaussian mixtures. In Advances in Neural Information Processing Systems 4, J. E. Moody, S. J. Hanson and R. P. Lippmann (eds.), Morgan Kauffmann, San Mateo, CA.

    Google Scholar 

  • Rice, J. R. and White, J. S. (1964). Norms for smoothing and estimation. SIAM Review, 6, 243–256.

    Google Scholar 

  • Ripley, B. D. (1992) Statistical aspects of neural networks. Proc. SemStat (Séminaire Européen Statistique), Sandbjerg, Denmark. Chapman and Hall, London.

    Google Scholar 

  • Rumelhart, D. E. and McClelland, J. L. (1986a) Parallel Distributed Processing: Explorations in the Microstructure of Cognition-Vol. 1: Foundations. MIT Press, Boston, MA.

    Google Scholar 

  • Rumelhart, D. E., Hinton, G. E. and Williams, R. J. (1986b) Learning representations by back-propagating errors. Nature, 323, 533–536.

    Google Scholar 

  • Sawai, H., Waibel, A., Haffner, P., Miyatake, M. and Shikano, K. (1989) Parallelism, hierarchy, scaling in time delay neural networks for spotting Japanese phonemes/CV-syllables. Proc. Int. Joint Conf. on Neural Networks, 2, 81–88.

    Google Scholar 

  • Watrous, R. L. (1987) Learning algorithms for connectionist networks: applied gradient methods for non-linear optimisation. Proc. IEEE Int. Conf. on Neural Networks, San Diego, 619–627.

  • Waibel, A., Hanazawa, T., Hinton, G., Shikano, K. and Lang, K. J. (1989). Phoneme recognition using time delay neural networks. IEEE Trans. Acoustics, Speech and Signal Processing, 37, 328–339.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Smith, D.J., Bailey, T.C. & Munford, A.G. Robust classification of high-dimensional data using artificial neural networks. Stat Comput 3, 71–81 (1993). https://doi.org/10.1007/BF00153066

Download citation

  • Received:

  • Accepted:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF00153066

Keywords

Navigation