Testing and Evaluation of Statistical Software
Data sets analyzed by statisticians are likely to he ill-conditioned for the kinds of computations ordinarily performed on them. For this reason, most of the activity in testing and validation of statistical software has centered on numerical error analysis. The host generally effective testing and validation method has been test data generation. Once the parameters of numerical condition are identified, methods for systematically generating test data sets can be developed. In the case of least squares computations, for example, collinearity and stiffness seem to be the major components of condition. Test data sets with prespecified collinearity and stiffness are discussed.
The problem of software validation goes beyond the tests of the kernels that perform numerical computations. A program may involve the use of several computational modules, and errors in the program often occur in the setup stages in moving from one computational module to another. A stochastic model for errors remaining in a program after a sequence of tests is presented and discussed.
There are many statistical computations for which perturbation methods can be used easily to assess correctness. Perturbation methods can be employed by the user so as to avoid the bigger question of testing the software; the test is for the performance of the software on the specific problem of interest.
Unable to display preview. Download preview PDF.
- Duran, J.W., and Wiorkowski, J.J. (1979). “Quantifying software validity by sampling,” UTD Programs in Mathematical Sciences, Technical Report #50.Google Scholar
- Gilsinn, J., Hoffman, K., Jackson, R.H.F., Leyendecker, E., Saunders, P., and Shier, D. (1977). “Methodology and analysis for comparing discrete linear L1 approximation codes,” Communications in Statistics, Part B 6, 399–414.Google Scholar
- Greenfield, T., and Siday, S. (1980). “Statistical computing for business and industry,” The Statistician, 29, 33–55.Google Scholar
- Gregory, R.T., and Karney, D.L. (1969). A Collection of Matrices for Testing Computational Algorithms, Wiley-Interscience, New York.Google Scholar
- Hastings, W.K. (1972). “Test data for statistical algorithms: Least squares and ANOVA,” JASA, 67, 874–879.Google Scholar
- Hoffman, K.L., and Shier, D.R. (1977). “A test problem generator for discrete linear L1 approximation problems,^NBS Working Paper.Google Scholar
- Kennedy, W.J., and Gentle, J.E. (1977). “Examining rounding error in LAV regression computations,” Communications in Statistics, Part B, 6, 415–420.Google Scholar
- Kennedy, W.J., and Gentle, J.E. (1980). Statistical Computing. Marcel Dekker, New York.Google Scholar
- Kennedy, W.J., Gentle, J.E., and Sposito, V.A. (1977). “A computer oriented method for generating test problems for L1 regression,” Communications in Statistics, Part B, 6, 21–27.Google Scholar
- Kubat, P., and Koch, H.S. (1980). “On the estimation of the number of errors and reliability of software systems,” University of Rochester, Graduate School of Management, Working Paper No. 8012.Google Scholar
- Miller, E., and Howden, W.E., Editors (1978). Tutorial: Software Testing and Validation Techniques, IEEE Computer Society, Long Beach.Google Scholar
- Osterweil, L.J., and Fosdick, L.D. (1976). ‘Program testing techniques using simulated execution,’ Simuletter, 7 (4), 171–177.Google Scholar
- Velleman, P.F., and Allen, I.E. (1976). “The performance of package regression routines under stress: a preliminary trial of a regression evaluation method,” Proceedings of the Statistical Computing Section, American Statistical Association, Washington, 297–304.Google Scholar
- Wampler, R.H. (1978). “Test problems and test procedures for least squares algorithms,” Proceedings of Computer Science and Statistics: Eleventh Annual Symposium on the Interface. North Carolina State University, Raleigh, 84–90.Google Scholar