Abstract
An important problem in statistics is to study the effect of one or two factors on a dependent variable. This type of problem can be formulated as a regression problem (by using dummy (0,1) variables to represent the levels of factors) and the standard least squares (LS) analysis is well-known. The least absolute value (LAV) analysis is less well known, but certainly is becoming more widely used, especially in exploratory data analysis.The purpose of this report is to present a didactic treatment of visual display methods useful in exploratory data analysis. These visual display techniques (stem- and- leaf, box- and- whisker, and two-way plots) are presented for both the least squares and the least absolute value analyses of a two-way classification model.
Similar content being viewed by others
References
Andrews, D. F. (1974). A robust method for multiple linear regression.Technometrics 16: 523–531.
Andrews, D. F., Bickel, P. J., Hampel, F. R., Huber, P. J., Rogers, W. H. and Tukey, J. W. (1972).Robust Estimates of Location. New Jersey: Princeton University Press.
Armstrong, R. D. and Frome, E. L. (1976). A comparison of two algorithms for absolute deviation curve fitting.Journal American Statistical Association 71(354): 328–330.
Armstrong, R. D. and Frome, E. L. (1979). Least absolute value estimators for one-way and two-way tables.Naval Research Logistics Quarterly 26(1): 79–96.
Armstrong, R. D., Frome, E. L., and Sklar, M G, (1980). Linear programming in exploratory data analysis.Journal of Educational Statistics 5: 293–307.
Armstrong, R. D. and Hultz, J. D. (1976). An algorithm for restricted discrete approximation problem in the L1 norm.SIAM Journal on Numerical Analysis 14: 328–330.
Barrodale, I. and Roberts, F. D. K. (1977). An improved algorithm for discrete L1 linear approximation.SIAM Journal of Numerical Analysis 10: 839–848.
Barrodale, I. and Young, A. (1966). Algorithms for best L1 and L∞ linear approximation on a Set.Numerical Mathematics 8: 295–306.
Becker, R. A., Chamber, J. M., and Wilks, A. R. (1988). The new S language. Pacific Grove, CA: Wadsworth & Brooks/Cole.
Charnes, A. and Cooper, W. W. (1961).Management Models and Industrial Applications of Linear Programming, Vols I and II, New York: John Wiley and Sons, Inc.
Cook, R. D. and Weisberg, S. (1982). Criticism and influence analysis in regression.Sociological Methodology: 313–361.
CorelDraw!, Corel Systems Corporation, 1600 Carling Ave., Ottawa, Ontario K1Z 8r7.
Dietz, T., Frey, R. S., and Kalof, L. (1987). Estimation with cross-national data: robust and nonparametric methods.American Sociological Review 52(3): 380–391.
Daniel, C. (1978). Patterns in Residuals in the Two-Way Layout.Technometrics 20: 385–395.
Dutter, R. (1976). Computer linear robust curve fitting program LINWDR. Research Report 10. Fachgruppe für Statistik, ETH, Zurich.
Gentle, J. E. (1977). Least absolute values estimation: an introduction.Commun. Statist. — Simul. Computa. B6(4): 313–328.
Gentleman, J. F. and Wilk, M. B. (1975). Detecting outliers in a two-way table: statistical behavior of residuals.Technometrics 17: 1–14.
Goodman, L. A. (1972). A modified multiple regression approach to the analysis of dichotomous variables.American Sociological Review 37: 28–46.
Hampel, F. R. (1971). A general qualitative definition of robustness.Annals of Mathematical Statistics 42: 1887–1896.
Hartwig, F. and Dearing, B. E. Exploratory Data Analysis,Sage Series: Quantitative Applications in the Social Sciences No. 07-016.
Hogg, R. V. (1974). Adaptive robust procedures; a partial review and some suggestions for future applications and theory.Journal of the American Statistical Association 69.
Hoaglin, D. C., Mosteller, F., and Tukey, J. W. (1983).Understanding Robust and Exploratory Data Analysis, Wiley, New York.
Hoaglin, D. C., Mosteller, F., and Tukey, J. W. (1985).Exploring Data Tables, Trends, and Shapes. New York: John Wiley & Sons, Inc.
Huber, P. J. (1972). Robust statistics: a review.The Annals of Mathematical Statistics 43: 1041–1067.
John, J. A. and Draper, N. R. (1978). On Testing for Two Outliers or One Outlier in Two-Way Tables.Technometrics 20: 69–78.
Leroy, A. and Rousseeuw, P. J. (1984). PROGRES: A program for robust regression. Technical Report 201. Free University, Brussels, Belgium.
Marazzi, A. (1980). Robust linear regression programs in ROBETH. ROBETH document no. 2, Research Report 23. Fachgruppe für Statistik, ETH, Zurich.
McNeill, J. J. and Tukey, J. W. (1975). Higher-order diagnosis of two-way tables, illustrated on two sets of demographic empirical distributions.Biometrics: 129–132.
Minitab for Windows, Release 9, Minitab, Inc., 3081 Enterprise Dr., State College, PA, 16801–3008.
Mosteller, F. and Tukey, J. (1977).Data Analysis and Regression,. Reading, Massachusetts: Addison-Wesley, 1977.
PC-ISP, Artemis Systems, 125 Berry Corner Lane, Carlisle, MA 01741.
Peters, S. C., Samarov, A., and Welsch, R. E. (1982). Computational procedures for bounded-influence and robust regression (TROLL: BIF and BIFMOD). Technical Report 30. Center for Computational Research in Economics and Management Science, MIT, Cambridge, Mass, 1982.
Searle, S. R. (1971).Linear Models. New York: Wiley and Sons.
Sklar, M. G. and Armstrong, R. D. (1982). Least absolute value and Chebychev estimation utilizing least squares results.Mathematical Programming 24: 346–352.
SAS Institute, Inc., P.O. Box 8000, Cary, NC, 27511.
SPSS, Inc., Suite 3000, 444 North Michigan Ave., Chicago, IL, 70711.
Tukey, J. W. (1977).Exploratory Data Analysis. Reading, Massachusetts: Addison-Wesley.
Wu, L. L. (1985). Robust M-estimation of location and regression.Sociological Methodology: 316–388.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Sklar, M.G., Armstrong, R.D. Robust estimation procedures and visual display techniques for a two-way classification model. Qual Quant 28, 283–304 (1994). https://doi.org/10.1007/BF01098945
Issue Date:
DOI: https://doi.org/10.1007/BF01098945