Advertisement

Cleaning and Assessing Continuous Data using MEANS, UNIVARIATE, and BOXPLOT

  • Charles DiMaggio
Chapter

Abstract

In this chapter, we continue and expand on our brief introduction to continuous data from our consideration of PROC MEANS in Chap. 6. From an epidemiological perspective, the descriptive procedures in this chapter may be all that are needed to give us some summary statistics like means and medians, or simple graphical comparisons. They are often useful for data “cleaning” by providing tools to look for missing data, identifying outlier or erroneous values, and getting an overall sense of the data. These procedures, which in addition to PROC MEANS include PROC UNIVARIATE and PROC BOXPLOT, are also used to evaluate data for the assumptions for analyses such as ANOVA or linear regression. We will be talking (a lot) about statistical significance, variability, etc., and it’s easy to get caught up in the ideal of statistical significance as its own end. But, remember: clinical or epidemiological importance is what we’re really interested in.

Keywords

Sample Standard Deviation Normal Probability Plot Proc UNIVARIATE Default Statistic Discharge Disposition 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    Cavalieri P, Marovich P, Patetta MJ, Walsh S, Bond C, SAS Institute (2000) Statistics I: Introduction to anova, regression, and logistic regression: course notes. SAS Institute, Cary, NCGoogle Scholar
  2. 2.
    Daniel WW (2006) Biostatistics: a foundation for analysis in the health sciences 8th edition with SPSS software CD Rom 14.0 set (Wiley series in probability and statistics). Wiley, New YorkGoogle Scholar
  3. 3.
    Darroch J. (1997) Biologic synergism and parallelism. American Journal of Epidemiology Vol 145, No7, pp 661–668.CrossRefGoogle Scholar
  4. 4.
    Delwiche LD, Slaughter SJ (2008) The little SAS book: a primer. SAS Institute, Cary, NCGoogle Scholar
  5. 5.
    Hennekens CH, Buring JE, Mayrent SL (1987) Epidemiology in medicine. Lippincott Williamns & Wilkins, PhiladelphiaGoogle Scholar
  6. 6.
    Hosmer DW, Lemeshow S (2000) Applied logistic regression (Wiley series in probability and statistics). Wiley, New YorkCrossRefGoogle Scholar
  7. 7.
    Kelsey JL, Whittemore AS, Evans AS, Thompson WD (1996) Methods in observational epidemiology. Oxford University Press, Oxford, New YorkGoogle Scholar
  8. 8.
    Kleinbaum DG, Kupper LL, Nizam A, Muller KE (2007) Applied regression analysis and multivariable methods (Duxbury applied). Duxbury Press, North Scituate, MAGoogle Scholar
  9. 9.
    Patetta MJ, Amrhein J (2005) Categorical data analysis using logistic regression: course notes. SAS Institute, Cary, NCGoogle Scholar
  10. 10.
    Rothman KJ, Greenland S, Lash TL (2008) Modern epidemiology, 3rd Edition. Lippincott Williamns & Wilkins, PhiladelphiaGoogle Scholar
  11. 11.
    Schlesselman JJ. (1982) Case-control studies: design, conduct, analysis. Oxford University Press, Oxford, New YorkGoogle Scholar
  12. 12.
    Susser ES, Schwartz S, Morabia A, Bromet E (2006) Psychiatric epidemiology: searching for the causes of mental disorders. Oxford University Press, Oxford, New YorkGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2013

Authors and Affiliations

  • Charles DiMaggio
    • 1
  1. 1.Departments of Anesthesiology and Epidemiology College of Physicians and Surgeons Mailman School of Public HealthColumbia UniversityNew YorkUSA

Personalised recommendations