Getting to Know Your Data

Wooditch, Alese; Johnson, Nicole J.; Solymosi, Reka; Medina Ariza, Juanjo; Langton, Samuel

doi:10.1007/978-3-030-50625-4_2

Alese Wooditch⁶,
Nicole J. Johnson⁶,
Reka Solymosi⁷,
Juanjo Medina Ariza⁸ &
…
Samuel Langton⁹

832 Accesses

The original version of this chapter was revised. The ESM information has been updated. The correction to this chapter is available at https://doi.org/10.1007/978-3-030-50625-4_16

Abstract

Now that you are familiar with creating data in different formats in R, we can start to discuss one of the most important steps of the data analysis process—transforming your data into what Hadley Wickham (Wickham, Journal of Statistical Software, 59(10), 1–23, 2014) calls tidy data. Fittingly, many useful functions for data tidying are in a set of packages called the tidyverse. Data tidying is a very important step that will ensure your data are in the format you need to conduct your analyses. It includes steps such as viewing data types; viewing, editing, and adding both variable labels and value labels; formatting classes; and recoding and creating new variables. Learning basic techniques to determine, for example, how different variables in your dataset are stored or whether a variable has too many missing cases can be extremely useful when you are planning what you can feasibly analyze and how to do it. In pretty much any research project involving data analysis, you can expect that your data will require some level of manipulation. We rarely receive data that are perfectly clean and set up for our purpose! Fortunately, R offers a great deal of flexibility in how to accomplish these tasks. In this section, we will walk through some examples of common data transformations you may need to perform in your own analysis while at the same time practicing the concept of levels of measurement using data from the National Crime Victimization Survey.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

eBook: USD 16.99; Price excludes VAT (USA)

Softcover Book: USD 64.99; Price excludes VAT (USA)

Hardcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Reference

Wickham, H. (2014). Tidy data. Journal of Statistical Software, 59(10), 1–23.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Criminal Justice, Temple University, Philadelphia, PA, USA
Alese Wooditch & Nicole J. Johnson
School of Social Sciences, University of Manchester, Manchester, UK
Reka Solymosi
Department of Criminal Law and Crime Science, School of Law, University of Seville, Seville, Spain
Juanjo Medina Ariza
Netherlands Institute for the Study of Crime and Law Enforcement, Amsterdam, The Netherlands
Samuel Langton

Authors

Alese Wooditch
View author publications
You can also search for this author in PubMed Google Scholar
Nicole J. Johnson
View author publications
You can also search for this author in PubMed Google Scholar
Reka Solymosi
View author publications
You can also search for this author in PubMed Google Scholar
Juanjo Medina Ariza
View author publications
You can also search for this author in PubMed Google Scholar
Samuel Langton
View author publications
You can also search for this author in PubMed Google Scholar

Electronic Supplementary Material

Data 2.1

(SAV 1.41 mb)

Key Terms

Project: A self-contained working directory.
Nominal variables: Categorical, unordered variables.
Ordinal variables: Categorical, ordered variables.
Interval/ratio variables: Numeric variables with equal intervals between values; they are functionally the same, yet ratio-level variables have a true zero.

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Wooditch, A., Johnson, N.J., Solymosi, R., Medina Ariza, J., Langton, S. (2021). Getting to Know Your Data. In: A Beginner’s Guide to Statistics for Criminology and Criminal Justice Using R. Springer, Cham. https://doi.org/10.1007/978-3-030-50625-4_2

Download citation

DOI: https://doi.org/10.1007/978-3-030-50625-4_2
Published: 04 June 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-50624-7
Online ISBN: 978-3-030-50625-4
eBook Packages: Law and CriminologyLaw and Criminology (R0)

Publish with us

Policies and ethics