Advertisement

Advanced R pp 115-140 | Cite as

Introduction to Data Management Using data.table

  • Matt Wiley
  • Joshua F. Wiley
Chapter

Abstract

We already briefly introduced the data.table package. This package is the heart of this chapter, which covers the basics of accessing, editing, and manipulating data under the broad term data management. Although not glamorous, data management is a critical first step to data visualization or analysis. Furthermore, the majority of time on a particular analysis project may come from the data management. For example, running a linear model in R can take one line of code, once the data is clean and in the format that the lm() function in R expects. Data management can be challenging, because raw data come in all types, shapes, and formats; missing data is common; and you may also have to combine or merge separate data sources. In this chapter, we introduce both mechanical and philosophical techniques to approach data management. All packages used in this chapter are already in our checkpoint.R file. Thus you need only source the file to get started.

Keywords

Data Table Function Call Data Frame Character String Package Data 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Copyright information

© Matt Wiley and Joshua F. Wiley 2016

Authors and Affiliations

  • Matt Wiley
    • 1
  • Joshua F. Wiley
    • 1
  1. 1.Elkhart Group Ltd. & Victoria CollegeColumbia CityUSA

Personalised recommendations