Advertisement

R Recipes pp 201-214 | Cite as

Dealing with Big Data

  • Larry A. Pace
Chapter

Abstract

Datasets are big and getting bigger, as has been mentioned previously in this book. People talk of the five Vs of big data, which include volume, variety, velocity, veracity (or lack thereof), and value (or lack thereof). The value gleaned from big data may be short-lived, simply because of the velocity with which the data can change. To deal with bigger data, we need faster processing for larger datasets. To accomplish this, we can use parallel processing, speed up the operations of our processing, use more efficient algorithms, or some combination of these approaches. In this chapter, you will learn how to use the R packages to perform parallel processing, how to extend (and speed up) the capabilities of the traditional R data frame by using data tables instead, and how to speed up computations in R by using compiled code from C++ as well as by preallocating result objects.

Keywords

Parallel Processing Data Table Work Process Data Frame Fibonacci Sequence 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Copyright information

© Larry A. Pace 2014

Authors and Affiliations

  • Larry A. Pace
    • 1
  1. 1.SCUnites States

Personalised recommendations