R Recipes pp 201-214 | Cite as

Dealing with Big Data

  • Larry A. Pace


Datasets are big and getting bigger, as has been mentioned previously in this book. People talk of the five Vs of big data, which include volume, variety, velocity, veracity (or lack thereof), and value (or lack thereof). The value gleaned from big data may be short-lived, simply because of the velocity with which the data can change. To deal with bigger data, we need faster processing for larger datasets. To accomplish this, we can use parallel processing, speed up the operations of our processing, use more efficient algorithms, or some combination of these approaches. In this chapter, you will learn how to use the R packages to perform parallel processing, how to extend (and speed up) the capabilities of the traditional R data frame by using data tables instead, and how to speed up computations in R by using compiled code from C++ as well as by preallocating result objects.


Parallel Processing Data Table Work Process Data Frame Fibonacci Sequence 

Copyright information

© Larry A. Pace 2014

Authors and Affiliations

  • Larry A. Pace
    • 1
  1. 1.SCUnites States

Personalised recommendations