Abstract
In the last several chapters we have covered the main topics of traditional scientific computing. These topics provide a foundation for most computational work. Starting with this chapter, we move on to explore data processing and analysis, statistics, and statistical modeling. As a first step in this direction, we look at the data analysis library pandas. This library provides convenient data structures for representing series and tables of data, and makes it easy to transform, split, merge, and convert data. These are important steps in the process.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Also known as data munging or data wrangling.
- 2.
CSV, or comma-separated values, is a common text format where rows are stored in lines and columns are separated by a comma (or some other text delimiter). See Chapter 18 for more details about this and other file formats.
- 3.
This dataset was obtained from the Wiki page: http://en.wikipedia.org/wiki/Largest_cities_of_the_European_Union_by_population_within_city_limits .
- 4.
We can also directly use the month method of the DatetimeIndex index object, but for the sake of demonstration we use a more explicit approach here.
- 5.
There are a large number of available time-unit codes. See the sections on “Offset aliases” and “Anchored offsets” in the pandas reference manual for details.
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2015 Robert Johansson
About this chapter
Cite this chapter
Johansson, R. (2015). Data Processing and Analysis. In: Numerical Python. Apress, Berkeley, CA. https://doi.org/10.1007/978-1-4842-0553-2_12
Download citation
DOI: https://doi.org/10.1007/978-1-4842-0553-2_12
Published:
Publisher Name: Apress, Berkeley, CA
Print ISBN: 978-1-4842-0554-9
Online ISBN: 978-1-4842-0553-2
eBook Packages: Professional and Applied ComputingApress Access BooksProfessional and Applied Computing (R0)