Abstract
Effective data visualization is an important aspect of data science, for at least three distinct reasons:
-
Exploratory data analysis: What does your data really look like? Getting a handle on what you are dealing with is the first step of any serious analysis. Plots and visualizations are the best way I know of to do this.
-
Error detection: Did you do something stupid in your analysis? Feeding unvisualized data to any machine learning algorithm is asking for trouble. Problems with outlier points, insufficient cleaning, and erroneous assumptions reveal themselves immediately when properly visualizing your data. Too often a summary statistic (77.8% accurate!) hides what your model is really doing. Taking a good hard look what you are getting right vs. wrong is the first step to performing better.
-
Communication: Can you present what you have learned effectively to others? Meaningful results become actionable only after they are shared. Your success as a data scientist rests on convincing other people that you know what you are talking about. A picture is worth 1,000 words, especially when you are giving a presentation to a skeptical audience.
At their best, graphics are instruments for reasoning.
– Edward Tufte
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Copyright information
© 2017 The Author(s)
About this chapter
Cite this chapter
Skiena, S.S. (2017). Visualizing Data. In: The Data Science Design Manual. Texts in Computer Science. Springer, Cham. https://doi.org/10.1007/978-3-319-55444-0_6
Download citation
DOI: https://doi.org/10.1007/978-3-319-55444-0_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-55443-3
Online ISBN: 978-3-319-55444-0
eBook Packages: Computer ScienceComputer Science (R0)