Skip to main content

The Beginner's Guide to Data Science

  • Book
  • © 2022

Overview

  • A practitioner’s guide for data science that can also be used effectively as a textbook for students. For example, the application of statistics and statistical methods are linked to an immediate real-world data science problem, not presented as a theoretical isolated field of investigation
  • Presumes some familiarity with the basics of programming and purposefully guides the reader through a series of topics essential to data science, providing both the managerial and technical foundations of the discipline
  • Chapters can be read independently of the others, a benefit for both the data science novice and professionals focused on learning a specific topic
  • Most figures are created by code examples that may be downloaded and executed from https://github.com/robertball/Beginners-Guide-Data-Science
  • For figures in this book of special interest to the reader, the code that generated the figure may be customized to align with the reader’s current needs

This is a preview of subscription content, log in via an institution to check access.

Access this book

eBook USD 49.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book USD 64.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book USD 64.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Other ways to access

Licence this eBook for your library

Institutional subscriptions

Table of contents (11 chapters)

Keywords

About this book

This book discusses the principles and practical applications of data science, addressing key topics including data wrangling, statistics, machine learning, data visualization, natural language processing and time series analysis. Detailed investigations of techniques used in the implementation of recommendation engines and the proper selection of metrics for distance-based analysis are also covered.

Utilizing numerous comprehensive code examples, figures, and tables to help clarify and illuminate essential data science topics, the authors provide an extensive treatment and analysis of real-world questions, focusing especially on the task of determining and assessing answers to these questions as expeditiously and precisely as possible. This book addresses the challenges related to uncovering the actionable insights in “big data,” leveraging database and data collection tools such as web scraping and text identification.

This book is organized as 11 chapters, structuredas independent treatments of the following crucial data science topics:

  • Data gathering and acquisition techniques including data creation
  • Managing, transforming, and organizing data to ultimately package the information into an accessible format ready for analysis
  • Fundamentals of descriptive statistics intended to summarize and aggregate data into a few concise but meaningful measurements
  • Inferential statistics that allow us to infer (or generalize) trends about the larger population based only on the sample portion collected and recorded
  • Metrics that measure some quantity such as distance, similarity, or error and which are especially useful when comparing one or more data observations
  • Recommendation engines representing a set of algorithms designed to predict (or recommend) a particular product, service, or other item of interest a user or customer wishes to buy or utilize in some manner
  • Machine learning implementations and associated algorithms, comprising core data science technologies with many practical applications, especially predictive analytics
  • Natural Language Processing, which expedites the parsing and comprehension of written and spoken language in an effective and accurate manner
  • Time series analysis, techniques to examine and generate forecasts about the progress and evolution of data over time

Data science provides the methodology and tools to accurately interpret an increasing volume of incoming information in order to discern patterns, evaluate trends, and make the right decisions. The results of data science analysis provide real world answers to real world questions. Professionals working on data science and business intelligence projects as well as advanced-level students and researchers focused on data science, computer science, business and mathematics programs will benefit from this book. 




Authors and Affiliations

  • Weber State University, Ogden, USA

    Robert Ball, Brian Rague

About the authors

Robert Ball has devoted a significant amount of his adult years thinking about data. From visualizing data on 100-monitor displays, exploring migration patterns, to understanding the provenance and evolution of data through time, he has explored data expressed in many usages and forms. Dr. Ball both teaches and works with the private sector, public sector, and government in various projects and capacities. However, no matter the origin of the data, the ultimate question almost universally that needs to be answered is what insight can be discovered in the data?

Brian Rague joined the School of Computing faculty at Weber State University in 2003 after working on various data science and engineering research projects throughout his early career at MIT, Caltech, and NASA’s Jet Propulsion Laboratory. He has consulted with industry partners on how to effectively leverage the ongoing deluge of available data for both operations and research purposes.  His areas ofinterest emphasize the platforms and technologies that wrangle and process data, such as machine learning, parallel computing, and distributed systems.

Bibliographic Information

Publish with us