Advertisement

Data Profiling and Scrubbing

  • Francis Rodrigues
  • Michael Coles
  • David Dye

Abstract

Projects that require bringing together data from multiple sources—for example, data warehouse, data mart, or operational data store (ODS) projects—are extremely common. You could spend months gathering business requirements, putting together technical specifications, designing target databases, and coding and testing your ETL process. You could spend an eternity in “ad hoc maintenance mode” rewriting large sections of code that don’t handle unanticipated bad or nonconforming data. This scenario is the result of a failure to properly plan and execute data intergration projects—a phenomenon known as code, load, and explode.

Keywords

Postal Code Data Viewer Fuzzy Grouping String Data Connection Manager 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Francis Rodrigues, Michael Coles, and David Dye 2012

Authors and Affiliations

  • Francis Rodrigues
  • Michael Coles
  • David Dye

There are no affiliations available

Personalised recommendations