Advertisement

Augmenting ETL Processes

  • Bryan Cafferky

Abstract

This chapter will cover how to use PowerShell to augment ETL development. We often need to load external files into a SQL Server database. However, these files usually need to go through some preparation before they can be loaded. Perhaps they arrive via FTP from an external source. Such files may be compressed. Perhaps they need to be scrubbed of bad values or modified to make them easier to load. After being loaded, the business may want the files archived and retained for a period of time. Before PowerShell, legacy-style batch files were often employed to do these tasks. However, batch files are cryptic, difficult to maintain, and lack support for reusability. In this chapter, we will see how PowerShell scripts can be used to accomplish these tasks. Rather than define a specific business scenario for these tasks, we consider this a common ETL pattern in which we can choose to employ the given tasks that apply. In this pattern, files arrive in a folder and copied to a local server, then are decompressed, loaded, and archived. Typically, when the job starts and ends, email notifications are sent out. Sometimes there are additional requirements. We will discuss functions that help with these tasks. Let’s consider the potential ETL steps already mentioned as a template from which we can pick what we need.

Keywords

Output File Column Heading File Function Flat File Switch Parameter 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Copyright information

© Bryan Cafferky 2015

Authors and Affiliations

  • Bryan Cafferky
    • 1
  1. 1.MAUS

Personalised recommendations