PolyBase in SQL Server 2019 Data Virtualization with SQL Server, Cosmos DB, PostgreSQL, and Other Database Engines

  • Kevin Feasel

Your browser needs to be JavaScript capable to view this video

Try reloading this page, or reviewing your browser settings

You're watching a preview of subscription content. Log in to check access

This video dives into PolyBase - Microsoft’s technology for data virtualization - and its ability to virtualize data from a variety of data sources into SQL Server 2019. In this video, you will learn how to configure SQL Server and PolyBase to connect to another SQL Server instance. You will also learn the differences between PolyBase and linked server connections.

Next you will learn how to connect to additional data sources from SQL Server 2019, such as to Cosmos DB and PostgreSQL. You will see the configuration steps needed, as well as options available for retrieving data. From there, you will learn how to use statistics to assist the optimizer, obtain key query tuning tips, and implement a reference for loading a data warehouse using PolyBase.

What You Will Learn

  • Connect to remote instances of SQL Server, Cosmos DB, and PostgreSQL

  • Contrast linked servers to PolyBase and learn when to choose which method

  • Use statistics to assist the SQL Server optimizer in executing PolyBase queries

  • Tune queries against external tables

  • Load data from multiple external data sources into a data warehouse

Who This Video Is For

Data platform specialists who need to work on multiple data platforms, but want to use one unifying language to tie them together. Viewers should be familiar with SQL Server and the T-SQL language, but do not need prior experience with any other technologies. Viewers should have seen the first video in the series, PolyBase Concepts and Configuration.

This video covers PolyBase in SQL Server 2019 for data virtualization across SQL Server, Cosmos DB, PostgreSQL, and for data warehouse ETL.

About The Author

Kevin Feasel

Kevin Feasel is a Microsoft Data Platform MVP and CTO at Envizage, where he specializes in data analytics with T-SQL and R, forcing Spark clusters to do his bidding, fighting with Kafka, and pulling rabbits out of hats on demand. He is the lead contributor to the Curated SQL blog and author of PolyBase Revealed. A resident of Durham, North Carolina, he can be found cycling the trails along the triangle whenever the weather’s nice enough.

 

Supporting material

View source code at GitHub.

About this video

Author(s)
Kevin Feasel
DOI
https://doi.org/10.1007/978-1-4842-7047-9
Online ISBN
978-1-4842-7047-9
Total duration
1 hr 15 min
Publisher
Apress
Copyright information
© Kevin Feasel 2021

Related content

Video Transcript

[MUSIC]

Hi, I’m Kevin Feasel, and this is, PolyBase in SQL Server 2019; Data Virtualization with SQL Server, Cosmos DB, PostgreSQL, and More. In the first video, PolyBase Concepts and configuration, we looked at how to install and configure PolyBase. Additionally, we covered some of the core ideas behind PolyBase, including, the trio of external objects.

In the second video, Microsoft PolyBase, Hadoop and Azure Blob Storage, we learned about PolyBase V1 and its built-in integrations. Specifically, we learned where PolyBase does and does not work when connecting to Hadoop, understood the configuration settings needed to get everything working, and learned what to do when things break down.

In this third video, we will see how PolyBase V2, introduced in SQL Server 2019, opens the doors to connectivity with other SQL servers, Cosmos DB, PostgreSQL, and a whole lot more. Join me as we join together all the data sources with PolyBase in SQL Server 2019, Data Virtualization with SQL Server, Cosmos DB, PostgreSQL, and More.