Introduction to SQL Server Query Optimization

Identifying and Responding to Unwanted Table Scans

Your browser needs to be JavaScript capable to view this video

Try reloading this page, or reviewing your browser settings

This video delves into table scans, how they can negatively impact query performance, and how to resolve performance problems related to them.

Keywords

  • Clustered index scan
  • table scan
  • index scan
  • query optimization

About this video

Author(s)
Edward Pollack
First online
28 June 2019
DOI
https://doi.org/10.1007/978-1-4842-5144-7_7
Online ISBN
978-1-4842-5144-7
Publisher
Apress
Copyright information
© Ed Pollack 2019

Video Transcript

When a query is slow, the cause is typically that we are reading or processing too much data. This may be the direct result of querying for a massive amount of data or may be the result of a poorly written query, missing index, or another underlying problem.

A table scan refers to the action of reading the entire contents of a table in order to return the data we are looking for. We intend on reading the whole table, either because it is small or we need all the data in it. And this is not a bad thing if our goal is to read a small portion of the table.

Then churning through the rest of the data on that table results in a significant performance hit. For smaller table, its latency may not be noticeable. But for a table with millions or billions of rows, being forced to read through all of its data could take more time and resources than we have available.

To be specific, the table scan is the action of reading entire heap or a table with no clustered index. A clustered index scan is reading an entire table that has a clustered index. We can demonstrate this easily by selecting everything from a table.

Resulting execution plan is as simple as it gets. It indicates a clustered index scan on production dot product. In addition, we can see that 504 rows out of 504 on the table were read, confirming that we did, indeed, access every row in the table to complete our query.

Table scans are often caused by queries that do not have a supporting index and a target table. For example, if we select a few columns from production dot product filtering on cell start date, you can see that our results set is returned that is a fraction of the total rows on the table.

Despite that, we scanned the entire table anyway. The reason for this is that there is no index on a cell start date column. Without an index to search against, we are forced to read the entire table. We can also see that we’ve had 15 pages in the process of scanning the product table.

Assign index in this column and run our query again. The same results are returned. But we can see that our execution plan is changed. The cross-street index scan has been replaced by index C against the index we just created. You see that only 136 rows are read in the index.

In addition, our reads have dropped from 15 to four pages. If this were an important query, it was executed often. Or if this were a much larger table, than the index we added would be essential to our query’s performance. Not all table scans are bad, though. If our intention is to read most or all of the data on a table, then scanning the entire table is like what we want to do.

For example, if we wanted to adjust our query to return data starting in 2010, we would still use the index. We change the column list from two columns to all columns, though. Now, we’re turning such a large volume of data within the table that is more efficient to return everything rather than using index to return a majority of what is already contained in the table.

This results in a clustered index scan against the table. When a more complex query is slow, we can look for table scans in conjunction with higher reads against a given table. If we find both, then we’ve identified a target for optimization.

Table scans are indicative of a potential optimization challenge where we are reading more data than intended. If unindexed columns are being queried against, then there may be value in adding or modifying index to speed up an important query.