Understanding SQL Server Statistics

by Sep 11, 2011

Donabel Santos (twitter (@sqlbelle) | blog) – April 25, 2011

“Statistics provides tools that you need in order to react intelligently to information you hear or read” – David Lane, 2003

 

Identify out-of-date SQL Server statistics with Idera’s free tool – SQL Update Statistics!

If there’s an upcoming election and you are running for office and getting ready to go from town to town city to city with your flyers, you will want to know approximately how many flyers you’re going to bring.

If you’re the coach of a sports team, you will want to know your players’ stats before you decide who to play when, and against who. You will often play a matchup game, even if you have 20 players, you might be allowed to play just 5 at a time, and you will want to know which of your players will best match up to the other team’s roster. And you don’t want to interview them one by one at game time (table scan), you want to know, based on their statistics, who your best bets are.

Just like the election candidate or the sports coach, SQL Server tries to use statistics to “react intelligently” in its query optimization. Knowing number of records, density of pages, histogram, or available indexes help the SQL Server optimizer “guess” more accurately how it can best retrieve data. A common misnomer is that if you have indexes, SQL Server will use those indexes to retrieve records in your query. Not necessarily. If you create, let’s say, an index to a column City and <90% of the values are ‘Vancouver’, SQL Server will most likely opt for a table scan instead of using the index if it knows these stats.

For the most part, there *may* be minimal we need to do to keep our statistics up-to-date (depending on your configurations), but understanding statistics a little bit better is in order to help us understand SQL Server optimization a little bit more.

How are statistics created?

Statistics can be created different ways
– Statistics are automatically created for each index key you create.

– If the database setting autocreate stats is on, then SQL Server will automatically create statistics for non-indexed columns that are used in queries.

– CREATE STATISTICS

What do statistics look like?

If you’re curious, there’s a couple ways you can peek at what statistics look like.

Option 1 – you can go to your Statistics node in your SSMS, right click > Properties, then go to Details. Below is a sample of the stats and histogram that’s collected for one of the tables in my database

Option 2 – you can use DBCC SHOW_STATISTICS WITH HISTOGRAM

The histograms are a great way to visualize the data distribution in your table.

How are statistics updated?

The default settings in SQL Server are to autocreate and autoupdate statistics.

Notice that there are two (2) options with the Auto Update statistics.
Auto Update Statistics basically means, if there is an incoming query but statistics are stale, SQL Server will update statistics first before it generates an execution plan.
Auto Update Statistics Asynchronously on the other hand means, if there is an incoming query but statistics are stale, SQL Server uses the stale statistics to generate the execution plan, then updates the statistics afterwards.

However, Idera offers a cool free tool which simplifies the process of finding and updating out-of-date SQL Server table statistics in an easy to understand and use UI. Check it out here: https://www.idera.com/productssolutions/freetools/sql-server-statistics

How do we know statistics are being used?

One good check you can do is when you generate execution plans for your queries:

check out your “Actual Number of Rows” and “Estimated Number of Rows”.

If these numbers are (consistently) fairly close, then most likely your statistics are up-to-date and used by the optimizer for the query. If not, time for you to re-check your statistics create/update frequency.

What configuration settings should we set?

There may be cases when you may want to disable statistics update temporarily while you’re doing massive updates on a table, and you don’t want it to be slowed down by the autoupdate.

However, for the most part, you will want to keep the SQL Server settings:
– auto create statistics
– auto update statistics

References:

Rob Carrol. http://blogs.technet.com/b/rob/archive/2008/05/16/sql-server-statistics.aspx

Elisabeth Redei has an excellent 3-part series on SQL Server Statistics:
http://sqlblog.com/blogs/elisabeth_redei/archive/2009/03/01/lies-damned-lies-and-statistics-part-i.aspx
http://sqlblog.com/blogs/elisabeth_redei/archive/2009/08/10/lies-damned-lies-and-statistics-part-ii.aspx
http://sqlblog.com/blogs/elisabeth_redei/archive/2009/12/17/lies-damned-lies-and-statistics-part-iii-sql-server-2008.aspx

Excellent Books that touch on statistics
– Apress. Grant Fritchey & Sajal Dam. SQL Server 2008 Query Performance Tuning Distilled.
RedGate. Holger Schmeling. SQL Server Statistics.


MORE RESOURCES

White Paper – Waiting on Wait Stats

Webcast – What Are You Waiting For?

Idera Free Performance Monitoring Tool – SQL check

Idera Performance Tuning Product Trial – SQL doctor