SQL Server Code,Tips and Tricks, Performance Tuning: files

Showing posts with label files. Show all posts

Wednesday, October 04, 2017

Sizing database files

It has been a while since I wrote some of my best practices posts. I decided to revisit these posts again to see if anything has changed, I also wanted to see if I could add some additional info.

This post will demonstrate that there is a difference in performance if you don't size your database file accordingly. It is a good practice to have your database sized correctly for the next 6 to 12 months, you don't want your server wasting cycles with growing files all the time.

Figure out how big your files are now, figure out how much they will grow in the next year and size your files accordingly, check back every month or so to see if your estimates were correct.

By default SQL Server will create databases with very small files when you create a database and you don't specify the sizes. If you have people creating databases on your servers, consider adding a DDL trigger to notify you when a new DB is added so that you can talk to the database creator and size the files. You also can change the defaults on the server so that you don't have the 10% growth either.

First let's see what the difference is when we have a database where the files will have to grow versus one where the files are big enough for the data that will be inserted.

Here we are creating two databases, one with much bigger files than the other one

The TestBigger database is correctly sized for the data that will be inserted

CREATE DATABASE [TestBigger]
 ON  PRIMARY 
( NAME = N'TestBigger', FILENAME = N'f:\TempTestBigger.mdf' , 
SIZE = 509600KB , FILEGROWTH = 1024KB )
 LOG ON 
( NAME = N'TestBigger_log', FILENAME = N'f:\TempTestBigger_log.ldf' , 
SIZE = 502400KB , FILEGROWTH = 10%)
GO

The TestSmaller database is very small, files will have to be expanded many times to accommodate all the data I will be inserting

CREATE DATABASE [TestSmaller]
 ON  PRIMARY 
( NAME = N'TestSmaller', FILENAME = N'f:\TempTestSmaller.mdf' , 
SIZE = 5280KB , FILEGROWTH = 1024KB )
 LOG ON 
( NAME = N'TestSmaller_log', FILENAME = N'f:\TempTestSmaller_log.ldf' , 
SIZE = 504KB , FILEGROWTH = 10%)
GO

These two stored proc calls are just to verify that the files match with what we specified, you can use sp_helpdb to check the size of a database that you created when you don't specify the file sizes

EXEC sp_helpdb 'TestBigger'

name         filename             filegroup SIZE
TestBigger f:\TempTestBigger.mdf      PRIMARY 509632 KB
TestBigger_log f:\TempTestBigger_log.ldf NULL 502400 KB

EXEC sp_helpdb 'TestSmaller'

name         filename              filegroup SIZE
TestSmaller f:\TempTestSmaller.mdf      PRIMARY 5280 KB
TestSmaller_log f:\TempTestSmaller_log.ldf NULL  512 KB

Next, we are creating two identical tables, one in each database

USE TestSmaller
GO
CREATE TABLE test (SomeName VARCHAR(100), 
SomeID VARCHAR(36), SomeOtherID VARCHAR(100), SomeDate DATETIME)

USE TestBigger
GO
CREATE TABLE test (SomeName VARCHAR(100),
 SomeID VARCHAR(36), SomeOtherID VARCHAR(100), SomeDate DATETIME)

This query is just used so that the data is cached for the two inserts later on, this way the data doesn't have to be fetched from disk for either inserts, you can discard the results after the query is done

USE master
GO


SELECT TOP 1000000 c1.name,NEWID(),NEWID(),GETDATE() 
FROM sys.sysobjects c1
CROSS JOIN sys.sysobjects c2
CROSS JOIN sys.sysobjects c3
CROSS JOIN sys.sysobjects c4

Here is the first insert into the bigger database

INSERT TestBigger.dbo.test
SELECT TOP 1000000 c1.name,NEWID(),NEWID(),GETDATE() 
FROM sys.sysobjects c1
CROSS JOIN sys.sysobjects c2
CROSS JOIN sys.sysobjects c3
CROSS JOIN sys.sysobjects c4

Here is the second insert into the smaller database

INSERT TestSmaller.dbo.test
SELECT TOP 1000000 c1.name,NEWID(),NEWID(),GETDATE() 
FROM sys.sysobjects c1
CROSS JOIN sys.sysobjects c2
CROSS JOIN sys.sysobjects c3
CROSS JOIN sys.sysobjects c4

On several machines I tested on, it takes half the time or less to insert the data in the bigger database compared to the smaller database. On some machines it is almost 5 times faster to insert into the bigger database.

How about on your machine, do you see that the insert into the bigger database takes less than half the time it takes to insert into the smaller database?

Check the sizes of the databases again

EXEC sp_helpdb 'TestBigger'

name         filename            filegroup size
TestBigger f:\TempTestBigger.mdf      PRIMARY 509632 KB
TestBigger_log f:\TempTestBigger_log.ldf NULL 502400 KB

EXEC sp_helpdb 'TestSmaller'

name         filename              filegroup size
TestSmaller f:\TempTestSmaller.mdf      PRIMARY 215296 KB
TestSmaller_log f:\TempTestSmaller_log.ldf NULL 427392 KB

As you can see, the bigger database did not expand, the smaller database expanded a lot.

Autogrow
If you do use autogrow, then make sure you don't use the default 10%, take a look at this message

Date 10/03/2017 12:57:56 PM
Log SQL Server (Current - 11/25/2012 5:00:00 AM)
Source spid62
Message
Autogrow of file 'MyDB_Log' in database 'MyDB' took 104381 milliseconds. Consider using ALTER DATABASE to set a smaller FILEGROWTH for this file.

See that, it took a long time, you don't want to grow a one terabyte file by ten percent, that would be one hundred gigabytes, that is huge. Use something smaller and don't use percent, the bigger the file gets the longer it will take to expand the file.

File placement
Separate the log files from the data files by placing them on separate hard drives. Placing the files on separate drives allows I/O activity to occur at the same time for both the data and log files. Instead of having huge files consider having smaller files in separate filegroups. Put different tables used in the same join queries in different filegroups as well. This will improve performance, because of parallel disk I/O searching for joined data.

Put heavily accessed tables and the nonclustered indexes that belong to those tables on different filegroups. This will improve performance, because of parallel I/O if the files are located on different physical disks. Just remember that you can't separate the clustered indexes from the base table, you can only do this for non clustered indexes. Of course people can get very creative, I have worked with a database once where each table was placed in its own filegroups, there were hundreds of files....what a mess

Tempdb
There are all kinds of recommendations about how many data files you should have for tempdb. Start with 4 files and add more files if you see contention. Paul Randal, has a detailed post here: A SQL Server DBA myth a day: (12/30) tempdb should always have one data file per processor core.

If you can, place tempdb on its own physical drive as well, separated from the user databases.

Consider Solid State hard drives or flash storage for tempdb

See also http://support.microsoft.com/kb/2154845 for recommendations by Microsoft Customer Service and Support

Test, test, test
Never ever blindly follow what you read on the internet, make sure that you test it out first on a QA server before promoting the changes to production!!

Saturday, October 31, 2015

TempDB Database changes in SQL Server 2016

According to Books On Line, there are several changes made in SQL Server 2016 in regards to TempDB. Here is what Books On Line has on the topic

There are several enhancements to TempDB:

Trace Flags 1117 and 1118 are not required for tempdb anymore. If there are multiple tempdb database files all files will grow at the same time depending on growth settings. In addition, all allocations in tempdb will use uniform extents.
By default, setup adds as many tempdb files as the CPU count or 8, whichever is lower.
During setup, you can configure the number of tempdb database files, initial size, autogrowth and directory placement using the new UI input control on the Database Engine Configuration - TempDB section of SQL Server Installation Wizard.
The default initial size is 8MB and the default autogrowth is 64MB.
You can specify multiple volumes for tempdb database files. If multiple directories are specified tempdb data files will be spread across the directories in a round-robin fashion.

I decided to check this out, I downloaded and installed SQL Server 2016 Community Technology Preview 3.0 During the install I did not make any changes to TempDB

Once the install was done, I executed the following to see how many files TempDB had on my laptop

exec sp_helpdb 'tempdb'

Here is what I saw in the output

name fileid filename filegroup size maxsize growth usage
tempdev 1 C:\DATA\tempdb.mdf PRIMARY 8192 KB Unlimited 65536 KB data only
templog 2 C:\DATA\templog.ldf NULL 8192 KB Unlimited 65536 KB log only
temp2 3 C:\DATA\tempdb_mssql_2.ndf PRIMARY 8192 KB Unlimited 65536 KB data only
temp3 4 C:\DATA\tempdb_mssql_3.ndf PRIMARY 8192 KB Unlimited 65536 KB data only
temp4 5 C:\DATA\tempdb_mssql_4.ndf PRIMARY 8192 KB Unlimited 65536 KB data only

As you can see, there are 4 data files, when I looked at my SQL Server 2014 instance which was installed with defaults, there is only 1 data file. Another difference is that SQL Server 2014 uses 10% growth as default while SQL Server 2016 uses 65536 KB

This is a good change for all the casual installs by people who just want to experiment and learn, however most shops I would think already would have changed TempDB to have more than 1 file. But then again maybe Microsoft noticed that a lot of support calls that they are getting have only 1 data file for TempDB and thus they decided to implement this change.