SQL Server Code,Tips and Tricks, Performance Tuning: Best Practices

Showing posts with label Best Practices. Show all posts

Saturday, May 17, 2025

Benefits of Indexes

This is day fifteen of the SQL Advent 2012 series of blog posts. Today we are going to look at the benefit of indexes

So how does an index work?

How does an index work, how does it help SQL Server finding stuff faster? Here is an simple non technology explanation. If I told you to grab a cookbook and give me all the recipes in that book for cod, what would you do? There are two things yuo can do, you can read through the whole book page by page until you get to the last page looking for any cod recipes. Or...........take a look at the picture below

See that, in one second I know exactly where to find cod recipes, it is on page 305 and 61. Which do you think is faster, looking it up in an index or scanning through the book?

SQL Server pretty much uses the same technique. There are two types of basic indexes in SQL Server, these are clustered and non clustered indexes. A clustered index will contain all the data from the table as well, a non clustered index will have a pointer to the table row if there is no clustered index on that table or to the clustered index position is there is a clustered index on the table.

A table with a clustered index is also called a clustered table. A table without a clustered index is also called a heap.

SQL Server also has XML, spatial, columnstore, filtered, full text indexes but we are just focusing on the basic indexes in this post.

Here is what books on line has to say about the storage of a clustered index

In SQL Server, indexes are organized as B-trees. Each page in an index B-tree is called an index node. The top node of the B-tree is called the root node. The bottom level of nodes in the index is called the leaf nodes. Any index levels between the root and the leaf nodes are collectively known as intermediate levels. In a clustered index, the leaf nodes contain the data pages of the underlying table. The root and intermediate level nodes contain index pages holding index rows. Each index row contains a key value and a pointer to either an intermediate level page in the B-tree, or a data row in the leaf level of the index. The pages in each level of the index are linked in a doubly-linked list.

A non clustered index is a little different since it doesn't store the whole data pages, here is what books on line has to say about the storage of a nonclustered index

Nonclustered indexes have the same B-tree structure as clustered indexes, except for the following significant differences:

The data rows of the underlying table are not sorted and stored in order based on their nonclustered keys.

The leaf layer of a nonclustered index is made up of index pages instead of data pages.

Nonclustered indexes can be defined on a table or view with a clustered index or a heap. Each index row in the nonclustered index contains the nonclustered key value and a row locator. This locator points to the data row in the clustered index or heap having the key value.
The row locators in nonclustered index rows are either a pointer to a row or are a clustered index key for a row, as described in the following:

If the table is a heap, which means it does not have a clustered index, the row locator is a pointer to the row. The pointer is built from the file identifier (ID), page number, and number of the row on the page. The whole pointer is known as a Row ID (RID).

If the table has a clustered index, or the index is on an indexed view, the row locator is the clustered index key for the row. If the clustered index is not a unique index, SQL Server makes any duplicate keys unique by adding an internally generated value called a uniqueifier. This four-byte value is not visible to users. It is only added when required to make the clustered key unique for use in nonclustered indexes. SQL Server retrieves the data row by searching the clustered index using the clustered index key stored in the leaf row of the nonclustered index.

Cool, I will now just add indexes on every column. Not so fast, there are some things to consider here, for every update, insert and delete statement indexes have to be maintained. If you have a busy OLTP database, you have to find the right balance between your read and write IO. Not enough indexes and your retrieval queries will suffer, too many indexes and your inserts will be slower. Also keep in mind that indexes take up storage, the more you have the bigger your database will be.

Keep your clustered indexes narrow

Try to keep your clustered indexes as narrow as possible, if you can use something like an integer, this is only 4 bytes. The reason to keep your clustered indexes narrow is that when you have non clustered indexes, the row locator is the clustered index key for the row. In this case your non clustered index will become bigger as well and now you won't be able to store as much data on a page. To illustrate that let's take a look at some simple code

First let's create this table and populate it with 2048 rows

CREATE TABLE Test1(id int, somecol char(36), somecol2 char(36))
GO

INSERT Test1 
SELECT number,newid(),newid() 
FROM master..spt_values
WHERE type = 'P'

Add a clustered index

CREATE CLUSTERED INDEX cx on Test1(id)

Add these two non clustered indexes

CREATE NONCLUSTERED INDEX ix1 on Test1(somecol)
CREATE NONCLUSTERED INDEX ix2 on Test1(somecol2)

Let's check how much storage is required for the non clustered indexes

SELECT
DB_NAME(DATABASE_ID) AS [DatabaseName],
OBJECT_NAME(OBJECT_ID) AS TableName,
SI.NAME AS IndexName,
INDEX_TYPE_DESC AS IndexType,
PAGE_COUNT AS PageCounts
FROM sys.dm_db_index_physical_stats (DB_ID(), NULL, NULL , NULL, N'LIMITED') DPS
INNER JOIN sysindexes SI
ON DPS.OBJECT_ID = SI.ID AND DPS.INDEX_ID = SI.INDID
AND OBJECT_NAME(OBJECT_ID) = 'Test1'
GO

Here is the output, as you can see the non clustered indexes take up 12 pages

DatabaseName TableName IndexName IndexType PageCounts
tempdb         Test1         cx       CLUSTERED INDEX 22
tempdb         Test1         ix1    NONCLUSTERED INDEX 12
tempdb         Test1         ix2    NONCLUSTERED INDEX 12

If we check the table size

EXEC sp_spaceused 'Test1'

name rows reserved data index_size unused
Test1 2048    472 KB        176 KB 240 KB         56 KB

We see that it is using 240 KB for the indexes

Let's recreate the clustered index with all 3 columns now.

CREATE CLUSTERED INDEX cx on Test1(id,somecol,somecol2)
WITH DROP_EXISTING

Recreating the clustered index also recreated the non clustered indexes. Let's check now how many pages a non clustered index is

SELECT
DB_NAME(DATABASE_ID) AS [DatabaseName],
OBJECT_NAME(OBJECT_ID) AS TableName,
SI.NAME AS IndexName,
INDEX_TYPE_DESC AS IndexType,
PAGE_COUNT AS PageCounts
FROM sys.dm_db_index_physical_stats (DB_ID(), NULL, NULL , NULL, N'LIMITED') DPS
INNER JOIN sysindexes SI
ON DPS.OBJECT_ID = SI.ID AND DPS.INDEX_ID = SI.INDID
AND OBJECT_NAME(OBJECT_ID) = 'Test1'
GO

Here are the results

DatabaseName TableName IndexName IndexType PageCounts
tempdb          Test1         cx         CLUSTERED INDEX 22
tempdb          Test1         ix1      NONCLUSTERED INDEX 21
tempdb   Test1         ix2      NONCLUSTERED INDEX 21

As you can see the non clustered indexes went from 12 to 21 pages

The index size changed, if you run this

EXEC sp_spaceused 'Test1'

Here is the result

name rows reserved data index_size unused
Test1 2048    600 KB  176 KB 384 KB         40 KB

So we went from 240 KB to 384 KB for the index storage.

So why does this matter you ask? SQL Server will use indexes for all kind of things, if you run a COUNT(*) it will use an index, if you do a JOIN it will use an index, it will use indexes in GROUP By queries and many more things.

Let's look at a simple example, when you do a COUNT(*), the optimizer will pick a non clustered index if there is one since it usually has less columns than the clustered index

SET SHOWPLAN_TEXT ON
GO

SELECT count(*) FROM Test1
GO

SET SHOWPLAN_TEXT OFF
GO

Here is the plan

|--Compute Scalar(DEFINE:([Expr1004]=CONVERT_IMPLICIT(int,[Expr1005],0)))
|--Stream Aggregate(DEFINE:([Expr1005]=Count(*)))
|--Index Scan(OBJECT:([ReportServer].[dbo].[Test1].[ix2]))

Basically it had to scan through all the index pages to get the count, if your index was now still 12 pages instead of 21, SQL Server would take less time to accomplish this.

I barely scratched the surface on indexing, it is a big topic and I recommend you start by navigating to the topic in Books On Line: http://msdn.microsoft.com/en-us/library/ms175049.aspx

You can also take a look at the following posts written about indexing right here on this site

Index REBUILD and REORGANIZE
Indexes with Included Columns
Filtered Indexes
Is an index seek always better or faster than an index scan?
Finding Fragmentation Of An Index And Fixing It
How to get the selectivity of an index
Adding nonclustered index on primary keys
Index Seek on LOB Columns
Row Overflow Pages - Index Tuning
Columnstore Index Basics
Columnstore Index – Index Statistics

That is all for day fifteen of the SQL Advent 2012 series, come back tomorrow for the next one, you can also check out all the posts from last year here: SQL Advent 2011 Recap

Sunday, April 14, 2019

How to check if an Extended Event session exists before dropping it

Are you still running profiler or have you transferred to using Extended Events? I use Extended Events almost exclusively now because it's so much easier compared to using profiler or trace from T-SQL. Not to mentioned you can capture more things

The other day someone checked in some code and every now and then the build would fail with the error

Msg 15151, Level 16, State 19, Line 51
Cannot drop the event session 'ProcsExecutions', because it does not exist or you do not have permission.

I decided to take a look at the code and saw what the problem was. I will recreate the code here and then show you what needs to be changed. This post will not go into what Extended Events are, you can look that up in the SQL Server Extended Events documentation

Start by creating the Extended Event session by executing the following T-SQL

CREATE EVENT SESSION ProcsExecutions ON SERVER 
ADD EVENT sqlserver.rpc_completed(
    ACTION(sqlos.task_time,sqlserver.client_app_name,
 sqlserver.client_hostname,sqlserver.database_name,sqlserver.sql_text)
    WHERE ([sqlserver].[database_name]=N'Test')
 ),
ADD EVENT sqlserver.rpc_starting(
    ACTION(sqlos.task_time,sqlserver.client_app_name,
 sqlserver.client_hostname,sqlserver.database_name,sqlserver.sql_text)
    WHERE ([sqlserver].[database_name]=N'Test')
  )

To start the Extended Event session from T-SQL, execute the following command

ALTER EVENT SESSION ProcsExecutions
   ON SERVER  
   STATE = START;

Below is what the code looked like that was checked in.

You can run it and it will execute without a problem

IF EXISTS (SELECT name FROM sys.dm_xe_sessions  WHERE name = 'ProcsExecutions')
DROP EVENT SESSION ProcsExecutions ON SERVER
GO


CREATE EVENT SESSION ProcsExecutions ON SERVER 
ADD EVENT sqlserver.rpc_completed(
    ACTION(sqlos.task_time,sqlserver.client_app_name,
 sqlserver.client_hostname,sqlserver.database_name,sqlserver.sql_text)
    WHERE ([sqlserver].[database_name]=N'Test')
 ),
ADD EVENT sqlserver.rpc_starting(
    ACTION(sqlos.task_time,sqlserver.client_app_name,
 sqlserver.client_hostname,sqlserver.database_name,sqlserver.sql_text)
    WHERE ([sqlserver].[database_name]=N'Test')
  )


  ALTER EVENT SESSION ProcsExecutions
   ON SERVER  
   STATE = START; 
   GO

However if you run the following command now

 ALTER EVENT SESSION ProcsExecutions
   ON SERVER  
   STATE = STOP;

And then execute the same create Extended Event T-SQL Query again

IF EXISTS (SELECT name FROM sys.dm_xe_sessions  WHERE name = 'ProcsExecutions')
DROP EVENT SESSION ProcsExecutions ON SERVER
GO


CREATE EVENT SESSION ProcsExecutions ON SERVER 
ADD EVENT sqlserver.rpc_completed(
    ACTION(sqlos.task_time,sqlserver.client_app_name,
 sqlserver.client_hostname,sqlserver.database_name,sqlserver.sql_text)
    WHERE ([sqlserver].[database_name]=N'Test')
 ),
ADD EVENT sqlserver.rpc_starting(
    ACTION(sqlos.task_time,sqlserver.client_app_name,
 sqlserver.client_hostname,sqlserver.database_name,sqlserver.sql_text)
    WHERE ([sqlserver].[database_name]=N'Test')
  )


 ALTER EVENT SESSION ProcsExecutions
   ON SERVER  
   STATE = START; 
   GO

You will get the error

Msg 25631, Level 16, State 1, Line 29
The event session, "ProcsExecutions", already exists. Choose a unique name for the event session.

So why is that? There are 2 DMV that exist sys.dm_xe_sessions and sys.server_event_sessions. The DMV sys.dm_xe_sessions only returns a row for Extended Event sessions that are in the running state, the DMV sys.server_event_sessions will return a row even if the Extended Event session is not currently running

Lets' take a look at what that looks like by running some queries and commands

First we are going to stop the session and then query the sys.dm_xe_sessions DMV

--Stop the session if is running
 ALTER EVENT SESSION ProcsExecutions
   ON SERVER  
   STATE = STOP; 
   GO

-- this query returns only the running Extended Event sessions
SELECT dxs.name,
dxs.create_time,*
FROM sys.dm_xe_sessions AS dxs;

Output
-----------------
hkenginexesession
system_health
sp_server_diagnostics session
telemetry_xevents

As you can see our Extended Event session is not returned because it is not in a running state

Now lets's query the sys.server_event_sessions DMV and check if our Extended Event session is returned

-- this query returns also Extended Event sessions that are not currently running
 SELECT *
 FROM sys.server_event_sessions

Output
-----------------
system_health
AlwaysOn_health
telemetry_xevents
ProcsExecutions

As you can see our Extended Event session is returned even though it is not in a running state

If we now start the session again and then check the sys.dm_xe_sessions DMV, we will get back out session

-- start the session again
   ALTER EVENT SESSION ProcsExecutions
   ON SERVER  
   STATE = START; 
   GO


SELECT dxs.name,
dxs.create_time,*
FROM sys.dm_xe_sessions AS dxs;

Output
-----------------
hkenginexesession
system_health
sp_server_diagnostics session
telemetry_xevents
ProcsExecutions

So now our Extended Event session is returned because it is in a running state

Instead of this query to check for the existence

IF EXISTS (SELECT name FROM sys.dm_xe_sessions  WHERE name = 'ProcsExecutions')
DROP EVENT SESSION ProcsExecutions ON SERVER
GO

What we really want is this

IF EXISTS (SELECT name FROM sys.server_event_sessions  WHERE name = 'ProcsExecutions')
DROP EVENT SESSION ProcsExecutions ON SERVER
GO

So basically we change the dmv from sys.dm_xe_sessions to sys.server_event_sessions in IF EXISTS check

So it is a pretty easy change, just swap out the DMV

If you want to stop a session if it is running, you can go ahead and implement something like this

 IF EXISTS (SELECT name FROM sys.dm_xe_sessions  WHERE name = 'ProcsExecutions')
 BEGIN
 PRINT 'The Session Was Running'

 ALTER EVENT SESSION ProcsExecutions
   ON SERVER  
   STATE = STOP; 
END

IF NOT EXISTS (SELECT name FROM sys.dm_xe_sessions  WHERE name = 'ProcsExecutions')
 PRINT 'The Session is NOT Running'

That's all for this post, hopefully it will be useful to someone

Saturday, April 13, 2019

How to improve your tech skills

Today we are going to look at how to improve your tech skills. This really is a continuation of the Stay relevant and marketable post from a couple of weeks ago. Here are some things that you can do to improve your tech skills

Attend usergroups

Attend your local usergroup meetings, there is always some expert that comes to do presentations.

Answer questions

I still think answering questions is one of the best ways to improve your skill. Join a QA site like stackoverflow, head on to a specialized site on stackexchange, here is a list of all of them http://stackexchange.com/sites

If you are not comfortable with answering yet or if you realize that the questions are too difficult, don't worry about, just start by lurking. What you will find out over time is that every month you will be able to answer more and more of these question. This is because the questions are pretty much the same but some little detail might be different. After a while you will notice that there will be very few questions that you won't be able to answer in your field of expertise

Lunch and learns

No time you say to improve your skills, do you take lunch breaks? If so consider doing lunch and learns, get into a conference room, fire up the projector and then either look at code with the team, do design, watch videos, whatever floats your boat

Get involved with an open source project

A good way to improve your skills is to get involved with an open source project. Pick a project download it, then pick it apart. Start reading through the code, notice how things are done, ask yourself why it was done that way. Would you do it the same way? If you pick a big enough project, there will be many contributors, can you tell that the code was put together or does it pretty much look like it was written by one person. Are standards followed, how many design patterns are used

Read books, read code, read blogs

There are many classic list of books that every programmer should read
Here is just a small list that you can choose from, I grabbed this from stackoverflow

Code Complete (2nd edition) by Steve McConnell
The Pragmatic Programmer
Design Patterns by the Gang of Four
Refactoring: Improving the Design of Existing Code
The Mythical Man Month
Clean Code: A Handbook of Agile Software Craftsmanship by Robert C. Martin
CODE by Charles Petzold
Working Effectively with Legacy Code by Michael C. Feathers
Peopleware by Demarco and Lister
Coders at Work by Peter Seibel
Patterns of Enterprise Application Architecture by Martin Fowler
Test-Driven Development: By Example by Kent Beck
Practices of an Agile Developer
Don't Make Me Think
Agile Software Development, Principles, Patterns, and Practices by Robert C. Martin
Domain Driven Designs by Eric Evans
The Design of Everyday Things by Donald Norman
JavaScript - The Good Parts
Getting Real by 37 Signals
The Annotated Turing
Agile Principles, Patterns, and Practices in C# by Robert C. Martin
The Soul of a New Machine by Tracy Kidder
Here Comes Everybody: The Power of Organizing Without Organizations by Clay Shirky
Pragmatic Unit Testing in C# with NUnit by Andy Hunt and Dave Thomas with Matt Hargett
Rework by Jason Freid and DHH
JUnit in Action

Reading code is also a good way to improve your skills, head over to the frameworks you use the most and start digging around in the API, look at the example code.
Read blogs of subject expert, study their code and techniques, if something is not clear don't hesitate to leave a comment asking for some info or further explanation

Practice by doing katas

If you have ever done Karate you will know what a kata is, it is basically the practice of forms. A kata, or code kata, is defined as an exercise in programming which helps hone your skills through practice and repetition. Dave Thomas, started this movement for programming. You can find a list of awesome katas here: https://github.com/gamontal/awesome-katas

Blog

I found that blogging has been very good for my tech skills. It keeps me sharp and since I blog about new things it keeps my skill set up to date. When blogging, your readers will tell you when the code is wrong, so you have to make sure everything is tested and will run as shown in the post. Since you will have to do some research when writing these blog posts, your skills will improve and expand.
An added bonus is that I have a code library that I can access anytime I want.

Write a book

If you are a masochistic type of person then I recommend you write a book, almost everybody in the tech world that I know swore that they would never write a book again when they were done.......and yet they did. In order to write a book you have to spend a LOT of time doing research, making sure your code is correct and much more. Once you are done with this if you were not a subject expert you will be now. The worst part of writing a book is the initial feedback you get pointing out all your mistakes, if you are not thick skinned this could become a problem.

Listen to podcast, watch webinars

I get a lot of my tech info from podcasts, I like it better than listening to music at times and it makes the commute or run more enjoyable. The benefit is that you will learn something, you also might hear about some new shiny thing and then you will want to check it out when you get to the computer. There are many things I have learned from podcast, I also look forward to the next episode

Monday, March 18, 2019

Stay relevant and marketable

It has been a while since I wrote some of my best practices posts. I decided to revisit these posts again to see if anything has changed, I also wanted to see if I could add some additional info.

Whenever I interview people for a DB position, I always ask what the latest version is that they have used. I will also ask if they have used the version that is in beta now. More often than not I will get an answer that they are still running something that is two versions behind. Then I ask if they have installed the latest and greatest on their home computer. The answer to that question is usually no.

This to me is crazy, surely you can't be in this field just for the money, where is the passion for discovering new things? Imagine you are a person who makes tables, after a while you will become a master and there is not much more for you to learn. The curse and the blessing of technology is that it is changing, and it is changing rapidly. If you don't invest time after work, before work and over the weekend to discover new things, play with the newest versions and sharpen your skill you will become obsolete, there are many people like that, they fall apart during the interview process.

But I have no time to do all this additional work

People will complain that they don't have enough time to do these additional things. Here are some things you can do if you are short on time. If commuting by public transportation read a book or download the latest papers about the newest versions of the product and read them. If you workout try listening to podcast while doing your aerobic exercises. Perhaps you can set the speed on your player to be 30% faster, this way you can get more podcasts in the same amount of time.

Ask yourself what will help you advance, having up to date skill or being up to date with the latest episodes of Billions, Narcos or American Gods? Another option is to watch the shows on the train on a tablet and then do the tech stuff at home.

Attend an launch event or go to your local usergroup meeting, there are usually sessions on the latest and greatest versions of the software. Youtube also has tons of free sessions.

Here are the Amazon AWS re:Invent 2018 breakout sessions https://www.youtube.com/user/AmazonWebServices/playlists?shelf_id=33&view=50&sort=dd

The Microsoft Ignite sessions can be found here:
https://www.youtube.com/channel/UCrhJmfAGQ5K81XQ8_od1iTg

The Google Developer sessions can be found here:
https://www.youtube.com/user/GoogleDevelopers

Database related sessions: Azure SQL Data Warehouse, Azure SQL Database and SQL Server sessions from the Build 2018 conference

Do you eat lunch? If so, instead of going out for lunch, bring your lunch to the office. While eating your lunch at your desk, watch some of the sessions listed above or read some documentation or books about technologies that you are interested in. If you prefer to read and you want to learn about SQl Server Execution Plans? Head on over to Hugo Kornelis' SQL Server Execution Plan Reference site

Step outside your comfort zone

Try out some other things, become a polyglot programmer. If you are a Java developer, give Scala or Clojure a try. If you are a .NET developer then try out F#, IronPython or Boo. If you are a SQL Server guy why not start playing around with Oracle, PostgreSQL or perhaps even a flavor of NoSQL, take a look at MongoDB, CouchDB, Cassandra and other solutions.

Have you looked at Big Data? This is already getting big and it will only get bigger. Ever heard of Hadoop? No, heard of facebook? Sure you have, guess what, facebook claims to have the biggest Hadoop cluster, It is over 100 Petabytes and it grows by about half a Petabyte per day, and you thought your database was big :-)

Trends are cyclical, every 10 years or so something new and big comes along. In the late 90s this was data warehousing, OLAP cubes, dimensions, fact tables. Every company these days does some sort of data warehousing. Now it is big data, data science, AI and machine learning that is new and shiny. Take a look at some of that stuff

Finally have you looked at the cloud yet? If you have not... I beg you to take a look.

If you think that you need to have money to get started, this is not true

You can start with the free tiers to get some experience. All 3 major could providers offer free tiers

Here are the links to these free tiers in order of cloud vendor by market share

Amazon Webservices

https://aws.amazon.com/free/

Microsoft Azure

https://azure.microsoft.com/en-us/free/

Google Cloud

https://cloud.google.com/free/

That's it for this post.... happy learning.....

Saturday, March 16, 2019

Use sys.configurations to find features that are enabled but perhaps not used

It has been a while since I wrote some of my best practices posts. I decided to revisit these posts again to see if anything has changed, I also wanted to see if I could add some additional info.

Today we are going to look at servers where everything is installed and enabled. Before we start this post let's look back in time a little. Before SQL Server 2005 came out when you installed SQL Server pretty much everything was turned on by default. This of course widened the attack vector against the database servers. With SQL Server 2005 pretty much everything is turned off and you have to turn the features on if you want to use them. Now sometimes some admins will just turn everything on because that way they don't have to deal with this later, these are also the same kind of people who insist that the account needs to be db_owner otherwise their code won't work.

To see what these features are and if they are turned off, you can use the following query.

SELECT name, value,value_in_use 
FROM sys.configurations
WHERE name IN (
'Agent XPs',
'SQL Mail XPs',
'Database Mail XPs',
'SMO and DMO XPs',
'Ole Automation Procedures',
'SQL Mail XPs',
'external scripts enabled',
'Web Assistant Procedures',
'xp_cmdshell',
'Ad Hoc Distributed Queries',
'hadoop connectivity',
'polybase enabled',
'Replication XPs',
'clr enabled')

Here is what the output looks like on my laptop

The difference between value and value_in_use is that value_in_use is what is currently used and value will be used next time the server is restarted. If you want to have the change take effect immediately then use RECONFIGURE

As you can see xp_cmdshell is turned on

Here is an example that will turn off xp_cmdshell

EXECUTE sp_configure 'show advanced options', 1
RECONFIGURE
GO
 
EXECUTE sp_configure 'xp_cmdshell', '0'
RECONFIGURE
GO
 
EXECUTE sp_configure 'show advanced options', 0
RECONFIGURE
GO

To enable a feature use the value 1, to disable a feature use the value 0

If you prefer the GUI, you can also use that, right click on the database server name in SSMS, select Facets, from the Facet drop down select Surface Area Configuration. You will see the following Surface Area Configuration properties

Here you can enable or disable the features you are interested in. You can also export these properties as a policy to use on other servers so that the features are the same on all your servers

Installing everything by default

Here I have mixed feelings myself about what to do. On one hand I don't like to install SSAS or SSRS if there is no need for it, on the other hand I don't feel like adding that stuff 6 months down the road if there is suddenly a need for it. If I do install it, I make sure it at least doesn't run by default but it is disabled.

There is no benefit in having SSAS, SSRS or SSIS running and using CPU cycles as well as RAM if nobody is using these services. If you do install it and nobody uses it, disable the services, you will have more RAM and CPU cycles for the SQL Server service available.

One more thing I want to mention is that I have run into something like this many times.... The Dev or Test server has everything enabled. We deploy to production and oops... CLR is not enabled (quick fix) or SSRS/SSAS is not installed (need to install it, will take longer)

So make sure before deploying that you check what is enabled and installed on the production box

Friday, January 04, 2019

Proactive notifications

It has been a while since I wrote some of my best practices posts. I decided to revisit these posts again to see if anything has changed, I also wanted to see if I could add some additional info.

Today we are going to look at proactive notifications

In the SQL Server Maintenance post from yesterday I touched upon proactive notifications a little, today I want to dive a little deeper into this subject.

The last thing you want to hear as a DBA is ny of the following things from the end users or developers

The transaction log is full
The database is very slow
The latest backup we have is 9 days old
The table that was created has 2 extra columns this morning
Everything is locked up can't get any results back from a query
Deadlocks are occurring

What you really want to have at your shop is a tool like Quest Foglight, Solarwinds, Red Gate SQL Monitor or similar. The benefit of these tools is that there is a central location where you can look at all the alerts at a glance. You get a lot of stuff out of the box and all you have to do is tell it what server to start monitoring. I would suggest to start using the trial version to see if it is something that would be beneficial for your organization.

Of course you can roll your own solution as well, this will involve work and unless your time is worthless or you are bored out of your mind after work I wouldn't do it.

Utilize the logs

You need to scan the errorlog periodically to see if there are errors, you can automate this, no need to start opening log files every 5 minutes. create a SQL Agent job that runs every 5 minutes and checks if there are any errors since it last ran. You can use the xp_readerrorlog proc to read the error log from with sql server with T-SQL.

Here is a small example of what you can do if you have this in a SQL Agent job that runs every 5 minutes or so, you can of course email yourself the results, dump the result into a table that is perhaps shown on a dashboard in the office, there are many possibilities.

--This will hold the rows
CREATE TABLE #ErrorLog (LogDate datetime, ProcessInfo VarChar(10), 
ErrorMessage VarChar(Max))

-- Dump the errorlog into the table
INSERT INTO #ErrorLog
EXEC master.dbo.xp_readerrorlog

-- Delete everything older than 5 minutes
-- ideally you will store the max date when it ran last
DELETE #ErrorLog
WHERE LogDate <  DATEADD(mi,-5,GETDATE())

-- Some stuff you want to check for
-- Failed backups...you want to know this
SELECT * FROM #ErrorLog
WHERE ErrorMessage LIKE'BACKUP failed%'

-- Why does it take so looong to grow a file, maybe rethink your settings
SELECT * FROM #ErrorLog
WHERE ErrorMessage LIKE'Autogrow of file%'

-- What is going on any backups or statistic updates running at this time?
SELECT * FROM #ErrorLog
WHERE ErrorMessage LIKE'SQL Server has encountered %occurrence(s) of I/O requests taking longer than%'

-- My mirror might not be up to date
SELECT * FROM #ErrorLog
WHERE ErrorMessage LIKE'The alert for ''unsent log'' has been raised%'


DROP TABLE #ErrorLog

Those are just small samples, you might want to look for other kind of messages from the errorlog

The transaction log is full

You want to make sure that you know you are running out of space before you run out of space. I covered this in the SQL Server Maintenance post Take a look at the sections Make sure that you have enough space left on the drives and Make sure that you have enough space left for the filegroups In those two section I described what to look for and also supplied code that you can then plug into your own solution

The database is very slow

This complaint you hear every now and then, I have seen this from time to time. There are several things that could be happening, here is a list

Someone decided to take a backup of that 1 TB database in the middle of the day
The update statistics job is still running
Statistics are stale and haven't been updated in a long time
The virus scan is running amok and nobody told it to ignore the database files
Someone decided to query all the data all at once

If you have a tool like Quest Foglight, Confio Ignite, Red Gate SQL Monitor or similar then you can see what query ran at what time, what it did and how long it ran.

You can of course also use sp_who2, BlkBy column and DBCC INPUTBUFFER to see what is going on

If you like to use Dynamic Management Views, then take a look at Glenn Berry's SQL Server 2005 Diagnostic Information Queries posts, there is a .sql file in each post with all kind of queries to discover all kinds of stuff about your server.

It could also be that your hardware is having issues, make sure the IOs look good and check the eventlog for any clues.

The latest backup we have is 9 days old

The following query will give you for all the databases the last time it was backed up or display NEVER if it wasn't backed up

SELECT s.Name AS DatabaseName,'Database backup was taken on  ' + 
CASE WHEN MAX(b.backup_finish_date) IS NULL THEN 'NEVER!!!' ELSE
CONVERT(VARCHAR(12), (MAX(b.backup_finish_date)), 101) END AS LastBackUpTime
FROM sys.sysdatabases s
LEFT OUTER JOIN msdb.dbo.backupset b ON b.database_name = s.name
GROUP BY s.Name

Here is what the output will look like

DatabaseName LastBackUpTime
--------------  ---------------------------------------
model         Database backup was taken on  NEVER!!!
msdb         Database backup was taken on  12/10/2012
ReportServer Database backup was taken on  NEVER!!!

As you can see that is not that great, all the databases should be backed up on a regular basis. Scroll up to the Utilize the logs section to see how you can check the errorlog for failed backup messages.

Everything is locked up, you can't get any results back from a query

Usually this indicates that there is an open transaction somewhere that has not finished or someone did the BEGIN TRAN part but never did a COMMIT or ROLLBACK.

Some people just restart the server to 'fix' the issue, of course if you do that you will never know what the root cause is and you never know when it will happen again.

We can easily show what happens when you have an open transaction, btw don't do this on the production server.

In 1 query window run this, replace SomeTable with a real table name.

BEGIN TRAN

SELECT TOP 1 * FROM SomeTable WITH(UPDLOCK, HOLDLOCK)

You will get a message that the query completed successfully

In another window run this

SELECT TOP 1 * FROM SomeTable WITH(UPDLOCK, HOLDLOCK)

That query won't return anything unless the first one is commited or rolled back

Now run this query below, the first column should have the text AWAITING COMMAND

SELECT   sys.cmd
        ,sys.last_batch
        ,lok.resource_type
        ,lok.resource_subtype
        ,DB_NAME(lok.resource_database_id)
        ,lok.resource_description
        ,lok.resource_associated_entity_id
        ,lok.resource_lock_partition
        ,lok.request_mode
        ,lok.request_type
        ,lok.request_status
        ,lok.request_owner_type
        ,lok.request_owner_id
        ,lok.lock_owner_address
        ,wat.waiting_task_address
        ,wat.session_id
        ,wat.exec_context_id
        ,wat.wait_duration_ms
        ,wat.wait_type
        ,wat.resource_address
        ,wat.blocking_task_address
        ,wat.blocking_session_id
        ,wat.blocking_exec_context_id
        ,wat.resource_description
FROM    sys.dm_tran_locks lok
JOIN    sys.dm_os_waiting_tasks wat
ON      lok.lock_owner_address = wat.resource_address
JOIN sys.sysprocesses sys ON wat.blocking_session_id = sys.spid

As you can see you have a blocking_session_id and a session_id, this will tell you which session_id is being blocked. You can now verify that the transaction session_id is blocking the other id

Go back to that first command window and execute a rollback

ROLLBACK

The query that had that second select should now be done as well, if you run that query that checks for the waits it should be clean as well.

Of course you could have done the same exercise by running sp_who2, looking at the BlkBy column, finding out what that session is doing by running DBCC INPUTBUFFER(session_id) with that session_id

Deadlocks are occurring

There is already a post written on LessThanDot explaining how you can get emailed when deadlocks occur. Ted Krueger wrote that post and it can be found here: Proactive Deadlock Notifications

Summary

I only touched the surface of what can be done in this post. I want you to find out if there is any monitoring being done in your shop, who gets notified? I have worked in places where the end user was the proactive notification, as long as we fixed it before the business users started to complaint life was good. Manual notifications and homebrew solutions might work for a while but when you add more and more servers and you add more people to the team this becomes laborious and error prone.