SQL Server Code,Tips and Tricks, Performance Tuning: Performance Tuning

Showing posts with label Performance Tuning. Show all posts

Wednesday, October 11, 2017

Sargable Queries...

It has been a while since I wrote some of my best practices posts. I decided to revisit these posts again to see if anything has changed, I also wanted to see if I could add some additional info.

Today we are going to look at sargable queries. You might ask yourself, what is this weird term sargable. Sargable comes from searchable argument, sometimes also referred as Search ARGument ABLE. What that means is that the query will be able to use an index, a seek will be performed instead of a scan. In general any time you have a function wrapped around a column, an index won't be used

Some examples that are not sargable

WHERE LEFT(Name,1) = 'S'
WHERE Year(SomeDate) = 2012
WHERE OrderID * 3 = 33000

Those three should be rewritten like this in order to become sargable

WHERE Name LIKE 'S%'
WHERE SomeDate >= '20120101' AND SomeDate < '20130101'
WHERE OrderID = 33000/3

Let's create a table, insert some data so that we can look at the execution plan
Create this simple table

CREATE TABLE Test(SomeID varchar(100))

Let's insert some data that will start with a letter followed by some digits

INSERT Test
SELECT LEFT(v2.type,1) +RIGHT('0000' + CONVERT(varchar(4),v1.number),4) 
FROM master..spt_values v1
CROSS JOIN (SELECT DISTINCT LEFT(type,1) AS type 
FROM master..spt_values) v2
WHERE v1.type = 'p'

That insert should have generated 32768 rows

Now create this index on that table

CREATE CLUSTERED INDEX cx_test ON Test(SomeID)

Let's take a look at the execution plan, hit CTRL + M, this will add the execution plan once the query is done running

SELECT * FROM Test
WHERE SomeID LIKE 's%'

SELECT * FROM Test
WHERE LEFT(SomeID,1) = 's'

Here is what the plans looks like

As you can see it is 9% versus 91% between the two queries, that is a big difference
Hit CTRL + M again to disable the inclusion of the plan

Run this codeblock, it will give you the plans in a text format

SET SHOWPLAN_TEXT ON
GO

SELECT * FROM Test
WHERE SomeID LIKE 's%'

SELECT * FROM Test
WHERE LEFT(SomeID,1) = 's'
GO

SET SHOWPLAN_TEXT OFF
GO

Here are the two plans

|--Clustered Index Seek(OBJECT:([master].[dbo].[Test].[cx_test]),
SEEK:([master].[dbo].[Test].[SomeID] >= 'Rþ' AND [master].[dbo].[Test].[SomeID] < 'T'),
WHERE:([master].[dbo].[Test].[SomeID] like 's%') ORDERED FORWARD)
|--Clustered Index Scan(OBJECT:([master].[dbo].[Test].[cx_test]),
WHERE:(substring([master].[dbo].[Test].[SomeID],(1),(1))='s'))

As you can see the top one while looking more complicated is actually giving you a seek

Making a case sensitive search sargable

Now let's take a look at how we can make a case sensitive search sargable as well
In order to do a search and make it case sensitive, you have to have a case sensitive collation, if your table is not created with a case sensitive collation then you can supply it as part of the query

Here is an example to demonstrate what I mean

This is a simple table created without a collation

CREATE TABLE TempCase1 (Val CHAR(1))
INSERT TempCase1 VALUES('A')
INSERT TempCase1 VALUES('a')

Running this select statement will return both rows

SELECT * FROM TempCase1
WHERE Val = 'A'

Val
-----
A
a

Now create the same kind of table but with a case sensitive collation

CREATE TABLE TempCase2 (Val CHAR(1) COLLATE SQL_Latin1_General_CP1_CS_AS)
INSERT TempCase2 VALUES('A')
INSERT TempCase2 VALUES('a')

Run the same query

SELECT * FROM TempCase2
WHERE Val = 'A'

Val
-----
A

As you can see you only get the one row now that matches the case

To return both rows, you can supply the case insensitive collation in the query itself

SELECT * FROM TempCase1
WHERE Val = 'A' COLLATE SQL_Latin1_General_CP1_CI_AS

Val
-----
A
a

Now let's take a look at how we can make the case sensitive search sargable

First create this table and insert some data

CREATE TABLE TempCase (Val CHAR(1))
 
INSERT TempCase VALUES('A')
INSERT TempCase VALUES('B')
INSERT TempCase VALUES('C')
INSERT TempCase VALUES('D')
INSERT TempCase VALUES('E')
INSERT TempCase VALUES('F')
INSERT TempCase VALUES('G')
INSERT TempCase VALUES('H')

Now we will insert some lowercase characters

INSERT TempCase
SELECT LOWER(Val) FROM TempCase

Now we will create our real table which will have 65536 rows

CREATE TABLE CaseSensitiveSearch (Val VARCHAR(50))

We will do a couple of cross joins to generate the data for our queries

INSERT CaseSensitiveSearch
SELECT t1.val + t2.val + t3.val + t4.val
FROM TempCase t1
CROSS JOIN TempCase t2
CROSS JOIN TempCase t3
CROSS JOIN TempCase t4

Create an index on the table

CREATE INDEX IX_SearchVal ON CaseSensitiveSearch(Val)

Just like before, if we run this we will get back the exact value we passed in and also all the upper case and lower case variations

SELECT * FROM CaseSensitiveSearch
WHERE Val = 'ABCD'

Here are the results of that query
Val
-----
AbCd
ABcd
Abcd
ABCd
aBCd
abCd
aBcd
abcd
abCD
aBcD
abcD
aBCD
ABCD
AbCD
ABcD
AbcD

If you add the case sensitive collation to the query, you will get only what matches your value

SELECT * FROM CaseSensitiveSearch
WHERE Val = 'ABCD' COLLATE SQL_Latin1_General_CP1_CS_AS

Here is the result, it maches what was passed in
Val
---
ABCD

The problem with the query above is that it will cause a scan. So what can we do, how can we make it perform better? It is simple... combine the two queries

First grab all case sensitive and case insensitive values and then after that filter out the case insensitive values

Here is what that query will look like

SELECT * FROM CaseSensitiveSearch
WHERE Val = 'ABCD' COLLATE SQL_Latin1_General_CP1_CS_AS
AND Val LIKE 'ABCD'

AND Val LIKE 'ABCD' will result in a seek, but now when it also does the Val = 'ABCD' COLLATE SQL_Latin1_General_CP1_CS_AS part, it only returns the row that matches your value

If you run both queries, you can look at the plan difference (hit CTRL + M so that the plan is included)

SELECT * FROM CaseSensitiveSearch
WHERE Val = 'ABCD' COLLATE SQL_Latin1_General_CP1_CS_AS



SELECT * FROM CaseSensitiveSearch
WHERE Val = 'ABCD' COLLATE SQL_Latin1_General_CP1_CS_AS
AND Val LIKE 'ABCD'

Here is the plan

As you can see, there is a big difference between the two

Here is the plan in text as well

SET SHOWPLAN_TEXT ON
GO
 
SELECT * FROM CaseSensitiveSearch
WHERE Val = 'ABCD' COLLATE SQL_Latin1_General_CP1_CS_AS



SELECT * FROM CaseSensitiveSearch
WHERE Val = 'ABCD' COLLATE SQL_Latin1_General_CP1_CS_AS
AND Val LIKE 'ABCD'

GO
 
SET SHOWPLAN_TEXT OFF
GO

|--Table Scan(OBJECT:([tempdb].[dbo].[CaseSensitiveSearch]),
WHERE:(CONVERT_IMPLICIT(varchar(50),[tempdb].[dbo].[CaseSensitiveSearch].[Val],0)=CONVERT(varchar(8000),[@1],0)))
|--Index Seek(OBJECT:([tempdb].[dbo].[CaseSensitiveSearch].[IX_SearchVal]), SEEK:([tempdb].[dbo].[CaseSensitiveSearch].[Val] >= 'ABCD'
AND [tempdb].[dbo].[CaseSensitiveSearch].[Val] <= 'ABCD'),
WHERE:(CONVERT_IMPLICIT(varchar(50),[tempdb].[dbo].[CaseSensitiveSearch].[Val],0)='ABCD' AND [tempdb].[dbo].[CaseSensitiveSearch].[Val] like 'ABCD') ORDERED FORWARD)

I really wish Microsoft would take the time to internally rewrite these two queries when it hits the optimizer

WHERE LEFT(Name,1) = 'S'
WHERE Year(SomeDate) = 2012

It should create these and then performance would be much better

WHERE Name LIKE 'S%'
WHERE SomeDate >= '20120101' AND SomeDate < '20130101'

I think there are probably some SQL Server consultants cursing me now for even trying to suggest this :-)

Thursday, October 05, 2017

Data types storage differences

It has been a while since I wrote some of my best practices posts. I decided to revisit these posts again to see if anything has changed, I also wanted to see if I could add some additional info.

Today we are going to take a look at how data types can have an impact in queries and also the size of your database.

Char vs NChar

SQL Server has two data types to store character data[1], both of them come in fixed and variable length sizes. The char and varchar data type uses one byte of store to store one character, the nchar and nvarchar data type uses two bytes of store to store one character. The nchar and nvarchar data types are used to store unicode of data

Let's think about that for a second, what we are saying is that the char and varchar data type can store twice the number of characters in the same amount of store as the nchar and nvarchar data type. Why does this matter, space is cheap right? True, space is getting cheaper but we are also storing more and more data every year.

Now think about what happens you have everything stored as unicode data

What happens to your backup and restore process, will it be faster or slower, will the files be bigger if not compressed?
What about when transferring the results to and from your database server, are the packets able to store the same number of characters.
What about the amount of data on a page, what does this do to indexes and index lookups, how does it affect index maintenance?

If you don't need it, then don't use unicode data.
Some examples of what I have seen stored in nchar and nvarchar when realy you shouldn't:

Zip Code for US addresses
US addresses
Social Security Numbers (which were stored in plain text none the less)
Integer data (enforced by constraints or the app layer to make sure these were only digits)

Let's take a quick look by running some T-SQL

First create these two tables

CREATE TABLE TestChar (SomeCol char(10))
GO

CREATE TABLE TestNChar (SomeCol nchar(10))
GO

CREATE index ix_test on TestChar(SomeCol)
GO
CREATE index ix_test on TestNChar(SomeCol)
GO

Now populate each with some data

INSERT TestChar
SELECT TOP 1000000 '1234567890'
FROM sys.sysobjects c1
CROSS JOIN sys.sysobjects c2
CROSS JOIN sys.sysobjects c3
CROSS JOIN sys.sysobjects c4
GO

INSERT TestNChar
SELECT TOP 1000000 '1234567890'
FROM sys.sysobjects c1
CROSS JOIN sys.sysobjects c2
CROSS JOIN sys.sysobjects c3
CROSS JOIN sys.sysobjects c4
GO

Let's see how much space is used by each tables

EXEC sp_spaceused 'TestChar'
EXEC sp_spaceused 'TestNChar'

42768 KB
62736 KB

If you looked at the reserved column, you will see that the nchar data is using about 20 MB more than the char data

Implicit conversions

Besides the storage increase there is also a problem when querying for data that looks like varchar but is stored as unicode. Run the code below.

SET SHOWPLAN_TEXT ON
GO
DECLARE @v varchar(10) = '0123456789'

SELECT * FROM TestChar WHERE SomeCol LIKE  @v +'%'
GO

SET SHOWPLAN_TEXT OFF
GO

Here is the plan for that query

|--Nested Loops(Inner Join, OUTER REFERENCES:([Expr1008], [Expr1009], [Expr1010]))
|--Compute Scalar(DEFINE:([Expr1008]=LikeRangeStart([@v]+'%'), [Expr1009]=LikeRangeEnd([@v]+'%'), [Expr1010]=LikeRangeInfo([@v]+'%')))
| |--Constant Scan
|--Index Seek(OBJECT:([Performance].[dbo].[TestChar].[ix_test]), SEEK:([Performance].[dbo].[TestChar].[SomeCol] > [Expr1008] AND [Performance].[dbo].[TestChar].[SomeCol] < [Expr1009]), WHERE:([Performance].[dbo].[TestChar].[SomeCol] like [@v]+'%') ORDERED FORWARD)

If we look at the plan we can see that this looks pretty good

Usually people will sometimes change the datatype of a column but will not change any code that access this column. Let's now change the data type of the column to nchar



DROP INDEX TestChar.ix_test
GO

ALTER TABLE TestChar ALTER COLUMN SomeCol nchar(10)
GO

CREATE INDEX ix_test on TestChar(SomeCol)
GO

Run the query that gives you the plan again

SET SHOWPLAN_TEXT ON
GO
DECLARE @v varchar(10) = '0123456789'

SELECT * FROM TestChar WHERE SomeCol LIKE  @v +'%'
GO

SET SHOWPLAN_TEXT OFF
GO

Here is the plan

|--Nested Loops(Inner Join, OUTER REFERENCES:([Expr1008], [Expr1009], [Expr1010]))
|--Compute Scalar(DEFINE:([Expr1008]=LikeRangeStart(CONVERT_IMPLICIT(nvarchar(11),[@v]+'%',0)), [Expr1009]=LikeRangeEnd(CONVERT_IMPLICIT(nvarchar(11),[@v]+'%',0)), [Expr1010]=LikeRangeInfo(CONVERT_IMPLICIT(nvarchar(11),[@v]+'%',0))))
| |--Constant Scan
|--Index Seek(OBJECT:([Performance].[dbo].[TestChar].[ix_test]), SEEK:([Performance].[dbo].[TestChar].[SomeCol] > [Expr1008] AND [Performance].[dbo].[TestChar].[SomeCol] < [Expr1009]), WHERE:([Performance].[dbo].[TestChar].[SomeCol] like CONVERT_IMPLICIT(nvarchar(11),[@v]+'%',0)) ORDERED FORWARD)

As you can see, there is a conversion going on right now.

In order to get rid of the conversion, use the correct data types

SET SHOWPLAN_TEXT ON
GO
DECLARE @v nvarchar(10) = '0123456789'

SELECT * FROM TestChar WHERE SomeCol LIKE  @v +'%'
GO

SET SHOWPLAN_TEXT OFF
GO

|--Nested Loops(Inner Join, OUTER REFERENCES:([Expr1008], [Expr1009], [Expr1010]))
|--Compute Scalar(DEFINE:([Expr1008]=LikeRangeStart([@v]+N'%'), [Expr1009]=LikeRangeEnd([@v]+N'%'), [Expr1010]=LikeRangeInfo([@v]+N'%')))
| |--Constant Scan
|--Index Seek(OBJECT:([Performance].[dbo].[TestChar].[ix_test]), SEEK:([Performance].[dbo].[TestChar].[SomeCol] > [Expr1008] AND [Performance].[dbo].[TestChar].[SomeCol] < [Expr1009]), WHERE:([Performance].[dbo].[TestChar].[SomeCol] like [@v]+N'%') ORDERED FORWARD)

Implicit conversions also were an issue when ORMs first burst onto the scene. If you used NHibernate or LINQ to SQL with .NET, since strings in .NET are unicode, all text would be sent over as unicode and you would see all kinds on conversions.

Using larger datatypes when it is not needed

I see this problem mostly with the integer data types. Below is a list of the integer data types together with their storage size and range

tinyint
Storage size is 1 byte. Integer data from 0 through 255.

smallint
Storage size is 2 bytes. Integer data from -2^15 (-32,768) through 2^15 - 1 (32,767).

int
Storage size is 4 bytes. Integer data from -2^31 (-2,147,483,648) through 2^31 - 1 (2,147,483,647).

bigint
Storage size is 8 bytes. Integer data from -2^63 (-9,223,372,036,854,775,808) through 2^63-1 (9,223,372,036,854,775,807).

Now imagine facebook with a billion users decided to use bigint as CountryID in their Country table, this key is then uses as a foreign key in the user demographics table. This is wasteful,either use a smallint since we won't go through 32 thousand countries in the foreseeable feature or use the 2 or 3 character ISO code.

The problem is even worse if you have a compound 6 column key and it is used as a foreign key in tons of other tables...that was real fun to clean up....use a surrogate 1 column key in that case...but be sure to test....normalize till it hurts then denormalize till it works....I will cover normalization in another post...just wanted to mention it

[1] I know there is text and ntext but those are deprecated

Friday, November 04, 2016

Are your foreign keys indexed? If not, you might have problems

When you add a primary key constraint to a table in SQL Server, an index will be created automatically. When you add a foreign key constraint no index will be created. This might cause issues if you don't know that this is the behavior in SQL Server. Maybe there should be an option to automatically index the foreign keys in SQL Server, what do you think?

The other day some deletes on a newer table in the test environment became really slow. We had a primary table with a couple of hundred rows, we loaded up between 200 and 300 million rows into the child table. Then we deleted the child rows, this was fast. After this, we deleted one row from the primary table and this took several seconds.

When I looked at this I noticed something interesting, the most time during the delete was spent doing a lookup at the child table. Then I noticed that the foreign key was not indexed. After we added the index the delete became thousands of times faster

Let's try to replicate that behavior here

First create these primary table, we will add 2048 rows to this table

--Table that will have 2048 rows
CREATE TABLE Store(StoreID int not null,
  DateCreated datetime not null,
  StoreName varchar(500),
  constraint pk_store primary key   (StoreID))
GO

--insert 2048 rows
INSERT Store
SELECT ROW_NUMBER() OVER(ORDER BY t1.number) AS StoreID,DATEADD(dd,t1.number,'20161101')

AS datecreated, NEWID()
FROM master..spt_values t1
WHERE t1.type =  'p'

Now create the child table and add 500K rows, this might take up to 1 minute to run since the rows are pretty wide.

-- table that will also have 500000 rows, fk will be indexed  
CREATE TABLE GoodsSold (TransactionID int not null,
   StoreID int not null,
   DateCreated datetime not null,
   SomeValue char(5000),
   constraint pk_transaction primary key   (TransactionID))

INSERT GoodsSold
SELECT top 500000 ROW_NUMBER() OVER(ORDER BY t1.number) AS TransactionID, t2.StoreID,

DATEADD(dd,t1.number,'20161101') AS datecreated, REPLICATE('A',5000)
FROM master..spt_values t1
CROSS JOIN  Store t2
WHERE t1.type =  'p'

Now it is time to add the foreign key constraint and index this foreign key constraint

-- adding the foreign key
ALTER TABLE GoodsSold  WITH CHECK ADD  CONSTRAINT FK_StoreID FOREIGN KEY(StoreID)
REFERENCES Store(StoreID)
GO

ALTER TABLE GoodsSold CHECK CONSTRAINT FK_StoreID
GO

-- index the foreign key
CREATE index ix_StoreID on GoodsSold(StoreID)
GO

We will create another set of tables, let's start with the primary table, we will just insert into this table all the rows from the primary table we created earlier

-- create another primary table
CREATE TABLE StoreFK(StoreID int not null,
  DateCreated datetime not null,
  StoreName varchar(500),
  constraint pk_storefk primary key   (StoreID))
GO


-- add the same 2048 rows from the primary table with indexed FK
INSERT StoreFK
SELECT * FROM Store
GO

For the child table, it is the same deal, we will add all the rows from the child table we created earlier into this table

-- Add another FK table
CREATE TABLE GoodsSoldFKNoIndex (TransactionID int not null,
   StoreID int not null,
   DateCreated datetime not null,
   SomeValue char(5000),
   constraint pk_transactionfk primary key   (TransactionID))

-- add same 500K rows from table with FK index
INSERT GoodsSoldFKNoIndex
SELECT * FROM GoodsSold

Let's add the foreign key constraint, but this time we are not indexing the foreign key constraint

-- add the FK but do not index this
ALTER TABLE GoodsSoldFKNoIndex  WITH CHECK ADD  CONSTRAINT FK_StoreID_FK

FOREIGN KEY(StoreID)
REFERENCES StoreFK(StoreID)
GO

ALTER TABLE GoodsSoldFKNoIndex CHECK CONSTRAINT FK_StoreID_FK
GO

Let make sure that the tables have the same number of rows

-- check that the tables have the same rows
exec sp_spaceused 'GoodsSold'
exec sp_spaceused 'GoodsSoldFKNoIndex'

name	rows	reserved	data	index_size	unused
GoodsSold	500000	4024976 KB	4000000 KB	22520 KB	2456 KB
GoodsSoldFKNoIndex	500000	4015440 KB	4000000 KB	14936 KB	504 KB

Now that we are setup, let's wipe out all the rows from the child table for a specific StoreID, the SELECT statements should return 0 rows

DELETE GoodsSoldFKNoIndex
WHERE StoreID = 507


DELETE GoodsSold
WHERE StoreID = 507


SELECT * FROM GoodsSoldFKNoIndex
WHERE StoreID = 507

SELECT * FROM  GoodsSold
WHERE StoreID = 507

Now we are getting to the interesting part, turn on Include Actual Execution Plan, run statistics IO or run this in Plan Explorer

DELETE Store
WHERE StoreID = 507


DELETE StoreFK
WHERE StoreID = 507

You will see something like this

So 75% compared to 25%, not good but doesn't look catastrophic, if you have statistics time on, you will see the following

SQL Server Execution Times:
CPU time = 16 ms, elapsed time = 16 ms.

SQL Server Execution Times:
CPU time = 561 ms, elapsed time = 575 ms.

Now it looks much worse

What about statistics io?

Table 'GoodsSold'. Scan count 1, logical reads 3, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Store'. Scan count 0, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Table 'GoodsSoldFKNoIndex'. Scan count 1, logical reads 501373, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'StoreFK'. Scan count 0, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

That is terrible....

Here is also the view from Plan Explorer, look at Est. CPU Cost and Reads

There you have it, not indexing foreign keys can have a big impact even though the child table might not have any data at all

See also When did SQL Server stop putting indexes on Foreign Key columns? by Kimberly Tripp

Sunday, August 17, 2008

Only In A Database Can You Get 1000% + Improvement By Changing A Few Lines Of Code

Take a look at this query.

SELECT * FROM

(

SELECT customer_id, ‘MTD’ AS record_type, COUNT(*), SUM(…), AVG(…)

FROM payment_table

WHERE YEAR(payment_dt) = YEAR(GETDATE())

and MONTH(payment_dt) = MONTH(GETDATE())

GROUP BY customer_id) MTD_payments

UNION ALL

(

SELECT customer_id, ‘YTD’ AS record_type, COUNT(*), SUM(…), AVG(…)

FROM payment_table

WHERE

WHERE YEAR(payment_dt) = YEAR(GETDATE())

GROUP BY customer_id) YTD_payments

UNION ALL

(

SELECT customer_id, ‘LTD’ AS record_type, COUNT(*), SUM(…), AVG(…)

FROM payment_table) LTD_payments

) payments_report

ORDER BY customer_id, record_type

Can you see the problem?
A person had this query, it would run for over 24 hours. Wow, that is pretty bad, I don't think I had ever written something that ran over an hour, and the ones I did were mostly defragmentation and update statistics jobs.

The problem is that the following piece of code

where year(payment_dt) = year(getDate())
and month(payment_dt) = month(getDate())

is not sargable. First what does it mean to be sargable? A query is said to be sargable if the DBMS engine can take advantage of an index to speed up the execution of the query (using index seeks, not covering indexes). The term is derived from a contraction of Search ARGument Able.

This query is not sargable because there is a function on the column, whenever you use a function on the column you will not get an index seek but an index scan. The difference between an index seek and an index scan can be explained like this: when searching for something in a book, you go to the index in the back find the page number and go to the page, that is an index seek. When looking for something in a book you go from page one until the last page, read all the words on all the ages and get what you need, that was an index scan. Do you see how much more expensive in terms of performance that was?

Let's get back to the query, what can we do to make this piece of code use an index seek?
where year(payment_dt) = year(getDate())
and month(payment_dt) = month(getDate())

You would change it to this:
where payment_dt >= dateadd(mm, datediff(mm, 0, getdate())+0, 0)
and payment_dt < dateadd(mm, datediff(mm, 0, getdate())+1, 0)

You can see the complete question on the MSDN forum site here:
http://forums.microsoft.com/msdn/ShowPost.aspx?PostID=3746751&SiteID=1
The Person said that his query went from over 24 hours to 36 seconds. Wow!! That is very significant. hardware cannot help you out if you have bad queries like that.

The same exact day I answered a very similar question, take a look here: http://forums.microsoft.com/msdn/ShowPost.aspx?PostID=3752248&SiteID=1

The person had this

AND DATEDIFF(d, '08/10/2008', DateCreated) >= 0

AND DATEDIFF(d, DateCreated, '08/15/2008') >= 0

I told him to change it to this

AND DateCreated >= '08/10/2008'
and DateCreated <= '08/15/2008'

And that solved that query. If you are interested in some more performance, I have written some Query Optimization items on the LessThanDot Wiki. Below are some direct links

Case Sensitive Search
No Functions on Left Side of Operator
Query Optimizations With Dates
Optimization: Set Nocount On
No Math In Where Clause
Don't Use (select *), but List Columns

If you are interested in some blogposts about dates, take a look at these two which I wrote earlier
How Are Dates Stored In SQL Server?
Do You Know How Between Works With Dates?

Sunday, December 16, 2007

EXISTS or COUNT(*)

Do you use this

IF (SELECT COUNT(*) FROM SomeTable
WHERE SomeColumn = SomeValue ) > 0

Or do you use this

IF EXISTS (SELECT * FROM SomeTable WHERE SomeColumn = SomeValue )

If you answered COUNT(*) then maybe you should take a look these two articles

Andrew Kelly has a nice post on SQLBlog
http://sqlblog.com/blogs/andrew_kelly/archive/2007/12/15/exists-vs-count-the-battle-never-ends.aspx

Matija Lah has a good post on his snaps & snippets blog
http://milambda.blogspot.com/2006/10/exists-or-not-exists-that-is-question.html

Monday, October 15, 2007

Interview With Kalen Delaney About Inside Microsoft SQL Server 2005 Query Tuning and Optimization

I am a big fan of Inside Microsoft SQL Server 2005 The Storage Engine so when I saw that yet another book got published in this series I just had to check it out. Inside Microsoft SQL Server 2005: Query Tuning and Optimization is very well written gets to the point and give you the answers that you need. I decided to contact Kalen to see if she would be willing to do an interview, by now you know that the answer is yes of course ;-)

The question-and-answer session with Kalen that follows was conducted via email.

What is the audience for this book, is it the enterprise user or can a small department benefit from the tips in this book?
Because this book deals with query tuning, anyone who writes SQL queries for SQL Server can benefit. Very little in the book is geared towards system tuning, so the size of the machine doesn’t really matter. Now of course, if you have a very small system with very small tables, you won’t get as much benefit out of tuning your queries. However, if you have any tables of more than a few thousand rows, and you do any joins, you will need to tune your queries. In addition, the issues of blocking and concurrency control can impact any system, no matter how small.

What new technologies in SQL Server 2005 do you think are the most beneficial for performance?
For very large databases, the best new technology is partitioning. For any size system, if you have had serious performance problems due to blocking, you might find a big performance benefit by using one of the snapshot-based isolation levels, but you really need to understand the resource costs that come along with the improved performance. For your individual queries, I think the new optimizer hints and query level recompiles can make a big difference. For indexes, the ability to add included columns to nonclustered indexes can give some of your hard-to-tune queries a major performance boost.

What will a person who reads this book gain in terms of understanding how to performance tune a server?
The focus of this book is not so much on tuning the server, but on tuning queries. There is more in Inside SQL Server 2005: The Storage Engine on server issues such as memory and processor management. The biggest server wide issues are covered in Chapter 5, when I talk about managing the plan cache, and how and when query plans are reused.

Is the book geared towards a beginner/intermediate level user or do you have to be an advanced user to really utilize the information in this book?
The book is not geared towards beginners, but everyone should be able to get something out of it. It’s primarily geared to SQL Server developers and DBA’s who have been working with SQL Server for a while, and have encountered performance problems that they are trying to find solutions for.

With all the changes in SQL Server 2005, how critical has the tempdb become in regards to performance?
Tempdb has always been important. In SQL Server 2005, if you are using one of the snapshot-based isolation levels, you are going to have to be more aware of the demands placed on tempdb, both in the sizing requirements and the additional I/O resources needed. Fortunately, SQL Server 2005 provides tools to monitor tempdb, including a dozen new performance monitor counters, and a dynamic management view, sys.dm_db_file_space_usage, that keeps track of how much space in tempdb is being used for each of the different kind of object stored in tempdb.

I understand that this is the first time you wrote with a team of other writers; can you tell us something about that experience?
I initially thought that not having to write the whole volume by myself meant that I could get it done sooner, but that was not the case. Everyone had their own schedule and their own way of writing. The personal aspect of working with the other authors was great. I deeply respect all of the others and it was an honor to be working so closely with them. I had some concerns about the depth of coverage and I wondered whether all the chapters would end up being as deep as I hoped for, but that turned out not to be a major problem. The only real issues were agreeing on a common terminology and coding style, and even that wasn’t that big of an issue, because I got to do a final editing pass on everyone’s chapters.

What SQL Server books are on your bookshelf?
All of the Inside SQL Server books are there, of course, and all of Ken Henderson’s books. Bob Beauchemin’s book is in my car, to read while I am waiting for the ferryboat, and while on the ferry. I also have technical books that aren’t SQL Server specific, such as Jim Gray’s Transaction Processing, Russinovich’s and Solomon’s Windows Internals, Chris Date’s Introduction to Database Systems and Mike Stonebraker’s Readings in Database Systems.

Why do you write technical books?
I love working with SQL Server and trying to find out all I can about it. When I found that I could explain difficult concepts in a training environment, I thought I could do the same thing in a written format, and reach more people that way. I have always loved explaining things, ever since I was a teaching assistant for High School Math.

Will you be updating your books for SQL Server 2008?
I have just started meeting with my editors at Microsoft Press about SQL Server 2008, and it looks like a revision is in the plans. We’re really looking at it as just a revision, with the same structures as the current books, with straightforward changes and the inclusion of new features.

Name three things that are new in SQL Server 2005 that you find are the most valuable?
Dynamic Management Views, Dynamic Management Views, and Dynamic Management Views!
Oh, you wanted three different things? ;-) How about XML query plans and optimization hints. (I’m also very fond of many of the new TSQL constructs, but I was only talking about things that I cover in my new book.)

Name three things which are coming in SQL Server 2008 that you are most excited about?
You’ll have to ask me this next time. I have actually been avoiding SQL Server 2008 while I was getting my Query Tuning and Optimization book finished, because I didn’t want to get distracted.

Can you list any third party tools that you find useful to have as a SQL Server developer/admin?
I’ve tried a few other products, but usefully I find that it is much easier to just stick with the Microsoft line and use the tools provided with the product.

Name some of your favorite non-technical books.
Oh, I love to read. It would be impossible to list my favorite books, but I can tell you my favorite authors, most of whom write science fiction: Lois McMaster Bujold, Ursula LeGuin, Sheri Tepper, Orson Scott Card, Octavia Butler, Elizabeth Moon. I also love to read historical fiction like Leon Uris.