Monday, December 30, 2019

Top 10 posts from the last decade

Out of focus Christmas tree

As we finish the tumultuous 2010s and are ready for the roaring 2020s, I decided to take a quick look at the ten most viewed posts from the past decade. Two of these posts were made posted before 2010

Without any fanfare, here is the list


10. Some cool SQL Server announcements SQL Graph, Adaptive Query Plan, CTP1 of SQL vNext, SQL Injection detection
This is my recap of the chalkboard session with the SQL Server team at the SQL Server PASS summit in Seattle.

09. Convert Millisecond To "hh:mm:ss" Format
A very old post showing you how to convert from milliseconds to "hh:mm:ss" format

08. Can adding an index make a non SARGable query SARGable?
A post showing you how adding an index can make a query use that index even though the index column doesn't match the query

07. A little less hate for: String or binary data would be truncated in table
Can you believe they actually managed to accomplish this during the past decade  :-)

06. Some numbers that you will know by heart if you have been working with SQL Server for a while
After working with SQL Server for a while, you should know most of these

05. Use T-SQL to create caveman graphs
One of the shortest post on this site, show you how you can make visually appealing output with a pipe symbol

04. Ten SQL Server Functions That You Hardly Use But Should
A post from 2007 showing some hardly used functions like NULLIF, PARSENAME and STUFF

03. Your lack of constraints is disturbing
A post showing the type of constraints available in SQL Server with examples

02. Five Ways To Return Values From Stored Procedures
A very old post that shows you five ways to return values from a stored proc

01. After 20+ years in IT .. I finally discovered this...
What can I say, read it and let me know if you knew this one....



Wednesday, October 30, 2019

SQLSTATE 4200 Error 666 and what to do.




This morning I was greeted by the following message from a job email

The maximum system-generated unique value for a duplicate group was exceeded for index with partition ID 72059165481762816. Dropping and re-creating the index may resolve this; otherwise, use another clustering key. [SQLSTATE 42000] (Error 666)


Almost Halloween? check!
Error 666? check!
Ever seen this error before? no!


The job has a step that inserts into a bunch of tables... 
The table in question had a clustered index that was created without the UNIQUE property. When you create such an index, SQL Server will create a uniqueifier internally

This part is from the  CSS SQL Server Engineers blog post

A uniqueifier (or uniquifier as reported by SQL Server internal tools) has been used in the engine for a long time (since SQL Server 7.0), and even being known to many, referenced in books and blogs, The SQL Server documentation clearly states that you will not see it exposed externally in the engine (https://docs.microsoft.com/en-us/sql/relational-databases/sql-server-index-design-guide).


"If the clustered index is not created with the UNIQUE property, the Database Engine automatically adds a 4-byte uniqueifier column to the table. When it is required, the Database Engine automatically adds a uniqueifier value to a row to make each key unique. This column and its values are used internally and cannot be seen or accessed by users."

While it´s unlikely that you will face an issue related with uniqueifiers, the SQL Server team has seen rare cases where customer reaches the uniqueifier limit of 2,147,483,648, generating error 666.


Msg 666, Level 16, State 2, Line 1

The maximum system-generated unique value for a duplicate group was exceeded for index with partition ID <PARTITIONID>. Dropping and re-creating the index may resolve this; otherwise, use another clustering key.


So I ran into this rare case :-(

How can you quickly find out what table and index name the error is complaining about?

You can use the following query, just change the partitionid to match the one from your error message


SELECT SCHEMA_NAME(o.schema_id) as SchemaName, 
  o.name as ObjectName, 
  i.name as IndexName, 
  p.partition_id as PartitionID
FROM sys.partitions p
JOIN sys.objects o on p.object_id = o.object_id
JOIN sys.indexes i on p.object_id = i.object_id
WHERE p.partition_id = 72059165481762816

After running the query, you will  now have the schema name, the table name and the index name. That is all you need to find the index, you can now drop and recreate it

In my case this table was not big at all... 5 million rows or so, but we do delete and insert a lot of data into this table many times a day.
Also we have rebuild jobs running, rebuild jobs do not reset the uniqifier (see also below about a change from the CSS SQL Server Engineers)


To fix this, all I had to do was drop the index and recreate the index (after filling out tickets and testing it on a lower environment first). 

DROP INDEX [IX_IndexName] ON [SchemaName].TableName] 
GO

CREATE CLUSTERED INDEX [IX_IndexName] ON [SchemaName].[TableName] 
(
 Col1 ASC,
 Col2 ASC,
 Col3 ASC
) ON [PRIMARY]
GO


After dropping and recreating the index.. the code that threw an error earlier did not throw an error anymore

Since my table only had 5 million rows or so.. this was not a big deal and completed in seconds. If you have a large table you might have to wait or think of a different approach


If you want to know more, check out this post by the CSS SQL Server Engineers Uniqueifier considerations and error 666

The interesting part is

As of February 2018, the design goal for the storage engine is to not reset uniqueifiers during REBUILDs. As such, rebuild of the index ideally would not reset uniquifiers and issue would continue to occur, while inserting new data with a key value for which the uniquifiers were exhausted. But current engine behavior is different for one specific case, if you use the statement ALTER INDEX ALL ON <TABLE> REBUILD WITH (ONLINE = ON), it will reset the uniqueifiers (across all version starting SQL Server 2005 to SQL Server 2017).

Important: This is something that is not documented and can change in future versions, so our recommendation is that you should review table design to avoid relying on it.




Edit.. it turns out I have seen this before and have even blogged about it  http://sqlservercode.blogspot.com/2017/06/having-fun-with-maxed-out-uniqifiers-on.html



Tuesday, June 11, 2019

Can adding an index make a non SARGable query SARGable?


This question came up the other day from a co-worker, he said he couldn't change a query but was there a way of making the same query produce a better plan by doing something else perhaps (magic?)

He said his query had a WHERE clause that looked like the following

WHERE RIGHT(SomeColumn,3) = '333'

I then asked if he could change the table, his answer was that he couldn't mess around with the current columns but he could add a column

Ok, that got me thinking about a solution, let's see what I came up with


First create the following table


USE tempdb
GO


CREATE TABLE StagingData (SomeColumn varchar(255) NOT NULL )

ALTER TABLE dbo.StagingData ADD CONSTRAINT
 PK_StagingData PRIMARY KEY CLUSTERED 
 (
 SomeColumn
 )  ON [PRIMARY]

GO



We will create some fake data by appending a dot and a number between 100 and 999 to a GUID

Let's insert one row so that you can see what the data will look like

DECLARE @guid uniqueidentifier
SELECT @guid = 'DEADBEEF-DEAD-BEEF-DEAD-BEEF00000075' 

INSERT StagingData
SELECT CONVERT(varchar(200),@guid) + '.100'

SELECT * FROM StagingData


Output

SomeColumn
--------------------------------
DEADBEEF-DEAD-BEEF-DEAD-BEEF00000075.100




Time to insert 999,999 rows

Here is what the code looks like

INSERT StagingData
SELECT top 999999 CONVERT(varchar(200),NEWID()) 
 +  '.' 
 + CONVERT(VARCHAR(10),s2.number)
FROM master..SPT_VALUES s1
CROSS JOIN master..SPT_VALUES s2
WHERE s1.type = 'P'
AND s2.type = 'P'
and s1.number between 100 and 999
and s2.number between 100 and 999




With that completed we should now have one million rows


If we run our query to look for rows where the last 3 characters are 333 we can see that we get a scan

SET STATISTICS IO ON
GO

SELECT SomeColumn FROM StagingData
WHERE RIGHT(SomeColumn,3) = '333'


SET STATISTICS IO OFF
GO




(900 rows affected)
Table 'StagingData'. Scan count 1, logical reads 5404, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

We get 900 rows back and 5404 reads


Here is what the execution plan looks like







If we always query for the last 3 characters, what we can do is add a computed column to the table that just contains the last 3 characters and then add a nonclustered index to that column

That code looks like this

ALTER TABLE StagingData ADD RightChar as RIGHT(SomeColumn,3)
GO


CREATE INDEX ix_RightChar on StagingData(RightChar)
GO


Now let's check what we get when we use this new column

SET STATISTICS IO ON
GO

SELECT SomeColumn  FROM StagingData
WHERE RightChar  = '333'


SET STATISTICS IO OFF
GO



(900 rows affected)
Table 'StagingData'. Scan count 1, logical reads 10, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.


The reads went from 5404 to 10, that is a massive improvement, here is what the execution plan looks like



However there is a small problem.....

We said we would not modify the query...

What happens if we execute the same query from before?  Can the SQL Server optimizer recognize that our new column and index is pretty much the same as the WHERE clause?

SET STATISTICS IO ON
GO

SELECT SomeColumn FROM StagingData
WHERE RIGHT(SomeColumn,3) = '333'


SET STATISTICS IO OFF
GO

(900 rows affected)
Table 'StagingData'. Scan count 1, logical reads 10, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Damn right, the optimizer can, , there it is, it uses the new index and column although we specify the original column..... (must be all that AI built in... (just kidding))
If you look at the execution plan, you can see it is indeed a seek

So there you have it.. sometimes, you can't change the query, you can't mess around with existing column but you can add a column to the table, in this case a technique like the following can be beneficial



PS

Betteridge's law of headlines is an adage that states: "Any headline that ends in a question mark can be answered by the word no." It is named after Ian Betteridge, a British technology journalist who wrote about it in 2009

In this case as you can plainly see...this is not true  :-) The answer to "Can adding an index make a non SARGable query SARGable?" is clearly yes