SQL Server Code,Tips and Tricks, Performance Tuning

Tuesday, June 11, 2019

Can adding an index make a non SARGable query SARGable?

This question came up the other day from a co-worker, he said he couldn't change a query but was there a way of making the same query produce a better plan by doing something else perhaps (magic?)

He said his query had a WHERE clause that looked like the following

WHERE RIGHT(SomeColumn,3) = '333'

I then asked if he could change the table, his answer was that he couldn't mess around with the current columns but he could add a column

Ok, that got me thinking about a solution, let's see what I came up with

First create the following table

USE tempdb
GO


CREATE TABLE StagingData (SomeColumn varchar(255) NOT NULL )

ALTER TABLE dbo.StagingData ADD CONSTRAINT
 PK_StagingData PRIMARY KEY CLUSTERED 
 (
 SomeColumn
 )  ON [PRIMARY]

GO

We will create some fake data by appending a dot and a number between 100 and 999 to a GUID

Let's insert one row so that you can see what the data will look like

DECLARE @guid uniqueidentifier
SELECT @guid = 'DEADBEEF-DEAD-BEEF-DEAD-BEEF00000075' 

INSERT StagingData
SELECT CONVERT(varchar(200),@guid) + '.100'

SELECT * FROM StagingData

Output

SomeColumn
--------------------------------
DEADBEEF-DEAD-BEEF-DEAD-BEEF00000075.100

Time to insert 999,999 rows

Here is what the code looks like

INSERT StagingData
SELECT top 999999 CONVERT(varchar(200),NEWID()) 
 +  '.' 
 + CONVERT(VARCHAR(10),s2.number)
FROM master..SPT_VALUES s1
CROSS JOIN master..SPT_VALUES s2
WHERE s1.type = 'P'
AND s2.type = 'P'
and s1.number between 100 and 999
and s2.number between 100 and 999

With that completed we should now have one million rows

If we run our query to look for rows where the last 3 characters are 333 we can see that we get a scan

SET STATISTICS IO ON
GO

SELECT SomeColumn FROM StagingData
WHERE RIGHT(SomeColumn,3) = '333'


SET STATISTICS IO OFF
GO

(900 rows affected)
Table 'StagingData'. Scan count 1, logical reads 5404, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

We get 900 rows back and 5404 reads

Here is what the execution plan looks like

If we always query for the last 3 characters, what we can do is add a computed column to the table that just contains the last 3 characters and then add a nonclustered index to that column

That code looks like this

ALTER TABLE StagingData ADD RightChar as RIGHT(SomeColumn,3)
GO


CREATE INDEX ix_RightChar on StagingData(RightChar)
GO

Now let's check what we get when we use this new column

SET STATISTICS IO ON
GO

SELECT SomeColumn  FROM StagingData
WHERE RightChar  = '333'


SET STATISTICS IO OFF
GO

(900 rows affected)
Table 'StagingData'. Scan count 1, logical reads 10, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

The reads went from 5404 to 10, that is a massive improvement, here is what the execution plan looks like

However there is a small problem.....

We said we would not modify the query...

What happens if we execute the same query from before? Can the SQL Server optimizer recognize that our new column and index is pretty much the same as the WHERE clause?

SET STATISTICS IO ON
GO

SELECT SomeColumn FROM StagingData
WHERE RIGHT(SomeColumn,3) = '333'


SET STATISTICS IO OFF
GO


(900 rows affected)

Table 'StagingData'. Scan count 1, logical reads 10, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.



Damn right, the optimizer can, , there it is, it uses the new index and column although we specify the original column..... (must be all that AI built in... (just kidding))



If you look at the execution plan, you can see it is indeed a seek

So there you have it.. sometimes, you can't change the query, you can't mess around with existing column but you can add a column to the table, in this case a technique like the following can be beneficial

PS

Betteridge's law of headlines is an adage that states: "Any headline that ends in a question mark can be answered by the word no." It is named after Ian Betteridge, a British technology journalist who wrote about it in 2009

In this case as you can plainly see...this is not true :-) The answer to "Can adding an index make a non SARGable query SARGable?" is clearly yes

Wednesday, April 24, 2019

How to count NULLS without using IS NULL in a WHERE clause

This came up the other day, someone wanted to know the percentage of NULL values in a column

Then I said "I bet you I can run that query without using a NULL in the WHERE clause, as a matter of fact, I can run that query without a WHERE clause at all!!"

The person then wanted to know more, so you know what that means.. it becomes a blog post :-)

BTW, the PostgreSQL version of this blog post can be found here: A quick and easy way to count the percentage of nulls without a where clause in PostgreSQL

To get started, first create this table and verify you have 9 rows

CREATE TABLE foo(bar int)
INSERT foo values(1),(null),(2),(3),(4),
 (null),(5),(6),(7)

SELECT * FROM foo

Here is what the output should be

bar
1
NULL
2
3
4
NULL
5
6
7

To get the NULL values and NON NULL values, you can do something like this

SELECT COUNT(*) as CountAll FROM foo WHERE bar IS NOT NULL
SELECT COUNT(*) as CountAll FROM foo WHERE bar IS  NULL

However, there is another way

Did you know that COUNT behaves differently if you use a column name compared to when you use *

Take a look

SELECT COUNT(*) as CountAll, 
  COUNT(bar) as CountColumn
FROM foo

If you ran that query, the result is the following

CountAll CountColumn
----------- -----------
9 7

Warning: Null value is eliminated by an aggregate or other SET operation.

And did you notice the warning? That came from the count against the column

Let's see what Books On Line has to say

COUNT(*) returns the number of items in a group. This includes NULL values and duplicates.

COUNT(ALL expression) evaluates expression for each row in a group, and returns the number of nonnull values.

COUNT(DISTINCT expression) evaluates expression for each row in a group, and returns the number of unique, nonnull values.

This is indeed documented behavior

So now, lets change our query to return the percentage of non null values in the column

SELECT COUNT(*) as CountAll, 
  COUNT(bar) as CountColumn, 
  (COUNT(bar)*1.0/COUNT(*))*100 as PercentageOfNonNullValues 
FROM foo

Here is the output

CountAll CountColumn percentageOfNonNullValues
----------- ----------- ---------------------------------------
9 7 77.777777777700

I just want to point out one thing, the reason I have this * 1.0 in the query

(COUNT(bar)*1.0/COUNT(*))*100

I am doing * 1.0 here because count returns an integer, so you will end up with integer math and the PercentageOfNonNullValues would be 0 instead of 77.7777...

That's it for this short post.. hopefully you knew this, if not, then you know it now :-)

Sunday, April 14, 2019

How to check if an Extended Event session exists before dropping it

Are you still running profiler or have you transferred to using Extended Events? I use Extended Events almost exclusively now because it's so much easier compared to using profiler or trace from T-SQL. Not to mentioned you can capture more things

The other day someone checked in some code and every now and then the build would fail with the error

Msg 15151, Level 16, State 19, Line 51
Cannot drop the event session 'ProcsExecutions', because it does not exist or you do not have permission.

I decided to take a look at the code and saw what the problem was. I will recreate the code here and then show you what needs to be changed. This post will not go into what Extended Events are, you can look that up in the SQL Server Extended Events documentation

Start by creating the Extended Event session by executing the following T-SQL

CREATE EVENT SESSION ProcsExecutions ON SERVER 
ADD EVENT sqlserver.rpc_completed(
    ACTION(sqlos.task_time,sqlserver.client_app_name,
 sqlserver.client_hostname,sqlserver.database_name,sqlserver.sql_text)
    WHERE ([sqlserver].[database_name]=N'Test')
 ),
ADD EVENT sqlserver.rpc_starting(
    ACTION(sqlos.task_time,sqlserver.client_app_name,
 sqlserver.client_hostname,sqlserver.database_name,sqlserver.sql_text)
    WHERE ([sqlserver].[database_name]=N'Test')
  )

To start the Extended Event session from T-SQL, execute the following command

ALTER EVENT SESSION ProcsExecutions
   ON SERVER  
   STATE = START;

Below is what the code looked like that was checked in.

You can run it and it will execute without a problem

IF EXISTS (SELECT name FROM sys.dm_xe_sessions  WHERE name = 'ProcsExecutions')
DROP EVENT SESSION ProcsExecutions ON SERVER
GO


CREATE EVENT SESSION ProcsExecutions ON SERVER 
ADD EVENT sqlserver.rpc_completed(
    ACTION(sqlos.task_time,sqlserver.client_app_name,
 sqlserver.client_hostname,sqlserver.database_name,sqlserver.sql_text)
    WHERE ([sqlserver].[database_name]=N'Test')
 ),
ADD EVENT sqlserver.rpc_starting(
    ACTION(sqlos.task_time,sqlserver.client_app_name,
 sqlserver.client_hostname,sqlserver.database_name,sqlserver.sql_text)
    WHERE ([sqlserver].[database_name]=N'Test')
  )


  ALTER EVENT SESSION ProcsExecutions
   ON SERVER  
   STATE = START; 
   GO

However if you run the following command now

 ALTER EVENT SESSION ProcsExecutions
   ON SERVER  
   STATE = STOP;

And then execute the same create Extended Event T-SQL Query again

IF EXISTS (SELECT name FROM sys.dm_xe_sessions  WHERE name = 'ProcsExecutions')
DROP EVENT SESSION ProcsExecutions ON SERVER
GO


CREATE EVENT SESSION ProcsExecutions ON SERVER 
ADD EVENT sqlserver.rpc_completed(
    ACTION(sqlos.task_time,sqlserver.client_app_name,
 sqlserver.client_hostname,sqlserver.database_name,sqlserver.sql_text)
    WHERE ([sqlserver].[database_name]=N'Test')
 ),
ADD EVENT sqlserver.rpc_starting(
    ACTION(sqlos.task_time,sqlserver.client_app_name,
 sqlserver.client_hostname,sqlserver.database_name,sqlserver.sql_text)
    WHERE ([sqlserver].[database_name]=N'Test')
  )


 ALTER EVENT SESSION ProcsExecutions
   ON SERVER  
   STATE = START; 
   GO

You will get the error

Msg 25631, Level 16, State 1, Line 29
The event session, "ProcsExecutions", already exists. Choose a unique name for the event session.

So why is that? There are 2 DMV that exist sys.dm_xe_sessions and sys.server_event_sessions. The DMV sys.dm_xe_sessions only returns a row for Extended Event sessions that are in the running state, the DMV sys.server_event_sessions will return a row even if the Extended Event session is not currently running

Lets' take a look at what that looks like by running some queries and commands

First we are going to stop the session and then query the sys.dm_xe_sessions DMV

--Stop the session if is running
 ALTER EVENT SESSION ProcsExecutions
   ON SERVER  
   STATE = STOP; 
   GO

-- this query returns only the running Extended Event sessions
SELECT dxs.name,
dxs.create_time,*
FROM sys.dm_xe_sessions AS dxs;

Output
-----------------
hkenginexesession
system_health
sp_server_diagnostics session
telemetry_xevents

As you can see our Extended Event session is not returned because it is not in a running state

Now lets's query the sys.server_event_sessions DMV and check if our Extended Event session is returned

-- this query returns also Extended Event sessions that are not currently running
 SELECT *
 FROM sys.server_event_sessions

Output
-----------------
system_health
AlwaysOn_health
telemetry_xevents
ProcsExecutions

As you can see our Extended Event session is returned even though it is not in a running state

If we now start the session again and then check the sys.dm_xe_sessions DMV, we will get back out session

-- start the session again
   ALTER EVENT SESSION ProcsExecutions
   ON SERVER  
   STATE = START; 
   GO


SELECT dxs.name,
dxs.create_time,*
FROM sys.dm_xe_sessions AS dxs;

Output
-----------------
hkenginexesession
system_health
sp_server_diagnostics session
telemetry_xevents
ProcsExecutions

So now our Extended Event session is returned because it is in a running state

Instead of this query to check for the existence

IF EXISTS (SELECT name FROM sys.dm_xe_sessions  WHERE name = 'ProcsExecutions')
DROP EVENT SESSION ProcsExecutions ON SERVER
GO

What we really want is this

IF EXISTS (SELECT name FROM sys.server_event_sessions  WHERE name = 'ProcsExecutions')
DROP EVENT SESSION ProcsExecutions ON SERVER
GO

So basically we change the dmv from sys.dm_xe_sessions to sys.server_event_sessions in IF EXISTS check

So it is a pretty easy change, just swap out the DMV

If you want to stop a session if it is running, you can go ahead and implement something like this

 IF EXISTS (SELECT name FROM sys.dm_xe_sessions  WHERE name = 'ProcsExecutions')
 BEGIN
 PRINT 'The Session Was Running'

 ALTER EVENT SESSION ProcsExecutions
   ON SERVER  
   STATE = STOP; 
END

IF NOT EXISTS (SELECT name FROM sys.dm_xe_sessions  WHERE name = 'ProcsExecutions')
 PRINT 'The Session is NOT Running'

That's all for this post, hopefully it will be useful to someone

Saturday, April 13, 2019

How to improve your tech skills

Today we are going to look at how to improve your tech skills. This really is a continuation of the Stay relevant and marketable post from a couple of weeks ago. Here are some things that you can do to improve your tech skills

Attend usergroups

Attend your local usergroup meetings, there is always some expert that comes to do presentations.

Answer questions

I still think answering questions is one of the best ways to improve your skill. Join a QA site like stackoverflow, head on to a specialized site on stackexchange, here is a list of all of them http://stackexchange.com/sites

If you are not comfortable with answering yet or if you realize that the questions are too difficult, don't worry about, just start by lurking. What you will find out over time is that every month you will be able to answer more and more of these question. This is because the questions are pretty much the same but some little detail might be different. After a while you will notice that there will be very few questions that you won't be able to answer in your field of expertise

Lunch and learns

No time you say to improve your skills, do you take lunch breaks? If so consider doing lunch and learns, get into a conference room, fire up the projector and then either look at code with the team, do design, watch videos, whatever floats your boat

Get involved with an open source project

A good way to improve your skills is to get involved with an open source project. Pick a project download it, then pick it apart. Start reading through the code, notice how things are done, ask yourself why it was done that way. Would you do it the same way? If you pick a big enough project, there will be many contributors, can you tell that the code was put together or does it pretty much look like it was written by one person. Are standards followed, how many design patterns are used

Read books, read code, read blogs

There are many classic list of books that every programmer should read
Here is just a small list that you can choose from, I grabbed this from stackoverflow

Code Complete (2nd edition) by Steve McConnell
The Pragmatic Programmer
Design Patterns by the Gang of Four
Refactoring: Improving the Design of Existing Code
The Mythical Man Month
Clean Code: A Handbook of Agile Software Craftsmanship by Robert C. Martin
CODE by Charles Petzold
Working Effectively with Legacy Code by Michael C. Feathers
Peopleware by Demarco and Lister
Coders at Work by Peter Seibel
Patterns of Enterprise Application Architecture by Martin Fowler
Test-Driven Development: By Example by Kent Beck
Practices of an Agile Developer
Don't Make Me Think
Agile Software Development, Principles, Patterns, and Practices by Robert C. Martin
Domain Driven Designs by Eric Evans
The Design of Everyday Things by Donald Norman
JavaScript - The Good Parts
Getting Real by 37 Signals
The Annotated Turing
Agile Principles, Patterns, and Practices in C# by Robert C. Martin
The Soul of a New Machine by Tracy Kidder
Here Comes Everybody: The Power of Organizing Without Organizations by Clay Shirky
Pragmatic Unit Testing in C# with NUnit by Andy Hunt and Dave Thomas with Matt Hargett
Rework by Jason Freid and DHH
JUnit in Action

Reading code is also a good way to improve your skills, head over to the frameworks you use the most and start digging around in the API, look at the example code.
Read blogs of subject expert, study their code and techniques, if something is not clear don't hesitate to leave a comment asking for some info or further explanation

Practice by doing katas

If you have ever done Karate you will know what a kata is, it is basically the practice of forms. A kata, or code kata, is defined as an exercise in programming which helps hone your skills through practice and repetition. Dave Thomas, started this movement for programming. You can find a list of awesome katas here: https://github.com/gamontal/awesome-katas

Blog

I found that blogging has been very good for my tech skills. It keeps me sharp and since I blog about new things it keeps my skill set up to date. When blogging, your readers will tell you when the code is wrong, so you have to make sure everything is tested and will run as shown in the post. Since you will have to do some research when writing these blog posts, your skills will improve and expand.
An added bonus is that I have a code library that I can access anytime I want.

Write a book

If you are a masochistic type of person then I recommend you write a book, almost everybody in the tech world that I know swore that they would never write a book again when they were done.......and yet they did. In order to write a book you have to spend a LOT of time doing research, making sure your code is correct and much more. Once you are done with this if you were not a subject expert you will be now. The worst part of writing a book is the initial feedback you get pointing out all your mistakes, if you are not thick skinned this could become a problem.

Listen to podcast, watch webinars

I get a lot of my tech info from podcasts, I like it better than listening to music at times and it makes the commute or run more enjoyable. The benefit is that you will learn something, you also might hear about some new shiny thing and then you will want to check it out when you get to the computer. There are many things I have learned from podcast, I also look forward to the next episode

Wednesday, March 20, 2019

Some numbers that you will know by heart if you have been working with SQL Server for a while

This is just a quick and fun post.

I was troubleshooting a deadlock the other day and it got me thinking.... I know the number 1205 by heart and know it is associated to a deadlock. What other numbers are there that you can associate to an event or object or limitation. For example 32767 will be known by a lot of people as the database id of the ResourceDb, master is 1, msdb is 4 etc etc.

So below is a list of numbers I thought of

Leave me a comment with any numbers that you know by heart

BTW I didn't do the limits for int, smallint etc etc, those are the same in all programming languages...so not unique to SQL Server

-1
You use -1 with DBCC TRACESTATUS to see what trace flags are enabled on your system

For example on a brand new instance, I turned on these 3 trace flags, then when I check tracestatus, I get them back in the output

DBCC TRACEON (3605,1204,1222,-1)
DBCC TRACESTATUS(-1)

TraceFlag Status Global Session
1204 1 1 0
1222 1 1 0
3605 1 1 0

1
You can only have 1 clustered index per table. This is also a favorite interview question, asking people to explain why there can only be 1 clustered index

3
The smallest fraction second number in a datetime datatype is 3

Fractions of a seconds are rounded to increments of .000, .003, or .007 seconds

This means the value after 000 midnight is .003 seconds

Take a look at this

DECLARE @d DATETIME = '2019-03-19 23:59:59.997'

SELECT @d AS orig,
 dateadd(ms,1,@d) AS orig1ms,
 dateadd(ms,2,@d) AS orig2ms,
 dateadd(ms,3,@d) AS orig3ms,
 dateadd(ms,4,@d) AS orig4ms,
 dateadd(ms,5,@d) AS orig5ms

Output

2019-03-19 23:59:59.997
2019-03-19 23:59:59.997
2019-03-20 00:00:00.000
2019-03-20 00:00:00.000
2019-03-20 00:00:00.000
2019-03-20 00:00:00.003

This is also the reason you will see datetimes in queries ending in the following values for the time portion '23:59:59.997'. It will mostly be used with BETWEEN

For example

SELECT ...
FROM SomeTable
WHERE SomeDAte BETWEEN '2019-03-19' and '2019-03-19 23:59:59.997'

WHICH of course is the same as the query below

SELECT ...
FROM SomeTable
WHERE SomeDAte >='2019-03-19'
AND SomeDAte  < '2019-03-20'

But it's less typing to use between :-)

Another one with the number 3 is the /3GB flag you could set in the boot.ini file. In that case if you had a 32 bit 4 GB system, SQL Server could now use 3GB instead of only 2GB.... oh the good old times :-)

10
STATS = 10

When you script out a BACKUP or RESTORE command, it will by default use STATS =10, so every 10% you will get a message like below

10 percent processed.
20 percent processed.
30 percent processed.
40 percent processed.
50 percent processed.

For big databases, I like to use STATS = 1

15
If you have been using SQL Server for a while, you might see this in the error log

SQL Server has encountered 1 occurrence(s) of I/O requests taking longer than 15 seconds to complete on file [D:\SomeFilename.ldf] in database
The OS file handle is 0x0000000000000950. The offset of the latest long I/O is: 0x000000545eb200

There are several reasons why this might happen

1. SQL Server is spawning more I/O requests than what the I/O disk subsystem could handle.

2 . There could be an issue with the I/O subsystem (or) driver/firmware issue (or) Misconfiguration in the I/O Subsystem (or) Compression is turned on, so the Disks are performing very slow and thus SQL Server is affected by this

3. Some other process on the system is saturating the disks with I/O requests. Common application includes AV Scan,System Backup Etc.

50
Session ids which are smaller than 50 are system... You would filter this out from sp_who2 to get all the user generated sessions (not always true I have seen mirroring spids being between 51 and 70 on one my servers)

These days you would use is_user_process instead

So instead of this query

SELECT *
FROM sys.dm_exec_sessions AS es WITH (NOLOCK)
WHERE es.session_id > 50

You would use this one

SELECT *
FROM sys.dm_exec_sessions AS es WITH (NOLOCK)
WHERE es.is_user_process = 1

99.999
The five nines.. everyone knows this number... a 99.999% uptime certification means that the service or application will only be down for approximately five minutes and 15 seconds every year.

100
The default for MAXRECURSION in a recursive CTE

128
Identifier length (name of table, column etc etc)

128 is plenty, I still remember some FoxPro databases where the length could not exceed 8, then you would end up with Addrln1 etc etc

Here is a repo script that will attempt to create a table where the name is 130 characters in length

DECLARE @Ident VARCHAR(150) = REPLICATE('A', 150)

DECLARE @sql VARCHAR(500)  = 'create table ' + @Ident +'(id int)'

EXEC( @sql )

And it blows up with the following error
Msg 103, Level 15, State 4, Line 1
The identifier that starts with 'AAAA.....AAA' is too long. Maximum length is 128.

300
Page Life Expectancy is 300 seconds, meaning SQL Server can only keep those pages in memory for 300 seconds after reading them. This number is quoted all over the place that indicates you have issues if you fall below that. Is the number 300 still correct? Start here https://www.sqlskills.com/blogs/paul/page-life-expectancy-isnt-what-you-think/

900
900 bytes for a clustered index. But then again if you have such a wide clustered index and several nonclustered indexes... good luck!

999
Nonclustered indexes you can have per table
I believe this number used to be 249 or 254 back in the day... but I guess it changed after that monstrosity sharepoint came into existence

1,000
Ah yes, who doesn't remember this number. It usually starts with someone saying that they don't see any job history for the job they created on the new server

Hmmm, you already know the answer don't you?
You go and open up SQL Agent-->Properties-->History
And what do you see?

Maximum job history log size: 1000
Maximum job history rows per job: 100

Ah yes..those nasty defaults

Or someone was evil and executed the proc sp_purge_jobhistory for your job :-)

1,205
Transaction (Process ID %d) was deadlocked on %.*ls resources with another process and has been chosen as the deadlock victim. Rerun the transaction.

1,222
Use this trace flag to return the resources and types of locks that are participating in a deadlock and also the current command affected

1,433
The default port SQL Server is listening on

1,700
You can have 1,700 bytes for a nonclustered index.

2,100
Parameters per stored procedure or user-defined function

Tried that one as well...

declare @d varchar(max) = 'create procedure prtest  '

;with cte as (
select number from master..spt_values
where type = 'p'
union
select number + 2048 from master..spt_values
where type = 'p'
)


select top 2101 @d += '@i' + convert(varchar(10),number)  + ' int ,'
from  cte


select @d = left(@d, len(@d) -1) + 'as select 1 as Test'


exec(@d)

And here is the error

Msg 180, Level 15, State 1, Procedure prtest, Line 1
There are too many parameters in this CREATE PROCEDURE statement. The maximum number is 2100.

3,226
Oh your errorlog is full of messages like these?

Those are not really errors are they?

To stop logging all of your backup success entries to the error log, use traceflag 3226

3,605
Like I showed in the section for number -1, you would use traceflag 3605 alongside traceflags 1204 and 1222 to send deadlock information to the error log

4,096
Columns per SELECT statement. Really who has such a query?

Hmm, I just had to try that out

declare @d varchar(max) = 'select top 1 '


;with cte as (
select number from master..spt_values
where type = 'p'
union
select number + 2048 from master..spt_values
where type = 'p'
union all
select 4096)


select @d += 'name as [' + convert(varchar(10),number)  + '] ,'
from  cte

select @d = left(@d, len(@d) -1) + 'from sys.objects'

exec(@d)

(1 row(s) affected)
Msg 1056, Level 15, State 1, Line 1
The number of elements in the select list exceeds the maximum allowed number of 4096 elements.

Also 4096 is the default network packet size

4,199
Traceflag 4199 Enables query optimizer (QO) fixes released in SQL Server Cumulative Updates and Service Packs. See also the hint 'ENABLE_QUERY_OPTIMIZER_HOTFIXES' from SQL Server 2016 SP1 onwards

8,060
Bytes per page
The number of bytes a page can hold.. and also the number of bytes a row can hold (not taking into account row overflow data types)

15,000
Partitions per partitioned table or index
I think this used to be either 999 or 1000...don't remember exactly

32,767
This is database id of the ResourceDb

Monday, March 18, 2019

Stay relevant and marketable

It has been a while since I wrote some of my best practices posts. I decided to revisit these posts again to see if anything has changed, I also wanted to see if I could add some additional info.

Whenever I interview people for a DB position, I always ask what the latest version is that they have used. I will also ask if they have used the version that is in beta now. More often than not I will get an answer that they are still running something that is two versions behind. Then I ask if they have installed the latest and greatest on their home computer. The answer to that question is usually no.

This to me is crazy, surely you can't be in this field just for the money, where is the passion for discovering new things? Imagine you are a person who makes tables, after a while you will become a master and there is not much more for you to learn. The curse and the blessing of technology is that it is changing, and it is changing rapidly. If you don't invest time after work, before work and over the weekend to discover new things, play with the newest versions and sharpen your skill you will become obsolete, there are many people like that, they fall apart during the interview process.

But I have no time to do all this additional work

People will complain that they don't have enough time to do these additional things. Here are some things you can do if you are short on time. If commuting by public transportation read a book or download the latest papers about the newest versions of the product and read them. If you workout try listening to podcast while doing your aerobic exercises. Perhaps you can set the speed on your player to be 30% faster, this way you can get more podcasts in the same amount of time.

Ask yourself what will help you advance, having up to date skill or being up to date with the latest episodes of Billions, Narcos or American Gods? Another option is to watch the shows on the train on a tablet and then do the tech stuff at home.

Attend an launch event or go to your local usergroup meeting, there are usually sessions on the latest and greatest versions of the software. Youtube also has tons of free sessions.

Here are the Amazon AWS re:Invent 2018 breakout sessions https://www.youtube.com/user/AmazonWebServices/playlists?shelf_id=33&view=50&sort=dd

The Microsoft Ignite sessions can be found here:
https://www.youtube.com/channel/UCrhJmfAGQ5K81XQ8_od1iTg

The Google Developer sessions can be found here:
https://www.youtube.com/user/GoogleDevelopers

Database related sessions: Azure SQL Data Warehouse, Azure SQL Database and SQL Server sessions from the Build 2018 conference

Do you eat lunch? If so, instead of going out for lunch, bring your lunch to the office. While eating your lunch at your desk, watch some of the sessions listed above or read some documentation or books about technologies that you are interested in. If you prefer to read and you want to learn about SQl Server Execution Plans? Head on over to Hugo Kornelis' SQL Server Execution Plan Reference site

Step outside your comfort zone

Try out some other things, become a polyglot programmer. If you are a Java developer, give Scala or Clojure a try. If you are a .NET developer then try out F#, IronPython or Boo. If you are a SQL Server guy why not start playing around with Oracle, PostgreSQL or perhaps even a flavor of NoSQL, take a look at MongoDB, CouchDB, Cassandra and other solutions.

Have you looked at Big Data? This is already getting big and it will only get bigger. Ever heard of Hadoop? No, heard of facebook? Sure you have, guess what, facebook claims to have the biggest Hadoop cluster, It is over 100 Petabytes and it grows by about half a Petabyte per day, and you thought your database was big :-)

Trends are cyclical, every 10 years or so something new and big comes along. In the late 90s this was data warehousing, OLAP cubes, dimensions, fact tables. Every company these days does some sort of data warehousing. Now it is big data, data science, AI and machine learning that is new and shiny. Take a look at some of that stuff

Finally have you looked at the cloud yet? If you have not... I beg you to take a look.

If you think that you need to have money to get started, this is not true

You can start with the free tiers to get some experience. All 3 major could providers offer free tiers

Here are the links to these free tiers in order of cloud vendor by market share

Amazon Webservices

https://aws.amazon.com/free/

Microsoft Azure

https://azure.microsoft.com/en-us/free/

Google Cloud

https://cloud.google.com/free/

That's it for this post.... happy learning.....

Saturday, March 16, 2019

Use sys.configurations to find features that are enabled but perhaps not used

It has been a while since I wrote some of my best practices posts. I decided to revisit these posts again to see if anything has changed, I also wanted to see if I could add some additional info.

Today we are going to look at servers where everything is installed and enabled. Before we start this post let's look back in time a little. Before SQL Server 2005 came out when you installed SQL Server pretty much everything was turned on by default. This of course widened the attack vector against the database servers. With SQL Server 2005 pretty much everything is turned off and you have to turn the features on if you want to use them. Now sometimes some admins will just turn everything on because that way they don't have to deal with this later, these are also the same kind of people who insist that the account needs to be db_owner otherwise their code won't work.

To see what these features are and if they are turned off, you can use the following query.

SELECT name, value,value_in_use 
FROM sys.configurations
WHERE name IN (
'Agent XPs',
'SQL Mail XPs',
'Database Mail XPs',
'SMO and DMO XPs',
'Ole Automation Procedures',
'SQL Mail XPs',
'external scripts enabled',
'Web Assistant Procedures',
'xp_cmdshell',
'Ad Hoc Distributed Queries',
'hadoop connectivity',
'polybase enabled',
'Replication XPs',
'clr enabled')

Here is what the output looks like on my laptop

The difference between value and value_in_use is that value_in_use is what is currently used and value will be used next time the server is restarted. If you want to have the change take effect immediately then use RECONFIGURE

As you can see xp_cmdshell is turned on

Here is an example that will turn off xp_cmdshell

EXECUTE sp_configure 'show advanced options', 1
RECONFIGURE
GO
 
EXECUTE sp_configure 'xp_cmdshell', '0'
RECONFIGURE
GO
 
EXECUTE sp_configure 'show advanced options', 0
RECONFIGURE
GO

To enable a feature use the value 1, to disable a feature use the value 0

If you prefer the GUI, you can also use that, right click on the database server name in SSMS, select Facets, from the Facet drop down select Surface Area Configuration. You will see the following Surface Area Configuration properties

Here you can enable or disable the features you are interested in. You can also export these properties as a policy to use on other servers so that the features are the same on all your servers

Installing everything by default

Here I have mixed feelings myself about what to do. On one hand I don't like to install SSAS or SSRS if there is no need for it, on the other hand I don't feel like adding that stuff 6 months down the road if there is suddenly a need for it. If I do install it, I make sure it at least doesn't run by default but it is disabled.

There is no benefit in having SSAS, SSRS or SSIS running and using CPU cycles as well as RAM if nobody is using these services. If you do install it and nobody uses it, disable the services, you will have more RAM and CPU cycles for the SQL Server service available.

One more thing I want to mention is that I have run into something like this many times.... The Dev or Test server has everything enabled. We deploy to production and oops... CLR is not enabled (quick fix) or SSRS/SSAS is not installed (need to install it, will take longer)

So make sure before deploying that you check what is enabled and installed on the production box