SQL Server Code,Tips and Tricks, Performance Tuning: SQL Server

Showing posts with label SQL Server. Show all posts

Monday, December 30, 2019

Wednesday, October 30, 2019

SQLSTATE 4200 Error 666 and what to do.

This morning I was greeted by the following message from a job email

The maximum system-generated unique value for a duplicate group was exceeded for index with partition ID 72059165481762816. Dropping and re-creating the index may resolve this; otherwise, use another clustering key. [SQLSTATE 42000] (Error 666)

Almost Halloween? check!
Error 666? check!
Ever seen this error before? no!

The job has a step that inserts into a bunch of tables...
The table in question had a clustered index that was created without the UNIQUE property. When you create such an index, SQL Server will create a uniqueifier internally

This part is from the CSS SQL Server Engineers blog post

A uniqueifier (or uniquifier as reported by SQL Server internal tools) has been used in the engine for a long time (since SQL Server 7.0), and even being known to many, referenced in books and blogs, The SQL Server documentation clearly states that you will not see it exposed externally in the engine (https://docs.microsoft.com/en-us/sql/relational-databases/sql-server-index-design-guide).

"If the clustered index is not created with the UNIQUE property, the Database Engine automatically adds a 4-byte uniqueifier column to the table. When it is required, the Database Engine automatically adds a uniqueifier value to a row to make each key unique. This column and its values are used internally and cannot be seen or accessed by users."

While it´s unlikely that you will face an issue related with uniqueifiers, the SQL Server team has seen rare cases where customer reaches the uniqueifier limit of 2,147,483,648, generating error 666.

Msg 666, Level 16, State 2, Line 1

The maximum system-generated unique value for a duplicate group was exceeded for index with partition ID <PARTITIONID>. Dropping and re-creating the index may resolve this; otherwise, use another clustering key.

So I ran into this rare case :-(

How can you quickly find out what table and index name the error is complaining about?

You can use the following query, just change the partitionid to match the one from your error message

SELECT SCHEMA_NAME(o.schema_id) as SchemaName, 
  o.name as ObjectName, 
  i.name as IndexName, 
  p.partition_id as PartitionID
FROM sys.partitions p
JOIN sys.objects o on p.object_id = o.object_id
JOIN sys.indexes i on p.object_id = i.object_id
WHERE p.partition_id = 72059165481762816

After running the query, you will now have the schema name, the table name and the index name. That is all you need to find the index, you can now drop and recreate it

In my case this table was not big at all... 5 million rows or so, but we do delete and insert a lot of data into this table many times a day.
Also we have rebuild jobs running, rebuild jobs do not reset the uniqifier (see also below about a change from the CSS SQL Server Engineers)

To fix this, all I had to do was drop the index and recreate the index (after filling out tickets and testing it on a lower environment first).

DROP INDEX [IX_IndexName] ON [SchemaName].TableName] 
GO

CREATE CLUSTERED INDEX [IX_IndexName] ON [SchemaName].[TableName] 
(
 Col1 ASC,
 Col2 ASC,
 Col3 ASC
) ON [PRIMARY]
GO

After dropping and recreating the index.. the code that threw an error earlier did not throw an error anymore

Since my table only had 5 million rows or so.. this was not a big deal and completed in seconds. If you have a large table you might have to wait or think of a different approach

If you want to know more, check out this post by the CSS SQL Server Engineers Uniqueifier considerations and error 666

The interesting part is

As of February 2018, the design goal for the storage engine is to not reset uniqueifiers during REBUILDs. As such, rebuild of the index ideally would not reset uniquifiers and issue would continue to occur, while inserting new data with a key value for which the uniquifiers were exhausted. But current engine behavior is different for one specific case, if you use the statement ALTER INDEX ALL ON <TABLE> REBUILD WITH (ONLINE = ON), it will reset the uniqueifiers (across all version starting SQL Server 2005 to SQL Server 2017).

Important: This is something that is not documented and can change in future versions, so our recommendation is that you should review table design to avoid relying on it.

Edit.. it turns out I have seen this before and have even blogged about it http://sqlservercode.blogspot.com/2017/06/having-fun-with-maxed-out-uniqifiers-on.html

Wednesday, March 20, 2019

Some numbers that you will know by heart if you have been working with SQL Server for a while

This is just a quick and fun post.

I was troubleshooting a deadlock the other day and it got me thinking.... I know the number 1205 by heart and know it is associated to a deadlock. What other numbers are there that you can associate to an event or object or limitation. For example 32767 will be known by a lot of people as the database id of the ResourceDb, master is 1, msdb is 4 etc etc.

So below is a list of numbers I thought of

Leave me a comment with any numbers that you know by heart

BTW I didn't do the limits for int, smallint etc etc, those are the same in all programming languages...so not unique to SQL Server

-1
You use -1 with DBCC TRACESTATUS to see what trace flags are enabled on your system

For example on a brand new instance, I turned on these 3 trace flags, then when I check tracestatus, I get them back in the output

DBCC TRACEON (3605,1204,1222,-1)
DBCC TRACESTATUS(-1)

TraceFlag Status Global Session
1204 1 1 0
1222 1 1 0
3605 1 1 0

1
You can only have 1 clustered index per table. This is also a favorite interview question, asking people to explain why there can only be 1 clustered index

3
The smallest fraction second number in a datetime datatype is 3

Fractions of a seconds are rounded to increments of .000, .003, or .007 seconds

This means the value after 000 midnight is .003 seconds

Take a look at this

DECLARE @d DATETIME = '2019-03-19 23:59:59.997'

SELECT @d AS orig,
 dateadd(ms,1,@d) AS orig1ms,
 dateadd(ms,2,@d) AS orig2ms,
 dateadd(ms,3,@d) AS orig3ms,
 dateadd(ms,4,@d) AS orig4ms,
 dateadd(ms,5,@d) AS orig5ms

Output

2019-03-19 23:59:59.997
2019-03-19 23:59:59.997
2019-03-20 00:00:00.000
2019-03-20 00:00:00.000
2019-03-20 00:00:00.000
2019-03-20 00:00:00.003

This is also the reason you will see datetimes in queries ending in the following values for the time portion '23:59:59.997'. It will mostly be used with BETWEEN

For example

SELECT ...
FROM SomeTable
WHERE SomeDAte BETWEEN '2019-03-19' and '2019-03-19 23:59:59.997'

WHICH of course is the same as the query below

SELECT ...
FROM SomeTable
WHERE SomeDAte >='2019-03-19'
AND SomeDAte  < '2019-03-20'

But it's less typing to use between :-)

Another one with the number 3 is the /3GB flag you could set in the boot.ini file. In that case if you had a 32 bit 4 GB system, SQL Server could now use 3GB instead of only 2GB.... oh the good old times :-)

10
STATS = 10

When you script out a BACKUP or RESTORE command, it will by default use STATS =10, so every 10% you will get a message like below

10 percent processed.
20 percent processed.
30 percent processed.
40 percent processed.
50 percent processed.

For big databases, I like to use STATS = 1

15
If you have been using SQL Server for a while, you might see this in the error log

SQL Server has encountered 1 occurrence(s) of I/O requests taking longer than 15 seconds to complete on file [D:\SomeFilename.ldf] in database
The OS file handle is 0x0000000000000950. The offset of the latest long I/O is: 0x000000545eb200

There are several reasons why this might happen

1. SQL Server is spawning more I/O requests than what the I/O disk subsystem could handle.

2 . There could be an issue with the I/O subsystem (or) driver/firmware issue (or) Misconfiguration in the I/O Subsystem (or) Compression is turned on, so the Disks are performing very slow and thus SQL Server is affected by this

3. Some other process on the system is saturating the disks with I/O requests. Common application includes AV Scan,System Backup Etc.

50
Session ids which are smaller than 50 are system... You would filter this out from sp_who2 to get all the user generated sessions (not always true I have seen mirroring spids being between 51 and 70 on one my servers)

These days you would use is_user_process instead

So instead of this query

SELECT *
FROM sys.dm_exec_sessions AS es WITH (NOLOCK)
WHERE es.session_id > 50

You would use this one

SELECT *
FROM sys.dm_exec_sessions AS es WITH (NOLOCK)
WHERE es.is_user_process = 1

99.999
The five nines.. everyone knows this number... a 99.999% uptime certification means that the service or application will only be down for approximately five minutes and 15 seconds every year.

100
The default for MAXRECURSION in a recursive CTE

128
Identifier length (name of table, column etc etc)

128 is plenty, I still remember some FoxPro databases where the length could not exceed 8, then you would end up with Addrln1 etc etc

Here is a repo script that will attempt to create a table where the name is 130 characters in length

DECLARE @Ident VARCHAR(150) = REPLICATE('A', 150)

DECLARE @sql VARCHAR(500)  = 'create table ' + @Ident +'(id int)'

EXEC( @sql )

And it blows up with the following error
Msg 103, Level 15, State 4, Line 1
The identifier that starts with 'AAAA.....AAA' is too long. Maximum length is 128.

300
Page Life Expectancy is 300 seconds, meaning SQL Server can only keep those pages in memory for 300 seconds after reading them. This number is quoted all over the place that indicates you have issues if you fall below that. Is the number 300 still correct? Start here https://www.sqlskills.com/blogs/paul/page-life-expectancy-isnt-what-you-think/

900
900 bytes for a clustered index. But then again if you have such a wide clustered index and several nonclustered indexes... good luck!

999
Nonclustered indexes you can have per table
I believe this number used to be 249 or 254 back in the day... but I guess it changed after that monstrosity sharepoint came into existence

1,000
Ah yes, who doesn't remember this number. It usually starts with someone saying that they don't see any job history for the job they created on the new server

Hmmm, you already know the answer don't you?
You go and open up SQL Agent-->Properties-->History
And what do you see?

Maximum job history log size: 1000
Maximum job history rows per job: 100

Ah yes..those nasty defaults

Or someone was evil and executed the proc sp_purge_jobhistory for your job :-)

1,205
Transaction (Process ID %d) was deadlocked on %.*ls resources with another process and has been chosen as the deadlock victim. Rerun the transaction.

1,222
Use this trace flag to return the resources and types of locks that are participating in a deadlock and also the current command affected

1,433
The default port SQL Server is listening on

1,700
You can have 1,700 bytes for a nonclustered index.

2,100
Parameters per stored procedure or user-defined function

Tried that one as well...

declare @d varchar(max) = 'create procedure prtest  '

;with cte as (
select number from master..spt_values
where type = 'p'
union
select number + 2048 from master..spt_values
where type = 'p'
)


select top 2101 @d += '@i' + convert(varchar(10),number)  + ' int ,'
from  cte


select @d = left(@d, len(@d) -1) + 'as select 1 as Test'


exec(@d)

And here is the error

Msg 180, Level 15, State 1, Procedure prtest, Line 1
There are too many parameters in this CREATE PROCEDURE statement. The maximum number is 2100.

3,226
Oh your errorlog is full of messages like these?

Those are not really errors are they?

To stop logging all of your backup success entries to the error log, use traceflag 3226

3,605
Like I showed in the section for number -1, you would use traceflag 3605 alongside traceflags 1204 and 1222 to send deadlock information to the error log

4,096
Columns per SELECT statement. Really who has such a query?

Hmm, I just had to try that out

declare @d varchar(max) = 'select top 1 '


;with cte as (
select number from master..spt_values
where type = 'p'
union
select number + 2048 from master..spt_values
where type = 'p'
union all
select 4096)


select @d += 'name as [' + convert(varchar(10),number)  + '] ,'
from  cte

select @d = left(@d, len(@d) -1) + 'from sys.objects'

exec(@d)

(1 row(s) affected)
Msg 1056, Level 15, State 1, Line 1
The number of elements in the select list exceeds the maximum allowed number of 4096 elements.

Also 4096 is the default network packet size

4,199
Traceflag 4199 Enables query optimizer (QO) fixes released in SQL Server Cumulative Updates and Service Packs. See also the hint 'ENABLE_QUERY_OPTIMIZER_HOTFIXES' from SQL Server 2016 SP1 onwards

8,060
Bytes per page
The number of bytes a page can hold.. and also the number of bytes a row can hold (not taking into account row overflow data types)

15,000
Partitions per partitioned table or index
I think this used to be either 999 or 1000...don't remember exactly

32,767
This is database id of the ResourceDb

Wednesday, May 16, 2018

Azure SQL Data Warehouse, Azure SQL Database and SQL Server sessions from the Build 2018 conference

The Build 2018 conference happened last week, I decided to put all the Azure SQL Data Warehouse, Azure SQL Database and SQL Server sessions from the Build 2018 conference into one place.

Here are the youtube videos, the descriptions as well as download links to video, audio and slides where available

Enjoy

Looking ahead with SQL Operations Studio

Join the SQL Operations Studio Engineering team as we highlight our journey towards our upcoming release announcement. We will cover major features delivered since our Public Preview announcement including developing our Extensibility story through the extension marketplace and extension API’s, improving dashboard experiences, and experimenting with deeper community involvement to help deliver this awesome cross-platform tool. You may even hear special announcements and key roadmap initiatives that have yet to be shared.

You can also download the sessions in different formats here:

MP3 (32.2 MB)
Low Quality MP4 (41.4 MB)
High Quality MP4 (87.9 MB)
Mid Quality MP4 (79.2 MB)

Query Processing Innovations for data intensive, modern applications

Learn how your application can benefit from new capabilities in the Azure SQL Database and SQL Server platform, including graph data processing to model complex relationships between objects, and advanced self-tuning query processing to solve or avoid performance related problems.

You can also download the sessions in different formats here:

MP3 (72.3 MB)
Low Quality MP4 (81.6 MB)
High Quality MP4 (207.6 MB)
Mid Quality MP4 (157.8 MB)

Migrating and modernizing your data estate to Azure with Data Migration Services

Come learn all about Database Migration Service, and how anyone can utilize this service to migrate and modernize your data estate to fully managed services in Azure

You can also download the sessions in different formats here:

MP3 (76.0 MB)
Low Quality MP4 (82.9 MB)
High Quality MP4 (208.0 MB)
Mid Quality MP4 (160.5 MB)

Azure SQL Database the intelligent database – Your database on Autopilot

Come learn how Azure SQL DB, the most intelligent cloud database, uses machine learning and best practices to ensure your database is always performing at its best. Come learn about features like Adaptive Query Processing, Autotuning and Performance Recommendations, to see how Azure SQL Database can help you spend more time developing applications and less time managing your databases.

You can also download the sessions in different formats here:

MP3 (70.6 MB)
Low Quality MP4 (80.8 MB)
High Quality MP4 (201.3 MB)
Mid Quality MP4 (155.6 MB)

Azure SQL DB Managed Instances - Built to easily modernize application data layer

Managed Instances is the latest fully managed deployment model for Azure SQL Database that enables friction-free migration for SQL Server applications running on-premises. Come and learn why Managed Instance is the best PaaS destination for all SQL Server workloads and how to start your cloud modernization at scale now, using Azure SQL Database Managed Instances.

You can also download the sessions in different formats here:

MP3 (72.8 MB)
Low Quality MP4 (83.6 MB)
High Quality MP4 (207.9 MB)
Mid Quality MP4 (158.2 MB)

Build Intelligent Apps with the Microsoft Data & AI Platform

Description Join Rohan Kumar, Corporate Vice President of Data Platform, to learn how Microsoft provides the most comprehensive data platform for your modern, intelligent applications. Come see our latest innovations that enable you to easily modernize and provide new, differentiated experiences in your applications. Hear from customers like dv01, Finastra, Jet.com and Rubikloud as they share their stories of how they’re each transforming their industries by leveraging Microsoft’s data platform.

You can also download the sessions in different formats here:

MP3 (73.4 MB)
Low Quality MP4 (97.1 MB)
High Quality MP4 (286.7 MB)
Mid Quality MP4 (209.8 MB)

Develop scalable analytical solutions with Azure Data Factory & Azure SQL Data Warehouse

In this session you will learn how to develop data pipelines in Azure Data Factory and build a Cloud-based analytical solution adopting modern data warehouse approaches with Azure SQL Data Warehouse and implementing incremental ETL orchestration at scale. With the multiple sources and types of data available in an enterprise today Azure Data factory enables full integration of data and enables direct storage in Azure SQL Data Warehouse for powerful and high-performance query workloads which drive a majority of enterprise applications and business intelligence applications.

You can also download the sessions in different formats here:

MP3 (55.1 MB)
Low Quality MP4 (61.7 MB)
High Quality MP4 (151.2 MB)
Mid Quality MP4 (116.3 MB)

Slides
View Slides Online

Enhancing DevOps with SQL Server on Linux + containers

SQL Server 2017 runs now on both Linux and containers. In this session you will learn how SQL Server 2017 supports Continuous Deployment/Continuous Integration (CD/CI) and other DevOps processes. You will also learn the latest on cross-platform tooling, APIs, and how to develop apps with SQL Server running anywhere. We will also demo how easy it is to use SQL Server in your modern database development workflows.

You can also download the sessions in different formats here:

MP3 (58.7 MB)
Low Quality MP4 (64.1 MB)
High Quality MP4 (162.3 MB)
Mid Quality MP4 (121.0 MB)

Modernizing Mission-Critical Apps with SQL Server

Learn about the features that can help you modernize your mission critical applications, where security and performance can go hand in hand. From the wide range of SQL Server features available, we will take a closer look at In-Memory performance, Automatic Tuning, Advanced Security Features like Always Encrypted, Polybase and integration with Machine Learning through R and Python.

You can also download the sessions in different formats here:

MP3 (70.1 MB)
Low Quality MP4 (82.2 MB)
High Quality MP4 (231.5 MB)
Mid Quality MP4 (169.6 MB)

Slides
View Slides Online

Securing your data with Azure SQL DB

Come learn about new security features like Vulnerability Assessment, Information Protection, Thread Detection and Always Encrypt to see how Azure SQL Database is securing your data in the most secure database on the planet.

You can also download the sessions in different formats here:

MP3 (73.3 MB)
Low Quality MP4 (85.5 MB)
High Quality MP4 (238.8 MB)
Mid Quality MP4 (178.0 MB)

Slides
View Slides Online

Monday, April 24, 2017

How to shut down SQL Server

The question how to shutdown SQL Server came up at work last week, I figured it would be a nice topic for the newer DBAs and SQL Server programmers

Note: I assume SQL Server is not part of a clustered environment here, if your SQL Server instance is part of a cluster, then use the failover cluster manager instead!

There are a bunch of way to shut down SQL Server, here is how you can do it

Configuration Manager

To shut down SQL Server for the SQL Server Configuration Manager, navigate to your instance, right click and select Stop

SSMS Object Explorer

The process from SSMS is very similar to Configuration Manager, right click on the instance and select Stop

Shutdown command

From SSMS open a new query window and type shutdown or shutdown with nowait

Unless the WITHNOWAIT option is used, SHUTDOWN shuts down SQL Server by:

Disabling logins (except for members of the sysadmin and serveradmin fixed server roles).
Waiting for currently running Transact-SQL statements or stored procedures to finish. To display a list of all active processes and locks, run sp_who and sp_lock, respectively.
Inserting a checkpoint in every database.

That seems kinda dangerous can someone pass that command to a proc by masking it?

declare @d varchar(30)= reverse('tiawon htiw nwodtuhs')
exec(@d)

The SHUTDOWN statement cannot be executed within a transaction or by a stored procedure. Msg 0, Level 11, State 0, Line 3 A severe error occurred on the current command. The results, if any, should be discarded.

That is good to know :-)

Command prompt

Open a command prompt as administrator and type net stop MSSQLSERVER

Microsoft Windows [Version 10.0.14393]
(c) 2016 Microsoft Corporation. All rights reserved.

C:\WINDOWS\system32>net stop MSSQLSERVER
The SQL Server (MSSQLSERVER) service is stopping.
The SQL Server (MSSQLSERVER) service was stopped successfully.


C:\WINDOWS\system32>net start MSSQLSERVER
The SQL Server (MSSQLSERVER) service is starting.
The SQL Server (MSSQLSERVER) service was started successfully.


C:\WINDOWS\system32>

To start SQL Server again, change net stop to net start

Below are a couple of more ways but you should really not be using those!!

Task Manager

Yes, you can kill SQL Server from task manager

Taskkill

You can use the plain old DOS command taskkill as well

Shutdown your server
That is one way of course but you might get a bunch of people yelling at you

Unplug the power to your server after making sure you have no UPS battery backup plugged in
Great way to test recovery... .. sarcasm people.....

Shutdown the data center
You laugh but this actually happened to me once, someone did some work and shut down the whole development part of the data canter. We had SQL Server 2000 running on NT 4, it was not restarted in 6 years (meaning wasn't patched either) It was an isolated box, internal ip, couldn't be reached from outside the dev network

There you have all the ways that you can use to shut down SQL Server

Friday, December 09, 2016

Connecting to SQL Server when your default database is not available

One of our database on the development went in suspect mode today. This database was the default for a bunch of logins. These people could not login now. Someone needed to use a different database but he couldn’t login because the database that was in suspect mode was the default database for the login he was using.

I told this person to click on the Options button in the connection dialog and specify another database. I guess there was an misunderstanding because this person couldn’t get it to work. This means it is time for a blog post.

Let's take a look how this all works

Here is a script that will create 2 databases

CREATE DATABASE Good
GO
 
CREATE DATABASE OopsBad
GO

Now create a new login named TestLogin with a password of Test. We are also adding the login we just created to the OopsBad database and we will make the login part of the db_owner role

USE [master]
GO
CREATE LOGIN [TestLogin] WITH PASSWORD=N'Test', DEFAULT_DATABASE=[OopsBad]
USE [OopsBad]
GO
CREATE USER [TestLogin] FOR LOGIN [TestLogin]
GO
USE [OopsBad]
GO
ALTER ROLE [db_owner] ADD MEMBER [TestLogin]
GO

Add the login we just created to the Good database as well and make the login part of the db_owner role

USE [Good]
GO
CREATE USER [TestLogin] FOR LOGIN [TestLogin]
GO
USE [Good]
GO
ALTER ROLE [db_owner] ADD MEMBER [TestLogin]
GO

Make sure that you can login with the TestLogin account

Now that you know that you can login with the TestLogin account, use another account and put the OopsBad database in offline mode

Take the database online

ALTER DATABASE OopsBad SET OFFLINE

Now if you try to login with the TestLogin account, you will see the following error

Here is what you need to do, on the connect to server window, click on the Options button

One the next screen, click on the Connection Properties tab, do not click on Browse server... from the drop down. If you click that you might have to wait for a long time

Instead just type in the database name, in our case the database name is Good

Click on Connect and you should be in.... That's it, pretty simple, you just have to know where to find it.

Wednesday, November 16, 2016

SQL Server 2016 SP1 released, SQL Server vNext available for download

Today Microsoft announced the CTP of the next version of SQL Server, you can download it here https://www.microsoft.com/evalcenter/evaluate-sql-server-vnext-ctp

As you can see I am already downloading this version

What's New in SQL Server vNext https://msdn.microsoft.com/en-US/library/mt788653(SQL.130).aspx

Install SQL Server on Linux https://docs.microsoft.com/en-us/sql/linux/sql-server-linux-setup

SQL Server on Linux Documentation https://docs.microsoft.com/en-us/sql/linux/

SQL Server Service Pack 1

Also announced was Service Pack 1 of SQL Server 2016, you can download that here https://go.microsoft.com/fwlink/?linkid=835368

There are so many cool things in this service pack. For one, all the editions now support all the programmability features like indexed views, columnstore indexes, partitioning etc etc, see image below

You are still bound by the memory and CPU limits but at least your code doesn't have to change, this is good news for ISVs.

A couple of more tidbits....

CREATE OR ALTER (Yes, we heard you !!!) – New CREATE OR ALTER support makes it easier to modify and deploy objects like Stored Procedures, Triggers, User–Defined Functions, and Views. This was one of the highly requested features by developers and SQL Community.

New DMF sys.dm_exec_query_statistics_xml – Use this DMF to obtain actual query execution showplan XML (with actual number of rows) for a query which is still being executed in a given session (session id as input parameter). The showplan with a snapshot of current execution statistics is returned when profiling infrastructure (legacy or lightweight) is on.
New DMF for incremental statistics – New DMF sys.dm_db_incremental_stats_properties to expose information per–partition for incremental stats.
Better correlation between diagnostics XE and DMVs – Query_hash and query_plan_hash are used for identifying a query uniquely. DMV defines them as varbinary(8), while XEvent defines them as UINT64. Since SQL server does not have “unsigned bigint”, casting does not always work. This improvement introduces new XEvent action/filter columns equivalent to query_hash and query_plan_hash except they are defined as INT64 which can help correlating queries between XE and DMVs.

There is much more, see all the news here: https://blogs.msdn.microsoft.com/sqlreleaseservices/sql-server-2016-service-pack-1-sp1-released/

I also played around with some of the new stuff after installing vNext, see here:
Playing around with the new stuff in SQL Server vNext and SQL Server 2016 SP1

Friday, October 28, 2016

Some cool SQL Server announcements SQL Graph, Adaptive Query Plan, CTP1 of SQL vNext, SQL Injection detection

There were a bunch of cool announcement at the SQL QA and chalk talk sessions at the PASS Summit

Here are the people presenting that session(not all pictured here) Lindsey Allen Jakub Szymaszek Jovan Popovic Kevin Farlee Tomas Talius Sunil Agarwal Joseph Sack and Tomas Talius

Resumable Index Rebuilds

For now this will be row-store only, this will not be available for columnstore indexes for the coming release. I assume it will work like showcontig back in the day. I need nore info and have the following questions....

What happens if you stop the index rebuild and you add more data?
Are statistics update only at the end, when everything is done?
Is this just a reorg in disguise?
Is this online or offline only?

SQL Graph

Think about this like Neo4J but then in SQL Server and probably better

SQL Graph adds graph processing capabilities to SQL Server, which will help you link different pieces of connected data to help gather powerful insights and increase operational agility. Graphs are well suited for applications where relationships are important, such as fraud detection, risk management, social networks, recommendation engines, predictive analysis, dependence analysis, IoT suites, etc.
Initially, SQL Server will support CRUD graph operations and multi-hop graph navigation, and the following functionality will be available in the private preview:
Create graph objects, that is, nodes to represent entities and edges to represent relationships between any 2 given nodes. Both Nodes and Edges can have properties associated to them.
SQL language extensions to support join free, pattern matching queries for multi-hop navigation.

If you like to signup for the private preview...go here: http://aka.ms/GraphPreview

Adaptive Query Plans
I was sitting in this session, and this is more or less what I remembered

This is for starters about Multi-statement Table Valued Functions (MSTVF). The legacy cardinality estimator will use 1, in 2014 the new new cardinality estimator will use the number 100. Both are of course wrong. The idea with Adaptive Query Plans is when the tree is build, grab the correct value from the MSTVF, inject that into the plan and pass it down the pipeline

Adjust the plan..consecutive execution, if you use too much memory and start spilling, this will be adjusted and you will stop spilling
Adaptive joins, the idea is if they think that it is appropriate that they can have a nested loop operation but there might be a chance that the estimate is wrong, they will revert to a hash join otherwise they will stick with their original nested loop join

The plan is changed in flight, there will be an intra-plan change, same thing with the adaptive join.

For starters, they will just address MSTVF, other stuff will follow later

If you like to signup for the private preview...go here: http:/aka.ms/adaptiveqppreview

SQL Server vNext CTP1 coming next month
CTP1 for SQL Server vNext will be available next month, this is both for the Linux as well as the Windows version. This is available now, here is the link https://www.microsoft.com/evalcenter/evaluate-sql-server-vnext-ctp

See also: SQL Server 2016 SP1 released, SQL Server vNext available for download

SQL Server Standard Edition announcement on November 15th

There will be a big announcement about SQL Server Standard Edition on November 15th.... mmmm maybe the ability to use more memory? We will have to wait and see what will be announced

SQL Injection detection
The way this will work is that if they detect a query that has been SQL injected with for example DROP TABLE, they will block it and sent an email to a DL or group that you have defined explaining that the statement has been blocked

Python coming to SQL Server
It looks like in addition to R, Python will be coming to SQL Server as well

P.S.

If you want to read about my time at SQL Pass 2016, read these posts

SQL Pass Summit 2016... day 2
SQL Pass Summit 2016... day 3
SQL Pass Summit 2016... day 4
SQL Pass Summit 2016... day 5

You might be thinking, where is day 1? The day 1 posts is mostly about Seattle, if you are interested in that one, here it is: SQL Pass Summit 2016... day 1.... just looking around

Wednesday, September 28, 2016

If you are going to PASS Summit 2016, build your schedule today

If you are going to the SQL Server Pass summit this year, you can start building your schedule already. Don't wait too long because some events seat only a limited number of people. Take the Chalk Talk with the SQL Server Team [DBA-501-M] session for example
Here is what is says on the page for that session

Speaker(s): Lindsey Allen

Duration: 75 minutes

Track: Enterprise Database Administration & Deployment

Join us in an open dialog with members of the SQL Server 2016 Program Management team. Don't miss this unique opportunity to ask the hard questions. Seating is limited to 50 so arrive early. Chalk Talk sessions are not recorded.

As you can see this session sits only 50 people, so make sure to make a note of that, this way you know to get there early

Some other sessions are also going to be either very popular or they might be in a smaller room, so add those sessions to your schedule first.

You can see all the sessions here: http://www.sqlpass.org/summit/2016/Sessions/Schedule.aspx

That's all

See you at the PASS summit in October

Sunday, February 21, 2016

How to store twitter search results from R into SQL Server

I was messing around with R and grabbing twitter searches from within R, after that I decided to store the result in a csv file. Instead of stroring the results in a file, what if I want to store that info into a SQL Server database? This is pretty easy to do. If you want to try this out yourself, first head on over to my other post Using twitter from the R console. There you will be able to see how to setup OAuth so that you can use twitter within R, you will also see how to install the R twitteR package as well.

Time to get started....

First let's see how to connect to SQL Server from R
To load data from SQL Server into R, you first need to create an ODBC Data Source, I created one named LocalSqlServer, this DSN Data Source points to my local SQL Server 2016 instance.

After you have created your ODBC data source, you need to load the (RODBC) package, create a connection, execute a query, display the results and close the connection.. Here is what it looks like in the R Console

> require(RODBC)

>db <- odbcConnect("LocalSqlServer")

> Sample <- sqlQuery(db, "select * from sysobjects", stringsAsFactors=FALSE)

> head(Sample)

> close(db)  #close ORDBC connection
     name          id xtype uid info status base_schema_ver
1      sp_MSalreadyhavegeneration -1073624922    P    4    0      0               0
2      sp_MSwritemergeperfcounter -1072815163    P    4    0      0               0
3       TABLE_PRIVILEGES -1072372588    V    3    0         
4       sp_replsetsyncstatus -1071944761    X    4    0      0               0
5       sp_replshowcmds -1070913306    P    4    0      0               0
6       sp_publishdb -1070573756    P    4    0      0               0

Now that we know how to connect to SQL Server, let's see if we can get the twitteR package to work, head on over to my other post Using twitter from the R console to see how to install twitterR and how some queries are executed.

Assuming the twitteR package is installed, we need to load it to use it

> library(twitteR)

The next command is the command to enable OAuth, to see how to set this up go here Using twitter from the R console

> setup_twitter_oauth("API key", "API secret", "Access token", "Access secret")

[1] "Using direct authentication"

Now it is time to do a simple search

> tweets <- searchTwitter('#rstats', n=6)
> tweets

Running that command should have given you some results. However what we need as output is a dataframe, this way we can easily load this into a table in SQL Server. You can use the twListToDF function

> tweets <- searchTwitter('#rstats', n=6)
> twListToDF(tweets)

The output now looks different

1 RT @rquintino: @Mairos_B #sqlsatportugal session: all about R in #SqlServer 2016 #rstats https://t.co/DHrqIZrz1e
2 a quick script to use imgcat in #rstats https://t.co/fpUlgWNX33 https://t.co/AhCCMLewCH
3 RT @KirkDBorne: Useful packages (libraries) for Data Analysis in R: https://t.co/haRKopFyly #DataScience #Rstats by @analyticsvidhya https:…
4 Hey #rstats tweeps, do you have any readings to recommend on sensitivity analysis? Books/articles/websites all welcome.
5 RT @KirkDBorne: A Complete Tutorial on Time Series Modeling in R: https://t.co/7oI6JKyU4E #MachineLearning #DataScience #Rstats by @Analyti…
6 RT @KirkDBorne: A Complete Tutorial on Time Series Modeling in R: https://t.co/7oI6JKyU4E #MachineLearning #DataScience #Rstats by @Analyti…
favorited favoriteCount replyToSN created truncated replyToSID
1 FALSE 0 NA 2016-02-20 20:29:54 FALSE NA
2 FALSE 0 NA 2016-02-20 20:24:50 FALSE NA
3 FALSE 0 NA 2016-02-20 20:16:25 FALSE NA
4 FALSE 0 NA 2016-02-20 20:11:08 FALSE NA
5 FALSE 0 NA 2016-02-20 20:11:06 FALSE NA
6 FALSE 0 NA 2016-02-20 20:02:05 FALSE NA
id replyToUID
1 701141750161784834 NA
2 701140474019577856 NA
3 701138356466483204 NA
4 701137026075140096 NA
5 701137018508722176 NA
6 701134750296227840 NA
statusSource
1 Mobile Web (M5)
2 Tweetbot for Mac
3   Twitter for Android
4   Twitter Web Client
5   Twitter for iPhone
6   Twitter Web Client
screenName retweetCount isRetweet retweeted longitude latitude
1 psousa75 3 TRUE FALSE NA NA
2 millerdl 0 FALSE FALSE NA NA
3 diana_nario 50 TRUE FALSE NA NA
4 emjonaitis 0 FALSE FALSE NA NA
5 caryden 41 TRUE FALSE NA NA
6 ArkangelScrap 41 TRUE FALSE NA NA

The output now has a lot more stuff, you can see if it has been retweeted or favorited as well as the latitude, longtitude and more.

Time to send the result to SQL Server

There is a problem with how the sqlSave method creates the table, it makes the created column a float and then all inserts fail.

Run the following DDL statement in your database

CREATE TABLE [dbo].[tweets](
	[rownames] [nvarchar](255) NULL,
	[text] [nvarchar](255) NULL,
	[favorited] [nvarchar](5) NULL,
	[favoriteCount] [float] NULL,
	[replyToSN] [nvarchar](5) NULL,
	[created] [datetime] NULL,
	[truncated] [nvarchar](5) NULL,
	[replyToSID] [nvarchar](5) NULL,
	[id] [nvarchar](255) NULL,
	[replyToUID] [nvarchar](5) NULL,
	[statusSource] [nvarchar](255) NULL,
	[screenName] [nvarchar](255) NULL,
	[retweetCount] [float] NULL,
	[isRetweet] [nvarchar](5) NULL,
	[retweeted] [nvarchar](5) NULL,
	[longitude] [nvarchar](255) NULL,
	[latitude] [nvarchar](255) NULL
) ON [PRIMARY]

Now we are ready to save our twitter data in SQL Server, here is how you do it.

> sqlSave(db,append = TRUE,colnames = FALSE,verbose = FALSE, addPK = FALSE, twListToDF(tweets))

Here is what the results look in SSMS

Make sure that you have append = TRUE, if you make it FALSE, the table will be recreated every time but since the command issued creates the wrong data type for the created column, you can't use it

So to recap, here are the commands you need after you loaded the twitterR package and have setup OAuth

> tweets <- searchTwitter('#rstats', n=6) #store the twitter search results
> twListToDF(tweets)  #convert to dataframe

> require(RODBC)  # load RODBC package
> db <- odbcConnect("LocalSqlServer")  #connect
> sqlSave(db,append = TRUE,colnames = FALSE,verbose = FALSE, addPK = FALSE, twListToDF(tweets))  #insert data into your table
> close(db)  #close ORDBC connection