Saturday, November 12, 2016

What's new in SQL Server 2016: Truncate partitions


In my The best thing about the PASS summit for me is... post I said that I would do the following

I will also do a series about new stuff in SQL Server 2016, but that will be in between all the other posts. Whenever someone asked during a session how many people were already on SQL Server 2016, very few hands went up, this is the reason, I want to blog about this as well.
For all these series of posts, I plan to have a repo on github with a yet to be determined name. Then I will have all the SQL code for all the series organized there
This is the first post in that series

You can find the github repo for the series here whatisnewinsqlserver2016, all the code for this post can be found here TruncatePartitions

New in SQL Server is the ability to truncate a partition or more than 1 partition, this is very nice because we all know that truncate is much faster than a delete. Let's see how this all works

First create this simple table


CREATE TABLE SalesPartitioned(
 YearCol SMALLINT NOT NULL,
 OrderID INT NOT NULL, 
 SomeData UNIQUEIDENTIFIER DEFAULT newsequentialid())
GO


Now it is time to insert some data, we are going to create data for 6 years, we will use those years then to partition the data on. This query will insert 6 times 2048 rows, 2048 rows per year


INSERT SalesPartitioned (YearCol,OrderID)
SELECT 2013,number
FROM master..spt_values
WHERE type = 'P'
UNION ALL
SELECT 2014,number + 2048
FROM master..spt_values
WHERE type = 'P'
UNION ALL
SELECT 2015,number + 4096
FROM master..spt_values
WHERE type = 'P'
UNION ALL
SELECT 2016,number + 6144
FROM master..spt_values
WHERE type = 'P'
UNION ALL
SELECT 2017,number + 8192
FROM master..spt_values
WHERE type = 'P'
UNION ALL
SELECT 2018,number + 10240
FROM master..spt_values
WHERE type = 'P'


Now let's create the partition function

CREATE PARTITION FUNCTION pfFiscalYear(SMALLINT)
AS RANGE LEFT FOR VALUES(2013,2014,2015,2016,2017)
GO


Create the partition scheme


CREATE PARTITION SCHEME psFiscalYear
AS PARTITION pfFiscalYear ALL TO ([PRIMARY])
GO


Add a primary key to the table, add it to the partition scheme we created above


ALTER TABLE dbo.SalesPartitioned ADD CONSTRAINT
    PK_Sales PRIMARY KEY CLUSTERED (YearCol,OrderID)
ON psFiscalYear(YearCol)
GO

Now let's see what we have as far as counts for each partition


SELECT partition_number,rows
FROM sys.partitions
WHERE object_id = OBJECT_ID('SalesPartitioned')

Here are the results

partition_number rows
1 2048
2 2048
3 2048
4 2048
5 2048
6 2048

As you can see we have 6 partitions, each partition has 2048 rows


We can verify this by running a count ourselves


SELECT YearCol, $PARTITION.pfFiscalYear(YearCol) AS Partition,
COUNT(*) AS [COUNT] FROM SalesPartitioned
GROUP BY $PARTITION.pfFiscalYear(YearCol),YearCol
ORDER BY Partition;
GO


YearCol Partition COUNT
2013 1 2048
2014 2 2048
2015 3 2048
2016 4 2048
2017 5 2048
2018 6 2048


Now it is time to truncate a partition.
In order to truncate a partition, you use the following syntax




TRUNCATE TABLE ........
WITH ( PARTITIONS ( { <partition_number_expression> | <range> } [ , ...n ] ) )

So if we just want to truncate partition 2, we would execute the following


TRUNCATE TABLE SalesPartitioned
WITH (PARTITIONS (2));
GO


Checking those same count queries from before shows that partition 2 has no rows after we executed the truncate command


SELECT partition_number,rows
FROM sys.partitions
WHERE object_id = OBJECT_ID('SalesPartitioned')


partition_number rows
1 2048
2 0
3 2048
4 2048
5 2048
6 2048



SELECT YearCol, $PARTITION.pfFiscalYear(YearCol) AS Partition,
COUNT(*) AS [COUNT] FROM SalesPartitioned
GROUP BY $PARTITION.pfFiscalYear(YearCol),YearCol
ORDER BY Partition;
GO


YearCol Partition COUNT
2013 1 2048
2015 3 2048
2016 4 2048
2017 5 2048
2018 6 2048



There are two ways you can truncate a bunch of partitions, for example if you want to remove 4,5 and 6, you can use WITH (PARTITIONS (4, 5, 6)) or you can use a range like shown in the code below


TRUNCATE TABLE SalesPartitioned
WITH (PARTITIONS (4 TO 6));
GO

After we execute that and we check the counts again, we see that we are now only left with partitions 1 and 3


SELECT partition_number,rows
FROM sys.partitions
WHERE object_id = OBJECT_ID('SalesPartitioned')

Here are the results from that query

partition_number rows
1 2048
2 0
3 2048
4 0
5 0
6 0



SELECT YearCol, $PARTITION.pfFiscalYear(YearCol) AS Partition,
COUNT(*) AS [COUNT] FROM SalesPartitioned
GROUP BY $PARTITION.pfFiscalYear(YearCol),YearCol
ORDER BY Partition;
GO

Here are the results

YearCol Partition COUNT
2013 1 2048
2015 3 2048



What happens if you specify an invalid partition in your truncate statement?


TRUNCATE TABLE SalesPartitioned 
WITH (PARTITIONS (12)); 

You get the following message

Msg 7722, Level 16, State 2, Line 1
Invalid partition number 12 specified for table 'SalesPartitioned', partition number can range from 1 to 6.


There you have it, this is how truncate partitions work


You can find the github repo for the series here whatisnewinsqlserver2016, all the code for this post can be found here TruncatePartitions


Friday, November 11, 2016

Interesting SQL related links for the week of Nov 11, 2016


Here are some interesting articles I read and tweeted about this past week, I think you will like these as well. If you are bored this weekend, some of these might be good for you to read


.Net 4.6.2. Framework client driver for Always Encrypted resulting in intermittent failures to decrypt individual rows

The SQL Product team has identified an issue with .Net 4.6.2 framework client driver for Always Encrypted enabled database on SQL Server 2016 and Azure SQL Database. The issue can lead to intermittent failure while trying to decrypt the records from the Always Encrypted enabled database with following error message
Decryption failed. The last 10 bytes of the encrypted column encryption key are: ‘7E-0B-E6-D3-39-CE-35-86-2F-AA’.The first 10 bytes of ciphertext are: ’01-C3-D7-39-33-2F-E6-44-C3-B1′.Specified ciphertext has an invalid authentication tag. 

In-Memory OLTP: Is your database just in memory or actually optimized for memory?

In my many conversations with customers during Microsoft events, people often confuse between the terms ‘In Memory’ and ‘Memory-Optimized’ and many think that they are one and the same. If you continue reading this blog, you will realize that they are somewhat related but can lead to very different performance/scalability.



Paper Review: Why Does the Cloud Stop Computing? Lessons from Hundreds of Service Outages

This paper conducts a cloud outage study of 32 popular Internet services, and analyzes outage duration, root causes, impacts, and fix procedures. The paper appeared in SOCC 2016, and the authors are Gunawi, Hao, Suminto Laksono, Satria, Adityatama, and Eliazar.

Availability is clearly very important for cloud services. Downtimes cause financial and reputation damages. As our reliance to cloud services increase, loss of availability creates even more significant problems. Yet, several outages occur in cloud services every year. The paper tries to answer why outages still take place even with pervasive redundancies.



Daniel H Pink: employees are faster and more creative when solving other people's problems

Recent research reveals that people are more capable of mental novelty when thinking on behalf of others than for themselves. This has far-reaching practical implications at every level of business.


I’ve gotten a few questions about shrinking SQL Server data files lately. What’s the best way to get shrink to run? And why might it fail in some cases?

Traditionally, every time you ask a DBA how to make shrinking suck less, they start ranting how shrinking is bad and you just shouldn’t do it. Sometimes it sounds kinda angry.

What people are trying to say is that shrinking data files is generally slow, frustrating, and painful for you.



How It Works (It Just Runs Faster): Non-Volatile Memory SQL Server Tail Of Log Caching on NVDIMM

SQL Server 2016 has been enhanced along with Windows 2016 to detect a direct access volume (/dax.) When you create or alter the SQL Server database you simply add the LOG ON clause, pointing to the DAX enabled volume. SQL Server will detect the request on the DAX based volume and create the non-volatile, tail of log cache of fixed size for the database. From all external aspects (DMVs for example) the tail of log cache looks like another file that is part of the database’s LOG file group.


NASA Is Harnessing Graph Databases To Organize Lessons Learned From Past Projects

NASA famously maintains a "lessons learned" database containing valuable information from its past programs and projects. But the vast system, which has been online since 1994, is not always easy to navigate. Now the agency is modernizing it with help from a tool more familiar to social media than space missions: graph databases.

The genesis of the change began about a year and a half ago when an engineer, attempting to search "lessons learned" for relevant documents, found the number of possible results overwhelming. "He was getting things that really were not relevant to what he was looking for," David Meza, NASA’s chief knowledge architect, recalls.

Looking to make the database more useful, and help users investigate relationships beyond what basic keyword searches could uncover, Meza experimented with storing the information in a graph database—that is, a database optimized to store information in terms of data records and the connections between them. In recent years, such network graphs have become a familiar feature of online social networks.


Microsoft Azure SQL Database provides unparalleled performance with In-Memory technologies

Azure SQL Database built-in In-Memory technologies are now generally available for the Premium database tier including Premium pools. In-memory technology helps optimize the performance of transactional (OLTP), analytics (OLAP), as well as mixed workloads (HTAP). These technologies allow you to achieve phenomenal performance with Azure SQL Database – 75,000 transactions per second for order processing (11X perf gain) and reduced query execution time from 15 seconds to 0.26 (57X perf). You can also use them to reduce cost – on a P2 database obtain 9X perf gain for transactions or 10X perf gain for analytics queries by implementing In-Memory technologies, without any additional cost!


Offshoring roulette: lessons from outsourcing to India, China and the Philippines

In what is now a former life, I spent a great deal of time sending work overseas. In fact, I spent quite a bit of time over there myself in some of the places I'm going to talk about. That former life was as a software architect for Pfizer and in the last half dozen years of my 14-year tenure, I had responsibility for software architecture in the Asia Pacific region. For those not from this corner of the world, Asia Pacific is a sizeable chunk of the globe



TDD Doesn't Work

TDD Doesn't work.

It doesn't? That's odd. I've always found it to work quite well.

Not according to a new study.

Another study?

Yeah, an in-depth study that repeated another study that was done a few years back. Both showed that TDD doesn't work. The new one uses a multi-site, blind analysis, approach. It looks conclusive.



Oops Recovery with Temporal Tables

Have you ever got that sinking feeling after hitting the Execute button in SSMS, thinking “I should not have done that”? DML statements with missing WHERE clause, DROP statements accidentally targeting slightly mistyped (but existing) tables or databases, RESTORE statements overwriting databases with new data that haven’t been backed up, are all examples of actions prompting an “Oops…” (or worse) shortly thereafter. “Oops recovery” is the term that became popular to describe the process of fixing the consequences.

For most of these scenarios, the usual, and often the only, recovery mechanism is to restore the database from backup to a point in time just before the “oops”, known as point-in-time recovery (PITR). Even though PITR remains the most general and the most effective recovery mechanism, it does have some drawbacks and limitations: the recovery process requires a full database restore, taking the time proportional to the size of the database; a sequence of restores may be needed if multiple “oops” transactions have occurred; in the general case, there will be difficulties reconciling recovered data with data modified after the “oops” point in time, etc. Nevertheless, PITR remains the most widely applicable recovery method for SQL Server databases, both on-premises and in the cloud.



Introduction to Machine Learning for Developers

Today’s developers often hear about leveraging machine learning algorithms in order to build more intelligent applications, but many don’t know where to start.


One of the most important aspects of developing smart applications is to understand the underlying machine learning models, even if you aren’t the person building them. Whether you are integrating a recommendation system into your app or building a chat bot, this guide will help you get started in understanding the basics of machine learning.

This introduction to machine learning and list of resources is adapted from my October 2016 talk at ACT-W, a women’s tech conference.

And that is all for this week, try to read some of these over the weekend....

Wednesday, November 09, 2016

What was the first computer you bought, what were some of your favorite games?

A non SQL post today and I posted a variation of this before. What was the first computer you bought, what were some of your favorite games?

The first computer I ever bought was a commodore 128 (I actually received it as a gift for my 16th birthday)


This baby had 128K (not MB) of RAM, 4 sound channels and 16 colors
With the C128 you had a C64 built in and you could run CP/M (it came with a floppy)
I almost always booted up C64, this gave you 39KB free memory to use, the speed was 1MHZ, the C128 could run at 2MHZ but then the screen would go dark before you switched back to 1MHZ. The C128 came with BASIC built in, I had a tape recorder so that I could store and retrieve programs or games. This was such a nuisance, if your friend gave you a game and the heads on his recorder were aligned different you could not load the game, you would have to use a screwdriver to fix the azimuth. It would take up to 30 minutes to load a game if you didn’t have a turbo.

Later on, I bought one of these cartridges you can see in the picture below

Using this cartridge, you could 'freeze' the commodore 64, you could then store what was in memory on tape. Now all you had to do was load that again from tape and you could continue where you left off. The cartridge also had an assembler built, it was pretty cool technology for the late 80s.

I still remember the great games from that time, here are some of my favorites

1942

This was a very fun game, basically you had to shoot a bunch of enemy planes and stay clear of any bombs


yie-ar kung fu

This games was very good for the time, you would fight these characters all specializing in different martial arts


kung fu master

I played this one in the arcades as well as on my commodore, fun game to pass the time



Zaxxon

Probably one of the more difficult games of that time, it was 3D so you have to think about 3 dimensions and you would always crash into something















Ghost N Goblins

Fun game and also fun music

The best part about the games is that you could change the value in an address space after you loaded a game but before typing run
You would use POKE for that, examples:
POKE 43719,234 POKE 43720,234 POKE 43721,234 Invincibility
POKE 44731,76 POKE 44732,253 POKE 44733,174 All doors unlocked
POKE 34202,200 SYS 2060 Unlimited lives
Programming on the commodore was primarily done in BASIC or assembler (built in) but you could also buy a C compiler, Oxford Pascal or many other languages.
Here is an example of basic

10 PRINT "THIS IS THE MAIN PROGRAM",
20 GOSUB 1000
30 PRINT "AND AGAIN";
40 GOSUB 1000
50 PRINT "AND THAT IS ALL."
60 STOP
1000 REM SUBROUTINE STARTS HERE
1010 PRINT "THIS IS THE SUBROUTINE,"
1020 RETURN


Here is some assembler language
LDA $5000
ASL
CLC
ADC $5000
STA $5000
BRK


So what was your first computer?

Monday, November 07, 2016

Sometimes, you can stare at something for 5 minutes before seeing it


The other day I was running some code and the variable that I created was not being populated. I looked at the code for a minute or so, ran it again (like that would change it) and it was still null. Before I show you the code, I want you to look at this image, what do you see?

 If you have never seen this image you will probably see Paris in the spring. But look closer, do you see the second the on the third line?

The way this works is that your brain eliminates the second the since it already processed.

Take a look at this text below, you probably have no problem "reading" it


"I cdn'uolt blveiee taht I cluod aulaclty uesdnatnrd waht I was rdanieg: the phaonmneel pweor of the hmuan mnid. Aoccdrnig to a rseearch taem at Cmabrigde Uinervtisy, it deosn't mttaer in waht oredr the ltteers in a wrod are, the olny iprmoatnt tihng is taht the frist and lsat ltteer be in the rghit pclae. The rset can be a taotl mses and you can sitll raed it wouthit a porbelm. Tihs is bcuseae the huamn mnid deos not raed ervey lteter by istlef, but the wrod as a wlohe. Scuh a cdonition is arppoiatrely cllaed Typoglycemia .

"Amzanig huh? Yaeh and you awlyas thguoht slpeling was ipmorantt."



So back to my code..here is a simple example


DECLARE @name varchar(100)

SELECT @name = @name FROM sys.databases
WHERE database_id = 6


SELECT @name

Do you see the problem?  The table has a column named name, the variable is @name, these look very similar


Let's take a look at another example, one of my co-workers was inserting data into a temp table from another table.

INSERT #Temp
SELECT lastrun, cpu_busy, io_busy, idle, pack_received,  connections, pack_errors, 
total_read, total_write, total_errors
FROM master.dbo.spt_monitor

He then needed to add a column, he modified the temp table to add this column, then he added this column to his insert query, the column name was pack_sent


INSERT #Temp
SELECT lastrun, cpu_busy, io_busy, idle, pack_received pack_sent, connections, 
pack_errors, total_read, total_write, total_errors
FROM master.dbo.spt_monitor

Running that gave him this error

Msg 213, Level 16, State 1, Line 1
Column name or number of supplied values does not match table definition.


Do you see what happened?  He added the column but forgot a comma, so pack_sent became an alias for the pack_received column. Sometimes I really wished as would be required to alias a column, at least you could eliminate errors like these,


So how many times have you done stuff like this..also do you have any other examples of similar stuff?

Why you need additional privileges to truncate tables compared to delete statements



One of the people on our team wanted to have the ability to truncate tables on the staging database while this person was doing some testing.

 Here is what Books On Line has about permissions for the TRUNCATED statement

The minimum permission required is ALTER on table_name. TRUNCATE TABLE permissions default to the table owner, members of the sysadmin fixed server role, and the db_owner and db_ddladmin fixed database roles, and are not transferable.

Before I answer why someone would need ALTER TABLE permissions when the person already has DELETE permissions, let’s run some code that will show the ‘problem’.

CREATE DATABASE Test
go
 
USE Test
GO
 
CREATE TABLE TestTruncate(Id int)
GO
 
INSERT TestTruncate values(1)
GO

Now create a new user and give the user datareader and datawriter permissions

USE master
GO
CREATE LOGIN TestLogin WITH PASSWORD=N'Test', 
DEFAULT_DATABASE=master, CHECK_EXPIRATION=OFF, CHECK_POLICY=OFF
GO
USE Test
GO
CREATE USER TestLogin FOR LOGIN TestLogin
GO
USE Test
GO
ALTER ROLE db_datareader ADD MEMBER TestLogin
GO
USE Test
GO
ALTER ROLE db_datawriter ADD MEMBER TestLogin
GO

Now that the user is created, login as that user and run the TRUNCATE TABLE command


TRUNCATE TABLE TestTruncate

Msg 1088, Level 16, State 7, Line 1
Cannot find the object “TestTruncate” because it does not exist or you do not have permissions.
As you can see, you don’t have permission. Executing a delete will work just fine


DELETE TestTruncate


(1 row(s) affected)

Before I give you a workaround, let’s try to figure out why the minimum requirement is ALTER TABLE.
What is the difference between a DELETE and a TRUNCATE in terms of logging? When a TRUNCATE occurs, the operation does not log individual row deletions, a DELETE operation does. The reason this is important is because if you have a trigger on the table, in needs to be disabled before the TRUNCATE occurs. Now you know why ALTER TABLE is required, triggers need to be disabled.


ALTER TABLE SomeTable DISABLE TRIGGER SomeTrigger



And in order to disable the trigger, ALTER TABLE permissions are required as a minimum.

But I don’t want people altering tables on our staging and QA servers, so here is one way of giving the person the ability to TRUNCATE a table without giving them permissions explicitly. Create a stored procedure and use WITH EXECUTE AS, this will define the execution context of the stored procedure. In the example below, I picked a user that has sufficient privileges to perform the TRUNCATE.



CREATE PROCEDURE prTruncate
WITH EXECUTE AS 'SuperUser'
AS
TRUNCATE TABLE TestTruncate
GO


All you have to do is give your user execute permissions to the stored procedure you just created



GRANT EXECUTE ON prTruncate TO TestLogin
GO


Now if you execute the stored procedure as the TestLogin user, you will see it will run just fine



EXEC prTruncate

Hope this helps someone in the future who is filling up his or her transaction log these days with all those DELETE statements

Saturday, November 05, 2016

Did you know that the MSSQL Tiger Team has a GitHub repo?



I just found out that the Microsoft SQL Server Tiger team has a GitHub repository where they host a bunch of SQL Server scripts

It is called the tigertoolbox. In the image below, you can see what is in this repository



You can find that repository here: https://github.com/Microsoft/tigertoolbox

If you want, you can watch this repo, you will then be notified for any changes to any of  the files

Also make sure to follow the @mssqltiger twitter account to stay up to date with any news from the MS Tiger Team.

Finally, make sure to add their blog to your favorite rss reader: https://blogs.msdn.microsoft.com/sql_server_team/




Friday, November 04, 2016

Interesting bug with milliseconds and datetime on SQL Server 2016



The other day, I was messing around and ran code like the following on SQL Server 2016


DECLARE @ms int
DECLARE @datetime datetime = '2016-11-03T10:18:17.847'   --846 on SQL 2016
SET @ms = DATEPART(ms, @datetime)

SELECT  @ms
GO

DECLARE @ms int  
DECLARE @datetime datetime2 = '2016-11-03T10:18:17.847'  --847
SET @ms = DATEPART(ms, @datetime)

SELECT  @ms
GO

DECLARE @ms int
DECLARE @datetime time = '2016-11-03T10:18:17.847'   --847
SET @ms = DATEPART(ms, @datetime)

SELECT  @ms
GO


When I ran the code on SQL Server 2014 and SQL Server 2012, I got back 847 for all 3 queries. Running it on SQL Server 2016 returns 846 instead of 847 for the datetime data type.

Interesting, something must have changed. I then did a couple of searches on the internetz and .....


....I actually found this comment from Erland Sommarskog on the MS forums

Yes, this is an intended change.
datetime has a resolution of 3 1/3 ms and the exact value of what looks like 2016-09-27 18:03:03.297 is 2016-09-27 18:03:03.296666...
Previously this was not reflected in the internal represenation, but this was changed in SQL 2016.
It seems that you found the earlier thread about this. The poster there expressed concern about Entity Framework, but there was a suggestion from Erik Eskjov Jensen, which I am not able to evaluate myself. Does his suggestion help you?
From there I found that someone already filed a bug for this: https://connect.microsoft.com/SQLServer/feedback/details/3104723

And just 3 days ago, a comment was posted

Posted by Dan Guzman on 10/1/2016 at 9:04 PM
Yes, this is an intended change but has not yet been documented in the Breaking Changes topic in the SQL Server 2016 Books Online as of this writing.
The datetime value of '2016-09-27 18:03:03.297' is actually '2016-09-27 18:03:03.2966666...' because SQL Server internally uses 1/300 second interval units for the time portion. The infinitely repeating value is rounded to the fixed datetime precision of 3, resulting in the '2016-09-27 18:03:03.297' value. However, when the value is converted to a higher precision type, the repeating value is rounded to the target data type precision, which is '2016-09-27 18:03:03.2966667' in the case of datetime2(7).

This behavior is different than before SQL 2016, where the internal value was first rounded to a precision of 3, losing the available sub-millisecond value. As you observed, the conversion behavior may be controlled with the database compatibility level.
So there you have it, now you know the reason and this is indeed a breaking change.

So I guess it is a feature not a bug


Are your foreign keys indexed? If not, you might have problems



When you add a primary key constraint to a table in SQL Server, an index will be created automatically. When you add a foreign key constraint no index will be created. This might cause issues if you don't know that this is the behavior in SQL Server. Maybe there should be an option to automatically index the foreign keys in SQL Server, what do you think?

The other day some deletes on a newer table in the test environment became really slow. We had a primary table with a couple of hundred rows, we loaded up between 200 and 300 million rows into the child table. Then we deleted the child rows, this was fast. After this, we deleted one row from the primary table and this took several seconds.

When I looked at this I noticed something interesting, the most time during the delete was spent doing a lookup at the child table. Then I noticed that the foreign key was not indexed. After we added the index the delete became thousands of times faster

Let's try to replicate that behavior here

First create these primary table, we will add 2048 rows to this table


--Table that will have 2048 rows
CREATE TABLE Store(StoreID int not null,
  DateCreated datetime not null,
  StoreName varchar(500),
  constraint pk_store primary key   (StoreID))
GO

--insert 2048 rows
INSERT Store
SELECT ROW_NUMBER() OVER(ORDER BY t1.number) AS StoreID,DATEADD(dd,t1.number,'20161101') 
AS datecreated, NEWID()
FROM master..spt_values t1
WHERE t1.type =  'p'

Now create the child table and add 500K rows, this might take up to 1 minute to run since the rows are pretty wide.


-- table that will also have 500000 rows, fk will be indexed  
CREATE TABLE GoodsSold (TransactionID int not null,
   StoreID int not null,
   DateCreated datetime not null,
   SomeValue char(5000),
   constraint pk_transaction primary key   (TransactionID))

INSERT GoodsSold
SELECT top 500000 ROW_NUMBER() OVER(ORDER BY t1.number) AS TransactionID, t2.StoreID, 
DATEADD(dd,t1.number,'20161101') AS datecreated, REPLICATE('A',5000)
FROM master..spt_values t1
CROSS JOIN  Store t2
WHERE t1.type =  'p'


Now it is time to add the foreign key constraint and index this foreign key constraint

-- adding the foreign key
ALTER TABLE GoodsSold  WITH CHECK ADD  CONSTRAINT FK_StoreID FOREIGN KEY(StoreID)
REFERENCES Store(StoreID)
GO

ALTER TABLE GoodsSold CHECK CONSTRAINT FK_StoreID
GO

-- index the foreign key
CREATE index ix_StoreID on GoodsSold(StoreID)
GO


We will create another set of tables, let's start with the primary table, we will just insert into this table all the rows from the primary table we created earlier

-- create another primary table
CREATE TABLE StoreFK(StoreID int not null,
  DateCreated datetime not null,
  StoreName varchar(500),
  constraint pk_storefk primary key   (StoreID))
GO


-- add the same 2048 rows from the primary table with indexed FK
INSERT StoreFK
SELECT * FROM Store
GO


For the child table, it is the same deal, we will add all the rows from the child table we created earlier into this table

-- Add another FK table
CREATE TABLE GoodsSoldFKNoIndex (TransactionID int not null,
   StoreID int not null,
   DateCreated datetime not null,
   SomeValue char(5000),
   constraint pk_transactionfk primary key   (TransactionID))

-- add same 500K rows from table with FK index
INSERT GoodsSoldFKNoIndex
SELECT * FROM GoodsSold

Let's add the foreign key constraint, but this time we are not indexing the foreign key constraint

-- add the FK but do not index this
ALTER TABLE GoodsSoldFKNoIndex  WITH CHECK ADD  CONSTRAINT FK_StoreID_FK 
FOREIGN KEY(StoreID)
REFERENCES StoreFK(StoreID)
GO

ALTER TABLE GoodsSoldFKNoIndex CHECK CONSTRAINT FK_StoreID_FK
GO

Let make sure that the tables have the same number of rows

-- check that the tables have the same rows
exec sp_spaceused 'GoodsSold'
exec sp_spaceused 'GoodsSoldFKNoIndex'



namerowsreserveddataindex_sizeunused
GoodsSold5000004024976 KB4000000 KB22520 KB2456 KB
GoodsSoldFKNoIndex5000004015440 KB4000000 KB14936 KB504 KB



Now that we are setup, let's wipe out all the rows from the child table for a specific StoreID, the SELECT statements should return 0 rows

DELETE GoodsSoldFKNoIndex
WHERE StoreID = 507


DELETE GoodsSold
WHERE StoreID = 507


SELECT * FROM GoodsSoldFKNoIndex
WHERE StoreID = 507

SELECT * FROM  GoodsSold
WHERE StoreID = 507


Now we are getting to the interesting part, turn on Include Actual Execution Plan, run statistics IO or run this in Plan Explorer


DELETE Store
WHERE StoreID = 507


DELETE StoreFK
WHERE StoreID = 507

You will see something like this

So 75% compared to 25%, not good but doesn't look catastrophic, if you have statistics time on, you will see the following

 SQL Server Execution Times:
   CPU time = 16 ms,  elapsed time = 16 ms.


 SQL Server Execution Times:
   CPU time = 561 ms,  elapsed time = 575 ms.


Now it looks much worse

What about statistics io?

Table 'GoodsSold'. Scan count 1, logical reads 3, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Store'. Scan count 0, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Table 'GoodsSoldFKNoIndex'. Scan count 1, logical reads 501373, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'StoreFK'. Scan count 0, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

That is terrible....

Here is also the view from Plan Explorer, look at Est. CPU Cost and Reads




There you have it, not indexing foreign keys can have a big impact even though the child table might not have any data at all


See also When did SQL Server stop putting indexes on Foreign Key columns? by Kimberly Tripp

Thursday, November 03, 2016

Want to know if your connect item is fixed? There is a twitter account for that




Did you know there is a twitter account that will tweet when a connect item is closed or resolved as fixed? I first heard about this account at the Pass Summit last week, then I promptly forgot about it. This morning I was listening to SQL Server Radio show 59, in this show the hosts Matan Yungman and Guy Glantser were talking about this twitter account. I decided to check out this account and follow this account


Here are some examples of tweets from the Closed as Fixed twitter account


Publishing dacpac to SQL Database fails when database uses memory optimized features & target database is in database pool


In the actual execution plan, display Reads, writes, CPU ,duration top 3 waitypes/wait duration info for each query

SSMS silently reconnects to the default database when connection is lost

Wednesday, November 02, 2016

The best thing about the PASS summit for me is...


This past week I attended the PASS Summit 2016. I went there with 6 co-workers and I had a terrific time. This time the weather wasn't that bad, when I left Princeton, NJ, it looked like it was going to rain for 5 days straight. It was actually sunny on a couple of occasions  :-)

One thing I really liked were the chalk talk sessions they had with the program managers of SQL Server and Azure SQL DB. People attending these sessions could ask questions and the team then answered these questions. There were also some cool announcements, I documented those here: Some cool SQL Server announcements SQL Graph, Adaptive Query Plan, CTP1 of SQL vNext, SQL Injection detection

Another thing I really liked, and this is not specific to this PASS summit, is that I got to see some of the people I befriended over the years. I chatted and hung out with some of them.

I liked all the sessions I attended, the one negative feedback I have is that there were a couple of sessions at the same time, so I had to pick one of them. This is not such a big deal because once the sessions are made available online, I will just watch the ones I really wanted to attend first.


So now back to the title of this post..... what was the best thing about this PASS summit for me? It is the fact that going to this summit reignited my love for SQL Server, and the passion for learning about SQL Server. I have laid low for the last couple of years but now I feel like a spark has been lit again,  I have so much to learn.

One of the best ways for me to learn is to blog about it. So here is what I am thinking I will do:

For December I will do a SQL advent series, this series will be about windowing functions. Everyone knows about the basic ones that were introduced in 2005, but very few people are using the stuff introduced later.

After that I will do a  series about r and SQL Server, it will take me around 3 months,  and I will probably do two posts per week.

After the r series, I want to do a series about the QueryStore.

I will also do a series about new stuff in SQL Server 2016, but that will be in between all the other posts. Whenever someone asked during a session how many people were already on SQL Server 2016, very few hands went up, this is the reason, I want to blog about this as well.


For all these series of posts, I plan to have a repo on github with a yet to be determined name. Then I will have all the SQL code for all the series organized there


So that is my plan


P.S.

If you want to read about my time at SQL Pass 2016, read these posts


SQL Pass Summit 2016... day 2
SQL Pass Summit 2016... day 3
SQL Pass Summit 2016... day 4
SQL Pass Summit 2016... day 5


You might be thinking, where is day 1?  The day 1 posts is mostly about Seattle, if you are interested in that one, here it is:  SQL Pass Summit 2016... day 1.... just looking around