Wednesday, September 26, 2018

A little less hate for: String or binary data would be truncated in table


The error String or binary data would be truncated is one of the more annoying things in SQL Server. You would insert some data and you would get back the error String or binary data would be truncated. Then you would have to spend a good amount of time to see what caused the error.

I even posted about this as part of  T-SQL Tuesday #86: String or binary data would be truncated

I read the SQL Server 2019 CTP 2 whitepaper and on page 17 it has the following

Improve truncation message for ETL DW scenarios—the error message ID 8152 String or binary data would be truncated is familiar to many SQL Server developers and administrators who develop or maintain data movement workloads; the error is raised during data transfers between a source and a destination with different schemas when the source data is too large to fit into the destination data type. This error message can be time-consuming to troubleshoot because of its lack of specificity. SQL Server 2019 introduces a new, more specific error message for this scenario: ID 2628 String or binary data would be truncated in table '%.*ls', column '%.*ls'. Truncated value: '%.*ls'. The new error message provides more context for the problem, simplifying the troubleshooting process. So that it cannot break existing applications parsing message ID 8152, this new message ID 2628 is an opt-in replacement, which can be enabled with trace flag 460. 


Oh really... they fixed this? Let's take a look

First I downloaded SQL Server 2019 CTP 2 and installed it. Then I created a database with a simple table, I also inserted some data that wouldn't fit

CREATE DATABASE truncatetest
GO

USE truncatetest
GO

CREATE TABLE TruncateMe(somevalue varchar(5),somevalue2 varchar(5), somedecimal decimal(5,2))
GO

INSERT TruncateMe
SELECT '333333','444444',5.3

I then received the following error message, so this is the same as in SQL Server 2018 and earlier, notice message id 8152

Msg 8152, Level 16, State 30, Line 10
String or binary data would be truncated.
The statement has been terminated.

To enable the new functionality, we need to enable trace flag 460, you can do that by running the DBCC TRACEON command like this

DBCC TRACEON(460)

Now let's try that insert statement again

INSERT TruncateMe
SELECT '333333','444444',5.3

And there we go, you get the table name, the column name as well as the value, notice that the message id changed from 8152 to 2628 now

Msg 2628, Level 16, State 1, Line 20
String or binary data would be truncated in table 'truncatetest.dbo.TruncateMe', column 'somevalue'. Truncated value: '33333'.
The statement has been terminated.


So it looks it only returns the first value that generates the error, let's change the first value to fit into the column and execute the insert statement again

INSERT TruncateMe
SELECT '3','444444',5.3

Now you will see that the error is for the somevalue2 column


Msg 2628, Level 16, State 1, Line 27
String or binary data would be truncated in table 'truncatetest.dbo.TruncateMe', column 'somevalue2'. Truncated value: '44444'.
The statement has been terminated.



What will happen if you have more than one row that fails?

insert TruncateMe
select  '333333','444444',5.3 union all
select '3','444444',5.3

Here is the error

Msg 2628, Level 16, State 1, Line 37
String or binary data would be truncated in table 'truncatetest.dbo.TruncateMe', column 'somevalue'. Truncated value: '33333'.
The statement has been terminated.

As you can see the error is only for the first row, not the second one

What about Table Variables, will you also get an error with the column and value like for real tables?

declare  @table table (somevalue varchar(5),somevalue2 varchar(5), somedecimal decimal(5,2))


insert @table
select '333333','444444',5.3


Here is the error

Msg 2628, Level 16, State 1, Line 53
String or binary data would be truncated in table 'tempdb.dbo.#A6AD698B', column 'somevalue'. Truncated value: '33333'.

As you can see you also get the error, the table name is the internal table name for the table variable tied to your session


What about Table Valued Parameters?

CREATE TYPE TestTypeTrunc AS TABLE   
( somevalue varchar(5),somevalue2 varchar(5));  
GO

DECLARE @table TestTypeTrunc
INSERT @table
SELECT '555555555','444444'

Here is the error for that 

Msg 2628, Level 16, State 1, Line 58
String or binary data would be truncated in table 'tempdb.dbo.#AC6642E1', column 'somevalue'. Truncated value: '55555'.


Pretty much the same message as for the table variable


I do applaud Microsoft for finally fixing this.

Here it is also in beautiful technicolor





Monday, September 24, 2018

First public preview of SQL Server 2019 is available for download



The first public preview of SQL Server 2019 is available for download

Download it here https://www.microsoft.com/en-us/sql-server/sql-server-2019

Aaron Bertrand blogged about some of the new stuff here: https://sqlperformance.com/2018/09/sql-server-2019/first-public-preview

No new T-SQL that stands out, but that might come in a future CTP

You can install this CTP on Windows, Linux, Docker and Kubernetes

Here is also a small video


Wednesday, May 16, 2018

Azure SQL Data Warehouse, Azure SQL Database and SQL Server sessions from the Build 2018 conference



The Build 2018 conference happened last week, I decided to put all the Azure SQL Data Warehouse, Azure SQL Database and SQL Server sessions from the Build 2018 conference into one place.

Here are the youtube videos, the descriptions as well as download links to video, audio and slides where available

Enjoy



Looking ahead with SQL Operations Studio

Join the SQL Operations Studio Engineering team as we highlight our journey towards our upcoming release announcement. We will cover major features delivered since our Public Preview announcement including developing our Extensibility story through the extension marketplace and extension API’s, improving dashboard experiences, and experimenting with deeper community involvement to help deliver this awesome cross-platform tool. You may even hear special announcements and key roadmap initiatives that have yet to be shared.



You can also download the sessions in different formats here:

MP3 (32.2 MB)
Low Quality MP4 (41.4 MB)
High Quality MP4 (87.9 MB)
Mid Quality MP4 (79.2 MB)



Query Processing Innovations for data intensive, modern applications 

Learn how your application can benefit from new capabilities in the Azure SQL Database and SQL Server platform, including graph data processing to model complex relationships between objects, and advanced self-tuning query processing to solve or avoid performance related problems.

 

You can also download the sessions in different formats here:

MP3 (72.3 MB)
Low Quality MP4 (81.6 MB)
High Quality MP4 (207.6 MB)
Mid Quality MP4 (157.8 MB)


Migrating and modernizing your data estate to Azure with Data Migration Services

Come learn all about Database Migration Service, and how anyone can utilize this service to migrate and modernize your data estate to fully managed services in Azure



You can also download the sessions in different formats here: 


MP3 (76.0 MB)
Low Quality MP4 (82.9 MB)
High Quality MP4 (208.0 MB)
Mid Quality MP4 (160.5 MB)


Azure SQL Database the intelligent database – Your database on Autopilot

Come learn how Azure SQL DB, the most intelligent cloud database, uses machine learning and best practices to ensure your database is always performing at its best. Come learn about features like Adaptive Query Processing, Autotuning and Performance Recommendations, to see how Azure SQL Database can help you spend more time developing applications and less time managing your databases.



You can also download the sessions in different formats here: 


MP3 (70.6 MB)
Low Quality MP4 (80.8 MB)
High Quality MP4 (201.3 MB)
Mid Quality MP4 (155.6 MB)



Azure SQL DB Managed Instances - Built to easily modernize application data layer

Managed Instances is the latest fully managed deployment model for Azure SQL Database that enables friction-free migration for SQL Server applications running on-premises. Come and learn why Managed Instance is the best PaaS destination for all SQL Server workloads and how to start your cloud modernization at scale now, using Azure SQL Database Managed Instances.



You can also download the sessions in different formats here: 


MP3 (72.8 MB)
Low Quality MP4 (83.6 MB)
High Quality MP4 (207.9 MB)
Mid Quality MP4 (158.2 MB)


Build Intelligent Apps with the Microsoft Data & AI Platform

Description Join Rohan Kumar, Corporate Vice President of Data Platform, to learn how Microsoft provides the most comprehensive data platform for your modern, intelligent applications. Come see our latest innovations that enable you to easily modernize and provide new, differentiated experiences in your applications. Hear from customers like dv01, Finastra, Jet.com and Rubikloud as they share their stories of how they’re each transforming their industries by leveraging Microsoft’s data platform.



You can also download the sessions in different formats here: 


MP3 (73.4 MB)
Low Quality MP4 (97.1 MB)
High Quality MP4 (286.7 MB)
Mid Quality MP4 (209.8 MB)


Develop scalable analytical solutions with Azure Data Factory & Azure SQL Data Warehouse

In this session you will learn how to develop data pipelines in Azure Data Factory and build a Cloud-based analytical solution adopting modern data warehouse approaches with Azure SQL Data Warehouse and implementing incremental ETL orchestration at scale. With the multiple sources and types of data available in an enterprise today Azure Data factory enables full integration of data and enables direct storage in Azure SQL Data Warehouse for powerful and high-performance query workloads which drive a majority of enterprise applications and business intelligence applications.

 

You can also download the sessions in different formats here: 



MP3 (55.1 MB)
Low Quality MP4 (61.7 MB)
High Quality MP4 (151.2 MB)
Mid Quality MP4 (116.3 MB)



Enhancing DevOps with SQL Server on Linux + containers

SQL Server 2017 runs now on both Linux and containers. In this session you will learn how SQL Server 2017 supports Continuous Deployment/Continuous Integration (CD/CI) and other DevOps processes. You will also learn the latest on cross-platform tooling, APIs, and how to develop apps with SQL Server running anywhere. We will also demo how easy it is to use SQL Server in your modern database development workflows.

 


You can also download the sessions in different formats here: 



MP3 (58.7 MB)
Low Quality MP4 (64.1 MB)
High Quality MP4 (162.3 MB)
Mid Quality MP4 (121.0 MB)



Modernizing Mission-Critical Apps with SQL Server

Learn about the features that can help you modernize your mission critical applications, where security and performance can go hand in hand. From the wide range of SQL Server features available, we will take a closer look at In-Memory performance, Automatic Tuning, Advanced Security Features like Always Encrypted, Polybase and integration with Machine Learning through R and Python.

 


You can also download the sessions in different formats here: 



MP3 (70.1 MB)
Low Quality MP4 (82.2 MB)
High Quality MP4 (231.5 MB)
Mid Quality MP4 (169.6 MB)

Slides
View Slides Online


Securing your data with Azure SQL DB

Come learn about new security features like Vulnerability Assessment, Information Protection, Thread Detection and Always Encrypt to see how Azure SQL Database is securing your data in the most secure database on the planet.

 


You can also download the sessions in different formats here: 



MP3 (73.3 MB)
Low Quality MP4 (85.5 MB)
High Quality MP4 (238.8 MB)
Mid Quality MP4 (178.0 MB)

Slides
View Slides Online

Monday, February 19, 2018

Webinar: SQL Server on Linux tools: SQL Server Integration Services


Microsoft is hosting a SQL Server on Linux tools: SQL Server Integration Services webinar/town hall on March 1st, 2018 10:00-11:00 AM Pacific Time (UTC-5)

In this session, They will demonstrate how easy it is to install, run and monitor SSIS on Linux. They will also explain how you can use ODBC to connect common data sources on Linux. Specifically, they will cover:


  • Current system architecture
  • Supported/unsupported features
  • A live demo of SSIS installation on Linux
  • A live demo of how to use ODBC to connect data sources like MySQL and Oracle
  • Future roadmap of SSIS on Linux and other platforms


If you are interested in SSIS running on Linux, make sure to register for this webinar

The link to register is here: https://info.microsoft.com/sql-server-on-linux-town-hall-sql-server-integration-services-registration.html

Reinventing the wheel

It has been a while since I wrote some of my best practices posts. I decided to revisit these posts again to see if anything has changed, I also wanted to see if I could add some additional info.

In this post  we are going to look at something called reinventing the wheel. Just in case your are not familiar with this metaphor or maybe you are not a native English speaker, I will use wikipedia's description of what reinventing the wheel means.
To reinvent the wheel is to duplicate a basic method that has already previously been created or optimized by others.
The inspiration for this idiomatic metaphor lies in the fact that the wheel is the archetype of human ingenuity, both by virtue of the added power and flexibility it affords its users, and also in the ancient origins which allow it to underlie much, if not all, of modern technology. As it has already been invented, and is not considered to have any operational flaws, an attempt to reinvent it would be pointless and add no value to the object, and would be a waste of time, diverting the investigator's resources from possibly more worthy goals which his or her skills could advance more substantially.



So now that you have read the paragraph above, how many times did you write some code only to find out that it already exists in the language as part of some library or function?. How many times have you written code that you could have easily grabbed from GitHub, CodePlex and other repositories for your own use?


Why write your own solution when you can use something that is robust and tested?


To start let's take a look at the GitHub repositories mentioned in this post: Five great SQL Server GitHub repos that every SQL Server person should check out
You will find code that does index maintenance, helps you with performance issues, setup and more. Check out that post for more details


Find out who the community leaders are for a particular skill set that you are interested in, start following these people, follow them on twitter, subscribe to their blogs and podcasts. Go to their presentations, talk to them, find out what they use, find out if they have made code available for the public to use. You will find out that a good percentage of these people have made available a whole bunch of libraries, stored procedures, functions, maintenance routines and much more for you to use and it is all free.
Don't be scared to ask for help on twitter, if you don't know any of the SQL Server tweeple, use the #sqlhelp hash tag and ask for help, here is an example of what it looks like #sqlhelp
Here is an image of the replies on twitter after I asked a question with the #sqlhelp tag


Besides twitter, you can also use slack. I like slack more because you are not limited to 280 characters. Here is the link to the relevant slack channel: https://sqlcommunity.slack.com/messages/C1MS1RA4B/

Here is a screen shot of what it looks like

That looks a little better than twitter don't you think?


Some commercial firms will also have community editions of code and tools for you to use. Take advantage of this, these are great, if you like the tools then maybe you will find a need for the pro editions, these have more bells and whistles and are not limited.

Some examples of available solutions:

SQL Server activity
Want to know what is going on right now? Try Adam Machanic's procedure Who Is Active

Execution Plans
Check out SentryOne's  Plan Explorer. This plan explorer does much more than the one that comes with SQL Server Management Studio

SQL Search and other tools
Red Gate has a bunch of free tool, you can get those here https://www.red-gate.com/products/free-tools. I started to use Red Gate's tools back in 2003, SQL Compare is the one I used the most. SQL Search is free and if you need to find anything in your DB it is invaluable.
Idera free tools
Idera has a bunch of free tools available for download, you can find those all here: https://www.idera.com/productssolutions/freetools

Get involved

If you have created some cool code and you know there is nothing similar, why now give back to the community? Put it out there, solicit feedback and in the end the code will be better because more eyes will have looked at it. Accept contributions as well. All of these things will make the community as a whole grow, if the community grows then the platform will grow as well. When the platform grows, this means there will be more demand for someone with your skill set. You are responsible that your community doesn't turn into a ghost town.

Monday, January 15, 2018

When uncompressing data, make sure to use the correct function



SQL Server has offered data compression for a while now, you could either use PAGE compression or ROW compression, you had no ability so specify which columns.  That changed in SQL Server 2016

SQL Server 2016 added the ability to compress data yourself by using the COMPRESS function. The COMPRESS function compresses the input expression using the GZIP algorithm. The result of the compression is byte array of type varbinary(max).


Let's take a quick look at how this function works

We will create a table, insert one row and then update the SomeOtherColumn value by using the COMPRESS function on the SomeColumn column

CREATE TABLE test (Somecolumn varchar(200), SomeOtherColumn varbinary(max))
go
INSERT test VALUES ('aaaaaaaaaaaaa', null)


UPDATE test
SET SomeOtherColumn = COMPRESS(Somecolumn)
go

SELECT Somecolumn,SomeOtherColumn 
FROM test

Here is what the output looks like


What if if told you now to create a new table by uncompressing the data via a SELECT INTO query followed by dropping the old table

No problem, if there is a COMPRESS function then surely there must be an UNCOMPRESS function... right?  Let's find out......


SELECT Somecolumn,CAST( UNCOMPRESS(SomeOtherColumn) AS varchar(max))  AS SomeOtherColumn
INTO SomeNewTable
FROM test

DROP TABLE test

SELECT *
FROM SomeNewTable


Wrong, while the UNCOMPRESS function exists, it is not the correct function

Hopefully you ran a test before just dropping a table after doing select into

The correct function is DECOMPRESS

Here is what it all looks like

SELECT Somecolumn,SomeOtherColumn,
 CAST( DECOMPRESS(SomeOtherColumn) AS varchar(max)) AS CastedDecompressed,
 DECOMPRESS(SomeOtherColumn) as Decompressed FROM test

 SELECT Somecolumn,SomeOtherColumn, 
CAST( UNCOMPRESS(SomeOtherColumn) AS varchar(max)) AS CastedUncompressed,
 UNCOMPRESS(SomeOtherColumn) as Uncompressed FROM test

This is the output



As you can see the UNCOMPRESS function returns a bunch of nonsense while the DECOMPRESS function works as expected

So the question is really wth is the uncompress function?


Friday, December 29, 2017

The 15 most popular posts in 2017


Another year is almost over. As the last post of this year I decided to share with you the 15 most popular posts in 2017. I just used Google Analytics to grab this info, I excluded the home page from the top 15.  Some of these post are so old, we didn't even have windowing functions in SQL Server when these were written.,....

Here they are in order of popularity

01. Ten SQL Server Functions That You Hardly Use But Should
A post from 2007 showing some hardly used functions like NULLIF, PARSENAME and STUFF

02. Five Ways To Return Values From Stored Procedures
A very old post that shows you five ways to return values from a stored proc

03. Your lack of constraints is disturbing
A post showing the type of constraints available in SQL Server with examples

04.  Use T-SQL to create caveman graphs
One of the shortest post on this site, show you how you can make visually appealing output with a pipe symbol

05. Convert Millisecond To "hh:mm:ss" Format
A very old post showing you how to convert from milliseconds to "hh:mm:ss" format

06. T-SQL Tuesday #92, Lessons learned the hard way
Some of my mistakes put together in 1 post

07. SQL Server 2017: SQL Graph
Me messing around with the Graph functionality in SQL Server 2017

08. Query Analyzer Trick
A very old post explaining how you can drag the columns into the query window from object explorer. I think by now everyone knows this one right?

09. ISO-11179 Naming Conventions
A very old post linking to the ISO-11179 Naming Conventions document

10. Not sure that I like the message from Online Resumable Index Rebuild in SQL Server 2017
After playing qround with resumable index rebuilds I think the kill state and severe error occurred is a little over the top

11. Some cool SQL Server announcements SQL Graph, Adaptive Query Plan, CTP1 of SQL vNext, SQL Injection detection
This is my recap of the chalkboard session with the SQL Server team at the SQL Server PASS summit in Seattle.

12.  Five great SQL Server GitHub repos that every SQL Server person should check out
What the titles says, 5 GitHub repos you need to use

13. Chaos Isolation Level In SQL Server
This is linked from  dba.stackexchange.com, it is kind of interesting because it was a silly post I made

14. T-SQL Tuesday #86: String or binary data would be truncated
A pet peeve of many people

15. Microsoft releases SQL Server Horizontica, a Vertica killer
Probably the best version of SQL Server ever created, was only available for download for 24 hours.


That is all for this year... see you in 2018...  the year of Linux on the desktop, DevOps, Ethereum, AR/VR, bendable phones and much more......

Out of focus Christmas tree,

Wednesday, November 29, 2017

Use T-SQL to create caveman graphs



I found this technique on Rich Benner's SQL Server Blog: Visualising the Marvel Cinematic Universe in T-SQL and decided to play around with it after someone asked me to give him the sizes of all databases on a development instance of SQL Server

The way it works is that you take the size of the database and then divide that number against the total size of all databases. You then use the replicate function with the | (pipe) character to generate the 'graph'  so 8% will look like this ||||||||

You can use this for tables with most rows, a count per state etc etc. By looking at the output the graph column adds a nice visual effect to it IMHO


Here is what the final query looks like


SELECT 
       database_name = DB_NAME(database_id) 
     , total_size_GB = CAST(SUM(size) * 8. / 1024/1024 AS DECIMAL(30,2))
  , percent_size = (CONVERT(decimal(30,4),(SUM(size) /
     (SELECT SUM(CONVERT(decimal(30,4),size))  
     FROM sys.master_files WITH(NOWAIT)))) *100.00)
  , graph = replicate('|',((convert(decimal(30,2),(SUM(size) / 
    (SELECT SUM(CONVERT(decimal(30,2),size))  
     FROM sys.master_files WITH(NOWAIT)))) *100)))
FROM sys.master_files WITH(NOWAIT)
GROUP BY database_id
ORDER BY 3 DESC

And here is the output (I blanked out the DB name in the output below), there are 48 databases, 15 of them show a bar, the rest don't because they use less than 0.5% of space.


Do you see how you can quickly tell visually that the top DB is about twice as large as the next DB? 

Those guys in Lascaux would have been so proud, only if they could see this  :-)



Wednesday, October 25, 2017

How to update 2 tables with 1 statement in SQL Server..the hard way....



Every now and then you will get someone asking how to update two tables with one statement in SQL Server. The answer is usually, no that is not possible... the person then walks away muttering something about how MySQL allows it.

So I decided to try to see if I could update two tables with one statement. I decided to try a couple of different things


  • view
  • common table expression
  • indexed view
  • instead of trigger

In order to begin we need two tables, each table will have one row of data so we can update those rows

CREATE TABLE test1(id int primary key, someVal char(1) not null)
CREATE TABLE test2(id int primary key, someVal char(1) not null)

INSERT test1 VALUES(1,'a')
INSERT test2 VALUES(1,'a')
Go

Now we can start with plan A... the mighty view

CREATE VIEW Testview1
AS
SELECT t1.*,t2.SomeVal as SomeVal2
FROM test1 t1
JOIN test2 t2 on t1.id = t2.id

Running a simple select against the view

SELECT * FROM testview1

id someVal SomeVal2
1 a         a

Time to update the view

UPDATE testview1 SET SomeVal = 'b', SomeVal2 = 'b'
WHERE id = 1


Msg 4405, Level 16, State 1, Line 1
View or function 'testview1' is not updatable because the modification affects multiple base tables.

As you can see that didn't work since even though you are updating one view, you are still trying to update two tables.

Time to implement plan B... the versatile common table expression

Since you can update a common table expression, can you update a common table expression if it updates more than one table?  Let's try it out

;WITH cte AS (SELECT t1.*,t2.SomeVal as SomeVal2
FROM test1 t1
JOIN test2 t2 on t1.id = t2.id)

UPDATE cte SET SomeVal = 'b', SomeVal2 = 'b'
WHERE id = 1

Msg 4405, Level 16, State 1, Line 1
View or function 'cte' is not updatable because the modification affects multiple base tables.

So plan B ended like plan A... also pretty much the same error message


Plan C.... the mighty indexed view.

If you ever tried working with indexed views you are probably busy cursing at the moment.
Let's create this view and add an index

CREATE VIEW testviewIndexed 
WITH SCHEMABINDING  
AS  
SELECT t1.id, t1.someVal, t2.SomeVal as SomeVal2
FROM dbo.test1 t1
JOIN dbo.test2 t2 on t1.id = t2.id
GO


CREATE UNIQUE CLUSTERED INDEX IDX_V1   
    ON testviewIndexed (id);  

Fingers crossed....but if the regular view didn't work..why would this?

UPDATE testviewIndexed SET SomeVal = 'b', SomeVal2 = 'b'
WHERE id = 1

Msg 4405, Level 16, State 1, Line 2
View or function 'testviewIndexed' is not updatable because the modification affects multiple base tables.

That is right... plan C is also bad, very bad.

So we are left with one more thing... plan D (as in Denis)....

We will now use an instead of trigger on the regular view from before

CREATE TRIGGER InsteadTrigger on testview1
INSTEAD OF UPDATE
AS
BEGIN

 UPDATE t  SET t.SomeVal = i.SomeVal
 FROM INSERTED i JOIN test1 t on i.id = t.id

 UPDATE t  SET t.SomeVal = i.SomeVal2
 FROM INSERTED i JOIN test2 t on i.id = t.id
END

GO

Let's see what happens now.. fingers crossed

UPDATE testview1 SET SomeVal = 'b', SomeVal2 = 'b'
WHERE id = 1

No error, let's see what is in the table

SELECT * FROM test1
SELECT * FROM test2

id someVal SomeVal2
1 b         b

And as you can see, you can update two tables with one statement.

Should you really go through all this trouble because you don't want to do something like this?

BEGIN TRAN

 UPDATE test1 SET SomeVal = 'c'
 WHERE id = 1

 UPDATE test2 SET SomeVal = 'c'
 WHERE id = 1

COMMIT-- hmm where is the error checking/begin try xact_state()?

Nope I would not go this route,  if the table changes you now need to also update the trigger. What if someone drops the trigger? There are too many ways this can go wrong


Foreign Keys don't always need a primary key

In the post Your lack of constraints is disturbing we touched a little upon foreign key constraints but today we are going to take a closer look at foreign keys. The two things that we are going to cover are the fact that you don't need a primary key in order to define a foreign key relationship, SQL Server by default will not index foreign keys

You don't need a primary key in order to have a foreign key

Most people will define a foreign key relationship between the foreign key and a primary key. You don't have to have a primary key in order to have a foreign key, if you have a unique index or a unique constraint then those can be used as well.

Let's take a look at what that looks like with some code examples


A foreign key with a unique constraint instead of a primary key


First create a table to which we will add a unique constraint after creation

CREATE TABLE TestUniqueConstraint(id int)
GO
Add a unique constraint to the table

ALTER TABLE TestUniqueConstraint ADD CONSTRAINT ix_unique UNIQUE (id)
GO
Insert a value of 1, this should succeed

INSERT  TestUniqueConstraint VALUES(1)
GO

Insert a value of 1 again, this should fail

INSERT  TestUniqueConstraint VALUES(1)
GO

Msg 2627, Level 14, State 1, Line 2
Violation of UNIQUE KEY constraint 'ix_unique'. Cannot insert duplicate key in object 'dbo.TestUniqueConstraint'. The duplicate key value is (1).
The statement has been terminated.



Now that we verified that we can't have duplicates, it is time to create the table that will have the foreign key



CREATE TABLE TestForeignConstraint(id int)
GO
Add the foreign key to the table

ALTER TABLE dbo.TestForeignConstraint ADD CONSTRAINT
 FK_TestForeignConstraint_TestUniqueConstraint FOREIGN KEY 
(id) REFERENCES dbo.TestUniqueConstraint(id) 



Insert a value that exist in the table that is referenced by the foreign key constraint


INSERT TestForeignConstraint  VALUES(1)
INSERT TestForeignConstraint  VALUES(1)

Insert a value that does not exist in the table that is referenced by the foreign key constraint


INSERT TestForeignConstraint  VALUES(2)

Msg 547, Level 16, State 0, Line 1
The INSERT statement conflicted with the FOREIGN KEY constraint "FK_TestForeignConstraint_TestUniqueConstraint". The conflict occurred in database "tempdb", table "dbo.TestUniqueConstraint", column 'id'.

The statement has been terminated.


As you can see, you can't insert the value 2 since it doesn't exist in the TestUniqueConstraint table



A foreign key with a unique index instead of a primary key

This section will be similar to the previous section, the difference is that we will use a unique index instead of a unique constraint
First create a table to which we will add a unique index after creation

CREATE TABLE TestUniqueIndex(id int)
GO

Add the unique index



CREATE UNIQUE NONCLUSTERED INDEX ix_unique ON TestUniqueIndex(id)
GO
Insert a value of 1, this should succeed


INSERT  TestUniqueIndex VALUES(1)
GO
Insert a value of 1 again , this should now fail


INSERT  TestUniqueIndex VALUES(1)
GO

Msg 2601, Level 14, State 1, Line 2
Cannot insert duplicate key row in object 'dbo.TestUniqueIndex' with unique index 'ix_unique'. The duplicate key value is (1).
The statement has been terminated.


Now that we verified that we can't have duplicates, it is time to create the table that will have the foreign key


CREATE TABLE TestForeignIndex(id int)
GO

Add the foreign key constraint

ALTER TABLE dbo.TestForeignIndex ADD CONSTRAINT
 FK_TestForeignIndex_TestUniqueIndex FOREIGN KEY 
 (id) REFERENCES dbo.TestUniqueIndex(id)  




Insert a value that exist in the table that is referenced by the foreign key constraint

INSERT TestForeignIndex  VALUES(1)
INSERT TestForeignIndex  VALUES(1)

Insert a value that does not exist in the table that is referenced by the foreign key constraint


INSERT TestForeignIndex  VALUES(2)


Msg 547, Level 16, State 0, Line 1
The INSERT statement conflicted with the FOREIGN KEY constraint "FK_TestForeignIndex_TestUniqueIndex". The conflict occurred in database "tempdb", table "dbo.TestUniqueIndex", column 'id'.
The statement has been terminated.


That failed because you can't insert the value 2 since it doesn't exist in the TestUniqueIndex table

As you have seen with the code example, you can have a foreign key constraint that will reference a unique index or a unique constraint. The foreign key does not always need to reference a primary key

Foreign keys are not indexed by default

When you create a primary key, SQL Server will by default make that a clustered index. When you create a foreign key, there is no index created

Scroll up to where we added the unique constraint to the TestUniqueConstraint table, you will see this code

ALTER TABLE TestUniqueConstraint ADD CONSTRAINT ix_unique UNIQUE (id)

All we did was add the constraint, SQL Server added the index behind the scenes for us in order to help enforce uniqueness more efficiently

Now run this query below


SELECT OBJECT_NAME(object_id) as TableName,
name as IndexName, 
type_desc as StorageType
FROM sys.indexes
WHERE OBJECT_NAME(object_id) IN('TestUniqueIndex','TestUniqueConstraint')
AND name IS NOT NULL

You will get these results

TableName         IndexName StorageType
---------------------   -----------     --------------
TestUniqueConstraint ix_unique NONCLUSTERED
TestUniqueIndex         ix_unique NONCLUSTERED

As you can see both tables have an index

Now let's look at what the case is for the foreign key tables. Run the query below


SELECT OBJECT_NAME(object_id) as TableName,
name as IndexName, 
type_desc as StorageType
FROM sys.indexes
WHERE OBJECT_NAME(object_id) IN('TestForeignIndex','TestForeignConstraint')

Here are the results for that query

TableName       IndexName StorageType
--------------------- --------- -------------
TestForeignConstraint NULL HEAP
TestForeignIndex NULL HEAP




As you can see no indexes have been added to the tables. Should you add indexes? In order to answer that let's see what would happen if you did add indexes. Joins would perform faster since it can traverse the index instead of the whole table to find the matching join conditions. Updates and deletes will be faster as well since the index can be used to find the foreign keys rows to update or delete (remember this depends if you specified CASCADE or NO ACTION when you create the foreign key constraint)

I wrote about deletes being very slow because the columns were not indexed here: Are your foreign keys indexed? If not, you might have problems
So to answer the question, yes, I think you should index the foreign key columns