In the A challenge for 2016....... accepted! post I said that I would write a post once every month that would explain very basic stuff about SQL Server. Today is the first post, in this post we will take a look at how you can return rows from one table which do not exist in another table.
First we need to create our two tables, these tables will be very simple, each will have only 1 column
Now that our tables are set up it is time to look at the various queries.
NOT INThis is one of the simpler ways, it is almost a direct translation from English, select all from this table which is not in the other table
Here is what the query looks like
This is the output
The output is correct, our MainData table has only these values: 1,2,4,5 and 7. The SomeData table has the values 3, 6 and 8, these values do not exist in our MainData table. As you can see the NOT IN query is very simple and will work most of the time, there are however two scenarios where it could be problematic using NOT IN. We will look at these two scenarios at the end of this post.
NOT EXISTSUsing NOT EXISTS is very similar to NOT IN, one addition you have to make is adding a WHERE clause, you are in essence doing a JOIN condition but not returning anything that satisfied this condition. Here is what the query looks like
The results for this query are the same as the NOT IN query
LEFT JOINA LEFT JOIN query returns all the data from both table, it will return NULL values for the rows that don't have a matched in the outer joined table. If you run the following query
Here is what the results look like
ID2 ID ---- ----- 1 1 2 2 3 NULL 4 4 6 NULL 8 NULL
As you can see the rows with the values 3,6 and 8 in the 1st column have a NULL value in the 2nd column.
The LEFT JOIN query is more complex compared to NOT IN and NOT EXISTS. You also need to know that this query will return data from both tables, this is why we need to specify SELECT t2.*. I also specified t2.* in the other two queries, this was however not needed since NOT IN and NOT EXISTS only return data from one table. Like in the NOT EXISTS query, you also specify a JOIN condition in the LEFT JOIN query. Finally in the WHERE clause you are filtering out the data which does exists by asking for all the rows where t1.ID is NULL
Here is what the query looks like and the results are the same as for the NOT IN and the NOT EXISTS queries
EXCEPTUsing EXCEPT is very easy, the query is basically return every from one table EXCEPT what is returned by the bottom query. Here is what the query looks like
This query will return the same data as the other queries. Using EXCEPT is pretty simple and straightforward but I have to warn you, EXCEPT does a sort and is the worst performing query of all the ones mentioned here. I don't recommend using EXCEPT for any big tables
Some problems you might encounter when using NOT IN
Take a look at the query below, do you see anything wrong? Run the following query
You don't get any errors but the query returns nothing. The problem is that the MainData table does not contain a column named ID2, it is named ID. Since the ID2 table does exists in the SomeData table SQL Server does not throw an error. If you use LEFT JOIN or NOT EXISTS, you cannot make this mistake. Run the following two queries to see what happens
Here is the output
Msg 207, Level 16, State 1, Line 75 Invalid column name 'ID2'. Msg 207, Level 16, State 1, Line 79 Invalid column name 'ID2'.
As you can see you got an error.
NULL values will also cause a problem when using NOT IN. Add the following NULL value to the MainData table
If you run all the 3 different queries again, you will notice something
Here are the results
Not in ----------- Not Exists ----------- 3 6 8 Left Join ----------- 3 6 8 Except ----------- 3 6 8
Do you see what happened? The NOT IN query is not returning anything, a NULL value is not equal to anything, not even to another NULL value.
So that's it for this post, I showed you four ways to return values from one table which do not exists in another table. I also showed you why NOT IN might cause some problems. In general I like to use EXISTS and NOT EXISTS unless I need data from both table, in that case I will use a JOIN.
Till next month. If you want me to cover a topic leave me a comment.