Monday, January 20, 2020

TVPs vs Memory Optimized TVPs


The other day I was thinking about the blog post Faster temp table and table variable by using memory optimization I read a while back. Since you can't believe anything on the internets (no disrespect to whoever wrote that post) , I decided to take this for a test

In this post I will be creating 2 databases, one is a plain vanilla database and the other, a database that also has a file group that contains memory optimized data

I will also be creating a table type in each database, a plain one and a memory optimized one in the memory optimized database

So lets get started, first I will create the regular database and the regular table type

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
CREATE DATABASE TempTVP
GO

USE TempTVP
GO

CREATE TYPE dbo.DataProcessingType AS TABLE(
 SomeDate datetime NOT NULL,
 SomeSymbol varchar(40) NOT NULL,
 SomeValue numeric(24, 10) NOT NULL,
 SomeDescription varchar(100),
 index tvp_temp (SomeDate, SomeSymbol))
GO


Now I will create the memory optimized database and the memory optimized table type
In order for the database to be able to use memory optimized code, you need to add a filegroup and tell SQL Server it contains memory optimized data, after that is created, you add a file to that file group.

The table type syntax is identical except for the line (WITH   (MEMORY_OPTIMIZED = ON);) at the end

Here is what the script looks like



 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
CREATE DATABASE TempTVPHekaton
GO

USE TempTVPHekaton
GO


ALTER DATABASE [TempTVPHekaton] ADD FILEGROUP [Hekaton_Data] 
CONTAINS MEMORY_OPTIMIZED_DATA 
GO


ALTER DATABASE [TempTVPHekaton] ADD FILE (NAME='Hekaton_Data_file',
 FILENAME='C:\Data\ekaton_Data_file.mdf') TO FILEGROUP Hekaton_Data;
GO

CREATE TYPE dbo.DataProcessingType AS TABLE(
 SomeDate datetime NOT NULL,
 SomeSymbol varchar(40) NOT NULL,
 SomeValue numeric(24, 10) NOT NULL,
 SomeDescription varchar(100),
 index tvp_temp (SomeDate, SomeSymbol))
  WITH   (MEMORY_OPTIMIZED = ON); 
GO

Now that we have our two database, lets create a very simple stored proc in each database, all it does is store the row count from the table valued parameter passed in into a variable


 1
 2
 3
 4
 5
 6
 7
 8
 9
10
CREATE PROCEDURE prTestTVP @tvp DataProcessingType readonly

AS

SET NOCOUNT ON

DECLARE @Count int

SELECT @Count = COUNT(*) FROM @tvp
GO

Now it is time to generate the test script

The text script will call the stored procedure 1000 times passing in a table valued parameter
The test script will populate the table type with 1000 rows, the data looks like this


That data is pushed into the table valued parameter, the proc is called, the table type is cleared out and every 100 iterations the current iteration will be printed

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
SET NOCOUNT ON
DECLARE @LoopID int = 1

WHILE @LoopID <= 1000
BEGIN
 DECLARE @tvp DataProcessingType
 INSERT @tvp -- add some values
  SELECT DATEADD(d,number,'20200101') as SomeDate,
  'X' + STR(number) + STR(@LoopID) as SomeSymbol,
   number * @LoopID * 1.11 as SomeValue,
   LEFT(REPLICATE(number,100),100) as SomeDescription
 FROM master..spt_values
  WHERE type = 'p' -- only numbers
  and number < 1000
 ORDER BY NEWID() --pseudo-random sort


 EXEC prTestTVP @tvp -- CALL proc with 1000 rows
 
 DELETE @tvp -- delete the data since it will be populated again

  if @LoopID %100 = 0 -- print every 100 iterations
  PRINT STR(@LoopID)
 SET @LoopID += 1 -- add 1 to counter

END


What I did now is take the code, I then pasted the code in 2 different SSMS windows and connected to the TempTVP database, I then executed the code in both windows and let it run.  Once it was finished, I noted down how long it took and then changed the connections to the database TempTVPHekaton which is memory optimized and ran the code in both windows as well. I played around with loops of 100, 1000, 2000, I played around as well by populating the table with rows between 1000 and 2048

Here are some of the results



DB Storage Iterations * rows Percentage of time
Disk 1000 * 1000 85.37%
Memory 1000 * 1000 14.63%
Disk 1500 * 1000 76.36%
Memory 1500 * 1000 23.64%
Disk 5000 * 100 92.31%
Memory 5000 * 100 7.69%


So it looks like it is at least 4 times faster, if the table is smaller and you have more iterations, it gets even faster


I did run into an issue while testing, if I made it execute 5000 times with a 2000 rows table.. I was greeted by the following error

Msg 701, Level 17, State 154, Procedure prTestTVP, Line 7 [Batch Start Line 0]
There is insufficient system memory in resource pool 'default' to run this query.

This code was running on a laptop where I had 40 tabs open in chrome so there was not a lot of free memory, I also didn't create a resource pool, everything was a default setup

If you look at the code you will see that I clear out the table after each iteration.

However the table variable doesn't get out of scope until the loop is finished. In my real time scenario, I don't have this issue, my procs are called by many processes but not in a loop

To read more about this error start here

Be aware of 701 error if you use memory optimized table variable in a loop

This is actually by-design behavior documented in “Memory-Optimized Table Variables”).  Here is what is state “Unlike memory-optimized tables, the memory consumed (including deleted rows) by table variables is freed when the table variable goes out of scope)”.  With a loop like above, all deleted rows will be kept and consume memory until end of the loop.


There you go.. if you are using table types, switching them to in memory table types might help your application perform better. But of course as I said before...  since you can't believe anything on the internets, test for yourself