Monday, January 15, 2018

When uncompressing data, make sure to use the correct function



SQL Server has offered data compression for a while now, you could either use PAGE compression or ROW compression, you had no ability so specify which columns.  That changed in SQL Server 2016

SQL Server 2016 added the ability to compress data yourself by using the COMPRESS function. The COMPRESS function compresses the input expression using the GZIP algorithm. The result of the compression is byte array of type varbinary(max).


Let's take a quick look at how this function works

We will create a table, insert one row and then update the SomeOtherColumn value by using the COMPRESS function on the SomeColumn column

CREATE TABLE test (Somecolumn varchar(200), SomeOtherColumn varbinary(max))
go
INSERT test VALUES ('aaaaaaaaaaaaa', null)


UPDATE test
SET SomeOtherColumn = COMPRESS(Somecolumn)
go

SELECT Somecolumn,SomeOtherColumn 
FROM test

Here is what the output looks like


What if if told you now to create a new table by uncompressing the data via a SELECT INTO query followed by dropping the old table

No problem, if there is a COMPRESS function then surely there must be an UNCOMPRESS function... right?  Let's find out......


SELECT Somecolumn,CAST( UNCOMPRESS(SomeOtherColumn) AS varchar(max))  AS SomeOtherColumn
INTO SomeNewTable
FROM test

DROP TABLE test

SELECT *
FROM SomeNewTable


Wrong, while the UNCOMPRESS function exists, it is not the correct function

Hopefully you ran a test before just dropping a table after doing select into

The correct function is DECOMPRESS

Here is what it all looks like

SELECT Somecolumn,SomeOtherColumn,
 CAST( DECOMPRESS(SomeOtherColumn) AS varchar(max)) AS CastedDecompressed,
 DECOMPRESS(SomeOtherColumn) as Decompressed FROM test

 SELECT Somecolumn,SomeOtherColumn, 
CAST( UNCOMPRESS(SomeOtherColumn) AS varchar(max)) AS CastedUncompressed,
 UNCOMPRESS(SomeOtherColumn) as Uncompressed FROM test

This is the output



As you can see the UNCOMPRESS function returns a bunch of nonsense while the DECOMPRESS function works as expected

So the question is really wth is the uncompress function?