Char vs NChar
- What happens to your backup and restore process, will it be faster or slower, will the files be bigger if not compressed?
- What about when transferring the results to and from your database server, are the packets able to store the same number of characters.
- What about the amount of data on a page, what does this do to indexes and index lookups, how does it affect index maintenance?
Some examples of what I have seen stored in nchar and nvarchar when realy you shouldn't:
US addresses
Social Security Numbers (which were stored in plain text none the less)
Integer data (enforced by constraints or the app layer to make sure these were only digits)
CREATE TABLE TestChar (SomeCol char(10))
GO
CREATE TABLE TestNChar (SomeCol nchar(10))
GO
CREATE index ix_test on TestChar(SomeCol)
GO
CREATE index ix_test on TestNChar(SomeCol)
GO
INSERT TestChar
SELECT TOP 1000000 '1234567890'
FROM sys.sysobjects c1
CROSS JOIN sys.sysobjects c2
CROSS JOIN sys.sysobjects c3
CROSS JOIN sys.sysobjects c4
GO
INSERT TestNChar
SELECT TOP 1000000 '1234567890'
FROM sys.sysobjects c1
CROSS JOIN sys.sysobjects c2
CROSS JOIN sys.sysobjects c3
CROSS JOIN sys.sysobjects c4
GO
EXEC sp_spaceused 'TestChar'
EXEC sp_spaceused 'TestNChar'
62736 KB
Implicit conversions
SET SHOWPLAN_TEXT ON
GO
DECLARE @v varchar(10) = '0123456789'
SELECT * FROM TestChar WHERE SomeCol LIKE @v +'%'
GO
SET SHOWPLAN_TEXT OFF
GO
|--Nested Loops(Inner Join, OUTER REFERENCES:([Expr1008], [Expr1009], [Expr1010]))
|--Compute Scalar(DEFINE:([Expr1008]=LikeRangeStart([@v]+'%'), [Expr1009]=LikeRangeEnd([@v]+'%'), [Expr1010]=LikeRangeInfo([@v]+'%')))
| |--Constant Scan
|--Index Seek(OBJECT:([Performance].[dbo].[TestChar].[ix_test]), SEEK:([Performance].[dbo].[TestChar].[SomeCol] > [Expr1008] AND [Performance].[dbo].[TestChar].[SomeCol] < [Expr1009]), WHERE:([Performance].[dbo].[TestChar].[SomeCol] like [@v]+'%') ORDERED FORWARD)
DROP INDEX TestChar.ix_test
GO
ALTER TABLE TestChar ALTER COLUMN SomeCol nchar(10)
GO
CREATE INDEX ix_test on TestChar(SomeCol)
GO
SET SHOWPLAN_TEXT ON
GO
DECLARE @v varchar(10) = '0123456789'
SELECT * FROM TestChar WHERE SomeCol LIKE @v +'%'
GO
SET SHOWPLAN_TEXT OFF
GO
|--Nested Loops(Inner Join, OUTER REFERENCES:([Expr1008], [Expr1009], [Expr1010]))
|--Compute Scalar(DEFINE:([Expr1008]=LikeRangeStart(CONVERT_IMPLICIT(nvarchar(11),[@v]+'%',0)), [Expr1009]=LikeRangeEnd(CONVERT_IMPLICIT(nvarchar(11),[@v]+'%',0)), [Expr1010]=LikeRangeInfo(CONVERT_IMPLICIT(nvarchar(11),[@v]+'%',0))))
| |--Constant Scan
|--Index Seek(OBJECT:([Performance].[dbo].[TestChar].[ix_test]), SEEK:([Performance].[dbo].[TestChar].[SomeCol] > [Expr1008] AND [Performance].[dbo].[TestChar].[SomeCol] < [Expr1009]), WHERE:([Performance].[dbo].[TestChar].[SomeCol] like CONVERT_IMPLICIT(nvarchar(11),[@v]+'%',0)) ORDERED FORWARD)
SET SHOWPLAN_TEXT ON
GO
DECLARE @v nvarchar(10) = '0123456789'
SELECT * FROM TestChar WHERE SomeCol LIKE @v +'%'
GO
SET SHOWPLAN_TEXT OFF
GO
|--Compute Scalar(DEFINE:([Expr1008]=LikeRangeStart([@v]+N'%'), [Expr1009]=LikeRangeEnd([@v]+N'%'), [Expr1010]=LikeRangeInfo([@v]+N'%')))
| |--Constant Scan
|--Index Seek(OBJECT:([Performance].[dbo].[TestChar].[ix_test]), SEEK:([Performance].[dbo].[TestChar].[SomeCol] > [Expr1008] AND [Performance].[dbo].[TestChar].[SomeCol] < [Expr1009]), WHERE:([Performance].[dbo].[TestChar].[SomeCol] like [@v]+N'%') ORDERED FORWARD)
Using larger datatypes when it is not needed
Storage size is 1 byte. Integer data from 0 through 255.
Storage size is 2 bytes. Integer data from -2^15 (-32,768) through 2^15 - 1 (32,767).
Storage size is 4 bytes. Integer data from -2^31 (-2,147,483,648) through 2^31 - 1 (2,147,483,647).
Storage size is 8 bytes. Integer data from -2^63 (-9,223,372,036,854,775,808) through 2^63-1 (9,223,372,036,854,775,807).
Now imagine facebook with a billion users decided to use bigint as CountryID in their Country table, this key is then uses as a foreign key in the user demographics table. This is wasteful,either use a smallint since we won't go through 32 thousand countries in the foreseeable feature or use the 2 or 3 character ISO code.
The problem is even worse if you have a compound 6 column key and it is used as a foreign key in tons of other tables...that was real fun to clean up....use a surrogate 1 column key in that case...but be sure to test....normalize till it hurts then denormalize till it works....I will cover normalization in another post...just wanted to mention it