Monday, February 20, 2017

Importing The Metropolitan Museum of Art's Open Access Initiative data into SQL Server

The Met just made some data available for download.

The Metropolitan Museum of Art presents over 5,000 years of art from around the world for everyone to experience and enjoy. The Museum lives in three iconic sites in New York City—The Met Fifth Avenue, The Met Breuer, and The Met Cloisters. Millions of people also take part in The Met experience online. 

Since it was founded in 1870, The Met has always aspired to be more than a treasury of rare and beautiful objects. Every day, art comes alive in the Museum's galleries and through its exhibitions and events, revealing both new ideas and unexpected connections across time and across cultures. The Metropolitan Museum of Art provides select datasets of information on more than 420,000 artworks in its Collection for unrestricted commercial and noncommercial use.


I decided to take a look. The first thing we will do is download the CSV file from their GitHub repository. You can find that here: https://github.com/metmuseum/openaccess

If you go to that link, you will see the following, grab the MetObjects.csv file

Be warned, this file is 214 MB.

One word of caution... if you try to import this file with a regular BULK INSERT command....good luck...let me know how many tries you need. No problem, I will just use a format file... and now you have 2 problems..  The real issue is that the file is somewhat problematic, there are quotes where there shouldn't be quotes,  there are no quotes where there should be quotes.  So what do you do?

Are you on SQL Server vNext 1.1 or higher? If you are, good news, you can use BULK INSERT and csv format, this is new in vNext 1.1


Ok let's get started, first create the following table

CREATE TABLE MetOpenData(
ObjectNumber nvarchar(4000),
IsHighlight nvarchar(4000),
IsPublicDomain nvarchar(4000),
ObjectID nvarchar(4000),
Department nvarchar(4000),
ObjectName nvarchar(4000),
Title nvarchar(4000),
Culture nvarchar(4000),
Period nvarchar(4000),
Dynasty nvarchar(4000),
Reign nvarchar(4000),
Portfolio nvarchar(4000),
ArtistRole nvarchar(4000),
ArtistPrefix nvarchar(4000),
ArtistDisplayName nvarchar(4000),
ArtistDisplayBio nvarchar(4000),
ArtistSuffix nvarchar(4000),
ArtistAlphaSort nvarchar(4000),
ArtistNationality nvarchar(4000),
ArtistBeginDate nvarchar(4000),
ArtistEndDate nvarchar(4000),
ObjectDate nvarchar(4000),
ObjectBeginDate nvarchar(4000),
ObjectEndDate nvarchar(4000),
Medium nvarchar(4000),
Dimensions nvarchar(4000),
CreditLine nvarchar(4000),
GeographyType nvarchar(4000),
City nvarchar(4000),
State nvarchar(4000),
County nvarchar(4000),
Country nvarchar(4000),
Region nvarchar(4000),
Subregion nvarchar(4000),
Locale nvarchar(4000),
Locus nvarchar(4000),
Excavation nvarchar(4000),
River nvarchar(4000),
Classification nvarchar(4000),
RightsandReproduction nvarchar(4000),
LinkResource nvarchar(4000),
MetadataDate nvarchar(4000),
Repository nvarchar(4000))

GO

Now it is time to import the file
Just to let you know, you will get a couple of errors, however all data except for these 4 rows will be imported

Msg 4864, Level 16, State 1, Line 62
Bulk load data conversion error (type mismatch or invalid character for the specified codepage) for row 213266, column 25 (Medium).
Msg 4864, Level 16, State 1, Line 62
Bulk load data conversion error (type mismatch or invalid character for the specified codepage) for row 217661, column 25 (Medium).
Msg 4863, Level 16, State 1, Line 62
Bulk load data conversion error (truncation) for row 226222, column 16 (ArtistDisplayBio).
Msg 4863, Level 16, State 1, Line 62
Bulk load data conversion error (truncation) for row 258639, column 16 (ArtistDisplayBio).

Here is what the BULK INSERT with FORMAT= 'CSV' command looks like. Change the filepath to point to the location where you have the MetObjects.csv file saved

BULK INSERT MetOpenData
FROM 'c:\Data\MetObjects.csv'
WITH (FORMAT = 'CSV'); 

Let's do a quick count


SELECT COUNT(*) FROM MetOpenData

The file that I imported resulted in 446026 rows. I downloaded this file on 2/10/2017, your file might have more data if they updated the file after the date I downloaded it


Now that we have the data we need, we can run some queries.
Let's see what kind of objects are in the collection, we will grab the top 15 objects

 SELECT TOP 15  ObjectName,count(*)
 FROM MetOpenData
 GROUP BY ObjectName
 ORDER BY 2 DESC

Here is what the results looks like


ObjectNameCount
Print 88582
Photograph 28071
Drawing 24905
Book 13360
Fragment 9405
Piece 8638
Negative 6258
Painting 5862
Baseball card, print 4985
Bowl 3534
Figure 3081
Baseball card 3046
Polaroid 2706
Vase 2698
Dress 2473

I don't know why..but I somehow thought painting would be the most occuring object..but what do I know

You can also treat this table as you own museum catalog, let's say you want to look at van Gogh's Madame Roulin and Her Baby painting?  No problem, run this query

 SELECT * FROM MetOpenData
 WHERE ArtistDisplayName like'%van%gogh%'
 and title = 'Madame Roulin and Her Baby'

Scroll to the LinkResource column, you will see the following: http://www.metmuseum.org/art/collection/search/459123

Clicking on that link will give you the following


Now you can download this image and do something with it, it is after all in the public domain

Here are a couple of more queries you can play around with



 
 SELECT city, count(*)
 FROM MetOpenData
 GROUP BY city
 ORDER BY 2 DESC

 SELECT Dynasty, count(*)
 FROM MetOpenData
 GROUP BY Dynasty
 ORDER BY 2 DESC

 SELECT Period, count(*)
 FROM MetOpenData
 GROUP BY Period
 ORDER BY 2 DESC

 SELECT ArtistNationality, count(*)
 FROM MetOpenData
 GROUP BY ArtistNationality
 ORDER BY 2 DESC


 SELECT * FROM MetOpenData
 WHERE ArtistDisplayName like'%pablo picasso%'

 SELECT * FROM MetOpenData
 WHERE ArtistDisplayName like'%rembrandt van rijn%'
 
 SELECT * FROM MetOpenData
 WHERE ObjectName like'%Postage stamps%'

I am not a big art person, but if you are and you have some interesting queries that you ran against this data please let me know in the comments

Also if you manage to get this file to import with plain old BCP or BULK INSERT with or without a format file...let me know the magic you used.... :-)


Tuesday, January 17, 2017

Using bigint with FORMATMESSAGE



SQL Server 2016 added the FORMATMESSAGE function.  According to Books On Line, FORMATMESSAGE constructs a message from an existing message in sys.messages or from a provided string. The functionality of FORMATMESSAGE resembles that of the RAISERROR statement. However, RAISERROR prints the message immediately, while FORMATMESSAGE returns the formatted message for further processing.

So let's take a look at this new function, run the following


SELECT FORMATMESSAGE('Signed int %i, %i', 50, -50) 
SELECT FORMATMESSAGE('Unsigned int %u, %u', 50, -50); 

Here is the output if you run that

--------------------------------------------
Signed int 50, -50
Unsigned int 50, 4294967246

Here is what the type specifications that you can use are

Type specification Represents
d or i Signed integer
o Unsigned octal
s String
u Unsigned integer
x or X Unsigned hexadecimal

We used i to denote a signed integer, we also used u to denote a unsigned integer


Let's look at another example, this time we are using a variable. The variable will be an integer and we  are using i as the type specification


DECLARE @Val int = 1
SELECT FORMATMESSAGE('The value you supplied %i is incorrect!', @Val);

Here is the output
---------------------------------------
The value you supplied 1 is incorrect!


That worked without a problem. Now let's use a variable of the bigint data type, we are using the same type specification as before


DECLARE @Val bigint = 1
SELECT FORMATMESSAGE('The value you supplied %i is incorrect!', @Val);


Here is the output
---------------------------------------------------------------------------
Error: 50000, Severity: -1, State: 1. (Params:).
The error is printed in terse mode because there was error during formatting.
Tracing, ETW, notifications etc are skipped.


As you can see that did not work, so what can we do?

One thing we can do is converting the value to a varchar and then use s as the type specification


DECLARE @Val bigint = 1
SELECT FORMATMESSAGE('The value you supplied %s is incorrect!',
   CONVERT(VARCHAR(100),@Val));

You will again get this as output

---------------------------------------
The value you supplied 1 is incorrect!

So converting to varchar worked, but what if we want to use a bigint data type without converting to a varchar?

Another way is to use I64d as the type specification


DECLARE @Val bigint = 1
SELECT FORMATMESSAGE('The value you supplied %I64d is incorrect!', @Val);


You will get this

---------------------------------------
The value you supplied 1 is incorrect!

So there you have it, if you want to use bigint with FORMATMESSAGE use I64d as the type specification, or convert to varchar and use s as the type specification

Monday, January 16, 2017

Creating the Wide World Importers sample database v1.0




Before SQL Server 2005, we had the northwind and pubs sample database, these actually shipped with the product. If you did a SQL Server 2000 install, you would see these two database in addition to the system databases.

With SQL Server 2005 this changed, there were no more sample databases included. Instead new databases were made available for download, for a while now we have the AdventureWorks available. Today I wanted to see if there was a AdventureWorks 2016 database available. I couldn't find it. Instead I found the Wide World Importers sample database v1.0.

These sample databases live now on github


Here is the link  https://github.com/Microsoft/sql-server-samples


As a normal person you will probably navigate to the databases folder https://github.com/Microsoft/sql-server-samples/tree/master/samples/databases  and in there you will probably go to https://github.com/Microsoft/sql-server-samples/tree/master/samples/databases/wide-world-importers/wwi-database-scripts

Then you will see the following scripts

Now if your name is Denis or you like to waste time (or both), you think..cool I will download this repo and run these files.  That is not unreasonable to think......  but

To save you some time, here is the easiest way to put this database on your SQL Server instance

Instead of running those scripts, go to this page https://github.com/Microsoft/sql-server-samples/releases/tag/wide-world-importers-v1.0  scroll down to downloads, download the 121 MB file named WideWorldImporters-Full.bak

While that is downloading, create a folder on you C drive named SQL, in that folder create 2 folders one named Data and one named Log, it should look like in the image below




Now assuming the backup file is on the C drive, here is what the restore command should look like


USE [master]
RESTORE DATABASE [WideWorldImporters] 
FROM  DISK = N'C:\WideWorldImporters-Full.bak' WITH  FILE = 1,  
MOVE N'WWI_Primary' TO N'C:\SQL\DATA\WideWorldImporters.mdf',  
MOVE N'WWI_UserData' TO N'C:\SQL\DATA\WideWorldImporters_UserData.ndf',  
MOVE N'WWI_Log' TO N'C:\SQL\Log\WideWorldImporters.ldf',  
MOVE N'WWI_InMemory_Data_1' TO N'C:\SQL\DATA\WideWorldImporters_InMemory_Data_1',  
NOUNLOAD,  REPLACE,  STATS = 5

GO

On SQL Server vNext 1.1, the output looks like this from the command above

5 percent processed.
10 percent processed.
15 percent processed.
20 percent processed.
25 percent processed.
30 percent processed.
Processed 1464 pages for database 'WideWorldImporters', file 'WWI_Primary' on file 1.
Processed 53096 pages for database 'WideWorldImporters', file 'WWI_UserData' on file 1.
Processed 33 pages for database 'WideWorldImporters', file 'WWI_Log' on file 1.
Processed 3862 pages for database 'WideWorldImporters', file 'WWI_InMemory_Data_1' on file 1.
Converting database 'WideWorldImporters' from version 852 to the current version 860.
Database 'WideWorldImporters' running the upgrade step from version 852 to version 853.
Database 'WideWorldImporters' running the upgrade step from version 853 to version 854.
Database 'WideWorldImporters' running the upgrade step from version 854 to version 855.
Database 'WideWorldImporters' running the upgrade step from version 855 to version 856.
Database 'WideWorldImporters' running the upgrade step from version 856 to version 857.
Database 'WideWorldImporters' running the upgrade step from version 857 to version 858.
Database 'WideWorldImporters' running the upgrade step from version 858 to version 859.
Database 'WideWorldImporters' running the upgrade step from version 859 to version 860.
100 percent processed.
RESTORE DATABASE successfully processed 58455 pages in 6.105 seconds (74.803 MB/sec).

To make sure everything is cool after the restore is done, run the following

USE WideWorldImporters
GO

select [Website].[CalculateCustomerPrice](1,number,'20160101')
from master..spt_values
where type = 'p'


select [Website].[CalculateCustomerPrice](customerid,1,'20160101')
from Sales.customers


And now we can even check that the function is visible in the new sys.dm_exec_function_stats DMV

SELECT TOP 10 d.object_id, d.database_id, 
OBJECT_NAME(object_id, database_id) AS 'function name',   
    d.cached_time, d.last_execution_time, d.total_elapsed_time,  
    d.total_elapsed_time/d.execution_count AS [avg_elapsed_time],  
    d.last_elapsed_time, d.execution_count  
FROM sys.dm_exec_function_stats AS d  
ORDER BY [total_worker_time] DESC;  

And here is the output


As you can see, we can see this function was executed a bunch of time

So there you have it, this is how you create the DB.... I also now understand why Brent Ozar uses the Stackoverflow DB for his posts, at least it is easy to find  .., if you prefer that over this example, head over here: https://www.brentozar.com/archive/2015/10/how-to-download-the-stack-overflow-database-via-bittorrent/

Tuesday, January 10, 2017

T-SQL Tuesday #86: String or binary data would be truncated

This month's T-SQL Tuesday is hosted by Brent Ozar, he proposed the following

Find the most interesting bug or enhancement request (and it can be your own), and write a blog post about it (including a link to the Connect item so that folks who agree with you can upvote the item)

This one was pretty easy for me, it is the following connect  item Please fix the "String or binary data would be truncated" message to give the column name

This error drives me crazy as well, it should be fairly easy to tell me if nothing else what damn column barfed on the data inserted, but no.. all you get is something like

Msg 8152, Level 16, State 6, Procedure <ProcName>, Line 61 String or binary data would be truncated.

This is like not having the black box after a plane crashed, you know the plane crashed, but you don't know why exactly.

Dealing with this issue on a semi-regular basis, I even have written my own T-SQL helper to quickly see where the issue is

declare @ImportTable varchar(100)
declare @DestinationTable varchar(100)
select @ImportTable = 'temp'
select @DestinationTable = 'TestTrunc'
 
declare @ImportTableCompare varchar(100)
declare @DestinationTableCompare varchar(100)
select @ImportTableCompare = 'MaxLengths'
select @DestinationTableCompare = 'TempTrunc'
 
 
declare @sql varchar(8000)
select @sql  = ''
select @sql = 'select  0 as _col0 ,'
select @sql +=   'max(len( ' + column_name+ ')) AS ' + column_name + ',' 
from information_schema.columns
where table_name = @ImportTable
and data_type in('varchar','char','nvarchar','nchar')
 
select @sql = left(@sql,len(@sql) -1)
select @sql +=' into ' + @ImportTableCompare + ' from ' + @ImportTable
 
--select @sql -debugging so simple, a caveman can do it
 
exec (@sql)
 
 
 
select @sql  = ''
select @sql = 'select 0 as _col0, '
select @sql +=   '' + convert(varchar(20),character_maximum_length)
+ ' AS ' + column_name + ',' 
from information_schema.columns
where table_name = @DestinationTable
and data_type in('varchar','char','nvarchar','nchar')
 
select @sql = left(@sql,len(@sql) -1)
select @sql +=' into ' + @DestinationTableCompare
 
--select @sql -debugging so simple, a caveman can do it
 
exec (@sql)
 
 
select @sql  = ''
select @sql = 'select  '
select @sql +=   '' + 'case when  t.' + column_name + ' > tt.' + column_name
+ ' then ''truncation'' else ''no truncation'' end as '+ column_name
+ ',' 
from information_schema.columns
where table_name = @ImportTableCompare
and column_name <> '_col0'
select @sql = left(@sql,len(@sql) -1)
select @sql +='  from ' + @ImportTableCompare + ' t
join ' + @DestinationTableCompare + ' tt on t._col0 = tt._col0 '
 
--select @sql -debugging so simple, a caveman can do it
 
exec (@sql)
 
 
exec ('drop table ' + @ImportTableCompare+ ',' + @DestinationTableCompare )


Something like this only helps you if you have the data readily available, what if it is from an application? In that case you need profiler or extended events to capture the statement

It is also crazy that this connect item is almost 9 years old, it was opened in April 2008

We do have someone from Microsoft commenting on this issue last August

Posted by David [MSFT] on 8/5/2016 at 1:39 PM
Latest update - the developer working on it understands the challenges involved in creating a full fix. It may be tricky to plumb the information about columns needed to generate a full error message down to the actual conversion function in such a way that won't impact insert or update performance. We may implement something cheap in the short term such as logging the type and length of the data being truncated. It's still too early to know when such a fix would reach a publicly visible release.

This connect item has 1328 upvotes as of today, it also has 5 downvotes (who are these people..probably working on the SQL Server team :-) )

So there you have it that is my contribution to T-SQL Tuesday # 86, keep track of Brent's blog here https://www.brentozar.com/blog/ there will be a recap posted on Tuesday, January 2017

Saturday, January 07, 2017

BULK INSERT and csv format, new in vNext 1.1



In SQL Server vNext 1.1 we now have the ability to import a csv via the BULK INSERT command without having to specify the field or row terminator

 You still need to specify the format, if you only do something like the following

BULK INSERT AlexaSongs  
   FROM 'c:\Songs played with Alexa.csv'  

You will be greeted with these errors


Msg 4832, Level 16, State 1, Line 10
Bulk load: An unexpected end of file was encountered in the data file.
Msg 7399, Level 16, State 1, Line 10
The OLE DB provider "BULK" for linked server "(null)" reported an error. The provider did not give any information about the error.
Msg 7330, Level 16, State 2, Line 10
Cannot fetch a row from OLE DB provider "BULK" for linked server "(null)".

So let's take a look at how this all works

First create the following table


USE tempdb
GO

CREATE TABLE AlexaSongs(PlayDate varchar(100), 
   SongName varchar(200), 
   Artist varchar(200), 
   Album varchar(200))
GO

Now grab the csv file from here Songs played with Alexa.csv  Either download the whole project and grab the file, or open in raw mode and copy and paste it into a file and save as Songs played with Alexa.csv

Now that you have the file and table ready, let's first take a look at how this was done before vNext 1.1

Here is what it looked like


BULK INSERT AlexaSongs  
   FROM 'c:\Songs played with Alexa.csv'  
   WITH  
     (  
        FIELDTERMINATOR =',',  
        ROWTERMINATOR = '\n'
      ); 

As you can see, we specified a comma as the field terminator and a newline as the row terminator

You could also get it to work by just specifying the field terminator in this case

BULK INSERT AlexaSongs  
   FROM 'c:\Songs played with Alexa.csv'  
   WITH  
     (  
        FIELDTERMINATOR =','
      ); 


So what does the new syntax look like?

Here is the code that accomplished the same as above but by using the new WITH FORMAT = CSV option

BULK INSERT AlexaSongs
FROM 'c:\Songs played with Alexa.csv' 
WITH (FORMAT = 'CSV'); 


I guess you could say it is a little cleaner, but all this really is is syntactic sugar

For Azure, it looks like this, I grabbed this straight from this Books On Line Page here

First you need to create a data source


CREATE EXTERNAL DATA SOURCE MyAzureInvoices
    WITH  (
        TYPE = BLOB_STORAGE,
        LOCATION = 'https://newinvoices.blob.core.windows.net', 
        CREDENTIAL = UploadInvoices  
    );

And then you use that data source

BULK INSERT Colors2
FROM 'week3/inv-2017-01-19.csv'
WITH (DATA_SOURCE = 'MyAzureInvoices',
      FORMAT = 'CSV'); 

For more examples including accessing data in a CSV file referencing a container in an Azure blob storage location go here https://msdn.microsoft.com/en-us/library/mt805207.aspx

That's all for today

Sunday, January 01, 2017

Running queries against the songs you played with Alexa



My son got an Amazon Echo for Christmas. We use the Echo mostly to play music. I have setup IFTTT (If This Then That) to save the name of any song we play in a Google Sheet.

Between December 26 and January 1st we played a little over 1000 songs. Most of the time I would just say something like "Alexa, play 80s music" or "Alexa, play 70s music" this is why you might see songs from the same period played in a row

It is no coincidence that a lot of George Michael songs were played, he died on Christmas day. The most played song was requested by my youngest son Nicholas, he loves Demons by Imagine Dragons

I decided to import the Alexa data into SQL Server and run some queries. If you want to follow along, you can get the file here from GitHub: Songs played by Alexa

I exported the Google Sheet to a tab delimited file, I saved this file on my C drive, I created a table and did a BULK INSERT to populate this table with the data from this file


USE tempdb
GO

CREATE TABLE AlexaSongs(PlayDate varchar(100), 
   SongName varchar(200), 
   Artist varchar(200), 
   Album varchar(200))
GO

BULK INSERT AlexaSongs  
   FROM 'c:\Songs played with Alexa.tsv'  
   WITH  
     (  
        FIELDTERMINATOR =' ',  
        ROWTERMINATOR = '\n'
      );  

 The date in the file is not a format that can be converted automatically, it looks like this  December 26, 2016 at 09:53AM

I decided to add a date column and then convert that value with T-SQL. I did this by using the REPLACE function and replacing ' at ' with  ' '

ALTER TABLE  AlexaSongs ADD DatePlayed datetime
GO


UPDATE AlexaSongs
SET DatePlayed =  CONVERT(datetime, replace(playdate,' at ',' '))
GO


Now that this is all done, we can run some queries
What is the artist which we played the most?

SELECT Artist, count(SongName) As SongCount 
FROM AlexaSongs
GROUP BY Artist
ORDER BY SongCount DESC

Artist SongCount
George Michael 33
Nirvana 32
Imagine Dragons 22
Josh Groban 19
Eagles 17
Stone Temple Pilots 17
Mariah Carey 16
Meghan Trainor 15
Simon & Garfunkel 13
Pearl Jam 12

As you can see that is George Michael

How about if we want to know how many unique songs we played by artist?

SELECT Artist, count(DISTINCT SongName) As DistinctSongCount 
FROM AlexaSongs
GROUP BY Artist
ORDER BY DistinctSongCount DESC

Artist DistinctSongCount
Nirvana 25
Stone Temple Pilots 16
George Michael 15
Eagles 12
Simon & Garfunkel 12
Josh Groban 12
Mariah Carey 11
Michael Bubl+¬ 9
Snoop Dogg 9
Harry Connick Jr. 9

In this case Nirvana wins

How about the 10 most played songs? To answer that question and grab ties, we can use WITH TIES

SELECT TOP 10 WITH TIES Artist, SongName, COUNT(*) As SongCount 
FROM AlexaSongs
GROUP BY Artist,SongName
ORDER BY SongCount DESC

Here are the results
Artist SongName SongCount
Imagine Dragons Radioactive 12
Jason Mraz I'm Yours 9
Pearl Jam Yellow Ledbetter 6
Josh Groban When You Say You Love Me 5
Oasis Wonderwall (Remastered) 4
House Of Pain Jump Around [Explicit] 4
Meghan Trainor Lips Are Movin 4
Imagine Dragons Round And Round 4
Nirvana Smells Like Teen Spirit 4
Sir Mix-A-Lot Baby Got Back [Explicit] 4
George Michael Careless Whisper 4
George Michael Faith (Remastered) 4
George Michael Father Figure 4
George Michael Freedom! '90 4


So what other interesting queries can you come up with? How about how many Christmas related songs were there? Would the query look something like this?

SELECT TOP 10 WITH TIES Artist, SongName, COUNT(*) As SongCount 
FROM AlexaSongs
WHERE SongName LIKE '%christmas%'
OR SongName LIKE '%xmas%'
OR SongName LIKE '%santa%'
GROUP BY Artist,SongName
ORDER BY SongCount DESC

Maybe you would want to know how many songs you played per day?

SELECT CONVERT(date, DatePlayed) as TheDate, count(*)
FROM AlexaSongs
GROUP BY CONVERT(date, DatePlayed)
ORDER BY TheDate


Or maybe you want to know how many songs with the same title were sung by more than 1 artist?
Is this what the query would look like?

SELECT SongName, count(DISTINCT Artist) As SongCount 
FROM AlexaSongs
GROUP BY SongName
HAVING COUNT(*) > 1
ORDER BY SongCount DESC

If you want the song as well as the artist, you can use a windowing function with DENSE_RANK


;WITH cte AS(
SELECT   Artist, SongName, 
  DENSE_RANK() OVER (PARTITION BY   SongName
    ORDER BY Artist ) AS SongCount
FROM AlexaSongs )  

SELECT * FROM cte WHERE SongCount > 1

That is all for this post, I will keep collecting this data till next Christmas and hopefully will be able to run some more interesting queries

Thursday, December 29, 2016

The 15 most popular posts in 2016


Another year is almost over. As the last post of this year I decided to share with you the 15 most popular posts in 2016. I just used Google Analytics to grab this info, I excluded the home page from the top 15.  Four of these posts were written in 2006 and one was written in 2005

Here they are in order of popularity

Some cool SQL Server announcements SQL Graph, Adaptive Query Plan, CTP1 of SQL vNext, SQL Injection detection
This is my recap of the chalkboard session with the SQL Server team at the SQL Server PASS summit in Seattle.

Five Ways To Return Values From Stored Procedures
A very old post that shows you five ways to return values from a stored proc

SQL Server 2016 SP1 released, SQL Server vNext available for download
Posted this while the connect event was going on

Installing SQL Server on Linux, connecting and running queries from Linux
A quick walk through on how to install SQL Server on Ubuntu, I also included a video in the post

Are your foreign keys indexed? If not, you might have problems
A post showing you that indexing foreign keys might be beneficial if you do deletes

Convert Millisecond To "hh:mm:ss" Format
A very old post showing you how to convert from milliseconds to "hh:mm:ss" format

What's new in SQL Server 2016: CREATE OR ALTER
A post showing how the new CREATE OR ALTER syntax works

Why you need additional privileges to truncate tables compared to delete statements
The reason why you need these privileges is shown

What's new in SQL Server 2016: Truncate partitions
This post shows how truncate partitions works

10 What we learned from the reddit SQL Server AMA thread on /r/sqlserver
A recap of the AMA SQL Server thread on reddit

11 Playing around with the new stuff in SQL Server vNext and SQL Server 2016 SP1
Just a post showing some of the new stuff released in SP 1 of SQL Server 2016 and vNext

12 ISO-11179 Naming Conventions
A very old post linking to the ISO-11179 Naming Conventions document

13 Query Analyzer Trick
A very old post explaining how you can drag the columns into the query window from object explorer. I think by now everyone knows this one right?

14 What was the first computer you bought, what were some of your favorite games?
Mine was the Commodore 128, what about you?

15 Chaos Isolation Level In SQL Server
This is linked from  dba.stackexchange.com, it is kind of interesting because it was a silly post I made


That is all... see you in 2017...

Out of focus Christmas tree

Tuesday, December 27, 2016

Didn't go to PASS but want to watch the sessions?


You didn't go to the PASS summit this year, but you would still want to watch the sessions?  There is a way now, you can buy the USB stick with all the sessions, you can also download the sessions you are interested in. The passboutique site has the details, They also have a sale going on at the moment.

So if you got a bunch of money over the holidays, this would be a great investment.....

Saturday, December 17, 2016

SQL Server vNext version Community Technology Preview 1.1 available


A nice holiday present for you all has just arrived: SQL Server next version Community Technology Preview 1.1


Here is what is new in terms of the SQL Engine

  • Language and performance enhancements to natively compiled T-SQL modules, including support for OPENJSON, FOR JSON, JSON built ins as well as memory-optimized tables support for computed columns.
  • Improved the performance of updates to non-clustered columnstore indexes in the case when the row is in the delta store.
  • Batch mode queries now support “memory grant feedback loops,” which learn from memory used during query execution and adjusts on subsequent query executions; this can allow more queries to run on systems that are otherwise blocking on memory.
  • New T-SQL language features:
    • Introducing three new string functions: TRIM, CONCAT_WS, and TRANSLATE
    • BULK IMPORT supports CSV format and Azure Blob Storage as file source
    • STRING_AGG supports WITHIN GROUP (ORDER BY)s
TRIM
So finally we can write the following instead of doing LTRIM and RTRIM

SELECT TRIM( '     NoSPaces    ') AS Result;

That will return just the value NoSpaces

You can also specify what characters to trim


TRANSLATE 
This acts like a bunch of replace statements, instead of REPLACE(REPLACE(REPLACE(REPLACE(SomeVal,'[','('),']',,')'),'{','('),'}',,')') you can do the following which is much cleaner


SELECT TRANSLATE('2*[3+4]/{7-2}', '[]{}', '()()');

Running that will return 2*(3+4)/(7-2)

Everything that is new in version 1.1 can be found here https://msdn.microsoft.com/en-us/library/mt788653.aspx

What are you waiting for??..... pick where you want to install the next version.....


See...you won't be bored this weekend.....

Friday, December 16, 2016

Interesting SQL related links for the week of Dec 16, 2016


Here are some interesting articles I read and tweeted about this past week, I think you will like these as well. If you are bored this weekend, some of these might be good for you to read


SQL Server + Python — What’s new

Python is one of the most popular and fastest-growing languages used today. Pyodbc (Python-SQL Server Connector) is an open source Python module maintained by Michael Kleehammer that uses ODBC Drivers to connect to SQL Server. It allows you to connect from the platform of your choice to SQL Server on-premises and in the cloud. Pyodbc is also cross platform and can be installed using pip.

We recently announced SQL Server v.Next CTP1 on Linux and Windows, which brings the power of SQL Server to both Windows and — for the first time ever — Linux. You can now connect your applications to SQL Server running on Linux, Windows and macOS (with Docker).

For our Python developers, we have a few updates that will improve Python connectivity with SQL Server. Pyodbc is now:



SQL Server 2016 SP1: Know your limits

With the recent announcement of SQL Server 2016 SP1, we announced the consistent programmability experience for developers and ISVs, who can now maintain a single code base and build intelligent database applications which scale across all the editions of SQL Server. The processor, memory and database size limits does not change and remain as–in all editions as documented in the SQL Server editions page. We have made the following changes in our documentation to accurately reflect the memory limits on lower editions of SQL Server. This blog post is intended to clarify and provide more information on the memory limits starting with SQL Server 2016 SP1 on Standard, Web and Express Editions of SQL Server.



Parameterization for Always Encrypted – Using SSMS to Insert into, Update and Filter by Encrypted Columns

SQL Server Management Studio 17.o (the next major update of SSMS, currently available as a Release Candidate) introduces two important capabilities for Always Encrypted:
  • Ability to insert into, update and filter by values stored in encrypted columns from a Query Editor window.
  • The new online encryption algorithm, exposed in the Set-SqlColumnEncryption PowerShell cmdlet, which makes tables available for both reads and writes during the initial encryption and column encryption key rotation.

This article addresses the first of the two enhancements.




While we often worry about sophisticated digital attacks, the most common attacks for accessing news organizations’ accounts depend on only a few simple weaknesses. These weaknesses are usually a combination of predictable passwords, phishing emails designed to steal login credentials, as well as malicious file attachments in email and elsewhere. While the attacks are simple, so are the defenses. This collection of resources and learning materials will walk you through practices recommended by security specialists for defending your newsroom against common attacks on your accounts.



SQL Server on Linux: How? Introduction

Making SQL Server run on Linux involves introducing what is known as a Platform Abstraction Layer (“PAL”) into SQL Server. This layer is used to align all operating system or platform specific code in one place and allow the rest of the codebase to stay operating system agnostic. Because of SQL Server’s long history on a single operating system, Windows, it never needed a PAL. In fact, the SQL Server database engine codebase has many references to libraries that are popular on Windows to provide various functionality. In bringing SQL Server to Linux, we set strict requirements for ourselves to bring the full functional, performance, and scale value of the SQL Server RDBMS to Linux. This includes the ability for an application that works great on SQL Server on Windows to work equally great against SQL Server on Linux. Given these requirements and the fact that the existing SQL Server OS dependencies would make it very hard to provide a highly capable version of SQL Server outside of Windows in reasonable time it was decided to marry parts of the Microsoft Research (MSR) project Drawbridge with SQL Server’s existing platform layer SQL Server Operating System (SOS) to create what we call the SQLPAL. The Drawbridge project provided an abstraction between the underlying operating system and the application for the purposes of secure containers and SOS provided robust memory management, thread scheduling, and IO services. Creating SQLPAL enabled the existing Windows dependencies to be used on Linux with the help of parts of the Drawbridge design focused on OS abstraction while leaving the key OS services to SOS. We are also changing the SQL Server database engine code to by-pass the Windows libraries and call directly into SQLPAL for resource intensive functionality.