Tuesday, May 09, 2017

SQL Server 2017: SQL Graph



With the CTP2 build of SQL Server 2017, you now have the ability to use SQL Graph

I decided to play around with it a little, I based the code here on the sample I found in Books On Line. To follow along, first create this database, you have to be on SQL Server 2017 CTP2 or higher in order for the code to work



CREATE DATABASE GraphDBTest;
GO

USE  GraphDBTest;
GO


In SQL Graph, you have node tables and edge table, here is the description for both tables from Books On Line


A node table represents an entity in a graph schema. Every time a node table is created, along with the user defined columns, an implicit $node_id column is created, which uniquely identifies a given node in the database. The values in $node_id are automatically generated and are a combination of object_id of that node table and an internally generated bigint value

It is recommended that users create a unique constraint or index on the $node_id column at the time of creation of node table, but if one is not created, a default unique, non-clustered index is automatically created.



An edge table represents a relationship in a graph. Edges are always directed and connect two nodes. An edge table enables users to model many-to-many relationships in the graph. An edge table may or may not have any user defined attributes in it. Every time an edge table is created, along with the user defined attributes, three implicit columns are created in the edge table:

Column nameDescription
$edge_idUniquely identifies a given edge in the database. It is a generated column and the value is a combination of object_id of the edge table and a internally generated bigint value. However, when the $edge_id column is selected, a computed value in the form of a JSON string is displayed. $edge_id is a pseudo-column, that maps to an internal name with hex string in it. When you select $edge_id from the table, the column name will appear as $edge_id_\<hex_string>. Using pseudo-column names in quereis is the recommended way of querying the internal $edge_id column and using internal name with hex string should be avoided.
$from_idStores the $node_id of the node, from where the edge originates.
$to_idStores the $node_id of the node, at which the edge terminates.


Similar to the $node_id column, it is recommended that users create a unique index or constraint on the $edge_id column at the time of creation of the edge table, but if one is not created, a default unique, non-clustered index is automatically created on this column. It is also recommended, for OLTP scenarios, that users create an index on ($from_id, $to_id) columns, for faster lookups in the direction of the edge.


Here is what a node table looks like, the only difference is the AS NODE at the end

CREATE TABLE Person (
  ID INT PRIMARY KEY,
  name VARCHAR(100)
) AS NODE;

Inserting into a node table is same as inserting into a regular table


INSERT INTO Person VALUES (1,'Denis');
INSERT INTO Person VALUES (2,'Shidhar');
INSERT INTO Person VALUES (3,'Geoff');
INSERT INTO Person VALUES (4,'Kevin');
INSERT INTO Person VALUES (5,'Liz');
INSERT INTO Person VALUES (6,'Melanie');

Let's see what the output looks like

SELECT * FROM Person


$node_id_A0703BD81C174451BEC7688009010A5B ID name
{"type":"node","schema":"dbo","table":"Person","id":0} 1 Denis
{"type":"node","schema":"dbo","table":"Person","id":1} 2 Shidhar
{"type":"node","schema":"dbo","table":"Person","id":2} 3 Geoff
{"type":"node","schema":"dbo","table":"Person","id":3} 4 Kevin
{"type":"node","schema":"dbo","table":"Person","id":4} 5 Liz
{"type":"node","schema":"dbo","table":"Person","id":5} 6 Melanie

As you can see, there is JSON in the $node_id column

The sys.tables view has 2 new columns, is_edge and is_node. If we query the sys.tables view now for the Person table we just created, you will see that is_node = 1

SELECT t.is_edge,t.is_node,*
FROM sys.tables t
WHERE name = 'Person'

Here is the result

is_edgeis_nodenameobject_idschema_idtype_desccreate_date
01Person9015782501USER_TABLE2017-05-09 08:55:30.513


Now we will add another table, this table will have some data about DB products



CREATE TABLE DBProduct (
  ID INT NOT NULL, 
  name VARCHAR(100)
) AS NODE;



INSERT INTO DBProduct VALUES (1,'SQL Server');
INSERT INTO DBProduct VALUES (2,'Cassandra');
INSERT INTO DBProduct VALUES (3,'Oracle');
INSERT INTO DBProduct VALUES (4,'FoundationDB');
INSERT INTO DBProduct VALUES (5,'MongoDB');
INSERT INTO DBProduct VALUES (6,'Vertica');
INSERT INTO DBProduct VALUES (7,'Volt');



And now it is time for our edge tables, we will have 2 of them, a likes table and a friendOf table

CREATE TABLE likes (rating INT) AS EDGE;
CREATE TABLE friendOf AS EDGE;


To insert into the likes table, we have to say which person likes what product. The person will go into the $from_id column, the DBProduct will got into the $to_id column. You $Node_id to grab the id to populate this table


The node_id for Person with an id of 1 will go into the $from_id column, the node_id for DBProduct with an id of 1 will go into the $to_id column. That looks like this

((SELECT $node_id FROM Person WHERE id = 1), 
       (SELECT $node_id FROM DBProduct WHERE id = 1),5);

Here are the insert statements to populate this table


INSERT INTO likes VALUES ((SELECT $node_id FROM Person WHERE id = 1), 
       (SELECT $node_id FROM DBProduct WHERE id = 1),5);
INSERT INTO likes VALUES ((SELECT $node_id FROM Person WHERE id = 1), 
       (SELECT $node_id FROM DBProduct WHERE id = 2),15);
INSERT INTO likes VALUES ((SELECT $node_id FROM Person WHERE id = 2), 
      (SELECT $node_id FROM DBProduct WHERE id = 1),6);
INSERT INTO likes VALUES ((SELECT $node_id FROM Person WHERE id = 3), 
      (SELECT $node_id FROM DBProduct WHERE id = 3),7);
INSERT INTO likes VALUES ((SELECT $node_id FROM Person WHERE id = 4), 
      (SELECT $node_id FROM DBProduct WHERE id = 4),8);
INSERT INTO likes VALUES ((SELECT $node_id FROM Person WHERE id = 5), 
      (SELECT $node_id FROM DBProduct WHERE id = 5),9);


Let's look at what it looks like in the likes edge table

SELECT * FROM likes

Here is what it looks like, as you can see it is a bunch of JSON.. (click on the image for a bigger version)


 

 Now we will also insert some data into the friendOf table, it is the same as with the likes table, lookup the node ids and insert the data

INSERT INTO friendof VALUES ((SELECT $NODE_ID FROM person WHERE ID = 1), 
    (SELECT $NODE_ID FROM person WHERE ID = 2));
INSERT INTO friendof VALUES ((SELECT $NODE_ID FROM person WHERE ID = 1), 
    (SELECT $NODE_ID FROM person WHERE ID = 5));

INSERT INTO friendof VALUES ((SELECT $NODE_ID FROM person WHERE ID = 2), 
    (SELECT $NODE_ID FROM person WHERE ID = 3));
INSERT INTO friendof VALUES ((SELECT $NODE_ID FROM person WHERE ID = 3), 
    (SELECT $NODE_ID FROM person WHERE ID = 1));
INSERT INTO friendof VALUES ((SELECT $NODE_ID FROM person WHERE ID = 4), 
    (SELECT $NODE_ID FROM person WHERE ID = 2));
INSERT INTO friendof VALUES ((SELECT $NODE_ID FROM person WHERE ID = 5), 
    (SELECT $NODE_ID FROM person WHERE ID = 4));


Now it is time to run some queries
This query will return all the products that Denis likes

-- Find DBProducts that Denis likes
SELECT DBProduct.name
FROM Person, likes, DBProduct
WHERE MATCH (Person-(likes)->DBProduct)
AND Person.name = 'Denis';

name
-------------
SQL Server
Cassandra

Not surprisingly, SQL Server is in that list

What about my friends? What DB products do they like?

-- Find DBProducts that Denis' friends like
SELECT DBProduct.name as DBName,person2.name  
FROM Person person1, Person person2, likes, friendOf, DBProduct
WHERE MATCH(person1-(friendOf)->person2-(likes)->DBProduct)
AND person1.name='Denis';


DBName                  name
---------------         ------------
SQL Server              Shidhar
MongoDB                 Liz


Okay, SQL Server and MongoDB, I can live with that

What about Liz, what does her friend Kevin like?

SELECT DBProduct.name as DBName,person2.name 
FROM Person person1, Person person2, likes, friendOf, DBProduct
WHERE MATCH(person1-(friendOf)->person2-(likes)->DBProduct)
AND person1.name='Liz';


DBName             name  
------ -------
FoundationDB     Kevin

FoundationDB? Didn't Apple buy that and then shut down the downloads?


So what product is there that both friends like?

To do that we specify that person1 is a friendOf person 2
and both person1 and person2 like the same product. Since you cannot use the same table twice, you need to use an alias, in this case I have used l2 as an alias for the likes table


SELECT DBProduct.name as DBName,person1.name,person2.name 
FROM Person person1, Person person2, likes, friendOf, DBProduct, likes as l2
WHERE MATCH(person1-(friendOf)->person2-(likes)->DBProduct 
  AND person1-(l2)->DBProduct)


DBName        name       name
-----------
SQL Server    Denis      Shidhar

What a surprise, can you believe it, it is SQL Server!!



Let's take a look at some meta data, this time, we are adding some columns from the sys.columns view, I have added graph_type_desc and is_edge

select t.name, c.name, c.graph_type_desc,t.is_edge,t.is_node 
from sys.columns c
join sys.tables t on c.object_id = t.object_id
where graph_type is not null
order by t.name, c.name

Here is the output from that query

name name graph_type_desc is_edge is_node
DBProduct $node_id_4C8D1144EF7745FD8BACDB7178353A65 GRAPH_ID_COMPUTED 0 1
DBProduct graph_id_8FA26FC01AE94CF9921364ECEFB71621 GRAPH_ID 0 1
friendOf $edge_id_FBBA05CB4E0F49E7919EC346DBB21F92 GRAPH_ID_COMPUTED 1 0
friendOf $from_id_143FE9B1B3E942EB99FF8396B292CD19 GRAPH_FROM_ID_COMPUTED 1 0
friendOf $to_id_DFF0DCFA07FE45E3A31E1925F5E6F886 GRAPH_TO_ID_COMPUTED 1 0
friendOf from_id_71FB1C6C6D82428889276A07A9AF32E4 GRAPH_FROM_ID 1 0
friendOf from_obj_id_4310B8C357D641EC990E91329687BDAA GRAPH_FROM_OBJ_ID 1 0
friendOf graph_id_3722B217CA404686BE44855E883AA6FA GRAPH_ID 1 0
friendOf to_id_D01976015F664F91A0A29819D79EE6A5 GRAPH_TO_ID 1 0
friendOf to_obj_id_8C26534B00DF47FBA87FBED6D14C3201 GRAPH_TO_OBJ_ID 1 0
likes $edge_id_CA70E11B83A44BD99DAD3B14EFCC2E08 GRAPH_ID_COMPUTED 1 0
likes $from_id_FF1F089BA1224A00A1C94F68415C6B08 GRAPH_FROM_ID_COMPUTED 1 0
likes $to_id_7728587E08684D94961B4F2234702B87 GRAPH_TO_ID_COMPUTED 1 0
likes from_id_ACD968F7815C4A3A9B5B6C5506235EDE GRAPH_FROM_ID 1 0
likes from_obj_id_185C6B5EB64F4C0A8E3E1CF65092F706 GRAPH_FROM_OBJ_ID 1 0
likes graph_id_16575BE2C0764232B5B440524166DF15 GRAPH_ID 1 0
likes to_id_D164826C35374D3E98EDBC3FA77457B3 GRAPH_TO_ID 1 0
likes to_obj_id_373354D4D7014EC89771C48F961561BB GRAPH_TO_OBJ_ID 1 0
Person $node_id_A0703BD81C174451BEC7688009010A5B GRAPH_ID_COMPUTED 0 1
Person graph_id_40FFD8418C1343639EA62CA66B6CE414 GRAPH_ID 0 1



You might have noticed that SQL Graph queries use old style joins
So instead of this

SELECT DBProduct.name as DBName,person1.name,person2.name 
FROM Person person1, Person person2, likes, friendOf, DBProduct, likes as l2

I can do this right?

SELECT DBProduct.name as DBName,person1.name,person2.name 
FROM Person person1
CROSS JOIN Person person2
CROSS JOIN likes
CROSS JOIN friendOf
CROSS JOIN DBProduct
CROSS JOIN likes as l2


Yes that works, both queries return 54432 rows

Will it work with the WHERE clause?

SELECT DBProduct.name as DBName,person1.name,person2.name 
FROM Person person1
CROSS JOIN Person person2
CROSS JOIN likes
CROSS JOIN friendOf
CROSS JOIN DBProduct
CROSS JOIN likes as l2
WHERE MATCH(person1-(friendOf)->person2-(likes)->DBProduct 
  AND person1-(l2)->DBProduct)


Msg 13920, Level 16, State 1, Line 164
Identifier 'person1' in a MATCH clause is used with a JOIN clause or APPLY operator. JOIN and APPLY are not supported with MATCH clauses.
Msg 13920, Level 16, State 1, Line 164
Identifier 'friendOf' in a MATCH clause is used with a JOIN clause or APPLY operator. JOIN and APPLY are not supported with MATCH clauses.
Msg 13920, Level 16, State 1, Line 164
Identifier 'person2' in a MATCH clause is used with a JOIN clause or APPLY operator. JOIN and APPLY are not supported with MATCH clauses.
Msg 13920, Level 16, State 1, Line 164
Identifier 'person2' in a MATCH clause is used with a JOIN clause or APPLY operator. JOIN and APPLY are not supported with MATCH clauses.
Msg 13920, Level 16, State 1, Line 164
Identifier 'likes' in a MATCH clause is used with a JOIN clause or APPLY operator. JOIN and APPLY are not supported with MATCH clauses.
Msg 13920, Level 16, State 1, Line 164
Identifier 'DBProduct' in a MATCH clause is used with a JOIN clause or APPLY operator. JOIN and APPLY are not supported with MATCH clauses.
Msg 13920, Level 16, State 1, Line 165
Identifier 'person1' in a MATCH clause is used with a JOIN clause or APPLY operator. JOIN and APPLY are not supported with MATCH clauses.
Msg 13920, Level 16, State 1, Line 165
Identifier 'l2' in a MATCH clause is used with a JOIN clause or APPLY operator. JOIN and APPLY are not supported with MATCH clauses.
Msg 13920, Level 16, State 1, Line 165
Identifier 'DBProduct' in a MATCH clause is used with a JOIN clause or APPLY operator. JOIN and APPLY are not supported with MATCH clauses.


Boooo ... that is a big fat nope



Also keep the following in mind
There are certain limitations on node and edge tables in this release:
  • Local or global temporary tables cannot be node or edge tables.
  • Table types and table variables cannot be declared as a node or edge table.
  • Node and edge tables cannot be created as system-versioned temporal tables.
  • Node and edge tables cannot be memory optimized tables.
  • Users cannot update the $from_id and $to_id columns of an edge using UPDATE statement. To update the nodes that an edge connects, users will have to insert the new edge pointing to new nodes and delete the previous one.


This was just a first look at Graph SQL, I will have to play around with it a little more to get more familiar with it and see if there are any use cases for me


In the meantime, visit Books On Line and start playing around with SQL Graph: https://docs.microsoft.com/en-us/sql/relational-databases/graphs/sql-graph-overview

Friday, May 05, 2017

SQL Server 2017: SELECT INTO File Group



With the CTP2 build of SQL Server 2017, you now have the ability of doing a SELECT INTO operation into a specific filegroup.


The syntax looks like this

SELECT * INTO TableName ON FileGroup FROM SomeQuery

 What is the use of this you might ask? Well maybe you have some ad-hoc queries where you save some data but you don't want it sitting on your expensive SAN. Or maybe you populate staging tables on the fly and you want it to end up on a specific SSD because you need the speed of the SSD disk for these operations.

Of course all this can be done by creating the table first but this saves you at least some time..especially when all you have is a query and the data types are maybe unknown



Let's see how this all works, first create a database with just 1 data file and 1 log file



CREATE DATABASE [TestSelectInto]
 CONTAINMENT = NONE
 ON  PRIMARY 
( NAME = N'TestSelectInto', 
 FILENAME = N'C:\DATA\TestSelectInto.mdf' , 
 SIZE = 6MB , FILEGROWTH = 6MB )
 LOG ON 
( NAME = N'TestSelectInto_log', 
 FILENAME = N'C:\DATA\TestSelectInto_log.ldf' , 
 SIZE = 6MB , FILEGROWTH = 6MB )
GO

Now, add a new filegroup and name it ScratchFG

ALTER DATABASE [TestSelectInto] ADD FILEGROUP ScratchFG
GO

ALTER DATABASE [TestSelectInto]
ADD FILE
(   NAME= 'ScratchData',
    FILENAME = N'C:\DATA\ScratchData.mdf', SIZE = 6MB , FILEGROWTH = 6MB
) TO FILEGROUP ScratchFG;

Now that we added the filegroup, let's verify that we have 3 filegroups

USE TestSelectInto
GO

SELECT file_id,type_desc,name,physical_name 
FROM sys.database_files



file_id type_desc name physical_name
1 ROWS TestSelectInto C:\DATA\TestSelectInto.mdf
2 LOG TestSelectInto_log C:\DATA\TestSelectInto_log.ldf
3 ROWS ScratchData C:\DATA\ScratchData.mdf

As you can see, we have 3 filegroups

Now it is time to do our select into filegroup command, we will also do a select into without the filegroup specified

SELECT * INTO TestScratchFG ON ScratchFG 
FROM master..spt_values


SELECT * INTO TestDefault  
FROM master..spt_values

How can we be sure where the table ended up? You can use the query below

SELECT OBJECT_SCHEMA_NAME(t.object_id) AS schema_name,
t.name AS table_name,
i.index_id,
i.name AS index_name,
fg.name AS filegroup_name
FROM sys.tables t
INNER JOIN sys.indexes i ON t.object_id = i.object_id
INNER JOIN sys.filegroups fg ON i.data_space_id=fg.data_space_id

Here is what we see, the TestScratchFG table got created on the ScratchFG filegroup, the TestDefault table got created on the primary filegroup

schema_name table_name index_id index_name filegroup_name
dbo TestScratchFG 0 NULL ScratchFG
dbo TestDefault 0 NULL PRIMARY




What about temporary tables, can you use this syntax? Let's try.....


SELECT * INTO #TestScratchFG ON ScratchFG 
FROM master..spt_values

Msg 1921, Level 16, State 1, Line 60
Invalid filegroup 'ScratchFG' specified.

So as you can see, you cannot specify the filegroup of the DB that you are in...what you can do however is specifying primary

SELECT * INTO #TestScratchFG ON [primary]
FROM master..spt_values

Since you cannot add filegroups to tempdb, primary is all you can specify...but then again why would you..there is no point. So if you are using temporary table, this syntax is useless.

That is all for this post, hopefully, you have seen how easy it is to use this and it might be of use to you in the future

Monday, April 24, 2017

How to shut down SQL Server

The question how to shutdown SQL Server came up at work last week, I figured it would be a nice topic for the newer DBAs and SQL Server programmers

Note: I assume SQL Server is not part of a clustered environment here, if your SQL Server instance is part of a cluster, then use the failover cluster manager instead!


There are a bunch of way to shut down SQL Server, here is how you can do it

Configuration Manager

To shut down SQL Server for the SQL Server Configuration Manager, navigate to your instance, right click and select Stop

SSMS Object Explorer



The process from SSMS is very similar to Configuration Manager, right click on the instance and select Stop

Shutdown command

From SSMS open a new query window and type shutdown or shutdown with nowait

Unless the WITHNOWAIT option is used, SHUTDOWN shuts down SQL Server by:

  • Disabling logins (except for members of the sysadmin and serveradmin fixed server roles).
  • Waiting for currently running Transact-SQL statements or stored procedures to finish. To display a list of all active processes and locks, run sp_who and sp_lock, respectively.
  • Inserting a checkpoint in every database.

That seems kinda dangerous can someone pass that command to a proc by masking it?


declare @d varchar(30)= reverse('tiawon htiw nwodtuhs')
exec(@d)

The SHUTDOWN statement cannot be executed within a transaction or by a stored procedure. Msg 0, Level 11, State 0, Line 3 A severe error occurred on the current command. The results, if any, should be discarded.

That is good to know  :-)


Command prompt

Open a command prompt as administrator and type net stop MSSQLSERVER

Microsoft Windows [Version 10.0.14393]
(c) 2016 Microsoft Corporation. All rights reserved.

C:\WINDOWS\system32>net stop MSSQLSERVER
The SQL Server (MSSQLSERVER) service is stopping.
The SQL Server (MSSQLSERVER) service was stopped successfully.


C:\WINDOWS\system32>net start MSSQLSERVER
The SQL Server (MSSQLSERVER) service is starting.
The SQL Server (MSSQLSERVER) service was started successfully.


C:\WINDOWS\system32>

To start SQL Server again, change net stop to net start


Below are a couple of more ways but you should really not be using those!!

Task Manager


Yes, you can kill SQL Server from task manager

Taskkill

You can use the plain old DOS command taskkill as well

Shutdown your server
That is one way of course but you might get a bunch of people yelling at you

Unplug the power to your server after making sure you have no UPS battery backup plugged in
Great way to test recovery... .. sarcasm people.....

Shutdown the data center
You laugh but this actually happened to me once, someone did some work and shut down the whole development part of the data canter. We had SQL Server 2000 running on NT 4, it was not restarted in 6 years (meaning wasn't patched either) It was an isolated box, internal ip, couldn't be reached from outside the dev network


There you have all the ways that you can use to shut down SQL Server

Saturday, April 15, 2017

Security in SQL Server on Linux Microsoft Engineering Town Hall


On April 20, 2017 10:00-11:00 AM Pacific Time (UTC-7) Microsoft will host a Security in SQL Server on Linux Microsoft Engineering Town Hall


Here is what will be covered

Learn about recent developments and get answers to your questions at twice-monthly virtual town halls, where Microsoft connects you with the team behind SQL Server on Linux. Each session will begin with a new topic, but all questions about SQL on Linux are welcome. At this event, you’ll:


  • Engage with—and ask questions of—the SQL Server on Linux engineering team.
  • Learn about how SQL Server in Linux supports the same enterprise-grade security capabilities that customers rely on with SQL Server on Windows. Get an overview of how to secure SQL Server on Linux, with demos of popular scenarios.


You can register here: https://info.microsoft.com/sql-server-on-linux-town-hall-security-register.html

Monday, February 20, 2017

The strange case of the missing indexes....



This past week I needed to run some queries on production to verify there were indexes added on a table. There were several scripts that needed to be run and the last one was the addition of the indexes.  The query given to me was something like the following


SELECT *
FROM LinkedServerName.DatabaseName.sys.indexes
WHERE object_id =(OBJECT_ID('TableName'))

So I ran the query..nothing. Aha maybe they are still running the scripts before that, setting up replication, snapshotting the table etc etc. I will check again in a bit I thought.

Then I checked 2 more times nothing. So I then contacted them and asked if they ran everything..yes they said and no errors.  Mmm, okay..what could be wrong. The only way I can access this server is through a linked server call. I decided to look at the query again.... bingo.. I see what the problem is.....

Let's take a look. First create the following database with 2 tables and 2 indexes on the Test table


CREATE DATABASE Test1
GO


USE Test1
GO

CREATE TABLE Foo(ID int)
GO

CREATE TABLE Test(ID int, ID2 int, ID3 int)
GO

CREATE INDEX ix_Test on Test(ID)
CREATE INDEX ix_Test2 on Test(ID2)


Now if you run the following query against the sys.indexes object catalog view


SELECT  FROM sys.indexes
WHERE object_id = OBJECT_ID('Test')

You will get back the following information

object_id name index_id type type_desc
901578250 NULL 0 0 HEAP
901578250 ix_Test 2 2 NONCLUSTERED
901578250 ix_Test2 3 2 NONCLUSTERED

As you can see we have information about our two indexes and the table itself, since there is no clustered index on this table, we get a row back that shows that the table is a heap

Now let us connect to another database on the same server, in this case we will use the tempdb database

Create a table with the same name

USE tempdb
GO

CREATE TABLE Test(foo int)
GO


Now run the same query again but point to the Test1 database


SELECT * FROM Test1.sys.indexes
WHERE object_id = OBJECT_ID('Test')

And you get nothing back.  What does the OBJECT_ID() function return?


SELECT OBJECT_ID('Test')

For me it returns the number 965578478. The problem is that that object_id is the id for that object in the tempdb database, not for the one in the Test1 database

So what can you do?  There are two ways to do this

One way is to join sys.indexes with the sys.tables object catalog view

Here is what the query looks like

SELECT i.* 
FROM Test1.sys.indexes i
JOIN Test1.sys.tables t ON i.object_id = t.object_id
WHERE t.name = 'Test'

Running the query like that displays the correct information

object_id name index_id type type_desc
901578250 NULL 0 0 HEAP
901578250 ix_Test 2 2 NONCLUSTERED
901578250 ix_Test2 3 2 NONCLUSTERED


If you want to use this against a linked server, use 4 part notation, just prefix Test1.sys with the linked server name, I used LinkedServerName as an example


SELECT i.* 
FROM LinkedServerName.Test1.sys.indexes i
JOIN LinkedServerName.Test1.sys.tables t ON i.object_id = t.object_id
WHERE t.name = 'Test'

That query with the join between sys.indexes and sys.tables can be used for a linked server as well as a different database on the same instance, if you just have to go to another database like we have done before, you can simplify it like this

SELECT  * FROM Test1.sys.indexes
WHERE object_id = OBJECT_ID('Test1.dbo.Test')

Basically, you pass in the database name, schema name and object name to the OBJECT_ID() function

That's all for today..

Importing The Metropolitan Museum of Art's Open Access Initiative data into SQL Server

The Met just made some data available for download.

The Metropolitan Museum of Art presents over 5,000 years of art from around the world for everyone to experience and enjoy. The Museum lives in three iconic sites in New York City—The Met Fifth Avenue, The Met Breuer, and The Met Cloisters. Millions of people also take part in The Met experience online. 

Since it was founded in 1870, The Met has always aspired to be more than a treasury of rare and beautiful objects. Every day, art comes alive in the Museum's galleries and through its exhibitions and events, revealing both new ideas and unexpected connections across time and across cultures. The Metropolitan Museum of Art provides select datasets of information on more than 420,000 artworks in its Collection for unrestricted commercial and noncommercial use.


I decided to take a look. The first thing we will do is download the CSV file from their GitHub repository. You can find that here: https://github.com/metmuseum/openaccess

If you go to that link, you will see the following, grab the MetObjects.csv file

Be warned, this file is 214 MB.

One word of caution... if you try to import this file with a regular BULK INSERT command....good luck...let me know how many tries you need. No problem, I will just use a format file... and now you have 2 problems..  The real issue is that the file is somewhat problematic, there are quotes where there shouldn't be quotes,  there are no quotes where there should be quotes.  So what do you do?

Are you on SQL Server vNext 1.1 or higher? If you are, good news, you can use BULK INSERT and csv format, this is new in vNext 1.1


Ok let's get started, first create the following table

CREATE TABLE MetOpenData(
ObjectNumber nvarchar(4000),
IsHighlight nvarchar(4000),
IsPublicDomain nvarchar(4000),
ObjectID nvarchar(4000),
Department nvarchar(4000),
ObjectName nvarchar(4000),
Title nvarchar(4000),
Culture nvarchar(4000),
Period nvarchar(4000),
Dynasty nvarchar(4000),
Reign nvarchar(4000),
Portfolio nvarchar(4000),
ArtistRole nvarchar(4000),
ArtistPrefix nvarchar(4000),
ArtistDisplayName nvarchar(4000),
ArtistDisplayBio nvarchar(4000),
ArtistSuffix nvarchar(4000),
ArtistAlphaSort nvarchar(4000),
ArtistNationality nvarchar(4000),
ArtistBeginDate nvarchar(4000),
ArtistEndDate nvarchar(4000),
ObjectDate nvarchar(4000),
ObjectBeginDate nvarchar(4000),
ObjectEndDate nvarchar(4000),
Medium nvarchar(4000),
Dimensions nvarchar(4000),
CreditLine nvarchar(4000),
GeographyType nvarchar(4000),
City nvarchar(4000),
State nvarchar(4000),
County nvarchar(4000),
Country nvarchar(4000),
Region nvarchar(4000),
Subregion nvarchar(4000),
Locale nvarchar(4000),
Locus nvarchar(4000),
Excavation nvarchar(4000),
River nvarchar(4000),
Classification nvarchar(4000),
RightsandReproduction nvarchar(4000),
LinkResource nvarchar(4000),
MetadataDate nvarchar(4000),
Repository nvarchar(4000))

GO

Now it is time to import the file
Just to let you know, you will get a couple of errors, however all data except for these 4 rows will be imported

Msg 4864, Level 16, State 1, Line 62
Bulk load data conversion error (type mismatch or invalid character for the specified codepage) for row 213266, column 25 (Medium).
Msg 4864, Level 16, State 1, Line 62
Bulk load data conversion error (type mismatch or invalid character for the specified codepage) for row 217661, column 25 (Medium).
Msg 4863, Level 16, State 1, Line 62
Bulk load data conversion error (truncation) for row 226222, column 16 (ArtistDisplayBio).
Msg 4863, Level 16, State 1, Line 62
Bulk load data conversion error (truncation) for row 258639, column 16 (ArtistDisplayBio).

Here is what the BULK INSERT with FORMAT= 'CSV' command looks like. Change the filepath to point to the location where you have the MetObjects.csv file saved

BULK INSERT MetOpenData
FROM 'c:\Data\MetObjects.csv'
WITH (FORMAT = 'CSV'); 

Let's do a quick count


SELECT COUNT(*) FROM MetOpenData

The file that I imported resulted in 446026 rows. I downloaded this file on 2/10/2017, your file might have more data if they updated the file after the date I downloaded it


Now that we have the data we need, we can run some queries.
Let's see what kind of objects are in the collection, we will grab the top 15 objects

 SELECT TOP 15  ObjectName,count(*)
 FROM MetOpenData
 GROUP BY ObjectName
 ORDER BY 2 DESC

Here is what the results looks like


ObjectNameCount
Print 88582
Photograph 28071
Drawing 24905
Book 13360
Fragment 9405
Piece 8638
Negative 6258
Painting 5862
Baseball card, print 4985
Bowl 3534
Figure 3081
Baseball card 3046
Polaroid 2706
Vase 2698
Dress 2473

I don't know why..but I somehow thought painting would be the most occuring object..but what do I know

You can also treat this table as you own museum catalog, let's say you want to look at van Gogh's Madame Roulin and Her Baby painting?  No problem, run this query

 SELECT * FROM MetOpenData
 WHERE ArtistDisplayName like'%van%gogh%'
 and title = 'Madame Roulin and Her Baby'

Scroll to the LinkResource column, you will see the following: http://www.metmuseum.org/art/collection/search/459123

Clicking on that link will give you the following


Now you can download this image and do something with it, it is after all in the public domain

Here are a couple of more queries you can play around with



 
 SELECT city, count(*)
 FROM MetOpenData
 GROUP BY city
 ORDER BY 2 DESC

 SELECT Dynasty, count(*)
 FROM MetOpenData
 GROUP BY Dynasty
 ORDER BY 2 DESC

 SELECT Period, count(*)
 FROM MetOpenData
 GROUP BY Period
 ORDER BY 2 DESC

 SELECT ArtistNationality, count(*)
 FROM MetOpenData
 GROUP BY ArtistNationality
 ORDER BY 2 DESC


 SELECT * FROM MetOpenData
 WHERE ArtistDisplayName like'%pablo picasso%'

 SELECT * FROM MetOpenData
 WHERE ArtistDisplayName like'%rembrandt van rijn%'
 
 SELECT * FROM MetOpenData
 WHERE ObjectName like'%Postage stamps%'

I am not a big art person, but if you are and you have some interesting queries that you ran against this data please let me know in the comments

Also if you manage to get this file to import with plain old BCP or BULK INSERT with or without a format file...let me know the magic you used.... :-)


Tuesday, January 17, 2017

Using bigint with FORMATMESSAGE



SQL Server 2016 added the FORMATMESSAGE function.  According to Books On Line, FORMATMESSAGE constructs a message from an existing message in sys.messages or from a provided string. The functionality of FORMATMESSAGE resembles that of the RAISERROR statement. However, RAISERROR prints the message immediately, while FORMATMESSAGE returns the formatted message for further processing.

So let's take a look at this new function, run the following


SELECT FORMATMESSAGE('Signed int %i, %i', 50, -50) 
SELECT FORMATMESSAGE('Unsigned int %u, %u', 50, -50); 

Here is the output if you run that

--------------------------------------------
Signed int 50, -50
Unsigned int 50, 4294967246

Here is what the type specifications that you can use are

Type specification Represents
d or i Signed integer
o Unsigned octal
s String
u Unsigned integer
x or X Unsigned hexadecimal

We used i to denote a signed integer, we also used u to denote a unsigned integer


Let's look at another example, this time we are using a variable. The variable will be an integer and we  are using i as the type specification


DECLARE @Val int = 1
SELECT FORMATMESSAGE('The value you supplied %i is incorrect!', @Val);

Here is the output
---------------------------------------
The value you supplied 1 is incorrect!


That worked without a problem. Now let's use a variable of the bigint data type, we are using the same type specification as before


DECLARE @Val bigint = 1
SELECT FORMATMESSAGE('The value you supplied %i is incorrect!', @Val);


Here is the output
---------------------------------------------------------------------------
Error: 50000, Severity: -1, State: 1. (Params:).
The error is printed in terse mode because there was error during formatting.
Tracing, ETW, notifications etc are skipped.


As you can see that did not work, so what can we do?

One thing we can do is converting the value to a varchar and then use s as the type specification


DECLARE @Val bigint = 1
SELECT FORMATMESSAGE('The value you supplied %s is incorrect!',
   CONVERT(VARCHAR(100),@Val));

You will again get this as output

---------------------------------------
The value you supplied 1 is incorrect!

So converting to varchar worked, but what if we want to use a bigint data type without converting to a varchar?

Another way is to use I64d as the type specification


DECLARE @Val bigint = 1
SELECT FORMATMESSAGE('The value you supplied %I64d is incorrect!', @Val);


You will get this

---------------------------------------
The value you supplied 1 is incorrect!

So there you have it, if you want to use bigint with FORMATMESSAGE use I64d as the type specification, or convert to varchar and use s as the type specification

Monday, January 16, 2017

Creating the Wide World Importers sample database v1.0




Before SQL Server 2005, we had the northwind and pubs sample database, these actually shipped with the product. If you did a SQL Server 2000 install, you would see these two database in addition to the system databases.

With SQL Server 2005 this changed, there were no more sample databases included. Instead new databases were made available for download, for a while now we have the AdventureWorks available. Today I wanted to see if there was a AdventureWorks 2016 database available. I couldn't find it. Instead I found the Wide World Importers sample database v1.0.

These sample databases live now on github


Here is the link  https://github.com/Microsoft/sql-server-samples


As a normal person you will probably navigate to the databases folder https://github.com/Microsoft/sql-server-samples/tree/master/samples/databases  and in there you will probably go to https://github.com/Microsoft/sql-server-samples/tree/master/samples/databases/wide-world-importers/wwi-database-scripts

Then you will see the following scripts

Now if your name is Denis or you like to waste time (or both), you think..cool I will download this repo and run these files.  That is not unreasonable to think......  but

To save you some time, here is the easiest way to put this database on your SQL Server instance

Instead of running those scripts, go to this page https://github.com/Microsoft/sql-server-samples/releases/tag/wide-world-importers-v1.0  scroll down to downloads, download the 121 MB file named WideWorldImporters-Full.bak

While that is downloading, create a folder on you C drive named SQL, in that folder create 2 folders one named Data and one named Log, it should look like in the image below




Now assuming the backup file is on the C drive, here is what the restore command should look like


USE [master]
RESTORE DATABASE [WideWorldImporters] 
FROM  DISK = N'C:\WideWorldImporters-Full.bak' WITH  FILE = 1,  
MOVE N'WWI_Primary' TO N'C:\SQL\DATA\WideWorldImporters.mdf',  
MOVE N'WWI_UserData' TO N'C:\SQL\DATA\WideWorldImporters_UserData.ndf',  
MOVE N'WWI_Log' TO N'C:\SQL\Log\WideWorldImporters.ldf',  
MOVE N'WWI_InMemory_Data_1' TO N'C:\SQL\DATA\WideWorldImporters_InMemory_Data_1',  
NOUNLOAD,  REPLACE,  STATS = 5

GO

On SQL Server vNext 1.1, the output looks like this from the command above

5 percent processed.
10 percent processed.
15 percent processed.
20 percent processed.
25 percent processed.
30 percent processed.
Processed 1464 pages for database 'WideWorldImporters', file 'WWI_Primary' on file 1.
Processed 53096 pages for database 'WideWorldImporters', file 'WWI_UserData' on file 1.
Processed 33 pages for database 'WideWorldImporters', file 'WWI_Log' on file 1.
Processed 3862 pages for database 'WideWorldImporters', file 'WWI_InMemory_Data_1' on file 1.
Converting database 'WideWorldImporters' from version 852 to the current version 860.
Database 'WideWorldImporters' running the upgrade step from version 852 to version 853.
Database 'WideWorldImporters' running the upgrade step from version 853 to version 854.
Database 'WideWorldImporters' running the upgrade step from version 854 to version 855.
Database 'WideWorldImporters' running the upgrade step from version 855 to version 856.
Database 'WideWorldImporters' running the upgrade step from version 856 to version 857.
Database 'WideWorldImporters' running the upgrade step from version 857 to version 858.
Database 'WideWorldImporters' running the upgrade step from version 858 to version 859.
Database 'WideWorldImporters' running the upgrade step from version 859 to version 860.
100 percent processed.
RESTORE DATABASE successfully processed 58455 pages in 6.105 seconds (74.803 MB/sec).

To make sure everything is cool after the restore is done, run the following

USE WideWorldImporters
GO

select [Website].[CalculateCustomerPrice](1,number,'20160101')
from master..spt_values
where type = 'p'


select [Website].[CalculateCustomerPrice](customerid,1,'20160101')
from Sales.customers


And now we can even check that the function is visible in the new sys.dm_exec_function_stats DMV

SELECT TOP 10 d.object_id, d.database_id, 
OBJECT_NAME(object_id, database_id) AS 'function name',   
    d.cached_time, d.last_execution_time, d.total_elapsed_time,  
    d.total_elapsed_time/d.execution_count AS [avg_elapsed_time],  
    d.last_elapsed_time, d.execution_count  
FROM sys.dm_exec_function_stats AS d  
ORDER BY [total_worker_time] DESC;  

And here is the output


As you can see, we can see this function was executed a bunch of time

So there you have it, this is how you create the DB.... I also now understand why Brent Ozar uses the Stackoverflow DB for his posts, at least it is easy to find  .., if you prefer that over this example, head over here: https://www.brentozar.com/archive/2015/10/how-to-download-the-stack-overflow-database-via-bittorrent/

Tuesday, January 10, 2017

T-SQL Tuesday #86: String or binary data would be truncated

This month's T-SQL Tuesday is hosted by Brent Ozar, he proposed the following

Find the most interesting bug or enhancement request (and it can be your own), and write a blog post about it (including a link to the Connect item so that folks who agree with you can upvote the item)

This one was pretty easy for me, it is the following connect  item Please fix the "String or binary data would be truncated" message to give the column name

This error drives me crazy as well, it should be fairly easy to tell me if nothing else what damn column barfed on the data inserted, but no.. all you get is something like

Msg 8152, Level 16, State 6, Procedure <ProcName>, Line 61 String or binary data would be truncated.

This is like not having the black box after a plane crashed, you know the plane crashed, but you don't know why exactly.

Dealing with this issue on a semi-regular basis, I even have written my own T-SQL helper to quickly see where the issue is

declare @ImportTable varchar(100)
declare @DestinationTable varchar(100)
select @ImportTable = 'temp'
select @DestinationTable = 'TestTrunc'
 
declare @ImportTableCompare varchar(100)
declare @DestinationTableCompare varchar(100)
select @ImportTableCompare = 'MaxLengths'
select @DestinationTableCompare = 'TempTrunc'
 
 
declare @sql varchar(8000)
select @sql  = ''
select @sql = 'select  0 as _col0 ,'
select @sql +=   'max(len( ' + column_name+ ')) AS ' + column_name + ',' 
from information_schema.columns
where table_name = @ImportTable
and data_type in('varchar','char','nvarchar','nchar')
 
select @sql = left(@sql,len(@sql) -1)
select @sql +=' into ' + @ImportTableCompare + ' from ' + @ImportTable
 
--select @sql -debugging so simple, a caveman can do it
 
exec (@sql)
 
 
 
select @sql  = ''
select @sql = 'select 0 as _col0, '
select @sql +=   '' + convert(varchar(20),character_maximum_length)
+ ' AS ' + column_name + ',' 
from information_schema.columns
where table_name = @DestinationTable
and data_type in('varchar','char','nvarchar','nchar')
 
select @sql = left(@sql,len(@sql) -1)
select @sql +=' into ' + @DestinationTableCompare
 
--select @sql -debugging so simple, a caveman can do it
 
exec (@sql)
 
 
select @sql  = ''
select @sql = 'select  '
select @sql +=   '' + 'case when  t.' + column_name + ' > tt.' + column_name
+ ' then ''truncation'' else ''no truncation'' end as '+ column_name
+ ',' 
from information_schema.columns
where table_name = @ImportTableCompare
and column_name <> '_col0'
select @sql = left(@sql,len(@sql) -1)
select @sql +='  from ' + @ImportTableCompare + ' t
join ' + @DestinationTableCompare + ' tt on t._col0 = tt._col0 '
 
--select @sql -debugging so simple, a caveman can do it
 
exec (@sql)
 
 
exec ('drop table ' + @ImportTableCompare+ ',' + @DestinationTableCompare )


Something like this only helps you if you have the data readily available, what if it is from an application? In that case you need profiler or extended events to capture the statement

It is also crazy that this connect item is almost 9 years old, it was opened in April 2008

We do have someone from Microsoft commenting on this issue last August

Posted by David [MSFT] on 8/5/2016 at 1:39 PM
Latest update - the developer working on it understands the challenges involved in creating a full fix. It may be tricky to plumb the information about columns needed to generate a full error message down to the actual conversion function in such a way that won't impact insert or update performance. We may implement something cheap in the short term such as logging the type and length of the data being truncated. It's still too early to know when such a fix would reach a publicly visible release.

This connect item has 1328 upvotes as of today, it also has 5 downvotes (who are these people..probably working on the SQL Server team :-) )

So there you have it that is my contribution to T-SQL Tuesday # 86, keep track of Brent's blog here https://www.brentozar.com/blog/ there will be a recap posted on Tuesday, January 2017

Saturday, January 07, 2017

BULK INSERT and csv format, new in vNext 1.1



In SQL Server vNext 1.1 we now have the ability to import a csv via the BULK INSERT command without having to specify the field or row terminator

 You still need to specify the format, if you only do something like the following

BULK INSERT AlexaSongs  
   FROM 'c:\Songs played with Alexa.csv'  

You will be greeted with these errors


Msg 4832, Level 16, State 1, Line 10
Bulk load: An unexpected end of file was encountered in the data file.
Msg 7399, Level 16, State 1, Line 10
The OLE DB provider "BULK" for linked server "(null)" reported an error. The provider did not give any information about the error.
Msg 7330, Level 16, State 2, Line 10
Cannot fetch a row from OLE DB provider "BULK" for linked server "(null)".

So let's take a look at how this all works

First create the following table


USE tempdb
GO

CREATE TABLE AlexaSongs(PlayDate varchar(100), 
   SongName varchar(200), 
   Artist varchar(200), 
   Album varchar(200))
GO

Now grab the csv file from here Songs played with Alexa.csv  Either download the whole project and grab the file, or open in raw mode and copy and paste it into a file and save as Songs played with Alexa.csv

Now that you have the file and table ready, let's first take a look at how this was done before vNext 1.1

Here is what it looked like


BULK INSERT AlexaSongs  
   FROM 'c:\Songs played with Alexa.csv'  
   WITH  
     (  
        FIELDTERMINATOR =',',  
        ROWTERMINATOR = '\n'
      ); 

As you can see, we specified a comma as the field terminator and a newline as the row terminator

You could also get it to work by just specifying the field terminator in this case

BULK INSERT AlexaSongs  
   FROM 'c:\Songs played with Alexa.csv'  
   WITH  
     (  
        FIELDTERMINATOR =','
      ); 


So what does the new syntax look like?

Here is the code that accomplished the same as above but by using the new WITH FORMAT = CSV option

BULK INSERT AlexaSongs
FROM 'c:\Songs played with Alexa.csv' 
WITH (FORMAT = 'CSV'); 


I guess you could say it is a little cleaner, but all this really is is syntactic sugar

For Azure, it looks like this, I grabbed this straight from this Books On Line Page here

First you need to create a data source


CREATE EXTERNAL DATA SOURCE MyAzureInvoices
    WITH  (
        TYPE = BLOB_STORAGE,
        LOCATION = 'https://newinvoices.blob.core.windows.net', 
        CREDENTIAL = UploadInvoices  
    );

And then you use that data source

BULK INSERT Colors2
FROM 'week3/inv-2017-01-19.csv'
WITH (DATA_SOURCE = 'MyAzureInvoices',
      FORMAT = 'CSV'); 

For more examples including accessing data in a CSV file referencing a container in an Azure blob storage location go here https://msdn.microsoft.com/en-us/library/mt805207.aspx

That's all for today