Wednesday, November 29, 2006

Interview With SQL Server MVP Louis Davidson: Author Of Pro SQL Server 2005 Database Design and Optimization

After interviewing Ken Henderson I decided to try my luck with Louis Davidson. By now you already know that he said yes ;-). Louis is the author of Pro SQL Server 2005 Database Design and Optimization and a SQL Server MVP. I have a copy of his book and I highly recommend it to anyone who wants to learn about designing and optimizing databases. The question-and-answer session with Louis that follows was conducted via email.

Louis, the first time I came across your name was in the Microsoft public SQL Server programming group. Can you explain to us why you participate in newsgroups and forums?

I have been involved with newsgroups since back in college when I was an avid reader of And when I found the SQL Server groups (back then, before Microsoft’s groups it was, I started reading them, and finally giving some feedback. I then spent time in the newsgroups trying to answer people’s questions to help them, and help me extend my knowledge past school and technical books; because the real situations that people get into are more interesting than anything you can get in a textbook.

I quit hanging around for quite a while because the public newsgroups (just got too doggone noisy. I mean come on, there is only so often that one can have certain, “personal” products sold to them.

But about three years ago, I went to a session with Kalen Delaney (who was then, and is now one of the people who I look up to greatly,) who was talking about things to do to promote your books. I started then trying to answer three questions a day (or more) on the Microsoft newsgroups (instead of the public ones) and now on the MSDN Forums (

I don’t know that it helped my book sales in any overwhelmingly large way, but it certainly has helped me in my career. I find that so many things I have learned while helping others end up being useful at my day job. Plus, my philosophy in regards to SQL Server is simple: the more people who like SQL Server the more likely that the product will be around when I retire in 30-40 years.

I have noticed that you have explained the 12 rules of Codd, do you think that it is important to know these rules as a database developer?

Absolutely, if you don’t know why you do something, then it is just too easy to think that it isn’t worth it. And you can’t really bend the rules until you know them. Codd’s rules are the foundation that the RDBMS was built upon, which the begat normalization. Once people get that SQL Server is essentially a tool largely created to manipulate and serve up normalized data (a concept that was built upon the original 12 rules) it actually becomes essential to normalize, not just something that some square geezer is telling you that to do.

Why did you decide to write technical books?

Two reasons:
1. So I would have a reference to refer to when I need to recall something.
2. Because I hate having free time.

Seriously, I had really done quite badly in my senior level database design class in college, but good enough (I think I made a C.) So when I was thrust into the database programmer role, I wanted to learn how to do it right (something my father had not so subtlety driven into my brain years before.) What made things even better was that I had several mentors who believed in doing things right during my first six or seven years of programming, so I learned essentially only good practices, but I still wanted more. I couldn’t find a book on database design that was not written as a text book, so I suggested a book on database design to the editors at WROX, whom I was working for as a tech editor on a Visual Basic/SQL Server book. I expected to tech edit this design book, or possibly write a chapter or two, but before I knew it they had contracted me to write the entire book (Chuck Hawkins wrote a chapter for me, but I wrote the rest) and paid me what seemed like a ton of money (of course, if I average it out to an hourly wage, I averaged far less than minimum wage when all was said and done.)

The process of writing the book was an amazing journey, one because I had never written a paper greater than like 15 pages (my degree is in CS, so all of our papers were really quite small in comparison) and I had to learn a good deal of stuff along the way because I was not really an “expert” at the time (and the more I know, the less I feel like an expert now!) The thing about writing is that you have to be more correct than in almost any medium. When you speak on a subject, if you say something, the people in the audience go, “whu?” and if you talk fast enough and right enough, they will likely forgive you (as long as you don’t do it again.) But if you mess up in writing in a book, it just hangs there over your head forever.

When it was over I had something I was proud of and I said what every technical book writer I have talked with says: “NEVER AGAIN! Too much work, not enough gain, I could make more money cleaning out stalls at the local pet store!!!” I have now said more or less the same thing three times now, and I hope to say it again around the time Katmai ships.

Which chapter was the hardest to write and can you explain why?

Well, the ones that were the hardest were actually the easiest, since I got the Kevin Kline and Kurt Windisch to write them. These guys were excellent co-authors, and did great work. Kurt’s section on CLR programming is great, with some awesome example code that I often use when I need to build a CLR function. Kevin wrote a chapter about cross-platform development that rounded out the book.

Of the chapters I wrote, the most troublesome was probably the security chapter. Security is a very complex topic and there are many, many ways to enforce security. Even worse, there are too many ways to get around it once you have it set up. It is the one chapter that I am not completely happy with, and the one that I plan to do some writing on again in the next few months, most likely after the holidays, during which I hope to do very very little that isn’t concerned with eating, sleeping, or going to Disney World.

What is the audience for this book?

Well, with this being my third book I was kind of hoping that my audience had grown to Dogbert
-like proportions, but I have to be reasonable. I would simply be satisfied if I could get every SQL Server programmer to read it (and buy it!) The overall goal was to make a practical guide going through the boring bits (theory, modeling), the stuff you hear about but rarely really understand (normalization) and to include the more practical topics like constraints, triggers, security, etc that even people who have years of experience would like.

I think that most people will get something from the book, from the newbie to the pro. Of course the payoff will be far less for someone who already knows the theory, but there is something in there for everyone that will at least be worth your 60 bucks (which is the full retail price. Don’t pay full retail price!)

I also try to keep it as light as possible, considering the very dry topic. There are several little humorous bits thrown in, and for fun I added a hidden message for a contest to run later (actually going on right now in the end of 2006.) All of the clues are leading to a message that I “encrypted” into a bit of code in the book (and yes, that is bonus clue for reading through this really long interview!) for a bit of fun.

Ultimately, I want the book to be something that I would purchase in each version at the very least for the practical aspects of the book. You might think that since I wrote the book I wouldn’t refer to it, but I often use the templates for triggers, procedures, error handling, etc from the book whenever I need to write one without using some tool.

Which SQL Server books are on your bookshelf?

Well, to tell the truth, I have quite a few on my desk and no time to read them. I have thumbed through Itzik’s book, “Inside Microsoft SQL Server 2005: T-SQL Programming”, and it is pretty darn good. Dejan Sarka, who was one of my technical editors worked on that book too. Beauchemin and Sullivan’s “A Developer’s Guide to SQL Server 2005” seems pretty good, and I have used Tachev’s book, “Applied Microsoft Analysis Services 2005” several times to try to figure out how different Analysis Services was in the 2005 version.

And I would be remiss to not mention Kalen’s previous “Inside SQL Server 2005” book. What a great reference that book was. I can’t wait to get my hands on her new books (not that I would have the time to read them either!)

What are/could be some consequences if your data model is not in Third Normal Form or even First Normal Form?

The earth could stop spinning on its axis, at least that is what I try to tell people. The more reasonable answer is that you will have nagging, annoying issues with your data that will require more and more management just to keep straight.

Normalization pushes data consistency issues to the foreground and forces you to deal with them. For example, take first normal form. One of the rules of first normal form is that every row is unique. A constant question on newsgroups/forums is: “I have duplicate rows, how do I get rid of them?” Well, go through this following messy set of steps, and then PROTECT YOURSELF USING KEYS!
Another concern of first normal form is scalar attributes. I like to define this as keeping your data broken down to the lowest form you will manipulate using SQL. So if you have to use substring for any common operation, they you likely have an issue. For example take an address, if you use addresses just as the user types them in, then just have a text column, put the address in it. But if you have a mailing system and you have to cleanse the data, and then break the address down into all of its constituent parts instead of going through the process every time you need an address. SQL is great for taking bits and pieces and constructing what you need, but lousy at breaking things down.

For third normal form, the problems are more ominous. If the data in the table isn’t really describing the thing that the table is modeling, you will end up with redundant data. Redundant data doesn’t sound so bad, until you have need to update the data. Consider the case where you have a table modeling a person, and in this table you have the person’s manager, the manager’s name and address. From this sentence, it almost sounds plausible and correct. It certainly is something you might want to see on a screen. But is the manager just the manager for the one person? If not you might have ten copies of the managers information. Now you have to make sure and deal with all ten copies, which is a real pain, to say the least.

How did the massive changes between SQL Server 2000 and SQL Server 2005 affect the research for your book?

The beauty for me was that the truly massive changes made little difference. Since the book I have written is about database design, things haven’t changed so incredibly much from that angle in any version of SQL Server. SQL evolves and gets better and there are optimization changes, but the basics remain the same. Fortunately as machines get more powerful we need to tweak performance less and less and we can just focus on data integrity and usability through normalization.

The biggest differences that I am interested in from an authoring standpoint have been in the ability to do more with T-SQL (I wrote the T-SQL for Developer’s chapter in the Pro SQL Server 2005 book for Apress), and now using the CLR functions. I had Kurt Windisch (a friend who was a board member for PASS at the time) write a very large CLR section, and I am so glad. Now I have a nice reference and excellent code samples for CLR objects. It is one of the great things about a book like this in that it is not overview book where you simply give the new features and come up with some reason you might want to use the features. If the feature isn’t tremendously useful, we can say that it isn’t useful. I think we got it pretty much right by saying that functions are really the most useful part of the new CLR objects, as well as to a lesser extent aggregations.

I admit that I don’t cover XML, something that I lament occasionally. I had waffled on including it for months because I just don’t feel that it is that important to OLTP systems, but on the other hand it could certainly be useful to implement a set of sparse properties (a topic I do touch on) and had arranged for another coauthor set up to write me a chapter about XML, but they dropped out too late to get another one. Even today I am considering commissioning an “add-on” chapter about XML to send out to readers (if you want it, vote with your email to and your wallets by purchasing the book so I can afford to pay someone. )

Name three things that are new in SQL Server 2005 that you find are the most valuable?

  1. Dynamic Management Objects. Wow, I can’t even begin to start on just how great these are. The best one to me is outlined in this blog entry:!80677FB08B3162E4!1443.entry . Automatic index tuning is awesome. I have used the basis of this system to optimize a third party system in excellent ways using the number two feature.
  2. Included columns on indexes. These have really made it easier for dba’s to optimize queries by building covering indexes for queries without bloating indexes to death. What’s more, because you aren’t limited to 900 bytes, you can cover some queries in larger, more unwieldy systems that you just couldn’t do before. They aren’t the end all as they still have to be maintained, but they are certainly excellent in many situations. We have used them extensively to optimize a third party system that was very heavy on reads to cut down tremendously on full table scans.
  3. The new BI Paradigm. The attribute oriented nature of Analysis Services is just plain cool. It gives users this extremely rich set of possibilities for viewing data using their cubes that is just so much better than they had in 2000. It was a tremendous change in gears though, and I am very glad that the fundamentals of dimensional modeling didn’t change, so the Kimball paradigm data warehouse we had was still quite viable using either platform, just easier to use.

Name three things that you wish would be in Katmai (the next version of SQL Server)?

I should have finished this interview a few weeks earlier, and I could have spoken a bit more freely about this subject, but at this point I am under NDA for what I have heard about Katmai. What I will say is that I hope to see as much work done in the relational arena (T-SQL and the engine) as they do for the other areas like XML, or tools. The heart and soul of SQL Server should always be right there, lurking in the name: SQL. All of the other new stuff they have added over the years has made things better for us, but I would have traded many, many non relational features to have had the ROW_NUMBER() and RANK() functions in an earlier version.

Of course, this is just my very selfish view on things as a Data Architect/Relational Programmer. In this particular case, even more than normal, I particularly speak only for me.

How critical is a proper data model for a good performing database?

Freaking essential! This is like asking how important is it to know the race route to win the race. The SQL Server engine is amazingly good at serving up data, but it is up to the implementer to know what data is needed and to put the data into a format that works the way that SQL Server works.

I would also be remiss if I didn’t take this one step further. How can you build anything without design? Not one programmer would call up a contractor and say “build me a house, three bedrooms, two baths” and then just let the contractor sketch out a rough drawing and then start working would they? And while I realize that it seems that software is a lot more pliable than building a house, once you have built the foundation classes, the database, and started a UI, rework can be just as costly as running a bulldozer though a regular house, and on a particularly a large software project, far more expensive.

What are some of the biggest mistakes that people make when designing a database?

I did something this for Greg Low’s really cool SQL Down Under podcast (where you get a good idea that I am terrified of speaking publicly) but lets just boil it down to two here:

  1. Treating database design like application design - The reason why the concept of a distinct business layer is so intriguing to me is that we can transform the user’s views/needs from the GUI through this layer to the data layer. Each layer of the application has it’s own needs (UI, Object Layer Representation, Data Access Layer, Stored Procedure Layer, Database Structures) each very important in it’s own right. More importantly, each has very different design goals.
  2. Forgetting about the future – During the design phase of any project, changes are very, very cheap. So if you realize you want to have a table with a domain of values instead of a simple varchar column, it is as simple as adding a rectangle to your model and connecting rectangles. Same goes with the discovery that the cardinality between two of your tables was not what you originally thought (“oh, a person is allowed to have more than one phone number in this system.”) However, as time passes, and more code is written, and more data is added to the system, the costs go up almost exponentially. Of course when the project manager has saddled you up and is giving you the crop, demanding results (that is their job after all) it can be all too easy to simply ignore the future and start coding. Always think about Future-You and how annoyed he is going to be when he realizes that had you just spent another hour three months ago, two weeks of work (of course corresponding to Future-You’s vacation to Disney World) could have been avoided.

There are so many others, bad naming, poor constraint usage, under-using stored procedures, ignoring security, weak testing, and so on and so on. But I think I can safely say that if you start with these two concepts, a great number of these bad habits will correct themselves.

Can you list any third party tools that you find useful to have as a SQL Server developer/admin?

I have become quite enamored with the Red-Gate tools, especially the database compare tool. I use it quite often to compare databases to find changes that have been made, especially for indexes and stuff done for tuning a database. The Apex tools are pretty good too.

The one tool I have used a lot lately has been the Automatic Indexing stuff from the query optimization team: ( I am working right now on optimizing a homegrown/third party “object oriented”-esque tool that has a great deal of object abstraction built in, and have found tremendous performance gains from some really esoteric indexes this thing added to some of the queries that I would have never thought of in a month of Sundays. I wrote about it in more detail here: (!80677FB08B3162E4!1443.entry and yes, I know this is the second time I have mentioned it, it is that important!)

In general, I am not so much a tool guy as I am a script guy. I use the UI to script out objects quite often, but then I try to edit everything with T-SQL so I know how it is done. The people who get too comfortable in the wizards or GUI often forget or don’t learn all of the tricks that you can do with the raw SQL commands, because it would just be too hard to build the perfect UI. So they give you 50% in the wizards and 90% in the GUI, and the rest is “advanced” stuff. If you don’t know about the nuts and bolts, you might be tempted to say “SQL Server doesn’t…”.

Lastly I have always been a big fan of ERwin for modeling. It does a really good job of representing my structures, and I like their macro language for building very customized scripts to build tables/views/procedures/etc.

What is the biggest Database in GB/PB that you have ever worked with, how many rows was your biggest table ever?

It was about ten feet across, six feet high. I had just started my new job as a LAN administrator and occasionally I had to fill in for the night operator. We had these 14 inch across reels of something that looked like a reel to reel tape, but they….wait, do you mean gigabytes and petabytes? I was thinking Great Big, or Pretty Big.

Well definitely no petabytes, and only recently have we started thinking in terms of terabytes, some of which is due to the fact that we have started storing everything as Unicode, essentially doubling the size of the database. We probably have around a half of a terabyte in our OLTP system and Data warehouse combined. I am never too excited about the size of a database as I am of the number of rows in a key table. You could have 10 billion rows and a very small database because it was designed nicely, or to the contrary, a million rows and have a terabyte if you just stored all of your data in text blobs (which is a favorite kind of storage for too many people.)

The largest database I designed I never got to see come to fruition because our company went through a couple of downsizing “events”, which I was part of the next to last one. I only know it was large because a prior coworker of mine called me up one day and told me we had gone past the 2 billion row mark and they were to changing to use bigint rather than basic int types for the surrogate keys. Tons of rows, but really they were quite small rows. We were keeping information on Internet equipment that the company had deployed. It was a pretty cool system that I wish I could see in action today (if it is still being used. The concept was neat, but as the implementers we were never 100% sure if the data would be useful later in the process.)

When did you start blogging and why?
    When is easy, almost two years ago when I started writing my last book. The answer to why is pretty much the same as the answer to the technical books question, but increased an order of magnitude (as a reminder, that answer was: Two reasons: so I would have a reference to look into when I need it; and because I hate having free time.)

    Blogging is possibly the biggest time sink on earth really. When I am working a “normal” 40 or so hour week in my day (read: paying) job, it is easy to spend an hour or so a day blogging, but when my job squeezes my time, it gets really hard.

    But the most exciting part of blogging is the incredible active reference it gives me, in a format that really works for me. I can also use more of a shotgun approach and just write about what I want to write about on any topic, SQL or otherwise, rather than just writing about database design exclusively.

    Over the years, I have tried creating libraries of code on my laptop, but that never has seemed to work for me. As I have aged (a little) I have started to realize that commenting, blogging, writing, etc is not for you, it is for Future Me and other people. I have so much code on my laptop from the past 10 years that I don’t know what the heck to do with it. I have VB6, SQL from past ages, failed article attempts, diagrams, etc all over my projects directory. So Past Me (that jerk!) is punishing me with his laziness, something I now vow not to do to Future Me. Confused? Good.

    I started to realize that my blog had gotten out of control with so much information that I could no longer search and find when I found, a Microsoft site that allows you to create a free website for your small business. So I created as a reference for my books, presentations, and my blog. On there you will find a decent enough index to my blog entries, usually within a month of being up to date, as well as a list of about ten articles that I can get to from the front page just about instantly.

    About the book:
    By Louis Davidson, Kevin Kline, Kurt Windisch
    ISBN: 1-59059-529-7
    672 pp.
    Published: May 2006

    Table Of Contents
    CHAPTER 1 Introduction to Database Concepts
    CHAPTER 2 Data Modeling
    CHAPTER 3 Conceptual Data Modeling
    CHAPTER 4 The Normalization Process
    CHAPTER 5 Implementing the Base Table Structures
    CHAPTER 6 Protecting the Integrity of Your Data
    CHAPTER 7 Securing Access to the Data
    CHAPTER 8 Table Structures and Indexing
    CHAPTER 9 Coding for Concurrency
    CHAPTER 10 Code-Level Architectural Decisions
    CHAPTER 11 Database Interoperability
    APPENDIX A Codd’s 12 Rules for an RDBMS
    APPENDIX B Datatype Reference

    Sample Chapter:
    Ch. 02 - Data Modeling

    Amazon Links: CA FR DE UK JP US

    Tuesday, November 28, 2006

    Use SCHEMABINDING In Views To Protect Against Alteration Or Dropping Of Tables

    My post from last night triggered something in me to write a little bit about SCHEMABINDING.
    If you create a view and someone drops one of the tables that the view is dependent on then the view will fail. How can you protect yourself against that? You can use WITH SCHEMABINDING. Here is how you use it:

    Let's first create some tables and the view

    CREATE TABLE dbo.Test1 (ID INT, SomeName VARCHAR(49))
    CREATE TABLE dbo.Test2 (ID2 INT, SomeOtherName VARCHAR(49))

    INSERT dbo.Test1 VALUES 1,'Name1')
    INSERT dbo.Test1 VALUES( 2,'Name2')
    INSERT dbo.Test2 VALUES( 1,'OtherName1')
    INSERT dbo.Test2 VALUES( 2,'OtherName2')

    CREATE VIEW dbo.TestTables
    ID,SomeName, ID2,SomeOtherName
    FROM dbo.Test1 t1
    JOIN dbo.Test2 t2 on t1.ID =t2.ID2

    --SELECT from the view and everything is fine
    SELECT *
    FROM dbo.TestTables

    -- Drop table test2
    DROP TABLE dbo.Test2

    --Let's do the select again
    SELECT * FROM dbo.TestTables

    --And this is the error message that we get back
    Server: Msg 208, Level 16, State 1, Procedure TestTables, Line 3
    Invalid object name 'Test2'.
    Server: Msg 4413, Level 16, State 1, Line 1
    Could not use view or function 'TestTables' because of binding errors.

    --Let's create the Test2 table again
    CREATE TABLE dbo.Test2 (ID2 INT, SomeOtherName VARCHAR(49))

    INSERT dbo.Test2 VALUES( 1,'OtherName1')
    INSERT dbo.Test2 VALUES( 2,'OtherName2')

    --Now let's alter the view and we will use SCHEMABINDING
    ID,SomeName, ID2,SomeOtherName
    FROM dbo.Test1 t1
    JOIN dbo.Test2 t2 on t1.ID =t2.ID2

    --Try dropping the Test2 table again
    DROP TABLE dbo.Test2

    --And here is the message
    Server: Msg 3729, Level 16, State 1, Line 1
    Cannot DROP TABLE 'dbo.Test2' because it is being referenced by object 'TestTables'.

    --Let's try altering the table by adding a column
    ALTER TABLE dbo.Test2 ADD Col3 INT

    -- That seems to work
    SELECT * FROM dbo.Test2
    SELECT * FROM dbo.TestTables

    --Let's try altering the table by dropping a column
    ALTER TABLE dbo.Test2 DROP COLUMN SomeOtherName

    --Can't do that
    Server: Msg 5074, Level 16, State 3, Line 1
    The object 'TestTables' is dependent on column 'SomeOtherName'.
    Server: Msg 4922, Level 16, State 1, Line 1
    ALTER TABLE DROP COLUMN SomeOtherName failed because one or more objects access this column.

    --Let's try altering the table by changing a column size
    ALTER TABLE dbo.Test2 ALTER COLUMN SomeOtherName VARCHAR(48)

    --Same here; this is not allowed
    Server: Msg 5074, Level 16, State 3, Line 1
    The object 'TestTables' is dependent on column 'SomeOtherName'.
    Server: Msg 4922, Level 16, State 1, Line 1
    ALTER TABLE ALTER COLUMN SomeOtherName failed because one or more objects access this column.

    --Let's drop the column that is not used by the view

    --No problem here either, if a column is not used by the view it can be dropped

    --Clean up the mess
    DROP VIEW dbo.TestTables
    DROP TABLE dbo.Test2, dbo.Test1

    So there you have it; SCHEMABINDING is a nice way to protect your views from dropped tables, modified columns or dropped columns on tables that are used in views.

    --You can add columns without a problem to the table used by a view.
    --You can modify a column that is not used in a view.
    --You can also drop a column that is not used in the view

    Monday, November 27, 2006

    Do You Use Partitioned Views Or Indexed Views In SQL Server 2000

    I am in the process of interviewing people for a position at work. The people I have interviewed so far have between 8 and 10 years SQL experience. I noticed one thing; none of these people knew what partition views or horizontal partitioning is or what is involved with creating an index view (the SET options, SCHEMABINDING etc etc). I must say that I was surprised that people who worked with SQL Server since 1998 didn’t know about isolation levels, audit trails, difference between a unique constraint and primary key and the list goes on

    So here is my question to you: Do you use partitioned views or indexed views?

    I must admit that I have created indexed views only twice on a production box. Partitioned views I am using a lot more since the table that we had grew to 180 million rows and you can query it on a combination of columns. Creating partitioned views is not a big deal but it involves some maintenance if you add years or whatever your partitioned column is. Then you have to create does tables and add them to the view. Luckily SQL Server 2005 has partitioned functions and this makes it a lot easier

    So is this an unreasonable question? It’s not like I asked the difference between a conversion and a cycle deadlock is it?

    Saturday, November 25, 2006

    SQL Server - Best Practices Site Available On TechNet

    Microsoft TechNet has a page named SQL Server - Best Practices; this site has white papers, scrips a toolbox and top 10 lists
    From the site:
    Get the real-world guidelines, expert tips, and rock-solid guidance to take your SQL Server implementation to the next level. Drawing on the extensive experience and expertise from respected developers and engineers at Microsoft as they walk you through the specifics on solving particularly difficult issues.

    Here is a preview of some of the material that is available right now:

    Technical White Papers
    Deep level technical papers on specific SQL Server topics that were tested and validated by SQL Development

    • SQL Server 2005 Performance Tuning using Waits and Queues

    • TEMPDB Capacity Planning and Concurrency Considerations for Index Create and Rebuild

    • Loading Bulk Data into a Partitioned Table

    • DBCC SHOWCONTIG Improvements and Comparison between SQL Server 2000 and SQL Server 2005

    • Database Mirroring Best Practices and Performance Considerations

    Best Practices ToolBox
    • Scripts and tools for performance tuning and troubleshooting SQL Server 2005

    Top 10 Lists
    Summary lists (usually consisting of 10 items) of recommendations, best practices and common issues for specific customer scenarios by the SQL Server Customer Advisory Team.

    • Storage Top 10 Best Practices

    • OLTP Top 6 Performance Issues for OLTP Applications

    Best Practices in SQL Server Books Online• Best Practices for Replication Administration

    • Replication Security Best Practices

    A must for every SQL Server developer: SQL Server - Best Practices

    Friday, November 24, 2006

    Auto Update Statistics Asynchronously.

    In SQL Server 2000 if the engine determined that the statistics are out of date it would update the statistics and then run the query. If you had a very long table this could take a long time. In SQL server 2005 by turning Auto Update Statistics Asynchronously On the query will run with the old statistics which is not optimal but will probably still be faster than updating the statistics first on a large table and then running the query. The statistics will be updated after the query and the next queries can take advantage of the updated statistics.


    Queries that initiate an automatic update of out-of-date statistics will not wait for the statistics to be updated before compiling. Subsequent queries will use the updated statistics when they are available.
    Queries that initiate an automatic update of out-of-date statistics, wait until the updated statistics can be used in the query optimization plan.
    Setting this option to ON has no effect unless AUTO_UPDATE_STATISTICS is set to ON.

    So how do you enable this? Here is an example for the AdventureWorks database however setting this option to ON has no effect unless AUTO_UPDATE_STATISTICS is also set to ON.


    What does Books On Line have to say about this?

    Asynchronous Statistics Updates

    A query that initiates an update of out-of-date statistics must wait for those statistics to be updated before compiling and returning a result set. This can cause unpredictable query response times and may cause applications that have aggressive time-outs to fail.

    In SQL Server 2005, the database option AUTO_UPDATE_STATISTICS_ASYNC provides asynchronous statistics updating. When this option is set to ON, queries do not wait for the statistics to be updated before compiling. Instead, the out-of-date statistics are put on a queue for updating by a worker thread in a background process. The query and any other concurrent queries compile immediately by using the existing out-of-date statistics. Because there is no delay for updated statistics, query response times are predictable; however, the out-of-date statistics may cause the query optimizer to choose a less-efficient query plan. Queries that start after the updated statistics are ready will use those statistics. This may cause the recompilation of cached plans that depend on the older statistics version.

    The AUTO_UPDATE_STATISTICS_ASYNC option is set at the database level and determines the update method for all statistics in the database. It is applicable only to statistics updating and cannot be used to asynchronously create statistics. Setting this option to ON has no effect unless AUTO_UPDATE_STATISTICS is also set to ON. By default, the AUTO_UPDATE_STATISTICS_ASYNC option is OFF

    Thursday, November 23, 2006

    Happy Thanksgiving

    This is not going to be a SQL post (again) So to have at least some SQL I am going to link to a post I did last year: Calculating Thanksgiving By Using SQL in SQL Server

    So that’s that. I have some people coming over today and still have to cook all the stuff. The best part about thanksgiving besides the food and seeing the family is that we pick our secret Santa. Up until last year we would exchange with every family member and have a limit of $25. No we do it differently we pick one person from a hat and spend $125. This is much better because you only have to buy a gift for one person and you can get a nicer gift instead of several smaller gifts. So I usually give my list of Amazon links to the person and they can choose what to get me. Yes it’s not really a ‘secret’ Santa. I will post my list on Saturday so that you can see if I have any taste or not

    Now another topic: weight gain. Yes the time between Thanksgiving and New Years day is the time when most people start putting on the pounds. A friend of mine has a blogpost named Fat America but I must warn you if you are a heavier person you might get offended with some of the language at the end of the post (the elevator part)

    Anyway enjoy the food and drinks and have a good time

    Windows Vista Consumer Launch Countdown Gadget

    There is a Windows Vista Consumer Launch Countdown Gadget available. There are 3 versions of this gadget (widget for you Mac fans) available:
    one for pages, another for Windows Live Spaces, and a third for installation to the Windows Vista Sidebar.

    If you need more info about this gadget then visit this link:

    Wednesday, November 22, 2006

    Upgrading My Skills To SQL Server 2005

    At work we are required to get training at least once a year. So I decided to take the Upgrading Your Database Development Skills To Microsoft SQL server 2005

    At the end of this post you will see what I should learn. The things that I am most excited about are: Service Broker and Using SQL Management Objects. The new Transact SQL enhancements I know already; I have covered several of them in this blog.

    I doubt that I will use the .NET CLR in SQL Server 2005; but I must say that I was amazed at how easy it is to set it up. Notification services could be something I could use but right now I have build my own.
    The Native HTTP Support was really cool; it was strange to see WSDL generated from SQL Server. It is something that can be deployed internally since I know that we will never open up our SQL boxes at work to the outside world.

    So today is the last day of the class and we have a break now. I must say going to training beats a book anytime. it just saves you so much time. This was a 3 day course. If I had to do it by myself with a book it would take me at least 2 weeks.

    Talking about books I am also in the middle of reading Inside Microsoft (r) SQL Server (tm) 2005: The Storage Engine and Pro SQL Server 2005 Database Design and Optimization and both are excellent books

    So here is what was covered in this course:

    Module 1: SQL Server 2005 Overview

    The information in this module introduces the main new features and enhancements in SQL Server 2005 and describes the new development tools.

    SQL Server 2005 Components and Architecture
    SQL Server Developer Tools
    SQL Server 2005 Security Implementation

    Lab 1: Exploring SQL Server 2005
    Using SQL Server Management Studio
    Using Schemas

    Module 2: Transact-SQL Enhancements in SQL Server 2005

    In this module, students learn about the improvements in Transact-SQL in SQL Server 2005.

    Data Definition Language Enhancements
    Data Manipulation Language Enhancements
    Structured Exception Handling

    Lab 2: Programming with Transact-SQL
    Partitioning a Table
    Using Common Table Expressions
    Using Relational Operators

    Module 3: Using XML in SQL Server 2005

    In this module, students learn about the XML-related functionality in the SQL Server 2005 database engine will be described.

    XML Enhancements in SQL Server 2005
    The xml Data Type
    Using XQuery

    Lab 3: Working with XML
    Retrieving XML from Relational Data
    Storing XML Natively in the Databases
    Using XQuery with xml Methods

    Module 4: Using Service Broker

    The information in this module introduces Microsoftr SQL ServerT 2005 Service Broker-a message-based platform for building service-oriented database solutions.

    Service Broker Architecture
    Using the Service Broker

    Lab 4: Using Service Broker
    Creating Service Broker Objects
    Implementing the Customer Service
    Implementing the E-mail Service

    Module 5: Using Native HTTP Support

    In this module, students learn how to create HTTP endpoints that make database services available to Web services clients.

    Native HTTP Support in SQL Server 2005
    Configuring Native HTTP Support

    Lab 5: Implementing a Web Service with HTTP Endpoints
    Create an HTTP Endpoint
    Test the HTTP Endpoint
    Secure the HTTP Endpoint

    Module 6: Using Notification Services

    The information in this module introduces Notification Services and how to develop Notification Services applications.

    Notification Services Architecture
    Building Notification Services Solutions

    Lab 6: Using Notification Services
    Creating a Notification Services Application
    Creating a Subscription Management Application
    Creating an Event Provider

    Module 7: Using the .NET CLR in SQL Server 2005

    In this module, students learn how to implement managed code in a SQL Server 2005 database.

    SQL Server and the .NET CLR
    Implementing Managed Code in SQL Server 2005

    Lab 7: Implementing Managed Code in the Database
    Creating a Managed Stored Procedure
    Creating a Managed Function
    Creating a Managed User-Defined Type

    Module 8: Developing Client Applications

    The information in this module introduces database developers to client application development techniques that use the data access classes provided in the .NET Framework.

    Developing Client Applications with ADO.NET
    Data Binding in the .NET Framework 2.0

    Lab 8: Developing Client Applications
    Creating a Data Bound Windows Application
    Creating a Data bound ASP.NET Application

    Module 9: Using SQL Management Objects

    The information in this module introduces the SQL Management Objects (SMO) Application Programming Interface (API) in Microsoftr SQL ServerT 2005, which provides classes that you can use to manage SQL Server. SQL Server 2005 also introduces Replication Management Objects (RMO) to allow automation of replication.

    Introduction to SQL Management Objects
    Replication Management Objects

    Lab 9: Using SQL Management Objects
    Using SMO to Retrieve Server Information
    Using SMO to Create Database Objects
    Using SMO to Modify Database Objects

    SQL Hacks

    O'Reilly came out with their latest SQL book: SQL Hacks
    From their site:
    Whether you're running Access, MySQL, SQL Server, Oracle, or PostgreSQL, this book will help you push the limits of traditional SQL to squeeze data effectively from your database. The book offers 100 hacks -- unique tips and tools -- that bring you the knowledge of experts who apply what they know in the real world to help you take full advantage of the expressive power of SQL. You'll find practical techniques to address complex data manipulation problems. Learn how to:

    --Wrangle data in the most efficient way possible
    --Aggregate and organize your data for meaningful and accurate reporting
    --Make the most of subqueries, joins, and unions
    --Stay on top of the performance of your queries and the server that runs them
    --Avoid common SQL security pitfalls, including the dreaded SQL injection attack

    Let SQL Hacks serve as your toolbox for digging up and manipulating data. If you love to tinker and optimize, SQL is the perfect technology and SQL Hacks is the must-have book for you.

    Sample Chapter
    A sample chapter is available: Hack 4: Date Handling (PDF Format)

    Table Of Contents:
    Chapter 1. SQL Fundamentals
    1. Run SQL from the Command Line
    2. Connect to SQL from a Program
    3. Perform Conditional INSERTs
    4. UPDATE the Database
    5. Solve a Crossword Puzzle Using SQL
    6. Don't Perform the Same Calculation Over and Over

    Chapter 2. Joins, Unions, and Views
    7. Modify a Schema Without Breaking Existing Queries
    8. Filter Rows and Columns
    9. Filter on Indexed Columns
    10. Convert Subqueries to JOINs
    11. Convert Aggregate Subqueries to JOINs
    12. Simplify Complicated Updates
    13. Choose the Right Join Style for Your Relationships
    14. Generate Combinations

    Chapter 3. Text Handling
    15. Search for Keywords Without LIKE
    16. Search for a String Across Columns
    17. Solve Anagrams
    18. Sort Your Email

    Chapter 4. Date Handling
    19. Convert Strings to Dates
    20. Uncover Trends in Your Data
    21. Report on Any Date Criteria
    22. Generate Quarterly Reports
    23. Second Tuesday of the Month

    Chapter 5. Number Crunching
    24. Multiply Across a Result Set
    25. Keep a Running Total
    26. Include the Rows Your JOIN Forgot
    27. Identify Overlapping Ranges
    28. Avoid Dividing by Zero
    29. Other Ways to COUNT
    30. Calculate the Maximum of Two Fields
    31. Disaggregate a COUNT
    32. Cope with Rounding Errors
    33. Get Values and Subtotals in One Shot
    34. Calculate the Median
    35. Tally Results into a Chart
    36. Calculate the Distance Between GPS Locations
    37. Reconcile Invoices and Remittances
    38. Find Transposition Errors
    39. Apply a Progressive Tax
    40. Calculate Rank

    Chapter 6. Online Applications
    41. Copy Web Pages into a Table
    42. Present Data Graphically Using SVG
    43. Add Navigation Features to Web Applications
    44. Tunnel into MySQL from Microsoft Access
    45. Process Web Server Logs
    46. Store Images in a Database
    47. Exploit an SQL Injection Vulnerability
    48. Prevent an SQL Injection Attack

    Chapter 7. Organizing Data
    49. Keep Track of Infrequently Changing Values
    50. Combine Tables Containing Different Data
    51. Display Rows As Columns
    52. Display Columns As Rows
    53. Clean Inconsistent Records
    54. Denormalize Your Tables
    55. Import Someone Else's Data
    56. Play Matchmaker
    57. Generate Unique Sequential Numbers

    Chapter 8. Storing Small Amounts of Data
    58. Store Parameters in the Database
    59. Define Personalized Parameters
    60. Create a List of Personalized Parameters
    61. Set Security Based on Rows
    62. Issue Queries Without Using a Table
    63. Generate Rows Without Tables

    Chapter 9. Locking and Performance
    64. Determine Your Isolation Level
    65. Use Pessimistic Locking
    66. Use Optimistic Locking
    67. Lock Implicitly Within Transactions
    68. Cope with Unexpected Redo
    69. Execute Functions in the Database
    70. Combine Your Queries
    71. Extract Lots of Rows
    72. Extract a Subset of the Results
    73. Mix File and Database Storage
    74. Compare and Synchronize Tables
    75. Minimize Bandwidth in One-to-Many Joins
    76. Compress to Avoid LOBs

    Chapter 10. Reporting
    77. Fill in Missing Values in a Pivot Table
    78. Break It Down by Range
    79. Identify Updates Uniquely
    80. Play Six Degrees of Kevin Bacon
    81. Build Decision Tables
    82. Generate Sequential or Missing Data
    83. Find the Top n in Each Group
    84. Store Comma-Delimited Lists in a Column
    85. Traverse a Simple Tree
    86. Set Up Queuing in the Database
    87. Generate a Calendar
    88. Test Two Values from a Subquery
    89. Choose Any Three of Five

    Chapter 11. Users and Administration
    90. Implement Application-Level Accounts
    91. Export and Import Table Definitions
    92. Deploy Applications
    93. Auto-Create Database Users
    94. Create Users and Administrators
    95. Issue Automatic Updates
    96. Create an Audit Trail

    Chapter 12. Wider Access
    97. Allow an Anonymous Account
    98. Find and Stop Long-Running Queries
    99. Don't Run Out of Disk Space
    100. Run SQL from a Web Page

    And for $19.79 on Amazon this is a great gift

    Tuesday, November 21, 2006

    SQL Server 2005 - Service Pack 2 - Web Chat Today At 9AM PST

    The SQL Server Product Team will have a web chat later today
    From their site:

    If you have any questions regarding Service Pack 2, please join us in a public web chat tomorrow between 9:00am - 10:00am Pacific Time.
    Here are a couple of links where you might have already stumbled on this:

    Event: Service Pack 2 Web Chat
    Date: Tuesday, November 21, 2006
    Time: 9:00am - 10:00am (Pacific)


    Paul A. Mestemaker II
    Program Manager
    Microsoft SQL Server Manageability

    Sunday, November 19, 2006

    Twisted Sister: Twisted Christmas

    At work I overheard two people getting excited about the Twisted Sister Christmas album. What?

    How in the world can you get excited about that? So I decided to check it out on Amazon. There are 7 reviews and the average is 4 stars. so if you are stuck in the 80's go check it out here

    The CD has 10 songs:

    1. Have Yourself A Merry Little Christmas
    2. Oh Come All Ye Faithful
    3. White Christmas
    4. I'll Be Home For Christmas
    5. Silver Bells
    6. I Saw Mommy Kissing Santa Claus
    7. Let It Snow, Let It Snow, Let It Snow
    8. Deck The Halls
    9. The Christmas Song (Chestnuts Roasting On An Open Fire)
    10. Heavy Metal Christmas (The Twelve Days Of Christmas)

    The biggest surpise is the song Heavy Metal Christmas" which is the Twisted Sister version of "The Twelve Days of Christmas"

    Instead of this:

    twelve drummers drumming
    eleven pipers piping
    ten lords a leaping
    nine ladies dancing
    eight maids a milking
    seven swans a swimming
    six geese a laying
    five gold rings
    four calling birds
    three french hens
    two turtle doves
    and a partridge in a pear tree.

    You get this gem:

    On my heavy metal Christmas my true love gave to me,
    12 Silver crosses
    11 Black mascaras
    10 Pairs of platforms
    9 Tattered t-shirts
    8 Pentagrams
    7 Leather jackets
    6 Cans of hairspray
    5 Skull earrings
    4 Quarts of Jack
    3 Studded belts
    2 Pairs of spandex pants
    And a tattoo of Ozzy!

    Fun stuff

    Wednesday, November 15, 2006

    Visual Studio Team Edition for Database Professionals RTM has been set for Nov. 30, 2006

    Microsoft has made a couple of announcements at PASS today; one of them is that the release to manufacture date for Visual Studio Team Edition for Database Professionals has been set for Nov. 30, 2006. Here is what has been announced:

    The availability of the community technology preview (CTP) of SP2 for SQL Server 2005 delivers key updates, including data compression, increased business intelligence functionality, security updates relating to Common Criteria, manageability enhancements, support for Windows Vista™ and optimization for the 2007 Microsoft Office system environment. More details about SP2 are available. The SP2 November CTP can be downloaded immediately.

    • The RC of SQL Server 2005 Compact Edition is a new offering for essential relational database functionality in a compact footprint. By sharing a familiar SQL syntax and common ADO.NET programming model with other SQL Server editions, SQL Server Compact Edition allows developers and administrators to apply their existing skills and be immediately productive while building and supporting the next generation of applications. The RC is available via Web download.

    • The release to manufacture date for Visual Studio Team Edition for Database Professionals has been set for Nov. 30, 2006. Visual Studio Team Edition for Database Professionals provides a foundation for change management, development, testing and deployment of databases through integrated functionality that enables database developers and administrators to be more productive, reduce risk and drive quality. More information can be found on MSDN.

    • Plans to provide process guidance for database professionals are included in the Microsoft Solutions Framework. This process guidance is the first of its kind in the industry and demonstrates Microsoft’s commitment to making the database professional a full-fledged participant in the application life cycle.

    Going Mobile with SQLServer Compact Edition Podcast On Channel 9

    Channel 9 has a podcast about SQL Server Compact Edition. From the site:
    "Do you remember when having a mobile phone was a big deal? I can still recall the time when I bought my first “Car Phone” as we called it. It came with a big box that you put under the seat and worked only in the car. It was pretty cool.

    Then later I got a big phone that I carried with me on trains and planes and when I made a call people would stare and comment on how important I must be to have such a device. Today even my 12 year old has a mobile phone and my 8 year old wants one. No doubt about it; mobile devices are becoming more and more important every day and I believe that they will become the most pervasive computing platform over time.

    The question I have is, what do architects need to know about designing solutions for this platform? Well on this episode I’m joined by Nick Randolph .Net Compact Framework MVP who has been building solutions and has some thoughts for us. "

    Download the mp3 or wma version of the podcast here

    Tuesday, November 14, 2006


    You want to create an indexed view; however creating an indexed view is not as straight forward as creating an index on a table. The view must be created with the SCHEMABINDING option.

    The following SET options must be set to ON when the CREATE INDEX statement is executed:

    The NUMERIC_ROUNDABORT option must be set to OFF.

    The view must not reference any other views, only base tables. There are more restrictions, you can look them up by reading Creating an Indexed View in Books On Line.

    So how do you find out if NUMERIC_ROUNDABORT is set to ON or OFF? Since we are using SQL server 2005 we can use the sys.dm_exec_sessions view

    SELECT *
    FROM sys.dm_exec_sessions
    WHERE session_id = @@spid

    But wait there is no column named numeric_roundabort. Okay then let's use the old SQL server 2000 DBCC USEROPTIONS
    After executing DBCC USEROPTIONS this is the output for me

    textsize 64512
    language us_english
    dateformat ymd
    datefirst 7
    lock_timeout -1
    quoted_identifier SET
    arithabort SET
    ansi_null_dflt_on SET
    ansi_warnings SET
    ansi_padding SET
    ansi_nulls SET
    concat_null_yields_null SET
    isolation level read committed

    Still no luck. But there is hope because SQL Server 2005 offers the SESSIONPROPERTY function

    Let's do a quick test

    SESSIONPROPERTY('numeric_roundabort') --1

    SESSIONPROPERTY('numeric_roundabort') --0

    Now let's create our own view with all the set options needed for creating indexed views included

    CREATE VIEW UserOptionsExpanded
    arithabort,SESSIONPROPERTY('numeric_roundabort') AS numeric_roundabort
    FROM sys.dm_exec_sessions

    Let's select from the view

    SELECT *
    FROM UserOptionsExpanded
    WHERE session_id = @@spid

    And yes all the columns are there

    Of course we could have just executed the following and be done ;-)


    Let's see what else we can pass into the SESSIONPROPERTY function

    Specifies whether the SQL-92 compliant behavior of equals (=) and not equal to (<>) against null values is applied.
    1 = ON
    0 = OFF

    Controls the way the column stores values shorter than the defined size of the column, and the way the column stores values that have trailing blanks in character and binary data.
    1 = ON
    0 = OFF

    Specifies whether the SQL-92 standard behavior of raising error messages or warnings for certain conditions, including divide-by-zero and arithmetic overflow, is applied.
    1 = ON
    0 = OFF

    Determines whether a query is ended when an overflow or a divide-by-zero error occurs during query execution.
    1 = ON
    0 = OFF

    Controls whether concatenation results are treated as null or empty string values.
    1 = ON
    0 = OFF

    Specifies whether error messages and warnings are generated when rounding in an expression causes a loss of precision.
    1 = ON
    0 = OFF

    Specifies whether SQL-92 rules about how to use quotation marks to delimit identifiers and literal strings are to be followed.
    1 = ON
    0 = OFF

    [Any other string]
    NULL = Input is not valid.

    So it looks like SESSIONPROPERTY takes only the SET options required to create an indexed view. Interesting indeed.

    Monday, November 13, 2006

    sys.sp_estimated_rowsize reduction_for_vardecimal

    By now you probably know that Service Pack 2 has added new functionality in the SQL Server 2005 Enterprise Edition to provide an alternate storage format that can be used to minimize the disk space needed to store existing decimal and numeric data types. No application changes area are required to use its benefits.

    This new storage format, known as vardecimal storage format, stores decimal and numeric data as variable length columns and can be enabled or disabled at a table level on new or existing tables. If you have declared a decimal or numeric column with high precision but most values do not require it, you can potentially save the disk space needed to store the table. A new stored procedure is provided to estimate the reduction in average row size with the new storage format.

    So what is the name of this new stored procedure? The name is sys.sp_estimated_rowsize_reduction_for_vardecimal . The proc returns the following columns: avg_rowlen_fixed_format, avg_rowlen_vardecimal_format and row_count

    How do you call this proc?
    exec sys.sp_estimated_rowsize_reduction_for_vardecimal 'TableName'

    If your table is named OrderDetails you would call it like this
    exec sys.sp_estimated_rowsize_reduction_for_vardecimal 'OrderDetails'

    Read this post from the SQL Server Storage Engine team for more info

    Wednesday, November 08, 2006

    Tuesday, November 07, 2006

    SQL Server 2005 Service Pack 2 CTP

    Microsoft is releasing the SQL Server 2005 Service Pack 2 Community Technology Preview Tuesday, November 7th.

    Here are some of the changes:

    Database Engine
    • Maintenance plans are now supported by the SQL Server Database Services installation. Before SP2, you were required to install SQL Server 2005 Integration Services (SSIS) to run maintenance plans on a server-only installation.

    • Maintenance plans now support multiserver environments, logging to remote servers, and multiple schedules. For more information, see How to: Create Multiserver Maintenance Plans How to: Create a Maintenance Plan and How to: Add or Modify Maintenance Plan Subplan Schedules.

    • Added new functionality in the SQL Server 2005 Enterprise Edition to provide an alternate storage format that can be used to minimize the disk space needed to store existing decimal and numeric data types. No application changes area are required to use its benefits. This new storage format, known as vardecimal storage format, stores decimal and numeric data as variable length columns and can be enabled or disabled at a table level on new or existing tables. If you have declared a decimal or numeric column with high precision but most values do not require it, you can potentially save the disk space needed to store the table. A new stored procedure is provided to estimate the reduction in average row size with the new storage format.

    • Added logon triggers and a common criteria compliance enabled Option for sp_configure to support common criteria in the SQL Server 2005 Enterprise Edition.

    • The sqllogship application is now supported. This application performs a backup, copy, or restore operation and associated clean-up tasks for a log shipping configuration.

    • Plan cache improvements that provide improved system performance, better use of the available physical memory for database pages, and the ability to return text XML query plans that contain an XML nesting level greater than or equal to 128 by using the new sys.dm_exec_text_query_plan table-valued function.

    • SMO for Relational Engine Features

      • Table.CheckIdentityValue() correctly generates the schema name for the fully qualified object name.

      • Column.AddDefaultConstraint() works against table columns for SQL Server 2000 database instances.

    To see all the changes go here

    SQL Server 2005 Script Repository

    Found this on the TechNet site; sample scripts and stored procedures for managing and working with SQL Server 2005. However on the TechNet site the code is spread ove several pages. I decided to make a user-friendly list instead

    Buffer Cache
    Sample scripts and stored procedures for managing and monitoring the SQL Server buffer cache.

    Retrieve Buffer Counts by Object and Index

    CPU and Optimization
    Sample scripts and stored procedures for optimizing CPU performance in SQL Server.

    Determine CPU Resources Required for Optimization
    Retrieve Parallel Statements With the Highest Worker Time
    Retrieve Statements with the Highest Plan Re-Use Counts
    Retrieve Statements with the Lowest Plan Re-Use Counts

    Indexes and Indexing
    Sample scripts and stored procedures for working with SQL Server indexes.

    Analyze Index Statistics
    Create/Truncate an Indexstats Table
    Determine Index Cost Benefits
    Identify Missing Indexes
    List Indexes With the Most Contention
    Retrieve Index Statistics
    Retrieve Index Usage Statistics
    Retrieve Indexes Not Used Since the Last Recycle Time
    Retrieve Object and Index Fragmentation Information
    Retrieve Tables, Indexes, Files, and File Groups Information

    Sample scripts and stored procedures for managing and monitoring input and output in SQL Server.

    Calculate Average Stalls
    List Queries That Could Benefit From an Index
    List Rarely-Used Indexes
    List Statements By Input/Output Usage

    Performance (General)
    Sample scripts and stored procedures for managing and monitoring general performance issue in SQL Server.

    List Cached Plans Where Worker Time Exceeds Elapsed Time
    List Currently-Executing Parallel Plans
    List Recompiled Statements
    List Runnable Queues
    List Statements With the Highest Average CPU Time
    List Statements with the Highest Execution Counts
    List Top Wait Types for a Workload

    Processor Cache
    Sample scripts and stored procedures for managing the SQL Server processor cache.

    Compare Single-Use and Re-Used Plans
    List Statements By Plan Re-Use Count

    SQL Text
    Sample scripts and stored procedures for working with SQL text.

    Retrieve a SQL Statement with a Specified .SQL_Handle
    Retrieve SQL Text and XML Plans

    Sample scripts and stored procedures for working with the new SQL Server Operating System Layer.

    Compare Signal Waits and Resource Waits
    List Currently-Executing Statements
    List Scheduler Wait List Information
    List Schedulers, Workers, and Runnable Queues
    List Session and Scheduler ID Information
    List SQLOS Execution Model Information
    List Statements from a Specified Waiter List

    Sample scripts and stored procedures for working with the tempdb system database.

    List Real Time Tempdb Task Usage
    List Real-Time Tempdb Statements

    Transactions and Logging
    Sample scripts for working with SQL Server transactions and locks.

    Compare Locking and Repeatable Reads
    Configure a Block Condition
    Handle a Block Condition
    List Lock and Serializable Information
    List Locks and Reads
    List Locks and Repeatable Reads
    List Real-Time Blocker and Waiter Statements
    Report Blocker and Waiter SQL Statements

    Sample scripts and stored procedures for working with SQL Server waitstats.

    Retrieve Waiter List Information
    Retrieve Waitstat Snapshots

    Saturday, November 04, 2006

    How Are Dates Stored In SQL Server?

    Internally dates are stored as 2 integers. The first integer is the number of dates before or after the base date (1900/01/01). The second integer stores the number of clock ticks after midnight, each tick is 1/300 of a second.

    So if we run the following code for the base date (1900/01/01)

    SELECT @d = '1900-01-01 00:00:00.000'


    The results are

    DateInt DateBinary
    ----------- ----------
    0 0x00000000

    TimeInt TimeBinary
    ----------- ----------
    0 0x00000000

    If we use the max date 9999/12/31

    SELECT @d = '9999-12-31 23:59:59.997'


    we get the following result

    DateInt DateBinary
    ----------- ----------
    2958463 0x002D247F

    TimeInt TimeBinary
    ----------- ----------
    25919999 0x018B81FF

    If you take binary values and convert to datetime you get the following results

    SELECT CONVERT(DATETIME,0x0000000000000001) --1 Tick 1/300 of a second
    --1900-01-01 00:00:00.003

    SELECT CONVERT(DATETIME,0x000000000000012C) -- 1 minute = 300 ticks
    --1900-01-01 00:00:01.000

    SELECT CONVERT(INT,0x12C) --= 300
    SELECT CONVERT(VARBINARY(3),300) --= 0x00012C

    SELECT CONVERT(DATETIME,0x0000000100000000) --add 1 day
    --1900-01-02 00:00:00.000

    For smalldatetime the time is stored as the number of minutes after midnight

    Now here is some fun stuff

    SELECT @d = .0
    SELECT @d
    --1900-01-01 00:00:00.000

    SELECT @d = .1
    SELECT @d
    --1900-01-01 02:24:00.000

    SELECT @d = .12
    SELECT @d
    --1900-01-01 02:52:48.000

    SELECT @d = '0'
    SELECT @d
    Server: Msg 241, Level 16, State 1, Line 2
    Syntax error converting datetime from character string.

    SELECT @d = 0
    SELECT @d
    --1900-01-01 00:00:00.000

    So there is no implicit conversion, o is fine 'o' is not

    SELECT @d = 20061030
    SELECT @d
    Server: Msg 8115, Level 16, State 2, Line 2
    Arithmetic overflow error converting expression to data type datetime.

    SELECT @d = '20061030'
    SELECT @d
    --2006-10-30 00:00:00.000

    Here we have the reverse, the varchar value is fine but the int is not.
    This happens because the max integer value that a datetime can take is 36523
    If we run the following we are okay

    SELECT @d = 2958463
    SELECT @d
    --9999-12-31 00:00:00.000

    Friday, November 03, 2006

    ISO-11179 Naming Conventions

    Straight from the man himself comes this statement posted in the microsoft.public.sqlserver.programming forum: "You need to read ISO-11179 so you use proper data element names. You
    actually had "tbl-"on the table names! Sometimes "id" id a
    prefix and sometimes it is a postfix.

    Of course you know who I am talking about? No? Joe Celko of course. So what is ISO-11179?

    The 11179 standard is a multipart standard that includes the following parts:

    Part 1: Framework, introduces and discusses fundamental ideas of data elements, value domains, data element concepts, conceptual domains, and classification schemes essential to the understanding of this set of standards and provides the context for associating the individual parts of ISO/IEC 11179.

    Part 2: Classification, provides a conceptual model for managing classification schemes. There are many structures used to organize classification schemes and there are many subject matter areas that classification schemes describe. So, this Part also provides a two-faceted classification for classification schemes themselves.

    Part 3: Registry Metamodel and Basic Attributes, specifies a conceptual model for a metadata registry. It is limited to a set of basic attributes for data elements, data element concepts, value domains, conceptual domains, classification schemes, and other related classes, called administered items. The basic attributes specified for data elements in ISO/IEC 11179-3:1994 are provided in this revision.

    Part 4: Formulation of Data Definitions, provides guidance on how to develop unambiguous data definitions. A number of specific rules and guidelines are presented in ISO/IEC 11179-4 that specify exactly how a data definition should be formed. A precise, well-formed definition is one of the most critical requirements for shared understanding of an administered item; well-formed definitions are imperative for the exchange of information. Only if every user has a common and exact understanding of the data item can it be exchanged trouble-free.

    Part 5: Naming and Identification Principles, provides guidance for the identification of administered items. Identification is a broad term for designating, or identifying, a particular data item. Identification can be accomplished in various ways, depending upon the use of the identifier. Identification includes the assignment of numerical identifiers that have no inherent meanings to humans; icons (graphic symbols to which meaning has been assigned); and names with embedded meaning, usually for human understanding, that are associated with the data item's definition and value domain.

    Part 6: Registration, provides instruction on how a registration applicant may register a data item with a central Registration Authority and the allocation of unique identifiers for each data item. Maintenance of administered items already registered is also specified in this document.

    The one that deals with naming conventions is 11179-5 The link will point to a zip file which has a pdf file in it. The TOC of this pdf file is below

    1 Scope
    2 Normative references
    3 Terms and definitions
    4 Data Identifiers within a registry
    5 Identification
    6 Names
    6.1 Names in a registry
    6.2 Naming conventions
    7 Development of naming conventions
    7.1 Introduction
    7.2 Scope principle
    7.3 Authority principle
    7.4 Semantic principle
    7.5 Syntactic principle
    7.6 Lexical principle
    7.7 Uniqueness principle
    Annex A (informative) Example naming conventions for names within an MDR registry
    Annex B (informative) Example naming conventions for Asian languages

    So check it out and hopefully you and your team can adapt a common naming conventions instead of having things named like employee_address, EmployeeAddress, employeeAddress and tblEmployeeAddress.

    Thursday, November 02, 2006

    SQL Teaser

    Try to guess what the output will be of the following SQL Select statement (don't worry, you can't get it right ;-)

    First guess then run this

    SELECT 123.654, 123d6, 123e4, '123'e4


    I lifted/stole/borrowed the select statement from Mladen Prajdic's blog

    Top 5 Posts For October 2006

    Below are the top 5 posts according to Google Analytics for the month of October 2006 in order by pageviews descending

    Five Ways To Return Values From Stored Procedures
    SQL Query Optimizations
    Do You Know How Between Works With Dates?
    NULL Trouble In SQL Server Land
    SQL Server 2000 Undocumented Procedures For Files, Drives and Directories

    The reason Five Ways To Return Values From Stored Procedures was number one is because it was posted on digg. Once something makes it to digg and people seemed to like it then you can expect thousands of hits or even more (The Digg Effect) So far 33 people have dugg this story. The link to the digg URL is below

    After that 23 people saved it on that link is below

    I will update the top 10 of all time (below the masthead) later today

    [edit]Why wait? I have updated the top 10, you can see it here:[/edit]