Showing posts with label Interview. Show all posts
Showing posts with label Interview. Show all posts

Wednesday, July 02, 2008

Tuesday, May 06, 2008

What Would You Like To Ask Erland Sommarskog?

In the Who do you want to see interviewed next?(sqlblog) and Who do you want to see interviewed next?(here) blog post I asked for some names of people who YOU would like to see interviewed. Erland Sommarskog's name was submitted in comments a couple of times. I contacted Erland and am happy to anounce that he has agreed to do this.

The first time I heard of Erland Sommarskog was in the SQL Server programming newsgroups. I also read all the articles on his site. When I answer questions I usually give the link out to these 4 articles
Arrays and Lists in SQL Server
Implementing Error Handling with Stored Procedures
Error Handling in SQL Server – a Background
and of course The curse and blessings of dynamic SQL

I always wondered why he didn't write a SQL book. Hey, now I can ask him that :-)

Before you think of any questions please visit Erland's page first (http://www.sommarskog.se/index.html) and read some of the articles he wrote.

You can submit questions until Monday May 12th and then I will forward the questions to Erland.

What Would You Like To Ask Craig Freedman?

In the Who do you want to see interviewed next?(sqlblog) and Who do you want to see interviewed next?(here) blog post I asked for some names of people who YOU would like to see interviewed. Craig Freedman's name was submitted in comments a couple of times. I contacted Craig and am happy to anounce that he has agreed to do this.

Before you think of any questions please visit Craig's blog first(http://blogs.msdn.com/craigfr/) to get a feel of what kind of questions to ask. It would make sense to keep the questions focused on query processing, query execution, and query plans. I told Craig that if he doesn't like a question that he does not have to answer it.

You can submit questions until Monday May 12th and then I will forward the questions to Craig

Thursday, May 01, 2008

Who do you want to see interviewed next?

I have interviewed a bunch of people already, some of them on this blog and some of them on SQLBlog

Below is the list of all the people so far

Louis Davidson
Itzik Ben-Gan
Ken Henderson
Kalen Delaney
Adam Machanic
Interview With Stéphane Faroult About Refactoring SQL Applications
Database Refactoring Interview With Scott W. Ambler
Interview With Joe Celko About The Forthcoming Book Thinking In Sets

Who else should I interview? I have some ideas but I would like you to participate. Here is my thinking, after we have a list of possible people, I will contact them and ask if they will agree to an interview. If they agree I will make a post here and ask you what you would like to ask this person. I usually have a good set of questions but there might be someone out there with some real good/interesting questions.
So post the name of the person you would like to see interviewed and hopefully this person will agree.

It would be nice if you did not pick a person who blogs at SQLBlog but if_you_really_wanted_to then I will not stop you :-0

Please keep it database related, it doesn't have to be specific to SQL Server but it has to be applicable to SQL Server.

Monday, April 21, 2008

Interview With Stéphane Faroult About Refactoring SQL Applications

I decided to interview Stéphane Faroult, the author of The Art of SQL because it is one of my favorite SQL books. Stéphane told me he has a new book coming out soon, the name of this book is Refactoring SQL Applications and the majority of this interview is focused on that book.

You can find that interview here: Interview With Stéphane Faroult About Refactoring SQL Applications

Wednesday, April 02, 2008

Database Refactoring Interview With Scott W. Ambler

Since I am doing scrum and other agile methodologies myself I decided to interview the authority on agile in the database world.

Scott W. Ambler is the Practice Leader Agile Development for IBM Rational

You can find more about Scott here: http://www.ibm.com/software/rational/bios/ambler.html

I did not feel like duplicating the content here so you can read the interview here: Database Refactoring Interview With Scott W. Ambler

Monday, December 17, 2007

Interview With Joe Celko About The Forthcoming Book Thinking In Sets

I noticed that Joe Celko has a new book coming out soon: Joe Celko's Thinking in Sets: Auxiliary, Temporal, and Virtual Tables in SQL. I decided to send Joe an email to see if he would be willing to answer some questions I had about the book and SQL in general. To my surprise Joe was more than willing to accommodate my request. The question-and-answer session with Joe that follows was conducted via email.

Is the book geared towards a beginner/intermediate level user or do you have to be an advanced user to really utilize the information in this book?
I would say intermediate level. You need to know enough SQL to do some programming in the language and be experienced enough to see that DDL is as important as DML.

What are the most important things a person can do to master SQL?
The most important thing is to make the leap from procedural programming to declarative programming, from sequential files to sets. The only declarative language that most programmers have seen is spreadsheets. They are nothing like SQL!

I assume you wrote this book because people when they first use a database tend to do the same thing they do in Java, C# or other procedural languages; get a bunch of rows and manipulate them one by one. Is this the number one mistake beginners do?
That is right up there in the top five, for sure! But I think that the classic error is in not knowing how to design the schema in the first place. A bad schema -- denormalized, bad data element names, no constraints, no proper keys, no referential integrity -- leads to trying to correct the flaws in DDL in the DML. If you have a good schema, then the queries, inserts, updates, and deletes are usually fairly easy. All of the "hard work" should be hidden in the database and not spread over the application code and DML.

What is so bad about attribute splitting (somehow these three tables come to mind: Squids, Automobiles and Britney Spears)?
NO, no, no! I coined the phrase a "Squids, Automobiles and Britney Spears" table or column to refer to a table or column which has more than one kind of entity or more than one kind of attribute in it. What makes that phrase so fun is that there is a web page which compared photos of a giant squid and Britney Spears after she cut off her hair. If you actually gave such tables or columns a meaningful name, then you would see that it is not a proper table or column. These nightmares would have names like "hat_or_shoe_size_depending_on_column_five" Attribute splitting is when you take an attribute and make it into two or more tables, columns or rows. The table example is the most common one. How often have you seen a table for each location (split on geography) or a table for each month (temporal split)? This mimics a tape file system, so newbies feel comfortable. When you ask them if they would split a Personnel table into MalePersonnel and FemalePersonnel (split on gender), they immediately see the fallacy. Unfortunately, splits need all kinds of code if they want any data integrity. This code usually re-assembles the data elements back to what they should have been in the first place.

In chapter two of your new book you talk about the new advances in hardware. I have noticed that somehow the amount of data I have to store out paces the advance in hardware and the queries don't run any faster. Will Solid-State Disks finally change that?
I have great hopes for solid state devices. Any solid thing that has to physically move is by definition slower than electricity or light. We are looking at nano-tech, better lasers and a ton of new technology almost every month.

I see you have a whole chapter on auxiliary tables, I am a big fan of those tables myself and I use them to create dates on the fly or split comma delimited strings. In your opinion what percentage of developers does not use them and why should you use them?
Not enough. SQL is a language meant to handle data and not to do computations. Auxiliary tables can be shared among sessions and accessed in parallel. Imagine a complicated but deterministic formula. In a procedural computational model, you hang in a loop and re-do the work for each record in a sequential file. Let me give you a real-world example of that. Corky's BBQ does a huge mail-order business at Christmas time. The pick lists need to include the size of the boxes to be used, so part-time help can do their jobs. When I got this problem, they had discovered that using weights was not right way to go. The approach being attempted was to play "3-D Tetris" with the products. Since that kind of thing is a bitch to program, they were getting nowhere. My approach was to look over a few years of shipping history and find out how many different orders they had shipped and what the smallest box use for each of those orders was. There were only about 5000 configurations and the majority were standard gift packages. Do a relational division and a table look-up to handle 99.98% of the cases and print the phrase "Hand pick this order" on the pick list for the exceptions! In the SQL model, you join an auxiliary table which has the parameters and the result value in each row. You can do this simple multi-column equi-join once in parallel. Wait until the multi-core chips make parallelism the only way to design a computer; then auxiliary table are going to really fly.

I noticed you have a big chapter on VIEWs; do you feel that VIEWs are not utilized enough by developers?
They are used either not enough or too much.

Why do you think we should not use bit or Boolean flags in SQL?
In SQL, to be a data type it must allow NULLs. What is the fundamental nature of a BIT? It is one or zero; there is no NULL concept here! This was a problem for SQL Server, when they made their BIT data type into a numeric that could be NULL-able. The change scrambled a lot of data when it was ported from one release to the next. Bits are low-level, hardware dependent concepts. Are you high end or low end hardware? Do you have 4, 8, 16, 32 or 64 bit machine words? You have use proprietary operators. This defeats the idea of machine independence. Finally, bit flags are used to destroy First Normal Form (1NF) and thus destroy data integrity. As an example, recently in as newsgroup someone wanted to use a 4-bit column to store all the possible colors for a product (red, green, blue, yellow) and get them out with a bit mask (his hardware has a nibble!). But how do you add purple? How do you set up a constraint that no item can be mad with both red and green options? In RDBMS, we discover the state of our data with predicates and not by setting flags at the hardware level.

I have your puzzles book and noticed that you have a paragraph on Sudoku and one on Bin Packing in the last chapter of this book. You have procedural solutions and SQL solutions for most of the material in the chapter; is the SQL solution faster?
There is a funny story on the Sudoku problem. Richard Romley is a retired DBA from Smith-Barney and he decided to play with Sudoku for recreation. He used SQL Server on his home machine and coded a solver in one SELECT statement. The procedure takes 81 parameters (the starting grid) and does an 81-way self-join. It produces ALL the valid answers -- bet you did not know that many published puzzles have multiple solutions! The code is straight forward and depends on the optimizer to handle the search condition logic. Even longer problems with tens of answers run in well under one second. The procedural solvers vary, but I have seen some that stop when they get to the first valid grid. If there is only one solution, they are very fast. But is the right answer actually the *set* of valid grids? Since I am an SQL person, I think so. The procedural solvers can get hung up by backtracking to the starting position when there are a few hundred answers and become very slow. I also strongly recommend getting the Japanese or Chinese editions if you read either language. My two translators cleaned up some old code and added new solutions as we went along.

Should having good naming conventions such as 11179-5 be included in database courses?
Drop them in as soon as you start. If you grow up with good conventions, you will start doing it without thinking about it. When I teach RDBMS, I start with scales and measurement theory so that my students know what data is all about -- whether it is in a database or not.

When can we expect your new book to show up in bookstores?
It was supposed to be out in 2008 February, but we lucked up and it will be out in 2008 January. Production was faster than planned. I guess after seven books, and working with the same people, we have it down pretty well!

A bunch of questions not related to the book

Why do you write technical books?
I have no talent for fiction. I cannot get a plot or characters onto paper to save my life and my dialogues are awful chains of "he said-she said" stiff sentences. My grandfather wrote children's poems in Slovak, and I have even less talent for poetry. I have a number of friends who write detective novels, Science Fiction and Fantasy. They don't consider me a real author because I don't do fiction. I think about trying my hand at YA (Young Adult) books -- Danica McKellar (Winnie from the WONDER YEARS television show) just did a math book for girls, so maybe I could do "A Child's Garden of Normal Forms" or a juvenile detective series called "The Hardware Boys", then go on to a television show called "Query Eye for the Database Guy" or something.

Which of your books was the hardest to write and why?
First edition of SQL FOR SMARTIES! It was my first book and I thought that having written a few hundred magazine columns would make it easy. I was dead wrong -- completely different skill set. I was a year late in delivering the manuscript. After that, I had a system in place.

What is coming down the line? Any new books or updates to current ones?
I am trying to do at least one book per year -- more if I am unemployed and need the advances. My current thoughts are a book on the use of Standards in a database, and one on programming tricks with OLAP functions, CTEs, and other new features in the SQL-2003 Standards. My Morgan-Kaufmann books tend to follow a five year cycle, just like the ANSI/ISO Standards. I also get asked by vendors to do product specific books. I might self-publish something completely off-topic. I have a book on domino games based on my postings and I teach Texas-42 on Royal Carri bean Cruises -- it is a domino game only played in Texas. I also have a book on Pai Gow (a gambling game played with Chinese dominoes) which might sell 10 copies. I will also be doing some video classes, but I don't have details yet.

Which of all the SQL books that you wrote is your favorite?
DATA & DATABASES, which never got the sales of the others. It is more philosophical and concerned with the nature of data instead of programming.

What SQL Server books are on your bookshelf?
Anything I can get by Henderson, Machanic, Moreau, Ben Gan and Delaney. The SQL Server experts are pretty well-known and they publish. This is not true for other products, especially the open-source RDBMS products.

Why do you participate in newsgroups and do you think it is a good idea for beginners to ask questions in newsgroups?
To do some shameless self-promotion, of course :) Newsgroups are a good source of SQL problems and some clever answers that I can use in books and when I am consulting. I also have a pedantic streak I did to get out. And if I am available on a newsgroup, people don't fill up my mailbox at home. And, yes, beginners should use newsgroups for help. But not to have someone else do their job or their homework for them. I like to see the mindset of people who are just learning SQL. It is not enough to see that someone is making a mistake; you want to figure out what lead to that particular mistake. Remember Chernobyl? Everyone did just what they were supposed to do, but there were a few critical assumptions that lead to an event cascade.

What are Cowboy Coders and id-iots?
The term "cowboy coder" is an old one. It means someone who starts coding without any design phase, without any overview to the system as a whole, without any research for industry standards or a company data dictionary. They usually love dialect code and tricks that trade immediate performance for maintainability. The heavy dialect code also gives them job security, since they usually only know one product. An "ID-iot" is a newbie who has no RDBMS education and wants to have the comfort of a sequential file system. So he puts an IDENTITY column on every table as the PRIMARY KEY. Never mind that it is proprietary and non-relational; it is the familiar record number from a file system which can use to mimic pointer chains. He does not understand that rows are not records, tables are not files, columns are not fields and references are constraints and not pointers.

I have been working with Sybase IQ for a little bit; what is your opinion on columnar databases?
Sybase IQ is not the only game in town. I consulted with SAND (nee Marcus, nee Nucleus) years ago. It was one of the first such products. Later I ran into WX2 (nee White Cross) and I am looking at Stonebreaker's Verticia now. Their advantages in parallelism and compressing large amounts of data make them the best choice for Data Warehouses. I would also look at Teradata, which uses hashing. That will become more important as the research on minimal perfect hashing functions gets out of the lab and into products.

Where can we expect to see you in 2008? Any conferences, seminars, trade shows or classrooms perhaps?
I will hopefully be doing some more "SQL Saturdays!" on my weekends. I want to do more webcasts, but I am not sure if I am ready for YouTube. My other travel goal for 2008 is to get to Australia or Japan; I have never gone past Hawaii.

Some of Joe Celko's Books:
SQL for Smarties
SQL Programming Style
Trees and Hierarchies in SQL
SQL Puzzles and Answers
Data and Databases

Sunday, December 02, 2007

The Sad State Of Programmers Part 1 : The Phone Interview.

This is going to be a three part series.
Part 1 The phone interview.
Part 2 The face to face interview.
Part 3 Some tips and observations

A while back I posted that we are looking for a SQL/.NET/FoxPro developer. I did this because we had a real hard time finding this person. I am happy to inform you that we did find this person and he will start in two weeks. Interestingly enough we hired the person with the least years of experience (on paper). This person knew more that people with three times his experience in years.

These days when looking for a programmer you have to do phone interviews if you don’t want to waste an incredible amount of time. A phone interview enables you to assess the skill set of a potential employee without wasting time by picking him up, getting a security badge, booking a conference room etc. A phone interview is also good for the candidate since he/she doesn’t have to travel or dress up to do the interview.

Some things are difficult to ask over the phone but if the candidate looks (or should that be sounds) good then you can ask those questions when you bring the person in. Some people will prepare for a phone interview by having all their books and notes in front of them. They will ask you to repeat the question and while you do so you can hear them flipping pages frantically. So you might be able to cheat on the phone interview but be assured that if you do not know your stuff that you will fall flat on your face on a face to face interview (no pun intended).

One thing I never understood is the fact that it takes a person one minute to answer a question. You either know or don’t know the question. Keep your answers concise, do not spend 3 minutes explaining to me what the difference is between a clustered and non clustered index.
I had to reword my questions slightly because when I asked a question like “Do you know what the difference is between a clustered index and a non clustered index?” some people would reply “yes”. Because of that I changed the question to “Describe what the difference is between a clustered index and a non clustered index?”

Do not shoot yourself in the foot by giving me additional information which is wrong. I asked for the fastest way to empty a table. Almost every single person who knew about truncate added that you cannot rollback a truncate statement. I wrote about that myth a couple of months ago: SQL Myth: Truncate Cannot Be Rolled Back Because It Is Not Logged

I tend to ask between 20 and 40 questions, if I see the candidate’s skill is not good enough I don’t ask everything. Some of the questions are esoteric but I simply ask these questions to get a feel of the overall skill level; it doesn’t matter if they answer these wrong. You can find a list of question here: How Well Do You Interview And Do You Use Wizard Driven Programming?

Here are some interesting answers from the interviews.
Almost every single person answered that an index scan is better than an index seek.
There were several people with SQL Server 2005 experience, these people couldn’t name one single new thing introduced in SQL Server 2005. I asked about windowing functions, DMVs, pivot, apply and more, this was all Greek to them. One person had on her resume that she developed an app in SQL Server 2005. When I asked about her experience she told me she just started to read about SQL Server 2005. This is a big show stopper, sometimes headhunters/recruiters will tell you to just add it to your resume, I wouldn’t do it because it makes you look bad. If the SQL Server 2005 experience is not true what else could be made up? One person had on his resume that he optimized complex stored procedures, when I asked how he did it, he replied that he only selected the rows he needed instead of the whole table. This obviously didn’t answer my question.

That is it for the phone interview, part 2 will be up in a day or two.

Monday, October 15, 2007

Interview With Kalen Delaney About Inside Microsoft SQL Server 2005 Query Tuning and Optimization

Inside Microsoft SQL Server 2005: Query Tuning and Optimization I am a big fan of Inside Microsoft SQL Server 2005 The Storage Engine so when I saw that yet another book got published in this series I just had to check it out. Inside Microsoft SQL Server 2005: Query Tuning and Optimization is very well written gets to the point and give you the answers that you need. I decided to contact Kalen to see if she would be willing to do an interview, by now you know that the answer is yes of course ;-)


The question-and-answer session with Kalen that follows was conducted via email.

What is the audience for this book, is it the enterprise user or can a small department benefit from the tips in this book?
Because this book deals with query tuning, anyone who writes SQL queries for SQL Server can benefit. Very little in the book is geared towards system tuning, so the size of the machine doesn’t really matter. Now of course, if you have a very small system with very small tables, you won’t get as much benefit out of tuning your queries. However, if you have any tables of more than a few thousand rows, and you do any joins, you will need to tune your queries. In addition, the issues of blocking and concurrency control can impact any system, no matter how small.

What new technologies in SQL Server 2005 do you think are the most beneficial for performance?
For very large databases, the best new technology is partitioning. For any size system, if you have had serious performance problems due to blocking, you might find a big performance benefit by using one of the snapshot-based isolation levels, but you really need to understand the resource costs that come along with the improved performance. For your individual queries, I think the new optimizer hints and query level recompiles can make a big difference. For indexes, the ability to add included columns to nonclustered indexes can give some of your hard-to-tune queries a major performance boost.

What will a person who reads this book gain in terms of understanding how to performance tune a server?
The focus of this book is not so much on tuning the server, but on tuning queries. There is more in Inside SQL Server 2005: The Storage Engine on server issues such as memory and processor management. The biggest server wide issues are covered in Chapter 5, when I talk about managing the plan cache, and how and when query plans are reused.

Is the book geared towards a beginner/intermediate level user or do you have to be an advanced user to really utilize the information in this book?
The book is not geared towards beginners, but everyone should be able to get something out of it. It’s primarily geared to SQL Server developers and DBA’s who have been working with SQL Server for a while, and have encountered performance problems that they are trying to find solutions for.

With all the changes in SQL Server 2005, how critical has the tempdb become in regards to performance?
Tempdb has always been important. In SQL Server 2005, if you are using one of the snapshot-based isolation levels, you are going to have to be more aware of the demands placed on tempdb, both in the sizing requirements and the additional I/O resources needed. Fortunately, SQL Server 2005 provides tools to monitor tempdb, including a dozen new performance monitor counters, and a dynamic management view, sys.dm_db_file_space_usage, that keeps track of how much space in tempdb is being used for each of the different kind of object stored in tempdb.

I understand that this is the first time you wrote with a team of other writers; can you tell us something about that experience?
I initially thought that not having to write the whole volume by myself meant that I could get it done sooner, but that was not the case. Everyone had their own schedule and their own way of writing. The personal aspect of working with the other authors was great. I deeply respect all of the others and it was an honor to be working so closely with them. I had some concerns about the depth of coverage and I wondered whether all the chapters would end up being as deep as I hoped for, but that turned out not to be a major problem. The only real issues were agreeing on a common terminology and coding style, and even that wasn’t that big of an issue, because I got to do a final editing pass on everyone’s chapters.

What SQL Server books are on your bookshelf?
All of the Inside SQL Server books are there, of course, and all of Ken Henderson’s books. Bob Beauchemin’s book is in my car, to read while I am waiting for the ferryboat, and while on the ferry. I also have technical books that aren’t SQL Server specific, such as Jim Gray’s Transaction Processing, Russinovich’s and Solomon’s Windows Internals, Chris Date’s Introduction to Database Systems and Mike Stonebraker’s Readings in Database Systems.

Why do you write technical books?
I love working with SQL Server and trying to find out all I can about it. When I found that I could explain difficult concepts in a training environment, I thought I could do the same thing in a written format, and reach more people that way. I have always loved explaining things, ever since I was a teaching assistant for High School Math.

Will you be updating your books for SQL Server 2008?
I have just started meeting with my editors at Microsoft Press about SQL Server 2008, and it looks like a revision is in the plans. We’re really looking at it as just a revision, with the same structures as the current books, with straightforward changes and the inclusion of new features.

Name three things that are new in SQL Server 2005 that you find are the most valuable?
Dynamic Management Views, Dynamic Management Views, and Dynamic Management Views!
Oh, you wanted three different things? ;-) How about XML query plans and optimization hints. (I’m also very fond of many of the new TSQL constructs, but I was only talking about things that I cover in my new book.)

Name three things which are coming in SQL Server 2008 that you are most excited about?
You’ll have to ask me this next time. I have actually been avoiding SQL Server 2008 while I was getting my Query Tuning and Optimization book finished, because I didn’t want to get distracted.

Can you list any third party tools that you find useful to have as a SQL Server developer/admin?
I’ve tried a few other products, but usefully I find that it is much easier to just stick with the Microsoft line and use the tools provided with the product.

Name some of your favorite non-technical books.
Oh, I love to read. It would be impossible to list my favorite books, but I can tell you my favorite authors, most of whom write science fiction: Lois McMaster Bujold, Ursula LeGuin, Sheri Tepper, Orson Scott Card, Octavia Butler, Elizabeth Moon. I also love to read historical fiction like Leon Uris.

Wednesday, May 16, 2007

Interview with Adam Machanic Author Of Expert SQL Server 2005 Development

Expert SQL Server 2005 DevelopmentI am a big fan of Adam Machanic and the way he thinks about SQL server development. How may SQL developers unit test their procedures or use assertions in their code? Well I can guarantee that there are very few of those. I was very excited when Adam Machanic contacted me to give me a copy of Expert SQL Server 2005 Development for review. I did not finish reading the whole book yet but what I have read so far is very good. If you are an advanced developer and want to kick it up a notch then this is the book for you.


The question-and-answer session with Adam that follows was conducted via email.

What is the audience for this book? Is the book geared towards a beginner/intermediate level user or do you have to be an advanced user to really utilize the information in this book?
The book is certainly not for beginners; I make a lot of assumptions about the level of knowledge possessed by the reader. The reader should understand the basics of writing T-SQL, creating indexes, reading query plans, etc. It's difficult to differentiate between intermediate and advanced users, as everyone has their own opinion of what those levels actually mean! I've covered several not-so-common topics, and my hope is that that almost any SQL Server practitioner who has advanced beyond the basics can learn something from the book.


What are the most important things a person can do to master Transact-SQL?
Practice, practice, and more practice! The more complex queries you write, and the greater the variety of problems you solve in T-SQL, the easier it gets. Just like other programming languages, T-SQL programmers use and reuse various patterns as they work on different projects and problems. Gaining skill in T-SQL is really nothing more than a process of learning when and how to properly apply the patterns to any given situation.


Adam, the first time I came across your name was in the Microsoft public SQL Server programming group. Can you explain to us why you participate in newsgroups and forums?
When I first started working with SQL Server and other Microsoft technologies I discovered Microsoft's news server and was shocked to discover the overall quality and high signal-to-noise ratio compared to many other (Usenet) newsgroups I'd previously frequented. Questions I asked were actually answered--by people who knew what they were talking about, no less! After spending quite some time asking my own questions I realized I could answer a few here and there too, and give something back to the community that had helped me so much. Over time I became hooked. What a great way to kill free time (especially while waiting for software to compile or long queries to run)--not only does it feel good to help others solve their problems, but it also helps to strengthen my skills as a developer. As I mentioned in the previous question, the most important key to software development greatness is practice. By answering newsgroup and forum posts, you can get exposure to all sorts of business problems and scenarios that you would otherwise never see. There is no better learning experience that I have found.


Why is it important to think in sets instead of procedurally?
It's often said that thinking in sets takes advantage of the declarative nature of SQL; and while this is certainly true I think it's important to drill down to why that's such a big deal. SQL query performance, generally speaking, comes down to I/O performance. Too many I/Os, or an I/O-bound system, and your query won't return fast enough--simple as that. When we write set-based queries, we allow the query engine to optimize in order to minimize the number of index requests--and, therefore, I/O operations--necessary to satisfy the query. Working procedurally, with loops, means that the opposite happens--the maximum number of index operations will happen on every invocation, and therefore the query will perform in constant time—at the worst possible level.


Why do you feel it is important to properly unit test procedures?
I cover unit testing in some detail in Chapter 2 of the book, including when and why you want to do it. But what it boils down to is that creating a solid unit test infrastructure helps to guarantee that as you change your code over time, you won't break anything--and if you do, you'll find out about it very quickly. Studies have shown that by far the most expensive part of the software development process is maintenance. This is due to the fact that as we change our software, we break other pieces (dependencies)--and oftentimes, those dependency breaks silently cause subtle issues, that aren't noticed until long after the change was actually made. This can make it extremely difficult to track down and fix the root causes of problems. Comprehensive unit testing makes it possible to eliminate a lot of that delay, by alerting the developer right away if something doesn't look good. This, in turn, results in better quality software, happier customers, more business, and as a result, bigger bonus checks. A win-win situation, if you ask me.


Which chapter was the hardest to write and can you explain why?
Chapter 10, "Working with Temporal Data," took me four months to write. It's my favorite chapter in the book, and one of my favorite topics to think about, but it covers a huge variety of scenarios. As I mention in the chapter, temporal data finds its way into every single database, so it's a large and far-reaching topic. Attempting to cover it in a fairly complete manner turned out to be quite a big task!


Why do you write technical books?
Mainly for the groupies. But aside from that obvious perk, I do it simply because I love sharing technology and techniques with others. Writing a book is similar to participating in a newsgroup or technical forum, but on a much deeper level. Whereas I might answer a question on a forum in 400 words, or write a 3,000-word article, in a book I can cover the same topic in 15,000 or more words--really guide the reader through all of the pitfalls and ins and outs of a given area, and do things that would be impossible in another media (except, perhaps, in a long series of blog posts or articles). Plus, let's be honest--it's just a real thrill seeing my name on the cover of a book!


Who are your favorite authors?
For technical titles, my favorites are Steve McConnell ("Code Complete," "Rapid Development"), Joe Celko ("SQL For Smarties" and others), and Andrew Troelsen ("Pro C# and the .NET Platform"). These authors manage to convey heavy technical topics in very readable formats--which is something I aspire to doing in my own writing.

Unfortunately, I haven't been reading much fiction recently. The last really great novel I read was "Cloud Atlas" by David Mitchell. I'm also a longtime fan of science fiction writers such as William Gibson ("Neuromancer") and Neil Stephenson ("Snow Crash," "Cryptonomicon").


I noticed you have a chapter that deals with application concurrency, do you feel that developers don't have a multi-user/multi-app mindset when first developing the data model/procedures?
With the recent trend towards multiprocessor and multicore machines, threading and synchronization have been hot topics. I've read lots of interviews and articles about how many developers don't understand multithreading well enough. Yet as database developers, I feel like we've been there for years--the database is almost always a shared resource, and managing synchronization is no less important there than in the application tier. A lot of the time we tend to think about concurrency merely in the context of isolation levels and the like, but there is a lot more to it than that--an application will often push much of its synchronization logic all the way into the data tier (whether on purpose or not), and how to properly handle things is really more of a business question than a technical one. When writing that chapter, I tried to frame the issues in terms of business requirements, and then show how to use SQL Server to satisfy them. I think that developers are sometimes prone to getting too excited about technology and trying to find a business problem to which they can apply some new technique--the whole "hammer looking for a nail" mentality--rather than the correct route of determining the business problem and then using technology to solve it.


What SQL Server books are on your bookshelf?
I've now gotten rid of most of my SQL Server 2000 books, except for Ken Henderson's Guru's Guide" series and Kalen Delaney's "Inside SQL Server 2000". My bookshelves (note the plural!) are filled with SQL Server 2005 books--for the first year or so after its release I managed to get almost every new book that was released. I eventually slowed down a bit, but it's hard--there is a lot of great material out there! I'm a really big fan of the "Rational Guides" -- short, concise, yet in-depth guides to just one topic, generally very well-written and easy to digest.


Why did you decide to include a chapter on spatial data?
Through nothing more than luck of the draw I ended up on a couple of projects involving heavy spatial queries, and found that it was an area I really enjoyed. I love to solve difficult problems, and since comparatively very few people are doing spatial in SQL Server, I found that there was a lot of room to innovate--so I ended up having a great time with these projects and totally got hooked. I feel that spatial data is an area we're going to see a lot more of in the future. Spatially-enabled devices are becoming more and more prevalent, and of course all of those devices are going to need data. Unfortunately, due to time constraints I was not able to write that chapter, so my friend Hugo Kornelis tackled it. Hugo did an absolutely fantastic job of covering the topic--I think it's a great chapter, and will help readers get right up to speed.


Are you still making your own beer?
I live in a fairly small apartment at the moment and am somewhat cramped for space as well as time, so I haven't brewed a batch beer for the last couple of years. However, I have been experimenting with ciders quite a bit, and have made a variety of batches using all sorts of adjuncts including honey, brown sugar, and malt extract. Cider is very forgiving and needs to sit around for a long time before it's drinkable, so it's easy to make in a small space with limited time. I've also made a few batches of ginger beer, using a recipe that we actually designed in an online SQL Server forum!
(http://www.simple-talk.com/community/forums/thread/1232.aspx )



Table Of Contents
CHAPTER 1 Software Development Methodologies for the Database World
CHAPTER 2 Testing Database Routines
CHAPTER 3 Errors and Exceptions
CHAPTER 4 Privilege and Authorization
CHAPTER 5 Encryption
CHAPTER 6 SQLCLR: Architecture and Design Considerations
CHAPTER 7 Dynamic T-SQL
CHAPTER 8 Designing Systems for Application Concurrency
CHAPTER 9 Working with Spatial Data
CHAPTER 10 Working with Temporal Data
CHAPTER 11 Trees, Hierarchies, and Graphs
INDEX

Sample: Chapter: Ch. 01 - Software Development Methodologies for the Database World
Amazon Link: Expert SQL Server 2005 Development

Sunday, April 15, 2007

Interview With Itzik Ben-Gan Author Of Inside Microsoft SQL Server 2005: T-SQL Querying


If you are a SQL developer then you know who Itzik Ben-Gan is. If for some strange reason you don’t know then listen up. Itzik is a SQL Server MVP, he writes a monthly column for SQL Server Magazine and he is also a principal mentor and founder of Solid Quality Learning. You can also find him in the SQL Server programming public newsgroups. If you browse the MSDN site you will find several articles written by Itzik.
I purchased Inside Microsoft SQL Server 2005:T-SQL Querying and must say it is my favorite SQL Server 2005 book so far. Chapters 3, 6 and 9 alone are worth the money for the book. I have posted a link to a sample chapter (6) at the end of this post if you are interested to learn more about the style of writing and just to see how awesome this book really is.
Let’s get started with the interview

Q. What is the target audience for this book?
A. T-SQL developers with intermediate to advanced background.
By the way, the book is actually split to two volumes: Inside Microsoft SQL Server 2005: T-SQL Querying and Inside Microsoft SQL Server 2005: T-SQL Programming. Originally I intended to write one book, but I ended up with over 1200 pages; so I had to split it to two. But you should consider them as volumes 1 and 2 of one book. T-SQL Querying should be read before T-SQL Programming.

Also, I’d like to mention that several people contributed to the books, and I had great pleasure working with them. Lubor Kollar wrote Chapter 2 - Physical Query Processing of the T-SQL Querying volume and also provided great help and advice. Dejan Sarka wrote the CLR and XML content and explained what they have to do with the relational model. Roger Wolter wrote Chapter 11 - Service Broker of the T-SQL Programming volume. Steve Kass was the technical editor of the books, but contributed way more than what you would normally expect from technical editors. You can read about the contributors here:

http://www.insidetsql.com/


Q. What new technologies in SQL Server 2005 do you think are the most beneficial for developers?

A. I don’t think that it would be right of me to name two technologies in general, rather, depending on need. For example, for applications that involve XML manipulation, developers will find XML enhancements beneficial (and there are major ones in the product). For applications that need to implement at the database computationally intensive calculations, complex calculations, iterative logic, string manipulations, parsing, and so on, developers will find CLR integration beneficial. For applications that need queuing support, developers will find the new queuing infrastructure and the service broker platform beneficial. And then there’s row versioning embedded in the engine supporting the new snapshot isolations and other aspects of SQL Server; programmers will find this technology beneficial for certain types of systems that suffer from concurrency problems.

As for things that are closer to my heart; in regards to T-SQL, the two new features that I find most beneficial for developers are the OVER clause (e.g., with ranking calculations) and Common Table Expressions (CTE).

The OVER clause is really profound; I feel that it helps bridging the gap (or maybe I should say abyss) between cursors and sets. The OVER clause wasn’t implemented fully in SQL Server 2005, but even the existing implementation (especially with ranking calculations) allows simplifying and optimizing many pre-2005 solutions.

As for CTEs, they have both nonrecursive and recursive forms. The nonrecursive form has several important advantages over derived tables, allowing less code and better code maintenance. The recursive form allows for the first time in SQL Server’s history to write recursive queries, and very elegant ones. These are especially useful in manipulation of graphs (e.g., employee organizational chart, bill of materials, etc.).

Q. What will a person who reads this book gain in terms of understanding how the query engine works?

A. The approach I took in the books was first to understand logical query processing, then physical query processing (the engine), and then tackle practical problems where you need to combine your knowledge in both. It’s a problem based approach; I didn’t attempt to explain the engine for the sake of understanding how the engine works; rather to look at common practical problems developers face, write several solutions to each problem, analyze how the engine processes the solutions, optimize their performance, and simplify them. Ideal solutions are those that are both simple and perform well. A complex solution may have the side-effect of making the author proud of the ability to write such a solution, but is more prone to bugs and obviously involves maintenance overhead. The real beauty is in simple solutions that also perform well.

Regarding querying logic, the books put a lot of emphasis on correct understanding of SQL and thinking in terms of sets; this is a very though phase that developers have to go through since for most of us mere mortals it’s not a natural way of thinking.

Q. What are the most important things a person can do to master Transact-SQL?A. The quickest way is to issue the following code:

USE master ;-)

On a more serious note, I believe that there’s no quick way—there’s a way; it’s an ongoing thing. I can give recommendations based on what I try to follow. But some aspects of the way are personal; different things may work differently for different people. Anyway, my two cents worth…

Most importantly, you need to master the basics; or maybe I should use the term fundamentals. Ego and vanity can be big obstacles here. People try to jump directly to what they consider “advanced” and don’t get the fundamentals right. Advanced techniques are a matter of combining fundamental techniques. So be diligent and invest a lot of effort in understanding the fundamentals well. In terms of querying logic this means logical query processing, sets, NULLs, three-valued-logic, all-at-once operations, and so on. In terms of physical processing, this means getting to know internal structures, access methods, analyzing execution plans, and so on.

When facing new problems, try to solve them yourself before looking at others’ solutions. When you try something yourself you understand it best. This will also help you become more creative and develop your own techniques. Also, try different approaches, not just one, and compare the solutions.

Never consider yourself as being finished. Strive for perfection but never consider your solution perfect. Constantly work on problem solving; look for new problems and keep going back to old problems and try to improve the solutions. This way you polish and perfect your techniques and enhance your vocabulary.

I also feel that you get a much better understanding of a subject when you need to explain it to others.

I truly believe in all of the above, but ideally, you don’t need me to tell you these things. In fact I feel awkward and uncomfortable giving such recommendations; so if at this point you think that I’m a fart I won’t hold it against you. I feel more comfortable writing queries and talking about them. ;-)

As I said earlier, this simply feels natural to me. Though some things are common in the way to master any profession, some things are and should be personal; namely, you have to find your own way and not necessarily follow what someone else recommends to the letter. :-)


Q. Itzik , the first time I came across your name was in the Microsoft public SQL Server programming group. Can you explain to us why you participate in newsgroups and forums?
A. In the past I used to be much more active in the newsgroups; today less, but whenever I participate I feel revived—especially when visiting the public programming newsgroup. If I had to name one thing that had the greatest impact on my knowledge in T-SQL it would doubtless be my participation in the newsgroups. It’s a very fertile soil to grow. So many practical problems are posted there daily, and by trying to solve other people’s problems, you end up learning more and more. Also, many practical problems are repeated there frequently, and as time goes by, you keep perfecting your techniques, and also sometimes get to see very unique and interesting solutions posted by other people. People pay so much money to study in schools, colleges and universities; here’s a great place to learn for free, and while you’re at it, you end up helping other people. :-) It’s the perfect Dojo for T-SQL practitioners.


Q. How did the massive changes between SQL Server 2000 and SQL Server 2005 affect the research for your book?
A. I think I went through a different experience than some other authors. I’m so immersed in the T-SQL language that I feel that it’s my mother tongue. Seriously, in many cases I feel that I can express myself better with T-SQL than with English or even Hebrew which is supposed to be my native tongue. I was deeply involved in writing T-SQL code using new language elements from the very early builds of Yukon. Also, our company, Solid Quality Learning, works closely with the SQL Server development team and customer advisory team (CAT). We worked on whitepapers for beta builds and got access to specs in very early stages of the product. I also developed and delivered courses on Advanced T-SQL before writing the books. So writing the books did not involve research, rather the material was essentially part of me, and all I had to do was to convert my thoughts to words.

I have similar experience with writing in general. I have a monthly T-SQL column in SQL Server Magazine, and often people ask me where do I get ideas, and what’s the process involved. I work on so many ideas constantly not for the sake of writing, rather, it’s a natural part of me—the way I was talking about earlier. So when it’s time to write a column, it’s basically pulling something from one of the drawers in my mind and putting it into words.


Q. Name three things that you wish would be in Katmai (the next version of SQL Server)?
A. These are easy to name, but I doubt that my wishes will come true in Katmai:

1. Support the ORDER BY sub-clause of the OVER clause for aggregate functions; BTW, this has nothing to do with ordered aggregates, rather it’s a logical way to define to a window function which window to work with. I’ll provide a link to a paper which I wrote recently for people looking for more details.

2. Support the ROWS and RANGE windowing sub-clauses of the OVER clause for aggregate functions.

3. Add support for row value constructors/vector expressions.
In general I wish to see more focus on T-SQL enhancements, and specifically, a more complete implementation of the profound OVER clause. It bears great benefits within it that I believe many people are not aware of yet. You can find more details in the following paper:

http://www.insidetsql.com/OVER_Clause_and_Ordered_Calculations.doc


Q. Do you think that every developer should have a numbers table in their database?
A. That’s a good question that makes me smile. :-) Yes! If not a table, a table function. I’ve used an auxiliary table of numbers in so many solutions; I find it a great helper tool to solve many querying problems. Just as an example, here’s a query using an auxiliary table of numbers (called Nums with a column called n) that splits strings containing comma separated lists of values (stored in a table called Arrays, with columns arrid and string):

SELECT
  arrid,
  n - LEN(REPLACE(LEFT(string, n), ',', '')) + 1 AS pos,
  SUBSTRING(string, n,
    CHARINDEX(',', string + ',', n) - n) AS val
FROM Arrays JOIN Nums
  ON n <= LEN(string)   AND SUBSTRING(',' + string, n, 1) = ','  
ORDER BY arrid, pos;  

And if you need to operate on a single array (e.g., one that you get as an input to a stored procedure), you can encapsulate the logic in an inline table function:
CREATE FUNCTION dbo.fn_split(@arr AS VARCHAR(MAX)) RETURNS TABLE
AS
RETURN
  SELECT
    n - LEN(REPLACE(LEFT(@arr, n), ',', '')) + 1 AS pos,
    SUBSTRING(@arr, n,
      CHARINDEX(',', @arr + ',', n) - n) AS element
  FROM dbo.Nums
  WHERE n <= LEN(@arr) AND SUBSTRING(',' + @arr, n, 1) = ',';  
GO    

SELECT * FROM dbo.fn_split('10248,10249,10250');   
Output:
pos element
---- --------
1 10248
2 10249
3 10250

This implementation has major advantages (both in terms of security and performance) over the commonly practiced solution utilizing dynamic SQL.

Erland Sommarskog, a fellow MVP, wrote a great paper on the subject. You can find details here:
http://www.sommarskog.se/arrays-in-sql.html

By the way, you can vote for adding the table of numbers as part of the product here:

https://connect.microsoft.com/SQLServer/feedback/ViewFeedback.aspx?FeedbackID=258733


As a tip, here’s a very fast inline table function that will return a table of numbers for you:

CREATE FUNCTION dbo.fn_nums(@n AS BIGINT) RETURNS TABLE
AS
RETURN
  WITH
  L0   AS(SELECT 1 AS c UNION ALL SELECT 1),
  L1   AS(SELECT 1 AS c FROM L0 AS A, L0 AS B),
  L2   AS(SELECT 1 AS c FROM L1 AS A, L1 AS B),
  L3   AS(SELECT 1 AS c FROM L2 AS A, L2 AS B),
  L4   AS(SELECT 1 AS c FROM L3 AS A, L3 AS B),
  L5   AS(SELECT 1 AS c FROM L4 AS A, L4 AS B),
  Nums AS(SELECT ROW_NUMBER() OVER(ORDER BY c) AS n FROM L5)
  SELECT n FROM Nums 
  WHERE n <= @n;  
GO     

-- Test function 
SELECT * FROM dbo.fn_nums(10) AS F;  
GO   

Output:

n
---
1
2
3
4
5
6
7
8
9
10

It’s even faster than querying a real table of numbers. ;-)


Q. Why is it important to think in sets instead of procedural code?A. SQL is based on the relational model, which in turn is based on set-theory and predicate logic. The relational model is a proven model for handling data—for several decades. It’s important to adopt set-based thinking in order to write simple and efficient solutions, otherwise, you defeat the whole premise on which the model is based. SQL queries (set-based) have several advantages over procedural/iterative code (e.g., using cursors) as far as data manipulation is concerned. Set-based solutions focus on the “what” part of the solution; namely their logic. Iterative solutions spend a lot of code on the “how” part, clouding the logic behind all the code; lengthy code naturally has maintenance overhead. Set-based solutions allow the optimizer to choose an execution plan among several possible plans based on cost estimations, while iterative solutions typically force a very rigid plan. In short, set-based solutions allow you to focus on logic, and speak with the database in its native language. Also, cursors have a lot of overhead associated with the record-by-record manipulation.

I’m not saying that ALL problems are best solved with set-based solutions, but the majority of data manipulation related problems are. Tasks that involve less data manipulation, and more computations, string manipulation, parsing, inherently iterative logic, these can benefit from procedural solutions. But T-SQL (and SQL in general) was not designed to handle such tasks efficiently. That’s an area where .NET is much stronger, richer and more efficient. These are the areas where I see the benefit in the CLR integration in SQL Server 2005.


Q. Which chapter was the hardest to write and can you explain why?
A. Chapter 3 – Query Tuning in the T-SQL Querying volume. It was the hardest for several reasons.

In this chapter I cover internal structures and access methods. It was very important for me to depict internal structures and diagrams of access methods graphically as accurately and as detailed as possible. I created many figures and worked on each for many hours at the pixel level. Several figures took a day of work each. I actually took snapshots of real internal structures like pages of B-Trees with DBCC PAGE and DBCC IND commands, and incorporated real bits of data in the diagrams to make them as reliable as possible.

Also, the chapter appears early in the first volume. I wanted it early in the books to provide important foundations and background that is required for the rest of the books. But I had to include queries used for performance analysis that use advanced techniques that I explain later in the book.

There were also other obstacles.

This was the chapter I wrote last, and it took me several very draining weeks to complete. It ended up with over 100 pages. There’s a lot of query tuning coverage throughout the books, but it’s not the tuning part that was hard, rather the aforementioned reasons.


Q. Why do you write technical books?
A. I’m a teacher in heart and sole. I think that writing is an integral part of teaching. It’s a flowing river; you acquire knowledge and you pass on your knowledge to others. It’s very hard and requires a lot of discipline, but ultimately it’s a process from which you also gain a lot.
Also, there’s a more selfish reason. As I said, I’m a teacher and I want students in my classes. Books give you exposure and I hoped the books would get more students in my courses.


Q. Who are your favorite authors?
A. JRR Tolkien, William Goldman, Douglas Adams, Edgar Allan Poe, Niccolo Machiavelli, Yamamoto Tsunetomo, Miyamoto Musashi, Sun Tzu, J.K. Rowling

By the way, some of my favorite authors did not write any books, yet. :)

Here are a few examples:

David Campbell, who is the General Manager of Strategy, Infrastructure and Architecture of Microsoft SQL Server used to participate in the past in some private SQL newsgroups back in the days when he owned SQL Server’s Storage Engine. His depth of knowledge, teaching and explanatory skills, passion, humility, great English, all were evident in his posts. I asked him to write the foreword to the T-SQL Querying book; you have to read it to understand what I’m talking about. It’s pure joy to read his writings, and I sorely miss his participation in the newsgroups. I wish he had written books back in the days when he was very deeply involved in the technicalities and small details of SQL Server’s engine.

Roy Harvey, a fellow SQL Server MVP. He’s active in the SQL Server community especially in the SQL Server Programming newsgroup. When you read his posts you feel the experience, wisdom and humility flowing from them. I would have very much loved to see work published by him, but you know, some very wise and knowledgeable people shy away from publicity and exposure—unfortunately for us.

Steve Kass is another good example. Steve was the technical editor of my books and there are no words that I can use to describe his capabilities. He has a very unique combination of knowledge, logic, mathematics, SQL, genius, perfect English, and a way of explaining things in a simple manner. Oh, how much I’d love to see his work published in books some day!

I also like reading posts by David Portas; another fellow MVP. I’d love to see a book on modeling published by him some day.


Q. What SQL Server books are on your bookshelf?
A. With your permission, I’d like to broaden my reply to books covering topics that I believe are important for SQL practitioners. On my bookshelf you will find The Art of Computer Programming by Donald E. Knuth, books about logic puzzles, The Thirteen Books of The Elements by Euclid, Elements of Set Theory, Graph Theory, Regular Expressions Pocket Reference by Tony Stubblebine, Mastering Regular Expressions by Jeffrey E. F. Friedl, several SQL books by Joe Celko, SQL-99 Complete, Really by Gultzan and Peltzer, Inside Microsoft SQL Server (various versions; early by Ron Soukup, later by Kalen Delaney), The Art of SQL by Stephane Faroult with Peter Robson, and others…

You can find a more extensive list here:

http://www.sql.co.il/books/insidetsql2005/resources.htm#Books

I have to say though that I probably spend more time writing stuff than reading…


Q. I know you like math and logic puzzles, do you think it is beneficial for a SQL developer to be somewhat good in math?

A. Yes, but I would say it is beneficial to be good in logic and not necessarily generally in math. That’s why I included a whole appendix with pure logic puzzles in the T-SQL Querying volume. Some people find it odd, while people that truly follow the SQL way find it natural (at least that’s what I believe). Almost every SQL querying problem in its hart is a logic puzzle. It’s very common to see that people who are deeply immersed in SQL are also very fond of logic puzzles. It’s not by chance. My belief is that if you practice logic puzzles you can improve your SQL. Also, logic puzzles are fun and can be practiced by the whole family.

And there’s another reason to dwell on logic—one coined by Sir Arthur Conan Doyle:

"Crime is common. Logic is rare. Therefore it is upon the logic rather than upon the crime that you should dwell."


Amazon Link: Inside Microsoft SQL Server 2005: T-SQL Querying
Sample Chapter: http://www.sql.co.il/books/insidetsql2005/sample_chapters.htm#T-SQL_Querying_Sample_Chapters

Tuesday, October 10, 2006

An Interview With Ken Henderson About The Forthcoming Book SQL Server 2005 Practical Troubleshooting: The Database Engine

I am a big fan of Ken Henderson’s books, I believe that every SQL Server developer should have a copy of his books. When I noticed on Amazon.com that the book SQL Server 2005 Practical Troubleshooting: The Database Engine which listed Ken Henderson as its editor would be coming out soon I got very excited. I decided to send Ken an email to see if he would be willing to answer some questions I had about the book. To my surprise Ken was more than willing to accommodate my request.
The question-and-answer session with Ken that follows was conducted via email.




Denis: What is the audience for this book, is it the enterprise user or can a small department benefit from the tips in this book?

Ken: Both types of users would benefit. Anyone who’s ever had a problem with SQL Server has probably noticed how few resources there are out there for troubleshooting SQL Server issues. There are plenty of resources that talk about how it works. There are many that discuss how to write code for it. But there are scant few that talk about what to do when something goes wrong. This book is intended for that audience. It is also intended for those who want to better understand how the product works and to be prepared in the event that something does go wrong with it. SQL Server doesn’t break often, but, when it does, this book will help you deal with it.

Denis: For a customer who has a performance problem that is not hardware related, what would you say are the most important chapters in this book (in order of importance)?

Ken: The Query Processor Issues chapter and the Procedure Cache Issues chapter are the two best for this type of problem.

Denis: Seven developers from the SQL Server development team and three support professionals from Microsoft Customer Support Services wrote this book. What took so long to write a book like this, and why wasn’t there a SQL Server 2000 version? Is it because SQL Server has truly grown to be a major player in the enterprise market and there is a definitive need for a book like this now?

Ken: The book took so long because all of the authors are first-time authors, and they are very busy people. There was no SQL Server 2000 version because I was too busy with my own projects to begin this project back then. SQL Server is indeed a major player in the enterprise, but I believe it has been since SQL Server 7.0. That particular aspect had nothing to do with the timing of this book.

Denis: I noticed you are listed as editor of this book. Have you written any chapters in this book?

Ken: No, I did not write any of the chapters of the book. I also tried to preserve each author’s writing style. Each chapter includes its own byline, and I have edited them as little as possible. Because some of the authors were more capable than others, I necessarily had to be involved to varying degrees with each chapter. Some chapters needed lots of editing; some very little. I think each individual perspective represented in the book is a valuable one, but I also think they speak in unison on the important points of practical troubleshooting with SQL Server.

Denis: What new technologies in SQL Server 2005 do you think are the most beneficial for performance?

Ken: There are really too many to list. A few that come to mind at the moment: instant file growth, improved wildcard support, more sophisticated cache management, improved scaling on big hardware, the XML data type and richer XML support in general, CLR integration, etc.

Denis: This book as I understand has a lot of internals information from the people who either wrote the product or have supported it that currently is not available anywhere else--is that right?

Ken: Yes, the majority of the authors were actually developers on the product. A few were support engineers who supported the product. All have had full access to the SQL Server source code for many years.

Denis: What will a person who reads this book gain in terms of understanding how to performance tune a server?

Ken: They will better understand both how the server works and also how to recognize and troubleshoot common performance problems.

Denis: Is the book geared towards a beginner/intermediate level user or do you have to be an advanced user to really utilize the information in this book?

Ken: There is something in this book for everyone. I’d like to think that beginners, intermediates, and advanced users alike would benefit from reading it.

Denis: What are the most important things a person can do to master SQL Server?

Ken: Naturally, the best thing a person could do would be to do what the authors of this book did: study the SQL Server source code. Studying the SQL Server source gives you insight into how the product works that is impossible to gain through any other means. But, given that that excludes pretty much everyone outside of Microsoft, here are some general thoughts:

#1, understand how Windows works at a very low level and how SQL Server utilizes the many facilities it offers

#2, understand how the product was designed and how it was intended to be used

#3, explore it not only as a user, but as a developer. Fire up a debugger and see how it works under the hood

#4, build real apps with it, using its various components as they were intended to be used

Denis: What are the most important things a person can do to master Transact-SQL?

Ken: My initial thought is that, again, studying the SQL Server source code is the shortest path to the deepest understanding of the language. That said, here are some general thoughts in no particular order:

#1, understand how SQL Server works. Understand the intricacies of performance tuning on SQL Server. Know how data is stored. Understand memory management and scheduling at a very low level. Understand logging and tempdb semantics. Remember that SQL Server is just an application. It’s not magical and can be misused and abused just like any other app

#2, learn the syntax and semantics of the language inside-out. Get a feel for its strengths and weaknesses, and code to those strengths. Always lean toward writing set-oriented code when you can

#3, study solutions to hard problems that are available in various forms, and apply the techniques you learn to solve your own problems

#4, learn other SQL dialects so that you can become familiar with their unique attributes and understand how T-SQL compares with them. Gain an understanding of where T-SQL fits in the general taxonomy of SQL dialects

#5, learn other languages besides SQL. If your favorite programming language is T-SQL, you probably don’t know many languages. Learn C#, VB, Perl, Ruby, C++, or any others you can work into your research so that you can better understand software engineering as a discipline and so that you can more clearly see T-SQL’s strengths and weaknesses when compared with those other languages. Try to see where you might apply techniques from those other languages to solve problems you encounter in T-SQL. Familiarize yourself with what a design pattern is, what an idiom is, what refactoring is, and apply these concepts in T-SQL just as you would in any other “real” language

#6, understand the various SQL Server components and other technologies so that you can accurately ascertain when it’s appropriate to use T-SQL. It’s not the solution for every problem. As the old saying goes, “When all you have is a hammer, everything starts to look like a nail.” By broadening your knowledge of the tools and problem solutions available to you, you’ll be more likely to choose the best one when deciding how to design a piece of software or solve a particular problem. T-SQL may turn out not to be the best way to go in a given situation


And I will end with 9 questions for Ken not related to this book

Denis: What SQL Server books are on your bookshelf?

Ken: I have Celko’s books, Darren Green’s book, and a few others. Unfortunately, I don’t have time to read as much as I’d like. I spend most of my time either writing code for the product or studying code written by others on the SQL Server development team. The majority of my research into how SQL Server works happens via studying its source code directly.

Denis: Why do you write technical books?

Ken: I write technical books because I enjoy passing on what I’ve learned as a developer. That’s different from enjoying teaching people. I do enjoy teaching people, but that’s not why I write books. Some of the things I’ve learned about SQL Server took me years to master. I enjoy passing that on to people so that they don’t have to travel the same arduous roads that I did. I enjoy helping people. That’s different from teaching for the sake of teaching. I could never train people for a living. I am a programmer by trade, and everything else is an offshoot of that.

If I didn’t think I had something unique to bring to the discussion, I don’t think I’d write books. I don’t ever want to do what has already been done. I want to bring a fresh perspective to technical books, and I want to explore things in ways most other authors wouldn’t. If my work was exactly like everyone else’s, there’d be no reason for it to exist, and I wouldn’t bother. Given that I’ve never written fulltime but have always held down a regular day job while writing my books, the work itself is simply too hard to do just to be a clone of someone else. When people pick up one of my books, I hope they know right away that it’s one of mine, that it speaks with a distinctive voice, and I hope they think they might learn something from it simply because they trust me as an author.

Denis: Why did you join Microsoft?

Ken: I joined Microsoft to get inside SQL Server. I felt that the only way to go beyond the books and whitepapers currently out there on SQL Server was to see the source code for myself, and the only way that was going to happen is if I joined the company. I wanted to approach the exploration of SQL Server from a career developer’s standpoint, something that I had not seen done before. Most SQL Server books were written by professional trainers and former DBAs. As a career developer, I thought I could bring a fresh perspective to the coverage of SQL Server, and I felt the only way to really do that was to “go live with the natives” for a few years.

Denis: Who are your favorite authors?

Ken: Mark Twain, Kurt Vonnegut, Bart D. Erhman, Robert Price, Dean Koontz, Stephen King, Joe Celko, Sam Harris, Richard Carrier, Don Box, David Solomon, Charles Petzold, Kent Beck, Martin Fowler, Bruce Eckel, and many others.

Denis: Who do you consider your rival authors?

Ken: I don’t really think of anyone else out there as a rival. When I write a book, I mainly measure my work against my concept of the perfect book. I write for me. There’s a great book out there titled On Writing Well where the author, William Zinsser, repeats the old truism that quality is its own reward. It really is. I love the fact that people enjoy my books, but, really, the day I finish the final draft of a book and can say that I’m really done with it (at least for the moment :-) ), I’ve accomplished my goal. If it never sold a copy, I’d still feel fulfilled. I do care how it sells against other books, but I don’t really focus on it and don’t get caught up in any type of rivalries with other authors or other books.

Because I always want to write a better book than I wrote last time, I necessarily compete with my previous work and I compete against what I think the ideal technical book is. I think there’s enough room out there for lots of technical authors (it’s not as though people only buy one technical book and no others), and I have special empathy for my comrades out there who have to slog along in the middle of the night to crank their books out.

Denis: Where did the “Guru’s Guide” concept come from?

Ken: Wayne Snyder, one of the MVPs reviewing the manuscript for the first Guru’s Guide (which was at that time unnamed), wrote in the margin, “Hey, Ken, this is really a guru’s guide to solutions to hard T-SQL problems!” at which point the marketing folk at Addison-Wesley saw this and seized upon it. We had kicked around several titles, but hadn’t settled on any of them. As soon as they saw this, they pushed me hard to use it, and I reluctantly agreed. I didn’t like it initially because I thought the title of a technical book should focus on either the subject material or its intended audience, not its author. There was an understanding that we’d revisit the title when we did the second book (I was originally under contract to do three SQL Server books for Addison-Wesley), but then sales of the first book exploded, and there was no way we could change it at that point.

Denis: What do you think of all the accolades the Guru’s Guide books have received?

Ken: I am appreciative of them, but continue to be surprised by the longevity of the books and the reception they’ve garnered. I thought I was writing a niche book when I wrote that first Guru’s Guide book. I just wanted to get down everything I knew about T-SQL before I forgot it ;-). I will continue to write the kinds of books I like to read as long as people will buy them, so I hope that people continue to enjoy my work.

Denis: Will you be updating your Guru’s Guide books for SQL Server 2005? If so, when will they be out?

Ken: Yes. The second editions of the Guru’s Guide books should be out in 2007.

Denis: Describe your most unpleasant experience as an author.

Ken: I had a particularly unpleasant experience during the work on my architecture book when I had to send one of the technical reviewers packing. He was someone who’d provided useful feedback on my work in the past and someone I’d handpicked to review the book for technical issues. I usually appreciate negative feedback during the technical review process and generally consider it the most useful type of feedback, but this reviewer focused more on arguing with me about what should and shouldn’t be in the book than reviewing what was there for technical accuracy. He had a problem with the fact that I spent the first 300 pages of the book (the book ended up being over 1000 pages long) covering fundamental concepts (e.g., Windows internals) that I thought people needed to understand in order to understand the rest of the book.

I had seen people within Microsoft struggle to understand SQL Server internals because they did not have a good grasp of how Windows worked or how XML worked or how COM worked, or whatever, and, assuming readers would likely face the same types of challenges, I set out to remedy that in my book. I also wanted to go deeper than any SQL Server book ever had, and that necessitated being able to assume a certain amount of fundamental knowledge going in. I wrote him back after his first objection to the section and told him that, while I respected his opinion, I had my reasons for including it, and I explained those reasons as best I could.

He suggested I just refer people to authors like Richter and Solomon and those guys, and I told him I’d considered that, but that ultimately I felt that would be cutting corners and would be a huge inconvenience since readers would have to purchase several other books just to understand mine. No single other book had all the technical fundamentals I felt were essential, nor did any of them cover the material the way that I wanted it covered--in a manner that was designed especially for DBAs and database people. At the same time, most readers wouldn’t be able to skip the fundamentals coverage in some form or fashion because they wouldn’t be able to understand my SQL Server internals coverage without it. While it was certainly a huge amount of work for me to include this section (it was much like writing a whole separate book), I felt it was the right thing to do.

He persisted with his objections and continued to voice them not only to me but also to the editing team at Addison-Wesley. I told him on several occasions that I respected his opinion, but that, as the author, the call was mine to make and that I’d made it. This seemed to irritate him, and he continued to consume a certain amount of my time with correspondence related to the subject. At one point, I counted 7 separate threads from him on that one subject in my Inbox, and the folks at Addison-Wesley had begun to complain about him. The fundamentals section, and his negative remarks regarding it, came to dominate all the feedback we got from him. While other reviewers were either indifferent to the coverage of Windows internals in a SQL Server book (it was certainly a novel approach) or embraced it outright, he became increasingly more negative as we went along. We got useful feedback on the entirety of the manuscript from all the other reviewers, but he seemed unable to move on from the fundamentals issue. Eventually, I had my fill of dealing with him and cut him loose from the project. I’m a fairly patient person, but I just didn’t have time to deal with him anymore.

Technical reviewers sometimes get on crusades and attempt to usurp the role of the author to some extent. Until this happened, I’d never personally experienced it, but I’d heard of it. At the end of the day, the decision as to what is and isn’t in a book is the author’s to make, and the role of the technical reviewer is to identify technical issues with whatever it is that will be in the book. Decisions about content belong to the author, and, to a lesser extent, the publisher and the publisher’s editing team. I guess the lesson I learned here was to be more careful with whom I select for involvement with my projects. I always want honest feedback, and, fortunately, I know a lot of people who will happily point out every technical issue they find with my work without trying to become a de facto coauthor.


About the book:
Paperback: 456 pages
Publisher: Addison-Wesley; 1ST edition
Language: English
ISBN: 0321447743

Contents
Preface
Chapter 1 Waiting and Blocking Issues
Chapter 2 Data Corruption and Recovery Issues
Chapter 3 Memory Issues
Chapter 4 Procedure Cache Issues
Chapter 5 Query Processor Issues
Chapter 6 Server Crashes and Other Critical Failures
Chapter 7 Service Broker Issues
Chapter 8 SQLOS and Scheduling Issues
Chapter 9 Tempdb Issues
Chapter 10 Clustering Issues
Index



Thanks to Ken for answering all these questions and if there is one reason this year to buy your own holiday gift then SQL Server 2005 Practical Troubleshooting: The Database Engine is it



Amazon Links: CA FR DE UK JP US