Monday, February 04, 2008

Should You Get Your Database Under Version Control?

CodingHorror has a blogpost titled Get Your Database Under Version Control

It is suggested that you also keep the data in version control. I am reading through the comments and I am amazed. One person writes "Once you have used ActiveRecord Migrations it is very hard to go back!"

This might work for a small or medium size database. My database is well over a terabyte, how would you keep that data under version control? Data gets modified every day. Some of the data is encrypted. Some of the data gets inserted into audit tables. Data gets replicated to other servers/databases. There are jobs that pull in data from real time systems every second. I do have different versions of DBs on staging and QA servers but only one on the production server. Changes have to go through change management, you have to open a ticket to do a change. This is not something you would do on a daily basis.

What is your opinion? Do you have the schema in version control? What about the data itself?


rikkus said...

We use migrations. For Ruby code, we use ActiveRecord. For .NET, we use some in-house code (very simple stuff) to migrate SQL Server databases. These are kept in source control with the rest of our code.

Only static data is added in the migrations - the sort of stuff that can be added quickly with some insert statements. Our automated tests build a new database from scratch (using the migrations) - for each of the tests (this can be several hundred), so they need to work as quickly as possible.

JMC said...

i'm not sure you're understanding this correctly. you don't keep your DATA under source control (that's what backups are for). we're talking about the schema... the objects that make up your database (tables, views, functions, etc.).
whether your database is 10 megabytes or 10 terabytes, you should always keep your schema under version control, whether that be in the form of change scripts or snapshots (or both).