Database Migrations Link to heading

Overview Link to heading

Database migrations are part of the larger concept of version controlling your database schema¹ the same way you version control your application. There’s a good summary here The Ultimate Guide to Database Version Control, CI/CD, and Deployment - Database Star so I won’t reiterate it. The goal is to be able to create a new database with your database schema as of a specific point in the database’s history, at any time. That might seem a little abstract, but the practical application is that each developer can have their own copy of the database at version N+whatever, the latest changes are being tested in a QA environment at version N+1, and production is at version N. This is implemented by creating a new table in the database that tracks every migration that’s ever been run against the database with an incrementing version number.

Required Features Link to heading

Database migrations are common practice now, but the features available vary widely by tool/library. My personal favourite is grate, and thanks to .NET native AOT grate can now be installed nearly anywhere as a single self-contained binary. grate has just about every feature you can imagine in a standalone database migration tool, but there are a few I would consider absolutely necessary from any migration tool:

Supports migrations as raw SQL files Link to heading

A lot of ORMs implement migrations entirely as native code. There are enough problems with this that I’m tempted to consider it an anti-pattern.

Auto-generated SQL can be inefficient and/or dangerous: whether it’s nested LEFT OUTER JOINs or a DROP TABLE foo; CREATE TABLE foo (both of which I’ve seen in the wild) there’s no guarantee an ORM will produce safe, effective SQL. Being able to inspect the SQL that will be executed is essential.
Raw SQL files can be reviewed and analyzed: static analysis can be used to detect unsafe or inefficient SQL, and raw SQL files can be reviewed by an SQL expert as part of the code review process
Raw SQL files are ORM-independent: if you choose to switch to a different ORM or SQL library, your old migrations will still work without any rewriting

The downside to this implementation is that if you’re using an ORM rather than handwritten SQL, it often requires an additional build step where the ORM exports its migrations to a raw SQL file.

Idempotent migrations Link to heading

It’s possible to ensure that every migration is idempotent in code via code reviews and static analysis, but it’s even better if the migration library/tool enforces this itself. Some do; they fingerprint all migrations and store the fingerprint in the target database and will not re-run the same migration twice.

Baselining Link to heading

Given time, the individual migrations will pile up. It’s good practice to run each one as a transaction, which creates extra per-transaction overhead; it’s also common for a table added early in the project’s lifecycle to be renamed or dropped later, resulting in entirely superfluous SQL commands that just waste time when instantiating a new database. Baselining is the practice of combining a set of historical SQL migrations into a single migration and then recording in the database version table that this new migration is equivalent to the previous set so the migration tool doesn’t try to run it. Your migration tool needs to support this natively.

Standalone binary CLI Link to heading

However you apply migrations to your local copy of the application database, when you deploy to production the migrations will be executed by a deployment agent. These agents are usually low-spec, and limited in what software is permitted to run on them. Installing a complete development toolchain just to run migrations bloats the agent and is dangerous; the deployment agents have secure access to the production database and handing a malicious actor a compiler is asking for trouble. Often a deployment agent is an ephemeral VM or container; installing and verifying a single small binary is faster and safer than installing an entire compiler toolchain.

Supports different kinds of SQL objects (nice to have) Link to heading

This isn’t essential because it isn’t universal; a simple microservice may never need anything but simple table definitions. The more complex your database, the more you’ll need this.

Although you can represent any SQL object as a SQL query which can be executed by a migration tool, not all SQL objects are conceptually the same. You might have special permissions which need to be applied once at the beginning. Stored Procedures only need to be updated if they change. Indexes only need to be created once.

A migration tool that treats different types of SQL objects differently can help enforce idempotency.

Anti-Patterns Link to heading

There are number of defaults implemented by migration tools/libraries that are nonetheless antipatterns.

Migrations performed by the app on startup Link to heading

This is very convenient when developing, but it’s a disaster during production deployments. It’s a given that the app code will be running on an auto-scaling host, and every time a new host is instantiated the migrations will attempt to run again. If there is a baseline n instances deployed to host the app, n instances will all try to run the migration at the same time.

At best, your migration library will waste time on startup verifying that these migrations have already run. More likely, your database will block while each instance tries to run a series of individual transactions which will arrive in an undefined order. Worst case, if neither the migrations nor the library/tool enforce idempotency you’ll get a hopelessly corrupted database.

There is an implication here that might not be obvious. No matter how fast the deployment pipeline, when migrations are performed prior to the new code deploy there will be a measurable period of time where the database schema is at a different version than what the code currently running is expecting. Migrations must be written to take this into account by using the expand-contract pattern.

Migrations defined in native code only Link to heading

As discussed above, if the migrations can only be defined in native code this is difficult to test and forces dangerous habits when deploying.

Rollbacks Link to heading

Many migration libraries/tools support (or require!) a rollback for each migration (sometimes called a “down” operation). As discussed in the databasestar link above, this is an antipattern. Regardless of the tool’s support for rollbacks (or its promises about them!), don’t rely on generated rollback code to repair a failed migration. Roll forward with the fix instead.

Note that this requires a well-behaved continuous delivery pipeline, as it is not feasible to wait hours or days for a fix when the production environment has a serious database problem.

Let me just log in to Prod and fix this… Link to heading

Once you have started version controlling your database with a migration tool, doing this is the equivalent of logging in to the production server and hex-editing the running binary. It might fix the problem, but it’s not documented, it’s not observable, and on the next deploy the fix will be wiped out. Backporting your quick fix to the code is not acceptable, because you will forget.

Again, this is why having a well-behaved continuous delivery pipeline is so important. It removes the temptation to do this.

I’m using the word schema here to include any reference data such as lookup tables or unchanging data necessary for the application to work, but not generated by user input ↩︎