Introduction
The first session I conducted after initiating the Brown Bag session for developers at Viant was Database Migrations using Go-migrate. In this presentation, I briefly described what database migrations are, what is Go-migrate, why should we use database migrations, and finally the problems that one could face while using them. And I’ll be covering the same points in this post.
Database Migrations
- What are database migrations?
Database migrations or schema migrations refer to the version-controlled
, incremental
and reversible changes
to
a relational database schema. [1]
The 3 highlighted words are what makes database migrations useful -
- version-controlled: Each state in the database is given a version number and every time a migrations is applied (or rolled back), the database “version” is incremented (or decremented) accordingly
- incremental: Each migration is written in a way such that changes to the schema are incremental and if all migrations are applied from the beginning, the resultant state would be the current database state
- reversible changes: ideally all migrations are written in such a way that on applying it, the database version goes
from
V
toV+1
and on rolling back, the database rolls back to versionV
- Advantages vs. Disadvantages
Advantages | Disadvantages |
---|---|
1. Git but for database | |
2. Organized, granular and ideally reversible changes | 2. More dev work - writing rollback migrations + PRs migrations |
3. More efficient than a monolithic file dump of schema changes; reduces deployment time | |
4. Transformations between versions are clearly defined |
- What is Go-migrate?
Go-migrate [2] is a lightweight schema migration tool that reads migrations from sources and applies them to the specified database.
Go-migrate’s philosophy -
Database drivers are dumb. They don’t assume things or try to correct user input. When in doubt, fail.
Install Go-migrate - How to install Go-migrate?
- Go-migrate commands
I will explain all the Go-migrate using a sample schema for a music database.
1. Create
This command is used to create new migrations. Every create
will generate 2 files - up and down migrations
# timestamp-based
migrate create -ext sql -dir <path-to-migration-files> -format <timestamp-format> <migration-name>
# sequence-based
migrate create -ext sql -dir <path-to-migration-files> -seq <migration-name>
2. Up
This command is used to apply migrations. It can be used to apply all the unapplied migrations or it can be used to specify the number of migrations to apply after the current version.
# apply all
migrate -database "<DB DSN>" -path <path-to-migration-files> up
# apply N
migrate -database "<DB DSN>" -path <path-to-migration-files> up N
The first time this command is run, Go-migrate creates a schema_migrations
table in the database against which this
command is executed. This table is used by Go-migrate to keep track of the database version. The dirty column is used
to represent whether that migration was successfully applied or not. If the value is 0, the schema was migrated to
the version was successfully; 1 otherwise.
3. Down
This command is used to rollback the schema changes. It executes the queries in the down migration files. Just like the up command, this can be used to rollback all the migrations or can be used to specify a number of migrations to be rolled back from the current version.
# rollback all
migrate -database "<DB DSN>" -path <path-to-migration-files> down -all
# rollback N
migrate -database "<DB DSN>" -path <path-to-migration-files> down N
Given the current state of the database, if a down migration is executed, all the tables will be dropped. The only table that would then remain is the schema_migrations table.
4. Goto
This command is used to go to a specific version of the database. If the current version is lesser than the specified version, the migrations in between are applied. If the current version is greater, the migrations in between are rolled back.
migrate -database "<DB DSN> -path <path-to-migration-files> goto V"
After adding more migrations,
say the current version is 1, and the goto command is used with V=3.
5. Force
This command is used to force the schema_migrations to point to a specific version without actually applying it. The force command is normally used when a migration fails and the dirty column in the schema_migrations table is 1. When a migration fails, the user needs to inspect the failure and fix it manually, run the force command to force the schema_migration to that version, and then proceed with running the remaining migrations.
migrate -database "<DB DSN>" -pah <path-to-migration-files> force V
- Points to Note
- For a “no-op” migrate down, there should at least be a comment in the file. Leaving the file completely empty will result in an error
- DB DSN format:
<db-driver>://<username>:<password>@<host>:<port>/<db-name>
- Best Practices
- Multiple queries should be executed in a transaction
- Try to make the migrations idempotent i.e. running the same migration twice in a row should give the same result.
eg. adding
IF EXISTS
to DROP andIF NOT EXISTS
to CREATE statements - Before committing the migration to a remote repo, ensure the migrations run properly. Normally, do an up, down, and then an up again.
Late code check-in issues
- Joe’s branch -
migrations/2018-09-15-12:00:00_joe_1.sql
migrations/2018-10-10-12:00:00_joe_2.sql
- Alice’s branch -
migrations/2018-10-01-12:00:00_alice_1.sql
migrations/2018-10-18-12:00:00_alice_2.sql
- Joe’s changes get merged to master
# master
+migrations/2018-09-15-12:00:00_joe_1.sql
+migrations/2018-10-10-12:00:00_joe_2.sql
-
Joe’s changes are released. The prod DB version now points to
2018-10-10-12:00:00
-
Alice’s changes are merged to master later
# master
migrations/2018-09-15-12:00:00_joe_1.sql
+migrations/2018-10-01-12:00:00_alice_1.sql
migrations/2018-10-10-12:00:00_joe_2.sql
+migrations/2018-10-18-12:00:00_alice_2.sql
- Since the
alice_1.sql
migration is behind the prod DB version, one of migrations is never applied.
Possible workarounds -
- Use sequence numbers instead of timestamps - Go-migrate does not run any migrations if there are duplicate sequence numbers in the same folder.
- https://github.com/golang-migrate/migrate/issues/179#issuecomment-947094417
- Scenario #2 - Order dependent migrations
Say, Alice and Bob are working on separate tasks each involving DB changes that are sequence/order dependent. Eg. Alice’s ALTER followed by Bob’s INSERT on the same table.
Possible scenarios -
- Bob does NOT know about Alice’s changes
- Bob’s changes go first
- Bob’s migrations run correctly
- Alice’s migrations may or may not fail
- Alice’s changes go first
- Alice’s migrations run correctly
- Bob’s migrations may or may not fail
- Bob’s changes go first
- Bob knows about Alice’s changes
- Bob’s changes go first
- Bob’s migrations most likely fail
- Alice’s migrations run correctly
- Alice’s changes go first
- Everything works correctly
- Bob’s changes go first
3 out 4 of the times, there could be a partial or a complete failure while applying migrations.
Possible workarounds -
- Have an intermediate “stage” branch to order the migrations correctly before merging to master
- Use one migration file per release
References
[1] https://en.wikipedia.org/wiki/Schema_migration
[2] https://github.com/golang-migrate/migrate
[3] https://github.com/pressly/goose/issues/63#issuecomment-428681694
[4] https://www.prisma.io/dataguide/types/relational/what-are-database-migrations