Database Migrations in Golang using Go-migrate

Golang | 6 minutes (about 1140 words) | Sept 25, 2023

Introduction

The first session I conducted after initiating the Brown Bag session for developers at Viant was Database Migrations using Go-migrate. In this presentation, I briefly described what database migrations are, what is Go-migrate, why should we use database migrations, and finally the problems that one could face while using them. And I’ll be covering the same points in this post.

Database Migrations

- What are database migrations?

Database migrations or schema migrations refer to the version-controlled, incremental and reversible changes to a relational database schema. [1]

The 3 highlighted words are what makes database migrations useful -

  1. version-controlled: Each state in the database is given a version number and every time a migrations is applied (or rolled back), the database “version” is incremented (or decremented) accordingly
  2. incremental: Each migration is written in a way such that changes to the schema are incremental and if all migrations are applied from the beginning, the resultant state would be the current database state
  3. reversible changes: ideally all migrations are written in such a way that on applying it, the database version goes from V to V+1 and on rolling back, the database rolls back to version V

- Advantages vs. Disadvantages

Advantages Disadvantages
1. Git but for database [1. Late code check-in issues](#late-code-check-in-issues)
2. Organized, granular and ideally reversible changes 2. More dev work - writing rollback migrations + PRs migrations
3. More efficient than a monolithic file dump of schema changes; reduces deployment time
4. Transformations between versions are clearly defined

- What is Go-migrate?

Go-migrate [2] is a lightweight schema migration tool that reads migrations from sources and applies them to the specified database.


Go-migrate’s philosophy -

Database drivers are dumb. They don’t assume things or try to correct user input. When in doubt, fail.

Install Go-migrate - How to install Go-migrate?

- Go-migrate commands

I will explain all the Go-migrate using a sample schema for a music database.

1. Create

This command is used to create new migrations. Every create will generate 2 files - up and down migrations


# timestamp-based
migrate create -ext sql -dir <path-to-migration-files> -format <timestamp-format> <migration-name>

# sequence-based
migrate create -ext sql -dir <path-to-migration-files> -seq <migration-name>
Executing the create command
Executing the create command
Create command result
Create command result
Up and down migrations to create initial tables
Up and down migrations to create initial tables

2. Up

This command is used to apply migrations. It can be used to apply all the unapplied migrations or it can be used to specify the number of migrations to apply after the current version.


# apply all
migrate -database "<DB DSN>" -path <path-to-migration-files> up

# apply N
migrate -database "<DB DSN>" -path <path-to-migration-files> up N
Running the up command for the first time
Running the up command for the first time

The first time this command is run, Go-migrate creates a schema_migrations table in the database against which this command is executed. This table is used by Go-migrate to keep track of the database version. The dirty column is used to represent whether that migration was successfully applied or not. If the value is 0, the schema was migrated to the version was successfully; 1 otherwise.

schema_migrations table after applying the first migration
schema_migrations table after applying the first migration

3. Down

This command is used to rollback the schema changes. It executes the queries in the down migration files. Just like the up command, this can be used to rollback all the migrations or can be used to specify a number of migrations to be rolled back from the current version.


# rollback all
migrate -database "<DB DSN>" -path <path-to-migration-files> down -all

# rollback N
migrate -database "<DB DSN>" -path <path-to-migration-files> down N

Given the current state of the database, if a down migration is executed, all the tables will be dropped. The only table that would then remain is the schema_migrations table.

Database state after running the down command
Database state after running the down command

4. Goto

This command is used to go to a specific version of the database. If the current version is lesser than the specified version, the migrations in between are applied. If the current version is greater, the migrations in between are rolled back.


migrate -database "<DB DSN> -path <path-to-migration-files> goto V"

After adding more migrations,

Creating more migrations
Creating more migrations

say the current version is 1, and the goto command is used with V=3.

Goto command execution result
Goto command execution result
schema_migrations after goto command
schema_migrations after goto command

5. Force

This command is used to force the schema_migrations to point to a specific version without actually applying it. The force command is normally used when a migration fails and the dirty column in the schema_migrations table is 1. When a migration fails, the user needs to inspect the failure and fix it manually, run the force command to force the schema_migration to that version, and then proceed with running the remaining migrations.


migrate -database "<DB DSN>" -pah <path-to-migration-files> force V

- Points to Note

  1. For a “no-op” migrate down, there should at least be a comment in the file. Leaving the file completely empty will result in an error
  2. DB DSN format: <db-driver>://<username>:<password>@<host>:<port>/<db-name>

- Best Practices

  1. Multiple queries should be executed in a transaction
  2. Try to make the migrations idempotent i.e. running the same migration twice in a row should give the same result. eg. adding IF EXISTS to DROP and IF NOT EXISTS to CREATE statements
  3. Before committing the migration to a remote repo, ensure the migrations run properly. Normally, do an up, down, and then an up again.

Late code check-in issues

### - Scenario #1 - The timestamp migration order problem [3]

  1. Joe’s branch -
migrations/2018-09-15-12:00:00_joe_1.sql
migrations/2018-10-10-12:00:00_joe_2.sql
  1. Alice’s branch -
migrations/2018-10-01-12:00:00_alice_1.sql
migrations/2018-10-18-12:00:00_alice_2.sql
  1. Joe’s changes get merged to master
# master
+migrations/2018-09-15-12:00:00_joe_1.sql
+migrations/2018-10-10-12:00:00_joe_2.sql
  1. Joe’s changes are released. The prod DB version now points to 2018-10-10-12:00:00

  2. Alice’s changes are merged to master later

# master
migrations/2018-09-15-12:00:00_joe_1.sql
+migrations/2018-10-01-12:00:00_alice_1.sql
migrations/2018-10-10-12:00:00_joe_2.sql
+migrations/2018-10-18-12:00:00_alice_2.sql
  1. Since the alice_1.sql migration is behind the prod DB version, one of migrations is never applied.

Possible workarounds -

  1. Use sequence numbers instead of timestamps - Go-migrate does not run any migrations if there are duplicate sequence numbers in the same folder.
  2. https://github.com/golang-migrate/migrate/issues/179#issuecomment-947094417

- Scenario #2 - Order dependent migrations

Say, Alice and Bob are working on separate tasks each involving DB changes that are sequence/order dependent. Eg. Alice’s ALTER followed by Bob’s INSERT on the same table.

Possible scenarios -

3 out 4 of the times, there could be a partial or a complete failure while applying migrations.

Possible workarounds -

  1. Have an intermediate “stage” branch to order the migrations correctly before merging to master
  2. Use one migration file per release

References

[1] https://en.wikipedia.org/wiki/Schema_migration

[2] https://github.com/golang-migrate/migrate

[3] https://github.com/pressly/goose/issues/63#issuecomment-428681694

[4] https://www.prisma.io/dataguide/types/relational/what-are-database-migrations

Home  |  /til/ Github Linkedin Email Resume Website Hugo theme by Yukuro