Skip to main content

Migrating an RDS instance

This guide covers one way of migrating an RDS instance from one AWS account to another. This is a standard approach usingpg_dump and psql.

This guide assumes the migration complies with the following :

  • The migration happens from a source postresql RDS instance to a target postresql RDS instance
  • Elevated & short-lived sets of postgres credentials are available for both source and target

Generating these credentials for your source database depends on where it is. For the target database, in the Cloud Platform, if your regular credentials have enough access, you can use those, or post in #ask-cloud-platform when you’re ready, and the team can generate credentials with elevated privileges and put them into a secret in your namespace.

Overview

Postgres utilities

It is possible to do a full database migration using only official CLI tools, provided by Postgres. Using pg_dump and psql, this document describes the migration process.

pg_dump

Using this tooling implies having source database downtime. (As you don’t want data being written to it while migrating it.)

The steps will always be the same; on one side we export from source, on the target side we restore.

Pre-Data, Data, Post-Data

pg_dump can be used to export one big archive that can then be restored with pg_restore. One issue with that approach is the difficulty of troubleshooting migration issues.

Since we are trying to make this process as clear as possible, the following guide decomposes pg_dump into its three components:

  • Pre-data: The table structures, functions.
  • Data: data
  • Post-data: indexes, triggers, rules, and constraints

With that deconstructed process, it is easier to debug issues (and get help from the Cloud Platform team) and most importantly your team can perform validation/testing incrementally.

Step 0 - Pod

In order to run postgresql commands against both of those endpoints, you need access to both.

This is solved by launching a pod on the live1 kubernetes cluster, in the team’s namespace. The migration steps outlined below have been tested from a pod running the bitnami/postgresql Docker image.

Network access requirements:

  • The source RDS needs to have its public accessibility config turned on.
  • The RDS security group needs to be opened to the Cluster. For that, add inbound rules from the NAT gateways’ IP addresses to port 5432.
  • The RDS instance needs to support SSL connections.

Step 1 - Pre-Data

First, to export, we run:

pg_dump -U source_username \
     -h source_endpoint \
     -d source_database \
     -O \
     --section=pre-data > pre-data.sql

Here, -O tells RDS to export the table structure without owners. The command above stores the data in a local file.

Then to restore this into the target, we use psql:

psql -U target_username \
     -h target_endpoint \
     -d target_database \
     -f pre-data.sql

If using a local file is problematic, those two commands can be piped together (|)

Step 2 - Sequences

Sequences are essential for your database to know what the latest increment of the primary keys is. Sequences are held in special tables that will not be migrated during step 1.

First, to export, we run :

pg_dump -U source_username \
     -h source_endpoint \
     -d source_database \
     -t '*_seq' > sequences.sql

Here, -t '*_seq' indicates to pg_dump that we only want to export tables with names ending in _seq, which are the sequences.

Then to restore this into the target, we use psql:

psql -U target_username \
     -h target_endpoint \
     -d target_database \
     -f sequences.sql

If using a local file is problematic, those two commands can be piped together (|)

Step 3 - Data

First, to export, we run :

pg_dump -U source_username \
     -h source_endpoint \
     -d source_database \
     -O \
     --section=data > data.sql

Here, -O tells RDS to export the table structure without owners. The command above stores the data in a local file.

Then to restore this into the target, we use psql:

psql -U target_username \
     -h target_endpoint \
     -d target_database \
     -f data.sql

If using a local file is problematic, those two commands can be piped together (|)

Step 4 - Post-Data

Any constraints, indexes and foreign keys are also a special kind of metadata that would not be migrated during any of the steps above. All of data is contained within the post-data section.

The process is almost identical to Step 1:

First, to export, we run:

pg_dump -U source_username \
     -h source_endpoint \
     -d source_database \
     -O \
     --section=post-data > post-data.sql

Then to restore this into the target, we use psql:

psql -U target_username \
     -h target_endpoint \
     -d target_database \
     -f post-data.sql

If using a local file is problematic, those two commands can be piped together (|)

Step 5 - Data Validation (Very Important)

After a migration, it is your team’s responsibility to ensure the data, its integrity and anything required by your application to operate properly have been preserved.

Even though the process above migrates the data and the metadata, it is essential for you to have a data validation strategy to confirm everything is in order.

The Cloud Platform team can’t provide a how-to guide on data validation, as each service’s database migration is different.

Step 6 - Clean up

After a successful migration, we can clean up by:

  • Deleting the pod from STEP 1
  • Disabling the network access from the live-1 cluster to the source RDS
  • Revoking the temporary credentials created for the migration
This page was last reviewed on 14 June 2021. It needs to be reviewed again on 14 September 2021 .
This page was set to be reviewed before 14 September 2021. This might mean the content is out of date.