Skip to main content

RDS Snapshots

RDS snapshots have multiple purposes: migrations, backups, etc. When RDS DB instances are created using the Cloud Platform RDS terraform module, an IAM user account is created for management purposes. This user can create, delete, copy, and restore RDS snapshots.

Examples of managing RDS DB snapshots using the AWS CLI, via the Cloud Platform Service Pod can be found in the README within the RDS terraform module

Considerations

  • The amount of manual snapshots per AWS account is limited, so it’s important to cleanup old snapshots
  • Daily snapshots are provided “out of the box”, and do not count towards the “Manual Snapshots” total
  • Managing snapshots is the teams’ responsibility (as are snapshot restores), so teams are responsible for cleaning up unneeded manual snapshots in order to avoid hitting our AWS account limits
  • Manual snapshots persist even after the source database is deleted. Automated snapshots are deleted when the source DB is deleted
  • Manual snapshot creation takes approximately 2-3 minutes
  • You cannot restore a snapshot taken from an older DB engine version to a newer DB engine version, or vice versa. The engine version of the snapshot must match the engine version of the target DB instance when restoring from a snapshot.

Restoring live services from an RDS DB snapshot

Note: If your existing RDS instance has deletion protection enabled, you must first disable it before performing a restore.

Important: Database read-write operations are not availiable during the database restore step. This can take some time. Carry out testing on a representative test database to see how long your restore will take.

There are two restore scenarios:

  • Scenario A: Restore to the SAME database (replace existing)
  • Scenario B: Restore to a NEW database (cross-DB restore)

Scenario A: Restore to Same Database

Use this when restoring a snapshot back to the original database instance.

Preperation

Raise a PR to temporarily set deletion_protection = false in your RDS module configuration. Attempting to restore the RDS instance while deletion protection is enabled will result in an error from AWS and Terraform.

1. Get current DB credentials

cloud-platform decode-secret -n <namespace> -s <rds-secret-name>

Note down: rds_instance_endpoint, database_name, database_username, database_password

The rds_name is the first part of the endpoint (e.g., cloud-platform-xxxxxxxx from cloud-platform-xxxxxxxx.xxxxx.eu-west-2.rds.amazonaws.com).

2. List available snapshots

aws rds describe-db-snapshots \
  --db-instance-identifier <rds_name> \
  --query 'DBSnapshots[*].[DBSnapshotIdentifier,SnapshotCreateTime,Status]' \
  --output table

3. Create manual snapshot (if needed)

Create a snapshot, suggested format for the snapshot name is rds-name-namespace-purpose-date

aws rds create-db-snapshot \
  --db-instance-identifier <rds_name> \
  --db-snapshot-identifier <snapshot name>

4. Update Terraform

Add snapshot_identifier to your RDS module:

module "rds" {
  source = "github.com/ministryofjustice/cloud-platform-terraform-rds-instance?ref=9.2.0"
  # ... existing config ...

  snapshot_identifier = "<snapshot name>"
}

Raise PR and merge.

5. Verify data

Port-forward to the restored DB:

Ensure there are no port-forward pods already running. If there are, they will need to be cleaned up.

kubectl -n <namespace> run port-forward-pod --image=ministryofjustice/port-forward \
  --port=5432 \
  --env="REMOTE_HOST=<db-endpoint>" \
  --env="LOCAL_PORT=5432" \
  --env="REMOTE_PORT=5432"
kubectl -n <namespace> port-forward port-forward-pod 5432:5432

Connect and verify:

PGPASSWORD=<password> psql -h 127.0.0.1 -p 5432 \
  -U <username> -d <dbname> \
  -c "SELECT * FROM <table> LIMIT 5;"

6. Tidy up

IMPORTANT Do not skip tidy up, the snapshot_identifier must be removed from your RDS config for data safety and the port-forward pod must be deleted for security.

  • Remove snapshot_identifier from Terraform config (raise PR)
  • Delete port-forward pod: kubectl -n <namespace> delete pod port-forward-pod
  • Raise another PR to re-enable deletion_protection = true.

Scenario B: Restore to NEW Database (Cross-DB)

Scenario B is under review #ask-cloud-platform to get an engineer to help with this process.