RDS Snapshots

RDS snapshots have multiple purposes: migrations, backups, etc. When RDS DB instances are created using the Cloud Platform RDS terraform module, an IAM user account is created for management purposes. This user can create, delete, copy, and restore RDS snapshots.

Examples of managing RDS DB snapshots using the AWS CLI, via the Cloud Platform Service Pod can be found in the README within the RDS terraform module

Considerations

The amount of manual snapshots per AWS account is limited, so it’s important to cleanup old snapshots
Daily snapshots are provided “out of the box”, and do not count towards the “Manual Snapshots” total
Managing snapshots is the teams’ responsibility (as are snapshot restores), so teams are responsible for cleaning up unneeded manual snapshots in order to avoid hitting our AWS account limits

Restoring live services from an RDS DB snapshot

If you want to restore your production RDS DB instance from a previous snapshot, either because the database is corrupted or the whole database was deleted, here is the procedure:

Note: If your existing RDS instance has deletion protection enabled, you must first disable it before performing a restore.

Raise a PR to temporarily set deletion_protection = false in your RDS module configuration.

Once the restore is completed, raise another PR to re-enable deletion_protection = true.

Attempting to restore the RDS instance while deletion protection is enabled will result in an error from AWS and Terraform.

1. Get the `db-instance-identifier` of the DB instance you want to restore, using the Cloud Platform CLI;

cloud-platform decode-secret -n <your namespace> -s <the secret storing your RDS details>

Look for a line like this in the output;

"rds_instance_address": "cloud-platform-xxxxxxxxxxxxxxxx.abcdefghijkl.eu-west-2.rds.amazonaws.com",

The db-instance-identifier is the cloud-platform-xxxxxxxxxxxxxxxx part.

2. List the available snapshots using the AWS CLI:

First of all, ensure you have a Service Pod running and configured for your target RDS instance, in order to run the AWS CLI commands.

aws rds describe-db-snapshots --db-instance-identifier <db-instance-identifier>

The output will have a list of snapshots available for that RDS DB instance. Pick the DBSnapshotIdentifier from which you want the RDS to be restored.

3. Contact the Cloud Platform team about the restore.

Before making changes to the RDS manifest, you need to:

Add an APPLY_PIPELINE_SKIP_THIS_NAMESPACE file at the root of your namespace.
Provide the team with your db-instance-identifier so they can rename it via the AWS console.

4. Update your current RDS manifest file (usually called `rds.tf`) in the cloud-platform-environments repo to include `snapshot-identifier`.

snapshot_identifier = "rds:cloud-platform-XXXXX-2020-02-08-04-23"

For example:

module "example_team_rds" {
    source               = "github.com/ministryofjustice/cloud-platform-terraform-rds-instance?ref=5.1"
    cluster_name           = var.cluster_name
    cluster_state_bucket   = var.cluster_state_bucket
    team_name              = var.team_name
    business-unit          = var.business-unit
    application            = var.application
    is-production          = var.is-production
    db_engine_version      = "11.4"
    environment-name       = var.environment-name
    infrastructure-support = var.infrastructure-support
    force_ssl              = "true"
    rds_family             = "postgres11"

    snapshot_identifier = "rds:cloud-platform-XXXXX-2020-02-08-04-23"

    providers = {
    # Can be either "aws.london" or "aws.ireland"
    aws = aws.london
    }
}

5. Check the actual “AllocatedStorage” value described in your RDS instance using the command:

aws rds describe-db-instances --db-instance-identifier <your-rds-db-identifier>

If that is different to what is set for your RDS, update your RDS manifest to include the actual db_allocated_storage by adding:

db_allocated_storage = "<the actual value you got from the above cli command>"

Important: It is crucial to retain other original settings associated with the DB instance when restoring from the snapshot. Details such as db-engine-version, db-engine, rds-family must be the same as the original instance when the snapshot was taken. So don’t make any other changes to this file except the above mentioned.

Warning: If there is an RDS read replica for the DB instance that is being restored, then you must set count = 0 or comment out the read replica code (and any Kubernetes secrets relating to the read replica). This will remove the read replica. This is required as Terraform will throw the following error as Terraform is unable to change the read replica target: Error: cannot elect new source database for replication

When the read replica is re-created, it will be created with a new database identifier.

6. Raise a PR

After making the necessary changes to the manifest, raise a PR and contact the Cloud Platform team via the #ask-cloud-platform channel with the PR link. Once the new DB instance is restored and available, your application should be able to access the database with the same database credentials and endpoint as before.

7. Tidy up the environment

After the database has been successfully restored and verified, raise a PR to:

Remove the APPLY_PIPELINE_SKIP_THIS_NAMESPACE file.
Remove the snapshot_identifier from your Terraform configuration, as it is no longer needed after the instance has been created.

This ensures future Terraform applies do not attempt to recreate the instance or restore again from the snapshot.

8. Only required for Read Replicas - Recreate the DB read replica if it was removed as a part of step 3:

In Terraform, re-enable the DB read replica code by setting count = 1 or uncommenting the code and any Kubernetes secrets that may also be created as a part of it.

Create a PR that includes the changes for the Cloud Platform team and post the link into the #ask-cloud-platform channel.

Once the PR has been approved, Merge to Main and check that the Apply Pipeline creates the new DB read replica instance.

Note: The read replica will have a new database identifier once it is created.

9. Important: Version Downgrades Cannot Be Done In-Place

AWS does not support downgrading a database to an earlier version on the same instance. For example, if a database has been upgraded from PostgreSQL 16.3 to 17, it cannot be downgraded back to 16.3 using the standard snapshot restoration process outlined above.

To properly downgrade an RDS database to a previous version:

Take a snapshot of the current database to preserve it (in case restoration is needed later):

   aws rds create-db-snapshot \
     --db-instance-identifier <current-db-instance-identifier> \
     --db-snapshot-identifier <current-version-snapshot-name>

Create a new database instance with:
- A different identifier than the current database
- Engine version matching the snapshot you want to restore from
- RDS family matching the snapshot you want to restore from
- All other configuration parameters consistent with the source of the snapshot
In the terraform module:

   module "example_team_rds" {
     source               = "github.com/ministryofjustice/cloud-platform-terraform-rds-instance?ref=5.1"
     # ... other parameters ...
     db_instance_identifier = "<new-instance-name>"  # Must be different from current DB
     db_engine_version      = "16.3"  # Earlier version
     rds_family             = "postgres16"  # Family matching earlier version
     snapshot_identifier    = "<old-version-snapshot>"
     # ... other parameters ...
   }

Note: This approach requires a cutover as the new database will have a different endpoint and connection string.

After verifying the new downgraded database works properly, the old one can be deleted if needed.

This page was last reviewed on 24 April 2025. It needs to be reviewed again on 24 October 2025 by the page owner #cloud-platform .