For an updated ready-to-use CloudFormation template of this code, see newer post: Complete code: cross-region RDS recovery.

Amazon RDS is a great database-as-a-service, which takes care of almost all database-related maintenance tasks for you – everything from automated backups and patching to replication and fail-overs into another availability zones.

Unfortunately all of this fails if the region where your RDS is hosted fails. Region-wide failures are very rare, but they do happen! RDS does not support cross-region replication at the moment, so you cannot simply create a replica of your database in another region (unless you host the database on an EC2 instance and set up the replication yourself). The second best option, to make sure you can restore your service quickly in another region, is to always have a copy of your latest database backup in that region. In case of RDS, that can mean copying automated snapshots. There is no option for AWS to do it automatically, but it can be easily scripted with AWS Lambda functions.

RDS can create an automated snapshot of your database every day. All we need to do is make sure to copy that snapshot once it’s ready and remove any old snapshots from the “fail-over region” to save storage cost.

The following quick-and-dirty Lambda function (in Python) accomplishes just that:

import boto3
import operator

ACCOUNT = 'xxxxx'


def copy_latest_snapshot():
    client = boto3.client('rds', 'eu-west-1')
    frankfurt_client = boto3.client('rds', 'eu-central-1')

    response = client.describe_db_snapshots(
        SnapshotType='automated',
        IncludeShared=False,
        IncludePublic=False
    )

    if len(response['DBSnapshots']) == 0:
        raise Exception("No automated snapshots found")

    snapshots_per_project = {}
    for snapshot in response['DBSnapshots']:
        if snapshot['Status'] != 'available':
            continue

        if snapshot['DBInstanceIdentifier'] not in snapshots_per_project.keys():
            snapshots_per_project[snapshot['DBInstanceIdentifier']] = {}

        snapshots_per_project[snapshot['DBInstanceIdentifier']][snapshot['DBSnapshotIdentifier']] = snapshot[
            'SnapshotCreateTime']

    for project in snapshots_per_project:
        sorted_list = sorted(snapshots_per_project[project].items(), key=operator.itemgetter(1), reverse=True)

        copy_name = project + "-" + sorted_list[0][1].strftime("%Y-%m-%d")

        print("Checking if " + copy_name + " is copied")

        try:
            frankfurt_client.describe_db_snapshots(
                DBSnapshotIdentifier=copy_name
            )
        except:
            response = frankfurt_client.copy_db_snapshot(
                SourceDBSnapshotIdentifier='arn:aws:rds:eu-west-1:' + ACCOUNT + ':snapshot:' + sorted_list[0][0],
                TargetDBSnapshotIdentifier=copy_name,
                CopyTags=True
            )

            if response['DBSnapshot']['Status'] != "pending" and response['DBSnapshot']['Status'] != "available":
                raise Exception("Copy operation for " + copy_name + " failed!")
            print("Copied " + copy_name)

            continue

        print("Already copied")


def remove_old_snapshots():
    client = boto3.client('rds', 'eu-west-1')
    frankfurt_client = boto3.client('rds', 'eu-central-1')

    response = frankfurt_client.describe_db_snapshots(
        SnapshotType='manual'
    )

    if len(response['DBSnapshots']) == 0:
        raise Exception("No manual snapshots in Frankfurt found")

    snapshots_per_project = {}
    for snapshot in response['DBSnapshots']:
        if snapshot['Status'] != 'available':
            continue

        if snapshot['DBInstanceIdentifier'] not in snapshots_per_project.keys():
            snapshots_per_project[snapshot['DBInstanceIdentifier']] = {}

        snapshots_per_project[snapshot['DBInstanceIdentifier']][snapshot['DBSnapshotIdentifier']] = snapshot[
            'SnapshotCreateTime']

    for project in snapshots_per_project:
        if len(snapshots_per_project[project]) > 1:
            sorted_list = sorted(snapshots_per_project[project].items(), key=operator.itemgetter(1), reverse=True)
            to_remove = [i[0] for i in sorted_list[1:]]

            for snapshot in to_remove:
                print("Removing " + snapshot)
                frankfurt_client.delete_db_snapshot(
                    DBSnapshotIdentifier=snapshot
                )


def lambda_handler(event, context):
    copy_latest_snapshot()
    remove_old_snapshots()


if __name__ == '__main__':
    lambda_handler(None, None)

For the given account (update the ACCOUNT var at the top of the code) it will go through each of your RDS instances and copy the latest snapshot from Ireland (eu-west-1) to Frankfurt (eu-central-1). It will then go through all manual snapshots within Frankfurt and keep only the latest snapshot for each instance. Region values can be changed within the script to match any requirements.

This Lambda can be scheduled in two ways:
– via CloudWatch Events Schedule, to simply run every day,
– via RDS events (through SNS), to run whenever an RDS backup is finished (some improvements to the code could be useful).

You can create this function manually (it does not require any additional libraries, so it can be copied & pasted into AWS Lambda) or use CloudFormation (please do!).
For reference, check out the GitHub repository where you can find other useful Lambdas and CloudFormation templates for their creation: https://github.com/pbudzon/aws-maintenance.

Was this post helpful to you? Yes!


6 Comments

  1. […] En el trabajo surgió la necesidad de hacer respaldos de una base de datos MySQL en RDS entre regiones, pero sin tener una instancia corriendo en la región de destino, es decir, no se quería read replicas. Lo que primero que sugirieron fue usar algún tipo de cron que copiara los respaldos entre regiones. Como seguramente esto ya se había hecho decidí investigar un poco y me conseguí con este excelente artículo que explica cómo hacer la copia usando una función Lambda en Python: Copying RDS snapshot to another region for cross-region recovery […]

  2. Umesh

    This worked find and big time for me. Thanks you for the information.

  3. Cilla

    Hello, thanks for providing this lambda function. Can you please provide a READMe file and how best we can test it?

  4. Rohan Khanolkar

    Thanks for such a nice article.only one question will this code will copy my encrypted db instance snapshot as well or do i need to add python code for that

    • Paulina Budzon

      Hi Rohan,
      Check out the updated code on github, it now supports encrypted snapshots as well!


Leave a comment