[IMPORTANT] Dynamodb now provides server-side support for cross-region replication using Global Tables. Please use that instead of this client-side library. For more details about Global Tables, please see https://aws.amazon.com/dynamodb/global-tables/
The DynamoDB cross-region replication process consists of 2 distinct steps:
This step is necessary if your source table contains existing data, and you would like to sync the data first. Please use the following steps to complete the table copy:
WARNING: If your source table has live writes, make sure the table copy process completes well within 24 hours, because DynamoDB Streams records are only available for 24 hours. If your table copy process takes more than 24 hours, you can potentially end up with inconsistent data across your tables!
This step sets up a replication process that continuously consumes DynamoDB stream records from the source table and applies them to the destination table in real-time.
Enable DynamoDB Streams on your source table with StreamViewType set to "New and old images". For more information on how to do this, please refer to our offical DynamoDB Streams documentation.
Build the library:
java -jar target/dynamodb-cross-region-replication-1.2.1.jar --sourceRegion <source_region> --sourceTable <source_table_name> --destinationRegion <destination_region> --destinationTable <destination_table_name>
--help option to view all available arguments to the connector executable jar. The connector process accomplishes a few things:
taskName, used when restoring from crashes.
taskName. Overlapping names will result in strange, unpredictable behavior. Please also delete this DynamoDB checkpoint table if you wish to completely restart replication. See how a default
taskNameis calculated below in section "Advanced: running replication process across multiple machines".
--destinationEndpointcommand line arguments. You can override the DynamoDB Streams source endpoint with the
--sourceStreamsEndpointcommand line argument. The main use case for overriding any endpoint is to use DynamoDB Local on one end or both ends of the replication pipeline, or for KCL leases and checkpoints.
NOTE: More information on the design and internal structure of the connector library can be found in the design doc. Please note it is your responsibility to ensure the connector process is up and running at all times - replication stops as soon as the process is killed, though upon resuming the process automatically uses the checkpoint table in DynamoDB to restore progress.
With extremely large tables or tables with high throughput, it might be necessary to split the replication process across multiple machines. In this case, simply kick off the target executable jar with the same command on each machine (i.e. one KCL worker per machine). The processes use the DynamoDB checkpoint table to coordinate and distribute work among them, as a result, it is essential that you use the same
taskName for each process, or if you did not specify a
taskName, a default one is computed.
taskName= MD5 hash of (sourceTableRegion + sourceTableName + destinationTableRegion + destinationTableName)
Each instantiation of the jar executable is for a single replication path only (i.e. one source DynamoDB table to one destination DynamoDB table). To enable replication for multiple tables or create multiple replicas of the same table, a separate instantiation of the cross-region replication library is required. Some examples of replication setup:
Replication Scenario 1: One source table in us-east-1, one replica in each of us-west-2, us-west-1, and eu-west-1
Replication Scenario 2: Two source tables (table1 & table2) in us-east-1, both replicated separately to us-west-2
Can multiple cross-region replication processes run on the same machine?
How can I ensure the process is always up and running?
How can I build the library and run tests?
mvn clean verify -Pintegration-tests on the command line. This will download DynamoDB Local and run an integration test against the local instance with CloudWatch metrics disabled.