How to load production data into a local instance

See the atd-moped/moped-database/README.md for more details

Read replica database access

TPW DTS maintains a read-only copy of the production database, also known as a "read replica." This allows us to more efficiently make certain database queries without risking performance of the primary read/write database. This is also the database system we use to copy data from production for a local environment. The Moped database lives in a unified AWS RDS instance, and you can learn more here.

You will need permission to the unified database read replica and your read replica user will need select access on all tables in the Moped and Hasura schema. More information about user management can be found here.

This can be requested via Moped Technical PM, Mike Dilley, or another member of the Data Technology Service dev team who has admin permissions in AWS where our unified database and read replica are managed.

Environment file setup

Once you have unified read replica credentials, you'll need to add them to an env file. The env_template file exists to give you an example of what values need to be filled out. Simply copy the template and rename it to env with the values filled out.

From there, you should be able to run the ./hasura-cluster replicate command to pull a replica of production data into your local development environment.

Troubleshooting suggestions

If you encounter errors, here are a couple suggestions:

Check the moped-database/snapshots folder. When the replicate command functions properly, you will see a .sql file with today's date (ex: `moped-database/snapshots/production-2024-10-10.sql) and there should be SQL code in that file of the dumped data from production in a snapshot format that can be inserted into your local Postgres instance. Check this /snapshots folder and any newly created .sql file because it may contain error message in lieu of functioning SQL code. This file and any included error messages can provide clues for troubleshooting.
Try to test your Read Replica credentials (the same ones you'd put into the env file) with a SQL GUI like TablePlus and ensure your DB connection works independently of the moped codebase. If you can't connect, there may be a problem with your credentials or you may not have adequate access to the database.
Make sure the IP address you are working from is listed on the approved whitelist managed in the AWS security groups. Reach out to an AWS admin on the dev team in case you need help ensuring your IP is on this list. It is possible that your IP address at home can change and you may need to verify and update your current IP in AWS. You may also consider logging into the CoA network via VPN (ex: Cisco AnyConnect) and confirming your connections works while effectively using the CoA network's IP address range.
If you notice an error during replication like ERROR: role "staff" does not exist, you can ignore as this message is expected during a successful replication. We do not rely on roles during local testing and development.

PreviousHow do I connect to the RDS instance?NextHow do I update seed data?

Last updated 2 months ago

Was this helpful?