Terraform State Management: Collaborating Safely with S3 and DynamoDB
How to Prevent State File Conflicts with State Locking, Refresh-Only, and Reliable Storage Solutions
Terraform State:
Terraform State allows you to view the current state of any resources you have created on a cloud platform locally.
When you create an AWS EC2 instance using Terraform, its default state is "running".
When we did terraform apply our AWS Instance knows the instance state is running.
Similarly, we can instruct Terraform to display all the states it is managing.
terraform state list
Terraform maintains a list of all the resources it has created in the terraform.tfstate
file. It stores the details that are also available on the AWS console, as shown below:
To check the current state of any resource, use the following command:
terraform state show 'resource_state'
Now we will go to AWS and stop our instance:
It will display the status as stopped on the AWS console, so you would expect it to show the same status locally in Terraform. However, if you run terraform state show aws_instance.my_instance[0]
, you will see it still shows the state as running.
This discrepancy occurs because the local Terraform state is not automatically updated with changes made outside of Terraform. To resolve this issue and ensure the local state matches the AWS console, you can use the following command to refresh the state of the resource:
terraform apply -refresh-only
You will see a prompt asking, "Do you want to update the state?" as shown in the image below.
You can see in terraform.tfstate
that your instance state has been changed to stopped.
This is how you can sync your local setup with AWS or another cloud provider.
State Locking
Real Scenario:
Our terraform.tfstate have all the information regarding our resource including, AMI ID, Public key, Private Key, etc., We cannot commit it on GitHub because it have all the sensitive information about our infrastructure.
Suppose, User 1 create a version for state file where he add instance count = 1
at 10:00 PM. With this instance count, one instance has been created on AWS and the state file of that instance we did not commit on GitHub because it is sensitive data.
Now, User 2 increase the instance count = 2
on his system at 10:01 PM so on AWS two instances has been created.
It is happening because every user tfstate files
are different.
To address this issue, we need a mechanism that locks the state file when it is created, such as at 10:00 PM, ensuring that only a single state file is in use. This should be achieved without relying on GitHub.
Problem Statement:
When multiple users work on Terraform, they each have different state files, leading to disorganization in AWS. To resolve this, we require a single shared state file that doesn't need to be pushed to GitHub.
Solution:
We will store the state file in an S3 Bucket, allowing Terraform to read and write the state file from there. However, when using an S3 Bucket, it is necessary to connect it to a database, specifically DynamoDB.
Note: DynamoDB is a key-value pair database that does not use tables and columns.
When we execute terraform apply
, it will interact with the S3 Bucket. Once accessed, the S3 Bucket will communicate with DynamoDB to track which user is accessing it and what actions they are performing. The user's activity is recorded in DynamoDB, and a LockID is generated. As long as the LockID exists, User 2 must wait until User 1 completes their task and releases the LockID. During this period, no one else can access User 1's S3 resources. This process is known as state locking.
In state locking, we keep a single state file as the source of truth in an S3 bucket. Whenever we apply our Terraform configurations, a LockID is created in a DynamoDB table. As long as the LockID exists, no one else can use the state file. This is how the lock-and-key mechanism works. DynamoDB makes sure there is only one LockID, stopping multiple users from creating one.
Once User 1 finishes their work, the LockID is released, and the state becomes available, allowing another user to access and work with it.
We will see hands-on example in our next blog. Till then stay tuned!
Happy Learning :)
Chetan Mohod ✨
For more DevOps updates, you can follow me on LinkedIn.