Tuesday, September 30th, 2025

Using Terraform for Reproducible, High-Fidelity Local Development Environments

Nathan Leung
Engineering

Terraform allows us to quickly spin up high-fidelity local development environments that share the same infrastructure-as-code configuration that we deploy to production. Here's exactly how we use Terraform to configure Docker, Postgres, S3-compatible storage and other required services on developer machines so our local dev environment starts with one command, shares the same IaC, and feels like prod, without the mess of ad-hoc Bash orchestration or the heft of local Kubernetes.

Isometric 3d render of a glowing wireframe laptop centered on blueprint-style grid. Laptop connects to three evenly spaced modules: "postgres" label on a 3d database cylinder (dark blue wireframe, darker than blue used elsewhere), "minio" label on a bucket shape (red wireframe), "temporal" label on a sphere with equator and prime meridian lines (purple wireframe). Floating "127.0.0.1" label above the laptop. Deep blue and purple palette with subtle neon accents, minimal, high-detail, crisp technical aesthetic. Isometric grid glowing against dark background extending to edges.

Once upon a time, local development at Harbor was simple. We had a React frontend with a Node.js backend. We would open two terminal panes and type:

$ cd backend && npm run dev
$ cd frontend && npm run dev

and sure enough, our components would render at http://localhost:3000 and GET /api/user would appear in our backend terminal's output.

We didn't like losing all of our data when the backend process crashed, so we decided to add a database. Not wanting to deal with managing persistent databases on the local filesystem, we set up Docker Compose with an in-memory Postgres instance:

services:
  postgres:
    image: postgres:17.5-alpine
    environment:
      POSTGRES_USER: ${APP_DB_USER}
      POSTGRES_PASSWORD: ${APP_DB_PASSWORD}
      POSTGRES_DB: ${APP_DB_NAME}
    tmpfs:
      - /var/lib/postgresql/data
    ports:
      - "5432:5432"

Local development became three commands:

$ docker compose up -d
$ cd backend && npm run dev
$ cd frontend && npm run dev

We could open Docker Desktop and kill the containers whenever we wanted to reset the database or the sessions. It was simple, and it worked.

It Gets Complicated

Our local development story doesn't end there, though.

At Harbor, we make clinical trial software. Our first product is a database for clinical trial data. Because of our domain, our software is subject to strict regulatory requirements. Some of these requirements, unfortunately, started to push the limits of our simple local dev setup:

  1. Any action which touches clinical trial data needs to produce an immutable audit log with application-level data about the actor (e.g. user ID, permissions, organization they are a part of). The requirement to have application-level context in the audit log means that system logs from an extension like pgAudit aren't enough: we need audit log tables in the app database's schema itself, too. The immutability requirement means that the Postgres role used by the backend web service cannot have UPDATE or DELETE permissions on these audit log tables, only SELECT (e.g., to display the audit logs in-app) and INSERT.1
    • In local dev: How do we ensure that our local dev database's roles and permissions match those of the production database? Major divergences in database roles between dev and prod make it more likely we'll run into difficult-to-debug "but it worked in dev"-type issues.
  2. Data from each individual clinical trial needs to be completely isolated from data in other clinical trials. Depending on one's risk tolerance and interpretation of the regulations, implementing trial-level data isolation may mean anything from a single database with a trial_id column on every table that is always used as a filter (risky if you forget a WHERE somewhere, but operationally simpler) to completely separate databases for each clinical trial (theoretically safer, but greater operational overhead). At Harbor, we create a separate Postgres database for each clinical trial for maximal data isolation.
    • In local dev: How can we spin up individual clinical trial databases locally? At first glance, it seems like Docker Compose is a nonstarter: the docker-compose.yml file is static, so doesn't look like it's possible to dynamically create new Postgres containers with the existing setup.

Altogether, it was a major challenge to be able to meet these requirements with our barebones setup using only Docker Compose. These challenges, however, are exactly what excite us about building in this domain. So, to get everything set up, we put the team together and spent a weekend writing a 700-line Bash script that dynamically parses and generates the necessary YAML to... just kidding.

Enter Terraform

Like most modern engineering teams, we were already using Terraform to manage production infrastructure. So we asked ourselves: why not use it for local dev too?2

At the outset, it was straightforward to translate our original Docker Compose file into Terraform configuration. We relied heavily on the Terraform Docker provider, which gives Terraform the ability to declaratively manage Docker images and containers.

resource "docker_image" "postgres" {
  name         = "postgres:17.5-alpine"
  keep_locally = true
}

resource "docker_container" "app_db" {
  name  = "app_db"
  image = docker_image.postgres.image_id

  tmpfs = {
    "/var/lib/postgresql/data" = ""
  }

  env = [
    # Uppercase Terraform variables are more robust when dealing
    # with case-insensitive environment/secret managers — some
    # can't distinguish between `TF_VAR_name` and `TF_VAR_NAME`
    "POSTGRES_USER=${var.APP_DB_ADMIN_USER}",
    "POSTGRES_PASSWORD=${var.APP_DB_ADMIN_PASSWORD}",
    "POSTGRES_DB=${var.APP_DB_NAME}"
  ]

  ports {
    internal = 5432
    external = var.APP_DB_PORT
  }
}

There's nothing too novel here — it's almost a one-to-one translation of the original YAML into HCL. We repeated this translation for our other backend dependencies, like Redis (for user sessions), Mailpit (a local SMTP server for testing email sending), and MinIO (local S3-compatible storage):

resource "docker_image" "minio" {
  name         = "minio/minio:latest"
  keep_locally = true
}

resource "docker_container" "minio" {
  name  = "minio"
  image = docker_image.minio.image_id

  command = ["server", "/data"]

  tmpfs = {
    "/data" = ""
  }

  ports {
    internal = 9000
    external = 9000
  }
}

Now here's the start of the magic: since all of our local services are now managed in Terraform, we can use the Terraform providers for the services themselves to manage their configuration declaratively, using the same variables we used to configure the underlying containers. This is primarily possible because Terraform has providers for almost any service you can think of.

In our specific case, we can use the Terraform Postgres provider to manage roles in our Postgres database, and we can use the MinIO provider to create the storage buckets needed by our application to function. For instance, here's how we could use the Terraform Postgres provider — configured to talk to the Postgres database that was just created by the docker_container.app_db resource shown above — to create a service user for the backend web service with only SELECT and INSERT permissions on the audit log tables.

provider "postgresql" {
  host     = "localhost"

  # These are the same Terraform variables we used to configure the
  # underlying Postgres container above
  port     = var.APP_DB_PORT
  username = var.APP_DB_ADMIN_USER
  password = var.APP_DB_ADMIN_PASSWORD
  database = var.APP_DB_NAME
}

resource "postgresql_role" "service_user" {
  name     = var.APP_DB_SERVICE_USER
  password = var.APP_DB_SERVICE_USER_PASSWORD
  login    = true
}

data "postgresql_tables" "audit_logs_tables" {
  database          = var.APP_DB_NAME
  schemas           = ["public"]
  like_all_patterns = ["%_audit_logs"]
}

resource "postgresql_grant" "service_user_audit_logs_tables" {
  database    = var.APP_DB_NAME
  role        = postgresql_role.service_user.name
  schema      = "public"
  object_type = "table"
  objects     = data.postgresql_tables.audit_logs_tables.tables[*].object_name
  privileges  = ["SELECT", "INSERT"]
}

In practice, though, we already had code like this which we used to configure database roles on our production, cloud-hosted Postgres instance. So instead, we created a Terraform module to share the code and imported it into our local Terraform configuration, enabling us to run the same IaC in both local dev and prod:

# In production, we just pass different credentials to the provider
provider "postgresql" {
  host     = "localhost"
  port     = var.APP_DB_PORT
  username = var.APP_DB_ADMIN_USER
  password = var.APP_DB_ADMIN_PASSWORD
  database = var.APP_DB_NAME
}

# Internally, this module has effectively the same Terraform code as
# what we wrote above: it creates roles and locks down the audit log
# tables
module "app_db_config" {
  source = "../modules/db_config"

  db_name               = var.APP_DB_NAME
  service_user_name     = "${var.APP_DB_NAME}_service_user"
  service_user_password = var.APP_DB_SERVICE_USER_PASSWORD
}

Voilà: identical database roles in both environments, provisioned declaratively, with no custom Bash or psql scripting required.

Dynamically-created databases, on the other hand, were a bit less straightforward to implement locally. Since trial databases are intended to be created from the application itself, we couldn't hardcode a list of trial_ids in our IaC and have Terraform run for_each — we needed to be able to spin up an arbitrary number of new trial databases, on-demand, from the web app.

In brief, the approach we took was to run Terraform at the application level as well: when a new clinical trial database is requested, we apply3 a separate Terraform configuration specifically for provisioning that individual trial's database (in fact, this IaC uses the same db_config module that the app database uses to create roles and lock down the audit log tables). The specifics are out of scope of this post, but in broad terms, the trial-specific IaC is similar to that of the main application, in that there is a local Terraform config that creates Docker containers and a production config that creates resources in the cloud.

Concretely, the local dev trial-specific IaC creates a Docker container4 for the clinical trial database and a MinIO bucket to store trial-specific files, mirroring the hosted Postgres instance and cloud storage bucket that would be created in prod:

resource "random_integer" "trial_db_port" {
  # In prod, we'd get different hostnames or IPs, but in local dev,
  # everything's on `localhost`, so we have to disambiguate by port.
  min = 49152
  max = 65535
}

resource "docker_container" "trial_db" {
  name  = var.TRIAL_DB_NAME
  image = docker_image.postgres.image_id

  env = [
    "POSTGRES_USER=${var.TRIAL_DB_ADMIN_USER}",
    "POSTGRES_PASSWORD=${var.TRIAL_DB_ADMIN_PASSWORD}",
    "POSTGRES_DB=${var.TRIAL_DB_NAME}"
  ]

  ports {
    internal = 5432
    external = random_integer.trial_db_port.result
  }
}

module "trial_db_config" {
  source = "../modules/db_config"

  db_name               = var.TRIAL_DB_NAME
  service_user_name     = "${var.TRIAL_DB_NAME}_service_user"
  service_user_password = var.TRIAL_DB_SERVICE_USER_PASSWORD
}

resource "minio_s3_bucket" "trial_bucket" {
  bucket = "bucket-for-${var.TRIAL_ID}"

  # In production, we'd set up a private bucket with appropriate
  # permissions, versioning, etc.
  acl = "public"

  # This allows the bucket to be destroyed on `terraform destroy`, even
  # if it has content. This is fine for local dev.
  force_destroy = true
}

Now all of this Terraform is well and good — who could be against declarative, reproducible, reusable infrastructure? — but where does the state live?

In production, we take the typical approach of using a remote backend, pointed at a cloud storage bucket. In local dev — and this is where Terraform really starts to shine — we have our local MinIO instance's S3-compatible API.

Specifically, in the main application's local dev Terraform config, we create a MinIO bucket to store the local trial databases' Terraform state, mirroring the cloud storage bucket that we use to store the trial databases' infrastructure state in production:

resource "minio_s3_bucket" "trial_dbs_terraform_state" {
  bucket = "trial-dbs-tfstate"
  acl    = "public"
}

And in the trial-specific IaC, we simply point Terraform's backend config to the local MinIO bucket we just created:

terraform {
  backend "s3" {
    bucket = "trial-dbs-tfstate"
    region = "local"
    key = "${var.TRIAL_ID}/terraform.tfstate"

    endpoints = {
      # This tells Terraform to use our local MinIO instance
      # as the S3 backend instead of "real" AWS S3.
      s3 = "http://localhost:9000"
    }
  }
}

With this all in place, we can now spin up new trial databases from our web app locally, in almost exactly the same way that we do in prod: when a new database is requested, the backend calls terraform apply; since we're running locally, Terraform spins up a new trial-specific Postgres container (as opposed to a cloud-hosted database), and the trial-specific Terraform state is stored in the local bucket. When we want to deprovision a trial database, our backend can just as easily terraform destroy to clean up the trial's resources — again, using the same command in both local and prod.

Conclusion

The structural fidelity that Terraform enables between local and prod means that we can write and test features locally — including complex infrastructure provisioning workflows — with high confidence that they will work in production. For our specific regulatory requirements, Terraform, by abstracting away much of the complexity of provisioning and configuring multiple databases across various environments, has allowed us to maximize regulatory compliance and data isolation for our customers without compromising engineering velocity.

While the specific regulatory constraints we face and technical solutions we've built are unique to the clinical trial domain, the Terraform approach to local development generalizes to any application with infrastructure requirements beyond just running a few services — which in the long run, tends to be just about any application.

And our local dev still starts with just three commands:

# One command for all of the infrastructure
$ cd infra/local && terraform apply -auto-approve

# And two commands for the apps
$ cd backend && npm run dev
$ cd frontend && npm run dev

Reach out if you're also interested in leveraging great tools to build scalable, maintainable, and compliant software in a mission-critical domain. It's what we do every day at Harbor.

Footnotes

  1. Among other measures we use to enforce immutability, we also use cryptographic signatures to ensure that audit log rows are not tampered with. But that's implemented at the application level as opposed to the infrastructure level, so out of scope of this post.

  2. If you're running Kubernetes, the approach to local development outlined here likely isn't news to you. In essence, we're taking production IaC (which in the Kubernetes case would be YAML config files), and applying them locally (for Kubernetes, this would involve a local cluster tool like Minikube). For teams that aren't running Kubernetes, though, this Terraform-based approach is much more lightweight and probably integrates well with the local dev tooling you already use (e.g. Docker).

  3. More precisely, we have a Temporal workflow that clones the relevant Terraform config, and then runs terraform init and terraform apply. There is a similar workflow to run migrations against trial-specific databases as well.

  4. Yes, in local dev, each trial database gets its own Postgres container. Our engineers have the finest MacBooks money can buy.

Footnote