Deployment & CI for Infrastructure

This page describes how changes to the blog infrastructure are applied to AWS using Terraform and CircleCI, and how park/unpark workflows interact with the running environment.

The focus here is on infrastructure deployment for this repository (rlhatcher/blog_infra). Application release processes for other repositories (such as blog_data or the Vercel-hosted blog frontend) are documented elsewhere.

High-level deployment flow

The diagram below shows the main steps when a change is made to this repository and rolled out to AWS.

Deployment & CI/CD

At a high level:

  1. A developer pushes changes to rlhatcher/blog_infra on GitHub.
  2. CircleCI runs workflows that:
    • Plan and apply Terraform changes against infra/envs/prod.
    • Optionally park or unpark the environment.
    • Verify that key endpoints remain healthy.
  3. Terraform updates AWS resources (networking, ECS, RDS, S3, IAM, Lambda, etc.) according to the modules in this repo.

Terraform entrypoints

Terraform for production infrastructure is managed via a single root module:

  • Root: infra/envs/prod
  • Wrapper script: ./scripts/terraform-prod.sh

The wrapper script is invoked by CircleCI jobs to run init, plan and apply with a consistent set of options and backend configuration.

CircleCI workflows

The following workflows are responsible for provisioning and managing infrastructure defined in this repository.

Workflow namePurpose
terraform-prodPlan and apply Terraform changes to prod
park-infrastructureApply configuration that parks the environment
unpark-infrastructureRestore the environment from parked state
refresh-prefect-imageBuild/push the ECS-capable Prefect base image

These workflows are configured in the CircleCI config for this repository.

Terraform jobs

Within the terraform-prod workflow, the primary jobs are:

Job nameWhat it doesNotes
terraform-prod-planRuns ./scripts/terraform-prod.sh init/planGenerates a Terraform plan for review
terraform-prod-applyRuns ./scripts/terraform-prod.sh init/plan/applyApplies the approved plan to AWS
verify-prod-endpointsSmoke checks against portal and pipelines via CloudFrontEnsures core surfaces are reachable

The apply step typically requires manual approval in CircleCI before running in production.

Park/unpark jobs

To reduce costs when the environment is not in active use, this repository supports park mode:

Workflow / jobWhat it does
park-infrastructure / park-prod-envScales down or disables non-essential compute resources while keeping core stateful services intact
unpark-infrastructure / unpark-prod-envReverses park mode changes and restores normal capacity

The exact set of resources affected by park/unpark is defined in Terraform variables and modules within this repository. Stateful components such as Aurora and critical S3 buckets remain available so that data is not lost.

Responsibilities and boundaries

From an infra point of view:

  • This repository is the source of truth for AWS infrastructure used by the Backstage portal, Prefect access surface, data platform buckets, Lambda ork processor and related IAM/Secrets.
  • CircleCI is responsible for executing Terraform plans and applies in a controlled, auditable way, including park/unpark operations.
  • Application code and deployments in other repositories (for example Prefect flows in blog_data or the Vercel blog frontend) consume this infrastructure but have their own release pipelines and should not modify core infra directly.

Relationship to other repositories

Only this repository's Terraform is managed by the workflows and jobs described above. The other repositories that make up the system have their own application-level build and deployment pipelines and are not deployed by these CircleCI workflows.

RepositoryRole
blog_infraTerraform, AWS infrastructure and CI/CD for the shared platform
blog_portalBackstage application and configuration on ECS
blog_dataPrefect flows and Lambda code using S3, Neo4j Aura and Cloudinary
blog_codeNext.js public blog application on Vercel
blog_docsBroader system documentation
blog_contentMarkdown/MDX content for the blog

When making infrastructure changes, update the Terraform modules under infra/, run terraform plan locally if needed, and then rely on the terraform-prod workflow to apply changes to production.

Prefect images and immutable ECR tags

The blog-data ECR repository is used to host both the base Prefect image and a small Prefect deployer image that runs prefect deploy --all inside the VPC on ECS.

  • The base Prefect image is built from the official Prefect 3 Docker Hub image by the refresh-prefect-image workflow and tagged as prefect-3-python3.12-ecs
  • The Prefect deployer image is built by the build-prefect-deployer-image job and tagged as prefect-deployer-3-python3.12-<blog_infra_short_sha>

This scheme has important properties:

  • Immutable tags: the ECR repository is configured with imageTagMutability=IMMUTABLE, so tags are never moved once pushed.
  • Concrete, reproducible mapping: each deployer image tag encodes the blog_infra commit that defined how the image was built.
  • Terraform is explicit: the prod environment pins both images via locals in infra/envs/prod/main.tf.

There is intentionally no mutable latest tag for the deployer image.

Running the refresh-prefect-image workflow

Use the refresh-prefect-image workflow when you need to rebuild the base Prefect image. The workflow will:

  1. Build docker/prefect-base-ecs/Dockerfile FROM the official Prefect 3 image.
  2. Install prefect-aws[ecs] so ECS workers can run prefect worker start --type ecs.
  3. Push the result to the blog-data ECR repository as prefect-3-python3.12-ecs.

To run it in CircleCI:

  1. Start a new pipeline for rlhatcher/blog_infra on the desired branch.
  2. Set pipeline parameters: refresh-prefect-image-mode: true, terraform-prod-mode: false.
  3. Once the workflow completes, the tag is available in ECR.

Discovering the latest Prefect deployer image tag

To find the most recently pushed Prefect deployer image tag:

aws ecr describe-images \
  --repository-name blog-data \
  --region eu-west-2 \
  --query 'reverse(sort_by(imageDetails, &imagePushedAt))[0].imageTags' \
  --output text

Then update infra/envs/prod/main.tf so local.prefect_deployer_image references that concrete tag and run the terraform-prod workflow to roll the ECS task definition forward.