CircleCI Deployment Guide

Overview

CircleCI automatically builds Docker images, pushes to ECR, and deploys flows to your self-hosted Prefect server on every push to main, pipeline, or ecs branches.

Current Status Yes

  • Yes CircleCI Configuration: Configured in .circleci/config.yml
  • Yes Docker Build: Multi-stage Dockerfile optimized for Python 3.13
  • Yes ECR Push: Images tagged with commit hash
  • Yes Prefect Deployment: Flows deployed to self-hosted Prefect server via an internal Prefect deployer ECS task
  • Yes IAM Permissions: CircleCI assumes OrganizationAccountAccessRole via STS using the shared aws-blog-infra-prod context
  • Yes Self-Hosted Prefect: Deployed at https://pipelines.rocketclub.online

Quick Start

For detailed setup instructions, see CIRCLECI_SETUP.md.

Architecture

At a high level, CircleCI builds the blog_data Docker image and then triggers an internal Prefect deployer ECS task that talks to the self-hosted Prefect API:

Developer Push (main/pipeline/ecs)

CircleCI Workflow

Job 1: build-and-push-image

Job 2: deploy-to-prefect

Prefect deployer ECS task (from blog_infra)

Self-hosted Prefect API at https://pipelines.rocketclub.online

ECS work pool blog-data-pool → flow runs

Visual overview: See the CI & Prefect deployer diagram under docs/diagrams/diagram_2_blog_data_ci_and_deployer.png.

CI & Prefect deployer

Setup Instructions

Step 1: CircleCI Context

Ensure the aws-blog-infra-prod context exists in CircleCI and is attached to this project's workflow. That context is responsible for providing short-lived AWS credentials via STS; you should not configure static AWS_ACCESS_KEY_ID / AWS_SECRET_ACCESS_KEY values on this project.

If you want the optional Prefect smoke test in the deploy-to-prefect job to run, set PREFECT_API_URL=https://pipelines.rocketclub.online/api in that same context.

Step 2: Verify Prefect Work Pool

Ensure the blog-data-pool work pool exists:

export PREFECT_API_URL="https://pipelines.rocketclub.online/api"

prefect work-pool ls

If it doesn't exist, create it:

prefect work-pool create blog-data-pool --type ecs

Step 3: Deploy

Push to any of the configured branches:

# Option 1: Push to main
git push origin main

# Option 2: Push to pipeline
git push origin pipeline

# Option 3: Push to ecs
git push origin ecs

Step 4: Monitor Deployment

In CircleCI

  1. Go to app.circleci.com
  2. Select your project
  3. Watch the workflow:
    • build-and-push-image (~3-5 minutes)
    • deploy-to-prefect (~2-3 minutes)

In AWS ECR

# Verify image was pushed
aws ecr describe-images --repository-name blog-data --region eu-west-2

# Get latest image
aws ecr describe-images \
  --repository-name blog-data \
  --region eu-west-2 \
  --query 'imageDetails[0]'

In Self-Hosted Prefect Server

  1. Go to https://pipelines.rocketclub.online
  2. Log in with your credentials
  3. Navigate to Deployments
  4. Verify data-pipeline-daily deployment exists
  5. Check version matches git commit hash

Configuration Files

.circleci/config.yml

Two-job workflow:

  1. build-and-push-image

    • Uses ubuntu-2404:current machine executor
    • Dynamically gets AWS account ID via STS
    • Queries ECR for repository URL
    • Builds Docker image with commit hash tag
    • Pushes to ECR
  2. deploy-to-prefect

    • Uses cimg/python:3.13 Docker executor
    • Assumes an AWS IAM role via STS
    • Starts the Prefect deployer ECS task defined in blog_infra
    • Passes the built image tag via PREFECT_IMAGE_REFERENCE
    • Optionally verifies that the blog-data-pool work pool exists

prefect.yaml

The canonical deployment configuration lives in the repo prefect.yaml. Below is a simplified excerpt for the main scheduled deployment; see the file itself for the full set of on-demand deployments and tags.

prefect-version: '>=3.0.0'
name: blog-data-pipeline

deployments:
  - name: data-pipeline-daily
    description: 'Complete data pipeline: extract → clean → load'
    schedule:
      cron: '0 2 */7 * *'
      timezone: 'America/New_York'
    entrypoint: flows/main_pipeline.py:data_pipeline_flow
    work_pool:
      name: blog-data-pool
      work_queue_name: default
      job_variables:
        image: '{{ $PREFECT_IMAGE_REFERENCE }}'

Note: Docker build/push steps are handled by CircleCI, not Prefect.

Dockerfile

Multi-stage build:

  • Stage 1 (builder): Install dependencies
  • Stage 2 (runtime): Copy artifacts, minimal runtime image

Key Decisions & Lessons Learned

1. ECR Tag Immutability

Issue: ECR repository has immutable tags enabled, so we can't overwrite latest tag.

Solution: Only push commit hash tags ($CIRCLE_SHA1), not latest.

2. Work Pool Type

Issue: Initially configured for ECS work pool with Docker build/push steps.

Solution: Use Prefect managed work pool (blog-data-pool) and handle Docker build/push in CircleCI.

3. IAM Permissions

Issue: ECR resource ARN pattern blog-data-* didn't match repository blog-data.

Solution: Added both exact repository name and wildcard pattern to IAM policy:

resources = [
  "arn:aws:ecr:${var.aws_region}:${data.aws_caller_identity.current.account_id}:repository/${var.project_name}",
  "arn:aws:ecr:${var.aws_region}:${data.aws_caller_identity.current.account_id}:repository/${var.project_name}-*"
]

4. Prefect Deploy Options

Issue: prefect deploy --all --no-prompt failed because --no-prompt doesn't exist.

Solution: Use prefect deploy --all without the --no-prompt flag.

Troubleshooting

"AWS credentials not found"

  1. Verify environment variables in CircleCI:
    • AWS_ACCESS_KEY_ID
    • AWS_SECRET_ACCESS_KEY
    • AWS_REGION
  2. Check variable names match exactly (case-sensitive)

"Failed to push image to ECR"

# Check ECR repository exists
aws ecr describe-repositories --repository-names blog-data --region eu-west-2

# Verify credentials have ECR permissions
aws ecr get-authorization-token --region eu-west-2

"Prefect connection failed"

  1. Verify PREFECT_API_URL is correct (should end with /api) if you are running Prefect CLI locally or from CI
  2. Check that your machine or runner can reach https://pipelines.rocketclub.online/api
  3. Confirm the Prefect API ECS service is healthy in AWS

"Work pool 'blog-data-pool' not found"

# Create the work pool
prefect work-pool create blog-data-pool --type ecs

"tag invalid: The image tag already exists"

This happens when trying to push latest tag to ECR with immutable tags enabled.

Solution: Only push commit hash tags, not latest.

Manual Deployment (if needed)

If CircleCI fails, deploy manually:

# Build and push image
AWS_ACCOUNT_ID=$(aws sts get-caller-identity --query Account --output text)
ECR_REPO_URL="${AWS_ACCOUNT_ID}.dkr.ecr.eu-west-2.amazonaws.com/blog-data"

aws ecr get-login-password --region eu-west-2 | \
  docker login --username AWS --password-stdin ${AWS_ACCOUNT_ID}.dkr.ecr.eu-west-2.amazonaws.com

docker build -t blog-data:$(git rev-parse HEAD) .
docker tag blog-data:$(git rev-parse HEAD) ${ECR_REPO_URL}:$(git rev-parse HEAD)
docker push ${ECR_REPO_URL}:$(git rev-parse HEAD)

# Deploy flows against self-hosted Prefect (from within the VPC or an admin machine)
export PREFECT_API_URL="https://pipelines.rocketclub.online/api"
prefect deploy --all

Monitoring

View CircleCI Logs

  • UI: app.circleci.com → Project → Workflow
  • CLI: circleci workflow view <workflow-id>

View Prefect Deployments

# List all deployments
prefect deployment ls

# View specific deployment
prefect deployment inspect data-pipeline-daily

View ECR Images

# List images
aws ecr describe-images --repository-name blog-data --region eu-west-2

# Get image details
aws ecr describe-images \
  --repository-name blog-data \
  --region eu-west-2 \
  --image-ids imageTag=<commit-hash>

Next Steps

  1. Yes Environment variables configured in CircleCI
  2. Yes Pipeline tested and working
  3. Yes Docker images in ECR
  4. Yes Flows deployed to self-hosted Prefect
  5. ⏭️ Test flow execution from Prefect UI
  6. ⏭️ Monitor scheduled runs (see prefect.yaml for the canonical cron schedule)

References