Image Factory - Repository Structure

This document describes how the Image Factory system is organized across multiple repositories.

Quick Reference: Analysis Flow

Key Question: Where does Dockerfile parsing happen?

Answer: It depends on Dockerfile location:

Dockerfile LocationParsed ByWhenResult
Local (same repo as images.yaml)GitHub WorkflowOn push to mainBase images discovered immediately
External (different repo)Kargo Analysis JobAfter Warehouse createdBase images discovered at runtime

See “Analysis Responsibilities” section below for detailed breakdown.


Repository Organization

The Image Factory system is organized across multiple repositories for separation of concerns:

Primary Repositories

image-factory - Code Repository (Public)

Contains all the code, tools, and tests for the Image Factory system.

image-factory/
├── app/                    # Python analysis tool
│   ├── app.py             # Main tool for Dockerfile parsing
│   └── pyproject.toml     # Dependencies
├── cdk8s/                 # CDK8s application
│   ├── main.py           # Manifest generation
│   ├── lib/              # Reusable constructs
│   └── imports/          # Generated Kargo CRD imports
├── tests/                # All tests
│   ├── integration/      # Integration tests
│   └── acceptance/       # Acceptance tests
└── .github/workflows/    # CI/CD workflows
    ├── synth.yml        # CDK8s synthesis workflow
    └── test.yml         # Test runner workflow

Purpose: Houses all the logic for parsing Dockerfiles, generating Kargo manifests, and running tests.

Visibility: Public - allows community contributions and usage


image-factory-state - State Repository (Private)

Contains production configuration and auto-generated state files.

image-factory-state/
├── images.yaml           # ⭐ SOURCE OF TRUTH - User edits this
├── base-images/         # Auto-generated base image state
│   ├── node-22-bookworm-slim.yaml
│   ├── python-3.12-slim.yaml
│   └── distroless-python3-debian12-latest.yaml
├── images/              # Auto-generated app image state
│   ├── backstage.yaml
│   ├── uv.yaml
│   └── metrics-service.yaml
├── dist/                # Auto-generated Kargo manifests (CDK8s output)
│   └── cdk8s/
│       └── image-factory.k8s.yaml
├── lib/                 # Shared code via symlink
│   └── image-factory/   # → Symlink to ../../../image-factory (local dev)
├── Taskfile.yaml        # Local task definitions
└── Taskfile.shared.yaml # Shared tasks from lib/image-factory

Purpose:

  • images.yaml is the only file users edit to enroll images
  • base-images/ and images/ contain auto-generated state (DO NOT EDIT)
  • dist/cdk8s/ contains generated Kargo manifests that ArgoCD deploys
  • lib/image-factory/ provides shared code via symlink (local dev)

Visibility: Private - contains production configuration

Development Setup:

cd repos/image-factory-state
ln -s ../image-factory lib/image-factory  # Create symlink
task setup                                 # Setup environments
task generate                              # Generate state files
task synth                                 # Generate manifests

User Workflow (Production):

  1. Edit images.yaml to add/modify images
  2. Commit and push
  3. GitHub Actions automatically:
    • Runs synthesis
    • Generates state files
    • Generates Kargo manifests
    • Commits back to this repo
  4. ArgoCD automatically deploys from dist/cdk8s/

image-factory-exemplar - Public Example (Public)

Public demonstration repository showing how to use Image Factory with a complete example.

image-factory-exemplar/
├── README.md            # Comprehensive tutorial and examples
├── images.yaml         # Example configuration (manually edited)
├── base-images/        # Example auto-generated state (committed)
├── images/             # Example auto-generated state (committed)
├── dist/               # Example generated manifests (committed)
│   └── cdk8s/
│       └── image-factory.k8s.yaml
├── example-images/     # Example application code
│   └── nginx/
│       ├── Dockerfile
│       └── index.html
├── lib/                # Shared code via symlink
│   └── image-factory/  # → Symlink to ../../../image-factory (local dev)
├── Taskfile.yaml       # Local task definitions
└── .gitignore          # Ignores build artifacts

Purpose:

  • Learning Resource: Complete working example with nginx app
  • Testing Environment: Validate functionality locally
  • Template: Fork this to start your own Image Factory setup
  • Shows the complete workflow with actual generated files committed

Visibility: Public - for community learning and usage

Key Difference from image-factory-state:

  • Contains local Dockerfiles in example-images/
  • Generated files are committed to git to show what Image Factory produces (both repos commit generated files as audit trail)
  • Includes complete nginx example application
  • Can run entire workflow locally without GitHub Actions

Future Enhancement - Docker Image Distribution:

Package Image Factory as a Docker image that both -state and -exemplar repos can use:

# Instead of: task setup && task generate && task synth
# Could be: docker run -v $(pwd):/workspace ghcr.io/craigedmunds/image-factory:latest synth

Benefits:

  • Eliminates local Python environments and symlinks
  • Provides lightweight shim in each repo
  • Consistent execution environment

Meta-Consideration: The image-factory-exemplar repo should itself enroll the image-factory Docker image in its images.yaml:

- name: image-factory
  registry: ghcr.io
  repository: craigedmunds/image-factory
  source:
    provider: github
    repo: craigedmunds/image-factory
    dockerfile: Dockerfile
    workflow: docker-build.yml
  rebuildDelay: 7d
  autoRebuild: true

This creates a self-referential example: Image Factory builds and monitors its own container image!

Usage:

# Clone and setup
git clone https://github.com/craigedmunds/image-factory-exemplar.git
cd image-factory-exemplar
 
ln -s ../image-factory lib/image-factory  # For local dev
 
# Run workflow manually
task setup      # Setup environments
task generate   # Generate state files from images.yaml
task synth      # Generate Kargo manifests
task status     # View generated files

Platform Repositories

backstage - Developer Portal

Note: Backstage and its plugins are being migrated to a dedicated repository.

Contains Backstage UI integrations for Image Factory:

  • image-factory/ - Backstage UI plugin
  • image-factory-backend/ - Backend API
  • catalog-backend-module-image-factory/ - Catalog integration

Purpose: Provides developer portal UI for viewing and managing images


argocd-eda - EDA Platform Applications

Note: This repository continues to exist for EDA-specific applications. Image Factory components have been extracted.

Current Status:

  • Active repository for event-driven architecture applications
  • Likely client of Image Factory images (uses images built by Image Factory)
  • Image Factory ArgoCD applications moved to k8s-lab
  • Backstage plugins moved to dedicated backstage repo
  • Image Factory specs migrated to workspace-root .ai/projects/infrastructure/image-factory/

Historical Image Factory Components (now removed):

  • platform/kustomize/seed/image-factory/image-factory-app.yamlk8s-lab
  • backstage/app/plugins/image-factory*backstage repo
  • .kiro/specs/image-factory/workspace-root .ai/projects/

k8s-lab - Base Infrastructure

Base Kubernetes infrastructure repository.

k8s-lab/
├── components/
│   ├── kargo/           # Kargo installation and CRDs
│   ├── argocd/          # ArgoCD installation (via other location)
│   └── central-secret-store/
│       └── external-secrets/
│           ├── kargo-admin-credentials.yaml
│           ├── kargo-docker-registry.yaml
│           └── kargo-git-credentials.yaml
├── argocd/              # ArgoCD base configuration
│   ├── argocd-namespace.yaml
│   └── argocd-projects.yaml
└── other-seeds/
    ├── image-factory.yaml         # → image-factory repo (Kargo resources)
    ├── image-factory-state.yaml   # → image-factory-state repo (generated manifests)
    ├── image-factory-exemplar.yaml # → image-factory-exemplar repo (optional, for demos)
    └── argocd-eda.yaml            # Legacy: Seeds argocd-eda applications

Purpose:

  • Base platform components (ArgoCD, Kargo, External Secrets, etc.)
  • Infrastructure-level configuration
  • Seeds Image Factory and other application repositories
  • New Home: ArgoCD applications for Image Factory live here (moved from argocd-eda)

Repository Interaction Flow

┌─────────────────────────────────────────────────────────────┐
│                    k8s-lab (Infrastructure)                  │
│  - Installs ArgoCD                                           │
│  - Installs Kargo                                            │
│  - Provides secrets (registry, git credentials)             │
│  - Seeds Image Factory via other-seeds/                     │
│    • image-factory.yaml → Kargo base resources              │
│    • image-factory-state.yaml → Generated manifests         │
└─────────────────┬───────────────────────────────────────────┘
                  │
                  ▼
┌─────────────────────────────────────────────────────────────┐
│  User Action: Edit images.yaml in image-factory-state       │
└─────────────────┬───────────────────────────────────────────┘
                  │
                  ▼
┌─────────────────────────────────────────────────────────────┐
│  Synthesis Workflow                                          │
│  Current (Python):                                           │
│    1. Checkout image-factory (code)                         │
│    2. Checkout image-factory-state (config)                 │
│    3. Run CDK8s synthesis                                   │
│    4. Commit manifests back to image-factory-state          │
│                                                              │
│  Future (Docker):                                            │
│    1. Pull ghcr.io/craigedmunds/image-factory:latest        │
│    2. docker run -v state:/workspace synth                  │
│    3. Commit manifests back to image-factory-state          │
└─────────────────┬───────────────────────────────────────────┘
                  │
                  ▼
┌─────────────────────────────────────────────────────────────┐
│  ArgoCD (deployed from k8s-lab)                             │
│  - Watches image-factory-state/dist/cdk8s/                 │
│  - Detects new/changed manifests                            │
│  - Syncs Kargo resources to cluster                         │
└─────────────────┬───────────────────────────────────────────┘
                  │
                  ▼
┌─────────────────────────────────────────────────────────────┐
│  Kargo Orchestration Cycle                                  │
│  1. Warehouse monitors image registries                     │
│  2. New image → Freight created                             │
│  3. Stage triggers Analysis Job                             │
│  4. Analysis discovers base images → commits to git         │
│  5. Git change triggers CDK8s synth (back to top)          │
│  6. New Warehouses created for base images                  │
│  7. Base image updates → trigger app rebuilds               │
└─────────────────────────────────────────────────────────────┘

Note: Kargo/Analysis/CDK8s form a feedback loop - analysis discovers 
dependencies → CDK8s creates monitoring → monitoring triggers analysis
                  │
                  ▼
┌─────────────────────────────────────────────────────────────┐
│  Kargo (deployed from k8s-lab)                              │
│  - Warehouses monitor image registries                      │
│  - Stages run analysis jobs                                 │
│  - Triggers rebuilds when base images update                │
└─────────────────────────────────────────────────────────────┘

Analysis Responsibilities - WHO Does WHAT

This section clarifies which component performs each analysis task and when.

Component Responsibilities

Key Question: Why would GitHub parse Dockerfiles if Kargo will parse them later anyway?

Answer: It doesn’t need to! GitHub can skip Dockerfile parsing entirely.

User adds image to images.yaml:

- name: my-app
  registry: ghcr.io
  repository: myorg/my-app
  source:
    provider: github
    repo: myorg/my-app
    dockerfile: Dockerfile

Step 1: GitHub Workflow (CDK8s Synth)

  • Reads: images.yaml only
  • Creates: Kargo Warehouse for my-app (based on images.yaml entry)
  • Does NOT: Parse Dockerfile or discover base images
  • Output: dist/cdk8s/image-factory.k8s.yaml with Warehouse

Step 2: ArgoCD

  • Deploys Warehouse to cluster

Step 3: Kargo Analysis

  • Warehouse monitors registry → creates Freight
  • Analysis Job parses Dockerfile → discovers base images
  • Creates base-images/*.yaml state files
  • Commits to git

Step 4: GitHub Workflow (CDK8s Synth - Round 2)

  • Reads: images.yaml + newly created base-images/*.yaml
  • Creates: Warehouses for base images
  • Output: Updated dist/cdk8s/image-factory.k8s.yaml

Step 5: ArgoCD

  • Deploys base image Warehouses

Result: All Dockerfile parsing happens in Kargo. GitHub just does CDK8s synthesis.

Component Responsibilities (Simplified)

TaskComponentWhenInputOutput
Read images.yamlCDK8s (GitHub Workflow)On push to mainimages.yaml-
Create Warehouse from images.yamlCDK8s (GitHub Workflow)On push to mainimages.yamlWarehouse manifest
Parse DockerfileKargo AnalysisAfter Freight createdDockerfile from gitbase-images/*.yaml
Create Warehouses from base-imagesCDK8s (GitHub Workflow)After state files committedbase-images/*.yamlWarehouse manifests
Monitor registriesKargo WarehouseContinuouslyRegistry APIFreight
Trigger rebuildsKargo StageOn base updateState filesGitHub API call

Why This Is Better

Single source of Dockerfile parsing - Only Kargo does it ✅ GitHub Workflow is simpler - Just runs CDK8s, no Python analysis tool ✅ Consistent behavior - Same process for local and external Dockerfiles ✅ Better for Docker image distribution - GitHub just runs CDK8s container ✅ Kargo has credentials - Can access private repos

Current Implementation Note

The current implementation MAY have GitHub parsing Dockerfiles for image-factory-exemplar (local Dockerfiles) to provide faster feedback. This is optional and can be simplified to always use Kargo.

Complete Example Flow

Scenario: User enrolls backstage app with Dockerfile in external repo

┌──────────────────────────────────────────────────────────────┐
│ User: Edit images.yaml                                        │
│ + name: backstage                                            │
│   registry: ghcr.io                                          │
│   repository: craigedmunds/backstage                         │
│   source:                                                    │
│     repo: craigedmunds/argocd-eda                           │
│     dockerfile: apps/backstage/Dockerfile                   │
└────────────┬─────────────────────────────────────────────────┘
             │
             ▼
┌──────────────────────────────────────────────────────────────┐
│ GitHub Workflow: CDK8s Synth (Round 1)                       │
│ Reads: images.yaml                                           │
│ Creates: Warehouse for ghcr.io/craigedmunds/backstage       │
│ Does NOT parse Dockerfile                                    │
│ Commits: dist/cdk8s/image-factory.k8s.yaml                  │
└────────────┬─────────────────────────────────────────────────┘
             │
             ▼
┌──────────────────────────────────────────────────────────────┐
│ ArgoCD: Deploy Warehouse                                     │
└────────────┬─────────────────────────────────────────────────┘
             │
             ▼
┌──────────────────────────────────────────────────────────────┐
│ Kargo: Warehouse monitors GHCR                               │
│ Detects existing backstage image → Creates Freight          │
│ Triggers Analysis Job                                        │
└────────────┬─────────────────────────────────────────────────┘
             │
             ▼
┌──────────────────────────────────────────────────────────────┐
│ Kargo Analysis Job (in cluster)                             │
│ 1. Clone craigedmunds/argocd-eda                            │
│ 2. Parse apps/backstage/Dockerfile                          │
│ 3. Discover: FROM node:22-bookworm-slim                     │
│ 4. Create: base-images/node-22-bookworm-slim.yaml          │
│ 5. Commit to image-factory-state repo                       │
└────────────┬─────────────────────────────────────────────────┘
             │
             ▼
┌──────────────────────────────────────────────────────────────┐
│ GitHub Workflow: CDK8s Synth (Round 2)                       │
│ Reads: images.yaml + base-images/node-22-bookworm-slim.yaml │
│ Creates: Warehouse for docker.io/library/node:22-bookworm   │
│ Commits: Updated dist/cdk8s/image-factory.k8s.yaml         │
└────────────┬─────────────────────────────────────────────────┘
             │
             ▼
┌──────────────────────────────────────────────────────────────┐
│ ArgoCD: Deploy base image Warehouse                         │
└────────────┬─────────────────────────────────────────────────┘
             │
             ▼
┌──────────────────────────────────────────────────────────────┐
│ Kargo: Monitor node:22-bookworm-slim updates                │
│ Trigger backstage rebuilds when base updates                │
└──────────────────────────────────────────────────────────────┘

Key Insight: GitHub never parses the Dockerfile. It only reads:

  • Round 1: images.yaml → creates app Warehouse
  • Round 2: images.yaml + base-images/*.yaml → creates base Warehouses

When:

  • New Freight created (new image built)
  • Scheduled re-analysis
  • Manual trigger

Location: Kubernetes cluster, pod running ghcr.io/craigedmunds/image-factory-tool

Steps:

  1. Read images.yaml

    • Clone image-factory-state repo
    • Load enrollment configuration
  2. Fetch External Dockerfile

    • Clone source repository (e.g., craigedmunds/argocd-eda)
    • Navigate to Dockerfile path (e.g., apps/backstage/Dockerfile)
    • Parse Dockerfile
    • Discover base images
  3. Update State Files

    • Create/update base-images/{base}.yaml for newly discovered bases
    • Update images/{name}.yaml with discovery metadata
    • Commit to git → triggers CDK8s synth
  4. Trigger Next Cycle

    • Git commit triggers GitHub workflow
    • Workflow runs CDK8s synth
    • New Warehouses created for newly discovered base images
    • ArgoCD deploys updated manifests

What’s Persisted:

image-factory-state/
├── base-images/
│   ├── node-20.yaml              # Generated ✅ (runtime discovery)
│   └── python-312.yaml           # Generated ✅ (runtime discovery)
└── images/my-app.yaml            # Updated ✅ (discovery metadata)

Why Two Paths?

GitHub Workflow (Path 1):

  • ✅ Fast feedback for local Dockerfiles
  • ✅ No cluster required for initial validation
  • ✅ Works in image-factory-exemplar
  • ❌ Cannot access private repos
  • ❌ Cannot fetch external Dockerfiles

Kargo Analysis (Path 2):

  • ✅ Can fetch external Dockerfiles (has git credentials)
  • ✅ Runs in cluster with proper secrets
  • ✅ Handles private repositories
  • ✅ Re-runs on schedule for drift detection
  • ❌ Slower (cluster scheduling)
  • ❌ Requires Kargo to be running

Complete Flow: External Dockerfile

Example: User enrolls backstage with Dockerfile in craigedmunds/argocd-eda

1. User edits images.yaml
   ├─> GitHub Workflow (Path 1)
   │   ├─> Reads images.yaml ✅
   │   ├─> Cannot fetch external Dockerfile ❌
   │   ├─> Creates images/backstage.yaml (no base images yet)
   │   └─> Creates Warehouse for backstage (monitors GHCR)
   │
   └─> ArgoCD deploys Warehouse
       └─> Kargo detects existing backstage image
           └─> Creates Freight
               └─> Triggers Analysis Job (Path 2)
                   ├─> Clones craigedmunds/argocd-eda ✅
                   ├─> Parses apps/backstage/Dockerfile ✅
                   ├─> Discovers node:22-bookworm-slim ✅
                   ├─> Creates base-images/node-22.yaml ✅
                   ├─> Commits to git ✅
                   │
                   └─> Git commit triggers GitHub Workflow
                       ├─> CDK8s generates Warehouse for node:22 ✅
                       └─> ArgoCD deploys → Kargo monitors node:22 ✅

Complete Flow: Local Dockerfile

Example: User enrolls nginx with Dockerfile in image-factory-exemplar/example-images/nginx/

1. User edits images.yaml
   └─> GitHub Workflow (Path 1)
       ├─> Reads images.yaml ✅
       ├─> Parses local Dockerfile ✅
       ├─> Discovers nginx:alpine ✅
       ├─> Creates images/nginx.yaml ✅
       ├─> Creates base-images/nginx-alpine.yaml ✅
       ├─> CDK8s generates Warehouses for nginx + nginx:alpine ✅
       └─> Done! (Kargo Analysis not needed)

Data Flow: User Enrolls a New Image

Step-by-Step Flow

  1. User Action (in image-factory-state repo):

    # Edit images.yaml
    - name: my-app
      registry: ghcr.io
      repository: myorg/my-app
      source:
        provider: github
        repo: myorg/my-app
        dockerfile: Dockerfile
        workflow: build.yml
      rebuildDelay: 7d
      autoRebuild: true
  2. Commit and Push:

    git add images.yaml
    git commit -m "Add my-app to Image Factory"
    git push origin main
  3. image-factory-state GitHub Workflow Triggers (synth.yml):

    Workflow Location: image-factory-state/.github/workflows/synth.yml

    Trigger: Push to main branch (after PR merge)

    Steps (see “Analysis Responsibilities” section above for details):

    • Reads images.yaml enrollment
    • If Dockerfile accessible (local): Parses and discovers base images
    • If Dockerfile external: Skips parse, defers to Kargo Analysis
    • Generates state files
    • Runs CDK8s synthesis → Kargo manifests
    • Commits back to image-factory-state

    For External Dockerfiles: Kargo Analysis Job will later:

    • Clone source repository
    • Parse Dockerfile
    • Discover base images
    • Update state files
    • Trigger another synth cycle
  4. ArgoCD Detects Change:

    • Watching image-factory-state/dist/
    • Sees new manifests
    • Syncs to cluster
  5. Kargo Resources Created:

    • Warehouse for my-app (monitors GHCR for new builds)
    • Warehouse for node:20-bookworm-slim (monitors Docker Hub)
    • Stage for analysis
    • Stage for rebuild triggers
  6. Ongoing Monitoring:

    • When node:20-bookworm-slim updates, Kargo triggers rebuild
    • When my-app is built, Kargo runs analysis
    • Full automation from this point forward

User touched ONE file: images.yaml


Key Design Principles

1. Separation of Code and Config

  • Code (image-factory): Tools, synthesis logic, tests - can be updated independently
  • Config (image-factory-state): Production images, state - changes trigger synthesis
  • Platform (k8s-lab): Base components (ArgoCD, Kargo) and ArgoCD app definitions
  • UI (backstage): Developer portal integration
  • Specs (workspace-root): .ai/projects/infrastructure/image-factory/ - architectural documentation

2. Public vs Private

  • Public (image-factory, image-factory-exemplar): Code and examples anyone can use
  • Private (image-factory-state): Production configuration

3. GitOps All the Way

  • All changes via Git commits
  • Workflows automate synthesis
  • ArgoCD automates deployment
  • Full audit trail in Git history

4. Optimal User Experience

User Workflow:

  1. Edit images.yaml
  2. Commit and push
  3. ✨ Everything else automated ✨

What Gets Automated:

  • State file generation
  • Dockerfile parsing
  • Base image discovery
  • Kargo manifest generation
  • ArgoCD deployment
  • Image monitoring
  • Rebuild triggering

Cross-Repository Operations

Adding a New Image

Repos Touched:

  1. image-factory-state - User edits images.yaml
  2. image-factory - Workflow runs synthesis
  3. image-factory-state - Auto-commit of generated files
  4. k8s-lab - ArgoCD syncs from state repo

Automation Level: Fully automated after user edits images.yaml

Updating Image Factory Code

Repos Touched:

  1. image-factory - Developer updates CDK8s code or tools
  2. image-factory-state - Workflow re-synthesizes with new code
  3. k8s-lab - ArgoCD syncs updated manifests

Automation Level: Fully automated after code merge

Changing Base Infrastructure

Repos Touched:

  1. k8s-lab - Update Kargo or ArgoCD versions
  2. image-factory - May need CRD import updates
  3. image-factory-state - Re-synthesis may be triggered

Automation Level: Partially automated, may require manual intervention


Repository Ownership

RepositoryOwnerPrimary UsersUpdate Frequency
image-factoryPlatform TeamDevelopers (PRs)Weekly
image-factory-stateApp TeamsAll teamsDaily
image-factory-exemplarPlatform TeamExternal usersMonthly
backstagePlatform TeamUI developersWeekly
k8s-labInfrastructure TeamInfrastructure teamMonthly
argocd-edaPlatform TeamPlatform teamWeekly

  • Workflow Guide: .ai/steering/image-factory-workflow.md - End-to-end workflow
  • Design Specs: .ai/projects/infrastructure/image-factory/DESIGN.md - Technical design
  • This Document: .ai/projects/infrastructure/image-factory/REPOSITORY-STRUCTURE.md - Repository organization
  • User Tutorial: repos/image-factory-exemplar/README.md - Getting started guide
  • Code Docs: repos/image-factory/README.md - Developer documentation
  • Legacy Specs: repos/argocd-eda/.kiro/specs/image-factory/ - Historical reference (being migrated)