Image Factory - Repository Structure
This document describes how the Image Factory system is organized across multiple repositories.
Quick Reference: Analysis Flow
Key Question: Where does Dockerfile parsing happen?
Answer: It depends on Dockerfile location:
| Dockerfile Location | Parsed By | When | Result |
|---|---|---|---|
| Local (same repo as images.yaml) | GitHub Workflow | On push to main | Base images discovered immediately |
| External (different repo) | Kargo Analysis Job | After Warehouse created | Base images discovered at runtime |
See “Analysis Responsibilities” section below for detailed breakdown.
Repository Organization
The Image Factory system is organized across multiple repositories for separation of concerns:
Primary Repositories
image-factory - Code Repository (Public)
Contains all the code, tools, and tests for the Image Factory system.
image-factory/
├── app/ # Python analysis tool
│ ├── app.py # Main tool for Dockerfile parsing
│ └── pyproject.toml # Dependencies
├── cdk8s/ # CDK8s application
│ ├── main.py # Manifest generation
│ ├── lib/ # Reusable constructs
│ └── imports/ # Generated Kargo CRD imports
├── tests/ # All tests
│ ├── integration/ # Integration tests
│ └── acceptance/ # Acceptance tests
└── .github/workflows/ # CI/CD workflows
├── synth.yml # CDK8s synthesis workflow
└── test.yml # Test runner workflow
Purpose: Houses all the logic for parsing Dockerfiles, generating Kargo manifests, and running tests.
Visibility: Public - allows community contributions and usage
image-factory-state - State Repository (Private)
Contains production configuration and auto-generated state files.
image-factory-state/
├── images.yaml # ⭐ SOURCE OF TRUTH - User edits this
├── base-images/ # Auto-generated base image state
│ ├── node-22-bookworm-slim.yaml
│ ├── python-3.12-slim.yaml
│ └── distroless-python3-debian12-latest.yaml
├── images/ # Auto-generated app image state
│ ├── backstage.yaml
│ ├── uv.yaml
│ └── metrics-service.yaml
├── dist/ # Auto-generated Kargo manifests (CDK8s output)
│ └── cdk8s/
│ └── image-factory.k8s.yaml
├── lib/ # Shared code via symlink
│ └── image-factory/ # → Symlink to ../../../image-factory (local dev)
├── Taskfile.yaml # Local task definitions
└── Taskfile.shared.yaml # Shared tasks from lib/image-factory
Purpose:
images.yamlis the only file users edit to enroll imagesbase-images/andimages/contain auto-generated state (DO NOT EDIT)dist/cdk8s/contains generated Kargo manifests that ArgoCD deployslib/image-factory/provides shared code via symlink (local dev)
Visibility: Private - contains production configuration
Development Setup:
cd repos/image-factory-state
ln -s ../image-factory lib/image-factory # Create symlink
task setup # Setup environments
task generate # Generate state files
task synth # Generate manifestsUser Workflow (Production):
- Edit
images.yamlto add/modify images - Commit and push
- GitHub Actions automatically:
- Runs synthesis
- Generates state files
- Generates Kargo manifests
- Commits back to this repo
- ArgoCD automatically deploys from
dist/cdk8s/
image-factory-exemplar - Public Example (Public)
Public demonstration repository showing how to use Image Factory with a complete example.
image-factory-exemplar/
├── README.md # Comprehensive tutorial and examples
├── images.yaml # Example configuration (manually edited)
├── base-images/ # Example auto-generated state (committed)
├── images/ # Example auto-generated state (committed)
├── dist/ # Example generated manifests (committed)
│ └── cdk8s/
│ └── image-factory.k8s.yaml
├── example-images/ # Example application code
│ └── nginx/
│ ├── Dockerfile
│ └── index.html
├── lib/ # Shared code via symlink
│ └── image-factory/ # → Symlink to ../../../image-factory (local dev)
├── Taskfile.yaml # Local task definitions
└── .gitignore # Ignores build artifacts
Purpose:
- Learning Resource: Complete working example with nginx app
- Testing Environment: Validate functionality locally
- Template: Fork this to start your own Image Factory setup
- Shows the complete workflow with actual generated files committed
Visibility: Public - for community learning and usage
Key Difference from image-factory-state:
- Contains local Dockerfiles in
example-images/ - Generated files are committed to git to show what Image Factory produces (both repos commit generated files as audit trail)
- Includes complete nginx example application
- Can run entire workflow locally without GitHub Actions
Future Enhancement - Docker Image Distribution:
Package Image Factory as a Docker image that both -state and -exemplar repos can use:
# Instead of: task setup && task generate && task synth
# Could be: docker run -v $(pwd):/workspace ghcr.io/craigedmunds/image-factory:latest synthBenefits:
- Eliminates local Python environments and symlinks
- Provides lightweight shim in each repo
- Consistent execution environment
Meta-Consideration:
The image-factory-exemplar repo should itself enroll the image-factory Docker image in its images.yaml:
- name: image-factory
registry: ghcr.io
repository: craigedmunds/image-factory
source:
provider: github
repo: craigedmunds/image-factory
dockerfile: Dockerfile
workflow: docker-build.yml
rebuildDelay: 7d
autoRebuild: trueThis creates a self-referential example: Image Factory builds and monitors its own container image!
Usage:
# Clone and setup
git clone https://github.com/craigedmunds/image-factory-exemplar.git
cd image-factory-exemplar
ln -s ../image-factory lib/image-factory # For local dev
# Run workflow manually
task setup # Setup environments
task generate # Generate state files from images.yaml
task synth # Generate Kargo manifests
task status # View generated filesPlatform Repositories
backstage - Developer Portal
Note: Backstage and its plugins are being migrated to a dedicated repository.
Contains Backstage UI integrations for Image Factory:
image-factory/- Backstage UI pluginimage-factory-backend/- Backend APIcatalog-backend-module-image-factory/- Catalog integration
Purpose: Provides developer portal UI for viewing and managing images
argocd-eda - EDA Platform Applications
Note: This repository continues to exist for EDA-specific applications. Image Factory components have been extracted.
Current Status:
- Active repository for event-driven architecture applications
- Likely client of Image Factory images (uses images built by Image Factory)
- Image Factory ArgoCD applications moved to k8s-lab
- Backstage plugins moved to dedicated backstage repo
- Image Factory specs migrated to workspace-root
.ai/projects/infrastructure/image-factory/
Historical Image Factory Components (now removed):
→ k8s-labplatform/kustomize/seed/image-factory/image-factory-app.yaml→ backstage repobackstage/app/plugins/image-factory*→ workspace-root .ai/projects/.kiro/specs/image-factory/
k8s-lab - Base Infrastructure
Base Kubernetes infrastructure repository.
k8s-lab/
├── components/
│ ├── kargo/ # Kargo installation and CRDs
│ ├── argocd/ # ArgoCD installation (via other location)
│ └── central-secret-store/
│ └── external-secrets/
│ ├── kargo-admin-credentials.yaml
│ ├── kargo-docker-registry.yaml
│ └── kargo-git-credentials.yaml
├── argocd/ # ArgoCD base configuration
│ ├── argocd-namespace.yaml
│ └── argocd-projects.yaml
└── other-seeds/
├── image-factory.yaml # → image-factory repo (Kargo resources)
├── image-factory-state.yaml # → image-factory-state repo (generated manifests)
├── image-factory-exemplar.yaml # → image-factory-exemplar repo (optional, for demos)
└── argocd-eda.yaml # Legacy: Seeds argocd-eda applications
Purpose:
- Base platform components (ArgoCD, Kargo, External Secrets, etc.)
- Infrastructure-level configuration
- Seeds Image Factory and other application repositories
- New Home: ArgoCD applications for Image Factory live here (moved from argocd-eda)
Repository Interaction Flow
┌─────────────────────────────────────────────────────────────┐
│ k8s-lab (Infrastructure) │
│ - Installs ArgoCD │
│ - Installs Kargo │
│ - Provides secrets (registry, git credentials) │
│ - Seeds Image Factory via other-seeds/ │
│ • image-factory.yaml → Kargo base resources │
│ • image-factory-state.yaml → Generated manifests │
└─────────────────┬───────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ User Action: Edit images.yaml in image-factory-state │
└─────────────────┬───────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ Synthesis Workflow │
│ Current (Python): │
│ 1. Checkout image-factory (code) │
│ 2. Checkout image-factory-state (config) │
│ 3. Run CDK8s synthesis │
│ 4. Commit manifests back to image-factory-state │
│ │
│ Future (Docker): │
│ 1. Pull ghcr.io/craigedmunds/image-factory:latest │
│ 2. docker run -v state:/workspace synth │
│ 3. Commit manifests back to image-factory-state │
└─────────────────┬───────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ ArgoCD (deployed from k8s-lab) │
│ - Watches image-factory-state/dist/cdk8s/ │
│ - Detects new/changed manifests │
│ - Syncs Kargo resources to cluster │
└─────────────────┬───────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ Kargo Orchestration Cycle │
│ 1. Warehouse monitors image registries │
│ 2. New image → Freight created │
│ 3. Stage triggers Analysis Job │
│ 4. Analysis discovers base images → commits to git │
│ 5. Git change triggers CDK8s synth (back to top) │
│ 6. New Warehouses created for base images │
│ 7. Base image updates → trigger app rebuilds │
└─────────────────────────────────────────────────────────────┘
Note: Kargo/Analysis/CDK8s form a feedback loop - analysis discovers
dependencies → CDK8s creates monitoring → monitoring triggers analysis
│
▼
┌─────────────────────────────────────────────────────────────┐
│ Kargo (deployed from k8s-lab) │
│ - Warehouses monitor image registries │
│ - Stages run analysis jobs │
│ - Triggers rebuilds when base images update │
└─────────────────────────────────────────────────────────────┘
Analysis Responsibilities - WHO Does WHAT
This section clarifies which component performs each analysis task and when.
Component Responsibilities
Key Question: Why would GitHub parse Dockerfiles if Kargo will parse them later anyway?
Answer: It doesn’t need to! GitHub can skip Dockerfile parsing entirely.
Recommended Workflow (Simplified)
User adds image to images.yaml:
- name: my-app
registry: ghcr.io
repository: myorg/my-app
source:
provider: github
repo: myorg/my-app
dockerfile: DockerfileStep 1: GitHub Workflow (CDK8s Synth)
- Reads:
images.yamlonly - Creates: Kargo Warehouse for
my-app(based on images.yaml entry) - Does NOT: Parse Dockerfile or discover base images
- Output:
dist/cdk8s/image-factory.k8s.yamlwith Warehouse
Step 2: ArgoCD
- Deploys Warehouse to cluster
Step 3: Kargo Analysis
- Warehouse monitors registry → creates Freight
- Analysis Job parses Dockerfile → discovers base images
- Creates
base-images/*.yamlstate files - Commits to git
Step 4: GitHub Workflow (CDK8s Synth - Round 2)
- Reads:
images.yaml+ newly createdbase-images/*.yaml - Creates: Warehouses for base images
- Output: Updated
dist/cdk8s/image-factory.k8s.yaml
Step 5: ArgoCD
- Deploys base image Warehouses
Result: All Dockerfile parsing happens in Kargo. GitHub just does CDK8s synthesis.
Component Responsibilities (Simplified)
| Task | Component | When | Input | Output |
|---|---|---|---|---|
| Read images.yaml | CDK8s (GitHub Workflow) | On push to main | images.yaml | - |
| Create Warehouse from images.yaml | CDK8s (GitHub Workflow) | On push to main | images.yaml | Warehouse manifest |
| Parse Dockerfile | Kargo Analysis | After Freight created | Dockerfile from git | base-images/*.yaml |
| Create Warehouses from base-images | CDK8s (GitHub Workflow) | After state files committed | base-images/*.yaml | Warehouse manifests |
| Monitor registries | Kargo Warehouse | Continuously | Registry API | Freight |
| Trigger rebuilds | Kargo Stage | On base update | State files | GitHub API call |
Why This Is Better
✅ Single source of Dockerfile parsing - Only Kargo does it ✅ GitHub Workflow is simpler - Just runs CDK8s, no Python analysis tool ✅ Consistent behavior - Same process for local and external Dockerfiles ✅ Better for Docker image distribution - GitHub just runs CDK8s container ✅ Kargo has credentials - Can access private repos
Current Implementation Note
The current implementation MAY have GitHub parsing Dockerfiles for image-factory-exemplar (local Dockerfiles) to provide faster feedback. This is optional and can be simplified to always use Kargo.
Complete Example Flow
Scenario: User enrolls backstage app with Dockerfile in external repo
┌──────────────────────────────────────────────────────────────┐
│ User: Edit images.yaml │
│ + name: backstage │
│ registry: ghcr.io │
│ repository: craigedmunds/backstage │
│ source: │
│ repo: craigedmunds/argocd-eda │
│ dockerfile: apps/backstage/Dockerfile │
└────────────┬─────────────────────────────────────────────────┘
│
▼
┌──────────────────────────────────────────────────────────────┐
│ GitHub Workflow: CDK8s Synth (Round 1) │
│ Reads: images.yaml │
│ Creates: Warehouse for ghcr.io/craigedmunds/backstage │
│ Does NOT parse Dockerfile │
│ Commits: dist/cdk8s/image-factory.k8s.yaml │
└────────────┬─────────────────────────────────────────────────┘
│
▼
┌──────────────────────────────────────────────────────────────┐
│ ArgoCD: Deploy Warehouse │
└────────────┬─────────────────────────────────────────────────┘
│
▼
┌──────────────────────────────────────────────────────────────┐
│ Kargo: Warehouse monitors GHCR │
│ Detects existing backstage image → Creates Freight │
│ Triggers Analysis Job │
└────────────┬─────────────────────────────────────────────────┘
│
▼
┌──────────────────────────────────────────────────────────────┐
│ Kargo Analysis Job (in cluster) │
│ 1. Clone craigedmunds/argocd-eda │
│ 2. Parse apps/backstage/Dockerfile │
│ 3. Discover: FROM node:22-bookworm-slim │
│ 4. Create: base-images/node-22-bookworm-slim.yaml │
│ 5. Commit to image-factory-state repo │
└────────────┬─────────────────────────────────────────────────┘
│
▼
┌──────────────────────────────────────────────────────────────┐
│ GitHub Workflow: CDK8s Synth (Round 2) │
│ Reads: images.yaml + base-images/node-22-bookworm-slim.yaml │
│ Creates: Warehouse for docker.io/library/node:22-bookworm │
│ Commits: Updated dist/cdk8s/image-factory.k8s.yaml │
└────────────┬─────────────────────────────────────────────────┘
│
▼
┌──────────────────────────────────────────────────────────────┐
│ ArgoCD: Deploy base image Warehouse │
└────────────┬─────────────────────────────────────────────────┘
│
▼
┌──────────────────────────────────────────────────────────────┐
│ Kargo: Monitor node:22-bookworm-slim updates │
│ Trigger backstage rebuilds when base updates │
└──────────────────────────────────────────────────────────────┘
Key Insight: GitHub never parses the Dockerfile. It only reads:
- Round 1:
images.yaml→ creates app Warehouse - Round 2:
images.yaml+base-images/*.yaml→ creates base Warehouses
When:
- New Freight created (new image built)
- Scheduled re-analysis
- Manual trigger
Location: Kubernetes cluster, pod running ghcr.io/craigedmunds/image-factory-tool
Steps:
-
Read images.yaml ✅
- Clone image-factory-state repo
- Load enrollment configuration
-
Fetch External Dockerfile ✅
- Clone source repository (e.g.,
craigedmunds/argocd-eda) - Navigate to Dockerfile path (e.g.,
apps/backstage/Dockerfile) - Parse Dockerfile
- Discover base images
- Clone source repository (e.g.,
-
Update State Files ✅
- Create/update
base-images/{base}.yamlfor newly discovered bases - Update
images/{name}.yamlwith discovery metadata - Commit to git → triggers CDK8s synth
- Create/update
-
Trigger Next Cycle ✅
- Git commit triggers GitHub workflow
- Workflow runs CDK8s synth
- New Warehouses created for newly discovered base images
- ArgoCD deploys updated manifests
What’s Persisted:
image-factory-state/
├── base-images/
│ ├── node-20.yaml # Generated ✅ (runtime discovery)
│ └── python-312.yaml # Generated ✅ (runtime discovery)
└── images/my-app.yaml # Updated ✅ (discovery metadata)
Why Two Paths?
GitHub Workflow (Path 1):
- ✅ Fast feedback for local Dockerfiles
- ✅ No cluster required for initial validation
- ✅ Works in image-factory-exemplar
- ❌ Cannot access private repos
- ❌ Cannot fetch external Dockerfiles
Kargo Analysis (Path 2):
- ✅ Can fetch external Dockerfiles (has git credentials)
- ✅ Runs in cluster with proper secrets
- ✅ Handles private repositories
- ✅ Re-runs on schedule for drift detection
- ❌ Slower (cluster scheduling)
- ❌ Requires Kargo to be running
Complete Flow: External Dockerfile
Example: User enrolls backstage with Dockerfile in craigedmunds/argocd-eda
1. User edits images.yaml
├─> GitHub Workflow (Path 1)
│ ├─> Reads images.yaml ✅
│ ├─> Cannot fetch external Dockerfile ❌
│ ├─> Creates images/backstage.yaml (no base images yet)
│ └─> Creates Warehouse for backstage (monitors GHCR)
│
└─> ArgoCD deploys Warehouse
└─> Kargo detects existing backstage image
└─> Creates Freight
└─> Triggers Analysis Job (Path 2)
├─> Clones craigedmunds/argocd-eda ✅
├─> Parses apps/backstage/Dockerfile ✅
├─> Discovers node:22-bookworm-slim ✅
├─> Creates base-images/node-22.yaml ✅
├─> Commits to git ✅
│
└─> Git commit triggers GitHub Workflow
├─> CDK8s generates Warehouse for node:22 ✅
└─> ArgoCD deploys → Kargo monitors node:22 ✅
Complete Flow: Local Dockerfile
Example: User enrolls nginx with Dockerfile in image-factory-exemplar/example-images/nginx/
1. User edits images.yaml
└─> GitHub Workflow (Path 1)
├─> Reads images.yaml ✅
├─> Parses local Dockerfile ✅
├─> Discovers nginx:alpine ✅
├─> Creates images/nginx.yaml ✅
├─> Creates base-images/nginx-alpine.yaml ✅
├─> CDK8s generates Warehouses for nginx + nginx:alpine ✅
└─> Done! (Kargo Analysis not needed)
Data Flow: User Enrolls a New Image
Step-by-Step Flow
-
User Action (in
image-factory-staterepo):# Edit images.yaml - name: my-app registry: ghcr.io repository: myorg/my-app source: provider: github repo: myorg/my-app dockerfile: Dockerfile workflow: build.yml rebuildDelay: 7d autoRebuild: true -
Commit and Push:
git add images.yaml git commit -m "Add my-app to Image Factory" git push origin main -
image-factory-state GitHub Workflow Triggers (
synth.yml):Workflow Location:
image-factory-state/.github/workflows/synth.ymlTrigger: Push to
mainbranch (after PR merge)Steps (see “Analysis Responsibilities” section above for details):
- Reads
images.yamlenrollment - If Dockerfile accessible (local): Parses and discovers base images
- If Dockerfile external: Skips parse, defers to Kargo Analysis
- Generates state files
- Runs CDK8s synthesis → Kargo manifests
- Commits back to
image-factory-state
For External Dockerfiles: Kargo Analysis Job will later:
- Clone source repository
- Parse Dockerfile
- Discover base images
- Update state files
- Trigger another synth cycle
- Reads
-
ArgoCD Detects Change:
- Watching
image-factory-state/dist/ - Sees new manifests
- Syncs to cluster
- Watching
-
Kargo Resources Created:
- Warehouse for
my-app(monitors GHCR for new builds) - Warehouse for
node:20-bookworm-slim(monitors Docker Hub) - Stage for analysis
- Stage for rebuild triggers
- Warehouse for
-
Ongoing Monitoring:
- When
node:20-bookworm-slimupdates, Kargo triggers rebuild - When
my-appis built, Kargo runs analysis - Full automation from this point forward
- When
User touched ONE file: images.yaml ✨
Key Design Principles
1. Separation of Code and Config
- Code (image-factory): Tools, synthesis logic, tests - can be updated independently
- Config (image-factory-state): Production images, state - changes trigger synthesis
- Platform (k8s-lab): Base components (ArgoCD, Kargo) and ArgoCD app definitions
- UI (backstage): Developer portal integration
- Specs (workspace-root):
.ai/projects/infrastructure/image-factory/- architectural documentation
2. Public vs Private
- Public (image-factory, image-factory-exemplar): Code and examples anyone can use
- Private (image-factory-state): Production configuration
3. GitOps All the Way
- All changes via Git commits
- Workflows automate synthesis
- ArgoCD automates deployment
- Full audit trail in Git history
4. Optimal User Experience
User Workflow:
- Edit
images.yaml - Commit and push
- ✨ Everything else automated ✨
What Gets Automated:
- State file generation
- Dockerfile parsing
- Base image discovery
- Kargo manifest generation
- ArgoCD deployment
- Image monitoring
- Rebuild triggering
Cross-Repository Operations
Adding a New Image
Repos Touched:
image-factory-state- User editsimages.yamlimage-factory- Workflow runs synthesisimage-factory-state- Auto-commit of generated filesk8s-lab- ArgoCD syncs from state repo
Automation Level: Fully automated after user edits images.yaml
Updating Image Factory Code
Repos Touched:
image-factory- Developer updates CDK8s code or toolsimage-factory-state- Workflow re-synthesizes with new codek8s-lab- ArgoCD syncs updated manifests
Automation Level: Fully automated after code merge
Changing Base Infrastructure
Repos Touched:
k8s-lab- Update Kargo or ArgoCD versionsimage-factory- May need CRD import updatesimage-factory-state- Re-synthesis may be triggered
Automation Level: Partially automated, may require manual intervention
Repository Ownership
| Repository | Owner | Primary Users | Update Frequency |
|---|---|---|---|
| image-factory | Platform Team | Developers (PRs) | Weekly |
| image-factory-state | App Teams | All teams | Daily |
| image-factory-exemplar | Platform Team | External users | Monthly |
| backstage | Platform Team | UI developers | Weekly |
| k8s-lab | Infrastructure Team | Infrastructure team | Monthly |
| argocd-eda | Platform Team | Platform team | Weekly |
Related Documentation
- Workflow Guide:
.ai/steering/image-factory-workflow.md- End-to-end workflow - Design Specs:
.ai/projects/infrastructure/image-factory/DESIGN.md- Technical design - This Document:
.ai/projects/infrastructure/image-factory/REPOSITORY-STRUCTURE.md- Repository organization - User Tutorial:
repos/image-factory-exemplar/README.md- Getting started guide - Code Docs:
repos/image-factory/README.md- Developer documentation - Legacy Specs:
repos/argocd-eda/.kiro/specs/image-factory/- Historical reference (being migrated)