Design Document: CascadeGuard CLI Improvements
Overview
CascadeGuard’s CLI currently requires every image in images.yaml to carry its own registry and repository fields, and lacks native CLI commands for state file generation and CI pipeline generation (both currently only accessible via Taskfile wrappers around standalone Python scripts). This feature introduces three interconnected improvements:
-
Config inheritance via
.cascadeguard.yaml— repo-level defaults forregistry,repository, andlocal.dirthat individual images inherit unless they override them. This eliminates repetition and fixes the 25 validation failures incascadeguard-open-secure-images. -
New CLI commands —
cascadeguard images generateandcascadeguard ci generatethat wrap the existinggenerate_state.pyandgenerate_ci.pylogic directly into the CLI, removing the need for Taskfile/Docker indirection. -
Validation fix —
cascadeguard images validateupdated to resolve inherited defaults before checking required fields, so images that rely on repo-level defaults pass validation.
Architecture
graph TD subgraph ".cascadeguard.yaml" CG[Config File] end subgraph "images.yaml" IMG[Image Entries] end CG --> LOAD[load_config] IMG --> LOAD LOAD --> MERGE[merge_defaults] MERGE --> VAL[images validate] MERGE --> GEN[images generate] MERGE --> CI[ci generate] GEN --> STATE[state files<br/>base-images/ + images/] CI --> WF[.github/workflows/]
Config Resolution Flow
sequenceDiagram participant CLI as CLI Command participant CFG as ConfigLoader participant YAML as .cascadeguard.yaml participant IMG as images.yaml participant MERGE as merge_defaults() CLI->>CFG: load_config(repo_root) CFG->>YAML: read file YAML-->>CFG: {defaults: {registry, repository, local: {dir}}, ci: {platform}} CLI->>IMG: load images.yaml IMG-->>CLI: list of image dicts CLI->>MERGE: merge_defaults(images, config) MERGE-->>CLI: resolved images (each has registry, repository, etc.) CLI->>CLI: proceed with validate / generate / ci generate
Components and Interfaces
Component 1: ConfigLoader
Purpose: Loads and validates .cascadeguard.yaml, providing repo-level defaults.
Interface:
def load_config(repo_root: Path) -> dict:
"""Load .cascadeguard.yaml from repo_root. Returns {} if absent."""
...
def merge_defaults(images: list[dict], config: dict) -> list[dict]:
"""
Return a new list of image dicts with repo-level defaults applied.
Per-image fields take precedence over config defaults.
Does NOT mutate the input list.
"""
...Responsibilities:
- Parse
.cascadeguard.yamland return a typed config dict - Apply
defaults.registry,defaults.repository,defaults.local.dirto each image that lacks those fields - Per-image values always override repo-level defaults (shallow merge per top-level key)
Component 2: Updated Validator (cmd_validate)
Purpose: Validates images.yaml after merging config defaults.
Responsibilities:
- Load config, load images, merge defaults, then validate
- For enabled images: require
name,registry,image(orrepository),dockerfile - For disabled images (
enabled: false): require onlyname - Report clear errors showing whether a missing field could be fixed by adding it to
.cascadeguard.yaml
Component 3: CLI Commands (images generate, ci generate)
Purpose: Expose generate_state.py and generate_ci.py functionality as first-class CLI subcommands.
Responsibilities:
images generate: callsgenerate_state.generate_state_for_image()for each image, using the current working directory as outputci generate: callsgenerate_ci.generate_ci()with resolved platform from config- Both commands load config and merge defaults before processing
Data Models
.cascadeguard.yaml Schema (Extended)
# Repo-level defaults applied to every image in images.yaml
defaults:
registry: ghcr.io/cascadeguard # default registry for all images
repository: cascadeguard # default repository prefix (optional)
local:
dir: images # default local folder containing Dockerfiles
# CI configuration (existing)
ci:
platform: github # github | gitlab (future)
# Tagging configuration (existing, unchanged)
tagging:
stateRepo: true
sourceRepo: false
sourceRepoSecret: CROSS_REPO_PAT# Python type representation
from typing import TypedDict, Optional
class LocalDefaults(TypedDict, total=False):
dir: str # e.g. "images"
class ConfigDefaults(TypedDict, total=False):
registry: str # e.g. "ghcr.io/cascadeguard"
repository: str # e.g. "cascadeguard"
local: LocalDefaults
class CIConfig(TypedDict, total=False):
platform: str # "github" | "gitlab"
class CascadeGuardConfig(TypedDict, total=False):
defaults: ConfigDefaults
ci: CIConfig
tagging: dictValidation Rules:
defaultssection is entirely optional- Each field within
defaultsis optional defaults.registrymust be a non-empty string if presentdefaults.local.dirmust be a valid relative path if present- Unknown keys are silently ignored (forward compatibility)
Image Entry (after merge)
An image entry after merge_defaults() has been applied. The merge fills in missing fields from config defaults:
# Before merge (in images.yaml):
{"name": "nginx", "dockerfile": "images/nginx/Dockerfile", "image": "nginx", "tag": "stable-alpine-slim"}
# After merge (with defaults.registry = "ghcr.io/cascadeguard"):
{"name": "nginx", "dockerfile": "images/nginx/Dockerfile", "image": "nginx", "tag": "stable-alpine-slim",
"registry": "ghcr.io/cascadeguard"}Key Functions with Formal Specifications
Function 1: merge_defaults()
def merge_defaults(images: list[dict], config: dict) -> list[dict]:
"""Apply repo-level defaults from config to each image."""Preconditions:
imagesis a list of dicts (may be empty)configis a dict (may be empty or lackdefaultskey)
Postconditions:
- Returns a new list of the same length as
images - For each returned image
r[i]and originalimages[i]:- If
images[i]has a key,r[i]has the same value for that key - If
images[i]lacks a key andconfig["defaults"]has it,r[i]gets the default images[i]is not mutated
- If
- Only these keys are inherited:
registry,repository,local.dir
Loop Invariants:
- All previously processed images have defaults applied correctly
- Original
imageslist is never modified
Function 2: cmd_validate() (updated)
def cmd_validate(args) -> int:
"""Validate images.yaml with config inheritance."""Preconditions:
args.images_yamlpoints to a readable file or a missing file (error case)- Working directory contains
.cascadeguard.yaml(optional)
Postconditions:
- Returns 0 if all images pass validation after merging defaults
- Returns 1 if any validation errors exist, with errors printed to stderr
- Disabled images (
enabled: false) only requirename - Enabled images require
name,registry, anddockerfile
Function 3: cmd_images_generate()
def cmd_images_generate(args) -> int:
"""Generate state files from images.yaml."""Preconditions:
args.images_yamlpoints to a readable images.yamlargs.output_diris a writable directory path
Postconditions:
- State files created/updated in
{output_dir}/base-images/and{output_dir}/images/ - Returns 0 on success, 1 on failure
- Idempotent: re-running produces the same result
Function 4: cmd_ci_generate()
def cmd_ci_generate(args) -> int:
"""Generate CI pipeline files from images.yaml."""Preconditions:
args.images_yamlpoints to a readable images.yamlargs.output_diris a writable directory path
Postconditions:
- GitHub Actions workflow files created in
{output_dir}/.github/workflows/ - Platform resolved from: CLI flag >
.cascadeguard.yaml> default (“github”) - Returns 0 on success, 1 on failure
Algorithmic Pseudocode
Config Loading and Merging Algorithm
def load_config(repo_root: Path) -> dict:
config_path = repo_root / ".cascadeguard.yaml"
if not config_path.exists():
return {}
with open(config_path) as f:
return yaml.safe_load(f) or {}
def merge_defaults(images: list[dict], config: dict) -> list[dict]:
"""
Apply repo-level defaults to each image.
Merge strategy: shallow per-key. Image-level values always win.
Only specific keys are inherited from defaults.
"""
defaults = config.get("defaults", {})
if not defaults:
return [dict(img) for img in images] # shallow copy, no defaults to apply
default_registry = defaults.get("registry")
default_repository = defaults.get("repository")
default_local = defaults.get("local", {})
default_local_dir = default_local.get("dir")
result = []
for img in images:
merged = dict(img) # shallow copy
# Apply registry default
if "registry" not in merged and default_registry:
merged["registry"] = default_registry
# Apply repository default
if "repository" not in merged and default_repository:
merged["repository"] = default_repository
# Apply local.dir default (nested merge)
if default_local_dir:
img_local = merged.get("local", {})
if "dir" not in img_local:
merged_local = dict(img_local)
merged_local["dir"] = default_local_dir
merged["local"] = merged_local
result.append(merged)
return resultUpdated Validation Algorithm
def cmd_validate(args) -> int:
images_yaml = Path(args.images_yaml)
if not images_yaml.exists():
print(f"Error: images.yaml not found: {images_yaml}", file=sys.stderr)
return 1
with open(images_yaml) as f:
images = yaml.safe_load(f) or []
if not isinstance(images, list):
print("Error: images.yaml must be a list", file=sys.stderr)
return 1
# Load config and merge defaults BEFORE validation
repo_root = images_yaml.parent
config = load_config(repo_root)
resolved_images = merge_defaults(images, config)
errors = []
for i, image in enumerate(resolved_images):
name = image.get("name")
if not name:
errors.append(f"Image {i}: missing 'name' field")
continue
# Disabled images only need a name
if not image.get("enabled", True):
continue
# Enabled images need registry and dockerfile
if not image.get("registry"):
errors.append(f"Image '{name}': missing 'registry' (set in image or .cascadeguard.yaml defaults)")
if not image.get("dockerfile"):
errors.append(f"Image '{name}': missing 'dockerfile' field")
if errors:
print("Validation errors:", file=sys.stderr)
for err in errors:
print(f" - {err}", file=sys.stderr)
return 1
print(f"Validated {len(resolved_images)} images in {images_yaml}")
return 0CLI Command Registration
# In build_parser(), add to images subcommands:
# images generate
images_generate = images_sub.add_parser(
"generate", help="Generate state files from images.yaml"
)
images_generate.add_argument(
"--output-dir", default=".",
help="Output directory (default: current directory)"
)
images_generate.add_argument(
"--cache-dir", default=None,
help="Cache directory for cloned repos"
)
# In build_parser(), add new top-level 'ci' command:
ci = sub.add_parser("ci", help="CI/CD pipeline generation")
ci_sub = ci.add_subparsers(dest="ci_command", metavar="subcommand")
ci_sub.required = True
ci_generate = ci_sub.add_parser(
"generate", help="Generate CI pipeline files from images.yaml"
)
ci_generate.add_argument(
"--images-yaml", default="images.yaml",
help="Path to images.yaml (default: images.yaml)"
)
ci_generate.add_argument(
"--output-dir", default=".",
help="Output directory (default: current directory)"
)
ci_generate.add_argument(
"--platform", default=None,
help="CI platform (github). Overrides .cascadeguard.yaml"
)
ci_generate.add_argument(
"--dry-run", action="store_true",
help="Preview without writing files"
)Command Handler Implementations
def cmd_images_generate(args) -> int:
"""Generate state files from images.yaml."""
from generate_state import (
load_images_yaml, load_config, generate_state_for_image,
_generate_build_workflow
)
images_yaml = Path(args.images_yaml)
output_dir = Path(args.output_dir)
if not images_yaml.exists():
print(f"Error: images.yaml not found: {images_yaml}", file=sys.stderr)
return 1
cache_dir = Path(args.cache_dir) if args.cache_dir else output_dir / ".cascadeguard-cache"
cache_dir.mkdir(parents=True, exist_ok=True)
images = load_images_yaml(images_yaml)
config = load_config(output_dir)
print(f"Found {len(images)} images in {images_yaml}")
success = 0
workflows = 0
for image in images:
if generate_state_for_image(image, output_dir, cache_dir):
success += 1
if _generate_build_workflow(image, output_dir, config):
workflows += 1
print(f"\nGenerated state for {success}/{len(images)} images, {workflows} workflows")
return 0
def cmd_ci_generate(args) -> int:
"""Generate CI pipeline files from images.yaml."""
from generate_ci import generate_ci
images_yaml = Path(args.images_yaml)
output_dir = Path(args.output_dir)
if not images_yaml.exists():
print(f"Error: images.yaml not found: {images_yaml}", file=sys.stderr)
return 1
generate_ci(
images_yaml_path=images_yaml,
output_dir=output_dir,
dry_run=args.dry_run,
platform=args.platform,
)
return 0Example Usage
# 1. .cascadeguard.yaml with defaults
# ─────────────────────────────────────
# defaults:
# registry: ghcr.io/cascadeguard
# local:
# dir: images
# ci:
# platform: github
# 2. images.yaml (no registry needed per-image)
# ──────────────────────────────────────────────
# - name: nginx
# dockerfile: images/nginx/Dockerfile
# image: nginx
# tag: stable-alpine-slim
#
# - name: memcached
# enabled: false
# namespace: library
# 3. CLI usage
# ────────────
# Validate (now passes with config inheritance):
# cascadeguard images validate --images-yaml images.yaml
#
# Generate state files:
# cascadeguard images generate --images-yaml images.yaml --output-dir .
#
# Generate CI pipelines:
# cascadeguard ci generate --images-yaml images.yaml --output-dir .
#
# Generate CI with explicit platform:
# cascadeguard ci generate --platform github --dry-runCorrectness Properties
A property is a characteristic or behavior that should hold true across all valid executions of a system — essentially, a formal statement about what the system should do. Properties serve as the bridge between human-readable specifications and machine-verifiable correctness guarantees.
Property 1: Default inheritance
For any image entry and for any inheritable key (registry, repository, local.dir), if the image lacks that key and the config defaults provide it, then the merged image has the config default value for that key.
Validates: Requirements 2.1, 2.2, 2.3
Property 2: Override precedence
For any image entry and for any inheritable key (registry, repository, local.dir), if the image already has that key set, then the merged image retains the image’s original value regardless of the config default.
Validates: Requirements 2.4, 2.5, 2.6
Property 3: Non-mutation
For any list of image entries and for any config, calling merge_defaults does not modify the original image list or any of its contained dictionaries.
Validates: Requirement 2.7
Property 4: Length preservation
For any list of image entries and for any config, merge_defaults returns a list of the same length as the input.
Validates: Requirement 2.8
Property 5: No-defaults backward compatibility
For any list of image entries and a config with no defaults section (or an empty one), merge_defaults returns copies equivalent to the originals with no fields added or removed.
Validates: Requirements 2.9, 7.1
Property 6: Disabled image leniency
For any image entry with enabled: false that has a name field, validation passes regardless of which other fields are missing.
Validates: Requirement 3.4
Property 7: Missing name always fails validation
For any image entry (enabled or disabled) that lacks a name field, validation reports an error.
Validates: Requirement 3.5
Property 8: Validation correctness for enabled images
For any enabled image entry, validation passes if and only if the image has name, registry, and dockerfile fields present after merging defaults.
Validates: Requirements 3.2, 3.3, 3.6, 3.7
Property 9: Generation idempotency
For any valid images.yaml and config, running images generate or ci generate twice with the same inputs produces identical output files.
Validates: Requirements 4.6, 5.9
Error Handling
Error Scenario 1: Missing .cascadeguard.yaml
Condition: File does not exist in repo root
Response: load_config() returns {}, no defaults applied
Recovery: Validation proceeds with per-image fields only (backward compatible)
Error Scenario 2: Malformed .cascadeguard.yaml
Condition: YAML parse error or non-dict root Response: Print error to stderr, return exit code 1 Recovery: User fixes YAML syntax
Error Scenario 3: Missing required fields after merge
Condition: An enabled image still lacks registry or dockerfile after defaults are applied
Response: Validation error message hints that the field can be set in .cascadeguard.yaml defaults
Recovery: User adds the field to either the image entry or config defaults
Error Scenario 4: images generate with unreachable source repo
Condition: Git clone fails for a source repo during state generation
Response: Warning printed, existing state preserved if available, generation continues for other images
Recovery: Existing behavior from generate_state.py — graceful degradation
Testing Strategy
Unit Testing Approach
- Test
merge_defaults()with: empty config, partial defaults, full defaults, per-image overrides, disabled images - Test updated
cmd_validate()with: valid images + config, missing fields, disabled images, no config file - Test CLI argument parsing for new
images generateandci generatesubcommands - Mock
generate_stateandgenerate_cimodule calls in command handler tests
Property-Based Testing Approach
Property Test Library: hypothesis
- Property: merge_defaults never removes keys that exist on the original image
- Property: merge_defaults output length equals input length
- Property: if config has no defaults, merge_defaults returns copies identical to originals
- Property: disabled images always pass validation regardless of missing fields
Integration Testing Approach
- End-to-end test: create temp directory with
.cascadeguard.yaml+images.yaml, runcascadeguard images validate, assert exit code 0 - End-to-end test: run
cascadeguard images generate, verify state files created - End-to-end test: run
cascadeguard ci generate, verify workflow files created - Regression test: validate
cascadeguard-open-secure-imagesrepo with the new config defaults
Security Considerations
.cascadeguard.yamlis a local config file read from the repo root — no remote fetching- Registry URLs in defaults are used as-is; no URL validation beyond non-empty string check (same as current behavior)
generate_state.pyalready handles GitHub token securely via environment variables; no changes needed
Dependencies
- Existing:
pyyaml,argparse,pathlib(all already in use) - Existing modules:
generate_state.py,generate_ci.py(imported by new CLI commands) - No new external dependencies required