Design Document: CascadeGuard CLI v2
Overview
Align the CascadeGuard CLI (v0.1 pre-release) to match the getting-started docs. Five changes: add cg alias, rename ci → build command group, add cg init scaffolding command, unify images check to absorb check-upstream and generate into a single pass, and remove the now-redundant check-upstream-tags.yaml workflow from open-secure-images.
Main Algorithm/Workflow
sequenceDiagram participant U as User participant CLI as cg CLI participant Seed as cascadeguard-seed repo participant FS as Filesystem participant Reg as Container Registry Note over U,CLI: cg init U->>CLI: cg init CLI->>Seed: clone/copy seed repo Seed-->>CLI: seed files CLI->>FS: scaffold files (skip existing) FS-->>U: .cascadeguard.yaml, images.yaml, workflows, etc. Note over U,CLI: cg images check (unified) U->>CLI: cg images check CLI->>FS: load images.yaml + .cascadeguard.yaml defaults loop each enabled image CLI->>FS: parse Dockerfile (local or remote clone) CLI->>FS: write/update .cascadeguard/images/{name}.yaml CLI->>FS: write/update .cascadeguard/base-images/{ref}.yaml end loop each discovered base image CLI->>Reg: HEAD manifest (digest check) Reg-->>CLI: Docker-Content-Digest CLI->>Reg: GET tags (upstream tag check) Reg-->>CLI: tag list end CLI-->>U: results (table or JSON) Note over U,CLI: cg build generate (renamed from ci) U->>CLI: cg build generate CLI->>FS: read images.yaml CLI->>FS: emit .github/workflows/*.yaml
Core Interfaces/Types
# --- pyproject.toml entry points ---
# [project.scripts]
# cascadeguard = "app:main"
# cg = "app:main" # NEW alias
# --- init command types ---
@dataclass
class InitOptions:
seed_repo: str = "https://github.com/cascadeguard/cascadeguard-seed.git"
target_dir: Path = Path(".")
branch: str = "main"
SEED_FILES: list[str] = [
".cascadeguard.yaml",
"images.yaml",
".github/workflows/check.yaml",
".github/workflows/ci.yaml",
".github/workflows/build-image.yaml",
".cascadeguard/actions-policy.yaml",
".gitignore", # append .cascadeguard/.cache/
"images/", # example image directory
]
# --- unified check result types ---
@dataclass
class BaseImageCheckResult:
name: str
status: str # "ok" | "drift" | "new" | "error" | "skipped"
recorded_digest: str | None = None
live_digest: str | None = None
new_upstream_tags: list[str] | None = None
reason: str | None = None
@dataclass
class CheckResults:
image_results: list[BaseImageCheckResult]
has_drift: bool
has_new_tags: boolKey Functions with Formal Specifications
Function 1: cmd_init(args) -> int
def cmd_init(args) -> int:
"""Scaffold current directory from cascadeguard-seed."""Preconditions:
args.target_diris a valid writable directory (defaults to.)- Network access available to clone seed repo (or local seed path exists)
Postconditions:
- All seed files exist in target directory
- No pre-existing files were overwritten (skip with warning)
.gitignorehas.cascadeguard/.cache/entry (appended if file exists, created if not)- Returns 0 on success, 1 on fatal error
Loop Invariants:
- For each seed file: if
target / fileexists, skip; otherwise copy from seed
Function 2: cmd_check(args) -> int (unified)
def cmd_check(args) -> int:
"""Unified check: generate state, discover bases, check drift, check upstream tags."""Preconditions:
images.yamlexists and is a valid YAML list.cascadeguard.yamlmay or may not exist (defaults apply)args.state_dirdefaults to.cascadeguard
Postconditions:
.cascadeguard/images/{name}.yamlwritten/updated for each enabled image.cascadeguard/base-images/{ref}.yamlwritten/updated for each discovered base- Registry queried for digest drift on all base images
- Docker Hub queried for new upstream tags on all enrolled images
- Returns 1 if drift detected OR new upstream tags found, 0 otherwise
- Output format controlled by
--format(table | json)
Loop Invariants:
- After processing image
i: all state files for images0..iare up to date all_base_image_refsaccumulates all unique base image references seen so far
Function 3: cmd_build_generate(args) -> int (renamed from cmd_ci_generate)
def cmd_build_generate(args) -> int:
"""Generate CI pipeline files from images.yaml. Renamed from ci generate."""Preconditions:
images.yamlexists atargs.images_yaml- Output directory is writable
Postconditions:
- GitHub Actions workflow files written to
{output_dir}/.github/workflows/ - Identical behavior to current
cmd_ci_generate - Returns 0
Loop Invariants: N/A
Algorithmic Pseudocode
cg init Algorithm
def cmd_init(args) -> int:
target = Path(args.target_dir).resolve()
seed_dir = clone_seed_repo(SEED_REPO_URL, branch="main", cache=tempdir())
skipped, copied = 0, 0
for rel_path in walk_seed_files(seed_dir):
dest = target / rel_path
if dest.exists():
print(f" skip (exists): {rel_path}", file=sys.stderr)
skipped += 1
continue
dest.parent.mkdir(parents=True, exist_ok=True)
shutil.copy2(seed_dir / rel_path, dest)
copied += 1
# Ensure .gitignore has cache entry
gitignore = target / ".gitignore"
cache_entry = ".cascadeguard/.cache/"
if gitignore.exists():
content = gitignore.read_text()
if cache_entry not in content:
with open(gitignore, "a") as f:
f.write(f"\n{cache_entry}\n")
# (if .gitignore was copied from seed, it should already have it)
print(f"Initialised: {copied} files copied, {skipped} skipped (already exist)")
return 0Unified images check Algorithm
def cmd_check(args) -> int:
images_yaml = Path(args.images_yaml)
images = yaml.safe_load(images_yaml.read_text()) or []
config = load_config(images_yaml.parent)
resolved = merge_defaults(images, config)
state_dir = Path(args.state_dir)
images_dir = state_dir / "images"
base_images_dir = state_dir / "base-images"
cache_dir = state_dir / ".cache"
for d in (images_dir, base_images_dir, cache_dir):
d.mkdir(parents=True, exist_ok=True)
image_filter = getattr(args, "image", None)
fmt = getattr(args, "format", "table")
# Phase 1: Discover base images from Dockerfiles, write image state
all_base_refs: dict[str, str] = {} # norm_name -> full_ref
for image in resolved:
name = image.get("name")
if not name or not image.get("enabled", True):
continue
if image_filter and name != image_filter:
continue
base_images = discover_base_images(image, images_yaml.parent, cache_dir)
for norm, ref in base_images:
all_base_refs[norm] = ref
write_image_state(images_dir / f"{name}.yaml", name, image, base_images)
# Phase 2: Write base image state files
for norm_name, full_ref in all_base_refs.items():
write_or_update_base_image_state(base_images_dir / f"{norm_name}.yaml", full_ref)
# Phase 3: Check registries for digest drift (existing logic)
results = check_digest_drift(base_images_dir, image_filter)
# Phase 4: Check upstream tags (absorbed from check-upstream)
upstream_findings = check_upstream_tags(resolved, image_filter)
for finding in upstream_findings:
results.append({
"image": finding["image"],
"status": "new-tags",
"new_tags": finding["new_tags"],
})
has_drift = any(r["status"] == "drift" for r in results)
has_new_tags = any(r["status"] == "new-tags" for r in results)
output_results(results, fmt)
return 1 if (has_drift or has_new_tags) else 0build command group (rename from ci)
# In build_parser():
# Replace:
# ci = sub.add_parser("ci", ...)
# ci_sub = ci.add_subparsers(dest="ci_command", ...)
# With:
# build = sub.add_parser("build", ...)
# build_sub = build.add_subparsers(dest="build_command", ...)
# In main() dispatch:
# Replace: "ci": cmd_ci
# With: "build": cmd_build
def cmd_build(args) -> int:
"""Dispatch 'build' subcommands."""
return {"generate": cmd_build_generate}[args.build_command](args)Example Usage
# Install — both entry points work identically
cascadeguard images validate
cg images validate
# Scaffold a new state repo
mkdir my-images && cd my-images && git init
cg init
# → copies seed files, skips anything that already exists
# Edit images.yaml, then run unified check
cg images check
# → discovers base images from Dockerfiles
# → writes .cascadeguard/images/*.yaml and .cascadeguard/base-images/*.yaml
# → queries registries for digest drift
# → queries Docker Hub for new upstream tags
# → prints table of results
cg images check --format json
# → same but JSON output (for CI consumption)
# Generate CI pipelines (renamed from "ci generate")
cg build generate
cg build generate --dry-run
# Other commands unchanged
cg images validate
cg images enrol --name my-app --registry ghcr.io --repository org/my-app
cg images status
cg pipeline run
cg vuln report --image alpine --dir reports/
cg actions pin
cg scanCorrectness Properties
A property is a characteristic or behavior that should hold true across all valid executions of a system — essentially, a formal statement about what the system should do. Properties serve as the bridge between human-readable specifications and machine-verifiable correctness guarantees.
Property 1: cg and cascadeguard alias equivalence
For any valid CLI subcommand and argument combination, invoking via cg and invoking via cascadeguard shall produce identical output and identical exit codes.
Validates: Requirement 1.2
Property 2: Init copies new files and skips existing files
For any set of seed files and for any subset of those files that already exist in the target directory, after running cg init: every seed file that did not previously exist is now present in the target directory with content matching the seed, and every file that previously existed retains its original content unchanged.
Validates: Requirements 2.1, 2.2
Property 3: Init .gitignore idempotence
For any target directory state (no .gitignore, .gitignore without cache entry, .gitignore with cache entry), after running cg init, the .gitignore file contains exactly one .cascadeguard/.cache/ entry and any pre-existing content is preserved.
Validates: Requirements 2.3, 2.4, 2.5
Property 4: Init summary counts match file disposition
For any set of seed files and for any subset of pre-existing files in the target directory, the summary printed by cg init reports copied = total_seed_files - pre_existing_count and skipped = pre_existing_count.
Validates: Requirement 2.6
Property 5: build generate produces identical output to former ci generate
For any valid images.yaml configuration, running cg build generate produces the same set of workflow files with the same content as the former cg ci generate command.
Validates: Requirement 3.2
Property 6: Dry-run writes no files
For any valid images.yaml configuration, running cg build generate --dry-run writes zero files to the output directory.
Validates: Requirement 3.3
Property 7: Dockerfile base image extraction
For any valid Dockerfile containing one or more FROM statements (excluding scratch and build-stage aliases), the Check_Command parser extracts all base image references in order.
Validates: Requirement 4.2
Property 8: State file creation matches discovered images
For any images.yaml listing enabled images with Dockerfiles, after running cg images check, there exists a state file under .cascadeguard/images/{name}.yaml for each enabled image and a state file under .cascadeguard/base-images/{ref}.yaml for each unique base image reference discovered.
Validates: Requirement 4.3
Property 9: Drift detection correctness
For any base image where the recorded digest differs from the live registry digest, the Check_Command reports that image with status drift. For any base image where the digests match, the Check_Command does not report drift for that image.
Validates: Requirement 4.5
Property 10: Exit code reflects drift and upstream tag status
For any set of check results, the Check_Command returns exit code 1 if and only if at least one result has status drift or new-tags. Otherwise it returns exit code 0.
Validates: Requirements 4.7, 4.8
Property 11: Output format completeness
For any non-empty set of check results, both the table and json output formats include the image name, status, and relevant details for every result entry. The json format additionally produces output that is valid JSON parseable by a standard JSON parser.
Validates: Requirements 7.1, 7.2
Property 12: Image filter scoping
For any images.yaml containing multiple images, when --image <name> is provided, the Check_Command output contains results only for the named image and no others.
Validates: Requirement 4.11