Plan: cascadeguard scan CLI Command & One-Shot Install Script

Status: Draft Domain: cascadeguard.com

Overview

Add a cascadeguard scan subcommand that discovers container-related artifacts in a project directory (Dockerfiles, CI workflows, Compose files, Kubernetes manifests), presents an interactive selection UI, analyses the selected artifacts, and produces a structured report. Alongside the CLI command, ship a one-shot install script (install.sh) hosted at https://get.cascadeguard.com that bootstraps a temporary environment and runs the scan in a single curl | sh invocation.

The scan command reuses existing parsing capabilities already in app.py — specifically CascadeGuardTool.parse_dockerfile_base_images(), parse_image_reference(), and ActionsPinner._USES_RE for action reference detection — and extends them with new discoverer modules for Compose, Kubernetes, and GitLab CI artifacts.

No new Python dependencies are required for the core implementation (Phases 1–4). The only existing dependency is pyyaml, which already covers YAML parsing needs.

Architecture

graph TD
    CLI["cascadeguard scan<br/>CLI Entry Point"]
    CLI --> Discovery["Discovery Engine<br/>run_scan()"]
    Discovery --> DD["Dockerfile<br/>Discoverer"]
    Discovery --> AD["CI Actions<br/>Discoverer"]
    Discovery --> CD["Compose/Stack<br/>Discoverer"]
    Discovery --> KD["Kubernetes<br/>Discoverer"]
    Discovery --> GD["GitLab CI<br/>Discoverer<br/>(stretch)"]

    DD --> Artifacts["List[DiscoveredArtifact]"]
    AD --> Artifacts
    CD --> Artifacts
    KD --> Artifacts
    GD --> Artifacts

    Artifacts --> UI["Interactive Selection<br/>curses / fallback"]
    Artifacts -->|--non-interactive| Analysis

    UI --> Analysis["Analysis Engine<br/>report.py"]
    Analysis --> Output["Report Output<br/>text / json / yaml"]
    Output -->|--output FILE| File["File"]
    Output --> Stdout["stdout"]

    subgraph "Reused from app.py"
        Parse["parse_dockerfile_base_images()"]
        ImgRef["parse_image_reference()"]
        UsesRE["ActionsPinner._USES_RE"]
    end

    DD -.-> Parse
    DD -.-> ImgRef
    AD -.-> UsesRE

Scan Command Flow

sequenceDiagram
    participant User
    participant CLI as cascadeguard scan
    participant Disc as Discovery Engine
    participant UI as Interactive UI
    participant Anal as Analysis Engine
    participant Out as Report Output

    User->>CLI: cascadeguard scan [--dir .] [flags]
    CLI->>Disc: discover(root_dir)

    loop Each Discoverer
        Disc->>Disc: glob for matching files
        Disc->>Disc: parse metadata from matched files
        Disc-->>Disc: yield DiscoveredArtifact per match
    end

    Disc-->>CLI: List[DiscoveredArtifact]

    alt --non-interactive
        CLI->>Anal: analyse(all_artifacts)
    else interactive (default)
        CLI->>UI: present(artifacts, grouped by kind)
        User->>UI: toggle selections
        UI-->>CLI: selected artifacts
        CLI->>Anal: analyse(selected_artifacts)
    end

    Anal-->>Out: ScanResult
    alt --output FILE
        Out->>Out: write to FILE
    else default
        Out->>Out: print to stdout
    end

Data Models

DiscoveredArtifact

@dataclass
class DiscoveredArtifact:
    kind: str          # "dockerfile" | "actions" | "compose" | "k8s" | "gitlab-ci"
    path: str          # relative to scan root, e.g. "services/api/Dockerfile"
    details: dict      # parsed metadata, varies by kind

Details by kind:

kinddetails keysexample values
dockerfilebase_images, stages, args, env["python:3.11-slim", "node:20-alpine"], ["base", "builder", "final"]
actionsaction_refs, workflow_name["actions/checkout@v4", "docker/build-push-action@v5"], "CI"
composeservices, image_refs["api", "db", "redis"], ["postgres:16", "redis:7-alpine"]
k8sapi_version, kind, image_refs, namespace"apps/v1", "Deployment", ["nginx:1.25"], "default"
gitlab-ciimage_refs, stages["python:3.11"], ["build", "test", "deploy"]

ScanResult

@dataclass
class ScanResult:
    root_dir: str
    discovered: list[DiscoveredArtifact]
    selected: list[DiscoveredArtifact]
    analysis: list[ArtifactAnalysis]
    summary: ScanSummary
 
@dataclass
class ArtifactAnalysis:
    artifact: DiscoveredArtifact
    findings: list[str]           # human-readable observations
    recommendations: list[str]    # actionable suggestions
    risk_level: str               # "info" | "low" | "medium" | "high"
 
@dataclass
class ScanSummary:
    total_discovered: int
    total_selected: int
    total_images: int             # unique container image references
    total_actions: int            # unique action references
    by_kind: dict[str, int]       # count per artifact kind
    by_risk: dict[str, int]       # count per risk level

Discovery Modules

Each discoverer implements a common protocol:

class Discoverer(Protocol):
    def discover(self, root: Path) -> list[DiscoveredArtifact]: ...

All discoverers use pathlib.Path.rglob() for file matching — each module declares its glob patterns and rglob handles recursive traversal. A shared _excluded_dirs set (.git, node_modules, vendor, __pycache__, .venv, venv) is used to filter results, keeping the discovery code declarative and readable.

1. Dockerfile Discoverer

AttributeValue
Globs**/Dockerfile, **/Dockerfile.*, **/*.dockerfile, **/Containerfile, **/Containerfile.*
ParserReuses CascadeGuardTool.parse_dockerfile_base_images() for FROM extraction. Additionally parses ARG and ENV directives, and identifies build stage names from FROM ... AS <name>.
Output kinddockerfile

2. CI Actions Discoverer

AttributeValue
Globs.github/workflows/*.yml, .github/workflows/*.yaml
ParserReuses ActionsPinner._USES_RE regex to extract third-party action references. Parses workflow name field from YAML.
Output kindactions

3. Compose/Stack Discoverer

AttributeValue
Globs**/docker-compose*.yml, **/docker-compose*.yaml, **/compose*.yml, **/compose*.yaml
ParserYAML parse → extract services keys and image fields from each service. Detects build directives pointing to Dockerfiles.
Output kindcompose

4. Kubernetes / Infrastructure-as-Code Discoverers

The original monolithic “Kubernetes Manifest Discoverer” is split into tool-aware discoverers, each with appropriate detection heuristics and pinning recommendations.

4a. Helm Chart Discoverer

AttributeValue
DetectionChart.yaml present in directory, or path contains charts/
GlobsScans for Chart.yaml files, then reads templates and values.yaml from the chart directory
ParserExtracts image.repository / image.tag patterns from values.yaml. Scans templates for hardcoded image refs.
Output kindhelm
RecommendationOverride image tags via values.yaml — don’t edit templates directly

4b. Kustomize Discoverer

AttributeValue
Detectionkustomization.yaml or kustomization.yml present
ParserReads images transformer entries from kustomization file. Also scans referenced resources for image refs.
Output kindkustomize
RecommendationUse the images transformer in kustomization.yaml to pin digests

4c. Raw Kubernetes Manifest Discoverer

AttributeValue
Globs**/*.yaml, **/*.yml
HeuristicFile must contain both apiVersion and kind top-level keys. Skips files already claimed by Helm, Kustomize, Compose, or workflow discoverers.
ParserWalks spec.containers[*].image, spec.initContainers[*].image, and spec.template.spec.containers[*].image paths. Handles multi-document YAML (--- separators).
Output kindk8s
Recommendationcascadeguard images pin --file <path>

5. Future Discoverers

ToolDetectionPriorityPinning approach
FluxapiVersion: source.toolkit.fluxcd.io or kustomize.toolkit.fluxcd.ioMediumPin in HelmRelease values or Kustomization patches
ArgoCDapiVersion: argoproj.io with Application/ApplicationSetMediumPin in source repo, not the ArgoCD manifest
Terraform/OpenTofu*.tf files with container or image blocksLowPin in the .tf resource
PulumiPulumi.yaml in rootLowPin in the Pulumi program
GitLab CI.gitlab-ci.ymlLowPin image: fields in job definitions

Interactive Selection UI

Design

  • Built with Python curses module (no new dependency)
  • Checkbox-style selector with artifacts grouped by kind as collapsible categories
  • Keyboard controls: / navigate, Space toggles selection, a selects all, n deselects all, Enter confirms

Fallback Behaviour

ConditionBehaviour
--non-interactive flagSkip UI, select all artifacts
stdin is not a TTY (piped)Fall back to numbered list on stderr, read selections from stdin
Windows without cursesFall back to simple numbered list prompt

Display Format

CascadeGuard Scan — Select artifacts to analyse

  Dockerfiles (3)
    [x] services/api/Dockerfile
    [x] services/worker/Dockerfile
    [ ] tools/dev.Dockerfile

  GitHub Actions (2)
    [x] .github/workflows/ci.yml
    [x] .github/workflows/release.yml

  Compose Files (1)
    [x] docker-compose.yml

  Kubernetes Manifests (4)
    [x] k8s/deployment.yaml
    [x] k8s/cronjob.yaml
    [ ] k8s/configmap.yaml
    [ ] k8s/service.yaml

↑↓ Navigate  Space Toggle  a All  n None  Enter Confirm

Analysis & Reporting

Analysis Rules (per artifact kind)

KindAnalysis checks
dockerfileUnpinned base images (no digest/tag), use of latest tag, multi-stage build detection, USER root warnings, COPY --from references
actionsUnpinned action refs (tag vs SHA), use of @main/@master, known deprecated actions
composeUnpinned image references, exposed ports, privileged mode, volume mounts to sensitive paths
k8sUnpinned image references, imagePullPolicy: Always without digest, securityContext gaps, runAsRoot
gitlab-ciUnpinned image references, use of latest tag

Output Formats

Text (default):

CascadeGuard Scan Report
========================
Scanned: /path/to/project
Artifacts: 10 discovered, 8 selected

Dockerfiles (3)
  services/api/Dockerfile
    ⚠ Base image python:3.11-slim is not pinned to a digest
    ℹ Multi-stage build with 3 stages detected
    → Recommendation: Pin base images to digests for reproducible builds

Summary: 2 high, 3 medium, 5 info findings

JSON: Full ScanResult serialised as JSON.

YAML: Full ScanResult serialised as YAML.

CLI Interface

cascadeguard scan [--dir PATH] [--non-interactive] [--format json|text] [--output FILE]
FlagDefaultDescription
--dir.Root directory to scan
--non-interactivefalseScan all discovered artifacts without prompting
--formattextOutput format: text, json
--outputstdoutWrite results to file instead of stdout

Integration with existing CLI

The scan subcommand is added to build_parser() in app.py alongside existing commands (validate, enrol, check, etc.). A new cmd_scan handler is added to the command dispatch dict in main(). The handler delegates to scan.run_scan() which orchestrates discovery → selection → analysis → output.

One-Shot Install Script (install.sh)

Hosted at

https://get.cascadeguard.com

Usage:

curl -sSL https://get.cascadeguard.com | sh

Or with options:

curl -sSL https://get.cascadeguard.com | sh -s -- --keep --format json

Script Flow

graph TD
    Start["install.sh starts"] --> DetectOS["Detect OS + arch<br/>uname -s, uname -m"]
    DetectOS --> CheckPython["Check Python ≥ 3.11"]

    CheckPython -->|Found| CreateVenv["Create temp venv<br/>in mktemp -d"]
    CheckPython -->|Not found| TryBinary["Download pre-built binary<br/>(future: GitHub release)"]

    CreateVenv --> PipInstall["pip install cascadeguard-tool<br/>from PyPI (or GitHub release)"]
    PipInstall --> RunScan["Run: cascadeguard scan<br/>in current directory"]

    TryBinary --> RunScan

    RunScan --> Cleanup{"--keep flag?"}
    Cleanup -->|No| Remove["Remove temp dir"]
    Cleanup -->|Yes| Keep["Keep temp dir,<br/>print path"]
    Remove --> Done["Exit"]
    Keep --> Done

Script Behaviour

ConcernApproach
OS detectionuname -s → Linux/Darwin; uname -m → x86_64/arm64/aarch64
Python checkpython3 --version, require ≥ 3.11
Isolationmktemp -d for temp venv, cleaned up on exit (trap EXIT)
Installationpip install cascadeguard-tool into temp venv (no system pollution)
Passthrough argsAll args after -- forwarded to cascadeguard scan
--keep flagConsumed by install.sh, prevents cleanup, prints venv path
Error handlingset -euo pipefail, meaningful error messages on failure
No root requiredRuns entirely in user space

File Structure

app/
├── app.py                  # Add scan subcommand + cmd_scan handler
├── scan/
│   ├── __init__.py         # Public API: run_scan()
│   ├── discoverers.py      # All Discoverer implementations
│   ├── models.py           # DiscoveredArtifact, ScanResult, ArtifactAnalysis dataclasses
│   ├── ui.py               # Interactive selection (curses + fallback)
│   └── report.py           # Analysis engine and output formatting
├── tests/
│   ├── test_scan_discovery.py   # Unit tests for discoverers
│   ├── test_scan_analysis.py    # Unit tests for analysis rules
│   └── test_scan_ui.py          # UI tests (mocked curses)
install.sh                  # One-shot wrapper script

Implementation Phases

Phase 1: Core Discovery

Deliverables:

  • scan/models.pyDiscoveredArtifact, ScanResult, ArtifactAnalysis, ScanSummary dataclasses
  • scan/discoverers.pyDockerfileDiscoverer and CIActionsDiscoverer implementations
  • scan/__init__.pyrun_scan() orchestrator (discovery only, no UI or analysis yet)
  • app.pyscan subcommand wired into build_parser() and cmd_scan in dispatch dict
  • tests/test_scan_discovery.py — Unit tests for both discoverers with fixture directories
  • Output: plain list of discovered artifacts to stdout

Reuse: CascadeGuardTool.parse_dockerfile_base_images(), ActionsPinner._USES_RE

Phase 2: Extended Discovery

Deliverables:

  • scan/discoverers.py — Add ComposeDiscoverer and KubernetesDiscoverer
  • tests/test_scan_discovery.py — Extended tests with compose and k8s fixtures
  • Integration test: scan a fixture project directory containing all artifact types

Key decisions:

  • K8s detection via apiVersion + kind field presence (not file path heuristics)
  • Multi-document YAML support for K8s manifests
  • Compose file detection by filename pattern, not content heuristics

Phase 3: Interactive UI

Deliverables:

  • scan/ui.pyInteractiveSelector class with curses implementation
  • scan/ui.pyFallbackSelector for non-TTY / Windows environments
  • scan/__init__.py — Wire UI into run_scan() flow
  • --non-interactive flag support
  • tests/test_scan_ui.py — Tests with mocked curses

Phase 4: Analysis & Reporting

Deliverables:

  • scan/report.pyAnalysisEngine with per-kind analysis rules
  • scan/report.pyReportFormatter with text and JSON output
  • scan/__init__.py — Wire analysis and reporting into run_scan() flow
  • --format and --output flag support
  • tests/test_scan_analysis.py — Tests for analysis rules and output formatting

Phase 5: Install Script & Distribution

Deliverables:

  • install.sh — One-shot wrapper script
  • DNS/hosting setup for get.cascadeguard.com → serve install.sh
  • Smoke tests: run install.sh in clean Docker containers (Ubuntu, Alpine, macOS sim)
  • README updates with install instructions

Technical Decisions & Constraints

DecisionRationale
No new Python dependencies (Phases 1–4)Keep the tool lightweight; pyyaml + stdlib covers all needs
Reuse existing parsers from app.pyDockerfile and Actions parsing is already battle-tested
pathlib.Path.rglob() for file discoveryDeclarative glob patterns per discoverer; cleaner than manual os.walk. Excluded dirs (.git, node_modules, etc.) filtered on results
curses for interactive UIstdlib, no dependency; works on Linux/macOS out of the box
Simple numbered-list fallbackCovers Windows and piped-stdin cases without complexity
K8s detection via content heuristicsFile path alone is unreliable; apiVersion + kind is definitive
GitLab CI as stretch goalLower priority; most users are on GitHub Actions
dataclass for modelsSimple, no dependency, good for serialisation to JSON/YAML
Discoverer protocol (not ABC)Keeps it lightweight; structural subtyping via Protocol
install.sh uses temp venvZero system pollution; clean up on exit

Phase 6: Scan Report Enhancements

6a. Component Name Inference

Derive a short component name from the folder structure instead of using the full file path as the title.

Heuristics:

  • Dockerfile at components/remote-development/codev/Dockerfile → component name remote-development/codev or just codev
  • Helm chart at components/headlamp/charts/headlamp-0.39.0/headlampheadlamp
  • Kustomize at components/n8n/kustomization.yamln8n
  • GitHub Actions at .github/workflows/build-codev.yamlbuild-codev

Rules:

  1. For Helm: use chart_name from Chart.yaml (already parsed)
  2. For Kustomize: use the parent directory name (or parent/grandparent if parent is generic like overlays)
  3. For Dockerfiles: walk up from the Dockerfile, skip generic names (app, src, docker), take the first meaningful directory name
  4. For Actions: use the workflow filename stem
  5. Full path remains in the detail section of the markdown report

6b. Catalogue Integration

Cross-reference discovered images and actions against the CascadeGuard catalogue:

  • Look up images in our registry (cascadeguard managed images)
  • Report known CVEs for discovered image tags
  • For managed images: show what CVEs our pinned version resolves
  • For actions: check against our actions policy catalogue

Requires:

  • API endpoint or local catalogue file for image/action metadata
  • Integration with vuln report data (Grype/Trivy results if available)

Every recommendation should link to a cascadeguard.com article explaining the issue:

RecommendationArticle URL
Pin container images to digestshttps://cascadeguard.com/docs/why-pin-images
Pin GitHub Actions to commit SHAshttps://cascadeguard.com/docs/why-pin-actions
Use Kustomize images transformerhttps://cascadeguard.com/docs/kustomize-image-pinning
Pin Helm chart image tagshttps://cascadeguard.com/docs/helm-image-pinning

In the CLI summary table: deduplicate actions across kinds and show the link once at the bottom. In the markdown report: inline links in recommendations.

Requires:

  • Content creation for each article on cascadeguard.com
  • URL structure decision (docs/ vs blog/ vs guides/)

6d. Unified cascadeguard images pin Command

cascadeguard images pin should handle all artifact types, not just Dockerfiles:

  • Dockerfiles: rewrite FROM lines with digest-pinned references
  • Helm charts: update values.yaml image tags with digests
  • Kustomize: add/update images transformer entries in kustomization.yaml
  • Compose files: rewrite image: fields with digest-pinned references
  • Raw K8s manifests: rewrite image: fields with digest-pinned references

This means the scan recommendation is always cascadeguard images pin regardless of artifact type — the command figures out the right strategy based on the file type.

6e. CLI Summary Deduplication

When the same action applies to multiple kinds (e.g. cascadeguard images pin for Dockerfiles, Helm, Kustomize, Compose, K8s), show it once at the bottom of the CLI output instead of per-kind:

  Kind                Found   Issues
  Dockerfiles             2        2
  GitHub Actions          2        2
  Helm Charts            24       15
  Kustomize              38        8

  Recommended actions:
    cascadeguard images pin    — 25 artifacts across 3 kinds
                                 https://cascadeguard.com/docs/why-pin-images
    cascadeguard actions pin   — 2 workflows
                                 https://cascadeguard.com/docs/why-pin-actions

In the markdown report:

  • Link image names to our registry page if we host them (e.g. [nginx:alpine](https://cascadeguard.com/images/nginx))
  • Link recommendations to website articles
  • Link CascadeGuard commands to CLI docs

Implementation Priority

  1. Component name inference (6a) — quick win, improves readability immediately
  2. Unified images pin (6d) — makes the recommendation story consistent
  3. CLI summary deduplication (6e) — cleaner output
  4. Website links (6c) — needs content, but URL structure can be decided now
  5. Markdown links (6f) — follows from 6c
  6. Catalogue integration (6b) — largest effort, needs API/data work