Plan: `cascadeguard scan` CLI Command & One-Shot Install Script

Status: Draft Domain: cascadeguard.com

Overview

Add a cascadeguard scan subcommand that discovers container-related artifacts in a project directory (Dockerfiles, CI workflows, Compose files, Kubernetes manifests), presents an interactive selection UI, analyses the selected artifacts, and produces a structured report. Alongside the CLI command, ship a one-shot install script (install.sh) hosted at https://get.cascadeguard.com that bootstraps a temporary environment and runs the scan in a single curl | sh invocation.

The scan command reuses existing parsing capabilities already in app.py — specifically CascadeGuardTool.parse_dockerfile_base_images(), parse_image_reference(), and ActionsPinner._USES_RE for action reference detection — and extends them with new discoverer modules for Compose, Kubernetes, and GitLab CI artifacts.

No new Python dependencies are required for the core implementation (Phases 1–4). The only existing dependency is pyyaml, which already covers YAML parsing needs.

Architecture

graph TD
    CLI["cascadeguard scan<br/>CLI Entry Point"]
    CLI --> Discovery["Discovery Engine<br/>run_scan()"]
    Discovery --> DD["Dockerfile<br/>Discoverer"]
    Discovery --> AD["CI Actions<br/>Discoverer"]
    Discovery --> CD["Compose/Stack<br/>Discoverer"]
    Discovery --> KD["Kubernetes<br/>Discoverer"]
    Discovery --> GD["GitLab CI<br/>Discoverer<br/>(stretch)"]

    DD --> Artifacts["List[DiscoveredArtifact]"]
    AD --> Artifacts
    CD --> Artifacts
    KD --> Artifacts
    GD --> Artifacts

    Artifacts --> UI["Interactive Selection<br/>curses / fallback"]
    Artifacts -->|--non-interactive| Analysis

    UI --> Analysis["Analysis Engine<br/>report.py"]
    Analysis --> Output["Report Output<br/>text / json / yaml"]
    Output -->|--output FILE| File["File"]
    Output --> Stdout["stdout"]

    subgraph "Reused from app.py"
        Parse["parse_dockerfile_base_images()"]
        ImgRef["parse_image_reference()"]
        UsesRE["ActionsPinner._USES_RE"]
    end

    DD -.-> Parse
    DD -.-> ImgRef
    AD -.-> UsesRE

Scan Command Flow

sequenceDiagram
    participant User
    participant CLI as cascadeguard scan
    participant Disc as Discovery Engine
    participant UI as Interactive UI
    participant Anal as Analysis Engine
    participant Out as Report Output

    User->>CLI: cascadeguard scan [--dir .] [flags]
    CLI->>Disc: discover(root_dir)

    loop Each Discoverer
        Disc->>Disc: glob for matching files
        Disc->>Disc: parse metadata from matched files
        Disc-->>Disc: yield DiscoveredArtifact per match
    end

    Disc-->>CLI: List[DiscoveredArtifact]

    alt --non-interactive
        CLI->>Anal: analyse(all_artifacts)
    else interactive (default)
        CLI->>UI: present(artifacts, grouped by kind)
        User->>UI: toggle selections
        UI-->>CLI: selected artifacts
        CLI->>Anal: analyse(selected_artifacts)
    end

    Anal-->>Out: ScanResult
    alt --output FILE
        Out->>Out: write to FILE
    else default
        Out->>Out: print to stdout
    end

Data Models

DiscoveredArtifact

@dataclass
class DiscoveredArtifact:
    kind: str          # "dockerfile" | "actions" | "compose" | "k8s" | "gitlab-ci"
    path: str          # relative to scan root, e.g. "services/api/Dockerfile"
    details: dict      # parsed metadata, varies by kind

Details by kind:

kind	details keys	example values
`dockerfile`	`base_images`, `stages`, `args`, `env`	`["python:3.11-slim", "node:20-alpine"]`, `["base", "builder", "final"]`
`actions`	`action_refs`, `workflow_name`	`["actions/checkout@v4", "docker/build-push-action@v5"]`, `"CI"`
`compose`	`services`, `image_refs`	`["api", "db", "redis"]`, `["postgres:16", "redis:7-alpine"]`
`k8s`	`api_version`, `kind`, `image_refs`, `namespace`	`"apps/v1"`, `"Deployment"`, `["nginx:1.25"]`, `"default"`
`gitlab-ci`	`image_refs`, `stages`	`["python:3.11"]`, `["build", "test", "deploy"]`

ScanResult

@dataclass
class ScanResult:
    root_dir: str
    discovered: list[DiscoveredArtifact]
    selected: list[DiscoveredArtifact]
    analysis: list[ArtifactAnalysis]
    summary: ScanSummary
 
@dataclass
class ArtifactAnalysis:
    artifact: DiscoveredArtifact
    findings: list[str]           # human-readable observations
    recommendations: list[str]    # actionable suggestions
    risk_level: str               # "info" | "low" | "medium" | "high"
 
@dataclass
class ScanSummary:
    total_discovered: int
    total_selected: int
    total_images: int             # unique container image references
    total_actions: int            # unique action references
    by_kind: dict[str, int]       # count per artifact kind
    by_risk: dict[str, int]       # count per risk level

Discovery Modules

Each discoverer implements a common protocol:

class Discoverer(Protocol):
    def discover(self, root: Path) -> list[DiscoveredArtifact]: ...

All discoverers use pathlib.Path.rglob() for file matching — each module declares its glob patterns and rglob handles recursive traversal. A shared _excluded_dirs set (.git, node_modules, vendor, __pycache__, .venv, venv) is used to filter results, keeping the discovery code declarative and readable.

1. Dockerfile Discoverer

Attribute	Value
Globs	`/Dockerfile`, `/Dockerfile.`, `/.dockerfile`, `/Containerfile`, `/Containerfile.*`
Parser	Reuses `CascadeGuardTool.parse_dockerfile_base_images()` for FROM extraction. Additionally parses `ARG` and `ENV` directives, and identifies build stage names from `FROM ... AS <name>`.
Output kind	`dockerfile`

2. CI Actions Discoverer

Attribute	Value
Globs	`.github/workflows/.yml`, `.github/workflows/.yaml`
Parser	Reuses `ActionsPinner._USES_RE` regex to extract third-party action references. Parses workflow `name` field from YAML.
Output kind	`actions`

3. Compose/Stack Discoverer

Attribute	Value
Globs	`*/docker-compose.yml`, `*/docker-compose.yaml`, `*/compose.yml`, `*/compose.yaml`
Parser	YAML parse → extract `services` keys and `image` fields from each service. Detects `build` directives pointing to Dockerfiles.
Output kind	`compose`

4. Kubernetes / Infrastructure-as-Code Discoverers

The original monolithic “Kubernetes Manifest Discoverer” is split into tool-aware discoverers, each with appropriate detection heuristics and pinning recommendations.

4a. Helm Chart Discoverer

Attribute	Value
Detection	`Chart.yaml` present in directory, or path contains `charts/`
Globs	Scans for `Chart.yaml` files, then reads templates and `values.yaml` from the chart directory
Parser	Extracts `image.repository` / `image.tag` patterns from `values.yaml`. Scans templates for hardcoded image refs.
Output kind	`helm`
Recommendation	Override image tags via `values.yaml` — don’t edit templates directly

4b. Kustomize Discoverer

Attribute	Value
Detection	`kustomization.yaml` or `kustomization.yml` present
Parser	Reads `images` transformer entries from kustomization file. Also scans referenced resources for image refs.
Output kind	`kustomize`
Recommendation	Use the `images` transformer in `kustomization.yaml` to pin digests

4c. Raw Kubernetes Manifest Discoverer

Attribute	Value
Globs	`*/.yaml`, `*/.yml`
Heuristic	File must contain both `apiVersion` and `kind` top-level keys. Skips files already claimed by Helm, Kustomize, Compose, or workflow discoverers.
Parser	Walks `spec.containers[].image`, `spec.initContainers[].image`, and `spec.template.spec.containers[*].image` paths. Handles multi-document YAML (`---` separators).
Output kind	`k8s`
Recommendation	`cascadeguard images pin --file <path>`

5. Future Discoverers

Tool	Detection	Priority	Pinning approach
Flux	`apiVersion: source.toolkit.fluxcd.io` or `kustomize.toolkit.fluxcd.io`	Medium	Pin in `HelmRelease` values or `Kustomization` patches
ArgoCD	`apiVersion: argoproj.io` with `Application`/`ApplicationSet`	Medium	Pin in source repo, not the ArgoCD manifest
Terraform/OpenTofu	`*.tf` files with `container` or `image` blocks	Low	Pin in the `.tf` resource
Pulumi	`Pulumi.yaml` in root	Low	Pin in the Pulumi program
GitLab CI	`.gitlab-ci.yml`	Low	Pin `image:` fields in job definitions

Interactive Selection UI

Design

Built with Python curses module (no new dependency)
Checkbox-style selector with artifacts grouped by kind as collapsible categories
Keyboard controls: ↑/↓ navigate, Space toggles selection, a selects all, n deselects all, Enter confirms

Fallback Behaviour

Condition	Behaviour
`--non-interactive` flag	Skip UI, select all artifacts
`stdin` is not a TTY (piped)	Fall back to numbered list on stderr, read selections from stdin
Windows without curses	Fall back to simple numbered list prompt

Display Format

CascadeGuard Scan — Select artifacts to analyse

  Dockerfiles (3)
    [x] services/api/Dockerfile
    [x] services/worker/Dockerfile
    [ ] tools/dev.Dockerfile

  GitHub Actions (2)
    [x] .github/workflows/ci.yml
    [x] .github/workflows/release.yml

  Compose Files (1)
    [x] docker-compose.yml

  Kubernetes Manifests (4)
    [x] k8s/deployment.yaml
    [x] k8s/cronjob.yaml
    [ ] k8s/configmap.yaml
    [ ] k8s/service.yaml

↑↓ Navigate  Space Toggle  a All  n None  Enter Confirm

Analysis & Reporting

Analysis Rules (per artifact kind)

Kind	Analysis checks
`dockerfile`	Unpinned base images (no digest/tag), use of `latest` tag, multi-stage build detection, `USER root` warnings, `COPY --from` references
`actions`	Unpinned action refs (tag vs SHA), use of `@main`/`@master`, known deprecated actions
`compose`	Unpinned image references, exposed ports, privileged mode, volume mounts to sensitive paths
`k8s`	Unpinned image references, `imagePullPolicy: Always` without digest, `securityContext` gaps, `runAsRoot`
`gitlab-ci`	Unpinned image references, use of `latest` tag

Output Formats

Text (default):

CascadeGuard Scan Report
========================
Scanned: /path/to/project
Artifacts: 10 discovered, 8 selected

Dockerfiles (3)
  services/api/Dockerfile
    ⚠ Base image python:3.11-slim is not pinned to a digest
    ℹ Multi-stage build with 3 stages detected
    → Recommendation: Pin base images to digests for reproducible builds

Summary: 2 high, 3 medium, 5 info findings

JSON: Full ScanResult serialised as JSON.

YAML: Full ScanResult serialised as YAML.

CLI Interface

cascadeguard scan [--dir PATH] [--non-interactive] [--format json|text] [--output FILE]

Flag	Default	Description
`--dir`	`.`	Root directory to scan
`--non-interactive`	`false`	Scan all discovered artifacts without prompting
`--format`	`text`	Output format: `text`, `json`
`--output`	stdout	Write results to file instead of stdout

Integration with existing CLI

The scan subcommand is added to build_parser() in app.py alongside existing commands (validate, enrol, check, etc.). A new cmd_scan handler is added to the command dispatch dict in main(). The handler delegates to scan.run_scan() which orchestrates discovery → selection → analysis → output.

One-Shot Install Script (`install.sh`)

Hosted at

https://get.cascadeguard.com

Usage:

curl -sSL https://get.cascadeguard.com | sh

Or with options:

curl -sSL https://get.cascadeguard.com | sh -s -- --keep --format json

Script Flow

graph TD
    Start["install.sh starts"] --> DetectOS["Detect OS + arch<br/>uname -s, uname -m"]
    DetectOS --> CheckPython["Check Python ≥ 3.11"]

    CheckPython -->|Found| CreateVenv["Create temp venv<br/>in mktemp -d"]
    CheckPython -->|Not found| TryBinary["Download pre-built binary<br/>(future: GitHub release)"]

    CreateVenv --> PipInstall["pip install cascadeguard-tool<br/>from PyPI (or GitHub release)"]
    PipInstall --> RunScan["Run: cascadeguard scan<br/>in current directory"]

    TryBinary --> RunScan

    RunScan --> Cleanup{"--keep flag?"}
    Cleanup -->|No| Remove["Remove temp dir"]
    Cleanup -->|Yes| Keep["Keep temp dir,<br/>print path"]
    Remove --> Done["Exit"]
    Keep --> Done

Script Behaviour

Concern	Approach
OS detection	`uname -s` → Linux/Darwin; `uname -m` → x86_64/arm64/aarch64
Python check	`python3 --version`, require ≥ 3.11
Isolation	`mktemp -d` for temp venv, cleaned up on exit (trap EXIT)
Installation	`pip install cascadeguard-tool` into temp venv (no system pollution)
Passthrough args	All args after `--` forwarded to `cascadeguard scan`
`--keep` flag	Consumed by install.sh, prevents cleanup, prints venv path
Error handling	`set -euo pipefail`, meaningful error messages on failure
No root required	Runs entirely in user space

File Structure

app/
├── app.py                  # Add scan subcommand + cmd_scan handler
├── scan/
│   ├── __init__.py         # Public API: run_scan()
│   ├── discoverers.py      # All Discoverer implementations
│   ├── models.py           # DiscoveredArtifact, ScanResult, ArtifactAnalysis dataclasses
│   ├── ui.py               # Interactive selection (curses + fallback)
│   └── report.py           # Analysis engine and output formatting
├── tests/
│   ├── test_scan_discovery.py   # Unit tests for discoverers
│   ├── test_scan_analysis.py    # Unit tests for analysis rules
│   └── test_scan_ui.py          # UI tests (mocked curses)
install.sh                  # One-shot wrapper script

Implementation Phases

Phase 1: Core Discovery

Deliverables:

scan/models.py — DiscoveredArtifact, ScanResult, ArtifactAnalysis, ScanSummary dataclasses
scan/discoverers.py — DockerfileDiscoverer and CIActionsDiscoverer implementations
scan/__init__.py — run_scan() orchestrator (discovery only, no UI or analysis yet)
app.py — scan subcommand wired into build_parser() and cmd_scan in dispatch dict
tests/test_scan_discovery.py — Unit tests for both discoverers with fixture directories
Output: plain list of discovered artifacts to stdout

Reuse: CascadeGuardTool.parse_dockerfile_base_images(), ActionsPinner._USES_RE

Phase 2: Extended Discovery

Deliverables:

scan/discoverers.py — Add ComposeDiscoverer and KubernetesDiscoverer
tests/test_scan_discovery.py — Extended tests with compose and k8s fixtures
Integration test: scan a fixture project directory containing all artifact types

Key decisions:

K8s detection via apiVersion + kind field presence (not file path heuristics)
Multi-document YAML support for K8s manifests
Compose file detection by filename pattern, not content heuristics

Phase 3: Interactive UI

Deliverables:

scan/ui.py — InteractiveSelector class with curses implementation
scan/ui.py — FallbackSelector for non-TTY / Windows environments
scan/__init__.py — Wire UI into run_scan() flow
--non-interactive flag support
tests/test_scan_ui.py — Tests with mocked curses

Phase 4: Analysis & Reporting

Deliverables:

scan/report.py — AnalysisEngine with per-kind analysis rules
scan/report.py — ReportFormatter with text and JSON output
scan/__init__.py — Wire analysis and reporting into run_scan() flow
--format and --output flag support
tests/test_scan_analysis.py — Tests for analysis rules and output formatting

Phase 5: Install Script & Distribution

Deliverables:

install.sh — One-shot wrapper script
DNS/hosting setup for get.cascadeguard.com → serve install.sh
Smoke tests: run install.sh in clean Docker containers (Ubuntu, Alpine, macOS sim)
README updates with install instructions

Technical Decisions & Constraints

Decision	Rationale
No new Python dependencies (Phases 1–4)	Keep the tool lightweight; `pyyaml` + stdlib covers all needs
Reuse existing parsers from `app.py`	Dockerfile and Actions parsing is already battle-tested
`pathlib.Path.rglob()` for file discovery	Declarative glob patterns per discoverer; cleaner than manual `os.walk`. Excluded dirs (`.git`, `node_modules`, etc.) filtered on results
`curses` for interactive UI	stdlib, no dependency; works on Linux/macOS out of the box
Simple numbered-list fallback	Covers Windows and piped-stdin cases without complexity
K8s detection via content heuristics	File path alone is unreliable; `apiVersion` + `kind` is definitive
GitLab CI as stretch goal	Lower priority; most users are on GitHub Actions
`dataclass` for models	Simple, no dependency, good for serialisation to JSON/YAML
Discoverer protocol (not ABC)	Keeps it lightweight; structural subtyping via `Protocol`
install.sh uses temp venv	Zero system pollution; clean up on exit

Phase 6: Scan Report Enhancements

6a. Component Name Inference

Derive a short component name from the folder structure instead of using the full file path as the title.

Heuristics:

Dockerfile at components/remote-development/codev/Dockerfile → component name remote-development/codev or just codev
Helm chart at components/headlamp/charts/headlamp-0.39.0/headlamp → headlamp
Kustomize at components/n8n/kustomization.yaml → n8n
GitHub Actions at .github/workflows/build-codev.yaml → build-codev

Rules:

For Helm: use chart_name from Chart.yaml (already parsed)
For Kustomize: use the parent directory name (or parent/grandparent if parent is generic like overlays)
For Dockerfiles: walk up from the Dockerfile, skip generic names (app, src, docker), take the first meaningful directory name
For Actions: use the workflow filename stem
Full path remains in the detail section of the markdown report

6b. Catalogue Integration

Cross-reference discovered images and actions against the CascadeGuard catalogue:

Look up images in our registry (cascadeguard managed images)
Report known CVEs for discovered image tags
For managed images: show what CVEs our pinned version resolves
For actions: check against our actions policy catalogue

Requires:

API endpoint or local catalogue file for image/action metadata
Integration with vuln report data (Grype/Trivy results if available)

6c. Website Links in Recommendations

Every recommendation should link to a cascadeguard.com article explaining the issue:

Recommendation	Article URL
Pin container images to digests	`https://cascadeguard.com/docs/why-pin-images`
Pin GitHub Actions to commit SHAs	`https://cascadeguard.com/docs/why-pin-actions`
Use Kustomize images transformer	`https://cascadeguard.com/docs/kustomize-image-pinning`
Pin Helm chart image tags	`https://cascadeguard.com/docs/helm-image-pinning`

In the CLI summary table: deduplicate actions across kinds and show the link once at the bottom. In the markdown report: inline links in recommendations.

Requires:

Content creation for each article on cascadeguard.com
URL structure decision (docs/ vs blog/ vs guides/)

6d. Unified `cascadeguard images pin` Command

cascadeguard images pin should handle all artifact types, not just Dockerfiles:

Dockerfiles: rewrite FROM lines with digest-pinned references
Helm charts: update values.yaml image tags with digests
Kustomize: add/update images transformer entries in kustomization.yaml
Compose files: rewrite image: fields with digest-pinned references
Raw K8s manifests: rewrite image: fields with digest-pinned references

This means the scan recommendation is always cascadeguard images pin regardless of artifact type — the command figures out the right strategy based on the file type.

6e. CLI Summary Deduplication

When the same action applies to multiple kinds (e.g. cascadeguard images pin for Dockerfiles, Helm, Kustomize, Compose, K8s), show it once at the bottom of the CLI output instead of per-kind:

  Kind                Found   Issues
  Dockerfiles             2        2
  GitHub Actions          2        2
  Helm Charts            24       15
  Kustomize              38        8

  Recommended actions:
    cascadeguard images pin    — 25 artifacts across 3 kinds
                                 https://cascadeguard.com/docs/why-pin-images
    cascadeguard actions pin   — 2 workflows
                                 https://cascadeguard.com/docs/why-pin-actions

6f. Markdown Report Links

In the markdown report:

Link image names to our registry page if we host them (e.g. [nginx:alpine](https://cascadeguard.com/images/nginx))
Link recommendations to website articles
Link CascadeGuard commands to CLI docs

Implementation Priority

Component name inference (6a) — quick win, improves readability immediately
Unified images pin (6d) — makes the recommendation story consistent
CLI summary deduplication (6e) — cleaner output
Website links (6c) — needs content, but URL structure can be decided now
Markdown links (6f) — follows from 6c
Catalogue integration (6b) — largest effort, needs API/data work

Techcle Wiki

Explorer

Oneshot Install And Scan Command Plan

Plan: cascadeguard scan CLI Command & One-Shot Install Script

Overview

Architecture

Scan Command Flow

Data Models

DiscoveredArtifact

ScanResult

Discovery Modules

1. Dockerfile Discoverer

2. CI Actions Discoverer

3. Compose/Stack Discoverer

4. Kubernetes / Infrastructure-as-Code Discoverers

4a. Helm Chart Discoverer

4b. Kustomize Discoverer

4c. Raw Kubernetes Manifest Discoverer

5. Future Discoverers

Interactive Selection UI

Design

Fallback Behaviour

Display Format

Analysis & Reporting

Analysis Rules (per artifact kind)

Output Formats

CLI Interface

Integration with existing CLI

One-Shot Install Script (install.sh)

Hosted at

Script Flow

Script Behaviour

File Structure

Implementation Phases

Phase 1: Core Discovery

Phase 2: Extended Discovery

Phase 3: Interactive UI

Phase 4: Analysis & Reporting

Phase 5: Install Script & Distribution

Technical Decisions & Constraints

Phase 6: Scan Report Enhancements

6a. Component Name Inference

6b. Catalogue Integration

6c. Website Links in Recommendations

6d. Unified cascadeguard images pin Command

6e. CLI Summary Deduplication

6f. Markdown Report Links

Implementation Priority

Graph View

Table of Contents

Plan: `cascadeguard scan` CLI Command & One-Shot Install Script

One-Shot Install Script (`install.sh`)

6d. Unified `cascadeguard images pin` Command