Plan: cascadeguard scan CLI Command & One-Shot Install Script
Status: Draft Domain: cascadeguard.com
Overview
Add a cascadeguard scan subcommand that discovers container-related artifacts in a project directory (Dockerfiles, CI workflows, Compose files, Kubernetes manifests), presents an interactive selection UI, analyses the selected artifacts, and produces a structured report. Alongside the CLI command, ship a one-shot install script (install.sh) hosted at https://get.cascadeguard.com that bootstraps a temporary environment and runs the scan in a single curl | sh invocation.
The scan command reuses existing parsing capabilities already in app.py — specifically CascadeGuardTool.parse_dockerfile_base_images(), parse_image_reference(), and ActionsPinner._USES_RE for action reference detection — and extends them with new discoverer modules for Compose, Kubernetes, and GitLab CI artifacts.
No new Python dependencies are required for the core implementation (Phases 1–4). The only existing dependency is pyyaml, which already covers YAML parsing needs.
Architecture
graph TD CLI["cascadeguard scan<br/>CLI Entry Point"] CLI --> Discovery["Discovery Engine<br/>run_scan()"] Discovery --> DD["Dockerfile<br/>Discoverer"] Discovery --> AD["CI Actions<br/>Discoverer"] Discovery --> CD["Compose/Stack<br/>Discoverer"] Discovery --> KD["Kubernetes<br/>Discoverer"] Discovery --> GD["GitLab CI<br/>Discoverer<br/>(stretch)"] DD --> Artifacts["List[DiscoveredArtifact]"] AD --> Artifacts CD --> Artifacts KD --> Artifacts GD --> Artifacts Artifacts --> UI["Interactive Selection<br/>curses / fallback"] Artifacts -->|--non-interactive| Analysis UI --> Analysis["Analysis Engine<br/>report.py"] Analysis --> Output["Report Output<br/>text / json / yaml"] Output -->|--output FILE| File["File"] Output --> Stdout["stdout"] subgraph "Reused from app.py" Parse["parse_dockerfile_base_images()"] ImgRef["parse_image_reference()"] UsesRE["ActionsPinner._USES_RE"] end DD -.-> Parse DD -.-> ImgRef AD -.-> UsesRE
Scan Command Flow
sequenceDiagram participant User participant CLI as cascadeguard scan participant Disc as Discovery Engine participant UI as Interactive UI participant Anal as Analysis Engine participant Out as Report Output User->>CLI: cascadeguard scan [--dir .] [flags] CLI->>Disc: discover(root_dir) loop Each Discoverer Disc->>Disc: glob for matching files Disc->>Disc: parse metadata from matched files Disc-->>Disc: yield DiscoveredArtifact per match end Disc-->>CLI: List[DiscoveredArtifact] alt --non-interactive CLI->>Anal: analyse(all_artifacts) else interactive (default) CLI->>UI: present(artifacts, grouped by kind) User->>UI: toggle selections UI-->>CLI: selected artifacts CLI->>Anal: analyse(selected_artifacts) end Anal-->>Out: ScanResult alt --output FILE Out->>Out: write to FILE else default Out->>Out: print to stdout end
Data Models
DiscoveredArtifact
@dataclass
class DiscoveredArtifact:
kind: str # "dockerfile" | "actions" | "compose" | "k8s" | "gitlab-ci"
path: str # relative to scan root, e.g. "services/api/Dockerfile"
details: dict # parsed metadata, varies by kindDetails by kind:
| kind | details keys | example values |
|---|---|---|
dockerfile | base_images, stages, args, env | ["python:3.11-slim", "node:20-alpine"], ["base", "builder", "final"] |
actions | action_refs, workflow_name | ["actions/checkout@v4", "docker/build-push-action@v5"], "CI" |
compose | services, image_refs | ["api", "db", "redis"], ["postgres:16", "redis:7-alpine"] |
k8s | api_version, kind, image_refs, namespace | "apps/v1", "Deployment", ["nginx:1.25"], "default" |
gitlab-ci | image_refs, stages | ["python:3.11"], ["build", "test", "deploy"] |
ScanResult
@dataclass
class ScanResult:
root_dir: str
discovered: list[DiscoveredArtifact]
selected: list[DiscoveredArtifact]
analysis: list[ArtifactAnalysis]
summary: ScanSummary
@dataclass
class ArtifactAnalysis:
artifact: DiscoveredArtifact
findings: list[str] # human-readable observations
recommendations: list[str] # actionable suggestions
risk_level: str # "info" | "low" | "medium" | "high"
@dataclass
class ScanSummary:
total_discovered: int
total_selected: int
total_images: int # unique container image references
total_actions: int # unique action references
by_kind: dict[str, int] # count per artifact kind
by_risk: dict[str, int] # count per risk levelDiscovery Modules
Each discoverer implements a common protocol:
class Discoverer(Protocol):
def discover(self, root: Path) -> list[DiscoveredArtifact]: ...All discoverers use pathlib.Path.rglob() for file matching — each module declares its glob patterns and rglob handles recursive traversal. A shared _excluded_dirs set (.git, node_modules, vendor, __pycache__, .venv, venv) is used to filter results, keeping the discovery code declarative and readable.
1. Dockerfile Discoverer
| Attribute | Value |
|---|---|
| Globs | **/Dockerfile, **/Dockerfile.*, **/*.dockerfile, **/Containerfile, **/Containerfile.* |
| Parser | Reuses CascadeGuardTool.parse_dockerfile_base_images() for FROM extraction. Additionally parses ARG and ENV directives, and identifies build stage names from FROM ... AS <name>. |
| Output kind | dockerfile |
2. CI Actions Discoverer
| Attribute | Value |
|---|---|
| Globs | .github/workflows/*.yml, .github/workflows/*.yaml |
| Parser | Reuses ActionsPinner._USES_RE regex to extract third-party action references. Parses workflow name field from YAML. |
| Output kind | actions |
3. Compose/Stack Discoverer
| Attribute | Value |
|---|---|
| Globs | **/docker-compose*.yml, **/docker-compose*.yaml, **/compose*.yml, **/compose*.yaml |
| Parser | YAML parse → extract services keys and image fields from each service. Detects build directives pointing to Dockerfiles. |
| Output kind | compose |
4. Kubernetes / Infrastructure-as-Code Discoverers
The original monolithic “Kubernetes Manifest Discoverer” is split into tool-aware discoverers, each with appropriate detection heuristics and pinning recommendations.
4a. Helm Chart Discoverer
| Attribute | Value |
|---|---|
| Detection | Chart.yaml present in directory, or path contains charts/ |
| Globs | Scans for Chart.yaml files, then reads templates and values.yaml from the chart directory |
| Parser | Extracts image.repository / image.tag patterns from values.yaml. Scans templates for hardcoded image refs. |
| Output kind | helm |
| Recommendation | Override image tags via values.yaml — don’t edit templates directly |
4b. Kustomize Discoverer
| Attribute | Value |
|---|---|
| Detection | kustomization.yaml or kustomization.yml present |
| Parser | Reads images transformer entries from kustomization file. Also scans referenced resources for image refs. |
| Output kind | kustomize |
| Recommendation | Use the images transformer in kustomization.yaml to pin digests |
4c. Raw Kubernetes Manifest Discoverer
| Attribute | Value |
|---|---|
| Globs | **/*.yaml, **/*.yml |
| Heuristic | File must contain both apiVersion and kind top-level keys. Skips files already claimed by Helm, Kustomize, Compose, or workflow discoverers. |
| Parser | Walks spec.containers[*].image, spec.initContainers[*].image, and spec.template.spec.containers[*].image paths. Handles multi-document YAML (--- separators). |
| Output kind | k8s |
| Recommendation | cascadeguard images pin --file <path> |
5. Future Discoverers
| Tool | Detection | Priority | Pinning approach |
|---|---|---|---|
| Flux | apiVersion: source.toolkit.fluxcd.io or kustomize.toolkit.fluxcd.io | Medium | Pin in HelmRelease values or Kustomization patches |
| ArgoCD | apiVersion: argoproj.io with Application/ApplicationSet | Medium | Pin in source repo, not the ArgoCD manifest |
| Terraform/OpenTofu | *.tf files with container or image blocks | Low | Pin in the .tf resource |
| Pulumi | Pulumi.yaml in root | Low | Pin in the Pulumi program |
| GitLab CI | .gitlab-ci.yml | Low | Pin image: fields in job definitions |
Interactive Selection UI
Design
- Built with Python
cursesmodule (no new dependency) - Checkbox-style selector with artifacts grouped by
kindas collapsible categories - Keyboard controls:
↑/↓navigate,Spacetoggles selection,aselects all,ndeselects all,Enterconfirms
Fallback Behaviour
| Condition | Behaviour |
|---|---|
--non-interactive flag | Skip UI, select all artifacts |
stdin is not a TTY (piped) | Fall back to numbered list on stderr, read selections from stdin |
| Windows without curses | Fall back to simple numbered list prompt |
Display Format
CascadeGuard Scan — Select artifacts to analyse
Dockerfiles (3)
[x] services/api/Dockerfile
[x] services/worker/Dockerfile
[ ] tools/dev.Dockerfile
GitHub Actions (2)
[x] .github/workflows/ci.yml
[x] .github/workflows/release.yml
Compose Files (1)
[x] docker-compose.yml
Kubernetes Manifests (4)
[x] k8s/deployment.yaml
[x] k8s/cronjob.yaml
[ ] k8s/configmap.yaml
[ ] k8s/service.yaml
↑↓ Navigate Space Toggle a All n None Enter Confirm
Analysis & Reporting
Analysis Rules (per artifact kind)
| Kind | Analysis checks |
|---|---|
dockerfile | Unpinned base images (no digest/tag), use of latest tag, multi-stage build detection, USER root warnings, COPY --from references |
actions | Unpinned action refs (tag vs SHA), use of @main/@master, known deprecated actions |
compose | Unpinned image references, exposed ports, privileged mode, volume mounts to sensitive paths |
k8s | Unpinned image references, imagePullPolicy: Always without digest, securityContext gaps, runAsRoot |
gitlab-ci | Unpinned image references, use of latest tag |
Output Formats
Text (default):
CascadeGuard Scan Report
========================
Scanned: /path/to/project
Artifacts: 10 discovered, 8 selected
Dockerfiles (3)
services/api/Dockerfile
⚠ Base image python:3.11-slim is not pinned to a digest
ℹ Multi-stage build with 3 stages detected
→ Recommendation: Pin base images to digests for reproducible builds
Summary: 2 high, 3 medium, 5 info findings
JSON: Full ScanResult serialised as JSON.
YAML: Full ScanResult serialised as YAML.
CLI Interface
cascadeguard scan [--dir PATH] [--non-interactive] [--format json|text] [--output FILE]
| Flag | Default | Description |
|---|---|---|
--dir | . | Root directory to scan |
--non-interactive | false | Scan all discovered artifacts without prompting |
--format | text | Output format: text, json |
--output | stdout | Write results to file instead of stdout |
Integration with existing CLI
The scan subcommand is added to build_parser() in app.py alongside existing commands (validate, enrol, check, etc.). A new cmd_scan handler is added to the command dispatch dict in main(). The handler delegates to scan.run_scan() which orchestrates discovery → selection → analysis → output.
One-Shot Install Script (install.sh)
Hosted at
https://get.cascadeguard.com
Usage:
curl -sSL https://get.cascadeguard.com | shOr with options:
curl -sSL https://get.cascadeguard.com | sh -s -- --keep --format jsonScript Flow
graph TD Start["install.sh starts"] --> DetectOS["Detect OS + arch<br/>uname -s, uname -m"] DetectOS --> CheckPython["Check Python ≥ 3.11"] CheckPython -->|Found| CreateVenv["Create temp venv<br/>in mktemp -d"] CheckPython -->|Not found| TryBinary["Download pre-built binary<br/>(future: GitHub release)"] CreateVenv --> PipInstall["pip install cascadeguard-tool<br/>from PyPI (or GitHub release)"] PipInstall --> RunScan["Run: cascadeguard scan<br/>in current directory"] TryBinary --> RunScan RunScan --> Cleanup{"--keep flag?"} Cleanup -->|No| Remove["Remove temp dir"] Cleanup -->|Yes| Keep["Keep temp dir,<br/>print path"] Remove --> Done["Exit"] Keep --> Done
Script Behaviour
| Concern | Approach |
|---|---|
| OS detection | uname -s → Linux/Darwin; uname -m → x86_64/arm64/aarch64 |
| Python check | python3 --version, require ≥ 3.11 |
| Isolation | mktemp -d for temp venv, cleaned up on exit (trap EXIT) |
| Installation | pip install cascadeguard-tool into temp venv (no system pollution) |
| Passthrough args | All args after -- forwarded to cascadeguard scan |
--keep flag | Consumed by install.sh, prevents cleanup, prints venv path |
| Error handling | set -euo pipefail, meaningful error messages on failure |
| No root required | Runs entirely in user space |
File Structure
app/
├── app.py # Add scan subcommand + cmd_scan handler
├── scan/
│ ├── __init__.py # Public API: run_scan()
│ ├── discoverers.py # All Discoverer implementations
│ ├── models.py # DiscoveredArtifact, ScanResult, ArtifactAnalysis dataclasses
│ ├── ui.py # Interactive selection (curses + fallback)
│ └── report.py # Analysis engine and output formatting
├── tests/
│ ├── test_scan_discovery.py # Unit tests for discoverers
│ ├── test_scan_analysis.py # Unit tests for analysis rules
│ └── test_scan_ui.py # UI tests (mocked curses)
install.sh # One-shot wrapper script
Implementation Phases
Phase 1: Core Discovery
Deliverables:
scan/models.py—DiscoveredArtifact,ScanResult,ArtifactAnalysis,ScanSummarydataclassesscan/discoverers.py—DockerfileDiscovererandCIActionsDiscovererimplementationsscan/__init__.py—run_scan()orchestrator (discovery only, no UI or analysis yet)app.py—scansubcommand wired intobuild_parser()andcmd_scanin dispatch dicttests/test_scan_discovery.py— Unit tests for both discoverers with fixture directories- Output: plain list of discovered artifacts to stdout
Reuse: CascadeGuardTool.parse_dockerfile_base_images(), ActionsPinner._USES_RE
Phase 2: Extended Discovery
Deliverables:
scan/discoverers.py— AddComposeDiscovererandKubernetesDiscoverertests/test_scan_discovery.py— Extended tests with compose and k8s fixtures- Integration test: scan a fixture project directory containing all artifact types
Key decisions:
- K8s detection via
apiVersion+kindfield presence (not file path heuristics) - Multi-document YAML support for K8s manifests
- Compose file detection by filename pattern, not content heuristics
Phase 3: Interactive UI
Deliverables:
scan/ui.py—InteractiveSelectorclass with curses implementationscan/ui.py—FallbackSelectorfor non-TTY / Windows environmentsscan/__init__.py— Wire UI intorun_scan()flow--non-interactiveflag supporttests/test_scan_ui.py— Tests with mocked curses
Phase 4: Analysis & Reporting
Deliverables:
scan/report.py—AnalysisEnginewith per-kind analysis rulesscan/report.py—ReportFormatterwith text and JSON outputscan/__init__.py— Wire analysis and reporting intorun_scan()flow--formatand--outputflag supporttests/test_scan_analysis.py— Tests for analysis rules and output formatting
Phase 5: Install Script & Distribution
Deliverables:
install.sh— One-shot wrapper script- DNS/hosting setup for
get.cascadeguard.com→ serveinstall.sh - Smoke tests: run install.sh in clean Docker containers (Ubuntu, Alpine, macOS sim)
- README updates with install instructions
Technical Decisions & Constraints
| Decision | Rationale |
|---|---|
| No new Python dependencies (Phases 1–4) | Keep the tool lightweight; pyyaml + stdlib covers all needs |
Reuse existing parsers from app.py | Dockerfile and Actions parsing is already battle-tested |
pathlib.Path.rglob() for file discovery | Declarative glob patterns per discoverer; cleaner than manual os.walk. Excluded dirs (.git, node_modules, etc.) filtered on results |
curses for interactive UI | stdlib, no dependency; works on Linux/macOS out of the box |
| Simple numbered-list fallback | Covers Windows and piped-stdin cases without complexity |
| K8s detection via content heuristics | File path alone is unreliable; apiVersion + kind is definitive |
| GitLab CI as stretch goal | Lower priority; most users are on GitHub Actions |
dataclass for models | Simple, no dependency, good for serialisation to JSON/YAML |
| Discoverer protocol (not ABC) | Keeps it lightweight; structural subtyping via Protocol |
| install.sh uses temp venv | Zero system pollution; clean up on exit |
Phase 6: Scan Report Enhancements
6a. Component Name Inference
Derive a short component name from the folder structure instead of using the full file path as the title.
Heuristics:
- Dockerfile at
components/remote-development/codev/Dockerfile→ component nameremote-development/codevor justcodev - Helm chart at
components/headlamp/charts/headlamp-0.39.0/headlamp→headlamp - Kustomize at
components/n8n/kustomization.yaml→n8n - GitHub Actions at
.github/workflows/build-codev.yaml→build-codev
Rules:
- For Helm: use
chart_namefrom Chart.yaml (already parsed) - For Kustomize: use the parent directory name (or parent/grandparent if parent is generic like
overlays) - For Dockerfiles: walk up from the Dockerfile, skip generic names (
app,src,docker), take the first meaningful directory name - For Actions: use the workflow filename stem
- Full path remains in the detail section of the markdown report
6b. Catalogue Integration
Cross-reference discovered images and actions against the CascadeGuard catalogue:
- Look up images in our registry (cascadeguard managed images)
- Report known CVEs for discovered image tags
- For managed images: show what CVEs our pinned version resolves
- For actions: check against our actions policy catalogue
Requires:
- API endpoint or local catalogue file for image/action metadata
- Integration with vuln report data (Grype/Trivy results if available)
6c. Website Links in Recommendations
Every recommendation should link to a cascadeguard.com article explaining the issue:
| Recommendation | Article URL |
|---|---|
| Pin container images to digests | https://cascadeguard.com/docs/why-pin-images |
| Pin GitHub Actions to commit SHAs | https://cascadeguard.com/docs/why-pin-actions |
| Use Kustomize images transformer | https://cascadeguard.com/docs/kustomize-image-pinning |
| Pin Helm chart image tags | https://cascadeguard.com/docs/helm-image-pinning |
In the CLI summary table: deduplicate actions across kinds and show the link once at the bottom. In the markdown report: inline links in recommendations.
Requires:
- Content creation for each article on cascadeguard.com
- URL structure decision (docs/ vs blog/ vs guides/)
6d. Unified cascadeguard images pin Command
cascadeguard images pin should handle all artifact types, not just Dockerfiles:
- Dockerfiles: rewrite FROM lines with digest-pinned references
- Helm charts: update
values.yamlimage tags with digests - Kustomize: add/update
imagestransformer entries inkustomization.yaml - Compose files: rewrite
image:fields with digest-pinned references - Raw K8s manifests: rewrite
image:fields with digest-pinned references
This means the scan recommendation is always cascadeguard images pin regardless of artifact type — the command figures out the right strategy based on the file type.
6e. CLI Summary Deduplication
When the same action applies to multiple kinds (e.g. cascadeguard images pin for Dockerfiles, Helm, Kustomize, Compose, K8s), show it once at the bottom of the CLI output instead of per-kind:
Kind Found Issues
Dockerfiles 2 2
GitHub Actions 2 2
Helm Charts 24 15
Kustomize 38 8
Recommended actions:
cascadeguard images pin — 25 artifacts across 3 kinds
https://cascadeguard.com/docs/why-pin-images
cascadeguard actions pin — 2 workflows
https://cascadeguard.com/docs/why-pin-actions
6f. Markdown Report Links
In the markdown report:
- Link image names to our registry page if we host them (e.g.
[nginx:alpine](https://cascadeguard.com/images/nginx)) - Link recommendations to website articles
- Link CascadeGuard commands to CLI docs
Implementation Priority
- Component name inference (6a) — quick win, improves readability immediately
- Unified images pin (6d) — makes the recommendation story consistent
- CLI summary deduplication (6e) — cleaner output
- Website links (6c) — needs content, but URL structure can be decided now
- Markdown links (6f) — follows from 6c
- Catalogue integration (6b) — largest effort, needs API/data work