Design: Contextual Vulnerability Recommendation Engine

Issue: CAS-85 Status: Draft Author: CTO

Problem Statement

CascadeGuard secure images will always carry some vulnerabilities. A blanket CVE exemption is too blunt — it ignores the fact that the same CVE poses vastly different risk levels depending on who is running the image and how they deploy it. We need a system that asks the right questions about a customer’s business context and workload deployment, then cross-references those answers against the image’s actual vulnerability data to produce personalised, actionable recommendations rather than a binary “fix or exempt” decision.

Design Overview

Core Concept: Risk Profiles → Recommendation Engine

+-----------------+   +------------------+   +-------------------------+
| Company Profile | + | Workload Profile | -> | Risk-Weighted Vulns     |
| (10 questions)  |   | (10 questions)   |   | + Recommendations       |
+-----------------+   +------------------+   +-------------------------+
        |                      |                          |
   Risk factors           Exposure factors         Per-CVE actions:
   (jurisdiction,         (network, env,           - Must Fix (SLA)
    compliance,            data sensitivity,        - Recommended Fix
    industry)              runtime config)          - Acceptable Risk
                                                    - Mitigatable

Company Profile Questionnaire (~10 questions)

These questions establish the regulatory and organisational risk context:

#	Question	Options	Risk Signal
1	Primary jurisdiction?	UK, EU, US, APAC, Other	Determines applicable regulations
2	Industry sector?	Financial services, Healthcare, Government, Technology, Retail, Other	Sector-specific compliance
3	Subject to specific regulations?	PCI-DSS, HIPAA, SOC2, FedRAMP, DORA, NIS2, ISO 27001, None	Hard compliance requirements
4	Company size?	Startup (<50), SMB (50-500), Enterprise (500+)	Risk tolerance and audit exposure
5	Do you handle PII/PHI?	Yes at scale, Yes limited, No	Data protection obligations
6	Do you process payment card data?	Yes directly, Yes via processor, No	PCI scope
7	Subject to external security audits?	Annually, Quarterly, Ad-hoc, None	Compliance verification frequency
8	Supply chain security requirements?	SBOM required, Signed images required, Both, None	Provenance needs
9	Incident response SLA obligations?	< 24h, < 72h, < 7d, None	Breach notification windows
10	Risk appetite for known vulnerabilities?	Zero tolerance, Low (critical/high only), Moderate, Accept with mitigation	Overall posture

Workload Profile Questionnaire (~10 questions)

A customer may use the same image in multiple workloads (e.g. the same Node.js base image running an internet-facing API server and an internal batch job). Each deployment context gets its own workload profile, and the recommendation engine generates a separate recommendation set per (company profile, workload profile, image) combination. This means the same CVE can receive different actions for different deployments of the same image within the same organisation.

These questions establish the deployment and exposure context for a specific workload using a given image:

#	Question	Options	Risk Signal
1	Network exposure?	Internet-facing / DMZ, Internal network only, Air-gapped	Attack surface
2	Environment type?	Production, Staging, Development, CI/CD only	Blast radius
3	Data classification?	Public, Internal, Confidential, Restricted/Secret	Data sensitivity
4	Authentication to this workload?	Public / anonymous, Authenticated users, Service-to-service only	Access control
5	Container runtime privileges?	Privileged / host network, Standard, Restricted (read-only root, no caps)	Exploitability
6	Runs as root?	Yes, No, Unknown	Privilege escalation risk
7	Persistent storage with sensitive data?	Yes, No	Data exfiltration risk
8	Accepts untrusted input?	Yes user uploads/forms, Yes API input, No	Injection surface
9	Outbound network access?	Unrestricted, Restricted egress, No egress	C2/exfil potential
10	Update frequency tolerance?	Continuous (GitOps), Weekly maintenance window, Monthly, Quarterly	Remediation cadence

Recommendation Engine Logic

Step 1: Compute Risk Score Modifiers

Each answer maps to risk factor weights that modify the base severity of vulnerabilities:

Company modifiers handle jurisdiction multipliers and regulatory flags
Workload modifiers handle severity bumps based on network exposure, environment type, data sensitivity, and runtime configuration

Step 2: CVE Classification

For each vulnerability, combine:

Base severity (from scanner)
CVSS vector components (network vs local, user interaction, etc.)
Package context (runtime vs build-only dependency)
Fix availability (fixed_version present or not)

Fix Availability and Regulatory Treatment

Fix availability is a first-class factor in the recommendation engine. When no upstream fix exists, the recommendation shifts from “patch it” to “mitigate or accept with documentation”:

Fix Status	Regulatory Treatment	Engine Behaviour
Fix available	All frameworks expect timely remediation (PCI-DSS: 30 days critical, 90 days high; FedRAMP: 30/90/180 by severity; DORA: “without undue delay”)	Must Fix or Recommended Fix with SLA based on profile
Fix pending (upstream aware, no release yet)	Frameworks generally accept documented compensating controls while awaiting vendor fix. PCI-DSS 6.2 and ISO 27001 A.12.6.1 both recognise that remediation depends on vendor timeline.	Mitigatable — recommend runtime controls (network segmentation, WAF rules, restricted capabilities) with a review trigger when the fix ships
No fix / won’t fix	Regulators accept risk acceptance decisions when formally documented with justification and compensating controls. FedRAMP POA&M process, PCI-DSS compensating controls worksheet, and DORA’s risk assessment all provide mechanisms for this.	Acceptable Risk or Mitigatable depending on exploitability and exposure, with mandatory documented rationale
Not applicable (build-only dep, unreachable code path)	Not typically required to remediate, but must be documented if flagged during audit	Acceptable Risk with rationale noting non-runtime context

The engine records the fix status at recommendation generation time so that when an upstream fix later becomes available, re-running the recommendation against a new scan will automatically escalate previously-mitigated items.

Step 3: Per-CVE Recommendation

Recommendation	Meaning	Action
Must Fix	Regulatory/risk profile demands remediation within SLA	Rebuild with fix or replace package
Recommended Fix	Best practice but not compliance-blocking	Schedule in next maintenance window
Acceptable Risk	Context shows low actual risk	Document acceptance, review periodically
Mitigatable	Runtime controls can reduce risk without patching	Apply network policy, seccomp, read-only FS

Step 4: Summary Report

Risk posture summary (overall rating for image + profile)
CVE breakdown by recommendation tier
Top 3-5 priority actions
Mitigation suggestions (network policies, seccomp, capabilities)
Compliance notes (which regulations require which fixes)
SLA comparison (why Acceptable Risk items are acceptable in context)

Data Model Changes

New Tables (Additive — no existing table changes)

company_profiles

Column	Type	Notes
id	TEXT PK	UUID
name	TEXT	Profile display name
answers	TEXT (JSON)	Questionnaire answers
risk_flags	TEXT (JSON)	Derived risk flags
created_at	TEXT	ISO timestamp
updated_at	TEXT	ISO timestamp

workload_profiles

Column	Type	Notes
id	TEXT PK	UUID
name	TEXT	Profile display name
answers	TEXT (JSON)	Questionnaire answers
risk_score	REAL	Computed composite score
created_at	TEXT	ISO timestamp
updated_at	TEXT	ISO timestamp

recommendations

Column	Type	Notes
id	TEXT PK	UUID
image_id	TEXT FK	References images.id
scan_id	TEXT FK	References scans.id
company_profile_id	TEXT FK	References company_profiles.id
workload_profile_id	TEXT FK	References workload_profiles.id
summary	TEXT (JSON)	Aggregated stats and posture
generated_at	TEXT	ISO timestamp

recommendation_items

Column	Type	Notes
id	TEXT PK	UUID
recommendation_id	TEXT FK	References recommendations.id
vulnerability_id	TEXT FK	References vulnerabilities.id
original_severity	TEXT	From scanner
adjusted_severity	TEXT	After profile weighting
action	TEXT	must_fix / recommended_fix / acceptable_risk / mitigatable
rationale	TEXT	Human-readable explanation
mitigations	TEXT (JSON)	Suggested runtime mitigations
compliance_notes	TEXT (JSON)	Regulatory references

API Design

Authorization and Access Control

All recommendation endpoints sit behind the existing CascadeGuard auth layer. Access control follows the principle that profiles and recommendations are tenant-scoped — a user can only see and manage data belonging to their own organisation.

Resource	Create	Read	Update	Delete	Notes
Company profiles	Admin	All org members	Admin	Admin	Org-wide settings; non-admins consume but don’t modify
Workload profiles	All org members	All org members	Creator + Admin	Creator + Admin	Any team member can define a deployment context
Recommendations	All org members	All org members	N/A (immutable)	Admin	Generated as point-in-time snapshots; no edits, only regenerate
Questionnaire definitions	N/A (system)	Public (unauthenticated)	N/A	N/A	Question schemas are read-only reference data

Key security constraints:

All profile and recommendation endpoints require a valid session/API key and enforce tenant isolation at the query layer (no cross-org data leakage)
Recommendation generation is rate-limited to prevent abuse (e.g. 10 generations per image per hour)
Questionnaire definition endpoints are public to support unauthenticated preview flows (e.g. marketing “see what we check” pages) — they contain no customer data
PDF export of recommendations inherits the same access controls as the recommendation itself

Profile Management

POST /api/profiles/company — create company profile
GET /api/profiles/company — list company profiles
POST /api/profiles/workload — create workload profile
GET /api/profiles/workload — list workload profiles
GET /api/questionnaires/company — question definitions + options
GET /api/questionnaires/workload — question definitions + options

Recommendation Generation

POST /api/images/:id/recommendations — generate for given profiles
GET /api/images/:id/recommendations — list recommendation sets
GET /api/images/:id/recommendations/:recId — full recommendation + items

Frontend Changes

Profile Wizard (/profiles/new) — step-by-step questionnaire UI
Recommendation View (/dashboard/:imageId/recommendations/:recId) — summary + filterable CVE table with rationale
Image Detail Enhancement — “Get Recommendations” CTA + previous recommendations sidebar

Implementation Phases

Phase	Scope	Estimated Stories
1	Data model + questionnaire API	1
2	Recommendation engine logic	1-2
3	Frontend profile wizard	1
4	Frontend recommendation dashboard	1
5	Compliance packs + PDF export	1-2

Key Design Decisions

Contextualise, do not exempt — the same CVE can be “Must Fix” for one customer and “Acceptable Risk” for another
Questions versioned in code — easy to update, test, and version; answers stored as JSON for forward-compat
Point-in-time snapshots — recommendations tied to specific scans; regenerate on new scans
Existing SLA untouched — recommendations are an advisory layer, SLA deadlines remain independent
Additive schema — all new tables, zero risk to existing functionality

Techcle Wiki

Explorer

Vulnerability Recommendations