Design Document: Secure Proxy (sec-proxy)

Overview

The Secure Proxy (sec-proxy) is a security gateway that intercepts package requests from internal infrastructure and applies configurable security rules before proxying to external repositories. The system uses a plugin-based architecture where each package manager type (Docker, npm, pip, Maven, Go) has its own handler that understands the specific registry protocol and metadata format.

The core design principle is intelligent version resolution with security policies. When a client requests a package (especially with dynamic tags like “latest”), the proxy fetches metadata from the upstream registry, evaluates all available versions against configured rules (age thresholds, CVE checks, etc.), and transparently returns the newest compliant version. This allows builds to succeed while enforcing security policies.

Important: The proxy caches metadata only (package versions, publication dates, CVE information), not the actual package artifacts or images. This keeps storage requirements minimal while still providing fast policy evaluation. Package artifacts are always proxied directly from upstream registries.

Architecture

High-Level Architecture

graph TB
    Client[Build Environment / K8s Cluster]
    Proxy[Sec-Proxy Core]
    Config[YAML Configuration]
    Cache[Cache Layer]
    Audit[Audit Log]
    
    subgraph "Registry Handlers"
        Docker[Docker Handler]
        NPM[NPM Handler]
        Pip[Pip Handler]
        Maven[Maven Handler]
        Go[Go Handler]
    end
    
    subgraph "Rule Engine"
        AgeRule[Age Rule]
        CVERule[CVE Rule]
        CustomRule[Custom Rules]
    end
    
    subgraph "External Services"
        DockerHub[Docker Hub]
        NPMRegistry[npmjs.org]
        PyPI[PyPI]
        MavenCentral[Maven Central]
        GoProxy[proxy.golang.org]
        NVD[NVD CVE Database]
    end
    
    Client -->|HTTP Request| Proxy
    Proxy -->|Load Rules| Config
    Proxy -->|Route by Type| Docker
    Proxy -->|Route by Type| NPM
    Proxy -->|Route by Type| Pip
    Proxy -->|Route by Type| Maven
    Proxy -->|Route by Type| Go
    
    Docker -->|Fetch Metadata| DockerHub
    NPM -->|Fetch Metadata| NPMRegistry
    Pip -->|Fetch Metadata| PyPI
    Maven -->|Fetch Metadata| MavenCentral
    Go -->|Fetch Metadata| GoProxy
    
    Docker -->|Apply Rules| AgeRule
    Docker -->|Apply Rules| CVERule
    NPM -->|Apply Rules| AgeRule
    Pip -->|Apply Rules| CVERule
    
    AgeRule -->|Query CVEs| NVD
    CVERule -->|Query CVEs| NVD
    
    Proxy -->|Cache Hit/Miss| Cache
    Proxy -->|Log Events| Audit
    
    Cache -->|Serve Cached| Client
    Proxy -->|Serve Proxied| Client

Request Flow

  1. Request Reception: Client makes package request (e.g., docker pull myimage:latest)
  2. Handler Selection: Proxy routes to appropriate handler based on request type
  3. Metadata Fetch: Handler fetches package metadata from upstream registry (or from metadata cache if available)
  4. Rule Evaluation: Handler applies configured rules to available versions
  5. Version Resolution: Handler selects newest compliant version
  6. Package Proxy: Proxy streams the resolved package directly from upstream to client
  7. Metadata Cache: Store metadata for future requests
  8. Audit Log: Log the request and security decision

Components and Interfaces

Now I’ll update the code examples to use Go syntax:

// ProxyCore is the central component that handles HTTP requests
type ProxyCore interface {
    // Start the proxy server
    Start(config *ProxyConfig) error
    
    // Handle incoming HTTP request
    HandleRequest(w http.ResponseWriter, r *http.Request) error
    
    // Route request to appropriate handler
    RouteToHandler(r *http.Request) (RegistryHandler, error)
    
    // Reload configuration without downtime
    ReloadConfig(config *ProxyConfig) error
    
    // Health check endpoint
    HealthCheck() *HealthStatus
}
 
type HealthStatus struct {
    Status        string                    // "healthy", "degraded", "unhealthy"
    Uptime        time.Duration
    CacheStatus   *CacheHealth
    HandlerStatus map[string]*HandlerHealth
}

2. Registry Handler Interface

Each package manager type implements this interface to handle registry-specific protocols.

// RegistryHandler processes requests for a specific registry type
type RegistryHandler interface {
    // Identify if this handler can process the request
    CanHandle(r *http.Request) bool
    
    // Fetch metadata for a package from upstream registry
    FetchMetadata(ctx context.Context, packageID *PackageIdentifier) (*PackageMetadata, error)
    
    // Resolve version tag to specific version based on rules
    ResolveVersion(
        ctx context.Context,
        packageID *PackageIdentifier,
        requestedVersion string,
        rules []Rule,
    ) (*ResolvedVersion, error)
    
    // Proxy the request to upstream (streams package directly)
    ProxyRequest(ctx context.Context, w http.ResponseWriter, r *http.Request) error
}
 
type PackageIdentifier struct {
    Registry  string  // e.g., "docker.io", "registry.npmjs.org"
    Namespace *string // e.g., "library" for Docker, "@scope" for npm
    Name      string
}
 
type PackageMetadata struct {
    PackageID *PackageIdentifier
    Versions  []*VersionInfo
    Tags      map[string]string // tag name -> version
}
 
type VersionInfo struct {
    Version     string
    PublishedAt time.Time
    Checksums   map[string]string // algorithm -> hash
    Size        int64
    Metadata    map[string]interface{} // registry-specific metadata
}
 
type ResolvedVersion struct {
    Version      string
    Reason       string        // explanation of why this version was selected
    RulesApplied []*RuleResult
}

3. Rule Engine

The rule engine evaluates packages against configured security policies.

// Rule evaluates packages against a security policy
type Rule interface {
    // Unique identifier for the rule
    ID() string
    
    // Human-readable name
    Name() string
    
    // Evaluate a version against this rule
    Evaluate(
        ctx context.Context,
        packageID *PackageIdentifier,
        version *VersionInfo,
        evalCtx *EvaluationContext,
    ) (*RuleResult, error)
    
    // Check if this rule applies to the given package
    AppliesTo(packageID *PackageIdentifier) bool
}
 
type RuleResult struct {
    RuleID   string
    Passed   bool
    Severity string // "block", "warn", "info"
    Message  string
    Details  map[string]interface{}
}
 
type EvaluationContext struct {
    CurrentTime time.Time
    CVEDatabase CVEDatabase
    Config      *RuleConfig
}
 
// AgeRule implements age-based filtering
type AgeRule struct {
    id              string
    minimumAgeDays  int
    exclusions      []*PackagePattern
}
 
// CVERule implements CVE-based filtering
type CVERule struct {
    id              string
    maxSeverity     string // "critical", "high", "medium", "low"
    blockOnCritical bool
    exclusions      []*PackagePattern
}
 
type PackagePattern struct {
    Registry  *string
    Namespace *string
    Name      *string // supports wildcards
}

4. Configuration System

YAML-based configuration with global and per-registry rules.

type ProxyConfig struct {
    // Global settings
    Global *GlobalConfig `yaml:"global"`
    
    // Per-registry configurations
    Registries map[string]*RegistryConfig `yaml:"registries"`
    
    // Rule definitions
    Rules []*RuleDefinition `yaml:"rules"`
    
    // Cache configuration
    Cache *CacheConfig `yaml:"cache"`
    
    // Audit logging configuration
    Audit *AuditConfig `yaml:"audit"`
}
 
type GlobalConfig struct {
    ListenAddress string   `yaml:"listenAddress"`
    ListenPort    int      `yaml:"listenPort"`
    TLSEnabled    bool     `yaml:"tlsEnabled"`
    TLSCertPath   string   `yaml:"tlsCertPath,omitempty"`
    TLSKeyPath    string   `yaml:"tlsKeyPath,omitempty"`
    DefaultRules  []string `yaml:"defaultRules"` // rule IDs to apply globally
}
 
type RegistryConfig struct {
    Type          string            `yaml:"type"` // "docker", "npm", "pip", "maven", "go"
    UpstreamURL   string            `yaml:"upstreamUrl"`
    Rules         []string          `yaml:"rules"` // rule IDs to apply (overrides global)
    Exclusions    []*RuleExclusion  `yaml:"exclusions"`
    Authentication *RegistryAuth    `yaml:"authentication,omitempty"`
}
 
type RuleDefinition struct {
    ID     string                 `yaml:"id"`
    Type   string                 `yaml:"type"` // "age", "cve", "custom"
    Config map[string]interface{} `yaml:"config"`
}
 
type RuleExclusion struct {
    RuleID   string            `yaml:"ruleId"`
    Packages []*PackagePattern `yaml:"packages"`
    Reason   string            `yaml:"reason"`
}
 
type CacheConfig struct {
    Backend        string `yaml:"backend"`        // "memory", "redis"
    TTLSeconds     int    `yaml:"ttlSeconds"`     // how long to cache metadata
    MaxEntries     int    `yaml:"maxEntries"`     // maximum number of metadata entries
    EvictionPolicy string `yaml:"evictionPolicy"` // "lru", "lfu", "ttl"
}
 
type AuditConfig struct {
    Backend            string `yaml:"backend"` // "file", "syslog", "cloudwatch"
    Level              string `yaml:"level"`   // "debug", "info", "warn", "error"
    IncludeRequestBody bool   `yaml:"includeRequestBody"`
}

Example YAML configuration remains the same.

5. Metadata Cache Layer

Stores fetched metadata (not package artifacts) to improve performance and reduce upstream API calls.

// MetadataCache stores package metadata
type MetadataCache interface {
    // Check if metadata is cached
    Has(ctx context.Context, key *MetadataCacheKey) (bool, error)
    
    // Get cached metadata
    Get(ctx context.Context, key *MetadataCacheKey) (*CachedMetadata, error)
    
    // Store metadata in cache
    Put(ctx context.Context, key *MetadataCacheKey, metadata *PackageMetadata, ttl time.Duration) error
    
    // Invalidate cached metadata
    Invalidate(ctx context.Context, key *MetadataCacheKey) error
    
    // Get cache statistics
    Stats(ctx context.Context) (*CacheStats, error)
    
    // Evict old entries based on policy
    Evict(ctx context.Context) (int, error)
}
 
type MetadataCacheKey struct {
    Registry  string
    PackageID *PackageIdentifier
}
 
type CachedMetadata struct {
    Metadata      *PackageMetadata
    CacheMetadata *CacheMetadata
}
 
type CacheMetadata struct {
    CachedAt       time.Time
    LastAccessedAt time.Time
    AccessCount    int
    TTL            time.Duration
}
 
type CacheStats struct {
    TotalEntries  int
    HitRate       float64
    EvictionCount int
    MemoryUsage   int64
}

6. Audit Logger

Records all security decisions and package requests for compliance.

// AuditLogger records security events
type AuditLogger interface {
    // Log a package request
    LogRequest(ctx context.Context, event *RequestEvent) error
    
    // Log a security decision
    LogDecision(ctx context.Context, event *DecisionEvent) error
    
    // Log a cache event
    LogCache(ctx context.Context, event *CacheEvent) error
    
    // Query audit logs
    Query(ctx context.Context, filter *AuditFilter) ([]*AuditEntry, error)
}
 
type RequestEvent struct {
    Timestamp      time.Time
    ClientIP       string
    ClientIdentity *string
    PackageID      *PackageIdentifier
    RequestedVersion string
    UserAgent      string
}
 
type DecisionEvent struct {
    Timestamp       time.Time
    PackageID       *PackageIdentifier
    Version         string
    Decision        string // "allow", "block", "warn"
    RulesApplied    []*RuleResult
    ResolvedVersion *string
}
 
type CacheEvent struct {
    Timestamp time.Time
    PackageID *PackageIdentifier
    Version   string
    EventType string // "hit", "miss", "evict", "invalidate"
}
 
type AuditEntry struct {
    ID        string
    Timestamp time.Time
    EventType string // "request", "decision", "cache"
    Data      interface{} // RequestEvent | DecisionEvent | CacheEvent
}

7. CVE Database Client

Interfaces with external vulnerability databases.

// CVEDatabase queries vulnerability information
type CVEDatabase interface {
    // Query CVEs for a specific package version
    QueryCVEs(
        ctx context.Context,
        packageID *PackageIdentifier,
        version string,
    ) ([]*CVEInfo, error)
    
    // Check if database is available
    IsAvailable(ctx context.Context) (bool, error)
}
 
type CVEInfo struct {
    CVEID            string
    Severity         string // "critical", "high", "medium", "low"
    Score            float64 // CVSS score
    Description      string
    PublishedDate    time.Time
    AffectedVersions []string
    References       []string
}

Data Models

Package Request Flow

// Incoming request from client
type IncomingRequest struct {
    Method  string
    Path    string
    Headers map[string][]string
    Body    []byte
}
 
// Parsed package request
type PackageRequest struct {
    Handler          string // "docker", "npm", etc.
    PackageID        *PackageIdentifier
    RequestedVersion string
    OriginalRequest  *http.Request
}
 
// Response to client
type PackageResponse struct {
    StatusCode int
    Headers    map[string][]string
    Body       io.ReadCloser
    CacheHit   bool
}

Registry-Specific Models

Docker

type DockerManifest struct {
    SchemaVersion int    `json:"schemaVersion"`
    MediaType     string `json:"mediaType"`
    Config        struct {
        Digest string `json:"digest"`
        Size   int64  `json:"size"`
    } `json:"config"`
    Layers []struct {
        Digest string `json:"digest"`
        Size   int64  `json:"size"`
    } `json:"layers"`
}
 
type DockerImageMetadata struct {
    Tags         []string
    Name         string
    Created      time.Time
    Architecture string
}

NPM

type NPMPackageMetadata struct {
    Name     string                       `json:"name"`
    DistTags map[string]string            `json:"dist-tags"`
    Versions map[string]*NPMVersionMetadata `json:"versions"`
    Time     map[string]string            `json:"time"` // version -> ISO timestamp
}
 
type NPMVersionMetadata struct {
    Version string `json:"version"`
    Dist    struct {
        Tarball   string `json:"tarball"`
        Shasum    string `json:"shasum"`
        Integrity string `json:"integrity"`
    } `json:"dist"`
    Dependencies map[string]string `json:"dependencies"`
}

PyPI

type PyPIPackageMetadata struct {
    Info struct {
        Name    string `json:"name"`
        Version string `json:"version"`
    } `json:"info"`
    Releases map[string][]*PyPIRelease `json:"releases"`
}
 
type PyPIRelease struct {
    Filename         string            `json:"filename"`
    URL              string            `json:"url"`
    Digests          map[string]string `json:"digests"`
    UploadTimeISO8601 string           `json:"upload_time_iso_8601"`
    Size             int64             `json:"size"`
}

Maven

type MavenMetadata struct {
    GroupID    string `xml:"groupId"`
    ArtifactID string `xml:"artifactId"`
    Versioning struct {
        Latest      string   `xml:"latest"`
        Release     string   `xml:"release"`
        Versions    []string `xml:"versions>version"`
        LastUpdated string   `xml:"lastUpdated"`
    } `xml:"versioning"`
}
 
type MavenArtifact struct {
    GroupID    string
    ArtifactID string
    Version    string
    Packaging  string
    Checksums  map[string]string
}

Go Modules

type GoModuleInfo struct {
    Version string `json:"Version"`
    Time    string `json:"Time"` // RFC3339 timestamp
}
 
type GoModuleMetadata struct {
    Module   string            `json:"module"`
    Versions []*GoModuleInfo   `json:"versions"`
}

Correctness Properties

A property is a characteristic or behavior that should hold true across all valid executions of a system—essentially, a formal statement about what the system should do. Properties serve as the bridge between human-readable specifications and machine-verifiable correctness guarantees.

Property 1: Package Proxying

For any package request (Docker, npm, pip, Maven, or Go), when the package is not cached, the proxy should fetch it from the configured upstream repository.

Validates: Requirements 1.1, 2.1, 3.1, 4.1, 5.1

Property 2: Validation Before Serving

For any package fetched from upstream, the appropriate validation (vulnerability scan, checksum, signature) should be performed before the package is served to the client.

Validates: Requirements 1.2, 2.2, 3.2, 4.2, 5.2

Property 3: Caching Metadata

For any package metadata fetched from upstream, the metadata should be stored in the cache layer for future requests.

Validates: Requirements 1.3, 2.3, 3.3, 4.3, 5.3

Property 4: Metadata Cache Hit

For any cached package metadata, when the same package is requested again within the TTL, the metadata should be served from cache without fetching from upstream.

Validates: Requirements 1.4, 2.4, 3.4, 4.4, 5.4

Property 5: Blocking Invalid Packages

For any package that fails validation (security scan, checksum, signature), the request should be blocked and an error returned to the client.

Validates: Requirements 1.5, 2.5, 3.5, 4.5, 5.5

Property 6: Configuration Parsing

For any valid YAML configuration file, the proxy should successfully parse both global rules and per-registry rules.

Validates: Requirements 6.1

Property 7: Global Rule Application

For any registry that does not have specific rules configured, the global rules should be applied to requests for that registry.

Validates: Requirements 6.2

Property 8: Registry Rule Override

For any registry with specific rules configured, those rules should be applied instead of global rules.

Validates: Requirements 6.3

Property 9: Rule Exclusions

For any package matching a configured exclusion pattern, the specified rule should not be applied to that package.

Validates: Requirements 6.4

Property 10: Configuration Validation Errors

For any invalid YAML configuration, the proxy should report specific validation errors and refuse to start.

Validates: Requirements 6.5, 14.2

Property 11: Publication Timestamp Retrieval

For any package request, the proxy should retrieve the publication timestamp from the registry metadata.

Validates: Requirements 7.1

Property 12: Age-Based Filtering

For any package with a publication age less than the configured minimum age threshold (and not excluded), the request should be blocked.

Validates: Requirements 7.2, 7.3

Property 13: Age Rule Exclusions

For any package matching an age rule exclusion pattern, the age check should be bypassed.

Validates: Requirements 7.4

Property 14: Version Metadata Retrieval

For any request using a dynamic version tag (like “latest”), all available versions should be retrieved from the registry metadata.

Validates: Requirements 8.1

Property 15: Newest Compliant Version Selection

For any set of package versions evaluated against age rules, the newest version that meets the age threshold should be selected.

Validates: Requirements 8.2

Property 16: Transparent Version Resolution

For any dynamic version tag request where a compliant version exists, the resolved version should be transparently returned to the client.

Validates: Requirements 8.3

Property 17: No Compliant Version Blocking

For any dynamic version tag request where no version meets the age threshold, the request should be blocked with an error explaining the policy violation.

Validates: Requirements 8.4

Property 18: Specific Version Evaluation

For any request specifying an exact version, only that version should be evaluated against rules (not all available versions).

Validates: Requirements 8.5

Property 19: Handler Routing

For any package request, the proxy should route it to the correct handler based on the registry type (Docker, npm, pip, Maven, Go).

Validates: Requirements 9.1, 9.2, 9.3, 9.4, 9.5

Property 20: CVE Database Queries

For any package request (when CVE rules are enabled), the proxy should query the CVE database for known vulnerabilities in that package version.

Validates: Requirements 10.1

Property 21: CVE Severity Evaluation

For any package with CVEs found, the severity scores should be evaluated against configured thresholds.

Validates: Requirements 10.2

Property 22: Critical CVE Blocking

For any package with critical or high-severity CVEs (based on policy configuration), the package should be blocked.

Validates: Requirements 10.3

Property 23: Low Severity CVE Handling

For any package with low or medium-severity CVEs, the CVEs should be logged and the configured policy action should be applied.

Validates: Requirements 10.4

Property 24: Rule Interface Compatibility

For any rule implementation (built-in or custom), it should work through the common Rule interface.

Validates: Requirements 11.1

Property 25: Rule Evaluation Order

For any package request with multiple applicable rules, the rules should be evaluated in the configured order.

Validates: Requirements 11.2

Property 26: Rule Result Combination

For any package with multiple rule results, the results should be combined according to the configured logic (AND/OR).

Validates: Requirements 11.3

Property 27: Rule Evaluation Error Handling

For any rule that fails during evaluation, the failure should be logged and remaining rules should continue to be evaluated.

Validates: Requirements 11.4

Property 28: LRU Cache Eviction

For any metadata cache that reaches its configured entry limit, the least-recently-used metadata entries should be evicted.

Validates: Requirements 12.1

Property 29: Cache Staleness

For any cached metadata that exceeds the configured TTL, it should be considered stale and re-fetched on the next request.

Validates: Requirements 12.2

Property 30: Manual Cache Invalidation

For any administrator-requested cache invalidation, the specified metadata entries should be removed from the cache.

Validates: Requirements 12.3

Property 31: Cache Statistics

For any cache statistics request, the response should include hit rates, entry count, and memory usage.

Validates: Requirements 12.4

Property 32: Cache Persistence

For any proxy restart (when using Redis backend), the metadata cache should be persisted and available after restart.

Validates: Requirements 12.5

Property 33: Distributed Cache Sharing

For any multi-instance deployment (when using Redis backend), metadata cache should be shared across all instances.

Validates: Requirements 13.1

Property 34: Metadata Lookup Response Time

For any cached metadata lookup, the response time should be within 100ms.

Validates: Requirements 13.2

Property 35: Offline Metadata Operation

For any cached metadata, when upstream repositories are unavailable, the metadata should still be available from cache for policy evaluation.

Validates: Requirements 13.3

Property 36: Concurrent Request Handling

For any proxy instance, it should handle at least 1000 concurrent requests per second.

Validates: Requirements 13.4

Property 37: Configuration Parsing on Startup

For any valid configuration file provided at startup, the proxy should successfully parse and validate it.

Validates: Requirements 14.1

Property 38: Hot Configuration Reload

For any configuration change detected at runtime, the new settings should be applied without dropping existing connections.

Validates: Requirements 14.3

Property 39: Configuration Completeness

For any proxy configuration, it should support settings for upstream repositories, policies, cache settings, and security thresholds.

Validates: Requirements 14.4

Property 40: Health Check Endpoint

For any health check request, the endpoint should return configuration status and system health information.

Validates: Requirements 14.5

Property 41: Audit Logging for Requests

For any package request received, the request details should be logged to the audit log.

Validates: Requirements 8.1

Property 42: Audit Logging for Decisions

For any security decision made (allow/block/warn), the decision, reasoning, and applicable rules should be logged to the audit log.

Validates: Requirements 8.2

Property 43: Audit Logging for Cache Events

For any metadata cache event (hit/miss/evict/invalidate), the event should be logged to the audit log.

Validates: Requirements 8.3

Property 44: Audit Log Blocking Details

For any blocked package, the audit log should include the block reason, policy rule, and relevant security data (age, CVEs).

Validates: Requirements 8.4

Property 45: Audit Log Completeness

For any audit log entry, it should include timestamps, client identity, package details, and security outcomes.

Validates: Requirements 8.5

Error Handling

Error Categories

  1. Upstream Errors: Registry unavailable, network timeouts, invalid responses
  2. Validation Errors: Checksum mismatch, signature verification failure, malformed metadata
  3. Policy Errors: Age threshold violation, CVE threshold violation, blocked by policy
  4. Configuration Errors: Invalid YAML, missing required fields, conflicting rules
  5. Cache Errors: Cache full, cache corruption, cache backend unavailable
  6. System Errors: Out of memory, disk full, database connection failure

Error Handling Strategy

// ErrorResponse represents an error returned to clients
type ErrorResponse struct {
    StatusCode int                    `json:"statusCode"`
    ErrorCode  string                 `json:"errorCode"`
    Message    string                 `json:"message"`
    Details    map[string]interface{} `json:"details,omitempty"`
    Retryable  bool                   `json:"retryable"`
}
 
// UpstreamError represents upstream registry failures
type UpstreamError struct {
    Registry  string
    Cause     error
    Retryable bool
}
 
func (e *UpstreamError) Error() string {
    return fmt.Sprintf("upstream registry %s error: %v", e.Registry, e.Cause)
}
 
// PolicyViolationError represents policy violations
type PolicyViolationError struct {
    PackageID  *PackageIdentifier
    Version    string
    Violations []*RuleResult
}
 
func (e *PolicyViolationError) Error() string {
    return fmt.Sprintf("package %s:%s violates policy", e.PackageID.Name, e.Version)
}
 
// ValidationError represents validation failures
type ValidationError struct {
    PackageID      *PackageIdentifier
    Version        string
    ValidationType string
    Details        string
}
 
func (e *ValidationError) Error() string {
    return fmt.Sprintf("validation failed for %s:%s: %s", e.PackageID.Name, e.Version, e.Details)
}

Fallback Behavior

  1. Upstream Unavailable: Use cached metadata if available for policy evaluation, proxy request will fail if package not accessible
  2. CVE Database Unavailable: Apply configured fallback (block, allow, or warn)
  3. Cache Backend Unavailable: Continue proxying without metadata caching (degraded mode, all metadata fetched from upstream)
  4. Partial Metadata: Use available metadata, log warning, apply conservative policy

Circuit Breaker

Implement circuit breaker pattern for upstream registries:

  • After N consecutive failures, open circuit (stop trying)
  • After timeout period, attempt one request (half-open)
  • If successful, close circuit (resume normal operation)
  • If failed, reopen circuit

Testing Strategy

Dual Testing Approach

This project will use both unit tests and property-based tests to ensure comprehensive coverage:

  • Unit tests: Verify specific examples, edge cases, and error conditions
  • Property-based tests: Verify universal properties across all inputs

Both types of tests are complementary and necessary. Unit tests catch concrete bugs in specific scenarios, while property-based tests verify general correctness across a wide range of inputs.

Property-Based Testing

We will use gopter (github.com/leanovate/gopter) as our property-based testing library. Each correctness property listed above will be implemented as a property-based test.

Configuration:

  • Minimum 100 iterations per property test
  • Each test must reference its design document property
  • Tag format: Feature: sec-proxy, Property N: [property text]

Example Property Test:

import (
    "testing"
    "github.com/leanovate/gopter"
    "github.com/leanovate/gopter/gen"
    "github.com/leanovate/gopter/prop"
)
 
// Feature: sec-proxy, Property 4: Metadata Cache Hit
func TestCachedMetadataServedWithoutUpstreamFetch(t *testing.T) {
    properties := gopter.NewProperties(nil)
    
    properties.Property("cached metadata is served without upstream fetch", prop.ForAll(
        func(packageID *PackageIdentifier, metadata *PackageMetadata) bool {
            // Setup: cache the metadata
            ctx := context.Background()
            key := &MetadataCacheKey{PackageID: packageID}
            metadataCache.Put(ctx, key, metadata, time.Hour)
            
            // Mock upstream to track if it's called
            upstreamCalled := false
            mockUpstream := &MockUpstream{
                FetchMetadataFunc: func(ctx context.Context, id *PackageIdentifier) (*PackageMetadata, error) {
                    upstreamCalled = true
                    return nil, nil
                },
            }
            
            // Act: request metadata for the cached package
            handler := NewDockerHandler(mockUpstream, metadataCache)
            _, _ = handler.FetchMetadata(ctx, packageID)
            
            // Assert: upstream was not called
            return !upstreamCalled
        },
        genPackageIdentifier(),
        genPackageMetadata(),
    ))
    
    properties.TestingRun(t, gopter.ConsoleReporter(false))
}

Unit Testing

Unit tests will focus on:

  1. Specific Examples: Test known package requests with expected outcomes
  2. Edge Cases: Empty metadata, missing timestamps, malformed checksums
  3. Error Conditions: Network failures, invalid configurations, cache corruption
  4. Integration Points: Handler selection, rule evaluation, cache operations

Example Unit Test:

func TestAgeRule_BlocksRecentPackage(t *testing.T) {
    yesterday := time.Now().Add(-24 * time.Hour)
    packageInfo := &VersionInfo{
        Version:     "1.0.0",
        PublishedAt: yesterday,
    }
    
    rule := &AgeRule{
        id:             "age-7days",
        minimumAgeDays: 7,
    }
    
    ctx := context.Background()
    evalCtx := &EvaluationContext{
        CurrentTime: time.Now(),
    }
    
    result, err := rule.Evaluate(ctx, &PackageIdentifier{Name: "test-package"}, packageInfo, evalCtx)
    
    if err != nil {
        t.Fatalf("unexpected error: %v", err)
    }
    
    if result.Passed {
        t.Error("expected rule to fail for recent package")
    }
    
    if result.Severity != "block" {
        t.Errorf("expected severity 'block', got '%s'", result.Severity)
    }
}
 
func TestAgeRule_AllowsOldPackage(t *testing.T) {
    thirtyDaysAgo := time.Now().Add(-30 * 24 * time.Hour)
    packageInfo := &VersionInfo{
        Version:     "1.0.0",
        PublishedAt: thirtyDaysAgo,
    }
    
    rule := &AgeRule{
        id:             "age-7days",
        minimumAgeDays: 7,
    }
    
    ctx := context.Background()
    evalCtx := &EvaluationContext{
        CurrentTime: time.Now(),
    }
    
    result, err := rule.Evaluate(ctx, &PackageIdentifier{Name: "test-package"}, packageInfo, evalCtx)
    
    if err != nil {
        t.Fatalf("unexpected error: %v", err)
    }
    
    if !result.Passed {
        t.Error("expected rule to pass for old package")
    }
}

Test Organization

tests/
├── unit/
│   ├── handlers/
│   │   ├── docker.test.ts
│   │   ├── npm.test.ts
│   │   ├── pip.test.ts
│   │   ├── maven.test.ts
│   │   └── go.test.ts
│   ├── rules/
│   │   ├── age-rule.test.ts
│   │   ├── cve-rule.test.ts
│   │   └── rule-engine.test.ts
│   ├── cache/
│   │   ├── memory-cache.test.ts
│   │   ├── redis-cache.test.ts
│   │   └── cache-eviction.test.ts
│   └── config/
│       └── yaml-parser.test.ts
├── property/
│   ├── proxying.property.test.ts
│   ├── metadata-caching.property.test.ts
│   ├── rules.property.test.ts
│   └── version-resolution.property.test.ts
└── integration/
    ├── end-to-end.test.ts
    └── multi-registry.test.ts

Test Generators (Arbitraries)

For property-based testing, we need generators for domain objects:

// Generate random package identifiers
func genPackageIdentifier() gopter.Gen {
    return gopter.CombineGens(
        gen.OneConstOf("docker.io", "registry.npmjs.org", "pypi.org"),
        gen.PtrOf(gen.AlphaString()),
        gen.AlphaString().SuchThat(func(s string) bool { return len(s) > 0 }),
    ).Map(func(vals []interface{}) *PackageIdentifier {
        return &PackageIdentifier{
            Registry:  vals[0].(string),
            Namespace: vals[1].(*string),
            Name:      vals[2].(string),
        }
    })
}
 
// Generate random version info
func genVersionInfo() gopter.Gen {
    return gopter.CombineGens(
        gen.AlphaString().SuchThat(func(s string) bool { return len(s) > 0 }),
        gen.Time(),
        gen.MapOf(gen.AlphaString(), gen.AlphaString()),
        gen.Int64(),
    ).Map(func(vals []interface{}) *VersionInfo {
        return &VersionInfo{
            Version:     vals[0].(string),
            PublishedAt: vals[1].(time.Time),
            Checksums:   vals[2].(map[string]string),
            Size:        vals[3].(int64),
        }
    })
}
 
// Generate random package metadata
func genPackageMetadata() gopter.Gen {
    return gopter.CombineGens(
        genPackageIdentifier(),
        gen.SliceOf(genVersionInfo()).SuchThat(func(s []*VersionInfo) bool { return len(s) > 0 }),
        gen.MapOf(gen.AlphaString(), gen.AlphaString()),
    ).Map(func(vals []interface{}) *PackageMetadata {
        return &PackageMetadata{
            PackageID: vals[0].(*PackageIdentifier),
            Versions:  vals[1].([]*VersionInfo),
            Tags:      vals[2].(map[string]string),
        }
    })
}
 
// Generate random age rule configurations
func genAgeRuleConfig() gopter.Gen {
    return gen.IntRange(1, 365).Map(func(days int) *AgeRule {
        return &AgeRule{
            id:             "test-age-rule",
            minimumAgeDays: days,
        }
    })
}

Performance Testing

In addition to functional tests, we will include performance tests:

  1. Latency Tests: Verify cached responses are under 100ms
  2. Throughput Tests: Verify 1000+ requests/second handling
  3. Load Tests: Verify behavior under sustained high load
  4. Stress Tests: Verify graceful degradation under extreme load

Security Testing

  1. Fuzzing: Generate malformed requests and configurations
  2. Injection Tests: Test for command injection, path traversal
  3. Authentication Tests: Verify proper credential handling
  4. TLS Tests: Verify secure communication

Implementation Notes

Technology Stack

  • Language: Go (for performance, concurrency, and infrastructure tooling ecosystem)
  • HTTP Server: Standard library net/http or gorilla/mux for routing
  • Cache Backend: Redis (for distributed metadata caching) or in-memory with sync.Map (for simple deployments)
  • Configuration: gopkg.in/yaml.v3 for YAML parsing with struct validation
  • Testing: Standard library testing + github.com/leanovate/gopter for property-based testing
  • Logging: go.uber.org/zap or github.com/rs/zerolog (structured logging)
  • Metrics: github.com/prometheus/client_golang

Deployment Considerations

  1. Container Image: Provide official Docker image
  2. Kubernetes: Provide Helm chart with HA configuration
  3. Configuration: Support ConfigMaps and Secrets
  4. Monitoring: Export Prometheus metrics
  5. Health Checks: Liveness and readiness probes

Security Considerations

  1. TLS: Support TLS for client connections
  2. Authentication: Support basic auth, token auth, mTLS
  3. Authorization: Support per-registry access control
  4. Secrets: Never log credentials or tokens
  5. Validation: Strict input validation on all requests

Performance Optimizations

  1. Connection Pooling: Reuse connections to upstream registries
  2. Streaming: Stream large packages without buffering in memory
  3. Compression: Support gzip/brotli compression for metadata responses
  4. Parallel Fetching: Fetch metadata and CVE data in parallel
  5. Metadata Caching: Cache metadata separately with appropriate TTLs to reduce upstream API calls

Observability

  1. Metrics: Request rate, error rate, latency, cache hit rate
  2. Tracing: Distributed tracing for request flows
  3. Logging: Structured logs with correlation IDs
  4. Dashboards: Pre-built Grafana dashboards
  5. Alerts: Alert rules for common issues