Design Document: AWS Integration with Crossplane and LocalStack

Overview

This design implements platform-level AWS integration capabilities for k8s-lab using Crossplane as the infrastructure-as-code orchestrator and LocalStack as a local AWS-compatible environment. The solution enables application teams to define AWS resources (API Gateway, Lambda, DynamoDB, S3, IAM) as Kubernetes Custom Resource Definitions (CRDs) and manage them through GitOps workflows with ArgoCD.

The architecture follows a layered approach:

  • Platform Layer: Crossplane core, AWS provider, LocalStack service
  • Configuration Layer: ProviderConfig for LocalStack connectivity
  • Application Layer: AWS resource CRDs managed by application teams
  • GitOps Layer: ArgoCD for declarative resource management

Architecture

Component Diagram

graph TB
    subgraph "Application Teams"
        AppCRD[AWS Resource CRDs<br/>API Gateway, Lambda, etc.]
    end
    
    subgraph "GitOps Layer"
        ArgoCD[ArgoCD]
        Git[Git Repository]
    end
    
    subgraph "Crossplane Layer"
        CP[Crossplane Core]
        AWSP[AWS Provider]
        PC[ProviderConfig<br/>localstack]
    end
    
    subgraph "LocalStack Service"
        LS[LocalStack Pod]
        LSAPI[LocalStack API<br/>Port 4566]
        LSSVC[LocalStack Service]
        Ingress[Ingress Controller]
    end
    
    subgraph "External Access"
        CLI[awslocal CLI]
        Tools[Testing Tools]
    end
    
    Git -->|sync| ArgoCD
    ArgoCD -->|apply| AppCRD
    AppCRD -->|references| PC
    AppCRD -->|reconcile| AWSP
    AWSP -->|uses| PC
    PC -->|endpoint| LSSVC
    LSSVC -->|routes to| LSAPI
    LSAPI -->|runs in| LS
    Ingress -->|exposes| LSSVC
    CLI -->|connects via| Ingress
    Tools -->|connects via| Ingress

Deployment Architecture

graph LR
    subgraph "crossplane-system namespace"
        CPPod[Crossplane Pod]
        AWSPod[AWS Provider Pod]
    end
    
    subgraph "localstack namespace"
        LSPod[LocalStack Pod]
        LSPVC[PersistentVolumeClaim]
        LSSvc[Service<br/>Port 4566]
        LSIng[Ingress]
    end
    
    subgraph "Cluster-wide"
        PC[ProviderConfig CRD]
        AWSCRD[AWS Resource CRDs]
    end
    
    CPPod -->|manages| AWSPod
    AWSPod -->|reads| PC
    PC -->|points to| LSSvc
    AWSCRD -->|reconciled by| AWSPod
    LSPod -->|uses| LSPVC
    LSSvc -->|routes to| LSPod
    LSIng -->|exposes| LSSvc

Components and Interfaces

1. Crossplane Core

Purpose: Kubernetes-native infrastructure orchestration engine

Installation Method: Helm chart via Kustomize

Kustomize Structure:

components/aws-platform/
├── base/
│   ├── kustomization.yaml
│   ├── namespace-crossplane.yaml
│   ├── namespace-localstack.yaml
│   ├── crossplane/
│   │   ├── helmCharts.yaml
│   │   ├── helmValues.yaml
│   │   └── provider-aws.yaml
│   ├── localstack/
│   │   ├── helmCharts.yaml
│   │   ├── helmValues.yaml
│   │   └── ingress.yaml
│   └── providerconfig/
│       ├── secret.yaml
│       └── providerconfig-localstack.yaml
└── overlays/
    └── lab/
        └── kustomization.yaml

Helm Chart Configuration:

# components/aws-platform/base/crossplane/helmCharts.yaml
apiVersion: builtin
kind: HelmChartInflationGenerator
metadata:
  name: crossplane
chartName: crossplane-stable/crossplane
chartVersion: 1.14.0
releaseName: crossplane
namespace: crossplane-system
valuesFile: helmValues.yaml

Helm Values:

# helmValues.yaml
replicas: 1
image:
  repository: crossplane/crossplane
  tag: v1.14.0
args:
  - --enable-composition-revisions
  - --enable-environment-configs
resources:
  limits:
    cpu: 500m
    memory: 512Mi
  requests:
    cpu: 100m
    memory: 256Mi

Namespace: crossplane-system

Key Resources:

  • Deployment: crossplane
  • ServiceAccount: crossplane
  • ClusterRole: crossplane (manages CRDs, compositions)
  • ClusterRoleBinding: crossplane

Interfaces:

  • Kubernetes API: Watches for provider and resource CRDs
  • Metrics: Prometheus endpoint on port 8080

2. AWS Provider

Purpose: Crossplane provider for AWS resource management

Installation Method: Crossplane Provider CRD

Configuration:

apiVersion: pkg.crossplane.io/v1
kind: Provider
metadata:
  name: provider-aws
spec:
  package: xpkg.upbound.io/upbound/provider-aws:v0.47.0
  controllerConfigRef:
    name: aws-provider-config

Controller Configuration:

apiVersion: pkg.crossplane.io/v1alpha1
kind: ControllerConfig
metadata:
  name: aws-provider-config
spec:
  args:
    - --poll=1m
    - --max-reconcile-rate=100
  resources:
    limits:
      cpu: 500m
      memory: 512Mi
    requests:
      cpu: 100m
      memory: 256Mi

Supported AWS Services:

  • API Gateway (REST APIs, Resources, Methods, Deployments, Stages)
  • Lambda (Functions, Permissions, Event Source Mappings)
  • DynamoDB (Tables, Global Tables)
  • S3 (Buckets, Bucket Policies)
  • IAM (Roles, Policies, Role Policy Attachments)

Interfaces:

  • Crossplane API: Registers AWS CRDs
  • LocalStack API: HTTP/HTTPS to LocalStack endpoint
  • Kubernetes API: Updates resource status

3. LocalStack Service

Purpose: Local AWS cloud emulator for development and testing

Installation Method: Helm chart via Kustomize

Kustomize Structure:

infra/kustomize/localstack/
├── kustomization.yaml
├── namespace.yaml
├── helmCharts.yaml
├── helmValues.yaml
├── ingress.yaml
└── pvc.yaml

Helm Chart Configuration:

# helmCharts.yaml
apiVersion: builtin
kind: HelmChartInflationGenerator
metadata:
  name: localstack
chartName: localstack/localstack
chartVersion: 0.6.0
releaseName: localstack
namespace: localstack
valuesFile: helmValues.yaml

Helm Values:

# helmValues.yaml
image:
  repository: localstack/localstack
  tag: "3.0"
 
service:
  type: ClusterIP
  edgeService:
    targetPort: 4566
 
startServices: "apigateway,lambda,dynamodb,s3,iam"
 
debug: true
 
persistence:
  enabled: true
  storageClass: local-path
  accessModes:
    - ReadWriteOnce
  size: 10Gi
 
resources:
  limits:
    cpu: 1000m
    memory: 1Gi
  requests:
    cpu: 200m
    memory: 512Mi
 
livenessProbe:
  httpGet:
    path: /_localstack/health
    port: 4566
  initialDelaySeconds: 30
  periodSeconds: 10
 
readinessProbe:
  httpGet:
    path: /_localstack/health
    port: 4566
  initialDelaySeconds: 10
  periodSeconds: 5

Ingress (separate manifest in kustomization):

# ingress.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: localstack
  namespace: localstack
  annotations:
    nginx.ingress.kubernetes.io/proxy-read-timeout: "300"
    nginx.ingress.kubernetes.io/proxy-send-timeout: "300"
spec:
  ingressClassName: nginx
  rules:
  - host: localstack.k8s-lab.local
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: localstack
            port:
              number: 4566

Interfaces:

  • AWS-compatible API: Port 4566 (all AWS services)
  • Health endpoint: /_localstack/health
  • Ingress: http://localstack.k8s-lab.local

4. ProviderConfig for LocalStack

Purpose: Configures AWS provider to use LocalStack endpoint

Configuration:

apiVersion: aws.upbound.io/v1beta1
kind: ProviderConfig
metadata:
  name: localstack
spec:
  credentials:
    source: Secret
    secretRef:
      namespace: crossplane-system
      name: localstack-creds
      key: credentials
  endpoint:
    url:
      type: Static
      static: http://localstack.localstack.svc.cluster.local:4566
    hostnameImmutable: true
  skip_credentials_validation: true
  skip_metadata_api_check: true
  skip_requesting_account_id: true
  s3_use_path_style: true

Credentials Secret:

apiVersion: v1
kind: Secret
metadata:
  name: localstack-creds
  namespace: crossplane-system
type: Opaque
stringData:
  credentials: |
    [default]
    aws_access_key_id = test
    aws_secret_access_key = test

Interfaces:

  • Referenced by AWS resource CRDs via providerConfigRef
  • Points to LocalStack service endpoint

5. AWS Resource CRDs

Purpose: Kubernetes-native representation of AWS resources

Example: API Gateway REST API:

apiVersion: apigateway.aws.upbound.io/v1beta1
kind: RestAPI
metadata:
  name: example-api
spec:
  forProvider:
    name: example-api
    description: Example API Gateway
    region: us-east-1
  providerConfigRef:
    name: localstack

Example: Lambda Function:

apiVersion: lambda.aws.upbound.io/v1beta1
kind: Function
metadata:
  name: example-function
spec:
  forProvider:
    functionName: example-function
    runtime: python3.11
    handler: index.handler
    role: arn:aws:iam::000000000000:role/lambda-role
    region: us-east-1
    s3Bucket: lambda-code
    s3Key: function.zip
  providerConfigRef:
    name: localstack

Example: DynamoDB Table:

apiVersion: dynamodb.aws.upbound.io/v1beta1
kind: Table
metadata:
  name: example-table
spec:
  forProvider:
    name: example-table
    region: us-east-1
    billingMode: PAY_PER_REQUEST
    hashKey: id
    attribute:
    - name: id
      type: S
  providerConfigRef:
    name: localstack

Status Fields:

  • conditions: Resource reconciliation status
  • atProvider: AWS resource attributes (ARN, ID, etc.)

Interfaces:

  • Managed by AWS Provider
  • Status updated by Crossplane
  • Synced by ArgoCD

Data Models

Crossplane Resource Lifecycle States

stateDiagram-v2
    [*] --> Creating: CRD Applied
    Creating --> Available: Resource Created
    Creating --> Failed: Creation Error
    Available --> Updating: CRD Modified
    Updating --> Available: Update Success
    Updating --> Failed: Update Error
    Available --> Deleting: CRD Deleted
    Failed --> Deleting: CRD Deleted
    Deleting --> [*]: Resource Removed
    Failed --> Creating: Retry

ProviderConfig Connection Model

# Connection flow
ProviderConfig (localstack)
  ├── credentials: Secret reference
  │   └── aws_access_key_id: test
  │   └── aws_secret_access_key: test
  ├── endpoint: LocalStack service URL
  │   └── http://localstack.localstack.svc.cluster.local:4566
  └── options:
      ├── skip_credentials_validation: true
      ├── skip_metadata_api_check: true
      └── s3_use_path_style: true

Resource Ownership Model

graph TD
    ArgoCD[ArgoCD Application]
    CRD[AWS Resource CRD]
    MR[Managed Resource]
    AWS[LocalStack Resource]
    
    ArgoCD -->|owns| CRD
    CRD -->|creates| MR
    MR -->|provisions| AWS
    AWS -->|status| MR
    MR -->|status| CRD

Correctness Properties

A property is a characteristic or behavior that should hold true across all valid executions of a system—essentially, a formal statement about what the system should do. Properties serve as the bridge between human-readable specifications and machine-verifiable correctness guarantees.

Property 1: Crossplane Installation Completeness

For any Crossplane installation, all core components (deployment, service account, RBAC resources) should be present, all pods should be running and healthy, and metrics should be exposed before the system is marked as operational.

Validates: Requirements 1.1, 1.2, 1.3, 1.4

Property 2: Configuration Persistence

For any Crossplane or LocalStack configuration, restarting the pods should preserve all configuration and data, with resources remaining accessible after restart.

Validates: Requirements 1.5, 3.4

Property 3: AWS Provider Readiness

For any AWS provider installation with a specified version, the provider pod should be running, healthy, and all AWS resource CRDs (RestAPI, Function, Table, Bucket, Role) should be registered cluster-wide before resources can be created.

Validates: Requirements 2.1, 2.2, 2.3

Property 4: LocalStack Service Availability

For any LocalStack deployment, all configured AWS services (API Gateway, Lambda, DynamoDB, S3, IAM) should respond to health checks, the service should be accessible via its cluster endpoint, and resource limits should be configured to prevent cluster exhaustion.

Validates: Requirements 3.1, 3.3, 3.5

Property 5: ProviderConfig Connectivity

For any ProviderConfig pointing to LocalStack, the configuration should have the correct endpoint URL, use LocalStack-compatible credentials, successfully connect to the LocalStack service, and be marked as ready before resources can reference it.

Validates: Requirements 5.1, 5.2, 5.3, 5.4

Property 6: ProviderConfig Health Reporting

For any ProviderConfig, when the target endpoint (LocalStack) becomes unavailable, the ProviderConfig status should reflect degraded health within a reasonable time period.

Validates: Requirements 5.5

Property 7: Resource Creation and Tracking

For any valid AWS resource CRD, when applied to the cluster, the Platform_Service should validate the specification, create the corresponding resource in LocalStack, track the resource state, and populate the CRD status with resource details (ARN, ID) and availability status.

Validates: Requirements 6.2, 6.3, 8.1, 8.2, 8.3

Property 8: Resource Creation Idempotency

For any AWS resource CRD, applying the same manifest multiple times should result in the same resource state in LocalStack without errors or duplicate resources.

Validates: Requirements 6.3

Property 9: Resource Update Consistency

For any AWS resource CRD update, the changes should be applied to the LocalStack resource without recreating it (unless replacement is required), and the CRD status should reflect the updated state.

Validates: Requirements 6.4, 8.4

Property 10: Resource Deletion Cleanup

For any AWS resource CRD deletion, the corresponding LocalStack resource should be removed before the CRD finalizer is removed, and if deletion fails, the system should retry and report errors in the CRD status.

Validates: Requirements 6.5, 8.5, 8.6

Property 11: Resource Error Reporting

For any AWS resource CRD that fails to create, update, or delete, the error details should be reported in the CRD status field with sufficient context for troubleshooting.

Validates: Requirements 6.6

Property 12: ArgoCD Resource Management

For any Crossplane CRD in Git, ArgoCD should be able to sync and apply the resource, respect resource dependencies and creation order, detect drift when resources are modified outside Git, and preserve resource status fields during sync operations.

Validates: Requirements 7.1, 7.2, 7.3, 7.4

Property 13: ArgoCD Health Integration

For any Crossplane resource managed by ArgoCD, the ArgoCD application should correctly report the health status based on the resource’s readiness conditions.

Validates: Requirements 7.5

Property 14: Ingress Accessibility

For any external HTTP request to the LocalStack ingress endpoint, the request should be routed to the LocalStack service within the configured timeout period and receive a valid AWS API response.

Validates: Requirements 4.1, 4.2, 4.4

Property 15: Observability and Monitoring

For any running Crossplane installation, Prometheus metrics should be exposed and scrapeable, Kubernetes events should be emitted for resource operations, detailed error logs should be available, health check endpoints should be accessible, and metrics should track resource creation time and success/failure rates.

Validates: Requirements 10.1, 10.2, 10.3, 10.4, 10.5

Error Handling

Crossplane Installation Errors

Scenario: Helm installation fails due to resource constraints

Handling:

  • Helm will report installation failure with specific error
  • Check pod events: kubectl describe pod -n crossplane-system
  • Verify resource quotas and limits
  • Retry installation after addressing constraints

Recovery: Uninstall and reinstall with adjusted resource limits

AWS Provider Installation Errors

Scenario: Provider package fails to download or install

Handling:

  • Check Provider status: kubectl get provider provider-aws -o yaml
  • Verify network connectivity to package registry
  • Check controller logs: kubectl logs -n crossplane-system -l pkg.crossplane.io/provider=provider-aws

Recovery: Delete and recreate Provider resource

LocalStack Startup Failures

Scenario: LocalStack pod fails to start or crashes

Handling:

  • Check pod logs: kubectl logs -n localstack localstack-<pod-id>
  • Verify PVC is bound: kubectl get pvc -n localstack
  • Check resource limits and OOM kills
  • Verify service configuration

Recovery:

  • Delete pod to trigger restart
  • If persistent, delete PVC and recreate (data loss)

ProviderConfig Connection Failures

Scenario: ProviderConfig cannot connect to LocalStack

Handling:

  • Verify LocalStack service is running: kubectl get svc -n localstack
  • Test connectivity from provider pod: kubectl exec -n crossplane-system <provider-pod> -- curl http://localstack.localstack.svc.cluster.local:4566/_localstack/health
  • Check ProviderConfig status: kubectl get providerconfig localstack -o yaml
  • Verify endpoint URL is correct

Recovery: Update ProviderConfig with correct endpoint

Resource Creation Failures

Scenario: AWS resource CRD fails to create LocalStack resource

Handling:

  • Check CRD status: kubectl get <resource-type> <resource-name> -o yaml
  • Review conditions and events
  • Check provider logs for API errors
  • Verify ProviderConfig reference is correct
  • Test resource creation directly in LocalStack using awslocal CLI

Recovery:

  • Fix CRD specification errors
  • Delete and recreate resource
  • Check LocalStack logs for service-specific errors

Resource Update Failures

Scenario: Resource update fails or causes drift

Handling:

  • Check if update is supported by LocalStack
  • Review CRD status for error messages
  • Compare desired state vs actual state
  • Check if resource requires replacement vs update

Recovery:

  • Revert to previous working state
  • Delete and recreate if update is not supported

Resource Deletion Failures

Scenario: Resource deletion hangs or fails

Handling:

  • Check for finalizers: kubectl get <resource-type> <resource-name> -o yaml | grep finalizers
  • Verify LocalStack resource exists
  • Check provider logs for deletion errors
  • Check for dependent resources blocking deletion

Recovery:

  • Manually delete LocalStack resource using awslocal CLI
  • Remove finalizer if resource is confirmed deleted: kubectl patch <resource-type> <resource-name> -p '{"metadata":{"finalizers":[]}}' --type=merge

ArgoCD Sync Conflicts

Scenario: ArgoCD overwrites resource status or causes sync loops

Handling:

  • Configure ArgoCD to ignore status fields
  • Use sync options: RespectIgnoreDifferences=true
  • Check for spec drift vs status drift

Recovery: Update ArgoCD application with proper ignore rules

Ingress Access Failures

Scenario: External tools cannot reach LocalStack via ingress

Handling:

  • Verify ingress is created: kubectl get ingress -n localstack
  • Check ingress controller logs
  • Test internal connectivity: kubectl run -it --rm debug --image=curlimages/curl --restart=Never -- curl http://localstack.localstack.svc.cluster.local:4566/_localstack/health
  • Verify DNS resolution for ingress hostname
  • Check ingress annotations and timeouts

Recovery:

  • Update ingress configuration
  • Restart ingress controller if needed

Testing Strategy

Unit Tests

Unit tests focus on configuration validation and resource specification correctness.

Test Categories:

  1. Helm Chart Validation

    • Validate Crossplane Helm values render correctly
    • Test resource limits and requests are within bounds
    • Verify RBAC permissions are complete
  2. Manifest Validation

    • Validate LocalStack deployment manifests
    • Test ProviderConfig structure
    • Verify CRD examples are syntactically correct
  3. Kustomize Build Tests

    • Test kustomize builds without errors
    • Verify namespace isolation
    • Check resource naming conventions

Example Unit Test:

# Test kustomize build
kustomize build infra/kustomize/crossplane/ > /dev/null
if [ $? -eq 0 ]; then
  echo "✓ Crossplane kustomize build successful"
else
  echo "✗ Crossplane kustomize build failed"
  exit 1
fi

Integration Tests

Integration tests verify that components work together correctly within the cluster.

Test Categories:

  1. Crossplane Installation Tests

    • Deploy Crossplane via Helm
    • Verify all pods are running
    • Check CRDs are registered
    • Validate RBAC permissions
  2. LocalStack Deployment Tests

    • Deploy LocalStack
    • Verify pod is running and healthy
    • Test service connectivity
    • Verify PVC is bound and writable
  3. Provider Installation Tests

    • Install AWS provider
    • Verify provider pod is running
    • Check AWS CRDs are available
    • Test ProviderConfig creation
  4. Resource Lifecycle Tests

    • Create AWS resource CRD
    • Verify resource is created in LocalStack
    • Update resource and verify changes
    • Delete resource and verify cleanup
  5. ArgoCD Integration Tests

    • Create ArgoCD application for Crossplane resources
    • Verify sync succeeds
    • Test drift detection
    • Verify status preservation

Example Integration Test:

# Test LocalStack connectivity
kubectl run -it --rm test-localstack \
  --image=amazon/aws-cli \
  --restart=Never \
  --env="AWS_ACCESS_KEY_ID=test" \
  --env="AWS_SECRET_ACCESS_KEY=test" \
  --env="AWS_ENDPOINT_URL=http://localstack.localstack.svc.cluster.local:4566" \
  -- s3 ls
 
# Expected: Command succeeds (even if no buckets exist)

Acceptance Tests

Acceptance tests validate end-to-end workflows from an application team’s perspective.

Test Scenarios:

  1. API Gateway Deployment

    • Application team creates API Gateway REST API CRD
    • Verify API is created in LocalStack
    • Test API is accessible via LocalStack endpoint
    • Verify API can be updated and deleted
  2. Lambda Function Deployment

    • Application team creates Lambda function CRD
    • Upload function code to S3 in LocalStack
    • Verify function is created and invocable
    • Test function can be updated with new code
  3. DynamoDB Table Management

    • Application team creates DynamoDB table CRD
    • Verify table is created with correct schema
    • Test data can be written and read
    • Verify table can be deleted
  4. GitOps Workflow

    • Commit AWS resource CRDs to Git
    • ArgoCD syncs resources to cluster
    • Verify resources are created in LocalStack
    • Update CRDs in Git and verify changes propagate
  5. External Tool Access

    • Use awslocal CLI via ingress endpoint
    • Create resources using CLI
    • Verify resources appear in Kubernetes
    • Test resource management via both CLI and CRDs

Example Acceptance Test:

# Test complete API Gateway workflow
# 1. Create API Gateway CRD
kubectl apply -f - <<EOF
apiVersion: apigateway.aws.upbound.io/v1beta1
kind: RestAPI
metadata:
  name: test-api
spec:
  forProvider:
    name: test-api
    description: Test API
    region: us-east-1
  providerConfigRef:
    name: localstack
EOF
 
# 2. Wait for API to be ready
kubectl wait --for=condition=Ready restapi/test-api --timeout=60s
 
# 3. Get API ID from status
API_ID=$(kubectl get restapi test-api -o jsonpath='{.status.atProvider.id}')
 
# 4. Verify API exists in LocalStack
awslocal apigateway get-rest-api --rest-api-id $API_ID
 
# 5. Clean up
kubectl delete restapi test-api
 
# Expected: All steps succeed

Property-Based Tests

Property-based tests validate universal correctness properties across many generated inputs.

Configuration:

  • Minimum 100 iterations per property test
  • Use appropriate PBT library for the test language (e.g., Hypothesis for Python, fast-check for TypeScript)

Property Test Examples:

  1. Property 6: Resource Creation Idempotency

    # Feature: aws-integration-crossplane-localstack, Property 6: Resource Creation Idempotency
    @given(aws_resource_crd())
    def test_resource_creation_idempotency(resource_crd):
        # Apply resource twice
        apply_resource(resource_crd)
        first_status = get_resource_status(resource_crd)
        
        apply_resource(resource_crd)
        second_status = get_resource_status(resource_crd)
        
        # Status should be identical
        assert first_status == second_status
  2. Property 9: Resource Deletion Cleanup

    # Feature: aws-integration-crossplane-localstack, Property 9: Resource Deletion Cleanup
    @given(aws_resource_crd())
    def test_resource_deletion_cleanup(resource_crd):
        # Create resource
        apply_resource(resource_crd)
        wait_for_ready(resource_crd)
        resource_id = get_resource_id(resource_crd)
        
        # Delete resource
        delete_resource(resource_crd)
        wait_for_deletion(resource_crd)
        
        # Verify LocalStack resource is gone
        assert not localstack_resource_exists(resource_id)

Test Execution Order

  1. Unit tests - Fast validation of configurations
  2. Integration tests - Component interaction verification
  3. Acceptance tests - End-to-end workflow validation
  4. Property tests - Universal correctness verification

All test layers must pass before considering the implementation complete.