Implementation Plan: N8n Workflow Management

Overview

This implementation plan extends the existing N8n Operator (https://github.com/craigedmunds/n8n-operator) to support declarative workflow management through a new N8nWorkflow CRD. The implementation will add new controller logic, credential management, and API client extensions while integrating with the existing operator codebase.

Tasks

  • 1. Set up development environment and analyze existing operator

    • Fork and clone the existing n8n-operator repository
    • Analyze existing code structure, patterns, and dependencies
    • Set up local development environment with existing operator
    • Requirements: All requirements (foundation)
  • 2. Define N8nWorkflow CRD and data models

    • 2.1 Create N8nWorkflow CRD definition

      • Define OpenAPI schema for N8nWorkflow resource
      • Include spec fields (n8nRef, workflow, credentials)
      • Include status fields (workflowId, active, lastSync, conditions)
      • Add CRD to existing operator manifests
      • Requirements: 1.1, 1.5, 2.1, 3.1, 4.1, 6.1, 6.5
    • [ ]* 2.2 Write property test for CRD schema validation

      • Property 1: CRD Schema Validation
      • Validates: Requirements 1.1, 1.5, 2.1, 3.1, 4.1, 6.1, 6.5
    • 2.3 Create Go data structures for N8nWorkflow

      • Define N8nWorkflowSpec, N8nWorkflowStatus structs
      • Implement JSON marshaling/unmarshaling
      • Add validation tags and methods
      • Requirements: 1.1, 1.5, 2.1, 3.1, 4.1
    • [ ]* 2.4 Write unit tests for data model validation

      • Test struct validation and serialization
      • Test edge cases and error conditions
      • Requirements: 1.1, 1.5, 2.1, 3.1, 4.1
  • 3. Extend N8n API client for workflow operations

    • 3.1 Analyze existing n8n API client in operator

      • Review existing HTTP client implementation
      • Identify reusable authentication and connection logic
      • Document existing API patterns
      • Requirements: 1.2, 1.3, 1.4
    • 3.2 Add workflow management methods to API client

      • Implement CreateWorkflow, UpdateWorkflow, DeleteWorkflow methods
      • Implement GetWorkflow, ActivateWorkflow methods
      • Add proper error handling and response parsing
      • Requirements: 1.2, 1.3, 1.4, 4.2, 4.3
    • [ ]* 3.3 Write unit tests for API client workflow methods

      • Mock HTTP responses for all workflow operations
      • Test error handling and edge cases
      • Requirements: 1.2, 1.3, 1.4, 4.2, 4.3
    • 3.4 Add credential management methods to API client

      • Implement CreateCredential, UpdateCredential, DeleteCredential methods
      • Support multiple credential types (HTTP Basic Auth, API keys, OAuth)
      • Requirements: 3.2, 3.3, 3.4
    • [ ]* 3.5 Write unit tests for API client credential methods

      • Test credential creation and management
      • Test different credential types
      • Requirements: 3.2, 3.3, 3.4
  • 4. Implement Credential Manager component

    • 4.1 Create CredentialManager interface and implementation

      • Implement ProcessCredentials method for secret resolution
      • Implement credential lifecycle management methods
      • Add support for Kubernetes secret references
      • Requirements: 3.1, 3.2, 3.4, 3.5
    • [ ]* 4.2 Write property test for credential injection ordering

      • Property 7: Credential Injection Ordering
      • Validates: Requirements 3.2
    • [ ]* 4.3 Write property test for credential type support

      • Property 8: Credential Type Support
      • Validates: Requirements 3.3, 3.5
    • [ ]* 4.4 Write property test for credential synchronization

      • Property 9: Credential Synchronization
      • Validates: Requirements 3.4
    • 4.5 Add ESO integration for credential management

      • Support ESO-managed secrets alongside regular Kubernetes secrets
      • Handle secret updates and synchronization
      • Requirements: 3.5
  • 5. Checkpoint - Ensure core components pass tests

    • Ensure all tests pass, ask the user if questions arise.
  • 6. Implement N8nWorkflow controller

    • [-] 6.1 Create workflow controller structure

      • Set up controller manager integration with existing operator
      • Implement controller reconciliation loop
      • Add controller to existing operator main.go
      • Requirements: 1.2, 1.3, 1.4, 1.6
    • [ ]* 6.2 Write property test for workflow creation synchronization

      • Property 2: Workflow Creation Synchronization
      • Validates: Requirements 1.2
    • [ ]* 6.3 Write property test for workflow update synchronization

      • Property 3: Workflow Update Synchronization
      • Validates: Requirements 1.3, 1.6
    • [ ]* 6.4 Write property test for workflow deletion cleanup

      • Property 4: Workflow Deletion Cleanup
      • Validates: Requirements 1.4
    • 6.5 Implement n8n instance validation logic

      • Validate n8n instance references exist and are accessible
      • Handle cross-namespace n8n instance references
      • Requirements: 2.2, 2.4
    • [ ]* 6.6 Write property test for n8n instance validation

      • Property 5: N8n Instance Validation
      • Validates: Requirements 2.2, 6.3
    • [ ]* 6.7 Write property test for workflow independence

      • Property 6: Workflow Independence
      • Validates: Requirements 2.4, 2.5
  • 7. Implement workflow lifecycle management

    • 7.1 Add workflow activation/deactivation logic

      • Implement active field processing in controller
      • Handle workflow state changes in n8n
      • Requirements: 4.2, 4.3, 4.4
    • [ ]* 7.2 Write property test for workflow activation control

      • Property 10: Workflow Activation Control
      • Validates: Requirements 4.2, 4.3
    • 7.3 Implement comprehensive status reporting

      • Update resource status with workflow ID, active state, timestamps
      • Report sync status and error conditions
      • Requirements: 4.4, 5.1, 5.2, 5.4
    • [ ]* 7.4 Write property test for status reporting completeness

      • Property 11: Status Reporting Completeness
      • Validates: Requirements 4.4, 5.1, 5.2, 5.4
    • 7.5 Add health monitoring and error handling

      • Implement health checks for workflow existence
      • Add comprehensive error reporting with descriptive messages
      • Requirements: 5.5, 2.3, 4.5, 5.3, 6.2, 6.4
    • [ ]* 7.6 Write property test for health monitoring

      • Property 12: Health Monitoring
      • Validates: Requirements 5.5
    • [ ]* 7.7 Write property test for comprehensive error reporting

      • Property 13: Comprehensive Error Reporting
      • Validates: Requirements 2.3, 4.5, 5.3, 6.2, 6.4
  • 8. Add validation and admission control

    • 8.1 Implement workflow definition validation

      • Validate required fields and workflow structure
      • Add n8n-specific node type and parameter validation
      • Validate credential references
      • Requirements: 6.1, 6.2, 6.3, 6.4
    • [ ]* 8.2 Write unit tests for validation logic

      • Test validation of various workflow configurations
      • Test error cases and edge conditions
      • Requirements: 6.1, 6.2, 6.3, 6.4
  • 8.5. Implement N8n API authentication

    • 8.5.1 Add API key generation to N8n controller

      • Generate random 32-character alphanumeric API key when creating n8n instances
      • Store API key in Kubernetes secret <n8n-name>-api-credentials
      • Mount secret as file in n8n container at /run/secrets/n8n/apiKey
      • Configure n8n deployment with N8N_API_KEY_FILE environment variable (or wrapper script if not supported)
      • Set secret file permissions to 0400 (read-only for owner)
      • Update N8n status with API credentials secret name
      • Requirements: 2.2, 6.3
    • [ ]* 8.5.2 Write unit tests for API key generation

      • Test API key generation and secret creation
      • Test n8n deployment configuration with file-based secret mounting
      • Test secret file permissions
      • Requirements: 2.2, 6.3
    • 8.5.3 Update N8nWorkflow controller to use API authentication

      • Read API credentials secret name from N8n resource status
      • Retrieve API key from secret
      • Pass API key to N8n client constructor
      • Handle missing/invalid API credentials gracefully
      • Requirements: 2.2, 6.3
    • [ ]* 8.5.4 Write unit tests for N8nWorkflow authentication

      • Test API key retrieval from secrets
      • Test error handling for missing/invalid credentials
      • Requirements: 2.2, 6.3
  • 9. Integration and deployment preparation

    • 9.1 Update operator deployment manifests

      • Add N8nWorkflow CRD to operator deployment
      • Update RBAC permissions for new resources
      • Update operator configuration and documentation
      • Requirements: All requirements
    • 9.2 Create example N8nWorkflow resources

      • Create sample workflow definitions for testing
      • Include examples with different credential types
      • Document usage patterns and best practices
      • Requirements: All requirements
    • 9.3 Create and apply N8nWorkflow in k8s-lab n8n project

      • Create N8nWorkflow resource in k8s-lab/supporting-applications/n8n/
      • Reference the existing n8n instance (n8n in n8n namespace)
      • Create necessary secrets for workflow credentials
      • Add workflow to kustomization or install manifest
      • Apply to cluster and verify workflow syncs successfully to n8n instance
      • Requirements: 1.2, 2.1, 2.2, 3.1, 3.2, 4.1
    • [ ]* 9.4 Write integration tests

      • Test end-to-end workflow creation and management
      • Test integration with existing n8n operator functionality
      • Test ArgoCD compatibility and GitOps workflows
      • Requirements: All requirements
  • 10. Final checkpoint - Ensure all tests pass

    • Ensure all tests pass, ask the user if questions arise.

Notes

  • Tasks marked with * are optional and can be skipped for faster MVP

  • Each task references specific requirements for traceability

  • Checkpoints ensure incremental validation

  • Property tests validate universal correctness properties

  • Unit tests validate specific examples and edge cases

  • Integration with existing operator codebase is prioritized throughout

  • 11. Fix workflow creation to include nodes and connections

    • 11.1 Debug and fix workflow API payload
      • Investigate why workflow nodes and connections aren’t being sent to n8n API
      • Verify workflow data structure matches n8n API expectations
      • Ensure nodes, connections, and settings are properly serialized in CreateWorkflow call
      • Test that created workflows in n8n contain all nodes from the CRD spec
      • FIXED: Three bugs identified and resolved:
        1. active field is read-only in n8n API - removed from create/update requests
        2. tags field is read-only in n8n API - removed from create/update requests
        3. Type mismatches fixed: typeVersion must be float64 (not string), position must be []int (not []string)
      • Created integration tests to verify fixes work correctly
      • Requirements: 1.2, 1.3
    • 11.2 Write unit tests for workflow payload serialization
      • Test that workflow definition is correctly serialized for n8n API
      • Test that nodes array is included in API request
      • Test that connections map is included in API request
      • COMPLETED: Integration tests created in n8n-operator/test/integration/:
        • workflow_creation_test.go - tests API client directly
        • controller_conversion_test.go - tests controller conversion from YAML
      • Requirements: 1.2, 1.3
  • 12. Fix N8n controller to update existing deployments

    • 12.1 Update reconcileResource to handle resource updates
      • Modify reconcileResource function to compare existing resources with desired state
      • Update resources when configuration changes (e.g., new volumes, environment variables)
      • Ensure updates don’t disrupt running instances unnecessarily
      • Add proper diff logic to only update when needed
      • Requirements: 2.2, 6.3
    • 12.2 Add deployment update logic for API credentials
      • Detect when deployment is missing API credentials volume
      • Update deployment to add volume and volume mount
      • Update deployment to add N8N_API_KEY_FILE environment variable
      • Handle existing n8n instances gracefully
      • Requirements: 2.2, 6.3
    • [ ]* 12.3 Write unit tests for deployment update logic
      • Test deployment update detection
      • Test volume and environment variable addition
      • Test that unnecessary updates are avoided
      • Requirements: 2.2, 6.3
  • 13. Implement workflow cleanup tool with pagination support

    • 13.1 Create Python script for paginated workflow listing
      • Create scripts/workflow_cleanup.py in k8s-lab/components/n8n/
      • Implement N8nWorkflowManager class with pagination support
      • Handle n8n API pagination (cursor-based or limit/offset)
      • Report progress per page: “Fetching page X, found Y workflows so far…”
      • Continue fetching until all workflows are retrieved
      • Requirements: 8.1, 8.2, 8.5
    • [ ]* 13.2 Write property test for paginated workflow retrieval
      • Property 14: Paginated Workflow Retrieval
      • Validates: Requirements 8.1, 8.2
    • 13.3 Add filtering and deletion logic to Python script
      • Implement exact name matching filter (not pattern matching)
      • Add support for active/inactive status filtering
      • Add support for empty/blank workflow detection
      • Implement bulk deletion with per-workflow progress reporting
      • Report summary statistics: total, deleted, failed counts
      • Requirements: 8.3, 8.5, 8.6, 8.7
    • [ ]* 13.4 Write property test for filtered workflow deletion
      • Property 15: Filtered Workflow Deletion
      • Validates: Requirements 8.3, 8.7
    • 13.5 Add dry-run mode and safety features to Python script
      • Implement dry-run mode to preview deletions without executing
      • Add error handling with continuation (don’t stop on single failure)
      • Report errors with workflow IDs and continue processing
      • Add summary statistics at end of operation
      • Requirements: 8.4, 8.5, 8.6
    • [ ]* 13.6 Write property test for cleanup error resilience
      • Property 16: Cleanup Error Resilience
      • Validates: Requirements 8.6
    • 13.7 Update Taskfile to call Python script
      • Update workflows:cleanup task to call Python script with NAME parameter
      • Update workflows:cleanup:blank task to call Python script with —blank-only flag
      • Add DRY_RUN parameter support to all cleanup tasks
      • Update workflows:list task to use Python script for pagination
      • Document usage patterns and examples in task descriptions
      • Requirements: 8.1, 8.2, 8.3, 8.4, 8.5, 8.6, 8.7
  • 14. Clean up orphaned credentials in n8n instance

    • 14.1 Add credential listing task command
      • Create task command to list all credentials in n8n
      • Display credential ID, name, and type
      • Handle empty credential lists gracefully
      • Note: n8n API does not support listing credentials - must use UI
      • Requirements: 3.1, 3.4
    • 14.2 Add credential cleanup utility or task command
      • Create task command to clean up orphaned credentials
      • Require user confirmation before deletion
      • Delete all credentials that don’t correspond to N8nWorkflow CRs
      • Note: n8n API does not support listing credentials - cleanup must be done manually via UI
      • Requirements: 3.1, 3.4
  • 15. Final checkpoint - Ensure cleanup tool works correctly

    • Test pagination with large workflow counts (hundreds of pages)
    • Test exact name matching (not pattern matching)
    • Test dry-run mode prevents actual deletions
    • Test error handling continues processing after failures
    • Verify progress reporting shows per-page and summary statistics
    • Ensure all tests pass, ask the user if questions arise.
  • 16. Implement administrative user provisioning in N8n controller

    • 16.1 Add database connection logic to N8n controller

      • Add PostgreSQL client library dependency (database/sql with pgx driver)
      • Implement database connection helper that retrieves credentials from secrets
      • Add connection pooling and timeout configuration
      • Test database connectivity during reconciliation
      • Requirements: 10.1, 10.2
    • 16.2 Implement admin user creation logic

      • Generate secure admin credentials (email: operator@n8n.local, random password)
      • Implement bcrypt password hashing using golang.org/x/crypto/bcrypt
      • Execute SQL INSERT with ON CONFLICT to create/update admin user
      • Store admin credentials in n8n-api-credentials secret (adminEmail, adminPassword keys)
      • Handle database errors and connection failures gracefully
      • Requirements: 10.1, 10.2, 10.3
    • 16.3 Implement admin user verification

      • Wait for n8n instance and database to be ready before provisioning
      • Verify admin user can authenticate via n8n API after creation
      • Test basic API operations (list workflows) to confirm permissions
      • Update N8n resource status with provisioning state (AdminUserProvisioned field)
      • Only mark instance as “Ready” after successful verification
      • Requirements: 10.4, 10.5, 10.6
    • [ ]* 16.4 Write unit tests for admin user provisioning

      • Test credential generation and secret creation
      • Test database connection and SQL execution (with mock database)
      • Test bcrypt password hashing
      • Test verification logic and status updates
      • Test error handling for database and API failures
      • Requirements: 10.1, 10.2, 10.3, 10.4, 10.5, 10.6
    • 16.5 Add cleanup logic for admin user on instance deletion

      • Implement finalizer logic to delete admin user from database
      • Execute SQL DELETE for admin user before removing N8n resource
      • Ensure n8n-api-credentials secret is deleted with instance
      • Handle cleanup errors gracefully (log and continue)
      • Requirements: 10.7
    • [ ]* 16.6 Write integration tests for admin user lifecycle

      • Test end-to-end admin user creation during instance setup
      • Test that N8nWorkflow controller can use admin credentials
      • Test cleanup on instance deletion
      • Test recovery from provisioning failures
      • Requirements: 10.1, 10.2, 10.3, 10.4, 10.5, 10.6, 10.7
  • 17. Implement emergency data reset task

    • 17.1 Create nuclear cleanup task in n8n Taskfile

      • Display comprehensive warning about data loss
      • Require explicit confirmation phrase “DELETE EVERYTHING”
      • List all resources that will be deleted (workflows, credentials, data, namespace, PVCs)
      • Requirements: 9.1, 9.2, 9.7
    • 17.2 Implement namespace deletion logic

      • Delete n8n namespace with kubectl delete namespace
      • Wait for graceful termination with timeout
      • Handle stuck resources with force deletion if needed
      • Report progress through each step
      • Requirements: 9.3
    • 17.3 Implement PVC cleanup logic

      • Identify all PVCs associated with n8n namespace
      • Force delete stuck PVCs that remain after namespace deletion
      • Use grace period 0 and finalizer removal if necessary
      • Verify all persistent storage is cleaned up
      • Requirements: 9.4
    • 17.4 Implement environment restoration

      • Recreate n8n namespace after cleanup
      • Apply necessary labels for ESO secret distribution
      • Provide clear next steps for redeploying n8n
      • Include commands for verifying operator provisions admin user
      • Requirements: 9.5, 9.6
    • [ ]* 17.5 Test nuclear cleanup task in safe environment

      • Test in isolated namespace (not production)
      • Verify all data is deleted
      • Verify namespace is recreated correctly
      • Verify redeployment works after cleanup
      • Test that operator automatically provisions admin user after redeploy
      • Requirements: 9.1, 9.2, 9.3, 9.4, 9.5, 9.6, 9.7
  • 18. Final checkpoint - Ensure all new features work correctly

    • Verify admin user is automatically provisioned when creating n8n instances
    • Verify N8nWorkflow controller can authenticate using admin credentials
    • Verify nuclear cleanup task works and provides proper warnings
    • Verify environment can be restored after nuclear cleanup
    • Ensure all tests pass, ask the user if questions arise.