Skip to content

Waldur Components Architecture

Overview

Waldur is a cloud marketplace platform deployed on Kubernetes. This document describes the main components launched by the Waldur Helm chart, their roles, and how they interact with each other.

High-Level Architecture

graph TB
    subgraph External["External Users"]
        User["Users/Browsers"]
        API["API Clients"]
    end

    subgraph Ingress["Ingress Layer"]
        ING["Ingress Controller"]
    end

    subgraph Frontend["Frontend Layer"]
        HP["Homeport<br/>(React UI)"]
    end

    subgraph Backend["Backend Services"]
        MAPI["Mastermind API<br/>(Django REST)"]
        MW["Mastermind Worker<br/>(Celery Workers)"]
        MB["Mastermind Beat<br/>(Celery Scheduler)"]
    end

    subgraph Optional["Optional Services"]
        ME["Metrics Exporter<br/>(Prometheus)"]
        UVK["UVK Everypay<br/>(Payment Gateway)"]
    end

    subgraph Data["Data Layer"]
        PG["PostgreSQL<br/>(Database)"]
        RMQ["RabbitMQ<br/>(Message Broker)"]
    end

    User --> ING
    API --> ING
    ING --> HP
    ING --> MAPI
    ING --> UVK

    HP --> MAPI
    MAPI --> PG
    MW --> PG
    MB --> PG

    MAPI --> RMQ
    MW --> RMQ
    MB --> RMQ

    ME --> MAPI
    UVK --> MAPI

    style HP fill:#e1f5fe
    style MAPI fill:#c8e6c9
    style MW fill:#c8e6c9
    style MB fill:#c8e6c9
    style PG fill:#fff3e0
    style RMQ fill:#fff3e0
    style ME fill:#f3e5f5
    style UVK fill:#f3e5f5

Core Components

Deployment Purpose
waldur-homeport React-based frontend UI for the cloud marketplace
waldur-mastermind-api Django REST API backend handling all API requests, authentication, and resource orchestration
waldur-mastermind-worker Celery workers processing background tasks, provisioning, and long-running operations
waldur-mastermind-beat Celery scheduler managing periodic tasks, cleanup operations, and recurring jobs

Optional Components

5. Metrics Exporter

Deployment: waldur-metrics-exporter Container: Prometheus metrics exporter Enabled by: waldur.metricsExporter.enabled

  • Responsibilities:

  • Exposes Waldur metrics in Prometheus format

  • Provides monitoring data

  • Integrates with monitoring stack

  • Configuration:

  • Requires API token for authentication

  • Exposes metrics on port 8080

6. UVK Everypay Integration

Deployment: waldur-uvk-everypay Container: Payment gateway integration Enabled by: waldur.uvkEverypay.enabled

  • Components:

  • Main container: UVK payment processor

  • Sidecar container: HTTP API bridge

  • Responsibilities:

  • Processes payments through Everypay

  • Integrates with Azure AD

  • Handles payment notifications

  • Email notifications for transactions

Dependencies

PostgreSQL Database

Chart: Bitnami PostgreSQL v16.7.26 Enabled by: postgresql.enabled Images: Uses bitnamilegacy Docker images for compatibility Environment: Demo/Development only

⚠️ Production Recommendation: Use CloudNativePG Operator for production deployments

  • Options:

  • Simple PostgreSQL deployment

  • PostgreSQL HA deployment (using postgresqlha.enabled)

  • External database configuration

  • Production: CloudNativePG operator with automated failover

  • Purpose:

  • Primary data storage

  • User accounts and permissions

  • Resource state management

  • Billing and accounting data

  • Audit logs

RabbitMQ Message Broker

Chart: Bitnami RabbitMQ v16.0.13 Enabled by: rabbitmq.enabled Images: Uses bitnamilegacy Docker images for compatibility Environment: Demo/Development only

⚠️ Production Recommendation: Use RabbitMQ Cluster Operator for production deployments

  • Purpose:

  • Message queue for Celery

  • Task distribution to workers

  • Asynchronous communication

  • Event-driven architecture support

Scheduled Tasks (CronJobs)

graph LR
    subgraph CronJobs["Scheduled Tasks"]
        BK["Database Backup<br/>(Daily)"]
        BR["Backup Rotation<br/>(Weekly)"]
        CL["Session Cleanup<br/>(Daily)"]
        SM["SAML2 Sync<br/>(Configurable)"]
    end

    subgraph Targets["Target Systems"]
        DB[(PostgreSQL)]
        S3[Object Storage]
        IDP[Identity Provider]
    end

    BK --> DB
    BK --> S3
    BR --> S3
    CL --> DB
    SM --> IDP

    style BK fill:#fce4ec
    style BR fill:#fce4ec
    style CL fill:#fce4ec
    style SM fill:#fce4ec

Database Backup

CronJob: cronjob-waldur-db-backup.yaml Schedule: Daily (configurable)

  • Creates PostgreSQL dumps

  • Uploads to object storage

  • Configurable retention

Backup Rotation

CronJob: cronjob-waldur-db-backup-rotation.yaml Schedule: Weekly (configurable)

  • Manages backup retention

  • Removes old backups

  • Maintains backup history

Session Cleanup

CronJob: cronjob-waldur-cleanup.yaml Schedule: Daily

  • Cleans expired sessions

  • Removes old audit logs

  • Database maintenance tasks

SAML2 Metadata Sync

CronJob: cronjob-waldur-saml2-metadata-sync.yaml Schedule: Configurable

  • Synchronizes SAML2 metadata

  • Updates identity provider configurations

  • Maintains SSO configurations

Data Flow

sequenceDiagram
    participant U as User
    participant H as Homeport
    participant A as API
    participant W as Worker
    participant Q as RabbitMQ
    participant D as Database
    participant E as External Service

    U->>H: Access UI
    H->>A: API Request
    A->>D: Check Permissions
    D->>A: Return Data
    A->>Q: Queue Task
    Q->>W: Deliver Task
    W->>E: Provision Resource
    E->>W: Return Status
    W->>D: Update Status
    W->>Q: Task Complete
    A->>H: Return Response
    H->>U: Display Result

Service Communication

Internal Services

  • waldur-mastermind-api: ClusterIP service on port 80

  • waldur-homeport: ClusterIP service on port 80

  • waldur-metrics-exporter: ClusterIP service on port 8080

  • waldur-uvk-everypay: ClusterIP service on port 8000

External Access

  • Ingress controller routes traffic to services

  • TLS termination at ingress level

  • Support for multiple hostnames per service

Configuration Management

ConfigMaps

  • api-override-config: Django settings overrides

  • api-celery-config: Celery configuration

  • mastermind-config-features-json: Feature flags

  • mastermind-config-auth-yaml: Authentication settings

  • mastermind-config-permissions-override-yaml: Permission overrides

  • icons-config: Custom icons and branding

Secrets

  • waldur-secret: Database credentials, API tokens

  • waldur-saml2-secret: SAML2 certificates

  • waldur-valimo-secret: Valimo authentication certificates

  • waldur-ssh-key-config: SSH private keys

  • waldur-script-kubeconfig: Kubernetes config for script execution

High Availability Considerations

  1. API Layer:

  2. Supports multiple replicas

  3. Horizontal Pod Autoscaling available
  4. Load balanced through service

  5. Worker Layer:

  6. Horizontally scalable

  7. Multiple workers can process tasks in parallel
  8. HPA support for automatic scaling

  9. Beat Scheduler:

  10. Single instance only (by design)

  11. Handles scheduling, not processing

  12. Database:

  13. PostgreSQL HA option available

  14. Supports external managed databases
  15. Regular backup strategy

  16. Message Queue:

  17. RabbitMQ clustering supported

  18. External message broker option

Tuning and Extension Hooks

The chart exposes Bitnami-style extension hooks on the api, worker, beat, and homeport deployments so operators can inject site-specific configuration without forking. All defaults are empty — the rendered output is unchanged when the values are not set.

Per-deployment hooks

Available on each of api, worker, beat, homeport:

Value Type Purpose
<component>.extraEnvVars list of EnvVar Extra container env vars (supports value and valueFrom)
<component>.extraEnvVarsCM string Single ConfigMap name, rendered as envFrom.configMapRef
<component>.extraEnvVarsSecret string Single Secret name, rendered as envFrom.secretRef
<component>.extraVolumes list of Volume Extra volumes appended to the pod spec
<component>.extraVolumeMounts list of VolumeMount Extra mounts appended to the main container
<component>.podAnnotations object Merged onto the pod template metadata
<component>.podLabels object Merged onto the pod template metadata

For beat, hooks affect only the main beat container — the migration / DB-bootstrap init containers are left untouched.

Example: scrape the API pod with Prometheus and mount a shared scratch PVC.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
api:
  podAnnotations:
    prometheus.io/scrape: "true"
    prometheus.io/port: "8080"
  extraVolumes:
    - name: scratch
      persistentVolumeClaim:
        claimName: waldur-api-scratch
  extraVolumeMounts:
    - name: scratch
      mountPath: /scratch

Gunicorn process tuning (api only)

The gunicorn: block translates into a GUNICORN_CMD_ARGS env var on the api container; gunicorn reads it at startup and appends it to its own argv. Any value left empty falls back to the gunicorn defaults baked into the image (/etc/waldur/gunicorn.conf.py).

1
2
3
4
5
6
7
8
gunicorn:
  timeout: 120           # --timeout
  gracefulTimeout: 60    # --graceful-timeout
  workers: 6             # --workers
  keepalive: 5           # --keep-alive
  maxRequests: 1000      # --max-requests (recycle worker after N requests)
  maxRequestsJitter: 50  # --max-requests-jitter
  extraArgs: ""          # raw passthrough appended verbatim

Celery worker concurrency

1
2
celery:
  concurrency: 32        # CELERYD_CONCURRENCY (default 10)

This sets the number of child processes per worker pod. Combine with replicaCount.worker (and HPA, if enabled) to scale total parallelism. Note: increasing concurrency raises per-pod memory; size workerResources accordingly.

Ingress annotations

ingress.annotations is a free-form map merged onto every ingress this chart renders — api, api-admin, homeport, rmq-ws, and uvk-everypay. It sits alongside the className-specific annotations the chart already templates (nginx, haproxy, traefik, openshift-default) and the cert-manager cluster-issuer annotation; ingress-controller and cert-manager keys should stay in their own values so they keep their conditional logic.

The canonical use case is external-dns, which reads annotations on Ingress objects to manage DNS records:

1
2
3
4
5
ingress:
  annotations:
    external-dns.alpha.kubernetes.io/ttl: "60"           # low TTL for fast cut-over during deploys
    external-dns.alpha.kubernetes.io/hostname: "api.example.com"
    external-dns.alpha.kubernetes.io/cloudflare-proxied: "false"

A 60-second TTL is a common operator choice for production rollouts: short enough that resolvers pick up an IP change within a minute, long enough to avoid hammering the upstream DNS provider during steady state. The external-dns default is 300 seconds; set it explicitly per environment.

All values must be strings (Kubernetes annotation requirement) — quote numeric and boolean values. Leaving the map empty (the default) keeps the rendered ingresses byte-identical to the baseline chart.