Plugin Development Guide

This guide covers everything needed to build a custom backend plugin for Waldur Site Agent. It is written for both human developers and LLM-based code generators.

Waldur Mastermind concepts

Before implementing a plugin, understand how Waldur Mastermind concepts map to plugin operations.

Waldur concept	Description	Plugin relevance
Offering	Service catalog entry	Config block per offering; picks backend plugin
Resource	Allocation from an offering	CRUD via `BaseBackend`; keyed by `backend_id`
Order	Create/update/terminate request	Triggers `order_process` mode
Component	Measurable dimension (CPU, RAM)	Defined in `backend_components` config
OfferingUser	User linked to an offering	Username backend generates usernames
billing_type	`usage` or `limit`	Metered vs quota accounting
backend_id	Resource ID on the backend	Generated by `_get_resource_backend_id`

Architecture overview

A resource-management plugin consists of two main classes:

Backend (inherits BaseBackend): Orchestrates high-level operations (create resource, collect usage, manage users).
Client (inherits BaseClient): Handles low-level communication with the external system (CLI commands, API calls).

A separate plugin family covers username management (AbstractUsernameManagementBackend) — see the dedicated section below. A single distribution may register both, but the entry point groups are distinct.

graph TB
    WM[Waldur Mastermind<br/>REST API] <-->|Orders, Resources,<br/>Usage, Keys| SA[Site Agent Core<br/>Processor]
    SA -->|"user_context<br/>(ssh_keys, plan_quotas)"| BE[YourBackend<br/>BaseBackend]
    BE --> CL[YourClient<br/>BaseClient]
    CL --> EXT[External System<br/>CLI / API]
    BE -.->|backend_metadata| SA

    classDef waldur fill:#1E3A8A,stroke:#3B82F6,stroke-width:2px,color:#FFFFFF
    classDef core fill:#065F46,stroke:#10B981,stroke-width:2px,color:#FFFFFF
    classDef plugin fill:#581C87,stroke:#8B5CF6,stroke-width:2px,color:#FFFFFF
    classDef external fill:#92400E,stroke:#F59E0B,stroke-width:2px,color:#FFFFFF

    class WM waldur
    class SA core
    class BE,CL plugin
    class EXT external

BaseBackend method reference

Abstract methods (must implement)

`ping(raise_exception: bool = False) -> bool`

Mode: All (health check)
Purpose: Verify backend connectivity.
No-op: Return False.

`diagnostics() -> bool`

Mode: Diagnostics CLI
Purpose: Log diagnostic info and return health status.
No-op: Log a message, return True.

`list_components() -> list[str]`

Mode: Diagnostics
Purpose: Return component types available on the backend.
No-op: Return [].

`_get_usage_report(resource_backend_ids: list[str]) -> dict`

Mode: report, membership_sync
Purpose: Collect usage data for resources.
Return format:

{
    "resource_backend_id_1": {
        "TOTAL_ACCOUNT_USAGE": {"cpu": 1000, "mem": 2048},
        "user1": {"cpu": 500, "mem": 1024},
        "user2": {"cpu": 500, "mem": 1024},
    }
}

Key rules:
Component keys must match backend_components config keys.
Values must be in Waldur units (after unit_factor conversion).
TOTAL_ACCOUNT_USAGE is required and must equal the sum of per-user values.
No-op: Return {}.

`_collect_resource_limits(waldur_resource) -> tuple[dict, dict]`

Mode: order_process (resource creation)
Purpose: Convert Waldur limits to backend limits and back.
Returns: (backend_limits, waldur_limits) where backend_limits has values multiplied by unit_factor.
No-op: Return ({}, {}).

`_pre_create_resource(waldur_resource, user_context=None) -> None`

Mode: order_process (resource creation)
Purpose: Set up prerequisites before resource creation (e.g., parent accounts).
user_context contains pre-resolved data: ssh_keys (UUID → public key), plan_quotas (component → value), team, offering_users.
No-op: Use pass.

`downscale_resource(resource_backend_id: str) -> bool`

Mode: membership_sync
Purpose: Restrict resource capabilities (e.g., set restrictive QoS).
No-op: Return True.

`pause_resource(resource_backend_id: str) -> bool`

Mode: membership_sync
Purpose: Prevent all usage of the resource.
No-op: Return True.

`restore_resource(resource_backend_id: str) -> bool`

Mode: membership_sync
Purpose: Restore resource to normal operation.
No-op: Return True.

`get_resource_metadata(resource_backend_id: str) -> dict`

Mode: membership_sync
Purpose: Return backend-specific metadata for Waldur.
No-op: Return {}.

Hook methods (override as needed)

These have default implementations in BaseBackend. Override only when your backend needs custom behavior.

Method	Default	When to override
`post_create_resource`	No-op	Post-creation setup; set `resource.backend_metadata` to push data to Waldur
`_pre_delete_resource`	No-op	Pre-deletion cleanup (cancel jobs)
`post_delete_resource`	No-op	Post-deletion cleanup (e.g., remove child offerings linked to the resource)
`_pre_delete_user_actions`	No-op	Per-user cleanup before removal
`process_existing_users`	No-op	Process existing users (homedirs)
`check_pending_order`	Returns `True`	Non-blocking order creation (see below)
`evaluate_pending_order`	Returns `ACCEPT`	Custom approval logic for pending orders (see below)
`setup_target_event_subscriptions`	Returns `[]`	STOMP subscriptions to target systems
`get_usage_report_for_period`	Returns `{}`	Historical usage queries for past billing periods
`has_prepaid_components`	Returns `False`	Enable duration-aware limit calculation for prepaid billing
`sync_resource_end_date`	No-op	Synchronise `end_date` between source and target Waldur instances
`sync_resource_effective_id`	No-op	Reflect downstream `backend_id` as `effective_id` on the source resource
`sync_resource_project`	No-op	Push project metadata to backends that manage their own projects
`update_user_attributes`	No-op	Forward `OFFERING_USER` attribute updates to the backend
`sync_offering_user_usernames`	Returns `False`	Pull backend-assigned usernames into Waldur (federation)
`create_user_homedirs`	Provided	Override only to customise homedir quota or path logic

Non-blocking order creation (optional)

Backends that create resources via remote APIs can use non-blocking order creation. Instead of blocking until the remote operation completes, the backend returns immediately with a pending_order_id in BackendResourceInfo.

To opt in, set the supports_async_orders class attribute to True. The processor only inspects order.backend_id for async tracking when this flag is enabled, which prevents conflicts with external systems (e.g. SharePoint) that may set order.backend_id for unrelated purposes.

class MyAsyncBackend(BaseBackend):
    supports_async_orders = True

When enabled, the core processor:

Sets the source order's backend_id to the pending_order_id
Keeps the order in EXECUTING state
On subsequent polling cycles, calls check_pending_order(backend_id) to check completion
When check_pending_order() returns True, marks the source order as DONE

Backends that opt in will typically also override handled_resource_states to include ResourceState.CREATING so that user/limit sync runs while the remote order is still in flight.

`check_pending_order(order_backend_id: str) -> bool`

Default: Returns True (no async orders, always "complete")
Override when: Your backend uses non-blocking resource creation
Returns: True if the remote order completed, False if still pending
Raises: BackendError if the remote order failed or was cancelled

Example (Waldur federation plugin):

def check_pending_order(self, order_backend_id: str) -> bool:
    target_order = self.client.get_order(UUID(order_backend_id))
    if target_order.state == OrderState.DONE:
        return True
    if target_order.state in {OrderState.ERRED, OrderState.CANCELED}:
        raise BackendError(f"Target order failed: {target_order.state}")
    return False  # Still pending

`setup_target_event_subscriptions(source_offering, user_agent, global_proxy) -> list`

Default: Returns [] (no target subscriptions)
Override when: Your backend supports STOMP events from a target system
Returns: List of StompConsumer tuples for lifecycle management
Called by: event_process mode during STOMP setup

Pending order evaluation (optional)

When an order arrives in PENDING_PROVIDER state, the agent calls evaluate_pending_order on the backend before taking any action. The default implementation returns ACCEPT, which preserves the existing auto-approve behaviour. Override this method to implement custom approval logic.

`evaluate_pending_order(order, waldur_rest_client) -> PendingOrderDecision`

Default: Returns PendingOrderDecision.ACCEPT
Override when: You need to inspect or gate orders before approval
Parameters:
order (OrderDetails) — full order data including project_uuid, customer_uuid, created_by_*, attributes, and consumer_message / provider_message fields.
waldur_rest_client (AuthenticatedClient) — authenticated client for fetching additional data from the Waldur API (e.g., project members and roles).
Returns one of:
PendingOrderDecision.ACCEPT — approve the order
PendingOrderDecision.REJECT — reject the order
PendingOrderDecision.PENDING — keep waiting; the order will be re-evaluated on the next polling cycle

Note: This is the only hook that receives waldur_rest_client. Other backend methods receive Waldur data via user_context instead.

Use cases

Scenario	Approach
Wait for a PI	Query project members, return `PENDING` until a PI role exists
Reject unprocessable orders	Inspect `order.attributes`, return `REJECT`
Require a signed agreement	Set `provider_message`, return `PENDING` until `consumer_message` is set

Example: wait for a PI before approving

from waldur_api_client.api.marketplace_provider_resources import (
    marketplace_provider_resources_team_list,
)
from waldur_site_agent.backend.backends import BaseBackend, PendingOrderDecision


class MyBackend(BaseBackend):
    def evaluate_pending_order(self, order, waldur_rest_client):
        team = marketplace_provider_resources_team_list.sync(
            client=waldur_rest_client,
            uuid=order.marketplace_resource_uuid.hex,
        )
        has_pi = any(
            member.role_name == "PI" for member in (team or [])
        )
        if not has_pi:
            return PendingOrderDecision.PENDING
        return PendingOrderDecision.ACCEPT

Example: reject orders that lack a required attribute

from waldur_site_agent.backend.backends import BaseBackend, PendingOrderDecision


class MyBackend(BaseBackend):
    def evaluate_pending_order(self, order, waldur_rest_client):
        attrs = getattr(order, "attributes", None) or {}
        if not attrs.get("project_justification"):
            return PendingOrderDecision.REJECT
        return PendingOrderDecision.ACCEPT

Username management backends

Username management is handled by a separate plugin family inheriting from AbstractUsernameManagementBackend (defined in waldur_site_agent/backend/backends.py). These backends generate or look up local-IDP usernames for OfferingUser records and are wired in the config via username_management_backend.

Required methods

Method	Purpose
`generate_username(offering_user) -> str`	Create a new local username. Return `""` if generation is not supported.
`get_username(offering_user) -> Optional[str]`	Look up the existing local username for the user, or `None`.

get_or_create_username is provided by the base class and calls get_username first, then generate_username only if no username is found.

Optional hook methods

Method	Default	When to override
`sync_user_profiles(offering_users)`	No-op	Push user profiles to the IDP before membership sync runs
`deactivate_users(usernames)`	No-op	Remove departed users from the external system

Error signalling for user-action gates

When username generation requires the user to take an action (e.g. link an existing IdP account, complete a validation form), raise one of these exceptions from waldur_site_agent.backend.exceptions:

OfferingUserAccountLinkingRequiredError(comment, comment_url=None) — user must link an existing account.
OfferingUserAdditionalValidationRequiredError(comment, comment_url=None) — additional validation is required.

The processor moves the offering user into a PENDING_ACCOUNT_LINKING / PENDING_ADDITIONAL_VALIDATION state and surfaces the comment (and URL) to the operator. See docs/offering-users.md for the full state machine.

Entry point group

Register username management plugins under a different entry point group than resource backends:

[project.entry-points."waldur_site_agent.username_management_backends"]
mycustom = "waldur_site_agent_mycustom.username_backend:MyCustomUsernameBackend"

The fallback entry point name is base, provided by the waldur-site-agent-basic-username-management plugin.

BaseClient method reference

All methods below are abstract and must be implemented.

Method	Signature	Purpose
`list_resources`	`() -> list[ClientResource]`	List all resources on backend
`get_resource`	`(resource_id) -> ClientResource or None`	Get single resource or None
`create_resource`	`(name, description, organization, parent_name=None) -> str`	Create resource
`delete_resource`	`(name) -> str`	Delete resource
`set_resource_limits`	`(resource_id, limits_dict) -> str or None`	Set limits (backend units)
`get_resource_limits`	`(resource_id) -> dict[str, int]`	Get limits (backend units)
`get_resource_user_limits`	`(resource_id) -> dict[str, dict[str, int]]`	Per-user limits
`set_resource_user_limits`	`(resource_id, username, limits_dict) -> str`	Set per-user limits
`get_association`	`(user, resource_id) -> Association or None`	Check user-resource link
`create_association`	`(username, resource_id, default_account=None) -> str`	Create user-resource link
`delete_association`	`(username, resource_id) -> str`	Remove user-resource link
`get_usage_report`	`(resource_ids, timezone=None) -> list`	Raw usage data from backend
`list_resource_users`	`(resource_id) -> list[str]`	List usernames for resource

Important: BaseClient also provides:

execute_command(command, silent=False) for running CLI commands with error handling — use it for CLI-based backends.
create_linux_user_homedir(username, umask="") which shells out to /sbin/mkhomedir_helper. Override only if your backend creates home directories some other way; otherwise it works as-is for SLURM-style Linux deployments.

Agent mode method matrix

This table shows which BaseBackend methods are called by each agent mode.

Method	order_process	report	membership_sync	event_process
`ping`	startup	startup	startup	startup
`create_resource` / `create_resource_with_id`	CREATE order	-	-	CREATE event
`_pre_create_resource`	CREATE order	-	-	CREATE event
`post_create_resource`	CREATE order	-	-	CREATE event
`_collect_resource_limits`	CREATE order	-	-	CREATE event
`check_pending_order`	CREATE order (async)	-	-	CREATE event (async)
`evaluate_pending_order`	pending-provider orders	-	-	-
`set_resource_limits`	UPDATE order	-	-	UPDATE event
`delete_resource`	TERMINATE order	-	-	TERMINATE event
`_pre_delete_resource`	TERMINATE order	-	-	TERMINATE event
`pull_resource` / `pull_resources`	CREATE order	usage pull	sync cycle	various events
`_get_usage_report`	-	usage pull	sync cycle	-
`add_users_to_resource`	post-create	-	user sync	role events
`remove_users_from_resource`	-	-	user sync	role events
`add_user` / `remove_user`	-	-	role changes	role events
`downscale_resource`	-	-	status sync	-
`pause_resource`	-	-	status sync	-
`restore_resource`	-	-	status sync	-
`get_resource_metadata`	-	-	status sync	-
`setup_target_event_subscriptions`	-	-	-	STOMP setup
`list_resources`	-	import	-	import event
`get_resource_limits`	-	import	-	import event
`get_resource_user_limits`	-	-	limits sync	-
`set_resource_user_limits`	-	-	limits sync	-
`process_existing_users`	-	-	user sync	-

Usage report format specification

The _get_usage_report method must return data in this exact structure:

{
    "<resource_backend_id>": {
        "TOTAL_ACCOUNT_USAGE": {
            "<component_key>": <int_value>,  # Sum of all per-user values
            ...
        },
        "<username_1>": {
            "<component_key>": <int_value>,
            ...
        },
        "<username_2>": {
            "<component_key>": <int_value>,
            ...
        },
    },
    "<another_resource_backend_id>": { ... },
}

Rules

Component keys must exactly match those in the backend_components YAML config.
Values must be integers in Waldur units (i.e., divide raw backend values by unit_factor).
TOTAL_ACCOUNT_USAGE is a required key and must equal the sum of all per-user values for each component.
If a resource has no usage, return {"TOTAL_ACCOUNT_USAGE": {"cpu": 0, "mem": 0, ...}}.
If usage reporting is not supported, return {} (empty dict).

Example: SLURM CPU and memory

Given config:

backend_components:
  cpu:
    unit_factor: 60000
    measured_unit: "k-Hours"
  mem:
    unit_factor: 61440
    measured_unit: "gb-Hours"

If SLURM reports 120000 cpu-minutes and 122880 MB-minutes for user1:

{
    "hpc_my_allocation": {
        "TOTAL_ACCOUNT_USAGE": {"cpu": 2, "mem": 2},
        "user1": {"cpu": 2, "mem": 2},
    }
}

Calculation: 120000 / 60000 = 2, 122880 / 61440 = 2.

Capability flags and class attributes

BaseBackend exposes three class-level capability flags that change how the core processor treats your backend.

`supports_decreasing_usage: bool = False`

Set to True if usage values can decrease between reports (e.g., a storage backend reporting current disk usage rather than accumulated compute time).

class MyStorageBackend(BaseBackend):
    supports_decreasing_usage = True

When False (default), the reporting processor skips updates where the new usage value is lower than the previously reported value, treating it as a data anomaly.

`supports_cycle_preflight: bool = False`

Set to True for backends that call a remote API during order processing. The order processor runs run_preflight() once per offering per cycle before listing orders. The default implementation calls ping() and raises BackendNotReadyError on failure so orders stay pending instead of ERRED.

Override run_preflight() to probe specific endpoints. Opt in from plugins such as Waldur federation and other HTTP backends can enable it when needed.

`supports_async_orders: bool = False`

Set to True for backends that complete order creation asynchronously on a remote system and report progress via pending_order_id. See the "Non-blocking order creation" section above for the full flow.

class MyAsyncBackend(BaseBackend):
    supports_async_orders = True

`handled_resource_states: list = [ResourceState.OK, ResourceState.ERRED]`

Controls which resource states the membership processor fetches and processes. Override when your backend needs to manage users on resources that are still being provisioned (e.g., async backends that include CREATING).

from waldur_api_client.models.resource_state import ResourceState

class MyAsyncBackend(BaseBackend):
    handled_resource_states = [ResourceState.OK, ResourceState.ERRED, ResourceState.CREATING]

Decision matrix for no-op implementations

If your backend does not support a certain operation, use these return values:

Method	No-op return	Meaning
`ping`	`False`	Backend has no health check
`diagnostics`	`True`	Diagnostics not implemented but OK
`list_components`	`[]`	No component discovery
`_get_usage_report`	`{}`	No usage reporting
`_collect_resource_limits`	`({}, {})`	No limits support
`_pre_create_resource`	`pass`	No pre-creation setup
`downscale_resource`	`True`	No downscaling concept
`pause_resource`	`True`	No pausing concept
`restore_resource`	`True`	No restore concept
`get_resource_metadata`	`{}`	No metadata

Annotated YAML configuration

offerings:
  - name: "My Custom Offering"          # Human-readable name for logging

    # Waldur Mastermind connection
    waldur_api_url: "https://waldur.example.com/api/"
    waldur_api_token: "your-api-token"   # Service provider token
    waldur_offering_uuid: "uuid-here"    # UUID from Waldur offering page

    # Backend selection (entry point names from pyproject.toml)
    order_processing_backend: "mycustom"       # For create/update/terminate
    reporting_backend: "mycustom"              # For usage reporting
    membership_sync_backend: "mycustom"        # For user sync
    username_management_backend: "base"        # Username generation

    # Legacy setting (used if per-mode backends not specified)
    backend_type: "mycustom"

    # Event processing (optional)
    stomp_enabled: false

    # Backend-specific settings (passed to __init__ as backend_settings)
    backend_settings:
      default_account: "root"            # Default parent account
      customer_prefix: "cust_"           # Prefix for customer-level accounts
      project_prefix: "proj_"            # Prefix for project-level accounts
      allocation_prefix: "alloc_"        # Prefix for allocation-level accounts

    # Component definitions (passed to __init__ as backend_components)
    backend_components:
      cpu:
        limit: 100                       # Default limit in Waldur units
        measured_unit: "k-Hours"         # Display unit in Waldur UI
        unit_factor: 60000               # Waldur-to-backend conversion factor
        accounting_type: "usage"         # "usage" = metered, "limit" = quota
        label: "CPU"                     # Display label in Waldur UI
        # Optional Waldur offering component fields:
        # description: "CPU time"        # Component description
        # min_value: 0                   # Minimum allowed value
        # max_value: 10000               # Maximum allowed value
        # max_available_limit: 5000      # Maximum available limit
        # default_limit: 100             # Default limit value
        # limit_period: "month"          # "annual", "month", "quarterly", "total"
        # article_code: "CPU-001"        # Billing article code
        # is_boolean: false              # Boolean (on/off) component
        # is_prepaid: false              # Prepaid billing
      storage:
        limit: 1000
        measured_unit: "GB"
        unit_factor: 1
        accounting_type: "limit"
        label: "Storage"

`unit_factor` explained

The unit_factor converts between Waldur display units and backend-native units:

backend_value = waldur_value * unit_factor
waldur_value = backend_value / unit_factor

Examples:

CPU k-Hours to SLURM cpu-minutes: unit_factor = 60000 (60 min x 1000)
GB-Hours to SLURM MB-minutes: unit_factor = 61440 (60 min x 1024 MB)
GB to GB (no conversion): unit_factor = 1

Entry point registration

Register your plugin in pyproject.toml:

[project]
name = "waldur-site-agent-mycustom"
version = "0.1.0"
dependencies = ["waldur-site-agent>=0.7.0"]

[project.entry-points."waldur_site_agent.backends"]
mycustom = "waldur_site_agent_mycustom.backend:MyCustomBackend"

# Optional: register a username management backend
[project.entry-points."waldur_site_agent.username_management_backends"]
mycustom = "waldur_site_agent_mycustom.username_backend:MyCustomUsernameBackend"

# Optional: component schema validation
[project.entry-points."waldur_site_agent.component_schemas"]
mycustom = "waldur_site_agent_mycustom.schemas:MyCustomComponentSchema"

# Optional: backend settings schema validation
[project.entry-points."waldur_site_agent.backend_settings_schemas"]
mycustom = "waldur_site_agent_mycustom.schemas:MyCustomBackendSettingsSchema"

The entry point name (e.g., mycustom) is what users put in backend_type, order_processing_backend, or username_management_backend in the config YAML. The four entry-point groups are independent — a single distribution may register some or all of them.

Processor-plugin data flow

Plugins generally do not have direct access to the Waldur API client. The core processor pre-resolves any Waldur data the plugin might need and passes it via user_context. Plugins return metadata to Waldur by setting resource.backend_metadata.

Exception: evaluate_pending_order receives waldur_rest_client directly, because the order has not been approved yet and no resource context exists at that point.

Pre-resolved data in `user_context`

The processor enriches the user_context dict before calling backend methods. Plugins read from it without making API calls:

Key	Type	Contents
`team`	`list[dict]`	Team members with usernames
`offering_users`	`list[dict]`	Offering users
`ssh_keys`	`dict[str, str]`	Mapping of SSH key UUID → public key text
`plan_quotas`	`dict[str, int]`	Plan component quotas (component key → value)

Returning metadata via `backend_metadata`

To push metadata back to Waldur (e.g., access credentials, connection endpoints), set resource.backend_metadata in post_create_resource:

def post_create_resource(self, resource, waldur_resource, user_context=None):
    # ... create credentials, gather endpoints ...
    resource.backend_metadata = {
        "username": "admin",
        "password": generated_password,
        "endpoint": "https://service.example.com",
    }
    # The processor pushes this to Waldur automatically

Data flow

sequenceDiagram
    participant P as Processor
    participant B as YourBackend
    participant W as Waldur API
    participant E as External System

    P->>P: Fetch service provider
    P->>B: Set service_provider_uuid

    Note over P,B: Resource creation order arrives

    P->>W: Resolve SSH keys, plan quotas
    W-->>P: Pre-resolved data
    P->>B: _pre_create_resource(resource, user_context)
    B->>B: Read ssh_keys, plan_quotas from user_context
    B->>E: Create resource with resolved data
    E-->>B: Resource created

    P->>B: post_create_resource(resource, waldur_resource, user_context)
    B->>B: Set resource.backend_metadata
    B-->>P: Return
    P->>W: Push backend_metadata to Waldur

Example: resolving an SSH key UUID from `user_context`

When a resource attribute contains a UUID reference (e.g., an SSH key UUID from the order form), look it up in the pre-resolved ssh_keys dict:

from uuid import UUID

class MyBackend(BaseBackend):
    @staticmethod
    def _resolve_ssh_key(key_value: str, ssh_keys: dict[str, str]) -> str:
        """Resolve SSH key from pre-resolved context.

        If key_value is a UUID, look it up. Otherwise treat as raw key text.
        """
        try:
            key_uuid = UUID(key_value.strip())
        except ValueError:
            return key_value  # Raw public key text, use as-is

        return ssh_keys.get(str(key_uuid), "") or ssh_keys.get(key_uuid.hex, "")

    def _pre_create_resource(self, waldur_resource, user_context=None):
        user_context = user_context or {}
        ssh_keys = user_context.get("ssh_keys", {})
        raw_key = waldur_resource.attributes.get("ssh_public_key", "")
        resolved_key = self._resolve_ssh_key(raw_key, ssh_keys)
        # Use resolved_key for resource setup ...

Design principles

Plugins should avoid importing waldur_api_client for runtime API calls. All Waldur data should come via user_context or BaseBackend attributes. The exception is evaluate_pending_order, which receives waldur_rest_client for querying project or order data before approval.
service_provider_uuid is still set on BaseBackend by the processor and can be read by plugins for constructing backend-side identifiers.
Handle missing context gracefully — user_context may be None or missing keys in unit tests. Always default to {} or empty values.

Common pitfalls

1. Unit factor direction

The unit_factor converts from Waldur units to backend units by multiplication. When reporting usage back, you must divide by unit_factor. Getting this backwards causes limits to be set at 1/60000th of the intended value or usage to be reported 60000x too high.

2. Missing TOTAL_ACCOUNT_USAGE

The _get_usage_report return dict must include a "TOTAL_ACCOUNT_USAGE" key for each resource. If missing, the core will substitute zeros, and reported usage will appear as zero in Waldur.

3. Entry point not discovered

Common causes:

Package not installed (uv sync --all-packages)
Entry point group name misspelled. Resource backends use "waldur_site_agent.backends"; username management uses "waldur_site_agent.username_management_backends" (note the plural and the suffix). Validation schemas use "waldur_site_agent.component_schemas" / "waldur_site_agent.backend_settings_schemas".
Entry point value points to wrong class or module

Debug with:

from importlib.metadata import entry_points
print(list(entry_points(group="waldur_site_agent.backends")))
print(list(entry_points(group="waldur_site_agent.username_management_backends")))

4. Forgetting super().init()

Your backend __init__ must call super().__init__(backend_settings, backend_components). This sets up self.backend_settings, self.backend_components, and self.client. Then assign your own client:

def __init__(self, backend_settings, backend_components):
    super().__init__(backend_settings, backend_components)
    self.backend_type = "mycustom"
    self.client = MyCustomClient()

5. Returning wrong types from client methods

get_resource must return None (not raise) when resource is absent.
get_association must return None (not raise) when no association exists.
list_resources must return list[ClientResource], not raw dicts.

6. Component key mismatch

Component keys in _get_usage_report must exactly match the keys in backend_components config. If config has "cpu" but you report "CPU", the usage will be silently ignored.

Testing guidance

What to test per mode

Mode	Test focus
`order_process`	`create_resource`, `delete_resource`, limit conversion
`report`	`_get_usage_report` format, unit conversion math
`membership_sync`	`add_user`, `remove_user`, pause/restore
All	`ping`, error handling, edge cases

Mock patterns

Mock the client to avoid needing a real backend:

from unittest.mock import MagicMock, patch
from waldur_site_agent.backend.structures import ClientResource, Association

def test_create_resource():
    backend = MyCustomBackend(
        backend_settings={"default_account": "root", "allocation_prefix": "test_"},
        backend_components={"cpu": {"unit_factor": 60000, "limit": 10}},
    )
    backend.client = MagicMock()
    backend.client.get_resource.return_value = None  # Resource doesn't exist yet
    backend.client.create_resource.return_value = "created"

    # ... test resource creation

Fixtures

import pytest

@pytest.fixture
def backend_settings():
    return {
        "default_account": "root",
        "customer_prefix": "c_",
        "project_prefix": "p_",
        "allocation_prefix": "a_",
    }

@pytest.fixture
def backend_components():
    return {
        "cpu": {
            "limit": 10,
            "measured_unit": "k-Hours",
            "unit_factor": 60000,
            "accounting_type": "usage",
            "label": "CPU",
        },
    }

@pytest.fixture
def backend(backend_settings, backend_components):
    b = MyCustomBackend(backend_settings, backend_components)
    b.client = MagicMock()
    return b

Key assertions

# Usage report format
report = backend._get_usage_report(["alloc_1"])
assert "TOTAL_ACCOUNT_USAGE" in report["alloc_1"]
assert all(k in report["alloc_1"]["TOTAL_ACCOUNT_USAGE"]
           for k in backend.backend_components)

# Limit conversion
backend_limits, waldur_limits = backend._collect_resource_limits(mock_resource)
assert backend_limits["cpu"] == waldur_limits["cpu"] * 60000

LLM implementation checklist

When implementing a new backend plugin with an LLM, follow these steps in order:

Read existing plugins: Study plugins/slurm/ and plugins/mup/ for patterns.
Copy the template: Start from docs/plugin-template/ and rename.
Implement __init__: Call super().__init__(), set backend_type, create client.
Implement BaseClient methods: Start with get_resource, create_resource, delete_resource, list_resources.
Implement BaseBackend abstract methods: Start with ping, then _pre_create_resource, then _collect_resource_limits, then _get_usage_report.
Handle unit conversion: Verify unit_factor math in both directions.
Write tests: Mock the client, test each abstract method.
Register entry points: Add to pyproject.toml.
Test integration: Install with uv sync --all-packages and run waldur_site_diagnostics.
Verify: Run uv run pytest and uvx prek run --all-files.

Files to study

waldur_site_agent/backend/backends.py — BaseBackend (resource backends) and AbstractUsernameManagementBackend (username plugins), plus PendingOrderDecision enum.
waldur_site_agent/backend/clients.py — Base client class.
waldur_site_agent/backend/structures.py — Data structures (ClientResource, Association, BackendResourceInfo with fields backend_id, parent_id, effective_id, users, usage, limits, pending_order_id, backend_metadata).
waldur_site_agent/backend/exceptions.py — BackendError, DuplicateResourceError, and the OfferingUser*RequiredError exceptions used by username backends.
waldur_site_agent/common/plugin_schemas.py — PluginComponentSchema and PluginBackendSettingsSchema base classes for optional config validation entry points.
plugins/slurm/waldur_site_agent_slurm/backend.py — Reference implementation (CLI-based).
plugins/mup/waldur_site_agent_mup/backend.py — Reference implementation (API-based).
plugins/waldur/waldur_site_agent_waldur/backend.py — Reference for supports_async_orders, handled_resource_states, and the sync_resource_* hooks.
plugins/basic_username_management/ — Minimal reference for AbstractUsernameManagementBackend.

Common mistakes to avoid

Do not forget super().__init__(backend_settings, backend_components).
Do not return raw dicts from list_resources; return ClientResource objects.
Do not raise exceptions from get_resource when resource is absent; return None.
Do not forget the "TOTAL_ACCOUNT_USAGE" key in usage reports.
Do not confuse Waldur units with backend units in _collect_resource_limits.
Do not hardcode component keys; read them from self.backend_components.