CSCS HPC Storage Backend
A Waldur Site Agent backend plugin for managing CSCS HPC Storage systems. This backend provides a REST API proxy to access storage resource information from Waldur.
Overview
The CSCS HPC Storage backend provides a REST API proxy that serves storage resource information from Waldur Mastermind. The proxy translates Waldur resource data into CSCS-specific JSON format for consumption by external web servers and storage management systems.
Features
- REST API Proxy: Provides HTTP API access to storage resource information from Waldur
- Multi-offering support: Aggregates resources from multiple storage system offerings (capstor, vast, iopsstor)
- Hierarchical storage structure: Maps Waldur offering customer → resource customer → resource project to storage tenant → customer → project
- Configurable quotas: Automatic inode quota calculation based on storage size
- External HPC User API integration: Fetches Unix GID values for storage accounts with configurable SOCKS proxy support
- GID caching: Project GID values are cached in memory until server restart to reduce external API calls
- Mock data support: Development/testing mode with generated target item data
- Flexible configuration: Customizable file system types and quota coefficients
- API Filtering: Supports filtering by storage system, data type, status, and pagination
Configuration
Backend Settings
1 2 3 4 5 | |
Backend Components
1 2 3 4 5 6 | |
Storage Systems Configuration
The storage proxy supports multiple storage systems through offering slug mapping:
1 2 3 4 5 6 | |
HPC User API Configuration
The backend can integrate with an external HPC User API to retrieve Unix GID values for storage accounts. This configuration is optional - if not provided, mock GID values will be used.
1 2 3 4 5 6 7 8 9 | |
HPC User API Features:
- OAuth2 authentication: Automatic token acquisition and refresh
- GID caching: Project GID values are cached in memory until server restart
- SOCKS proxy support: Configurable SOCKS proxy for accessing APIs behind firewalls
- Fallback to mock data: If API is unavailable, uses generated mock GID values
- Cache statistics: Available via
get_gid_cache_stats()method for monitoring
SOCKS Proxy Support:
- Use
socks_proxyfield to configure SOCKS proxy access to the HPC User API - Format:
socks5://host:portorsocks4://host:port - Useful when the HPC User API is behind a firewall or requires proxy access
- Optional field - if not specified, direct connection is used
Architecture
The CSCS HPC Storage backend provides a REST API proxy that serves storage resource information:
graph TD
subgraph "Storage Proxy API"
SP[Storage Proxy Server<br/>FastAPI Application]
API[REST API Endpoints<br/>/api/storage-resources/]
AUTH[Authentication<br/>Keycloak/OIDC]
end
subgraph "CSCS HPC Storage Plugin"
BACKEND[CSCS Backend<br/>Data Processing]
TRANSFORM[Data Transformation<br/>Waldur → CSCS Format]
CACHE[GID Cache<br/>In-Memory Storage]
HPCCLIENT[HPC User Client<br/>OAuth2 + SOCKS Proxy]
end
subgraph "Waldur Integration"
WM[Waldur Mastermind<br/>API Client]
RESOURCES[Multi-Offering<br/>Resource Fetching]
end
subgraph "External Systems"
CLIENT[Client Applications<br/>Web UI, Scripts]
SMS[Storage Management<br/>System]
HPCAPI[HPC User API<br/>Unix GID Service]
PROXY[SOCKS Proxy<br/>localhost:12345]
end
%% API Flow
CLIENT --> AUTH
AUTH --> API
API --> SP
SP --> BACKEND
BACKEND --> TRANSFORM
TRANSFORM --> RESOURCES
RESOURCES --> WM
%% HPC API Flow
BACKEND --> CACHE
CACHE --> HPCCLIENT
HPCCLIENT --> PROXY
PROXY --> HPCAPI
%% Response Flow
WM --> RESOURCES
RESOURCES --> TRANSFORM
HPCAPI --> PROXY
PROXY --> HPCCLIENT
HPCCLIENT --> CACHE
CACHE --> BACKEND
TRANSFORM --> BACKEND
BACKEND --> SP
SP --> API
API --> CLIENT
%% External Integration
CLIENT --> SMS
%% Styling
classDef proxy stroke:#00bcd4,stroke-width:2px,color:#00acc1
classDef plugin stroke:#ff9800,stroke-width:2px,color:#f57c00
classDef waldur stroke:#9c27b0,stroke-width:2px,color:#7b1fa2
classDef external stroke:#4caf50,stroke-width:2px,color:#388e3c
classDef cache stroke:#e91e63,stroke-width:2px,color:#c2185b
class SP,API,AUTH proxy
class BACKEND,TRANSFORM,HPCCLIENT plugin
class WM,RESOURCES waldur
class CLIENT,SMS,HPCAPI,PROXY external
class CACHE cache
API Usage
Start the storage proxy server:
1 2 3 4 5 6 | |
Query storage resources:
1 2 3 | |
Data Mapping
Waldur to Storage Hierarchy
The three-tier hierarchy maps specific Waldur resource attributes to storage organization levels:
Tenant Level Mapping
Target Type: tenant
Waldur Source Attributes:
resource.provider_slugresource.provider_nameresource.offering_uuid
Generated Fields:
itemId:str(resource.offering_uuid)key:resource.provider_slugname:resource.provider_nameparentItemId:null
Customer Level Mapping
Target Type: customer
Waldur Source Attributes:
resource.customer_slugcustomer_info.name(from API)customer_info.uuid(from API)
Generated Fields:
itemId: deterministic UUID from customer datakey:resource.customer_slugname:customer_info.nameparentItemId: tenantitemId
Project Level Mapping
Target Type: project
Waldur Source Attributes:
resource.project_slugresource.project_nameresource.uuidresource.limits
Generated Fields:
itemId:str(resource.uuid)key:resource.project_slugname:resource.project_nameparentItemId: customeritemIdquotas: fromresource.limits
Key Mapping Details
- Tenant level: Uses the offering owner information (
provider_slug,provider_name) - Customer level: Uses the resource customer information (
customer_slug) with details fetched from Waldur API - Project level: Uses the resource project information (
project_slug,project_name) with resource-specific data
Mount Point Generation
The storage proxy generates hierarchical mount points for three levels of storage organization:
Hierarchical Structure
Mount points are generated at three levels:
- Tenant Level:
/{storage_system}/{data_type}/{tenant} - Customer Level:
/{storage_system}/{data_type}/{tenant}/{customer} - Project Level:
/{storage_system}/{data_type}/{tenant}/{customer}/{project}
Examples
Tenant Mount Point:
1 | |
Customer Mount Point:
1 | |
Project Mount Point:
1 | |
Path Components
Where each component is derived from Waldur resource data:
storage_system: From offering slug (waldur_resource.offering_slug)data_type: Storage data type (e.g.,store,users,scratch,archive)tenant: Offering customer slug (waldur_resource.provider_slug)customer: Resource customer slug (waldur_resource.customer_slug)project: Resource project slug (waldur_resource.project_slug)
Hierarchical Relationships
The three-tier hierarchy provides parent-child relationships:
- Tenant entries have
parentItemId: null(top-level) - Customer entries reference their parent tenant via
parentItemId - Project entries reference their parent customer via
parentItemId
Resource Attributes
The backend extracts the following attributes from waldur_resource.attributes.additional_properties:
| Attribute | Type | Required | Default | Description |
|---|---|---|---|---|
permissions |
string | No | "775" |
Octal permissions for storage access (e.g., "2770", "755") |
storage_data_type |
string | No | "store" |
Storage data type classification. Determines target type mapping |
Storage System Source:
- The
storageSystemvalue comes from theoffering_slugfield, not from resource attributes - Each offering represents a different storage system (e.g., offering with slug "capstor" = capstor storage system)
Validation Rules:
- All attributes must be strings if provided (non-string values raise
TypeError) - Unknown
storage_data_typevalues fall back to"project"target type with warning - Empty or missing attributes use their respective default values
Storage Data Type Mapping:
The storage_data_type attribute determines the target structure in the generated JSON:
- Project targets:
"store","archive"→ target type"project" - Fields:
status,name,unixGid,active - User targets:
"users","scratch"→ target type"user" - Fields:
status,email,unixUid,primaryProject,active
API Filtering
The storage proxy API supports filtering capabilities to query specific storage resources:
API Endpoint
1 | |
Filter Parameters
| Parameter | Type | Required | Description | Allowed Values |
|---|---|---|---|---|
storage_system |
enum | No | Filter by storage system | capstor, vast, iopsstor |
data_type |
string | No | Filter by data type | users, scratch, store, archive |
status |
string | No | Filter by status | pending, removing, active, error |
state |
ResourceState | No | Filter by Waldur resource state | Creating, OK, Erred |
page |
integer | No | Page number (≥1) | 1, 2, 3 |
page_size |
integer | No | Items per page (1-500) | 50, 100, 200 |
debug |
boolean | No | Return raw Waldur data for debugging | true, false |
Example API Calls
Get all storage resources:
1 | |
Filter by storage system:
1 | |
Filter by storage system and data type:
1 | |
Filter by storage system, data type, and status:
1 | |
Paginated results with filters:
1 | |
Debug mode for troubleshooting:
1 | |
Filter Behavior
- Optional filtering: All filters are optional and applied only when provided
- Value validation:
storage_systemonly accepts:capstor,vast,iopsstor - Default behavior: Without filters, returns resources from all configured storage systems
- Exact matching: All filters use exact string matching (case-sensitive)
- Combine filters: Multiple filters are combined with AND logic
- Empty results: Non-matching filters return empty result arrays
- Post-serialization filtering: Filters are applied after JSON transformation to ensure consistent behavior across single and multi-offering queries
Filter Implementation Details
The filtering system processes resources in the following sequence:
- Resource fetching: Resources are retrieved from Waldur API using offering slugs
- JSON serialization: Raw Waldur resources are transformed to CSCS JSON format
- Filter application: Filters (
data_type,status) are applied to serialized JSON objects - Pagination: Results are paginated based on filtered resource count
This approach ensures that filters work consistently whether querying a single storage system or multiple storage systems simultaneously.
Error Responses
Invalid storage_system value:
1 2 3 4 5 6 7 8 9 10 11 | |
Empty storage_system parameter:
1 2 3 4 5 6 7 8 9 10 11 | |
Debug Mode
When debug=true is specified, the API returns raw Waldur data without translation to the CSCS
storage JSON format. This is useful for troubleshooting and understanding the source data.
Debug Response Format:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 | |
Debug Mode Features:
- Separate configurations: Shows both agent's offering config and live Waldur offering details
- Agent offering config: Configuration from the agent's YAML file (excludes
secret_options) - Waldur offering details: Complete live offering data from Waldur API with all available attributes
- Complete attribute exposure: All
ProviderOfferingDetailsattributes are included dynamically - Raw resource data: Unprocessed Waldur resource data with all fields
- Filter transparency: Shows which filters were applied to the results
- Security: Only
secret_optionsis explicitly excluded for security - Smart serialization: Automatically handles UUIDs, dates, and complex nested objects
- Error handling: Shows errors if offering lookup fails, continues with other attributes
- Useful for debugging: Compare agent config vs Waldur state, see all available offering data
Recent Improvements
Storage Hierarchy Mapping Update
The storage hierarchy mapping has been updated to better align with multi-tenant storage architectures:
- Tenant level: Now uses
provider_slug(the customer who owns the offering) - Customer level: Now uses
customer_slug(the customer using the resource) - Project level: Now uses
project_slug(the project containing the resource) - Rationale: This mapping provides clearer organizational boundaries in multi-tenant environments
Multi-Offering Storage System Support
The storage proxy now supports aggregating resources from multiple storage system offerings:
- Configurable storage systems: Map storage system names to Waldur offering slugs
- Unified API responses: Single endpoint returns resources from all configured storage systems
- Consistent filtering: Filters work across all storage systems or can target specific ones
- Resource aggregation: Resources from multiple offerings are combined and properly paginated
HPC User API Integration
Integration with external HPC User API for Unix GID management:
- OAuth2 authentication: Automatic token acquisition and management
- SOCKS proxy support: Access APIs behind firewalls via configurable SOCKS proxy
- GID caching: Project GID values cached in memory until server restart
- Graceful fallbacks: Mock GID values used when API is unavailable
Data Type Filtering Fix
Resolved data_type filtering issues that affected multi-storage-system queries:
- Root cause: Filtering was applied before JSON serialization in multi-offering queries
- Solution: Unified filtering approach applied after JSON serialization across all query types
- Behavior: Consistent filtering whether querying single or multiple storage systems
- Impact:
data_typeparameter now works correctly in all scenarios
Troubleshooting
Common Issues
Data type filtering not working:
- Ensure you're using lowercase values:
data_type=archivenotdata_type=Archive - Check that the storage system has resources with the specified data type
- Use
debug=trueto inspect raw data and verify data type values
SOCKS proxy connection issues:
- Verify the proxy is running:
netstat -an | grep 12345 - Check proxy format: use
socks5://localhost:12345not justlocalhost:12345 - Ensure httpx[socks] dependency is installed:
uv add "httpx[socks]"
GID cache not working:
- Cache statistics available via backend's
get_gid_cache_stats()method - Cache persists until server restart (no TTL-based expiration)
- Mock values are used if HPC User API is unavailable
Empty filter results:
- Verify filter values match exactly (case-sensitive)
- Use
debug=trueto see available values in raw data - Check that storage system configuration matches offering slugs
Performance Considerations
- GID caching: Reduces external API calls by caching project GIDs until server restart
- Multi-offering efficiency: Single API call to Waldur with comma-separated offering slugs
- Pagination: Applied after filtering to ensure accurate page counts
- SOCKS proxy overhead: Minimal latency impact for accessing external APIs
Related Plugins
Compute & HPC Plugins
- SLURM Plugin - SLURM cluster management
- MOAB Plugin - MOAB cluster management
- MUP Plugin - MUP portal integration
Container & Cloud Plugins
- OpenShift/OKD Plugin - OpenShift and OKD container platform management
- Harbor Plugin - Harbor container registry management
Storage Plugins
- Croit S3 Plugin - Croit S3 storage management
Accounting Plugins
- CSCS DWDI Plugin - CSCS DWDI accounting integration
Utility Plugins
- Basic Username Management Plugin - Username generation and management