Integrating Open OnDemand with Waldur and SLURM
Components used
Open OnDemand is an open source software that empowers students, researchers, and industry professionals with remote web access to supercomputers.
Waldur is an open source platform for running HPC and Cloud self service.
SLURM is an open source, fault-tolerant, and highly scalable cluster management and job scheduling system for large and small Linux clusters.
Keycloak is an open source software product to allow single sign-on with identity and access management aimed at modern applications and services.
MyAccessID (optional) Identity and Access Management Service is provided by GEANT with the purpose of offering a common Identity Layer for Infrastructure Service Domains (ISDs).
Integration overview
flowchart TD
User[👤 Local or federated user] --> Keycloak[🔐 Keycloak<br/>Identity Server]
Keycloak --> Waldur[🏛️ Waldur<br/>Management Platform]
Keycloak --> OOD[💻 Open OnDemand<br/>Web Interface]
LDAP[📋 Existing LDAP<br/>User Directory] --> LDAPService[🔧 LDAP Microservice]
LDAPService --> |SSSD-LDAP connect| SLURM[⚡ SLURM Cluster]
LDAPService --> |SSSD-LDAP connect| OOD
Waldur --> |Pulls users| LDAPService
Agent[🤖 Waldur Site Agent<br/>Resource Manager] --> |Pushes usages| Waldur
Agent --> |Pulls accounts| Waldur
Agent --> |Creates accounts| SLURM
Agent --> |Connects via SSH and launches jobs| SLURM
OOD --> |SSSD-LDAP connect| LDAPService
classDef user fill:#e1f5fe,stroke:#0277bd,stroke-width:2px
classDef waldur fill:#c8e6c9,stroke:#388e3c,stroke-width:2px
classDef external fill:#fff3e0,stroke:#f57c00,stroke-width:2px
classDef infrastructure fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px
class User user
class Waldur,LDAPService,Agent waldur
class LDAP,SLURM external
class Keycloak,OOD infrastructure
OOD requirements
- Shared user directory storage available both on SLURM and OOD VMs
- Dedicated hostname for OOD machine (like ood.example.com)
- Open TCP/80 and TCP/443, ability to connect LDAP on SLURM management machine
- Linux server with at least 4GB of RAM and 10GB of storage disk
Open OnDemand (OOD) installation and configuration
Follow the guide at https://github.com/OSC/ood-ansible to automatically install OOD on the Linux server.
Preparation guide:
- Setup a certificate to later use in ansible configuration:
1 2 |
|
- In Keycloak create a client with authentication enabled for the Open OnDemand.
- Populate the ansible inventory.yaml:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 |
|
- Apply the playbook:
1 |
|
User authentication flow
-
OOD gets Linux username from preferred_username claim from Keycloak
-
OOD launches a "Per User Nginx" (PUN) environment after success user login
-
OOD connects to a SLURM cluster with the selected preferred_username
flowchart LR
User[👤 User] --> |Authenticates| OOD[💻 OOD starts per-user<br/>environment on the VM]
OOD --> |Logins as the specific user| SLURM[⚡ SLURM login node]
classDef user fill:#e1f5fe,stroke:#0277bd,stroke-width:2px
classDef system fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px
classDef cluster fill:#fff3e0,stroke:#f57c00,stroke-width:2px
class User user
class OOD system
class SLURM cluster
Keycloak configuration
Keycloak acts as a central Identity server for Waldur and Open OnDemand.
Steps to configure Keycloak:
- Configure Waldur and Open Ondemand clients
- Configure identity federation or user self-registration. If identity federation is used, common task is to configure username mapping like mentioned in https://puhuri.neic.no/idp_integration/use-cases/keycloak-integration/
- Install waldur-username-mapper for matching Keycloak or federated users with their respective Linux usernames: https://docs.waldur.com/integrations/waldur-keycloak-mapper/
Open OnDemand cluster configuration
To configure the connection between Open OnDemand and SLURM you need to manually configure the cluster config /etc/ood/config/clusters.d/my_cluster.yml:
1 2 3 4 5 6 7 8 9 |
|
Waldur Site Agent configuration
Waldur Site Agent is a microservice for pulling allocation from Waldur and pushing the allocation usage statistics back to Waldur.
The microservice supports 2 modes of operation:
- Docker Compose - testing only, requires SLURM running in the same docker compose
- Native - production
Follow Waldur Site Agent documentation for installation guide. Make sure to enable the enable_user_homedir_account_creation
flag - Open OnDemand does not work unless the user's home directory exists.
Host-based SSH authentication configuration
One of the methods to allow OOD to connect to SLURM login node is to setup a host-based “trust” or “SSH host-based authentication” between OOD VM and SLURM login node.
Use https://en.wikibooks.org/wiki/OpenSSH/Cookbook/Host-based_Authentication as a guide.
Troubleshooting
- OOD login and preferredUsername fetching errors are located in /var/log/httpd/error.log or similar
- Per user application or SLURM configuration errors are located in /var/log/ondemand-nginx/USERNAME/
- By default, OOD does not well tolerate setting arbitrary prepends for the URL — prefer using https://ood.example.com instead of https://ood.example.com/
- Most common SSSD / LDAP configuration errors include:
- Wrong LDAP filter
- SSSD is not able to reach LDAP server (network error)
- SSSD installed without sssd-ldap plugin
- Home directory of the user has not been created
- Make sure to specify correct SLURM account in the OOD job configuration: