SLURM Storage Quotas
The SLURM plugin can apply two independent kinds of filesystem quotas during normal agent operation:
- Per-user home directory quota (
homedir_quota) — runs inmembership_syncmode when a new user is added to a SLURM account and a homedir is created. Sets a user quota on the user's home directory via CephFS xattrs, XFS user quotas, or Lustre user quotas. - Per-project directory + Lustre group/project quota
(
project_directory) — runs inorder_processmode during resource creation (_pre_create_resource). Creates a shared project directory, applies ownership / permissions / ACLs, and optionally sets a Lustre project quota keyed on the LDAP group GID.
The two are independent — you can enable either, both, or neither. Failures in either are logged but do not abort agent operation.
Subsystem A — Per-user home directory quota
This lives in the core module
waldur_site_agent/backend/quota.py
and is invoked from
BaseBackend.create_user_homedirs
whenever the agent creates a new user homedir. It is wired into the SLURM
backend through the standard
enable_user_homedir_account_creation /
default_homedir_umask settings.
The SLURM backend triggers create_user_homedirs from three call sites:
post_create_resource— duringorder_process, after a resource is created, for the offering users in the user context.add_users_to_resource— duringmembership_sync, for users newly added to a resource.process_existing_users— duringmembership_sync, to ensure homedirs exist for already-known users.
It is also called by the standalone create_homedirs_for_offering_users
utility (waldur_site_agent/common/utils.py).
Homedir quota configuration
1 2 3 4 5 6 7 8 9 10 11 12 | |
The schema is HomedirQuotaConfig in backend/quota.py. It uses
extra="forbid", so unknown keys are rejected. An invalid configuration is
logged and treated as "no quota" — homedir creation still proceeds.
Provider: ceph_xattr
Sets quotas via extended attributes on the homedir.
1 2 3 4 | |
Commands executed (per attribute):
1 2 | |
The verify step compares the read-back value against the configured one and logs a warning on mismatch.
Provider: xfs
Sets XFS user quotas via xfs_quota. Block limits accept human-readable
suffixes (g, t, …).
1 2 3 4 5 6 7 | |
Command executed:
1 2 | |
If mount_point is missing or no limits are set, the quota step is
skipped with a log message. The homedir itself is still created.
Provider: lustre
Sets Lustre user quotas via lfs setquota. Block limits are expressed in
kilobytes.
1 2 3 4 5 6 7 | |
Command executed:
1 2 | |
When the quota is applied
BaseBackend.create_user_homedirs iterates over the set of usernames it is
given. For each one:
client.create_linux_user_homedir(username, umask)is called.- If
homedir_quotais configured, the quota is then applied on the resolved homedir path. - A failure for one user is logged but does not stop processing of the remaining users.
- If homedir creation itself fails for a user, the quota step is skipped for that user.
Subsystem B — Project directory + Lustre group/project quota
SLURM-plugin specific. Schemas: ProjectDirectoryConfig and
LustreQuotaConfig in
plugins/slurm/waldur_site_agent_slurm/schemas.py.
Implementation: _setup_project_directory /
_set_lustre_quota in
plugins/slurm/waldur_site_agent_slurm/backend.py.
Project directory configuration
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | |
What the agent does on resource creation
When project_directory.enabled: true, _pre_create_resource calls
_setup_project_directory(resource_backend_id) after the SLURM account
hierarchy is created. It executes (in order):
1 2 3 4 5 | |
<group_name> defaults to the resource backend ID. It can be overridden
through two extra keys that are read by the backend but not (yet) part of
the schema fields — they pass through because the schema has
extra="allow":
group_owner_source: when set to"project_id"(the default), the group name is the SLURM account name. Any other value falls back to:group_name: explicit override of the group used inchown/setfacl.
Lustre group/project quota: prerequisites
The Lustre quota step inside _setup_project_directory runs only if all
three of the following are true:
lustre_quotais configured.- The SLURM backend has an LDAP client configured (
backend_settings.ldap). LdapClient.get_group_gid(group_name)returns a non-NoneGID.
If any of these is missing the directory is still created but the Lustre
quota is silently skipped. No warning is emitted today — if you intend to
use Lustre project quotas, make sure the offering has ldap: configured
and that the project group exists in LDAP before the resource is created.
If the prerequisites are met, the agent runs:
1 2 3 | |
Each -b/-B/-i/-I flag is included only if the corresponding limit is set
in the configuration. mount_point defaults to /valhalla.
Note that this is a project quota (-p), not a group or user quota. The
GID coming from LDAP is reused as the Lustre project ID, and the directory
tree is tagged with that project ID via lfs project -r -s.
Unlike subsystem A, no verification step is run. Failures from
lfs setquota are logged but do not abort resource creation.
Examples
A complete reference example showing both subsystems is in
examples/waldur-site-agent-config.yaml.example:
- The
Example SLURM Offeringshowshomedir_quotaplacement with theceph_xattrprovider (commented out). - The
Discoverer CPUoffering showshomedir_base_path, thelustreandxfshomedir_quotaprovider examples (commented out), and a full uncommentedproject_directoryblock withlustre_quota.
Caveats and known inconsistencies
HomedirQuotaConfig.block_softlimit/block_hardlimitare typed asOptional[str](to allow XFS suffixes like"900g"), whileLustreQuotaConfig.block_softlimit/block_hardlimitare typed asOptional[int]. Use integers (kilobytes) forproject_directory.lustre_quotaeven if you use string-with-suffix forhomedir_quotaon Lustre.group_owner_sourceandgroup_nameare accepted by the runtime but are not declared fields ofProjectDirectoryConfig(they pass through viaextra="allow"). They are documented here for completeness.- Lustre project quota silently no-ops without LDAP. This is a current implementation choice — the quota requires the GID and the GID lookup goes through the LDAP client.
- No verification is run after
lfs setquota -p(the homedir-quota path does verify). Operators should spot-check withlfs quota -p <gid> <mount>after onboarding the first project.
Operator troubleshooting
Verify a user's home quota:
1 2 3 4 5 6 | |
Verify a project directory's Lustre quota / project ID:
1 2 | |
The agent logs each command it issues and each verification result. Search
the structured logs for homedir, Lustre quota, or
Created project directory to trace activity.