Marketplace Software Catalogs
This guide covers the software catalog system in Waldur's marketplace, including support for EESSI (European Environment for Scientific Software Installations), Spack, and other software catalogs.
Overview
The software catalog system allows marketplace offerings to expose large collections of scientific and HPC software packages from external catalogs. Instead of manually tracking individual software installations, offerings can reference comprehensive software catalogs with thousands of packages. Waldur supports multiple catalog sources including:
- EESSI: Binary runtime environment with pre-compiled HPC software
- Spack: Source-based package manager for scientific computing
- Future support: conda-forge, modules, and custom catalogs
Architecture
Unified Catalog Loader Framework
Waldur uses a unified catalog loader framework that provides:
- BaseCatalogLoader: Abstract base class for all catalog loaders
- EESSICatalogLoader: Loader for EESSI catalogs from new API format
- SpackCatalogLoader: Loader for Spack catalogs from repology.json format
- Extensible design: Support for additional catalog types
Data Models
The system uses relational models for efficient storage and querying:
- SoftwareCatalog: Represents a software catalog (e.g., EESSI 2023.06, Spack 2024.12)
- SoftwarePackage: Individual software packages within catalogs
- SoftwareVersion: Specific versions of packages
- SoftwareTarget: Architecture/platform-specific installations or build variants
- OfferingSoftwareCatalog: Links offerings to available catalogs
Catalog Types
- binary_runtime: Pre-compiled software ready to use (EESSI)
- source_package: Source packages requiring compilation (Spack)
- package_manager: Traditional package managers (future: conda, pip)
- environment_module: Module-based software stacks
Loading Software Catalogs
EESSI Catalog Loading
The EESSI loader uses the new EESSI API format which supports both main software packages and extensions (Python packages, R packages, etc.).
Load EESSI Catalog
1 2 3 4 5 6 7 8 9 10 11 | |
EESSI Command Options
--catalog-name: Name of the software catalog (default: EESSI)--catalog-version: EESSI version (auto-detected from API if not provided)--api-url: Base URL for EESSI API (default: https://www.eessi.io/api_data/data/)--extensions/--no-extensions: Include/exclude extension packages (default: include)--dry-run: Show what would be done without making changes--update-existing: Update existing catalog data if it exists
Spack Catalog Loading
The Spack loader supports the repology.json format from packages.spack.io, providing access to thousands of scientific computing packages.
Load Spack Catalog
1 2 3 4 5 6 7 8 9 10 11 12 | |
Spack Command Options
--catalog-name: Name of the software catalog (default: Spack)--catalog-version: Spack version (auto-detected from data timestamp if not provided)--data-url: URL for Spack repology.json data--dry-run: Show what would be done without making changes--update-existing: Update existing catalog data if it exists
What Gets Created
Both management commands create:
- SoftwareCatalog entry with detected version and metadata
- SoftwarePackage entries for each software package
- SoftwareVersion entries for each package version
- SoftwareTarget entries for architecture/platform combinations or build variants
Management commands vs daily task: Management commands (
load_eessi_catalog,load_spack_catalog) will create new catalog records if none exist. The daily automated task (update_software_catalogs) only updates existing catalog records — it never creates new ones. This prevents orphaned catalogs from being auto-created when no offering references them.
Automated Catalog Updates
Waldur provides automated daily updates for software catalogs through Celery tasks.
Configuration Settings
Configure automated updates through constance settings:
EESSI Settings
SOFTWARE_CATALOG_EESSI_UPDATE_ENABLED: Enable automated EESSI updates (default: false)SOFTWARE_CATALOG_EESSI_VERSION: EESSI version to load (auto-detect if empty)SOFTWARE_CATALOG_EESSI_API_URL: Base URL for EESSI API dataSOFTWARE_CATALOG_EESSI_INCLUDE_EXTENSIONS: Include Python/R extensions (default: true)
Spack Settings
SOFTWARE_CATALOG_SPACK_UPDATE_ENABLED: Enable automated Spack updates (default: false)SOFTWARE_CATALOG_SPACK_VERSION: Spack version to load (auto-detect if empty)SOFTWARE_CATALOG_SPACK_DATA_URL: URL for Spack repology.json data
General Settings
SOFTWARE_CATALOG_UPDATE_EXISTING_PACKAGES: Update existing packages during refresh (default: true)SOFTWARE_CATALOG_CLEANUP_ENABLED: Enable automatic cleanup of old catalog data (default: false)SOFTWARE_CATALOG_RETENTION_DAYS: Number of days to retain old catalog versions (default: 90)
Scheduled Updates
The update_software_catalogs task runs daily at 3 AM and:
- Updates only existing catalogs: The task never creates new catalog records. If no catalog exists in the database for a given name/type, the task skips it with a warning. Create catalogs first via the API, management commands, or the
discoverendpoint to see what's available. - Independent Processing: Each catalog is updated independently - failures don't affect other catalogs
- Configuration Validation: Validates settings before attempting updates
- Error Isolation: Individual catalog failures are logged but don't prevent other updates
- Comprehensive Logging: Detailed logging for monitoring and troubleshooting
Note: Both
SOFTWARE_CATALOG_EESSI_UPDATE_ENABLEDandSOFTWARE_CATALOG_SPACK_UPDATE_ENABLEDdefault tofalse. Enable them explicitly after creating the initial catalog records.
Manual Trigger
You can manually trigger catalog updates:
1 2 | |
Associate Catalogs with Offerings
Link the loaded software catalogs to your marketplace offerings:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | |
Understanding Software Catalog Targets
EESSI Architecture Targets
EESSI provides software optimized for different CPU architectures and microarchitectures:
Common CPU Targets
x86_64/generic- General x86_64 compatibilityx86_64/intel/haswell- Intel Haswell and newerx86_64/intel/skylake_avx512- Intel Skylake with AVX-512x86_64/amd/zen2- AMD Zen2 architecturex86_64/amd/zen3- AMD Zen3 architectureaarch64/generic- General ARM64 compatibilityaarch64/neoverse_n1- ARM Neoverse N1 cores
EESSI Extension Support
The new EESSI API format includes support for extension packages:
- Python packages: NumPy, SciPy, TensorFlow, PyTorch, etc.
- R packages: Bioconductor, CRAN packages
- Perl modules: CPAN modules
- Ruby gems: Scientific Ruby libraries
- Octave packages: Signal processing, optimization
Extensions are linked to their parent software (e.g., Python packages linked to Python installation).
Spack Build Variants
Spack supports flexible build configurations through targets:
Target Types
build_variant/default- Standard build configurationplatform/windows- Windows-compatible packagesexternal/system- System-provided packages (detectable)build_system/build-tool- Build tools and compilers
Spack Categories
build-tools- Compilers, build systems, make toolsdetectable- Externally provided packageswindows- Windows compatibility- Custom categories based on package metadata
Why Targets Matter
- Performance: Architecture-specific builds can be 20-50% faster
- Compatibility: Ensures software runs on target hardware
- Instruction Sets: Leverages specific CPU features (AVX, NEON, etc.)
- HPC Requirements: Critical for scientific computing workloads
- Build Flexibility: Spack provides multiple build configurations
Available API Endpoints
The software catalog system provides the following API endpoints:
- marketplace-software-catalogs: View and manage software catalogs
- marketplace-software-packages: Browse software packages within catalogs
- marketplace-software-versions: View software versions for packages
- marketplace-software-targets: View architecture-specific installations
Discover Available Catalog Versions
Staff users can check what catalog versions are available upstream without creating anything:
1 2 | |
Example response:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | |
| Field | Type | Description |
|---|---|---|
name |
string | Catalog name (EESSI or Spack) |
catalog_type |
string | Catalog type identifier |
latest_version |
string or null | Detected upstream version, null if detection failed |
existing |
boolean | Whether a catalog record exists in the database |
existing_version |
string or null | Version of the existing catalog record |
update_available |
boolean | True when upstream version differs from existing |
This endpoint makes lightweight HTTP calls to the upstream sources (EESSI API, Spack repology) to detect the latest version. It does not download package data or modify the database. Requires staff permissions.
Software Catalog Management Actions
Offering-software catalog associations are managed through offering actions:
add_software_catalog: Associate a catalog with an offeringupdate_software_catalog: Update catalog configuration for an offeringremove_software_catalog: Remove catalog association from offering
These actions are available on the marketplace-provider-offerings endpoint.
API Usage
Browse Available Catalogs
1 2 3 4 5 | |
Example response:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | |
Browse Software Packages
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | |
Example response:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | |
Package Detail with Nested Versions and Targets
When viewing package details, the response includes nested versions with their targets and EESSI-specific metadata:
1 2 | |
Example detailed response:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 | |
Version Response Fields (EESSI)
| Field | Type | Description |
|---|---|---|
module |
object | Structured module information with full_module_name, module_name, module_version |
required_modules |
array | List of required module objects with structured info |
extensions |
array | Bundled extensions (e.g., Python packages) with type, name, version |
toolchain |
object | Toolchain info with name and version |
toolchain_families_compatibility |
array | List of compatible toolchain families |
targets |
array | Available architecture targets |
Browse Software Versions
1 2 3 4 5 | |
Browse Installation Targets
1 2 3 4 5 6 7 8 | |
Linking Catalogs to Offerings
Associate Catalog with Offering
Offering-software catalog associations are managed through offering actions, not a separate endpoint:
1 2 3 4 5 6 7 8 9 | |
Update Offering Software Catalog Configuration
1 2 3 4 5 6 7 8 9 | |
Remove Software Catalog from Offering
1 2 3 4 5 6 7 | |
Query Offering Software
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 | |
Key Fields
| Field | Type | Description |
|---|---|---|
module |
object | Structured module info: full_module_name, module_name, module_version |
required_modules |
array of objects | Each with full_module_name, module_name, module_version |
extensions |
array | Bundled packages with type, name, version |
toolchain_families_compatibility |
array | Compatible toolchain families (e.g., "2023b_foss") |
Extension Structure
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 | |
Spack Repology Format
Spack uses the repology.json format from packages.spack.io:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | |
Catalog Metadata Comparison
| Feature | EESSI | Spack |
|---|---|---|
| Format | New API (JSON) | Repology (JSON) |
| Type | Binary runtime | Source packages |
| Architecture Support | CPU-specific builds | Build variants |
| Extensions | Python, R, Perl, etc. | Dependencies only |
| Toolchain Info | Full toolchain details | Build dependencies |
| Installation Paths | CVMFS paths | Download URLs |
| Categories | Scientific domains | Package types |
| Updates | API timestamp | Git commit date |
SLURM Partitions and Software Catalogs
For detailed information about SLURM partition configuration and their integration with software catalogs, see the dedicated Marketplace SLURM Partitions guide.
This includes: - SLURM partition model configuration - Partition management APIs (add, update, remove) - Partition-specific software catalog associations - CPU architecture targeting for different partitions