RabbitMQ Cluster Operator (Production)
For production deployments, it is strongly recommended to use the official RabbitMQ Cluster Kubernetes Operator instead of the Bitnami Helm chart. The operator provides better lifecycle management, high availability, and production-grade features.
Overview
The RabbitMQ Cluster Operator automates:
-
Provisioning and management of RabbitMQ clusters
-
Scaling and automated rolling upgrades
-
Monitoring integration with Prometheus and Grafana
-
Backup and recovery operations
-
Network policy and security configurations
Prerequisites
-
Kubernetes cluster version 1.19 or above
-
Configured
kubectlaccess -
Appropriate RBAC permissions
Installation
1. Install the RabbitMQ Cluster Operator
1 | |
Verify the operator is running:
1 | |
2. Create a Production RabbitMQ Cluster
Create a production-ready RabbitMQ cluster configuration:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 | |
Apply the configuration:
1 | |
Configuration for Waldur
1. Retrieve RabbitMQ Credentials
Get the auto-generated credentials:
1 2 3 4 5 6 7 | |
2. Configure Waldur Helm Values
Update your Waldur values.yaml:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | |
RabbitMQ Operator Secret Management:
The RabbitMQ Cluster Operator automatically creates a default user secret named [cluster-name]-default-user containing:
-
username- Auto-generated username -
password- Auto-generated password -
Other connection details
This approach avoids hardcoding credentials and follows Kubernetes security best practices.
High Availability Configuration
For production high availability, consider these additional configurations:
Pod Disruption Budget
1 2 3 4 5 6 7 8 9 | |
Network Policy (Optional)
Restrict network access to RabbitMQ:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 | |
Monitoring
The operator automatically enables Prometheus metrics. To access them:
-
Prometheus Metrics Endpoint:
<http://waldur-rabbitmq:15692/metrics> -
Management UI Access:
1 | |
Access at: <http://localhost:15672>
- Grafana Dashboard: Import RabbitMQ dashboard ID
10991or similar
Backup and Recovery
Automated Backup Configuration
The operator supports backup configurations through definitions:
1 2 3 4 5 6 7 | |
For production, implement external backup strategies using tools like Velero or cloud-native backup solutions.
Scaling
Scale the cluster:
1 | |
Important: Always use odd numbers for replicas (1, 3, 5, 7) to avoid split-brain scenarios.
Troubleshooting
Check Cluster Status
1 2 3 4 5 6 7 8 9 10 11 | |
View Logs
1 2 3 4 5 6 7 | |
Migration from Bitnami Chart
If migrating from the Bitnami chart:
-
Backup existing data using RabbitMQ management tools
-
Deploy the operator and create a new cluster
-
Export/import virtual hosts, users, and permissions
-
Update Waldur configuration to point to the new cluster
-
Test thoroughly before decommissioning the old setup
Security Considerations
- TLS Configuration: Enable TLS for production:
1 2 3 | |
-
Authentication: Consider integrating with LDAP or other authentication backends
-
Network Policies: Implement network policies to restrict access
-
RBAC: Ensure appropriate Kubernetes RBAC policies are in place
Performance Tuning
For high-throughput scenarios:
-
Adjust memory limits based on message volume
-
Configure disk I/O with appropriate storage classes
-
Tune RabbitMQ parameters in
additionalConfig -
Monitor resource usage and scale accordingly
Support and Documentation
-
Official Documentation: https://www.rabbitmq.com/kubernetes/operator/
-
GitHub Repository: https://github.com/rabbitmq/cluster-operator
-
Examples: https://github.com/rabbitmq/cluster-operator/tree/main/docs/examples
-
Community Support: RabbitMQ Discussions on GitHub