Skip to content

Commit bceb270

Browse files
committed
Answer some production-ready questions that came up in private threads
I expect this to preempt some follow-up questions which I have seen come up a few times already. I enccourage others to answer follow-up questions in the same way. It will save us from repeating the same answers, and sharing what we know to be true. If you think that there is too much information in this README, we can split the questions into a separate file. I think that approach will fragment the information and make it less accessible, but I am open to what others feel would work better. Signed-off-by: Gerhard Lazu <glazu@vmware.com>
1 parent eef774f commit bceb270

File tree

1 file changed

+44
-6
lines changed

1 file changed

+44
-6
lines changed
Lines changed: 44 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,20 +1,58 @@
11
# Production Example
22

3-
This is an example of a good starting point for a production RabbitMQ deployment. It deploys a 3-node cluster with enough resources to handle reasonable traffic.
3+
This is an example of a good starting point for a production RabbitMQ deployment.
4+
It deploys a 3-node cluster with sufficient resources to handle 1 billion messages per day at 8kB payload and a replication factor of three.
5+
The rest of the workload details are outlined in the monthly cost savings calculator on https://rabbitmq.com/tanzu
46

57
Please keep in mind that:
68

7-
1. It may not be suitable for YOUR production deployment. Please go through the [Production Checklist](https://www.rabbitmq.com/production-checklist.html) to learn more about production deployment considerations.
9+
1. It may not be suitable for **your** production deployment.
10+
The official [RabbitMQ Production Checklist](https://www.rabbitmq.com/production-checklist.html) will help you with some of these considerations.
811

9-
2. While it is important to correctly deploy RabbitMQ cluster for production deployment, it is even more important to correctly use RabbitMQ from your applications. [Production Checklist](https://www.rabbitmq.com/production-checklist.html) covers some of the common issues such as connection churn and polling consumers. Please also consider using [Quorum Queues](https://www.rabbitmq.com/quorum-queues.html) since they provide better data safety.
12+
2. While it is important to correctly deploy RabbitMQ cluster for production workloads, it is equally important for your applications to use RabbitMQ correctly.
13+
[Production Checklist](https://www.rabbitmq.com/production-checklist.html) covers some of the common issues such as connection churn and polling consumers.
14+
This example was tested with [Quorum Queues](https://www.rabbitmq.com/quorum-queues.html) which provide excellent data safety for workloads that require message replication.
1015

11-
You can deploy this example like this:
16+
Before you can deploy this RabbitMQ cluster, you will need a multi-zone Kubernetes cluster with at least 3 nodes, 12 CPUs, 30Gi RAM and 1.5Ti disk space available.
17+
A `storageClass` named `ssd` will need to be defined too.
18+
We have [a GKE-specific example](ssd-gke.yaml) included in this example.
19+
Read more about the expected disk performance [in Google Cloud Documentation](https://cloud.google.com/compute/docs/disks/performance#ssd_persistent_disk).
20+
For what it's worth, disk write throughput is the limiting factor for persistent messages with a payload of 8kB.
21+
22+
To deploy this RabbitMQ cluster, run the following:
1223

1324
```shell
1425
kubectl apply -f rabbitmq.yaml
1526
kubectl apply -f pod-disruption-budget.yaml
1627
```
1728

18-
Please keep in mind that you need a multi-zone Kubernetes cluster with 3 nodes, 12 CPUs, 30Gi RAM, 1.5Ti disk space available as well as a `storageClass` called `ssd` to deploy this example as-is. Of course you can adjust these values to your environment if needed.
29+
## Q & A
30+
31+
### Is 4 CPUs per RabbitMQ node the minimum?
32+
33+
No. The absolute minimum is 2 CPUs.
34+
35+
For our workload - 1 billion messages per day at 8kB payload and a replication factor of three - 4 CPUs is the minimum.
36+
37+
### Will RabbitMQ work with 1 CPU?
38+
39+
Yes. It will work, but poorly, which is why we cannot recommend it for production workloads.
40+
A RabbitMQ with less than 2 full CPUs cannot be considered production.
41+
42+
43+
### Can I assign less than 1 CPU to RabbitMQ?
44+
45+
Yes, this is entirely possible within Kubernetes.
46+
Be prepared for unresponsiveness that cannot be explained.
47+
The kernel will work against RabbitMQ's runtime optimisations, and anything can happen.
48+
A RabbitMQ with less than 2 full CPUs cannot be considered production.
49+
50+
### Does CPU clock speed matter for message throughput?
51+
52+
Yes. Queues are single threaded, and CPUs with higher clock speeds can run more cycles, which means that the queue process can perform more operations per second.
53+
This will not the case when disks or network are the limiting factor, but in benchmarks with sufficient network and disk capacity, faster CPUs translate to higher message throughhput.
54+
55+
### Are vCPUs (virtual CPUs) OK?
1956

20-
An SSD storage class can be defined using [the example](ssd-gke.yaml) (which is GKE-specific and needs to be adjusted for other environments). Read more about the expected disk performance [in Google Cloud Documentation](https://cloud.google.com/compute/docs/disks/performance#ssd_persistent_disk).
57+
Yes. The workload that was used for this production configuration starting point ran on Google Cloud and used 2 real CPU cores with 2 hyper-threads each, meaning 4 vCPUs.
58+
While we would recommend real CPUs and no hyper-threading, we also operate in the cloud and default to using vCPUs, including for our benchmarks.

0 commit comments

Comments
 (0)