25 posts tagged with "Performance"

RabbitMQ 3.13: Classic Queues Changes

January 11, 2024 · 11 min read

Michał Kuratczyk

We've already announced two major new features of 3.13 in separate blog posts:

This post focuses on the changes to the classic queues in this release:

classic queue storage format version 1 is deprecated
new implementation of the classic queue message store

RabbitMQ 3.12 Performance Improvements

May 17, 2023 · 13 min read

Michał Kuratczyk

RabbitMQ 3.12 will be released soon with many new features and improvements. This blog post focuses on the the performance-related differences. The most important change is that the lazy mode for classic queues is now the standard behavior (more on this below). The new implementation should be even more memory efficient while proving higher throughput and lower latency than both lazy or non-lazy implementations did in earlier versions.

For even better performance, we highly recommend switching to classic queues version 2 (CQv2).

Serving Millions of Clients with Native MQTT

March 21, 2023 · 24 min read

David Ansari

RabbitMQ's core protocol has been AMQP 0.9.1. To support MQTT, STOMP, and AMQP 1.0, the broker transparently proxies via its core protocol. While this is a simple way to extend RabbitMQ with support for more messaging protocols, it degrades scalability and performance.

In the last 9 months, we re-wrote the MQTT plugin to not proxy via AMQP 0.9.1 anymore. Instead, the MQTT plugin parses MQTT messages and sends them directly to queues. This is what we call Native MQTT.

The results are spectacular:

Memory usage drops by up to 95% and hundreds of GBs with many connections.
For the first time ever, RabbitMQ is able to handle millions of connections.
End-to-end latency drops by 50% - 70%.
Throughput increases by 30% - 40%.

Native MQTT turns RabbitMQ into an MQTT broker opening the door for a broader set of IoT use cases.

Native MQTT ships in RabbitMQ 3.12.

Improving RabbitMQ Performance with Flame Graphs

May 31, 2022 · 16 min read

David Ansari

Recent Erlang/OTP versions ship with Linux perf support. This blog post provides step by step instructions on how you can create CPU and memory flame graphs in RabbitMQ to quickly and accurately detect performance bottlenecks. We also provide examples of how flame graphs have helped us to increase message throughput in RabbitMQ.

RabbitMQ 3.10 Performance Improvements

May 16, 2022 · 12 min read

Michał Kuratczyk

RabbitMQ 3.10 was released on the 3rd of May 2022, with many new features and improvements. This blog post gives an overview of the performance improvements in that release. Long story short, you can expect higher throughput, lower latency and faster node startups, especially with large definitions files imported on startup.

Erlang 24 Support Roadmap

March 23, 2021 · 5 min read

Michael Klishin

TL;DR

Erlang 24 will ship in May and it offers significant performance gains to RabbitMQ users
Supporting Erlang 24 and 22 at the same time is not feasible, so in early May 2021, Erlang 22 support will be dropped
If you run on Erlang 22, upgrade to 23.2 today: it should be a drop-in replacement
Users of the RabbitMQ Kubernetes Operator, the Docker community image and modern releases of VMware Tanzu RabbitMQ for VMs are not affected as those projects all use Erlang 23 today

Cluster Sizing Case Study – Quorum Queues Part 2

June 22, 2020 · 12 min read

Jack Vanlightly

In the last post we started a sizing analysis of our workload using quorum queues. We focused on the happy scenario that consumers are keeping up meaning that there are no queue backlogs and all brokers in the cluster are operating normally. By running a series of benchmarks modelling our workload at different intensities we identified the top 5 cluster size and storage volume combinations in terms of cost per 1000 msg/s per month.

Cluster: 7 nodes, 8 vCPUs (c5.2xlarge), gp2 SDD. Cost: $54
Cluster: 9 nodes, 8 vCPUs (c5.2xlarge), gp2 SDD. Cost: $69
Cluster: 5 nodes, 8 vCPUs (c5.2xlarge), st1 HDD. Cost: $93
Cluster: 5 nodes, 16 vCPUs (c5.4xlarge), gp2 SDD. Cost: $98
Cluster: 7 nodes, 16 vCPUs (c5.4xlarge), gp2 SDD. Cost: $107

There are more tests to run to ensure these clusters can handle things like brokers failing and large backlogs accumulating during things like outages or system slowdowns.

All quorum queues are declared with the following properties:

x-quorum-initial-group-size=3
x-max-in-memory-length=0

The x-max-in-memory-length property forces the quorum queue to remove message bodies from memory as soon as it is safe to do. You can set it to a longer limit, this is the most aggressive - designed to avoid large memory growth at the cost of more disk reads when consumers do not keep up. Without this property message bodies are kept in memory at all times which can place memory growth to the point of memory alarms setting off which severely impacts the publish rate - something we want to avoid in this workload case study.

Cluster Sizing Case Study – Quorum Queues Part 1

June 21, 2020 · 16 min read

Jack Vanlightly

In a first post in this sizing series we covered the workload, the tests, and the cluster and storage volume configurations on AWS ec2. In this post we’ll run a sizing analysis with quorum queues. We also ran a sizing analysis on mirrored queues.

In this post we'll run the increasing intensity tests that will measure our candidate cluster sizes at varying publish rates, under ideal conditions. In the next post we'll run resiliency tests that measure whether our clusters can handle our target peak load under adverse conditions.

All quorum queues are declared with the following properties:

x-quorum-initial-group-size=3 (replication factor)
x-max-in-memory-length=0

The x-max-in-memory-length property forces the quorum queue to remove message bodies from memory as soon as it is safe to do. You can set it to a longer limit, this is the most aggressive - designed to avoid large memory growth at the cost of more disk reads when consumers do not keep up. Without this property message bodies are kept in memory at all times which can place memory growth to the point of memory alarms setting off which severely impacts the publish rate - something we want to avoid in this workload case study.

Cluster Sizing Case Study – Mirrored Queues Part 2

June 20, 2020 · 12 min read

Jack Vanlightly

In the last post we started a sizing analysis of our workload using mirrored queues. We focused on the happy scenario that consumers are keeping up meaning that there are no queue backlogs and all brokers in the cluster are operating normally. By running a series of benchmarks modelling our workload at different intensities we identified the top 5 cluster size and storage volume combinations in terms of cost per 1000 msg/s per month.

Cluster: 5 nodes, 8 vCPUs, gp2 SDD. Cost: $58
Cluster: 7 nodes, 8 vCPUs, gp2 SDD. Cost: $81
Cluster: 5 nodes, 8 vCPUs, st1 HDD. Cost: $93
Cluster: 5 nodes, 16 vCPUs, gp2 SDD. Cost: $98
Cluster: 9 nodes, 8 vCPUs, gp2 SDD. Cost: $104

There are more tests to run to ensure these clusters can handle things like brokers failing and large backlogs accumulating during things like outages or system slowdowns.

Cluster Sizing Case Study - Mirrored Queues Part 1

June 19, 2020 · 13 min read

Jack Vanlightly

In a first post in this sizing series we covered the workload, cluster and storage volume configurations on AWS ec2. In this post we’ll run a sizing analysis with mirrored queues.

The first phase of our sizing analysis will be assessing what intensities each of our clusters and storage volumes can handle easily and which are too much.

All tests use the following policy:

ha-mode: exactly
ha-params: 2
ha-sync-mode: manual

TL;DR​

TL;DR