Kafka, RabbitMQ, Redis
"One morning, as Gregor Samsa was waking up from anxious dreams, he discovered that in bed he had been changed into a monstrous verminous bug. He lay on his armour-hard back and saw, as he lifted his head up a little, his brown, arched abdomen divided up into rigid bow-like sections."
-- Franz Kafka, Der Prozess
When developing a distributed, highly-available and performant system based on messaging, and we need to leverage both pub-sub and push-pull, we need to select the broker implementation that suits our needs and usecases the most.
Some of the needs are unrelated to the use-case, for example, it is a better idea to use a message queue / pub-sub broker that already has official, or strong community, support for the programming language we are using, so that we do not have to expend time creating bindings for the technologies we are using, which not only costs more time, but can also lead to bugs down the line, and will prevent us from benefiting from upstream support.
While there are many such implementation, for the sake of simplicity, let's consider three major tools we can use: Apache Kafka, RabbitMQ and Redis. Bet you didn't expect Redis would show up again, would you? :D
RabbitMQ
RabbitMQ, a very popular message queue, is an implementation of the AMQP protocol, which stands for Advanced Message Queueing Protocol. With this type of messaging model, instead of producing messages directly to a queue, we are sending messages to a so-called exchange.
An exchange is a router, we could say it is similar to the post office, it inspects the messages and decides into which message queue(s) it should put them to. The queue(s) are connected to the consuming services, or in other words, consumers.
The exchange is connected to queues using something called bindings. These bindings can be specifically referenced using something called the binding key.
The ways in which messages can move through the system are extremely flexible. That is because there is many exchanges available, which provide different behaviors, and you can also write plugins for RabbitMQ which extend its functionality.
Here is a couple exchanges that are available in RabbitMQ:
- fan-out - messages are sent to every single queue the exchange knows about
- direct - in this paradigm, the producer will produce a message with a particular routing key, the routing key is being compared to a binding key, and if it is an exact match, the message goes into said queue. This is, in-effect a point-to-point pattern
- topic - here, the routing key serves as a topic, and is compared partially against the binding keys. This means that the messages will go to all queues who match the pattern specified by the routing key
- header - the routing key is ignored completely, and the message is routed through the system according to a header
- default - this is a exchange that is unique to RabbitMQ, and is not part of the AMQP standard. It is also called a nameless exchange. Here, the routing key is tied to the name of the queue itself.
That is of course not all the models with which you can route messages, there is more, and you can also build on existing models.
In RabbitMQ, the behavior of the message is largely dictated by the message metadata, as opposed to the broker administrator.
RabbitMQ is quite fast, but it does not keep the messages it routes through itself. By default, it is not persistent either, although a persistence layer is available should you wish to use it.
Redis
Redis seems to pop up pretty much everywhere. Since the last couple of years, it has had support for both message queues (push-pull) and the pub-sub pattern.
The pub-sub pattern works pretty much how you would expect it to work, it is quite basic. Unlike with Kafka and RabbitMQ, we do not call the storage medium here a topic, but rather a channel.
You can see the pub/sub documentation here: https://redis.io/docs/manual/pubsub/
To provide message-queue / push-pull functionality, Redis streams are used: https://redis.io/docs/manual/data-types/streams/
Unlike RabbitMQ queues, Redis streams keep all of the messages received by default, use offsets just like Kafka, and so you can seek and replay just the same.
However, since we are still talking about Redis, everything is always stored in the RAM, so while it is very fast, RAM size is a constraint, and scaling out is less effective.
You also have to "pick one" in Redis, it is not possible to use pub-sub and message queues with a single-topic.
So, when to use what?
The key difference between RabbitMQ and Kafka is that RabbitMQ has a smart broker + dumb consumer, the producer sends the message to the exchange, and the exchange routes the message to queue. The exchange does the routing, and so it is a smart broker. Meanwhile in Kafka, the broker is dumb and the consumer is smart, It is up to the consumer to decide what topic it is interested in, creating consumer groups and deciding what listens to what.
If your broker doesn't have to be smart, then Kafka is the better option.
Another benefit of Kafka is message persistence. It has the best storage capabilities out of the three. But if your messages are not source of truth, but rather notifications, then RabbitMQ can be the better solution for you. For example, if you want to notify a user that a new message arrived at runtime (as opposed to showing unread messages in the inbox), then you might use RabbitMQ to distribute notifications to services, which can create an actual system notification for the end user.
There is often no harm done if such a notification is lost in the case of a power failure, network failure, or system restart, so it's fine to use RabbitMQ.
On the other hand, Redis and Kafka have better persistence capabilities and can replay messages. Redis is very fast, but cannot store as much as Kafka, and sharding and replication may be more complicated.
However, an argument in favor of Redis is that Redis has a lot of other storage-related functionality, and so you may get away from introducing another technology to your stack. If you are for example already using Redis for caching or temporary storage of some data, and your messaging needs are not overly complicated, then you can just leverage Redis and easily extend the functionality of your applications.
Keep in mind that you cannot mix and match pub-sub and streams in Redis, so that is a good threshold of what would be considered "overly complicated messaging".