Appendix: Kafka setup


All [the authorities] did was to guard the distant and invisible interests of distant and invisible masters.

-- Franz Kafka, Der Schloss

Setting up Kafka is a process which can be as involved as much pain you are willing to tolerate.

Here, let's list two options - Docker and direct installation. Of course, it also would have been possible to set up Kafka on Kubernetes, but we don't need that for demo purposes in these chapters.

It can be added later, though.

Docker setup with docker-compose

This is the quick and relatively pain-free way.

Start by installing docker-compose and docker:

# arch/artix
pacman -S docker-compose docker

# ubuntu/debian
apt install docker-compose docker

# void linux
xbps-install -S docker-compose docker

# fedora
dnf install docker-compose docker

# older RHEL based distros
yum install docker-compose docker

(make sure to start the Docker daemon, how this is done depends on your system and init. You might have to install an additional package with init/service scripts, if they aren't bundled with the docker package by default)

Next, let's create a docker-compose.yml file somewhere. It is advised to put it into its own folder, you might also want to consider putting it into git.

This content might be enough for us for the first try:

# docker-compose.yaml
version: '2'
services:
  zookeeper:
    image: wurstmeister/zookeeper
    ports:
      - "2181:2181"
  kafka:
    image: wurstmeister/kafka
    ports:
      - "9092:9092"
    environment:
      KAFKA_ADVERTISED_HOST_NAME: 127.0.0.1
      KAFKA_CREATE_TOPICS: "test:1:1"
      KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock

As you see, Kafka requires Apache Zookeeper as well.

Let's start our containers:

docker-compose up -d

If all went right, you should now have a very small Kafka setup running :)

We have also created the test topic. This will create the topic with 1 partition and 1 replica. That's not very safe for production, but "alright I guess" for us.

You can verify that it works using kcat:

First, let's start a consumer on the topic test:

kcat -C -b 127.0.0.1:9092 -t test

And now let's start a producer in another terminal window that endlessly reads from stdin:

kcat -P -b 127.0.0.1:9092 -t test

If you type in something and press enter, you should see your message appear in the consumer terminal window.

The more involved way to set up Kafka

Start by going to kafka.apache.org/downloads and download the latest release of Kafka. The versioning might seem a little confusing, as the Scala version number is included first in the archive name.

When you have downloaded it, start by unpacking the tarball:

tar -xzf kafka_2.11-3.2.0.tgz
cd kafka_2.11-3.2.0

Then go let's look at the bin folder, and let's start ZooKeeper:

bin/zookeeper-server-start.sh config/zookeeper.properties

This will start a single-node instance of ZooKeeper with the default port and settings.

To make things spicier, and to demonstrate Kafka as a distributed system, let's spin up three brokers:

cp config/server.properties config/server0.properties
cp config/server.properties config/server1.properties
cp config/server.properties config/server2.properties

# Edit each file above to have the following changed properties respectively
vi config/server.properties config/server0.properties
broker.id=0
listeners=PLAINTEXT://localhost:9092
log.dir=/tmp/kafka-logs-0

vi config/server.properties config/server1.properties
broker.id=1
listeners=PLAINTEXT://localhost:9093
log.dir=/tmp/kafka-logs-1

vi config/server.properties config/server2.properties
broker.id=2
listeners=PLAINTEXT://localhost:9094
log.dir=/tmp/kafka-logs-2

Pay attention to the ports, we need to make sure they are not the same. The IDs have to be unique as well, and having all Kafka instances log into the same file would be a bad idea also.

Now, we can start each broker:

bin/kafka-server-start.sh config/server0.properties
bin/kafka-server-start.sh config/server1.properties
bin/kafka-server-start.sh config/server2.properties

You will have to run each of these commands in a separate terminal window. Or run them in the background by appending & to the command, but beware that you might see some confusing output scroll.

Now, let's create a topic. Because we have three brokers running, we can go proverbially buck wild and run them with three partitions times and three replicas:

bin/kafka-topics.sh --zookeeper localhost:2181 --create --topic test --partitions 3 --replication-factor 3

If it succeeded, you should see the output:

Created topic "test"

See if the topic is there:

bin/kafka-topics.sh --zookeeper localhost:2181 --list

If you want to see a bit more under the hood, you can also view topic layout:

bin/kafka-topics.sh --describe --zookeeper localhost:2181 --topic helloworld

We can verify that the deployment was correct by using kcat again.

First, let's start a consumer on the topic test:

kcat -C -b 127.0.0.1:9092 -t test

And now let's start a producer in another terminal window that endlessly reads from stdin:

kcat -P -b 127.0.0.1:9092 -t test

And you can try sending messages again.

It should work.

Alternatively, the Kafka distribution comes with its own commands for creating simple cli consumers and producers:

bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic test --from-beginning
bin/kafka-console-producer.sh --broker-list localhost:9092 --topic test