Skip to main content
Quick navigation

Large messages support in Kafka with built-in claimcheck pattern.

View the full demo in realtime

You can either follow all the steps manually, or watch the recording

Review the docker compose environment

As can be seen from docker-compose.yaml the demo environment consists of the following services:

  • cli-aws
  • gateway1
  • gateway2
  • kafka-client
  • kafka1
  • kafka2
  • kafka3
  • minio
  • schema-registry
  • zookeeper
cat docker-compose.yaml

Starting the docker environment

Start all your docker processes, wait for them to be up and ready, then run in background

  • --wait: Wait for services to be running|healthy. Implies detached mode.
  • --detach: Detached mode: Run containers in the background
docker compose up --detach --wait

Creating virtual cluster teamA

Creating virtual cluster teamA on gateway gateway1 and reviewing the configuration file to access it

# Generate virtual cluster teamA with service account sa
token=$(curl \
--request POST "http://localhost:8888/admin/vclusters/v1/vcluster/teamA/username/sa" \
--header 'Content-Type: application/json' \
--user 'admin:conduktor' \
--silent \
--data-raw '{"lifeTimeSeconds": 7776000}' | jq -r ".token")

# Create access file
echo """
sasl.mechanism=PLAIN required username='sa' password='$token';
""" >

# Review file

Review credentials

cat credentials

Let's create a bucket

docker compose exec cli-aws \
aws \
--profile minio \
--endpoint-url=http://minio:9000 \
--region eu-south-1 \
s3api create-bucket \
--bucket bucket

Creating topic large-messages on teamA

Creating on teamA:

  • Topic large-messages with partitions:1 and replication-factor:1
kafka-topics \
--bootstrap-server localhost:6969 \
--command-config \
--replication-factor 1 \
--partitions 1 \
--create --if-not-exists \
--topic large-messages

Adding interceptor large-messages

Let's ask Gateway to offload large messages to S3

cat step-09-large-messages.json | jq

curl \
--request POST "http://localhost:8888/admin/interceptors/v1/vcluster/teamA/interceptor/large-messages" \
--header 'Content-Type: application/json' \
--user 'admin:conduktor' \
--silent \
--data @step-09-large-messages.json | jq

Let's create a large message

openssl rand -hex $((20*1024*1024)) > large-message.bin 
ls -lh large-message.bin

Sending large pdf file through kafka

requiredMemory=$(( 2 * $(cat large-message.bin | wc -c | awk '{print $1}')))

kafka-producer-perf-test \
--producer.config \
--topic large-messages \
--throughput -1 \
--num-records 1 \
--payload-file large-message.bin \
--producer-props \
bootstrap.servers=localhost:6969 \
max.request.size=$requiredMemory \

Let's read the message back

kafka-console-consumer  \
--bootstrap-server localhost:6969 \
--consumer.config \
--topic large-messages \
--from-beginning \
--max-messages 1 > from-kafka.bin

Let's compare the files

ls -lH *bin

Let's look at what's inside minio

docker compose exec cli-aws \
aws \
--profile minio \
--endpoint-url=http://minio:9000 \
--region eu-south-1 \
s3 \
ls s3://bucket --recursive --human-readable

Consuming from teamAlarge-messages

Consuming from teamAlarge-messages in cluster kafka1

kafka-console-consumer \
--bootstrap-server localhost:19092,localhost:19093,localhost:19094 \
--topic teamAlarge-messages \
--from-beginning \
--timeout-ms 10000 \
--property print.headers=true | jq

Tearing down the docker environment

Remove all your docker processes and associated volumes

  • --volumes: Remove named volumes declared in the "volumes" section of the Compose file and anonymous volumes attached to containers.
docker compose down --volumes


ksqlDB can run in a virtual cluster where all its topics are concentrated into a single physical topic