Skip to main content
Quick navigation

What is a Schema Payload Validation Policy Interceptor?

Avoid outages from missing or badly formatted records, ensure all messages adhere to a schema.

This interceptor also supports validating payload against specific constraints for AvroSchema and ProtoBuf

This is similar to the validations provided by JsonSchema, such as:

  • Number: minimum, maximum, exclusiveMinimum, exclusiveMaximum, multipleOf
  • String: minLength, maxLength, pattern, format
  • Collections: maxItems, minItems

View the full demo in realtime

You can either follow all the steps manually, or watch the recording

Review the docker compose environment

As can be seen from docker-compose.yaml the demo environment consists of the following services:

  • gateway1
  • gateway2
  • kafka-client
  • kafka1
  • kafka2
  • kafka3
  • schema-registry
  • zookeeper
cat docker-compose.yaml

Starting the docker environment

Start all your docker processes, wait for them to be up and ready, then run in background

  • --wait: Wait for services to be running|healthy. Implies detached mode.
  • --detach: Detached mode: Run containers in the background
docker compose up --detach --wait

Creating virtual cluster teamA

Creating virtual cluster teamA on gateway gateway1 and reviewing the configuration file to access it

# Generate virtual cluster teamA with service account sa
token=$(curl \
--request POST "http://localhost:8888/admin/vclusters/v1/vcluster/teamA/username/sa" \
--header 'Content-Type: application/json' \
--user 'admin:conduktor' \
--silent \
--data-raw '{"lifeTimeSeconds": 7776000}' | jq -r ".token")

# Create access file
echo """
bootstrap.servers=localhost:6969
security.protocol=SASL_PLAINTEXT
sasl.mechanism=PLAIN
sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username='sa' password='$token';
""" > teamA-sa.properties

# Review file
cat teamA-sa.properties

Creating topic topic-json-schema on teamA

Creating on teamA:

  • Topic topic-json-schema with partitions:1 and replication-factor:1
kafka-topics \
--bootstrap-server localhost:6969 \
--command-config teamA-sa.properties \
--replication-factor 1 \
--partitions 1 \
--create --if-not-exists \
--topic topic-json-schema

Review the example json schema schema

Review the example json schema schema

cat user-schema-with-validation-rules.json

Let's register it to the Schema Registry

curl -s \
http://localhost:8081/subjects/topic-json-schema/versions \
-X POST \
-H "Content-Type: application/vnd.schemaregistry.v1+json" \
--data "{\"schemaType\": \"JSON\", \"schema\": $(cat user-schema-with-validation-rules.json | jq tostring)}"

Review invalid payload

Review invalid payload

cat invalid-payload.json

Let's send invalid data

Perfect the Json Schema serializer did its magic and validated our rules

cat invalid-payload.json | jq -c | \
kafka-json-schema-console-producer \
--bootstrap-server localhost:6969 \
--producer.config teamA-sa.properties \
--topic topic-json-schema \
--property schema.registry.url=http://localhost:8081 \
--property value.schema.id=1

Let's send invalid data using the protocol

Unfortunately the message went through

MAGIC_BYTE="\x00"
SCHEMA_ID="\x00\x00\x00\x01"
JSON_PAYLOAD=$(cat invalid-payload.json | jq -c)
printf "${MAGIC_BYTE}${SCHEMA_ID}${JSON_PAYLOAD}" | \
kcat \
-b localhost:6969 \
-X security.protocol=SASL_PLAINTEXT \
-X sasl.mechanism=PLAIN \
-X sasl.username=sa \
-X sasl.password=$(cat teamA-sa.properties | awk -F"'" '/password=/{print $4}') \
-P \
-t topic-json-schema

Let's consume it back

That's pretty bad, you are going to propagate wrong data within your system!

kafka-json-schema-console-consumer \
--bootstrap-server localhost:6969 \
--consumer.config teamA-sa.properties \
--topic topic-json-schema \
--from-beginning \
--skip-message-on-error \
--timeout-ms 3000

Adding interceptor guard-schema-payload-validate

Add Schema Payload Validation Policy Interceptor

Creating the interceptor named guard-schema-payload-validate of the plugin io.conduktor.gateway.interceptor.safeguard.SchemaPayloadValidationPolicyPlugin using the following payload

{
"pluginClass" : "io.conduktor.gateway.interceptor.safeguard.SchemaPayloadValidationPolicyPlugin",
"priority" : 100,
"config" : {
"schemaRegistryConfig" : {
"host" : "http://schema-registry:8081"
},
"topic" : "topic-.*",
"schemaIdRequired" : true,
"validateSchema" : true,
"action" : "BLOCK"
}
}

Here's how to send it:

curl \
--request POST "http://localhost:8888/admin/interceptors/v1/vcluster/teamA/interceptor/guard-schema-payload-validate" \
--header 'Content-Type: application/json' \
--user 'admin:conduktor' \
--silent \
--data @step-13-guard-schema-payload-validate.json | jq

Listing interceptors for teamA

Listing interceptors on gateway1 for virtual cluster teamA

curl \
--request GET 'http://localhost:8888/admin/interceptors/v1/vcluster/teamA' \
--header 'Content-Type: application/json' \
--user 'admin:conduktor' \
--silent | jq

Let's send invalid data using the protocol again

Perfect our interceptor did its magic and validated our rules

MAGIC_BYTE="\x00"
SCHEMA_ID="\x00\x00\x00\x01"
JSON_PAYLOAD=$(cat invalid-payload.json | jq -c)
printf "${MAGIC_BYTE}${SCHEMA_ID}${JSON_PAYLOAD}" | \
kcat \
-b localhost:6969 \
-X security.protocol=SASL_PLAINTEXT \
-X sasl.mechanism=PLAIN \
-X sasl.username=sa \
-X sasl.password=$(cat teamA-sa.properties | awk -F"'" '/password=/{print $4}') \
-P \
-t topic-json-schema

Check in the audit log that message was denied

Check in the audit log that message was denied in cluster kafka1

kafka-console-consumer \
--bootstrap-server localhost:19092,localhost:19093,localhost:19094 \
--topic _auditLogs \
--from-beginning \
--timeout-ms 3000 \
| jq 'select(.type=="SAFEGUARD" and .eventData.plugin=="io.conduktor.gateway.interceptor.safeguard.SchemaPayloadValidationPolicyPlugin")'

returns

Processed a total of 14 messages
{
"id": "e5594b2c-fdb2-4f32-9ad6-4587d46bd08b",
"source": "krn://cluster=WR6pYN7oSpSXjdolNbdGtQ",
"type": "SAFEGUARD",
"authenticationPrincipal": "teamA",
"userName": "sa",
"connection": {
"localAddress": null,
"remoteAddress": "/192.168.65.1:21099"
},
"specVersion": "0.1.0",
"time": "2024-02-14T03:24:21.998097138Z",
"eventData": {
"level": "error",
"plugin": "io.conduktor.gateway.interceptor.safeguard.SchemaPayloadValidationPolicyPlugin",
"message": "Request parameters do not satisfy the configured policy. Topic 'topic-json-schema' has invalid json schema payload: [#/hobbies: expected minimum item count: 2, found: 1, #/name: expected minLength: 3, actual: 1, #/email: [bad email] is not a valid email address, #/address/city: expected minLength: 2, actual: 0, #/address/street: expected maxLength: 15, actual: 56]"
}
}

Let's now produce a valid payload

cat valid-payload.json | jq -c | \
kafka-json-schema-console-producer \
--bootstrap-server localhost:6969 \
--producer.config teamA-sa.properties \
--topic topic-json-schema \
--property schema.registry.url=http://localhost:8081 \
--property value.schema.id=1

And consume it back

kafka-json-schema-console-consumer \
--bootstrap-server localhost:6969 \
--consumer.config teamA-sa.properties \
--topic topic-json-schema \
--from-beginning \
--skip-message-on-error \
--timeout-ms 3000

Tearing down the docker environment

Remove all your docker processes and associated volumes

  • --volumes: Remove named volumes declared in the "volumes" section of the Compose file and anonymous volumes attached to containers.
docker compose down --volumes

Conclusion

You can enrich your existing schema to add even more data quality to your systems!