Skip to main content
Quick navigation

What is a Schema Payload Validation Policy Interceptor?

Avoid outages from missing or badly formatted records, ensure all messages adhere to a schema.

This interceptor also supports validating payload against specific constraints for AvroSchema and ProtoBuf

This is similar to the validations provided by JsonSchema, such as:

  • Number: minimum, maximum, exclusiveMinimum, exclusiveMaximum, multipleOf
  • String: minLength, maxLength, pattern, format
  • Collections: maxItems, minItems

This interceptor also supports validating payload against specific custom constraints expression, which uses a simple language familiar with devs is CEL (Common Expression Language)

This interceptor also supports validating payload against specific custom metadata.rules object in the schema using CEL, too.

View the full demo in realtime

You can either follow all the steps manually, or watch the recording

Review the docker compose environment

As can be seen from docker-compose.yaml the demo environment consists of the following services:

  • gateway1
  • gateway2
  • kafka-client
  • kafka1
  • kafka2
  • kafka3
  • schema-registry
cat docker-compose.yaml

Starting the docker environment

Start all your docker processes, wait for them to be up and ready, then run in background

  • --wait: Wait for services to be running|healthy. Implies detached mode.
  • --detach: Detached mode: Run containers in the background
docker compose up --detach --wait

Creating topic topic-json-schema on gateway1

Creating on gateway1:

  • Topic topic-json-schema with partitions:1 and replication-factor:1
kafka-topics \
--bootstrap-server localhost:6969 \
--replication-factor 1 \
--partitions 1 \
--create --if-not-exists \
--topic topic-json-schema

Review the example json schema schema

Review the example json schema schema

cat user-schema-with-validation-rules.json

Let's register it to the Schema Registry

curl -s \
http://localhost:8081/subjects/topic-json-schema/versions \
-X POST \
-H "Content-Type: application/vnd.schemaregistry.v1+json" \
--data "{\"schemaType\": \"JSON\", \"schema\": $(cat user-schema-with-validation-rules.json | jq tostring)}"

Review invalid payload

Review invalid payload

cat invalid-payload.json

Let's send invalid data

Perfect the Json Schema serializer did its magic and validated our rules

cat invalid-payload.json | jq -c | \
kafka-json-schema-console-producer \
--bootstrap-server localhost:6969 \
--topic topic-json-schema \
--property schema.registry.url=http://localhost:8081 \
--property value.schema.id=1

Let's send invalid data using the protocol

Unfortunately the message went through

MAGIC_BYTE="\000"
SCHEMA_ID="\000\000\000\001"
JSON_PAYLOAD=$(cat invalid-payload.json | jq -c)
printf "${MAGIC_BYTE}${SCHEMA_ID}${JSON_PAYLOAD}" | \
kcat \
-E \
-b localhost:6969 \
-X security.protocol=PLAINTEXT \
-X sasl.mechanism=PLAIN \
-P \
-t topic-json-schema

Let's consume it back

That's pretty bad, you are going to propagate wrong data within your system!

kafka-json-schema-console-consumer \
--bootstrap-server localhost:6969 \
--topic topic-json-schema \
--from-beginning \
--skip-message-on-error \
--timeout-ms 3000

Adding interceptor guard-schema-payload-validate

Add Schema Payload Validation Policy Interceptor

step-12-guard-schema-payload-validate-interceptor.json:

{
"kind" : "Interceptor",
"apiVersion" : "gateway/v2",
"metadata" : {
"name" : "guard-schema-payload-validate"
},
"spec" : {
"comment" : "Adding interceptor: guard-schema-payload-validate",
"pluginClass" : "io.conduktor.gateway.interceptor.safeguard.SchemaPayloadValidationPolicyPlugin",
"priority" : 100,
"config" : {
"schemaRegistryConfig" : {
"host" : "http://schema-registry:8081"
},
"topic" : "topic-.*",
"schemaIdRequired" : true,
"validateSchema" : true,
"action" : "BLOCK"
}
}
}
curl \
--silent \
--request PUT "http://localhost:8888/gateway/v2/interceptor" \
--header "Content-Type: application/json" \
--user "admin:conduktor" \
--data @step-12-guard-schema-payload-validate-interceptor.json | jq

Listing interceptors

Listing interceptors on gateway1

curl \
--silent \
--request GET "http://localhost:8888/gateway/v2/interceptor" \
--user "admin:conduktor" | jq

Let's send invalid data using the protocol again

Perfect our interceptor did its magic and validated our rules

MAGIC_BYTE="\000"
SCHEMA_ID="\000\000\000\001"
JSON_PAYLOAD=$(cat invalid-payload.json | jq -c)
printf "${MAGIC_BYTE}${SCHEMA_ID}${JSON_PAYLOAD}" | \
kcat \
-E \
-b localhost:6969 \
-X security.protocol=PLAINTEXT \
-P \
-t topic-json-schema

Check in the audit log that message was denied

Check in the audit log that message was denied in cluster kafka1

kafka-console-consumer \
--bootstrap-server localhost:9092,localhost:9093,localhost:9094 \
--topic _conduktor_gateway_auditlogs \
--from-beginning \
--timeout-ms 3000 \| jq 'select(.type=="SAFEGUARD" and .eventData.plugin=="io.conduktor.gateway.interceptor.safeguard.SchemaPayloadValidationPolicyPlugin")'

returns 1 event

{
"id" : "3a2962ce-ef87-4a52-828d-20ee6a50f794",
"source" : "krn://cluster=p0KPFA_mQb2ixdPbQXPblw",
"type" : "SAFEGUARD",
"authenticationPrincipal" : "passthrough",
"userName" : "anonymous",
"connection" : {
"localAddress" : null,
"remoteAddress" : "/192.168.176.1:35142"
},
"specVersion" : "0.1.0",
"time" : "2024-11-17T21:05:08.689090586Z",
"eventData" : {
"interceptorName" : "guard-schema-payload-validate",
"level" : "error",
"plugin" : "io.conduktor.gateway.interceptor.safeguard.SchemaPayloadValidationPolicyPlugin",
"message" : "Request parameters do not satisfy the configured policy. Request parameters do not satisfy the configured policy. Topic 'topic-json-schema' has invalid json schema payload: hobbies must have 2 items, age must be greater than or equal to 18, email should end with 'example.com', #/hobbies: expected minimum item count: 2, found: 1, #/name: expected minLength: 3, actual: 1, #/email: [bad email] is not a valid email address, #/address/city: expected minLength: 2, actual: 0, #/address/street: expected maxLength: 15, actual: 56, street does not match expression 'size(street) >= 5 && size(street) <= 15', address does not match expression 'size(address.street) > 1 && address.street.contains('paris') || address.city == 'paris'', hobbies does not match expression 'size(hobbies) >= 2', name does not match expression 'size(name) >= 3', email does not match expression 'email.contains('foo')'"
}
}

Let's now produce a valid payload

cat valid-payload.json | jq -c | \
kafka-json-schema-console-producer \
--bootstrap-server localhost:6969 \
--topic topic-json-schema \
--property schema.registry.url=http://localhost:8081 \
--property value.schema.id=1

And consume it back

kafka-json-schema-console-consumer \
--bootstrap-server localhost:6969 \
--topic topic-json-schema \
--from-beginning \
--skip-message-on-error \
--timeout-ms 3000

Tearing down the docker environment

Remove all your docker processes and associated volumes

  • --volumes: Remove named volumes declared in the "volumes" section of the Compose file and anonymous volumes attached to containers.
docker compose down --volumes

Conclusion

You can enrich your existing schema to add even more data quality to your systems!