Skip to main content
Quick navigation

What is a Schema Payload Validation Policy Interceptor?

Avoid outages from missing or badly formatted records, ensure all messages adhere to a schema.

This interceptor also supports validating payload against specific constraints for AvroSchema and Protobuf.

This is similar to the validations provided by JsonSchema, such as:

  • Number: minimum, maximum, exclusiveMinimum, exclusiveMaximum, multipleOf
  • String: minLength, maxLength, pattern, format
  • Collections: maxItems, minItems

This interceptor also supports validating payload against specific custom constraints expression, which uses a simple language familiar with devs is CEL (Common Expression Language)

This interceptor also supports validating payload against specific custom metadata.rules object in the schema using CEL, too.

View the full demo in realtime

You can either follow all the steps manually, or watch the recording

Review the docker compose environment

As can be seen from docker-compose.yaml the demo environment consists of the following services:

  • gateway1
  • gateway2
  • kafka-client
  • kafka1
  • kafka2
  • kafka3
  • schema-registry
cat docker-compose.yaml

Starting the docker environment

Start all your docker processes, wait for them to be up and ready, then run in background

  • --wait: Wait for services to be running|healthy. Implies detached mode.
  • --detach: Detached mode: Run containers in the background
docker compose up --detach --wait

Creating topic topic-avro on gateway1

Creating on gateway1:

  • Topic topic-avro with partitions:1 and replication-factor:1
kafka-topics \
--bootstrap-server localhost:6969 \
--replication-factor 1 \
--partitions 1 \
--create --if-not-exists \
--topic topic-avro

Review the example avro schema

Review the example avro schema

cat user-schema.avsc

Let's register it to the Schema Registry

curl -s \
http://localhost:8081/subjects/topic-avro/versions \
-X POST \
-H "Content-Type: application/vnd.schemaregistry.v1+json" \
--data "{\"schemaType\": \"AVRO\", \"schema\": $(cat user-schema.avsc | jq tostring)}"

Review invalid payload

Review invalid payload

cat invalid-payload.json

Let's send invalid data

cat invalid-payload.json | jq -c | \
kafka-avro-console-producer \
--bootstrap-server localhost:6969 \
--topic topic-avro \
--property schema.registry.url=http://localhost:8081 \
--property value.schema.id=1

Let's consume it back

That's pretty bad, you are going to propagate wrong data within your system!

kafka-avro-console-consumer \
--bootstrap-server localhost:6969 \
--topic topic-avro \
--from-beginning \
--timeout-ms 3000

Adding interceptor guard-schema-payload-validate

Add Schema Payload Validation Policy Interceptor

step-11-guard-schema-payload-validate-interceptor.json:

{
"kind" : "Interceptor",
"apiVersion" : "gateway/v2",
"metadata" : {
"name" : "guard-schema-payload-validate"
},
"spec" : {
"comment" : "Adding interceptor: guard-schema-payload-validate",
"pluginClass" : "io.conduktor.gateway.interceptor.safeguard.SchemaPayloadValidationPolicyPlugin",
"priority" : 100,
"config" : {
"schemaRegistryConfig" : {
"host" : "http://schema-registry:8081"
},
"topic" : "topic-.*",
"schemaIdRequired" : true,
"validateSchema" : true,
"action" : "BLOCK"
}
}
}
curl \
--silent \
--request PUT "http://localhost:8888/gateway/v2/interceptor" \
--header "Content-Type: application/json" \
--user "admin:conduktor" \
--data @step-11-guard-schema-payload-validate-interceptor.json | jq

Listing interceptors

Listing interceptors on gateway1

curl \
--silent \
--request GET "http://localhost:8888/gateway/v2/interceptor" \
--user "admin:conduktor" | jq

Review the avro schema with validation rules

Review the avro schema with validation rules

cat user-schema-with-validation-rules.avsc

Let's update the schema with our validation rules

curl -s \
http://localhost:8081/subjects/topic-avro/versions \
-X POST \
-H "Content-Type: application/vnd.schemaregistry.v1+json" \
--data "{\"schemaType\": \"AVRO\", \"schema\": $(cat user-schema-with-validation-rules.avsc | jq tostring)}"

Let's asserts number of registered schemas

curl -s http://localhost:8081/subjects/topic-avro/versions

Let's produce the same invalid payload again

The payload has been rejected with useful errors

org.apache.kafka.common.errors.PolicyViolationException: Request parameters do not satisfy the configured policy. 
Topic 'topic-avro' has invalid avro schema payload: hobbies must have 2 items, age must be greater than or equal to 18, email should end with 'example.com', name is too short (1 < 3), name does not match expression 'size(name) >= 3 && size(name) <= 50', email does not match format 'email', email does not match expression 'email.contains('foo')', street is too long (56 > 15), street does not match expression 'size(street) >= 5 && size(street) <= 15', city is too short (0 < 2), address does not match expression 'size(address.street) >= 5 && address.street.contains('paris') || address.city == 'paris'', hobbies has too few items (1 < 2), hobbies does not match expression 'size(hobbies) >= 2', name does not match expression 'size(name) <= 4', age is greater than 10, name does not match expression 'size(name) <= 4', age is greater than 10
cat invalid-payload.json | jq -c | \
kafka-avro-console-producer \
--bootstrap-server localhost:6969 \
--topic topic-avro \
--property schema.registry.url=http://localhost:8081 \
--property value.schema.id=2

Check in the audit log that message was denied

Check in the audit log that message was denied in cluster kafka1

kafka-console-consumer \
--bootstrap-server localhost:9092,localhost:9093,localhost:9094 \
--topic _conduktor_gateway_auditlogs \
--from-beginning \
--timeout-ms 3000 \| jq 'select(.type=="SAFEGUARD" and .eventData.plugin=="io.conduktor.gateway.interceptor.safeguard.SchemaPayloadValidationPolicyPlugin")'

returns 1 event

{
"id" : "25cb4ba1-37a6-42a8-baad-68be330f1ce9",
"source" : "krn://cluster=p0KPFA_mQb2ixdPbQXPblw",
"type" : "SAFEGUARD",
"authenticationPrincipal" : "passthrough",
"userName" : "anonymous",
"connection" : {
"localAddress" : null,
"remoteAddress" : "/172.29.0.1:58384"
},
"specVersion" : "0.1.0",
"time" : "2024-11-17T20:22:11.698055963Z",
"eventData" : {
"interceptorName" : "guard-schema-payload-validate",
"level" : "error",
"plugin" : "io.conduktor.gateway.interceptor.safeguard.SchemaPayloadValidationPolicyPlugin",
"message" : "Request parameters do not satisfy the configured policy. Request parameters do not satisfy the configured policy. Topic 'topic-avro' has invalid avro schema payload: hobbies must have 2 items, age must be greater than or equal to 18, email should end with 'example.com', name is too short (1 < 3), name does not match expression 'size(name) >= 3 && size(name) <= 50', email does not match format 'email', email does not match expression 'email.contains('foo')', street is too long (56 > 15), street does not match expression 'size(street) >= 5 && size(street) <= 15', city is too short (0 < 2), address does not match expression 'size(address.street) >= 5 && address.street.contains('paris') || address.city == 'paris'', hobbies has too few items (1 < 2), hobbies does not match expression 'size(hobbies) >= 2', name does not match expression 'size(name) <= 4', age is greater than 10, name does not match expression 'size(name) <= 4', age is greater than 10"
}
}

Let's now produce a valid payload

cat valid-payload.json | jq -c | \
kafka-avro-console-producer \
--bootstrap-server localhost:6969 \
--topic topic-avro \
--property schema.registry.url=http://localhost:8081 \
--property value.schema.id=2

And consume it back

kafka-avro-console-consumer \
--bootstrap-server localhost:6969 \
--topic topic-avro \
--from-beginning \
--timeout-ms 3000

Tearing down the docker environment

Remove all your docker processes and associated volumes

  • --volumes: Remove named volumes declared in the "volumes" section of the Compose file and anonymous volumes attached to containers.
docker compose down --volumes

Conclusion

You can enrich your existing schema to add even more data quality to your systems!