Skip to main content
Quick navigation

What is a Schema Payload Validation Policy Interceptor?

Avoid outages from missing or badly formatted records, ensure all messages adhere to a schema.

This interceptor also supports validating payload against specific constraints for AvroSchema and JsonSchema.

This is similar to the validations provided by JsonSchema, such as:

  • Number: minimum, maximum, exclusiveMinimum, exclusiveMaximum, multipleOf
  • String: minLength, maxLength, pattern, format
  • Collections: maxItems, minItems

This interceptor also supports validating payload against specific custom constraints expression, which uses a simple language familiar with devs is CEL (Common Expression Language)

This interceptor also supports validating payload against specific custom metadata.rules object in the schema using CEL, too.

View the full demo in realtime

You can either follow all the steps manually, or watch the recording

Review the docker compose environment

As can be seen from docker-compose.yaml the demo environment consists of the following services:

  • gateway1
  • gateway2
  • kafka-client
  • kafka1
  • kafka2
  • kafka3
  • schema-registry
  • zookeeper
cat docker-compose.yaml

Starting the docker environment

Start all your docker processes, wait for them to be up and ready, then run in background

  • --wait: Wait for services to be running|healthy. Implies detached mode.
  • --detach: Detached mode: Run containers in the background
docker compose up --detach --wait

Creating virtual cluster teamA

Creating virtual cluster teamA on gateway gateway1 and reviewing the configuration file to access it

# Generate virtual cluster teamA with service account sa
token=$(curl \
--request POST "http://localhost:8888/admin/vclusters/v1/vcluster/teamA/username/sa" \
--header 'Content-Type: application/json' \
--user 'admin:conduktor' \
--silent \
--data-raw '{"lifeTimeSeconds": 7776000}' | jq -r ".token")

# Create access file
echo """
bootstrap.servers=localhost:6969
security.protocol=SASL_PLAINTEXT
sasl.mechanism=PLAIN
sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username='sa' password='$token';
""" > teamA-sa.properties

# Review file
cat teamA-sa.properties

Creating topic topic-protobuf on teamA

Creating on teamA:

  • Topic topic-protobuf with partitions:1 and replication-factor:1
kafka-topics \
--bootstrap-server localhost:6969 \
--command-config teamA-sa.properties \
--replication-factor 1 \
--partitions 1 \
--create --if-not-exists \
--topic topic-protobuf

Review the example protocol buffer schema

Review the example protocol buffer schema

cat user-schema.proto

Let's register it to the Schema Registry

curl -s \
http://localhost:8081/subjects/topic-protobuf/versions \
-X POST \
-H "Content-Type: application/vnd.schemaregistry.v1+json" \
--data "{\"schemaType\": \"PROTOBUF\", \"schema\": $(cat user-schema.proto | jq -Rs)}"

Review invalid payload

Review invalid payload

cat invalid-payload.json

Let's send invalid data

cat invalid-payload.json | jq -c | \
kafka-protobuf-console-producer \
--bootstrap-server localhost:6969 \
--producer.config teamA-sa.properties \
--topic topic-protobuf \
--property schema.registry.url=http://localhost:8081 \
--property value.schema.id=1

Let's consume it back

That's pretty bad, you are going to propagate wrong data within your system!

kafka-protobuf-console-consumer \
--bootstrap-server localhost:6969 \
--consumer.config teamA-sa.properties \
--topic topic-protobuf \
--from-beginning \
--timeout-ms 3000

Adding interceptor guard-schema-payload-validate

Add Schema Payload Validation Policy Interceptor

cat step-12-guard-schema-payload-validate.json | jq

curl \
--request POST "http://localhost:8888/admin/interceptors/v1/vcluster/teamA/interceptor/guard-schema-payload-validate" \
--header 'Content-Type: application/json' \
--user 'admin:conduktor' \
--silent \
--data @step-12-guard-schema-payload-validate.json | jq

Listing interceptors for teamA

Listing interceptors on gateway1 for virtual cluster teamA

curl \
--request GET 'http://localhost:8888/admin/interceptors/v1/vcluster/teamA' \
--header 'Content-Type: application/json' \
--user 'admin:conduktor' \
--silent | jq

Review the protocol buffer schema with validation rules

Review the protocol buffer schema with validation rules

cat user-schema-with-validation-rules.proto

Let's update the schema with our validation rules

curl -s \
http://localhost:8081/subjects/topic-protobuf/versions \
-X POST \
-H "Content-Type: application/vnd.schemaregistry.v1+json" \
--data "{\"schemaType\": \"PROTOBUF\", \"schema\": $(cat user-schema-with-validation-rules.proto | jq -Rs)}"

Let's asserts number of registered schemas

curl -s http://localhost:8081/subjects/topic-protobuf/versions

Let's produce the same invalid payload again

The payload has been rejected with useful errors

org.apache.kafka.common.errors.PolicyViolationException: Request parameters do not satisfy the configured policy. 
Topic 'topic-protobuf' has invalid protobuf schema payload: name length must greater than 2, age must be greater than or equal to 18, Student.name is too short (1 < 3), Student.name does not match expression 'size(name) >= 3 && size(name) <= 50', Student.email does not match format 'email', Student.email does not match expression 'email.contains('foo')', Student.Address.street is too long (56 > 10), Student.Address.street does not match expression 'size(street) >= 5 && size(street) <= 10', Student.Address.city is too short (0 < 2), Student.address does not match expression 'size(address.street) >= 5 && address.street.contains('paris') || address.city == 'paris'', Student.hobbies has too few items (1 < 2), Student.hobbies does not match expression 'size(hobbies) >= 2', Student.Friend.age is greater than 10, Student.Friend.age does not match expression 'age >= 2 && age <= 10', Student.Friend.name is too long (11 > 10), Student.Friend.name does not match expression 'size(name) >= 3 && size(name) <= 10', Student.Friend.age is greater than 10, Student.Friend.age does not match expression 'age >= 2 && age <= 10'
cat invalid-payload.json | jq -c | \
kafka-protobuf-console-producer \
--bootstrap-server localhost:6969 \
--producer.config teamA-sa.properties \
--topic topic-protobuf \
--property schema.registry.url=http://localhost:8081 \
--property value.schema.id=2

Check in the audit log that message was denied

Check in the audit log that message was denied in cluster kafka1

kafka-console-consumer \
--bootstrap-server localhost:19092,localhost:19093,localhost:19094 \
--topic _conduktor_gateway_auditlogs \
--from-beginning \
--timeout-ms 3000 \| jq 'select(.type=="SAFEGUARD" and .eventData.plugin=="io.conduktor.gateway.interceptor.safeguard.SchemaPayloadValidationPolicyPlugin")'

returns 1 event

{
"id" : "ca499d6a-7fa1-49f1-8a01-a7bf3a939554",
"source" : "krn://cluster=ECCAV3HlS5uwR_NIedQ4Kw",
"type" : "SAFEGUARD",
"authenticationPrincipal" : "teamA",
"userName" : "sa",
"connection" : {
"localAddress" : null,
"remoteAddress" : "/192.168.65.1:30697"
},
"specVersion" : "0.1.0",
"time" : "2024-04-09T23:57:56.097380762Z",
"eventData" : {
"level" : "error",
"plugin" : "io.conduktor.gateway.interceptor.safeguard.SchemaPayloadValidationPolicyPlugin",
"message" : "Request parameters do not satisfy the configured policy. Topic 'topic-protobuf' has invalid protobuf schema payload: name length must greater than 2, age must be greater than or equal to 18, Student.name is too short (1 < 3), Student.name does not match expression 'size(name) >= 3 && size(name) <= 50', Student.email does not match format 'email', Student.email does not match expression 'email.contains('foo')', Student.Address.street is too long (56 > 10), Student.Address.street does not match expression 'size(street) >= 5 && size(street) <= 10', Student.Address.city is too short (0 < 2), Student.address does not match expression 'size(address.street) >= 5 && address.street.contains('paris') || address.city == 'paris'', Student.hobbies has too few items (1 < 2), Student.hobbies does not match expression 'size(hobbies) >= 2', Student.Friend.age is greater than 10, Student.Friend.age does not match expression 'age >= 2 && age <= 10', Student.Friend.name is too long (11 > 10), Student.Friend.name does not match expression 'size(name) >= 3 && size(name) <= 10', Student.Friend.age is greater than 10, Student.Friend.age does not match expression 'age >= 2 && age <= 10'"
}
}

Let's now produce a valid payload

cat valid-payload.json | jq -c | \
kafka-protobuf-console-producer \
--bootstrap-server localhost:6969 \
--producer.config teamA-sa.properties \
--topic topic-protobuf \
--property schema.registry.url=http://localhost:8081 \
--property value.schema.id=2

And consume it back

kafka-protobuf-console-consumer \
--bootstrap-server localhost:6969 \
--consumer.config teamA-sa.properties \
--topic topic-protobuf \
--from-beginning \
--timeout-ms 3000

Tearing down the docker environment

Remove all your docker processes and associated volumes

  • --volumes: Remove named volumes declared in the "volumes" section of the Compose file and anonymous volumes attached to containers.
docker compose down --volumes

Conclusion

You can enrich your existing schema to add even more data quality to your systems!