Skip to main content
This feature is available with Conduktor Trust only.

Overview

Bad data breaks customer experiences, drives churn and slows growth. Conduktor Trust helps teams catch and fix data quality issues before they impact your business. You define the Rules and we’ll enforce them at the streaming layer.
This page details Conduktor’s capability to enforce data quality with Gateway.Find out about data quality capabilities without Gateway.

Prerequisites

Before creating data quality Rules and Policies, you have to:
  • use Conduktor Console 1.34 or later,
  • use Conduktor Gateway 3.9 or later,
  • be logged in as an admin to Console UI or use an admin token for the Conduktor CLI,
  • configure your Gateway cluster in Console and fill in the Provider tab with Gateway API credentials.

Rules

You can define three different types of Rules that will validate your Kafka message data quality:
Rules do nothing on their own - you have to attach them to a Policy.

Create a Rule

You can create a data quality Rule (CEL or JSON schema) using the Console UI or the Conduktor CLI.

Test Rules before creation

When creating Rules using the Console UI, you can test both CEL expressions and JSON schema Rules against sample data before saving changes. This helps with ensuring that your Rules work correctly with the expected data format, allowing you to iterate quickly and ensure that the Rules will work as expected before deploying them to production topics.
  1. Click Validate your Rule against sample messages to open the validation panel.
  2. In the Rule validation panel enter:
    • Key: sample message key (only for CEL rules that reference key properties)
    • Value: sample message value (the main data payload that your Rule will validate)
    • Headers: sample message headers (only for CEL rules that check header values)
  3. Click Test Rule to validate it.
  4. A message will appear explaining whether the Rule has passed, failed or if there’s an evaluation error (an issue with your Rule syntax or data format).
  • Console UI
  • Conduktor CLI
To create a Rule using the Console UI:
  1. Go to Rules and click +New Rule.
  2. Define the Rule details:
    • Add a descriptive name for the Rule.
    • The Technical ID will be auto-populated as you type in the name. This is used to identify this Rule in CLI/API.
    • (Optional) Enter a Description that explains the purpose of your Rule.
  3. Select the Rule type (CEL expression or JSON schema) and provide the required logic:
    • CEL is an expression language supporting common operators like == and > as well as macros like has() to check for the presence of fields. Use matches() to test regex patterns. See the CEL language definition for details . You can access a set of pre-built examples by clicking Show regex library. Click on the relevant example to copy it and paste into your Rule expression. You can then customize it further, as necessary.
    • For JSON schema, enter your schema definition. See the JSON schema specification for details .
  4. (Optional) Define a custom message that replaces the default <RULE_NAME> did not pass when the Rule is violated, shown in the Policy violation history.
  5. Click Create.

CEL expression Rules

You can create Rules with CEL expressions that capture business logic of your data. For example: value.customerId.matches("[0-9]{8}"). The Rules page lists your Rules with a preview of their CEL expressions. Open a Rule’s detail page to see its description, full CEL expression and attached Policies.

Sample CEL Rules

Make sure you amend these examples to use your values.
Your requirements may be different from this RegEx, as email validation via RegEx is complex. value.customer.email.matches(r"^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$")
value.customer.id.matches(r"^[0-9a-fA-F]{8}\b-[0-9a-fA-F]{4}\b-[0-9a-fA-F]{4}\b-[0-9a-fA-F]{4}\b-[0-9a-fA-F]{12}$")
headers['Content-Type'] == 'application/json'

Built-in Rules

We provide built-in validation Rules that can’t be achieved with CEL.
We currently only support Confluent and Confluent like (e.g. Redpanda) schema registries.

EnforceAvro

EnforceAvro ensures that:
  • Your messages have a schema ID prepended to the message content.
  • The schema ID exists within your schema registry.
  • The schema it references is of type avro.

JSON schema Rule

Enforce JSON Schema validation on Kafka messages to ensure data consistency and structure compliance.

Configure JSON schema

Add a JSON schema Rule using the CLI:
apiVersion: v1
kind: DataQualityRule
metadata:
    name: json
spec:
    schema:
        {
          "$schema": "http://json-schema.org/draft-07/schema#",
          "type": "object",
          "required": ["name", "age"],
          "properties": {
            "name": { "type": "string" },
            "age": { "type": "integer", "minimum": 0 },
            "isActive": { "type": "boolean", "default": true }
          }
        }
    displayName: valid user
    description: check that the user is valid
    type: JsonSchema

Schema requirements

Examples

{
  "type": "object",
  "required": ["id", "name", "age"],
  "properties": {
    "id": { "type": "string" },
    "name": { "type": "string", "minLength": 1 },
    "age": { "type": "integer", "minimum": 0 }
  }
}
{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "type": "object",
  "required": ["id", "name"],
  "properties": {
    "id": { "type": "string" },
    "name": { "type": "string" }
  }
}
{
  "type": "object",
  "required": ["id", "name"],
  "properties": {
    "id": { "type": "string" },
    "name": { "type": "string" }
  },
  "additionalProperties": false
}

Policies

A Policy is made up of Rules that are applied to topics/prefixes. Once created, Policies can be assigned actions to take effect when certain criteria is met (e.g., a Rule in the Policy is violated). The Policy’s detail page shows its description, linked Rules, targets, violation and evaluated message counts and a history of recent violations. You can also enable (and disable) actions for the Policy from this page. The Policies page lists all of your Policies with their actions, targets and violation counts.

Actions

The available actions to enable for a Policy are:
  • Block: when a violation occurs, prevent data from being processed or transmitted.
  • Mark: when a violation occurs, add a special header to the message so it can be identified and handled downstream. Find out about how marking works.
By default, Policies created using the Console UI don’t have any actions enabled. You have to complete the Policy creation first and then enable the required actions. If there are any additional actions you’d like to see, get in touch .
If both Block and Mark are enabled, only blocking will be applied: messages violating the Policy will be blocked entirely with no marking.

Mark action

When the Mark action is enabled, messages that violate the Rule(s) are tagged with a special header:
  • Header name:
  • Header value: JSON object mapping policy names to arrays of failed rule names. For example:
{
  "policy-name": ["rule-1", "rule-2"],
  "another-policy": ["rule-3"]
}

Violations metrics

Metrics displayed

Data freshness

Create a Policy

You can create a data quality Policy using the Console UI or the Conduktor CLI.
  • Console UI
  • Conduktor CLI

Manage a Policy

Once a Policy is created, you can view the attached Rule(s), the target(s) of the Policy and change the related actions. You can also view the violations as they occur and the violation count will be shown. To edit the list of Rules attached to a Policy, click Edit selection on the Policy details page. In the dialog that opens, select/deselect the required Rules from the list and save changes.
Since the block action has the ability to stop data from being sent to the requested topic, you have to confirm this by entering ‘BLOCK’ when prompted. Conversely, to disable the blocking, enter ‘UNBLOCK’ when prompted.

Assign permissions

Make sure to enable this on the Gateway (and not the underlying Kafka) cluster. Modifying group permissions won’t affect any Policies associated with the group.
The 'manage data quality' permission

Set up Policy violation alerts

Using multiple Policies

When multiple Policies target the same topic, there are two scenarios that can occur when a record is produced:
  • None of the Policies block the record and all are evaluated
    • The evaluation count is increased for all of the Policies.
    • The violation count is increased for each violated Policy.
    • An entry will appear in the violation history for each violated Policy.
  • One or more of the Policies block the record production. In this scenario, one of the Policies blocks the record first and then hides it from others
    • For the first blocking Policy, both the violation and evaluation counts are increased and a report is generated.
    • For the other Policies: no counts are increased and no reports are generated.

Troubleshoot

This is the status of a data quality Policy:
  • Pending: the configuration isn’t deployed or refreshed yet
  • Ready: the configuration is up-to-date on Gateway
  • Failed: something unexpected happened during the deployment. Check that the connected Gateway is active.
Use bracket notation instead of dot notation. For example: headers['Content-Type']
Whether your data is sent as JSON or Avro, Conduktor Gateway internally converts the payload to JSON before applying CEL rules. In JSON, all numeric values are treated as a generic number — there’s no distinction between int and double. As a result, expressions like type(value.age) == int may fail unexpectedly, even if:
  • the original value is a valid integer (e.g., 12)
  • you’re using an Avro schema where age is explicitly entered as an integer
This is because the Avro type information is lost during the conversion to JSON.Recommended workaround: Use logic-based expressions like: value.age > 0 && value.age < 130 This implicitly checks that the field is numeric and falls within a valid range, avoiding type inference.Note: CEL currently can’t evaluate against Avro schemas directly — it only sees the JSON-converted payload.We recommend enabling Gateway debug logs to inspect how data is interpreted during rule evaluation and to understand why it may have failed.