- What Kafka Connect is and when to use it
- How to configure a standalone connector
- How to set up the required properties files
- How to verify data flowing into Kafka
What is Kafka Connect?
Kafka Connect is a framework for streaming data between Kafka and external systems using reusable connectors. Instead of writing custom producer/consumer code for common integrations, you can use pre-built connectors.| Connector type | Direction | Examples |
|---|---|---|
| Source | External → Kafka | Debezium, JDBC, S3, MongoDB, Twitter |
| Sink | Kafka → External | Elasticsearch, S3, JDBC, HDFS, Splunk |
How to use Kafka Connect in standalone mode?
To use Kafka Connect in standalone mode, we need to provide the mandatory parameters:- Download a Kafka Connect connector, either from GitHub or Confluent Hub Confluent Hub
- Create a configuration file for your connector
- Use the
connect-standalone.shCLI to start the connector
Example: Kafka Connect standalone with Wikipedia data
Create the Kafka topicwikipedia.recentchange in Kafka with 3 partitions
wikipedia.dlq, for catching any errors
kafka_2.13-2.8.1/connectors/kafka-connect-sse:
connectors/kafka-connect-sse/connector.properties with the following properties:
bin and config folders are)
Edit the content of the config/connect-standalone.properties file
plugin.path config: this is where you indicate the folder where you store your Kafka connectors you have downloaded before.
This must be an absolute path (not relative, and no shortcut with ~) to your connectors directory
If you fail this step, Kafka Connect will stop after starting it.
Next, we can start our Kafka Connect standalone connector
wikipedia.recentchange topic:
Standalone vs distributed mode
| Mode | Use case | Scalability |
|---|---|---|
| Standalone | Development, testing, single tasks | Single worker |
| Distributed | Production, high availability | Multiple workers |
See it in practice with ConduktorConduktor Console provides a visual interface for managing Kafka Connect clusters, deploying connectors and monitoring connector health and throughput.
Next steps
- Explore Confluent Hub for available connectors
- Learn about distributed mode for production deployments
- Monitor Kafka clusters including Connect workers