Skip to main content

Overview

This guide provides production-ready recommendations for sizing and scaling your PostgreSQL database for Conduktor Console across AWS, GCP or Azure to ensure a performant and consistent user experience.
The minimum requirements (1-2 vCPU, 1 GB RAM, 10 GB disk) are only suitable for proof-of-concept or test environments.For production deployments, follow this guide to avoid performance issues like IOPS throttling, CPU spikes and memory exhaustion.

Considerations

Key considerations for sizing your database:
1

Match database size to production needs

Select database specifications that match your production workload - avoid using proof-of-concept configurations in production environments.
2

Monitor from day one

Set up alerts for CPU, memory, IOPS and query latency before going live.
3

Ensure sufficient IOPS throughput

Provision a minimum of 3,000 IOPS (Input/Output Operations Per Second) - Background sync processes are write-heavy.
4

Balance cost and performance

Use these recommendations as a baseline, then optimize based on actual usage.

Usage level

To get started, choose the level that matches your expected initial usage:
LevelConcurrent usersKafka scaleEstimated Maximum DB size
StandardUp to 5001-5 clusters, up to 1,000 topics / 10,000 partitions, ~500 consumer groupsup to 50 GB
Mid scale500-1,0005-10 clusters, up to 5,000 topics / 50,000 partitions, ~1,000 consumer groupsup to 100 GB
Fully scaled1,000-5,00010+ clusters, 5,000+ topics / 50,000+ partitions, 1,000+ consumer groupsup to 250 GB
The estimated maximum DB size is based on an installation making maximum utilization of all Conduktor features over a year and caters for accumulated information such as audit logging.

Target performance

Our recommendations are based on providing the optimal Conduktor experience while avoiding over-provisioning of the database. We’ve based the recommendations on:
  • keeping P95 query latency under a level to provide the best user experience
  • handling concurrent queries based on your expected user base size
  • supporting background Conduktor metadata updates for your expected Kafka platform size
  • avoiding IOPS throttling or similar cloud provider limitations during normal operations
Conduktor Console continuously syncs Kafka metadata to the database via a background task, which requires sufficient IOPS (Input/Output Operations Per Second) as specified in the recommendations below.
  • AWS
  • Azure
  • GCP
We recommend that you use AWS RDS PostgreSQL. The Postgres version has to be 14.8+ or 15.3+.
  • Standard level
  • Mid scale
  • Fully scaled
Instance type:
  • Minimum: db.t4g.large (2 vCPU, 8 GB RAM)
  • Recommended: db.m6g.large (2 vCPU, 8 GB RAM) for consistent performance without CPU credits
Storage:
  • Type: general purpose SSD (gp3)
  • Size: 50 GB minimum (allows for growth)
  • IOPS: 3,000 IOPS baseline (included free with gp3)
  • Throughput: 125 MB/s (included free)
Configuration:
  • PostgreSQL version: 14.8+ or 15.3+. See version requirements.
  • Multi-AZ: recommended for production
  • Automated backups: enable with 7-day retention minimum

Monitoring and observability

Regardless of your level of usage, we recommend that you implement these monitoring practices for the database:

Critical metrics

MetricWarning thresholdCritical thresholdAction
CPU utilization>70% sustained>85% sustainedScale up compute
Memory (Freeable)<25% free<15% freeScale up memory
IOPS (read/write)>80% of limit>95% of limitIncrease provisioned IOPS or storage size
Disk apace<20% free<10% freeIncrease storage size
Replication lag (if HA)>30 seconds>60 secondsCheck network, investigate load
Conduktor support is available if you have any concerns on the metrics you are monitoring.

Cloud-specific tools

  • AWS RDS
  • GCP cloud SQL
  • Azure database for PostgreSQL
  • Enable Performance Insights (provides query-level analysis)
  • Enable Enhanced Monitoring (OS-level metrics)
  • Create CloudWatch alarms for critical metrics
  • Monitor: ReadIOPS, WriteIOPS, CPUUtilization, FreeableMemory, DatabaseConnections
  • CloudWatch Metrics for RDS

Scaling and performance

When to scale up

  1. CPU consistently >70% for more than 1 hour during business hours
  2. Memory (freeable) <25% sustained, indicating index and working set don’t fit in RAM
  3. IOPS at >80% of limit for more than 30 minutes, causing query slowdowns

Connection pooling

Conduktor Console includes built-in connection pooling. The default is 15 connections per instance but you can change this using the CDK_DATABASE_CONNECTION_POOL_SIZE parameter. Cloud providers have default connection limits based on the provisioned database instance size. Verify that your instance type supports your required connection count. As a general rule, you should also allow for a few more (~10) connections on top of this.
max_connections = (Console instances × connections_per_instance) + 10
Example for with 3 Console instances:
max connections = (3 × 15) + 10 = 55

Database maintenance

Backup and recovery

Our suggested backup requirements are:
  • Automated backups: enabled with 7-day retention minimum (14-30 days for production)
  • Backup window: during low-usage periods (e.g., 2-4 AM local time)
  • Point-in-time recovery: enabled (available on all cloud providers)
  • Cross-region backups: for disaster recovery (if required by compliance)
  • Manual snapshots: take a manual backup before upgrades of Conduktor Console

Upgrade paths

Scaling compute (vertical scaling): All providers support instance size changes with brief downtime (typically 5-15 minutes). Check your cloud provider documentation for full information. Scaling storage:
  • AWS: storage can be scaled up without downtime (gp3 volumes support online resizing)
  • GCP: storage automatically scales up; can be manually increased without downtime
  • Azure: storage can be scaled up without downtime
IOPS scaling:
  • AWS: modify gp3 IOPS or switch to Provisioned IOPS (io1/io2) during a maintenance window
  • GCP: IOPS scale automatically with storage size
  • Azure: Premium SSD v2 allows online IOPS adjustment; Premium SSD requires storage tier change

Cost optimization

  1. Use reserved instances/committed use discounts: save 30-60% for predictable workloads
  2. Right-size early: starting oversized and scaling down is difficult; start with recommendations and scale up as needed
  3. Use gp3 storage on AWS: 20% cheaper than gp2 with better baseline performance, especially for IOPS
  4. Enable multi-AZ only for production: dev/test environments can use single-AZ to save 50% on instance costs
  5. Monitor idle connections: ensure connection pooling is working correctly to avoid over-provisioning

Troubleshoot

Yes, Aurora PostgreSQL is compatible with Conduktor Console and is a good option for fully scaled deployments.Aurora provides better scalability, automatic failover, and read replicas. Version requirements still apply (14.8+ / 15.3+).
IOPS throttling causes slow queries, timeouts, and potential user-facing errors.The background metadata sync process is especially sensitive to IOPS limits. Monitor ReadIOPS and WriteIOPS metrics and scale up before hitting limits.
T-series (AWS) or Burstable tier (Azure) instances can work for standard level installs with low, consistent load.However, once CPU credits are exhausted, performance degrades significantly. For production, we recommend general purpose instances (M-series on AWS, General Purpose on Azure/GCP) for predictable performance.
Database growth typically depends on:
  • the number of Kafka topics, partitions, subjects (schemas), jobs (Kafka Connect) and consumer groups as well as
  • number of users, the level of RBAC and the activity level of these users
Monitor your database size monthly to project growth.
Read replicas can help with read-heavy workloads but add complexity. They are typically not needed for Conduktor.
These levels are guidelines. If you have 150 users but 100,000 topics, use mid or fully scaled sizing.The Kafka scale (topics, consumer groups) drives database size more than user count. When in doubt, start with the next level up and scale down if over-provisioned.