> ## Documentation Index
> Fetch the complete documentation index at: https://docs.conduktor.io/llms.txt
> Use this file to discover all available pages before exploring further.

# PostgreSQL sizing for Conduktor Console production

> Size PostgreSQL for Conduktor Console in production on AWS, GCP, or Azure. Covers CPU, RAM, disk IOPS.

## Overview

This guide provides production-ready recommendations for sizing and scaling your PostgreSQL database for Conduktor Console across AWS, GCP or Azure to ensure a performant and consistent user experience.

<Warning>
  The minimum requirements (1-2 vCPU, 1 GB RAM, 10 GB disk) are only suitable for **proof-of-concept** or **test** environments.

  For production deployments, follow this guide to avoid performance issues like IOPS throttling, CPU spikes and memory exhaustion.
</Warning>

## Considerations

Key considerations for sizing your database:

<Steps>
  <Step title="Match database size to production needs">
    Select database specifications that match your production workload - avoid using proof-of-concept configurations in production environments.
  </Step>

  <Step title="Monitor from day one">
    Set up alerts for CPU, memory, IOPS and query latency before going live.
  </Step>

  <Step title="Ensure sufficient IOPS throughput">
    Provision a minimum of 3,000 IOPS (Input/Output Operations Per Second) - Background sync processes are write-heavy.
  </Step>

  <Step title="Balance cost and performance">
    Use these recommendations as a **baseline**, then optimize based on actual usage.
  </Step>
</Steps>

## Usage level

To get started, choose the level that matches your expected initial usage:

| Level            | Concurrent users | Kafka scale                                                                    | Estimated Maximum DB size |
| ---------------- | ---------------- | ------------------------------------------------------------------------------ | ------------------------- |
| **Standard**     | Up to 500        | 1-5 clusters, up to 1,000 topics / 10,000 partitions, \~500 consumer groups    | up to 50 GB               |
| **Mid scale**    | 500-1,000        | 5-10 clusters, up to 5,000 topics / 50,000 partitions, \~1,000 consumer groups | up to 100 GB              |
| **Fully scaled** | 1,000-5,000      | 10+ clusters, 5,000+ topics / 50,000+ partitions, 1,000+ consumer groups       | up to 250 GB              |

<Info>
  The estimated maximum DB size is based on an installation making maximum utilization of all Conduktor features over a year and caters for accumulated information such as audit logging.
</Info>

## Target performance

Our recommendations are based on providing the optimal Conduktor experience while avoiding over-provisioning of the database. We've based the recommendations on:

* keeping P95 query latency under a level to provide the best user experience
* handling concurrent queries based on your expected user base size
* supporting background Conduktor metadata updates for your expected Kafka platform size
* avoiding IOPS throttling or similar cloud provider limitations during normal operations

<Info>
  Conduktor Console continuously syncs Kafka metadata to the database via a background task, which requires sufficient IOPS (Input/Output Operations Per Second) as specified in the recommendations below.
</Info>

## Recommended specifications

<Tabs>
  <Tab title="AWS">
    We recommend that you use AWS RDS PostgreSQL. The Postgres version has to be 14.8+ or 15.3+.

    <Tabs>
      <Tab title="Standard level">
        **Instance type:**

        * **Minimum**: `db.t4g.large` (2 vCPU, 8 GB RAM)
        * **Recommended**: `db.m6g.large` (2 vCPU, 8 GB RAM) for consistent performance without CPU credits

        **Storage:**

        * **Type**: general purpose SSD (gp3)
        * **Size**: 50 GB minimum (allows for growth)
        * **IOPS**: 3,000 IOPS baseline (included free with gp3)
        * **Throughput**: 125 MB/s (included free)

        **Configuration:**

        * PostgreSQL version: 14.8+ or 15.3+. [See version requirements](/guide/conduktor-in-production/deploy-artifacts/deploy-console#configure-postgres-database).
        * Multi-AZ: recommended for production
        * Automated backups: enable with 7-day retention minimum
      </Tab>

      <Tab title="Mid scale">
        **Instance type:**

        * **Recommended**: `db.m6g.xlarge` (4 vCPU, 16 GB RAM)

        **Storage:**

        * **Type**: general purpose SSD (gp3)
        * **Size**: 150 GB minimum
        * **IOPS**: 5,000-8,000 IOPS (provision additional IOPS beyond baseline)
        * **Throughput**: 250-500 MB/s

        **Configuration:**

        * Multi-AZ: strongly recommended
        * Automated backups: 14-day retention
        * Performance insights: enable for monitoring
      </Tab>

      <Tab title="Fully scaled">
        **Instance type:**

        * **Recommended**: `db.m6g.2xlarge` (8 vCPU, 32 GB RAM)

        **Storage:**

        * **Type**: general purpose SSD (gp3) or provisioned IOPS SSD (io1/io2) for >20,000 IOPS
        * **Size**: 500 GB minimum
        * **IOPS**: 15,000-30,000 IOPS
        * **Throughput**: 500-1,000 MB/s

        **Configuration:**

        * Multi-AZ: required
        * Automated backups: 30-day retention
        * Performance insights: enable with extended retention
        * Enhanced monitoring: enable
      </Tab>
    </Tabs>
  </Tab>

  <Tab title="Azure">
    We recommend that you use Azure database for PostgreSQL. The Postgres version has to be 13 or higher.

    <Tabs>
      <Tab title="Standard level">
        **Compute tier:**

        * **Tier**: general purpose
        * **SKU**: `Standard_D2s_v3` (2 vCPU, 8 GB RAM)

        **Storage:**

        * **Type**: premium SSD
        * **Size**: 64 GB minimum (32 GB is minimum for premium SSD)
        * **IOPS**: depends on VM tier, typically 3,200+ IOPS for D2s\_v3

        **Configuration:**

        * High availability: zone-redundant recommended for production
        * Automated backups: enable with 7-day retention minimum
      </Tab>

      <Tab title="Mid scale">
        **Compute tier:**

        * **Tier**: general purpose
        * **SKU**: `Standard_D4s_v3` (4 vCPU, 16 GB RAM)

        **Storage:**

        * **Type**: premium SSD
        * **Size**: 256 GB
        * **IOPS**: \~7,000-10,000 IOPS (depends on VM tier)

        **Configuration:**

        * High availability: zone-redundant strongly recommended
        * Automated backups: 14-day retention
      </Tab>

      <Tab title="Fully scaled">
        **Compute tier:**

        * **Tier**: general purpose
        * **SKU**: `Standard_D8s_v3` (8 vCPU, 32 GB RAM)

        **Storage:**

        * **Type**: premium SSD v2 (when available) or premium SSD
        * **Size**: 1 TB minimum
        * **IOPS**: 20,000-40,000 IOPS (customize with premium SSD v2)

        **Configuration:**

        * High availability: zone-redundant required
        * Automated backups: 30-day retention
        * Enable query performance insight
      </Tab>
    </Tabs>
  </Tab>

  <Tab title="GCP">
    We recommend that you use GCP Cloud SQL for PostgreSQL. PostgreSQL has to be v13 or higher.

    <Tabs>
      <Tab title="Standard level">
        **Machine type:**

        * **Recommended**: `db-custom-2-8192` (2 vCPU, 8 GB RAM)
        * Or use `db-standard-2` (2 vCPU, 7.5 GB RAM) for slightly lower cost

        **Storage:**

        * **Type**: SSD
        * **Size**: 50 GB minimum
        * **IOPS**: 3,000 IOPS (50 GB × 60 IOPS/GB = 3,000 IOPS)

        **Configuration:**

        * High availability: recommended for production
        * Automated backups: enable with 7-day retention minimum
      </Tab>

      <Tab title="Mid scale">
        **Machine type:**

        * **Recommended**: `db-custom-4-16384` (4 vCPU, 16 GB RAM)

        **Storage:**

        * **Type**: SSD
        * **Size**: 150 GB minimum
        * **IOPS**: 9,000 IOPS (150 GB × 60 IOPS/GB)

        **Configuration:**

        * High availability: strongly recommended
        * Automated backups: 14-day retention
        * Enable query insights
      </Tab>

      <Tab title="Fully scaled">
        **Machine type:**

        * **Recommended**: `db-custom-8-32768` (8 vCPU, 32 GB RAM)

        **Storage:**

        * **Type**: SSD
        * **Size**: 500 GB minimum
        * **IOPS**: 30,000 IOPS (500 GB × 60 IOPS/GB)

        **Configuration:**

        * High availability: required
        * Automated backups: 30-day retention
        * Enable query insights and database flags monitoring
      </Tab>
    </Tabs>
  </Tab>
</Tabs>

## Monitoring and observability

Regardless of your level of usage, we recommend that you implement these monitoring practices for the database:

### Critical metrics

| Metric                      | Warning threshold | Critical threshold | Action                                    |
| --------------------------- | ----------------- | ------------------ | ----------------------------------------- |
| **CPU utilization**         | >70% sustained    | >85% sustained     | Scale up compute                          |
| **Memory (Freeable)**       | \<25% free        | \<15% free         | Scale up memory                           |
| **IOPS (read/write)**       | >80% of limit     | >95% of limit      | Increase provisioned IOPS or storage size |
| **Disk apace**              | \<20% free        | \<10% free         | Increase storage size                     |
| **Replication lag** (if HA) | >30 seconds       | >60 seconds        | Check network, investigate load           |

Conduktor support is available if you have any concerns on the metrics you are monitoring.

### Cloud-specific tools

<Tabs>
  <Tab title="AWS RDS">
    * Enable **Performance Insights** (provides query-level analysis)
    * Enable **Enhanced Monitoring** (OS-level metrics)
    * Create CloudWatch alarms for critical metrics
    * Monitor: `ReadIOPS`, `WriteIOPS`, `CPUUtilization`, `FreeableMemory`, `DatabaseConnections`
    * [CloudWatch Metrics for RDS](https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/monitoring-cloudwatch.html)
  </Tab>

  <Tab title="GCP cloud SQL">
    * Enable **Query Insights** (slow query analysis)
    * Use **Cloud Monitoring** dashboards
    * Set up alerting policies in Cloud Monitoring
    * Monitor: `database/cpu/utilization`, `database/memory/utilization`, `database/disk/read_ops_count`, `database/disk/write_ops_count`
    * [Cloud SQL Monitoring](https://cloud.google.com/sql/docs/postgres/monitoring)
  </Tab>

  <Tab title="Azure database for PostgreSQL">
    * Enable **Query Performance Insight**
    * Use **Azure Monitor** for metrics and alerts
    * Enable **Server Parameters** monitoring
    * Monitor: `cpu_percent`, `memory_percent`, `io_consumption_percent`, `storage_percent`, `active_connections`
    * [Azure PostgreSQL Monitoring](https://learn.microsoft.com/en-us/azure/postgresql/flexible-server/concepts-monitoring)
  </Tab>
</Tabs>

## Scaling and performance

### When to scale up

1. **CPU consistently >70%** for more than 1 hour during business hours
2. **Memory (freeable) \<25%** sustained, indicating index and working set don't fit in RAM
3. **IOPS at >80% of limit** for more than 30 minutes, causing query slowdowns

### Connection pooling

Conduktor Console includes built-in connection pooling.

The default is 15 connections per instance but you can change this using the `CDK_DATABASE_CONNECTION_POOL_SIZE` parameter.

Cloud providers have default connection limits based on the provisioned database instance size.

Verify that your instance type supports your required connection count. As a general rule, you should also allow for a few more (\~10) connections on top of this.

```
max_connections = (Console instances × connections_per_instance) + 10
```

**Example for with 3 Console instances:**

```
max connections = (3 × 15) + 10 = 55
```

## Database maintenance

### Backup and recovery

Our suggested backup requirements are:

* **Automated backups**: enabled with 7-day retention minimum (14-30 days for production)
* **Backup window**: during low-usage periods (e.g., 2-4 AM local time)
* **Point-in-time recovery**: enabled (available on all cloud providers)
* **Cross-region backups**: for disaster recovery (if required by compliance)
* **Manual snapshots**: take a manual backup before upgrades of Conduktor Console

### Upgrade paths

**Scaling compute (vertical scaling):**

All providers support instance size changes with brief downtime (typically 5-15 minutes). Check your cloud provider documentation for full information.

**Scaling storage:**

* **AWS**: storage can be scaled up without downtime (gp3 volumes support online resizing)
* **GCP**: storage automatically scales up; can be manually increased without downtime
* **Azure**: storage can be scaled up without downtime

**IOPS scaling:**

* **AWS**: modify gp3 IOPS or switch to Provisioned IOPS (io1/io2) during a maintenance window
* **GCP**: IOPS scale automatically with storage size
* **Azure**: Premium SSD v2 allows online IOPS adjustment; Premium SSD requires storage tier change

## Cost optimization

1. **Use reserved instances/committed use discounts**: save 30-60% for predictable workloads
2. **Right-size early**: starting oversized and scaling down is difficult; start with recommendations and scale up as needed
3. **Use gp3 storage on AWS**: 20% cheaper than gp2 with better baseline performance, especially for IOPS
4. **Enable multi-AZ only for production**: dev/test environments can use single-AZ to save 50% on instance costs
5. **Monitor idle connections**: ensure connection pooling is working correctly to avoid over-provisioning

## Troubleshoot

<AccordionGroup>
  <Accordion title="Can I use Aurora PostgreSQL instead of RDS PostgreSQL?">
    Yes, Aurora PostgreSQL is compatible with Conduktor Console and is a good option for fully scaled deployments.

    Aurora provides better scalability, automatic failover, and read replicas. Version requirements still apply (14.8+ / 15.3+).
  </Accordion>

  <Accordion title="What happens if I run out of IOPS?">
    IOPS throttling causes slow queries, timeouts, and potential user-facing errors.

    The background metadata sync process is especially sensitive to IOPS limits. Monitor `ReadIOPS` and `WriteIOPS` metrics and scale up before hitting limits.
  </Accordion>

  <Accordion title="Can I use Burstable tier (T-series) instances for production?">
    T-series (AWS) or Burstable tier (Azure) instances can work for standard level installs with low, consistent load.

    However, once CPU credits are exhausted, performance degrades significantly. For production, we recommend general purpose instances (M-series on AWS, General Purpose on Azure/GCP) for predictable performance.
  </Accordion>

  <Accordion title="How do I estimate my database size growth?">
    Database growth typically depends on:

    * the number of Kafka topics, partitions, subjects (schemas), jobs (Kafka Connect) and consumer groups as well as
    * number of users, the level of RBAC and the activity level of these users

    Monitor your database size monthly to project growth.
  </Accordion>

  <Accordion title="Should I use read replicas?">
    Read replicas can help with read-heavy workloads but add complexity. They are typically not needed for Conduktor.
  </Accordion>

  <Accordion title="What if my deployment doesn't match these usages levels exactly?">
    These levels are guidelines. If you have 150 users but 100,000 topics, use mid or fully scaled sizing.

    The Kafka scale (topics, consumer groups) drives database size more than user count. When in doubt, start with the next level up and scale down if over-provisioned.
  </Accordion>
</AccordionGroup>

## Related resources

* [Deploy Conduktor Console](/guide/conduktor-in-production/deploy-artifacts/deploy-console)
* [Deploy Console with Kubernetes](/guide/conduktor-in-production/deploy-artifacts/deploy-console/kubernetes)
* [View Console environment variables](/guide/conduktor-in-production/deploy-artifacts/deploy-console/environment-variables)
* [Give us feedback/request a feature](https://conduktor.io/roadmap) <Icon icon="up-right-from-square" />
