Advanced Scaling Features

This document describes the advanced scaling features implemented in the KAI platform, including predictive scaling, cross-service scaling dependencies, and enhanced HPA event logging.

Overview
Predictive Scaling
Cross-Service Scaling Dependencies
Enhanced HPA Event Logging
Monitoring and Visualization
API Reference

Overview

The KAI platform implements several advanced scaling features to optimize resource utilization and ensure system stability:

Predictive Scaling: Analyzes historical metrics to predict future load and proactively adjust HPA settings
Cross-Service Scaling Dependencies: Ensures that when one service scales, dependent services are also scaled appropriately
Enhanced HPA Event Logging: Provides detailed information about scaling decisions and their triggers

These features are implemented in the Coordinator service and can be enabled or disabled through environment variables.

Predictive Scaling

Predictive scaling analyzes historical metrics to predict future load and proactively adjust HPA settings for services with predictable load patterns.

How It Works

The system collects and analyzes historical metrics to identify patterns in service load
Based on these patterns, it predicts future load for specific time windows
The system proactively adjusts HPA settings to ensure that services have the right number of replicas before load increases

Configuration

Predictive scaling is controlled by the following environment variables:

ENABLE_PREDICTIVE_SCALING: Set to true to enable predictive scaling
REDIS_URL: Redis connection URL for storing predictions and patterns

Service Load Patterns

Service load patterns define when a service is expected to experience increased load:

JavaScript code removed for compatibility

API Endpoints

The following API endpoints are available for managing predictive scaling:

GET /api/predictive-scaling/patterns: Get all service load patterns
GET /api/predictive-scaling/patterns/:service: Get service load pattern for a specific service
POST /api/predictive-scaling/patterns/:service: Create or update service load pattern
DELETE /api/predictive-scaling/patterns/:service: Delete service load pattern
GET /api/predictive-scaling/predictions: Get recent predictions

Cross-Service Scaling Dependencies

Cross-service scaling dependencies ensure that when one service scales, dependent services are also scaled appropriately to maintain system balance.

How It Works

The system monitors the replica count of source services
When a source service scales, the system automatically adjusts the replica count of dependent services based on the defined dependency type

Dependency Types

Proportional: Scale the target service proportionally to the source service (e.g., 2:1 ratio)
Fixed: Set a fixed number of replicas for the target service when the source service scales
Minimum: Ensure that the target service has at least a minimum number of replicas

Configuration

Cross-service scaling dependencies are controlled by the following environment variables:

ENABLE_SCALING_DEPENDENCIES: Set to true to enable cross-service scaling dependencies
REDIS_URL: Redis connection URL for storing dependencies

API Endpoints

The following API endpoints are available for managing scaling dependencies:

GET /api/scaling-dependencies: Get all scaling dependencies
GET /api/scaling-dependencies/:sourceService/:targetService: Get a specific scaling dependency
POST /api/scaling-dependencies/:sourceService/:targetService: Create or update a scaling dependency
DELETE /api/scaling-dependencies/:sourceService/:targetService: Delete a scaling dependency
POST /api/scaling-dependencies/:sourceService/:targetService/enable: Enable a scaling dependency
POST /api/scaling-dependencies/:sourceService/:targetService/disable: Disable a scaling dependency

Enhanced HPA Event Logging

Enhanced HPA event logging provides detailed information about scaling decisions and their triggers, helping to understand and optimize scaling behavior.

How It Works

The system monitors HPA objects in the Kubernetes cluster
When a scaling event occurs, the system logs detailed information about the event, including the trigger metric and its value
The system also calculates scaling effectiveness metrics to help optimize scaling behavior

Configuration

Enhanced HPA event logging is controlled by the following environment variables:

ENABLE_HPA_EVENT_LOGGING: Set to true to enable enhanced HPA event logging
REDIS_URL: Redis connection URL for storing events

Event Types

scale-up: HPA decided to increase the number of replicas
scale-down: HPA decided to decrease the number of replicas
no-scale: HPA decided not to change the number of replicas
limited-scale: HPA wanted to scale but was limited by constraints

API Endpoints

The following API endpoints are available for accessing HPA event logs:

GET /api/hpa-events: Get recent HPA events
GET /api/hpa-events/:service: Get recent HPA events for a specific service
GET /api/hpa-events/:service/effectiveness: Get scaling effectiveness for a specific service

Monitoring and Visualization

The KAI platform includes comprehensive monitoring and visualization for advanced scaling features:

Grafana Dashboards

HPA Metrics Dashboard: Shows current and desired replica counts, scaling events, and their triggers
Coordinator Service Dashboard: Shows queue depths, workflow durations, and processing metrics
Supabase Connection Pool Dashboard: Shows database connection pool metrics and performance

Admin Panel Integration

The admin panel includes a dedicated Grafana Dashboards page that embeds these dashboards, providing a unified interface for monitoring the system.

API Reference

Predictive Scaling API

Get All Service Load Patterns

GET /api/predictive-scaling/patterns

Response:

JavaScript code removed for compatibility

Create or Update Service Load Pattern

POST /api/predictive-scaling/patterns/:service

Request:

JavaScript code removed for compatibility

Scaling Dependencies API

Get All Scaling Dependencies

GET /api/scaling-dependencies

Response:

JavaScript code removed for compatibility

Create or Update Scaling Dependency

POST /api/scaling-dependencies/:sourceService/:targetService

Request:

JavaScript code removed for compatibility

HPA Events API

Get Recent HPA Events

GET /api/hpa-events

Response:

JavaScript code removed for compatibility

Get Scaling Effectiveness

GET /api/hpa-events/:service/effectiveness

Response:

JavaScript code removed for compatibility

Table of Contents​

Overview​

Predictive Scaling​

How It Works​

Configuration​

Service Load Patterns​

API Endpoints​

Cross-Service Scaling Dependencies​

How It Works​

Dependency Types​

Configuration​

API Endpoints​

Enhanced HPA Event Logging​

How It Works​

Configuration​

Event Types​

API Endpoints​

Monitoring and Visualization​

Grafana Dashboards​

Admin Panel Integration​

API Reference​

Predictive Scaling API​

Get All Service Load Patterns​

Create or Update Service Load Pattern​

Scaling Dependencies API​

Get All Scaling Dependencies​

Create or Update Scaling Dependency​

HPA Events API​

Get Recent HPA Events​

Get Scaling Effectiveness​

Table of Contents

Overview

Predictive Scaling

How It Works

Configuration

Service Load Patterns

API Endpoints

Cross-Service Scaling Dependencies

How It Works

Dependency Types

Configuration

API Endpoints

Enhanced HPA Event Logging

How It Works

Configuration

Event Types

API Endpoints

Monitoring and Visualization

Grafana Dashboards

Admin Panel Integration

API Reference

Predictive Scaling API

Get All Service Load Patterns

Create or Update Service Load Pattern

Scaling Dependencies API

Get All Scaling Dependencies

Create or Update Scaling Dependency

HPA Events API

Get Recent HPA Events

Get Scaling Effectiveness