Skip to content

This repository contains demo infrastructure configurations for showcasing Kestrel's cloud incident response capabilities.

License

Notifications You must be signed in to change notification settings

KestrelAI/Demos

Repository files navigation

Kestrel AI Demos

This repository contains demonstration infrastructure configurations for showcasing Kestrel AI's cloud incident response capabilities.

Available Demos

Demonstrates a common VPC peering misconfiguration where asymmetric routing causes network "blackholes" - traffic flows one way but responses cannot return.

Demonstrates how undersized MSK brokers become resource-constrained under production load. Creates an MSK cluster with undersized brokers and generates high-volume traffic to trigger CPU and memory exhaustion.

Demonstrates how an undersized 2-broker MSK cluster becomes capacity-bound under high throughput, causing under-replicated partitions. Kestrel detects this and generates a two-step fix: adding brokers via AWS API and rebalancing partitions via Kafka CLI.

Demonstrates how required pod anti-affinity causes scheduling failures when scaling beyond available nodes. Works fine in dev/staging with few replicas, but fails in production when HPA tries to scale during traffic spikes. Shows the classic "works in dev, breaks in prod" scenario.

Demonstrates GPU resource fragmentation on Kubernetes clusters running NVIDIA A100/H100 GPUs with Multi-Instance GPU (MIG) enabled. MIG instances must be created from contiguous GPU slices, which leads to fragmentation over time. Your dashboard shows 40GB "available" but training jobs requesting 2g.10gb stay Pending because no GPU has 2 contiguous slices free. Kestrel detects this invisible problem and shows exactly how to fix it.

License

MIT

About

This repository contains demo infrastructure configurations for showcasing Kestrel's cloud incident response capabilities.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •