Skip to content

# Memory Leak in ApiServerSpanExporter #933

@bleastrind

Description

@bleastrind

Memory Leak in ApiServerSpanExporter

 ## 🔴 Required Information

 **Describe the Bug:**
 The `ApiServerSpanExporter` class in `dev` modules has a memory leak. The `allExportedSpans` list continuously grows without any cleanup
 mechanism, storing all exported spans indefinitely. As the application runs, this list accumulates spans indefinitely, leading to memory exhaustion (OOM)
 over time.

 **Steps to Reproduce:**
 1. Start the ADK application with OpenTelemetry tracing enabled
 2. Run the application for an extended period (several hours or days)
 3. Create multiple agent sessions or execute LLM calls continuously
 4. Monitor memory usage - it will continuously increase

 **Expected Behavior:**
 The application should maintain stable memory usage over time, with old trace data being cleaned up when it's no longer needed.

 **Observed Behavior:**
 Memory usage grows continuously until the application runs out of memory and crashes (OOM). The `allExportedSpans` list contains all historical spans
 without any limit or cleanup strategy.

 **Environment Details:**
  - ADK Library Version: Any
  - OS: Any (macOS, Linux, Windows)
  - Java Version: Any

 **Model Information:**
  - Which model is being used: N/A (affects all models)

 ---

 ## 🟡 Optional Information

 **Regression:**
 Yes - This issue exists in all versions that use `ApiServerSpanExporter`

 **Root Cause Analysis:**
 The `ApiServerSpanExporter` class stores all exported spans in a `List<SpanData>` called `allExportedSpans`. This list:
 - Is populated every time `export()` is called (which happens for every span created)
 - Has no maximum size limit
 - Has no time-based expiration
 - Is never cleared, even when `shutdown()` is called
 - Is accessed by the debug endpoint `/adk-dev/debug/trace/session/{sessionId}` to retrieve session traces

 **Impact:**
 - Memory leak in both production and development environments
 - Increased GC pressure
 - Potential OOM crashes in long-running applications
 - Performance degradation due to growing list size

 **Proposed Fix:**
 Add capacity limit to `allExportedSpans` (e.g., keep only the most recent 10,000 spans) and clear storage in `shutdown()` method.

 **Additional Context:**
 The `sessionToTraceIdsMap` also has a similar issue, though less severe as it only stores session-to-trace-id mappings rather than full span data.

 **Minimal Reproduction Code:**
 ```java
 // Run any ADK application with tracing enabled for an extended period
 // Monitor memory usage
 ```

 **How often has this issue occurred?:**
  - Always (100%) - This is a deterministic memory leak that will occur in all long-running applications

 ---

 ## 🟢 Fix Details

 **Files Modified:**
 - `dev/src/main/java/com/google/adk/web/service/ApiServerSpanExporter.java`

 **Changes Made:**
 1. Added `MAX_SPANS_TO_KEEP` constant (10,000 spans)
 2. Modified `export()` method to remove oldest spans when limit is exceeded
 3. Modified `shutdown()` method to clear all storage

 **Implementation:**
 ```java
 private static final int MAX_SPANS_TO_KEEP = 10000;

 @Override
 public CompletableResultCode export(Collection<SpanData> spans) {
   exporterLog.debug("ApiServerSpanExporter received {} spans to export.", spans.size());
   List<SpanData> currentBatch = new ArrayList<>(spans);
   allExportedSpans.addAll(currentBatch);

   // Prevent memory leak by keeping only the most recent spans
   synchronized (allExportedSpans) {
     while (allExportedSpans.size() > MAX_SPANS_TO_KEEP) {
       allExportedSpans.remove(0);
     }
   }
   // ... rest of method
 }

 @Override
 public CompletableResultCode shutdown() {
   exporterLog.debug("Shutting down ApiServerSpanExporter.");
   // Clear all storage to prevent memory leaks
   synchronized (allExportedSpans) {
     allExportedSpans.clear();
   }
   eventIdTraceStorage.clear();
   sessionToTraceIdsMap.clear();
   return CompletableResultCode.ofSuccess();
 }

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions