-
Notifications
You must be signed in to change notification settings - Fork 605
Description
Problem Statement
When using OpenAI-compatible endpoints that wrap Bedrock models (e.g., Databricks Model Serving), context overflow errors are not properly detected, preventing conversation managers (like SummarizingConversationManager) from triggering.
Root Cause:
The OpenAI provider only catches BadRequestError with code "context_length_exceeded". However, Databricks endpoints serving Bedrock models return APIError with Bedrock-style error messages like:
- "Input is too long for requested model"
- "input length and max_tokens exceed context limit"
- "too many total text bytes"
These errors are not converted to ContextWindowOverflowException, so the agent never attempts to reduce context.
Proposed Solution
Extend the OpenAI provider's exception handling to recognise Bedrock-style error messages in openai.APIError exceptions and convert them to ContextWindowOverflowException.
Changes:
- Add constants for Bedrock-style overflow message patterns
- Catch openai.APIError (after more specific exceptions) and check for these patterns
- Raise ContextWindowOverflowException when detected
This enables proper context management for OpenAI-compatible endpoints that wrap Bedrock models, maintaining backward compatibility with native OpenAI endpoints.
Use Case
Users deploying agents with Databricks Model Serving endpoints configured as OpenAI-compatible providers. Databricks serves models from AWS Bedrock but exposes them through an OpenAI-compatible API.
Scenario:
- Agent is configured with SummarizingConversationManager to handle long conversations
- Model provider is OpenAIModel pointing to a Databricks endpoint
- Databricks endpoint serves a Bedrock model (e.g., Claude)
- Conversation grows beyond the model's context window
Expected Behaviour:
Agent catches the overflow error, triggers reduce_context(), summarises old messages, and retries.
Actual Behaviour:
Agent crashes with OpenAI.APIError: 400 - Input is too long for the requested model because the error is not recognised as a context overflow, so summarisation never triggers.
Alternatives Solutions
No response
Additional Context
No response