decision: add decision record for when to extend HookProvider #457
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR proposes a decision record aligning with the outcome of the review of the following document.
Should new features implement HookProvider?
Problem Statement
We are introducing RetryStrategy as a first-class parameter in the Agent constructor. RetryStrategy determines whether and how to retry failed model invocations or tool executions. Like SessionManager and ConversationManager, RetryStrategy will leverage HookProvider to integrate with the agent lifecycle. The team has agreed that the public interface should be retry_strategy: RetryStrategy rather than the more generic retry_strategy: HookProvider. The open question is whether the RetryStrategy interface itself should extend HookProvider to perform retries via the hook system, or whether it should expose a simpler contract such as should_retry(exception: Exception) -> bool with the framework handling invocation internally.
While RetryStrategy provides the immediate context for this decision, the question applies broadly to all future interfaces we introduce. Should new capabilities like guardrails, rate limiters, cost trackers, or audit loggers extend HookProvider as SessionManager and ConversationManager do today? Or should we establish a different pattern where domain interfaces remain pure and the framework handles lifecycle integration internally? The decision we make for RetryStrategy will set precedent for the SDK's architectural direction.
This discussion assumes HookProvider is a valid internal mechanism for lifecycle coordination. Whether hooks are the right architectural choice for the framework is a separate question. The question here is narrower: when the framework uses hooks internally, should that implementation detail be exposed in user-facing interfaces?
Context: HookProvider as Internal Primitive
Today, both SessionManager and ConversationManager implement HookProvider. This architectural choice enabled rapid iteration because new capabilities could be built using the same extensibility mechanism available to external contributors. When we needed session persistence, we did not invent a new lifecycle integration pattern; we registered callbacks for AgentInitializedEvent, MessageAddedEvent, and AfterInvocationEvent. The hook system became our internal composition primitive.
This approach follows the principle of dogfooding. By building features through the public extensibility API, we validate that API continuously and ensure external developers have access to the same power we do. This also reinforces a key architectural property: Martin Fowler describes the distinction between a framework and a library as inversion of control—frameworks call your code rather than the other way around. Hooks give us that inversion uniformly whether the code is internal or external.
The Case for Extending HookProvider
Consistency argues for HookProvider. If SessionManager persists state via hooks and ConversationManager reduces context via hooks, then RetryStrategy performing retries via hooks maintains a uniform mental model. Developers who understand one component understand the composition pattern of all components. This aligns with our tenet of composability: primitives are building blocks with each other, and each feature is developed with all other features in mind.
Under this approach, the interface would look like the following:
Implementing HookProvider also honors our commitment to being extensible by design. A RetryStrategy could respond to multiple events such as AfterModelCallEvent and AfterToolCallEvent, adjust its behavior based on telemetry events, or compose with other hook providers. The Open-Closed Principle suggests we should design for extension, and HookProvider provides that extension surface.
The Case Against: Leaky Abstractions and Implicit Behavior
The counterargument centers on abstraction boundaries. When a user configures session_manager=FileSessionManager(), they cannot easily reason about when persistence occurs. The session is not saved through an explicit session_manager.save() call; instead, persistence happens implicitly through hook callbacks responding to events the user may not know exist. This tension directly challenges our tenet that the obvious path is the happy path. If understanding when sessions persist requires knowledge of hook registration and event dispatch, we have made the non-obvious path the only path.
Joshua Bloch's guidance on API design emphasizes that interfaces should be easy to use correctly and hard to use incorrectly. A RetryStrategy with a should_retry method is self-documenting:
The framework calls should_retry, the strategy returns a boolean, and the framework acts accordingly. The user need not understand AfterModelCallEvent, hook registration, or callback ordering. This simplicity embodies our tenet that simple things should be simple. A retry strategy is conceptually simple; its interface should reflect that simplicity rather than exposing the machinery of lifecycle integration.
The Deeper Question: Primitive or Leak?
The core tension is whether HookProvider is a base primitive upon which everything should be built, or whether its internal usage constitutes an abstraction leak that we have simply normalized.
The hook-based approach has enabled customers to solve problems the framework did not anticipate. Consider a real-world example where a team needed to run guardrail validation before session persistence. Because hook ordering is determined by registration order, they embedded guardrail registration inside their session manager subclass:
The HookProvider abstraction gave them the power to solve their problem without waiting for framework changes. This is extensibility by design working as intended.
However, this flexibility comes with tradeoffs. The workaround violates separation of concerns: a session manager is an unexpected place for guardrail configuration. A reader must understand hook registration sequencing to understand why guardrails appear in a session manager class. If the base class changes its registration order, the workaround may silently break. The question is whether these tradeoffs are acceptable costs of extensibility, or whether they indicate the abstraction is leaking implementation details that force unrelated concerns to merge.
Consider Java's Comparator interface. A comparator exposes compare(a, b) and the framework calls it at the appropriate time. The comparator knows nothing about sorting internals or when comparisons occur. A simple RetryStrategy follows the same pattern: expose should_retry and let the framework call it. Extending HookProvider inverts this relationship, requiring the user to understand the event system, register callbacks, and respond to event objects just to implement retry logic.
Exposing HookProvider in an interface is not inherently a leak—it depends on what the interface requires. SessionManager legitimately needs to respond to initialization, message addition, and invocation completion as separate events. If we chose not to have SessionManager implement HookProvider, we would introduce abstract methods like on_agent_initialized, on_message_added, and on_invocation_complete—strongly coupled to the same lifecycle events, just with extra indirection. RetryStrategy is different: it has a single decision point, so a simple should_retry method captures the entire domain without mirroring event structure. Exposing HookProvider becomes a leak when we apply it uniformly rather than asking what each interface actually requires.
Recommendation
While RetryStrategy is simple enough that a non-HookProvider interface would suffice, for consistency with SessionManager and ConversationManager, RetryStrategy will extend HookProvider. This maintains a uniform pattern across all agent constructor parameters that integrate with the lifecycle: users implementing any of these interfaces learn one composition model.
Decision Framework for Future Interfaces
When introducing a new internal interface, use the following criteria to determine whether it should extend HookProvider or expose a simple domain interface.
Consider a simple interface (not extending HookProvider) when:
Use HookProvider (or extend it) when:
When uncertain, prefer the simple interface. A simple interface can always be evolved to extend HookProvider if real use cases emerge. The opposite migration path breaks existing implementations.