fix: correct model name aliases and harden dynamic cost calculator#44
fix: correct model name aliases and harden dynamic cost calculator#44
Conversation
Model registry fixes: - Add claude-sonnet-4-0 alias (Anthropic naming convention) alongside existing claude-4.0-sonnet key - Add claude-3-7-sonnet alias (hyphen) alongside claude-3.7-sonnet (dot) - Add claude-3-5-haiku alias (hyphen) alongside claude-3.5-haiku (dot) Mismatched names caused get_model_config to raise ValueError, causing the x402 middleware to fall back to cost_per_request per request; sessions accumulated to the 0.05 OPG cap and always settled for the full precheck amount instead of actual usage cost. Cost calculator hardening (util.py): - _extract_usage_tokens now raises ValueError instead of returning None when usage data is missing or malformed — no more silent zero charges - _extract_model_from_context now reads only from request_json (ignores response model field which may be a versioned provider alias) - _extract_asset_decimals_from_requirements now raises ValueError for unknown or missing asset addresses instead of silently falling back to DEFAULT_ASSET_DECIMALS (which would compute wrong amounts for USDC) Tests (tests/test_pricing.py): - 69 unit tests covering every supported model from all four providers with exact wei-level cost assertions for both OPG (18 dec) and USDC (6 dec), plus edge-case error path coverage - Added to .github/workflows/test.yml so pricing tests run on every PR Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ic pricing fails The x402 PaymentMiddleware catches all exceptions from the session cost calculator and silently falls back to the static cap amount (0.05 OPG), causing every request with an unresolvable cost to be charged the full cap rather than failing visibly. Capture the PaymentMiddleware instance and replace _resolve_session_request_cost with a strict variant that lets exceptions propagate. This means requests with unknown models, missing usage data, or any other pricing error will result in a 500 response instead of an incorrect charge. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
tee_gateway/model_registry.py
Outdated
| "claude-opus-4-5": SupportedModel.CLAUDE_OPUS_4_5, | ||
| "claude-opus-4-6": SupportedModel.CLAUDE_OPUS_4_6, | ||
| # claude-3-7-sonnet: Anthropic uses hyphens (3-7), not dots (3.7) | ||
| "claude-3-7-sonnet": SupportedModel.CLAUDE_3_7_SONNET, |
There was a problem hiding this comment.
we don't have sonnet-3.7 in the SDK anymore, why do we need it here?
There was a problem hiding this comment.
This was the same thing with being compliant with older SDK versions. I'm fine with removing the older non-supported models now though
There was a problem hiding this comment.
Removed old ones + added comments for legacy models
tee_gateway/model_registry.py
Outdated
| "claude-3-5-haiku": SupportedModel.CLAUDE_3_5_HAIKU, | ||
| "claude-3.5-haiku": SupportedModel.CLAUDE_3_5_HAIKU, | ||
| # claude-sonnet-4-0: Anthropic's newer naming puts family before version | ||
| "claude-sonnet-4-0": SupportedModel.CLAUDE_4_0_SONNET, |
There was a problem hiding this comment.
hmm i'm a bit confused by this, do you expect clients to use this alternative naming?
There was a problem hiding this comment.
No I moreso just wanted to include both to make sure it's robust -- in case folks use older versions of the SDK that may or may not have the 3.5 or 3-5 naming convention
There was a problem hiding this comment.
i see, can you add a comment to explain that these might come from older SDKs? and that we should remove it eventually
There was a problem hiding this comment.
Done -- added legacy section
- Remove CLAUDE_3_7_SONNET from the registry entirely (discontinued by Anthropic)
- Remove its _MODEL_LOOKUP entries ("claude-3-7-sonnet", "claude-3.7-sonnet")
- Group CLAUDE_3_5_HAIKU, CLAUDE_4_0_SONNET, GROK_2, GROK_3, GROK_3_MINI under
a "Legacy models" section in both the enum and _MODEL_LOOKUP — these are
retained only to support requests from older SDK versions and are not in the
current TEE_LLM enum
- Update unit tests: assert claude-3-7-sonnet raises ValueError; remove its
cost/alias tests; update legacy model test descriptions
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
No need to assert the model raises — just remove it. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…t/tee-gateway into claude/affectionate-shockley
…check to 0.1 claude-3-5-haiku-20241022 was retired by Anthropic on Feb 19 2026. grok-2-latest/grok-2-1212 returns 404 from xAI — model is discontinued. OPG session cap raised from 0.05 to 0.1 to reduce sessions hitting the cap. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Google shut down gemini-3-pro-preview on March 9, 2026 — 16 days ago. The replacement is gemini-3.1-pro-preview (not yet added here). gemini-3-flash-preview, grok-3/grok-3-mini, and claude-sonnet-4-0 are still live and retained. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Drop tests for claude-3-5-haiku, grok-2, and gemini-3-pro-preview which were removed from the model registry. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Model registry fixes:
Cost calculator hardening (util.py):
Tests (tests/test_pricing.py):