fix: 5 critical bugs — TEE randomization, HTTP 402 hint, int dtype, inference None crash, gas estimation#201
Open
amathxbt wants to merge 4 commits intoOpenGradient:mainfrom
Conversation
Previously get_llm_tee() always returned tees[0], the first TEE in the registry list. This caused all clients to hit the same TEE, providing no load distribution and no resilience when that TEE starts failing. Fix: use random.choice(tees) so each LLM() construction independently picks from all currently active TEEs. Successive retries or re-initializations will therefore naturally spread across the healthy pool. Closes OpenGradient#200
Previously any HTTP error from the TEE (including 402 Payment Required)
was silently wrapped into a generic RuntimeError("TEE LLM chat failed: ...").
This caused the confusing traceback seen in issue OpenGradient#188 — the real cause
(insufficient OPG allowance) was buried inside the exception message.
Fix:
- Add `import httpx` and intercept httpx.HTTPStatusError before the
generic `except Exception` handler in both _chat_request() and
completion().
- When status == 402, raise a RuntimeError with a clear, actionable hint
telling the user to call llm.ensure_opg_approval(opg_amount=<amount>).
- Also handle 402 in _parse_sse_response() for the streaming path.
- All other HTTP errors continue to surface as before.
Closes OpenGradient#188
Previously convert_to_float32() always returned np.float32 regardless of the decimals field, forcing callers to manually cast integer results (issue OpenGradient#103 — add proper type conversions from Solidity contract to Python). Fix: - Add convert_fixed_point_to_python(value, decimals) that returns int when decimals == 0 and np.float32 otherwise. np.array() automatically picks the correct dtype (int64 vs float32) based on the element types. - Update both convert_to_model_output() and convert_array_to_model_output() to call the new function. - Keep convert_to_float32() as a deprecated backward-compatible alias so any external code that imports it directly continues to work. Partially closes OpenGradient#103
… in run_workflow Bug 4 — None crash in infer(): _get_inference_result_from_node() can legitimately return None when the inference node has no result yet. Previously this None was passed directly to convert_to_model_output(), which calls event_data.get(), causing an AttributeError: 'NoneType' object has no attribute 'get'. Fix: after calling _get_inference_result_from_node(), check for None and raise a clear RuntimeError with a human-readable message and the inference_id so callers know what to retry. Also guard the precompile log fetch: if parsed_logs is empty, raise RuntimeError instead of crashing with an IndexError on parsed_logs[0]. Bug 5 — hardcoded 30M gas in run_workflow(): run_workflow() built the transaction with gas=30000000 (30 million) unconditionally. This is wasteful (users overpay for gas) and can fail on networks with a lower block gas limit. Fix: call estimate_gas() first (consistent with infer()), then multiply by 3 for headroom. Fall back to 30M only if estimation itself fails. Also fixes new_workflow() deploy to use INFERENCE_TX_TIMEOUT constant instead of the previous hardcoded 60 seconds.
Contributor
Author
|
Hey @adambalogh 👋 — this PR addresses 5 critical bugs found during a code audit, 3 of which directly close or partially close open issues you and the team have filed:
All changes are backward-compatible. Happy to adjust anything before review! 🙏 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR fixes 5 critical bugs identified from open issues and code audit, spanning TEE selection, LLM error handling, type conversions, inference safety, and gas estimation.
Bug 1 — TEE always picks the same node (closes #200)
File:
src/opengradient/client/tee_registry.pyRoot cause:
get_llm_tee()always returnedtees[0], routing 100% of traffic to a single TEE with zero load distribution or failover.Fix: Replace
tees[0]withrandom.choice(tees)so eachLLM()construction independently selects from the full pool of active, registry-verified TEEs.Bug 2 — HTTP 402 swallowed as cryptic RuntimeError (closes #188)
File:
src/opengradient/client/llm.pyRoot cause: All HTTP errors (including 402 Payment Required) were caught by
except Exceptionand re-raised as a genericRuntimeError("TEE LLM chat failed: ..."). Users had no idea they needed to callensure_opg_approval()first.Fix: Add
import httpxand intercepthttpx.HTTPStatusErrorbefore the generic handler. Whenstatus_code == 402, raise aRuntimeErrorwith an explicit, actionable message pointing toensure_opg_approval(). Applied to_chat_request(),completion(), and_parse_sse_response()(streaming path).Bug 3 — Fixed-point int returns np.float32 instead of int (partially closes #103)
File:
src/opengradient/client/_conversions.pyRoot cause:
convert_to_float32(value, decimals)always returnednp.float32, even whendecimals == 0(i.e., the original value was an integer). Users expecting integer outputs receivednp.float32and had to cast manually.Fix: Add
convert_fixed_point_to_python(value, decimals)that returnsintwhendecimals == 0andnp.float32otherwise.np.array()then infers the correct dtype (int64vsfloat32) automatically. The oldconvert_to_float32is kept as a deprecated alias for backward compatibility.Bug 4 —
infer()crashes with AttributeError when inference node returns NoneFile:
src/opengradient/client/alpha.pyRoot cause:
_get_inference_result_from_node()returnsNonewhen the node has no result yet. ThisNonewas passed directly toconvert_to_model_output(), which callsevent_data.get("output", {})→AttributeError: NoneType object has no attribute get. Additionally, ifModelInferenceEventlogs were empty,parsed_logs[0]caused anIndexError.Fix:
Bug 5 —
run_workflow()uses hardcoded 30M gasFile:
src/opengradient/client/alpha.pyRoot cause:
run_workflow()hardcodedgas=30000000, unlikeinfer()which correctly usesestimate_gas(). This is both wasteful (users overpay) and fragile on networks with lower block gas limits.Fix: Call
estimate_gas()first and multiply by 3 for headroom. Fall back to 30M only if estimation itself fails.Also replaces the hardcoded
timeout=60innew_workflow()with theINFERENCE_TX_TIMEOUTconstant.Files Changed
src/opengradient/client/tee_registry.pysrc/opengradient/client/llm.pysrc/opengradient/client/_conversions.pysrc/opengradient/client/alpha.pyRelated Issues
Closes #200
Closes #188
Partially closes #103