Graceful failure when extensions fail to start: suppress verbose errors and surface update hints#6832
Graceful failure when extensions fail to start: suppress verbose errors and surface update hints#6832
Conversation
…tensions Co-authored-by: hemarina <104857065+hemarina@users.noreply.github.com>
spboyer
left a comment
There was a problem hiding this comment.
Two concerns:
-
Non-timeout startup errors are masked as timeout failures (High)
The fallback warning always says extensions failed to initialize within the timeout period even for real startup errors (invalid binary, permission denied, bad args). WaitUntilReady returns the original failure error, not just timeout, but the message hardcodes timeout wording for all failures. Consider conditionally branching the messaging on context.DeadlineExceeded vs other error types. -
Synchronous network work in failure path (Medium)
On extension startup failure, the middleware now performs update checks (including source refresh via FindExtensions which can issue HTTP requests) before continuing to next(ctx). This adds blocking I/O in a failure path that should be fast. Consider making update checks async with a short timeout, or skip remote refresh entirely during startup failure handling.
|
@copilot address feedback |
…rk I/O from failure path Co-authored-by: spboyer <7681382+spboyer@users.noreply.github.com>
Both issues addressed in 19b8566:
|
| var exitErr *exec.ExitError | ||
| if errors.As(err, &exitErr) { | ||
| m.console.Message(ctx, fmt.Sprintf( | ||
| "Extension '%s' failed to start (exit code: %d). Run with --debug for details.", |
There was a problem hiding this comment.
To avoid print out a whole call stack in StdErr. How does folks think? Should we print out the whole error message?
| output.WithHighLightFormat("azd extension upgrade %s", result.ExtensionId)), | ||
| fmt.Sprintf("To upgrade all: %s", | ||
| output.WithHighLightFormat("azd extension upgrade --all")), | ||
| fmt.Sprintf("Fix by running:\n\t%s\n\t%s", |
There was a problem hiding this comment.
Have a meeting with @hyoshis to discuss more on UI. Will update this after meeting.
There was a problem hiding this comment.
Pull request overview
This PR improves the user experience when extensions fail to start by suppressing verbose error output and surfacing actionable update hints.
Changes:
- Replaced raw extension stdout/stderr console output with debug-level logging only
- Added logic to check for available extension updates using cached registry data when startup fails
- Differentiated timeout vs. real startup errors in failure messaging
Reviewed changes
Copilot reviewed 5 out of 6 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| cli/azd/cmd/middleware/extensions.go | Routes verbose extension errors to logs; checks for updates on failed extensions; conditionally shows timeout hint |
| cli/azd/pkg/extensions/update_checker.go | Updated warning message to list outdated extensions with upgrade/uninstall commands |
| cli/azd/pkg/extensions/registry_cache.go | Added semver comparison to correctly identify the latest version regardless of registry sort order |
| cli/azd/extensions/microsoft.azd.concurx/go.mod | Updated dependency from eiannone/keyboard to AlecAivazis/survey/v2 |
| cli/azd/extensions/microsoft.azd.concurx/go.sum | Dependency checksum updates reflecting the module changes |
| cli/azd/extensions/azure.ai.agents/go.mod | Moved stretchr/testify from indirect to direct dependency |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…t/improve-extension-error-messages
|
@copilot Run go mode tidy and update go related files |
…ommand and prerelease version detection Co-authored-by: hemarina <104857065+hemarina@users.noreply.github.com>
Done in 9b88ddb.
|
Co-authored-by: hemarina <104857065+hemarina@users.noreply.github.com>

When an extension fails to start,
azdwas dumping the extension's full stdout/stderr to the console—including internal gRPC logs, timestamps, and EOF traces—producing noise irrelevant to the user. There was also no guidance that an outdated extension might be the root cause.Changes
cmd/middleware/extensions.gostdout/stderron failure is now routed tolog.Printfonly, keeping it available for debug inspection (AZD_DEBUG) without polluting normal output.extensionStartFailurestruct tracking each extension alongside atimedOut bool(set viaerrors.Is(err, context.DeadlineExceeded)). The failure message now says "failed to initialize" generically; theAZD_EXT_TIMEOUThint is only shown when at least one failure was actually a deadline exceeded.checkUpdatesForExtensionsconsults the already-cached registry data for failed extensions (no network requests in the failure path). Any extension with an available upgrade emits aux.WarningMessagewith actionable upgrade instructions before the generic startup-failure block.pkg/extensions/update_checker.goFormatUpdateWarningto use the correctazd extension upgradecommand and to display hints in the "Fix by running:" / "If you don't use these extensions, you can uninstall them:" format, with the specific-extension upgrade listed first followed by--all.pkg/extensions/registry_cache.goGetExtensionLatestVersionto find the highest version via semver comparison across all listed versions (replacing complex prerelease-fallback logic that incorrectly returned a stable version when the only newer entry was a prerelease).Before (on extension startup failure):
After — timeout failure with a newer version available:
After — real startup error (e.g. invalid binary, permission denied):
Original prompt
🔒 GitHub Advanced Security automatically protects Copilot coding agent pull requests. You can protect all pull requests by enabling Advanced Security for your repositories. Learn more about Advanced Security.