-
Notifications
You must be signed in to change notification settings - Fork 42
RHINENG-21760: add additional debug logs for shutdown #2030
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
Referenced Jiras: |
Reviewer's guide (collapsed on small PRs)Reviewer's GuideAdds additional observability around application shutdown by logging received termination signals and graceful HTTP server shutdown, and updates tests accordingly. Sequence diagram for enhanced shutdown loggingsequenceDiagram
participant OS as OS
participant HandleSignals as HandleSignals
participant Context as CancelContext
participant Server as RunServer
OS->>HandleSignals: SIGINT or SIGTERM
HandleSignals->>HandleSignals: LogInfo starting grace period
HandleSignals->>HandleSignals: Sleep defaultK8sGracePeriod/4
HandleSignals->>Context: Cancel()
Context->>Server: ctx.Done()
Server->>Server: LogDebug gracefully shutting down server
Server->>Server: Shutdown()
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
b74ca65 to
c38f528
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey - I've left some high level feedback:
- In
HandleSignals, consider logging the actual signal value (e.g., using the value read from the channel) instead of a genericSIGTERM/SIGINTmessage to make it clearer which signal triggered shutdown. - In
RunServer, it may be useful to include the server address/port in the "gracefully shutting down server" log message to aid debugging when multiple servers are running.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In `HandleSignals`, consider logging the actual signal value (e.g., using the value read from the channel) instead of a generic `SIGTERM/SIGINT` message to make it clearer which signal triggered shutdown.
- In `RunServer`, it may be useful to include the server address/port in the "gracefully shutting down server" log message to aid debugging when multiple servers are running.Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
c38f528 to
488bfa4
Compare
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #2030 +/- ##
==========================================
+ Coverage 59.18% 59.27% +0.08%
==========================================
Files 133 134 +1
Lines 8599 8618 +19
==========================================
+ Hits 5089 5108 +19
Misses 2967 2967
Partials 543 543
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
488bfa4 to
49c47bb
Compare
3539e2c to
40cb335
Compare
Add a short delay, a portion of the grace period between SIGTERM and SIGKILL, before context cancellation to allow the in-flight requests to finish and thus prevent gRPC CANCELED errors from Kessel middleware.
40cb335 to
fb2b112
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey - I've found 1 issue, and left some high level feedback:
- The fixed
defaultK8sGracePeriodand unconditionaltime.Sleep(defaultK8sGracePeriod / 4)inHandleSignalsmake shutdown latency rigid and potentially misaligned with actual KubernetesterminationGracePeriodSeconds; consider making this duration configurable or derived from environment/config instead of hard-coded. - Using
time.Sleepin the signal-handling goroutine makes the pre-cancel delay non-interruptible (e.g., on a second signal); consider using atime.Timeror context with timeout so you can still react promptly to additional signals or early-exit conditions during the grace window. - For the new signal and shutdown messages, consider using structured logging (fields for signal type and grace-period duration) rather than string concatenation to make log filtering and analysis easier in production.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The fixed `defaultK8sGracePeriod` and unconditional `time.Sleep(defaultK8sGracePeriod / 4)` in `HandleSignals` make shutdown latency rigid and potentially misaligned with actual Kubernetes `terminationGracePeriodSeconds`; consider making this duration configurable or derived from environment/config instead of hard-coded.
- Using `time.Sleep` in the signal-handling goroutine makes the pre-cancel delay non-interruptible (e.g., on a second signal); consider using a `time.Timer` or context with timeout so you can still react promptly to additional signals or early-exit conditions during the grace window.
- For the new signal and shutdown messages, consider using structured logging (fields for signal type and grace-period duration) rather than string concatenation to make log filtering and analysis easier in production.
## Individual Comments
### Comment 1
<location> `base/base.go:30-32` </location>
<code_context>
signal.Notify(c, syscall.SIGINT, syscall.SIGTERM)
go func() {
- <-c
+ sig := <-c
+ utils.LogInfo("starting grace period for " + sig.String())
+ time.Sleep(defaultK8sGracePeriod / 4)
CancelContext()
utils.LogInfo("SIGTERM/SIGINT handled")
</code_context>
<issue_to_address>
**issue (bug_risk):** Blocking sleep in signal handler delays context cancellation and may reduce actual shutdown time window.
Sleeping in the signal-handling goroutine means shutdown doesn’t start until `defaultK8sGracePeriod/4` has elapsed, reducing the time available for cleanup and delaying all logic tied to this context. Instead, consider calling `CancelContext()` immediately and moving any delay to a separate goroutine that only governs process exit, or make the sleep duration configurable (including zero) so it can be tuned to the actual grace period of the deployment environment.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
|
/retest |
Secure Coding Practices Checklist GitHub Link
Secure Coding Checklist
Summary by Sourcery
Improve graceful shutdown behavior and observability for the HTTP server and signal handling.
Enhancements:
Tests: