Appframeworks s1 tests stability fix#1712
Conversation
… was able to identify while running CI for 10.2 certification
** Claude analysis **Bug Explanation: MonitoringConsole Restart Loop Due to Empty SPLUNK_STANDALONE_URL The Problem When a Standalone CR with a MonitoringConsoleRef was deployed, the MonitoringConsole pod would enter a restart loop, timing out on its startup probe after ~6.7 minutes and continuously restarting. Root Cause The bug stems from a race condition in the operator's reconciliation logic combined with Go's zero-value semantics:
spec := enterpriseApi.StandaloneSpec{ Why the Fix Works By explicitly setting Replicas: 1 in the test spec: spec := enterpriseApi.StandaloneSpec{ The Replicas field is already 1 when ApplyMonitoringConsoleEnvConfigMap() is called, so GetSplunkStatefulsetUrls() correctly generates SPLUNK_STANDALONE_URL=splunk--standalone-0...svc.cluster.local instead of an empty string. Potential Operator-Level Fix The proper fix in the operator code would be to apply default values before creating dependent resources like ConfigMaps. This would involve reordering the logic in pkg/splunk/enterprise/standalone.go to apply defaults before line 228. |
Pull Request Test Coverage Report for Build 22240607963Details
💛 - Coveralls |
|
I think that if this happens as described in the comment, then it would be better to fix it in the root instead of making false positive results. Is someone looking into permanent fix? |
|
Also, validateStandaloneSpec is executed earlier than ApplyMonitoringConsoleEnvConfigMap and it sets replicas to 1 |
…t claude was able to identify while running CI for 10.2 certification" This reverts commit 38298fd.
@kasiakoziol I have implemented a similar fix in the operator-level code |
Yes, however it does seem that there is a related race condition as outlined here by claude: |
|
should we not set standalone min replicas to 1 instead of 0. I am unsure if its correct, if i create standalone instance , i always see 1 replica created. |
vivekr-splunk
left a comment
There was a problem hiding this comment.
Are we sure standalone do not create any replicas. i have run this locally many times and I can see it creating single replica by default. look at validate() function in standalone its should have default set to 1 for standalone
|
@patrykw-splunk is already working on adding kubebuilder validations wherever applicable in v4. |
patrykw-splunk
left a comment
There was a problem hiding this comment.
It seems like we don't have a consensus here. I think that from engineering side it would be good @gabrielm-splunk if you could provide a demo/poc regarding this behaviour to confirm it
|
Didn't want this to be a point of contention/high-effort fix. This seemed to help the |
Description
Small fix to
appframeworksS1tests as the standalone specs that reference the MC cause some issues (as identified by claude). Will post claude explanation of bug and fix in commentsKey Changes
Just adding
Replicas: 1to standalone specs with MC refTesting and Verification
Ran tests locally
Related Issues
Stemmed from 10.2 certification: https://splunk.atlassian.net/browse/CSPL-4531
PR Checklist