ci(ZENKO-5142): ignore flaky test failures on re-run via JUnit merging and check_retries#2329
ci(ZENKO-5142): ignore flaky test failures on re-run via JUnit merging and check_retries#2329Copilot wants to merge 15 commits intodevelopment/2.13from
Conversation
…ENKO-5142) Co-authored-by: francoisferrand <3909027+francoisferrand@users.noreply.github.com>
…ENKO-5142) Co-authored-by: francoisferrand <3909027+francoisferrand@users.noreply.github.com>
|
@copilot Is it really how ´check_retries' behaves? My understanding was that it looked in the report only, not the previous runs. Can you check how it really behaves? Also we really want to keep artifacts of each run, so we need to keep the "run" suffix somehow |
…s flaky tests Co-authored-by: francoisferrand <3909027+francoisferrand@users.noreply.github.com>
You're right on both counts — Fixed in 6ae738a:
|
| id: prev-attempt | ||
| if: ${{ github.run_attempt > 1 }} | ||
| shell: bash | ||
| run: echo "number=$(( ${{ github.run_attempt }} - 1 ))" >> "$GITHUB_OUTPUT" |
There was a problem hiding this comment.
@copilot do we actually need a bash step : can't we simply use a javascript expression directly in the next step argument: ${{ github.run_attempt -1 }} ?
There was a problem hiding this comment.
Removed the bash step in 06abf96 — ${{ github.run_attempt - 1 }} is used directly in the artifact name.
| password: ${{ inputs.password }} | ||
| source: /tmp/artifacts | ||
|
|
||
| - name: Fail if tests have failures |
There was a problem hiding this comment.
- is this needed: iill
continue-on-errorreally "ignore" the failure completely, and let the whole job/workflow complete successfully? - if it is, should we not simply move the
Publish test reportstep down here (and remove itscontinue-on-error?
There was a problem hiding this comment.
Done in 06abf96. continue-on-error: true does completely suppress the failure for the job conclusion, which is why the extra "Fail if tests have failures" step was needed as a workaround. Moved "Publish test report" to the very end (after "Upload artifacts"), removed continue-on-error and the workaround step. Added if: always() so it still runs when earlier archiving steps fail.
…move publish step last Co-authored-by: francoisferrand <3909027+francoisferrand@users.noreply.github.com>
| with: | ||
| check_name: ${{ inputs.stage }} | ||
| report_paths: /tmp/junit-merged/merged.xml | ||
| job_summary: true |
There was a problem hiding this comment.
@copilot Publish summary only if the tests in this job actually failed (or if this is not the first run, as it means we retried: so there may be flakies)
There was a problem hiding this comment.
Done in 5a16b0a — job_summary: ${{ job.status != 'success' || github.run_attempt > 1 }}.
Co-authored-by: francoisferrand <3909027+francoisferrand@users.noreply.github.com>
| password: ${{ inputs.password }} | ||
| source: /tmp/artifacts | ||
|
|
||
| - name: Publish test report |
There was a problem hiding this comment.
@copilot keep this at the beginning, so the report is displayed earlier.
it can fail, followup steps are marked as "always()" anyway
There was a problem hiding this comment.
Done in ab9b374 — "Publish test report" is now immediately after "Upload merged JUnit report", with continue-on-error: true so the archiving steps still run. Added if: always() to "Upload results" and "Upload artifacts" for the same reason.
| shell: bash | ||
| run: npm install --prefix /tmp/xmldom-install @xmldom/xmldom | ||
|
|
||
| - name: Download previous reports and merge |
There was a problem hiding this comment.
@copilot the JS script is way to complex.
download every report with a single bash loop : for report in {1..${{ github.run_attempt - 1 }}} ; do curl ...
There was a problem hiding this comment.
Done in b64308c. The download is now a plain bash step:
for attempt in $(seq 1 ${{ github.run_attempt - 1 }}); do
curl -f -s -u "user:password" \
"{link}/data/{job}.${attempt}/raw-reports.xml" \
-o "/tmp/junit-previous/attempt-${attempt}.xml" \
|| echo "Attempt ${attempt} not available (skipped)"
donemerge-junit-reports.js now only handles XML merging — no HTTP, no auth.
Co-authored-by: francoisferrand <3909027+francoisferrand@users.noreply.github.com>
| @@ -0,0 +1,151 @@ | |||
| 'use strict'; | |||
There was a problem hiding this comment.
this script is still way too long and complex. Python is available "natively" in GHA, may use a script like this:
#!/usr/bin/env python3
"""
merge-reports.py - Safe JUnit XML merger
Usage: python3 merge-reports.py output.xml input1.xml input2.xml ...
"""
import xml.etree.ElementTree as ET
import sys
def merge_reports(output_file, input_files):
"""Safely merge JUnit XML reports"""
# Create root element
root = ET.Element('testsuites')
total_tests = 0
total_failures = 0
total_errors = 0
total_skipped = 0
for file in input_files:
try:
tree = ET.parse(file)
source_root = tree.getroot()
# Handle both testsuites and testsuite root elements
if source_root.tag == 'testsuites':
testsuites = source_root.findall('testsuite')
else:
testsuites = [source_root]
for suite in testsuites:
root.append(suite)
# Aggregate stats
total_tests += int(suite.get('tests', 0))
total_failures += int(suite.get('failures', 0))
total_errors += int(suite.get('errors', 0))
total_skipped += int(suite.get('skipped', 0))
except ET.ParseError as e:
print(f"Error parsing {file}: {e}", file=sys.stderr)
sys.exit(1)
# Set aggregated stats on root
root.set('tests', str(total_tests))
root.set('failures', str(total_failures))
root.set('errors', str(total_errors))
root.set('skipped', str(total_skipped))
# Write output
tree = ET.ElementTree(root)
ET.indent(tree, space=" ") # Pretty print
tree.write(output_file, encoding='utf-8', xml_declaration=True)
print(f"✓ Merged {len(input_files)} files -> {output_file}")
if __name__ == '__main__':
if len(sys.argv) < 4:
print(f"Usage: {sys.argv[0]} <output.xml> <input1.xml> <input2.xml> ...", file=sys.stderr)
sys.exit(1)
output = sys.argv[1]
inputs = sys.argv[2:]
merge_reports(output, inputs)
There was a problem hiding this comment.
Done in 76e87fd. Replaced the JS script with a Python script based on your suggestion (stdlib xml.etree.ElementTree, no external deps). The actions/github-script and npm install steps are gone — the "Merge JUnit reports" step is now a plain bash step calling python3 merge-junit-reports.py twice: once for raw-reports.xml (current attempt only) and once for junit-merged.xml (all attempts combined).
…lementTree) Co-authored-by: francoisferrand <3909027+francoisferrand@users.noreply.github.com>
| shopt -s nullglob | ||
| mkdir -p /tmp/artifacts/data/${{ inputs.stage }} | ||
| current_files=(${{ inputs.junit-paths }}) | ||
| prev_files=(/tmp/junit-previous/*.xml) | ||
| # Save current-attempt report for future re-runs to download | ||
| [ ${#current_files[@]} -gt 0 ] && python3 "$GITHUB_WORKSPACE/.github/scripts/merge-junit-reports.py" \ | ||
| /tmp/artifacts/data/${{ inputs.stage }}/raw-reports.xml \ | ||
| "${current_files[@]}" | ||
| # Merge current + all previous attempts into a single report | ||
| all_files=("${current_files[@]}" "${prev_files[@]}") | ||
| [ ${#all_files[@]} -gt 0 ] && python3 "$GITHUB_WORKSPACE/.github/scripts/merge-junit-reports.py" \ | ||
| /tmp/artifacts/data/${{ inputs.stage }}/junit-merged.xml \ | ||
| "${all_files[@]}" |
There was a problem hiding this comment.
@copilot too complex and hard to read, makes this simpler
There was a problem hiding this comment.
Simplified in 4b9d805. The Python script now expands glob patterns internally, so the bash step is just two calls:
python3 merge-junit-reports.py raw-reports.xml '${{ inputs.junit-paths }}'
python3 merge-junit-reports.py junit-merged.xml '${{ inputs.junit-paths }}' '/tmp/junit-previous/*.xml'No shopt, no arrays, no conditionals.
Co-authored-by: francoisferrand <3909027+francoisferrand@users.noreply.github.com>
| python3 "$GITHUB_WORKSPACE/.github/scripts/merge-junit-reports.py" \ | ||
| /tmp/artifacts/data/${{ inputs.stage }}/raw-reports.xml '${{ inputs.junit-paths }}' | ||
| python3 "$GITHUB_WORKSPACE/.github/scripts/merge-junit-reports.py" \ | ||
| /tmp/artifacts/data/${{ inputs.stage }}/junit-merged.xml \ | ||
| '${{ inputs.junit-paths }}' '/tmp/junit-previous/*.xml' |
There was a problem hiding this comment.
@copilot why 2 calls? Can't we make a single one?
There was a problem hiding this comment.
Done in f88291e. The script now takes output_dir current_glob [previous_glob] and writes both files internally, so the bash step is a single call:
python3 merge-junit-reports.py \
/tmp/artifacts/data/${{ inputs.stage }} \
'${{ inputs.junit-paths }}' '/tmp/junit-previous/*.xml'Co-authored-by: francoisferrand <3909027+francoisferrand@users.noreply.github.com>
| current_glob = sys.argv[2] | ||
| previous_glob = sys.argv[3] if len(sys.argv) > 3 else None |
There was a problem hiding this comment.
@copilot why handle this here?
can't we just process every file that is passed in argument (no globbing, just let bash handle wildcard expansion in the gha step)
There was a problem hiding this comment.
Done in ee270a5. Removed glob/os from the Python script — it now just takes plain file paths as sys.argv[2:]. Bash handles glob expansion in the action step:
shopt -s nullglob
python3 "$SCRIPT" "$STAGE/raw-reports.xml" ${{ inputs.junit-paths }}
python3 "$SCRIPT" "$STAGE/junit-merged.xml" ${{ inputs.junit-paths }} /tmp/junit-previous/*.xmlshopt -s nullglob ensures the second call works safely on attempt 1 (empty /tmp/junit-previous/).
Co-authored-by: francoisferrand <3909027+francoisferrand@users.noreply.github.com>
| python3 "$SCRIPT" "$STAGE/raw-reports.xml" ${{ inputs.junit-paths }} | ||
| python3 "$SCRIPT" "$STAGE/junit-merged.xml" ${{ inputs.junit-paths }} /tmp/junit-previous/*.xml |
There was a problem hiding this comment.
@copilot merge both line, let bash expand both wildcards
| python3 "$SCRIPT" "$STAGE/raw-reports.xml" ${{ inputs.junit-paths }} | |
| python3 "$SCRIPT" "$STAGE/junit-merged.xml" ${{ inputs.junit-paths }} /tmp/junit-previous/*.xml | |
| python3 "$SCRIPT" "$STAGE/junit-merged.xml" ${{ inputs.junit-paths }} /tmp/junit-previous/*.xml |
There was a problem hiding this comment.
Done in 3b3e9df — removed the raw-reports.xml line. The download loop now fetches junit-merged.xml from each previous attempt (consistent with what's uploaded), so the merge step is a single call.
…ious attempts Co-authored-by: francoisferrand <3909027+francoisferrand@users.noreply.github.com>
raw-reports.xmlpython call — onlyjunit-merged.xmlis generated per attemptjunit-merged.xmlfrom previous attempts (consistent with what's now uploaded)python3line as suggested💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.