Skip to content

[wrangler] Attempts to reduce remote e2e test flakiness#12896

Open
petebacondarwin wants to merge 11 commits intomainfrom
pbd/fix-remote-binding-flake
Open

[wrangler] Attempts to reduce remote e2e test flakiness#12896
petebacondarwin wants to merge 11 commits intomainfrom
pbd/fix-remote-binding-flake

Conversation

@petebacondarwin
Copy link
Contributor

@petebacondarwin petebacondarwin commented Mar 14, 2026

Several changes to reduce flakiness in the wrangler remote-binding and dev e2e tests:

Consistent waitFor/waitForLong helpers

  • Extract shared waitFor() and waitForLong() helpers that wrap vi.waitFor() with tuned defaults
    • waitFor(): 100ms interval, 5s timeout — for polling synchronous state (e.g. console output)
    • waitForLong(): 500ms interval, 10s timeout — for polling HTTP endpoints
  • Add ESLint rule to enforce using these helpers instead of bare vi.waitFor() in e2e tests
  • Migrate all e2e tests to use the shared helpers

Retry transient API failures in remote preview

  • Wrap createPreviewSession() and createWorkerPreview() in retryOnAPIFailure so transient 5xx errors are retried automatically (up to 3 attempts with linear backoff)
  • Add an optional abortSignal parameter to retryOnAPIFailure so backoff delays can be cancelled immediately when a new bundle arrives

Increase e2e timeouts for remote preview and add per-request API timeout

  • startWorker: use waitForLong (10s) instead of waitFor (5s) for remote reload polling, matching the convention for HTTP endpoint polling
  • start-worker-remote-bindings: increase beforeAll deploy timeout from 35s to 60s for slow Windows CI
  • dev.test Workers+Assets: increase waitForReady/waitForReload from 15s to 30s since remote mode involves session creation + asset upload + bundle upload to edge-preview
  • create-worker-preview: add 30s per-request timeout via AbortSignal.any so a hung Cloudflare API response doesn't block the reload indefinitely

Use shared preserve-e2e-* workers for remote-binding tests

  • Add ensureWorkerDeployed() helper to WranglerE2ETestHelper that checks whether a worker is already live (by fetching its devprod-testing7928.workers.dev URL) and only deploys if it returns 404
  • Migrate dev-remote-bindings.test.ts, start-worker-remote-bindings.test.ts, and remote-bindings-api.test.ts to use fixed preserve-e2e-wrangler-remote-worker / preserve-e2e-wrangler-remote-worker-alt names instead of deploying fresh workers with random names on every run
  • These workers persist across test runs and are excluded from the periodic e2e cleanup job by their preserve-e2e- prefix

  • Tests
    • Tests included/updated
    • Automated tests not possible - manual testing has been completed as follows:
    • Additional testing not necessary because:
  • Public documentation
    • Cloudflare docs PR(s):
    • Documentation not necessary because: test infrastructure and internal retry logic only

A picture of a cute animal (not mandatory, but encouraged)

@changeset-bot
Copy link

changeset-bot bot commented Mar 14, 2026

🦋 Changeset detected

Latest commit: 6413596

The changes in this PR will be included in the next version bump.

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@workers-devprod
Copy link
Contributor

workers-devprod commented Mar 14, 2026

Codeowners approval required for this PR:

  • ✅ @cloudflare/wrangler
Show detailed file reviewers

devin-ai-integration[bot]

This comment was marked as resolved.

@pkg-pr-new
Copy link

pkg-pr-new bot commented Mar 14, 2026

create-cloudflare

npm i https://pkg.pr.new/create-cloudflare@12896

@cloudflare/kv-asset-handler

npm i https://pkg.pr.new/@cloudflare/kv-asset-handler@12896

miniflare

npm i https://pkg.pr.new/miniflare@12896

@cloudflare/pages-shared

npm i https://pkg.pr.new/@cloudflare/pages-shared@12896

@cloudflare/unenv-preset

npm i https://pkg.pr.new/@cloudflare/unenv-preset@12896

@cloudflare/vite-plugin

npm i https://pkg.pr.new/@cloudflare/vite-plugin@12896

@cloudflare/vitest-pool-workers

npm i https://pkg.pr.new/@cloudflare/vitest-pool-workers@12896

@cloudflare/workers-editor-shared

npm i https://pkg.pr.new/@cloudflare/workers-editor-shared@12896

wrangler

npm i https://pkg.pr.new/wrangler@12896

commit: 6413596

@petebacondarwin petebacondarwin force-pushed the pbd/fix-remote-binding-flake branch from bef408f to 3de2809 Compare March 14, 2026 11:12
@petebacondarwin petebacondarwin changed the title test: use vi.waitFor/vi.waitUntil to wait for logger output in e2e tests consistent use of vi.waitFor/vi.waitUntil to wait for things in e2e tests Mar 14, 2026
devin-ai-integration[bot]

This comment was marked as resolved.

devin-ai-integration[bot]

This comment was marked as resolved.

devin-ai-integration[bot]

This comment was marked as resolved.

@petebacondarwin petebacondarwin force-pushed the pbd/fix-remote-binding-flake branch from 1e6a215 to ea1dfd6 Compare March 16, 2026 10:20
@petebacondarwin petebacondarwin force-pushed the pbd/fix-remote-binding-flake branch from ea1dfd6 to 0471f8c Compare March 16, 2026 10:37
@petebacondarwin petebacondarwin changed the title consistent use of vi.waitFor/vi.waitUntil to wait for things in e2e tests [wrangler] use consistent waitFor/waitForLong helpers in e2e tests Mar 16, 2026
@petebacondarwin petebacondarwin force-pushed the pbd/fix-remote-binding-flake branch from 0471f8c to e1a9b31 Compare March 16, 2026 13:00
devin-ai-integration[bot]

This comment was marked as resolved.

@petebacondarwin petebacondarwin changed the title [wrangler] use consistent waitFor/waitForLong helpers in e2e tests [wrangler] Attempts to reduce remote e2e test flakiness Mar 17, 2026
@petebacondarwin petebacondarwin requested review from a team as code owners March 17, 2026 14:06
@github-actions
Copy link
Contributor

github-actions bot commented Mar 17, 2026

✅ All changesets look good

@petebacondarwin petebacondarwin force-pushed the pbd/fix-remote-binding-flake branch 2 times, most recently from a6c37fc to 961354c Compare March 18, 2026 13:00
devin-ai-integration[bot]

This comment was marked as resolved.

devin-ai-integration[bot]

This comment was marked as resolved.

- Extract shared waitFor() and waitForLong() helpers wrapping vi.waitFor() with tuned defaults
- waitFor(): 100ms interval, 5s timeout — for polling synchronous state (e.g. console output)
- waitForLong(): 500ms interval, 10s timeout — for polling HTTP endpoints
- Add ESLint rule to enforce using these helpers instead of bare vi.waitFor() in e2e tests
- Migrate all e2e tests to use the shared helpers
Wrap createPreviewSession() and createWorkerPreview() in
retryOnAPIFailure so transient 5xx errors are retried automatically
(up to 3 attempts with linear backoff). Also add an optional
abortSignal parameter to retryOnAPIFailure so backoff delays can be
cancelled immediately when a new bundle arrives.
- startWorker: use waitForLong (10s) instead of waitFor (5s) for remote
  reload polling, matching the convention for HTTP endpoint polling
- start-worker-remote-bindings: increase beforeAll deploy timeout from
  35s to 60s for slow Windows CI
- dev.test Workers+Assets: increase waitForReady/waitForReload from 15s
  to 30s since remote mode involves session creation + asset upload +
  bundle upload to edge-preview
- create-worker-preview: add 30s per-request timeout via AbortSignal.any
  so a hung Cloudflare API response doesn't block the reload indefinitely
Reuse pre-deployed workers instead of deploying fresh ones on every test
run. The new ensureWorkerDeployed() helper checks whether a worker is
already live before deploying, and the preserve-e2e- prefix keeps the
workers excluded from periodic cleanup.
Thread the abort signal through to the fetchResult call inside
getWorkersDevSubdomain so it gets the same withTimeout protection
as the other API calls in createPreviewSession.
The backoff computation backoff + (MAX_ATTEMPTS - attempts) * 1000
was off-by-one: the computed delay was passed to the next recursive
call but that call would throw before ever sleeping on it. Replace
with a simple backoff + 1000 so the first retry is immediate (0ms)
and the second waits 1000ms, matching the documented intent.
@petebacondarwin petebacondarwin force-pushed the pbd/fix-remote-binding-flake branch from dfb8330 to 215ecda Compare March 18, 2026 16:33
devin-ai-integration[bot]

This comment was marked as resolved.

A timed-out API request (DOMException with name TimeoutError) is a
transient failure and should be retried, just like 5xx errors. User-
initiated aborts (AbortError) are still propagated immediately.
devin-ai-integration[bot]

This comment was marked as resolved.

@github-project-automation github-project-automation bot moved this from Untriaged to Approved in workers-sdk Mar 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: Approved

Development

Successfully merging this pull request may close these issues.

4 participants