Skip to content

Conversation

@mattermoran
Copy link

@mattermoran mattermoran commented Dec 25, 2025

Fixes #4930 from 2023
Closes #23902

Accessing .size or .exists() on a Bun.file(path) that does not yet exist caused the Blob to permanently cache its size as 0. This prevented subsequent reads (like .text(), .slice(), .size) from seeing the file's content after it was created as they relied on the stale cached size.

The fix modifies resolveSize in src/bun.js/webcore/Blob.zig to avoid setting this.size = 0 when resolveFileStat fails (seekable == null).

This way if the file is missing it remains unknown (max_size), getSize returns 0.
Any subsequent calls will keep retrying the stat until the file exists.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 25, 2025

Walkthrough

This pull request addresses file-backed blob size resolution in Bun by improving seekability state handling in Blob operations and adding a regression test to verify that calling exists() before write() does not corrupt subsequent write operations.

Changes

Cohort / File(s) Summary
Blob size resolution logic
src/bun.js/webcore/Blob.zig
Modified getSize and resolveSize methods to properly handle file-backed blobs with uncertain seekability (null state). Added early returns for non-seekable file-backed blobs and simplified conditions triggering size updates. Improves handling of max_size comparisons when seekable state is unknown.
Regression test for file operations
test/regression/issue/4930.test.ts
Added test verifying that Bun.file().exists() does not affect subsequent Bun.write() operations or result in corrupted file sizes. Test exercises the interaction between exists() caching and write() to prevent size poisoning.

Suggested reviewers

  • cirospaciari
  • taylordotfish

Pre-merge checks

✅ Passed checks (4 passed)
Check name Status Explanation
Linked Issues check ✅ Passed All code changes directly address the linked issue requirements: Blob.zig modifications prevent premature size caching when files don't exist [#4930, #22484, #23902], and the regression test verifies that exists() before write() doesn't poison subsequent operations.
Out of Scope Changes check ✅ Passed All changes are scoped to the identified problems: Blob.zig changes handle file stat resolution and size caching, while the regression test verifies the specific bug scenario. No unrelated modifications detected.
Title check ✅ Passed The title clearly summarizes the main fix: preventing size caching when files don't exist, which directly addresses the core issue in Blob.zig's resolveSize logic.
Description check ✅ Passed The PR description clearly explains the issue, the fix, and its behavior with sufficient technical detail about the changes to Blob.zig.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 699d8b1 and d0978b9.

📒 Files selected for processing (2)
  • src/bun.js/webcore/Blob.zig
  • test/js/bun/io/bun-write.test.js
🧰 Additional context used
📓 Path-based instructions (3)
test/**/*.test.{ts,js,jsx,tsx,mjs,cjs}

📄 CodeRabbit inference engine (test/CLAUDE.md)

test/**/*.test.{ts,js,jsx,tsx,mjs,cjs}: Use bun:test with files that end in *.test.{ts,js,jsx,tsx,mjs,cjs}
Do not write flaky tests. Never wait for time to pass in tests; always wait for the condition to be met instead of using an arbitrary amount of time
Never use hardcoded port numbers in tests. Always use port: 0 to get a random port
Prefer concurrent tests over sequential tests using test.concurrent or describe.concurrent when multiple tests spawn processes or write files, unless it's very difficult to make them concurrent
When spawning Bun processes in tests, use bunExe and bunEnv from harness to ensure the same build of Bun is used and debug logging is silenced
Use -e flag for single-file tests when spawning Bun processes
Use tempDir() from harness to create temporary directories with files for multi-file tests instead of creating files manually
Prefer async/await over callbacks in tests
When callbacks must be used and it's just a single callback, use Promise.withResolvers to create a promise that can be resolved or rejected from a callback
Do not set a timeout on tests. Bun already has timeouts
Use Buffer.alloc(count, fill).toString() instead of 'A'.repeat(count) to create repetitive strings in tests, as ''.repeat is very slow in debug JavaScriptCore builds
Use describe blocks for grouping related tests
Always use await using or using to ensure proper resource cleanup in tests for APIs like Bun.listen, Bun.connect, Bun.spawn, Bun.serve, etc
Always check exit codes and test error scenarios in error tests
Use describe.each() for parameterized tests
Use toMatchSnapshot() for snapshot testing
Use beforeAll(), afterEach(), beforeEach() for setup/teardown in tests
Track resources (servers, clients) in arrays for cleanup in afterEach()

Files:

  • test/js/bun/io/bun-write.test.js
src/**/*.zig

📄 CodeRabbit inference engine (src/CLAUDE.md)

src/**/*.zig: Private fields in Zig are fully supported using the # prefix: struct { #foo: u32 };
Use decl literals in Zig for declaration initialization: const decl: Decl = .{ .binding = 0, .value = 0 };
Prefer @import at the bottom of the file (auto formatter will move them automatically)

Files:

  • src/bun.js/webcore/Blob.zig
**/*.zig

📄 CodeRabbit inference engine (CLAUDE.md)

In Zig code, be careful with allocators and use defer for cleanup

Files:

  • src/bun.js/webcore/Blob.zig
🧠 Learnings (15)
📚 Learning: 2025-11-24T18:36:59.706Z
Learnt from: CR
Repo: oven-sh/bun PR: 0
File: src/bun.js/bindings/v8/CLAUDE.md:0-0
Timestamp: 2025-11-24T18:36:59.706Z
Learning: Applies to src/bun.js/bindings/v8/test/v8/v8.test.ts : Add corresponding test cases to test/v8/v8.test.ts using checkSameOutput() function to compare Node.js and Bun output

Applied to files:

  • test/js/bun/io/bun-write.test.js
📚 Learning: 2025-11-24T18:37:30.259Z
Learnt from: CR
Repo: oven-sh/bun PR: 0
File: test/CLAUDE.md:0-0
Timestamp: 2025-11-24T18:37:30.259Z
Learning: Applies to test/js/node/test/{parallel,sequential}/*.js : For test/js/node/test/{parallel,sequential}/*.js files without a .test extension, use `bun bd <file>` instead of `bun bd test <file>` since these expect exit code 0 and don't use bun's test runner

Applied to files:

  • test/js/bun/io/bun-write.test.js
📚 Learning: 2025-11-24T18:37:30.259Z
Learnt from: CR
Repo: oven-sh/bun PR: 0
File: test/CLAUDE.md:0-0
Timestamp: 2025-11-24T18:37:30.259Z
Learning: Applies to test/**/*.test.{ts,js,jsx,tsx,mjs,cjs} : Use `bun:test` with files that end in `*.test.{ts,js,jsx,tsx,mjs,cjs}`

Applied to files:

  • test/js/bun/io/bun-write.test.js
📚 Learning: 2025-11-24T18:37:11.466Z
Learnt from: CR
Repo: oven-sh/bun PR: 0
File: src/js/CLAUDE.md:0-0
Timestamp: 2025-11-24T18:37:11.466Z
Learning: Write JS builtins for Bun's Node.js compatibility and APIs, and run `bun bd` after changes

Applied to files:

  • test/js/bun/io/bun-write.test.js
📚 Learning: 2025-12-16T00:21:32.179Z
Learnt from: CR
Repo: oven-sh/bun PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-16T00:21:32.179Z
Learning: Applies to **/*.test.ts?(x) : Verify tests fail with `USE_SYSTEM_BUN=1 bun test <file>` and pass with `bun bd test <file>` - tests are invalid if they pass with USE_SYSTEM_BUN=1

Applied to files:

  • test/js/bun/io/bun-write.test.js
📚 Learning: 2025-11-24T18:37:30.259Z
Learnt from: CR
Repo: oven-sh/bun PR: 0
File: test/CLAUDE.md:0-0
Timestamp: 2025-11-24T18:37:30.259Z
Learning: Applies to test/**/*-fixture.ts : Test files that spawn Bun processes should end in `*-fixture.ts` to identify them as test fixtures and not tests themselves

Applied to files:

  • test/js/bun/io/bun-write.test.js
📚 Learning: 2025-12-16T00:21:32.179Z
Learnt from: CR
Repo: oven-sh/bun PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-16T00:21:32.179Z
Learning: Applies to **/*.test.ts?(x) : For multi-file tests, prefer `tempDir` and `Bun.spawn` over single-file tests

Applied to files:

  • test/js/bun/io/bun-write.test.js
📚 Learning: 2025-11-24T18:37:30.259Z
Learnt from: CR
Repo: oven-sh/bun PR: 0
File: test/CLAUDE.md:0-0
Timestamp: 2025-11-24T18:37:30.259Z
Learning: Applies to test/**/*.test.{ts,js,jsx,tsx,mjs,cjs} : Use `-e` flag for single-file tests when spawning Bun processes

Applied to files:

  • test/js/bun/io/bun-write.test.js
📚 Learning: 2025-11-24T18:37:30.259Z
Learnt from: CR
Repo: oven-sh/bun PR: 0
File: test/CLAUDE.md:0-0
Timestamp: 2025-11-24T18:37:30.259Z
Learning: Applies to test/**/*.test.{ts,js,jsx,tsx,mjs,cjs} : When spawning Bun processes in tests, use `bunExe` and `bunEnv` from `harness` to ensure the same build of Bun is used and debug logging is silenced

Applied to files:

  • test/js/bun/io/bun-write.test.js
📚 Learning: 2025-12-16T00:21:32.179Z
Learnt from: CR
Repo: oven-sh/bun PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-16T00:21:32.179Z
Learning: Only push changes after running `bun bd test <file>` and ensuring tests pass

Applied to files:

  • test/js/bun/io/bun-write.test.js
📚 Learning: 2025-11-06T00:58:23.965Z
Learnt from: markovejnovic
Repo: oven-sh/bun PR: 24417
File: test/js/bun/spawn/spawn.test.ts:903-918
Timestamp: 2025-11-06T00:58:23.965Z
Learning: In Bun test files, `await using` with spawn() is appropriate for long-running processes that need guaranteed cleanup on scope exit or when explicitly testing disposal behavior. For short-lived processes that exit naturally (e.g., console.log scripts), the pattern `const proc = spawn(...); await proc.exited;` is standard and more common, as evidenced by 24 instances vs 4 `await using` instances in test/js/bun/spawn/spawn.test.ts.

Applied to files:

  • test/js/bun/io/bun-write.test.js
📚 Learning: 2025-11-24T18:37:30.259Z
Learnt from: CR
Repo: oven-sh/bun PR: 0
File: test/CLAUDE.md:0-0
Timestamp: 2025-11-24T18:37:30.259Z
Learning: Applies to test/**/*.test.{ts,js,jsx,tsx,mjs,cjs} : Always use `await using` or `using` to ensure proper resource cleanup in tests for APIs like Bun.listen, Bun.connect, Bun.spawn, Bun.serve, etc

Applied to files:

  • test/js/bun/io/bun-write.test.js
📚 Learning: 2025-11-08T04:06:33.198Z
Learnt from: Jarred-Sumner
Repo: oven-sh/bun PR: 24491
File: test/js/bun/transpiler/declare-global.test.ts:17-17
Timestamp: 2025-11-08T04:06:33.198Z
Learning: In Bun test files, `await using` with Bun.spawn() is the preferred pattern for spawned processes regardless of whether they are short-lived or long-running. Do not suggest replacing `await using proc = Bun.spawn(...)` with `const proc = Bun.spawn(...); await proc.exited;`.

Applied to files:

  • test/js/bun/io/bun-write.test.js
📚 Learning: 2025-10-08T13:48:02.430Z
Learnt from: Jarred-Sumner
Repo: oven-sh/bun PR: 23373
File: test/js/bun/tarball/extract.test.ts:107-111
Timestamp: 2025-10-08T13:48:02.430Z
Learning: In Bun's test runner, use `expect(async () => { await ... }).toThrow()` to assert async rejections. Unlike Jest/Vitest, Bun does not require `await expect(...).rejects.toThrow()` - the async function wrapper with `.toThrow()` is the correct pattern for async error assertions in Bun tests.

Applied to files:

  • test/js/bun/io/bun-write.test.js
📚 Learning: 2025-09-20T03:39:41.770Z
Learnt from: pfgithub
Repo: oven-sh/bun PR: 22534
File: test/regression/issue/21830.fixture.ts:14-63
Timestamp: 2025-09-20T03:39:41.770Z
Learning: Bun's test runner supports async describe callbacks, unlike Jest/Vitest where describe callbacks must be synchronous. The syntax `describe("name", async () => { ... })` is valid in Bun.

Applied to files:

  • test/js/bun/io/bun-write.test.js
🧬 Code graph analysis (1)
test/js/bun/io/bun-write.test.js (2)
test/js/bun/sqlite/sqlite.test.js (1)
  • tmpbase (9-9)
test/harness.ts (1)
  • tempDir (277-284)
🔇 Additional comments (3)
src/bun.js/webcore/Blob.zig (3)

3105-3107: LGTM: Proper handling of unknown file seekability.

This short-circuit correctly returns 0 when the file's seekability status is unknown (null) while keeping the internal size as max_size. This prevents caching 0 permanently while providing a sensible return value for non-existent files.

The logic ensures subsequent calls will retry stat'ing the file (via resolveSize) since the internal size remains max_size.


3134-3138: LGTM: Core fix preventing size cache poisoning.

This change is the heart of the fix for issue #4930. By returning early when seekable remains null after resolveFileStat, the code avoids setting this.size = 0 at line 3151, which was permanently caching the size as zero for non-existent files.

The logic:

  1. If seekable is unknown (null), call resolveFileStat to stat the file
  2. If stat fails (file doesn't exist), seekable remains null
  3. Return early, keeping this.size as max_size (unknown)
  4. Subsequent calls will retry until the file exists

This allows writes to succeed and size queries to reflect the actual file size once the file is created.


3141-3147: LGTM: More precise condition for size updates.

The condition change from checking seekable to checking max_size != Blob.max_size is a good improvement. This makes the logic more precise:

  • Before: Updated size when file was successfully stat'd (seekable known)
  • After: Updates size only when the file's max_size is actually known

This correctly handles edge cases:

  • Regular files: stat succeeds → max_size set to file size → condition true → size updated ✓
  • Pipes/sockets: stat succeeds → max_size may remain Blob.max_size → condition false → size calculation skipped ✓
  • Non-existent files: stat fails → early return at line 3137 → this code not reached ✓

The new condition aligns better with the actual data availability rather than just file type.

@mattermoran mattermoran force-pushed the fix/bun-file-cache-poisoning branch from d0978b9 to 6609d4c Compare December 26, 2025 11:06
@mattermoran mattermoran changed the title fix(blob): Prevent Bun.file size caching when file does not exist fix(fs): Prevent Bun.file size caching when file does not exist Dec 26, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

BunFile .text() does not return correct content after BunFile.write() Calling BunFile.exists makes Bun.write write nothing

1 participant