Skip to content

Debugging: allow breakpoints to be set at "function start" by slipping forward to first opcode.#12791

Open
cfallin wants to merge 1 commit intobytecodealliance:mainfrom
cfallin:lets-make-breakpoints-a-little-fuzzier-please
Open

Debugging: allow breakpoints to be set at "function start" by slipping forward to first opcode.#12791
cfallin wants to merge 1 commit intobytecodealliance:mainfrom
cfallin:lets-make-breakpoints-a-little-fuzzier-please

Conversation

@cfallin
Copy link
Member

@cfallin cfallin commented Mar 17, 2026

LLDB, when instructed to break main, looks at the DWARF metadata for main and finds its PC range, then sets a breakpoint at the first PC. This is reasonable behavior for native ISAs! That PC better be a real instruction!

On Wasm, however, (i) toolchains typically emit the PC range as including the locals count, a leb128 value that precedes the first opcode and any types of locals; (ii) our gdbstub component that bridges LLDB to our debug APIs (#12771) only supports exact PCs for breakpoints, so when presented with a PC that does not actually point to an opcode, setting the breakpoint is effectively a no-op. There will always be a difference of at least 1 byte between the start-of-function offset and first-opcode offset (for a leb128 of 0 for no locals), so a breakpoint "on" a function will never work.

I initially prototyped a fix that adds a sequence point at the start of every function (which, again, is guaranteed to be distinct from the first opcode), and the branch is here, but I didn't like the developer experience: this meant that when a breakpoint at a function start fired, LLDB had a weird interstitial state where no line-number applied.

The behavior that would be closer in line with "native" debug expectations is that we add a bit of fuzzy-ish matching: setting a breakpoint at function start should break at the first opcode, even if that's a few (or many) bytes later. There are two options here: special-case function start, or generally change the semantics of our breakpoint API so that "add breakpoint at pc" means "add breakpoint at next opcode at or after pc". I opted for the latter in this PR because it's more consistent.

The logic is a little subtle because we're effectively defining an n-to-1 mapping with this "snap-to-next" behavior, so we have to refcount each breakpoint (consider setting a breakpoint at function start and at the first opcode, then deleting them, one at a time). I believe the result is self-consistent, even if a little more complicated now. And, importantly, with #12771 on top of this change, it produces the expected behavior for the (very simple!) debug script "b main; continue".

…g forward to first opcode.

LLDB, when instructed to `break main`, looks at the DWARF metadata for
`main` and finds its PC range, then sets a breakpoint at the first
PC. This is reasonable behavior for native ISAs! That PC better be a
real instruction!

On Wasm, however, (i) toolchains typically emit the PC range as
*including* the *locals count*, a leb128 value that precedes the first
opcode and any types of locals; (ii) our gdbstub component that
bridges LLDB to our debug APIs (bytecodealliance#12771) only supports *exact* PCs for
breakpoints, so when presented with a PC that does not actually point
to an opcode, setting the breakpoint is effectively a no-op. There
will always be a difference of at least 1 byte between the
start-of-function offset and first-opcode offset (for a leb128 of `0`
for no locals), so a breakpoint "on" a function will never work.

I initially prototyped a fix that adds a sequence point at the start
of every function (which, again, is *guaranteed* to be distinct from
the first opcode), and the branch is [here], but I didn't like the
developer experience: this meant that when a breakpoint at a function
start fired, LLDB had a weird interstitial state where no line-number
applied.

The behavior that would be closer in line with "native" debug
expectations is that we add a bit of fuzzy-ish matching: setting a
breakpoint at function start should break at the first opcode, even if
that's a few (or many) bytes later. There are two options here:
special-case function start, or generally change the semantics of our
breakpoint API so that "add breakpoint at `pc`" means "add breakpoint
at next opcode at or after `pc`". I opted for the latter in this PR
because it's more consistent.

The logic is a little subtle because we're effectively defining an
n-to-1 mapping with this "snap-to-next" behavior, so we have to
refcount each breakpoint (consider setting a breakpoint at function
start *and* at the first opcode, then deleting them, one at a time). I
believe the result is self-consistent, even if a little more
complicated now. And, importantly, with bytecodealliance#12771 on top of this change,
it produces the expected behavior for the (very simple!) debug script
"`b main`; `continue`".

[here]: https://github.com/cfallin/wasmtime/tree/breakpoint-at-func-start
@cfallin cfallin requested a review from a team as a code owner March 17, 2026 01:16
@cfallin cfallin requested review from alexcrichton and dicej and removed request for a team and dicej March 17, 2026 01:16
/// (possibly slipped-forward) breakpoint key to a reference
/// count. Multiple requested PCs may map to the same actual
/// breakpoint when they are slipped forward.
breakpoints: BTreeMap<BreakpointKey, usize>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically this is required from an lldb/gdbstub perspective I think anyway, right? In that if I set a breakpoint on a symbol and the same address and remove one the other should stay.

One option would be to return a bool in Wasmtime if a breakpoint is set and push the refcounting up to the gdbstub itself, but I think it's fine to live in wasmtime too.

@github-actions github-actions bot added the wasmtime:api Related to the API of the `wasmtime` crate itself label Mar 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

wasmtime:api Related to the API of the `wasmtime` crate itself

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants