Debugging: add builtin gdbstub component. by cfallin · Pull Request #12771 · bytecodealliance/wasmtime

cfallin · 2026-03-12T22:45:44Z

This adds a debug component that makes use of the debug-main world defined in #12756 and serves the gdbstub protocol, with Wasm extensions, compatible with LLDB.

This component is built and included inside the Wasmtime binary, and is loaded using the lower-level -D debugger=... debug-main option; the user doesn't need to specify the .wasm adapter component. Instead, the user simply runs wasmtime run -g <PORT> program.wasm ... and Wasmtime will load and prepare to run program.wasm as the debuggee, waiting for a gdbstub connection on the given TCP port before continuing.

The workflow is:

$ wasmtime run -g 1234 program.wasm
[ wasmtime starts and waits for connection ]

$ /opt/wasi-sdk/bin/lldb  # use LLDB from wasi-sdk release 32 or later
(lldb) process connect --plugin wasm connect://localhost:1234
Process 1 stopped
* thread #1, stop reason = signal SIGTRAP
    frame #0: 0x40000000000001cc
->  0x40000000000001cc: unreachable
    0x40000000000001cd: end
    0x40000000000001ce: local.get 0
    0x40000000000001d0: call   13
(lldb) si
Process 1 stopped
* thread #1, stop reason = instruction step into
    frame #0: 0x4000000000000184
->  0x4000000000000184: block
    0x4000000000000186: block
    0x4000000000000188: global.get 1
    0x400000000000018e: i32.const 3664
[ ... ]

This makes use of the gdbstub third-party crate, into which I've upstreamed support for the Wasm extensions in daniel5151/gdbstub#188, daniel5151/gdbstub#189, daniel5151/gdbstub#190, and daniel5151/gdbstub#192. (I'll add vets as part of this PR.)

cfallin · 2026-03-12T22:46:31Z

This is stacked on top of #12756 until that one lands; only the last commit is new.

I haven't added end-to-end tests that spawn/interact with LLDB yet; depending on how that goes I might be able to include that here or might defer to another PR if that's OK.

This adds a debug component that makes use of the debug-main world defined in bytecodealliance#12756 and serves the gdbstub protocol, with Wasm extensions, compatible with LLDB. This component is built and included inside the Wasmtime binary, and is loaded using the lower-level `-D debugger=...` debug-main option; the user doesn't need to specify the `.wasm` adapter component. Instead, the user simply runs `wasmtime run -g <PORT> program.wasm ...` and Wasmtime will load and prepare to run `program.wasm` as the debuggee, waiting for a gdbstub connection on the given TCP port before continuing. The workflow is: ``` $ wasmtime run -g 1234 program.wasm [ wasmtime starts and waits for connection ] $ /opt/wasi-sdk/bin/lldb # use LLDB from wasi-sdk release 32 or later (lldb) process connect --plugin wasm connect://localhost:1234 Process 1 stopped * thread #1, stop reason = signal SIGTRAP frame #0: 0x40000000000001cc -> 0x40000000000001cc: unreachable 0x40000000000001cd: end 0x40000000000001ce: local.get 0 0x40000000000001d0: call 13 (lldb) si Process 1 stopped * thread #1, stop reason = instruction step into frame #0: 0x4000000000000184 -> 0x4000000000000184: block 0x4000000000000186: block 0x4000000000000188: global.get 1 0x400000000000018e: i32.const 3664 [ ... ] ``` This makes use of the `gdbstub` third-party crate, into which I've upstreamed support for the Wasm extensions in daniel5151/gdbstub#188, daniel5151/gdbstub#189, daniel5151/gdbstub#190, and daniel5151/gdbstub#192. (I'll add vets as part of this PR.)

cfallin · 2026-03-13T19:24:57Z

Rebased out #12756; should be good to review now.

github-actions · 2026-03-13T21:51:52Z

Subscribe to Label Action

cc @fitzgen

Details

This issue or pull request has been labeled: "wizer"

Thus the following users have been cc'd because of the following labels:

fitzgen: wizer

To subscribe or unsubscribe from this label, edit the .github/subscribe-to-label.json configuration file.

Learn more.

alexcrichton · 2026-03-16T16:17:38Z

crates/gdbstub-component/artifact/Cargo.toml

+authors.workspace = true
+edition.workspace = true
+license = "Apache-2.0 WITH LLVM-exception"
+publish = false


Hm ok here's a sticking point I didn't realize: this makes the wasmtime-cli crate non-publishable because there's a dependency on something that doesn't exist on crates.io. I'm also realizing now that it's not as simple as publishing this crate since it's fundamentally not publishable, it relies on being part of this workspace to build a sibling crate, which isn't present when built as a dep from crates.io.

The "true fix" for this is https://doc.rust-lang.org/nightly/cargo/reference/unstable.html#artifact-dependencies, an unstable Cargo feature. Unfortunately we can't rely on that even in a nightly-conditional context I believe, if we tried to use that it would mean that Wasmtime would always require nightly.

Some possible ideas:

Check in the gdbstub binary and verify it's built in CI. I suspect it's a bit large and will receive many changes, so not my first choice.

Download gdbstub from github releases for released wasmtime-cli artifacts. We don't currently have HTTP/network dependencies in the CLI outside of WASI impls, so that'll be hard.

Publish this crate to crates.io, but handle failures in the build.rs script. That'd mean that the gdbstub binary would have to be an off-by-default optional feature which we enable for our release builds but would be off-by-default for crates.io.

Well, "some possible ideas" aka I think randomly in real time until something semi-reasonable pops out... Maybe that last one?

Oh, hmm, I definitely didn't foresee this one being a problem either!

Question on the last option: what do you mean by "handle failures in build.rs"; in other words, why would such a failure be any different than some other build failure for a crate on crates.io (some of which are e.g. crates that wrap C code and use cc, or do other build-time shenanigans)?

Sort of two things:

I forget what cwd is used for build.rs so I don't know what effect cargo build -p thing will have when something is published to crates.io. If that tries to build other random crates in a workspace, for example, I think that'd be bad.

I think this would ideally have a better error message than "package thing not found" when built from crates.io, aka something like "you can't enable the gdbstub component when wasmtime is built from crates.io" or similar.

Basically, yeah, build-time weirdness is expected, but I'd like to ideally tame it. Another example would be printing a better error if wasm32-wasip2 weren't installed, but rustc does a decent job of this already.

alexcrichton · 2026-03-16T16:18:44Z

crates/gdbstub-component/artifact/build.rs

+    let wasm = out_dir
+        .join("wasm32-wasip2")
+        .join("release")
+        .join("wasmtime_internal_gdbstub_component.wasm");


To gauge, how big is this binary? Are we talking ~30M or something more like ~1M? Given that this is included in the CLI uncompressed it might be reasonable to try to apply simple size optimizations where possible if it's extra large.

alexcrichton · 2026-03-16T16:19:24Z

crates/gdbstub-component/Cargo.toml

+[dependencies]
+wit-bindgen = { workspace = true, features = ["macros"] }
+anyhow = { workspace = true }
+structopt = { workspace = true }


Instead of pulling in structopt, can you use clap? They should be pretty similar and easy to transition between, but my impression is that structopt is more-or-less deprecated in favor of clap

alexcrichton · 2026-03-16T16:20:55Z

src/commands/run.rs

+                bail!("-g/--gdb cannot be combined with -Ddebugger=");
+            }
+            self.run.common.debug.debugger = Some("<built-in gdbstub>".into());
+            self.run.common.debug.arg.push(format!("0.0.0.0:{port}"));


To me defaulting to localhost feels more natural, but at the same time if doing so we should probably have the option to handle both 0.0.0.0 and localhost. Maybe the gdbstub_port option is renamed to just gdbstub, and then internally the wasm does parsing to figure out if it's a port or port-and-address?

alexcrichton · 2026-03-16T16:21:10Z

src/commands/run.rs

+            }
+            self.run.common.debug.debugger = Some("<built-in gdbstub>".into());
+            self.run.common.debug.arg.push(format!("0.0.0.0:{port}"));
+            Some(gdbstub_component_artifact::GDBSTUB_COMPONENT.to_vec())


To avoid .to_vec() on a big thing, could the field in RunCommand change to &'static [u8]?

alexcrichton · 2026-03-16T16:23:44Z

Also, to clarify, @cfallin what depth would you like me to review the gdbstub component code itself? I'm happy more-or-less not reviewing it at all in the sense that it's well-sequestered, low-risk, and we'll likely iterate a lot on it in-tree. If you'd prefer though I could give it a closer look in any particular areas of interest.

cfallin · 2026-03-16T17:20:09Z

Also, to clarify, @cfallin what depth would you like me to review the gdbstub component code itself? I'm happy more-or-less not reviewing it at all in the sense that it's well-sequestered, low-risk, and we'll likely iterate a lot on it in-tree. If you'd prefer though I could give it a closer look in any particular areas of interest.

I guess my default answer is "to whatever extent allows us to fulfill policy and be comfortable having this code in-repo" :-) I agree that since it's sandboxed, the bar could be lower than for core runtime code. I guess the spirit of our code-review policies is still that someone should give it a once-over -- but up to you how deep you take that!

alexcrichton · 2026-03-16T17:54:11Z

crates/gdbstub-component/src/lib.rs

+        let listener = TcpListener::bind(&self.options.tcp_address)
+            .await
+            .expect("Could not bind to TCP port");
+


I forget what QEMU does, but this might be a reasonable place to print "Hey I'm listening on address A.B.C.D:XXXX for a debugger" both to signify that's why the process is halted and also be a sort of "join point" for "if you're a test, now's when you can make your TCP connection" synchronization point.

alexcrichton · 2026-03-16T17:55:47Z

crates/gdbstub-component/src/lib.rs

+                    if inner.borrow_conn().flush().await.is_err() {
+                        // Connection closed or other outbound error.
+                        break 'mainloop;
+                    }
+


This may want to print something about an error perhaps? Otherwise the guest is still running and it may be "violently deleted" when dropping

alexcrichton · 2026-03-16T17:59:21Z

crates/gdbstub-component/src/lib.rs

+impl Connection for Conn {
+    type Error = anyhow::Error;
+
+    fn write(&mut self, byte: u8) -> std::result::Result<(), Self::Error> {
+        self.buf.push(byte);
+        Ok(())
+    }
+
+    fn flush(&mut self) -> std::result::Result<(), Self::Error> {
+        // We cannot flush synchronously; we leave this to the `async
+        // fn flush` method called within the main loop. Fortunately
+        // the gdbstub cannot wait for a response before returning to
+        // the main loop, so we cannot introduce any deadlocks by
+        // failing to flush synchronously here.
+        Ok(())
+    }
+}


Question for you: how come these aren't implemented with blocking I/O? My rough assumption is that blocking I/O is expected while gdbstub is doing stuff and the guest is halted, and the only async-y bit is "wait for I/O on the TCP connection or the guest to hit an event"

alexcrichton · 2026-03-16T18:01:20Z

crates/gdbstub-component/src/target.rs

+        Err(TargetError::NonFatal)
+    }
+
+    #[inline(always)]


stray debugging annotation?

alexcrichton · 2026-03-16T18:02:36Z

crates/gdbstub-component/src/target.rs

+    fn add_sw_breakpoint(&mut self, addr: u64, _kind: usize) -> TargetResult<bool, Self> {
+        let Some(wasm_addr) = WasmAddr::from_raw(addr) else {
+            return Ok(false);
+        };
+        let debuggee = self.debuggee;
+        if let AddrSpaceLookup::Module { module, .. } = self.addr_space.lookup(wasm_addr, debuggee)
+        {
+            module
+                .add_breakpoint(debuggee, wasm_addr.offset())
+                .map_err(|_| TargetError::NonFatal)?;
+            Ok(true)
+        } else {
+            Ok(false)
+        }
+    }
+
+    fn remove_sw_breakpoint(&mut self, addr: u64, _kind: usize) -> TargetResult<bool, Self> {
+        let Some(wasm_addr) = WasmAddr::from_raw(addr) else {
+            return Ok(false);
+        };
+        let debuggee = self.debuggee;
+        if let AddrSpaceLookup::Module { module, .. } = self.addr_space.lookup(wasm_addr, debuggee)
+        {
+            module
+                .remove_breakpoint(debuggee, wasm_addr.offset())
+                .map_err(|_| TargetError::NonFatal)?;
+            Ok(true)
+        } else {
+            Ok(false)
+        }
+    }


Could this perhaps support dual setting breakpoints on mapped addresses and seeing breakpoints on raw addresses in the code section?

…g forward to first opcode. LLDB, when instructed to `break main`, looks at the DWARF metadata for `main` and finds its PC range, then sets a breakpoint at the first PC. This is reasonable behavior for native ISAs! That PC better be a real instruction! On Wasm, however, (i) toolchains typically emit the PC range as *including* the *locals count*, a leb128 value that precedes the first opcode and any types of locals; (ii) our gdbstub component that bridges LLDB to our debug APIs (bytecodealliance#12771) only supports *exact* PCs for breakpoints, so when presented with a PC that does not actually point to an opcode, setting the breakpoint is effectively a no-op. There will always be a difference of at least 1 byte between the start-of-function offset and first-opcode offset (for a leb128 of `0` for no locals), so a breakpoint "on" a function will never work. I initially prototyped a fix that adds a sequence point at the start of every function (which, again, is *guaranteed* to be distinct from the first opcode), and the branch is [here], but I didn't like the developer experience: this meant that when a breakpoint at a function start fired, LLDB had a weird interstitial state where no line-number applied. The behavior that would be closer in line with "native" debug expectations is that we add a bit of fuzzy-ish matching: setting a breakpoint at function start should break at the first opcode, even if that's a few (or many) bytes later. There are two options here: special-case function start, or generally change the semantics of our breakpoint API so that "add breakpoint at `pc`" means "add breakpoint at next opcode at or after `pc`". I opted for the latter in this PR because it's more consistent. The logic is a little subtle because we're effectively defining an n-to-1 mapping with this "snap-to-next" behavior, so we have to refcount each breakpoint (consider setting a breakpoint at function start *and* at the first opcode, then deleting them, one at a time). I believe the result is self-consistent, even if a little more complicated now. And, importantly, with bytecodealliance#12771 on top of this change, it produces the expected behavior for the (very simple!) debug script "`b main`; `continue`". [here]: https://github.com/cfallin/wasmtime/tree/breakpoint-at-func-start

cfallin requested review from a team as code owners March 12, 2026 22:45

cfallin requested review from dicej and removed request for a team March 12, 2026 22:45

cfallin requested review from alexcrichton and removed request for dicej March 12, 2026 22:46

cfallin force-pushed the gdbstub-component branch from d7959df to fc1f75a Compare March 12, 2026 22:47

cfallin mentioned this pull request Mar 12, 2026

Debugging: add the debug-main world. #12756

Merged

cfallin force-pushed the gdbstub-component branch 6 times, most recently from 2719201 to 71bd19d Compare March 13, 2026 08:01

cfallin mentioned this pull request Mar 13, 2026

LLDB in latest release cannot read symbols in guest binary WebAssembly/wasi-sdk#612

Closed

cfallin force-pushed the gdbstub-component branch from 71bd19d to 34e9d51 Compare March 13, 2026 18:27

cfallin added 2 commits March 13, 2026 12:23

cargo vets.

c0c1f02

cfallin force-pushed the gdbstub-component branch from 34e9d51 to c0c1f02 Compare March 13, 2026 19:23

This was referenced Mar 13, 2026

Debugging: implement debug component support for wasmtime serve #12776

Open

Debugging: tracking issue for MVP and post-MVP work #12777

Open

github-actions bot added the wizer Issues related to Wizer snapshotting, pre-initialization, and the `wasmtime wizer` subcommand label Mar 13, 2026

alexcrichton reviewed Mar 16, 2026

View reviewed changes

cfallin mentioned this pull request Mar 17, 2026

Debugging: allow breakpoints to be set at "function start" by slipping forward to first opcode. #12791

Open

Handle Trap events as well as breakpoints.

91472fb

Conversation

cfallin commented Mar 12, 2026

Uh oh!

cfallin commented Mar 12, 2026

Uh oh!

cfallin commented Mar 13, 2026

Uh oh!

github-actions bot commented Mar 13, 2026

Subscribe to Label Action

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

alexcrichton commented Mar 16, 2026

Uh oh!

cfallin commented Mar 16, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants