Skip to content

guide: document VMBus channel model#2977

Open
mattkur wants to merge 3 commits intomicrosoft:mainfrom
mattkur:guide/vmbus-channels
Open

guide: document VMBus channel model#2977
mattkur wants to merge 3 commits intomicrosoft:mainfrom
mattkur:guide/vmbus-channels

Conversation

@mattkur
Copy link
Contributor

@mattkur mattkur commented Mar 13, 2026

The VMBus channel model — identity tuples, ring buffer pairs, subchannels, target VP, and lifecycle — is foundational to understanding any VMBus device but was not documented in the Guide. VP index vs APIC ID vs Linux CPU number confusion comes up frequently in debugging.

Changes

  • New Guide page: architecture/vmbus/channels.md covering channel identity (OfferKey), ring buffer pairs (diagram), subchannel lifecycle (mermaid state diagram), target VP semantics, VP index vs CPU number vs APIC ID clarification, ring buffer queuing model, and key types table.
  • New VMBus Architecture section in SUMMARY.md.
  • Expanded rustdoc for VmbusDevice trait — max_subchannels() now documents that it's the device's upper bound (not current count) and how the framework uses it; retarget_vp() now documents when it's called and what devices should do.
  • Expanded rustdoc for ChannelControl — documents the enable_subchannels() error case and usage context (called from device protocol handlers like StorVSP's CREATE_SUB_CHANNELS).

Can be reviewed independently from the StorVSP channels (#2976) and CPU scheduling (#2975) PRs.

Comment on lines +24 to +35
│ VMBus Channel │
│ │
│ ┌──────────────────┐ ┌──────────────────┐ │
│ │ Incoming Ring │ │ Outgoing Ring │ │
│ │ (guest → host) │ │ (host → guest) │ │
│ └────────┬─────────┘ └────────┬─────────┘ │
│ │ │ │
│ GPADL-backed memory (guest-allocated) │
│ │
│ Signal: guest → host │ Signal: host → guest │
│ Target VP: set at open time │
└────────────────────────────────────────────────┘
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit:

  • put a box around GPADL-backed memory (guest-allocated)
  • The | | between Incoming Ring and Outgoing Ring don't line up

Why is this change being made?
- The VMBus channel model — identity tuples, ring buffer pairs,
  subchannels, target VP, and lifecycle — is foundational to
  understanding any VMBus device but was not documented in the Guide.
- VP index vs APIC ID vs Linux CPU number confusion comes up
  frequently in debugging.

What changed?
- New Guide page: `Guide/src/reference/architecture/vmbus/channels.md`
  covering channel identity (`OfferKey`), ring buffer pairs, subchannel
  lifecycle (mermaid state diagram), target VP semantics, VP index vs
  CPU number vs APIC ID clarification, ring buffer model, and key types.
- New `VMBus Architecture` section in SUMMARY.md.
- Expanded rustdoc for
  [`VmbusDevice`](https://openvmm.dev/rustdoc/linux/vmbus_channel/channel/trait.VmbusDevice.html)
  — `max_subchannels()` and `retarget_vp()` now document semantics and
  usage patterns.
- Expanded rustdoc for
  [`ChannelControl`](https://openvmm.dev/rustdoc/linux/vmbus_channel/channel/struct.ChannelControl.html)
  — documents error handling for `enable_subchannels()` and device
  protocol handler usage context.

How was the change tested?
- ✅ `cargo doc --no-deps -p vmbus_channel` — no warnings
- ✅ Guide cross-links verified

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@mattkur mattkur force-pushed the guide/vmbus-channels branch from e4c7bcd to 14db471 Compare March 13, 2026 19:15
@mattkur mattkur marked this pull request as ready for review March 13, 2026 19:21
@mattkur mattkur requested review from a team as code owners March 13, 2026 19:21
Copilot AI review requested due to automatic review settings March 13, 2026 19:21
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds foundational documentation for the VMBus channel model (identity tuples, ring buffers, subchannels, target VP semantics, and lifecycle) to reduce recurring debugging confusion around VP/CPU/APIC identifiers, and aligns relevant rustdoc with the documented behavior.

Changes:

  • Adds a new Guide reference page describing the VMBus channel model (channels.md) and wires it into the Guide navigation.
  • Expands rustdoc on VmbusDevice and ChannelControl to clarify subchannel limits, VP retargeting semantics, and expected caller behavior.
  • Documents a concrete error-mapping expectation for enable_subchannels() over-limit requests.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
vm/devices/vmbus/vmbus_channel/src/channel.rs Clarifies rustdoc around subchannel limits, subchannel enablement, and interrupt retargeting callbacks.
Guide/src/reference/architecture/vmbus/channels.md New architecture doc page explaining VMBus channel/subchannel concepts and VP identifier mapping.
Guide/src/SUMMARY.md Adds a new navigation section for VMBus architecture docs and links the new Channels page.

@@ -140,8 +159,10 @@ impl ChannelControl {
///
/// If more than `count` subchannels are already enabled, this does nothing.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This is used when VPs come online/offline (e.g., CPU hot-remove) and
the guest needs to rebalance channel assignments.

### VP index, CPU number, and APIC ID
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jstarks , @smalis-msft, @chris-oo , could you review this section? I am not confident in its correctness, but this is based on what I have observed.

VMBus server at offer time. The channel's lifecycle is: **offered →
opened → active → closed**.

```text
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this diagram really add anything?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally: I'm a visual person, so like having the boxes with an incoming and outgoing ring. This helps me conceptualize VMBus channels as rings.

This is used when VPs come online/offline (e.g., CPU hot-remove) and
the guest needs to rebalance channel assignments.

### VP index, CPU number, and APIC ID
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like this could be its own page somewhere else with just a link from here. It's an important topic, and not vmbus specific

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. Probably makes sense to move it to the page I add #2975, or even its own section. I can refactor immediately after I take all these PRs to avoid too much structural churn. Do you mind confirming correctness of this section as written?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's right but Chris should double check

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

talked with matt offline, but we assume that vp_index == linux vcpu number via the bootshim, see #190 and validate_vp_hw_ids in openhcl_boot

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree this VP/APIC ID/CPU discussion should not be in this file. But it looks good.

Copy link
Contributor

@will-j-wright will-j-wright left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This all seems right to me, but @SvenGroot might want to take a read through for correctness.

ordering matches VP index ordering (panics if not), and controls the
CPU online sequence to maintain the mapping.

The APIC ID is a separate concept. On x86, the APIC ID may not
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would probably reword this section to note that "Each platform has its own architectural way of describing cpus, with x86 APIC IDs and MPIDR on AArch64. Note that these values cannot be assumed to map directly to VP index, as the physical or virtual topology of a system determines the values for these architectural identifiers."

```

In OpenHCL today, the VP index is used directly as the Linux CPU
number (`let cpu = vp.vp_index().index()`). This is a simplifying
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say "This mapping is guaranteed by openhcl_boot, as this mapping in a general purpose guest is not guaranteed. This simplifies code throughout OpenHCL, as there is no need to maintain a separate mapping for Linux CPU number to hypervisor VP indices.".

Comment on lines +154 to +155
the OpenHCL threadpool. In OpenVMM, it maps to a dedicated worker
thread (without physical CPU affinity). The strength of the targeting
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This openvmm bit will probably change soon. Hopefully I'll remember to update the guide.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants