Skip to content

CoW Developer docs#1305

Open
jsturtevant wants to merge 2 commits intohyperlight-dev:mainfrom
jsturtevant:CoW-docs
Open

CoW Developer docs#1305
jsturtevant wants to merge 2 commits intohyperlight-dev:mainfrom
jsturtevant:CoW-docs

Conversation

@jsturtevant
Copy link
Contributor

@jsturtevant jsturtevant commented Mar 12, 2026

Part of #1261

Docs updated from @syntactically's original proposal 3d59400

Signed-off-by: James Sturtevant jsturtevant@gmail.com

syntactically and others added 2 commits March 12, 2026 15:09
Signed-off-by: James Sturtevant <jsturtevant@gmail.com>
Signed-off-by: James Sturtevant <jsturtevant@gmail.com>
Copy link
Member

@syntactically syntactically left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I have some comments here. Mostly fairly minor/nits, but a couple of slightly more significant things. Happy to do a draft as well for some of the ones where I've asked for subsections to be restructured, if that would be useful?

top of memory, is simply a large bag of blank pages: scratch memory
into which this VM can write.
physical address space, which are always populated: one, near the
bottom of memory (starting at GPA `0x1000`), is a
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The specific base address is sort of a nasty implementation detail that might go away at some point anyway. Morally, it doesn't matter where it is and "at the bottom" is really just a general location. I would rather not have this in the high-level docs because it feels like noise to me.

```
Guest Physical Address Space (GPA)

+-------------------------------+ MAX_GPA
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is duplicating a diagram from src/hyperlight_host/src/mem/layout.rs. Maybe we can merge them. I'm ambivalent about whether the merged diagram ends up here or there.


+-------------------------------+ MAX_GPA
| Exn Stack, Bookkeeping |
| (scratch size, allocator |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than explicitly listing metadata/bookkeeping items, I would just in text reference the layout.rs files (primarily the hyperlight_common one) that have fairly detailed comments on each item along with its offsets.

+-------------------------------+
| Free Scratch Memory |
+-------------------------------+
| Output Data |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly, the io data being here right now is sort of a stupid hack because we don't have the better phys alloc / buffer management from #1112. I would like to avoid having that in this diagram if possible and we can talk about the fact that that is the stopgap in the section below which I see was not fully written in the original commit (sorry).

ready supply of already-mapped scratch pages to use for replacement
page tables is needed. Currently, the guest accomplishes this by
keeping an identity mapping of the entire scratch memory around.
page tables is set up by the Host. The guest keeps a mapping of the entire scratch
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't fully understand what this change is trying to say. I prefer the original wording I think.

to be able to take exceptions to it. The remainder of the top page of
the scratch memory is used for this.
to be able to take exceptions to it. The exception stack begins at
offset `0x20` from the top of the scratch region (below the metadata
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, I would avoid duplicating the actual offset here (too easy to get out of date) and just say below the metadata.

to be able to take exceptions to it. The exception stack begins at
offset `0x20` from the top of the scratch region (below the metadata
fields described above) and grows downward through the remainder of
the top page.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically I think it's actually 2 pages used for this now.

ignore because we only need 1-2 buffers inflight at a time. We can
emulate the current paradigm by recreating a new buffer out of the
free space in the original buffer on call, etc etc.
I/O buffers are statically allocated at the bottom of the scratch
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's rewrite this to be a little bit more high-level about how buffer management should work, plus details on how it is done right now (I believe there is also some discussion about buffer management in #1112).

+-------------------------------------------+ (scratch base)
```

The minimum scratch size (`min_scratch_size()`) accounts for these
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if this is a useful comment to have in this file. If it's useful to have the details of the min scratch size calculation duplicated in here (my first instinct would be to just link to the layout.rs file that has the calculation explained in detail), it should either be a general statement (e.g. "some architecture-specific processor structures that will be written during initialisation" or explicitly split out for each supported architecture).

collisions between the guest stack and the scratch physical page
allocator.
In the current startup path, the host enters the guest with
`RSP` pointing to the exception stack. Early guest init then
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please retain the less-architecture-dependent language of "the stack pointer" rather than "rsp", and consider whether or not it makes sense to retain any useful points/implications of the note about exception stack overflows being difficult to detect (thanks for the rest of the rewrite of this section, which is good).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants