Skip to content

Conversation

@psafont
Copy link
Member

@psafont psafont commented Dec 22, 2025

These device IDs are handled in an ad-hoc way, and none of the implementations follow the actual linux behaviour, replace all of them with an in imeplementation that uses the glibc macros to handle these.

Also change vhd_wrapper_tool to print better error messages when there's an error when getting the base VHD.

Reproduction steps:

  • Create a pool with 2 hosts
  • SSH into one of the hosts and run for i in $(seq 1 512) ; do tap-ctl allocate; done. This forces the minor device ID to be higher than 255 for the local VBD created.
  • Create a guest in the local SR of the host where the previous command was run, with the guest tools installed to support live migration.
  • Live-migrate the guest between the hosts:
xe vm-migrate live=true uuid=ff50e9b1-c155-ed39-a2cc-680f9acaada4 host=pau-xcp-01 vdi:f613c010-7c06-4288-b0db-d1bde01fe0ef=23df0f9c-a9d8-5756-c829-e3655a225224 remote-master=pau-xcp-00 remote-username=root remote-password=xxx

Performing a storage live migration. Your VM's VDIs will be migrated with the VM.
Will migrate to remote host: pau-xcp-01, using remote network: Pool-wide network associated with eth0. Here is the VDI mapping:
VDI f613c010-7c06-4288-b0db-d1bde01fe0ef -> SR 23df0f9c-a9d8-5756-c829-e3655a225224
The server failed to handle your request, due to an internal error. The given message may give details useful for debugging the problem.
message: Storage_error ([S(Internal_error);S(Storage_error ([S(Migration_mirror_fd_failure);S(VDI Not Attached)]))])

The migration succeeds when both hosts in the pool have the changes in this PR running.

Some {major; minor}

let encode_st_dev {major; minor} =
0
Copy link
Member Author

@psafont psafont Dec 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the 0 |^ is added to make ocamlformat behave and keep the first major term parenthesised and aligned. The same has been done in in decode

Copy link
Contributor

@last-genius last-genius left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably worth explaining the motivation for this - the fact that we saw real-world minors > 256 failing to be handled by xapi properly, as it is unable to figure out the format of the backing file, and has to handle the VDI as raw instead.

@psafont psafont force-pushed the dev/pau/majmin branch 3 times, most recently from d037795 to 21cf444 Compare January 8, 2026 14:37
These device IDs are handled in an ad-hoc way, and none of the
implementations follow the actual linux behaviour.

The glibc-provided macros are used for the implementation, with a
pure-ocaml implementation that was useful to compare against while
implementing, which has been kept in the tests, to detect any
behavioural changes.

Because Unix.stat returns an `int` instead of an int64, the code does
not support all possible values for major supported by glibc. This
shouldn't be an issue in Linux since the device value is 32-bit wide.

Signed-off-by: Pau Ruiz Safont <pau.safont@vates.tech>
There were 5 ad-hoc implementations, they were all getting wrong results
for the minor because only the lower 8 bits were accounted for. Use the
unixext one which handles up to 32-bit-wide minors correctly.

Signed-off-by: Pau Ruiz Safont <pau.safont@vates.tech>
@psafont psafont marked this pull request as ready for review January 8, 2026 15:12
Previously, when a failure happened while getting a base vhd, a
non-descriptive error was printed. Because there can be many reasons
that can cause this code to fail, model the possible errors and print
them when needed.

Signed-off-by: Pau Ruiz Safont <pau.safont@vates.tech>
@psafont psafont enabled auto-merge January 8, 2026 15:14
@psafont psafont added this pull request to the merge queue Jan 8, 2026
Merged via the queue into xapi-project:master with commit 38ff2a7 Jan 8, 2026
16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants