Skip to content

igvmfilegen: add diff subcommand for IGVM reproducibility investigations#3033

Open
justus-camp-microsoft wants to merge 1 commit intomicrosoft:mainfrom
justus-camp-microsoft:igvm_file_diff
Open

igvmfilegen: add diff subcommand for IGVM reproducibility investigations#3033
justus-camp-microsoft wants to merge 1 commit intomicrosoft:mainfrom
justus-camp-microsoft:igvm_file_diff

Conversation

@justus-camp-microsoft
Copy link
Contributor

This adds a diff command to the igvmfilegen tool to help with current and future investigations into issues with IGVM reproducibility. This tool uses the output IGVM .bin and .map files to deconstruct it into its individual parts and then passes them through diffoscope. The tool assumes you have diffoscope installed.

Disclaimer - this tool was mostly written by copilot but in my experience in using it to diagnose reproducibility issues it's been super useful.

@justus-camp-microsoft justus-camp-microsoft requested review from a team as code owners March 17, 2026 19:52
Copilot AI review requested due to automatic review settings March 17, 2026 19:52
@github-actions github-actions bot added the Guide label Mar 17, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an igvmfilegen diff subcommand to help investigate IGVM reproducibility issues by extracting IGVM contents into structured temp directories and invoking diffoscope on the extracted trees.

Changes:

  • Add Diff subcommand wiring to igvmfilegen CLI.
  • Implement IGVM extraction + region naming/coalescing + diffoscope invocation in a new diff module.
  • Document the new tool/subcommand in the Guide and link it from SUMMARY.md.

Reviewed changes

Copilot reviewed 5 out of 6 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
vm/loader/igvmfilegen/src/main.rs Adds Diff subcommand to CLI and dispatches to the new diff implementation.
vm/loader/igvmfilegen/src/diff.rs New extraction/coalescing logic and diffoscope runner for comparing IGVM outputs.
vm/loader/igvmfilegen/Cargo.toml Adds tempfile dependency for managing extracted temp directories.
Guide/src/dev_guide/dev_tools/igvmfilegen.md New documentation page describing igvmfilegen diff usage and workflows.
Guide/src/SUMMARY.md Adds the new igvmfilegen doc page to the Guide table of contents.
Cargo.lock Locks tempfile as a dependency due to the new module.

Comment on lines +154 to +158
let component = lookup_map_name(map, *gpa).unwrap_or("unmapped").to_string();
// Deduplicate pages at the same GPA (different compatibility masks
// produce duplicate PageData entries with identical content).
if seen_gpas.insert(*gpa) {
page_data_entries.push(PageDataEntry {
Comment on lines +26 to +33
/// Parse an IGVM .map file, extracting all layout entries across all isolation sections.
/// Deduplicates by (start_gpa, end_gpa).
fn parse_map_file(path: &Path) -> anyhow::Result<Vec<MapEntry>> {
let content = fs_err::read_to_string(path).context("reading map file")?;
let mut entries: Vec<MapEntry> = Vec::new();
let mut seen: std::collections::HashSet<(u64, u64)> = std::collections::HashSet::new();
let mut in_layout = false;


//! Implements IGVM file diffing by extracting constituent parts and running diffoscope.

use anyhow::Context;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this isn't actually using any other parts of igvmfilegen maybe it should be in a separate crate. After all disassembling files is really the opposite of generating them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fine, was trying to avoid splitting out a new crate but maybe while I'm at it I can move the Dump command to whatever I name the new crate too.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tbh i think it's fine to leave in here for now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nah I'd prefer Justus's idea, We should separate out igvm-creation and igvm-processing if they're not sharing any code (beyond the igvm crate itself)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants