Skip to content

PatchedArray#27

Open
a10y wants to merge 8 commits intodevelopfrom
aduffy/patches
Open

PatchedArray#27
a10y wants to merge 8 commits intodevelopfrom
aduffy/patches

Conversation

@a10y
Copy link
Contributor

@a10y a10y commented Mar 6, 2026

Distilling some thoughts from the initial implementation work into RFC so we can all get on the same page before we go any further.

a10y added 7 commits March 4, 2026 09:26
Signed-off-by: Andrew Duffy <andrew@a10y.dev>
Signed-off-by: Andrew Duffy <andrew@a10y.dev>
Signed-off-by: Andrew Duffy <andrew@a10y.dev>
Signed-off-by: Andrew Duffy <andrew@a10y.dev>
Signed-off-by: Andrew Duffy <andrew@a10y.dev>
Signed-off-by: Andrew Duffy <andrew@a10y.dev>
Signed-off-by: Andrew Duffy <andrew@a10y.dev>
@a10y a10y requested review from gatesn and joseph-isaacs March 6, 2026 16:03
Comment on lines +16 to +17
This relies on introducing a new encoding to represent exception patching, which would be a forward-compatibility break
as is always the case when adding a new default encoding.
Copy link
Contributor Author

@a10y a10y Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as is always the case when adding a new default encoding

This is not the purpose of this RFC, but just calling out this is going to continue to be annoying, I see a few alternatives here

  1. All new encodings need to be gated behind a Writer flag so they are not written unless you explicitly opt-in. Then after some number of releases they can be enabled by default.
  2. Come back around to the idea of distributing encodings as WASM binaries, seems unlikely to be picked up very widely
  3. NEVER allow new encodings within a single "edition". We'd need to formalize what an edition means, how frequently we drop one, and how we maintain and test encodings on develop between edition releases.


## Summary

Make a backwards compatible change to the serialization format for `Patches` used by the FastLanes-derived encodings:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why limit to fastlanes encodings what about sparse arrays?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly just because the whole "lanes" concept only maps cleanly to primitives.

I suppose this could help us write a data-parallel version of sparsearray though...

Comment on lines +93 to +95
pub(super) indices: BufferHandle,
/// patch values corresponding to the indices. The ptype is specified by `values_ptype`.
pub(super) values: BufferHandle,
Copy link
Contributor

@joseph-isaacs joseph-isaacs Mar 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want these to be uncompressed and never compressed in the future?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we assume that patches are only 0.5-1% of the overall array then I think compression is sort of superfluous, yea.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree strongly here, you can always write decompressed arrays but will find it much harder to go the other way.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's valid, maybe i'm underestimating how likely this is to change in the future. let me update the PR to hold child values and see how that works out

Comment on lines +86 to +88
pub(super) offset: usize,
/// Total length.
pub(super) len: usize,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are these size bounds on this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not on len, but offset < 1024. I just used usize just for indexing convenience

The PatchedArray holds buffer handles for the `lane_offsets` which provides chunk/lane-level random indexing
into the patch `indices` and `values`, so these values can live equivalently in device or host memory.

The only operation performed at planning time is slicing, which means that all of its reduce rules would run
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what will you do here?

Signed-off-by: Andrew Duffy <andrew@a10y.dev>
@a10y a10y marked this pull request as ready for review March 18, 2026 21:25
@a10y
Copy link
Contributor Author

a10y commented Mar 19, 2026

so here's an annoying thing: converting BitPackedArray that holds an Option<Patches> into a PatchedArray that has a BitPacked child means that we have a compat problem to solve.

The easiest way to solve this problem is to, on read, invert the BP w/patches into the new format. That way we can delete all of the code having to deal with interior Patches for BP. However, this forces you to do execution on the read path to load and tranpose patch values/indices. Doing execution in the read path goes against our existing model.

Another alternative is that we copy-paste BitPacked and make a BitPackedV2 that does not have a patches child, and we write that one but continue to leave the existing BitPacked codepaths in place. The downside is that we have to maintain both forever.

@gatesn
Copy link
Contributor

gatesn commented Mar 19, 2026

You could also impl VTable::execute for BitPacked array and use the first iteration to return ExecutionStep::Done(..) with the new inverted array

@a10y
Copy link
Contributor Author

a10y commented Mar 19, 2026

Oh interesting. That might work. I just want to be sure that we stop producing arrays with interior patches after we merge all of this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants