PatchedArray by a10y · Pull Request #27 · vortex-data/rfcs

a10y · 2026-03-06T15:42:16Z

Distilling some thoughts from the initial implementation work into RFC so we can all get on the same page before we go any further.

Signed-off-by: Andrew Duffy <andrew@a10y.dev>

a10y · 2026-03-06T16:08:34Z

proposed/0027-patches-format.md

+This relies on introducing a new encoding to represent exception patching, which would be a forward-compatibility break
+as is always the case when adding a new default encoding.


as is always the case when adding a new default encoding

This is not the purpose of this RFC, but just calling out this is going to continue to be annoying, I see a few alternatives here

All new encodings need to be gated behind a Writer flag so they are not written unless you explicitly opt-in. Then after some number of releases they can be enabled by default.

Come back around to the idea of distributing encodings as WASM binaries, seems unlikely to be picked up very widely

NEVER allow new encodings within a single "edition". We'd need to formalize what an edition means, how frequently we drop one, and how we maintain and test encodings on develop between edition releases.

joseph-isaacs · 2026-03-09T11:12:47Z

proposed/0027-patches-format.md

+
+## Summary
+
+Make a backwards compatible change to the serialization format for `Patches` used by the FastLanes-derived encodings:


Why limit to fastlanes encodings what about sparse arrays?

Mostly just because the whole "lanes" concept only maps cleanly to primitives.

I suppose this could help us write a data-parallel version of sparsearray though...

joseph-isaacs · 2026-03-09T11:13:19Z

proposed/0027-patches-format.md

+    pub(super) indices: BufferHandle,
+    /// patch values corresponding to the indices. The ptype is specified by `values_ptype`.
+    pub(super) values: BufferHandle,


Do we want these to be uncompressed and never compressed in the future?

If we assume that patches are only 0.5-1% of the overall array then I think compression is sort of superfluous, yea.

I disagree strongly here, you can always write decompressed arrays but will find it much harder to go the other way.

that's valid, maybe i'm underestimating how likely this is to change in the future. let me update the PR to hold child values and see how that works out

joseph-isaacs · 2026-03-09T11:14:09Z

proposed/0027-patches-format.md

+    pub(super) offset: usize,
+    /// Total length.
+    pub(super) len: usize,


are these size bounds on this?

not on len, but offset < 1024. I just used usize just for indexing convenience

joseph-isaacs · 2026-03-09T11:15:55Z

proposed/0027-patches-format.md

+The PatchedArray holds buffer handles for the `lane_offsets` which provides chunk/lane-level random indexing
+into the patch `indices` and `values`, so these values can live equivalently in device or host memory.
+
+The only operation performed at planning time is slicing, which means that all of its reduce rules would run


what will you do here?

Signed-off-by: Andrew Duffy <andrew@a10y.dev>

a10y · 2026-03-19T19:29:45Z

so here's an annoying thing: converting BitPackedArray that holds an Option<Patches> into a PatchedArray that has a BitPacked child means that we have a compat problem to solve.

The easiest way to solve this problem is to, on read, invert the BP w/patches into the new format. That way we can delete all of the code having to deal with interior Patches for BP. However, this forces you to do execution on the read path to load and tranpose patch values/indices. Doing execution in the read path goes against our existing model.

Another alternative is that we copy-paste BitPacked and make a BitPackedV2 that does not have a patches child, and we write that one but continue to leave the existing BitPacked codepaths in place. The downside is that we have to maintain both forever.

gatesn · 2026-03-19T19:31:44Z

You could also impl VTable::execute for BitPacked array and use the first iteration to return ExecutionStep::Done(..) with the new inverted array

a10y · 2026-03-19T19:34:29Z

Oh interesting. That might work. I just want to be sure that we stop producing arrays with interior patches after we merge all of this

a10y added 7 commits March 4, 2026 09:26

save

e8112f4

Signed-off-by: Andrew Duffy <andrew@a10y.dev>

save

9d51946

Signed-off-by: Andrew Duffy <andrew@a10y.dev>

rename

deb2e49

Signed-off-by: Andrew Duffy <andrew@a10y.dev>

link

3419293

Signed-off-by: Andrew Duffy <andrew@a10y.dev>

more

5fbb7df

Signed-off-by: Andrew Duffy <andrew@a10y.dev>

ok

b2a9140

Signed-off-by: Andrew Duffy <andrew@a10y.dev>

add header

c12dd5e

Signed-off-by: Andrew Duffy <andrew@a10y.dev>

a10y requested review from gatesn and joseph-isaacs March 6, 2026 16:03

a10y commented Mar 6, 2026

View reviewed changes

joseph-isaacs reviewed Mar 9, 2026

View reviewed changes

a10y mentioned this pull request Mar 17, 2026

PatchedArray: basics and wiring vortex-data/vortex#7002

Open

update

5fb3cf8

Signed-off-by: Andrew Duffy <andrew@a10y.dev>

a10y marked this pull request as ready for review March 18, 2026 21:25

		This relies on introducing a new encoding to represent exception patching, which would be a forward-compatibility break
		as is always the case when adding a new default encoding.


		## Summary

		Make a backwards compatible change to the serialization format for `Patches` used by the FastLanes-derived encodings:

Conversation

a10y commented Mar 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

a10y Mar 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

joseph-isaacs Mar 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

a10y commented Mar 19, 2026

Uh oh!

gatesn commented Mar 19, 2026

Uh oh!

a10y commented Mar 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

a10y commented Mar 6, 2026 •

edited

Loading

a10y Mar 6, 2026 •

edited

Loading

joseph-isaacs Mar 9, 2026 •

edited

Loading