-
Notifications
You must be signed in to change notification settings - Fork 206
[AURON #1889] Implement monotonically_increasing_id() function #1955
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
[AURON #1889] Implement monotonically_increasing_id() function #1955
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Implements native support for Spark’s monotonically_increasing_id() as a non-deterministic physical expression in Auron, wiring it through the Spark shims, protobuf plan representation, and the Rust planner and execution engine.
Changes:
- Adds a
MonotonicallyIncreasingIdExprNodeto the protobufPhysicalExprNodeoneof and wires it through the ScalaShimsImpl.convertMoreExprWithFallback. - Introduces
SparkMonotonicallyIncreasingIdExprindatafusion-ext-exprs, including unit tests that validate type, nullability, monotonicity, partition offsets, and partition separation. - Extends the Rust
PhysicalPlannerto buildSparkMonotonicallyIncreasingIdExprfrom the new protobuf expression type and exposes the module fromdatafusion-ext-exprs.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| spark-extension-shims-spark/src/main/scala/org/apache/spark/sql/auron/ShimsImpl.scala | Maps Spark’s MonotonicallyIncreasingID Catalyst expression to the new protobuf MonotonicIncreasingIdExprNode for native planning. |
| native-engine/datafusion-ext-exprs/src/spark_monotonically_increasing_id.rs | Implements the physical expression that generates 64-bit partition-scoped monotonically increasing IDs and adds unit tests for behavior. |
| native-engine/datafusion-ext-exprs/src/lib.rs | Exposes the new spark_monotonically_increasing_id module from the extension expressions crate. |
| native-engine/auron-planner/src/planner.rs | Deserializes the new protobuf MonotonicIncreasingIdExpr into SparkMonotonicallyIncreasingIdExpr during physical planning. |
| native-engine/auron-planner/proto/auron.proto | Extends the physical expression protobuf with MonotonicIncreasingIdExprNode and its field in PhysicalExprNode, and relocates SparkPartitionIdExprNode to avoid duplication. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
native-engine/datafusion-ext-exprs/src/spark_monotonically_increasing_id.rs
Show resolved
Hide resolved
|
@cxzl25 can I get a re-review on this? Thanks! |
Which issue does this PR close?
Closes #1889
Rationale for this change
Adds's support for non-deterministic function, as part of #1833
What changes are included in this PR?
Implements native support for Spark's `monotonically_increasing_id()`` function in Auron.
Functionality TL;DR:
The
monotonically_increasing_id()function generates unique, monotonically increasing 64-bit integers across all partitions. Each partition generates IDs using the formula:Are there any user-facing changes?
N/A
How was this patch tested?
Unit tests