Skip to content

Conversation

@ariesdevil
Copy link
Contributor

@ariesdevil ariesdevil commented Dec 27, 2025

Why?

Struct fingerprint mismatch for enum fields

When computing struct version hash for cross-language compatibility, Java and C++ treat enum fields as nullable=true by default. However, Rust's proc-macro cannot determine at compile time whether a field type is an enum, causing fingerprint mismatch.

Cross-language Union serialization fails

Java's AbstractObjectSerializer expects to read type_id for non-final fields, but Rust/C++ skip type_id for Union fields per xlang spec. This caused Type id 104 not registered errors when Java tried to deserialize Rust-serialized Union data.

What does this PR do?

Rust changes:

  • Union-compatible enum handling for xlang mode
  • Fix field_need_write_type_info() to handle UNION TypeId

Java changes:

  • AbstractObjectSerializer: Skip type_id read for Union types
  • UnionSerializer: Add read()/write() delegating to xread()/xwrite()

Tests:

  • Add testUnionXlang for Rust enum <-> Java Union2 interoperability

Related issues

Does this PR introduce any user-facing change?

[ ] Does this PR introduce any public API change?
[x] Does this PR introduce any binary protocol compatibility change?
Note: Struct version hash for structs containing enum fields will now match Java/C++.

Benchmark

Others

It's really hard to determine this bug 😭...

@chaokunyang
Copy link
Collaborator

chaokunyang commented Dec 29, 2025

@ariesdevil Most of such bugs are caused by the type system differences between laungages.

I'm refactoring the whole xlang serialization to mark all fields not null by default, I think this refactor will resolve most such tricky issue.

here is the principle:

Java:

  1. All fields are not null except Optional
  2. All fields don't track ref by default

Rust

  1. All fields are not null except Option/RcWeak/ArcWeak
  2. Only Rc/Arc/RcWeak/ArcWeak track ref

Go

  1. all fields are not null by default, except pointer
  2. slice/map/interface are not null by default

C++

  1. All fields are not null except Optional, shared_ptr/unique_ptr are not null by default
  2. shared_ptr track ref by default

Python

  1. All fields are not null except Optional type hints
  2. All fields don't track ref by default

For native serialization, keep current nullable/ref principles for max compatibility

@ariesdevil ariesdevil force-pushed the feat/xlang-union-rust-java branch 2 times, most recently from 060bf8e to ef83261 Compare December 29, 2025 09:47
@ariesdevil
Copy link
Contributor Author

@chaokunyang Thanks for informing me of the changes. I updated the PR desc and some code, PTAL again.

@ariesdevil ariesdevil changed the title fix(rust): align struct fingerprint nullable flag for enum fields with Java/C++, and fix cross-language Union serialization fix(rust): align struct fingerprint nullable handling with xlang spec and fix Union xlang interoperability Dec 29, 2025
@chaokunyang
Copy link
Collaborator

Hi @ariesdevil , I finished xlang nullable refactor in #3093. It conflict with your pr, could you megre main branch and resolve the conflict? And since all fields are not null in rust by default except Optional, maybe the problem is more simple.

For enum, java and c++ will take it as not null by default now.

@ariesdevil ariesdevil force-pushed the feat/xlang-union-rust-java branch from ef83261 to 33f5fb6 Compare December 30, 2025 05:57
@ariesdevil ariesdevil changed the title fix(rust): align struct fingerprint nullable handling with xlang spec and fix Union xlang interoperability fix(rust): enable Union type cross-language serialization between Rust and Java Dec 30, 2025
@ariesdevil ariesdevil force-pushed the feat/xlang-union-rust-java branch 3 times, most recently from 4e2f868 to dc3fe7d Compare December 30, 2025 08:21
@ariesdevil ariesdevil force-pushed the feat/xlang-union-rust-java branch from dc3fe7d to 10a2913 Compare December 30, 2025 13:17

#[inline(always)]
fn fory_actual_type_id(type_id: u32, register_by_name: bool, compatible: bool) -> u32 {
fn fory_actual_type_id(type_id: u32, register_by_name: bool, compatible: bool, xlang: bool) -> u32 {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this xlang param really necessary? could we avoid add this parameter?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not in hotpath, add a new parameter is OK. Maybe we should remove this method in future, and handle type id idrectly in type_resolver.rs

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's necessary, for union-compatible enum, xlang specific returned TypeId:

  • xlang=true -> return TypeId::UNION for cross language mode
  • xlang=false return enum TypeId otherwise

@ariesdevil ariesdevil force-pushed the feat/xlang-union-rust-java branch from 10a2913 to 5c8e84e Compare December 30, 2025 15:39
@ariesdevil ariesdevil force-pushed the feat/xlang-union-rust-java branch from 5c8e84e to c84bba7 Compare December 31, 2025 01:38
@ariesdevil ariesdevil force-pushed the feat/xlang-union-rust-java branch from c84bba7 to 05bce45 Compare December 31, 2025 01:59
Copy link
Collaborator

@chaokunyang chaokunyang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@chaokunyang chaokunyang merged commit d6f6461 into apache:main Dec 31, 2025
53 checks passed
@ariesdevil
Copy link
Contributor Author

Thanks for your patient review, I learned a lot.

@ariesdevil ariesdevil deleted the feat/xlang-union-rust-java branch December 31, 2025 05:54
@chaokunyang
Copy link
Collaborator

@ariesdevil #3107 refactored java ObjectSerializer, we don't need special code path for enum now

@ariesdevil
Copy link
Contributor Author

@ariesdevil #3107 refactored java ObjectSerializer, we don't need special code path for enum now

@chaokunyang Wonderful job, happy new year : )

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants