Conversation
Signed-off-by: cheyang <cheyang.cy@alibaba-inc.com>
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request updates the Fluid project's roadmap from 2025 to 2026, outlining significant advancements in data access, mobility, and real-time availability. Key changes include the introduction of a strategic focus on LLM KV Cache orchestration, enhancements to the generic cache runtime and dynamic configuration, API upgrades, and improved developer tooling to streamline operations and performance. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Pull request overview
Updates the project roadmap document to outline Fluid’s 2026 priorities and planned initiatives across data access (“Anyway”), mobility (“Anywhere”), and workflow/DX (“Anytime”).
Changes:
- Rename the roadmap section from 2025 to 2026 and refresh the planned initiative list.
- Add new 2026 focus areas (e.g., LLM KV cache orchestration, dynamic runtime config, webhook validation, CLI plugin).
- Rework bullets and descriptions across the three roadmap pillars.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
ROADMAP.md
Outdated
| - Recommend underutilized Pods for scaling (cost/performance-aware). | ||
| - Ensure cache engines adapt to dynamic throughput post-scaling. | ||
| - **Cloud-Agnostic Recovery**: Rebuild caches across regions using cloud disk snapshots. | ||
| - **Elastic Caching & Cost Optimization** |
There was a problem hiding this comment.
This top-level list item appears to have an extra leading space before the "-", which can inadvertently nest it under the previous list in Markdown. Align it with other top-level bullets to keep the section structure consistent.
| - **Elastic Caching & Cost Optimization** | |
| - **Elastic Caching & Cost Optimization** |
ROADMAP.md
Outdated
| • **Temporal Workflow Integration** | ||
|
|
||
| – **Kueue-Driven Pipelines**: Trigger training/inference jobs automatically upon DataLoad completion; automate post-job cache eviction and data migration. |
There was a problem hiding this comment.
This section uses a Unicode bullet character ("•") instead of a Markdown list marker ("-"). Consider switching to "-" for consistency and to ensure proper list rendering across Markdown viewers.
ROADMAP.md
Outdated
| – **Fluid kubectl Plugin**_(P1)_: Native CLI extension (kubectl fluid) for: | ||
|
|
There was a problem hiding this comment.
The priority tag markup is missing a separating space before the italic text ("Fluid kubectl Plugin(P1)"). Add a space so the emphasis renders reliably: "Fluid kubectl Plugin (P1): ..."
ROADMAP.md
Outdated
| – **Kubernetes Node-Specific Restrictions**_(P1)_: Support node selectors, affinity/anti-affinity, and taint tolerations for cache Pods. | ||
|
|
There was a problem hiding this comment.
The priority tag markup again lacks a separating space before the italic text ("...(P1)"). Adding a space before "(P1)" will ensure consistent Markdown rendering.
ROADMAP.md
Outdated
|
|
||
| – **Fluid kubectl Plugin**_(P1)_: Native CLI extension (kubectl fluid) for: | ||
|
|
||
| - Dataset status inspection and health diagnostics. |
There was a problem hiding this comment.
List item spacing is inconsistent ("- Dataset ..." has two spaces after the dash). Normalizing to a single space after list markers will keep Markdown rendering and linting consistent.
| - Dataset status inspection and health diagnostics. | |
| - Dataset status inspection and health diagnostics. |
ROADMAP.md
Outdated
| - **Generic Cache Runtime** | ||
| – **Pluggable Architecture**: Standardized Cache Runtime Interface for rapid integration of new engines (CubeFS, Dragonfly, Vineyard) with minimal boilerplate. | ||
| – **AdvancedStatefulset-Based Orchestration**: Migrate from StatefulSet to InstanceSet for fine-grained Pod lifecycle management, ordered rollout, and enhanced failover capabilities. |
There was a problem hiding this comment.
Markdown list formatting is inconsistent here: sub-items use a Unicode en dash ("–") instead of a standard Markdown list marker ("-") and indentation varies. This can render as plain text instead of nested bullets; consider using consistent "-" markers and indentation for all sub-bullets.
ROADMAP.md
Outdated
| - Minimum container permission (remove the privileged permission of FUSE Pod) | ||
| - **Generic Cache Runtime** | ||
| – **Pluggable Architecture**: Standardized Cache Runtime Interface for rapid integration of new engines (CubeFS, Dragonfly, Vineyard) with minimal boilerplate. | ||
| – **AdvancedStatefulset-Based Orchestration**: Migrate from StatefulSet to InstanceSet for fine-grained Pod lifecycle management, ordered rollout, and enhanced failover capabilities. |
There was a problem hiding this comment.
"AdvancedStatefulset-Based Orchestration" likely intends to reference the Kubernetes resource "StatefulSet" (capital S in the second word). Consider renaming to "Advanced StatefulSet-based Orchestration" (or similar) to match the canonical Kubernetes spelling and improve readability.
| – **AdvancedStatefulset-Based Orchestration**: Migrate from StatefulSet to InstanceSet for fine-grained Pod lifecycle management, ordered rollout, and enhanced failover capabilities. | |
| – **Advanced StatefulSet-based Orchestration**: Migrate from StatefulSet to InstanceSet for fine-grained Pod lifecycle management, ordered rollout, and enhanced failover capabilities. |
ROADMAP.md
Outdated
|
|
||
| – **Hot Parameter Swapping**: Runtime modification of cache engine configurations (e.g., Alluxio block size, Jindo worker threads) for traffic spike handling. | ||
|
|
||
| - **API upgradation to v1alpha2** |
There was a problem hiding this comment.
"API upgradation" is non-idiomatic English; consider changing this heading to "API upgrade" / "API upgrade to v1alpha2" to improve clarity.
| - **API upgradation to v1alpha2** | |
| - **API upgrade to v1alpha2** |
ROADMAP.md
Outdated
| - **Validation Webhook** | ||
|
|
||
| – Admission-time CRD validation with auto-correction suggestions to prevent misconfigurations. | ||
|
|
||
| – Policy enforcement for resource quotas and security constraints. |
There was a problem hiding this comment.
These lines have inconsistent indentation and use a Unicode en dash ("–") that won't be parsed as a list item by Markdown in many renderers. Using a consistent nested list marker (e.g., two spaces + "- ") will improve readability and ensure bullets render correctly.
ROADMAP.md
Outdated
| - **LLM KV Cache Orchestration**_(P0, New Strategic Focus)_ | ||
|
|
There was a problem hiding this comment.
The emphasis markup is missing a separating space before the italicized priority tag (e.g., "...(P0...)"). Without the space, some Markdown renderers won't format this as intended. Consider adding a space: "... (P0, ...)."
There was a problem hiding this comment.
Code Review
This pull request updates the Fluid roadmap to 2026, introducing significant new objectives and features across the 'Data Anyway', 'Data Anywhere', and 'Data Anytime' sections. The changes provide a clear direction for the project. My review focuses on improving the clarity and consistency of the new roadmap document.
ROADMAP.md
Outdated
|
|
||
| – **Hot Parameter Swapping**: Runtime modification of cache engine configurations (e.g., Alluxio block size, Jindo worker threads) for traffic spike handling. | ||
|
|
||
| - **API upgradation to v1alpha2** |
| - **Distributed Prewarming**: Maximize bandwidth utilization for fast data loading. | ||
| - **Throttling Control**: Limit bandwidth usage during prewarming to avoid saturation. | ||
| - **Rsync Optimization**: Improve cross-region sync efficiency. |
There was a problem hiding this comment.
The bullet points for these sub-items use a hyphen (-), which is inconsistent with the en dash (–) used for other sub-items in this document (e.g., lines 9-10). For consistency, please use an en dash.
| - **Distributed Prewarming**: Maximize bandwidth utilization for fast data loading. | |
| - **Throttling Control**: Limit bandwidth usage during prewarming to avoid saturation. | |
| - **Rsync Optimization**: Improve cross-region sync efficiency. | |
| – **Distributed Prewarming**: Maximize bandwidth utilization for fast data loading. | |
| – **Throttling Control**: Limit bandwidth usage during prewarming to avoid saturation. | |
| – **Rsync Optimization**: Improve cross-region sync efficiency. |
ROADMAP.md
Outdated
| - Recommend underutilized Pods for scaling (cost/performance-aware). | ||
| - Ensure cache engines adapt to dynamic throughput post-scaling. | ||
| - **Cloud-Agnostic Recovery**: Rebuild caches across regions using cloud disk snapshots. | ||
| - **Elastic Caching & Cost Optimization** |
ROADMAP.md
Outdated
| - **Dynamic Volume Mounting**: | ||
| - Support dynamic volume mounting capabilities for multi-cloud/hybrid-cloud scenarios. | ||
| - Enable dyanmic data mount operations in Python SDK. | ||
| • **Temporal Workflow Integration** |
ROADMAP.md
Outdated
| - Dataset status inspection and health diagnostics. | ||
| - On-demand prewarming triggering (kubectl fluid warmup). | ||
| - Cache performance profiling and bottleneck analysis. | ||
| - Runtime configuration hot-updates. |
There was a problem hiding this comment.
The bullet points and indentation for these sub-items are inconsistent with the rest of the document. For consistency, please use an en dash (–) and adjust the indentation.
| - Dataset status inspection and health diagnostics. | |
| - On-demand prewarming triggering (kubectl fluid warmup). | |
| - Cache performance profiling and bottleneck analysis. | |
| - Runtime configuration hot-updates. | |
| – Dataset status inspection and health diagnostics. | |
| – On-demand prewarming triggering (kubectl fluid warmup). | |
| – Cache performance profiling and bottleneck analysis. | |
| – Runtime configuration hot-updates. |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #5672 +/- ##
==========================================
- Coverage 61.05% 61.04% -0.02%
==========================================
Files 444 444
Lines 30540 30540
==========================================
- Hits 18647 18643 -4
- Misses 10356 10360 +4
Partials 1537 1537 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Signed-off-by: cheyang <cheyang.cy@alibaba-inc.com>
Signed-off-by: cheyang <cheyang.cy@alibaba-inc.com>
Signed-off-by: cheyang <cheyang.cy@alibaba-inc.com>
|
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 1 out of 1 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| – **Kueue-Driven Pipelines**: Trigger training/inference jobs automatically upon DataLoad completion; automate post-job cache eviction and data migration. | ||
|
|
||
| – **Event-Driven Policies**: Flexible metadata synchronization triggered by workload lifecycle events. | ||
|
|
||
| - **Developer Experience** | ||
|
|
||
| – **Fluid kubectl Plugin**: Native CLI extension (kubectl fluid) for: |
There was a problem hiding this comment.
The sub-items under Temporal Workflow Integration / Developer Experience use a Unicode en-dash (–) for the first-level sub-bullets. Replace with standard Markdown list markers so these items render consistently with the - list used below.
|
|
||
| - **ThinRuntime Productization** | ||
|
|
||
| – Production-ready stability for large-scale deployments with **minimum container privileges** (eliminate privileged FUSE Pod requirements). |
There was a problem hiding this comment.
This section has a header bullet (- **ThinRuntime Productization**) but the following line isn’t formatted as a nested list item (it starts with an en-dash and inconsistent indentation). Convert it into a proper nested bullet so it renders under the ThinRuntime item.
| – Production-ready stability for large-scale deployments with **minimum container privileges** (eliminate privileged FUSE Pod requirements). | |
| - Production-ready stability for large-scale deployments with **minimum container privileges** (eliminate privileged FUSE Pod requirements). |
| – **Disaggregated KV Cache**: Externalize vLLM/SGLang KV Cache to Fluid-managed distributed storage, enabling 10x+ throughput improvement for long-context inference. | ||
|
|
||
| – **Cross-Pod Cache Sharing**: Live migration of KV Cache between inference instances for preemptive scheduling and spot instance tolerance. | ||
|
|
||
| – **Mooncake Integration**: Official partnership for high-performance KV Cache backend with RDMA acceleration. |
There was a problem hiding this comment.
The KV cache sub-items are prefixed with a Unicode en-dash (–) and inconsistent indentation. This won’t render as a nested list in Markdown; use standard list markers (-/*) and consistent indentation.
| – **Master Pod Crash Recovery**: Automatic re-setup and state reconstruction after cache master failure without data loss. | ||
| – **Metadata Persistence**: WAL-based metadata recovery for rapid failover. |
There was a problem hiding this comment.
Same Markdown list issue here: the JindoRuntime HA sub-items use a Unicode en-dash (–) instead of a Markdown list marker, so they won’t render as bullets. Switch to - and indent as nested bullets.
| – **Master Pod Crash Recovery**: Automatic re-setup and state reconstruction after cache master failure without data loss. | |
| – **Metadata Persistence**: WAL-based metadata recovery for rapid failover. | |
| - **Master Pod Crash Recovery**: Automatic re-setup and state reconstruction after cache master failure without data loss. | |
| - **Metadata Persistence**: WAL-based metadata recovery for rapid failover. |



Ⅰ. Describe what this PR does
Ⅱ. Does this pull request fix one issue?
fixes #XXXX
Ⅲ. List the added test cases (unit test/integration test) if any, please explain if no tests are needed.
Ⅳ. Describe how to verify it
Ⅴ. Special notes for reviews