client: add onchain reconciler to daemon with enable/disable CLI#3034
Open
client: add onchain reconciler to daemon with enable/disable CLI#3034
Conversation
590d528 to
988a52e
Compare
Contributor
There was a problem hiding this comment.
Pull request overview
This PR moves tunnel provisioning from the doublezero CLI into an onchain-driven reconciler loop inside the doublezerod daemon, adds runtime enable/disable controls, and updates status/reporting + tests/fixtures accordingly.
Changes:
- Add an onchain reconciler (polling + provisioning/removal) to
doublezerod, with persisted enable/disable state and new/enable,/disable,/v2/statusendpoints. - Update CLI flows (
connect,status, plus newenable/disable) to interact with the reconciler rather than directly provisioning. - Add caching onchain fetcher + adjust tests/e2e fixtures to account for reconciler state and updated output.
Reviewed changes
Copilot reviewed 66 out of 66 changed files in this pull request and generated 10 comments.
Show a summary per file
| File | Description |
|---|---|
| rfcs/rfc17-client-onchain-reconciler.md | RFC documenting reconciler design, state, endpoints, and rollout plan |
| CHANGELOG.md | Changelog entry describing reconciler + CLI enable/disable |
| e2e/user_ban_test.go | Treat missing tunnel interface as successful route withdrawal |
| e2e/multicast_test.go | Update status fixture usage to templated reconciler-aware output |
| e2e/internal/fixtures/diff.go | Make CLI-table diff parsing robust to non-table preamble lines |
| e2e/ibrl_with_allocated_ip_test.go | Update status fixture usage to templated reconciler-aware output |
| e2e/ibrl_test.go | Update status fixture usage to templated reconciler-aware output |
| e2e/ibrl_enable_disable_test.go | New E2E test for connect → disable → enable → restart persistence lifecycle |
| e2e/fixtures/multicast/doublezero_status_disconnected.txt | Remove old disconnected status fixture |
| e2e/fixtures/multicast/doublezero_status_disconnected.tmpl | New reconciler-aware disconnected status fixture template |
| e2e/fixtures/multicast/doublezero_status_connected_subscriber.tmpl | Add “Reconciler” column to connected multicast subscriber fixture |
| e2e/fixtures/multicast/doublezero_status_connected_publisher.tmpl | Add “Reconciler” column to connected multicast publisher fixture |
| e2e/fixtures/ibrl_with_allocated_addr/doublezero_status_disconnected.txt | Remove old disconnected status fixture |
| e2e/fixtures/ibrl_with_allocated_addr/doublezero_status_disconnected.tmpl | New reconciler-aware disconnected status fixture template |
| e2e/fixtures/ibrl_with_allocated_addr/doublezero_status_connected.tmpl | Add “Reconciler” column to connected allocated-IP fixture |
| e2e/fixtures/ibrl/doublezero_status_disconnected.txt | Remove old disconnected status fixture |
| e2e/fixtures/ibrl/doublezero_status_disconnected.tmpl | New reconciler-aware disconnected status fixture template |
| e2e/fixtures/ibrl/doublezero_status_connected.tmpl | Add “Reconciler” column to connected IBRL fixture |
| client/doublezerod/cmd/doublezerod/main.go | Add flags for client IP, reconciler poll interval, and state dir; pass into runtime |
| client/doublezerod/internal/runtime/run.go | Wire reconciler + state migration + caching fetcher; update routes handler wiring; latency now uses fetcher |
| client/doublezerod/internal/runtime/run_test.go | Update runtime tests to new Run() signature; remove statefile recovery assertions |
| client/doublezerod/internal/runtime/clientip.go | New client IP auto-discovery (explicit → interfaces → ifconfig.me) |
| client/doublezerod/internal/runtime/clientip_test.go | Unit tests for public IP classification and explicit IP behavior |
| client/doublezerod/internal/reconciler/reconciler.go | New reconciler loop + enable/disable endpoints + /v2/status |
| client/doublezerod/internal/reconciler/state.go | Persisted reconciler enable/disable state + migration from old doublezerod.json |
| client/doublezerod/internal/reconciler/state_test.go | Unit tests for state load/write + migration behavior |
| client/doublezerod/internal/reconciler/metrics.go | Prometheus metrics for reconciler polls/provisions/removals/matched users |
| client/doublezerod/internal/onchain/fetcher.go | New TTL-based caching fetcher shared by reconciler and latency subsystem |
| client/doublezerod/internal/onchain/fetcher_test.go | Unit tests for caching behavior (TTL, stale-on-error, concurrency) |
| client/doublezerod/internal/manager/manager.go | Remove DB/statefile dependency; add ResolveTunnelSrc + GetProvisionedServices |
| client/doublezerod/internal/manager/http_test.go | Update manager HTTP tests for removed DB + new GetProvisionedServices |
| client/doublezerod/internal/manager/db.go | Remove old on-disk state DB implementation |
| client/doublezerod/internal/manager/db_test.go | Remove tests for deleted DB/statefile system |
| client/doublezerod/internal/manager/fixtures/doublezerod.*.json | Remove fixtures used exclusively for deleted DB/statefile behavior |
| client/doublezerod/internal/services/base.go | Remove DBReaderWriter interface (services now keep ProvisionRequest in memory) |
| client/doublezerod/internal/services/services_test.go | Update service creation tests after DB removal |
| client/doublezerod/internal/services/ibrl.go | Store ProvisionRequest in memory; drop DB usage; expose ProvisionRequest() |
| client/doublezerod/internal/services/edgefiltering.go | Store ProvisionRequest in memory; drop DB usage; expose ProvisionRequest() |
| client/doublezerod/internal/services/multicast.go | Store ProvisionRequest in memory; drop DB usage; expose ProvisionRequest() |
| client/doublezerod/internal/latency/smartcontract.go | Remove old direct smartcontract fetcher module (replaced by fetcher integration) |
| client/doublezerod/internal/latency/manager.go | Support injected Fetcher; adjust SmartContractFunc signature and fetch path |
| client/doublezerod/internal/latency/manager_test.go | Update tests for new smartcontract func signature and removed program ID options |
| client/doublezerod/internal/api/routes.go | Replace DBReader with ServiceStateReader (provisioned services now in manager) |
| client/doublezerod/internal/api/routes_test.go | Update routes tests to use ServiceStateReader mock |
| client/doublezero/src/servicecontroller.rs | Add v2 status + enable/disable calls; remove CLI provisioning/remove/resolve-route APIs |
| client/doublezero/src/routes.rs | Remove resolve_route command surface; keep routes retrieval |
| client/doublezero/src/main.rs | Allow enable/disable commands without version warning gate |
| client/doublezero/src/cli/command.rs | Add doublezero enable / doublezero disable commands |
| client/doublezero/src/command/mod.rs | Register enable/disable subcommands |
| client/doublezero/src/command/enable.rs | New enable command implementation + unit tests |
| client/doublezero/src/command/disable.rs | New disable command implementation + unit tests |
| client/doublezero/src/command/status.rs | Use /v2/status; surface reconciler state in output/table; synthesize disconnected row when empty |
| client/doublezero/src/command/connect.rs | Switch connect flow to best-effort enable reconciler + poll daemon status for provisioning |
| client/doublezero/src/command/disconnect.rs | Stop calling daemon /remove; poll daemon for deprovision completion after onchain user deletion |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
e2e/fixtures/ibrl_with_allocated_addr/doublezero_status_disconnected.tmpl
Show resolved
Hide resolved
packethog
requested changes
Feb 19, 2026
5706744 to
0d5d872
Compare
martinsander00
approved these changes
Feb 25, 2026
Add a reconciliation loop to doublezerod that polls the DZ Ledger and automatically provisions/removes tunnels when users are activated or deactivated onchain. This replaces the CLI-driven /provision flow and the JSON state file crash recovery mechanism. Key changes: - Reconciler loop on NetlinkManager polls onchain state every 10s - Persistent enabled/disabled state with migration from old state file - Client IP auto-discovery via kernel default route with external fallback - Caching onchain fetcher wrapping the serviceability SDK - Drift detection re-provisions services when onchain state changes - Enable/disable HTTP endpoints and v2 status endpoint - Remove db.go state file, Recover() path, and DBReaderWriter interface
Adapt the doublezero CLI to work with the daemon's onchain reconciler instead of directly driving tunnel provisioning. - Add enable/disable commands to control the reconciler - Simplify connect: create onchain user, enable reconciler, poll daemon status until tunnel appears (drops ~400 lines of provisioning logic) - Simplify disconnect: delete onchain user and let reconciler tear down - Rewrite status to use daemon v2 endpoint with reconciler state - Deprecate --client-ip on CLI in favor of daemon flag - Read client IP from daemon v2 status for onchain user creation
- Add TestE2E_IBRL_EnableDisable covering enable/disable lifecycle - Update e2e fixtures for reconciler-driven status output - Update existing e2e tests for reconciler flow (pass client-ip to daemon) - Update CHANGELOG, DEVELOPMENT docs
0d5d872 to
5847e99
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary of Changes
doublezerodthat polls the DZ Ledger and automatically provisions/removes tunnels when users are activated or deactivated, replacing the CLI-driven/provisionflowdoublezerod.json) crash recovery mechanism — the daemon now re-provisions from onchain state on restartdoublezero enable/doublezero disableCLI commands to control the reconciler, withconnectimplicitly enabling it--client-ipflag), deprecating--client-ipon the CLITestE2E_IBRL_EnableDisable) covering the enable/disable lifecycleRFC: rfcs/rfc17-client-onchain-reconciler.md
Diff Breakdown
~41% of the net additions are tests; core logic is roughly balanced between new reconciler code and removed CLI/state-file code.
Key files (click to expand)
client/doublezerod/internal/manager/reconciler_test.go— comprehensive tests for reconciler provisioning, removal, drift detection, enable/disable lifecycleclient/doublezero/src/command/status.rs— rewritten to use daemon v2 status endpoint, display reconciler stateclient/doublezerod/internal/manager/manager.go— reconciler loop, onchain data diffing, provision request building, state file removalclient/doublezero/src/command/connect.rs— stripped provisioning logic; now enables reconciler and polls daemon for tunnel statusclient/doublezerod/internal/manager/state.go— persistent reconciler enabled/disabled state with migration from old state fileclient/doublezero/src/command/disconnect.rs— no longer sends/removeto daemon; reconciler handles teardown after onchain user deletionclient/doublezerod/internal/runtime/clientip.go— client IP auto-discovery via kernel default route with external fallbackclient/doublezerod/internal/onchain/fetcher.go— caching fetcher wrapping the serviceability SDK for RPC callsTesting Verification
TestE2E_IBRL_EnableDisablevalidating the full enable → connect → disable → re-enable flow against a live devnet