Skip to content

Comments

client: handle existing IBRL tunnel on reconnect#3066

Closed
martinsander00 wants to merge 2 commits intomainfrom
ms/fix-dz-connect-error
Closed

client: handle existing IBRL tunnel on reconnect#3066
martinsander00 wants to merge 2 commits intomainfrom
ms/fix-dz-connect-error

Conversation

@martinsander00
Copy link
Contributor

@martinsander00 martinsander00 commented Feb 20, 2026

Summary

  • doublezero connect ibrl previously failed silently when the daemon already had a unicast service in memory (e.g. recovered from the state file after a daemon restart or reboot), printing ❌ Error provisioning service: unicast service already provisioned but then claiming ✅ User Provisioned and leaving the user with a broken tunnel
  • The connect command now checks daemon status before provisioning: if a unicast service is already active (BGP Session Up), it exits successfully without re-provisioning; if the service exists but is not active (e.g. Network Unreachable, BGP Session Down), it returns a clear error directing the user to run doublezero disconnect ibrl first
  • doublezero disconnect now cleans up stale daemon services even when the onchain user account has already been deleted before disconnect runs (e.g. in a ban/unban flow where the account is deleted as the unban mechanism). Previously, the daemon's tunnel service was silently left behind, causing a subsequent connect ibrl to fail with "already provisioned" and caused e2e TestE2E_UserBan/user-ban-ibrl to fail
  • Two unit tests cover both connect cases: already-active tunnel skips provision, stale tunnel returns an actionable error

Testing Verification

  • test_connect_ibrl_already_active_skips_provision: verifies provisioning is not called when BGP is up (mockall panics on unexpected provisioning() call)
  • test_connect_ibrl_stale_tunnel_returns_error: verifies the error message includes the current status and the disconnect command

When an IBRL user is already activated onchain and the daemon has a
unicast service in memory (e.g. recovered from state file after restart),
connect now checks the daemon status before attempting to provision:
- BGP Session Up → exit successfully, tunnel is already active
- Any other status → return a clear error directing the user to run
  `doublezero disconnect ibrl` first

Previously, the provision call would fail silently and the command would
print "✅ User Provisioned" despite the tunnel never being set up.
…y deleted

When the onchain user account was deleted before disconnect ran (e.g. in a
ban/unban flow where the account is deleted as the unban mechanism), list_user
returned no results and controller.remove() was never called, leaving a stale
daemon service behind. A subsequent connect ibrl would then hit the "already
provisioned" guard and fail.

disconnect now falls back to querying the daemon directly via controller.status()
and removes any matching stale services when no onchain user is found.
@martinsander00
Copy link
Contributor Author

Handled by #3034

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant