🤖 feat: add analytics dashboard with DuckDB backend#2525
🤖 feat: add analytics dashboard with DuckDB backend#2525
Conversation
|
@codex review |
2a0dc68 to
e23c573
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: e23c573d10
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
@codex review Addressed all 3 review comments:
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 6880d71bf7
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
@codex review Addressed all 3 review comments from round 2:
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: e19e6f3766
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
@codex review Addressed both review comments from round 3:
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: abc5efa154
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
@codex review Addressed both review comments from round 4:
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: ba1a297fbb
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
@codex review Addressed the P2 timezone label issue:
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 91e5c2c21f
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
@codex review Added a new Storybook/Chromatic story for the analytics stats page:
Please take another look. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 57b5cc077c
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
@codex review Addressed the latest feedback:
Re-ran |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 6db86f6ed8
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
@codex review Addressed the latest ETL issues:
Re-ran |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 2221495769
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
@codex review Addressed the remaining P1 ETL issue:
Re-ran |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f43d99ae04
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
@codex review Addressed the truncation edge case:
Re-ran |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: cb1d8d635d
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
@codex review Addressed the false-positive regression detector:
Re-ran |
|
@codex review Applied the final ETL gating adjustment:
Re-ran |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: e175909ae5
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
@codex review Addressed the head-truncation+append case where count recovers:
Re-ran |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 1b33a51b27
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
@codex review Addressed the missing
This prevents deleted workspaces from continuing to contribute stale totals in long-running sessions. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 9c6faa7f6a
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
…uards - refresh incremental analytics ingest for same-sequence rewrites by replacing rows at matching response_index values before reinserting - keep node-pty rebuild independent from DuckDB package presence while preserving per-module stamp caching
Add a workspace metadata-null cleanup hook that dispatches clearWorkspace through AnalyticsService/worker/ETL so deleted workspaces are removed from analytics immediately. Also switch summary today_spend_usd to a bound UTC YYYY-MM-DD value instead of DuckDB CURRENT_DATE to avoid local-time drift against UTC event dates.
Include computed ttftMs in stream-end metadata and history updates when a first token timestamp is available, while omitting the field when TTFT cannot be derived. Add stream manager coverage that verifies ttftMs is persisted in finalized history metadata when available and remains absent otherwise.
Detect startup backfill readiness by comparing session workspace IDs with ingest watermark workspace IDs, not only aggregate counts. - scan session directories for workspace IDs that actually have chat.jsonl - load watermark workspace IDs and flag any session workspace missing coverage - thread missing-ID signal into shouldRunInitialBackfill while preserving existing zero-event and wiped-events safeguards - add stale watermark-ID mismatch test coverage
80423a2 to
cd80919
Compare
Summary
Adds a backend-side analytics engine using DuckDB (
@duckdb/node-api) that aggregates token usage, costs, and timing data across all workspaces, exposed via oRPC and visualized in a new Analytics dashboard. Works identically in Electron and server mode.Background
Mux stores per-workspace stats in JSON files (
session-usage.json,session-timing.json,chat.jsonl) under~/.mux/sessions/{workspaceId}/. These contain rich data (token usage, costs, timing, per-message details) but there's no way to aggregate across workspaces or projects. Users have no visibility into total spend, spend trends, model distribution, or timing performance.Implementation
Architecture
Key design decisions
~/.mux/analytics/analytics.dbfor persistence across restarts.lastSequenceper workspace in aningest_watermarkstable. Only processes new messages, with full rebuild as self-healing fallback.chat.jsonlrow), (2) query output (DuckDB result rows), (3) oRPC wire format. Zero ORMs, zero extra deps.@duckdb/node-apiis in the banned eager imports list.Files added/modified
New files (~1,600 LoC):
src/common/orpc/schemas/analytics.ts— Zod schemas (oRPC contract + DuckDB row validation)src/node/services/analytics/analyticsWorker.ts— Worker thread with DuckDBsrc/node/services/analytics/etl.ts— ETL from chat.jsonl → DuckDBsrc/node/services/analytics/queries.ts— Typed SQL queries with Zod validationsrc/node/services/analytics/analyticsService.ts— Main-thread service wrappersrc/browser/hooks/useAnalytics.ts— Data fetching hookssrc/browser/components/analytics/— Dashboard components (7 files)Modified files:
package.json— Added@duckdb/node-api,rechartsdeps +asarUnpackscripts/postinstall.sh— DuckDB electron-rebuild with stamp cachingMakefile—rebuild-nativeincludes DuckDBscripts/check_eager_imports.sh— DuckDB in banned importssrc/node/orpc/context.ts— AddedanalyticsServicetoORPCContextsrc/node/orpc/router.ts— Addedanalyticsnamespace (7 procedures)src/node/services/serviceContainer.ts— Wire service + stream-end triggersrc/browser/App.tsx— Analytics route rendering + keybindsrc/browser/components/TitleBar.tsx— Analytics toggle buttonsrc/browser/contexts/RouterContext.tsx— Analytics navigationsrc/browser/utils/ui/keybinds.ts—OPEN_ANALYTICSkeybind (Ctrl+Shift+Y)Dashboard features
Risks
@duckdb/node-bindings-*adds ~65 MB to the app. Acceptable for desktop/server app but significant. Lazy loading mitigates startup impact.📋 Implementation Plan
Analytics Dashboard with DuckDB
Context & Goals
Mux stores per-workspace stats in JSON files (
session-usage.json,session-timing.json,chat.jsonl) under~/.mux/sessions/{workspaceId}/. These contain token usage, costs, timing, and per-message data — but there's no way to aggregate across workspaces or projects.Goal: Add a backend-side analytics engine using DuckDB native (
@duckdb/node-api) that aggregates data across all workspaces, exposed via oRPC and visualized in a new Analytics dashboard. Must work identically in Electron and server mode.Estimated LoC: ~1,600 product code (excluding tests).
Architecture Overview
Implementation Plan
1. Build System — DuckDB native module integration (~20 LoC)
Files:
package.json— add dependency +asarUnpackscripts/postinstall.sh— rebuild for ElectronMakefile— add torebuild-nativetargetscripts/check_eager_imports.sh— add DuckDB to banned eager imports list2. Analytics Worker — DuckDB in
worker_thread(~250 LoC)New file:
src/node/services/analytics/analyticsWorker.tsFollows the existing
tokenizer.worker.tspattern but with persistent state (DuckDB connection).Task dispatch:
init~/.mux/analytics/analytics.db, creates schemaingesteventsrebuildAllquery3. ETL Logic — JSON → DuckDB rows (~300 LoC)
New file:
src/node/services/analytics/etl.tsRuns inside the worker. For each workspace:
chat.jsonl— extract per-assistant-message rows (each hasusage,duration,model,timestamp)session-usage.json— used for validation and to detect rolled-up child datasession-timing.json— extract TTFT, streaming time, tool execution time per modelingest)Per-message extraction from
chat.jsonl:Each assistant message is extracted, enriched with workspace metadata, and validated against
EventRowSchema(from section 5) viasafeParsebefore insertion. Malformed rows are logged and skipped — never crash the ETL. Cost fields are computed by reusing existing pricing logic fromusageAggregator.ts.See section 6 for the
extractAndValidateRow()implementation and validation flow.Incremental strategy:
lastSequenceper workspace iningest_watermarkstablehistorySequence > lastSequenceDELETE FROM events; DELETE FROM ingest_watermarks;→ re-scan allRoll-up handling:
session-usage.json→rolledUpFromledgerrolledUpFrom(its data is already counted in the parent's cumulative stats)chat.jsonl, rolled-up child workspaces whose session dirs are deleted are simply absent — no double-counting risk4. AnalyticsService — Backend service (~200 LoC)
New file:
src/node/services/analytics/analyticsService.tsRegistration in
ServiceContainer:Trigger incremental ETL on stream-end:
Hook into the existing
SessionTimingService.handleStreamEndorSessionUsageService.recordUsageflow — after writing session JSON, callanalyticsService.ingestWorkspace(workspaceId).5. oRPC Endpoints + Dual-Purpose Zod Schemas (~200 LoC)
New file:
src/common/orpc/schemas/analytics.tsZod schemas serve double duty: they define the oRPC contract (input/output validation over the wire) AND validate raw DuckDB result rows inside the worker. This gives compile-time types via
z.infer<>and runtime validation at both boundaries — no Drizzle or ORM needed.Router addition in
src/node/orpc/router.ts(same pattern as before):6. Typed SQL Queries Inside Worker (~200 LoC)
New file:
src/node/services/analytics/queries.tsCore
typedQueryhelper — validates every DuckDB result row against a Zod schema before returning. This gives runtime safety (catch DB/SQL mismatches immediately) plus compile-time types (viaz.infer<>), with zero ORM overhead.Named query functions using the typed helpers + schemas from section 5:
ETL insert validation — rows are validated before insertion too:
Type-safety flow summary:
7. Frontend — Analytics Dashboard (~400 LoC)
New files:
src/browser/components/analytics/AnalyticsDashboard.tsx— top-level viewsrc/browser/components/analytics/SummaryCards.tsxsrc/browser/components/analytics/SpendChart.tsxsrc/browser/components/analytics/TimingChart.tsxsrc/browser/components/analytics/ModelBreakdown.tsxsrc/browser/hooks/useAnalytics.ts— thin wrapper aroundapi.analytics.*Charting: Add
recharts(~200 KB gzipped) as a dependency. It's React-native, supports bar/line/area/pie charts, and covers all the visualizations needed. No other charting library exists in the codebase today.Entry point integration:
Add "Analytics" as a top-level view in
src/browser/App.tsx, alongsideSettingsPageandProjectPage:Navigation: Add an Analytics icon button in
TitleBar.tsx(next to Settings), using alucide-reacticon (e.g.,BarChart3).Dashboard layout:
Data hook:
8. Wiring — Incremental Ingestion Triggers
After
SessionUsageService.recordUsage()orSessionTimingService.handleStreamEnd()writes to disk, emit an event or directly call:The
AnalyticsServiceis injected intoSessionTimingService(or both services reference it from the container). The ingest call is fire-and-forget — it runs in the worker thread and doesn't block the response stream.On startup:
AnalyticsServicedoes NOT auto-ingest. The first call to anyanalytics.*endpoint triggersensureWorker(), which opens the existing DB. A periodic background refresh (e.g., every 5 minutes) scans for workspaces with newer mtimes than their watermarks.Implementation Order
@duckdb/node-api, update rebuild/unpack configanalyticsWorker.tswith DuckDB init and table creationetl.tswith per-message extraction fromchat.jsonlAnalyticsServiceinServiceContainerwith lazy initrecharts, navigation entryKey design decisions and tradeoffs
Why file-backed DB, not in-memory + Parquet?
analytics.dbis corrupted/missing, rebuild from source JSONsCOPY ... TO 'export.parquet'for data portabilityWhy per-message granularity (chat.jsonl), not just session summaries?
Session-level summaries (
session-usage.json) give cumulative totals but no time dimension. To show "spend per day" or "hourly heatmap," we need per-message timestamps. Each assistant message inchat.jsonlalready carriestimestamp,usage,duration, andmodel.Why
rechartsover custom SVGs?The existing codebase uses custom Tailwind for simple meters (
TokenMeter.tsx). But proper time-series charts, histograms, and pie charts are significantly more complex.rechartsis ~200 KB gzipped, React-native, and avoids reinventing axes, tooltips, legends, and responsive layouts.Double-counting prevention
Two mechanisms:
chat.jsonlper workspace. Deleted child workspaces have no session dir, so no rows — no double-counting.session-usage.jsoncontainsrolledUpFrommapping. During ingest, if a workspace ID is in another'srolledUpFrom, we skip it (its cumulative data is already merged into the parent).Why Zod-validated SQL, not Drizzle ORM?
Drizzle has no official DuckDB dialect. Community options (
@leonardovida-md/drizzle-neo-duckdb) are experimental (7 stars) and usedrizzle-orm/pg-coreas a compatibility shim. More fundamentally, our queries are analytical (percentiles, date_trunc, window functions, CTEs) — Drizzle's query builder can't express them, so you'd fall back tosql\...`` tagged templates everywhere, getting no type-safety from the ORM layer. The Zod approach gives stronger guarantees (runtime validation at three boundaries) with zero extra deps.Binary size impact
@duckdb/node-bindings-linux-x64is ~65 MB. For context:node-pty: ~2 MBThis is the main cost. Mitigations: the user has already accepted DuckDB as the choice; the binary is only loaded lazily; and server-mode deployments can use the system Node.js without Electron overhead.
Generated with
mux• Model:anthropic:claude-opus-4-6• Thinking:xhigh• Cost:$5.31