Skip to content

Comments

🤖 feat: add analytics dashboard with DuckDB backend#2525

Open
ThomasK33 wants to merge 40 commits intomainfrom
analytics-anyz
Open

🤖 feat: add analytics dashboard with DuckDB backend#2525
ThomasK33 wants to merge 40 commits intomainfrom
analytics-anyz

Conversation

@ThomasK33
Copy link
Member

Summary

Adds a backend-side analytics engine using DuckDB (@duckdb/node-api) that aggregates token usage, costs, and timing data across all workspaces, exposed via oRPC and visualized in a new Analytics dashboard. Works identically in Electron and server mode.

Background

Mux stores per-workspace stats in JSON files (session-usage.json, session-timing.json, chat.jsonl) under ~/.mux/sessions/{workspaceId}/. These contain rich data (token usage, costs, timing, per-message details) but there's no way to aggregate across workspaces or projects. Users have no visibility into total spend, spend trends, model distribution, or timing performance.

Implementation

Architecture

Main Process (ServiceContainer)
├── AnalyticsService              ← new, lazy-initialized
│   └── analyticsWorker.ts        ← worker_thread holding DuckDB instance
│       ├── DuckDB (file-backed: ~/.mux/analytics/analytics.db)
│       ├── ETL: reads session JSONs → inserts into DuckDB tables
│       └── Query: executes SQL, returns small validated result objects
│
├── oRPC Router
│   └── analytics.*               ← 7 new procedures
│
└── Stream-end events trigger incremental ETL

Renderer
└── AnalyticsDashboard            ← new top-level view with recharts

Key design decisions

  • DuckDB in worker_thread: Keeps the main thread responsive. The DuckDB instance is file-backed at ~/.mux/analytics/analytics.db for persistence across restarts.
  • Incremental ETL with watermarks: Tracks lastSequence per workspace in an ingest_watermarks table. Only processes new messages, with full rebuild as self-healing fallback.
  • Three Zod validation boundaries: (1) ETL input (each chat.jsonl row), (2) query output (DuckDB result rows), (3) oRPC wire format. Zero ORMs, zero extra deps.
  • Lazy initialization: DuckDB worker starts on first query/ingest, not at app startup. @duckdb/node-api is in the banned eager imports list.

Files added/modified

New files (~1,600 LoC):

  • src/common/orpc/schemas/analytics.ts — Zod schemas (oRPC contract + DuckDB row validation)
  • src/node/services/analytics/analyticsWorker.ts — Worker thread with DuckDB
  • src/node/services/analytics/etl.ts — ETL from chat.jsonl → DuckDB
  • src/node/services/analytics/queries.ts — Typed SQL queries with Zod validation
  • src/node/services/analytics/analyticsService.ts — Main-thread service wrapper
  • src/browser/hooks/useAnalytics.ts — Data fetching hooks
  • src/browser/components/analytics/ — Dashboard components (7 files)

Modified files:

  • package.json — Added @duckdb/node-api, recharts deps + asarUnpack
  • scripts/postinstall.sh — DuckDB electron-rebuild with stamp caching
  • Makefilerebuild-native includes DuckDB
  • scripts/check_eager_imports.sh — DuckDB in banned imports
  • src/node/orpc/context.ts — Added analyticsService to ORPCContext
  • src/node/orpc/router.ts — Added analytics namespace (7 procedures)
  • src/node/services/serviceContainer.ts — Wire service + stream-end trigger
  • src/browser/App.tsx — Analytics route rendering + keybind
  • src/browser/components/TitleBar.tsx — Analytics toggle button
  • src/browser/contexts/RouterContext.tsx — Analytics navigation
  • src/browser/utils/ui/keybinds.tsOPEN_ANALYTICS keybind (Ctrl+Shift+Y)

Dashboard features

  • Summary cards: Total spend, today's spend, avg daily, cache hit ratio
  • Spend over time: Stacked bar chart by model (hourly/daily/weekly)
  • Spend by project: Horizontal bar chart
  • Spend by model: Donut chart with token counts
  • Timing distribution: Histogram with p50/p90/p99 reference lines
  • Agent cost breakdown: Bar chart by agent type
  • Filters: Project filter dropdown, time range selector (7d/30d/90d/all)

Risks

  • Binary size: @duckdb/node-bindings-* adds ~65 MB to the app. Acceptable for desktop/server app but significant. Lazy loading mitigates startup impact.
  • DuckDB native rebuild: Requires Electron ABI-matched rebuild (handled by postinstall + stamp caching). If rebuild fails, analytics features degrade gracefully.

📋 Implementation Plan

Analytics Dashboard with DuckDB

Context & Goals

Mux stores per-workspace stats in JSON files (session-usage.json, session-timing.json, chat.jsonl) under ~/.mux/sessions/{workspaceId}/. These contain token usage, costs, timing, and per-message data — but there's no way to aggregate across workspaces or projects.

Goal: Add a backend-side analytics engine using DuckDB native (@duckdb/node-api) that aggregates data across all workspaces, exposed via oRPC and visualized in a new Analytics dashboard. Must work identically in Electron and server mode.

Estimated LoC: ~1,600 product code (excluding tests).


Architecture Overview

Main Process (ServiceContainer)
├── AnalyticsService              ← new, lazy-initialized
│   └── analytics.worker.ts       ← worker_thread holding DuckDB instance
│       ├── DuckDB (file-backed: ~/.mux/analytics/analytics.db)
│       ├── ETL: reads session JSONs → inserts into DuckDB tables
│       └── Query: executes SQL, returns small result objects
│
├── oRPC Router
│   └── analytics.*               ← new namespace, ~6 procedures
│
└── Existing services (stream-end events trigger incremental ETL)

Renderer (thin client)
└── AnalyticsDashboard            ← new top-level view
    └── Calls api.analytics.* → renders charts

Implementation Plan

1. Build System — DuckDB native module integration (~20 LoC)

Files:

  • package.json — add dependency + asarUnpack
  • scripts/postinstall.sh — rebuild for Electron
  • Makefile — add to rebuild-native target
  • scripts/check_eager_imports.sh — add DuckDB to banned eager imports list
# package.json — dependencies
"@duckdb/node-api": "^1.4.4"
// package.json — build.asarUnpack (add to existing array)
"**/node_modules/@duckdb/**/duckdb.node"
# scripts/postinstall.sh — add DuckDB to the rebuild command
npx @electron/rebuild -f \
  -m node_modules/node-pty \
  -m node_modules/@duckdb/node-bindings-linux-x64  # platform-specific

Binary size note: @duckdb/node-bindings-* is ~65 MB. This is significant but acceptable for a desktop/server app with an existing ~60 MB Electron binary. The WASM alternative is smaller (~10 MB) but doesn't work in Node.js server mode.


2. Analytics Worker — DuckDB in worker_thread (~250 LoC)

New file: src/node/services/analytics/analyticsWorker.ts

Follows the existing tokenizer.worker.ts pattern but with persistent state (DuckDB connection).

// analyticsWorker.ts — entry point
import { parentPort } from "node:worker_threads";
import { DuckDBInstance } from "@duckdb/node-api";

let db: DuckDBInstance;
let conn: DuckDBConnection;

async function init(dbPath: string) {
  db = await DuckDBInstance.create(dbPath);
  conn = await db.connect();
  await createSchema(conn);
}

async function createSchema(conn: DuckDBConnection) {
  await conn.run(`
    CREATE TABLE IF NOT EXISTS events (
      -- Identity
      workspace_id      VARCHAR NOT NULL,
      project_path      VARCHAR,
      project_name      VARCHAR,
      workspace_name    VARCHAR,
      parent_workspace_id VARCHAR,
      agent_id          VARCHAR,

      -- Time
      timestamp         TIMESTAMP,
      date              DATE,

      -- Model
      model             VARCHAR,
      thinking_level    VARCHAR,

      -- Tokens
      input_tokens      INTEGER DEFAULT 0,
      output_tokens     INTEGER DEFAULT 0,
      reasoning_tokens  INTEGER DEFAULT 0,
      cached_tokens     INTEGER DEFAULT 0,
      cache_create_tokens INTEGER DEFAULT 0,

      -- Costs (USD)
      input_cost_usd    DOUBLE DEFAULT 0,
      output_cost_usd   DOUBLE DEFAULT 0,
      reasoning_cost_usd DOUBLE DEFAULT 0,
      cached_cost_usd   DOUBLE DEFAULT 0,
      total_cost_usd    DOUBLE DEFAULT 0,

      -- Timing
      duration_ms       INTEGER,
      ttft_ms           INTEGER,
      streaming_ms      INTEGER,
      tool_execution_ms INTEGER,
      output_tps        DOUBLE,

      -- Metadata
      response_index    INTEGER,
      is_sub_agent      BOOLEAN DEFAULT FALSE
    );

    CREATE TABLE IF NOT EXISTS ingest_watermarks (
      workspace_id    VARCHAR PRIMARY KEY,
      last_sequence   INTEGER DEFAULT 0,
      last_modified   TIMESTAMP DEFAULT NOW()
    );
  `);
}

// Message handler dispatches to: init, ingest, query, rebuild
parentPort?.on("message", async (msg) => {
  try {
    const result = await handleMessage(msg);
    parentPort?.postMessage({ messageId: msg.messageId, result });
  } catch (error) {
    parentPort?.postMessage({
      messageId: msg.messageId,
      error: { message: error.message, stack: error.stack },
    });
  }
});

Task dispatch:

Message What it does
init Opens/creates DB at ~/.mux/analytics/analytics.db, creates schema
ingest ETL for one workspace — reads its JSON files, upserts into events
rebuildAll Drops + re-ingests all workspaces (self-healing)
query Runs a named query with params, returns result rows

3. ETL Logic — JSON → DuckDB rows (~300 LoC)

New file: src/node/services/analytics/etl.ts

Runs inside the worker. For each workspace:

  1. Read chat.jsonl — extract per-assistant-message rows (each has usage, duration, model, timestamp)
  2. Read session-usage.json — used for validation and to detect rolled-up child data
  3. Read session-timing.json — extract TTFT, streaming time, tool execution time per model
  4. Read workspace metadata — from config (passed as input to ingest)

Per-message extraction from chat.jsonl:

Each assistant message is extracted, enriched with workspace metadata, and validated against EventRowSchema (from section 5) via safeParse before insertion. Malformed rows are logged and skipped — never crash the ETL. Cost fields are computed by reusing existing pricing logic from usageAggregator.ts.

See section 6 for the extractAndValidateRow() implementation and validation flow.

Incremental strategy:

  • Track lastSequence per workspace in ingest_watermarks table
  • On ingest: read only messages with historySequence > lastSequence
  • On compaction/corruption: detect via sequence gaps → full re-ingest for that workspace
  • On rebuild: DELETE FROM events; DELETE FROM ingest_watermarks; → re-scan all

Roll-up handling:

  • Check session-usage.jsonrolledUpFrom ledger
  • Skip ingesting any workspace whose ID appears in another workspace's rolledUpFrom (its data is already counted in the parent's cumulative stats)
  • Since we use per-message data from chat.jsonl, rolled-up child workspaces whose session dirs are deleted are simply absent — no double-counting risk

4. AnalyticsService — Backend service (~200 LoC)

New file: src/node/services/analytics/analyticsService.ts

export class AnalyticsService {
  private worker: Worker | null = null;
  private pendingPromises = new Map<number, { resolve, reject }>();
  private messageId = 0;

  constructor(
    private config: Config,
    private workspaceService: WorkspaceService,
  ) {}

  /** Lazy init — worker starts on first query or ingest */
  private async ensureWorker(): Promise<void> {
    if (this.worker) return;
    const dbPath = join(this.config.rootDir, "analytics", "analytics.db");
    await mkdir(dirname(dbPath), { recursive: true });
    this.worker = new Worker(
      join(__dirname, "analyticsWorker.js")
    );
    this.worker.unref(); // Don't block process exit
    this.worker.on("message", this.handleWorkerMessage);
    await this.dispatch("init", { dbPath });
  }

  /** Called on stream-end from SessionTimingService/UsageService */
  async ingestWorkspace(workspaceId: string): Promise<void> {
    await this.ensureWorker();
    const meta = await this.getWorkspaceMeta(workspaceId);
    const sessionDir = getSessionDir(workspaceId);
    await this.dispatch("ingest", { workspaceId, sessionDir, meta });
  }

  /** Full rebuild — self-healing if DB is corrupted */
  async rebuildAll(): Promise<void> { ... }

  /** Query methods matching oRPC procedures */
  async getSummary(projectPath?: string): Promise<SummaryResult> {
    return this.dispatch("query", { name: "summary", params: { projectPath } });
  }

  async getSpendOverTime(params: SpendOverTimeParams): Promise<SpendOverTimeResult> {
    return this.dispatch("query", { name: "spendOverTime", params });
  }

  // ... other query methods
}

Registration in ServiceContainer:

// src/node/services/serviceContainer.ts
// Add to constructor, after workspaceService is created:
this.analyticsService = new AnalyticsService(config, this.workspaceService);

Trigger incremental ETL on stream-end:

Hook into the existing SessionTimingService.handleStreamEnd or SessionUsageService.recordUsage flow — after writing session JSON, call analyticsService.ingestWorkspace(workspaceId).


5. oRPC Endpoints + Dual-Purpose Zod Schemas (~200 LoC)

New file: src/common/orpc/schemas/analytics.ts

Zod schemas serve double duty: they define the oRPC contract (input/output validation over the wire) AND validate raw DuckDB result rows inside the worker. This gives compile-time types via z.infer<> and runtime validation at both boundaries — no Drizzle or ORM needed.

import { z } from "zod";

// ── Reusable row schemas (used by both oRPC output AND worker query validation) ──

/** Single row from DuckDB, validated before crossing worker→main boundary */
export const SummaryRowSchema = z.object({
  total_spend_usd: z.number(),
  today_spend_usd: z.number(),
  avg_daily_spend_usd: z.number(),
  cache_hit_ratio: z.number(),
  total_tokens: z.number(),
  total_responses: z.number(),
});
export type SummaryRow = z.infer<typeof SummaryRowSchema>;

export const SpendOverTimeRowSchema = z.object({
  bucket: z.string(),
  model: z.string(),
  cost_usd: z.number(),
});
export type SpendOverTimeRow = z.infer<typeof SpendOverTimeRowSchema>;

export const SpendByProjectRowSchema = z.object({
  project_name: z.string(),
  project_path: z.string(),
  cost_usd: z.number(),
  token_count: z.number(),
});

export const SpendByModelRowSchema = z.object({
  model: z.string(),
  cost_usd: z.number(),
  token_count: z.number(),
  response_count: z.number(),
});

export const TimingPercentilesRowSchema = z.object({
  p50: z.number(),
  p90: z.number(),
  p99: z.number(),
});

export const HistogramBucketSchema = z.object({
  bucket: z.number(),
  count: z.number(),
});

export const AgentCostRowSchema = z.object({
  agent_id: z.string(),
  cost_usd: z.number(),
  token_count: z.number(),
  response_count: z.number(),
});

/** ETL input validation — each row extracted from chat.jsonl is validated before insert */
export const EventRowSchema = z.object({
  workspace_id: z.string(),
  project_path: z.string().nullable(),
  project_name: z.string().nullable(),
  workspace_name: z.string().nullable(),
  parent_workspace_id: z.string().nullable(),
  agent_id: z.string().nullable(),
  timestamp: z.number().nullable(),  // unix ms
  model: z.string().nullable(),
  thinking_level: z.string().nullable(),
  input_tokens: z.number().default(0),
  output_tokens: z.number().default(0),
  reasoning_tokens: z.number().default(0),
  cached_tokens: z.number().default(0),
  cache_create_tokens: z.number().default(0),
  input_cost_usd: z.number().default(0),
  output_cost_usd: z.number().default(0),
  reasoning_cost_usd: z.number().default(0),
  cached_cost_usd: z.number().default(0),
  total_cost_usd: z.number().default(0),
  duration_ms: z.number().nullable(),
  ttft_ms: z.number().nullable(),
  streaming_ms: z.number().nullable(),
  tool_execution_ms: z.number().nullable(),
  output_tps: z.number().nullable(),
  response_index: z.number().nullable(),
  is_sub_agent: z.boolean().default(false),
});
export type EventRow = z.infer<typeof EventRowSchema>;

// ── oRPC procedure schemas (transform DB snake_case → API camelCase) ──

export const analytics = {
  getSummary: {
    input: z.object({ projectPath: z.string().nullish() }),
    output: z.object({
      totalSpendUsd: z.number(),
      todaySpendUsd: z.number(),
      avgDailySpendUsd: z.number(),
      cacheHitRatio: z.number(),
      totalTokens: z.number(),
      totalResponses: z.number(),
    }),
  },
  getSpendOverTime: {
    input: z.object({
      projectPath: z.string().nullish(),
      granularity: z.enum(["hour", "day", "week"]),
      from: z.coerce.date().nullish(),
      to: z.coerce.date().nullish(),
    }),
    output: z.array(z.object({
      bucket: z.string(),
      costUsd: z.number(),
      model: z.string(),
    })),
  },
  getSpendByProject: {
    input: z.object({}),
    output: z.array(z.object({
      projectName: z.string(),
      projectPath: z.string(),
      costUsd: z.number(),
      tokenCount: z.number(),
    })),
  },
  getSpendByModel: {
    input: z.object({ projectPath: z.string().nullish() }),
    output: z.array(z.object({
      model: z.string(),
      costUsd: z.number(),
      tokenCount: z.number(),
      responseCount: z.number(),
    })),
  },
  getTimingDistribution: {
    input: z.object({
      metric: z.enum(["ttft", "duration", "tps"]),
      projectPath: z.string().nullish(),
    }),
    output: z.object({
      p50: z.number(),
      p90: z.number(),
      p99: z.number(),
      histogram: z.array(z.object({ bucket: z.number(), count: z.number() })),
    }),
  },
  getAgentCostBreakdown: {
    input: z.object({ projectPath: z.string().nullish() }),
    output: z.array(z.object({
      agentId: z.string(),
      costUsd: z.number(),
      tokenCount: z.number(),
      responseCount: z.number(),
    })),
  },
};

Router addition in src/node/orpc/router.ts (same pattern as before):

analytics: {
  getSummary: t
    .input(schemas.analytics.getSummary.input)
    .output(schemas.analytics.getSummary.output)
    .handler(async ({ context, input }) => {
      return context.analyticsService.getSummary(input.projectPath);
    }),
  // ... same pattern for other endpoints
},

6. Typed SQL Queries Inside Worker (~200 LoC)

New file: src/node/services/analytics/queries.ts

Core typedQuery helper — validates every DuckDB result row against a Zod schema before returning. This gives runtime safety (catch DB/SQL mismatches immediately) plus compile-time types (via z.infer<>), with zero ORM overhead.

import type { DuckDBConnection } from "@duckdb/node-api";
import type { z } from "zod";

/**
 * Execute a DuckDB query and validate every result row against a Zod schema.
 * Throws a clear ZodError if the DB returns unexpected shapes — prevents
 * silent data corruption flowing through to the UI.
 */
async function typedQuery<T extends z.ZodType>(
  conn: DuckDBConnection,
  sql: string,
  params: unknown[],
  schema: T,
): Promise<z.infer<T>[]> {
  const stmt = await conn.prepare(sql);
  // Bind params positionally ($1, $2, ...)
  for (let i = 0; i < params.length; i++) {
    bindParam(stmt, i + 1, params[i]);
  }
  const result = await stmt.run();
  const rows = result.getRowObjects();
  // Validate each row — fast for small analytical result sets
  return rows.map((row) => schema.parse(row));
}

/**
 * Execute a query expecting exactly one row (e.g., summary aggregations).
 * Asserts non-empty result — defensive against empty-table edge cases.
 */
async function typedQueryOne<T extends z.ZodType>(
  conn: DuckDBConnection,
  sql: string,
  params: unknown[],
  schema: T,
): Promise<z.infer<T>> {
  const rows = await typedQuery(conn, sql, params, schema);
  assert(rows.length === 1, `Expected 1 row, got ${rows.length}`);
  return rows[0];
}

Named query functions using the typed helpers + schemas from section 5:

import {
  SummaryRowSchema,
  SpendOverTimeRowSchema,
  SpendByProjectRowSchema,
  SpendByModelRowSchema,
  TimingPercentilesRowSchema,
  HistogramBucketSchema,
  AgentCostRowSchema,
} from "@/common/orpc/schemas/analytics";

async function querySummary(
  conn: DuckDBConnection,
  projectPath: string | null,
): Promise<SummaryRow> {
  return typedQueryOne(conn, `
    SELECT
      COALESCE(SUM(total_cost_usd), 0)     as total_spend_usd,
      COALESCE(SUM(CASE WHEN date = CURRENT_DATE
        THEN total_cost_usd END), 0)        as today_spend_usd,
      COALESCE(SUM(total_cost_usd) / NULLIF(
        DATE_DIFF('day', MIN(date), MAX(date)) + 1, 0
      ), 0)                                  as avg_daily_spend_usd,
      COALESCE(SUM(cached_tokens)::DOUBLE / NULLIF(
        SUM(input_tokens + cached_tokens), 0
      ), 0)                                  as cache_hit_ratio,
      COALESCE(SUM(input_tokens + output_tokens
        + reasoning_tokens + cached_tokens), 0) as total_tokens,
      COUNT(*)                               as total_responses
    FROM events
    WHERE ($1 IS NULL OR project_path = $1)
  `, [projectPath], SummaryRowSchema);
  // Return type is SummaryRow — guaranteed by Zod at runtime
}

async function querySpendOverTime(
  conn: DuckDBConnection,
  granularity: string,
  projectPath: string | null,
  from: Date | null,
  to: Date | null,
): Promise<SpendOverTimeRow[]> {
  return typedQuery(conn, `
    SELECT
      DATE_TRUNC($1, timestamp)::VARCHAR as bucket,
      model,
      COALESCE(SUM(total_cost_usd), 0) as cost_usd
    FROM events
    WHERE ($2 IS NULL OR project_path = $2)
      AND ($3 IS NULL OR timestamp >= $3)
      AND ($4 IS NULL OR timestamp <= $4)
    GROUP BY bucket, model
    ORDER BY bucket
  `, [granularity, projectPath, from, to], SpendOverTimeRowSchema);
}

// queryTimingDistribution, queryAgentCostBreakdown, etc. follow same pattern

ETL insert validation — rows are validated before insertion too:

import { EventRowSchema } from "@/common/orpc/schemas/analytics";

function extractAndValidateRow(msg: MuxMessage, meta: WorkspaceMeta): EventRow | null {
  if (msg.role !== "assistant" || !msg.metadata?.usage) return null;

  const raw = buildRawRow(msg, meta); // construct from message fields
  const parsed = EventRowSchema.safeParse(raw);
  if (!parsed.success) {
    // Log warning, skip malformed row — self-healing, never crash
    log.warn(`Skipping malformed event row: ${parsed.error.message}`);
    return null;
  }
  return parsed.data;
}

Type-safety flow summary:

chat.jsonl message
  → extractAndValidateRow() validates with EventRowSchema
  → INSERT into DuckDB (runtime-validated input)
  → SQL query
  → typedQuery() validates result with SummaryRowSchema / SpendOverTimeRowSchema / etc.
  → worker posts validated result to main process
  → AnalyticsService transforms snake_case → camelCase for oRPC
  → oRPC output schema validates the API response (already standard oRPC behavior)

Three validation boundaries, zero ORMs, zero extra dependencies.

7. Frontend — Analytics Dashboard (~400 LoC)

New files:

  • src/browser/components/analytics/AnalyticsDashboard.tsx — top-level view
  • src/browser/components/analytics/SummaryCards.tsx
  • src/browser/components/analytics/SpendChart.tsx
  • src/browser/components/analytics/TimingChart.tsx
  • src/browser/components/analytics/ModelBreakdown.tsx
  • src/browser/hooks/useAnalytics.ts — thin wrapper around api.analytics.*

Charting: Add recharts (~200 KB gzipped) as a dependency. It's React-native, supports bar/line/area/pie charts, and covers all the visualizations needed. No other charting library exists in the codebase today.

bun add recharts

Entry point integration:

Add "Analytics" as a top-level view in src/browser/App.tsx, alongside SettingsPage and ProjectPage:

// App.tsx — conditional rendering
if (currentView === "analytics") {
  return <AnalyticsDashboard />;
}

Navigation: Add an Analytics icon button in TitleBar.tsx (next to Settings), using a lucide-react icon (e.g., BarChart3).

Dashboard layout:

┌─────────────────────────────────────────────────────┐
│ Summary Cards (total spend, today, avg/day, cache%) │
├──────────────────────┬──────────────────────────────┤
│ [Project filter ▾]   │ [Time range: 7d|30d|90d|all] │
├──────────────────────┴──────────────────────────────┤
│ Daily Spend (stacked bar by model)                  │
├─────────────────────────────────────────────────────┤
│ Spend by Project (bar)  │  Spend by Model (donut)   │
├─────────────────────────────────────────────────────┤
│ TTFT Distribution (histogram)                       │
├─────────────────────────────────────────────────────┤
│ Agent Cost Attribution (bar)                        │
└─────────────────────────────────────────────────────┘

Data hook:

// useAnalytics.ts
export function useAnalyticsSummary(projectPath?: string) {
  const { api } = useAPI();
  const [data, setData] = useState<SummaryResult | null>(null);

  useEffect(() => {
    api.analytics.getSummary({ projectPath }).then(setData);
  }, [api, projectPath]);

  return data;
}

8. Wiring — Incremental Ingestion Triggers

After SessionUsageService.recordUsage() or SessionTimingService.handleStreamEnd() writes to disk, emit an event or directly call:

// In SessionTimingService.handleStreamEnd:
// After persisting timing file:
this.analyticsService?.ingestWorkspace(workspaceId);

The AnalyticsService is injected into SessionTimingService (or both services reference it from the container). The ingest call is fire-and-forget — it runs in the worker thread and doesn't block the response stream.

On startup: AnalyticsService does NOT auto-ingest. The first call to any analytics.* endpoint triggers ensureWorker(), which opens the existing DB. A periodic background refresh (e.g., every 5 minutes) scans for workspaces with newer mtimes than their watermarks.


Implementation Order

  1. Build system — Add @duckdb/node-api, update rebuild/unpack config
  2. Worker + schemaanalyticsWorker.ts with DuckDB init and table creation
  3. ETLetl.ts with per-message extraction from chat.jsonl
  4. ServiceAnalyticsService in ServiceContainer with lazy init
  5. oRPC — Schemas + router endpoints
  6. Queries — SQL implementations for each endpoint
  7. Frontend — Dashboard with recharts, navigation entry
  8. Triggers — Wire stream-end events to incremental ingestion
Key design decisions and tradeoffs

Why file-backed DB, not in-memory + Parquet?

  • Persistence: Avoids re-ingesting all workspaces on every app restart
  • Incremental inserts: DuckDB file DB supports fast appends via Appender
  • Self-healing: If analytics.db is corrupted/missing, rebuild from source JSONs
  • Parquet as export: Can still COPY ... TO 'export.parquet' for data portability

Why per-message granularity (chat.jsonl), not just session summaries?

Session-level summaries (session-usage.json) give cumulative totals but no time dimension. To show "spend per day" or "hourly heatmap," we need per-message timestamps. Each assistant message in chat.jsonl already carries timestamp, usage, duration, and model.

Why recharts over custom SVGs?

The existing codebase uses custom Tailwind for simple meters (TokenMeter.tsx). But proper time-series charts, histograms, and pie charts are significantly more complex. recharts is ~200 KB gzipped, React-native, and avoids reinventing axes, tooltips, legends, and responsive layouts.

Double-counting prevention

Two mechanisms:

  1. Per-message rows: We extract from chat.jsonl per workspace. Deleted child workspaces have no session dir, so no rows — no double-counting.
  2. Roll-up ledger: session-usage.json contains rolledUpFrom mapping. During ingest, if a workspace ID is in another's rolledUpFrom, we skip it (its cumulative data is already merged into the parent).

Why Zod-validated SQL, not Drizzle ORM?

Drizzle has no official DuckDB dialect. Community options (@leonardovida-md/drizzle-neo-duckdb) are experimental (7 stars) and use drizzle-orm/pg-core as a compatibility shim. More fundamentally, our queries are analytical (percentiles, date_trunc, window functions, CTEs) — Drizzle's query builder can't express them, so you'd fall back to sql\...`` tagged templates everywhere, getting no type-safety from the ORM layer. The Zod approach gives stronger guarantees (runtime validation at three boundaries) with zero extra deps.

Binary size impact

@duckdb/node-bindings-linux-x64 is ~65 MB. For context:

  • Electron binary: ~60 MB
  • node-pty: ~2 MB
  • Total app size increase: ~65 MB (~50% larger)

This is the main cost. Mitigations: the user has already accepted DuckDB as the choice; the binary is only loaded lazily; and server-mode deployments can use the system Node.js without Electron overhead.


Generated with mux • Model: anthropic:claude-opus-4-6 • Thinking: xhigh • Cost: $5.31

@ThomasK33
Copy link
Member Author

@codex review

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e23c573d10

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@ThomasK33
Copy link
Member Author

@codex review

Addressed all 3 review comments:

  • P1: Replaced node:assert/strict with browser-safe @/common/utils/assert in useAnalytics.ts
  • P1: Added background backfill on first worker init via rebuildAll so existing workspace history is available immediately after upgrade
  • P2: Histogram now emits real metric values (e.g. ms, tok/s) instead of abstract 1..20 bucket indices, so chart axes and percentile lines are meaningful

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6880d71bf7

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@ThomasK33
Copy link
Member Author

@codex review

Addressed all 3 review comments from round 2:

  • P1: COALESCE histogram bucket label when min==max to prevent NULL failing Zod validation
  • P1: Replaced remaining node:assert/strict in analyticsUtils.ts with browser-safe @/common/utils/assert
  • P2: Backfill now awaited in startWorker() so the first query sees complete historical data instead of an empty DB

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e19e6f3766

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@ThomasK33
Copy link
Member Author

@codex review

Addressed both review comments from round 3:

  • P1: Replaced node:assert/strict in AnalyticsDashboard.tsx with browser-safe @/common/utils/assert
  • P1: Added bigint handling in toFiniteNumber() so DuckDB BIGINT watermark values are properly coerced to JS numbers, preventing repeated re-ingestion of already-processed messages

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: abc5efa154

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@ThomasK33
Copy link
Member Author

@codex review

Addressed both review comments from round 4:

  • P2: Persisted analytics filters now self-heal to safe defaults via normalizeTimeRange() / normalizeTimingMetric() instead of asserting on unknown values from localStorage
  • P2: Date range computation uses UTC boundaries (Date.UTC()) to prevent the toISOString().slice(0,10) backend serialization from silently shifting the day in positive-offset timezones

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ba1a297fbb

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@ThomasK33
Copy link
Member Author

@codex review

Addressed the P2 timezone label issue:

  • Date-only chart bucket labels now render with timeZone: "UTC" to prevent west-of-UTC locales from shifting the displayed day backward

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 91e5c2c21f

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@ThomasK33
Copy link
Member Author

@codex review

Added a new Storybook/Chromatic story for the analytics stats page:

  • src/browser/stories/App.analytics.stories.tsx
  • Uses realistic mocked analytics datasets and navigates via the titlebar analytics button
  • Validated with make lint and make typecheck

Please take another look.

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 57b5cc077c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@ThomasK33
Copy link
Member Author

@codex review

Addressed the latest feedback:

  • ETL now refreshes rewritten rows with unchanged sequence IDs by replacing rows for matching (workspace_id, response_index) during incremental ingest, and includes sequence == watermark for rewrite coverage
  • scripts/postinstall.sh rebuild guards were split so node-pty rebuild remains independent from DuckDB package presence

Re-ran make lint and make typecheck.

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6db86f6ed8

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@ThomasK33
Copy link
Member Author

@codex review

Addressed the latest ETL issues:

  • Added rewind/truncation detection before incremental ingest; when sequence rewinds (or assistant events disappear), the workspace analytics slice is rebuilt transactionally (DELETE workspace rows + reinsert current parsed events)
  • Added TTFT extraction in ETL (extractTtftMs) from metadata/providerMetadata timing keys instead of hardcoding ttft_ms: null

Re-ran make lint and make typecheck.

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 2221495769

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@ThomasK33
Copy link
Member Author

@codex review

Addressed the remaining P1 ETL issue:

  • Incremental ingest now detects any parsed event with sequence < watermark.lastSequence
  • On detection, it switches to full workspace-slice replacement for current parsed events and writes watermark from parsed max sequence (allows watermark to move backward)
  • Normal incremental fast path remains unchanged when no regression is present

Re-ran make lint and make typecheck.

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f43d99ae04

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@ThomasK33
Copy link
Member Author

@codex review

Addressed the truncation edge case:

  • Rebuild detection now also checks persisted row-count truncation (parsedEvents.length < persistedEventRowCount) when a watermark exists
  • This catches history truncations where max sequence stays unchanged, forcing workspace-slice rebuild and removing stale deleted rows

Re-ran make lint and make typecheck.

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: cb1d8d635d

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@ThomasK33
Copy link
Member Author

@codex review

Addressed the false-positive regression detector:

  • hasSequenceRegression now detects only sequence ordering regressions between adjacent parsed assistant events (current < previous), not any event below watermark
  • Rebuild triggers still include truncation + max-sequence rewind
  • This preserves incremental fast path for normal append-only histories

Re-ran make lint and make typecheck.

@ThomasK33
Copy link
Member Author

@codex review

Applied the final ETL gating adjustment:

  • Removed per-row sequence-regression checks from parse loop
  • Rebuild predicate now uses only truncation detection + parsed-max rewind conditions
  • Incremental fast path remains active for normal append-only histories

Re-ran make lint and make typecheck.

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e175909ae5

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@ThomasK33
Copy link
Member Author

@codex review

Addressed the head-truncation+append case where count recovers:

  • Added persisted-vs-parsed head signature comparison (timestamp + model + total_cost_usd)
  • If watermark exists and head signatures differ, we force workspace-slice rebuild
  • Rebuild predicate now includes truncation + head-mismatch + max-sequence rewind

Re-ran make lint and make typecheck.

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 1b33a51b27

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@ThomasK33
Copy link
Member Author

@codex review

Addressed the missing chat.jsonl cleanup issue:

  • On ENOENT in ingestWorkspace, ETL now clears workspace analytics state transactionally:
    • DELETE FROM events WHERE workspace_id = ?
    • DELETE FROM ingest_watermarks WHERE workspace_id = ?

This prevents deleted workspaces from continuing to contribute stale totals in long-running sessions.
Re-ran make lint and make typecheck.

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 9c6faa7f6a

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

…uards

- refresh incremental analytics ingest for same-sequence rewrites by replacing rows at matching response_index values before reinserting
- keep node-pty rebuild independent from DuckDB package presence while preserving per-module stamp caching
Add a workspace metadata-null cleanup hook that dispatches clearWorkspace through AnalyticsService/worker/ETL so deleted workspaces are removed from analytics immediately.

Also switch summary today_spend_usd to a bound UTC YYYY-MM-DD value instead of DuckDB CURRENT_DATE to avoid local-time drift against UTC event dates.
Include computed ttftMs in stream-end metadata and history updates when a first token timestamp is available, while omitting the field when TTFT cannot be derived.

Add stream manager coverage that verifies ttftMs is persisted in finalized history metadata when available and remains absent otherwise.
Detect startup backfill readiness by comparing session workspace IDs with ingest watermark workspace IDs, not only aggregate counts.

- scan session directories for workspace IDs that actually have chat.jsonl
- load watermark workspace IDs and flag any session workspace missing coverage
- thread missing-ID signal into shouldRunInitialBackfill while preserving existing zero-event and wiped-events safeguards
- add stale watermark-ID mismatch test coverage
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant