Skip to content

Comments

New architecture and new visualization capability#231

Open
Chenglong-MS wants to merge 114 commits intomainfrom
dev
Open

New architecture and new visualization capability#231
Chenglong-MS wants to merge 114 commits intomainfrom
dev

Conversation

@Chenglong-MS
Copy link
Collaborator

@Chenglong-MS Chenglong-MS commented Jan 31, 2026

Redesigned Visualization Architecture & Expanded Chart Capabilities

Summary

This PR introduces a ground-up redesign of Data Formulator's visualization engine and backend architecture, delivering a multi-backend semantic chart compiler, a tiered sandbox system for secure code execution, and cloud-native workspace storage — along with 30+ chart types across four rendering backends.


Highlights

1. agents-chart — A New Semantic Visualization Library

We replaced the previous monolithic chart assembly logic with agents-chart, a standalone, pure-TypeScript visualization library built around a key insight: semantic types as the contract between AI and rendering.

Instead of asking the LLM to produce detailed, low-level chart specs (which look good but break on every user edit), or using library defaults (which are editable but look bad), agents-chart takes a third path:

LLM outputs:    chart type + field assignments + semantic types  (~10-line JSON)
Compiler does:  sizing, zero-baseline, formatting, color schemes, mark templates
User edits:     swap fields, change chart type, add facets → compiler re-derives (no AI call)

Architecture:

Layer Role
core/ Target-agnostic: semantic type resolution, layout engine, recommendation engine, overflow handling, decision logic
vegalite/ Vega-Lite backend (primary) — 30+ chart types
echarts/ ECharts backend — 19 chart types including funnel, gauge, treemap, sunburst, sankey

Each backend implements the same assemble() interface: data in, native spec out. Users retain full control over aesthetic fine-tuning using each library's own API — no abstraction tax.

2. Massively Expanded Chart Type Library

Category Chart Types
Scatter & Point Scatter Plot, Linear Regression, Boxplot, Strip Plot, Jitter Plot
Bar Bar, Grouped Bar, Stacked Bar, Histogram, Lollipop, Pyramid
Line & Area Line, Dotted Line, Bump Chart, Area, Streamgraph
Part-to-Whole Pie, Rose, Heatmap, Waterfall, Treemap, Sunburst
Statistical Density Plot, Ranged Dot Plot, Radar, Candlestick
Flow & Hierarchy Sankey, Funnel, Gauge
Geospatial US Map, World Map
Custom Custom Point, Custom Line, Custom Bar, Custom Rect, Custom Area

New chart icon assets added for area, bump, candlestick, density, lollipop, pyramid, radar, rose, streamgraph, strip plot, and waterfall charts.

3. Smarter Chart Recommendation

A new recommendation engine (core/recommendation.ts) intelligently suggests chart types based on field semantics and adapts field assignments when switching between chart types. The UI has been split into a streamlined SimpleChartRecBox component (~1,100 lines) plus a ChartGallery (~930 lines) for browsing the full chart type catalog.

4. Tiered Sandbox System

Replaced the single sandbox implementation with a three-tier execution model via an abstract Sandbox base class:

Tier Isolation Use Case
LocalSandbox Persistent warm subprocess pool with Python audit hooks; blocks file writes, subprocess, and shutil calls; pre-imports numpy/pandas/duckdb for fast first-call Default for local development
DockerSandbox Full container isolation with read-only workspace mount and bind-mounted output Production / untrusted environments
NotASandbox Direct exec() with no isolation Benchmarking only

5. Cloud-Native Workspace Storage

New Azure Blob Storage workspace backend (~670 lines) alongside the existing local filesystem workspace, selected via a factory pattern (workspace_factory.py). Enables multi-user cloud deployments with the same API surface — tables stored as parquet, sessions as JSON, metadata as YAML.

6. Security Hardening

  • Sandbox audit hooks block dangerous operations (file writes, subprocess spawning)
  • Session export strips sensitive fields (API keys, model configs) before persisting
  • Auth improvements for multi-tenant scenarios

Architecture Overview

┌─────────────────────────────────────────────────────────────────┐
│  Frontend (React + Redux + TypeScript)                          │
│  ┌──────────────┐  ┌──────────────┐  ┌────────────────────────┐ │
│  │ Encoding     │  │ Chart        │  │ agents-chart compiler  │ │
│  │ Shelf (DnD)  │  │ Rec Engine   │  │ core → vegalite/       │ │
│  │              │  │              │  │        echarts/        │ │
│  │              │  │              │  │        chartjs/gofish  │ │
│  └──────┬───────┘  └──────┬───────┘  └────────┬───────────────┘ │
│         └─────────────────┴───────────────────┘                 │
│                           │                                     │
│  ┌────────────────────────▼─────────────────────────────────┐   │
│  │  ChartRenderService (headless: vega compile → SVG/PNG)   │   │
│  └──────────────────────────────────────────────────────────┘   │
└─────────────────────────────┬───────────────────────────────────┘
                              │ /api/*
┌─────────────────────────────▼───────────────────────────────────┐
│  Backend (Flask + LiteLLM)                                      │
│  ┌──────────────┐  ┌──────────────┐  ┌────────────────────────┐ │
│  │ Agent Routes  │  │ Table Routes │  │ Session Routes         │ │
│  │ (10+ agents)  │  │ (DuckDB)    │  │ (save/load/export)     │ │
│  └──────┬───────┘  └──────────────┘  └────────────────────────┘ │
│         │                                                       │
│  ┌──────▼──────────────────┐  ┌────────────────────────────────┐│
│  │ Sandbox (local/docker)  │  │ Workspace (local/Azure Blob)   ││
│  └─────────────────────────┘  └────────────────────────────────┘│
└─────────────────────────────────────────────────────────────────┘

Breaking Changes

  • The monolithic py_sandbox.py has been replaced by the sandbox/ package with base.py, local_sandbox.py, docker_sandbox.py, and not_a_sandbox.py.
  • Sandbox selection is now configured via CLI argument (--sandbox-type).

dependabot bot and others added 7 commits January 5, 2026 22:58
Bumps [vega-selections](https://github.com/vega/vega) from 6.1.0 to 6.1.2.
- [Release notes](https://github.com/vega/vega/releases)
- [Commits](vega/vega@v6.1.0...v6.1.2)

---
updated-dependencies:
- dependency-name: vega-selections
  dependency-version: 6.1.2
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Bumps [vega-functions](https://github.com/vega/vega) from 6.1.0 to 6.1.1.
- [Release notes](https://github.com/vega/vega/releases)
- [Commits](vega/vega@v6.1.0...v6.1.1)

---
updated-dependencies:
- dependency-name: vega-functions
  dependency-version: 6.1.1
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Copy link
Contributor

@github-advanced-security github-advanced-security bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CodeQL found more than 20 potential problems in the proposed changes. Check the Files changed tab for more details.

Chenglong-MS and others added 28 commits February 17, 2026 15:44
- Compute bar width/height from layout step and stepPadding
- Add spacing parameter to spread operation
- For grouped bars, divide bar size by number of groups (matching ECharts behavior)
- Ensures grouped bars are properly sized and don't overlap

Co-authored-by: Cursor <cursoragent@cursor.com>
Fix GoFish bar sizing to match other backends
…unctions-6.1.1

Bump vega-functions from 6.1.0 to 6.1.1
…elections-6.1.2

Bump vega-selections from 6.1.0 to 6.1.2
@Chenglong-MS Chenglong-MS changed the title upgrade to python 3.11 and with uv support New architecture and new visualization capability Feb 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants