Skip to content

Ieatspace/datascience

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Handwrite Studio (Generative Letter Model Phase 1)

This project renders handwritten PNG previews in a Next.js app.

Phase 1 adds a real letter-level generative model (python_ai/lettergen/) that learns a distribution over your scanned/labeled glyph crops and samples new glyph pixels (not crop copy/paste) during rendering.

Current Website Flow

  • Frontend calls POST /api/generate
  • Next.js route validates request with Zod
  • lib/generate-service.ts runs local Python generate_handwriting_page.py (or FastAPI/stub)
  • Python returns a PNG file, Next.js converts it to imageDataUrl (data:image/png;base64,...)
  • PreviewPanel renders the PNG unchanged

Dataset Preparation (Phase 1 Letter Model)

The letter model trains from your existing lowercase character dataset:

  • out/labels.csv
    • CSV with at least: filename,label
    • label should be lowercase a-z
  • out/chars/
    • PNG crops matching filenames in out/labels.csv

Notes:

  • Input is scanned-paper-derived glyph crops (no stroke/tablet data required).
  • The model currently targets lowercase a-z.
  • Spaces are handled by the renderer (not the model).
  • Missing letters fall back to the existing crop sampler or fallback glyph path.

Phase 1: Train the Letter Generator (cVAE)

Recommended (CPU-friendly defaults):

python -m python_ai.lettergen.train --epochs 20 --batch-size 64

Useful options:

python -m python_ai.lettergen.train ^
  --epochs 30 ^
  --batch-size 64 ^
  --image-size 64 ^
  --latent-dim 32 ^
  --beta 0.15 ^
  --beta-warmup-epochs 6 ^
  --val-split 0.12 ^
  --seed 1234 ^
  --checkpoint-dir out/checkpoints ^
  --out-weights out/letter_gen.pt ^
  --out-config out/letter_gen_config.json

Outputs:

  • Weights: out/letter_gen.pt
  • Training config/summary JSON: out/letter_gen_config.json
  • Epoch checkpoints: out/checkpoints/letter_gen_epochXXX.pt

The training script always writes the canonical website activation artifact:

  • out/letter_gen.pt

Sanity Check: Sample Generated Letters

Generate sample grid for one letter:

python -m python_ai.lettergen.sample --letter a --n 16 --out out/samples_a.png

Generate one sample for every letter:

python -m python_ai.lettergen.sample --letter all

Sample grids are written to out/ by default (or --out path if provided).

Smoke Test (Python)

This script:

  • trains for 1 epoch on a small subset
  • then renders "hello world" via generate_handwriting_page.py using the learned letter model
python -m python_ai.lettergen.smoke_test

Website Integration (Automatic)

The local Python provider now auto-enables the learned letter model when:

  • HANDWRITE_PROVIDER resolves to local mode, and
  • out/letter_gen.pt exists (or HANDWRITE_LOCAL_LETTER_MODEL_PATH points to a weights file)

If the letter model is unavailable or fails:

  • it falls back to the existing crop sampler (out/chars + out/labels.csv)
  • if that also fails for a character, it falls back to the existing synthetic fallback glyph path

The frontend Generate Preview flow and response shape remain compatible.

Run the Website and Generate an Image

Start Next.js:

npm run dev

Open the app and click Generate Preview, or use the local smoke script:

npm run smoke:generate

The smoke script posts to http://localhost:3000/api/generate and asserts the response starts with:

  • data:image/png;base64,

SaaS-Style Dashboard UI (Generate / OCR / Dataset / Training)

The web app now ships with a richer dashboard UI:

  • Tabbed workspace: Generate, OCR, Dataset, Training
  • 16 themes with persistence (localStorage key: handwriting_theme)
  • Advanced generation controls (variation, style strength, temperature, seed lock, spacing, page style)
  • Animated preview reveal + shimmer loading state (respects reduced motion)
  • Debug overlay toggles (boxes, labels, fallback markers)
  • Presets save/load in browser storage
  • Export actions: PNG + transparent PNG (PDF UI placeholder)

Theme system

Theme definitions are centralized in:

  • lib/themes.ts

Each theme defines:

  • id
  • name
  • cssVars (--bg, --panel, --text, --muted, --border, --accent, --accent2, --shadow, etc.)

To add a theme:

  1. Add a new object to THEMES in lib/themes.ts
  2. Provide a unique id and readable name
  3. Fill cssVars with colors/gradient values

Backend endpoints used by the dashboard

  • POST /api/generate (existing generator route)
  • GET /api/status (training status; reads local files/logs when available)
  • GET /api/dataset-stats (dataset counts; falls back to public/mock/dataset_stats.json)
  • POST /api/train (stub; UI-ready)
  • POST /api/stop (stub; UI-ready)

If your backend process control is not wired yet, the UI still works and shows stub responses for train/stop controls.

Exact Run Steps (End-to-End)

  1. Install/start the website
npm install
npm run dev
  1. Train the generative letter model
python -m python_ai.lettergen.train --epochs 20 --batch-size 64
  1. Confirm the activation weights file exists
python -c "from pathlib import Path; p=Path('out/letter_gen.pt'); print(p.exists(), p.resolve())"
  1. Generate from the website and confirm server log shows:
  • USING LETTER GEN MODEL: out/letter_gen.pt

Local Generator Tuning (Optional)

The request schema supports optional hidden fields (not required by the UI) for letter-model tuning:

  • letterModelEnabled
  • letterModelStyleStrength
  • letterModelBaselineJitter
  • letterModelWordSlant
  • letterModelRotationJitter
  • letterModelInkVariation

The renderer also applies realism tweaks with defaults:

  • baseline jitter (correlated within words)
  • pairwise kerning adjustments
  • per-word slant
  • per-letter rotation jitter
  • light ink thickness variation

Environment Variables (Local Provider)

See .env.example. Relevant additions:

  • HANDWRITE_LOCAL_USE_LETTER_MODEL=1
  • HANDWRITE_LOCAL_LETTER_MODEL_PATH=out/letter_gen.pt
  • HANDWRITE_LOCAL_USE_CLASSIFIER=1

Troubleshooting

Missing letters / weak quality for some letters

  • Check class counts in out/labels.csv
  • Add more samples for the weak letters
  • Retrain (python -m python_ai.lettergen.train ...)
  • The renderer will fall back to crop sampling or fallback glyphs for unsupported/low-sample letters

Blurry outputs

  • Increase training epochs
  • Reduce --beta slightly (e.g. 0.10)
  • Keep --image-size at 64 (higher sizes require more data/training)
  • Verify crop labels are clean and not heavily clipped

Letter model not being used in the website

  • Confirm out/letter_gen.pt exists
  • Check .env values (HANDWRITE_PROVIDER, HANDWRITE_LOCAL_USE_LETTER_MODEL)
  • Look for warnings in server logs; the generator falls back instead of crashing

CPU training is slow

  • Use fewer epochs first (--epochs 5) to validate the pipeline
  • Use --max-samples for iteration
  • Reduce --base-channels or --latent-dim for quicker experiments

Phase 2: Sentence/Line Model (Scaffold Only)

Phase 2 is intentionally scaffolded and not implemented yet.

Added placeholders:

  • python_ai/textgen/dataset_lines.py
  • python_ai/textgen/model_lines.py
  • python_ai/textgen/train_lines.py

What Phase 2 will need

Paired data:

  • one text label
  • one handwritten line image
  • exact alignment (the line image must contain exactly that text)

Recommended collection method

  • Use prompt sheets
  • Write one line per prompt
  • Save each line as a separate cropped image
  • Store labels in CSV/JSONL (split,text,image_path)

Expected future dataset format

  • train.csv / val.csv (or JSONL)
  • columns: text,image_path
  • images pre-cropped to a single handwritten line

Phase 2 training stub currently raises NotImplementedError with guidance so the repo has a clean extension point without pretending a large text-conditioned model is ready.

About

Where the 67 meet the 89

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors