This project renders handwritten PNG previews in a Next.js app.
Phase 1 adds a real letter-level generative model (python_ai/lettergen/) that learns a distribution over your scanned/labeled glyph crops and samples new glyph pixels (not crop copy/paste) during rendering.
- Frontend calls
POST /api/generate - Next.js route validates request with Zod
lib/generate-service.tsruns local Pythongenerate_handwriting_page.py(or FastAPI/stub)- Python returns a PNG file, Next.js converts it to
imageDataUrl(data:image/png;base64,...) PreviewPanelrenders the PNG unchanged
The letter model trains from your existing lowercase character dataset:
out/labels.csv- CSV with at least:
filename,label labelshould be lowercasea-z
- CSV with at least:
out/chars/- PNG crops matching filenames in
out/labels.csv
- PNG crops matching filenames in
Notes:
- Input is scanned-paper-derived glyph crops (no stroke/tablet data required).
- The model currently targets lowercase
a-z. - Spaces are handled by the renderer (not the model).
- Missing letters fall back to the existing crop sampler or fallback glyph path.
Recommended (CPU-friendly defaults):
python -m python_ai.lettergen.train --epochs 20 --batch-size 64Useful options:
python -m python_ai.lettergen.train ^
--epochs 30 ^
--batch-size 64 ^
--image-size 64 ^
--latent-dim 32 ^
--beta 0.15 ^
--beta-warmup-epochs 6 ^
--val-split 0.12 ^
--seed 1234 ^
--checkpoint-dir out/checkpoints ^
--out-weights out/letter_gen.pt ^
--out-config out/letter_gen_config.jsonOutputs:
- Weights:
out/letter_gen.pt - Training config/summary JSON:
out/letter_gen_config.json - Epoch checkpoints:
out/checkpoints/letter_gen_epochXXX.pt
The training script always writes the canonical website activation artifact:
out/letter_gen.pt
Generate sample grid for one letter:
python -m python_ai.lettergen.sample --letter a --n 16 --out out/samples_a.pngGenerate one sample for every letter:
python -m python_ai.lettergen.sample --letter allSample grids are written to out/ by default (or --out path if provided).
This script:
- trains for 1 epoch on a small subset
- then renders
"hello world"viagenerate_handwriting_page.pyusing the learned letter model
python -m python_ai.lettergen.smoke_testThe local Python provider now auto-enables the learned letter model when:
HANDWRITE_PROVIDERresolves to local mode, andout/letter_gen.ptexists (orHANDWRITE_LOCAL_LETTER_MODEL_PATHpoints to a weights file)
If the letter model is unavailable or fails:
- it falls back to the existing crop sampler (
out/chars+out/labels.csv) - if that also fails for a character, it falls back to the existing synthetic fallback glyph path
The frontend Generate Preview flow and response shape remain compatible.
Start Next.js:
npm run devOpen the app and click Generate Preview, or use the local smoke script:
npm run smoke:generateThe smoke script posts to http://localhost:3000/api/generate and asserts the response starts with:
data:image/png;base64,
The web app now ships with a richer dashboard UI:
- Tabbed workspace:
Generate,OCR,Dataset,Training - 16 themes with persistence (
localStoragekey:handwriting_theme) - Advanced generation controls (variation, style strength, temperature, seed lock, spacing, page style)
- Animated preview reveal + shimmer loading state (respects reduced motion)
- Debug overlay toggles (boxes, labels, fallback markers)
- Presets save/load in browser storage
- Export actions: PNG + transparent PNG (PDF UI placeholder)
Theme definitions are centralized in:
lib/themes.ts
Each theme defines:
idnamecssVars(--bg,--panel,--text,--muted,--border,--accent,--accent2,--shadow, etc.)
To add a theme:
- Add a new object to
THEMESinlib/themes.ts - Provide a unique
idand readablename - Fill
cssVarswith colors/gradient values
POST /api/generate(existing generator route)GET /api/status(training status; reads local files/logs when available)GET /api/dataset-stats(dataset counts; falls back topublic/mock/dataset_stats.json)POST /api/train(stub; UI-ready)POST /api/stop(stub; UI-ready)
If your backend process control is not wired yet, the UI still works and shows stub responses for train/stop controls.
- Install/start the website
npm install
npm run dev- Train the generative letter model
python -m python_ai.lettergen.train --epochs 20 --batch-size 64- Confirm the activation weights file exists
python -c "from pathlib import Path; p=Path('out/letter_gen.pt'); print(p.exists(), p.resolve())"- Generate from the website and confirm server log shows:
USING LETTER GEN MODEL: out/letter_gen.pt
The request schema supports optional hidden fields (not required by the UI) for letter-model tuning:
letterModelEnabledletterModelStyleStrengthletterModelBaselineJitterletterModelWordSlantletterModelRotationJitterletterModelInkVariation
The renderer also applies realism tweaks with defaults:
- baseline jitter (correlated within words)
- pairwise kerning adjustments
- per-word slant
- per-letter rotation jitter
- light ink thickness variation
See .env.example. Relevant additions:
HANDWRITE_LOCAL_USE_LETTER_MODEL=1HANDWRITE_LOCAL_LETTER_MODEL_PATH=out/letter_gen.ptHANDWRITE_LOCAL_USE_CLASSIFIER=1
- Check class counts in
out/labels.csv - Add more samples for the weak letters
- Retrain (
python -m python_ai.lettergen.train ...) - The renderer will fall back to crop sampling or fallback glyphs for unsupported/low-sample letters
- Increase training epochs
- Reduce
--betaslightly (e.g.0.10) - Keep
--image-sizeat64(higher sizes require more data/training) - Verify crop labels are clean and not heavily clipped
- Confirm
out/letter_gen.ptexists - Check
.envvalues (HANDWRITE_PROVIDER,HANDWRITE_LOCAL_USE_LETTER_MODEL) - Look for warnings in server logs; the generator falls back instead of crashing
- Use fewer epochs first (
--epochs 5) to validate the pipeline - Use
--max-samplesfor iteration - Reduce
--base-channelsor--latent-dimfor quicker experiments
Phase 2 is intentionally scaffolded and not implemented yet.
Added placeholders:
python_ai/textgen/dataset_lines.pypython_ai/textgen/model_lines.pypython_ai/textgen/train_lines.py
Paired data:
- one text label
- one handwritten line image
- exact alignment (the line image must contain exactly that text)
- Use prompt sheets
- Write one line per prompt
- Save each line as a separate cropped image
- Store labels in CSV/JSONL (
split,text,image_path)
train.csv/val.csv(or JSONL)- columns:
text,image_path - images pre-cropped to a single handwritten line
Phase 2 training stub currently raises NotImplementedError with guidance so the repo has a clean extension point without pretending a large text-conditioned model is ready.