cpu-only
Here are 12 public repositories matching this topic...
🦙 chat-o-llama: A lightweight, modern web interface for AI conversations with support for both Ollama and llama.cpp backends. Features persistent conversation management, real-time backend switching, intelligent context compression, and a clean responsive UI.
-
Updated
Dec 10, 2025 - Python
An LLM-based content moderator. Firefox extension to block webpages unrelated to work, based on page title and URL. Local LLMs with Ollama and Langchain to ensure your browsing history never leaves your device, for complete privacy. Google Gemini also supported.
-
Updated
Dec 12, 2024 - Python
A high-performance Python library for extracting structured content from PDF documents with layout-aware text extraction. pdf_to_json preserves document structure including headings (H1-H6) and body text, outputting clean JSON format.
-
Updated
Jan 6, 2026 - Python
Image Classification with On-Device Inference, built with Flutter, AI model runs on mobile cpu
-
Updated
Jan 29, 2025 - Dart
Ternsig Virtual Mainframe Runtime (TVMR) — extensible VM with 10 standard extensions (121 instructions), Signal ISA, mastery learning, hot-reload firmware, and thermogram persistence.
-
Updated
Feb 3, 2026 - Rust
CPU-friendly experience-based reasoning framework combining meta-learning (MAML), state space models (SSM), and memory buffers for fast few-shot adaptation. Pure NumPy implementation for edge devices and low-compute environments.
-
Updated
Oct 23, 2025 - Python
A new one shot face swap approach for image and video domains - version tailored to work on CPU
-
Updated
Aug 20, 2024 - Python
CPU-optimized RAG pipeline reducing latency 2.7× (247ms → 92ms). Implements caching, filtering, quantization for production. Complete with FastAPI, Docker, benchmarks, investor materials. The engineering showcase that sells itself.
-
Updated
Jan 24, 2026 - Python
Face Detection service, super fast inference with a nano model
-
Updated
Jan 26, 2025 - Python
A lightweight reproduction and analysis inspired by recent work on presentation-aware deepfake / spoofing detection, with a focus on codec-induced presentation mismatch (AMR) under CPU-only constraints.
-
Updated
Feb 3, 2026 - Python
Chat-O-Llama is a user-friendly web interface for managing conversations with Ollama, featuring persistent chat history. Easily set up and start your chat sessions with just a few commands. 🐙💻
-
Updated
Feb 4, 2026 - HTML
Improve this page
Add a description, image, and links to the cpu-only topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the cpu-only topic, visit your repo's landing page and select "manage topics."