Unified LLM Interface
Write once, run everywhere
AbstractCore is a Python library that provides a unified create_llm(...) API across cloud + local LLM providers (OpenAI, Anthropic, Ollama, LMStudio, and more). The default install is intentionally lightweight; add providers and optional subsystems via explicit install extras.
First-class support for:
- sync + async
- streaming + non-streaming
- universal tool calling (native + prompted tool syntax)
- structured output (Pydantic)
- media input (images/audio/video + documents) with explicit, policy-driven fallbacks (*)
- optional capability plugins (
core.voice/core.audio/core.vision) for deterministic TTS/STT and generative vision (viaabstractvoice/abstractvision) - glyph visual-text compression for long documents (**)
- unified openai-compatible endpoint for all providers and models
(*) Media input is policy-driven (no silent semantic changes). If a model doesn’t support images, AbstractCore can use a configured vision model to generate short visual observations and inject them into your text-only request (vision fallback). Audio/video attachments are also policy-driven (audio_policy, video_policy) and may require capability plugins for fallbacks. See Media Handling and Centralized Config.
(**) Optional visual-text compression: render long text/PDFs into images and process them with a vision model to reduce token usage. See Glyph Visual-Text Compression (install pip install "abstractcore[compression]"; for PDFs also install pip install "abstractcore[media]").
Docs: Getting Started · FAQ · Docs Index · https://lpalbou.github.io/AbstractCore
# Core (small, lightweight default)
pip install abstractcore
# Providers
pip install "abstractcore[openai]" # OpenAI SDK
pip install "abstractcore[anthropic]" # Anthropic SDK
pip install "abstractcore[huggingface]" # Transformers / torch (heavy)
pip install "abstractcore[mlx]" # Apple Silicon local inference (heavy)
pip install "abstractcore[vllm]" # NVIDIA CUDA / ROCm (heavy)
# Optional features
pip install "abstractcore[tools]" # built-in web tools (web_search, skim_websearch, skim_url, fetch_url)
pip install "abstractcore[media]" # images, PDFs, Office docs
pip install "abstractcore[compression]" # glyph visual-text compression (Pillow-only)
pip install "abstractcore[embeddings]" # EmbeddingManager + local embedding models
pip install "abstractcore[tokens]" # precise token counting (tiktoken)
pip install "abstractcore[server]" # OpenAI-compatible HTTP gateway
# Combine extras (zsh: keep quotes)
pip install "abstractcore[openai,media,tools]"
# Turnkey "everything" installs (pick one)
pip install "abstractcore[all-apple]" # macOS/Apple Silicon (includes MLX, excludes vLLM)
pip install "abstractcore[all-non-mlx]" # Linux/Windows/Intel Mac (excludes MLX and vLLM)
pip install "abstractcore[all-gpu]" # Linux NVIDIA GPU (includes vLLM, excludes MLX)OpenAI example (requires pip install "abstractcore[openai]"):
from abstractcore import create_llm
llm = create_llm("openai", model="gpt-4o-mini")
response = llm.generate("What is the capital of France?")
print(response.content)from abstractcore import create_llm, BasicSession
session = BasicSession(create_llm("anthropic", model="claude-haiku-4-5"))
print(session.generate("Give me 3 bakery name ideas.").content)
print(session.generate("Pick the best one and explain why.").content)from abstractcore import create_llm
llm = create_llm("ollama", model="qwen3:4b-instruct")
for chunk in llm.generate("Write a short poem about distributed systems.", stream=True):
print(chunk.content or "", end="", flush=True)import asyncio
from abstractcore import create_llm
async def main():
llm = create_llm("openai", model="gpt-4o-mini")
resp = await llm.agenerate("Give me 5 bullet points about HTTP caching.")
print(resp.content)
asyncio.run(main())from abstractcore import create_llm
llm = create_llm(
"openai",
model="gpt-4o-mini",
max_tokens=8000, # total budget (input + output)
max_output_tokens=1200, # output cap
)Open-source-first: local providers (Ollama, LMStudio, vLLM, openai-compatible, HuggingFace, MLX) are first-class. Cloud and gateway providers are optional.
openai:OPENAI_API_KEY, optionalOPENAI_BASE_URLanthropic:ANTHROPIC_API_KEY, optionalANTHROPIC_BASE_URLopenrouter:OPENROUTER_API_KEY, optionalOPENROUTER_BASE_URL(default:https://openrouter.ai/api/v1)portkey:PORTKEY_API_KEY,PORTKEY_CONFIG(config id), optionalPORTKEY_BASE_URL(default:https://api.portkey.ai/v1)ollama: local server atOLLAMA_BASE_URL(or legacyOLLAMA_HOST)lmstudio: OpenAI-compatible local server atLMSTUDIO_BASE_URL(default:http://localhost:1234/v1)vllm: OpenAI-compatible server atVLLM_BASE_URL(default:http://localhost:8000/v1)openai-compatible: generic OpenAI-compatible endpoints viaOPENAI_COMPATIBLE_BASE_URL(default:http://localhost:1234/v1)huggingface: local models via Transformers (optionalHUGGINGFACE_TOKENfor gated downloads)mlx: Apple Silicon local models (optionalHUGGINGFACE_TOKENfor gated downloads)
You can also persist settings (including API keys) via the config CLI:
abstractcore --statusabstractcore --configure(alias:--config)abstractcore --set-api-key openai sk-...
- Tools: universal tool calling across providers → Tool Calling
- Built-in tools (optional): web + filesystem helpers (
skim_websearch,skim_url,fetch_url,read_file, …) → Tool Calling - Tool syntax rewriting:
tool_call_tags(Python) andagent_format(server) → Tool Syntax Rewriting - Structured output: Pydantic-first with provider-aware strategies → Structured Output
- Media input: images/audio/video + documents (policies + fallbacks) → Media Handling and Vision Capabilities
- Capability plugins (optional): deterministic
llm.voice/llm.audio/llm.visionsurfaces → Capabilities - Glyph visual-text compression: scale long-context document analysis via VLMs → Glyph Visual-Text Compression
- Embeddings and semantic search → Embeddings
- Observability: global event bus + interaction traces → Architecture, API Reference (Events), Interaction Tracing
- MCP (Model Context Protocol): discover tools from MCP servers (HTTP/stdio) → MCP
- OpenAI-compatible server: one
/v1gateway for chat + optional/v1/images/*and/v1/audio/*endpoints → Server
By default (execute_tools=False), AbstractCore:
- returns clean assistant text in
response.content - returns structured tool calls in
response.tool_calls(host/runtime executes them)
from abstractcore import create_llm, tool
@tool
def get_weather(city: str) -> str:
return f"{city}: 22°C and sunny"
llm = create_llm("openai", model="gpt-4o-mini")
resp = llm.generate("What's the weather in Paris? Use the tool.", tools=[get_weather])
print(resp.content)
print(resp.tool_calls)If you need tool-call markup preserved/re-written in content for downstream parsers, pass
tool_call_tags=... (e.g. "qwen3", "llama3", "xml"). See Tool Syntax Rewriting.
from pydantic import BaseModel
from abstractcore import create_llm
class Answer(BaseModel):
title: str
bullets: list[str]
llm = create_llm("openai", model="gpt-4o-mini")
answer = llm.generate("Summarize HTTP/3 in 3 bullets.", response_model=Answer)
print(answer.bullets)Requires pip install "abstractcore[media]".
from abstractcore import create_llm
llm = create_llm("anthropic", model="claude-haiku-4-5")
resp = llm.generate("Describe the image.", media=["./image.png"])
print(resp.content)Notes:
- Images: use a vision-capable model, or configure vision fallback for text-only models (
abstractcore --config;abstractcore --set-vision-provider PROVIDER MODEL). - Video:
video_policy="auto"(default) uses native video when supported, otherwise samples frames (requiresffmpeg/ffprobe) and routes them through image/vision handling (so you still need a vision-capable model or vision fallback configured). - Audio: use an audio-capable model, or set
audio_policy="auto"/"speech_to_text"and installabstractvoicefor speech-to-text.
Configure defaults (optional):
abstractcore --status
abstractcore --set-vision-provider lmstudio qwen/qwen3-vl-4b
abstractcore --set-audio-strategy auto
abstractcore --set-video-strategy autoSee Media Handling and Vision Capabilities.
pip install "abstractcore[server]"
python -m abstractcore.server.appUse any OpenAI-compatible client, and route to any provider/model via model="provider/model":
from openai import OpenAI
client = OpenAI(base_url="http://localhost:8000/v1", api_key="unused")
resp = client.chat.completions.create(
model="ollama/qwen3:4b-instruct",
messages=[{"role": "user", "content": "Hello from the gateway!"}],
)
print(resp.choices[0].message.content)See Server.
Interactive chat:
abstractcore-chat --provider openai --model gpt-4o-mini
abstractcore-chat --provider lmstudio --model qwen/qwen3-4b-2507 --base-url http://localhost:1234/v1
abstractcore-chat --provider openrouter --model openai/gpt-4o-miniToken limits:
- startup:
abstractcore-chat --max-tokens 8192 --max-output-tokens 1024 ... - in-REPL:
/max-tokens 8192and/max-output-tokens 1024
AbstractCore also ships with ready-to-use CLI apps:
summarizer,extractor,judge,intent,deepsearch(see docs/apps/)
Start here:
- Docs Index — navigation for all docs
- Prerequisites — provider setup (keys, local servers, hardware notes)
- Getting Started — first call + core concepts
- FAQ — common questions and setup gotchas
- Examples — end-to-end patterns and recipes
- Troubleshooting — common failures and fixes
Core features:
- Tool Calling — universal tools across providers (native + prompted)
- Tool Syntax Rewriting — rewrite tool-call syntax for different runtimes/clients
- Structured Output — schema enforcement + retry strategies
- Media Handling — images/audio/video + documents (policies + fallbacks)
- Vision Capabilities — image/video input, vision fallback, and how this differs from generative vision
- Glyph Visual-Text Compression — compress long documents into images for VLMs
- Generation Parameters — unified parameter vocabulary and provider quirks
- Session Management — conversation history, persistence, and compaction
- Embeddings — embeddings API and RAG building blocks
- Async Guide — async patterns, concurrency, best practices
- Centralized Config —
~/.abstractcore/config/abstractcore.json+ CLI config commands - Capabilities — supported features and current limitations
- Interaction Tracing — inspect prompts/responses/usage for observability
- MCP — consume MCP tool servers (HTTP/stdio) as tool sources
Reference and internals:
- Architecture — system overview + event system
- API (Python) — how to use the public API
- API Reference — Python API (including events)
- Server — OpenAI-compatible gateway with tool/media support
- CLI Guide — interactive
abstractcore-chatwalkthrough
Project:
- Changelog — version history and upgrade notes
- Contributing — dev setup and contribution guidelines
- Security — responsible vulnerability reporting
- Acknowledgements — upstream projects and communities
MIT