Every developer using LLMs faces the same three problems:
Cost blindness — you cannot answer "how much did I spend today?"
No failover — when OpenAI goes down, your app goes down
Wasted money — identical prompts hit the API over and over instead of being cached
I built llmux to fix all three with zero code changes.
What is llmux?
A single Rust binary (~5MB) that sits between your code and every LLM API. It handles failover, caching, rate limiting, and cost tracking automatically.
Your code (any language)
|
http://localhost:4000
|
┌──────┐
│ llmux │ ← single binary, ~5MB
└──┬───┘
│
┌────┼────┬────────┐
▼ ▼ ▼ ▼
OpenAI Claude Gemini Ollama
Zero Code Changes
You change one environment variable. That is it.
# Before
export OPENAI_BASE_URL=https://api.openai.com
# After
export OPENAI_BASE_URL=ht
Discussion
Be the first to comment
Add your perspective to get the discussion started.