You ship your AI feature. It works. A week later your OpenAI bill is $400 and you have no idea which of your users caused which $0.05.
This is the single most underrated metric in production LLM apps — cost per end-user — and it's surprisingly easy to instrument if you know what to do.
Here are the three approaches I've found work in practice, ranked by setup time.
Approach 1: Wrap your provider client (5 minutes)
Works for Express, Next.js Route Handlers, Fastify — anything that has a single OpenAI or Anthropic client instance.
import OpenAI from 'openai'
import { wrapOpenAI, withTrace } from '@voightxyz/openai'
const openai = wrapOpenAI(new OpenAI(), {
agent: 'production-chat-api',
})
app.post('/api/chat', async (req, res) => {
await withTrace(
async () => {
const r = await openai.chat.completions.create({
model: 'gpt-4o-mini',
messages: req.body.messages,
})
res.json({ reply: r.choices[0].message })
},
Discussion
Take the lead—comment now
Lead the way—your insights can inspire others.