Back to Feed
r/ClaudeCode
by dinkinflika0
New
How to track Claude Code's token usage and costs across multiple API keys
1 points
0 comments
100% upvoted
View on Reddit
Content
Been using Claude Code for a few weeks and wanted to route requests through a gateway for better observability and cost tracking across multiple API keys.
Expected it to be complicated. Wasn't.
**The setup:**
Bifrost is an open-source LLM gateway ([https://github.com/maximhq/bifrost](https://github.com/maximhq/bifrost)) that sits between Claude Code and Anthropic's API. Written in Go, adds \~11μs latency.
**Why I wanted this:**
1. **Observability** \- See every request/response, token usage, costs in one place
2. **Load balancing** \- Rotate between multiple API keys automatically
3. **Rate limiting** \- Don't hit limits on any single key
4. **Caching** \- Semantic caching for repeated queries
**Installation:**
`git clone <https://github.com/maximhq/bifrost> cd bifrost docker compose up`
Gateway runs on localhost:8080. Add your Anthropic API keys through the UI.
**Claude Code config:**
Change base URL in your config:
`{ "baseURL": "<http://localhost:8080/v1>", "provider": "anthropic" }`
That's it. Claude Code thinks it's talking to Anthropic directly, but goes through Bifrost.
**What I'm seeing:**
Dashboard shows every Claude Code request - which files it's reading, what code it's generating, token costs per session. Makes it way easier to see what's actually happening.
Also helpful: when one API key hits rate limits, gateway automatically switches to another. No more interruptions mid-coding session.
**Performance:**
Haven't noticed any latency difference. Gateway overhead is \~11μs which is basically nothing compared to LLM call time.
**Caching is interesting:**
If you ask Claude Code the same question twice (like "explain this function"), second request is instant and costs nothing. Semantic cache hits even with slightly different wording.
Full setup guide: [https://www.getmaxim.ai/bifrost/blog/integrating-claude-code-with-bifrost-gateway/](https://www.getmaxim.ai/bifrost/blog/integrating-claude-code-with-bifrost-gateway/)
Anyone else routing Claude Code through a gateway? Curious what you're using and why.
*Disclosure: I work at Maxim (we built Bifrost)*
Comments
No comments fetched yet
Comments are fetched when you run cortex fetch with comment fetching enabled