r/ClaudeCode by dinkinflika0 New

How to track Claude Code's token usage and costs across multiple API keys

1 points 0 comments 100% upvoted

Content

Been using Claude Code for a few weeks and wanted to route requests through a gateway for better observability and cost tracking across multiple API keys. Expected it to be complicated. Wasn't. **The setup:** Bifrost is an open-source LLM gateway ([https://github.com/maximhq/bifrost](https://github.com/maximhq/bifrost)) that sits between Claude Code and Anthropic's API. Written in Go, adds \~11μs latency. **Why I wanted this:** 1. **Observability** \- See every request/response, token usage, costs in one place 2. **Load balancing** \- Rotate between multiple API keys automatically 3. **Rate limiting** \- Don't hit limits on any single key 4. **Caching** \- Semantic caching for repeated queries **Installation:** `git clone <https://github.com/maximhq/bifrost> cd bifrost docker compose up` Gateway runs on localhost:8080. Add your Anthropic API keys through the UI. **Claude Code config:** Change base URL in your config: `{ "baseURL": "<http://localhost:8080/v1>", "provider": "anthropic" }` That's it. Claude Code thinks it's talking to Anthropic directly, but goes through Bifrost. **What I'm seeing:** Dashboard shows every Claude Code request - which files it's reading, what code it's generating, token costs per session. Makes it way easier to see what's actually happening. Also helpful: when one API key hits rate limits, gateway automatically switches to another. No more interruptions mid-coding session. **Performance:** Haven't noticed any latency difference. Gateway overhead is \~11μs which is basically nothing compared to LLM call time. **Caching is interesting:** If you ask Claude Code the same question twice (like "explain this function"), second request is instant and costs nothing. Semantic cache hits even with slightly different wording. Full setup guide: [https://www.getmaxim.ai/bifrost/blog/integrating-claude-code-with-bifrost-gateway/](https://www.getmaxim.ai/bifrost/blog/integrating-claude-code-with-bifrost-gateway/) Anyone else routing Claude Code through a gateway? Curious what you're using and why. *Disclosure: I work at Maxim (we built Bifrost)*

Comments

No comments fetched yet

Comments are fetched when you run cortex fetch with comment fetching enabled