Budget Enforcement Fix
The Business Rules
Credit Pack (pay-as-you-go):
- User buys credits ($10/$20/$100 packs)
- Each model has input/output pricing
- Request goes in → response comes back → balance debited
- If balance goes slightly negative on a response, that's OK
- Next request is blocked if balance ≤ 0
Unlimited ($200/mo):
isUnlimited: true→ unlimited requests to Fabric-hosted models- Balance is irrelevant — doesn't matter what it is
- Fabric models always work while subscription is active
Current reality: LiteLLM only serves Fabric-hosted models (Devstral, MiniMax, Qwen Coder). No third-party models (Claude, GPT-4) are available. So Unlimited users' balance should never be decremented.
Architecture
Two databases:
| Component | Database | What It Stores |
|---|---|---|
| Our App (Vercel) | NeonDB (cloud) | Users, teams, balanceCents, payments, usage records |
| LiteLLM (69.30.85.97) | Local Postgres | API keys with max_budget + spend, spend logs |
api.codewithfabric.com goes directly to LiteLLM via Cloudflare — our app is NOT in the chat request path.
How LiteLLM Enforces Budgets
LiteLLM independently tracks spend on every key. On each request:
- LiteLLM checks:
key.spend > key.max_budget→ if yes, block ("Budget exceeded") - Routes request to model
- Increments
key.spendby the cost in its own DB - Fires webhook to our
/api/litellm/callback - Our callback decrements
team.balanceCentsin NeonDB
Two independent counters that are never synced:
- LiteLLM:
key.spendgoes UP (accumulates forever) - NeonDB:
team.balanceCentsgoes DOWN (decremented by callback)
How Data Flows
The Bugs
| # | Bug | Location | Impact |
|---|---|---|---|
| 1 | Unlimited users get maxBudget: 0 on key creation |
/api/user/status:143 |
New Unlimited users blocked immediately |
| 2 | Dashboard rejects Unlimited users with 402 | /api/keys/generate:44 |
Can't generate key from dashboard |
| 3 | LiteLLM default cap max_budget: 50 in config |
Server config | Any key without explicit budget capped at $50 |
| 4 | Callback decrements NeonDB balance but never syncs to LiteLLM | /api/litellm/callback |
LiteLLM budget drifts — eventually blocks users |
| 5 | Subscription cancellation doesn't update LiteLLM key | /api/stripe/webhooks |
Cancelled users keep LiteLLM access |
| 6 | LiteLLM Prisma client was stale | Server | Key generation broken for all users (fixed today) |
Why Ramtin Got Blocked
Ramtin's personal account (credit pack):
- Key created Dec 18 with
max_budget: $99.84(his balance at the time) - He used MiniMax heavily — LiteLLM's
key.spendaccumulated to$99.93 - LiteLLM checked:
99.93 > 99.84→ blocked - Meanwhile, our NeonDB showed
$88.46remaining (different number!) - The two systems were never synced
The Fix
Key Insight: How to Sync Correctly
LiteLLM blocks when spend > max_budget. Since spend accumulates forever, we can't just set max_budget = balance. Example:
- LiteLLM
spend= $50, our balance = $20 remaining - If we set
max_budget: 20→ LiteLLM sees50 > 20→ blocked (wrong!) - If we set
max_budget: 70→ LiteLLM sees50 > 70→ allowed, $20 more to go (correct!)
Solution: On every sync, reset spend: 0 AND set max_budget: balance / 100.
This makes our DB the single source of truth. LiteLLM becomes a simple enforcement gate for the remaining balance.
Change 1: Fix Key Generation for Unlimited Users
Files: src/app/api/user/status/route.ts, src/app/api/keys/generate/route.ts
- Detect
isUnlimitedbefore setting maxBudget - Unlimited users →
maxBudget: null(no limit, balance is irrelevant) - Credit users →
maxBudget: balance / 100 - Dashboard route: allow Unlimited users with $0 balance
Change 2: Sync LiteLLM on Every Balance Decrement
File: src/app/api/litellm/callback/route.ts
After decrementing team.balanceCents, call:
await litellm.updateKey({
key: apiKey.litellmKeyId,
maxBudget: updatedTeam.balanceCents / 100,
spend: 0, // Reset spend so max_budget = remaining balance
});
For Unlimited users on Fabric models: shouldDecrementBalance returns false → no decrement → no sync needed → max_budget stays null.
Change 3: Sync LiteLLM on Stripe Events
File: src/app/api/stripe/webhooks/route.ts
| Event | Current | Fix |
|---|---|---|
| checkout (Unlimited) | Sets maxBudget: undefined |
Set maxBudget: null explicitly, verify LiteLLM treats as "no limit" |
| checkout (Credit) | Syncs budget | Also reset spend: 0 |
| invoice.paid (Credit) | Syncs budget | Also reset spend: 0 |
| subscription.updated | No LiteLLM sync | Add sync when status changes |
| subscription.deleted (Unlimited) | No LiteLLM sync | Set maxBudget: 0 (or remaining credit balance) |
| subscription.deleted (Credit) | No LiteLLM sync | Set maxBudget: remaining balance / 100, reset spend |
Change 4: Remove LiteLLM Default Cap
Server config: /home/farpoint/litellm_config.yaml on 69.30.85.97
Remove max_budget: 50 from key_generation_settings.
Change 5: Add spend Reset to LiteLLM Client
File: src/lib/litellm.ts
The updateKey method needs to support setting spend:
async updateKey(params: {
key: string;
maxBudget?: number | null;
spend?: number; // Add this
rpmLimit?: number;
}): Promise<LiteLLMKey> {
return this.request<LiteLLMKey>("/key/update", "POST", {
key: params.key,
max_budget: params.maxBudget,
spend: params.spend, // Add this
rpm_limit: params.rpmLimit,
});
}
Change 6: Migration Script
One-time script to fix all existing keys:
For each user with an API key:
- If team.isUnlimited: set max_budget=null, spend=0
- If credit pack: set max_budget=balance/100, spend=0
State Transitions
Files to Modify
| File | Changes |
|---|---|
src/app/api/user/status/route.ts |
Unlimited-aware key generation |
src/app/api/keys/generate/route.ts |
Allow Unlimited users, fix maxBudget |
src/app/api/litellm/callback/route.ts |
Sync LiteLLM key after balance decrement |
src/app/api/stripe/webhooks/route.ts |
Add missing LiteLLM syncs, reset spend on all syncs |
src/lib/litellm.ts |
Add spend param to updateKey |
Server: litellm_config.yaml |
Remove default max_budget: 50 |
Test Strategy
| Scenario | Expected |
|---|---|
| Unlimited user, first login, $0 balance | Key created with max_budget: null — all Fabric models work |
| Unlimited user generates key from dashboard | Works (no 402 error) |
| Credit user, $10 balance, uses $3 | Callback: balance → $7, LiteLLM synced: spend=0, max_budget=7 |
| Credit user, $0.50 balance, response costs $0.60 | Response completes, balance → -$0.10, LiteLLM synced: spend=0, max_budget=0 → next request blocked |
| Stripe renewal adds $10 to credit user | Balance → $10, LiteLLM synced: spend=0, max_budget=10 |
| Unlimited user cancels | isUnlimited=false, LiteLLM: max_budget=0 (or remaining credits) |
| LiteLLM sync fails in callback | Log error, continue — next sync will catch up |
Risks
| Risk | Likelihood | Mitigation |
|---|---|---|
| LiteLLM API call fails in callback | Low | Log failure; next callback will sync again |
Resetting spend:0 on every callback is noisy |
Medium | LiteLLM handles it; spend logs are separate from the key counter |
| Concurrent requests race between sync | Low | Small overspend acceptable (sub-cent per request) |
| LiteLLM config default overrides null | Low | Test explicitly after removing config default |



