Skip to content

Budget Enforcement Fix

The Business Rules

Credit Pack (pay-as-you-go):

  • User buys credits ($10/$20/$100 packs)
  • Each model has input/output pricing
  • Request goes in → response comes back → balance debited
  • If balance goes slightly negative on a response, that's OK
  • Next request is blocked if balance ≤ 0

Unlimited ($200/mo):

  • isUnlimited: true → unlimited requests to Fabric-hosted models
  • Balance is irrelevant — doesn't matter what it is
  • Fabric models always work while subscription is active

Current reality: LiteLLM only serves Fabric-hosted models (Devstral, MiniMax, Qwen Coder). No third-party models (Claude, GPT-4) are available. So Unlimited users' balance should never be decremented.

Architecture

Database Architecture

Two databases:

Component Database What It Stores
Our App (Vercel) NeonDB (cloud) Users, teams, balanceCents, payments, usage records
LiteLLM (69.30.85.97) Local Postgres API keys with max_budget + spend, spend logs

api.codewithfabric.com goes directly to LiteLLM via Cloudflare — our app is NOT in the chat request path.

How LiteLLM Enforces Budgets

LiteLLM independently tracks spend on every key. On each request:

  1. LiteLLM checks: key.spend > key.max_budget → if yes, block ("Budget exceeded")
  2. Routes request to model
  3. Increments key.spend by the cost in its own DB
  4. Fires webhook to our /api/litellm/callback
  5. Our callback decrements team.balanceCents in NeonDB

Two independent counters that are never synced:

  • LiteLLM: key.spend goes UP (accumulates forever)
  • NeonDB: team.balanceCents goes DOWN (decremented by callback)

How Data Flows

Data Flow

The Bugs

Bug Flow

# Bug Location Impact
1 Unlimited users get maxBudget: 0 on key creation /api/user/status:143 New Unlimited users blocked immediately
2 Dashboard rejects Unlimited users with 402 /api/keys/generate:44 Can't generate key from dashboard
3 LiteLLM default cap max_budget: 50 in config Server config Any key without explicit budget capped at $50
4 Callback decrements NeonDB balance but never syncs to LiteLLM /api/litellm/callback LiteLLM budget drifts — eventually blocks users
5 Subscription cancellation doesn't update LiteLLM key /api/stripe/webhooks Cancelled users keep LiteLLM access
6 LiteLLM Prisma client was stale Server Key generation broken for all users (fixed today)

Why Ramtin Got Blocked

Ramtin's personal account (credit pack):

  1. Key created Dec 18 with max_budget: $99.84 (his balance at the time)
  2. He used MiniMax heavily — LiteLLM's key.spend accumulated to $99.93
  3. LiteLLM checked: 99.93 > 99.84blocked
  4. Meanwhile, our NeonDB showed $88.46 remaining (different number!)
  5. The two systems were never synced

The Fix

Key Insight: How to Sync Correctly

LiteLLM blocks when spend > max_budget. Since spend accumulates forever, we can't just set max_budget = balance. Example:

  • LiteLLM spend = $50, our balance = $20 remaining
  • If we set max_budget: 20 → LiteLLM sees 50 > 20blocked (wrong!)
  • If we set max_budget: 70 → LiteLLM sees 50 > 70 → allowed, $20 more to go (correct!)

Solution: On every sync, reset spend: 0 AND set max_budget: balance / 100.

This makes our DB the single source of truth. LiteLLM becomes a simple enforcement gate for the remaining balance.

Change 1: Fix Key Generation for Unlimited Users

Files: src/app/api/user/status/route.ts, src/app/api/keys/generate/route.ts

  • Detect isUnlimited before setting maxBudget
  • Unlimited users → maxBudget: null (no limit, balance is irrelevant)
  • Credit users → maxBudget: balance / 100
  • Dashboard route: allow Unlimited users with $0 balance

Change 2: Sync LiteLLM on Every Balance Decrement

File: src/app/api/litellm/callback/route.ts

After decrementing team.balanceCents, call:

await litellm.updateKey({
  key: apiKey.litellmKeyId,
  maxBudget: updatedTeam.balanceCents / 100,
  spend: 0,  // Reset spend so max_budget = remaining balance
});

For Unlimited users on Fabric models: shouldDecrementBalance returns false → no decrement → no sync needed → max_budget stays null.

Change 3: Sync LiteLLM on Stripe Events

File: src/app/api/stripe/webhooks/route.ts

Event Current Fix
checkout (Unlimited) Sets maxBudget: undefined Set maxBudget: null explicitly, verify LiteLLM treats as "no limit"
checkout (Credit) Syncs budget Also reset spend: 0
invoice.paid (Credit) Syncs budget Also reset spend: 0
subscription.updated No LiteLLM sync Add sync when status changes
subscription.deleted (Unlimited) No LiteLLM sync Set maxBudget: 0 (or remaining credit balance)
subscription.deleted (Credit) No LiteLLM sync Set maxBudget: remaining balance / 100, reset spend

Change 4: Remove LiteLLM Default Cap

Server config: /home/farpoint/litellm_config.yaml on 69.30.85.97

Remove max_budget: 50 from key_generation_settings.

Change 5: Add spend Reset to LiteLLM Client

File: src/lib/litellm.ts

The updateKey method needs to support setting spend:

async updateKey(params: {
  key: string;
  maxBudget?: number | null;
  spend?: number;       // Add this
  rpmLimit?: number;
}): Promise<LiteLLMKey> {
  return this.request<LiteLLMKey>("/key/update", "POST", {
    key: params.key,
    max_budget: params.maxBudget,
    spend: params.spend,  // Add this
    rpm_limit: params.rpmLimit,
  });
}

Change 6: Migration Script

One-time script to fix all existing keys:

For each user with an API key:
  - If team.isUnlimited: set max_budget=null, spend=0
  - If credit pack: set max_budget=balance/100, spend=0

State Transitions

State Transitions

Files to Modify

File Changes
src/app/api/user/status/route.ts Unlimited-aware key generation
src/app/api/keys/generate/route.ts Allow Unlimited users, fix maxBudget
src/app/api/litellm/callback/route.ts Sync LiteLLM key after balance decrement
src/app/api/stripe/webhooks/route.ts Add missing LiteLLM syncs, reset spend on all syncs
src/lib/litellm.ts Add spend param to updateKey
Server: litellm_config.yaml Remove default max_budget: 50

Test Strategy

Scenario Expected
Unlimited user, first login, $0 balance Key created with max_budget: null — all Fabric models work
Unlimited user generates key from dashboard Works (no 402 error)
Credit user, $10 balance, uses $3 Callback: balance → $7, LiteLLM synced: spend=0, max_budget=7
Credit user, $0.50 balance, response costs $0.60 Response completes, balance → -$0.10, LiteLLM synced: spend=0, max_budget=0 → next request blocked
Stripe renewal adds $10 to credit user Balance → $10, LiteLLM synced: spend=0, max_budget=10
Unlimited user cancels isUnlimited=false, LiteLLM: max_budget=0 (or remaining credits)
LiteLLM sync fails in callback Log error, continue — next sync will catch up

Risks

Risk Likelihood Mitigation
LiteLLM API call fails in callback Low Log failure; next callback will sync again
Resetting spend:0 on every callback is noisy Medium LiteLLM handles it; spend logs are separate from the key counter
Concurrent requests race between sync Low Small overspend acceptable (sub-cent per request)
LiteLLM config default overrides null Low Test explicitly after removing config default