How Budget Limiting Works
Budget limiting consists of an ordered list of rules. Each rule defines which requests it applies to and how much they can spend. When a request comes in, the gateway evaluates it against the rules from top to bottom. Two things happen during evaluation:- Budget tracking for all matching rules. If a request matches multiple rules, the cost is counted against every matching rule.
- The first matching rule controls allow/block. The first rule whose conditions match the request decides whether it goes through or is rejected.
Key distinction: The allow/block decision comes from the first matching rule, but budget tracking happens for every matching rule. This is what makes layered budget controls possible.
Why Rule Order Matters
Because the first matching rule controls the allow/block decision, the order of rules determines priority. Place higher-priority rules (overrides, exceptions) above lower-priority rules (general defaults). Example: You want every developer to have a $10/day budget, but the ML team should get $100/day. Place the ML team rule above the default rule. When an ML engineer makes a request, the $100 limit applies (first match). When any other developer makes a request, the $10 limit applies. In both cases, the cost is tracked against all matching rules.Setting Up Budget Rules
To configure budget limiting, navigate to AI Gateway → Policies → Budget Limiting in the TrueFoundry dashboard. Click Add New Budget Limiting Rule to create a rule. The form has the following fields:
Rule ID
A unique identifier for the rule. This is used in logs, metrics, and API responses to identify which rule acted on a request. Choose a descriptive name likeper-user-daily or ml-team-budget.
When Request Comes To (Filters)
Defines which requests this rule applies to. You can filter by one or more of the following. All selected filters use AND logic — a request must match all filters to be matched by the rule.| Filter | Description | Example |
|---|---|---|
| Subjects | Users, teams, or virtual accounts | user:alice@example.com, team:engineering, virtualaccount:acct_123 |
| Models | Specific model names | openai-main/gpt-4, anthropic-main/claude-3 |
| Metadata | Custom key-value pairs sent via the X-TFY-METADATA header | environment: production, project_id: proj-123 |
If you leave all filters empty (no subjects, no models, no metadata), the rule matches every request. This is useful for setting default budgets that apply to everyone.
Budget
Set the spending limit and time period:- Budget ($): The dollar amount for the budget limit.
- Limit Unit: The time period over which the budget applies. Choose from:
- Cost per day — resets at UTC midnight
- Cost per week — resets on Monday at UTC midnight
- Cost per month — resets on the 1st of each month at UTC midnight
Apply Budget Per (Optional)
By default, a single budget is shared across all requests matching the rule. For example, a $100/day rule with ateam:engineering filter means the entire team shares a single $100 pool.
To create separate budgets for each individual within the matching group, use the “Apply budget per” option. Available values:
| Value | Effect |
|---|---|
| User | Each user gets their own budget (e.g., Alice has $100/day, Bob has a separate $100/day) |
| Model | Each model gets its own budget |
| Virtual Account | Each virtual account gets its own budget |
| Metadata key | Each unique value of a metadata key gets its own budget (e.g., per project_id) |
Block If Usage Limit Exceeded
Controls the enforcement behavior:- ON (default): Requests are blocked when the budget is exceeded.
- OFF (audit mode): Requests go through even when over budget. Budget is still tracked and alerts are still sent.

Send Alerts On Budget Milestones
Configure notifications when budget usage crosses specified thresholds. Select the percentage thresholds (75%, 90%, 95%, 100%) and choose a notification channel (email, Slack webhook, or Slack bot).Alert configuration details
Alert configuration details
Available thresholds:
75%, 90%, 95%, 100%Each threshold triggers once per budget period. When a new period starts (day/week/month), alerts reset and can be sent again. Alerts are checked every 20 minutes.Notification channels:- Email — Send alerts to one or more email addresses via a configured email notification channel
- Slack Webhook — Send alerts to a Slack channel via a webhook notification channel
-
Slack Bot — Send alerts to specific Slack channels via a bot notification channel

Viewing Budget Usage
You can monitor budget usage directly on the budget configuration page. Each rule card displays:- Current usage amount and percentage
- Budget limit and remaining budget
- Period start time (when the current budget period began)

Practical Examples
Per-developer budgets with team overrides
Per-developer budgets with team overrides
Give every developer a $10/day budget, but allow the ML team $100/day. Place the override rule above the default.
How it works:
| Order | Rule ID | Filter | Budget | Per |
|---|---|---|---|---|
| 1 | ml-team-budget | Subjects: team:ml-engineering | $100/day | User |
| 2 | default-dev-budget | (no filter — matches all) | $10/day | User |
- ML team member → matched by rule 1 (first match, $100 limit applies). Budget is also tracked against rule 2.
- Any other developer → rule 1 doesn’t match, rule 2 matches ($10 limit applies).
Model-level cap with per-user limits
Model-level cap with per-user limits
Cap total GPT-4 spending at $500/month, while giving each user a $10/day limit.
How it works:
| Order | Rule ID | Filter | Budget | Per |
|---|---|---|---|---|
| 1 | per-user-daily | (no filter) | $10/day | User |
| 2 | gpt4-monthly-cap | Models: openai-main/gpt-4 | $500/month | (shared) |
- A user calls GPT-4 → cost is tracked against both the per-user budget and the model-wide budget. The per-user rule controls allow/block.
- The model-wide cap acts as a safety net — even if individual users are within their limits, total GPT-4 spending is capped at $500/month.
Virtual account budgets
Virtual account budgets
Set spending limits per virtual account (useful when multiple teams or applications share the gateway).
Each virtual account gets an independent $1000/week budget, tracked separately.
| Order | Rule ID | Filter | Budget | Per |
|---|---|---|---|---|
| 1 | va-weekly-budget | (no filter) | $1000/week | Virtual Account |
Project-based budgets using metadata
Project-based budgets using metadata
Track spending per project by using metadata sent in the
Each unique
X-TFY-METADATA header.| Order | Rule ID | Filter | Budget | Per |
|---|---|---|---|---|
| 1 | project-daily-budget | (no filter) | $100/day | metadata.project_id |
project_id value gets its own $100/day budget. Requests must include the header:YAML Configuration
Budget rules configured via the UI can be exported as YAML. This is useful for version control, programmatic management, or copying configurations across environments.YAML structure reference
YAML structure reference
| Field | Description |
|---|---|
id | Unique rule identifier |
when.subjects | List of users, teams, or virtual accounts to match |
when.models | List of model names to match |
when.metadata | Key-value pairs to match against request metadata |
limit_to | Budget amount in dollars |
unit | cost_per_day, cost_per_week, or cost_per_month |
budget_applies_per | Optional. ['user'], ['model'], ['virtualaccount'], or ['metadata.<key>'] |
block_on_budget_exceed | true (enforcement) or false (audit mode). Defaults to true |
alerts.thresholds | List of percentage thresholds: 75, 90, 95, 100 |
alerts.notification_target | Notification channel configuration (email, slack-webhook, or slack-bot) |
Example: Layered budget config
Example: Layered budget config
Example: Budgets with alerts
Example: Budgets with alerts
Example: Comprehensive multi-rule config
Example: Comprehensive multi-rule config