This page covers
datasource: "modelMetrics". For querying MCP server / tool metrics, see API Access to MCP Metrics.Access control
- Tenant admins: Can query metrics for the entire organization (tenant-wide).
- Users: Can query their own data and their teams’ data.
- Virtual accounts: Can query their own data and their teams’ data; with tenant-admin permissions, they can access tenant-wide data.
Contents
| Section | Description |
|---|---|
| Overview | Authentication, quick start, and API reference |
| Filtering | Filter operators, fields, and combinations |
| Distribution examples | Aggregated (distribution) query examples |
| Timeseries examples | Time-bucketed (timeseries) query examples |
| Response format | Response JSON structure |
Authentication
You need to authenticate with your TrueFoundry API key. You can use either a Personal Access Token (PAT) or Virtual Account Token (VAT).Get your API key
Get your API key
To generate an API key:
- Personal Access Token (PAT): Go to Access → Personal Access Tokens in your TrueFoundry dashboard
- Virtual Account Token (VAT): Go to Access → Virtual Account Tokens (requires admin permissions)
Quick Start
Distribution Query
Get aggregated model metrics distribution with multiple aggregations including count, sum, and percentiles:Timeseries Query
Get model metrics over time with hourly intervals, including latency percentiles:API Reference
Endpoint
Request Parameters
ISO 8601 timestamp for the start of the data range (e.g.,
"2025-01-21T00:00:00.000Z")ISO 8601 timestamp for the end of the data range (e.g.,
"2025-01-22T00:00:00.000Z")The data source to query. Use
"modelMetrics" for gateway model metrics.The type of query to execute:
"distribution"- Returns aggregated metrics"timeseries"- Returns metrics over time intervals
Array of aggregation objects. Each aggregation specifies:
Supported columns for aggregation:
type- The aggregation typecolumn- The column to aggregate on
| Type | Description |
|---|---|
count | Count of records |
sum | Sum of values |
p50 | 50th percentile (median) |
p75 | 75th percentile |
p90 | 90th percentile |
p99 | 99th percentile |
costInUSD- Cost incurred in USDinputTokens- Number of input tokensoutputTokens- Number of output tokenslatencyMs- Total request latency (ms)interTokenLatencyMs- Latency between the generation of consecutive tokens (ms)timeToFirstTokenMs- Time to first token (ms)timePerOutputTokenLatencyMs- Latency per output token (ms)
Array of fields to group the metrics by. Available options:
modelName- Group by model nameuserEmail- Group by user emailvirtualaccount- Group by virtual accountteam- Group by team (unnests the Teams array)virtualModel- Group by virtual modelerrorCode- HTTP error code returnedrequestType- Type of model request (e.g.ChatCompletion,Embeddingetc)providerAccountType- Account type of provider (e.g.model,mcp-server,guardrail-config)providerModelName- Underlying provider model namecreatedBySubjectType- Subject type (e.g.user,virtualaccount)metadata.<key>- Group by a custom metadata key (e.g.,metadata.environment)
Required for timeseries queries. The time interval in seconds for grouping data points.Common values:
60- 1 minute intervals300- 5 minute intervals1800- 30 minute intervals3600- 1 hour intervals86400- 1 day intervals