Request Logs: Span Attributes - TrueFoundry Docs

Understanding Span Attributes

Each span you query from LLM Gateway captures key request and model details. Recognizing these attributes helps you analyze and debug usage effectively.

Core Span Attributes

Attribute	Description
`tfy.span_type`	Type of span, with possible values: • `"ChatCompletion"` - Complete chat request lifecycle • `"Completion"` - Text completion requests without chat context • `"MCP"` - Model Context Protocol server interactions and tool calls • `"Rerank"` - Document reranking operations for search relevance • `"Embedding"` - Vector embedding generation operations • `"Model"` - Actual LLM model inference processing • `"AgentResponse"` - Multi-tool agent orchestration workflows • `"Guardrail"` - Safety, compliance, and content validation checks
`tfy.input`	Complete input data sent to the model, mcp_server, guardrail, etc..
`tfy.output`	Complete output response from the model, mcp_server, guardrail, etc..
`tfy.input_short_hand`	Abbreviated version of the input for display purposes
`tfy.error_message`	Error message if the request failed
`tfy.prompt_version_fqn`	FQN of the prompt version used (if applicable)
`tfy.prompt_variables`	Variables used in prompt templating
`tfy.triggered_guardrail_fqns`	List of guardrails that were triggered during the request

Request Context Attributes

Attribute	Description
`tfy.request.model_name`	Name of the model that was requested
`tfy.request.created_by_subject`	Subject (user/service account) that made the request
`tfy.request.created_by_subject_teams`	Teams associated with the requesting subject
`tfy.request.metadata`	Additional metadata associated with the request (e.g., `{'foo': 'bar'}`)
`tfy.request.conversation_id`	Unique identifier for the conversation (if part of a chat)

Model Attributes

Attribute	Description
`tfy.model.id`	Unique identifier of the model
`tfy.model.name`	Display name of the model
`tfy.model.fqn`	Fully qualified name of the model
`tfy.model.request_url`	URL endpoint used for the model request
`tfy.model.streaming`	Whether the request used streaming mode
`tfy.model.request_type`	Type of request (e.g., `"ChatCompletion", "Completion", "Embedding", "Rerank", "AgentResponse", "MCPGateway", "CreateModelResponse"`)
`tfy.model.metric.cache_read_input_tokens`	Number of input tokens served from the cache, billed at a lower cache read rate instead of the standard input rate
`tfy.model.metric.cache_creation_input_tokens`	Number of input tokens written to the cache, billed at a higher cache write rate to cover the cost of storage

Model Performance Metrics

Attribute	Description
`tfy.model.metric.time_to_first_token_in_ms`	Time taken to receive the first token (streaming)
`tfy.model.metric.latency_in_ms`	Total request latency in milliseconds
`tfy.model.metric.input_tokens`	Number of tokens in the model input
`tfy.model.metric.output_tokens`	Number of tokens in the model output
`tfy.model.metric.cost_in_usd`	Cost of the request in USD
`tfy.model.metric.inter_token_latency_in_ms`	Average latency between tokens (streaming)

Load Balancing Attributes

Attribute	Description
`applied_loadbalance_rule_ids`	IDs of load balancing rules that were applied (e.g., `['gpt-4-dev-load']`)

Budget Control Attributes

Attribute	Description
`applied_budget_rule_ids`	IDs of budget rules that were applied to this request (e.g., `['virtualaccount1-monthly-budget']`)

Rate Limiting Attributes

Attribute	Description
`applied_ratelimit_rule_ids`	IDs of all rate limiting rules that were applied (e.g., `['virtualaccount1-daily-ratelimit']`)

MCP (Model Context Protocol) Server Attributes

Attribute	Description
`tfy.mcp_server.id`	Unique identifier of the MCP server
`tfy.mcp_server.name`	Display name of the MCP server
`tfy.mcp_server.url`	URL endpoint of the MCP server
`tfy.mcp_server.fqn`	Fully qualified name of the MCP server
`tfy.mcp_server.server_name`	Internal name of the MCP server
`tfy.mcp_server.method`	MCP method that was called
`tfy.mcp_server.primitive_name`	Name of the MCP primitive used
`tfy.mcp_server.error_code`	Error code if the MCP call failed
`tfy.mcp_server.is_tool_call_execution_error`	Whether the error was from tool call execution

MCP Server Metrics

Attribute	Description
`tfy.mcp_server.metric.latency_in_ms`	Latency of the MCP server call in milliseconds
`tfy.mcp_server.metric.number_of_tools`	Number of tools available in the MCP server

Guardrail Attributes

Attribute	Description
`tfy.guardrail.id`	Unique identifier of the guardrail
`tfy.guardrail.name`	Display name of the guardrail
`tfy.guardrail.fqn`	Fully qualified name of the guardrail
`tfy.guardrail.result`	Result of the guardrail check (e.g., `'pass'`, `'mutate'`, `'flag'`)

Guardrail Applied Entity Attributes

Attribute	Description
`tfy.guardrail.applied_on_entity.type`	Type of entity the guardrail was applied to
`tfy.guardrail.applied_on_entity.id`	ID of the entity
`tfy.guardrail.applied_on_entity.name`	Name of the entity
`tfy.guardrail.applied_on_entity.fqn`	FQN of the entity
`tfy.guardrail.applied_on_entity.scope`	Scope of the entity

Guardrail Metrics

Attribute	Description
`tfy.guardrail.metric.latency_in_ms`	Time taken for the guardrail check in milliseconds

HTTP Response Attributes

Attribute	Description
`http.response.status_code`	HTTP status code of the response

References

Advanced Queries Architecture

⌘I