Skip to main content
← Back to index page

0.1.19

vLLora now includes an MCP Server that exposes trace and run inspection as tools for coding agents. Debug, fix, and monitor your AI agents directly from your terminal or IDE by connecting Claude Desktop, Cursor, or any MCP-capable client to vLLora's MCP endpoint.

The MCP server provides tools to:

  • Search and filter traces by status, time range, model, and more
  • Get run overviews with span trees and error breadcrumbs
  • Inspect individual LLM call payloads and responses
  • Monitor system health with aggregated statistics

Learn more in the MCP Server documentation.

Features

  • Enhance MCP tools for traces information (bdd770f)
  • Support tool calls event in responses API (fbcc376)

Bug Fixes

  • Allow to use custom endpoint for openai (9e44d55)
  • thread cost calculation should be extracted from api_invoke only (c72c256)