0.1.20
vLLora now includes a CLI tool that brings trace inspection and debugging capabilities directly to your terminal. The CLI enables fast iteration, automation workflows, and local reproduction of LLM traces without leaving your terminal.
The CLI provides commands to:
- Search and filter traces by status, time range, model, and operation type
- Get run overviews with span trees and LLM call summaries
- Inspect individual LLM call payloads and responses
- Monitor system health with aggregated statistics
Learn more in the vLLora CLI documentation.
This release also introduces Custom Providers and Models, allowing you to register your own API endpoints and model identifiers. Connect to self-hosted inference engines (like Ollama or LocalAI), private enterprise gateways, or any OpenAI-compatible service using a namespaced format (provider/model-id). Configure providers and models through Settings or the Chat Model Selector.
Learn more in the Custom Providers and Models documentation.
Features
- Add operation names filter to get traces API (41e3523)
- Enhance LLM call handling and tool summary structure (8283e74)
- Handle MCP functionallity through CLI (dc9915a)
- Remove trace_id from GetLlmCallParams and update related handling (4a0de09)
- Support custom providers, models and endpoints (#224) (148f9aa)
Bug Fixes
- Fix response mapping (c64a13f)