Changelog

All notable changes to the Invarium platform are documented here. Entries are organized by version with the most recent release first.

v0.1.0 (Beta) — March 2026

Initial beta release of the Invarium platform.

Added

MCP Server — Five tools for agent testing: invarium_connect, invarium_upload_blueprint, invarium_generate_tests, invarium_get_tests, and invarium_sync_results
Blueprint upload and validation — JSON blueprint format for describing AI agents with tools, constraints, and workflows
Test generation — Behavioral test case generation targeting six failure categories: Hallucination, Tool Misuse, Guardrail Violation, Loop/Stuck, Context Loss, and Safety Violation
Behavioral Safety Score (BSS) — 0-100 reliability score with four components: pass rate, severity weighting, coverage breadth, and consistency
Failure Taxonomy — Structured failure classification with categories, subtypes, and severity levels (Critical, High, Medium, Low)
Agent Intelligence Graph — Interactive visualization of agent behavioral patterns, decision paths, and failure clusters
Behavioral Tracing — Timestamped event logs for every agent action during testing
Dashboard — Web dashboard at app.invarium.dev with test run management, BSS score tracking, agent graph visualization, and team management
Quality Gates — Configurable pass/fail thresholds for CI/CD integration with BSS score and failure count rules
Team Management — Workspace-based collaboration with Admin, Member, and Viewer roles
MCP Resources — Blueprint template and prompt template resources for scaffolding new blueprints
Multi-framework support — Blueprint format supports LangChain, CrewAI, AutoGen, and custom agents

Known limitations

Export functionality for Agent Intelligence Graph is limited to PNG
Trace data retention is 90 days during beta
Maximum of 100 test cases per generation request
Custom failure categories are not yet supported

Was this page helpful?