Changelog
All notable changes to the Invarium platform are documented here. Entries are organized by version with the most recent release first.
v0.1.0 (Beta) — March 2026
Initial beta release of the Invarium platform.
Added
- MCP Server — Five tools for agent testing:
invarium_connect,invarium_upload_blueprint,invarium_generate_tests,invarium_get_tests, andinvarium_sync_results - Blueprint upload and validation — JSON blueprint format for describing AI agents with tools, constraints, and workflows
- Test generation — Behavioral test case generation targeting six failure categories: Hallucination, Tool Misuse, Guardrail Violation, Loop/Stuck, Context Loss, and Safety Violation
- Behavioral Safety Score (BSS) — 0-100 reliability score with four components: pass rate, severity weighting, coverage breadth, and consistency
- Failure Taxonomy — Structured failure classification with categories, subtypes, and severity levels (Critical, High, Medium, Low)
- Agent Intelligence Graph — Interactive visualization of agent behavioral patterns, decision paths, and failure clusters
- Behavioral Tracing — Timestamped event logs for every agent action during testing
- Dashboard — Web dashboard at app.invarium.dev with test run management, BSS score tracking, agent graph visualization, and team management
- Quality Gates — Configurable pass/fail thresholds for CI/CD integration with BSS score and failure count rules
- Team Management — Workspace-based collaboration with Admin, Member, and Viewer roles
- MCP Resources — Blueprint template and prompt template resources for scaffolding new blueprints
- Multi-framework support — Blueprint format supports LangChain, CrewAI, AutoGen, and custom agents
Known limitations
- Export functionality for Agent Intelligence Graph is limited to PNG
- Trace data retention is 90 days during beta
- Maximum of 100 test cases per generation request
- Custom failure categories are not yet supported
Was this page helpful?