ChangelogLatest

Changelog

All notable changes to the Invarium platform are documented here. Entries are organized by version with the most recent release first.


v0.1.0 (Beta) — March 2026

Initial beta release of the Invarium platform.

Added

  • MCP Server — Five tools for agent testing: invarium_connect, invarium_upload_blueprint, invarium_generate_tests, invarium_get_tests, and invarium_sync_results
  • Blueprint upload and validation — JSON blueprint format for describing AI agents with tools, constraints, and workflows
  • Test generation — Behavioral test case generation targeting six failure categories: Hallucination, Tool Misuse, Guardrail Violation, Loop/Stuck, Context Loss, and Safety Violation
  • Behavioral Safety Score (BSS) — 0-100 reliability score with four components: pass rate, severity weighting, coverage breadth, and consistency
  • Failure Taxonomy — Structured failure classification with categories, subtypes, and severity levels (Critical, High, Medium, Low)
  • Agent Intelligence Graph — Interactive visualization of agent behavioral patterns, decision paths, and failure clusters
  • Behavioral Tracing — Timestamped event logs for every agent action during testing
  • Dashboard — Web dashboard at app.invarium.dev with test run management, BSS score tracking, agent graph visualization, and team management
  • Quality Gates — Configurable pass/fail thresholds for CI/CD integration with BSS score and failure count rules
  • Team Management — Workspace-based collaboration with Admin, Member, and Viewer roles
  • MCP Resources — Blueprint template and prompt template resources for scaffolding new blueprints
  • Multi-framework support — Blueprint format supports LangChain, CrewAI, AutoGen, and custom agents

Known limitations

  • Export functionality for Agent Intelligence Graph is limited to PNG
  • Trace data retention is 90 days during beta
  • Maximum of 100 test cases per generation request
  • Custom failure categories are not yet supported

Was this page helpful?