MCP Referenceinvarium_dashboard

invarium_dashboard

Show the Invarium dashboard overview. Displays fleet health, pass rate, failure breakdown, recent test runs, and quick stats — the same data shown on the web dashboard, directly in your IDE.

When to Use

Call invarium_dashboard when you want a high-level summary of your workspace’s testing status. Common scenarios:

  • At the start of a session to understand the current state of your agents
  • After completing a testing cycle to see the overall impact on pass rates and AQS scores
  • When deciding which agent to focus on next based on fleet health

See Understanding Your Dashboard for interpreting the metrics.

Parameters

This tool takes no parameters. It returns the dashboard overview for the authenticated user’s workspace.

invarium_dashboard

Show the Invarium dashboard overview. Displays fleet health, pass rate, failure breakdown, recent test runs, and quick stats -- the same data shown on the web dashboard.

Returns

Formatted string with quick stats, fleet health breakdown, failure categories, and the five most recent test runs.

Example

Invarium Dashboard Overview

Quick Stats:
  Agents: 4 | Scenarios: 23 | Test Runs: 12 | Pass Rate: 78%

Fleet Health:
  Passing (2): order-agent (87), search-agent (92)
  At Risk (1): support-agent (54)
  Failing (1): billing-agent (31)

Failure Breakdown:
  Tool Misuse: 8 | Hallucination: 5 | Security: 2

Recent Runs:
  1. [order-agent] completed -- 9/10 passed (90%) -- Mar 26
  2. [support-agent] completed -- 6/10 passed (60%) -- Mar 25
  3. [billing-agent] failed -- 3/10 passed (30%) -- Mar 24

View full dashboard: https://app.invarium.dev/dashboard

Response

The tool returns a formatted text overview organized into four sections:

Invarium Dashboard Overview

Quick Stats:
  Agents: 4 | Scenarios: 23 | Test Runs: 12 | Pass Rate: 78%

Fleet Health:
  Passing (2): order-agent (87), search-agent (92)
  At Risk (1): support-agent (54)
  Failing (1): billing-agent (31)

Failure Breakdown:
  Tool Misuse: 8 | Hallucination: 5 | Security: 2

Recent Runs:
  1. [order-agent] completed -- 9/10 passed (90%) -- Mar 26
  2. [support-agent] completed -- 6/10 passed (60%) -- Mar 25
  3. [billing-agent] failed -- 3/10 passed (30%) -- Mar 24

View full dashboard: https://app.invarium.dev/dashboard

Quick Stats

FieldDescription
AgentsTotal number of registered agents in the workspace.
ScenariosTotal number of test scenarios across all agents.
Test RunsTotal number of test runs executed.
Pass RateOverall pass rate across all test runs, as a percentage.

Fleet Health

Agents are grouped into health tiers based on their Agent Quality Score (AQS). Each agent shows its name and AQS score in parentheses.

TierAQS RangeMeaning
Passing61—100Agent is performing well. Tests are passing consistently.
At Risk31—60Agent has notable issues. Some tests are failing or reliability is inconsistent.
Failing0—30Agent has critical problems. Significant test failures or safety concerns.
No DataAgent has been registered but no test runs have been executed yet.

Failure Breakdown

Aggregated failure counts across all recent test runs, sorted by frequency. Categories include:

CategoryDescription
SecurityPrompt injection, guardrail bypass, or unauthorized actions.
HallucinationFabricated information, outdated facts, or source misattribution.
Tool MisuseWrong tool selected, incorrect parameters, or sequence violations.
Policy ViolationConstraint violations or actions outside the agent’s defined scope.
OtherFailures that do not fit the above categories.

Categories with zero failures are omitted from the response.

Recent Runs

The five most recent test runs, showing:

FieldDescription
Agent nameThe agent the test run belongs to.
Statuscompleted, failed, running, or pending.
Pass rateNumber of passed tests out of total, with percentage. Shows in progress for running tests.
DateWhen the test run was created.

Empty State

If no agents have been registered and no test runs exist, the tool returns:

No data yet. Register an agent with invarium_upload_blueprint to get started.

Examples

Basic Usage

Call with no arguments to see the workspace overview:

"Show me the Invarium dashboard."

After Completing a Test Cycle

After syncing test results with invarium_sync_results, call the dashboard to see updated pass rates and fleet health:

"Sync these results for order-agent, then show me the dashboard to see the updated fleet health."

Identifying At-Risk Agents

Review the Fleet Health section to find agents in the “At Risk” or “Failing” tiers. These agents need attention — use invarium_get_agent to drill into a specific agent’s details, or invarium_generate_tests to create targeted test cases for their weak areas.

Error Responses

ErrorCauseFix
Authentication failed: invalid API keyInvalid or missing API key.Verify your INVARIUM_API_KEY is set correctly. Run invarium_connect first.
Failed to fetch dashboard: ...Backend API error or network issue.Check your connection and try again. See status.invarium.dev for outages.

See Error Codes for the full error reference.

Was this page helpful?