invarium_dashboard
Show the Invarium dashboard overview. Displays fleet health, pass rate, failure breakdown, recent test runs, and quick stats — the same data shown on the web dashboard, directly in your IDE.
When to Use
Call invarium_dashboard when you want a high-level summary of your workspace’s testing status. Common scenarios:
- At the start of a session to understand the current state of your agents
- After completing a testing cycle to see the overall impact on pass rates and AQS scores
- When deciding which agent to focus on next based on fleet health
See Understanding Your Dashboard for interpreting the metrics.
Parameters
This tool takes no parameters. It returns the dashboard overview for the authenticated user’s workspace.
invarium_dashboardShow the Invarium dashboard overview. Displays fleet health, pass rate, failure breakdown, recent test runs, and quick stats -- the same data shown on the web dashboard.
Returns
Formatted string with quick stats, fleet health breakdown, failure categories, and the five most recent test runs.
Example
Invarium Dashboard Overview
Quick Stats:
Agents: 4 | Scenarios: 23 | Test Runs: 12 | Pass Rate: 78%
Fleet Health:
Passing (2): order-agent (87), search-agent (92)
At Risk (1): support-agent (54)
Failing (1): billing-agent (31)
Failure Breakdown:
Tool Misuse: 8 | Hallucination: 5 | Security: 2
Recent Runs:
1. [order-agent] completed -- 9/10 passed (90%) -- Mar 26
2. [support-agent] completed -- 6/10 passed (60%) -- Mar 25
3. [billing-agent] failed -- 3/10 passed (30%) -- Mar 24
View full dashboard: https://app.invarium.dev/dashboardResponse
The tool returns a formatted text overview organized into four sections:
Invarium Dashboard Overview
Quick Stats:
Agents: 4 | Scenarios: 23 | Test Runs: 12 | Pass Rate: 78%
Fleet Health:
Passing (2): order-agent (87), search-agent (92)
At Risk (1): support-agent (54)
Failing (1): billing-agent (31)
Failure Breakdown:
Tool Misuse: 8 | Hallucination: 5 | Security: 2
Recent Runs:
1. [order-agent] completed -- 9/10 passed (90%) -- Mar 26
2. [support-agent] completed -- 6/10 passed (60%) -- Mar 25
3. [billing-agent] failed -- 3/10 passed (30%) -- Mar 24
View full dashboard: https://app.invarium.dev/dashboardQuick Stats
| Field | Description |
|---|---|
| Agents | Total number of registered agents in the workspace. |
| Scenarios | Total number of test scenarios across all agents. |
| Test Runs | Total number of test runs executed. |
| Pass Rate | Overall pass rate across all test runs, as a percentage. |
Fleet Health
Agents are grouped into health tiers based on their Agent Quality Score (AQS). Each agent shows its name and AQS score in parentheses.
| Tier | AQS Range | Meaning |
|---|---|---|
| Passing | 61—100 | Agent is performing well. Tests are passing consistently. |
| At Risk | 31—60 | Agent has notable issues. Some tests are failing or reliability is inconsistent. |
| Failing | 0—30 | Agent has critical problems. Significant test failures or safety concerns. |
| No Data | — | Agent has been registered but no test runs have been executed yet. |
Failure Breakdown
Aggregated failure counts across all recent test runs, sorted by frequency. Categories include:
| Category | Description |
|---|---|
| Security | Prompt injection, guardrail bypass, or unauthorized actions. |
| Hallucination | Fabricated information, outdated facts, or source misattribution. |
| Tool Misuse | Wrong tool selected, incorrect parameters, or sequence violations. |
| Policy Violation | Constraint violations or actions outside the agent’s defined scope. |
| Other | Failures that do not fit the above categories. |
Categories with zero failures are omitted from the response.
Recent Runs
The five most recent test runs, showing:
| Field | Description |
|---|---|
| Agent name | The agent the test run belongs to. |
| Status | completed, failed, running, or pending. |
| Pass rate | Number of passed tests out of total, with percentage. Shows in progress for running tests. |
| Date | When the test run was created. |
Empty State
If no agents have been registered and no test runs exist, the tool returns:
No data yet. Register an agent with invarium_upload_blueprint to get started.Examples
Basic Usage
Call with no arguments to see the workspace overview:
"Show me the Invarium dashboard."After Completing a Test Cycle
After syncing test results with invarium_sync_results, call the dashboard to see updated pass rates and fleet health:
"Sync these results for order-agent, then show me the dashboard to see the updated fleet health."Identifying At-Risk Agents
Review the Fleet Health section to find agents in the “At Risk” or “Failing” tiers. These agents need attention — use invarium_get_agent to drill into a specific agent’s details, or invarium_generate_tests to create targeted test cases for their weak areas.
Error Responses
| Error | Cause | Fix |
|---|---|---|
Authentication failed: invalid API key | Invalid or missing API key. | Verify your INVARIUM_API_KEY is set correctly. Run invarium_connect first. |
Failed to fetch dashboard: ... | Backend API error or network issue. | Check your connection and try again. See status.invarium.dev for outages. |
See Error Codes for the full error reference.