invarium_list_test_runs
List test runs for an agent or across all agents in your workspace. Shows run status, pass rate, source, and date for each run.
When to Use
Call invarium_list_test_runs when you need to review test execution history. Common scenarios:
- After syncing results with
invarium_sync_results, verify the test run was created - Review recent test runs to find a specific
test_run_idfor deeper inspection withinvarium_get_test_run - Compare pass rates across multiple test runs to track quality trends
- Filter by status to find failed or in-progress runs that need attention
- Get a cross-agent overview by omitting
agent_nameto see runs across the entire workspace
Parameters
invarium_list_test_runsList test runs for an agent or across all agents in your workspace. Shows run status, pass rate, source, and date for each run.
Parameters
| Name | Type | Required | Description |
|---|---|---|---|
agent_name | string | null | default: null | Filter by agent name. If omitted, lists runs across all agents in the workspace. |
status | string | null | default: null | Filter by run status. Valid values: completed, failed, running, pending. If omitted, all statuses are returned. |
limit | int | default: 10 | Maximum number of runs to return. Must be a positive integer. |
offset | int | default: 0 | Number of runs to skip for pagination. Must be a non-negative integer. |
Returns
Formatted list of test runs with run ID, date, status, pass rate, and source. Includes total count in the header.
Example
Test runs for 'customer-support-agent' (12 total):
1. Run 8f4a2b1c -- 2025-01-15 14:32
Status: completed | Passed: 8/10 (80%) | Source: mcp
2. Run a3c7e9d2 -- 2025-01-14 09:15
Status: completed | Passed: 15/20 (75%) | Source: ci
3. Run f1b2c3d4 -- 2025-01-13 16:48
Status: failed | Passed: 3/10 (30%) | Source: mcpResponse
Agent-Scoped Listing
When agent_name is provided, runs are listed for that specific agent:
Test runs for 'customer-support-agent' (12 total):
1. Run 8f4a2b1c -- 2025-01-15 14:32
Status: completed | Passed: 8/10 (80%) | Source: mcp
2. Run a3c7e9d2 -- 2025-01-14 09:15
Status: completed | Passed: 15/20 (75%) | Source: ci
3. Run f1b2c3d4 -- 2025-01-13 16:48
Status: failed | Passed: 3/10 (30%) | Source: mcpWorkspace-Wide Listing
When agent_name is omitted, runs include the agent name for each entry:
Recent test runs (24 total):
1. [customer-support-agent] Run 8f4a2b1c -- 2025-01-15 14:32
Status: completed | Passed: 8/10 (80%)
2. [order-processing-agent] Run b5d6e7f8 -- 2025-01-15 11:20
Status: running | Passed: --/15
3. [travel-booking-agent] Run c9a1b2d3 -- 2025-01-14 17:05
Status: completed | Passed: 12/12 (100%)Per-Run Fields
| Field | Description |
|---|---|
| Run ID | Short identifier (first 8 characters) for the test run. Use the full ID with invarium_get_test_run for details. |
| Date | When the test run was created. |
| Status | Current status: completed, failed, running, or pending. |
| Passed | Pass rate shown as passed/total with percentage. Shows -- if results are not yet available. |
| Source | Where the tests were executed: mcp, ci, vscode, cli, or api. Shown for agent-scoped listings. |
| Agent name | The agent the run belongs to. Shown only for workspace-wide listings (when agent_name is omitted). |
Empty State
If no test runs match the query, the tool returns:
No test runs found. Use invarium_sync_results to upload test results.Examples
Basic — List Recent Runs for an Agent
View the latest test runs for a specific agent:
"List the recent test runs for my customer-support-agent."List All Runs Across Workspace
Get a cross-agent overview of recent test activity:
"Show me recent test runs across all my agents."Advanced — Filter by Status
Find all failed runs to investigate issues:
"Show me all the failed test runs for my order-processing-agent."Find runs that are still in progress:
"Are there any test runs currently running for my order-processing-agent?"Pagination
Page through a long history of test runs:
"Show me the first 10 test runs for my customer-support-agent.""Show me the next 10 test runs for my customer-support-agent."Workflow — Verify a Sync
After syncing results, confirm the test run was created:
"Sync these results for billing-agent as 'Regression Suite -- v3.2', then list the latest runs to confirm it was created."Status Values
| Status | Meaning |
|---|---|
| completed | All test cases have been evaluated and results are final. |
| failed | The test run encountered an error during execution. |
| running | Test cases are currently being executed or evaluated. |
| pending | The test run has been created but execution has not started. |
The run ID shown in the listing is truncated to the first 8 characters for readability. The full ID is returned by invarium_sync_results and is needed when calling invarium_get_test_run.
Error Responses
| Error | Cause | Fix |
|---|---|---|
Authentication failed: invalid API key | Invalid or missing API key. | Verify your INVARIUM_API_KEY. Run invarium_connect first. |
Agent not found: '...' | No blueprint exists for the specified agent name. | Check the agent name with invarium_list_agents. |
Invalid status '...' | The status filter is not a recognized value. | Use one of: completed, failed, running, pending. |
Invalid limit: must be a positive integer | The limit value is zero or negative. | Use a positive integer. |
Invalid offset: must be a non-negative integer | The offset value is negative. | Use 0 or a positive integer. |
Failed to list test runs: ... | Backend API error or network issue. | Check the error details and try again. |
See Error Codes for the full error reference.