`invarium_list_test_runs`

List test runs for an agent or across all agents in your workspace. Shows run status, pass rate, source, and date for each run.

When to Use

Call invarium_list_test_runs when you need to review test execution history. Common scenarios:

After syncing results with invarium_sync_results, verify the test run was created
Review recent test runs to find a specific test_run_id for deeper inspection with invarium_get_test_run
Compare pass rates across multiple test runs to track quality trends
Filter by status to find failed or in-progress runs that need attention
Get a cross-agent overview by omitting agent_name to see runs across the entire workspace

Parameters

invarium_list_test_runs

List test runs for an agent or across all agents in your workspace. Shows run status, pass rate, source, and date for each run.

Parameters

Name	Type	Required	Description
`agent_name`	string \| null	default: null	Filter by agent name. If omitted, lists runs across all agents in the workspace.
`status`	string \| null	default: null	Filter by run status. Valid values: completed, failed, running, pending. If omitted, all statuses are returned.
`limit`	int	default: 10	Maximum number of runs to return. Must be a positive integer.
`offset`	int	default: 0	Number of runs to skip for pagination. Must be a non-negative integer.

Returns

Formatted list of test runs with run ID, date, status, pass rate, and source. Includes total count in the header.

Example

Test runs for 'customer-support-agent' (12 total):

1. Run 8f4a2b1c -- 2025-01-15 14:32
   Status: completed | Passed: 8/10 (80%) | Source: mcp

2. Run a3c7e9d2 -- 2025-01-14 09:15
   Status: completed | Passed: 15/20 (75%) | Source: ci

3. Run f1b2c3d4 -- 2025-01-13 16:48
   Status: failed | Passed: 3/10 (30%) | Source: mcp

Response

Agent-Scoped Listing

When agent_name is provided, runs are listed for that specific agent:

Test runs for 'customer-support-agent' (12 total):

1. Run 8f4a2b1c -- 2025-01-15 14:32
   Status: completed | Passed: 8/10 (80%) | Source: mcp

2. Run a3c7e9d2 -- 2025-01-14 09:15
   Status: completed | Passed: 15/20 (75%) | Source: ci

3. Run f1b2c3d4 -- 2025-01-13 16:48
   Status: failed | Passed: 3/10 (30%) | Source: mcp

Workspace-Wide Listing

When agent_name is omitted, runs include the agent name for each entry:

Recent test runs (24 total):

1. [customer-support-agent] Run 8f4a2b1c -- 2025-01-15 14:32
   Status: completed | Passed: 8/10 (80%)

2. [order-processing-agent] Run b5d6e7f8 -- 2025-01-15 11:20
   Status: running | Passed: --/15

3. [travel-booking-agent] Run c9a1b2d3 -- 2025-01-14 17:05
   Status: completed | Passed: 12/12 (100%)

Per-Run Fields

Field	Description
Run ID	Short identifier (first 8 characters) for the test run. Use the full ID with `invarium_get_test_run` for details.
Date	When the test run was created.
Status	Current status: `completed`, `failed`, `running`, or `pending`.
Passed	Pass rate shown as passed/total with percentage. Shows `--` if results are not yet available.
Source	Where the tests were executed: `mcp`, `ci`, `vscode`, `cli`, or `api`. Shown for agent-scoped listings.
Agent name	The agent the run belongs to. Shown only for workspace-wide listings (when `agent_name` is omitted).

Empty State

If no test runs match the query, the tool returns:

No test runs found. Use invarium_sync_results to upload test results.

Examples

Basic — List Recent Runs for an Agent

View the latest test runs for a specific agent:

"List the recent test runs for my customer-support-agent."

List All Runs Across Workspace

Get a cross-agent overview of recent test activity:

"Show me recent test runs across all my agents."

Advanced — Filter by Status

Find all failed runs to investigate issues:

"Show me all the failed test runs for my order-processing-agent."

Find runs that are still in progress:

"Are there any test runs currently running for my order-processing-agent?"

Pagination

Page through a long history of test runs:

"Show me the first 10 test runs for my customer-support-agent."

"Show me the next 10 test runs for my customer-support-agent."

Workflow — Verify a Sync

After syncing results, confirm the test run was created:

"Sync these results for billing-agent as 'Regression Suite -- v3.2', then list the latest runs to confirm it was created."

Status Values

Status	Meaning
completed	All test cases have been evaluated and results are final.
failed	The test run encountered an error during execution.
running	Test cases are currently being executed or evaluated.
pending	The test run has been created but execution has not started.

The run ID shown in the listing is truncated to the first 8 characters for readability. The full ID is returned by invarium_sync_results and is needed when calling invarium_get_test_run.

Error Responses

Error	Cause	Fix
`Authentication failed: invalid API key`	Invalid or missing API key.	Verify your `INVARIUM_API_KEY`. Run `invarium_connect` first.
`Agent not found: '...'`	No blueprint exists for the specified agent name.	Check the agent name with `invarium_list_agents`.
`Invalid status '...'`	The `status` filter is not a recognized value.	Use one of: `completed`, `failed`, `running`, `pending`.
`Invalid limit: must be a positive integer`	The `limit` value is zero or negative.	Use a positive integer.
`Invalid offset: must be a non-negative integer`	The `offset` value is negative.	Use 0 or a positive integer.
`Failed to list test runs: ...`	Backend API error or network issue.	Check the error details and try again.

See Error Codes for the full error reference.

Was this page helpful?

invarium_sync_results invarium_get_test_run