MCP Referenceinvarium_list_test_runs

invarium_list_test_runs

List test runs for an agent or across all agents in your workspace. Shows run status, pass rate, source, and date for each run.

When to Use

Call invarium_list_test_runs when you need to review test execution history. Common scenarios:

  • After syncing results with invarium_sync_results, verify the test run was created
  • Review recent test runs to find a specific test_run_id for deeper inspection with invarium_get_test_run
  • Compare pass rates across multiple test runs to track quality trends
  • Filter by status to find failed or in-progress runs that need attention
  • Get a cross-agent overview by omitting agent_name to see runs across the entire workspace

Parameters

invarium_list_test_runs

List test runs for an agent or across all agents in your workspace. Shows run status, pass rate, source, and date for each run.

Parameters

NameTypeRequiredDescription
agent_namestring | nulldefault: nullFilter by agent name. If omitted, lists runs across all agents in the workspace.
statusstring | nulldefault: nullFilter by run status. Valid values: completed, failed, running, pending. If omitted, all statuses are returned.
limitintdefault: 10Maximum number of runs to return. Must be a positive integer.
offsetintdefault: 0Number of runs to skip for pagination. Must be a non-negative integer.

Returns

Formatted list of test runs with run ID, date, status, pass rate, and source. Includes total count in the header.

Example

Test runs for 'customer-support-agent' (12 total):

1. Run 8f4a2b1c -- 2025-01-15 14:32
   Status: completed | Passed: 8/10 (80%) | Source: mcp

2. Run a3c7e9d2 -- 2025-01-14 09:15
   Status: completed | Passed: 15/20 (75%) | Source: ci

3. Run f1b2c3d4 -- 2025-01-13 16:48
   Status: failed | Passed: 3/10 (30%) | Source: mcp

Response

Agent-Scoped Listing

When agent_name is provided, runs are listed for that specific agent:

Test runs for 'customer-support-agent' (12 total):

1. Run 8f4a2b1c -- 2025-01-15 14:32
   Status: completed | Passed: 8/10 (80%) | Source: mcp

2. Run a3c7e9d2 -- 2025-01-14 09:15
   Status: completed | Passed: 15/20 (75%) | Source: ci

3. Run f1b2c3d4 -- 2025-01-13 16:48
   Status: failed | Passed: 3/10 (30%) | Source: mcp

Workspace-Wide Listing

When agent_name is omitted, runs include the agent name for each entry:

Recent test runs (24 total):

1. [customer-support-agent] Run 8f4a2b1c -- 2025-01-15 14:32
   Status: completed | Passed: 8/10 (80%)

2. [order-processing-agent] Run b5d6e7f8 -- 2025-01-15 11:20
   Status: running | Passed: --/15

3. [travel-booking-agent] Run c9a1b2d3 -- 2025-01-14 17:05
   Status: completed | Passed: 12/12 (100%)

Per-Run Fields

FieldDescription
Run IDShort identifier (first 8 characters) for the test run. Use the full ID with invarium_get_test_run for details.
DateWhen the test run was created.
StatusCurrent status: completed, failed, running, or pending.
PassedPass rate shown as passed/total with percentage. Shows -- if results are not yet available.
SourceWhere the tests were executed: mcp, ci, vscode, cli, or api. Shown for agent-scoped listings.
Agent nameThe agent the run belongs to. Shown only for workspace-wide listings (when agent_name is omitted).

Empty State

If no test runs match the query, the tool returns:

No test runs found. Use invarium_sync_results to upload test results.

Examples

Basic — List Recent Runs for an Agent

View the latest test runs for a specific agent:

"List the recent test runs for my customer-support-agent."

List All Runs Across Workspace

Get a cross-agent overview of recent test activity:

"Show me recent test runs across all my agents."

Advanced — Filter by Status

Find all failed runs to investigate issues:

"Show me all the failed test runs for my order-processing-agent."

Find runs that are still in progress:

"Are there any test runs currently running for my order-processing-agent?"

Pagination

Page through a long history of test runs:

"Show me the first 10 test runs for my customer-support-agent."
"Show me the next 10 test runs for my customer-support-agent."

Workflow — Verify a Sync

After syncing results, confirm the test run was created:

"Sync these results for billing-agent as 'Regression Suite -- v3.2', then list the latest runs to confirm it was created."

Status Values

StatusMeaning
completedAll test cases have been evaluated and results are final.
failedThe test run encountered an error during execution.
runningTest cases are currently being executed or evaluated.
pendingThe test run has been created but execution has not started.

The run ID shown in the listing is truncated to the first 8 characters for readability. The full ID is returned by invarium_sync_results and is needed when calling invarium_get_test_run.

Error Responses

ErrorCauseFix
Authentication failed: invalid API keyInvalid or missing API key.Verify your INVARIUM_API_KEY. Run invarium_connect first.
Agent not found: '...'No blueprint exists for the specified agent name.Check the agent name with invarium_list_agents.
Invalid status '...'The status filter is not a recognized value.Use one of: completed, failed, running, pending.
Invalid limit: must be a positive integerThe limit value is zero or negative.Use a positive integer.
Invalid offset: must be a non-negative integerThe offset value is negative.Use 0 or a positive integer.
Failed to list test runs: ...Backend API error or network issue.Check the error details and try again.

See Error Codes for the full error reference.

Was this page helpful?