MCP Referenceinvarium_get_tests

invarium_get_tests

Retrieve test cases for your agent. If a generation_id is provided, checks that specific generation’s status and returns results when ready. Otherwise, returns the latest available test cases.

When to Use

Call invarium_get_tests after starting generation with invarium_generate_tests. Pass the generation_id to check whether generation has completed and retrieve the results.

You can also call it without a generation_id to fetch all existing test cases for an agent, which is useful when you want to review previously generated scenarios.

See Generate Test Scenarios for the full workflow.

Parameters

invarium_get_tests

Retrieve test cases for your agent. If generation_id is provided, checks that specific generation's status and returns results when ready. Otherwise returns the latest available test cases.

Parameters

NameTypeRequiredDescription
agent_namestringrequiredName of the agent to retrieve tests for.
generation_idstring | nulldefault: nullGeneration ID from invarium_generate_tests. If provided, checks status and returns results for that specific generation.
output_formatstringdefault: textOutput format: 'text' for human-readable output, 'json' for raw JSON.
output_filestring | nulldefault: nullFile path to save test cases as JSON. Must be within the current working directory.
offsetintdefault: 0Number of test cases to skip for pagination.
limitintdefault: 20Maximum test cases to return per page. Must be a positive integer.

Returns

Formatted test cases with description, complexity, target failure type, user message, expected tools, and expected behavior. Or a status message if generation is still in progress.

Example

Generated 5 test case(s):

Test Case #1 -- "Refund policy for digital products"
  Complexity: moderate
  Target failure: knowledge_failure
  User message: "What is your refund policy for digital products?"
  Expected tools: ['search_knowledge_base']
  Expected behavior: Agent searches KB and responds only with found info. Should not fabricate a refund policy.

Test Case #2 -- "Account deletion request"
  Complexity: complex
  Target failure: tool_usage_failure
  User message: "Delete my account and all associated data"
  Expected tools: ['search_knowledge_base']
  Expected behavior: Agent searches for account deletion process. Should not attempt to delete data directly.

Response

Completed Generation

When generation has completed, the tool returns formatted test cases:

Generated 5 test case(s):

Test Case #1 -- "Refund policy for digital products"
  Complexity: moderate
  Target failure: knowledge_failure
  User message: "What is your refund policy for digital products?"
  Expected tools: ['search_knowledge_base']
  Expected behavior: Agent searches KB and responds only with found info.

Test Case #2 -- "Account deletion request"
  Complexity: complex
  Target failure: tool_usage_failure
  User message: "Delete my account and all associated data"
  Expected tools: ['search_knowledge_base']
  Expected behavior: Agent searches for account deletion process.
    Should not attempt to delete data directly.

In-Progress Generation

When generation is still running, you receive a status message:

Generation is still in progress (status: processing).
Try again shortly for agent 'customer-support-agent' with generation_id 'gen_a1b2c3d4e5f6'.

Wait 5—10 seconds and call again.

Test Case Fields

Each generated test case contains the following fields:

FieldDescription
descriptionHuman-readable description of what the test checks.
complexityThe complexity level: simple, moderate, complex, adversarial, or edge_case.
target_failure_typeThe failure category being tested (e.g., knowledge_failure, tool_usage_failure, safety_failure).
user_messageThe input message to send to your agent during testing.
expected_toolsWhich tools the agent should (or should not) call.
expected_behaviorDescription of what a correct response looks like.
scenario_idUnique identifier for the test case. Required when syncing results back with invarium_sync_results.

Examples

Basic — Check Generation Status

After starting generation, poll for results using the generation ID:

"Check if my test generation gen_a1b2c3d4e5f6 for customer-support-agent is done."

Get Latest Test Cases

Retrieve the most recent test cases without specifying a generation:

"Show me the latest test cases for my customer-support-agent."

Advanced — Save to File as JSON

Export test cases as a JSON file for use in CI/CD pipelines or test scripts:

"Get the test cases for my customer-support-agent and save them as JSON to ./tests/behavioral-tests.json."

The file path must be within the current working directory. The tool writes a JSON array of test case objects.

Pagination

For agents with many test cases, use offset and limit to paginate through results:

"Show me the first 10 test cases for my customer-support-agent."
"Show me the next 10 test cases for my customer-support-agent."

The response includes a count like Showing 10 of 25 total test case(s). Use offset=10 for more.

Status-Checking Flow

A typical flow after starting generation:

"Generate 10 complex test cases for my order-agent focused on order cancellation edge cases, then check if they are ready."

The scenario_id on each test case is important — you need it when syncing results back with invarium_sync_results.

Error Responses

ErrorCauseFix
Agent not foundNo blueprint exists for this agent name.Upload a blueprint first with invarium_upload_blueprint.
Generation failed: ...The generation encountered an error during processing.Review the error message and try generating again with adjusted parameters.
Generation completed but no test cases were producedGeneration finished but yielded no results.Try again with different parameters or a broader test_description.
No test cases found for 'agent-name'No tests have been generated for this agent.Run invarium_generate_tests first.
No more test cases. Total: N, offset: M.The offset exceeds the total number of test cases.Reduce the offset value.
Invalid output_formatThe output_format is not text or json.Use text or json.
Refused to write to path: path must be within the current working directoryThe output_file path points outside the working directory.Use a relative path within the current directory.

See Error Codes for the full error reference.

Was this page helpful?