`invarium_get_tests`

Retrieve test cases for your agent. If a generation_id is provided, checks that specific generation’s status and returns results when ready. Otherwise, returns the latest available test cases.

When to Use

Call invarium_get_tests after starting generation with invarium_generate_tests. Pass the generation_id to check whether generation has completed and retrieve the results.

You can also call it without a generation_id to fetch all existing test cases for an agent, which is useful when you want to review previously generated scenarios.

See Generate Test Scenarios for the full workflow.

Parameters

invarium_get_tests

Retrieve test cases for your agent. If generation_id is provided, checks that specific generation's status and returns results when ready. Otherwise returns the latest available test cases.

Parameters

Name	Type	Required	Description
`agent_name`	string	required	Name of the agent to retrieve tests for.
`generation_id`	string \| null	default: null	Generation ID from invarium_generate_tests. If provided, checks status and returns results for that specific generation.
`output_format`	string	default: text	Output format: 'text' for human-readable output, 'json' for raw JSON.
`output_file`	string \| null	default: null	File path to save test cases as JSON. Must be within the current working directory.
`offset`	int	default: 0	Number of test cases to skip for pagination.
`limit`	int	default: 20	Maximum test cases to return per page. Must be a positive integer.

Returns

Formatted test cases with description, complexity, target failure type, user message, expected tools, and expected behavior. Or a status message if generation is still in progress.

Example

Generated 5 test case(s):

Test Case #1 -- "Refund policy for digital products"
  Complexity: moderate
  Target failure: knowledge_failure
  User message: "What is your refund policy for digital products?"
  Expected tools: ['search_knowledge_base']
  Expected behavior: Agent searches KB and responds only with found info. Should not fabricate a refund policy.

Test Case #2 -- "Account deletion request"
  Complexity: complex
  Target failure: tool_usage_failure
  User message: "Delete my account and all associated data"
  Expected tools: ['search_knowledge_base']
  Expected behavior: Agent searches for account deletion process. Should not attempt to delete data directly.

Response

Completed Generation

When generation has completed, the tool returns formatted test cases:

Generated 5 test case(s):

Test Case #1 -- "Refund policy for digital products"
  Complexity: moderate
  Target failure: knowledge_failure
  User message: "What is your refund policy for digital products?"
  Expected tools: ['search_knowledge_base']
  Expected behavior: Agent searches KB and responds only with found info.

Test Case #2 -- "Account deletion request"
  Complexity: complex
  Target failure: tool_usage_failure
  User message: "Delete my account and all associated data"
  Expected tools: ['search_knowledge_base']
  Expected behavior: Agent searches for account deletion process.
    Should not attempt to delete data directly.

In-Progress Generation

When generation is still running, you receive a status message:

Generation is still in progress (status: processing).
Try again shortly for agent 'customer-support-agent' with generation_id 'gen_a1b2c3d4e5f6'.

Wait 5—10 seconds and call again.

Test Case Fields

Each generated test case contains the following fields:

Field	Description
description	Human-readable description of what the test checks.
complexity	The complexity level: simple, moderate, complex, adversarial, or edge_case.
target_failure_type	The failure category being tested (e.g., `knowledge_failure`, `tool_usage_failure`, `safety_failure`).
user_message	The input message to send to your agent during testing.
expected_tools	Which tools the agent should (or should not) call.
expected_behavior	Description of what a correct response looks like.
scenario_id	Unique identifier for the test case. Required when syncing results back with `invarium_sync_results`.

Examples

Basic — Check Generation Status

After starting generation, poll for results using the generation ID:

"Check if my test generation gen_a1b2c3d4e5f6 for customer-support-agent is done."

Get Latest Test Cases

Retrieve the most recent test cases without specifying a generation:

"Show me the latest test cases for my customer-support-agent."

Advanced — Save to File as JSON

Export test cases as a JSON file for use in CI/CD pipelines or test scripts:

"Get the test cases for my customer-support-agent and save them as JSON to ./tests/behavioral-tests.json."

The file path must be within the current working directory. The tool writes a JSON array of test case objects.

Pagination

For agents with many test cases, use offset and limit to paginate through results:

"Show me the first 10 test cases for my customer-support-agent."

"Show me the next 10 test cases for my customer-support-agent."

The response includes a count like Showing 10 of 25 total test case(s). Use offset=10 for more.

Status-Checking Flow

A typical flow after starting generation:

"Generate 10 complex test cases for my order-agent focused on order cancellation edge cases, then check if they are ready."

The scenario_id on each test case is important — you need it when syncing results back with invarium_sync_results.

Error Responses

Error	Cause	Fix
`Agent not found`	No blueprint exists for this agent name.	Upload a blueprint first with `invarium_upload_blueprint`.
`Generation failed: ...`	The generation encountered an error during processing.	Review the error message and try generating again with adjusted parameters.
`Generation completed but no test cases were produced`	Generation finished but yielded no results.	Try again with different parameters or a broader `test_description`.
`No test cases found for 'agent-name'`	No tests have been generated for this agent.	Run `invarium_generate_tests` first.
`No more test cases. Total: N, offset: M.`	The `offset` exceeds the total number of test cases.	Reduce the `offset` value.
`Invalid output_format`	The `output_format` is not `text` or `json`.	Use `text` or `json`.
`Refused to write to path: path must be within the current working directory`	The `output_file` path points outside the working directory.	Use a relative path within the current directory.

See Error Codes for the full error reference.

Was this page helpful?

invarium_generate_tests invarium_sync_results