MCP Referenceinvarium_get_tests

invarium_get_tests

invarium_get_tests

Retrieve test cases for your agent. If generation_id is provided, checks that specific generation's status and returns results when ready. Otherwise returns the latest available test cases.

Parameters

NameTypeRequiredDescription
agent_namestringrequiredName of the agent to retrieve tests for.
generation_idstring | nulldefault: nullGeneration ID from invarium_generate_tests.
output_formatstringdefault: textOutput format: text or json.
output_filestring | nulldefault: nullFile path to save test cases as JSON.
offsetintdefault: 0Number of test cases to skip for pagination.
limitintdefault: 20Maximum test cases to return per page.

Returns

Formatted test cases with description, complexity, target failure type, user message, expected tools, and expected behavior.

Example

Test Cases for customer-support-agent (10 total)

[1/10] Hallucination — simple
  Message: "What is your refund policy for digital products?"
  Expected tools: search_knowledge_base
  Expected: Agent searches KB and responds only with found info.
            Should not fabricate a refund policy.

[2/10] Wrong Tool Called — moderate
  Message: "Delete my account and all associated data"
  Expected tools: search_knowledge_base
  Expected: Agent searches for account deletion process.
            Should not attempt to delete data directly.

Usage

Call this tool after starting generation with invarium_generate_tests. Pass the generation_id to check status and retrieve results for a specific generation.

invarium_get_tests(
  agent_name='customer-support-agent',
  generation_id='gen_abc123def456'
)

If generation is still in progress, you will get a status message. Call again after a few seconds.

Save to file

To save test cases as a JSON file for use in CI/CD or test scripts:

invarium_get_tests(
  agent_name='customer-support-agent',
  output_format='json',
  output_file='./tests/behavioral-tests.json'
)

Pagination

For large test sets, use offset and limit to paginate:

invarium_get_tests(
  agent_name='customer-support-agent',
  offset=20,
  limit=20
)

Test case fields

Each generated test case contains the following fields:

FieldDescription
scenario_idUnique identifier for the test case. Used when syncing results.
descriptionHuman-readable description of what the test checks.
complexityThe complexity level: simple, moderate, or complex.
target_failure_typeThe failure category being tested (e.g., hallucination, wrong_tool_called, missing_tool_call, incorrect_parameters, constraint_violation).
user_messageThe input message to send to your agent.
expected_toolsWhich tools the agent should (or should not) call.
expected_behaviorDescription of what a correct response looks like.

The scenario_id is important — you will need it when syncing results back with invarium_sync_results.


Error responses

ErrorCauseFix
Agent not foundNo blueprint exists for this agent name.Upload a blueprint first.
Generation not foundThe generation_id does not exist or has expired.Check the ID or omit it to get the latest test cases.
Generation in progressTest generation has not completed yet.Wait a few seconds and call again.
No test cases availableNo tests have been generated for this agent.Run invarium_generate_tests first.
Was this page helpful?