invarium_sync_results
invarium_sync_resultsSync test results from your IDE to the Invarium dashboard. After running test cases locally, use this tool to upload the results. Creates a new test run or appends to an existing one.
Parameters
| Name | Type | Required | Description |
|---|---|---|---|
agent_name | string | required | Name of the agent whose tests were run. |
results | string | required | JSON array of test result objects. Each must contain: scenario_id, user_message, agent_response. Optional: tools_called, passed, notes. |
test_run_id | string | null | default: null | Existing test run ID for incremental sync. If provided, results are appended to that run. If omitted, a new test run is created. |
source | string | default: mcp | Where tests were executed: mcp, ci, vscode, cli, api. |
Returns
Summary with test run ID, agent name, results count, and dashboard link.
Example
Results synced successfully
Test Run: run_xyz789
Agent: customer-support-agent
Results: 10 synced
Dashboard: https://app.invarium.dev/runs/run_xyz789Usage
After running each generated test case against your agent, collect the results and sync them back:
invarium_sync_results(
agent_name='customer-support-agent',
results='[{"scenario_id": "sc_001", "user_message": "What is your refund policy?", "agent_response": "Our refund policy allows..."}]'
)Incremental sync
If you are running tests in batches, pass the test_run_id from the first sync to append results to the same run:
invarium_sync_results(
agent_name='customer-support-agent',
results='[...]',
test_run_id='run_xyz789'
)Source labels
Use the source parameter to indicate where the tests were executed. This helps track where results come from in the dashboard:
| Source | When to use |
|---|---|
mcp | Running from an MCP client (Claude Desktop, Cursor, etc.) |
ci | Running in a CI/CD pipeline (GitHub Actions, etc.) |
vscode | Running from VS Code |
cli | Running from a command-line script |
api | Running via the REST API |
Result object schema
Each object in the results array should follow this structure:
| Field | Type | Required | Description |
|---|---|---|---|
scenario_id | string | Yes | The ID of the test case (from invarium_get_tests). |
user_message | string | Yes | The input message that was sent to the agent. |
agent_response | string | Yes | The agentโs actual output/response. |
tools_called | array | No | Array of tool call objects, each with a required name field and optional parameters object. |
passed | boolean | No | Whether the test passed. If omitted, Invarium will evaluate the result automatically. |
notes | string | No | Free-text notes about the test execution (e.g., errors encountered, observations). |
Example result object
{
"scenario_id": "sc_001",
"user_message": "What is your refund policy for digital products?",
"agent_response": "Based on our knowledge base, digital products can be refunded within 14 days of purchase. Source: KB article #1234.",
"tools_called": [
{
"name": "search_knowledge_base",
"parameters": { "query": "refund policy digital products" }
}
],
"passed": true,
"notes": "Agent correctly searched KB and cited source."
}If you omit the passed field, Invarium will automatically evaluate the agent response against the expected behavior defined in the test case. Manual passed values take precedence over automatic evaluation.
Error responses
| Error | Cause | Fix |
|---|---|---|
Agent not found | No blueprint exists for this agent name. | Upload a blueprint first. |
Invalid results format | The results string is not valid JSON or is missing required fields. | Ensure each result has scenario_id, user_message, and agent_response. |
Test run not found | The test_run_id does not exist. | Omit test_run_id to create a new run, or check the ID. |
Scenario ID not found: sc_xxx | A scenario_id in the results does not match any generated test case. | Verify the scenario IDs match those from invarium_get_tests. |