Quickstart

Test your first agent in under 5 minutes.

This guide walks you through setting up the Invarium MCP server, uploading your first agent blueprint, generating behavioral tests, and viewing results — all from your IDE.

Prerequisites

Create an account

Go to app.invarium.dev and sign up for an account. Sign up with your email and password.

Get your API key

Once logged in, click API Keys in the dashboard sidebar. Create a new key and copy it — you will need it in the next step.

⚠️

Keep your API key secret. Do not commit it to version control or share it publicly.

Configure the MCP server

Add the Invarium MCP server to your IDE’s configuration. Choose your client below:

Add this to your Claude Desktop MCP config file (claude_desktop_config.json):

{
  "mcpServers": {
    "invarium": {
      "command": "uvx",
      "args": ["invarium-mcp"],
      "env": {
        "INVARIUM_API_KEY": "inv_your_key_here"
      }
    }
  }
}

Add this to your Cursor MCP settings (.cursor/mcp.json):

{
  "mcpServers": {
    "invarium": {
      "command": "uvx",
      "args": ["invarium-mcp"],
      "env": {
        "INVARIUM_API_KEY": "inv_your_key_here"
      }
    }
  }
}

For any MCP-compatible client, use this configuration:

{
  "mcpServers": {
    "invarium": {
      "command": "uvx",
      "args": ["invarium-mcp"],
      "env": {
        "INVARIUM_API_KEY": "inv_your_key_here"
      }
    }
  }
}

Replace inv_your_key_here with your actual API key.

Verify the connection

After configuring the MCP server, verify it is working by calling invarium_connect:

invarium_connect()

You should see output like:

Connected to Invarium
  Account: you@example.com

If you get an authentication error, double-check that your API key is correct and that the INVARIUM_API_KEY environment variable is set.

Upload a blueprint

A blueprint describes your agent — its name, framework, tools, and expected behaviors. Here is a minimal blueprint for a LangChain customer support agent:

{
  "agent_name": "customer-support-agent",
  "framework": "langchain",
  "description": "Handles customer inquiries by searching a knowledge base and providing accurate answers.",
  "tools": [
    {
      "name": "search_knowledge_base",
      "description": "Searches the internal knowledge base for articles matching the customer query.",
      "parameters": {
        "query": "string"
      },
      "returns": "Array of matching articles with title and content.",
      "side_effects": "none"
    }
  ],
  "constraints": [
    "Never fabricate information not found in the knowledge base",
    "Always cite the source article when answering",
    "Escalate to a human agent if confidence is low"
  ]
}

Upload it with invarium_upload_blueprint:

invarium_upload_blueprint(
  blueprint='<the JSON above as a string>',
  agent_name='customer-support-agent'
)

Expected output:

Blueprint uploaded successfully
  Agent: customer-support-agent
  Tools: 1 detected
  Confidence: high
  Dashboard: https://app.invarium.dev/agents/customer-support-agent

Generate tests

Now generate behavioral test cases for your agent:

invarium_generate_tests(
  agent_name='customer-support-agent',
  count=10,
  complexity='mixed'
)

Expected output:

Test generation started
  Generation ID: gen_abc123
  Agent: customer-support-agent
  Count: 10
  Complexity: mixed

Use invarium_get_tests to check status and retrieve results.

Test generation runs asynchronously. It typically completes in 10-30 seconds depending on the number of tests and complexity.

Get results

Retrieve the generated test cases:

invarium_get_tests(
  agent_name='customer-support-agent',
  generation_id='gen_abc123'
)

Each test case includes:

Description — What the test is checking (e.g., “Agent should not hallucinate when KB has no results”)
Complexity — simple, moderate, or complex
Target failure type — The failure category being tested (e.g., hallucination, tool_misuse)
User message — The input to send to your agent
Expected tools — Which tools the agent should call
Expected behavior — What a correct response looks like

You can now run each test’s user message against your agent and use invarium_sync_results to send the results back to Invarium for scoring.