Quickstart

Test your first AI agent in under 10 minutes.

Key Takeaways

✓Test your first agent in under 10 minutes — connect, upload blueprint, generate tests, run, view results
✓Works end-to-end from your IDE (MCP) or the web dashboard
✓No SDK integration or code changes required

Why It Matters

The quickstart walks you through Invarium’s complete workflow — from registering an agent to viewing behavioral test results — so you can validate your agent’s reliability before it reaches production.

Prerequisites

Before starting, make sure you have:

An Invarium account — sign up at app.invarium.dev if you have not already
An API key — generate one from Settings > API Keys in the dashboard
Your IDE configured (MCP path only) — see Installation & Setup for config instructions

How to Use It

Dashboard Path

The web dashboard at app.invarium.dev provides a guided interface for the complete workflow.

Create an agent

Navigate to Agents in the sidebar and click Create Agent.

You can create an agent using the guided wizard or by importing a YAML blueprint (a structured description of your agent’s tools, workflows, and constraints):

Wizard — step-by-step form that walks you through naming your agent, adding tools, defining constraints, and describing workflows. Best for first-time users.
JSON Import — paste or upload a complete blueprint JSON. Best for power users or when migrating from an MCP-generated blueprint. (MCP generates YAML — convert to JSON or use the wizard to reproduce the same fields.)

For this quickstart, use the wizard and fill in:

Agent name: customer-support-agent
Description: “Handles customer inquiries by searching a knowledge base and providing accurate answers.”
Framework: LangChain (or your framework of choice)
Add one tool: search_knowledge_base — “Searches the internal knowledge base for articles matching the customer query.”
Add constraints: “Never fabricate information not found in the knowledge base” and “Always cite the source article when answering”

Invarium Agents page

Generate test scenarios

Navigate to Scenarios in the sidebar and click Generate with AI.

Enter a description of what you want to test, for example:

Generate 10 behavioral test scenarios for my customer support agent,
focusing on edge cases like missing knowledge base results,
frustrated users, and multi-turn conversations.

Invarium generates scenarios targeting your agent’s specific failure modes based on its blueprint.

Review generated scenarios

Each generated scenario includes:

Description — what the test is checking (e.g., “Agent should not hallucinate when knowledge base returns no results”)
Complexity — simple, moderate, or complex
Target failure type — the failure category being tested (e.g., knowledge, tool_usage, safety)
User message — the input to send to your agent
Expected behavior — what a correct response looks like

Review the scenarios and remove or edit any that do not apply to your use case.

Create and run a test run

Navigate to Test Runs and click Create Test Run.

Select your agent (customer-support-agent)
Choose which scenarios to include (select all for a comprehensive first run)
Click Run

The test run executes each scenario against your agent and evaluates the results.

Test runs typically complete in 30-60 seconds depending on the number of scenarios and your agent’s response time.

Invarium Test Runs page showing status tabs

View results

Click on the completed test run to see detailed results:

Agent Quality Score (AQS) — a composite score from 0 to 100 reflecting your agent’s behavioral reliability
Pass/fail breakdown — how many scenarios passed vs. failed
Failure details — for each failed scenario, the specific failure type (e.g., “Tool Usage > Missing Validation”) and a recommendation for fixing it
Agent Intelligence Graph — a visual map of your agent’s architecture with nodes color-coded by test coverage

Use the failure details to identify what to fix in your agent before running another round of tests.

MCP Path

Everything below happens through natural language conversation in your IDE. Ask your IDE assistant to perform each step.

Connect to Invarium

Ask your IDE assistant:

"Connect to Invarium and verify my connection."

This calls invarium_connect to verify your API key and confirm the connection.

Expected output:

Connected to Invarium ✓
Account: you@example.com
Workspace: my-workspace

Project setup:
  config.json: created
  invarium_trace.py: installed
  .gitignore: added .invarium/

Create a blueprint

Ask your IDE assistant to analyze your codebase and create a blueprint (a YAML description of your agent’s tools, workflows, and constraints):

"Analyze my agent and create an Invarium blueprint."

The assistant reads the blueprint template from Invarium (invarium://templates/agent-blueprint), scans your codebase, and generates a blueprint describing your agent’s tools, workflows, and constraints.

Here is what a blueprint looks like for a LangChain customer support agent:

agent_name: customer-support-agent
framework: langchain
description: Handles customer inquiries by searching a knowledge base and providing accurate answers.
tools:
  - name: search_knowledge_base
    description: Searches the internal knowledge base for articles matching the customer query.
    parameters:
      query: string
    returns: Array of matching articles with title and content.
    side_effects: none
constraints:
  - Never fabricate information not found in the knowledge base
  - Always cite the source article when answering
  - Escalate to a human agent if confidence is low

The assistant will first run invarium_prepare_blueprint to validate the blueprint and show an audit score. Review the findings and iterate if needed.

Upload the blueprint

Once you are satisfied with the blueprint, ask:

"Upload this blueprint to Invarium."

This calls invarium_upload_blueprint to register your agent. Expected output:

Blueprint uploaded for customer-support-agent ✓
Tools detected: 1
Workflow chains: 0
Confidence: medium
View on dashboard: https://app.invarium.dev/agent/ag_abc123

Local Audit Preview: 74/100 (Yellow)
  2 finding(s) -- run invarium_get_audit for the server-side results

Update .invarium/config.json to add: "agent_id": "ag_abc123"

Generate and retrieve tests

Ask your IDE assistant to generate behavioral tests:

"Generate 10 behavioral tests for customer-support-agent."

This calls invarium_generate_tests to create test scenarios targeting your agent’s failure modes, then invarium_get_tests to retrieve the results.

Each test case includes:

Description — what the test checks (e.g., “Agent should not hallucinate when KB has no results”)
Complexity — simple, moderate, or complex
Target failure type — the failure category being tested (e.g., knowledge, tool_usage, safety)
User message — the input to send to your agent
Expected tools — which tools the agent should call
Expected behavior — what a correct response looks like

Test generation runs asynchronously and typically completes in 10-30 seconds depending on the number of tests and complexity level.

View results

Ask your IDE assistant to show the results:

"Show me the test results for customer-support-agent."

You can also view results on the dashboard at app.invarium.dev, where you will find the Agent Intelligence Graph, detailed failure breakdowns, and your Agent Quality Score.

To send test execution results back to Invarium for scoring, use invarium_sync_results after running each test’s user message against your agent.

What’s Next After Your First Test

Once you have completed your first test run, here are the recommended next steps:

Improve your agent — use the failure details and recommendations from your test results to fix behavioral issues in your agent’s code
Generate more tests — create additional test scenarios with different complexity levels (simple, moderate, complex) to increase behavioral coverage
Explore the Agent Intelligence Graph — visualize your agent’s architecture and identify unguarded paths and missing safeguards. See Agent Intelligence Graph

Installation & Setup Create & Upload a Blueprint