MCP Referenceinvarium_get_audit

invarium_get_audit

Retrieve the latest static audit results for a registered agent. Returns the Agent Readiness Score (ARS), per-category breakdown, and findings grouped by severity with recommended fixes.

When to Use

Call invarium_get_audit after uploading a blueprint with invarium_upload_blueprint to retrieve the authoritative server-side audit results. The audit runs automatically when a blueprint is uploaded — this tool fetches the results.

Common scenarios:

  • Right after invarium_upload_blueprint to get the server-side ARS (the local preview from invarium_prepare_blueprint is informational; the server-side audit is authoritative)
  • To review an agent’s security posture before generating test scenarios
  • To identify high-severity findings that should be addressed in the blueprint
  • To compare audit results after updating a blueprint

See Agent Readiness Audit for a full explanation of scoring, categories, and how to improve your ARS.

Parameters

NameTypeRequiredDefaultDescription
agent_namestringYesName of the agent to retrieve audit results for. Must match the name used when the blueprint was uploaded.
invarium_get_audit

Retrieve the latest static audit results for your agent. Shows the Agent Readiness Score (ARS), per-category breakdown, and findings grouped by severity with recommended fixes. The audit runs automatically when you upload a blueprint.

Parameters

NameTypeRequiredDescription
agent_namestringrequiredName of the agent to retrieve audit results for.

Returns

Formatted audit results with ARS score, color label, per-category breakdown, severity summary, finding details with recommendations, and dashboard link.

Example

"Get the audit results for my customer-support-agent"

Response

The tool returns a formatted text block with the full audit results:

Audit results for 'customer-support-agent'

Agent Readiness Score: 82/100 (GREEN -- Ready)
Based on 11 audit checks | 5 findings

  Per-category breakdown with scores and finding counts
  for Security, Reliability, System Design, and Tool Quality.

  Critical: 0 | High: 1 | Medium: 3 | Low: 1

-- HIGH (1) --
  Security [search_tickets]
    Tool has no input validation defined
    -> Add input validation rules to prevent injection attacks

-- MEDIUM (3) --
  Security
    No PII handling policy defined
    -> Add a PII handling constraint to your blueprint
  Reliability [create_ticket]
    Tool has no error_handling defined
    -> Set error_handling to retry, fallback, or none
  System Design
    No system prompt provided
    -> Add a system_prompt or system_prompt_summary to improve audit score

-- LOW (1) --
  Tool Quality [escalate_to_human]
    Tool has no return type defined
    -> Add a return_type field to document what the tool returns

View on dashboard: https://app.invarium.dev/agent/ag_abc123

Response Fields

FieldDescription
Agent Readiness ScoreOverall score from 0 to 100. Higher is better.
Color labelRED — Critical Risk (0—30), YELLOW — Notable Issues (31—60), GREEN — Ready (61—100).
Audit checksTotal number of checks the audit engine ran against the blueprint.
FindingsTotal number of issues found across all categories.
Category breakdownPer-category earned/budget scores with deduction and finding counts.
Severity summaryCount of findings at each severity level: Critical, High, Medium, Low.
Finding detailsEach finding lists its category, affected tool (if applicable), description, and a recommended fix.
Dashboard linkDirect URL to the agent’s page on the Invarium dashboard.

ARS Categories

Scores across four categories: Security, Reliability, System Design, and Tool Quality — weighted by production risk.

CategoryWhat It Measures
SecurityGuardrails, PII handling, input validation, authorization checks.
ReliabilityError handling, fallbacks, retry policies, timeout configuration.
System DesignWorkflow completeness, constraint coverage, system prompt quality.
Tool QualityParameter definitions, descriptions, return types, side effect declarations.

Severity Levels

SeverityMeaning
CriticalFundamental security or reliability gap. Address before going to production.
HighSignificant issue that will meaningfully impact the ARS score.
MediumModerate issue worth addressing to improve agent quality.
LowMinor improvement opportunity.

No Audit Results

If the agent has not had a blueprint uploaded yet, the tool returns:

No audit results for 'customer-support-agent'.
Upload a blueprint to trigger the static audit.

Examples

Basic Usage

Retrieve audit results for an agent:

"Get the audit results for my customer-support-agent"

After Uploading a Blueprint

The standard workflow is to upload the blueprint, then fetch the server-side audit:

// Step 1: Upload the blueprint
"Upload the blueprint for customer-support-agent"
// -> Local Audit Preview: 82/100 (Green)

// Step 2: Get the authoritative server-side audit
"Show me the audit results for customer-support-agent"
// -> Full audit with per-finding details and recommendations

Using Audit Findings to Improve the Blueprint

Review findings, fix the blueprint, re-upload, and check the new score:

// Get current audit
"Get the audit for travel-booking-agent"
// -> ARS: 58/100 (YELLOW), 2 high findings

// Fix: add error_handling, input validation, guardrails
// Re-upload the improved blueprint
"Upload the improved blueprint for travel-booking-agent"

// Verify improvement
"Show the updated audit for travel-booking-agent"
// -> ARS: 87/100 (GREEN), 0 high findings

Error Responses

ErrorCauseFix
Authentication failed: invalid API keyInvalid or missing API key.Verify your INVARIUM_API_KEY. Run invarium_connect first.
Agent '{name}' not found. Upload a blueprint first with invarium_upload_blueprint.No agent with this name exists in your workspace.Check the agent name. Use invarium_list_agents to see all registered agents.
No audit results for '{name}'.The agent exists but has no audit data yet.Upload a blueprint with invarium_upload_blueprint to trigger the audit.
Failed to get audit: ...Backend API error.Check the error details and try again.

See Error Codes for the full error reference.

Was this page helpful?