Skip to content
Allure report logoAllure Report
Main Navigation ModulesDocumentationStarter Project

English

Español

English

Español

Appearance

Allure Agent Mode

Test reporting for the agentic era

Allure Agent Mode turns your existing test command into agent-readable test reporting for Codex, Claude, and other coding agents, so they can review what ran before claiming a change is safe.

Try Agent Mode

Keep your test reporting workflow. Let your AI agent use it too.

check checkout feature coverage
Worked for 3m 22s

Coverage Check

I checked checkout coverage with two fresh Allure agent runs:

  • checkout e2e agent run: 7 passed, expectations matched, 0 findings.
  • checkout api agent run: 5 passed, expectations matched, 0 findings.

Covered happy paths:

  • cart summary before payment
  • successful order creation
  • declined-payment error state

Main gaps I see:

  • No focused expired-card test.
  • No direct retry-after-failure coverage.
Why it exists

"Tests passed" is not enough for AI test work

You ask an agent to check a change. It runs a command, sees green, and says it looks safe. But you still have to ask the real questions: did it run the right tests, did it notice what was skipped, and did it understand what those tests actually proved?

01

Green does not tell you what was checked

A passing command can hide a lot. The agent may have run too broad a smoke test, missed the feature you cared about, ignored skipped tests, or treated a weak assertion as real validation.

$ npm test -- --grep smokeRUN v2.0.5 /app/checkout✓ test/checkout.smoke.test.ts (12)  ✓ cart summary  ✓ order successTest Files  1 passed (1)Tests       7 passed, 5 skipped (12)Duration    1.4s
Agent seesSmoke passed
Agent missesA leftover it.only skipped half of the checkout smoke suite
02

The important details are not all in the terminal

The useful story often lives in the report: which tests ran, what steps happened, what retried, what was attached, which screenshots or traces matter, and where the run looks thin.

$ npm test checkout.spec.tsRUN v2.0.5 /app/checkout✓ test/checkout.spec.ts (2)  ✓ creates an order  ✓ declines invalid cardTest Files  1 passed (1)Tests       2 passed (2)Duration    349ms
Agent seesCheckout works as expected
Agent missesThe tests checked the legacy checkout page
03

You still need to trust the agent's conclusion

When an agent says covered or safe, you need more than confidence. You need to see what it checked, what reporting it reviewed, and what gaps it found before you accept the answer.

$ npm test checkout.spec.tsRUN v2.0.5 /app/checkout× test/checkout.spec.ts (2)  ✓ creates an order  × keeps total after retryExpected: "$42.00"  Received: "$39.00"Test Files  1 failed (1)Tests       1 failed, 1 passed (2)Duration    612ms
Agent seesCheckout is broken
Agent missesA flaky test read stale cart data after retry

That is what Allure Agent Mode gives the agent.

It lets your AI agent use the same test reporting workflow your team already uses, so its answer can be based on what actually ran, not just what the terminal happened to print.

Allure agent outputindex.md

Coverage check

I checked checkout coverage with three fresh Allure agent runs.

  • Scope gap:checkout.smoke.test.ts passed, but 5 skipped scenarios were present in the same run.
  • Context gap:checkout.spec.ts covered the legacy checkout route, not the new checkout page.
  • Test quality: the retry failure read stale cart data after navigation, so I would not treat it as proof that checkout is broken.
I would not call checkout fully covered yet. The useful next test is the new-page retry path with fresh cart state.
How it works

A test report for your agent

Allure Agent Mode wraps the test command your agent would run anyway and writes agent-readable output, including a Markdown run report. The Allure skills teach Codex, Claude, and other coding agents how to set it up, when to run it, and how to review the result before answering.

Test authoring

Add or change tests without losing the behavior you meant to protect. The agent runs the intended scope, checks what actually executed, and strengthens weak assertions or evidence before calling the work done.

Coverage review

Review what is covered at runtime, not just what appears covered from source code. The agent compares intended scope with observed tests, skipped cases, attachments, and gaps.

Regression safety checks

Before saying a change is safe, the agent runs the smallest meaningful test scope and reports what passed, what was skipped or retried, and what remains uncertain.

Failure debugging

When tests fail, the agent reviews per-test evidence, findings, logs, screenshots, traces, retries, and stderr before deciding whether it is a product bug, test bug, fixture issue, or environment problem.

Flaky test investigation

Separate real failures from unstable tests. The agent can inspect retries, timing, stale state, attachments, and repeated runs before recommending a fix.

Evidence enrichment

Improve tests that pass but are hard to review. The agent adds meaningful steps, attachments, labels, parameters, or descriptions, then reruns the same scope to confirm the report is clearer.

Getting started

Try Agent Mode in your repo

Start by adding the Allure skills. If your project already has Allure reporting, you can go straight to the agent workflow setup. If not, ask your agent to configure reporting first. The examples use different coding agents to show the workflow is portable; use whichever CLI you prefer.

01
Shell

Add the skills

Install the Allure skills in the repository where your tests live.

$ npx skills add allure-framework/skills
02
Claude

Configure Allure reporting

Optional. Use this when the project does not already emit useful Allure results.

$ claude "configure allure reporting"
03
Codex

Enable Agent Mode

Let the agent discover local wrappers, test commands, result paths, and supported agent options.

$ codex "enable allure agent mode"
04
OpenCode

Ask for real test work

Now use normal testing requests. Agent Mode gives the agent a report to review before it answers.

$ opencode run "review checkout test coverage"
Works your way

Use Agent Mode with the AI workflow you already have

Allure Agent Mode does not require MCP, a hosted service, or a specific AI vendor. It works through the test command your coding agent can already run, then gives that agent a readable report to review before it answers.

  • No MCP requiredRun it from the shell. Your agent only needs to execute project commands and read the generated report.
  • Any agentic code workflowUse Codex, Claude, OpenCode, custom scripts, or CI-driven agents with the same testing setup.
  • Any modelSwitch tools or models without changing how your tests report.
  • Humans still get Allure reportsYour team keeps visual reports for charts, history, CI artifacts, and review.

CI systems

GitHub Actions logo
GitHub Actions
GitLab CI logo
GitLab CI
Jenkins logo
Jenkins
CircleCI logo
CircleCI

Test frameworks

Playwright logo
Playwright
Vitest logo
Vitest
Jest logo
Jest
Pytest logo
Pytest
JUnit logo
JUnit
Mocha logo
Mocha
Allure ReportAgent Mode
same test workflowNo MCP required

AI agents

C
Codex
Claude logo
Claude
OpenCode logo
OpenCode

Readable markdown run reports

Humans

Allure reports

Visual reports, charts, history, and artifacts

Try Agent Mode

Make AI test work easier to trust

Keep the judgment with your team. Let Agent Mode give your coding agent the report context it needs before it claims coverage, safety, or a fix.

Get started
Powered by

Subscribe to our newsletter

Get product news you actually need, no spam.

Subscribe
Allure TestOps
  • Overview
  • Why choose us
  • Cloud
  • Self-hosted
  • Success Stories
Company
  • Documentation
  • Blog
  • About us
  • Contact
  • Events
Join our community

We aim to make Allure Report as reliable and user-friendly as possible, and together with the community, we're here to help when problems arise.

© 2026 Qameta Software Inc. All rights reserved.