QA Wolf AI

How QAW AI stacks up

QA Wolf without AI

19:00

QA Wolf AI

6:00

Playwright recorder

30:00

Meet the agents

QA Wolf’s multi-agent system has specialized agents for specific tasks. This allows the system to use multiple context-heavy inputs, including the video and transcript, DOM snapshots, and browser logs to outline and code a test. Working together, the agents solve problems faster, with fewer mistakes than a single-agent system ever could.

The Orchestrator

One agent to rule them all and control the flow of information between agents.

The Outliner

The Outliner develops comprehensive test plans and AAA outlines after watching a product tour and capturing the the client’s testing goals from the audio.

The Code Writer

The Code Writer generates open-source Playwright code. It’s trained on 700+ gym scenarios derived from 40M test runs and can automate any test case that Playwright supports.

The Verifier

The Verifier runs the test code to make sure it works as intended.

+150 other agents

AI agents + human reviewers
= 100% accuracy

As the agents create and update E2E tests, human reviewers check their work. The partnership with QA Wolf AI lets our engineers do 5x more work in the same amount of time — new tests are created in minutes and existing tests are updated almost instantaneously.

700 eval criteria

QA Wolf AI agents are evaluated each night on their ability to handle 700 unique UI scenarios that have been identified from QA Wolf’s history of 50,000,000 test runs.

The evals — known collectively as the training gym — are in turn evaluated for how well they reflect real-world conditions.

Under the hood

Explainable decision-making

Every decision the agents make is logged in plain English and auditable by QA engineers and customers, so there’s transparency and accountability.

UI showing howl's decision making process

Arbitrary Javascript development

Test code generated by QA Wolf AI makes use of variables, loops, helper functions, and conditions required to automate complex workflows.

Multi-source context

The agents can reference multiple streams to evaluate a situation and implement an action, including AAA test outline, browser logs, HTML of the page, visual screenshots, and video product tours.

QA Wolf AI

Increases velocity of Playwright test automation by 5x

How QAW AI stacks up

Meet the agents

AI agents + human reviewers
= 100% accuracy

700 eval criteria

Dive deeper

Under the hood

Explainable decision-making

Arbitrary Javascript development

Multi-source context

About QA Wolf

Resources

Legal

Hello!

QA Wolf AI

Increases velocity of Playwright test automation by 5x

How QAW AI stacks up

Meet the agents

AI agents + human reviewers = 100% accuracy

700 eval criteria

Dive deeper

Under the hood

Explainable decision-making

Arbitrary Javascript development

Multi-source context

About QA Wolf

Resources

Legal

Hello!

AI agents + human reviewers
= 100% accuracy