Brilliant makes games for learning in math, science, computer science, and data analysis. They help learners develop intuition through interaction, build understanding through experimentation, and have fun.
Brilliant’s engineering leadership made 2024 a focus on quality. Continued success and continued growth were dependent on a smooth and delightful product experience for users and rapid development with minimal rework on the part of developers.
Automated testing was the obvious solution, and they started looking at different options (including an in-house team and LLM-powered vendors) that would meet their needs:
Brilliant’s app introduces a lot of complexity that most testers, and most QA tools aren’t able to handle. The lessons are delivered through interactive widgets with drag-and-drop functionality, unique animations triggered by the state of the UI, and responsive scaling for desktop and mobile devices. There are also dozens of permutations per lesson, like specialized hints or feedback triggered by particular errors. The whole system feels more like a game engine than a traditional application. As Brilliant evaluated their different options it became clear that most solutions weren’t built for game-like interactions or complex visual feedback loops.
In searching for a provider, it was really important to us that A) it was even capable of manipulating the browser in this way, because many of them are simply not capable of doing that; and B) have people who are able to think critically about a test’s purpose and goals.
—Jared Silver
When a company moves as fast as Brilliant, the functional requirements of older features aren’t always well documented. If the engineer who built it still works at the company, they’ve likely forgotten how the product was meant to work. And that creates a problem with tool-only vendors, and other outsourced QA solutions: someone from Brilliant needs to define the test cases
One of the reasons why QA has historically been a bit of a challenge for us is that functional product requirements for parts of the code base have been lost to time, and we would need to pull resources away from other things to reverse engineer more than a decade of workflows.
—Jared Silver
One of Brilliant’s primary needs was to allow developers and lesson designers to be able to ship without fear of breaking anything. This becomes extremely difficult when you’re trying to ship quickly, as every test needs to be revised alongside the product roadmap to prevent false positives. Before partnering with QA Wolf, the team at Brilliant relied on manual checks before and after releasing new code, which hurt their productivity and split their focus.
The cognitive overhead required by our developers and content producers to test every permutation, every time they published a new lesson or launched a new feature, definitely slowed us down. It would sometimes take an entire day for QA testers to go through the application.
—Jared Silver
QA cycles dropped from 24 hours to 5 minutes
Manually testing the application after a release was taking hours, even a full day. During that time, developers and lesson planners were split-focused monitoring the latest deploy while trying to focus on the next piece of work. The speed with which QA Wolf delivers test results lets developers finish one thing before starting the next, reducing the mental overhead and making them more productive in the long run with less rework.
Today’s users are extremely unforgiving. QA Wolf is helping Brilliant retain revenue by quickly detecting issues.
We took the expected lifetime value of a customer and the amount we could expect to lose if something is broken before a manual tester was able to find it, and we compared that to how quickly QA Wolf is able to detect issues, and we found that over the course of a year we’re saving hundreds of thousands in top line revenue — and I expect that to be many millions of dollars over time.
—Jared Silver
Of the nine different QA solutions that Brilliant evaluated, including an in-house team, QA Wolf delivered the very best value for money. Even with advancement in AI tools, dedicated resources would still be required to monitor and manage them
In order to be able to achieve the same coverage that QA Wolf provides, and to account for the necessary redundancy when people get sick or go on vacation, would probably require a minimum of four or five full-time engineers.
—Jared Silver