For any readers who aren’t familiar with the lingo, white-box tests (also called clearbox or open box) are automated tests that are aware of and interact with the underlying code. It’s the opposite of a black-box test, which tests the product like an end-user would see it from the rendered UI or APIs.
Let’s start with what makes high-quality code. Every team will have slightly different definitions, and this list is far from exhaustive, but we can say that good code is…
We write white-box tests by the thousands to validate that our code meets those standards. This is work that developers have to do themselves, because the tests need direct access to the underlying functions and test-driven development means we structure our code around the tests.
Ironically, or just unfortunately, what we call “good code” is mostly invisible to the user. Yes, they notice inefficient code when it slows down their computer and sends their fans whirring, but what they’re really looking for is a product that is…
The reason we do E2E testing is because unit and integration tests may pass on their own, but the final product can still have bugs. In a high-quality product, everything works together to create something greater than the sum of its parts.
For end-to-end testing to be most effective, teams need at least 80% test coverage. So when you look at how low test coverage actually is across all industries, you really start to appreciate how difficult it can be.
In 2021, 90% of companies had less than 75% coverage. Two-thirds of companies had less than 50% coverage. And those numbers are down since 2018 (SmartBear, 2021).
Let’s talk about why that is.
Teams drown under the volume of E2E tests that need maintenance
Maintaining white-box tests is pretty simple. If there’s a change to the code, only the tests that are related to that code need to be updated—if anything. E2E tests are another animal altogether.
E2E tests are full stack tests. They validate that everything is working as it should from the front-end to the APIs to the integrations. And the tests often overlap one another, with multiple tests running through the same functionality testing different use cases.
That means if there’s a change to any part of the stack, no matter how small, you could break a couple or a couple dozen of your E2E tests. It puts an enormous maintenance burden on the whole organization to keep the E2E tests running. And the more tests you create, the greater the burden.
Writing and maintaining end-to-end tests limits the ability to work on new features
There are lots of hidden costs to shifting all the way left and putting QA responsibilities onto developers. The biggest is probably productivity loss.
Maintaining end-to-end tests can take 20–40% of a developer’s time. And in our experience it’s the first thing that overworked development teams neglect. This causes a chain reaction of delays: The flaky tests hold up deployments, then bugs get into production, and teams spend even more time re-deploying fixes.
Engineering time is expensive and is much better spent building new features and adding value for customers than maintaining end-to-end tests.
End-to-end testing is rarely a developer’s core competency
This is pretty evident just from the coverage levels that we see but it’s also shown in research: Just 45% of engineering teams believe they have the right testing strategy or process in place (World Quality Report, 2021).
Another survey found that 70% of teams approach test design by intuition; just 46% have a methodical approach that provides efficient and effective test coverage (Tricentis, 2021).
There’s an art and science to E2E testing and it’s not something that most front-end developers have learned to do. In fact, one of the most common things we hear from clients is that they simply don’t have the expertise to write E2E tests for their own applications:
The real benefit of QA Wolf is that we can test far more than we were able to test before — and more than we realistically ever would have been able to do — which has led to a much more stable and reliable application.
—Collin Palmer, Product Manager, Padlet
The QA Wolf platform and our in-house QA engineers ramp clients to 80%+ test coverage in four months. When a test fails, your QA concierge will investigate 24 hours a day. Flaky tests get fixed and re-run automatically, while bug reports get flagged for your developers or ticketed.
As your product grows, we grow with you, keeping you at a minimum of 80% coverage at all times. If you ever want to leave, our tests code is in the Microsoft Playwright open source framework which you can export and use yourself.