Replay Testing, but for browsers

Replay testing is one of the most important, but least discussed technologies, that every major self driving car company has developed.

Here’s the short version. Car company alpha has a fleet of cars runnning on the road. The model is generally very good, so there are no accidents, but there are hundreds of close calls where a car gets too close to another car or pedestrian on the road. These incidents are an incredibly valuable data set for future testing.

When deciding whether to make a change to how the car behaves via a model or code, company Alpha compares replay runs on these historical “takeover/near-misses” before and after the code change.

The question is how does Replay Testing for cars carry over to browsers.

The first observation is that while it’s possible to design a simulated environment for the ego car and NPC, it’s very difficult to design a simulation for real world applications such as redfin.com. This is where Replay.io’s ability to record the browser going to redfin.com, interacting with the application, and later deterministically replaying the environment is valuable.

The second observation is that while it’s possible to collect a lot of sensor data in the real world, in contrast, the virtual world can be a lot noisier. This is where Replay’s browser serves as a lossless recorder, capable of retrieving any runtime information after the fact, such as the DOM, React Component tree, and other derived data like clickable elements, and element stacking contexts that can be computed after the fact.

The third observation is that while it’s possible to explore different strategies or present different counterfactuals in the car simulation, it’s impossible in the virtual world because the next time you visit redfin.com, and execute the same steps, you could get a different outcome. This is where the flexibility of Replay.io’s replayer comes in. When you’re viewing a replay and pause at a point in time, we fork the browser process so that you can call new functions, and even update the DOM, and repaint.

This flexibility gives you the ability to try running different counter factuals such as, what if the agent had clicked Button B instead of Button A, or had simply clicked Button A sooner. The ability to explore the space, go back and try different strategies, gives models a unique training environment.

What are the next steps?

Today, there are lots of companies building agents to perform tasks on the web in browsers. The simplest integration would be for these companies to start using the Replay browser to record these runs. This would mean that whenever a developer wanted to see why the agent failed to perform a task, they could open the replay and inspect it with browser devtools.