Replay is designed for recording and replaying interpreted language runtimes. In previous posts we’ve talked about how Replay’s recorder works and the ways in which recording is specialized to work well on runtimes. This lets us replay what the runtime is doing, but isn’t enough to allow actually inspecting the runtime’s state for debugging or other analysis. In this post we’ll discuss the architecture and techniques Replay uses for inspecting runtimes, illustrating what is involved in adapting a new runtime to support Replay and using a new client to inspect Replay recordings.
There are two interfaces in play when creating or inspecting Replay recordings:
Both of these interfaces are designed to support adaptation/extension to new languages and new clients. Together with Replay’s recorder and backend, they form a platform for time travel based debugging and analysis of interpreted languages. Once a runtime has been integrated with the recorder so that it replays reliably and supports the APIs used to inspect its state, any client using the Record Replay Protocol can debug that runtime’s recordings. Likewise, new clients using the protocol will be able to debug recordings made by any runtime that has been integrated with the recorder.
To show how these interfaces work, we’ll use an example of setting a breakpoint in Replay’s devtools. When a breakpoint is added, the devtools console is updated to show every point where that breakpoint is hit. The breakpoint has a message which can be edited to evaluate an expression everywhere the breakpoint is hit, and update the console with the results of those evaluations within a second or two.
To understand how this works, let’s start by describing this from the perspective of the devtools client. After loading the recording, the devtools sends a Debugger.findSources
request to get the URLs and identifiers for every source (a piece of JavaScript) that was loaded by the recording. Sending Debugger.getSourceContents
and Debugger.getPossibleBreakpoints
requests fetch the text for these sources and the set of places where breakpoints can be added. Evaluating a user-provided expression everywhere the breakpoint is hit is done with a few Analysis requests: Analysis.createAnalysis
specifies an analysis that can run when the program is paused somewhere, Analysis.addLocation
indicates that the analysis should run everywhere a specific breakpoint is hit, and Analysis.runAnalysis
starts the analysis and returns the results of performing it at all the hits for that breakpoint. The analysis specification causes Pause.getTopFrame
and Pause.evaluateInFrame
requests to run at each of these hits, so that the result of evaluating the expression at each of these hits is included in the analysis results and can be shown to the user.
Replay’s backend responds to all of these requests by replaying the recording and fetching the information it needs from the replayed program through the Recorder API. Several kinds of information are needed:
Every time the runtime’s virtual machine loads or creates a new source for the language being interpreted, it calls a RecordReplayOnNewSource
API in the recorder. This call doesn’t do anything when recording, but the replayer can use this API to enumerate the sources in the recording. When it does so, it will use a callback (which the VM installed at startup) to fetch the source’s contents and breakpoint locations. This callback is based on the same requests used in the Record Replay Protocol: in the same way that a client sends a Debugger.getSourceContents
request to asynchronously get a source’s contents from the backend, the backend’s replayer sends a Debugger.getSourceContents
request to synchronously get those same contents, which it can store and then send to the client when needed.
When replaying, these request callbacks may or may not be invoked during the RecordReplayOnNewSource
call, and either way, replaying needs to continue to work afterwards. Essentially, the replayer’s behavior within these API calls is a source of non-determinism. If it asks for source contents or breakpoint locations, the runtime needs to get that information, and in doing so it can change the VM’s state. For example, some JavaScript engines compile functions lazily: normally, compilation happens when the function is first called, but getting its breakpoint locations requires compiling it earlier than that. The point where a function is first compiled is then non-deterministic. JS compilation involves allocating both malloc’ed and garbage collected objects, but because malloc
and the GC are also non-deterministic, replaying can continue without running into problems.
This illustrates one of the main benefits of recording and replaying with effective determinism rather than complete determinism. Extracting information about sources is one of the simpler analyses which the replayer will do, but if we insisted on replaying with complete determinism then we would have to get this information without using malloc
or allocating GC’ed objects, which would require major invasive changes to the VM. Replaying with effective determinism allows this analysis to run without needing specialized VM changes, and takes advantage of the fact that lazy function compilation is an optimization that has already been designed to work without affecting the behavior of the running JavaScript.
To be able to quickly run analyses against all the points where a breakpoint is hit, the replayer runs an analysis that collects the entire set of hits on every breakpoint location in the recording, so those hits can be indexed and queried rapidly. This is done through the RecordReplayOnInstrument
API, which the VM calls while replaying to describe what code is running: calls are made whenever a breakpoint site is reached, and in a few other places. This is similar to the RecordReplayOnNewSource
API described above, but unlike that API the replayer can also notify the VM about whether instrumentation is enabled (and instrumentation calls need to be made) by using a callback the VM installs at startup.
When replaying it is non-deterministic whether instrumentation is enabled and what happens within instrumentation calls, but since we replay with effective determinism, replaying will not be affected by any instrumentation related allocations or other side effects. In fact, because the JITs behave non-deterministically they can optimize away instrumentation logic entirely when instrumentation is disabled, and discard that optimized code if it is enabled later on. This uses the JIT’s existing mechanisms for optimizing and deoptimizing code, and is ensures replaying is fast when instrumentation based analyses aren’t running.