How the Mojo REPL works under the hood, and how each engine exploits it.
Mojo doesn't have a standalone interpreter. All Mojo code execution -- including the REPL -- happens through LLDB's expression evaluation infrastructure. When you run mojo repl, it:
- Initializes LLDB via
SBDebugger::Initialize() - Loads
libMojoLLDB.dylib(Modular's LLDB plugin that adds Mojo language support) (or libMojoLLDB.so on linux) - Creates a target from
mojo-repl-entry-point(a small binary with amojo_repl_mainbreakpoint) - Launches the target and stops at the breakpoint
- Calls
SBDebugger::RunREPL(mojo_lang)to enter interactive REPL mode
The critical challenge for a Jupyter kernel is variable persistence. Each call to LLDB's expression command creates a new scope -- var x = 42 in one call is invisible to the next.
Discovered by running strings on libMojoLLDB.dylib, the REPL uses a context struct pattern:
struct __mojo_repl_context__:
var `x`: __mojo_repl_UnsafePointer[mut=True, __mojo_repl_UnsafePointer[mut=True, Int]]
passEach variable declared in the REPL gets a field in this struct as a double-indirection UnsafePointer. Every expression is then wrapped in:
def __mojo_repl_expr_impl__(mut __mojo_repl_arg: __mojo_repl_context__,
mut `x`: Int) -> None:
var __mojo_repl_expr_failed = True
@parameter
def __mojo_repl_expr_body__() -> None:
# user code goes here
pass
__mojo_repl_expr_body__()
__mojo_repl_expr_failed = FalseThe context struct accumulates fields as you declare variables. Each expression receives the accumulated context as a parameter, giving it access to all previously declared variables. LLDB's AddPersistentVariable stores the compiled results across evaluations.
This mechanism is triggered by using the internal Target::GetREPL() + IOHandlerInputComplete().
LLDB's public SB API (SBTarget) does not expose GetREPL() method
However, the internal lldb/Target/Target.h exposes Target::GetREPL() which returns a REPL object. Routing code through this object via IOHandlerInputComplete() gives full variable persistence (likely because it uses the same REPL instance across calls)
Because SBTarget wraps the internal Target with a single member
class SBTarget {
private:
lldb::TargetSP m_opaque_sp;
};Then the C++ server accesses the internal Target via reinterpret_cast
static lldb::TargetSP get_target_sp(lldb::SBTarget &target) {
return *reinterpret_cast<lldb::TargetSP *>(&target);
}This works because SBTarget has exactly one data member, so its address is the address of m_opaque_sp. We reinterpret it and dereference to get the TargetSP, then call Target::GetREPL() on it
The server is a single-process C++ binary:
-
Startup: Initialize LLDB, load
libMojoLLDB.dylib/.so, create target frommojo-repl-entry-point, set breakpoint onmojo_repl_main, launch and stop at breakpoint. -
Create the LLDB REPL object:
auto mojo_lang = SBLanguageRuntime::GetLanguageTypeFromString("mojo"); debugger.SetREPLLanguage(mojo_lang); TargetSP target_sp = get_target_sp(target); lldb_private::Status repl_err; REPLSP repl = target_sp->GetREPL(repl_err, mojo_lang, nullptr, true); IOHandlerSP io_handler = repl->GetIOHandler();
SBTargetdoes not exposeTarget::GetREPL(), so the server unwraps the internalTargetSPfromSBTargetwith the single-member layout assumption described above. -
JSON protocol loop: Read JSON from stdin, route execute requests through the persistent REPL instance, and return JSON on stdout.
std::string mutable_code = code; repl->IOHandlerInputComplete(*io_handler, mutable_code);This uses LLDB's real Mojo REPL path, so
var/letdeclarations persist across execute requests. -
Output capture: There are two output channels to collect after each execute request:
-
Target process stdout/stderr: output produced by running Mojo code, such as
print(...). The server drains this withSBProcess::GetSTDOUT()andSBProcess::GetSTDERR(). -
Debugger stdout/stderr: output produced by LLDB's REPL machinery, especially compiler diagnostics, parse errors, and other REPL messages. The server redirects the debugger's output and error file handles to temporary files, then reads and truncates those files after each request.
The JSON response combines both sources so notebook users see normal program output and compile/runtime diagnostics from the same execute request.
-
The main server uses LLDB's public SB API plus a few LLDB internal headers for the REPL path:
#include <lldb/Target/Target.h> // Target::GetREPL()
#include <lldb/Expression/REPL.h> // REPLSP, IOHandlerInputComplete()
#include <lldb/Utility/Status.h> // lldb_private::StatusBecause these headers expose LLDB/LLVM private types, server/repl_server.cpp must be built against matching LLDB headers and linked against Modular's liblldb plus LLVM support libraries:
-llldb23.0.0git (from Modular)
-lLLVMSupport (from brew LLVM)
-lLLVMDemangle (from brew LLVM)
→ {"type":"execute","code":"var x = 42","id":1}
← {"id":1,"status":"ok","stdout":"","stderr":"","value":""}
→ {"type":"execute","code":"print(x)","id":2}
← {"id":2,"status":"ok","stdout":"42\r\n","stderr":"","value":""}
→ {"type":"execute","code":"print(bad)","id":3}
← {"id":3,"status":"error","stdout":"","stderr":"","ename":"MojoError",
"evalue":"use of unknown declaration 'bad'","traceback":["..."]}
→ {"type":"shutdown","id":99}
← {"id":99,"status":"ok"}
The pexpect engine spawns mojo repl with noise-suppressing LLDB settings:
mojo repl \
-O 'settings set show-statusline false' \
-O 'settings set show-progress false' \
-O 'settings set use-color false' \
-O 'settings set show-autosuggestion false' \
-O 'settings set auto-indent false'
It sets TERM=dumb to minimize terminal escape sequences (though editline still produces some ANSI codes that must be stripped).
- Send code: Each line sent individually via
sendline(), followed by a blank line to submit. - Read output: Read from the PTY until a prompt pattern (
\n\s*\d+>\s) is detected. After the first prompt match, wait 300ms of silence ("settle time") to ensure all output has arrived. - Parse output: Strip ANSI codes, filter prompt lines (
\d+[>.]\s) and echo lines, detecterror:to split output from error messages.
PTY data arrives in chunks. The prompt pattern might appear in the middle of a chunk, with more output still buffered. The settle time ensures we've received everything before returning.
The parser scans for lines containing error: (case-insensitive). Once found, all subsequent lines are treated as error output. This matches how the Mojo compiler reports errors through the REPL.
tests/test_pexpect_engine.py is marked @pytest.mark.slow.
The pexpect fallback engine is not currently our active execution path, so normal development test runs should skip these tests:
tools/test.shRun them only when explicitly working on fallback behavior:
INCLUDE_SLOW=1 tools/test.sh -m slowUse these scripts to capture reproducible behavior snapshots for offline debugging:
tools/explore_lsp.py
tools/explore_kernel_client.pyEach writes a timestamped JSON report under meta/ with raw request/response payloads.
MojoLSPClient sets MODULAR_PROFILE_FILENAME to a temp path by default, so LSP profiling artifacts don't land in the project directory. Set MODULAR_PROFILE_FILENAME explicitly to override this.
For live kernel diagnostics, set MOJO_KERNEL_LSP_DIAG=1 before starting Jupyter. Completion replies will include _mojokernel_debug metadata (per-stage success/failure, elapsed ms, and LSP health snapshot on errors), and kernel logs will include LSP warning details/restarts. If needed, tune LSP request timeout with MOJO_LSP_REQUEST_TIMEOUT (seconds).
This is a C++ version of the pexpect approach. It:
- Creates a PTY pair with
openpty() - Redirects LLDB's stdin/stdout/stderr to the PTY slave via
SetInputFileHandle()/SetOutputFileHandle() - Runs
SBDebugger::RunREPL()in a detachedstd::thread - Communicates with the REPL through the PTY master using the same prompt detection and output parsing as the pexpect engine
- Exposes the same JSON protocol on stdin/stdout as the main server
This exists as a fallback. If Modular changes the internal SBTarget layout or the Target::GetREPL() / REPL::IOHandlerInputComplete() APIs used by the main server, the PTY server should still work because it drives SBDebugger::RunREPL() through public LLDB APIs and terminal I/O.
The original server used ci.HandleCommand("expression -l mojo -- " + code). This works for fn/struct/trait definitions (which compile into the persistent LLDB module) but NOT for var/let declarations. Each HandleCommand creates a new expression scope -- variables are local to that scope and disappear after evaluation.
The Target::GetREPL() path fixes this by reusing LLDB's actual Mojo REPL object across requests. That REPL object owns the accumulated context struct, so each call through REPL::IOHandlerInputComplete() sees declarations from previous cells, matching the behavior of the interactive Mojo REPL.