Introduction
xemu is a RISC-V system emulator written in Rust. It boots OpenSBI, xv6, Linux, and Debian 13 to an interactive shell, and is the execution core of the ProjectX computer system project.
What xemu is
- A full-system emulator (M / S / U privilege modes, MMU, devices)
- RV32 / RV64 dual-target — single codebase via
cfg - Supports the IMAFDC + Zicsr + Zifencei extensions, plus the privileged ISA with Sv32 / Sv39 MMU and PMP
- Ships a debugger (
xdb) with breakpoints, watchpoints, expression evaluation, disassembly, and reference-comparison differential testing against QEMU and Spike - Uses a clean, trait-based device bus with lock-free single-hart hot path
What xemu is not
- Not a JIT. It's a pure interpreter with a decoded-instruction cache.
- Not a cycle-accurate simulator — it's faithful to architectural semantics, not microarchitectural timing.
- Not hardware-specific — it models the QEMU-virt-like platform with documented deltas.
How this book is organised
- Getting Started — build xemu, run your first kernel.
- Usage — drive each boot target, use the debugger, run benchmarks.
- Internals — how the CPU, memory subsystem, and devices are implemented.
- Reference — tables of supported ISA, memory map, environment variables, and HAL API.
- Contributing — the iteration workflow and how to propose a new feature.
For the development roadmap and landed-phase status, see
../PROGRESS.md. For feature-level specifications,
see ../spec/.
Overview
Top-level layout
ProjectX/
├── xemu/ RISC-V emulator
│ ├── xcore/ execution engine — CPU, MMU, devices, bus
│ ├── xdb/ debugger / monitor (REPL, breakpoints, difftest)
│ └── xlogger/ logging — colored, levelled, per-instruction trace
├── xam/ bare-metal HAL (abstract-machine) — targets xemu
├── xlib/ freestanding C library (klib) — printf, string, stdio
├── xkernels/ test kernels
│ └── tests/ am-tests, cpu-tests, alu-tests, benchmarks
├── resource/ external boot artifacts — OpenSBI, xv6, Linux, Debian
├── scripts/ CI + perf measurement scripts
└── docs/ this documentation
Component relationships
xkernel source (C / Rust)
│
│ compile with
▼
xam HAL + xlib (klib)
│
│ produces
▼
ELF image
│
│ loaded by
▼
xemu (xdb binary)
│
│ executes through
▼
xcore: CPU → MMU → Bus → Devices (ACLINT / PLIC / UART / VirtIO)
Crates at a glance
| Crate | Role | Key types |
|---|---|---|
xcore | Execution engine | CPU, RVCore, Bus, Mmu, Pmp, Aclint, Plic, Uart, VirtioBlk |
xdb | Binary + debugger | Monitor, breakpoint / watchpoint tables, command REPL |
xlogger | Log facade | trace! / debug! / info! macros with color + timestamp |
xam | Guest HAL | _putch, mtime, uptime, init_trap, TrapFrame |
xlib | Guest C library | printf, memset, memcpy, strlen, strcmp, assert.h |
Boot target summary
| Target | Make command | Firmware | Rootfs |
|---|---|---|---|
| am-tests | cd xkernels/tests/am-tests && make run | none (bare) | — |
| xv6 | cd resource && make xv6 | xv6 bootstrap | ramdisk |
| Linux | cd resource && make linux | OpenSBI v1.3.1 | initramfs |
| Linux SMP | cd resource && make linux-2hart | OpenSBI | initramfs |
| Debian 13 | cd resource && make debian | OpenSBI + bootlin kernel | ext4 over VirtIO-blk |
| Debian SMP | cd resource && make debian-2hart | OpenSBI + bootlin kernel | ext4 over VirtIO-blk |
See Boot targets for each in detail.
Building xemu
Prerequisites
- Rust toolchain — auto-detected from
rust-toolchain.toml(nightly). - C cross-compiler —
riscv64-unknown-linux-muslfor building guest C programs. On macOS, install viabrewor fetch from cross-tools/musl-cross. - axconfig-gen —
cargo install axconfig-gen(cached in~/.cargo/bin). - clang-format — system package, used by the
fmtCI job.
Build modes
| Mode | Invocation | Notes |
|---|---|---|
| Release (default) | make run | LTO + codegen-units = 1. Use for benchmarks. |
| Debug | DEBUG=y make run | Faster to compile; slower at runtime. |
| Difftest-enabled | DIFFTEST=1 make run | Links the QEMU / Spike comparison backends. |
Always set DEBUG=n before benchmarking.
Supported hosts
- macOS (Apple Silicon, Intel) — primary development target.
- Linux (x86_64, aarch64) — CI target.
samplyprofiling works without entitlement on Linux.
Windows is not supported.
Cargo workspace
xemu is a single Cargo workspace at xemu/. Build the whole thing:
cd xemu
cargo build --release
cargo test --workspace
The resulting binary is xemu/target/release/xdb. In normal
development you don't invoke xdb directly — make run from a
kernel directory wires up the right X_FILE and launch flags.
Running your first kernel
The fastest way to see xemu work is to run the am-tests suite — bare-metal kernels that exercise the HAL, UART, ACLINT, PLIC, CSRs, traps, and interrupts.
cd xkernels/tests/am-tests
make run
You should see UART output, a summary line per sub-test, and a clean SiFive test-finisher exit.
Running a single am-test
cd xkernels/tests/am-tests
make run TEST=u # UART echo
make run TEST=c # CSR sanity
make run TEST=t # trap + interrupt routing
make run TEST=k # keyboard — interactive PTY echo
See xkernels/tests/am-tests/src/tests/ for the full set.
Running the cpu-tests
Two parallel suites — Rust (cpu-tests-rs, 31 tests) and C
(cpu-tests-c, 35 tests):
cd xkernels/tests/cpu-tests-rs && make run
cd xkernels/tests/cpu-tests-c && make run
Both are bare-metal — no OS, just instruction-sequence fixtures.
Troubleshooting
"cannot find riscv64-unknown-linux-musl-gcc" — install the
cross-compiler and export its bin/ on your PATH. See
Building xemu.
No UART output — check you're not in DEBUG=y mode unintentionally
(it routes UART to a PTY, requiring screen to attach). For plain
stdio, use DEBUG=n or omit the flag.
Test hangs — hit Ctrl-A X to exit. xemu intercepts the same
escape sequence QEMU uses.
What to read next
- The xdb debugger — step, break, examine memory / registers.
- Boot targets — run xv6, Linux, Debian.
- Architecture overview — how xemu is built internally.
Boot targets
xemu supports four kinds of boot:
| Target | Privilege entry | Firmware | Guest payload |
|---|---|---|---|
| am-tests | M-mode | none | bare-metal test kernel |
| xv6 | M-mode | xv6 bootstrap | xv6 kernel + ramdisk |
| Linux | M-mode | OpenSBI v1.3.1 | Linux 6.1.44 + initramfs |
| Debian | M-mode | OpenSBI + bootlin kernel | Linux + ext4 rootfs via VirtIO-blk |
All targets share the same machine layout (RAM at 0x8000_0000,
ACLINT at 0x0200_0000, PLIC at 0x0C00_0000, UART0 at
0x1000_0000). The differences are in what gets loaded and
whether a firmware layer is present.
See the individual pages for each:
Common environment variables
| Var | Default | Effect |
|---|---|---|
DEBUG | n | y routes UART to a PTY (requires screen attach) and enables extra logging |
LOG | info | trace / debug / info — controls xlogger verbosity |
X_HARTS | 1 | Number of guest harts (single-threaded cooperative scheduler) |
DIFFTEST | 0 | 1 enables per-instruction comparison against QEMU / Spike |
Use make run (or the per-target aliases like make linux) — not
target/release/xdb directly. The Makefiles wire up X_FILE, boot
layout, and DTB compilation for you.
Bare-metal tests (am-tests)
Bare-metal kernels that exercise the HAL and the core device set without any OS layer.
Running
cd xkernels/tests/am-tests
make run # run all
make run TEST=u # UART echo
make run TEST=t # trap + interrupt routing
make run TEST=k # keyboard (interactive PTY echo)
make run TEST=f # float sanity
What each letter covers
| Letter | Subject |
|---|---|
u | UART TX, THRE interrupt, DLAB divisor |
c | CSR read / write / WARL masks |
t | Trap delegation, mret / sret, vectored mtvec |
i | Interrupt priority, MIE / SIE gating, global enable |
a | ACLINT (MSWI + MTIMER + SSWI), mtimecmp, Sstc |
p | PLIC claim / complete, level-trigger semantics |
k | Keyboard — PTY-backed UART RX |
f | F / D extension, NaN-boxing, fcsr shifted aliases |
Exit semantics
am-tests use the SiFive test finisher at 0x0010_0000:
- Write
0x5555→ graceful exit, status 0 - Write
(code << 16) | 0x3333→ exit withcode
xlib/src/stdio.c provides the xam_halt() helper that wraps this.
xv6-riscv
Boots the MIT xv6-riscv kernel to an interactive shell.
Running
cd resource
make xv6
Expected time to prompt: ~0.3 s.
What happens
- xemu starts in M-mode at
0x8000_0000. - The xv6 bootstrap switches to S-mode, sets up page tables, and starts the kernel scheduler.
- Console (
sh) runs off a ramdisk embedded in the kernel image.
No firmware is loaded — xv6 runs directly. This makes it the simplest "real OS" target and a good sanity check after touching the trap framework or the MMU.
Exiting
Ctrl-A X — QEMU-style escape, intercepted by xemu's UART.
Linux (OpenSBI + initramfs)
Boots a full Linux 6.1.44 kernel to an interactive shell.
Running
cd resource
make linux # single-hart
make linux-2hart # 2 harts (cooperative scheduler)
Expected time to prompt: ~3 s.
Boot chain
xemu M-mode → OpenSBI v1.3.1 → Linux (S-mode) → static init (busybox lp64d)
- OpenSBI v1.3.1 — fw_jump configuration, generic platform.
- Linux 6.1.44 — bootlin kernel with
rv64imafdc, Sstc timer. - initramfs — bootlin rootfs (busybox + glibc lp64d), auto-downloaded
at first build and packed into
initrd.cpio.gz.
Init prompt
The initramfs runs a minimal static init with built-in commands:
ls pwd cd cat echo uname poweroff
poweroff invokes the SiFive test finisher via SBI shutdown — clean
exit.
DTS
resource/xemu-linux.dts declares:
- 1 GiB RAM at
0x8000_0000 - 1 or 2 harts (
cpus@0,cpus@1) - ACLINT, PLIC, UART, test-finisher nodes
riscv,isa = "rv64imafdcsu_sstc"timebase-frequency = 10_000_000
SMP notes
make linux-2hart boots two harts on a single-threaded cooperative
round-robin scheduler. Both cores share the same Bus instance.
True per-hart OS threads are gated by the Phase 11 RFC; see
../PROGRESS.md §Phase 11.
Debian 13 Trixie (VirtIO-blk rootfs)
Boots a full Debian 13 system from an ext4 rootfs mounted via a VirtIO-blk device.
Running
cd resource
make debian # single-hart
make debian-2hart # 2 harts
Expected time to prompt: ~20 s.
What you get
- 4 GiB ext4 root at
/dev/vda, mounted read-write. - 288 dpkg packages pre-installed, including Python3 (verified during boot test).
- Full Debian shell —
apt,vim,git, coreutils.
Boot chain
xemu M-mode → OpenSBI v1.3.1 → bootlin kernel → Debian userspace (/sbin/init)
The bootlin kernel is used in place of a custom kernel because it already has F / D extension support and the right Sstc driver, which matches xemu's exposed ISA.
First-run setup
On first make debian, the build system downloads the pre-built
4 GiB image (xemu-debian.img) to resource/debian/. Subsequent
runs reuse the snapshot; changes to the guest filesystem are
persisted across runs.
Two-tier reset
- VirtIO transport reset (issued by guest driver during
probe) — preserves disk contents; only resets the queue state. - Emulator hard-reset (via test finisher) — restores the disk snapshot, so repeated runs start from a clean Debian install.
DTS
resource/xemu-debian.dts:
- 1 GiB RAM
virtio,mmionode at0x1000_1000chosen: bootargs = "root=/dev/vda rw ..."- Same ACLINT / PLIC / UART as the Linux target
Cleanly exiting
poweroff from the Debian shell → SBI shutdown → xemu exits.
Or Ctrl-A X for an immediate abort.
The xdb debugger
xdb is the xemu monitor — a REPL with GDB-flavoured commands for
breakpoints, watchpoints, memory / register inspection, and
single-stepping.
Invoking
When you make run any target without -batch flags, xemu drops
into the xdb REPL after loading. The prompt looks like:
(xdb)
Command reference
Execution
| Command | Effect |
|---|---|
c / continue | Run until a breakpoint, watchpoint, or program exit. |
s [N] / step [N] | Single-step N instructions (default 1). |
r / run | Reset and restart. |
q / quit | Exit xdb. |
Breakpoints
| Command | Effect |
|---|---|
b <addr> | Set breakpoint at physical / virtual address. Stable ID returned. |
info b | List breakpoints with IDs. |
d <id> | Delete breakpoint by ID. |
Breakpoints are address-based. After a step-after-hit, xdb skips the same breakpoint once to avoid refiring.
Watchpoints
| Command | Effect |
|---|---|
w <expr> | Watch a value — fires when the expression changes. Validated at creation. |
info w | List watchpoints. |
d <id> | Delete. |
Expressions can reference registers ($a0), dereference memory
(*0x80000000), arithmetic ($sp + 8), comparisons, and parentheses.
Inspection
| Command | Effect |
|---|---|
x/N<f> <addr> | GDB-style memory examine — f = i (instruction), x (hex word), b (byte). |
info reg [<name>] | Dump all registers, or named GPR / CSR / pc. |
p <expr> | Evaluate and print an expression. |
Example:
(xdb) x/4i $pc
(xdb) x/16x 0x80200000
(xdb) info reg a0
(xdb) p $sp - 0x10
Differential testing
(xdb) dt attach qemu
(xdb) dt attach spike
(xdb) dt status
(xdb) dt detach
See Differential testing for what's compared and how to interpret divergences.
Logging while inside xdb
Set LOG=trace (per-instruction) or LOG=debug (per memory / CSR
event) before make run. Logs interleave with REPL output but do not
interrupt command entry.
Differential testing (QEMU / Spike)
Per-instruction comparison of xemu (the DUT) against a reference implementation (QEMU or Spike). On any divergence, xdb halts and reports the first mismatched register.
Enabling
DIFFTEST=1 make run
(xdb) dt attach qemu # or: dt attach spike
(xdb) c
What gets compared
- PC
- GPRs (x0..x31)
- Current privilege (M / S / U)
- 14 whitelisted CSRs (masked) — auto-generated from the
csr_table!macro's@ difftestannotation.
MMIO handling
MMIO reads are intentionally non-deterministic (wall-clock, interrupt state). Difftest skips instructions that touch MMIO and syncs raw values from DUT to REF to keep them aligned.
Backends
QEMU
- Protocol: GDB Remote Serial over TCP.
- Config:
sstep=0x7(NOIRQ + NOTIMER),PhyMemMode:1. - Initial state is synced once at attach.
- Easy to reproduce on any host that has
qemu-system-riscv64.
Spike
- Protocol: FFI into
libriscv, wrapped bytools/difftest/spike/. - Links
libriscv+libsoftfloat+libfesvr+libdisasm. - Closer to the ISA reference; harder to set up than QEMU.
- Used as the tiebreaker when QEMU and xemu disagree.
Known limitations
- Difftest cannot run in
DEBUG=ymode — PTY timing perturbs the reference. - Very long runs amortize slowly; prefer reproducing divergences on a focused test kernel.
- Not yet wired into CI; tracked as a deferred item in
../PROGRESS.mdPhase 6.
Benchmarks
xemu ships three benchmark kernels under xkernels/tests/benchmark/.
| Benchmark | Iterations | Characteristic |
|---|---|---|
| Dhrystone | 500 000 | ALU / GPR / call-heavy |
| CoreMark | 1 000 | Mixed integer + list / matrix |
| MicroBench | 10 sub-benches | Includes C++ workloads (qsort-cpp, string) |
Running
cd xkernels/tests/benchmark/dhrystone && make run
cd xkernels/tests/benchmark/coremark && make run
cd xkernels/tests/benchmark/microbench && make run
Always run with DEBUG=n for stable timing.
Published scores (MacBook Air M4)
| Benchmark | Marks |
|---|---|
| MicroBench | 718 |
| CoreMark | 499 |
| Dhrystone | 255 |
Perf pipeline
To regenerate the measurement baseline:
bash scripts/perf/bench.sh # writes docs/perf/baselines/<today>/data/bench.csv
bash scripts/perf/sample.sh # writes <today>/data/<workload>.sample.txt
python3 scripts/perf/render.py # writes <today>/graphics/*.svg
Run from ProjectX/ root. See
../internals/performance.md for how
buckets are interpreted and
../PROGRESS.md §Phase 9 for landed optimisations.
Reproducing the published numbers
- Use
make run(nottarget/release/xdbdirectly). The Makefile sets the right boot layout. - Leave
DEBUGunset (defaults ton). - Close other CPU-heavy processes. macOS
samplyespecially is sensitive to background load;user_sis the stable metric,real_sis noisy. - Take the mean of 3 runs for any comparison.
Architecture overview
The step loop
CPU::step is the per-instruction driver:
CPU::step()
1. bus.tick() — ACLINT every step, UART/PLIC every 64
2. sync_interrupts() — merge irq_state → mip
3. check_pending_interrupts() — raise trap if priority/gating allows
4. fetch → decode (icache) → execute
5. retire + commit_trap() — commit npc, enter trap vector if any
The loop owns the Bus directly — no Arc<Mutex<Bus>>. Field-level
borrow splitting lets MMU and Bus be accessed simultaneously without
locking. This is Phase P1 of the perf roadmap; see
performance.md.
Dispatch diagram
┌──────────────────────────────────────────────┐
│ xdb::main │
│ (monolithic under LTO + codegen-units = 1) │
└───────────────┬──────────────────────────────┘
│
┌───────────▼────────────┐
│ CPU<Core, Bus> │
│ ├── Core: CoreOps │ ← arch-agnostic trait
│ └── Bus: owned │
└───────────┬────────────┘
│
┌───────────▼────────────┐
│ RVCore (CoreOps impl) │
│ ├── GPR / PC / NPC │
│ ├── csr: CsrFile │
│ ├── privilege │
│ ├── mmu: Mmu │
│ ├── pmp: Pmp │
│ ├── icache │
│ └── pending_trap │
└───────────┬────────────┘
│
┌───────────▼──────────────────────────┐
│ Bus │
│ ├── Ram [0x8000_0000, 1 GiB] │
│ ├── ACLINT [0x0200_0000] │
│ ├── PLIC [0x0C00_0000] │
│ ├── UART [0x1000_0000] │
│ ├── VirtIO [0x1000_1000] │
│ └── Test [0x0010_0000] (test-only)│
└──────────────────────────────────────┘
Four-layer memory access
A guest load/store walks:
vaddr ─► align check ─► MMU.translate ─► paddr ─► PMP.check ─► Bus.access
│ ▲
└── page walk: pte_paddr ┘ (PMP checks PTE reads too)
Responsibility split (see
../spec/mm/SPEC.md for the canonical table):
| Layer | Knows about | Does NOT know about |
|---|---|---|
Bus | Physical addresses, device regions | Virtual addresses, privilege, traps, PMP |
Mmu | Page tables, TLB, PTE bits, SUM / MXR | Trap codes, PMP (receives &Pmp for walks) |
Pmp | Physical-address permissions, privilege | Virtual addresses, page tables |
RVCore | Orchestrates: privilege, MPRV, trap mapping | Internal device state |
Lock-free IRQ delivery
Devices raise interrupts through IrqState — a shared Arc<AtomicU64>
bitmap. Each bit maps to an mip hardware bit:
| Bit | Source |
|---|---|
| 1 | SSIP (ACLINT SSWI) |
| 3 | MSIP (ACLINT MSWI) |
| 7 | MTIP (ACLINT MTIMER) |
| 9 | SEIP (PLIC context 1) |
| 11 | MEIP (PLIC context 0) |
sync_interrupts() merges this into the CPU's mip register at the
top of each step. No locks, no downcasts.
Related reading
- CPU dispatch & ISA decode
- CSR subsystem
- Memory: MMU, TLB, PMP
- Traps & interrupts
- Devices
- Performance: hot path & baselines
CPU dispatch & ISA decode
The CPU<Core, Bus> generic
#![allow(unused)] fn main() { pub struct CPU<C: CoreOps, B> { cores: Vec<C>, current: usize, bus: B, } }
C: CoreOps— ISA-specific core (e.g.RVCore). The generic boundary means xemu can host multiple ISAs; a LoongArch stub exists inxemu/xcore/src/arch/loongarch/.B— the bus type. Single-hart carriesBusinline; multi-hart carriesArc<Mutex<Bus>>when true SMP lands (Phase 11 RFC).
Per-instruction flow
- Tick devices.
bus.tick()advances ACLINT mtime (every step), drives the UART and PLIC on a slower cadence (every 64 steps), and collects IRQ lines. - Sync interrupts.
sync_interrupts()copies the atomic IRQ bitmap intomip. - Check pending interrupts. If any enabled, higher-priority
interrupt is pending,
pending_trapis set and the rest of the step is skipped. - Fetch. Read the instruction word at
pcvia the MMU. - Decode. First check the decoded-instruction cache (per-hart 4 K direct-mapped). On miss, run the pest-based pattern matcher.
- Execute. Dispatch on
DecodedInst, updatingnpc(and registers / CSRs / memory as side effects). - Retire.
self.pc = self.npc. Ifpending_trapis set,commit_trap()writes the trap vector address tonpcfirst.
Decoder
xcore/src/arch/riscv/isa/decode/ contains ~200 instruction
patterns expressed in pest. Each pattern captures the opcode fields
into a DecodedInst::* variant:
R/I/S/B/U/J— standard formatsFR— floating-point with explicitrm(rounding mode) fieldFR4— FMA-style 4-register (fmadd,fmsub, ...)C*— compressed variants
The match tree after decode is the dispatch loop — one big match
on DecodedInst calling per-instruction handlers.
Decoded-instruction cache
Phase P4 of the perf roadmap:
#![allow(unused)] fn main() { struct ICacheLine { pc: usize, // guest virtual address ctx_tag: u32, // bumped on any mapping change raw: u32, // raw instruction word (sanity) decoded: DecodedInst, } icache: [ICacheLine; 4096] // per-hart, direct-mapped }
ctx_tag invalidates implicitly on:
satpwritessfence.vma- Privilege-mode transitions that change the effective translation
fence.i
Self-modifying code: every guest store invalidates the whole icache
(simple, correct, loses the icache effect only on code-writing
guests — rare). See
../spec/perfHotPath/SPEC.md for
the full invariant set.
Trace
LOG=trace emits one line per instruction with PC, mnemonic,
operands, and the resulting GPR delta. Very verbose — use it only
for focused debugging.
CSR subsystem
Layering
Two layers with a clean split:
CsrFile ← storage + descriptor-driven mask / shadow dispatch
RVCore ← privilege checks, dynamic rules, side effects, trap generation
Key principle: CsrFile knows what a CSR is (address, width,
WARL mask, shadow); RVCore knows when a CSR access is allowed
and what happens after.
Storage
Flat 4096-entry array indexed by the 12-bit CSR address:
#![allow(unused)] fn main() { pub struct CsrFile { regs: [Word; 4096], } }
Shadow registers (sstatus, sip, sie) don't occupy their own
slot — they redirect to the M-mode slot with a mask.
csr_table! macro
xcore/src/arch/riscv/cpu/csr/table.rs declares every CSR in a
single macro invocation. The macro generates:
- The
CsrAddrenum. - The
CSR_DESCSdescriptor table (address, WARL mask, shadow target, side effects, difftest whitelist membership). - The O(1) dispatch
matchfor read / write.
One source of truth prevents the enum and descriptor table from drifting apart.
WARL model
Writes go through the descriptor's write mask:
new = (old & !mask) | (incoming & mask)
Read-only-zero bits are enforced by mask = 0 on those fields.
Shadow registers wrap the M-mode register's read/write with an extra
mask (e.g. sstatus only exposes the S-mode subset of mstatus).
Side effects
Some writes trigger xemu-internal reconfiguration:
| CSR | Side effect |
|---|---|
satp | mmu.update_satp → reconfigure SvMode + flush TLB + bump icache ctx_tag |
mstatus | Recompute SUM / MXR flags |
pmpcfg* / pmpaddr* | Rebuild PMP entries with lock semantics |
mtimecmp | Recompute next_fire_mtime in ACLINT (Phase P3) |
fcsr / fflags / frm | Route through shifted-subfield alias to the canonical fcsr |
Traps on CSR access
CSR privilege violations raise architectural traps, not emulator errors. Use:
#![allow(unused)] fn main() { self.raise_trap(TrapCause::IllegalInst, /*tval=*/ instruction_word); return Ok(()); }
Never return Err(XError) from a trap — reserve Err for host
failures (I/O error) and emulator invariant violations. This is the
"err2trap" refactor pattern; see
../spec/err2trap/SPEC.md.
Difftest whitelist
The csr_table! @ difftest annotation marks CSRs whose value is
checked against the reference every step. Currently 14 CSRs are on
the whitelist — architectural state that both DUT and REF model the
same way. CSRs that depend on xemu-specific timing (time,
mcycle) are excluded.
Memory: MMU, TLB, PMP
Access pipeline
vaddr → align check → MMU.translate(vaddr, op, priv) → paddr → PMP.check(paddr, op, priv) → Bus.access
│ ▲
└── page walk (pte_paddr checks PMP)┘
Four layers, each with a narrow responsibility. See
../spec/mm/SPEC.md for the full design.
MMU
- Sv32 (RV32) and Sv39 (RV64) — multi-level page walk with hardware A / D bit update.
SvModedescriptor — runtime switch between Sv32 / 39 / 48 / 57; PTE format-dependent methods take&SvMode.satpWARL — RV64 currently restricts to Sv39; writes to unsupported modes are masked.
The MMU caches the effective privilege, SUM / MXR bits, and the
current SvMode — avoiding a CSR read on every translate.
TLB
- 64 entries, direct-mapped.
- ASID-tagged. Global pages (
PTE.G = 1) match any ASID. - Flushed on
sfence.vmawith the standard ASID / vaddr operand semantics.
PMP
- 16 entries, matching modes:
OFF,TOR,NA4,NAPOT. - Partial-overlap detection — a paddr straddling two entries raises the appropriate fault.
- Lock bit — once set, the entry is immutable until the next reset.
- M-mode fast path — when no entries are locked, M-mode bypasses the 16-entry linear scan entirely.
Trap generation
MMU and PMP produce Err(XError::PageFault) or Err(XError::BadAddress)
up the stack. RVCore maps these to the correct TrapCause:
| XError | Trap cause |
|---|---|
PageFault { access: Load } | LoadPageFault |
PageFault { access: Store } | StorePageFault |
PageFault { access: Fetch } | InstPageFault |
BadAddress { access: Load } | LoadAccessFault |
BadAddress { access: Store } | StoreAccessFault |
This translation happens once in the step → commit_trap path;
instruction handlers just propagate ?.
MPRV
When mstatus.MPRV is set, loads / stores use mstatus.MPP as the
effective privilege. xemu routes this through
Mmu::effective_privilege(&mstatus, op) — not by clamping the actual
privilege field.
Typed RAM access (Phase P6)
Hot-path loads / stores of 1 / 2 / 4 / 8 bytes bypass the generic
memmove shim:
#![allow(unused)] fn main() { // Pseudocode if op.aligned() && size ∈ {1,2,4,8} { direct_u{size}_read(ram_slice) } else { bytemuck::copy_within(...) // slow path } }
Drops the _platform_memmove + Bus::{read,write} combined bucket
from ~18 % to sub-2 % on the dhrystone / coremark / microbench
profile. See
../spec/perfHotPath/SPEC.md.
Traps & interrupts
Two-phase trap handling
xemu uses a single-commit-point model: architectural state only
changes when step() commits at the retire stage.
Phase 1 (raise): an instruction handler or subsystem detects a
trap condition and sets pending_trap. Control returns up the stack
via Ok(()).
Phase 2 (commit): after execute, step() inspects pending_trap;
if set, commit_trap() updates mepc/sepc, mcause/scause,
mtval/stval, mstatus/sstatus (MPP/MPIE), sets privilege,
and writes self.npc = trap_vector.
The loop then commits: self.pc = self.npc.
PendingTrap
#![allow(unused)] fn main() { pub struct PendingTrap { pub cause: TrapCause, pub tval: Word, } }
One canonical representation. Never carried inside XError for
architectural traps (see the err2trap discussion in
csr.md).
Delegation
xemu implements the full medeleg / mideleg model:
- Faults at U-mode may be delegated to S-mode.
- Timer / software / external interrupts are routed via
mideleg. - Delegation happens in
commit_trap— handlers just supply theTrapCause.
Vectored mtvec / stvec is supported: when mtvec[1:0] = 1, async
interrupts dispatch to base + 4 * cause; synchronous traps always
jump to base.
Interrupt priority
Per the spec:
MEI > MSI > MTI > SEI > SSI > STI
check_pending_interrupts() walks in priority order, masked by
mie / sie / the global enable bit (mstatus.MIE / sstatus.SIE).
Lock-free IRQ plane
Devices raise interrupts by flipping bits in a shared
Arc<AtomicU64> (IrqState). The CPU merges this into mip at
the top of each step. No locks; no vtable downcasts from the Bus
into the PLIC.
Device-to-PLIC is direct: the UART holds a reference to the
PLIC's source slot (PlicSource) and flips it on state change. No
Bus-mediated round-trip. This is the directIrq fix; see
../spec/directIrq/SPEC.md.
Edge vs level
- ACLINT MSIP / MTIP / SSIP — level-triggered by bit state.
- UART — level;
!rx_fifo.is_empty() && (ier & 1). - PLIC — level on its external sources; claim/complete exclusion
prevents re-pending until the handler completes.
plicGatewayfixed a prior edge/level confusion; see../spec/plicGateway/SPEC.md.
Devices
Device trait
#![allow(unused)] fn main() { pub trait Device: Send { fn read(&mut self, offset: usize, size: usize) -> XResult<Word>; fn write(&mut self, offset: usize, size: usize, value: Word) -> XResult; fn tick(&mut self) {} fn irq_line(&self) -> bool { false } fn notify(&mut self, _irq_lines: u32) {} } }
Five methods. Default-no-op for tick, irq_line, notify — device
authors override only what they need.
Bus
#![allow(unused)] fn main() { pub struct Bus { ram: Ram, mmio: Vec<MmioRegion>, plic_idx: Option<usize>, } struct MmioRegion { name: &'static str, range: Range<usize>, dev: Box<dyn Device>, irq_source: u32, // 0 = no IRQ } }
Every access goes through Bus::read / Bus::write:
- Fast path — RAM. Static dispatch, no vtable. Typed-read bypass for aligned 1/2/4/8-byte accesses (Phase P6).
- Slow path — MMIO. Linear scan for the covering region, then
dispatch via
dyn Device.
tick() split
#![allow(unused)] fn main() { pub fn tick(&mut self) { // ... ACLINT every step (fast path) ... // ... UART + PLIC every 64 steps (slow path) ... } }
ACLINT fires on every step because the Mtimer deadline check is on the critical path. UART and PLIC tick less frequently — their state rarely changes per-instruction.
Inside ACLINT, the Mtimer deadline gate (Phase P3) short-circuits 99.99 % of checks:
#![allow(unused)] fn main() { if self.mtime < self.next_fire_mtime { return; } self.check_all(); // slow path only when a deadline has arrived }
IRQ collection
The Bus collects level-triggered IRQ lines:
#![allow(unused)] fn main() { let mut irq_lines: u32 = 0; for r in &mut self.mmio { r.dev.tick(); if r.irq_source > 0 && r.dev.irq_line() { irq_lines |= 1 << r.irq_source; } } if let Some(i) = self.plic_idx { self.mmio[i].dev.notify(irq_lines); } }
The PLIC is the only device whose notify is overridden — it
receives the full IRQ-line bitmap and evaluates MEIP/SEIP.
Per-device pages
ACLINT (MSWI / MTIMER / SSWI)
ACLINT replaces the legacy CLINT with three cleanly-split sub-devices
sharing the 0x0200_0000 / 0x1_0000 region. Wire-compatible with
the CLINT layout for software that expects one.
See the split spec at
../spec/aclintSplit/SPEC.md.
MSWI — Machine Software Interrupt
msipat offset0x0000— bit 0 only.- Writing 1 sets MSIP in
irq_state; writing 0 clears it.
MTIMER — Machine Timer
mtimeat0xBFF8(lo) /0xBFFC(hi) — host wall clock at 10 MHz.timebase-frequency = 10_000_000.mtimecmpat0x4000(lo) /0x4004(hi).- Amortized sync: wall-clock samples are taken every 512 ticks, not every step (Phase P1-era optimisation).
- Deadline gate: per-step
tick()short-circuits whenself.mtime < self.next_fire_mtime(Phase P3). - When
mtime ≥ mtimecmp, MTIP is raised inirq_state.
SSWI — Supervisor Software Interrupt
setssipat0xC000— write-only.- Writing 1 sets SSIP in
irq_state. - Read always returns 0.
Sstc extension
xemu exposes stimecmp for Sstc — an S-mode direct timer register.
The xemu DTS advertises riscv,isa = "rv64imafdcsu_sstc"; Linux and
OpenSBI use Sstc when present, bypassing the SBI timer call.
PLIC
Platform-Level Interrupt Controller at 0x0C00_0000, 64 MiB region.
- 32 sources (source 0 is reserved "no interrupt").
- 2 contexts — context 0 = M-mode, context 1 = S-mode.
- Level-triggered on external sources.
- Per-source priority, per-context threshold and enable bitmap.
- Claim / complete with claimed-exclusion — a claimed source
does not re-pend until its
completewrite.
See ../spec/plicGateway/SPEC.md
for the Gateway + Core split design and the level-trigger invariants.
Register layout (offsets within the PLIC base)
| Offset | Register |
|---|---|
0x0000_0000 | Priority[source] (32-bit per source) |
0x0000_1000 | Pending bitmap (32 bits — one per source) |
0x0000_2000 | Enable bitmap, context 0 (M) |
0x0000_2080 | Enable bitmap, context 1 (S) |
0x0020_0000 | Threshold + Claim/Complete, context 0 |
0x0020_1000 | Threshold + Claim/Complete, context 1 |
Update algorithm
On each bus tick the PLIC receives the current IRQ-line bitmap via
Device::notify(irq_lines: u32):
for src in 1..32:
if src in claimed: continue
if bit(irq_lines, src): pending |= (1 << src)
else: pending &= !(1 << src) # level went low
evaluate(context 0) → MEIP
evaluate(context 1) → SEIP
evaluate(ctx) finds the highest-priority enabled pending source
above threshold[ctx]. If one exists, it sets MEIP/SEIP in
irq_state; otherwise clears.
Claim / Complete
- Claim read — returns the highest-priority pending source,
clears its pending bit, and records it in
claimed[ctx]. - Complete write — if the value matches
claimed[ctx], the slot is released andevaluate()reruns so a subsequent interrupt can re-pend.
Direct IRQ delivery
Devices like UART hold a reference to their PLIC source slot and
signal state changes directly — no Bus round-trip. See
../spec/directIrq/SPEC.md.
UART 16550
National Semiconductor 16550-compatible UART at 0x1000_0000,
PLIC source 10.
Registers
| Offset | DLAB=0 read | DLAB=0 write | DLAB=1 |
|---|---|---|---|
| 0 | RBR (RX data) | THR (TX data) | DLL (divisor low) |
| 1 | IER | IER | DLM (divisor high) |
| 2 | IIR (read) | FCR (write) | IIR / FCR |
| 3 | LCR | LCR | LCR |
| 4 | MCR | MCR | MCR |
| 5 | LSR | — | LSR |
| 6 | MSR | MCR | MSR |
| 7 | SCR | SCR | SCR |
LCR[7]is DLAB — toggles register meaning for offsets 0/1.LSR.DR= RX ready (derived fromrx_fifo).LSR.THRE/LSR.TEMT= always set (TX is synchronous to stdout).
Modes
Default (stdio, batch-friendly)
- TX → host stdout.
- RX → host stdin. Non-blocking poll per tick.
PTY mode (DEBUG=y)
- TX → PTY master.
- RX → PTY master.
- Attach the slave with
screen /dev/ttysXXX 115200. xemu prints the slave path at startup.
Keyboard am-test
TEST=k runs a bare-metal kernel that polls RBR and echoes to TX —
the canonical interactive smoke test.
Interrupts
irq_line() = !rx_fifo.is_empty() && (ier & 0x1).
THRE interrupts (ier & 0x2) are also supported: when the guest
writes THR and re-arms IER, the next tick promotes thre_pending
into thre_ip and re-syncs the IRQ state.
Ctrl-A X
xemu intercepts the Ctrl-A X escape sequence (QEMU-style) to exit
cleanly from firmware-boot modes without needing a guest poweroff.
VirtIO-blk
MMIO legacy (version 1) transport backing the Debian rootfs.
Layout
| Region | Address | Size |
|---|---|---|
| VirtIO MMIO | 0x1000_1000 | 0x1000 |
Transport
- Legacy MMIO v1 — matches Linux's
virtio_mmiodriver without needing the modernVIRTIO_F_VERSION_1path. - Split virtqueue, 128 entries.
- Synchronous DMA processing —
process_dmareads descriptors from guest RAM, dispatches the request, writes status, rings the used ring.
DmaCtx
Bus-mediated guest-memory accessor — the only bridge between VirtIO code and guest RAM:
#![allow(unused)] fn main() { impl<'a> DmaCtx<'a> { pub fn read_bytes(&mut self, gpa: u64, buf: &mut [u8]) -> XResult<()>; pub fn write_bytes(&mut self, gpa: u64, buf: &[u8]) -> XResult<()>; pub fn read_le<T: LeBytes>(&mut self, gpa: u64) -> XResult<T>; pub fn write_le<T: LeBytes>(&mut self, gpa: u64, v: T) -> XResult<()>; } }
The LeBytes trait is the type-safe layer — no unsafe aliasing of
guest memory, just bounded &mut [u8] views through the Bus.
BlkStorage
Separated from the transport state so Rust's borrow checker can split them:
#![allow(unused)] fn main() { struct VirtioBlk { transport: TransportState, // queue pointers, device status storage: BlkStorage, // backing snapshot } }
process_dma borrows &mut TransportState + &mut BlkStorage
disjointly — no interior mutability, no runtime borrow tracking.
Two-tier reset
- Transport reset —
QueueReadygoes to 0;QueueSelis cleared; disk contents preserved. - Emulator hard reset — via the test finisher; restores the disk to the snapshot recorded at load.
Debian image
resource/xemu-debian.img — 4 GiB ext4 filesystem with Debian 13
Trixie pre-installed. Build system downloads it on first make debian.
Performance: hot path & baselines
Short answer
Over five phases (P1 + P3 + P4 + P5 + P6) the user-time per benchmark dropped by ~57–62 % vs the pre-P1 baseline:
| Benchmark | Pre-P1 | Post-hotPath | Δ |
|---|---|---|---|
| Dhrystone | 8.09 s | 3.48 s | −57 % |
| CoreMark | 14.02 s | 5.82 s | −58 % |
| MicroBench | 85.82 s | 32.91 s | −62 % |
See ../PROGRESS.md §Phase 9 for the full table
and ../spec/perfBusFastPath/SPEC.md,
../spec/perfHotPath/SPEC.md for
per-phase design.
Where time goes today
On the post-hotPath profile, the dominant buckets are roughly:
| Bucket | Share | Character |
|---|---|---|
xdb::main (dispatch + decode + execute) | ~30 % | Interpreter core |
MMU entry (checked_* + access_bus) | ~10 % | Per load/store |
| Mtimer deadline gate | <1 % | Per-step (post-P3) |
| Typed RAM access | <2 % | Per load/store (post-P6) |
| Device ticks (UART / PLIC / VirtIO) | <1 % | Slow path, every 64 steps |
The pre-P1 baseline had pthread_mutex_* at 33–40 % — now 0 %
(Bus is owned, not behind Arc<Mutex<_>>).
The five landed phases
| Phase | Subject | Win | Risk |
|---|---|---|---|
| P1 busFastPath | Drop Arc<Mutex<Bus>>, own inline | −45…−52 % wall | Low |
| P3 Mtimer deadline | Cache next_fire_mtime, short-circuit tick | Mtimer bucket → <1 % | Very low |
| P4 icache | Per-hart decoded-inst cache, 4 K entries | xdb::main bucket −10 pp | Medium (invalidation) |
| P5 MMU inline | #[inline] pressure through fast path | MMU bucket −3 pp | Low |
| P6 memmove bypass | Typed reads on aligned 1/2/4/8-byte accesses | memmove bucket → <2 % | Low-Medium (unsafe) |
Measurement pipeline
Always run from ProjectX/ root:
bash scripts/perf/bench.sh # → docs/perf/baselines/<today>/data/bench.csv
bash scripts/perf/sample.sh # → <today>/data/<workload>.sample.txt
python3 scripts/perf/render.py # → <today>/graphics/*.svg
- 3 runs per workload —
user_sis the stable metric,real_sis noisy on macOS under system load. - Use
DEBUG=n. PTY mode perturbs timing. - Commit
data/andgraphics/with the phase's MASTER document.
Phase exit gate pattern
A phase is not done until:
cargo test --workspace+make linux+make debianall green (and-2hartvariants where applicable).bench.shrerun (3 iters per workload).sample.shrerun for each of the three benches.- Per-phase exit gate hit with ≥ 1 pp margin on the bucket it targets.
- REPORT.md deltas committed to the phase's archived MASTER.
What's next
- P7 multi-hart re-profile — pending; shapes the Phase 11 SMP work. Not an optimisation in itself — a measurement task.
- Phase 11 (RFC) — true per-hart OS threads. Requires atomic RAM, per-hart reservations, per-device MMIO locking. Not in any current perf phase.
Multi-hart
Today's multi-hart is a single-threaded cooperative round-robin
scheduler in CPU::step. N harts are no faster than 1. The
abstraction exists so the ISA code can reason about per-hart state,
not because the host is running them in parallel.
See ../spec/multiHart/SPEC.md for
the Hart abstraction design.
What's shared, what's per-hart
| Shared | Per-hart |
|---|---|
Bus (RAM + all devices) | GPR / PC / NPC |
| ACLINT mtime (one host wall-clock source) | CsrFile |
| PLIC state (2 contexts route to 2 harts) | privilege |
IrqState Arc<AtomicU64> (one set of mip/mie bits per hart) | mmu, pmp |
icache | |
pending_trap |
Per-hart icache
Each hart has its own 4 K direct-mapped decoded-instruction cache. A
satp write on one hart does not flush the other hart's icache
— each has its own ctx_tag. sfence.vma with an explicit hart
target would too, but the current implementation flushes both harts
on any sfence.vma for simplicity (conservative, correct).
Running
cd resource
make linux-2hart # 2 harts, cooperative scheduler
make debian-2hart # same, with VirtIO rootfs
Both cores share the same Bus instance. The scheduler gives each hart a slice of steps in round-robin order before rotating.
Why single-threaded today
P1 (busFastPath) removed the Arc<Mutex<Bus>> that was dead weight
under the cooperative scheduler — there's no real SMP, so the mutex
was pure overhead. Removing it gave 45–52 % wall-clock.
True SMP (Phase 11 RFC)
Not in any landed phase. To get per-hart OS threads:
- Guest RAM becomes
&[AtomicU8](orunsafetyped access with explicit fences). - LR/SC reservations become per-hart
AtomicUsize. - Per-device fine-grained sync (or the QEMU MTTCG "BQL on MMIO only" model).
- A runtime that joins / cancels hart threads cleanly.
None of this fits in the perf roadmap. See
../PROGRESS.md §Phase 11 for reference designs
(QEMU MTTCG, rv8, Guo 2019 on fast TLB simulation).
Pre-conditions before opening Phase 11
- P1, P2 (bus-access API), P5 (MMU inline) shipped. Done.
- A reproducible 2-hart Linux benchmark in
docs/perf/baselines/<date>/showing the fraction of time actually parallelisable. Not yet measured. - P7 re-profile results.
Supported ISA
xemu implements the RISC-V unprivileged ISA plus the privileged
model, across both RV32 and RV64 via cfg_if.
Base + standard extensions
| Ext | Description | RV32 | RV64 |
|---|---|---|---|
I | Base integer | ✅ | ✅ |
M | Multiply / divide | ✅ | ✅ |
A | Atomic (LR/SC + 9 AMO ops, .w and .d) | ✅ | ✅ |
F | Single-precision float | ✅ | ✅ |
D | Double-precision float | ✅ | ✅ |
C | Compressed | ✅ | ✅ |
Zicsr | CSR access | ✅ | ✅ |
Zifencei | fence.i | ✅ | ✅ |
DTS advertisement: riscv,isa = "rv64imafdcsu_sstc".
Privileged ISA
- M / S / U modes with full trap delegation (
medeleg/mideleg). - Vectored and direct
mtvec/stvec. mret/sretwith MPRV handling.- Sstc — S-mode direct
stimecmp.
MMU
| Mode | Support |
|---|---|
| Bare (identity) | ✅ |
| Sv32 (RV32) | ✅ |
| Sv39 (RV64) | ✅ — hardware A/D bit update |
| Sv48 | Descriptor exists; write to satp masks it off |
| Sv57 | Not wired |
- TLB: 64-entry direct-mapped, ASID-tagged, global-page aware.
- PMP: 16 entries, TOR / NA4 / NAPOT, lock semantics, partial-overlap detection.
Float details
softfloat_pure— pure Rust Berkeley softfloat-3.- NaN-boxing for F operands when D is also active.
fcsr/fflags/frmare shifted subfield aliases of one canonicalfcsr(see CSR subsystem).mstatus.FStracked asOff/Initial/Clean/Dirty, withSDrecomputed on everymstatus/fcsrwrite.
What's not implemented
- V (vector) — RVV is not supported.
- H (hypervisor) — no HS-mode.
- Zba / Zbb / Zbc / Zbs (bit-manipulation) — deferred.
- Zicbom / Zicboz (cache management) — no caches modelled.
- Svnapot / Svpbmt — not wired.
Instruction table
For the full per-mnemonic implementation status, see
../../spec/inst/SPEC.md.
Device memory map
Default xemu machine layout, QEMU-virt-compatible in shape with documented deltas.
| Device | Base | Size | IRQ |
|---|---|---|---|
| Test finisher (test-only) | 0x0010_0000 | 0x10 | — |
| ACLINT | 0x0200_0000 | 0x1_0000 | — |
| PLIC | 0x0C00_0000 | 0x400_0000 | — |
| UART0 (NS16550) | 0x1000_0000 | 0x100 | 10 |
| VirtIO MMIO (Debian target) | 0x1000_1000 | 0x1000 | 1 |
| RAM | 0x8000_0000 | 128 MiB (tests) / 1 GiB (Linux) | — |
Intentional deltas from QEMU virt
- ACLINT replaces CLINT. Wire-compatible MMIO layout; offers clean MSWI / MTIMER / SSWI split.
- Test finisher is test-only. Not wired into the default machine used by Linux / Debian.
timebase-frequency = 10_000_000(10 MHz), matching the host wall-clock sampling rate.
PLIC source assignments
| Source | Owner |
|---|---|
| 0 | "no interrupt" (reserved) |
| 1 | VirtIO-blk |
| 10 | UART0 |
Higher source numbers are reserved for future devices.
IrqState bitmap
Arc<AtomicU64> where:
| Bit | mip name | Writer |
|---|---|---|
| 1 | SSIP | ACLINT SSWI |
| 3 | MSIP | ACLINT MSWI |
| 7 | MTIP | ACLINT MTIMER |
| 9 | SEIP | PLIC context 1 |
| 11 | MEIP | PLIC context 0 |
sync_interrupts() on CPU step merges this into mip.
Boot layout (where the ELF lands)
- Bare-metal tests — entry at
0x8000_0000. - xv6 — entry at
0x8000_0000(M-mode). - Linux / Debian — OpenSBI lands at
0x8000_0000(M-mode), then jumps to the kernel at0x8020_0000(S-mode). - FDT —
BootLayout::fdt_addrpersists the DTB address so the kernel can find it ata1on entry.
Environment variables
Recognised by the make run / make linux / etc. entry points.
| Var | Values | Default | Effect |
|---|---|---|---|
DEBUG | y / n | n | y routes UART to a PTY, enables richer logging, and turns off release optimisations. Always set DEBUG=n when benchmarking. |
LOG | trace / debug / info / warn / error / off | info | xlogger verbosity. trace is per-instruction. |
X_HARTS | integer ≥ 1 | 1 | Guest hart count (cooperative scheduler). |
X_FILE | path | set by per-target Makefile | ELF to execute. Don't set manually — let make run resolve it. |
DIFFTEST | 0 / 1 | 0 | Compile-in QEMU / Spike difftest backends. |
AM_HOME | path | ${workspace}/xam | Where xam HAL sources live. |
XEMU_HOME | path | ${workspace}/xemu | Where xemu workspace lives. |
XLIB_HOME | path | ${workspace}/xlib | Where xlib (klib) sources live. |
CI-only
| Var | Effect |
|---|---|
ECC_DISABLED_HOOKS | Disable specific Everything-Claude-Code plugin hooks by hook ID. |
ECC_HOOK_PROFILE | minimal / standard / strict — coarse toggle. |
Runtime (xdb REPL)
Not env vars — commands inside the monitor. See The xdb debugger.
xam HAL
xam is the bare-metal HAL (abstract-machine) that kernels link
against. It exposes a minimal set of primitives that xemu knows how
to service.
Layout
xam/
├── include/ HAL headers (C + Rust bindings)
├── src/ implementations (arch-agnostic + riscv-specific)
└── scripts/ build_c.mk, link.ld, cross-target cargo support
API
Console
void _putch(char ch); // write one byte to UART TX
Used by xlib's stdio.c to back printf.
Time
uint64_t mtime(void); // read ACLINT mtime
void set_mtimecmp(uint64_t t); // set MTIMECMP for this hart
uint64_t uptime(void); // microseconds since boot
uptime() is derived from mtime() divided by 10 (10 MHz clock).
Trap entry
#![allow(unused)] fn main() { pub struct TrapFrame { pub regs: [usize; 32], pub sstatus: usize, pub sepc: usize, pub scause: usize, pub stval: usize, } pub fn init_trap(handler: fn(&mut TrapFrame)); }
Guest sets the handler once at boot; xemu's trap dispatch lands on it with a populated frame.
Main-args
#![allow(unused)] fn main() { extern "C" { static mainargs: *const u8; // compile-time strings } }
Useful for passing test identifiers into a single kernel binary.
Linker symbols
_heap_start — start of the heap (end of .bss)
_heap_end — end of the heap (derived from RAM size)
MMIO constants
Device addresses match Device memory map.
Building a kernel with xam
cd xkernels/tests/your-kernel
make run
The xam/scripts/build_c.mk and build_rs.mk wrappers handle the
cross-compilation and link script automatically. No target-specific
flags needed in your kernel's Makefile.
xlib (klib)
Freestanding C library for programs built by xam and run on xemu.
Modelled after NEMU's abstract-machine klib — minimal, deterministic,
platform-independent.
See ../../spec/klib/SPEC.md for the
design.
What's included
<string.h> — string.c
void *memset(void *s, int c, size_t n);
void *memcpy(void *dst, const void *src, size_t n);
void *memmove(void *dst, const void *src, size_t n);
int memcmp(const void *s1, const void *s2, size_t n);
size_t strlen(const char *s);
char *strcpy(char *dst, const char *src);
char *strncpy(char *dst, const char *src, size_t n);
char *strcat(char *dst, const char *src);
int strcmp(const char *s1, const char *s2);
int strncmp(const char *s1, const char *s2, size_t n);
char *strchr(const char *s, int c);
char *strrchr(const char *s, int c);
<stdio.h> — stdio.c + format.c
int printf(const char *fmt, ...);
int sprintf(char *buf, const char *fmt, ...);
int snprintf(char *buf, size_t size, const char *fmt, ...);
int vsprintf(char *buf, const char *fmt, va_list ap);
int vsnprintf(char *buf, size_t size, const char *fmt, va_list ap);
int puts(const char *s);
int putch(char ch);
Format specifiers: %d %i %u %x %X %s %c %p %o %%, with l / ll
length modifiers, field width, 0-padding, left-alignment.
No floating-point printf.
<assert.h>
#define assert(x) ...
C- and C++-compatible (carries extern "C" guards).
<stdlib.h> — stdlib.c
int atoi(const char *s);
int abs(int x);
void srand(unsigned seed);
int rand(void);
No malloc / free — cpu-tests don't need them, benchmarks use
local allocators.
<ctype.h> — ctype.c
isspace, isdigit, isalpha, isalnum, toupper, tolower,
etc. Standard shapes.
What's not included
- POSIX APIs,
FILE *streams. - Floating-point
printf. - Locale support.
- Thread-safe allocation.
This is intentional — xlib targets bare-metal test and benchmark kernels, not a hosted C environment.
Using from your kernel
#include <klib.h> /* umbrella header */
This pulls in <stddef.h>, <stdint.h>, <stdbool.h>, <stdarg.h>,
<string.h>, <stdio.h>, <stdlib.h>, <ctype.h>. The xam build
system prepends -I$(XLIB_HOME)/include before system includes.
klib-macros.h
Convenience macros used by benchmarks:
#define LENGTH(arr) (sizeof(arr) / sizeof((arr)[0]))
#define ROUNDUP(x, n) (((x) + (n) - 1) & ~((n) - 1))
#define ROUNDDOWN(x, n) ((x) & ~((n) - 1))
#define MIN(a, b) ((a) < (b) ? (a) : (b))
#define MAX(a, b) ((a) > (b) ? (a) : (b))
Workflow overview
ProjectX uses a spec- and doc-driven iteration workflow. The
canonical rules live in /AGENTS.md; this page
explains the shape at a glance.
Three locations per feature
docs/tasks/<feature>/— in-flight workspace. HoldsNN_PLAN.md/NN_REVIEW.md/NN_MASTER.mdrounds as the design converges.docs/spec/<feature>/SPEC.md— landed canonical spec. Authored by extracting the final PLAN's## Specsection (Goals / Architecture / Invariants / Data Structure / API Surface / Constraints).docs/archived/<category>/<feature>/— iteration history, moved out oftasks/once the feature lands.
Iteration loop
Per feature, up to 5 rounds (00 – 04):
plan-executor → NN_PLAN.md
(main session stops)
external reviewer (codex / human) → NN_REVIEW.md
(optional) user → NN_MASTER.md
→ next plan-executor, or → implementation
Loop cap. If the reviewer returns APPROVED earlier (no CRITICAL / HIGH findings) or after round 04, proceed to implementation. Any surviving MEDIUM / LOW findings are addressed inline during implementation.
Implementation
- Implementation (code and
NN_IMPL.md) is authored by the main session directly — not by a sub-agent. - There is no post-implementation review artifact. Audit findings are applied inline in the same session.
Categories at landing
When a feature lands, choose the archive category that matches the dominant intent:
| Category | Trigger |
|---|---|
feat | New user-visible or API-visible capability |
fix | Bug or MANUAL_REVIEW finding that isn't a reorg |
refactor | Reshape code without changing behavior |
perf | Measurable speedup under a published exit gate |
review | Audit / retrospective not tied to one feature |
When a task has mixed intent, split it.
Continuing reading
- Opening a new feature
- Writing a SPEC
- Adding a benchmark
/AGENTS.md— canonical workflow specdocs/tasks/README.md— active-feature lifecycle and category heuristics
Opening a new feature
Step-by-step guide for starting a new feature. Assumes you've read Workflow overview.
1. Pick a name
A short camelCase identifier — no spaces, no slashes. Examples:
vgaConsole, perfIcacheV2, rvv, sbiDebug.
2. Create the task workspace
mkdir -p docs/tasks/<feature>
cp docs/template/PLAN.template docs/tasks/<feature>/00_PLAN.md
cp docs/template/REVIEW.template docs/tasks/<feature>/00_REVIEW.md
cp docs/template/MASTER.template docs/tasks/<feature>/00_MASTER.md
Create all three files at the start of the round, even if some are empty. Reviewer and user fill them in turn.
3. Author 00_PLAN.md
Dispatch plan-executor sub-agent from the main session. The
sub-agent produces the plan — the main session never authors it
directly.
The PLAN must include:
## Summary— one paragraph.## Log— reviewer-facing changelog. Start empty for round 00.## Specwith[**Goals**]/[**Architecture**]/[**Invariants**]/[**Data Structure**]/[**API Surface**]/[**Constraints**].## Implement— step-by-step engineering plan.## Trade-offs— what was considered and rejected.## Validation— test plan with real code sketches.
4. Get NN_REVIEW.md
The main session stops after round 00's PLAN. You invoke an
external reviewer (codex / human) to produce 00_REVIEW.md.
Classify findings: CRITICAL / HIGH / MEDIUM / LOW.
5. Optional NN_MASTER.md
If you want to override the review or add binding directives, write
00_MASTER.md yourself. MUST directives are binding on the next
PLAN; SHOULD directives need explicit response if rejected.
6. Iterate
Signal the main session to dispatch round 01. The next PLAN must:
- Have a Response Matrix mapping every prior CRITICAL / HIGH finding + MASTER directive to a resolution.
- Address all MASTER
MUSTdirectives unconditionally.
7. Implement
After the final approved PLAN (up to round 04), the main session
authors the code changes and NN_IMPL.md directly. Include:
- What shipped vs what was planned.
- Deviations from the plan, with justification.
- Validation results — tests run, exit gates met.
8. Land
- Extract SPEC. Copy the final PLAN's
## Specsection intodocs/spec/<feature>/SPEC.md. - Archive.
git mv docs/tasks/<feature> docs/archived/<category>/<feature>. - Update PROGRESS.md — add the landed feature to the appropriate phase or task table.
Do-nots
- Don't edit previous iteration documents. Always create the next numbered file.
- Don't silently deviate during implementation. If the design changes meaningfully, open a new iteration.
- Don't dispatch reviewer sub-agents for the PLAN review — reviews are external, out-of-session.
Writing a SPEC
The SPEC.md is the landed, canonical description of a feature.
It's extracted from the final PLAN's ## Spec section after the
feature implementation lands.
See ../../template/SPEC.template
for the canonical shape.
Sections
[**Goals**]
What the feature provides, numbered G-1, G-2, ... Each goal is a
one-sentence claim about observable behaviour or a measurable
threshold.
- G-1: All 31 cpu-tests-rs pass with the new MMU implementation.
- G-2: Linux boots to initramfs shell in ≤ 5 seconds on the M4 host.
Follow Goals with Non-Goals NG-1, NG-2, ... — what the feature
explicitly does not cover.
[**Architecture**]
A prose + diagram description of the component's shape. ASCII diagrams are fine; keep them under 80 columns. Show the data-flow arrows, not just boxes.
[**Invariants**]
Numbered I-1, I-2, ... Properties that must hold at all times
across all execution paths.
- I-1: mip hardware bits are modified only via irq_state merge.
- I-2: Tick order: bus.tick → sync → check → fetch → execute → retire.
- I-3: Claimed PLIC sources are excluded from re-pending until complete.
[**Data Structure**]
Core types — structs, enums, traits — with real Rust syntax. This is the type-level signature of the feature.
#![allow(unused)] fn main() { pub struct Aclint { epoch: Instant, mtime: u64, msip: u32, mtimecmp: u64, irq_state: Arc<AtomicU64>, } }
[**API Surface**]
Public function signatures and their contracts.
#![allow(unused)] fn main() { /// Read a word at `addr`. Returns `BadAddress` for unmapped paddrs /// or `PageFault` for unmapped vaddrs. pub fn checked_read(&mut self, addr: VirtAddr, size: usize) -> XResult<Word>; }
[**Constraints**]
Numbered C-1, C-2, ... Things that would look like bugs but are
intentional design boundaries.
- C-1: xemu internal layout matches QEMU-virt in shape; ACLINT replaces CLINT.
- C-2: Single hart (cooperative scheduler).
- C-3: UART byte-access only; word writes raise SizeMismatch.
Extraction from PLAN
When a feature lands:
- Read the final
NN_PLAN.md. - Locate its
## Specsection. - Copy everything between
## Specand the next##heading intodocs/spec/<feature>/SPEC.md. - Prepend a banner:
# `<feature>` SPEC
> Source: [`/docs/archived/<cat>/<feature>/NN_PLAN.md`](...) —
> iteration history lives under `docs/archived/<cat>/<feature>/`.
---
- Commit the SPEC in the same PR as the IMPL.
Pre-workflow features
Some features (e.g. csr, klib, mm) predate the template. Their
SPEC.md contains the original pre-workflow design verbatim with a
banner flagging it. Do not rewrite until the feature next sees
meaningful iteration — the rewrite is its own task.
Updating a SPEC
When a feature iterates, the new PLAN's Response Matrix addresses
all prior CRITICAL / HIGH findings; the implementation lands; the
SPEC is replaced with the new round's ## Spec. Never hand-edit
the SPEC in isolation.
Adding a benchmark
Existing benchmarks: Dhrystone, CoreMark, MicroBench. Adding a new one means a new test kernel + a measurement entry in the perf pipeline.
Kernel
Create a new directory under xkernels/tests/benchmark/<name>/:
xkernels/tests/benchmark/<name>/
├── Makefile # uses xam/scripts/build_c.mk or build_rs.mk
├── src/
│ └── main.c # or .rs — the benchmark itself
└── README.md # what it measures, expected score range
The Makefile should delegate to xam's build system. Link against
xlib for printf / memcpy.
Exit via the SiFive test finisher:
#include <klib.h>
extern void xam_halt(int code);
int main() {
uint64_t t0 = uptime();
/* ... work ... */
uint64_t t1 = uptime();
printf("score = %lu\n", compute_score(t1 - t0));
xam_halt(0);
return 0;
}
Measurement pipeline
Add the new benchmark to scripts/perf/bench.sh so CI / manual
runs capture it:
BENCHES=(dhrystone coremark microbench <name>)
Also teach scripts/perf/sample.sh how to capture its sample
profile (usually the same per-workload path).
Baseline
After landing:
- Run
bash scripts/perf/bench.sh --runs 3. This writes the new workload intodocs/perf/baselines/<today>/data/bench.csv. - Run
bash scripts/perf/sample.shto produce the sample traces. - Run
python3 scripts/perf/render.pyfor the SVG flamegraphs. - Commit the new
data/+graphics/files.
Reporting in PROGRESS.md
Add a row to the "Benchmark" table in the project root README.md
if the benchmark is user-facing enough to publish. Update
../PROGRESS.md §Phase 9 if the workload
introduces a new cost centre worth tracking per-phase.
Performance hygiene
- Don't add a benchmark that depends on wall-clock non-determinism (interrupt timing, stdin blocking). Use deterministic work loops.
- Use
uptime()(microseconds) for in-guest timing; it's derived from ACLINT mtime, which is frozen during xdb pause. - Take 3 runs for the published number;
user_sis the stable metric.