Introduction

xemu is a RISC-V system emulator written in Rust. It boots OpenSBI, xv6, Linux, and Debian 13 to an interactive shell, and is the execution core of the ProjectX computer system project.

What xemu is

  • A full-system emulator (M / S / U privilege modes, MMU, devices)
  • RV32 / RV64 dual-target — single codebase via cfg
  • Supports the IMAFDC + Zicsr + Zifencei extensions, plus the privileged ISA with Sv32 / Sv39 MMU and PMP
  • Ships a debugger (xdb) with breakpoints, watchpoints, expression evaluation, disassembly, and reference-comparison differential testing against QEMU and Spike
  • Uses a clean, trait-based device bus with lock-free single-hart hot path

What xemu is not

  • Not a JIT. It's a pure interpreter with a decoded-instruction cache.
  • Not a cycle-accurate simulator — it's faithful to architectural semantics, not microarchitectural timing.
  • Not hardware-specific — it models the QEMU-virt-like platform with documented deltas.

How this book is organised

  • Getting Started — build xemu, run your first kernel.
  • Usage — drive each boot target, use the debugger, run benchmarks.
  • Internals — how the CPU, memory subsystem, and devices are implemented.
  • Reference — tables of supported ISA, memory map, environment variables, and HAL API.
  • Contributing — the iteration workflow and how to propose a new feature.

For the development roadmap and landed-phase status, see ../PROGRESS.md. For feature-level specifications, see ../spec/.

Overview

Top-level layout

ProjectX/
├── xemu/           RISC-V emulator
│   ├── xcore/       execution engine — CPU, MMU, devices, bus
│   ├── xdb/         debugger / monitor (REPL, breakpoints, difftest)
│   └── xlogger/     logging — colored, levelled, per-instruction trace
├── xam/            bare-metal HAL (abstract-machine) — targets xemu
├── xlib/           freestanding C library (klib) — printf, string, stdio
├── xkernels/       test kernels
│   └── tests/       am-tests, cpu-tests, alu-tests, benchmarks
├── resource/       external boot artifacts — OpenSBI, xv6, Linux, Debian
├── scripts/        CI + perf measurement scripts
└── docs/           this documentation

Component relationships

xkernel source (C / Rust)
    │
    │ compile with
    ▼
xam HAL  +  xlib (klib)
    │
    │ produces
    ▼
ELF image
    │
    │ loaded by
    ▼
xemu (xdb binary)
    │
    │ executes through
    ▼
xcore: CPU → MMU → Bus → Devices (ACLINT / PLIC / UART / VirtIO)

Crates at a glance

CrateRoleKey types
xcoreExecution engineCPU, RVCore, Bus, Mmu, Pmp, Aclint, Plic, Uart, VirtioBlk
xdbBinary + debuggerMonitor, breakpoint / watchpoint tables, command REPL
xloggerLog facadetrace! / debug! / info! macros with color + timestamp
xamGuest HAL_putch, mtime, uptime, init_trap, TrapFrame
xlibGuest C libraryprintf, memset, memcpy, strlen, strcmp, assert.h

Boot target summary

TargetMake commandFirmwareRootfs
am-testscd xkernels/tests/am-tests && make runnone (bare)
xv6cd resource && make xv6xv6 bootstrapramdisk
Linuxcd resource && make linuxOpenSBI v1.3.1initramfs
Linux SMPcd resource && make linux-2hartOpenSBIinitramfs
Debian 13cd resource && make debianOpenSBI + bootlin kernelext4 over VirtIO-blk
Debian SMPcd resource && make debian-2hartOpenSBI + bootlin kernelext4 over VirtIO-blk

See Boot targets for each in detail.

Building xemu

Prerequisites

  • Rust toolchain — auto-detected from rust-toolchain.toml (nightly).
  • C cross-compilerriscv64-unknown-linux-musl for building guest C programs. On macOS, install via brew or fetch from cross-tools/musl-cross.
  • axconfig-gencargo install axconfig-gen (cached in ~/.cargo/bin).
  • clang-format — system package, used by the fmt CI job.

Build modes

ModeInvocationNotes
Release (default)make runLTO + codegen-units = 1. Use for benchmarks.
DebugDEBUG=y make runFaster to compile; slower at runtime.
Difftest-enabledDIFFTEST=1 make runLinks the QEMU / Spike comparison backends.

Always set DEBUG=n before benchmarking.

Supported hosts

  • macOS (Apple Silicon, Intel) — primary development target.
  • Linux (x86_64, aarch64) — CI target. samply profiling works without entitlement on Linux.

Windows is not supported.

Cargo workspace

xemu is a single Cargo workspace at xemu/. Build the whole thing:

cd xemu
cargo build --release
cargo test --workspace

The resulting binary is xemu/target/release/xdb. In normal development you don't invoke xdb directly — make run from a kernel directory wires up the right X_FILE and launch flags.

Running your first kernel

The fastest way to see xemu work is to run the am-tests suite — bare-metal kernels that exercise the HAL, UART, ACLINT, PLIC, CSRs, traps, and interrupts.

cd xkernels/tests/am-tests
make run

You should see UART output, a summary line per sub-test, and a clean SiFive test-finisher exit.

Running a single am-test

cd xkernels/tests/am-tests
make run TEST=u    # UART echo
make run TEST=c    # CSR sanity
make run TEST=t    # trap + interrupt routing
make run TEST=k    # keyboard — interactive PTY echo

See xkernels/tests/am-tests/src/tests/ for the full set.

Running the cpu-tests

Two parallel suites — Rust (cpu-tests-rs, 31 tests) and C (cpu-tests-c, 35 tests):

cd xkernels/tests/cpu-tests-rs && make run
cd xkernels/tests/cpu-tests-c  && make run

Both are bare-metal — no OS, just instruction-sequence fixtures.

Troubleshooting

"cannot find riscv64-unknown-linux-musl-gcc" — install the cross-compiler and export its bin/ on your PATH. See Building xemu.

No UART output — check you're not in DEBUG=y mode unintentionally (it routes UART to a PTY, requiring screen to attach). For plain stdio, use DEBUG=n or omit the flag.

Test hangs — hit Ctrl-A X to exit. xemu intercepts the same escape sequence QEMU uses.

Boot targets

xemu supports four kinds of boot:

TargetPrivilege entryFirmwareGuest payload
am-testsM-modenonebare-metal test kernel
xv6M-modexv6 bootstrapxv6 kernel + ramdisk
LinuxM-modeOpenSBI v1.3.1Linux 6.1.44 + initramfs
DebianM-modeOpenSBI + bootlin kernelLinux + ext4 rootfs via VirtIO-blk

All targets share the same machine layout (RAM at 0x8000_0000, ACLINT at 0x0200_0000, PLIC at 0x0C00_0000, UART0 at 0x1000_0000). The differences are in what gets loaded and whether a firmware layer is present.

See the individual pages for each:

Common environment variables

VarDefaultEffect
DEBUGny routes UART to a PTY (requires screen attach) and enables extra logging
LOGinfotrace / debug / info — controls xlogger verbosity
X_HARTS1Number of guest harts (single-threaded cooperative scheduler)
DIFFTEST01 enables per-instruction comparison against QEMU / Spike

Use make run (or the per-target aliases like make linux) — not target/release/xdb directly. The Makefiles wire up X_FILE, boot layout, and DTB compilation for you.

Bare-metal tests (am-tests)

Bare-metal kernels that exercise the HAL and the core device set without any OS layer.

Running

cd xkernels/tests/am-tests
make run                # run all
make run TEST=u         # UART echo
make run TEST=t         # trap + interrupt routing
make run TEST=k         # keyboard (interactive PTY echo)
make run TEST=f         # float sanity

What each letter covers

LetterSubject
uUART TX, THRE interrupt, DLAB divisor
cCSR read / write / WARL masks
tTrap delegation, mret / sret, vectored mtvec
iInterrupt priority, MIE / SIE gating, global enable
aACLINT (MSWI + MTIMER + SSWI), mtimecmp, Sstc
pPLIC claim / complete, level-trigger semantics
kKeyboard — PTY-backed UART RX
fF / D extension, NaN-boxing, fcsr shifted aliases

Exit semantics

am-tests use the SiFive test finisher at 0x0010_0000:

  • Write 0x5555 → graceful exit, status 0
  • Write (code << 16) | 0x3333 → exit with code

xlib/src/stdio.c provides the xam_halt() helper that wraps this.

xv6-riscv

Boots the MIT xv6-riscv kernel to an interactive shell.

Running

cd resource
make xv6

Expected time to prompt: ~0.3 s.

What happens

  • xemu starts in M-mode at 0x8000_0000.
  • The xv6 bootstrap switches to S-mode, sets up page tables, and starts the kernel scheduler.
  • Console (sh) runs off a ramdisk embedded in the kernel image.

No firmware is loaded — xv6 runs directly. This makes it the simplest "real OS" target and a good sanity check after touching the trap framework or the MMU.

Exiting

Ctrl-A X — QEMU-style escape, intercepted by xemu's UART.

Linux (OpenSBI + initramfs)

Boots a full Linux 6.1.44 kernel to an interactive shell.

Running

cd resource
make linux              # single-hart
make linux-2hart        # 2 harts (cooperative scheduler)

Expected time to prompt: ~3 s.

Boot chain

xemu M-mode  →  OpenSBI v1.3.1  →  Linux (S-mode)  →  static init (busybox lp64d)
  • OpenSBI v1.3.1 — fw_jump configuration, generic platform.
  • Linux 6.1.44 — bootlin kernel with rv64imafdc, Sstc timer.
  • initramfs — bootlin rootfs (busybox + glibc lp64d), auto-downloaded at first build and packed into initrd.cpio.gz.

Init prompt

The initramfs runs a minimal static init with built-in commands:

ls   pwd   cd   cat   echo   uname   poweroff

poweroff invokes the SiFive test finisher via SBI shutdown — clean exit.

DTS

resource/xemu-linux.dts declares:

  • 1 GiB RAM at 0x8000_0000
  • 1 or 2 harts (cpus@0, cpus@1)
  • ACLINT, PLIC, UART, test-finisher nodes
  • riscv,isa = "rv64imafdcsu_sstc"
  • timebase-frequency = 10_000_000

SMP notes

make linux-2hart boots two harts on a single-threaded cooperative round-robin scheduler. Both cores share the same Bus instance. True per-hart OS threads are gated by the Phase 11 RFC; see ../PROGRESS.md §Phase 11.

Debian 13 Trixie (VirtIO-blk rootfs)

Boots a full Debian 13 system from an ext4 rootfs mounted via a VirtIO-blk device.

Running

cd resource
make debian            # single-hart
make debian-2hart      # 2 harts

Expected time to prompt: ~20 s.

What you get

  • 4 GiB ext4 root at /dev/vda, mounted read-write.
  • 288 dpkg packages pre-installed, including Python3 (verified during boot test).
  • Full Debian shell — apt, vim, git, coreutils.

Boot chain

xemu M-mode  →  OpenSBI v1.3.1  →  bootlin kernel  →  Debian userspace (/sbin/init)

The bootlin kernel is used in place of a custom kernel because it already has F / D extension support and the right Sstc driver, which matches xemu's exposed ISA.

First-run setup

On first make debian, the build system downloads the pre-built 4 GiB image (xemu-debian.img) to resource/debian/. Subsequent runs reuse the snapshot; changes to the guest filesystem are persisted across runs.

Two-tier reset

  • VirtIO transport reset (issued by guest driver during probe) — preserves disk contents; only resets the queue state.
  • Emulator hard-reset (via test finisher) — restores the disk snapshot, so repeated runs start from a clean Debian install.

DTS

resource/xemu-debian.dts:

  • 1 GiB RAM
  • virtio,mmio node at 0x1000_1000
  • chosen: bootargs = "root=/dev/vda rw ..."
  • Same ACLINT / PLIC / UART as the Linux target

Cleanly exiting

poweroff from the Debian shell → SBI shutdown → xemu exits. Or Ctrl-A X for an immediate abort.

The xdb debugger

xdb is the xemu monitor — a REPL with GDB-flavoured commands for breakpoints, watchpoints, memory / register inspection, and single-stepping.

Invoking

When you make run any target without -batch flags, xemu drops into the xdb REPL after loading. The prompt looks like:

(xdb)

Command reference

Execution

CommandEffect
c / continueRun until a breakpoint, watchpoint, or program exit.
s [N] / step [N]Single-step N instructions (default 1).
r / runReset and restart.
q / quitExit xdb.

Breakpoints

CommandEffect
b <addr>Set breakpoint at physical / virtual address. Stable ID returned.
info bList breakpoints with IDs.
d <id>Delete breakpoint by ID.

Breakpoints are address-based. After a step-after-hit, xdb skips the same breakpoint once to avoid refiring.

Watchpoints

CommandEffect
w <expr>Watch a value — fires when the expression changes. Validated at creation.
info wList watchpoints.
d <id>Delete.

Expressions can reference registers ($a0), dereference memory (*0x80000000), arithmetic ($sp + 8), comparisons, and parentheses.

Inspection

CommandEffect
x/N<f> <addr>GDB-style memory examine — f = i (instruction), x (hex word), b (byte).
info reg [<name>]Dump all registers, or named GPR / CSR / pc.
p <expr>Evaluate and print an expression.

Example:

(xdb) x/4i $pc
(xdb) x/16x 0x80200000
(xdb) info reg a0
(xdb) p $sp - 0x10

Differential testing

(xdb) dt attach qemu
(xdb) dt attach spike
(xdb) dt status
(xdb) dt detach

See Differential testing for what's compared and how to interpret divergences.

Logging while inside xdb

Set LOG=trace (per-instruction) or LOG=debug (per memory / CSR event) before make run. Logs interleave with REPL output but do not interrupt command entry.

Differential testing (QEMU / Spike)

Per-instruction comparison of xemu (the DUT) against a reference implementation (QEMU or Spike). On any divergence, xdb halts and reports the first mismatched register.

Enabling

DIFFTEST=1 make run
(xdb) dt attach qemu       # or: dt attach spike
(xdb) c

What gets compared

  • PC
  • GPRs (x0..x31)
  • Current privilege (M / S / U)
  • 14 whitelisted CSRs (masked) — auto-generated from the csr_table! macro's @ difftest annotation.

MMIO handling

MMIO reads are intentionally non-deterministic (wall-clock, interrupt state). Difftest skips instructions that touch MMIO and syncs raw values from DUT to REF to keep them aligned.

Backends

QEMU

  • Protocol: GDB Remote Serial over TCP.
  • Config: sstep=0x7 (NOIRQ + NOTIMER), PhyMemMode:1.
  • Initial state is synced once at attach.
  • Easy to reproduce on any host that has qemu-system-riscv64.

Spike

  • Protocol: FFI into libriscv, wrapped by tools/difftest/spike/.
  • Links libriscv + libsoftfloat + libfesvr + libdisasm.
  • Closer to the ISA reference; harder to set up than QEMU.
  • Used as the tiebreaker when QEMU and xemu disagree.

Known limitations

  • Difftest cannot run in DEBUG=y mode — PTY timing perturbs the reference.
  • Very long runs amortize slowly; prefer reproducing divergences on a focused test kernel.
  • Not yet wired into CI; tracked as a deferred item in ../PROGRESS.md Phase 6.

Benchmarks

xemu ships three benchmark kernels under xkernels/tests/benchmark/.

BenchmarkIterationsCharacteristic
Dhrystone500 000ALU / GPR / call-heavy
CoreMark1 000Mixed integer + list / matrix
MicroBench10 sub-benchesIncludes C++ workloads (qsort-cpp, string)

Running

cd xkernels/tests/benchmark/dhrystone   && make run
cd xkernels/tests/benchmark/coremark    && make run
cd xkernels/tests/benchmark/microbench  && make run

Always run with DEBUG=n for stable timing.

Published scores (MacBook Air M4)

BenchmarkMarks
MicroBench718
CoreMark499
Dhrystone255

Perf pipeline

To regenerate the measurement baseline:

bash scripts/perf/bench.sh   # writes docs/perf/baselines/<today>/data/bench.csv
bash scripts/perf/sample.sh  # writes <today>/data/<workload>.sample.txt
python3 scripts/perf/render.py   # writes <today>/graphics/*.svg

Run from ProjectX/ root. See ../internals/performance.md for how buckets are interpreted and ../PROGRESS.md §Phase 9 for landed optimisations.

Reproducing the published numbers

  • Use make run (not target/release/xdb directly). The Makefile sets the right boot layout.
  • Leave DEBUG unset (defaults to n).
  • Close other CPU-heavy processes. macOS samply especially is sensitive to background load; user_s is the stable metric, real_s is noisy.
  • Take the mean of 3 runs for any comparison.

Architecture overview

The step loop

CPU::step is the per-instruction driver:

CPU::step()
  1. bus.tick()                   — ACLINT every step, UART/PLIC every 64
  2. sync_interrupts()             — merge irq_state → mip
  3. check_pending_interrupts()    — raise trap if priority/gating allows
  4. fetch → decode (icache) → execute
  5. retire + commit_trap()        — commit npc, enter trap vector if any

The loop owns the Bus directly — no Arc<Mutex<Bus>>. Field-level borrow splitting lets MMU and Bus be accessed simultaneously without locking. This is Phase P1 of the perf roadmap; see performance.md.

Dispatch diagram

                ┌──────────────────────────────────────────────┐
                │                 xdb::main                    │
                │  (monolithic under LTO + codegen-units = 1)  │
                └───────────────┬──────────────────────────────┘
                                │
                    ┌───────────▼────────────┐
                    │  CPU<Core, Bus>        │
                    │   ├── Core: CoreOps    │   ← arch-agnostic trait
                    │   └── Bus: owned       │
                    └───────────┬────────────┘
                                │
                    ┌───────────▼────────────┐
                    │  RVCore (CoreOps impl) │
                    │   ├── GPR / PC / NPC   │
                    │   ├── csr: CsrFile     │
                    │   ├── privilege        │
                    │   ├── mmu: Mmu         │
                    │   ├── pmp: Pmp         │
                    │   ├── icache           │
                    │   └── pending_trap     │
                    └───────────┬────────────┘
                                │
                    ┌───────────▼──────────────────────────┐
                    │  Bus                                 │
                    │   ├── Ram [0x8000_0000, 1 GiB]        │
                    │   ├── ACLINT [0x0200_0000]            │
                    │   ├── PLIC   [0x0C00_0000]            │
                    │   ├── UART   [0x1000_0000]            │
                    │   ├── VirtIO [0x1000_1000]            │
                    │   └── Test   [0x0010_0000] (test-only)│
                    └──────────────────────────────────────┘

Four-layer memory access

A guest load/store walks:

vaddr ─► align check ─► MMU.translate ─► paddr ─► PMP.check ─► Bus.access
                             │                         ▲
                             └── page walk:  pte_paddr ┘  (PMP checks PTE reads too)

Responsibility split (see ../spec/mm/SPEC.md for the canonical table):

LayerKnows aboutDoes NOT know about
BusPhysical addresses, device regionsVirtual addresses, privilege, traps, PMP
MmuPage tables, TLB, PTE bits, SUM / MXRTrap codes, PMP (receives &Pmp for walks)
PmpPhysical-address permissions, privilegeVirtual addresses, page tables
RVCoreOrchestrates: privilege, MPRV, trap mappingInternal device state

Lock-free IRQ delivery

Devices raise interrupts through IrqState — a shared Arc<AtomicU64> bitmap. Each bit maps to an mip hardware bit:

BitSource
1SSIP (ACLINT SSWI)
3MSIP (ACLINT MSWI)
7MTIP (ACLINT MTIMER)
9SEIP (PLIC context 1)
11MEIP (PLIC context 0)

sync_interrupts() merges this into the CPU's mip register at the top of each step. No locks, no downcasts.

CPU dispatch & ISA decode

The CPU<Core, Bus> generic

#![allow(unused)]
fn main() {
pub struct CPU<C: CoreOps, B> {
    cores: Vec<C>,
    current: usize,
    bus: B,
}
}
  • C: CoreOps — ISA-specific core (e.g. RVCore). The generic boundary means xemu can host multiple ISAs; a LoongArch stub exists in xemu/xcore/src/arch/loongarch/.
  • B — the bus type. Single-hart carries Bus inline; multi-hart carries Arc<Mutex<Bus>> when true SMP lands (Phase 11 RFC).

Per-instruction flow

  1. Tick devices. bus.tick() advances ACLINT mtime (every step), drives the UART and PLIC on a slower cadence (every 64 steps), and collects IRQ lines.
  2. Sync interrupts. sync_interrupts() copies the atomic IRQ bitmap into mip.
  3. Check pending interrupts. If any enabled, higher-priority interrupt is pending, pending_trap is set and the rest of the step is skipped.
  4. Fetch. Read the instruction word at pc via the MMU.
  5. Decode. First check the decoded-instruction cache (per-hart 4 K direct-mapped). On miss, run the pest-based pattern matcher.
  6. Execute. Dispatch on DecodedInst, updating npc (and registers / CSRs / memory as side effects).
  7. Retire. self.pc = self.npc. If pending_trap is set, commit_trap() writes the trap vector address to npc first.

Decoder

xcore/src/arch/riscv/isa/decode/ contains ~200 instruction patterns expressed in pest. Each pattern captures the opcode fields into a DecodedInst::* variant:

  • R / I / S / B / U / J — standard formats
  • FR — floating-point with explicit rm (rounding mode) field
  • FR4 — FMA-style 4-register (fmadd, fmsub, ...)
  • C* — compressed variants

The match tree after decode is the dispatch loop — one big match on DecodedInst calling per-instruction handlers.

Decoded-instruction cache

Phase P4 of the perf roadmap:

#![allow(unused)]
fn main() {
struct ICacheLine {
    pc:      usize,      // guest virtual address
    ctx_tag: u32,        // bumped on any mapping change
    raw:     u32,        // raw instruction word (sanity)
    decoded: DecodedInst,
}

icache: [ICacheLine; 4096]   // per-hart, direct-mapped
}

ctx_tag invalidates implicitly on:

  • satp writes
  • sfence.vma
  • Privilege-mode transitions that change the effective translation
  • fence.i

Self-modifying code: every guest store invalidates the whole icache (simple, correct, loses the icache effect only on code-writing guests — rare). See ../spec/perfHotPath/SPEC.md for the full invariant set.

Trace

LOG=trace emits one line per instruction with PC, mnemonic, operands, and the resulting GPR delta. Very verbose — use it only for focused debugging.

CSR subsystem

Layering

Two layers with a clean split:

CsrFile    ← storage + descriptor-driven mask / shadow dispatch
RVCore     ← privilege checks, dynamic rules, side effects, trap generation

Key principle: CsrFile knows what a CSR is (address, width, WARL mask, shadow); RVCore knows when a CSR access is allowed and what happens after.

Storage

Flat 4096-entry array indexed by the 12-bit CSR address:

#![allow(unused)]
fn main() {
pub struct CsrFile {
    regs: [Word; 4096],
}
}

Shadow registers (sstatus, sip, sie) don't occupy their own slot — they redirect to the M-mode slot with a mask.

csr_table! macro

xcore/src/arch/riscv/cpu/csr/table.rs declares every CSR in a single macro invocation. The macro generates:

  • The CsrAddr enum.
  • The CSR_DESCS descriptor table (address, WARL mask, shadow target, side effects, difftest whitelist membership).
  • The O(1) dispatch match for read / write.

One source of truth prevents the enum and descriptor table from drifting apart.

WARL model

Writes go through the descriptor's write mask:

new = (old & !mask) | (incoming & mask)

Read-only-zero bits are enforced by mask = 0 on those fields. Shadow registers wrap the M-mode register's read/write with an extra mask (e.g. sstatus only exposes the S-mode subset of mstatus).

Side effects

Some writes trigger xemu-internal reconfiguration:

CSRSide effect
satpmmu.update_satp → reconfigure SvMode + flush TLB + bump icache ctx_tag
mstatusRecompute SUM / MXR flags
pmpcfg* / pmpaddr*Rebuild PMP entries with lock semantics
mtimecmpRecompute next_fire_mtime in ACLINT (Phase P3)
fcsr / fflags / frmRoute through shifted-subfield alias to the canonical fcsr

Traps on CSR access

CSR privilege violations raise architectural traps, not emulator errors. Use:

#![allow(unused)]
fn main() {
self.raise_trap(TrapCause::IllegalInst, /*tval=*/ instruction_word);
return Ok(());
}

Never return Err(XError) from a trap — reserve Err for host failures (I/O error) and emulator invariant violations. This is the "err2trap" refactor pattern; see ../spec/err2trap/SPEC.md.

Difftest whitelist

The csr_table! @ difftest annotation marks CSRs whose value is checked against the reference every step. Currently 14 CSRs are on the whitelist — architectural state that both DUT and REF model the same way. CSRs that depend on xemu-specific timing (time, mcycle) are excluded.

Memory: MMU, TLB, PMP

Access pipeline

vaddr → align check → MMU.translate(vaddr, op, priv) → paddr → PMP.check(paddr, op, priv) → Bus.access
                              │                                   ▲
                              └── page walk (pte_paddr checks PMP)┘

Four layers, each with a narrow responsibility. See ../spec/mm/SPEC.md for the full design.

MMU

  • Sv32 (RV32) and Sv39 (RV64) — multi-level page walk with hardware A / D bit update.
  • SvMode descriptor — runtime switch between Sv32 / 39 / 48 / 57; PTE format-dependent methods take &SvMode.
  • satp WARL — RV64 currently restricts to Sv39; writes to unsupported modes are masked.

The MMU caches the effective privilege, SUM / MXR bits, and the current SvMode — avoiding a CSR read on every translate.

TLB

  • 64 entries, direct-mapped.
  • ASID-tagged. Global pages (PTE.G = 1) match any ASID.
  • Flushed on sfence.vma with the standard ASID / vaddr operand semantics.

PMP

  • 16 entries, matching modes: OFF, TOR, NA4, NAPOT.
  • Partial-overlap detection — a paddr straddling two entries raises the appropriate fault.
  • Lock bit — once set, the entry is immutable until the next reset.
  • M-mode fast path — when no entries are locked, M-mode bypasses the 16-entry linear scan entirely.

Trap generation

MMU and PMP produce Err(XError::PageFault) or Err(XError::BadAddress) up the stack. RVCore maps these to the correct TrapCause:

XErrorTrap cause
PageFault { access: Load }LoadPageFault
PageFault { access: Store }StorePageFault
PageFault { access: Fetch }InstPageFault
BadAddress { access: Load }LoadAccessFault
BadAddress { access: Store }StoreAccessFault

This translation happens once in the stepcommit_trap path; instruction handlers just propagate ?.

MPRV

When mstatus.MPRV is set, loads / stores use mstatus.MPP as the effective privilege. xemu routes this through Mmu::effective_privilege(&mstatus, op) — not by clamping the actual privilege field.

Typed RAM access (Phase P6)

Hot-path loads / stores of 1 / 2 / 4 / 8 bytes bypass the generic memmove shim:

#![allow(unused)]
fn main() {
// Pseudocode
if op.aligned() && size ∈ {1,2,4,8} {
    direct_u{size}_read(ram_slice)
} else {
    bytemuck::copy_within(...)  // slow path
}
}

Drops the _platform_memmove + Bus::{read,write} combined bucket from ~18 % to sub-2 % on the dhrystone / coremark / microbench profile. See ../spec/perfHotPath/SPEC.md.

Traps & interrupts

Two-phase trap handling

xemu uses a single-commit-point model: architectural state only changes when step() commits at the retire stage.

Phase 1 (raise): an instruction handler or subsystem detects a trap condition and sets pending_trap. Control returns up the stack via Ok(()).

Phase 2 (commit): after execute, step() inspects pending_trap; if set, commit_trap() updates mepc/sepc, mcause/scause, mtval/stval, mstatus/sstatus (MPP/MPIE), sets privilege, and writes self.npc = trap_vector.

The loop then commits: self.pc = self.npc.

PendingTrap

#![allow(unused)]
fn main() {
pub struct PendingTrap {
    pub cause: TrapCause,
    pub tval: Word,
}
}

One canonical representation. Never carried inside XError for architectural traps (see the err2trap discussion in csr.md).

Delegation

xemu implements the full medeleg / mideleg model:

  • Faults at U-mode may be delegated to S-mode.
  • Timer / software / external interrupts are routed via mideleg.
  • Delegation happens in commit_trap — handlers just supply the TrapCause.

Vectored mtvec / stvec is supported: when mtvec[1:0] = 1, async interrupts dispatch to base + 4 * cause; synchronous traps always jump to base.

Interrupt priority

Per the spec:

MEI > MSI > MTI > SEI > SSI > STI

check_pending_interrupts() walks in priority order, masked by mie / sie / the global enable bit (mstatus.MIE / sstatus.SIE).

Lock-free IRQ plane

Devices raise interrupts by flipping bits in a shared Arc<AtomicU64> (IrqState). The CPU merges this into mip at the top of each step. No locks; no vtable downcasts from the Bus into the PLIC.

Device-to-PLIC is direct: the UART holds a reference to the PLIC's source slot (PlicSource) and flips it on state change. No Bus-mediated round-trip. This is the directIrq fix; see ../spec/directIrq/SPEC.md.

Edge vs level

  • ACLINT MSIP / MTIP / SSIP — level-triggered by bit state.
  • UART — level; !rx_fifo.is_empty() && (ier & 1).
  • PLIC — level on its external sources; claim/complete exclusion prevents re-pending until the handler completes. plicGateway fixed a prior edge/level confusion; see ../spec/plicGateway/SPEC.md.

Devices

Device trait

#![allow(unused)]
fn main() {
pub trait Device: Send {
    fn read(&mut self, offset: usize, size: usize) -> XResult<Word>;
    fn write(&mut self, offset: usize, size: usize, value: Word) -> XResult;
    fn tick(&mut self) {}
    fn irq_line(&self) -> bool { false }
    fn notify(&mut self, _irq_lines: u32) {}
}
}

Five methods. Default-no-op for tick, irq_line, notify — device authors override only what they need.

Bus

#![allow(unused)]
fn main() {
pub struct Bus {
    ram: Ram,
    mmio: Vec<MmioRegion>,
    plic_idx: Option<usize>,
}

struct MmioRegion {
    name: &'static str,
    range: Range<usize>,
    dev: Box<dyn Device>,
    irq_source: u32,       // 0 = no IRQ
}
}

Every access goes through Bus::read / Bus::write:

  • Fast path — RAM. Static dispatch, no vtable. Typed-read bypass for aligned 1/2/4/8-byte accesses (Phase P6).
  • Slow path — MMIO. Linear scan for the covering region, then dispatch via dyn Device.

tick() split

#![allow(unused)]
fn main() {
pub fn tick(&mut self) {
    // ... ACLINT every step (fast path) ...
    // ... UART + PLIC every 64 steps (slow path) ...
}
}

ACLINT fires on every step because the Mtimer deadline check is on the critical path. UART and PLIC tick less frequently — their state rarely changes per-instruction.

Inside ACLINT, the Mtimer deadline gate (Phase P3) short-circuits 99.99 % of checks:

#![allow(unused)]
fn main() {
if self.mtime < self.next_fire_mtime { return; }
self.check_all();  // slow path only when a deadline has arrived
}

IRQ collection

The Bus collects level-triggered IRQ lines:

#![allow(unused)]
fn main() {
let mut irq_lines: u32 = 0;
for r in &mut self.mmio {
    r.dev.tick();
    if r.irq_source > 0 && r.dev.irq_line() {
        irq_lines |= 1 << r.irq_source;
    }
}
if let Some(i) = self.plic_idx {
    self.mmio[i].dev.notify(irq_lines);
}
}

The PLIC is the only device whose notify is overridden — it receives the full IRQ-line bitmap and evaluates MEIP/SEIP.

Per-device pages

ACLINT (MSWI / MTIMER / SSWI)

ACLINT replaces the legacy CLINT with three cleanly-split sub-devices sharing the 0x0200_0000 / 0x1_0000 region. Wire-compatible with the CLINT layout for software that expects one.

See the split spec at ../spec/aclintSplit/SPEC.md.

MSWI — Machine Software Interrupt

  • msip at offset 0x0000 — bit 0 only.
  • Writing 1 sets MSIP in irq_state; writing 0 clears it.

MTIMER — Machine Timer

  • mtime at 0xBFF8 (lo) / 0xBFFC (hi) — host wall clock at 10 MHz. timebase-frequency = 10_000_000.
  • mtimecmp at 0x4000 (lo) / 0x4004 (hi).
  • Amortized sync: wall-clock samples are taken every 512 ticks, not every step (Phase P1-era optimisation).
  • Deadline gate: per-step tick() short-circuits when self.mtime < self.next_fire_mtime (Phase P3).
  • When mtime ≥ mtimecmp, MTIP is raised in irq_state.

SSWI — Supervisor Software Interrupt

  • setssip at 0xC000 — write-only.
  • Writing 1 sets SSIP in irq_state.
  • Read always returns 0.

Sstc extension

xemu exposes stimecmp for Sstc — an S-mode direct timer register. The xemu DTS advertises riscv,isa = "rv64imafdcsu_sstc"; Linux and OpenSBI use Sstc when present, bypassing the SBI timer call.

PLIC

Platform-Level Interrupt Controller at 0x0C00_0000, 64 MiB region.

  • 32 sources (source 0 is reserved "no interrupt").
  • 2 contexts — context 0 = M-mode, context 1 = S-mode.
  • Level-triggered on external sources.
  • Per-source priority, per-context threshold and enable bitmap.
  • Claim / complete with claimed-exclusion — a claimed source does not re-pend until its complete write.

See ../spec/plicGateway/SPEC.md for the Gateway + Core split design and the level-trigger invariants.

Register layout (offsets within the PLIC base)

OffsetRegister
0x0000_0000Priority[source] (32-bit per source)
0x0000_1000Pending bitmap (32 bits — one per source)
0x0000_2000Enable bitmap, context 0 (M)
0x0000_2080Enable bitmap, context 1 (S)
0x0020_0000Threshold + Claim/Complete, context 0
0x0020_1000Threshold + Claim/Complete, context 1

Update algorithm

On each bus tick the PLIC receives the current IRQ-line bitmap via Device::notify(irq_lines: u32):

for src in 1..32:
    if src in claimed:         continue
    if bit(irq_lines, src):    pending |= (1 << src)
    else:                      pending &= !(1 << src)   # level went low
evaluate(context 0) → MEIP
evaluate(context 1) → SEIP

evaluate(ctx) finds the highest-priority enabled pending source above threshold[ctx]. If one exists, it sets MEIP/SEIP in irq_state; otherwise clears.

Claim / Complete

  • Claim read — returns the highest-priority pending source, clears its pending bit, and records it in claimed[ctx].
  • Complete write — if the value matches claimed[ctx], the slot is released and evaluate() reruns so a subsequent interrupt can re-pend.

Direct IRQ delivery

Devices like UART hold a reference to their PLIC source slot and signal state changes directly — no Bus round-trip. See ../spec/directIrq/SPEC.md.

UART 16550

National Semiconductor 16550-compatible UART at 0x1000_0000, PLIC source 10.

Registers

OffsetDLAB=0 readDLAB=0 writeDLAB=1
0RBR (RX data)THR (TX data)DLL (divisor low)
1IERIERDLM (divisor high)
2IIR (read)FCR (write)IIR / FCR
3LCRLCRLCR
4MCRMCRMCR
5LSRLSR
6MSRMCRMSR
7SCRSCRSCR
  • LCR[7] is DLAB — toggles register meaning for offsets 0/1.
  • LSR.DR = RX ready (derived from rx_fifo).
  • LSR.THRE / LSR.TEMT = always set (TX is synchronous to stdout).

Modes

Default (stdio, batch-friendly)

  • TX → host stdout.
  • RX → host stdin. Non-blocking poll per tick.

PTY mode (DEBUG=y)

  • TX → PTY master.
  • RX → PTY master.
  • Attach the slave with screen /dev/ttysXXX 115200. xemu prints the slave path at startup.

Keyboard am-test

TEST=k runs a bare-metal kernel that polls RBR and echoes to TX — the canonical interactive smoke test.

Interrupts

irq_line() = !rx_fifo.is_empty() && (ier & 0x1).

THRE interrupts (ier & 0x2) are also supported: when the guest writes THR and re-arms IER, the next tick promotes thre_pending into thre_ip and re-syncs the IRQ state.

Ctrl-A X

xemu intercepts the Ctrl-A X escape sequence (QEMU-style) to exit cleanly from firmware-boot modes without needing a guest poweroff.

VirtIO-blk

MMIO legacy (version 1) transport backing the Debian rootfs.

Layout

RegionAddressSize
VirtIO MMIO0x1000_10000x1000

Transport

  • Legacy MMIO v1 — matches Linux's virtio_mmio driver without needing the modern VIRTIO_F_VERSION_1 path.
  • Split virtqueue, 128 entries.
  • Synchronous DMA processingprocess_dma reads descriptors from guest RAM, dispatches the request, writes status, rings the used ring.

DmaCtx

Bus-mediated guest-memory accessor — the only bridge between VirtIO code and guest RAM:

#![allow(unused)]
fn main() {
impl<'a> DmaCtx<'a> {
    pub fn read_bytes(&mut self, gpa: u64, buf: &mut [u8]) -> XResult<()>;
    pub fn write_bytes(&mut self, gpa: u64, buf: &[u8]) -> XResult<()>;
    pub fn read_le<T: LeBytes>(&mut self, gpa: u64) -> XResult<T>;
    pub fn write_le<T: LeBytes>(&mut self, gpa: u64, v: T) -> XResult<()>;
}
}

The LeBytes trait is the type-safe layer — no unsafe aliasing of guest memory, just bounded &mut [u8] views through the Bus.

BlkStorage

Separated from the transport state so Rust's borrow checker can split them:

#![allow(unused)]
fn main() {
struct VirtioBlk {
    transport: TransportState,   // queue pointers, device status
    storage:   BlkStorage,        // backing snapshot
}
}

process_dma borrows &mut TransportState + &mut BlkStorage disjointly — no interior mutability, no runtime borrow tracking.

Two-tier reset

  • Transport resetQueueReady goes to 0; QueueSel is cleared; disk contents preserved.
  • Emulator hard reset — via the test finisher; restores the disk to the snapshot recorded at load.

Debian image

resource/xemu-debian.img — 4 GiB ext4 filesystem with Debian 13 Trixie pre-installed. Build system downloads it on first make debian.

Performance: hot path & baselines

Short answer

Over five phases (P1 + P3 + P4 + P5 + P6) the user-time per benchmark dropped by ~57–62 % vs the pre-P1 baseline:

BenchmarkPre-P1Post-hotPathΔ
Dhrystone8.09 s3.48 s−57 %
CoreMark14.02 s5.82 s−58 %
MicroBench85.82 s32.91 s−62 %

See ../PROGRESS.md §Phase 9 for the full table and ../spec/perfBusFastPath/SPEC.md, ../spec/perfHotPath/SPEC.md for per-phase design.

Where time goes today

On the post-hotPath profile, the dominant buckets are roughly:

BucketShareCharacter
xdb::main (dispatch + decode + execute)~30 %Interpreter core
MMU entry (checked_* + access_bus)~10 %Per load/store
Mtimer deadline gate<1 %Per-step (post-P3)
Typed RAM access<2 %Per load/store (post-P6)
Device ticks (UART / PLIC / VirtIO)<1 %Slow path, every 64 steps

The pre-P1 baseline had pthread_mutex_* at 33–40 % — now 0 % (Bus is owned, not behind Arc<Mutex<_>>).

The five landed phases

PhaseSubjectWinRisk
P1 busFastPathDrop Arc<Mutex<Bus>>, own inline−45…−52 % wallLow
P3 Mtimer deadlineCache next_fire_mtime, short-circuit tickMtimer bucket → <1 %Very low
P4 icachePer-hart decoded-inst cache, 4 K entriesxdb::main bucket −10 ppMedium (invalidation)
P5 MMU inline#[inline] pressure through fast pathMMU bucket −3 ppLow
P6 memmove bypassTyped reads on aligned 1/2/4/8-byte accessesmemmove bucket → <2 %Low-Medium (unsafe)

Measurement pipeline

Always run from ProjectX/ root:

bash scripts/perf/bench.sh       # → docs/perf/baselines/<today>/data/bench.csv
bash scripts/perf/sample.sh      # → <today>/data/<workload>.sample.txt
python3 scripts/perf/render.py   # → <today>/graphics/*.svg
  • 3 runs per workloaduser_s is the stable metric, real_s is noisy on macOS under system load.
  • Use DEBUG=n. PTY mode perturbs timing.
  • Commit data/ and graphics/ with the phase's MASTER document.

Phase exit gate pattern

A phase is not done until:

  1. cargo test --workspace + make linux + make debian all green (and -2hart variants where applicable).
  2. bench.sh rerun (3 iters per workload).
  3. sample.sh rerun for each of the three benches.
  4. Per-phase exit gate hit with ≥ 1 pp margin on the bucket it targets.
  5. REPORT.md deltas committed to the phase's archived MASTER.

What's next

  • P7 multi-hart re-profile — pending; shapes the Phase 11 SMP work. Not an optimisation in itself — a measurement task.
  • Phase 11 (RFC) — true per-hart OS threads. Requires atomic RAM, per-hart reservations, per-device MMIO locking. Not in any current perf phase.

Multi-hart

Today's multi-hart is a single-threaded cooperative round-robin scheduler in CPU::step. N harts are no faster than 1. The abstraction exists so the ISA code can reason about per-hart state, not because the host is running them in parallel.

See ../spec/multiHart/SPEC.md for the Hart abstraction design.

What's shared, what's per-hart

SharedPer-hart
Bus (RAM + all devices)GPR / PC / NPC
ACLINT mtime (one host wall-clock source)CsrFile
PLIC state (2 contexts route to 2 harts)privilege
IrqState Arc<AtomicU64> (one set of mip/mie bits per hart)mmu, pmp
icache
pending_trap

Per-hart icache

Each hart has its own 4 K direct-mapped decoded-instruction cache. A satp write on one hart does not flush the other hart's icache — each has its own ctx_tag. sfence.vma with an explicit hart target would too, but the current implementation flushes both harts on any sfence.vma for simplicity (conservative, correct).

Running

cd resource
make linux-2hart         # 2 harts, cooperative scheduler
make debian-2hart        # same, with VirtIO rootfs

Both cores share the same Bus instance. The scheduler gives each hart a slice of steps in round-robin order before rotating.

Why single-threaded today

P1 (busFastPath) removed the Arc<Mutex<Bus>> that was dead weight under the cooperative scheduler — there's no real SMP, so the mutex was pure overhead. Removing it gave 45–52 % wall-clock.

True SMP (Phase 11 RFC)

Not in any landed phase. To get per-hart OS threads:

  • Guest RAM becomes &[AtomicU8] (or unsafe typed access with explicit fences).
  • LR/SC reservations become per-hart AtomicUsize.
  • Per-device fine-grained sync (or the QEMU MTTCG "BQL on MMIO only" model).
  • A runtime that joins / cancels hart threads cleanly.

None of this fits in the perf roadmap. See ../PROGRESS.md §Phase 11 for reference designs (QEMU MTTCG, rv8, Guo 2019 on fast TLB simulation).

Pre-conditions before opening Phase 11

  • P1, P2 (bus-access API), P5 (MMU inline) shipped. Done.
  • A reproducible 2-hart Linux benchmark in docs/perf/baselines/<date>/ showing the fraction of time actually parallelisable. Not yet measured.
  • P7 re-profile results.

Supported ISA

xemu implements the RISC-V unprivileged ISA plus the privileged model, across both RV32 and RV64 via cfg_if.

Base + standard extensions

ExtDescriptionRV32RV64
IBase integer
MMultiply / divide
AAtomic (LR/SC + 9 AMO ops, .w and .d)
FSingle-precision float
DDouble-precision float
CCompressed
ZicsrCSR access
Zifenceifence.i

DTS advertisement: riscv,isa = "rv64imafdcsu_sstc".

Privileged ISA

  • M / S / U modes with full trap delegation (medeleg / mideleg).
  • Vectored and direct mtvec / stvec.
  • mret / sret with MPRV handling.
  • Sstc — S-mode direct stimecmp.

MMU

ModeSupport
Bare (identity)
Sv32 (RV32)
Sv39 (RV64)✅ — hardware A/D bit update
Sv48Descriptor exists; write to satp masks it off
Sv57Not wired
  • TLB: 64-entry direct-mapped, ASID-tagged, global-page aware.
  • PMP: 16 entries, TOR / NA4 / NAPOT, lock semantics, partial-overlap detection.

Float details

  • softfloat_pure — pure Rust Berkeley softfloat-3.
  • NaN-boxing for F operands when D is also active.
  • fcsr / fflags / frm are shifted subfield aliases of one canonical fcsr (see CSR subsystem).
  • mstatus.FS tracked as Off / Initial / Clean / Dirty, with SD recomputed on every mstatus / fcsr write.

What's not implemented

  • V (vector) — RVV is not supported.
  • H (hypervisor) — no HS-mode.
  • Zba / Zbb / Zbc / Zbs (bit-manipulation) — deferred.
  • Zicbom / Zicboz (cache management) — no caches modelled.
  • Svnapot / Svpbmt — not wired.

Instruction table

For the full per-mnemonic implementation status, see ../../spec/inst/SPEC.md.

Device memory map

Default xemu machine layout, QEMU-virt-compatible in shape with documented deltas.

DeviceBaseSizeIRQ
Test finisher (test-only)0x0010_00000x10
ACLINT0x0200_00000x1_0000
PLIC0x0C00_00000x400_0000
UART0 (NS16550)0x1000_00000x10010
VirtIO MMIO (Debian target)0x1000_10000x10001
RAM0x8000_0000128 MiB (tests) / 1 GiB (Linux)

Intentional deltas from QEMU virt

  • ACLINT replaces CLINT. Wire-compatible MMIO layout; offers clean MSWI / MTIMER / SSWI split.
  • Test finisher is test-only. Not wired into the default machine used by Linux / Debian.
  • timebase-frequency = 10_000_000 (10 MHz), matching the host wall-clock sampling rate.

PLIC source assignments

SourceOwner
0"no interrupt" (reserved)
1VirtIO-blk
10UART0

Higher source numbers are reserved for future devices.

IrqState bitmap

Arc<AtomicU64> where:

Bitmip nameWriter
1SSIPACLINT SSWI
3MSIPACLINT MSWI
7MTIPACLINT MTIMER
9SEIPPLIC context 1
11MEIPPLIC context 0

sync_interrupts() on CPU step merges this into mip.

Boot layout (where the ELF lands)

  • Bare-metal tests — entry at 0x8000_0000.
  • xv6 — entry at 0x8000_0000 (M-mode).
  • Linux / Debian — OpenSBI lands at 0x8000_0000 (M-mode), then jumps to the kernel at 0x8020_0000 (S-mode).
  • FDTBootLayout::fdt_addr persists the DTB address so the kernel can find it at a1 on entry.

Environment variables

Recognised by the make run / make linux / etc. entry points.

VarValuesDefaultEffect
DEBUGy / nny routes UART to a PTY, enables richer logging, and turns off release optimisations. Always set DEBUG=n when benchmarking.
LOGtrace / debug / info / warn / error / offinfoxlogger verbosity. trace is per-instruction.
X_HARTSinteger ≥ 11Guest hart count (cooperative scheduler).
X_FILEpathset by per-target MakefileELF to execute. Don't set manually — let make run resolve it.
DIFFTEST0 / 10Compile-in QEMU / Spike difftest backends.
AM_HOMEpath${workspace}/xamWhere xam HAL sources live.
XEMU_HOMEpath${workspace}/xemuWhere xemu workspace lives.
XLIB_HOMEpath${workspace}/xlibWhere xlib (klib) sources live.

CI-only

VarEffect
ECC_DISABLED_HOOKSDisable specific Everything-Claude-Code plugin hooks by hook ID.
ECC_HOOK_PROFILEminimal / standard / strict — coarse toggle.

Runtime (xdb REPL)

Not env vars — commands inside the monitor. See The xdb debugger.

xam HAL

xam is the bare-metal HAL (abstract-machine) that kernels link against. It exposes a minimal set of primitives that xemu knows how to service.

Layout

xam/
├── include/        HAL headers (C + Rust bindings)
├── src/            implementations (arch-agnostic + riscv-specific)
└── scripts/        build_c.mk, link.ld, cross-target cargo support

API

Console

void _putch(char ch);       // write one byte to UART TX

Used by xlib's stdio.c to back printf.

Time

uint64_t mtime(void);               // read ACLINT mtime
void     set_mtimecmp(uint64_t t);  // set MTIMECMP for this hart
uint64_t uptime(void);              // microseconds since boot

uptime() is derived from mtime() divided by 10 (10 MHz clock).

Trap entry

#![allow(unused)]
fn main() {
pub struct TrapFrame {
    pub regs: [usize; 32],
    pub sstatus: usize,
    pub sepc: usize,
    pub scause: usize,
    pub stval: usize,
}

pub fn init_trap(handler: fn(&mut TrapFrame));
}

Guest sets the handler once at boot; xemu's trap dispatch lands on it with a populated frame.

Main-args

#![allow(unused)]
fn main() {
extern "C" {
    static mainargs: *const u8;  // compile-time strings
}
}

Useful for passing test identifiers into a single kernel binary.

Linker symbols

_heap_start      — start of the heap (end of .bss)
_heap_end        — end of the heap (derived from RAM size)

MMIO constants

Device addresses match Device memory map.

Building a kernel with xam

cd xkernels/tests/your-kernel
make run

The xam/scripts/build_c.mk and build_rs.mk wrappers handle the cross-compilation and link script automatically. No target-specific flags needed in your kernel's Makefile.

xlib (klib)

Freestanding C library for programs built by xam and run on xemu. Modelled after NEMU's abstract-machine klib — minimal, deterministic, platform-independent.

See ../../spec/klib/SPEC.md for the design.

What's included

<string.h> — string.c

void  *memset(void *s, int c, size_t n);
void  *memcpy(void *dst, const void *src, size_t n);
void  *memmove(void *dst, const void *src, size_t n);
int    memcmp(const void *s1, const void *s2, size_t n);

size_t strlen(const char *s);
char  *strcpy(char *dst, const char *src);
char  *strncpy(char *dst, const char *src, size_t n);
char  *strcat(char *dst, const char *src);
int    strcmp(const char *s1, const char *s2);
int    strncmp(const char *s1, const char *s2, size_t n);
char  *strchr(const char *s, int c);
char  *strrchr(const char *s, int c);

<stdio.h> — stdio.c + format.c

int printf(const char *fmt, ...);
int sprintf(char *buf, const char *fmt, ...);
int snprintf(char *buf, size_t size, const char *fmt, ...);
int vsprintf(char *buf, const char *fmt, va_list ap);
int vsnprintf(char *buf, size_t size, const char *fmt, va_list ap);
int puts(const char *s);
int putch(char ch);

Format specifiers: %d %i %u %x %X %s %c %p %o %%, with l / ll length modifiers, field width, 0-padding, left-alignment. No floating-point printf.

<assert.h>

#define assert(x) ...

C- and C++-compatible (carries extern "C" guards).

<stdlib.h> — stdlib.c

int     atoi(const char *s);
int     abs(int x);
void    srand(unsigned seed);
int     rand(void);

No malloc / free — cpu-tests don't need them, benchmarks use local allocators.

<ctype.h> — ctype.c

isspace, isdigit, isalpha, isalnum, toupper, tolower, etc. Standard shapes.

What's not included

  • POSIX APIs, FILE * streams.
  • Floating-point printf.
  • Locale support.
  • Thread-safe allocation.

This is intentional — xlib targets bare-metal test and benchmark kernels, not a hosted C environment.

Using from your kernel

#include <klib.h>          /* umbrella header */

This pulls in <stddef.h>, <stdint.h>, <stdbool.h>, <stdarg.h>, <string.h>, <stdio.h>, <stdlib.h>, <ctype.h>. The xam build system prepends -I$(XLIB_HOME)/include before system includes.

klib-macros.h

Convenience macros used by benchmarks:

#define LENGTH(arr)        (sizeof(arr) / sizeof((arr)[0]))
#define ROUNDUP(x, n)      (((x) + (n) - 1) & ~((n) - 1))
#define ROUNDDOWN(x, n)    ((x) & ~((n) - 1))
#define MIN(a, b)          ((a) < (b) ? (a) : (b))
#define MAX(a, b)          ((a) > (b) ? (a) : (b))

Workflow overview

ProjectX uses a spec- and doc-driven iteration workflow. The canonical rules live in /AGENTS.md; this page explains the shape at a glance.

Three locations per feature

  1. docs/tasks/<feature>/ — in-flight workspace. Holds NN_PLAN.md / NN_REVIEW.md / NN_MASTER.md rounds as the design converges.
  2. docs/spec/<feature>/SPEC.md — landed canonical spec. Authored by extracting the final PLAN's ## Spec section (Goals / Architecture / Invariants / Data Structure / API Surface / Constraints).
  3. docs/archived/<category>/<feature>/ — iteration history, moved out of tasks/ once the feature lands.

Iteration loop

Per feature, up to 5 rounds (0004):

plan-executor  → NN_PLAN.md
(main session stops)
external reviewer (codex / human) → NN_REVIEW.md
(optional) user → NN_MASTER.md
→ next plan-executor, or → implementation

Loop cap. If the reviewer returns APPROVED earlier (no CRITICAL / HIGH findings) or after round 04, proceed to implementation. Any surviving MEDIUM / LOW findings are addressed inline during implementation.

Implementation

  • Implementation (code and NN_IMPL.md) is authored by the main session directly — not by a sub-agent.
  • There is no post-implementation review artifact. Audit findings are applied inline in the same session.

Categories at landing

When a feature lands, choose the archive category that matches the dominant intent:

CategoryTrigger
featNew user-visible or API-visible capability
fixBug or MANUAL_REVIEW finding that isn't a reorg
refactorReshape code without changing behavior
perfMeasurable speedup under a published exit gate
reviewAudit / retrospective not tied to one feature

When a task has mixed intent, split it.

Continuing reading

Opening a new feature

Step-by-step guide for starting a new feature. Assumes you've read Workflow overview.

1. Pick a name

A short camelCase identifier — no spaces, no slashes. Examples: vgaConsole, perfIcacheV2, rvv, sbiDebug.

2. Create the task workspace

mkdir -p docs/tasks/<feature>
cp docs/template/PLAN.template   docs/tasks/<feature>/00_PLAN.md
cp docs/template/REVIEW.template docs/tasks/<feature>/00_REVIEW.md
cp docs/template/MASTER.template docs/tasks/<feature>/00_MASTER.md

Create all three files at the start of the round, even if some are empty. Reviewer and user fill them in turn.

3. Author 00_PLAN.md

Dispatch plan-executor sub-agent from the main session. The sub-agent produces the plan — the main session never authors it directly.

The PLAN must include:

  • ## Summary — one paragraph.
  • ## Log — reviewer-facing changelog. Start empty for round 00.
  • ## Spec with [**Goals**] / [**Architecture**] / [**Invariants**] / [**Data Structure**] / [**API Surface**] / [**Constraints**].
  • ## Implement — step-by-step engineering plan.
  • ## Trade-offs — what was considered and rejected.
  • ## Validation — test plan with real code sketches.

4. Get NN_REVIEW.md

The main session stops after round 00's PLAN. You invoke an external reviewer (codex / human) to produce 00_REVIEW.md.

Classify findings: CRITICAL / HIGH / MEDIUM / LOW.

5. Optional NN_MASTER.md

If you want to override the review or add binding directives, write 00_MASTER.md yourself. MUST directives are binding on the next PLAN; SHOULD directives need explicit response if rejected.

6. Iterate

Signal the main session to dispatch round 01. The next PLAN must:

  • Have a Response Matrix mapping every prior CRITICAL / HIGH finding + MASTER directive to a resolution.
  • Address all MASTER MUST directives unconditionally.

7. Implement

After the final approved PLAN (up to round 04), the main session authors the code changes and NN_IMPL.md directly. Include:

  • What shipped vs what was planned.
  • Deviations from the plan, with justification.
  • Validation results — tests run, exit gates met.

8. Land

  • Extract SPEC. Copy the final PLAN's ## Spec section into docs/spec/<feature>/SPEC.md.
  • Archive. git mv docs/tasks/<feature> docs/archived/<category>/<feature>.
  • Update PROGRESS.md — add the landed feature to the appropriate phase or task table.

Do-nots

  • Don't edit previous iteration documents. Always create the next numbered file.
  • Don't silently deviate during implementation. If the design changes meaningfully, open a new iteration.
  • Don't dispatch reviewer sub-agents for the PLAN review — reviews are external, out-of-session.

Writing a SPEC

The SPEC.md is the landed, canonical description of a feature. It's extracted from the final PLAN's ## Spec section after the feature implementation lands.

See ../../template/SPEC.template for the canonical shape.

Sections

[**Goals**]

What the feature provides, numbered G-1, G-2, ... Each goal is a one-sentence claim about observable behaviour or a measurable threshold.

- G-1: All 31 cpu-tests-rs pass with the new MMU implementation.
- G-2: Linux boots to initramfs shell in ≤ 5 seconds on the M4 host.

Follow Goals with Non-Goals NG-1, NG-2, ... — what the feature explicitly does not cover.

[**Architecture**]

A prose + diagram description of the component's shape. ASCII diagrams are fine; keep them under 80 columns. Show the data-flow arrows, not just boxes.

[**Invariants**]

Numbered I-1, I-2, ... Properties that must hold at all times across all execution paths.

- I-1: mip hardware bits are modified only via irq_state merge.
- I-2: Tick order: bus.tick → sync → check → fetch → execute → retire.
- I-3: Claimed PLIC sources are excluded from re-pending until complete.

[**Data Structure**]

Core types — structs, enums, traits — with real Rust syntax. This is the type-level signature of the feature.

#![allow(unused)]
fn main() {
pub struct Aclint {
    epoch: Instant,
    mtime: u64,
    msip: u32,
    mtimecmp: u64,
    irq_state: Arc<AtomicU64>,
}
}

[**API Surface**]

Public function signatures and their contracts.

#![allow(unused)]
fn main() {
/// Read a word at `addr`. Returns `BadAddress` for unmapped paddrs
/// or `PageFault` for unmapped vaddrs.
pub fn checked_read(&mut self, addr: VirtAddr, size: usize) -> XResult<Word>;
}

[**Constraints**]

Numbered C-1, C-2, ... Things that would look like bugs but are intentional design boundaries.

- C-1: xemu internal layout matches QEMU-virt in shape; ACLINT replaces CLINT.
- C-2: Single hart (cooperative scheduler).
- C-3: UART byte-access only; word writes raise SizeMismatch.

Extraction from PLAN

When a feature lands:

  1. Read the final NN_PLAN.md.
  2. Locate its ## Spec section.
  3. Copy everything between ## Spec and the next ## heading into docs/spec/<feature>/SPEC.md.
  4. Prepend a banner:
# `<feature>` SPEC

> Source: [`/docs/archived/<cat>/<feature>/NN_PLAN.md`](...) —
> iteration history lives under `docs/archived/<cat>/<feature>/`.

---
  1. Commit the SPEC in the same PR as the IMPL.

Pre-workflow features

Some features (e.g. csr, klib, mm) predate the template. Their SPEC.md contains the original pre-workflow design verbatim with a banner flagging it. Do not rewrite until the feature next sees meaningful iteration — the rewrite is its own task.

Updating a SPEC

When a feature iterates, the new PLAN's Response Matrix addresses all prior CRITICAL / HIGH findings; the implementation lands; the SPEC is replaced with the new round's ## Spec. Never hand-edit the SPEC in isolation.

Adding a benchmark

Existing benchmarks: Dhrystone, CoreMark, MicroBench. Adding a new one means a new test kernel + a measurement entry in the perf pipeline.

Kernel

Create a new directory under xkernels/tests/benchmark/<name>/:

xkernels/tests/benchmark/<name>/
├── Makefile          # uses xam/scripts/build_c.mk or build_rs.mk
├── src/
│   └── main.c        # or .rs — the benchmark itself
└── README.md         # what it measures, expected score range

The Makefile should delegate to xam's build system. Link against xlib for printf / memcpy.

Exit via the SiFive test finisher:

#include <klib.h>
extern void xam_halt(int code);

int main() {
    uint64_t t0 = uptime();
    /* ... work ... */
    uint64_t t1 = uptime();
    printf("score = %lu\n", compute_score(t1 - t0));
    xam_halt(0);
    return 0;
}

Measurement pipeline

Add the new benchmark to scripts/perf/bench.sh so CI / manual runs capture it:

BENCHES=(dhrystone coremark microbench <name>)

Also teach scripts/perf/sample.sh how to capture its sample profile (usually the same per-workload path).

Baseline

After landing:

  1. Run bash scripts/perf/bench.sh --runs 3. This writes the new workload into docs/perf/baselines/<today>/data/bench.csv.
  2. Run bash scripts/perf/sample.sh to produce the sample traces.
  3. Run python3 scripts/perf/render.py for the SVG flamegraphs.
  4. Commit the new data/ + graphics/ files.

Reporting in PROGRESS.md

Add a row to the "Benchmark" table in the project root README.md if the benchmark is user-facing enough to publish. Update ../PROGRESS.md §Phase 9 if the workload introduces a new cost centre worth tracking per-phase.

Performance hygiene

  • Don't add a benchmark that depends on wall-clock non-determinism (interrupt timing, stdin blocking). Use deterministic work loops.
  • Use uptime() (microseconds) for in-guest timing; it's derived from ACLINT mtime, which is frozen during xdb pause.
  • Take 3 runs for the published number; user_s is the stable metric.