RISC-V Startup Code

Motivation

I work with different microcontroller boards day to day. My job is to write startup code in Rust for board bring-up, especially RISC-V cores and to write, generate, and maintain the Peripheral Access Crate (PAC) for each RISC-V SoC.

Working this low in the stack, I wanted to write down what actually makes a RISC-V core come alive and run our code. Concretely:

I’ll answer these using xv6, a small teaching OS that targets a 64-bit RISC-V core and runs on QEMU. Its boot path is the same bare-metal bring-up problem I deal with every day - set up a stack, place code and data, pick a privilege level, jump to our code - just unusually well documented, which makes it a great thing to learn from.

This post follows only the path from power-on to the main Rust function (main); everything the kernel does after that is for a later post.

I’m porting xv6 from C to Rust to learn real OS concepts by building one, reading the C source alongside the xv6 book. You can find my in-progress port on GitHub: xv6rs. Every code snippet in this post is taken from that repo.

RISC-V EEI

EEI stands for Execution Environment Interface, defined by the RISC-V Unprivileged spec.

See the RISC-V Unprivileged spec for the full definition.

Program’s Entry Point

xv6 operating system runs on RISC-V multiprocessor under QEMU. So QEMU sets up EEI for us, such as:

With the combination of linker script and assembly, we will set our program to start from address 0x80000000. In theory we could have done this using Rust instead of assembly, but compiled Rust code, like C, assumes some things are already set up when any function is called like a valid stack pointer, a zeroed .bss and an initialized .data section. Here we are starting from garbage state w.r.t. QEMU, we will use assembly to set this up so that we could use Rust functions later.

As DRAM is filled with 0’s, we need not clear the .bss section and we need not move contents from Flash to RAM for the .data section as it will already be in DRAM. So the only thing missing is setting a valid stack pointer.

entry.rs:

use core::arch::global_asm;

global_asm!(
    "
    .section .text.entry
    .global _entry
    _entry:
            la sp, stack0
            csrr a1, mhartid
            addi a1, a1, 1
            slli a0, a1, 12       # `li a0, 4096`; `mul a0, a0, a1`;
            add sp, sp, a0
            call start
    ",
);

kernel.x:

OUTPUT_ARCH( "riscv" )
ENTRY( _entry )

SECTIONS
{
  . = 0x80000000;

  .text : {
    KEEP(*(.text.entry))
    *(.text .text.*)
  }

  .bss : {
  . = ALIGN(16);
  *(.bss.stack0)
  }
}

start.rs:

#[unsafe(no_mangle)]
#[unsafe(link_section = ".bss.stack0")]
static mut stack0: [u8; 4096 * NCPU] = [0; 4096 * NCPU]; // NCPU: No of CPU's

In kernel.x, we are setting start address to 0x80000000 and also we are keeping all symbols from .text.entry section at this address. As we only have _entry symbol in this section (see entry.rs), the first instruction of our program is la sp, stack0.

error: mul instruction requires the following: ‘Zmmul’ (Integer Multiplication)

Even though we are targeting riscv64gc-unknown-none-elf which has multiplication support, that feature isn’t automatically applied to the assembler for hand-written global_asm!, so this will not compile. We could solve this issue by using .option arch, +m, but I just showed you how we could avoid mul altogether by just using a base instruction (i.e., shift). You can find more details about this here and here.

S-Mode Kernel

xv6 kernel runs in S-mode whereas user programs run in U-mode. Now that we are in M-mode by default, we will use a function named start which we are calling at the end of assembly code to set up relevant things in M-mode and then we transition to S-mode.

The following things have to be set in xv6 during M-S mode transition:

start.rs:

#[unsafe(no_mangle)]
pub extern "C" fn start() -> ! {
    let mut x: u64 = r_mstatus();
    x &= !MSTATUS_MPP_MASK;
    x |= MSTATUS_MPP_S;
    w_mstatus(x);

    w_mepc(main as *const () as u64);

    w_satp(0);

    w_medeleg(0xffff);
    w_mideleg(0xffff);
    w_sie(r_sie() | SIE_SEIE | SIE_STIE);

    w_pmpaddr0(0x3fffffffffffff);
    w_pmpcfg0(0xf);

    timerinit();

    let id: u64 = r_mhartid();
    w_tp(id);

    unsafe {
        asm!("mret", options(noreturn));
    };
}

// ask each hart to generate timer interrupts.
fn timerinit() {
    w_mie(r_mie() | MIE_STIE);
    w_menvcfg(r_menvcfg() | (1 << 63));
    w_mcounteren(r_mcounteren() | 2);
    w_stimecmp(r_time() + 1000000);
}

main function

Once mret instruction at the end of start function is executed, we will fall into S-mode and jump to main function because mepc is set to main and MPP is set to S.

main.rs:

#[unsafe(no_mangle)]
pub extern "C" fn main() -> ! {
    loop {
        unsafe { core::arch::asm!("wfi") }
    }
}

For now main just parks each hart in a low-power wfi loop; we don’t handle traps yet, so it just loops. In further posts I will explain how we could add UART to this and print some helpful messages from each core.

Running on QEMU

Here comes the juicy part i.e., actually running our kernel and verifying that our kernel boots and runs main function from all 3 cores.

Follow the below steps:

  1. Checkout my repo at specified commit hash
 git clone https://github.com/Karthik-d-k/xv6rs.git

 cd xv6rs/

 git checkout 99b67fe8db4a7d50033a612a0902c9444caa32c3
  1. In terminal 1, Run QEMU
 just qemu-gdb
  1. In terminal 2, Run GDB
 just gdb
  1. Verify our kernel in GDB terminal
(gdb) continue
Continuing.
^C               # Press Ctrl+C to interrupt the program and then inspect
Thread 3 received signal SIGINT, Interrupt.
[Switching to Thread 1.3]
0x0000000080000020 in kernel::main () at kernel/src/main.rs:23
23              unsafe { core::arch::asm!("wfi") }
(gdb) info threads
  Id   Target Id                    Frame
  1    Thread 1.1 (CPU#0 [running]) 0x0000000080000020 in kernel::main ()
    at kernel/src/main.rs:23
  2    Thread 1.2 (CPU#1 [running]) 0x0000000080000020 in kernel::main ()
    at kernel/src/main.rs:23
* 3    Thread 1.3 (CPU#2 [running]) 0x0000000080000020 in kernel::main ()
    at kernel/src/main.rs:23
(gdb) thread 1
[Switching to thread 1 (Thread 1.1)]
#0  0x0000000080000020 in kernel::main () at kernel/src/main.rs:23
23              unsafe { core::arch::asm!("wfi") }
(gdb) info reg pc tp
pc             0x80000020       0x80000020 <kernel::main+4>
tp             0x0      0x0
(gdb) thread 2
[Switching to thread 2 (Thread 1.2)]
#0  0x0000000080000020 in kernel::main () at kernel/src/main.rs:23
23              unsafe { core::arch::asm!("wfi") }
(gdb) info reg pc tp
pc             0x80000020       0x80000020 <kernel::main+4>
tp             0x1      0x1
(gdb) thread 3
[Switching to thread 3 (Thread 1.3)]
#0  0x0000000080000020 in kernel::main () at kernel/src/main.rs:23
23              unsafe { core::arch::asm!("wfi") }
(gdb) info reg pc tp
pc             0x80000020       0x80000020 <kernel::main+4>
tp             0x2      0x2

Each core/thread/hart is looping at main and its respective tp registers are set to 0/1/2.

References

/xv6/ /riscv/ /rust/ /os/