Building LMARV-1: a tangible RISC-V processor, part 1

A new project, hooray!


RISC-V is not a processor in the sense of an ARM processor or x86. It is, in fact, an open specification for an Instruction Set Architecture (ISA). This means that the instruction set is standardized by the [RISC-V Foundation], and then anyone can implement [that instruction set] free of any licenses, royalties, or legal requirements.

RISC-V is the result of much study of existing ISAs. The result is regular, which makes an instruction decoder easy to create. It's also modular. There is only one subset of the ISA that is required to be implemented, the 32-bit integer set. This is called the 32I extension, meaning that all instructions and registers are 32 bits. Everything else is optional. For example, the M extension include multiplication and division instructions. A is for atomic operations. F and D are for single- and double-precision floating point. The G extension means "General", and means that you implement all of IMAF and D.

In general, if your implementation doesn't implement a particular G extension, you will have to provide libraries that do the same thing in software. Otherwise a compiler wouldn't be able to compile certain standard expressions.

There's also a 64-bit extension, 64I, and a 128-bit extension, 128I. There is a 32E extension, which is for smaller (e.g. embedded) processors which halves the number of required integer registers. Quad-precision floating points are in the Q extension. There's even an extension, C, for compressed instructions, which allows 16-bit and variable-length instructions. 

Some other interesting extensions, which have not yet been frozen (i.e. fully developed and agreed upon), are V (vector instructions), L (decimal floating point, as in calculators), and B (bit manipulation).

Speaking of extensions, RISC-V is extensible. If you have some piece of specialized hardware that you want custom instructions for, there is a whole range of opcodes reserved for that. Of course, your compiler would have to support those custom opcodes, or you could just wrap the assembly language.

And speaking of instructions, it should be pretty clear that the ISA is a reduced-instruction set. There are fewer than 50 instructions in the G extension! All the instructions are very simple. There aren't even any condition codes or flags such as carry, zero, or overload (these can be handled by other instructions).

RISC-V specifies four levels of privilege. The highest level is the machine level, which is the only required level. At this level all instructions have access to all of memory and all peripherals. The next level down is the hypervisor level, which is for things like virtual machines. Then there is the supervisor level below that, for operating systems and kernels. Finally, the lowest privilege level is the user level, which is for applications.

The plan

Most, if not all, projects implementing a RISC-V processor use an FPGA. However, I want to build a RISC-V processor that you can see and touch the insides of. So that you can learn about how a RISC-V processor actually works by observing it. I plan to use MSI and LSI chips, so things like buffers, flip flops, and so on. As few programmable chips as possible.

This first part is going to be about building the registers for the processor, which I call the LMARV-1 (Learn Me A RISC-V, level 1). It is level 1 because I only plan on implementing the 32I extension. Later levels add more features.

An instruction

There are several [formats of instructions], but the most interesting one specifies two source registers and a destination register:


Here, d stands for a destination register, 1 for one source register, and 2 for a second source register. For example, adding two registers and storing the result in a third would use this instruction format.

Another format only specifies a destination and one source register. And a third format only has a destination register, for example for storing an immediate value. But the interesting thing is that the destination and source registers are all in the same position in the instruction. The instruction set is therefore regular. 


There are 32 registers called x0-x31, in addition to a program counter register, pc. In the 32I extension, these are all 32-bit registers. Interestingly, x0 is a fixed value, zero. Writing to it does nothing, and reading from it always yields zero. Compilers often set aside one register for a fixed zero value, and this just formalizes that.

The RISC-V spec also declares which registers are for what use. This is called the ABI spec, or Application Binary Interface. Each register has an alias name in the ABI. For example the name for x0 is "zero".

I am not too concerned with the ABI, since I'm not writing a compiler. I'm building the hardware that the compiler will write programs for.

So, here's my idea of what a single register should look like:

For this setup, I'm using a 74LVT16374, which is a 16-bit D flip-flop. LV means it's a low-voltage (3.3 volt) part, and T is the technology used, called ABT, or [BiCMOS]. This is like TTL but also low-power like CMOS.


As a destination register, we can clock data from a destination bus into this register. We can also have two source buses, and use the 74LVTH162541 16-bit tristate buffer. This controls which source bus the register outputs on: one, the other, both, or none.

There are LEDs which can show the state of the register, which is important for a visible processor.

Now of course, these are all 16-bit parts. There are no equivalent 32-bit parts. So we would have to multiply the number of chips per register by two, making six. And then there are 31 registers (aside from the zero register), so 96 chips.

Here's an alternate setup:


Here, I store the same value in three places: two sources and a display (the unlabeled box at the bottom, which is also a '16374). This has several advantages: cost, and drive capability. The '16374 can drive 32mA, while the '162541 can only drive 12mA.

And here it is!


The LEDs on it are specifically 3mm flangeless LEDs. The reason they are flangeless is that the flange on the LED adds to the 3mm width, which would require the board to be about 10% larger.

The two card edges go into two PCIe x4 slots, for a total of 196 signals. Some of these signals are used for power and ground between bits on the bus, for signal integrity.

You can see a [video version] of this post on my YouTube channel.

And, all of [my schematics] (KiCAD) are on GitHub and are [Open Source Hardware].