# Project Two RISC Processor Implementation ECE 485 Chenqi Bao Peter Chinetti November 6, 2013 Instructor: Professor Borkar ## 1 Statement of Problem This project requires the design and test of a RISC processor in VHDL. It focuses especially on the datapath design of the processor, and its implementation. In this groups' specific case, the required instructions were: | Name | Abrev. | Type | |-----------------|--------|------| | Load Word | lw | I | | Store Word | sw | I | | Add | add | R | | Branch On Equal | beq | I | | NAND | nand | R | | OR Immediate | ori | I | | OR | or | R | | AND Immediate | andi | I | # 2 Background # 2.1 Instruction Types The MIPS ISA defines three instruction types, R, I, and J type instructions. Only R and I type instructions will be covered here, as they are the only instructions that are to be implemented for this project. <sup>&</sup>lt;sup>1</sup>NAND does not exist in the MIPS ISA, so the ISA was extrapolated to fill out the table ### 2.1.1 R Type R type, or register type instructions are the most common form of MIPS instructions. In this instruction format, the 32 bits of the instruction are split as follows: In these instructions, the opcode is always $000000_2$ , and the function code (funct) is used to determine the specific instruction. rs and rt are the two registers the operation is working on, and rd is the destination register. For some instructions, a shift amount (shamt) is needed, so it is specified. #### 2.1.2 I type I type, or immediate type instructions are also very common. In this instruction format, the 32 bits of the instruction are split as follows: In these instructions, the op code field actually encodes the specific instruction. rt is the destination register, and rs is the register on which the operation acts. The immediate field holds the immediate data that serves as the other operand. ## 2.2 Multicycle Datapath The microprocessor logically comprises two main components: datapath and control. The datapath performs the arithmetic operations, and control tells the datapath, memory and I/O devices what to do according to the wishes of the instructions of the program [1]. When executing an instruction, the microprocessor steps through five main stages: Instruction Fetch (IF), Instruction Decode (ID), Execution (EX), Memory Operations (MEM) and Write Back (WB). Multicycle datapath implementations takes advantage of the fact that the stages of the operation can share the same hardware. Rather than use for example, a separate ALU for PC incrementing and addition of two registers, the same ALU can have its input switched from PC incrementation to register reads. This reuse saves on components in the processor, which can cost less. Multicycle, however, requires some additional work in the form of multiplexers to select between inputs and outputs of each stage. Although this is a non-trivial amount of work, it is still better than duplicating components for each step. #### 2.3 VHDL VHDL is a hardware description language that can be used to prototype digital systems. According to [2], "VHDL includes facilities for describing logical structure and function of digital system at a number of levels of abstraction, from system level down to the gate level." # 3 Implementation # 3.1 Design Decisions #### 3.1.1 Instruction Set The first design decision was what to use as the format of the instructions requested. Generally, we used the format specified in the MIPS ISA, but, as mentioned earlier, NAND is not implemented in the MIPS ISA. Below is a list of our choices for opcodes and function codes: | $\operatorname{OpCode}$ | Function Field | Instruction | Operation | |-------------------------|----------------|-------------|-----------------------| | 100011 | 000000 | lw | lw \$t3,200(\$t2) | | 101011 | 000000 | sw | sw \$t3,0(\$t2) | | 000000 | 100000 | add | add \$t1,\$t1,\$t1 | | 000100 | 000000 | beq | beq \$t1,\$t4,15 | | 000000 | 100101 | or | or \$t0,\$t1,\$t0 | | 001100 | 000000 | andi | andi \$t0,\$t0,5 | | 000000 | 000001 | nand | nand \$t0,\$t0,\$zero | | 001101 | 000000 | ori | ori \$t6,\$t6,61680 | #### **3.1.2** Memory Memory was implemented as a simple array of 256 words in this implementation. Larger memory sizes are possible, but they are unnecessarily complicated for a simple demonstration such as this. ## 3.2 Optimization Little optimization was done on this project other than to not include obviously useless code. This processor is not pipelined, and as such, it is very much kept back from the optimization that make modern processors so quick. ## 3.3 Improvements This processor has many ways to improve. Out of the large many ways, a few are most obvious: implement a complete instruction set, add piplineing, and increase the memory size. Currently the processor exists solely to serve as an educational demonstration, but could grow to be a complete implementation of the MIPS ISA given much improvement. #### 3.4 Failures Thankfully, we have no failures to report. ## 3.5 Block Diagram See figure 1. #### 3.6 Simulation The output of the simulations can be found in figures 2-9. The simulation was done sequentially, in the order of presentation, so the values going into subsequent instructions are often dependent on the output of the previous command. ## 3.7 Code Listing #### 3.7.1 Datapath ``` entity MIPS is Port ( clock : in bit; --clock record PC0 : out integer; --PC counter (32 bits SET : in bit; Memval : out bit_vector (31 downto 0); --mem word addressable Instrval : out bit_vector (31 downto 0); --Instruction 32 bits wide Output: out BIT_VECTOR (31 downto 0); --We are working in Word size Port1, Port2, Port3, Port4: out bit_vector (31 downto 0)); end MIPS; architecture INSTRUCTION of MIPS is - Data types signal internal_state: integer; subtype word is bit_vector(31 downto 0); -- 32-bit words type regfile is array (0 to 31) of word; — 32 words type ram is array (0 to 255) of word; — toy sized ram for testing subtype reg_addr is bit_vector(4 downto 0); - 2^5 can store 32 regs subtype halfword is bit_vector(15 downto 0); -- 16-bit entities i.e . Immediate value subtype byte is bit_vector(7 downto 0); -- if we need bytes 21 constant bvc : bit_vector (0 to 1) := "01"; --Binary value -int -> bits procedure int2bits(int :in integer; bits :out bit-vector) is variable temp: integer; variable result: bit_vector(bits'range); begin temp := int; if \quad int \, < \, 0 \quad then \\ temp := -int - 1; ``` ``` end if; for index in bits 'reverse_range loop result(index) := bvc(temp rem 2); temp := temp/2; 33 end loop; if int < 0 then 35 result := not result; result(bits'left) := '1'; 37 end if; 39 bits := result; end int2bits; -bits -> unsigned int function bits2int (bits : in bit_vector) return integer is 43 variable result : integer := 0; begin for index in bits 'range loop 45 result := result * 2 + bit 'pos(bits(index)); end loop; 47 return result; end bits2int; 49 - Sign Extend function sign_ext(imm : in halfword) return word is variable extended : word; 53 begin if imm(imm'left) = '1' then extended := (31 downto 16 => '1')& imm; 57 extended := (31 downto 16 => '0')& imm; \quad end \quad i\,f\;; 59 return extended; 61 end sign_ext; procedure alu_add_subtract (addsel: in bit; result : inout word; a, nb : in word; V,N : out bit) is -- Overflow -> Cout variable sum : word; 65 variable carry : bit := '0'; variable b: word; 67 begin if addsel ='1' then b := Not nb; carry := ',1'; else b := nb; end if; for index in sum'reverse_range loop sum(index) := a(index) xor carry xor b(index); carry := ( a(index) and b(index) ) or ( carry and ( a(index) xor b( index)); end loop; result := sum; V := carry ;--= '1'; 79 end procedure alu_add_subtract; 81 Begin Proc: Process(clock) 83 variable i: integer:=0; — Execution cycle counter ``` ``` if clock = '1' and clock' event then 85 if i = 5 OR SET = '1' then -- reset on SET or 5 cycles i := 0; end if; i := i + 1; 89 internal_state <= i; end if; end process Proc; 95 97 Datapath: Process (internal_state) variable result, Instr, op1, op2, op3, maddr: word; variable opcode, funct : bit_vector(5 downto 0); variable rs,rt,rd,dstreg,shamt : reg_addr; variable state : integer:=0; -- =='cycle variable PC : integer:= 0; variable Imm : halfword; variable mem_index : byte; -- only need 8 bits variable reg : regfile:= (9 => X"0000_0001", 10 => X"0000_0002",12 => X"0000_0002", others => X"0000_0000"); 107 variable mem : ram := ( Store $7FFF_FFFF to memory address 2] 2 \implies X"0129-4820", -- add \$t1, \$t1, \$t1 [ doing 1+1 and store the result in $t1] _0000] 0000\_0000 \Rightarrow FFFF\_FFF2 115 18 => X"35CE_F0F0", — ori $t0,$zero,61608 [or 0000_F0F0 with 0000\_0000 \Rightarrow FFFF\_FFF others \Rightarrow X"0000_0000"); variable mem_rw : boolean; — Mem Access variable mem_r : boolean; -- Mem Read variable i: integer:=0; — Exec cycle counter variable Dmem : ram := ( 121 202 => X" 7FFF_FFFF" others \Rightarrow X"0000_0000"); variable V,N,RST : bit; 125 Begin state:=internal_state; case state is when 1 \Rightarrow -- IF 129 Instr := mem(PC); PC := PC + 1; --If PC is an int, incremeting by 1 works RST := '0'; -- init mem_rw := false; -- init when 2 \Rightarrow 133 -- ID ``` ``` opcode := Instr(31 downto 26); rs := Instr(25 \text{ downto } 21); rt := Instr(20 \text{ downto } 16); rd := Instr(15 \text{ downto } 11); 139 dstreg := rt; Imm := Instr(15 \text{ downto } 0); shamt := Instr(10 downto 6); funct := Instr(5 downto 0); op1 := reg(bits2int(rs)); -- after filtering to an int, store op2 := reg(bits2int(rt)); 145 op3 := sign_ext(Imm); -- this is the immediate value after being sign extended 147 when 3 \Rightarrow - EX 149 case opcode is -- switch on opcode when "100011" => ---lw alu_add_subtract('0', maddr, op1, op3, V, N); mem_rw := true; 153 mem_r := true; when "101011" => ---sw alu_add_subtract('0', maddr, op1, op3, V, N); mem_rw := true; mem_r := false; when "000100" \Longrightarrow --beq alu_add_subtract('1', result, op1, op2, V, N); if result = X"0000_0000" then—if our ALU had a zero output, take the branch PC := PC + bits2int(op3); RST := '1'; end if; when "001101" => ---ORI result := op1 \ \, \hbox{O\!R} \ \, op3; when "001100" => ---ANDI 167 \texttt{result} \; := \; \texttt{op1} \; \; \textcolor{result}{\texttt{AND}} \; \; \texttt{op3} \; ; when "000000" \Longrightarrow --0 op code, therefore R type dstreg := rd; --R types always have rd as the dest case funct is when "100000" => ---Add alu_add_subtract('0', result, op1,op2,V,N); when "100001" \Rightarrow --NAND result :=op1 NAND op2; when "100100" \Rightarrow ---AND result := op1 AND op2; when "100101" \Rightarrow ---OR 179 result := op1 OR op2; when others => 181 end case; 183 when others => end case; 185 when 4 \implies --MEM if mem_rw = true then -- These flags got set above when decoding lw and sw if mem_r = true then --set on read ``` ``` result := Dmem(bits2int(maddr)); 189 else -- cleared on write Dmem(\,bits2int(\,maddr)\,) \;:=\; op2\,;\; -\!\!-\!\!-\!\!reg2 \;\; written \;\; to \;\; mem RST := '1'; end if; end if; when 5 => -- Write-back cycle if RST = '0' then — if we didn't write to mem reg(bits2int(dstreg)) := result; -- writeback value to dest. register end if; when others => end case; Output <= result; Memval <= mem(bits2int(maddr));</pre> PC0 \le PC; 205 InstrVal <= Instr;</pre> Port1 \le op1; 207 Port2 \le op2; Port3 <= op3; 209 Port4 <= reg(bits2int(dstreg)); end process Datapath; end INSTRUCTION; ``` MIPS.vhd #### 3.7.2 Simulator ``` ENTITY sim2 IS 3 END sim2; 5 ARCHITECTURE simulation OF sim2 IS COMPONENT MIPS PORT ( clock : In bit; SET : In bit; Output : Out BIT_VECTOR (31 DownTo 0); PC0: Out INTEGER; Memval: Out BIT_VECTOR (31 DownTo 0); Instrval : Out BIT_VECTOR (31 DownTo 0); Port1, Port2, Port3, Port4: Out BIT_VECTOR (31 DownTo 0) 13 END COMPONENT; — 17 SIGNAL Clock : bit := '0'; SIGNAL SET : bit := '0'; 19 SIGNAL Output : BIT_VECTOR (31 DownTo 0) := " SIGNAL PC0 : INTEGER := 0; SIGNAL Memval: BIT_VECTOR (31 DownTo 0) := " SIGNAL Instrval: BIT_VECTOR (31 DownTo 0) := " ``` ``` 23 SIGNAL Port1, Port2, Port3, Port4: BIT_VECTOR (31 DownTo 0) := " --||||||||Simulation begins BEGIN 27 UUT : MIPS PORT MAP ( 29 clock => clock, \mathrm{SET} \implies \mathrm{SET}\,, Instrval => Instrval, {\rm Output} \implies {\rm Output}\,, PC0 \Rightarrow PC0 Memval \implies Memval, Port1 => Port1, Port2 => Port2, Port3 => Port3, Port4 \implies Port4 PROCESS 41 BEGIN CL : LOOP 43 clock <= '0'; WAIT FOR 50 ns; 45 clock <= '1'; WAIT FOR 50 ns; 47 END LOOP CL; END PROCESS; 49 PROCESS BEGIN WAIT FOR 5000 ns; END PROCESS; END simulation; ``` simulator.vhd # References - [1] David A. Patterson, John L. Hennesy, *Computer Organization and Design*. Morgan Kaufmann, Massachusetts, 4th Revised Edition, 2012. - [2] Peter J. Ashden, VHDL Tutorial. Elsevier Science, USA, 2004 Figure 1: Block Diagram Figure 2: lw \$t3,200(\$t2); Loading 7FFF FFFF Figure 3: sw \$t3,0(\$t2); Storing 7FFF FFFF Figure 4: add \$t1,\$t1,\$t1; \$t1 initialized to 1 Figure 5: beq $\$t1,\$t4,15;\ \$t4$ initialized to $2,\ \$t1=2$ Figure 6: or t0,t3,t0; t3 = 7FFF FFFF, t0 = 0 Figure 7: andi \$t0,\$t0,5; \$t0 = 7FFF FFFF Figure 8: nand 0,0,0,0 Figure 9: ori \$t6,\$t6,61680; \$t6 = 0