You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I had a lot of fun working on my 16 bit softcore processor (https://github.com/aslak3/cpu), and thought it would be interesting to extend the design to a 32 bit processor.
I'm one of those odd programmers who enjoys writing code in assembly. I want to produce an ISA which is pleasent to program in assembly, even if this means it does not perfom as well in other envirnoments, such as when it is the target of a C compiler.
Saying that, it would be terrific to look at producing an LLVM target for this design. And with that in mind, it should have the necessary ISA features to make running C code reasonably efficent, providing it doesn't compromise the fun of writing code in assembly.
Sitting between RISC and CISC is a nice place to be.
Stacking operations with multiple registers in one instruction is not very RISC like at all, and would certainly hinder a future pipelined design. None the less it's a big programmer convience.
On the other hand, being a load/store based processor has obvious benifits.
I'm happy to borrow ideas from other designs.
An eventual goal is to look at introducing a pipeline, though this may entail a partial or even complete redisgn of the ISA and a scrapping of most of this implementation itself.
This project is also a good place to explore my interest in processor design. For instance, it would be interesting to look at switching to a microcoded control unit, just for the experience of doing so.
This seems like a nice logic block to use to explore other areas of computer systems design, such as memory controllers for SDRAM etc.
Long, Word and Byte size memory accesses, with signed/unsigned extension on byte and word reads
Bus error signal on unaligned word transfers
Memory currenly must be 32 bits wide
Instructions
Some opcodes (like LOADI, JUMPs, BRANCHes, ALUMI, CALLs) have one following immediate value or address
Load an immediate 32 bit quantity into a register found, the value being found at the following longword in the instruction stream
Load the lower 16 bit portion into a register using a single instruction longword, the value is sign extended to 32 bits
Load and store instructions operate either through a register, a register with an immediate displacement, the program counter with an immediate displacement, or an immediate memory address. Displacements may either be found in the following longword or an integrated (termed "quick" in the ISA) 12 bit quantity, which is sign extended.
Clear instruction as assembler nicety, which uses a quick load of zero
Simple status bits: zero, negative, carry and overflow
ALU operations including
add, add with carry, subtract, subtract with carry, signed and unsigned 8 bit to 16 bit multiply, and, or, xor, not, shift left, shift right, copy, negation, sign extensions, etc
ALU operations are of the form DEST <= OPERAND1 op OPERAND2, or DEST <= op OPERAND
ALUMI operates with an immediate longword operand extrated from the instruction stream, eg. add r0,r1,#123
ALUMQ operates with an embedded sign exteded 12 bit quantity inside the instruction word, eg. addq r0,r1,#2
Assembler provides shorthand versions, eg: add r0,#123 which is the same as: add r0,r0,#123
Flow control, including calling subroutines and return: borrows the 15 conditions from ARM
Jump and call subroutine through register
Branch either with a 32 bit displacement or with a quick 12 bit displacement
Return can also be conditional
Flags (currently just the four condition codes) can be manually ORed/ANDed
Nop and Halt instructions
Register to register copy
Stack
Push and pop a single register eg: push (r15),r0 pushes r0 onto r15
push and pop multiple registers eg: pushmulti (r15),R1|R3|R5 - pushes r1, r3 and r5 onto r15 in sequence, decrementing it by 12
Started
Register File, Program Counter, Instruction Register
ALU
Bus Interface
Control Unit (no testbench as yet)
DataPath and external entity
Simulation environment
TODO
Expose condition code register to allow it to be stacked/transferred to a register
If condition -> PC := PC + quick memory displacement
0x33
CALLJUMP
Memory address
5
If condition -> rSP := rSP - 4 ; (rSP) := PC ; PC := Memory address
0x34
CALLBRANCH
Memory displacement
5
If condition -> rSP := rSP - 4 ; (rSP) := PC ; PC := PC + Memory displacement
0x35
CALLBRANCHQ
-
5
If condition -> rSP := rSP - 4 ; (rSP) := PC ; PC := PC + Quick memory displacement
0x36
JUMPR
-
3
If condition -> PC := rN
0x37
CALLJUMPR
-
5
If condition -> rSP := rSP - 4 ; (rSP) := PC ; PC := rN
0x38
RETURN
-
3
If condition -> PC := (rSP) ; rSP := rSP + 4
0x40
ALUM
-
3
rD := rOP2 operation rOP3
0x42
ALUMI
Operand
3
rD := rOP2 operation operand
0x49
ALUMS
-
3
rD := operation rOP2
0x50
ALUMQ
-
3
rD := rOP2 operation Quick operand
0x60
PUSH
-
4
rSP := rSP - 4 ; (rSP) := rN
0x61
POP
-
3
rN := r(SP) ; rSP := rSP + 4
0x62
PUSHMULTI
-
3 + rN count * 2
for each rN set do: rSP := rSP - 4 ; (rSP) := rN
0x61
POPMULTI
-
3 + rN count * 2
for each rN set do: rN := r(SP) ; rSP := rSP + 4
0x70
COPY
-
3
rD := rS
Condition flags
3
2
1
0
V: Oerflow
C: Carry
Z: Zero
N: Negative
Conditions (jumps, branches, and return)
Hex value
Assembly postfix
Description
Meaning
1
eq AKA zs
Equal / equals zero
Z
2
ne AKA zc
Not equal
!Z
3
cs
Carry set
C
4
cc
Carry clear
!C
5
mi
Minus
N
6
pl
Plus
!N
7
vs
Overflow
V
8
vc
No overflow
!V
9
hi
Unsigned higher
!C and !Z
A
ls
Unsigned lower or same
C or Z
B
ge
Signed greater than or equal
N == V
C
lt
Signed less than
N != V
D
gt
Signed greater than
!Z and (N == V)
E
le
Signed less than or equal
Z or (N != V)
0, F
al
Always
any
Registers
0b0000
r0
0b0001
r1
...
...
0b1110
r14
0b1111
r15
Transfer types
Value
Transfer size and extension mode (loads only)
0b000
Byte unsigned
0b001
Word unsigned
0b010
Long unsigned
0b011
Reserved
0b100
Byte unsigned
0b101
Word signed
0b110
Long
0b111
Reserved
ALU multi (destination and operand) operations
0b0000
Add
0b0001
Add with cary
0b0010
Subtract
0b0011
Subtract with borrow
0b0100
Bitwise AND
0b0101
Bitwise OR
0b0110
Bitwise XOR
0b0111
Copy (does not update flags)
0b1000
Compare
0b1001
Bitwise test
0b1010
Unsigned 16 bit to 32 bit multiply
0b1011
Signed 16 bit to 32 bit multiply
0b1100-0b1111
Unused
ALU single (destination only) operations
0b0000
Increment
0b0001
Decrement
0b0010
Bitwise NOT
0b0011
Logic shift left
0b0100
Logic shift right
0b0101
Arithmetic shift left
0b0110
Arithmetic shift right
0b0111
Negation
0b1000
Byte swap
0b1001
Compare with zero
0b1010
Sign extend word
0b1011
Sign extend byte
0b1100-0b1111
Unused
Sample code
The currently used CustomASM CPU definition makes it possible to
write very presentable assembly by, for example, combing LOADI, LOADM, LOADR and LOADRD into a single "load"
mnemonic with the width represented by .l, .ws, .wu, .bs or .bu. ALU operations are similarly represented.
#d32 start ; reset vectorstart: load.l r15,#0x200 ; setup the stack pointer loadq.ws r0,#1 ; intiail factorial loadq.ws r3,#9 ; getting 1 to this number load.l r2,#table ; output pointerloop: calljump factorial ; get the factorial for r0 in r1 store.l (r2),r1 ; save it in the table addq r2,#4 ; move to the next row addq r0,#1 ; inc the number we are calculating compare r0,r3 ; got all the factorials? branchqne loop ; no? loop again halt ; stop the procfactorial: push (r15),r0 ; save the param, we will use it copy r1,r0 ; start from this valuel: subq r0,#1 ; loop counter branchzs factorialo ; done? mulu r1,r0,r1 ; multiply running total by previous branch l ; get the next onefactorialo: pop r0,(r15) ; restore the original param return ; done #d32 -1 ; start of table markertable: ; table of output goes here