-
Notifications
You must be signed in to change notification settings - Fork 140
MIPS disassembler
This page is maintained by Alexander Timofeev
As you know, all instructions are just sets of bits. In MIPS, it is 4-byte words. Processors decodes instruction according to its rules called format.
Although formatted 4-bytes opcodes are compact and fully readable by processor, for programmer it is very complicated. That's why people use assembler language — a language with translates understandable and standardized statement. Every statement can be converted to 4-byte instruction — it is called assembling. Vice versa, every 4-byte instruction can be converted to statement of assembler language — disassembling.
So, disassembler is a program that converts assembled program back to human-understandable view. It is extremely useful during development of any simulator, architectual or micro-architectual, functional or performance.
Disassembler should support all instructions listed on this page without pseudo-instructions and all registers listed on MIPS register page
Class FuncInstr
is a basic abstraction of instruction in our simulator.
Implementation provides these interfaces
class FuncInstr
{
// ...
public:
FuncInstr( uint32 bytes);
const std::string& Dump() const;
};
std::ostream& operator<<( std::ostream& out, const FuncInstr& instr);
You are free to create your own internal variables and methods.
FuncInstr specifies:
- Format type (R, I, or J)
- Registers addresses
- Type of instruction
The easiest way to support these options is C enumerations.
class FuncInstr
{
// ...
enum Format
{
FORMAT_R,
FORMAT_I,
FORMAT_J
} format;
/// ...
};
Constructor takes bytes
variable and initializes internal variables (parser) using MIPS instruction format on these bytes.
FuncInstr::FuncInstr( uint32 bytes)
{
this->initFormat(bytes);
switch (this->format)
{
case FORMAT_R:
this->parseR(bytes);
break;
case FORMAT_I:
this->parseI(bytes);
break;
case FORMAT_J:
this->parseJ(bytes);
break;
default:
assert(0);
}
// ...
};
Good technique to parse bytes is C-style union of structures with bit values.
class FuncInstr
{
// ...
union
{
struct
{
unsigned imm:16;
unsigned t:5;
unsigned s:5;
unsigned opcode:6;
} asI;
struct
{
// ...
} asR;
struct
{
// ...
} asJ;
uint32 raw;
} bytes;
};
// ...
this->tReg = this->bytes.asI.t;
// ...
ISA information is stored in static array. There is example of this array below.
class FuncInstr
{
// ...
struct ISAEntry
{
const char* name;
uint8 opcode;
uint8 func;
FuncInstr::FormatType format;
FuncInstr::Type type;
// ...
};
static const ISAEntry[] isaTable;
// ...
};
// ...
const FuncInstr::ISAEntry[] FuncInstr::isaTable =
{
// name opcode func format type
{ "add ", 0x0, 0x20, FuncInstr::FORMAT_R, FuncInstr::ADD /*...*/ },
// more instructions ...
};
You are free to add in ISAEntry
as many fields as you wish.
Dump
method returns disassembly of instructions with their correct names and registers names. Format of output is following:
<indent><instr name> <reg1>[, <reg2>][, <reg3>][, const]
Examples:
add $t0, $t1, $t2
addi $t0, $t4, 0x20
Constants are printed in hexadecimal format.
Note: Dump function must not parse any input bytes! It should use internal variables of class |
---|
Output operator calls Dump
method.
std::ostream& operator<< ( std::ostream& out, const FuncInstr& instr)
{
return instr->Dump();
}
Stand-alone MIPS disassembler is deprecated. Please, use 'mips-objdump' tool provided by GNU Binutils
MIPT-V / MIPT-MIPS — Cycle-accurate pre-silicon simulation.