A complete MIPS assembler implementation that translates MIPS assembly language into binary machine code. This project demonstrates low-level computer architecture concepts including instruction encoding, memory management, and assembly language processing.
- R-Type Instructions:
add,sub,mult,div,mflo,mfhi,slt,sll,srl - I-Type Instructions:
addi,lw,sw,beq,bne - J-Type Instructions:
j,jal,jr - System Calls:
syscall
- Label Resolution: Automatic handling of instruction and data labels
- Static Memory Management:
.worddirective support with label references - Multi-file Assembly: Process multiple
.asmfiles in a single run - Comment Stripping: Automatic removal of comments and whitespace
- Binary Output: Generates separate instruction and static memory binaries
- Binary Analysis: Included
readbytesutility for examining binary output
Checkpoint 1/
βββ project1.h # Header with encoding functions and utilities
βββ project1.cpp # Main assembler implementation
βββ readbytes.cpp # Binary file analysis utility
βββ Makefile # Build configuration
βββ Testcases/
β βββ Assembly/ # Test assembly files
β β βββ test1.asm
β β βββ test2.asm
β β βββ ...
β βββ GoldBinaries/ # Expected binary outputs
β βββ test1_inst.bin
β βββ test1_static.bin
β βββ ...
βββ README.md # This file
- C++ compiler with C++11 support
- Make utility
makeThis creates the main executable:
assemble- The main assembler
The readbytes utility must be compiled separately:
g++ -std=c++17 readbytes.cpp -o readbytes./assemble input1.asm [input2.asm ...] static_output.bin instruction_output.bin# First compile readbytes if not already done
g++ -std=c++17 readbytes.cpp -o readbytes
# View instruction binary in hex format
./readbytes instruction_output.bin
# View static memory binary
./readbytes static_output.bin# Build everything
make
g++ -std=c++17 readbytes.cpp -o readbytes
# Assemble the code
./assemble Testcases/Assembly/test1.asm static.bin inst.bin
# Examine the generated binaries
echo "=== Instruction Memory ==="
./readbytes inst.bin
echo "=== Static Memory ==="
./readbytes static.bin
# Compare with expected output
echo "=== Comparing with Gold Standard ==="
./readbytes Testcases/GoldBinaries/test1_inst.bin
./readbytes Testcases/GoldBinaries/test1_static.bin# One-time setup
make
g++ -std=c++17 readbytes.cpp -o readbytes
# Now you can test any assembly file
./assemble Testcases/Assembly/test1.asm static.bin inst.bin
./readbytes inst.bin# Test and verify in one command (assumes readbytes is compiled)
./assemble Testcases/Assembly/test2.asm static.bin inst.bin && \
echo "Generated:" && ./readbytes inst.bin && \
echo "Expected:" && ./readbytes Testcases/GoldBinaries/test2_inst.binThe readbytes utility is essential for:
- Verifying instruction encoding: Check if your R/I/J-type encodings are correct
- Debugging static memory: Ensure label references resolve to correct values
- Comparing outputs: Side-by-side comparison with gold standard binaries
- Understanding binary format: Learn how MIPS instructions look in machine code
# Compile readbytes first
g++ -std=c++17 readbytes.cpp -o readbytes
# If your test fails, use readbytes to investigate
./assemble Testcases/Assembly/test1.asm static.bin inst.bin
# Check what you generated vs expected
echo "Your output:"
./readbytes inst.bin
echo "Expected output:"
./readbytes Testcases/GoldBinaries/test1_inst.bin
# Look for differences in specific instructions
diff <(./readbytes inst.bin) <(./readbytes Testcases/GoldBinaries/test1_inst.bin)# Examine expected outputs to understand correct encoding
./readbytes Testcases/GoldBinaries/test1_inst.bin
./readbytes Testcases/GoldBinaries/test1_static.bin.data
array: .word 1 2 3 4 # Static array
ptr: .word array # Label reference.text
.globl main
main:
addi $s0, $zero, 10 # Load immediate
add $t0, $s0, $s1 # Register arithmetic
lw $t1, 4($sp) # Load word
beq $t0, $zero, end # Conditional branch
jal function # Jump and link
end:
syscall # System call- Parse all input files
- Strip comments and whitespace
- Identify and map instruction labels to line numbers
- Identify and map static data labels to memory addresses
- Resolve label references in
.worddirectives - Convert numeric values (decimal/hex) to binary
- Output static memory binary file
- Encode each instruction using appropriate format (R/I/J-type)
- Resolve branch offsets and jump targets
- Output instruction binary file
| opcode | rs | rt | rd | shamt | funct |
| 6 | 5 | 5 | 5 | 5 | 6 |
| opcode | rs | rt | immediate |
| 6 | 5 | 5 | 16 |
| opcode | address |
| 6 | 26 |
readbytes utility requires manual compilation:
g++ -std=c++17 readbytes.cpp -o readbytes$ ./readbytes inst.bin
00000000: 20100064 # addi $s0, $zero, 100
00000004: 02002020 # add $a0, $s0, $0
00000008: 0c000007 # jal f
0000000c: 02001020 # add $v0, $s0, $0Each line shows:
- Memory address (hexadecimal)
- Instruction encoding (32-bit hex value)
- Optional comment (if you add instruction tracing)
- Wrong opcode: First 6 bits incorrect
- Register mixup: Check rs, rt, rd fields
- Immediate issues: Sign extension or bit masking problems
- Branch offsets: PC-relative calculation errors
- Basic Arithmetic: Addition, subtraction, multiplication
- Memory Operations: Load/store with various addressing modes
- Control Flow: Branches, jumps, function calls
- Complex Programs: Loops, recursion, data structure access
Complete support for MIPS register naming conventions:
$zero,$at,$v0-$v1,$a0-$a3,$t0-$t9,$s0-$s7,$k0-$k1,$gp,$sp,$ra- Numeric equivalents:
$0-$31
- Instruction Labels: Resolved to line numbers (Γ4 for byte addressing)
- Data Labels: Resolved to memory offsets in static data section
- Forward References: Supported through two-pass assembly
- File I/O error detection
- Invalid register name detection
- Label resolution failure handling
- Malformed instruction detection
// Enable debug output in source
cout << "Line " << curr_line_num << ": " << instruction << endl;The assembler outputs both instruction and static memory maps for debugging.
- β Fixed I-type instruction encoding for negative immediates
- β Corrected load/store instruction parsing
- β Implemented proper branch offset calculation
- β
Added comprehensive
.gitignorefor build artifacts - β
Integrated
readbytesutility for binary analysis
- Limited to subset of MIPS instruction set
- No macro expansion support
- No optimization passes
David Nathanson - CMSC301 Fall 2025 Martin Sang - CMSC301 Fall 2025
This project is part of academic coursework and is subject to university academic integrity policies.
Built with β€οΈ for Computer Architecture & Organization
Pro Tip: Always compile readbytes first (g++ -std=c++17 readbytes.cpp -o readbytes) and use it to verify your binary output matches the expected results! π