Even after assigning all the opcodes, the syntax of the cpu's assembly language must still be chosen. I decided to use mnemonics similar to the Intel 8051, but a right to left syntax similar to Motorola's 68000. It seemed like an interesting idea to use little arrows instead of commas to separate operands, to indicate the direction of data flow. Much of the assembly language was worked out while defining the instruction set, but every last detail needs to be resolved to create a usable assembler.

It was also obvious that the assembly must have built in macros for some common cases, such as loading registers with immediate data, since the instruction set only supports loading 4 bits at a time.

Writing the Assembler

The assembler is based on a parser created with yacc, that uses a lex based lexer. Yacc can be frustrating at times (shift/reduce conflicts) but it's a lot less frustrating to catch the problems at compile time instead of runtime.

I should add more detail about the internals of the assembler... maybe at the bottom of this page...

Example Code

Here's a bit of example source code that makes use of the built-in macros of the assembler.

Example Source Code

Resulting List Output

; this uses some of the nifty macros,
; it fills all of the  machine's memory,
; except for it's own space, with values
; between "minval" and "maxval"

	.equ	minval,	'a'
	.equ	maxval,	'z'

	.org	0
begin:	move	#stack -> p1
	move	p1 -> sp
	move	#begin -> p1
restart:move	#end -> p2
	inc	p2

loop0:	move	#minval -> a
loop:	store	a -> @p2
	inc	p2
	jp	begin
	cje	a, #maxval, loop0
	inc	a
	setb	c
	jc	loop
stack:	nop	;some room for the stack.
	nop	;The stack isn't explicitly
	nop	;used, but some macros
	nop	;use the stack
end:	nop

0000: C7   MOVE  #7 -> P1Lb
0001: D2   MOVE  #2 -> P1Lt
0002: E0   MOVE  #0 -> P1Hb
0003: F0   MOVE  #0 -> P1Ht
0004: 77   MOVE  P1 -> SP
0005: C0   MOVE  #0 -> P1Lb
0006: D0   MOVE  #0 -> P1Lt
0007: E0   MOVE  #0 -> P1Hb
0008: F0   MOVE  #0 -> P1Ht
0009: 46   PUSH  P1H
000A: 44   PUSH  P1L
000B: CB   MOVE  #B -> P1Lb
000C: D2   MOVE  #2 -> P1Lt
000D: E0   MOVE  #0 -> P1Hb
000E: F0   MOVE  #0 -> P1Ht
000F: 75   MOVE  P1 -> P2
0010: 45   POP   P1L
0011: 47   POP   P1H
0012: 71   INC   P2
0013: 6E   NOP
0014: 81   MOVE  #1 -> Ab
0015: 96   MOVE  #6 -> At
0016: 6E   NOP
0017: 6E   NOP
0018: 59   STORE A -> @P2
0019: 71   INC   P2
001A: 64   MOVE  P -> C
001B: 26   JCU   0000
001C: 42   PUSH  B
001D: AA   MOVE  #A -> Bb
001E: B7   MOVE  #7 -> Bt
001F: 65   CLR   C
0020: 15   SUB   A-B -> B
0021: 43   POP   B
0022: 63   MOVE  Z -> C
0023: 23   JCU   0014
0024: 0C   INC   A
0025: 66   SET   C
0026: 22   JCU   0018
0027: 6E   NOP
0028: 6E   NOP
0029: 6E   NOP
002A: 6E   NOP
002B: 6E   NOP

Here's a few details to look for in the above code:

Macros to load immediate data into an entire register, such as MOVE #stack -> p1 are actually turned into two or four instructions.
Labels loop0 and loop are targets of branch instructions. The assembler automatically inserted NOP instructions at 0013, 0016 and 0017, so that these labels end up aligned at valid branch locations. This is due to the OSU8's opcodes using only 4 bit for a relative offset, at 32 bit aligned boundries.
The OSU8 instruction set only provides conditional jump on carry. The assembler macros allow conditional jumps on any bit, by moving the bit into the carry.
The CJE macro does a conditional jump if the two operands are equal. The assembler inserts 8 instructions to accomplish this, including a push and pop of the B register, which is used to hold the constant.

OSU8: Simple 8-Bit Microprocessor Design; Paul Stoffregen
http://www.pjrc.com/tech/osu8/assembler.html
Last updated: February 24, 2005
Status: These pages are a work-in-progress
Comments, Suggestions: <paul@pjrc.com>