Assembler, Part 1
April 15, 2014
This exercise is trickier than it looks, since we must write a two-pass assembler to handle labels that appear after their first reference; the first pass reads the program and saves the line numbers referred to by labels, and the second pass writes a memory image containing the assembled program.
We keep the line numbers referred to by labels in a global a-list, and define an a-list of opcodes:
(define labels (list)) (define opcodes '(("const" . 0) ("get" . 1) ("put" . 2) ("ld" . 3) ("st" . 4) ("add" . 5) ("sub" . 6) ("jpos" . 7) ("jz" . 8) ("j" . 9) ("halt" . 10)))
The first pass reads the program, writes an intermediate form of the program, and collects the a-list of label/line pairs:
(define (asm1 file-name) ; first-pass (set! labels (list)) (with-input-from-file file-name (lambda () (let loop ((k 0) (line (read-line)) (lines (list))) (if (eof-object? line) (reverse lines) (if (or (string=? line "") (char=? (string-ref line 0) #\#)) (loop k (read-line) lines) (call-with-values (lambda () (split (string->list line))) (lambda (lbl opc obj cmt) (when (not (string=? lbl "")) (set! labels (cons (cons lbl k) labels))) (loop (+ k 1) (read-line) (cons (vector k lbl opc obj cmt) lines))))))))))
The output of the first pass is a list of five-slot vectors: line number, label, opcode, object, comment. Everything at this point remains in symbolic form; we keep the comment because we intend to use it in a future exercise. The
split function extracts up to four fields from each line:
(define (split line) (define (skip-blanks) (let loop () (if (or (null? line) (not (char-whitespace? (car line)))) #f (begin (set! line (cdr line)) (loop))))) (define (get-string) (let loop ((str (list))) (if (or (null? line) (char-whitespace? (car line))) (list->string (reverse str)) (let ((c (car line))) (set! line (cdr line)) (loop (cons c str)))))) (let* ((lbl (get-string)) (_ (skip-blanks)) (opc (get-string)) (_ (skip-blanks)) (obj (get-string)) (_ (skip-blanks))) (values lbl opc obj (list->string line))))
split function runs through the input
line, extracting fields as it sees them and discarding the parts of the line that it has already processed. Our function isn’t perfect — ill-formed input can make it crash — but our exercise is about assembling, not parsing, and our function does work properly on well-formed input. The list output by the first pass is fed to the second pass, which writes a memory image:
(define (asm2 lines) ; second pass (let ((mem (make-vector 1000 0))) (do ((lines lines (cdr lines))) ((null? lines) mem) (let ((num (vector-ref (car lines) 0)) (opc (vector-ref (car lines) 2)) (obj (vector-ref (car lines) 3))) (vector-set! mem num (+ (* (cdr (assoc opc opcodes)) 1000) (if (assoc obj labels) (cdr (assoc obj labels)) (if (not (string=? obj "")) (string->number obj) 0))))))))
The last line (which is spread over four physical lines) that performs the
vector-set! operation is rather intense. One of three things is stored in the memory location. If
obj is a label, memory is set with the numeric value of the opcode in the two high-order digits and the label location in the three low-order digits. If
obj is a number, it is necessarily the object of a
const opcode, and is written literally to memory; the numeric value of the
const opcode is 0, so the two high-order digits are not special, and are just part of the constant. Otherwise, for those opcodes that have no object, the numeric value of the opcode is written to the two high-order digits and the three low-order digits are set to 0.
If our sample program is stored in file
program.asm, it is assembled like this:
> (asm2 (asm1 "program.asm")) #1000(3010 4011 1000 8007 5011 4011 9002 3011 2000 10000 0)
Note that Scheme elides the trailing zeros from the vector.
You can run the program at http://programmingpraxis.codepad.org/3KFpWaoQ. We’ll write a simulator for our hypothetical computer in the next exercise.
Pages: 1 2