The purpose of this project is for you to gain experience using the imperative programming style to implement a number of register machine programs. In the context of the register machine, running time and space can be more explicitly measured, and thus the effects of changes in algorithms and optimization of code can be explored. In this project, we use a register machine simulator to run and measure the performance of register machine programs. In addition, later parts of the project explore procedure calls, stacks, and recursion. Finally, a short side-jaunt examines the separation of syntactical analysis from execution, in order to improve the speed of a register machine simulator using techniques similar to those used in the analyze evaluator.
(assign result (op *) (reg result) (const 2))
(perform (op write-line) (reg tmp))
(test (op <) (reg x) (const 10))
(goto (label loop-top))
(branch (label loop-done))
(save result)
(restore result)
The code that you write is what most programmers would call "assembly code." It is a very simple low-level language consisting of a sequence of instructions, most of which are executed in order from start to finish, by the register machine executing it. The usual process of running code on a register machine has two steps:
; (sum-to-n)
; Input: continue
; Output: result
; Modifies: arg0 result
(define reg-sum-to-n ; save assembled result
(assemble ; assemble the following
'(sum-to-n ; label to identify start
(assign arg0 (const 10)) ; arg0 holds counter 10 downto 0
(assign result (const 0)) ; result holds sum to output
sum-to-n-top ; start of loop
(test (op zero?) (reg arg0)) ; done yet?
(branch (label sum-to-n-done)) ; if so, goto end
(assign result (op +) (reg arg0) (reg result)) ; sum
(assign arg0 (op dec) (reg arg0)) ; decrement counter
(goto (label sum-to-n-top)) ; do it again
sum-to-n-done ; if we're done, goto where we're supposed to go
(goto (reg continue)))))
;Assembling............done
;Value: reg-sum-to-n
(run-code ; load&run
reg-sum-to-n ; the assembled code for sum-to-n
'sum-to-n ; start at label sum-to-n
'machine-done)
;Result = 55 ; prints out value of "result" register
;Cycles used: 56 Max Stack Depth: 0
;Timing - Run: .01s GC: 0.s Real: .01s
;Value: #[unspecified-return-value]
Reading the comments on the right-hand side should give you an idea of what each line is doing. The define accomplishes step 1 from above, the run-code does step 2. Here assemble is called with a list of instructions and labels, which it converts to assembled code. run-code loads some assembled-code into the register machine, and executes it starting with the specified label.
The comment at the top specifies how the code interacts with the registers. It takes the continue register as input, leaves its output in the result register, and modifies both the arg0 and result registers on the way. A register is considered an input if the value is used before the register is assigned to. A register is modified if its value could be changed from its initial value by the code (generally these are registers which are the target of assigns). The description of whether or not a register might change is part of the "contract" for the code, and it will help us when we glue multiple pieces of register machine code together.
What you may not have seen before is the convention in which we use a continue register, which contains a label that indicates where to continue executing when the current code block is done. In this case, when the sum has been computed and left in the result register, the code jumps to the destination in the continue register. For the run-code above, this destination is machine-done, which is an automatically defined label that points at a halt instruction. The halt instruction signals the simulator to stop running.
To see the code that the machine is actually running, evaluate (display-code), as shown below. Note that you should only do this after you have used run-code, so that you can see the code that was actually run on that machine.
(display-code) Assembled code: Regs: (pc cr sp continue arg1 arg2 arg3 result arg0) Ops: (dec + zero?) Code-size: 9 Labels: Label Offset machine-done 0 sum-to-n 1 sum-to-n-top 3 sum-to-n-done 8 Instructions: 0 (halt) 1 (assign arg0 (const 10)) 2 (assign result (const 0)) 3 (test (op zero?) (reg arg0)) 4 (branch (label sum-to-n-done)) 5 (assign result (op +) (reg arg0) (reg result)) 6 (assign arg0 (op dec) (reg arg0)) 7 (goto (label sum-to-n-top)) 8 (goto (reg continue)) ------------- ;Value: #[unspecified-return-value]
If the continue register contains the value of the label machine-done, the (goto (reg continue)) on line 8 will jump to line 0 and (halt). By comparison, the output of the assemble is stored in reg-sum-to-n:
(display-assembled-code reg-sum-to-n) Assembled code: Regs: (result arg0) Ops: (dec + zero?) Code-size: 8 Labels: Label Offset sum-to-n 0 sum-to-n-top 2 sum-to-n-done 7 Instructions: 0 (assign arg0 (const 10)) 1 (assign result (const 0)) 2 (test (op zero?) (reg arg0)) 3 (branch (label sum-to-n-done)) 4 (assign result (op +) (reg arg0) (reg result)) 5 (assign arg0 (op dec) (reg arg0)) 6 (goto (label sum-to-n-top)) 7 (goto (reg continue)) ------------- ;Value: #[unspecified-return-value]
You will note that the simulator adds one instruction (halt), one label (machine-done), and a couple of registers, including:
Our register machine simulator differs from the one in the lecture, in that several registers are predefined and made explicit here. The register machine from lecture has pc and cr, but doesn't really talk about them. The addition of the argument registers and the result registers make it easier to interact with code running on the register machine. For example, the run-code procedure will set one or more of the argN registers, if provided with more arguments. Returning to our previous example, we could instead write the sum-to-n block of code, taking in the value n in the register arg0, as:
; (sum-to-n n)
; Input: arg0 continue
; Output: result
; Modifies: arg0 result
(define reg-sum-to-n
(assemble
'(sum-to-n ; note: no assign to arg0, assume preset!
(assign result (const 0))
sum-to-n-top
(test (op zero?) (reg arg0))
(branch (label sum-to-n-done))
(assign result (op +) (reg arg0) (reg result))
(assign arg0 (op dec) (reg arg0))
(goto (label sum-to-n-top))
sum-to-n-done
(goto (reg continue)))))
And then we can run this code with arg0 initialized to 11 with:
(run-code reg-sum-to-n 'sum-to-n 'machine-done 11) ;Result = 66 ;Cycles used: 60 Max Stack Depth: 0 ;Timing - Run: .01s GC: 0.s Real: .01s ;Value: #[unspecified-return-value]
Up to four arguments can be supplied initially with run-code in this fashion.
To see the state of the machine, use display-machine-status:
(display-machine-status default-machine) Machine Status (@ cycle 60) pc 0 cr #t sp 0 continue 0 arg1 #f arg2 #f arg3 #f arg0 0 result 66 Top of stack: (empty stack) ;Value: #[unspecified-return-value]
In the state of the machine summarized above, the pc is zero because that is the location of the halt instruction that stopped the computation. The cr is true because that is what caused the loop to exit. The sp is 0 because there is nothing on the stack. The continue is 0 because that is the value of the machine-done label. Args 1-3 are #f because they were never set. arg0 is 0 because it started at 11 and was decremented down to 0. And finally result is 66 because that is where the sum was stored.
Sometimes you'll have a bug and want to see which instructions are being executed. For this, instead of run-code, use trace-code:
(trace-code reg-sum-to-n 'sum-to-n 'machine-done 1) pc=1 I=(assign result (const 0)) pc=2 I=(test (op zero?) (reg arg0)) pc=3 I=(branch (label sum-to-n-done)) pc=4 I=(assign result (op +) (reg arg0) (reg result)) pc=5 I=(assign arg0 (op dec) (reg arg0)) pc=6 I=(goto (label sum-to-n-top)) pc=2 I=(test (op zero?) (reg arg0)) pc=3 I=(branch (label sum-to-n-done)) pc=7 I=(goto (reg continue)) pc=0 I=(halt) ;Result = 1 ;Cycles used: 10 Max Stack Depth: 0 ;Timing - Run: 0.s GC: 0.s Real: 0.s ;Value: #t
Sometimes even tracing isn't enough. At this point, you'll want to step through your code, running it one line by one line at a time. For this, use start-stepper:
(start-stepper reg-sum-to-n 'sum-to-n 'machine-done 1) [' ' next inst, 'd' for display, 'q' to quit] pc=1 I=(assign result (const 0)) [' ' next inst, 'd' for display, 'q' to quit] pc=2 I=(test (op zero?) (reg arg0)) [' ' next inst, 'd' for display, 'q' to quit] pc=3 I=(branch (label sum-to-n-done)) [' ' next inst, 'd' for display, 'q' to quit] pc=4 I=(assign result (op +) (reg arg0) (reg result)) ...
By pressing spacebar, you execute the next instruction. Pressing 'd' will print out the status of the machine, just like using display-machine-status. Pressing 'q' will break out of the stepping mode.
Computer Exercise 1: Reversal comes to everyone
Load the code (in order, syntax.scm, structures.scm, and simulator.scm). Write an assembly language program that reverses a list. The list is input in arg0, and the output should be left in result.
(define reg-ireverse
(assemble
'(reverse
the-rest-of-your-code-here
...)))
(run-code
reg-ireverse
'reverse
'machine-done
'(3 2 1))
;Result = (1 2 3)
;Cycles used: 20 Max Stack Depth: 0
;Timing - Run: 0.s GC: 0.s Real: 0.s
;Value: #[unspecified-return-value]
Of course, the number of cycles (a cycle is an execution of a single machine instruction) used will depend on how you implement it (yours need not take exactly the same number of cycles as shown above).
In the next couple of parts of this project, you will be working with mathematical matrices. From our perspective, a matrix is a data structure or data abstraction, defined and supported by the following set of operations. In the following, the Scheme version of these operations is demonstrated. The register machine you are working with will also have these built in as "primitive operations" which take arguments according to the (op ...) syntax as described earlier.
(define M (make-matrix 2)) M ;Value: (2 . #(0 0 0 0)) (print-matrix M) 0 0 0 0 ;Value: #[unspecified-return-value]
Note that the "internal" representation of a matrix is not important. That is, the (2 . #(0 0 0 0)) is some Scheme printout of the internal representation of the matrix; you will want to use print-matrix instead to view a matrix.
(matrix-set! M 0 1 33) (print-matrix M) 0 33 0 0 ;Value: #[unspecified-return-value]
(print-matrix (generate-random-matrix 3 5)) 1 3 1 0 3 2 4 3 1 ;Value: #[unspecified-return-value]
(print-matrix (generate-identity-matrix 3)) 1 0 0 0 1 0 0 0 1 ;Value: #[unspecified-return-value]
Computer Exercise 2: Matrix Revolutions
Matrix multiplication is a very common operation in mathematical programming. The algorithm to multiply two square matrices (A*B=C) of size n looks something like this:
for i = 0 to n-1
for j = 0 to n-1
sum = 0
for k = 0 to n-1
sum = sum + A(i,k)*B(k,j)
end
C(i,j)=sum
end
end
The A( , ) and B( , ) are matrix references, which are done using the provided primitive procedure (matrix-ref M i j). The C( , ) is a matrix set, which is done using (matrix-set! M i j val).
Your job is to implement this matrix multiplication algorithm. It should takes its inputs (matrices) in arg0 and arg1 and leave its output in result.
(define reg-matrix-multiply
(assemble
'(matrix-multiply
(assign n (op matrix-size) (reg arg0))
(perform (op write-line) (const "Arg0:"))
(perform (op print-matrix) (reg arg0))
(perform (op write-line) (const "Arg1:"))
(perform (op print-matrix) (reg arg1))
(assign result (op make-matrix) (reg n)) ; create output matrix
...
(perform (op write-line) (const "Result:"))
(perform (op print-matrix) (reg result))
(goto (reg continue))
)))
The scheme implementation of this is as follows (you can use this to check the answers from your code).
(define (multiply-matrix a b)
(let ((c (make-matrix (matrix-size a)))
(n (matrix-size a)))
(let loop
((i 0) (j 0) (k 0) (sum 0))
(cond ((= i n)
c)
((= j n)
(loop (+ i 1) 0 0 0))
((= k n)
(matrix-set! c i j sum)
(loop i (+ j 1) 0 0))
(else
(loop i j (+ k 1) (+ sum
(* (matrix-ref a i k)
(matrix-ref b k j)))))))))
You can test your code by using generate-random-matrix and generate-identity-matrix. Anything multiplied by the identity matrix is itself, which allows you to quickly tell if your code has major problems. You might also consider starting with a very simple matrix (e.g. a 1x1 matrix) when testing your code. For final testing, you can compare results with the provided matrix-multiply scheme procedure.
(define m1 (generate-random-matrix 4 10)) (define m2 (generate-random-matrix 4 10)) (print-matrix (multiply-matrix m1 m2)) (run-code reg-matrix-multiply 'matrix-multiply 'machine-done m1 m2)
The results should match.
Computer Exercise 3: Demons of Speed
Try running your matrix multiply on matrices of size 16-17. How long does it take? By dividing the number of cycles by the time, we arrive at the cycles/second (execution speed) of the simulator. What is it for your computer? Unfortunately, it's likely to be rather slow (good simulators can easily run at millions of cycles/sec, which is much faster than you are likely to see with our simulator). In order to speed things up a little, let's look into how our simulator works.
Look at code in simulator.scm. The code is divided into two parts: assembling lists of instructions into assembled-code, and executing the assembled code. Hopefully, the register machine simulator will look oddly familier to you -- it is basically an "evaluator" for the register machine language!
The assembler currently looks through the code to produce the label table and compiles a list of the registers and operations used in the code. Every time sweep-insts sees a label, it adds it to the table. Every time it sees an instruction, it adds the result of calling assemble-instruction to the assembled-code. The code for assemble-instruction currently has a big cond statement which figures out what type of instruction it's dealing with, then notes the used registers and ops, and finally returns the instruction unchanged.
The key execution procedure is step-machine. It does some bookkeeping, error-checking, and printing (if tracing), and then gets to the interesting stuff: the cond. Here, it figures out which instruction it has, and then does the right thing. You will notice that step-machine returns #t if the execution should continue, or #f if it has halted.
The issue here is that an instruction can (and often is) executed many times. Each time it's executed, the step-machine code has to figure out what to do with it again. This is a good place to consider using the idea of separating the code analysis from the execution process, as was done in the analyze evaluator.
Look at the code in analyze-sim.scm. In the assembly phase, we've changed things to output each instruction with a procedure that, when given a machine, performs the effect of that instruction on the machine. This is very similar to the analyze evaluator, except that instead of passing in an environment env which holds the execution state, we're passing in a register machine as the state. The execution phase now just calls the procedure (the whole cond in step-machine is replaced by a single procedure call).
Look at the code for assemble-instruction. It is returning the result of cons'ing the original instruction with the result of analyzing it (we keep the original instruction around so we can print it while tracing). The result of the analysis should be a procedure that takes a machine as input. The eval-expression procedure has been re-written into analyze-expr, which returns a (lambda (machine) ...) that ultimately returns the value of the expression given the machine. An example of how to analyze an instruction is given in the analyze-assign code. The used registers and ops are noted. The expression is analyzed and returned as a procedure, which when given a machine does something very much like what was originally in the cond statement in step-machine. Note that the procedure created by the analysis returns #t to indicate that the instruction was not a halt instruction.
Computer Exercise 4: Simulator Hacking
Load the code for analyze-sim.scm.
Computer Exercise 5: Somebody's Calling
As you discovered previously, implementing iterative processes is relatively easy on register machines. If we're trying to implement append, we'd like it to be iterative. Here is a simple attempt at an iterative solution:
(define (append l1 l2)
(if (null? l1)
l2
(append (cdr l1) (cons (car l1) l2))))
(append '(1 2 3) '(4 5 6))
;Value: (3 2 1 4 5 6)
This doesn't work -- what's the problem? Well, l1 gets put on the front of l2, but it's backwards. Hmmmm... We can fix that by reversing l1 first:
(define (append l1 l2)
(define (helper l1 l2)
(if (null? l1)
l2
(helper (cdr l1) (cons (car l1) l2))))
(helper (reverse l1) l2))
Luckily, you've already written the code to reverse a list. In append, we're going to use the code for reverse as a "subroutine" (an assembly language implementation of some set of code, conceptually similar to a "procedure"), rather than just including it verbatim at the beginning of append. Using subroutines or procedures is often called code reuse and is an important component of programming (yes, procedures are good things). The information about how to do a procedure call is located in the lecture on stacks and recursion. If you haven't done so already, you should look over that lecture to familiarize yourself with the material. Hopefully, your reverse code is already written in the form of a procedure (takes its input in args, output in result, and returns (by way of a goto or a branch) to the location specified by the continue register at the end.
You can include more than one chunk of assembled code by using the join-all-code procedure, which merges blocks of code. This allows you to test some chunks of code separately, and then use them together.
(run-code (join-all-code reg-append reg-ireverse) ; glue code together 'append 'machine-done '(1 2 3) '(4 5 6))
Computer Exercise 6: Appendicitus
Now let's implement append the simple recursive way:
(define (append l1 l2)
(if (null? l1)
l2
(cons (car l1)
(append (cdr l1) l2))))
In order to accomplish this, we'll need to use procedure call, but the procedure we're calling is ourself! This type of procedure call is really no different from any other, though it may feel a little odd. You should use the stack to save any state you need to use after the return from a subroutine call, as discussed in the lecture.
Computer Exercise 7: Recursive Reverse
Here's the Scheme implementation of a recursive reverse procedure:
(define (reverse lst)
(if (null? lst)
'()
(append (reverse (cdr lst))
(list (car lst)))))
Following is a register machine implementation for a recursive reverse. You'll note that this requires TWO procedure calls.
(define reg-reverse
(assemble
'(reverse
(test (op null?) (reg arg0))
(branch (label reverse-base-case))
(save arg0)
(save continue)
(assign arg0 (op cdr) (reg arg0))
(assign continue (label after-reverse-call))
(goto (label reverse))
after-reverse-call
(restore continue)
(restore tmp)
(save continue)
(assign arg0 (reg result))
(assign arg1 (op car) (reg tmp))
(assign arg1 (op list) (reg arg1))
(assign continue (label finish-reverse))
(goto (label append))
finish-reverse
(restore continue)
(goto (reg continue))
reverse-base-case
(assign result (const ()))
(goto (reg continue))
)))
For each problem, include your code (with identification of the exercise number being solved), as well as comments and explanations of your code, and demonstrate your code's functionality against a set of test cases. Once you have completed this project, your file should be submitted electronically on the 6.001 on-line tutor, using the Submit Project Files button.
We encourage you to work with others on problem sets as long as you acknowledge it (see the 6.001 General Information handout) and so long as you observe the rules on collaboration and using "bibles". If you cooperated with other students, LA's, or others, please indicate your consultants' names and how they collaborated. Be sure that your actions are consistent with the posted course policy on collaboration.
Remember that this is Project 6; when you have completed all your work and saved it in a file, upload that file and submit it for Project 6.