Module Specifications.Assign3

This assignment is due on Thursday 9/26 by 8:00PM. You should put all of your programming solutions in a file called assign3/lib/assign3.ml. See the file test/test_assign3.ml for example behavior of each function. You should put all your written solutions in a single pdf file.

Note. All implementations in this assignment must be tail-recursive. We may have large test-cases which will fail for non-tail-recursive implementations.

Programming (50%)

Unlimited Register Machines

type registers = (int * int) list

An unlimited register machine (URM) is an abstract representation of an infinite collection of registers indexed by integers.

REGISTERS ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ...
  INDICES      -4  -3  -2  -1   0   1   2   3   4

Programmatically, we can represent a URM rs as an association list, i.e., a list of type (int * int) list where the pair (i, j) indicates rs has the value j in register i. Furthermore, we assume that if (i, j) is not a member of rs for any value j, then register i has the value 0. For example, the list [(-1, 5); (0, 4); (3, 1)] represents the URM:

REGISTERS ... | 0 | 0 | 0 | 5 | 4 | 0 | 0 | 1 | 0 | ...
  INDICES      -4  -3  -2  -1   0   1   2   3   4

where all registers not pictured have the value 0. In this part of the assignment, you'll be filling in an interface for working with URMs. In each of the following functions you must maintain the following invariants:

Hint: There are quite a few functions, but many of them have the same logic. Try to avoiding rewriting this logic. Write a general function and reuse it.

val load : int list -> registers

Implement the function load so that load l is the URM in which the ith element of l (using 0-indexing) is put in register i.

val lookup : registers -> int -> int

Implement the function lookup so that lookup rs i is the value in register i of rs.

val incr : registers -> int -> registers

Implement the function incr so that incr rs i is the result of adding 1 to the value in register i of rs.

val zero : registers -> int -> registers

Implement the function zero so that zero rs i is the result of setting the value in register i of rs to 0.

val transfer : registers -> int -> int -> registers

Implement the function transfer so that transfer rs i j is the result of setting the value in register j of rs to be the value in register i of rs (the value in register i remains unchanged).

URM Programs

A URM program is a sequence of instructions of the following form:

A URM program is whitespace agnostic, so you can write each instruction on its own line, or you can write multiple instructions on the same line. All instructions codes and indices must be separated by whitespace. Here is an example program:

J  1   2   100
   I 0     I 2
   J 0 0 0

We will represent a URM program as a int list list with the following correspondence for instructions:

The above program is represented by the following list:

[[3; 1; 2; 100]; [1; 0]; [1; 2]; [3; 0; 0; 0]]

As with any programming language, we can give a formal semantics for the evaluation of URM programs. In this setting, a configuration is of the form

   \langle \ P \ , R \ , \ n \ \rangle
   

where P is a URM program, R is a URM, and n is a program counter. We use the program to keep track of what instruction to run. An evaluation judgment is of the form

   \langle \ P \ , R \ , \ n \ \rangle
   \longrightarrow
   \langle \ P' \ , R' \ , \ n' \ \rangle
   

and expression that the configuration on the left-hand side evaluates to the configuration on the right-hand side.

   \frac
   {\textcolor{green}{P[n] = \texttt{Z i}}}
   {
   \langle \ P \ , \ R \ , \ n \ \rangle
   \longrightarrow
   \langle \ P \ , \ \texttt{zero} \ R \ \texttt{i}  \ , \ n + 1 \ \rangle
   }
   \text{(zero)}
   
   \frac
   {\textcolor{green}{P[n] = \texttt{I i}}}
   {
   \langle \ P \ , \ R \ , \ n \ \rangle
   \longrightarrow
   \langle \ P \ , \ \texttt{incr} \ R \ \texttt{i}  \ , \ n + 1 \ \rangle
   }
   \text{(incr)}
   
   \frac
   {\textcolor{green}{P[n] = \texttt{T i j}}}
   {
   \langle \ P \ , \ R \ , \ n \ \rangle
   \longrightarrow
   \langle \ P \ , \ \texttt{transfer} \ R \ \texttt{i} \ \texttt{j}  \ , \ n + 1 \ \rangle
   }
   \text{(transfer)}
   
   \frac
   {
   \textcolor{green}{P[n] = \texttt{J i j k}}
   \qquad
   \textcolor{green}{\texttt{lookup} \ R \ \texttt{i} = \texttt{lookup} \ R \ \texttt{j}}
   }
   {
   \langle \ P \ , \ R \ , \ n \ \rangle
   \longrightarrow
   \langle \ P \ , \ R  \ , \ \texttt{k} \ \rangle
   }
   \text{(jump-eq)}
   
   \frac
   {
   \textcolor{green}{P[n] = \texttt{J i j k}}
   \qquad
   \textcolor{green}{\texttt{lookup} \ R \ \texttt{i} \neq \texttt{lookup} \ R \ \texttt{j}}
   }
   {
   \langle \ P \ , \ R \ , \ n \ \rangle
   \longrightarrow
   \langle \ P \ , \ R  \ , \ n + 1 \ \rangle
   }
   \text{(jump-neq)}
   

Evaluating a program P on registers R means starting in the configuration \langle \ P \ , \ R \ , \ 0 \ \rangle applying the above rules until the program counter becomes as least the number of instructions in P.

For example, the above program has the following evaluation behavior on the URM load [5;3]:

⟨   [J 1 2 100] I 0  I 2  J 0 0 0    ,   [(0, 5); (1, 3)]           ,   0   ⟩   ───► (jump-neq)
⟨    J 1 2 100 [I 0] I 2  J 0 0 0    ,   [(0, 5); (1, 3)]           ,   1   ⟩   ───► (incr)
⟨    J 1 2 100  I 0 [I 2] J 0 0 0    ,   [(0, 6); (1, 3)]           ,   2   ⟩   ───► (incr)
⟨    J 1 2 100  I 0  I 2 [J 0 0 0]   ,   [(0, 6); (1, 3); (2, 1)]   ,   3   ⟩   ───► (jump-eq)
⟨   [J 1 2 100] I 0  I 2  J 0 0 0    ,   [(0, 6); (1, 3); (2, 1)]   ,   0   ⟩   ───► (jump-neq)
⟨    J 1 2 100 [I 0] I 2  J 0 0 0    ,   [(0, 6); (1, 3); (2, 1)]   ,   1   ⟩   ───► (incr)
⟨    J 1 2 100  I 0 [I 2] J 0 0 0    ,   [(0, 7); (1, 3); (2, 1)]   ,   2   ⟩   ───► (incr)
⟨    J 1 2 100  I 0  I 2 [J 0 0 0]   ,   [(0, 7); (1, 3); (2, 2)]   ,   3   ⟩   ───► (jump-eq)
⟨   [J 1 2 100] I 0  I 2  J 0 0 0    ,   [(0, 7); (1, 3); (2, 2)]   ,   0   ⟩   ───► (jump-neq)
⟨    J 1 2 100 [I 0] I 2  J 0 0 0    ,   [(0, 7); (1, 3); (2, 2)]   ,   1   ⟩   ───► (incr)
⟨    J 1 2 100  I 0 [I 2] J 0 0 0    ,   [(0, 8); (1, 3); (2, 2)]   ,   2   ⟩   ───► (incr)
⟨    J 1 2 100  I 0  I 2 [J 0 0 0]   ,   [(0, 8); (1, 3); (2, 3)]   ,   3   ⟩   ───► (jump-eq)
⟨   [J 1 2 100] I 0  I 2  J 0 0 0    ,   [(0, 8); (1, 3); (2, 3)]   ,   0   ⟩   ───► (jump-eq)
⟨    J 1 2 100  I 0  I 2  J 0 0 0    ,   [(0, 8); (1, 3); (2, 3)]   ,   100 ⟩

We've put square brackets in the program around the instruction that the program counter points to in order to make the behavior more clear. In this problem, you'll be implementing an interpreter for URM programs.

val sep_on_whitespace : string -> string list

sep_on_whitespace s is the result of removing all whitespace in s and combining the contiugous non-whitespace characters in s. This function is implemented for you.

val parse_urm : string list -> int list list

Implement the function parse_urm so that parse_urm l maps the ouput of sep_on_whitespace s to a URM program representation, given that s is a valid URM program. In particular, the behavior of the function is undefined if s does not represent a valid URM program.

val eval_urm : int list list -> registers -> registers

Implement the function eval_urm so that eval_urm prog rs is the result of evaluating the program prog on the URM rs according to the above rules.

All remaining functions are implemented for you.

val interp_urm : string -> int list -> int

interp_urm s args is the result of interpreting the URM program s on the URM after loading the values in args.

val max_urm : int -> int -> int

max_urm i j is the maximum of i and j, implemented by intpreting a URM program.

val fibonacci_urm : int -> int

fibonacci_urm i is ith Fibonacci number, implemented by intpreting a URM program.

Written (50%)

Typing Derivation (I)

    \def\code#1{\textcolor{purple}{\texttt{#1}}}
    \def\side#1{\textcolor{green}{#1}}
    \{ \code{x} : \code{int} , \code{y} : \code{bool} \} \vdash \code{if y then [] else x :: []} : \code{int list}
    

Give a typing derivation of the above typing judgment using the following rules. Do not use any shorthands in your derivation.

    \def\code#1{\textcolor{purple}{\texttt{#1}}}
    \def\side#1{\textcolor{green}{#1}}
    \frac
    {\side{(x : t) \in \Gamma}}
    {\Gamma \vdash x : t}
    \text{(var)}
    \qquad
    \frac
    {
    \Gamma \vdash e_1 : \code{bool}
    \quad
    \Gamma \vdash e_2 : \tau
    \qquad
    \Gamma \vdash e_3 : \tau
    }
    {\Gamma \vdash \code{if } e_1 \code{ then } e_2 \code{ else } e_3 : \tau}
    \text{(if)}
    \qquad
    \frac{}
    {\Gamma \vdash \code{[]} : \tau \code{ list}}
    \text{(nil)}
    \qquad
    \frac
    {
    \Gamma \vdash e_1 : \tau
    \quad
    \Gamma \vdash e_2 : \tau \code{ list}
    }
    {\Gamma \vdash e_1 \code{ :: } e_2 : \tau \code{ list}}
    \text{(cons)}
    

Typing Derivation (II)

   \def\code#1{\textcolor{purple}{\texttt{#1}}}
   \def\side#1{\textcolor{green}{#1}}
   \Gamma \vdash \code{g (f (g 0))} : \tau
   

Determine a minimal context \Gamma and type \tau such that the above typing judgment is derivable, and then given a derivation. Minimal here means that if you remove a variable declaration from \Gamma, then the judgment is no longer derivable. You may use the following rules. Do not use any shorthands in your derivation.

   \def\code#1{\textcolor{purple}{\texttt{#1}}}
   \def\side#1{\textcolor{green}{#1}}
   \frac
   {\side{(x : t) \in \Gamma}}
   {\Gamma \vdash x : t}
   \text{(var)}
   \qquad
   \frac
   {\side{n {\text{ is an int literal}}}}
   {\Gamma \vdash n : \code{int}}
   \text{(int)}
   \qquad
   \frac
   {
   \Gamma \vdash e_1 : \tau_2 \code{ -> } \tau
   \quad
   \Gamma \vdash e_2 : \tau_2
   }
   {
   \Gamma \vdash e_1 \ e_2 : \tau
   }
   \text{(app)}
   

Can there be more than one choice of \Gamma and \tau? Justify your answer and explain the relationship between \Gamma and \tau.

URM Programs

The way that we formalize the semantic behavior of things like URM programs is by using the transitive closure of the evaluation judgment we defined above. This can be formalized by introduing the following two rules which over a new judgment of the form C_1 \longrightarrow^\star C_2 where C_1 and C_2 are configurations as defined above:

   \frac{}
   {C \longrightarrow^\star C}
   \text{(refl)}
   \qquad
   \frac{
   C_1 \longrightarrow C_2
   \quad
   C_2 \longrightarrow^\star C_3
   }
   {
   C_1 \longrightarrow^\star C_3
   }
   \text{(trans)}
   

Write a derivation of the following judgment using these rules and the rules from the above section on URM programs.

   \def\code#1{\textcolor{purple}{\texttt{#1}}}

   \langle \ \code{J 0 1 2 I 0 I 1} \ , \ \code{[(0, 5); (1, 5)]} \ , \ 0 \ \rangle \longrightarrow^\star
   \langle \ \code{J 0 1 2 I 0 I 1} \ , \ \code{[(0, 5); (1, 6)]} \ , \ 3 \ \rangle
   

You can use P has shorthand for the program \texttt{J 0 1 2 I 0 I 1} but otherwise do not use any shorthands in your derivation.