Final Project

Part 0: Set up
Part 1: Semantics
Part 2: Types
Part 3: Syntax
Part 4: Assertions
Part 5: Everything

Welcome to the final project for CAS CS 392: Rust in Practice and in Theory. As you know, we'll be building an interpreter for Featherweight Rust (FR), a subset of Rust defined and formalized by David J. Pearce. I envisage this as something between a project spec and a tutorial. I'll be adding to it for each part of the project, trying to gently guide you to the finish line. I won't be giving any starter code, I'd like this project to be something that you build from scratch.

Each part of the project will generally be presented in two forms:

hard mode, a high-level description with little detail and many degrees of freedom;
easy mode, a low-level description with more detail and fewer degrees of freedom.

The first form should be read as the actual project spec, what you're required to do. The second form should be read as a guide to the simplest way of satisfy the spec. The point being: if you want to make this a fully featured interpreter, then go for it. As long as you satisfy the spec. If you're just looking to make it to the end, the guide should provide a more direct path there.

Let's get started.

Part 0: Set up

At the beginning, there was a new crate. Choose a "clever" name for a small subset of Rust. If you don't want to be clever, you can use the name I'll be using for this page:¹

cargo new salt

First we're gonna set up our files. We want our project to have the following structure in the directory salt/src:

.
├── bin
│   ├── interp.rs
│   └── playground.rs
├── lib.rs
└── utils.rs

The files interp.rs and playground.rs should have empty main functions, utils.rs should be an empty file, and lib.rs should contain the single line:

pub mod utils;

A couple notes about this structure:

This crate has two executables. If we want to run the playground we can use cargo run --bin playground and if we want to run the interpreter, we can use cargo run --bin interp. This will be useful when we want to quickly check how Rust behaves on some program, and eventually when we want a file sitting around for testing the whole pipeline.
The file utils.rs will contain code that's shared across the project. You can put whatever you want in there (or get rid of it).
The file lib.rs is the entry point for all the code we write. By the end of this project, it'll have just a couple lines, making public the various parts of our interpreter.
```
pub mod utils;  // PART 0
pub mod eval;   // PART 1
pub mod types;  // PART 2
pub mod lexer;  // PART 3
pub mod parser; // PART 3
```

Part 1: Semantics

Then there was an evaluator. We'll start by implementing the semantics for FR. This means defining a couple things:

an enumeration for expressions in FR (called terms in the FR paper);
a structure representing the evaluation context (this is a confluence of terms, it's not the same as an evaluation context in the FR paper (Definition 3.5), it's just a structure with the necessary data to perform evaluation);
a method for evaluating expressions according to the FR spec.

Important: Keep in mind that in the next part we'll be implementing a type/borrow checker. The point of the type/borrow checker is to make sure we only evaluate well-behaved expressions. In practice this means unwrap to your hearts delight.²

Hard-Mode

At a high-level, you need to implement the following according to the FR spec:

struct Expr;     // TODO
struct Value;    // TODO
struct Lifetime; // TODO
struct Context;  // TODO


impl Context {
  fn eval_expr(&mut self, e : &Expr, l: Lifetime) -> Value { todo!() }
}

If that's all you need, then have at it. Feel free to put the code wherever it's clear.

Easy-Mode

Create an empty file src/eval.rs and add the following line to the file src/lib.rs:

pub mod eval;

Programs

Let's start by giving a slightly different grammar for FR:

<expr>  ::= '{' { <stmt> '⨟' } [<expr>] '}'
          | 'Box::new' '(' <expr> ')'
          | <lval>
          | '&' ['mut'] <lval>
          | <int>

<stmt>  ::= <expr>
          | 'let' 'mut' <var> '=' <expr>
          | <lval> '=' <expr>

<lval>  ::= <var>
          | '*' <lval>

<prog>  ::=

The biggest different between this syntax and the given one is that we distinguish between expressions and statements.³

From this grammar we get a natural AST (which we can put into the file src/utils.rs):

type Ident = String;
enum Copyable {Yes, No}
enum Mutable {Yes, No}

struct Lifetime(usize);

struct Lval {
    ident: Ident,
    derefs: usize,
}

enum Expr {
    Unit,
    Int(i32),
    Lval(Lval, Copyable),
    Box(Box<Expr>),
    Borrow(Lval, Mutable),
    Block(Vec<Stmt>, Box<Expr>, Lifetime),
}

enum Stmt {
    Assign(Lval, Expr),
    LetMut(Ident, Expr),
    Expr(Expr),
}

Program Store

Next we need some machinary to define the notion of a program store. A quick reminder of definitions (see the FR paper (pg. 13-15) for more details):

A location is an abstract entity that is either named or unnamed. In reality, unnnamed locations still need unique identifers, but they don't correspond to any variable name in the program.
A value is:
- unit (\(\epsilon\));
- an integer;
- a reference, which a location that is either owned (\(\ell^\bullet\)) or borrowed (\(\ell^\circ\)).
A partial value is either a value or undefined (\(\bot\)).
A slot value is a partial value together with a lifetime (\(\langle v \rangle^m\)).
The program store (\(\mathcal S\)) is a map from locations to slots values.

We can do a pretty direct translation of these constructs, and put them in src/eval.rs:

Location = Ident;
enum Owned {Yes, No}


enum Value {
    Unit,
    Int(i32),
    Ref(Location, Owned),
}

type Pvalue = Option<Value>;

struct Slot {
    value: Pvalue,
    lifetime: Lifetime,
}

struct Store(HashMap<Location, Slot>);

We'll take unnamed locations to be locations given fresh identifiers.

The semantics of FR is presented against an interface of functions on program stores (write returns the old value in the slot):

impl Store {
    fn locate(&self, w: &Lval) -> &Location { todo!() }
    fn read(&self, x: &Lval) -> &Slot { todo!() }
    fn write(&mut self, x: &Lval, v: Pvalue) -> Pvalue { todo!() }
    fn drop(&mut self, values: Vec<Pvalue>) { todo! () }
}

Evaluation

Finally, the evaluation context is a structure with a program store, along with whatever other data we need in order to perform evaluation:

struct Context {
    store: Store,
    // TODO: anything else you need
}

All that's left to do is implement two evaluation functions, one for expressions and one for statements:

impl Context {
    pub fn eval_expr(&mut self, expr: &Expr, l: Lifetime) -> Value { todo!() }
    pub fn eval_stmt(&mut self, stmt: &Stmt, l: Lifetime) { todo!() }
}

Evaluating an expression returns a value, whereas evaluating a statement only modifies the evaluation context (i.e., the program store). This is a bit more natural compared to what is done in the FR paper, in which statements need to artificially be given values in order to define the semantics.

The trick is that we won't implement the semantics given in the FR paper, but instead its equivalent big-step semantics.

I'm going to leave this as part of the challenge of the project, but hopefully a couple examples will get you thinking about how to come up with the right big-step semantics for FR.

Example: Take the rule \(\text{R-Box}\) and it's associated rule in \(\text{R-Sub}\):
\begin{align*} \frac {\ell_n \not \in \mathbf{dom}(\mathcal S_1) \qquad \mathcal S_2 = \mathcal S_1 [\ell_n \mapsto \langle v \rangle^*] } {\langle \ \mathcal S_1 \vartriangleright \texttt{box} \ v \longrightarrow \mathcal S_2 \vartriangleright \ell_n^\bullet \ \rangle^l} \ (\text{R-Box}) \end{align*} \begin{align*} \frac {\langle \ \mathcal S_1 \vartriangleright t_1 \longrightarrow \mathcal S_2 \vartriangleright t_2 \ \rangle^l} {\langle \ \mathcal S_1 \vartriangleright \texttt{box} \ t_1 \longrightarrow \mathcal S_2 \vartriangleright \texttt{box} \ t_2 \ \rangle^l} \ (\text{R-Sub}) \end{align*}
The rule \(\text{R-Sub}\) states that we need to evaluate the argument of a box constructor before evaluating the box expression itself (the '\(v\)' in \(\text{R-Box}\) is a value). We can package this into a single big-step rule:
\begin{align*} &\frac {\langle \ \mathcal S_1 \vartriangleright e \Downarrow \mathcal S_2 \vartriangleright v \ \rangle^l \qquad \ell_n \not \in \mathbf{dom}(\mathcal S_2) \qquad \mathcal S_3 = \mathcal S_2 [\ell_n \mapsto \langle v \rangle^*] } {\langle \ \mathcal S_1 \vartriangleright \texttt{box} \ e \Downarrow \mathcal S_3 \vartriangleright \ell_n^\bullet \ \rangle^l} \ (\text{R-Box-Big}) \end{align*}
Note that "\(\ell_n \not \in \mathbf{dom}(\mathcal S_2)\)" is just a freshness condition, expressing that \(n\) is a fresh unique identifier.

Example: In the case of statements, consider the rules \(\text{R-Declare}\) and its associated rule in \(\text{R-Sub}\):
\begin{align*} \frac {\mathcal S_2 = \mathcal S_1[\ell_x \mapsto \langle v \rangle^l]} {\langle \ \mathcal S_1 \vartriangleright \texttt{let mut} \ x \ \texttt{=} \ v \longrightarrow \mathcal S_2 \vartriangleright \epsilon \ \rangle^l} \ (\text{R-Declare}) \end{align*} \begin{align*} \frac {\langle \ \mathcal S_1 \vartriangleright t_1 \longrightarrow \mathcal S_2 \vartriangleright t_2 \ \rangle^l} {\langle \ \mathcal S_1 \vartriangleright \texttt{let mut} \ x \ \texttt{=} \ t_1 \longrightarrow \mathcal S_2 \vartriangleright \texttt{let mut} \ x \ \texttt{=} \ t_2 \ \rangle^l} \ (\text{R-Sub}) \end{align*}
We can package this into a single big-step rule, except now there is no need to return a value, just a new store:
\begin{align*} \frac { \langle \ \mathcal S_1 \vartriangleright e \Downarrow \mathcal S_2 \vartriangleright v \ \rangle^l \qquad \mathcal S_3 = \mathcal S_2[\ell_x \mapsto \langle v \rangle^l] } {\langle \ \mathcal S_1 \vartriangleright \texttt{let mut} \ x \ \texttt{=} \ v \Downarrow \mathcal S_3 \rangle^l} \ (\text{R-Declare-Big}) \end{align*}

The name of the game is determining how to do this translation for the remaining rules. It's a bit more work conceptually, but I think a fair amount easier in the implementation.

Remarks

One thing we're hand-waving here is our definition of Lifetime. The only thing that's required of lifetimes for the semantics is that every block is labeled with a unique lifetime. This is so the semantics of dropping is correct (we don't want to drop values from the wrong block).
In the \(\text{R-Box-Big}\) rule, you'll need to use the global lifetime. The easiest way to deal with this for now is to define an method for Lifetime:
```
impl Lifetime {
    pub fn global() -> Lifetime {
        Lifetime(0)
    }
}
```
Then when we update the definition of Lifetime for our type/borrow checker, we won't need to update the evaluator at all.
There are many small missing details even in this description of the code. It'll be worthwhile to create some methods, derive some traits, the usual things we need when putting together a Rust project.
One way you might want to extend your implementation to be more useful is to carry metadata in the expression to make error messages more useful.
A quick reminder that this is the first iteration of this course, so I don't really know what y'all are going to struggle with the most. Please ask questions, because it helps me out too!

Part 2: Types

But we have to type/borrow check our programs before we evaluate. We take a step back to build a type/borrow checker. This part will be a fair amount more involved than the last. We need to define:

an enumeration for types in FR, which may become partial during type-checking;
a structure for representing the type-checking context, which will include a typing environment and auxiliary data for managing lifetime information;
a method for type-checking a FR expression according to the FR spec.

Hard Mode

At a high-level, you need to implement the following according to the FR spec:

enum Type;      // TODO
struct Context; // TODO
enum Error;     // TODO

impl Context {
    fn type_expr(&mut self, expr: &mut Expr) -> Result<Type, Error> {
        todo!()
    }
}

Note that type_expr takes a mutable reference to an expression. This is because our surface-level syntax doesn't include annotations which label l-values as copyable. In the parsing stage, we'll label everything as movable, and during the type-checking phase, we'll relabel l-values with copyable types.

Easy Mode

Create a empty file src/types.rs and include the following line to the file src/lib.rs:

pub mod types;

Types

We start with an enumeration for types. This will already be a departure from the FR specification; it will be more convenient if our notion of type includes partial types:

enum Type {
    Unit,
    Int,
    Box(Box<Type>),
    Ref(Lval, Mutable),
    Undefined(Box<Type>),
}

We will think of the Undefined constructor as a way of "walling off" part of a type as undefined.⁴

Typing Environment

The typing environment is a mapping from identifiers to slotted types (i.e., (partial) types labeled with lifetimes), very similar to our program store from the previous part:

struct Slot {
  tipe: Type,
  lifetime: Lifetime,
}

struct Env(HashMap<Ident, Slot>);

For now, you are not required to give meaningful type errors. We can set up an enumeration for errors with a single constructor:

enum Error {
    Dummy,
}

type TypeResult<T> = Result<T, Error>;

The environment should implement functions that appear in the spec.

impl Env {
    fn insert(&mut self, var: &str, tipe: Type, lifetime: Lifetime) { todo!() }

    fn type_lval(&self, lval: &Lval) -> TypeResult<Slot> { todo!() }

    // Returns the type under the boxes of a type, given that the
    // underlying type is defined
    fn contained(&self, var: &Ident) -> Option<&Type> { todo!() }

    fn read_prohibited(&self, lval: &Lval) -> bool { todo!() }

    fn write_prohibited(&self, lval: &Lval) -> bool { todo!() }

    // "move" is a keyword in Rust
    fn moove(&mut self, lval: &Lval) -> TypeResult<()> { todo!() }

    // so is "mut"
    fn muut(&self, lval: &Lval) -> bool { todo!() }

    fn compatible(&self, t1: &Type, t2: &Type) -> bool { todo!() }

    fn write(&mut self, w: &Lval, tipe: Type) -> TypeResult<()> { todo!() }

    fn drop(&mut self, l: Lifetime) { todo!() }
}

Most of the functions take a mutable reference to self. This is all we really mean when we say that we're implementing a flow-sensitive type system: the typing environment changes throughout the process of type checking. This also means you have to make sure that you check premises in the correct order!

Alternative Definitions

A lot of these functions are straightforward translations of the definitions given in the spec. There are couple worth expanding on:

Definition: (Type Containment) The type contained in a partial type (a.k.a. the contained type) is defined inductively as follows:

the type contained in \(\epsilon\) (i.e., the unit type) is \(\epsilon\);

the type contained in int is int;

the type contained in &[mut] w is &[mut] w;

the type contained in \(\square \tau\) is the type contained in \(\tau\);

otherwise, the contained type is undefined.

That is, the contained type is the full type under any number of partial boxes, given that it is defined.

The type contained in the identifier \(x\) with respect to the typing environment \(\Gamma\) is the type contained in \(\sigma\) where \(\Gamma(x) = \langle \sigma \rangle^l\), given that \(x \in \mathsf{dom}(\Gamma)\) and \(\sigma\) contains a type. It's undefined otherwise.

That is, the contained type of an identifier is the contained type of its partial type in the typing environment, if it's defined.

Definition: (Strong Update) We define the (partial) strong update function and the (partial) write function as follows:
\begin{align*} \mathsf{update}(\Gamma, \epsilon , \tau_1, \tau_2) &= (\Gamma, \tau_2) \\ \mathsf{update}(\Gamma, *\pi , \square \tau_1, \tau_2) &= (\Gamma', \square \tau_1') &(\Gamma', \tau_1') = \mathsf{update}(\Gamma, \pi , \tau_1, \tau_2) \\ \mathsf{update}(\Gamma, *\pi , \&\mathsf{mut} \ u, \tau_2) &= (\Gamma', \&\mathsf{mut} \ u) &\Gamma' = \mathsf{write}(\Gamma, \pi u, \tau_2) \end{align*} \begin{align*} \mathsf{write}(\Gamma, w, \tau) &= \Gamma'[x \mapsto \langle \sigma' \rangle^l] \text{ where} \\ w &= \pi \ | \ x \\ \Gamma(x) &= \langle \sigma \rangle^l \\ (\Gamma', \sigma') &= \mathsf{update}(\Gamma, \pi, \sigma, \tau) \end{align*}

write is by far the most difficult functions in the above interface to implement, particularly if you want to avoid unnecessary cloning. As a hint, think about how strong update can be implemented using a mutable reference to a type, which can be used to write the given argument tipe.

Remarks (Part 2.1)

Because we're implementing strong updates, we are ignoring all issues regarding type/environment strengthing/joins.
Many of the above functions depend on working with paths. This is, in part, why we defined l-values as we did, with a usize representing the number of dereferences on a variable. You can use this number as a path.
The functions moove and write, in theory, have the potential to fail, but don't return TypeResult<()>. This is because they should not fail given that the other premises hold. As discussed during lecture, this will require checking that types are not partial in some cases. That was a lie, my apologies. We do need to return errors for moves and writes, e.g., in the cases there is an attempt to move or write through an immutable reference.

Typing Context

As with evaluation, we need more than an environment to perform type/borrow checking. We'll create a typing context to maintain this information.

pub struct Context {
    env: Env,
    // TODO: anything else you need
}

This typing context should implement the following interface:

impl Context {
    // l ≥ m, the ordering relation on liftimes (Note (2) pg. 13)
    fn lifetime_contains(&self, l: Lifetime, m: Lifetime) -> bool { todo!() }

    // Γ ⊢ T ≥ l (Definition 3.21)
    fn well_formed(&self, tipe: &Type, l: Lifetime) -> bool { todo!() }

    fn type_expr(&mut self, expr: &mut Expr) -> TypeResult<Type> { todo!() }

    fn type_stmt(&mut self, stmt: &mut Stmt) -> TypeResult<()> { todo!() }
}

Type Errors

Now that we're putting together the full type/borrow checker, we're gonna fill in our type errors. There are a lot of ways that the type/borrow checker can fail. If you're following along with the structure we've defined here, I'd recommend the following enumeration for type errors:

enum Error {
    UnknownVar(String),
    CannotDeref(Type),
    MovedOut(Lval),
    MoveBehindRef(Lval),
    UpdateBehindImmRef(Lval),
    CopyAfterMutBorrow(Lval),
    MoveAfterBorrow(Lval),
    MutBorrowBehindImmRef(Lval),
    MutBorrowAfterBorrow(Lval),
    BorrowAfterMutBorrow(Lval),
    Shadowing(String),
    IncompatibleTypes(Type, Type),
    LifetimeTooShort(Expr),
    AssignAfterBorrow(Lval),
}

Feel free to add more error kinds if you'd like. Each one roughly captures failing a side condition in a typing rule.

Remarks (Part 2.2)

You'll need some information to keep track of what lifetimes you're contained in during type/borrow checking. I'd recommend a stack of lifetimes. (Important: Contrary to what I eluded to previously, we're going to keep identifying each lifetime with a usize.)
It may be useful to keep around the Dummy constructor in Error as you work through the type/borrow checker and fill in the actual errors later.
Remember that you should update the expression itself in the case that you see an l-value that is copyable. In the parsing phase, we'll mark everything as movable.
The coding here is a bit shorter than that of Part 2.1, but is more error prone. Work through the rules carefully.

Part 3: Syntax

I can't say we saved the best for last. To have a working interpreter, we need a parser. Nothing to it but that. This means defining:

a lexer and an enumeration for tokens in our language;
a method for parsing expressions in our language to the expression enumeration from the first part of this project.

Hard Mode

Pretty straightforward, you need:

struct Parser; // TODO
enum Error;    // TODO

impl Parser {
    fn parse(&mut self) -> Result<Expr, Error> { todo!() }
}

where the input string is something of the form:

fn main() {
    // CODE
    ...
}

and you return the expression in which you read the bracketed part as a block-expression.

Easy Mode

Create two empty files src/lexer.rs and src/parser.rs. Then add the lines:

pub mod lexer;
pub mod parser;

to the file src/lib.rs.

Lexer

We can start with an token enumeration in the lexer file:

pub enum Token {
    Lparen,
    Rparen,
    Lbracket,
    Rbracket,
    Eq,
    Ampersand,
    Star,
    Comma,
    Semicolon,
    Fn,
    Let,
    Mut,
    Box,
    Int(i32),
    Var(String),
}

It is useful to delineate the lexemes in a single constant array:

const LEXEMES : [(&str, Token); 13] = [
    ("(", Token::Lparen),
    (")", Token::Rparen),
    ("{", Token::Lbracket),
    ("}", Token::Rbracket),
    ("=", Token::Eq),
    ("&", Token::Ampersand),
    ("*", Token::Star),
    (",", Token::Comma),
    (";", Token::Semicolon),
    ("fn", Token::Fn),
    ("let", Token::Let),
    ("mut", Token::Mut),
    ("Box::new", Token::Box),
];

We'll assume ASCII (no unicode). We can then use a similar structured lexer to what was used in our assignment on S-expressions:

struct Lexer<'a> {
    contents: Lines<'a>,
    curr_line_num: usize,
    curr_col_num: usize,
    curr_line: &'a str,
}

enum Error {
    Unknown(usize, usize),
}

type LexResult = Result<Token, Error>;

Hint: It's easiest to assume that curr_line is "" if contents is empty. I would then recommend the following interface for the lexer:

impl<'a> Lexer<'a> {
    fn unknown(&self) -> Error {
        Error::Unknown(self.curr_line_num, self.curr_col_num)
    }

    // remove `i` characters and remove any leading whitespace from the
    // result (including empty lines)
    fn consume(&mut self, i: usize) { todo!() }

    fn symbol_or_keyword(&mut self) -> LexResult { todo!() }
        for (lexeme, token) in LEXEMES {
            if self.curr_line.starts_with(lexeme) {
                self.consume(lexeme.len());
                return Ok(token);
            }
        }
        Err(self.unknown())
    }

    // similar to `symbol_or_keyword` but for variables
    fn variable(&mut self) -> LexResult { todo!() }

    // similar to `symbol_or_keyword` for but integer literals
    fn int(&mut self) -> LexResult { todo!() }
}

Finally, you should make Lexer and iterator:

impl<'a> Iterator for Lexer<'a> {
    type Item = LexResult;
    fn next(&mut self) -> Option<LexResult> { todo!() }
}

Parser

As we've done in the past, we'll give our parser a peekable version of our lexer. Hint: You may want to add to this structure…

pub struct Parser<'a> {
    lexer: Peekable<Lexer<'a>>,
    // TODO: anything else you need
}

pub enum Error {
    EndOfFile,
    Lexer(lexer::Error),
    Unexpected(lexer::Token),
}

type ParseResult<T> = Result<T, Error>;

Here's a little interface for Parser that I found useful:

impl<'a> Parser<'a> {
    fn next_token(&mut self) -> ParseResult<Token> {
        match self.lexer.next() {
            Some(Ok(token)) => Ok(token),
            Some(Err(err)) => Err(Error::Lexer(err)),
            None => Err(Error::EndOfFile),
        }
    }

    fn next_token_match(&mut self, t: Token) -> ParseResult<Token> {
        let token = self.next_token()?;
        if token != t {
            Err(Error::Unexpected(token))
        } else {
            Ok(token)
        }
    }

    fn next_token_var(&mut self) -> ParseResult<String> {
        let token = self.next_token()?;
        match token {
            Token::Var(ident) => Ok(ident.clone()),
            _ => Err(Error::Unexpected(token))
        }
    }

    fn peek_token(&mut self) -> Result<&Token, Error> {
        let token = self.lexer.peek();
        match token {
            Some(Ok(token)) => Ok(token),
            _ => Err(Error::EndOfFile),
        }
    }
}

What remains is to implement the parse function.

impl<'a> Parser<'a> {
    fn parse_block(&mut self) -> ParseResult<Expr> { todo!() }

    fn parse(&mut self) -> ParseResult<Expr> {
        self.next_token_match(Token::Fn)?;
        let main = self.next_token()?;
        match main {
            Token::Var(ident) if ident == "main" => {
                self.next_token_match(Token::Lparen)?;
                self.next_token_match(Token::Rparen)?;
                let out = self.parse_block();
                if self.peek_token().is_err() {
                    return out
                }
                Err(Error::Unexpected(self.next_token()?))
            }
            _ => Err(Error::Unexpected(main))
        }
    }
}

This function should take a program of the form:

fn main() {
    ...
}

and return the expression in which you read the bracketed part as a block-expression.

Again, as we've done the past, you should approach this in a recursive-descent fashion, writing mini-parsers for each individual construct. With the interface above, this can be quite clean, e.g.:

fn parse_box(&mut self) -> ParseResult<Expr> {
    self.next_token_match(Token::Box)?;
    self.next_token_match(Token::Lparen)?;
    let e = self.parse_expr()?;
    self.next_token_match(Token::Rparen)?;
    Ok(Expr::Box(Box::new(e)))
}

Remarks

Don't forget to handle comments in the lexer!
Remember that every l-value expression should be parsed as not copyable. These should be made copyable in type/borrow checker.
You'll probably want to add a field to the Parser to help you deal with lifetime annotations on blocks. The only thing that's required is that every block has a distinct lifetime annotation (which is also distinct from the global lifetime), since the ordering of lifetimes is dealt with in the type/borrow checker.

Part 4: Assertions

We ended up not covering extensions as much as I would have liked. If you managed to implement an extension, great! If not, no worries. But there's one extensions that it incredibly useful just for making sure your code works as expected: assertions.

This is an "extra credit" part of the project, but if you're working in a group, you MUST complete this part.

The task: go back through the project and add what you need in order to interpret code of the following form:

fn main() {
    let mut x = Box::new(2);
    let mut y = Box::new(2);
    let mut z = &y;
    assert_eq!(*x, **z);
}

Formally speaking this means adding:

a constructor to the expression enumeration;
a token to the lexer;
a case to the parser;
a case to the type checker; note that assert_eq!(e1, e2) is an expression of unit type; in order to simplify things, we will require that e1 and e2 have integer type;
a constructor to the type error enumeration for ill-typed assertions
a case to the evaluator; assert_eq!(e1, e2) should reflect to an assertion within Rust, i.e., you should evaluate e1 and e2 and then assert within Rust that their corresponding integer values are equal.

Part 5: Everything

All that's left is to fill in our bin/interp.rs file so that we can run our interpreter. We'll play this part a bit fast and loose:

use std::env;
use std::fs::File;
use std::io::{Read, Error, ErrorKind};
use salt::parser::Parser;
use salt::eval;
use salt::types;
use salt::utils::*;

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let filename = {
        let mut args = env::args();
        args.next();
        args.next().ok_or(Error::new(ErrorKind::NotFound, "usage: cargo run --bin interp <filename>"))?
    };

    let mut contents = String::new();
    File::open(filename)?.read_to_string(&mut contents)?;

    let mut e = Parser::new(&contents[..])
        .parse()
        .map_err(|err| Error::new(ErrorKind::Other, format!("{:?}", err)))?;

    types::Context::default()
        .type_expr(&mut e)
        .map_err(|err| Error::new(ErrorKind::Other, format!("{:?}", err)))?;

    eval::Context::default()
        .eval_expr(&e, Lifetime::global());

    Ok(())
}

You may have to go back and derive Debug for some structures in order for this to work.

And that's all folks. I hope you enjoyed building an interpreter of Rust in Rust. At this point, you will almost certainly uncover some unexpected bugs. Play around with some examples, see if you can handle some of the more interesting programs we've seen throughout the course!

Remember that we set things up to have that playground file in our crate. This means we can run:

cargo run --bin interp playground.rs
cargo run --bin playground

and compare what Rust does to what our interpreter does. This should help out in testing out examples.

Footnotes:

I dunno, I guess salt is usually a precursor to rust.

Although you should keep in mind that using unwrap a lot can make code a bit harder to debug.

These are combined in to a notion of a term in the FR paper.

⁴

This enumeration has a lot of ill-defined partial types, e.g., a partial type is required to have a most one undefined part in the core calculus. We'll ignore this, but it's worth keeping in mind.

Final Project

Table of Contents

Part 0: Set up

Part 1: Semantics

Hard-Mode

Easy-Mode

Programs

Program Store

Evaluation

Remarks

Part 2: Types

Hard Mode

Easy Mode

Types

Typing Environment

Alternative Definitions

Remarks (Part 2.1)

Typing Context

Type Errors

Remarks (Part 2.2)

Part 3: Syntax

Hard Mode

Easy Mode

Lexer

Parser

Remarks

Part 4: Assertions

Part 5: Everything

Footnotes: