Assignment 5

The following assignment is due Thursday 3/20 by 8:00 PM. You'll need to submit one crate on Gradescope called assign5.

The specification for this assignment is very simple: implement an S-expressions parser. This is a great way to make sure sure you understand how to use indirection to build and work with recursive data.

You're required to use the following types in your program (you may want to add derived traits):

struct Metadata {
    line_num: usize,
    col_num: usize,
}

enum Sexpr<T> {
    Atom(T),
    List(Box<Vec<Sexpr<T>>>),
}

enum ParseErrorKind {
    NotClosed,
    ExtraClosed,
}

struct ParseError {
    kind: ParseErrorKind,
    metadata: Metadata,
}

You're required to implement the following method:

impl Sexpr<String> {
    fn parse(input: &str) -> Result<Vec<Sexpr<String>>, ParseError> {
        todo!()
    }
}

This function should succeed if str represents a sequence of S-expressions. For example, parsing:

(add
  (mul 2 3)
  4)
((one "1") (two "2"))

should give you the S-expression (eliding boxes):

[
    List([
        Atom("add"),
        List([Atom("mul"), Atom("2"), Atom("3")]), Atom("4")
    ]),
    List([
        List([Atom("one"), Atom("\"1\"")]),
        List([Atom("two"), Atom("\"2\"")])
    ]),
]

There are only two ways that a sequence of general S-expression can be ill-formed:

  1. There is an unclosed opening parenthesis
  2. There is an extra closing parenthesis

In either of these cases, you should return a ParseError. Note that errors include Metadata which refers to the line and column of the offending parenthesis. In the first case, you should return an error with metadata for the first unopened closing parentheis. In the second case, you should return an error with metadata for the first extra closing parenthesis.

Parsing the ill-formed S-expression:

((foo
 bar  )  )  )
( (  ) ) )

should give you:

ParseError {
    kind: ExtraClosed,
    metadata: Metadata { line_num: 2, col_num: 12 }
}

whereas parsing:

((foo
 bar  )) (
  ( (  )

should give you:

ParseError {
    kind: NotClosed,
    metadata: Metadata { line_num: 2, col_num: 9 }
}

You can do this however you'd like, but I'd recommend the following structure:

enum Lexeme {
    Lparen,
    Rparen,
    Atom(String),
}

struct Token {
    lexeme: Lexeme,
    metadata: Metadata,
}


struct Lexer<'a> {
    contents: Lines<'a>,
    curr_line_num: usize,
    curr_col_num: usize,
    curr_line: &'a str,
}

impl<'a> Lexer<'a> {
    fn new(input: &'a str) -> Lexer<'a> {
        Lexer {
            contents: input.lines(),
            curr_line_num: 0,
            curr_col_num: 1,
            curr_line: "",
        }
    }
}

impl<'a> Iterator for Lexer<'a> {
    type Item = Token;

    fn next(&mut self) -> Option<Token> {
        todo!()
    }
}

Happy coding.