Module 4b: Semantics

BIN (aka Your First Language)

  • ROTATE ROLES and write the name of the new driver here:

Consider the following BNF grammar:

<bin> ::= '#'
        | '(' <bin> <bin> ')'
  • Write an algebraic data type called Bin which corresponds to <bin>. That is, Bin should encode the abstract syntax trees corresponding to the concrete syntax <bin>.

  • Write a function prettyBin which turns an abstract syntax tree into concrete syntax.

  • Write a function parseBin :: String -> Maybe (Bin, String) which turns concrete syntax into abstract syntax.

One way we could give a semantics (meaning) to abstract Bin trees is as follows: a leaf has value 1; the value of a branch with left and right subtrees is the value of the left subtree plus twice the value of the right subtree.

  • Write a function interpBin :: Bin -> Integer which implements this semantics.

We can think of this as a little programming language for computing integers; let’s call it BIN. (Of course, BIN is not a very useful language, but don’t let it hear you say that because you might hurt its feelings.) For example, "((##)(##))" is a program to compute the value 9. parseBin is a parser for the language, and interp is an interpreter.

  • Put the pieces together to create a function evalBin :: String -> Maybe Integer. For example,

      evalBin "#"          --> Just 1
      evalBin "((##)(##))" --> Just 9
      evalBin "(##"        --> Nothing
  • In class we considered three different sources of error in a programming language. Which kind(s) of errors are possible in BIN?

  • Why can’t the other kind(s) of errors occur in BIN?


  • ROTATE ROLES and write the name of the new driver here:

The language EBIN (which stands, of course, for extended BIN) is just like BIN except it also allows single digits, which can stand in for an entire tree. In other words, a digit can go anywhere a # can go in a BIN program. For example, "((4#)(#2))" is a valid EBIN program. Note that "((##)(42))" is also a valid EBIN program, which does not contain the number 42 but rather the single digits 4 and 2.

The digit n stands for a full binary tree of height n. For example,

  • 0 stands for #
  • 1 stands for (##)
  • 2 stands for ((##)(##))

and so on. Thus, EBIN is not a completely new, separate language, but just adds some syntax sugar to BIN. That is, single digits are just a convenient shorthand for more cumbersome but equivalent expressions in BIN. We define EBIN in terms of BIN, and we can think of EBIN programs as abbreviated BIN programs.

  • Write down a BNF grammar for EBIN.

  • Make an algebraic data type EBin to represent EBIN abstract syntax. (Note, at this stage you should not be expanding digits into full binary trees; we will do that later.) Just record the possibilities that EBIN expressions could contain a digit, a #, or a parenthesized pair of EBIN expressions.

  • Write a parser for EBIN. (Hints: you may find the isDigit and read functions useful. Note that Data.Char was imported at the beginning of the module so you can use isDigit. You are always welcome to import other library modules as needed. read has a more general type but for the purposes of this assignment you can think of it as having type String -> Integer; for example, read "34" = 34.)

  • Write a function desugar :: EBin -> Bin which desugars EBIN abstract syntax into equivalent BIN abstract syntax. (Hint: you will probably find it useful to write a separate function grow :: Integer -> Bin which converts an integer into a full tree of the given height.)

  • Would it be possible to do the desugaring phase before the parsing phase? If so, what would be the type of desugar?

  • Now put all the pieces together to define an evaluator evalEBin :: String -> Maybe Integer. That is, evalEBin should encompass the following phases:

    Concrete syntax parse EBIN AST desugar BIN AST interpret Integer

    For example, evalEBin "(#2)" should yield 19, and "((4#)(#2))" should evaluate to 121.

As an (optional) fun aside, make a conjecture about the value of EBIN programs consisting of a single digit. That is, what can you say about evalEBin "0", evalEBin "1", evalEBin "2", and so on? Can you prove your conjecture?