COMP3161/9164 23T3 Assignment 1
hindsight
Version 1.0.6
Marks : 17.5% of the mark for the course.
Due date: Friday, Week 8, 1st of November 2024, 23:59:59 Sydney time
Overview
In this assignment you will implement an interpreter for MinHS, a small functional language similar to ML and Haskell. It is fully typed, with types specified by the programmer.
However, we will not evaluate MinHS directly; instead, we’ll first compile it to an intermediate language we call hindsight. In hindsight we use neither call-by- value nor a call-by-name evaluation, but call-by-push-value. This means the program- mer gets to decide the evaluation order herself with explicit operators to steer the con- trolflow. Once we have implemented an evaluator for hindsight, we can then give MinHS either a call-by-value or a call-by-name evaluator, by going to hindsight via different compilation strategies.
The assignment consists of a base compulsory component, worth 70%, and four additional components which collectively are worth 50%, meaning that not all must be completed to earn full marks.
Your total mark can go up to 120%. Any marks above 100% will be converted to bonus exam marks, at a 20-to-3 exchange rate. For example, earning 110% on the assignment will yield 1.5 bonus marks on the final exam.
• Task 1 (70%)
Implement an interpreter for hindsight, using an environment semantics, in- cluding support for recursion and closures.
• Task 2 (10%)
Extend the interpreter to support partially applied primops.
• Task 3 (10%)
Extend the interpreter to support multiple bindings in the one let form.
• Task 4 (10%)
Implement an optimisation pass for hindsight.
• Task 5 (20%)
Implement a call-by-name compiler from MinHS to hindsight.
The front end of the interpreter (lexer, parser, type checker) is provided for you, along
with the type of the evaluate function (found in the file Hindsight/Evaluator.hs) and an implementation stub. The function evaluate returns an object of type Value. You may modify the constructors for Value if you wish, but not the type for evaluate. The return value of evaluate is used to check the correctness of your assignment.
You must provide an implementation of evaluate, in Hindsight/Evaluator .hs. It is this file you will submit for Task 1. The only other files that can be modified are Hindsight/Optimiser .hs (for Task 4) and Hindsight/CBNCompile .hs (for Task 5)
You can assume the typechecker has done its job and will only give you type- correct programs to evaluate. The type checker will, in general, rule out type-incorrect programs, so the interpreter does not have to consider them.
Please use the Ed forum for questions about this assignment.
Submission
Submit your (modified) Hindsight/Evaluator .hs, Hindsight/Optimiser .hs and Hindsight/CBNCompile .hs using the CSE give system, by typing the com-mand
give cs3161 Eval Evaluator .hs Optimiser .hs CBNCompile .hs
or by using the CSE give web interface. Note that Optimiser .hs and CBNCompile .hs are optional, and should only be included if you completed the corresponding bonus tasks.
1 Primer on call-by-push-value
As mentioned, hindsight is a call-by-push-value language. The core of the lan- guage is similar to MinHS as seen in the lectures. This section will describe some of the key differences.
Following the call-by-push-value paradigm, hindsight distinguishes between two kinds of expressions: value expressions and computation expressions. A value expression denotes a value, and a computation expression denotes a process that might produce a value if we run it.
Computations can be suspended using the thunk operator, and suspended com- putations can be passed around as value expressions, and later resumed using force. Here’s an example program:
main :: F Bool
= let y :: U(F Bool) = thunk(1 == 2);
in
reduce 1 < 2
to x in
if x
then produce True
else force y
The type annotation main :: F Bool means that main is a computation expres- sion which produces a boolean result. The U in y :: U(F Bool) means that y is a sus- pended computation which, if resumed, would produce a boolean result. reduce 1 < 2 to x means that the computation 1 < 2 is evaluated, producing a value which is saved in the local binding x. If the True branch is chosen, we’ll run the trivial computation produce True which immediately produces a value True. Otherwise, we’ll resume the suspended computation from before.
Thus, in this case the equality comparison 1 == 2 is never evaluated. If we want the equality comparison to be evaluated first (despite the fact that we don’t need its result), we can refrain from suspending it:
main :: F Bool
= reduce 1 == 2
to y in
reduce 1 < 2
to x in
if x
then produce True
else produce y
2 Task 1
This is the core part of the assignment. You are to implement an interpreter for hindsight. The following expressions must be handled:
• variables. x , y , z
• integer constants. 1, 2, ..
• boolean constants. True , False
• some primitive arithmetic and boolean operations. +, *,<,<=, ..
• constructors for lists. Nil , Cons
• destructors for lists. head , tail
• inspectors for lists. null
• function application. fx
• ifv thenc1 else c2
• suspending computations. thunkc
• resuming suspended computations. force v
• let x :: τx = v ; inc
• reduce c1 toxin c2
• produce v
• recfun f :: (τ1 → τ2 ) x = c expressions
The conceptual meaning of these expressions is explained in detail below, and their semantics are specified more precisely in a big-step style. in Section 3. The abstract syn- tax defining these syntactic entities is in Hindsight/Syntax.hs, which inherits some definitions from MinHS/Syntax .hs You should understand the Hindsight data types VExp , CExp , CBind and VBind well.
In the syntax above and elsewhere in this section, variables named v, v1 etc rep- resent value expressions, and variables named c, c1 etc represent computation expres- sions. The types of the constructors of the VExp and CExp types also clarify this.
Your implementation is to follow the dynamic semantics described in this docu- ment. You are not to use substitution as the evaluation strategy, but must use an envi- ronment/heap semantics. If a runtime error occurs, which is possible, you should use Haskell’s error :: String → a function to emit a suitable error message (the error code returned by error is non-zero, which is what will be checked for – the actual error message is not important).
2.1 Program structure
A program in hindsight may evaluate to either an integer, a list of integers, or a boolean, depending on the type assigned to the main function. The main function is always defined (this is checked by the implementation). You need only consider the case of a single top-level binding for main , as e.g. here:
main :: F Int = 1 + 2
2.2 Variables, Literals and Constants
hindsight is a spartan language. We have to consider the following six forms of types:
Int Bool
[Int]
U ct
F vt
vt -> ct
The first four are value types, and the latter two are computation types. We use vt to denote value types and ct to denote computation types.
Note the Int type of MinHS and hindsight denotes an unbounded precision integer, which is the same as the Integer type in Haskell. This is different to the Int type of Haskell, which is either a 32-bit or 64-bit integer depending on the platform.
The only literals you will encounter are integers. The only non-literal constructors are True and False for the Bool type, and Nil and Cons for the [Int] type.
2.3 Function application
A function in hindsight accepts exactly one argument, which must be a value. The body of the function must be a computation. Inside the body of a recursive function f :: vt − > ct, any recursive references to f are considered suspended; that is, they are regarded as having type f :: U(vt − > ct).
The result of a function application may in turn be a function.
2.4 Primitive operations
You need to implement the following primitive operations:
+ :: Int -> Int -> F Int
- :: Int -> Int -> F Int
* :: Int -> Int -> F Int / :: Int -> Int -> F Int % :: Int -> Int -> F Int
negate :: Int -> F Int
> :: Int -> Int -> F Bool >= :: Int -> Int -> F Bool < :: Int -> Int -> F Bool <= :: Int -> Int -> F Bool
== :: Int -> Int -> F Bool /= :: Int -> Int -> F Bool
head :: [Int] -> F Int
tail :: [Int] -> F [Int] null :: [Int] -> F Bool
These operations are defined over Ints, [Int]s, and Bools, as usual. negate is the primop representation of the unary negation function,i.e. negate applied to 1 results in -1. The abstract syntax for primops is inherited from MinHS/Syntax .hs.
Note the Int type of MinHS and hindsight denotes an unbounded precision integer, which is the same as the Integer type in Haskell. This is different to the Int type of Haskell, which is either a 32-bit or 64-bit integer depending on the platform.
2.5 if- then- else
hindsight has an ifv thenc1 else c2 construct. The types of c1 and c2 are the same. The type of vis Bool.
2.6 let
For the first task you only need to handle simple let expressions of the kind we have discussed in the lectures. Like these:
main :: F Int = let
x :: Int = 3; in produce x
or
main :: F Int
= let f :: U (Int -> F Int)
= thunk (recfun f :: (Int -> F Int) x = x + x); in force f 3
For the base component of the assignment, you do not need to handle let bindings of more than one variable at a time (as is possible in Haskell). Remember, a let may bind a (suspended) recursive function defined with recfun.
2.7 force and thunk
thunkc is a value expression called a thunk or a suspended computation. A suspended computation valuev can be evaluated later in the computation expression force v.
2.8 reduce
reduce c1 tox in c2 is a computation which first executes c1 until a value is pro- duced. This value is then bound to the name x in the evaluation of c2 . It is similar to let, but instead of binding a value expression to a name, it binds the value produced by a computation expression to a name.
2.9 recfun
The recfun expression introduces a new, named function computation. It has the form.
(recfun f :: (Int -> F Int) x = x + x)
Unlike in Haskell (and MinHS), a recfun is not a value, but a computation. It can be bound inlet expressions, but only if suspended by thunk. The value ‘f’ is bound in the body of the function, so it is possible to write recursive functions:
recfun f :: (Int -> F Int) x = reduce x < 10
to b in
if b then
reduce x + 1 to y in
force f y else produce x
Note that inside the body of ‘f’, ‘f’ is considered suspended, hence force must be used to explicitly resume recursive calls.
Be very careful when implementing this construct, as there can be problems when using environments in a language allowing functions to be returned by functions.
2.10 Evaluation strategy
We have seen in the tutorialshow it is possible to evaluate expressions via substitution. This is an extremely inefficient way to run a program. In this assignment you are to use an environment instead. You will be penalised for an interpreter that operates via substitution.
The module MinHS/Env .hs provides a data type suitable for most uses. The lec- ture notes may give a guide on use of environments in dynamic semantics. In general, you will need to use: empty, lookup, add and addAll to begin with an empty environ- ment, lookup the environment, or to add binding(s) to the environment, respectively.
As these functions clash with functions in the Prelude, a good idea is to import the module Env qualified:
import qualified Env
This makes the functions accessible as Env . empty and Env .lookup, to disambiguate from the Prelude versions.
3 Dynamic Semantics of hindsight
Big-step semantics
We define two mutually recursive judgements, a big step semantics for value expres- sions, Γ ⊢ v ↓v V and a big step semantics for computational expressions, Γ ⊢ e ↓e T. The first relates an environment mapping variables to values Γ and a value expression v to the resultant value of that expression V. The second maps the same kind of envi- ronment Γ and a computation expression e to a terminal computation T. Our value set for V will, to start with, consist of:
• Machine integers
• Boolean values
• Lists of integers
Our terminal computations T consist of:
• P V, a computation that immediately produces the value V.
• function terminals, whose shape you must decide.
We will use t to range over terminal computations, and v to denote values. Note that v can also denote value expressions; it should be clear from context which one is intended.
We will also need to add closures orfunction terminals to our terminal computation set, to deal with the recfun construct in a sound way, and a constructor for thunk values to our value set to deal with thunk. There are some design decisions to be made here, and they’re up to you.
Environment
The environment Γ maps variables to values, and is used in place of substitution. It is specified as follows:
Γ ::= · | Γ, x = v
Values bound in the environment are closed – they contain no free variables. This requirement creates a problem with thunk values created with thunk whose bodies contain variables bound in an outer scope. We must bundle them with their associ- ated environment. The same problem will also arise for computations created with recfun, and requires introducing closures. Care must also betaken to support sus- pended functionsinthunk values.
Constants and Boolean Constructors
Γ ⊢ Num n ↓v n Γ ⊢ Con True ↓v True Γ ⊢ Con False ↓v False
Primitive operations
Γ ⊢ v1 ↓v v1(′) Γ ⊢ v2 ↓v v2(′)
Γ ⊢ Add v1 v2 ↓c P(v1(′) + v2(′))
Similarly for the other arithmetic and comparison operations (as for the language of arithmetic expressions)
Note that division by zero should cause your interpreter to throw an error using Haskell’s error function.
The abstract syntax of the interpreter re-uses function application to represent ap- plication of primitive operations, so Add e1 e2 is actually represented as:
App (App (Prim Add) e1 ) e2
For this first part of the assignment, you may assume that primops are never partially
applied — that is, they are fully supplied with arguments, so the term App (Prim Add) e1 will never occur in isolation.
Evaluation of if-expression
Γ ⊢ v ↓v True Γ ⊢ c1 ↓c t Γ ⊢ If v c1 c2 ↓c t
Γ ⊢ v ↓v False Γ ⊢ c2 ↓c t Γ ⊢ If v c1 c2 ↓c t