Statements

Statements are executable lines or blocks of code, which appear inside procedures. The Mars execution model is, when evaluating a procedures, to execute each statement, in sequence.

The body of a procedure is comprised of a sequence of statements. Some statement forms (such as while) contain a nested sequence of statements.

Statements are divided into two categories: basic statements (simple single-line statements) and compound statements (statements which contain nested statements or affect control flow).

statement ::=  basic_statement | compound_statement

Basic statements

A basic statement is a single line statement, which doesn’t alter control flow.

basic_statement ::=  expression_stmt
                     | assignment_stmt
                     | field_update_stmt
                     | pass_stmt

Expression statements

An expression statement consists of a single expression:

expression_stmt ::=  expression NEWLINE

Executing this statement simply evaluates the expression, and throws away its result. The only effect of this statement are the side-effects of the expression’s evaluation. (Typically the expression is a function call, with side-effects).

Assignment statements

An assignment statement is used to bind or re-bind a local variable or variables to a new value:

assignment_stmt ::=  pattern "=" expression NEWLINE

Executing this statement evaluates the expression, and matches the expression’s value against the pattern using the pattern matching algorithm defined below. This has the effect of binding any variables mentioned in the pattern. Any side-effects of the expression’s evaluation take place during the execution of the statement.

The expression’s type must equal the pattern’s type (which includes matching the types of any variables that have already been declared or bound), or the compiler MUST reject the program.

It must be statically provable that all possible values are handled by the pattern, or the compiler MUST reject the program. This means that it is not legal to use number literal patterns, and matching a user-defined data type is only allowed if the type has a single constructor.

An assignment statement of the form pattern = expression is equivalent to the switch statement:

switch expression:
    case pattern:
        pass

Examples of assignment statements follow. A simple assignment simply assigns the value of an expression to a local variable:

x = 7

An assignment can deconstruct a value if the type has only one constructor. This is commonly used to access the values of a tuple:

p = Pair(1, 2)
Pair(fst, snd) = p

The deconstruction can include recursive patterns and underscores can be used to avoid binding or matching:

q = Pair(Pair(1, 2), Pair(3, 4))
Pair(_, Pair(x, y)) = q

Lastly, a non-binding assignment is equivalent to a simple expression statement:

_ = print_string("Hello\n")

Note

Deconstructing assignment statements are currently not supported in the interactive prompt. Only simple (variable or non-binding) assignment statements may be executed in that environment.

Field update statements

A field update statement is used to destructively modify a named field of a user-defined type:

field_update_stmt ::=  primary "." lower_identifier "=!" expression
                           NEWLINE

Warning

Mars is, fundamentally, a pure language. This means that no statement can modify an existing value or global state, only assign local variables (and perform input/output, if the statement appears in an I/O context).

The field update statement breaks that purity by modifying a given value and all of its aliases. It should not be considered part of the language semantics, and is not intended to be used by most code (think of it as an “unofficial feature” – the “!” in the syntax is designed as a warning to this effect). It is provided to increase performance in certain situations. Use at your own risk.

See Field replace expressions for a pure way to update fields.

A field update statement of the form obj.field =! e modifies obj and all values aliased to it, replacing field field with the value of e. The same compile-time and runtime rules as field reference expressions apply. The expression e MUST have the same type as the type of the parameter named field in any constructor of the type of obj.

Executing this statement causes obj to be evaluated first. If the constructor used to construct obj has a parameter named field, then that paremeter is mutated to have the value of e. If the constructor used to construct obj does not have a parameter named field, then a runtime error occurs, terminating the program.

Note that obj is evaluated before e. If obj does not have an appropriate constructor, it is unspecified whether e is evaluated or not.

For example:

?> x = Cons(1, Nil)
?> y = x
?> x.head =! 2
?> x.tail =! Cons(2, Nil)
?> x
Cons(2, Cons(2, Nil))
?> y
Cons(2, Cons(2, Nil))
?> Nil.head =! 1
Runtime Error: List instance has no field 'head'

The pass statement

The pass statement does nothing:

pass_stmt ::=  "pass" NEWLINE

It is useful in the body of a procedure or compound statment which does nothing, for syntactic reasons. (At least one statement is required, so to have a body which does nothing, a pass statement is used as a dummy no-op statement).

Compound statements

Compound statements are statements with nested statement blocks, or which affect control flow in some way:

compound_statement ::=  return_stmt
                        | if_stmt
                        | while_stmt
                        | switch_stmt

The return statement

The return statement terminates the current procedure, returning a computed value:

return_stmt ::=  "return" expression NEWLINE

The expression is evaluated, and any side-effects of evaluating the expression take place during the execution of the return statement. The current procedure then terminates. The result of the entire evaluation of the procedure is the value of the expression.

The expression’s type must equal the procedure’s declared return type, or the compiler MUST reject the program.

Note

return is not a basic statement, because it alters control flow. This means, for example, it doesn’t make sense for it to be used at the command-line.

The if statement

The if statement conditionally executes a sequence of statements:

if_stmt ::=  "if" expression ":" NEWLINE
                 INDENT statement+ DEDENT
             ( "elif" expression ":" NEWLINE
                 INDENT statement+ DEDENT )*
             [ "else" ":" NEWLINE
                 INDENT statement+ DEDENT ]

The expression on the first line is called the “condition”. The condition’s type must be Num, or the compiler MUST reject the program. The first nested block of statements is called the “then” clause. If one or more elif parts are supplied, then their conditions are called the “elif” conditions, and their corresponding nested blocks of statements are called the “elif” clauses. If the optional else part is supplied, then the final nested block of statements is called the “else” clause.

The condition is evaluated, and any side-effects take place immediately. If the value is not 0, the “then” clause is executed. If the value is 0, then the conditions of the “elif” clauses are tested in the same way, and the clause of the first successful “elif” condition is executed. If no “elif” condition succeeds, the “else” clause, if supplied, is executed.

The elif part is syntactic sugar for nested if and else clauses. For example, the code:

if a:
    return w
elif b:
    return x
elif c:
    return y
else:
    return z

is syntactic sugar for:

if a:
    return w
else:
    if b:
        return x
    else:
        if c:
            return y
        else:
            return z

The while statement

The while statement repeatedly executes a sequence of statements as long as a condition is true:

while_stmt ::=  "while" expression ":" NEWLINE
                    INDENT statement+ DEDENT

The expression on the first line is called the “condition”. The condition’s type must be Num, or the compiler MUST reject the program.

On each iteration, the condition is evaluated, and any side-effects take place immediately. If the value is 0, execution of the while loop halts, and execution continues with the statement following it. If the value is not 0, the nested block of statements is executed, and then the while statement is executed again.

The switch statement

The switch statement is used to select from a number of alternatives, and also unpack values of types with several alternatives:

switch_stmt ::=  "switch" expression ":" NEWLINE
                     INDENT case+ DEDENT
case        ::=  "case" pattern ":" NEWLINE
                     INDENT statement+ DEDENT

The expression is first evaluated, and any side-effects take place immediately.

The expression’s value is then matched against each case’s pattern (in the order the cases appear) using the pattern matching algorithm defined below, until a match succeeds. After any variable binding (which may be caused by a successful match), the statements inside the successful case are executed.

It must be statically provable that all possible values are handled by at least one case statement, or the compiler MUST reject the program. This can be proven either by having a wildcard pattern (underscore or variable), or by matching all alternatives of a user-defined data type.

Note that the effects of variable binding outlast the switch statement. Their binding is like an assignment statement.

Pattern matching

Pattern matching is an algorithm for matching a value with a pattern. It is used on the left-hand side of assignment statements and cases inside switch statements.

pattern      ::=  "_"
                  | variable
                  | [ "-" ] num_literal | char_literal
                  | constructor [ "(" [ pattern_list ] ")" ]
pattern_list ::=  pattern ("," pattern)*

The result is either failure or success. Failure means the case statement in question is rejected, and another one is tried. Success may be accompanied by variable binding; that is, one or more variables may be defined and initialised, or assigned a new value if they already exist.

Pattern matching is defined by the following rules:

  • If a pattern is _, the match succeeds unconditionally, with no variables bound.
  • If a pattern is a variable, x, then the match succeeds unconditionally. The variable x is bound or re-bound to the value being matched. If the variable has been declared or already bound, it must have the same type as the value, or the compiler MUST reject the program. If variable x has not been declared, then this binding has the effect of declaring the variable, with the same type as the value. This declaration has the scope of the procedure. Subsequent references to the variable name, even after this switch statement, must honour this declaration and its type.
  • If a pattern is an num_literal_expr (or negative number literal), then the value must have type Num, or the compiler MUST reject the program. If the value is equal to the given number or character literal, then the match succeeds, with no variables bound. Otherwise, it fails.
  • If a pattern is a constructor name, optionally with parameter patterns, then the value must have the same type as the typedef containing the constructor, and the pattern must have the same arity as the constructor, as defined there [1]. If the value was constructed using the named constructor, all of the value’s parameter values are matched against the pattern’s parameter patterns. If all sub-matches succeed, then the match succeeds, with all variables bound in the sub-matches being bound. If any sub-match fails, or the value was constructed with a different constructor, the match fails.

Note

The pattern matching algorithm is similar to a specialised version of the unification algorithm in mathematical logic. For a general description of this algorithm, see Unification on Wikipedia.

[1]For a pattern to have the same arity as a constructor, it must match the constructor declaration exactly. Parameter-less constructors must have parameter-less patterns, and nullary constructors (with empty parentheses) must have nullary patterns. This is due to the very different semantics of parameter-less and nullary functions. See Procedure header.