.. Mars Documentation
Copyright (C) 2008 Matt Giuca
11/12/2008
.. This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
.. This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
.. You should have received a copy of the GNU General Public License
along with this program. If not, see .
.. _ref-exprs:
***********
Expressions
***********
.. sectionauthor:: Matt Giuca
Expressions are a description of the computation of a value, given some
environment. Expressions appear inside source code, and are evaluated at
runtime. The environment in which an expression is evaluated is the
environment of its surrounding scope, which is computed at runtime.
This section outlines the syntax and semantics of expressions in Mars.
Atoms
=====
Atoms are the most basic expressions. They are comprised of single tokens, and
do not recurse.
.. productionlist::
atom: `num_literal_expr` | `variable` | `constructor`
.. _ref-numlit-expr:
Number literals
---------------
Number literal expressions are usually just formed from number literal tokens.
However, a character literal token also forms a number literal expression:
.. productionlist::
num_literal_expr: `num_literal` | `char_literal`
Because Mars has no "character" data type, a :token:`char_literal` is simply
interpreted as a number, with the byte value of the character. Thus, ``'a'``
is equivalent to ``97``, and ``'\x8c'`` to ``140``.
The type of a number literal expression is always :type:`Num`. It evaluates to
the constant value of the number literal (rounded towards the nearest
representable number).
.. _ref-variable-references:
Variable references
-------------------
An identifier refers to a variable or constructor.
.. productionlist::
variable: `lower_identifier`
constructor: `upper_identifier`
Statically, the variable must exist in the expression's scope. If the
identifier matches the name of a local variable or argument of the current
procedure, then it refers to that variable, regardless of whether there is a
global variable of the same name (hence, local variables shadow globals), and
the expression has the type of that variable.
If the identifier matches the name of a global variable or constructor, then
it refers to that variable, and the expression has the type of that global.
If the identifier does not match a local or global variable, it is a compiler
error, and the program MUST be rejected.
The dynamic semantics of a variable reference expression are to evaluate to
the current value of the variable, at the point at which the expression is
evaluated. Since Mars is a strict language, that point is the exact point in
the code at which the expression appears, so even though local variables may
change value, the evaluation of such expressions is quite deterministic.
.. note::
A constructor reference is just a special type of variable
reference. Constructors are variables too, only syntactically distinguished
by having an uppercase letter. They are semantically equivalent.
Complex expressions
===================
We now present the grammar and semantics for all of the complex (non-atom)
expressions in Mars, in order of precedence from highest to lowest. We begin
with "primary expressions" -- expressions that never need to be surrounded in
parentheses because they have such high precedence. Parenthesised expressions
*are* primary because they may be freely placed wherever an expression is
expected.
Note that we do not yet give a production for :token:`expression`: as that
rule encompasses all possible expressions, it is given last (as the production
with the lowest precedence).
.. productionlist::
primary: `atom`
: | `array_literal_expr`
: | `apply_expr`
: | `field_reference`
: | "(" `expression` ")"
expression_list: `expression` ("," `expression`)*
Array literals
--------------
An array literal expression allows an array value to be created and
initialised with a single expression. The value of an array literal does not
need to be constant, as it may contain nested expressions which are evaluated
at runtime.
Just as a character literal forms a numeric expression, so does a string
literal form an array of numbers. Hence, a string literal is just special
syntax for an array literal.
.. productionlist::
array_literal_expr: `array_literal` | `string_literal`
array_literal: "[" [`expression_list`] "]"
An array literal consists of a list of zero or more sub-expressions. The
sub-expressions may be of any type, but the types of all sub-expressions must
unify, or the compiler MUST reject the program (with a type error). The type
of the array literal expression is :type:`Array(a)`, where :type:`a` is the
unified type of all of the sub-expressions.
The dynamic semantics of an array literal expression are to construct an array
value with as many elements as there are sub-expressions. The value of each
element is the evaluation of its corresponding sub-expression.
A string literal consists of zero or more byte values. The type of a string
literal expression is :type:`Array(Num)`. The evaluation of a string literal
is an array value, with one :type:`Num` element per character, with each
element evaluating to the corresponding character's byte value. Thus,
``"hello"`` is equivalent to ``['h', 'e', 'l', 'l', 'o']``, which is
equivalent to ``[104, 101, 108, 108, 111]``.
.. _ref-function-application-expr:
Function application
--------------------
Function application expressions allow arguments to be passed to functions,
and functions to be evaluated:
.. productionlist::
apply_expr: `primary` "(" [`expression_list`] ["," "..."] ")"
Primarily, function application has the form ``f(v1, v2, ..., vn)``. This is
known as **full function application** (due to the lack of a trailing
":token:`...`" token). ``f`` must have type :type:`(t1, t2, ..., tn) -> [io]
t`,
or the expression is a type error, and the compiler MUST reject the program.
Each value ``vi`` must have type :type:`ti`, where 1 <= *i* <= *n*. This
expression has type :type:`t`.
The evaluation of this expression first causes ``f`` and each ``vi`` to be
evaluated. This expression evaluates to the result of the full application of
function ``f`` to all actual parameters ``v1`` ... ``vn`` (see :ref:`procedure
evaluation `).
If ``f`` has type :type:`(t1, t2, ..., tn) -> io t` (that is, ``f`` is an I/O
function), an additional static constraint applies: if the function
application expression is not :ref:`in an I/O context `, the
compiler MUST reject the program. In other words, it is illegal to call an I/O
function from a non-I/O function.
If an expression performs multiple I/O effects, it may have unspecified
behaviour (see :ref:`ref-evaluation-order`).
An extended form ``f(v1, v2, ..., vm, "...")`` is known as **partial function
application** (it has a trailing ":token:`...`" token). ``f`` must have type
:type:`(t1, t2, ..., tn) -> t`, where *m* <= *n*, or the expression is a type
error, and the compiler MUST reject the program.
Each value ``vi`` must have type :type:`ti`, where 1 <= *i* <= *m*.
This expression has type :type:`(tm+1, tm+2, ..., tn) -> t`.
The evaluation of this expression first causes ``f`` and each ``vi`` to be
evaluated. This expression evaluates to the result of the partial application
of function ``f`` to all actual parameters ``v1`` ... ``vm`` (see
:ref:`procedure evaluation `).
.. note::
Partial function application is not subject to the I/O context constraint:
it is legal to partially-apply an :keyword:`io` function from a non-I/O
context, as long as the resulting function is not applied within a non-I/O
context.
.. _ref-field-reference-expr:
Field references
----------------
A field reference retrieves the value of a named field of a user-defined type:
.. productionlist::
field_reference: `primary` "." `lower_identifier`
Recall that :ref:`user-defined types ` consist of one
or more constructors, each with zero or more parameters, each with an optional
name. The same name may appear in multiple constructors, but must have the
same type in each.
For a field reference of the form ``obj.field``, *obj* MUST have a
user-defined type, and MUST have at least one constructor with a parameter
named *field*, or the compiler must reject the program. The type of the field
reference is the type of the parameter named *field* in any constructor of the
type of *obj*.
The evaluation of the expression causes *obj* to be evaluated first. If the
constructor used to construct *obj* has a parameter named *field*, then the
value of the field reference is the value of the parameter *field*. If the
constructor used to construct *obj* does not have a parameter named *field*,
then a runtime error occurs, terminating the program.
.. highlight:: marscon
For example, the :type:`List` type in the :mod:`prelude` has two
constructors: `Cons` with parameters :attr:`head` and :attr:`tail`, and `Nil`,
with no parameters. Hence::
?> x = Cons(1, Nil)
?> x.head
1
?> x.tail
Nil
?> Nil.head
Runtime Error: List instance has no field 'head'
.. highlight:: mars
Due to the possibility of runtime errors, it is recommended that field
references be used only on types with a single constructor, or fields which
appear in all constructors of a type. For other types, the more powerful
:keyword:`switch` statement should be used. The :keyword:`switch` statement
can also be used to access fields without names.
Unary negation operator
-----------------------
The unary negation operator negates its numeric argument; it is equivalent to
:func:`neg` (but note that use of the ``-`` operator does not require
importing the :mod:`prelude`).
.. productionlist::
unary_expr: `primary`
: | "-" `unary_expr`
This is the tightest-binding arithmetic operator, so a negated complex
expression must be surrounded by parentheses (``-x+y`` negates only *x*;
``-(x+y)`` negates the sum of *x* and *y*).
Binary arithmetic operators
---------------------------
The binary arithmetic operators perform the usual mathematical operations,
with the usual precedence and associativity.
.. productionlist::
multiplicative_expr: `unary_expr`
: | `multiplicative_expr` "*" `unary_expr`
: | `multiplicative_expr` "/" `unary_expr`
: | `multiplicative_expr` "//" `unary_expr`
: | `multiplicative_expr` "%" `unary_expr`
additive_expr: `multiplicative_expr`
: | `additive_expr` "+" `multiplicative_expr`
: | `additive_expr` "-" `multiplicative_expr`
The ``*``, ``/``, ``//``, ``%``, ``+`` and ``-`` operators are equivalent to
:func:`mul`, :func:`fdiv`, :func:`div`, :func:`mod`, :func:`add` and
:func:`sub`, respectively. The ``*``, ``/``, ``//`` and ``%`` operators all
have the same precedence, and bind more tightly than the ``+`` and ``-``
operators. The ``+`` and ``-`` operators have the same precedence as each
other.
Note that there are two division operators: ``/``, which may return a
fractional result, and ``//``, which always returns an integer (rounding
towards negative infinity). The ``//`` operator should be used when dealing
with integer arithmetic.
All of the operators are left-associative (so, for example, ``2 - 3 - 4`` is
equivalent to ``(2 - 3) - 4``).
Comparison operators
--------------------
The comparison operators compare two operands and return a truth value (0 for
false and 1 for true). They bind less tightly than the arithmetic operators.
.. productionlist::
relational_expr: `additive_expr`
: | `additive_expr` "<" `additive_expr`
: | `additive_expr` "<=" `additive_expr`
: | `additive_expr` ">" `additive_expr`
: | `additive_expr` ">=" `additive_expr`
equality_expr: `relational_expr`
: | `relational_expr` "==" `relational_expr`
: | `relational_expr` "!=" `relational_expr`
The relational operators (``<``, ``<=``, ``>`` and ``>=``) operate on numeric
arguments and are equivalent to :func:`lt`, :func:`le`, :func:`gt` and
:func:`ge`, respectively. They bind more tightly than the equality operators.
The equality operators (``==`` and ``!=``) operate on values of any type and
are equivalent to :func:`eq` and :func:`ne`, respectively.
.. note::
The logic of ``==`` and ``!=`` for non-numeric values is complicated;
please see the documentation for :func:`eq`.
.. note::
All of these operators are non-associative; it is an error to mix
relational operators or equality operators within the same expression
level, although it is permitted to mix relational operators with equality
operators.
For example, the expression ``x < y < z`` is invalid, as is ``x == y ==
z``. If you really want to say "the truth value of *x* < *y* is less than
*z*," use parentheses: ``(x < y) < z``. However, it is permitted to mix
equality and relational operators: ``a < b == x > y`` means "the truth
value of *a* < *b* is the same as the truth value of *x* > *y*."
.. _ref-field-replace-expr:
Field replace expressions
-------------------------
A field replace expression is used to non-destructively modify a named field
of a user-defined type:
.. productionlist::
expression: `equality_expr`
: | `primary` "." `lower_identifier` ":=" `expression`
.. note::
The ``:=`` operator is right-associative, and has lower precedence than
field references, function application or binary operators. Hence, ``x.foo
:= y.bar := z`` is equivalent to ``x.foo := (y.bar := z)``.
Updating multiple fields simultaneously therefore requires parentheses, for
example, ``(x.foo := y).bar := z``.
A field replace expression of the form ``obj.field := e`` produces a new value
which is the same as *obj* but with field *field* replaced with the value of
*e*. The original *obj* is not modified. The same compile-time and runtime
rules as :ref:`field reference ` expressions apply.
The expression *e* MUST have the same type as the type of the parameter named
*field* in any constructor of the type of *obj*.
The type of the field replace expression is the type of *obj*. The evaluation
of the expression causes *obj* to be evaluated first. If the constructor used
to construct *obj* has a parameter named *field*, then the value of the field
replace is a value with the same constructor as *obj*, and all parameters the
same as *obj*, except the parameter named *field* having the value of *e*. If
the constructor used to construct *obj* does not have a parameter named
*field*, then a runtime error occurs, terminating the program.
Note that *obj* is evaluated before *e*. If *obj* does not have an appropriate
constructor, it is unspecified whether *e* is evaluated or not.
.. highlight:: marscon
For example::
?> x = Cons(1, Nil)
?> x.head := 2
Cons(2, Nil)
?> x.tail := Cons(2, Nil)
Cons(1, Cons(2, Nil))
?> x
Cons(1, Nil)
?> Nil.head := 1
Runtime Error: List instance has no field 'head'
As with any other expression, field replace expressions do not update the
variable *obj*. Therefore, it is common to perform an "update" as follows::
?> x = x.head := 2
?> x
Cons(2, Nil)
.. highlight:: mars
.. note::
Unlike conventional languages, this will not modify any aliases of `x` (it
is literally an assignment of a new value to the variable named `x`). This
can help avoid confusing bugs, and make your software more reliable overall
(indeed, it is the reason behind Mars being a declarative language), but it
can be confusing if you are used to conventional languages.
See also: :ref:`ref-field-update-statements`.
Precedence table
================
The precedence and associativity of all of the Mars expressions is summarised
in the following table. Higher-numbered precedence levels indicate tighter
binding.
+------------+----------------+------------------------------+---------------+
| Precedence | Type | Operator(s) | Associativity |
+============+================+==============================+===============+
| 6 | Primary | primary expressions | N/A |
+------------+----------------+------------------------------+---------------+
| 5 | Unary | unary ``-`` | Right |
+------------+----------------+------------------------------+---------------+
| 4 | Multiplicative | ``*``, ``/``, ``//``, ``%`` | Left |
+------------+----------------+------------------------------+---------------+
| 3 | Additive | ``+``, ``-`` | Left |
+------------+----------------+------------------------------+---------------+
| 2 | Relational | ``<``, ``<=``, ``>``, ``>=`` | None |
+------------+----------------+------------------------------+---------------+
| 1 | Equality | ``==``, ``!=`` | None |
+------------+----------------+------------------------------+---------------+
| 0 | Field replace | ``:=`` | Right |
+------------+----------------+------------------------------+---------------+
.. _ref-evaluation-order:
Evaluation order
================
Mars imposes a strict evaluation order between statements: for a statement
sequence *s1*; *s2*, all computation involved in *s1* will be completed before
the execution of *s2* begins. However, within an expression, the evaluation
order of sub-expressions is unspecified.
This is usually only relevant in an I/O context. For example, the following
I/O code fragment has unspecified behaviour::
put_char('x') + put_char('y') # Unspecified evaluation order
However, note that Mars has several "hidden" features which may perform
side-effects in pure code (such as :ref:`field update statements
`, :mod:`impure` functions and :mod:`debug`
functions) -- all of these features operate outside of the Mars semantics and
should not be used in normal code. Such features can be affected by the
unspecified evaluation order.
Pure code can also have unspecified termination behaviour. Consider the
following pure function::
def weird() :: Num:
return error("A") + error("B") + weird() # Unspecified termination
Here it is unspecified whether the program will a) print the error message
"A", b) print the error message "B", or c) loop forever or until the program
runs out of stack space. However, in any case, the program will not continue
executing (in the semantics, all of these cases have the denotation ⊥,
pronounced "bottom"), so they can be considered equivalent.