.. Mars Documentation Copyright (C) 2008 Matt Giuca 11/12/2008 .. This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. .. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. .. You should have received a copy of the GNU General Public License along with this program. If not, see . .. _ref-exprs: *********** Expressions *********** .. sectionauthor:: Matt Giuca Expressions are a description of the computation of a value, given some environment. Expressions appear inside source code, and are evaluated at runtime. The environment in which an expression is evaluated is the environment of its surrounding scope, which is computed at runtime. This section outlines the syntax and semantics of expressions in Mars. Atoms ===== Atoms are the most basic expressions. They are comprised of single tokens, and do not recurse. .. productionlist:: atom: `num_literal_expr` | `variable` | `constructor` .. _ref-numlit-expr: Number literals --------------- Number literal expressions are usually just formed from number literal tokens. However, a character literal token also forms a number literal expression: .. productionlist:: num_literal_expr: `num_literal` | `char_literal` Because Mars has no "character" data type, a :token:`char_literal` is simply interpreted as a number, with the byte value of the character. Thus, ``'a'`` is equivalent to ``97``, and ``'\x8c'`` to ``140``. The type of a number literal expression is always :type:`Num`. It evaluates to the constant value of the number literal (rounded towards the nearest representable number). .. _ref-variable-references: Variable references ------------------- An identifier refers to a variable or constructor. .. productionlist:: variable: `lower_identifier` constructor: `upper_identifier` Statically, the variable must exist in the expression's scope. If the identifier matches the name of a local variable or argument of the current procedure, then it refers to that variable, regardless of whether there is a global variable of the same name (hence, local variables shadow globals), and the expression has the type of that variable. If the identifier matches the name of a global variable or constructor, then it refers to that variable, and the expression has the type of that global. If the identifier does not match a local or global variable, it is a compiler error, and the program MUST be rejected. The dynamic semantics of a variable reference expression are to evaluate to the current value of the variable, at the point at which the expression is evaluated. Since Mars is a strict language, that point is the exact point in the code at which the expression appears, so even though local variables may change value, the evaluation of such expressions is quite deterministic. .. note:: A constructor reference is just a special type of variable reference. Constructors are variables too, only syntactically distinguished by having an uppercase letter. They are semantically equivalent. Complex expressions =================== We now present the grammar and semantics for all of the complex (non-atom) expressions in Mars, in order of precedence from highest to lowest. We begin with "primary expressions" -- expressions that never need to be surrounded in parentheses because they have such high precedence. Parenthesised expressions *are* primary because they may be freely placed wherever an expression is expected. Note that we do not yet give a production for :token:`expression`: as that rule encompasses all possible expressions, it is given last (as the production with the lowest precedence). .. productionlist:: primary: `atom` : | `array_literal_expr` : | `apply_expr` : | `field_reference` : | "(" `expression` ")" expression_list: `expression` ("," `expression`)* Array literals -------------- An array literal expression allows an array value to be created and initialised with a single expression. The value of an array literal does not need to be constant, as it may contain nested expressions which are evaluated at runtime. Just as a character literal forms a numeric expression, so does a string literal form an array of numbers. Hence, a string literal is just special syntax for an array literal. .. productionlist:: array_literal_expr: `array_literal` | `string_literal` array_literal: "[" [`expression_list`] "]" An array literal consists of a list of zero or more sub-expressions. The sub-expressions may be of any type, but the types of all sub-expressions must unify, or the compiler MUST reject the program (with a type error). The type of the array literal expression is :type:`Array(a)`, where :type:`a` is the unified type of all of the sub-expressions. The dynamic semantics of an array literal expression are to construct an array value with as many elements as there are sub-expressions. The value of each element is the evaluation of its corresponding sub-expression. A string literal consists of zero or more byte values. The type of a string literal expression is :type:`Array(Num)`. The evaluation of a string literal is an array value, with one :type:`Num` element per character, with each element evaluating to the corresponding character's byte value. Thus, ``"hello"`` is equivalent to ``['h', 'e', 'l', 'l', 'o']``, which is equivalent to ``[104, 101, 108, 108, 111]``. .. _ref-function-application-expr: Function application -------------------- Function application expressions allow arguments to be passed to functions, and functions to be evaluated: .. productionlist:: apply_expr: `primary` "(" [`expression_list`] ["," "..."] ")" Primarily, function application has the form ``f(v1, v2, ..., vn)``. This is known as **full function application** (due to the lack of a trailing ":token:`...`" token). ``f`` must have type :type:`(t1, t2, ..., tn) -> [io] t`, or the expression is a type error, and the compiler MUST reject the program. Each value ``vi`` must have type :type:`ti`, where 1 <= *i* <= *n*. This expression has type :type:`t`. The evaluation of this expression first causes ``f`` and each ``vi`` to be evaluated. This expression evaluates to the result of the full application of function ``f`` to all actual parameters ``v1`` ... ``vn`` (see :ref:`procedure evaluation `). If ``f`` has type :type:`(t1, t2, ..., tn) -> io t` (that is, ``f`` is an I/O function), an additional static constraint applies: if the function application expression is not :ref:`in an I/O context `, the compiler MUST reject the program. In other words, it is illegal to call an I/O function from a non-I/O function. If an expression performs multiple I/O effects, it may have unspecified behaviour (see :ref:`ref-evaluation-order`). An extended form ``f(v1, v2, ..., vm, "...")`` is known as **partial function application** (it has a trailing ":token:`...`" token). ``f`` must have type :type:`(t1, t2, ..., tn) -> t`, where *m* <= *n*, or the expression is a type error, and the compiler MUST reject the program. Each value ``vi`` must have type :type:`ti`, where 1 <= *i* <= *m*. This expression has type :type:`(tm+1, tm+2, ..., tn) -> t`. The evaluation of this expression first causes ``f`` and each ``vi`` to be evaluated. This expression evaluates to the result of the partial application of function ``f`` to all actual parameters ``v1`` ... ``vm`` (see :ref:`procedure evaluation `). .. note:: Partial function application is not subject to the I/O context constraint: it is legal to partially-apply an :keyword:`io` function from a non-I/O context, as long as the resulting function is not applied within a non-I/O context. .. _ref-field-reference-expr: Field references ---------------- A field reference retrieves the value of a named field of a user-defined type: .. productionlist:: field_reference: `primary` "." `lower_identifier` Recall that :ref:`user-defined types ` consist of one or more constructors, each with zero or more parameters, each with an optional name. The same name may appear in multiple constructors, but must have the same type in each. For a field reference of the form ``obj.field``, *obj* MUST have a user-defined type, and MUST have at least one constructor with a parameter named *field*, or the compiler must reject the program. The type of the field reference is the type of the parameter named *field* in any constructor of the type of *obj*. The evaluation of the expression causes *obj* to be evaluated first. If the constructor used to construct *obj* has a parameter named *field*, then the value of the field reference is the value of the parameter *field*. If the constructor used to construct *obj* does not have a parameter named *field*, then a runtime error occurs, terminating the program. .. highlight:: marscon For example, the :type:`List` type in the :mod:`prelude` has two constructors: `Cons` with parameters :attr:`head` and :attr:`tail`, and `Nil`, with no parameters. Hence:: ?> x = Cons(1, Nil) ?> x.head 1 ?> x.tail Nil ?> Nil.head Runtime Error: List instance has no field 'head' .. highlight:: mars Due to the possibility of runtime errors, it is recommended that field references be used only on types with a single constructor, or fields which appear in all constructors of a type. For other types, the more powerful :keyword:`switch` statement should be used. The :keyword:`switch` statement can also be used to access fields without names. Unary negation operator ----------------------- The unary negation operator negates its numeric argument; it is equivalent to :func:`neg` (but note that use of the ``-`` operator does not require importing the :mod:`prelude`). .. productionlist:: unary_expr: `primary` : | "-" `unary_expr` This is the tightest-binding arithmetic operator, so a negated complex expression must be surrounded by parentheses (``-x+y`` negates only *x*; ``-(x+y)`` negates the sum of *x* and *y*). Binary arithmetic operators --------------------------- The binary arithmetic operators perform the usual mathematical operations, with the usual precedence and associativity. .. productionlist:: multiplicative_expr: `unary_expr` : | `multiplicative_expr` "*" `unary_expr` : | `multiplicative_expr` "/" `unary_expr` : | `multiplicative_expr` "//" `unary_expr` : | `multiplicative_expr` "%" `unary_expr` additive_expr: `multiplicative_expr` : | `additive_expr` "+" `multiplicative_expr` : | `additive_expr` "-" `multiplicative_expr` The ``*``, ``/``, ``//``, ``%``, ``+`` and ``-`` operators are equivalent to :func:`mul`, :func:`fdiv`, :func:`div`, :func:`mod`, :func:`add` and :func:`sub`, respectively. The ``*``, ``/``, ``//`` and ``%`` operators all have the same precedence, and bind more tightly than the ``+`` and ``-`` operators. The ``+`` and ``-`` operators have the same precedence as each other. Note that there are two division operators: ``/``, which may return a fractional result, and ``//``, which always returns an integer (rounding towards negative infinity). The ``//`` operator should be used when dealing with integer arithmetic. All of the operators are left-associative (so, for example, ``2 - 3 - 4`` is equivalent to ``(2 - 3) - 4``). Comparison operators -------------------- The comparison operators compare two operands and return a truth value (0 for false and 1 for true). They bind less tightly than the arithmetic operators. .. productionlist:: relational_expr: `additive_expr` : | `additive_expr` "<" `additive_expr` : | `additive_expr` "<=" `additive_expr` : | `additive_expr` ">" `additive_expr` : | `additive_expr` ">=" `additive_expr` equality_expr: `relational_expr` : | `relational_expr` "==" `relational_expr` : | `relational_expr` "!=" `relational_expr` The relational operators (``<``, ``<=``, ``>`` and ``>=``) operate on numeric arguments and are equivalent to :func:`lt`, :func:`le`, :func:`gt` and :func:`ge`, respectively. They bind more tightly than the equality operators. The equality operators (``==`` and ``!=``) operate on values of any type and are equivalent to :func:`eq` and :func:`ne`, respectively. .. note:: The logic of ``==`` and ``!=`` for non-numeric values is complicated; please see the documentation for :func:`eq`. .. note:: All of these operators are non-associative; it is an error to mix relational operators or equality operators within the same expression level, although it is permitted to mix relational operators with equality operators. For example, the expression ``x < y < z`` is invalid, as is ``x == y == z``. If you really want to say "the truth value of *x* < *y* is less than *z*," use parentheses: ``(x < y) < z``. However, it is permitted to mix equality and relational operators: ``a < b == x > y`` means "the truth value of *a* < *b* is the same as the truth value of *x* > *y*." .. _ref-field-replace-expr: Field replace expressions ------------------------- A field replace expression is used to non-destructively modify a named field of a user-defined type: .. productionlist:: expression: `equality_expr` : | `primary` "." `lower_identifier` ":=" `expression` .. note:: The ``:=`` operator is right-associative, and has lower precedence than field references, function application or binary operators. Hence, ``x.foo := y.bar := z`` is equivalent to ``x.foo := (y.bar := z)``. Updating multiple fields simultaneously therefore requires parentheses, for example, ``(x.foo := y).bar := z``. A field replace expression of the form ``obj.field := e`` produces a new value which is the same as *obj* but with field *field* replaced with the value of *e*. The original *obj* is not modified. The same compile-time and runtime rules as :ref:`field reference ` expressions apply. The expression *e* MUST have the same type as the type of the parameter named *field* in any constructor of the type of *obj*. The type of the field replace expression is the type of *obj*. The evaluation of the expression causes *obj* to be evaluated first. If the constructor used to construct *obj* has a parameter named *field*, then the value of the field replace is a value with the same constructor as *obj*, and all parameters the same as *obj*, except the parameter named *field* having the value of *e*. If the constructor used to construct *obj* does not have a parameter named *field*, then a runtime error occurs, terminating the program. Note that *obj* is evaluated before *e*. If *obj* does not have an appropriate constructor, it is unspecified whether *e* is evaluated or not. .. highlight:: marscon For example:: ?> x = Cons(1, Nil) ?> x.head := 2 Cons(2, Nil) ?> x.tail := Cons(2, Nil) Cons(1, Cons(2, Nil)) ?> x Cons(1, Nil) ?> Nil.head := 1 Runtime Error: List instance has no field 'head' As with any other expression, field replace expressions do not update the variable *obj*. Therefore, it is common to perform an "update" as follows:: ?> x = x.head := 2 ?> x Cons(2, Nil) .. highlight:: mars .. note:: Unlike conventional languages, this will not modify any aliases of `x` (it is literally an assignment of a new value to the variable named `x`). This can help avoid confusing bugs, and make your software more reliable overall (indeed, it is the reason behind Mars being a declarative language), but it can be confusing if you are used to conventional languages. See also: :ref:`ref-field-update-statements`. Precedence table ================ The precedence and associativity of all of the Mars expressions is summarised in the following table. Higher-numbered precedence levels indicate tighter binding. +------------+----------------+------------------------------+---------------+ | Precedence | Type | Operator(s) | Associativity | +============+================+==============================+===============+ | 6 | Primary | primary expressions | N/A | +------------+----------------+------------------------------+---------------+ | 5 | Unary | unary ``-`` | Right | +------------+----------------+------------------------------+---------------+ | 4 | Multiplicative | ``*``, ``/``, ``//``, ``%`` | Left | +------------+----------------+------------------------------+---------------+ | 3 | Additive | ``+``, ``-`` | Left | +------------+----------------+------------------------------+---------------+ | 2 | Relational | ``<``, ``<=``, ``>``, ``>=`` | None | +------------+----------------+------------------------------+---------------+ | 1 | Equality | ``==``, ``!=`` | None | +------------+----------------+------------------------------+---------------+ | 0 | Field replace | ``:=`` | Right | +------------+----------------+------------------------------+---------------+ .. _ref-evaluation-order: Evaluation order ================ Mars imposes a strict evaluation order between statements: for a statement sequence *s1*; *s2*, all computation involved in *s1* will be completed before the execution of *s2* begins. However, within an expression, the evaluation order of sub-expressions is unspecified. This is usually only relevant in an I/O context. For example, the following I/O code fragment has unspecified behaviour:: put_char('x') + put_char('y') # Unspecified evaluation order However, note that Mars has several "hidden" features which may perform side-effects in pure code (such as :ref:`field update statements `, :mod:`impure` functions and :mod:`debug` functions) -- all of these features operate outside of the Mars semantics and should not be used in normal code. Such features can be affected by the unspecified evaluation order. Pure code can also have unspecified termination behaviour. Consider the following pure function:: def weird() :: Num: return error("A") + error("B") + weird() # Unspecified termination Here it is unspecified whether the program will a) print the error message "A", b) print the error message "B", or c) loop forever or until the program runs out of stack space. However, in any case, the program will not continue executing (in the semantics, all of these cases have the denotation ⊥, pronounced "bottom"), so they can be considered equivalent.