.. Mars Documentation Copyright (C) 2014 Matt Giuca 2014-07-05 .. This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. .. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. .. You should have received a copy of the GNU General Public License along with this program. If not, see . .. Note the careful use of ..highlight for switching between the 'mars' and 'marscon' languages. 'mars' should be used for code examples, and 'marscon' for Bash and Mars console session examples. .. _destructive: Automatic destructive update ============================ .. sectionauthor:: Matt Giuca One of the exciting features of Mars is that it supports *automatic destructive update* for array operations. The compiler can, in some circumstances, turn inefficient copy-and-update operations into in-place destructive operations. Mars will do this behind the scenes, without you needing to worry about it, but knowing some of the details of this feature can help you make sure you get the best performance out of your array operations. Because Mars is a purely declarative language, functions cannot normally modify their arguments, which means that updating data structures --- particularly arrays --- can be inefficient. Recall this code from the previous chapter:: def array_increment(a :: Array(Num)) :: Array(Num): i = 0 while i < array_length(a): a = array_replace(a, i, array_ref(a, i) + 1) i = i + 1 return a In a naïve implementation, this algorithm has quadratic running time, because every call to :func:`array_replace` makes a full copy of the array. But what if the compiler was smart enough to realise that the array did not always need to be copied, because the old value of the array would never be used again? In *certain cases*, the Mars compiler is clever enough to optimize away the copy operations. Before we discuss this example, let's look at some simpler cases. Direct calls to array operations -------------------------------- Consider a simple Mars program that performs two updates to an array:: import prelude def main() :: io Num: a = [1, 2, 3] print_value(array_replace(a, 0, 10)) put_char('\n') print_value(array_replace(a, 1, 20)) put_char('\n') return 0 .. highlight:: marscon The output of this program is:: [10, 2, 3] [1, 20, 3] .. highlight:: mars It is important to note that :func:`array_replace` never changes *a*, so the second line has the original value of the first array element (1, not 10). You might realise, however, that *a* is never used again after that second call to :func:`array_replace`. The Mars compiler realises this too, and takes the liberty of automatically telling :func:`array_replace` to just mutate the array it is given, and not bother copying it (effectively transforming it into a call to :func:`array_set`). You won't be able to tell the difference in the output, but if you do a lot of array updates, you might notice the performance difference. Mars automatically transforms calls to the following array functions: * :func:`array_replace` * :func:`array_add` * :func:`array_concat` (destructive on the first argument only) * :func:`array_remove` In principle, we could apply the same techniques to updating other data structures as well, but for the time being, Mars only applies this optimization to arrays. .. highlight:: marscon But, aside from profiling, how do you know when Mars has transformed an update operation? In the Mars interactive mode, you can use the ":uc" command to see which calls to a particular function are destructive. For example:: $ mars -i test.mar ?> :uc main:array_replace [N] [D] Here, we are asking the compiler: show the call mode of every call to :func:`array_replace` from :func:`main`. As there are two calls to :func:`array_replace`, it prints two lines. The first is "[N]", meaning non-destructive or normal --- we cannot update *a* because it will be used later on. The second is "[D]", meaning destructive. This shows that the compiler is indeed destructively updating the array *a* on the second call to :func:`array_replace`. .. highlight:: mars Calls to functions that update arrays ------------------------------------- Mars is clever enough to track usage information across function calls, so the analysis still works if a function tries to update an array it was given as an argument. Consider this program, which generates an array of numbers from 0 to 99:: import prelude # Appends the length of array as a new element to the end. def push_length(array :: Array(Num)) :: Array(Num): len = array_length(array) return array_add(array, len) def main() :: io Num: a = [] while array_length(a) < 100: a = push_length(a) print_value(a) put_char('\n') return 0 .. highlight:: marscon Can Mars optimize the call to :func:`array_add` in :func:`push_length`? The answer is "it depends". The function is modifying an array it was given as an argument, so it cannot perform destructive update unless it is sure that the *caller* is not going to need it again. Let's see what ":uc" says:: ?> :uc push_length:array_add [{array}] The answer is neither N nor D, but "{array}", which means "the call to :func:`array_add` will be destructive if the caller does not need *array* any more." Essentially, :func:`push_length` has two modes, just like the primitive array operations. So, to find out whether it will be destructive, we must ask :func:`main`:: ?> :uc main:push_length [D] .. highlight:: mars So even though :func:`main` is going to call :func:`push_length` one hundred times, it only has a single static call to the function, and the "D" tells us that it will call the destructive mode of :func:`push_length`. This means that our program will be highly efficient --- it will create only a single array and modify it one hundred times. But what about another program that calls :func:`push_length`:: def main() :: io Num: a = [1, 2, 3] b = push_length(a) print_string("Length was ") print_value(array_length(a)) print_string("; now it is ") print_value(array_length(b)) put_char('\n') return 0 .. highlight:: marscon Notice that this program reads from *a* after calling :func:`push_length`. Unfortunately, this disqualifies it from using the destructive version of :func:`push_length` (or else ``array_length(a)`` would return the wrong result). As we can see, the program is now calling the non-destructive version:: ?> :uc main:push_length [N] .. highlight:: mars To fix this problem, we can make sure there are no usages of *a* after :func:`push_length` is called, by pre-caching its original length:: def main() :: io Num: a = [1, 2, 3] a_len = array_length(a) # Must be before push_length(a). b = push_length(a) print_string("Length was ") print_value(a_len) print_string("; now it is ") print_value(array_length(b)) put_char('\n') return 0 .. highlight:: marscon Now the array is destructively updated:: ?> :uc main:push_length [D] .. highlight:: mars Iteration inside a function --------------------------- Now let's return to our original :func:`array_increment` example:: def array_increment(a :: Array(Num)) :: Array(Num): i = 0 while i < array_length(a): a = array_replace(a, i, array_ref(a, i) + 1) i = i + 1 return a .. highlight:: marscon Can this quadratic-time algorithm be automatically made into a linear-time one? Unfortunately, the answer, for the moment, is "it depends":: ?> :uc array_increment:array_replace [{a}] .. highlight:: mars As with the :func:`push_length` example, :func:`array_increment` is destructive only if the array is never used again by the caller. That means we have perfect linear behaviour for a simple caller like this:: def main() :: io Num: a = [1, 2, 3] print_value(array_increment(a)) # [D] put_char('\n') But if a caller needs to use the original *a* again after the call to :func:`array_increment`, there will be no optimization:: def main() :: io Num: a = [1, 2, 3] print_value(array_increment(a)) # [N] put_char('\n') print_value(array_add(a, 4)) # [D] put_char('\n') This is not ideal: the fact that the caller needs *a* again means that :func:`array_increment` needs to copy it once, but it doesn't need to copy it every time around the loop! There is an inherent limitation in the compiler that each call site has a fixed mode: it can't behave one way on the first iteration of the loop, and another way for each subsequent iteration. Armed with the knowledge from ":uc", we can manually work around this limitation to always get linear behaviour. We simply need to manually unroll the loop, so the first iteration is separate from the main loop:: def array_increment(a :: Array(Num)) :: Array(Num): if array_length(a) == 0: return a # Manually unrolled for destructive update. a = array_replace(a, 0, array_ref(a, 0) + 1) i = 1 while i < array_length(a): a = array_replace(a, i, array_ref(a, i) + 1) i = i + 1 return a .. highlight:: marscon Now we can ask the compiler how it has optimized the function:: ?> :uc array_increment:array_replace [{a}] [D] .. highlight:: mars Success! The *first* update will be destructive if the caller allows it, or make a copy if necessary. Either way, the array now belongs to us, and we can update it however we wish. All calls to :func:`array_replace` in the body of the loop are always destructive, regardless of the caller's instructions. Either way, :func:`array_increment` is now a linear-time operation. .. note:: Future versions of the Mars compiler may perform this unrolling automatically, but for now, such functions must be hand-optimized. Aliases and advanced troubleshooting ------------------------------------ So far, we have used the rather hand-wavy "if it will never be used again" to describe when it is safe to destructively update. In reality, deciding whether a variable will never be used again is the most complex part of the process, due to *aliases* --- other variables that point to the same object as the variable under consideration. .. note:: Aliases are not normally something you must worry about in Mars. It is more or less the whole point of a purely declarative language that you should not need to care whether a two variables are aliased or just contain the same value. Consider the following example:: import prelude def main() :: io Num: a = [1, 2, 3] b = a print_value(array_replace(a, 0, 10)) # [N] put_char('\n') print_value(array_replace(b, 1, 20)) # [D] put_char('\n') return 0 In this example, the first call to :func:`array_replace` is non-destructive. Why? It is the last time *a* is mentioned, but that is not the only consideration: *a* and *b* are pointers to the same array object, so mutating *a* would inadvertently cause *b* to change as well. Luckily, the Mars compiler is clever enough to detect that *a* and *b* are aliased, so it prevents the destructive update in this case. In general, there are four conditions for optimizing the update to a variable when calling a function: * The variable in question must not be used again in the remainder of the function. * The variable must not be aliased to another variable that will be used in the remainder of the function. * If the variable is an argument to this function, or is aliased to an argument, the caller must have authorized its update. * The variable must not appear as two arguments to this function call. If you are finding that a function call is not being optimized as it should be, look for other variables that could potentially be aliased to the variable you are trying to update. The Mars aliasing inference system is fairly sophisticated; it can precisely detect aliases across function calls and inside complex data structures. However, the compiler is not perfect, and might think variables could be aliased when in fact they can't be. You might have to try rearranging the code to remove any doubt. The best way to get a feel for what can be optimized is to experiment with programs and the ":uc" command. If you think that the Mars compiler is not optimizing some code that it could optimize, make your example as simple as possible and `file a bug `_. If the Mars compiler is doing destructive update on a variable that is still alive (giving incorrect behaviour), then that is definitely a bug and should be reported. If you need to disable automatic destructive update for some reason, you can pass the ``-d-`` command-line argument to either :command:`mars` or :command:`marsc`.