Switch factoring algorithm

This is the algorithm in Mars for transforming an AST-level switch statement to a CFG-level one. This flattens patterns out into multi-level switch statements, so it is a non-recursive data structure (like the rest of CFG), and represents the actual execution paths taken by the machine. It also generates errors and warnings for malformed switch statements in the source.

First, we transform each case into a “matcher”, a temporary data structure representing a single case in an equivalent, but flattened, form. For each case, recurse through the pattern structure, and generate a flat “matcher”. This results in a list of matchers, corresponding to the original list of cases.

matcher ::=  pair(list(match), stmt_block)

A matcher list is an assoc list associating sequences of “match“es with blocks, just like a switch is an assoc list associating patterns with blocks. A “match” is a pattern without arguments (ctors have arities rather than arguments) – unlike patterns, matches don’t recurse.

match ::=  match_any | match_var | match_ctor(Name, Arity) | match_intlit(Val)

A key operation in this algorithm is generate_matcher_code, which generates low-level code from a matcher list. This recursive operation performs most of the work, once the matcher list is generated. It is called as:

generate_matcher_code(arg_list, matcher_list)

Where arg_list is a list of atoms (variable names), denoting the variables holding the switch values for the match list. The arg_list is always shorter or equal in length to the matcher_list, as arg_list only has the arguments for the current level, while matcher_list is fully expanded out.

Example:

switch x:
    case Pair(A, 1)
    case Pair(B, 2)
    case Pair(A, 3)
    case Pair(q, _)
    case Pair(A, 5)

We want the final output to be:

switch x:
    case Pair:
        a = x.Pair(0)
        n = x.Pair(1)
        switch a:
            case A:
                switch n:
                    case 1
                    case 3
                    default: q = a
            case B:
                switch n:
                    case 2
                    default: q = a
            default: q = a

The arguments to pair are patterns, so this data structure is recursive. We want to flatten it out. Therefore, the matchers we generate are:

[(Pair/2,A,1)-block1, (Pair/2,B,2)-block2, (Pair/2,A,3)-block3, (Pair/2,q,_)-block4, (Pair/2,A,5)-block5]

(All the match arguments would have an arity, such as A/0 and 1/0, but we don’t show /0 in the list above. But not the default arguments, such as q.)

Now the constructors have no arguments – we have a flat list of patterns. Note that we recursively flatten this data, so if A had two arguments, we could see a matcher like:

(Pair/2,A/2,J,K,1)-block1

This reads operationally, “match a Pair, match the Pair’s first argument with A, match the A’s first argument with J and its second with K, then match the Pair’s second argument with 1.” So matchers are just a flat data structure for representing the same information as the case statement, but with an additional operational reading. Not all matchers have the same number of steps, as different ctors have different arities.

Now before we generate code for the matchers, we add a layer of abstraction over the blocks, to avoid duplicating blocks when we duplicate matchers. Hence, our matcher list becomes:

[(Pair/2,A,1)-“block1”, (Pair/2,B,2)-“block2”, (Pair/2,A,3)-“block3”, (Pair/2,q,_)-“block4”, (Pair/2,A,5)-“block5”]

“block1”: block1

“block2”: block2

“block3”: block3

“block4”: block4

“block5”: block5

(The unquoted block names represent the full block code; the quoted ones represent a thin reference to the block name).

The overall algorithm, outside of generating matcher code, goes, for switch(switchvar, cases):

  1. Construct the matcher list, matcher_list, from cases, by flattening out the switch statement.
    • Generate the code for each target block, and store the Block Ids in the matcher list, not the blocks themselves.
  2. Call generate_matcher_code ([switchvar], matcher_list).

The result is a CFG-level switch statement. CFG-level switch statements are very simple jump tables. They switch over an atom, and map ints and ctor names to block IDs. They have an optional default clause, which is required to be last. It is up to the backend to decide what to do if there is no default clause, but the backend is free to exhibit undefined behaviour if a switch doesn’t match and there is no default clause, as the code generator will never generate such an instruction due to exhaustive check, described below.

In this example, we call:

generate_matcher_code(["x"], [(Pair/2,A,1)-"block1", (Pair/2,B,2)-"block2",
                              (Pair/2,A,3)-"block3", (Pair/2,q,_)-"block4",
                              (Pair/2,A,5)-"block5"])

Generating code for a list of matchers

generate_matcher_code(arg_list, matcher_list)

Summary of the algorithm: Focus only on the first argument for each matcher – the rest will be handled by recursion. Convert variable-binding matchers into wildcard/default matchers which bind a variable in their code. For every relevant head value, group together every possible matcher (those with an exact match, and wildcard/default matchers). Make these groupings into switch statements, and recurse to generate the body.

Note that we assume we will switch on arguments from left to right. We could also pick a more intelligent order, such as choosing the argument with the most distinct cases first (which means earlier switches eliminate the most cases). In this case, the example below would choose the 2nd argument first, but we’ll assume left-to-right for now. Note that the flattening means we have to make this decision before generating the matcher list.

Each call to generate_matcher_code is given a block to write to (this block will be completed by the end of the call). When a choice is required, the block will be finished with a switch statement, and new blocks will be created for each case (and finished off by recursive calls). When no choice is required, some code may be appended to the block, and then the (single) recursive call will complete the same block.

Calls to generate_matcher_code may also append phi instructions to the target blocks (which is the mechanism used to make explicit variable bindings).

The function also needs to track two tables, for keeping variable bindings. Ultimately, variable bindings are stored in phi instructions in the target blocks, but during the code generation, we track them separately, because we can’t build phi nodes piece-by-piece. The binding table only moves down through the stack, so any changes made to it are undone on the way up. This maps blockid to varname to varname, such that if BB-BoundVar maps to SourceVar, it says “if BB wants BoundVar, it is currently available in SourceVar.” The phi table moves down and up through the stack, so changes made to it are permanent throughout the algorithm. This is a multimap mapping blockid to varname to pair(blockid,varname), such that if TargetBlock-BoundVar maps to SourceBlock-SourceVar, it says “TargetBlock should have a phi node for BoundVar, where one entry maps SourceBlock to SourceVar.” Both tables start out empty.

Base case

When arg_list is empty. It should always be the case that matcher_list contains exactly one element (but assume it has multiple and pick the first), whose match list is empty. In this case, generate a branch to the block named by the first element of matcher_list.

Also add entries to the phi table for the target block. Get the target block out of the binding table. For each entry in this table (BoundVar-SourceVar), create a phi table entry (TargetBlock-BoundVar) = (CurrentBlock-SourceVar).

Step 1 – Update binding table with explicit variable bindings

The first step of the operation (which in the current example will do nothing, but we’ll see it working later) is to update the binding table for explicit variable bindings (as distinct from the implicit temporary variables, which are created during switch generation). For each matcher whose first argument is a variable name:

  1. Add an entry in the binding table mapping (that matcher’s block - the first variable in arg_list) to the given variable name. For example, if arg_list [1] is “a” and the first entry in a matcher is “q”, then the variable “q” must be bound to the first switch value. Therefore, add to the binding table (<matcher's block> - q) -> a.
  2. Replace the first argument of the matcher with match_any (“_”). This is sound, as variables match anything, and we no longer need to worry about the variable binding, as it will be taken care of as soon as we branch to the code.

Step 2 – Create association list

The next step is to create an assoc list for every head value of interest, for the first argument. The assoc list maps “match”es onto match lists. “Head value” means the ctor without its arguments, or the number. “Of interest” means the head values referred to by any matcher. Head values which are never referred to (e.g., all the numbers which aren’t explicitly mentioned) are all the same as far as we’re concerned.

Now we have to be very careful to preserve order, because default matchers which are before more specific matchers need to supersede the specific ones. (This assumes a language like Haskell, where the first match is chosen, not Prolog/Mercury, where the order technically doesn’t matter.) The grouping of matchers with the same head will conceptually change the order, but this never matters, because when we make a grouping, it is a committed choice. In other words, once a match is made, we cover all possible subsequent cases inside the match, and never fall back out to a default case. So we don’t re-order within a group, but once we have selected the groupings, their order doesn’t matter (though we keep them in order for the sake of predictable code generation).

Let’s return to our example, with the matcher list:

[(Pair/2,A,1)-“block1”, (Pair/2,B,2)-“block2”, (Pair/2,A,3)-“block3”, (Pair/2,q,_)-“block4”, (Pair/2,A,5)-“block5”]

The resulting assoc list is very basic at this level, since all of the matchers have the same head, Pair/2:

Pair/2: [(A,1)-“block1”, (B,2)-“block2”, (A,3)-“block3”, (q,_)-“block4”, (A,5)-“block5”]

It’s difficult to see at this stage, but the operation is to take each head value of interest, and give it its own entry in the assoc list. Then, place in the assoc list any matcher which begins with either the head value, or a “_”.

  • For matchers which begin with the head value, remove the first column of the matcher (so the matched head value no longer appears in the match list).
  • For matchers which begin with a “_”, replace the first column of the matcher (the “_”) with one “_” for each argument of the head value (0 or more). For example, if placing a default matcher in the Cons/2 row, replace its front “_” with “_”, “_” – one for each argument. If placing a default matcher in the Nil row, remove its front “_”.

If there are any default cases, also create an assoc list entry for “_”, and only place in the assoc list any matcher which begins with a “_”. Remove the first column of such matchers, as if matching a 0-arity head value.

Horizontal rule: For any matcher (looking at the remaining matches), if it is covered by all of the matchers already on a row, remove it, as it is redundant. By “covered”, I mean for all of the remaining matches, if it is an exact match, it already appears in a previous matcher, or if it is a default, there is already a default in the previous matchers or between them they include all constructors of that type. When removing a matcher, if the first argument is not a default match, give a warning, as it is a useless clause.

Vertical rule: For any assoc list entry, if its head is covered by all of the entries already in the table, remove it, as it is redundant. By “covered”, I mean if it is an exact match, that head already appeared in the assoc list. If it is a default, there is already a default in the assoc list or between them, all entries include all constructors of that type. When removing an assoc list entry, give a warning for each matcher, as they are all useless. (Perhaps do not give a warning for a default matcher covered by all constructors, as that probably represents defensive programming. For example, the programmer might just be using the default rule as an assert because he doesn’t trust the compiler to generate “not exhaustive” errors – then again, maybe it should be a warning because he should trust the compiler!) When a default entry is placed in the assoc list, it will always be the last (this is a structural requirement of the CFG-level switch statement).

Once this grouping is done (at each level), test that there is either a default case, or that all ctors of the switch type have a case. If not, it is a compiler error (“switch not exhaustive”). The error message should include all of the missing cases (“requires default case or cases for A, B, C”), if it is an ADT. If it is an number type, it must have a default case, so the error message should just read “switch not exhaustive: default case required for switch on number”.

Step 3 – Generate switch statement and implicit unpacking instructions

This enables us to generate the top-level switch statement. Switch over the first variable of arg_list, and generate a case statement for each assoc list entry. If the assoc list entry is default, generate the “default” clause of the switch statement (which must be last). Within each case statement, we unpack all arguments, giving them a temporary name.

Unnecessary optimisation: If the variable’s original pattern is _ in all matchers, don’t unpack it – note that to know this, you actually shouldn’t erase the variable name in step 1, or you won’t know which are needed and which aren’t. Also, if their pattern is a variable name in all matchers, we can use that variable name and avoid the mov instruction on a later recursion of step 1, but for an optimising backend like LLVM, that won’t save anything.

<continue current block>
    switch x:
        case Pair: goto casepair

casepair:
    a = x.Pair(0)
    n = x.Pair(1)
    <generate_matcher_code([a,n], [(A,1)-"block1", (B,2)-"block2",
                                   (A,3)-"block3", (q,_)-"block4",
                                   (A,5)-"block5"])>

Optimisation: If arg_list has a single element, and there were no variables unpacked, then all recursive calls will be base cases with a simple goto. Avoid generating a bunch of new goto-only blocks by making the case statements directly branch to the first target block in each case, as the base case would. Must also update the phi table as the base case does.

Optimisation: I naively generated a single-case switch statement. Instead, we should avoid generating such a thing. If there is exactly one entry in the assoc list, just generate the unpacking code directly:

<continue current block>
    a = x.Pair(0)
    n = x.Pair(1)
    <generate_matcher_code([a,n], [(A,1)-"block1", (B,2)-"block2",
                                   (A,3)-"block3", (q,_)-"block4",
                                  (A,5)-"block5"])>

Step 4 – Recurse

Now we recurse on generate_matcher_code. The recursive calls are shown in the example above. Note that we recurse once for each assoc list entry (so we iterate, then recurse).

The arg_list on the recursive calls is the current arg_list with the first element removed, and the unpacked temporary variable names prepended. These new arguments correspond to the first entries in the new matcher lists. The matcher_list on the recursive calls is simply the corresponding value in the assoc list, for this head.

Afterwards – Generating phi code for explicit variable bindings

After all calls to generate_matcher_code have ended, the phi table has been built. This describes exactly where each target block should read the variable bindings from, conditionally dependent on the block it came from.

These should be converted into phi instructions for each block, in the obvious way. Unnecessary optimisation: Any entries in the phi table which have the same source variable name for every source block can be converted into mov instructions instead.

Completing the example

I now present the rest of the worked example, stepping through the recursive calls.

generate_matcher_code([a,n], [(A,1)-"block1", (B,2)-"block2", (A,3)-"block3",
                              (q,_)-"block4", (A,5)-"block5"])

First, we apply step 1, binding q for the matcher for (q,_).

  1. prepend to the actual code of block 4, the instruction “q = a” (where a is the first variable we are matching against).

  2. Replace q with _ in the matchers list. Hence:

    [(A,1)-“block1”, (B,2)-“block2”, (A,3)-“block3”, (_,_)-“block4”, (A,5)-“block5”]

Now we again group together the matchers by their first argument – this time we have multiple ctors and default matchers in the first argument. For every matched ctor, we duplicate all of the default matchers. Thus, the groups are:

A: [(1)-“block1”, (3)-“block3”, (_)-“block4”, (5)-“block5”]

B: [(2)-“block2”, (_)-“block4”]

_: [(_)-“block4”]

Note that block4 was duplicated for the A group, even though it did not explicitly match A. This preserves the semantics of the switch statement – if we had not duplicated it, then matching A would discount that default case. As explained above, when duplicating a default value, first test to see if it is covered by all of the matchers before it, and if so, do not duplicate.

This allows us to generate the next switch statement:

switch a:
    case A:
        <generate_matcher_code([n], [(1)-"block1", (3)-"block3",
                                     (_)-"block4", (5)-"block5"])>
    case B:
        <generate_matcher_code([n], [(2)-"block2", (_)-"block4"])>
    default:
        <generate_matcher_code([n], [(_)-"block4"])>

Now that we have three cases, we iterate over all of them, and recurse three times.

The assoc list for case A:

1: [()-“block1”]

3: [()-“block3”]

_: [()-“block4”]

(Applying the vertical rule to remove the entry “5: [()-“block5”]”).

The assoc list for case B:

2: [()-“block2”]

_: [()-“block4”]

The assoc list for the default case:

_: [()-“block4”]

This brings us to seven base cases, of <generate_matcher_code([], [()-“blockn”])>, which just generates the goto statement to the given block.

We end up with this code:

switch a:
    case A:
        switch n:
            case 1: goto "block1"
            case 3: goto "block3"
            default: goto "block4"
    case B:
        switch n:
            case 2: goto "block2"
            default: goto "block4"
    default: goto "block4"

Then we generate all of the code for each block. Note that each block only has code generated once, but may be targeted by multiple case statements, such as block4.

Further examples

Above example, continued

Now, let us imagine a slightly different scenario, where we had these matchers:

A: [(1,_)-“block1”, (3,7)-“block3”, (_,_)-“block4”, (5,_)-“block5”]

(Same as above for case A, but with an extra argument.) This means, by the rule above, we need to copy the default match for each concrete match. Hence, we generate the following grouping for case A:

1: [(_)-“block1”, (_)-“block4”]

3: [(7)-“block3”, (_)-“block4”]

_: [(_)-“block4”]

5: [(_)-“block5”, (_)-“block4”]

This is inefficient, but perfectly correct. Operationally, it reads: “If you match 1, match all and go to block1, and if that fails, match all and go to block4.” (Obviously, we will pick block1). Then, “If you match 3, match 7 and go to block3, or match all and go to block4”. Then it says “Match all and go to block4.” Finally, “If you match 5, match all and go to block5, and if that fails, match all and go to block4.” There are two forms of redundancy here.

First, once a default case occurs, all subsequent cases can be removed (preferably with a warning):

1: [(_)-“block1”, (_)-“block4”]

3: [(7)-“block3”, (_)-“block4”]

_: [(_)-“block4”]

Second, once a matcher which cannot fail is seen, all subsequent matchers can be removed:

1: [(_)-“block1”]

3: [(7)-“block3”, (_)-“block4”]

_: [(_)-“block4”]

This still perfectly preserves the semantics of the above matcher list – there is no possibility of going to block5, or block4 from a match of 1.

However, we would like to generate a warning in this last case as well, which would be spurious if we duplicated any defaults then eliminated them again. So we need to avoid duplicating in the first place.

Therefore: When duplicating a default value, first test to see if any of the matchers before it will always succeed, and if so, do not duplicate.

Now, we can be even more judicious in preventing duplication. First, don’t just assume a matcher set cannot fail if it has a default case – also check for exhaustive ctor coverage. Secondly, it’s not necessary for the matchers to be unable to fail, it is merely sufficient for them to cover the duplicated patterns. So: When duplicating a default value, first test to see if it is covered by all of the matchers before it, and if so, do not duplicate.

Also: Do not think of it as “duplicating” the default values. Simpler to think of it as “projecting” away all values which don’t match. So, include all exact matches and also default values. Then, eliminate all later matches which are covered by earlier matches, giving a warning iff the eliminated matches are exact (and therefore are useless). If they are default matches, then they aren’t useless unless covered by all constructors on the outside – they should have already been checked by the vertical rule.

An example of this:

[(1,A)-“block1”, (1,B)-“block2”, (2,A)-“block3”, (_,B)-“block4”, (_,_)-“block5”]

Naively transformed:

1: [(A)-“block1”, (B)-“block2”, (B)-“block4”, (_)-“block5”]

2: [(A)-“block3”, (B)-“block4”, (_)-“block5”]

_: [(B)-“block4”, (_)-“block5”]

However, we didn’t need to duplicate the (_,B) case to case 1, as it is covered by the (1,B) case. So:

1: [(A)-“block1”, (B)-“block2”, (_)-“block5”]

2: [(A)-“block3”, (B)-“block4”, (_)-“block5”]

_: [(B)-“block4”, (_)-“block5”]

If A and B are the only ctors for the switch type, then it would not be necessary to copy the (_,_) case either if 1 or 2 is matched, as it is covered in both cases:

1: [(A)-“block1”, (B)-“block2”]

2: [(A)-“block3”, (B)-“block4”]

_: [(B)-“block4”, (_)-“block5”]

Tricky variable binding example

Consider this code:

switch x:
    case Pair(49, Box(3))
    case Pair(m, Box(n))

This example is important because in this case, the binding for n depends on the path taken to the second block. This explains the need for Phi instructions when making variable bindings. In this example, the second case’s n variable needs to be copied from a temporary unpack variable – which variable will depend on whether the first number was 49.

Args: [x]

Binds: []

Matchers:

Pair, 49, Box, 3 - block1
Pair, m, Box, n - block2

Assoc:

Pair: [49, Box, 3]-block1, [m, Box, n]-block2

Codegen:

entry:
    a = x.Pair(0)
    b = x.Pair(1)
    <continue>

Args: [a,b]

Binds: []

Matchers:

49, Box, 3 - block1
m, Box, n - block2

Bind m=a in block2’s binding table.

49, Box, 3 - block1
_, Box, n - block2

Assoc list:

49: [Box, 3]-block1, [Box, n]-block2
_: [Box, n]-block2

Codegen:

    <continue entry>

    switch a:
        case 49: goto case49
        default: goto default
case49:
    <continue>
default:
    <continue>

(case 49)

Args: [b]

Binds: [(block2-m)=a]

Matchers:

Box, 3 - block1
Box, n - block2

Assoc list:

Box: [3]-block1, [n]-block2

Codegen:

<continue case49>
c = b.Box(0)
<continue>

(case 49, Box)

Args: [c]

Binds: [(block2-m)=a]

Matchers:

3 - block1
n - block2

Bind n=c in block2’s binding table.

Note

A phi is required, as the binding is different to that below.

3 - block1
_ - block2

Codegen:

<continue case49>
switch c:
    case 3: goto block1
    default: goto block2

Note

This isn’t quite the base case. We could recurse once more, generating two new blocks, and then each block would consist entirely of a goto. However, this pre-base-case allows more efficient code generation.

Write to phi table: (block2-m)=(case49-a)

Write to phi table: (block2-n)=(case49-c)


(case default)

Args: [b]

Binds: [(block2-m)=a]

Matchers:

Box, n - block2

Assoc list:

Box: [n]-block2

Codegen:

<continue default>
d = b.Box(0)
<continue>

(case default, Box)

Args: [d]

Binds: [(block2-m)=a]

Matchers:

n - block2

Bind n=d in block2’s binding table.

Note

A phi is required, as the binding is different to that above.

_ - block2

Assoc list:

_: []-block2

Codegen:

<continue default>
goto block2

Write to phi table: (block2-m)=(default-a)

Write to phi table: (block2-n)=(default-d)


(Final generated code)

Final phi table:

(block2-m)=[(case49-a), (default-a)]
(block2-n)=[(case49-c), (default-d)]
entry:
    a = x.Pair(0)
    b = x.Pair(1)
    switch a:
        case 49: goto case49
        default: goto default

case49:
    c = b.Box(0)
    switch c:
        case 3: goto block1
        default: goto block2

default:
    d = b.Box(0)
    goto block2

block1:
    ...

block2:
    n = phi(case49 = c, default = d)
    m = phi(case49 = a, default = a)
    ...

Note

The phi node for m in block2 may be converted into a mov instruction, because the binding is the same for all predecessors.