Loops
loop-Statement
A typical search loop checks all the elements in a data structure for some search criterion and returns the first matching element or signals that none such element exists. This is very ugly in many languages.
Type result = null; int index = 0; while index < data.size { element = data.get(index); if element.is_what_we_want { result = element; break; } index++; } if result != null { use(result); } else { print("not found"); }
Many developers turn this into
int index = 0; while index < data.size { element = data.get(index); if element.is_what_we_want { use(element); break; } index++; } if index >= data.size { print("not found"); }
or even
int index = 0; while (index < data.size) && !((element = data.get(index)).is_what_we_want) { index++; } if index < data.size { use(element); } else { print("not found"); }
Both these approaches are ugly. The use of break
adds a non-obvious exit condition to the loop, while the assignment to the element within the exit condition has a non-obvious side-effect.
How can we get nice code without break
and without side-effects in conditions?
int index = 0; while index < data.size { element = data.get(index); if element.is_what_we_want { result = element; break; } index++; } if result != null { use(result); } else { print("not found"); }
Using for_all
Having a for_all-construct avoids writing the iteration code and the index variable, but the overall result is not satisfying:
found := null; for element : data { if element.is_what_we_want { result = element; break; } } if result != null { use(result); } else { print("not found"); }
A data.iterate
function that receives a function argument that signals by its return value if processing shall continue is also not very convincing:
found := data.iterate((x) => { if (x.is_what_we_want) { use(x); true } else { false } }); if (!found) { print("not found"); }
Using Sequence and options
A Sequence, filter and option type provides a nice solution, e.g.,
option<Type> o := data.filter((x) => x.is_what_we_want).first_or_none; match o { Type element => use(element); None => print("not found"); }
The iteration and filtering is clearly separated from the code that processes the result.
Additionally nice in this approach is that there are no variables that contain state. There are no assignments to variables other than the initial values.
data.filter((x) => x.is_what_we_want).first_or_none ? Type element => use(element), None => print("not found");
Using a loop in a function
option<Type> find is result = None index := 0 while (result = None) && (index < data.size) { element = data.get(index); if element.is_what_we_want result = element index++ find ? Type t => use(t), None => print("not found");
This clearly separates the search for the element from its use.
Using explicit loop variables
option<Type> find is for index := 0, index + 1 result = None while (result = None) && (index < data.size) element = data.get index if element.is_what_we_want result = element find ? Type t => use t, None => print "not found"
This allows the code that sets and updates the index variable to be at one single location, it can also be used to make the compiler forbid any modifications of index at other places in the loop body.
Combining while and until loops
for index := 0, index + 1 while index < data.size element := data.get index until element.is_what_we_want use element else print "not found"
This gives the clearer structure for this case. A bit strange is the visibility of element
outside of the block that defines it, but the same holds for index.
Nesting until in while
for index := 0, index + 1 while index < data.size element := data.get index until element.is_what_we_want use element else print "not found"
This solves the extended scope of element, but it does not seem to improve readability.
Adding variant and invariant
for index := 0, index + 1 variant data.size - index invariant (0..index-1).for_each(x -> !data.get(x).is_what_we_want) while index < data.size element := data.get index until element.is_what_we_want use element else check (0..data.size-1).for_each(x -> !data.get(x).is_what_we_want) print "not found"
This can be used for debugging or analysis tools to proof the correctness of the loop, in particular
- the variant is a non-negative integer expression that decreases during each loop iteration, so it makes sure the loop terminates. The initial value of the variant gives an upper bound for the number of loop iterations
- the invariant states that all elements up to index are not what we are searching for. Together with a false while-condition, this implies that none of the elements in data are what we are searching for.
Loop using tailcall
Variables declared within a loop that are referenced from inner features via closures must be duplicated for each loop iteration, so they cannot be allocated in the stackframe of the feature containing the loop. One way to avoid the need for special handling of these variables is to consider a loop syntactic sugar and to convert it into a recursive function and have the compiler perform optimizations such as tail recursion. The general loop code
for index := 0, index + 1 variant data.size - index invariant (0..index-1).for_each(x -> !data.get(x).is_what_we_want) while index < data.size element := data.get index until element.is_what_we_want use element else print "not found"
can be converted to a function as follows:
loop (index int, var option int) pre /* variant */ (var ? None => true, int v => v > vari), /* invariant */ (0..index-1).for_each(x -> !data.get(x).is_what_we_want) is vari => data.size - index if index < data.size element := data.get index if element.is_what_we_want use element else loop (index + 1) vari // tail recursion, compiler should convert this into a goto else print "not found" loop 0 None
With this conversion, the compiler no longer has to handle loops. But it is required to detect and optimize tail recursion. Loop variables have become local variables of the loop function, so they will be duplicated and stored on the heap automatically in case they are part of closures.
If the compiler supports inlining and tail recursion, we could avoid the special handling of the variant on the first iteration by using a second feature loop2 that is used for the second and following iterations:
// loop for 2nd and following iterations: check variant and call loop loop2 (index, last_vari int) : loop index /* check variant is decreasing */ pre last_vari > vari && vari >= 0 is loop (index int) /* invariant */ pre (0..index-1).for_each(x -> !data.get(x).is_what_we_want) is // loop variant vari => data.size - index if index < data.size element := data.get index if element.is_what_we_want use element else loop2 (index + 1) vari // tail recursion, compiler should inline and convert this into a goto else print "not found" loop 0
Loop for iteration
A very common application of loops it to iterate over the elements of some set/array/list etc. This can be achieved by extending the index variable clause to allow a notation to iterate the elements of such a data structure, e.g.,
for x in myset do print x
Typical problems in Java are
- How to combine this with an index variable
- How to iterate two sets simultaneously
- How to define a termination condition without using ugly "if cond break" constructs
The first two can be solved by allowing multiple entries in the index vars section. E.g, the code to copy the elements of two lists as tuples in an array:
for a in lista b in listb i := 0, i+1 array := array (tuple A B) (min a.size b.size) do array[i] := (a,b)
A termination condition should still be possible even when iterating a data structure, in this example with an empty loop body:
for x in mylist until x.is_what_we_want print "we found what we want" else print "we don't have what we want"