Functional Lists

Lists are by far the most commonly used data structure in functional programming. We say that a data structure is functional if it is heap-allocated and immutable and can only be freed through garbage collection (GC). In contrast, a data structure is said to be linear if it is either stack-allocated or heap-allocated and can be freed by the user as well as by the GC.

The datatype for functional lists in ATS is (essentially) declared as follows:

datatype list (a:t@ype, int) =
  | {n:nat} list_cons (a, n+1) of (a, list (a, n))
  | list_nil (a, 0) of ()
// end of [list]

There are two data constructors associated with list: list_nil forms an empty list and list_cons for a list of a given head and tail. Given a type T and an integer I, the type list(T, I) is for lists of length I in which each element is of type T. Note that the sort t@ype indicates that the element type of a list can be unboxed (i.e., flat).

Often the following abbreviations are introduced for the list constructors so as to make the code involving list-processing less verbose:

#define nil list_nil
#define cons list_cons
#define :: list_cons // [::] is an infix operator

As an example of list creation, the following expression evaluates to a list consisting of integers 1, 2 and 3:

cons (1, cons (2, cons (3, nil ()))) // [nil ()] can be replaced with [nil]

Clearly, this kind of syntax is a bit unwieldy if longer lists need to be constructed. The following alternatives can also be used to create lists:

val xs = '[1, 2, 3] // the first character is quote (')
val xs = $lst (1, 2, 3) // this is equivalent to '[1, 2, 3]
val xs = $lst {Nat} (1, 2, 3) // [Nat] is given as the type for the list elements

The interfaces for various functions on lists can be found in prelude/SATS/list.sats.

Let us now see some list-processing code in ATS. The following program implements a function template that computes the length of a given list:

length {n:nat} .<n>.
  (xs: list (a, n)): int n =
  case+ xs of _ :: xs => 1 + length xs | nil () => 0
// end of [length]

As this is not a tail-recursive implementation, the function length may have difficulty handling long lists (e.g., (e.g., those containing more than 1 million elements). A tail-recursive implementation of length that can handle lists of any length is given as follows:

length {n:nat} .<>.
  (xs: list (a, n)): int n = let
  fun loop {i,j:nat} .<i>.
    (xs: list (a, i), j: int j): int (i+j) =
    case+ xs of _ :: xs => loop (xs, j+1) | nil () => j
  // end of [loop]
  loop (xs, 0)
end // end of [length]

Let us see another example. The following function append returns the concatenation of two given lists:

append {m,n:nat} .<m>. (
  xs: list (a, m), ys: list (a, n)
) : list (a, m+n) =
  case+ xs of
  | cons (x, xs) => cons (x, append (xs, ys)) | nil () => ys
// end of [append]

This is not a tail-recursive implementation. As a consequence, append may have difficulty handling a case where its first argument is of a large length (e.g., 1 million). Can append be given a tail-recursive implementation in ATS? The answer is affirmative. For instance, a tail-recursive implementation of append is available in prelude/DATS/list.dats. As the implementation makes use of linear types, it is to be explained elsewhere.