Functional programming tips
Use functional style by default
Code should be written in a functional style (no ref
s, arrays, or records
with mutable fields) except where indicated. Marks will be taken off if this
rule is violated. We will let you know when it's OK to use imperative idioms.
Certain imperative idioms (like while
and for
loops) are almost never
used, but we do use ref
cells on occasion to make code simpler.
Many of the tips below describe ways to effectively use functional style. They are not in any particular order, so feel free to browse until you find something interesting.
Use tail recursion by default
As you recall, a lot of CS 4 was spent teaching you to write functions using recursion instead of loops, and specifically using tail recursion. We use tail recursion because it has desirable space properties, so in general, when you write a recursive function, we generally try to make it tail recursive. In particular, any function that can operate on arbitrarily-sized data structures should really be tail-recursive so as not to use up excessive amounts of stack space.
However, there is no need to be obsessive about this. Most of the code compiled by a compiler will be comparatively short (e.g. the size of a typical function), so if the input to a function is that code, optimizing for space efficiency may not buy you very much. So if it's significantly easier to write a particular recursive function in a non-tail-recursive style, it's OK to do that. But be prepared to defend this choice in code reviews.
Also note that it's often desirable to use a fold (see below) instead of using recursion at all!
Persistent data structures
Probably the hardest part of using functional programming effectively is getting used to persistent data structures (like lists) instead of mutable data structures (like arrays).
In this course, almost all data structures are persistent, which means that when you change something in the data structure, it returns the updated version without altering the input version. This is very different to how imperative languages work.
Tip
If you think you need to use List.iter
(which would be appropriate for a mutable data structure)
you almost always want to actually use List.fold_left
or an iterative helper function
(which is appropriate for a persistent data structure).
Even though you may find functional programming awkward at first, it is vastly easier to write correct code in a functional style than in an imperative style. There is simply less that can go wrong.
Maps are your friend!
Consider this tail-recursive function:
let rec convert_nums (scale : int) (ns : int list) (acc : int list) : int list =
match ns with
| [] -> acc
| h :: t ->
convert_nums scale t (acc @ [scale * h])
This function would be called like this:
When you look at what the function is actually doing, it becomes clear that it is simply multiplying each element of the list by 10. A much simpler way to achieve this would be to use a map:
You should develop your intuition for where a recursive function
can be replaced by a higher-order function,
and List.map
is a simple example of this.
Note on efficiency
The other thing wrong about this function is that it's gratuitously inefficient, because of the line
The acc @ [scale * h]
expression
adds a new value scale * h
to the end of a list,
which is an \(O(n)\) operation,
giving the entire function an \(O(n^2)\) time complexity.
A better approach would be to add to the front of the list
and reverse it at the end.
Now consider this (non tail-recursive) function:
let rec spread_nums (ns : int list) : int list =
match ns with
| [] -> []
| h :: t -> [h - 1; h; h + 1] @ spread_nums t
It could be used like this:
You might think you can't use a map here:
# List.map (fun n -> [n - 1; n; n + 1]) [1;2;3];;
- : int list list = [[0; 1; 2]; [1; 2; 3]; [2; 3; 4]]
However, you can combine the list of lists into a list
using the List.concat
function:
# List.concat (List.map (fun n -> [n - 1; n; n + 1]) [1;2;3]);;
- : int list list = [0; 1; 2; 1; 2; 3; 2; 3; 4]
This is so common that there is a function called List.concat_map
which does both the mapping and the combining:
# List.concat_map (fun n -> [n - 1; n; n + 1]) [1;2;3];;
- : int list list = [0; 1; 2; 1; 2; 3; 2; 3; 4]
So again there is no need for a recursive function
when all the function is doing is mapping a function over a list of values
and flattening the resulting list of lists;
this is just the same as using List.concat_map
.
Folds are your friend!
Programmers new to functional programming are often frustrated by what they
perceive as the difficulty of doing simple things. A good example of this is
accumulation. You have a list and want to compute some value from the list
elements. For instance, say you want to get the maximum value of the list (and
assume that you only have a two-element max
function to compute maximums).
A Python programmer could immediately write this code:
def max_list(lst):
"""Compute the maximum of a list of positive integers."""
if len(lst) == 0:
return 0
mx = lst[0]
for item in lst:
mx = max(mx, item)
return mx
In OCaml, we do this sort of thing using folds, specifically left folds.
This can be simplified further:
Whenever you need to write code which accumulates something over a list, you generally want to use a fold.
A common beginner's style error is to do this with an iterative helper function:
let max_list lst =
let rec iter rest mx =
match rest with
| [] -> mx
| h :: t -> iter t (max h mx)
in
iter lst 0
This is not so bad, but the iterative helper function can be replaced by
List.fold_left
and the entire function becomes a one-liner!
Be alert for this kind of situation --
there's no benefit to using a functional language
if you don't take advantage of what it offers.
On the other hand,
sometimes you are accumulating more than one thing in an iteration,
or you may be passing more than just the thing you are accumulating
through each iteration of the computation.
In these cases, you can still use List.fold_left
by creating a tuple of everything that needs to be passed
from one iteration to another
and using that as the accumulation value,
but in my experience,
it's often easier to just write an iterative helper function.
Left vs. right folds
95% of the time or more, if you want to use a fold in OCaml,
you want to use a left fold (List.fold_left
)
instead of a right fold (List.fold_right
).
Left folds are tail recursive (space efficient) and are more natural
than right folds, which are not tail recursive
and thus not space efficient.
On the other hand, there are cases where right folds are the right thing to use. When you want to accumulate things in a list from the right going back to the left, a right fold is usually going to be simpler than a left fold, and you shouldn't feel bad about using one in that case (unless you expect that the list could be very long, in which case you don't want to use a right fold because it would use up too much stack space).
Nested pattern matching
It's very common to want to write a pattern match inside another pattern match. There are two cases to consider.
-
Where the two pattern matches can be combined. Consider this code:
let f (maybe_lst : int list option) : int option = match maybe_lst with | None -> None | Some lst -> begin match lst with | h :: _ -> Some h | [] -> None end
You can merge both pattern matches as follows:
let f (maybe_lst : int list option) : int option = match maybe_lst with | Some (h :: _) -> Some h | _ -> None
Note how much shorter and clearer the code is. Also note the use of the wildcard (
_
) pattern for "don't care" situations.Merging pattern matches is almost always possible if the inner pattern match doesn't depend on a value computed using information from the outer pattern match. Unnecessary pattern matches are bad style and will be marked down (but we'll try to alert you of this issue during code reviews).
-
Where the two pattern matches can't be combined. Consider this code:
let f (lst : int list) : (int * int) option = match lst with | h :: t -> begin match List.sort compare t with | h' :: _ -> Some (h, h') | _ -> None end | _ -> None
The inner match is matching on the result of the
List.sort
function applied to the tail of the list. You can't combine these pattern matches because the inner one depends on a value which must be computed (it's not structural). For very simple cases, you can use awhen
clause instead of the inner match, but often this isn't feasible. So you have nestedmatch
expressions.When you have nested
match
expressions, you must surround them by eitherbegin
/end
delimiters (as we did above) or parentheses, which looks like this:let f (lst : int list) : (int * int) option = match lst with | h :: t -> (match List.sort compare t with | h' :: _ -> Some (h, h') | _ -> None) | _ -> None
Using parentheses looks more concise, but it's easy to forget to close the open parenthesis, so we prefer to use
begin
/end
. (It reads better, too.) What you definitely do not want to do is this:let f (lst : int list) : (int * int) option = match lst with | h :: t -> match List.sort compare t with | h' :: _ -> Some (h, h') | _ -> None | _ -> None
If you do this, you will get strange error messages because OCaml will consider the last line to be part of the inner pattern match. Remember, OCaml is not whitespace sensitive! So don't forget the
begin
/end
.