Recursive factorial in 14 characters
I’ve been having some fun showing off the J array programming language. But there are a number of important aspects of this language that I haven’t talked about yet. In this post, I’ll demonstrate them in the context of a very simple example: a recursive factorial function.
TL/DR
So as not to keep you in suspense, here is how you can define a recursive factorial function in J in 14 characters:
factorial =: 1:`(*$:@<:)@.*
factorial 0
1
factorial 10
3628800
(1:`(*$:@<:)@.*) 10
3628800
Not only is 1:`(*$:@<:)@.*
a definition of factorial,
it’s an anonymous function as well.
You can see how J gets its reputation as a “line noise” language. But there is also a lot of cool stuff in this definition, as we’ll see.
No J programmer would ever use this definition of factorial,
since factorial is a primitive function in J,
using the (prefix) !
character:
!10
3628800
Simple definitions
We’ll work up to the 14-character definition in stages. First, let’s see how to define a recursive factorial function in Python:
def factorial(n):
if n == 0:
return 1
else:
return n * factorial(n - 1)
(Of course, we could leave off the else:
if we wanted to.)
You can write it like this in J too:
factorial =: {{
if. y = 0 do.
1
else.
y * factorial (y - 1)
end.
}}
This illustrates some of the conventional control constructs in J:
if.
, do.
, else.
and end.
.
The .
at the end is to be consistent with J’s naming conventions,
which often use .
or :
at the end of primitives.
(It also means that they can’t conflict with variable names,
which can’t end in a .
.)
The y
means “right argument”;
you can’t give a special name for a function argument.
This function only takes a right argument;
if it also had a left argument, that would be referred to as x
.
This example also shows that J code doesn’t have to be obfuscated; you can write very straightforward code if you want to. In fact, because of J’s strict right-to-left evaluation order, you could leave off some parentheses:
factorial =: {{
if. y = 0 do.
1
else.
y * factorial y - 1
end.
}}
Whether this is an improvement is arguable.
In Python, you would probably use a for
loop instead of recursion
to compute factorials:
def factorial(n):
result = 1
for i in range(1, n + 1):
result *= i
return result
You can write this in J too:
factorial =: {{
result =. 1
for_i. 1 + i. y do.
result =. result * i
end.
result
}}
The for_i.
control structure is basically a “foreach” loop,
with the item looped over given the name of i
.
The =.
is a local assignment operator,
in contrast to the global assignment operator =:
.
The i.
primitive (no connection to for_i.
)
is equivalent to Python’s range
with one argument.
So here we generate an array of numbers from 0 up to (but not including) y
,
add 1 to all of them, and loop over them,
multiplying result
by them all to get the answer.
Having said this, the J way is all about brevity, and we can get a much smaller factorial function as follows:
factorial =: {{ */ 1 + i. y }}
factorial 10
3628800
Here, we are generating an array from 1 to y
and multiplying it all together with the */
operation.
No recursion is necessary!
We can easily turn this into a tacit definition:
factorial =: [: */ 1 + i.
factorial 10
3628800
This is a factorial function in 8 (non-whitespace) characters. The recursive definition is clearly going to be larger, but surprisingly, not that much larger.
Getting rid of conditionals
Let’s go back to the earlier recursive factorial definition:
factorial =: {{
if. y = 0 do.
1
else.
y * factorial (y - 1)
end.
}}
Bulk-wise, what jumps out at you is the if.
, do.
, else.
and end.
control structure keywords.
We could make the definition a lot shorter if we got rid of those.
Some languages (C, C++, Java) have conditional expressions
(the infamous ? :
trinary operator)
which allows you to evaluate one of two expressions
based on the value of a third expression
which evaluates to true or false.
J generalizes this, but in a very unusual way.
There are two key ingredients:
- gerunds
- the
@.
primitive
Gerunds
Gerunds exist because functions aren’t fully first-class objects in J. For instance, you can’t create an array of functions. However, sometimes you really want something like this, and gerunds allow you to have a version of it. A gerund is a function (verb) represented as a noun (data).
The primitive that creates gerunds is the “tie” operator,
which is represented by the backtick (`
) character.
It’s called “tie” because it’s normally used
to “tie” multiple functions together
into an array of gerunds.
For instance,
you can define an array of three gerunds by using the tie as follows:
+`-`* NB. array of three gerunds, made from the +, - and * functions
+-+-+-+
|+|-|*|
+-+-+-+
The boxed output suggests that these functions aren’t represented as themselves; they are in an internal representation. Despite this, a tie of functions is a true array:
fs =: +`-`*
$ fs NB. get shape of array
3
0 { fs NB. get first element
+-+
|+|
+-+
To convert the gerund back into a function,
you can use the `:
primitive (“evoke gerund”) like this:
f0 =: 0 { fs
f =: f0 `: 6
f
+
2 f 3
5
`:
(6) is pretty weird;
see the J documentation for more on this.
As I understand it,
it’s the right argument to use in most cases.Fortunately, in practice, we almost never need to use
the `:
primitive;
we use the @.
primitive instead.
The @.
primitive
One of the most common uses of gerunds is as arguments
to the @.
primitive, known as “agenda”.
This primitive creates conditional expressions.
It takes two arguments:
- the left argument is an array of gerunds,
normally produced by the
`
(tie) primitive; - the right argument is an integer index into the array.
Assuming the index is valid,
the @.
primitive fetches the corresponding gerund from the array
and converts it back into a function.
fs =: +`-`* NB. array of three gerunds
2 (fs @. 0) 3 NB. select +
5
2 (fs @. 1) 3 NB. select -
_1
2 (fs @. 2) 3 NB. select *
6
2 (fs @. 3) 3
|index error
| 2(fs @.3)3
The way to use @.
and gerunds to implement conditional expressions
is to create separate functions for each arm of the conditional,
tie them together into an array of gerunds using the `
primitive,
and the select the one you want using the @.
primitive.
Unlike conditional expressions in many languages,
this works fine with multi-way conditionals
(i.e. more than two conditions)
as long as you can express the behavior you want
as an integer between 0 and the maximum index of the gerund array.
For the very common case where there are only two cases,
this still works because “true” and “false” in J
are written as the integers 1
and 0
respectively.
Schematically, what we would need is:
NB. This is not valid J syntax.
{false case function}`{true case function} @. {boolean expression}
where the boolean expression (an expression which evaluates to a boolean) would select either the false case function or the true case function. This function would then need to be applied to something else to get the final result.
Aside from the odd syntax, the tricky part of using this pattern is encoding the false case actions and the true case actions as functions. Although this is very much in keeping with the function-level programming which is one of the hallmarks of J (seen also in trains of functions, which I’ve discussed previously), it’s not exactly intuitive. We normally think “If this condition holds, do this, otherwise do that.” J asks us to write this as “If this condition holds, select this function, otherwise select that function, then apply the function you’ve selected.”
Let’s try to kludge away the problem by noting that
you can turn any value into a function that returns that value
by using the [
(left argument) primitive:
10 [ 42
10
10 [ 2 + 2 + 2
10
always_ten =: 10 & [
always_ten 42
10
always_ten 2 + 2 + 2
10
Recall that the &
(bond) conjunction
binds a value to (in these cases)
the left argument to the functions.
(For instance, 2 & +
is the “add two” function.)
So if you have an expression which represents either
the true case or the false case of your conditional,
you can convert it into a function this way
and use @.
on it:
(10&[)`(20&[) @. 0 NB. computing a function
10&[
(10&[)`(20&[) @. 1
20&[
((10&[)`(20&[) @. 0) '' NB. '' is a dummy argument
10
((10&[)`(20&[) @. 1) ''
20
We use the ''
(empty string) argument to the functions because
in a function like 10&[
,
the result will always be 10 regardless of the right argument,
but we still need a right argument to execute the function.
OK, with what we now know,
we can try to write a factorial function using @.
.
factorial =: {{
((1&[)`((y * factorial (y - 1))&[) @. (y > 0)) ''
}}
On the positive side, this is pretty concise. On the negative side, it doesn’t work!
Let’s see how it was supposed to work.
In explicit definitions like this,
y
represents the right argument,
which is the only argument to the factorial
function.
The expression (y > 0)
is 0 (false) when y
is 0 and 1 (true) otherwise.
When y
is 0, the @.
primitive picks out the
0th gerund in the gerund array,
which is (1&[)
.
This is a function which, when applied to anything, always returns 1.
(You can also write this more concisely in J as 1:
,
which we’ll do below.)
If this function is selected,
it’s applied to the ''
dummy argument to get 1,
which is the correct result for factorial 0
.
When y
is 1, the @.
primitive picks out the
1st gerund in the gerund array,
which is ((y * factorial (y - 1))&[)
This is made up of two parts.
The (y * factorial (y - 1))
part
computes the factorial of interest using recursion.
The &[
part returns that value for any argument,
including the dummy argument ''
.
So why doesn’t this work?
It has to do with J’s evaluation model.
When creating a gerund from a function,
the code representing the function
(here, ((y * factorial (y - 1))&[)
)
is evaluated as far as possible.
For instance, if the code was (2 + 2)&[
,
it would be evaluated to 4&[
before creating the gerund.
So the code (y * factorial (y - 1))
gets evaluated
before the function gets created,
and this leads to an infinite recursion.
What we need is some way to delay evaluation of the
(y * factorial (y - 1))
expression
until it’s needed.
In a functional language like OCaml, this is trivial;
you can just define a function taking a unit argument (say)
which wraps the expression of interest.
It would look something like this:
(fun () -> y * factorial (y - 1))
But J can’t do this.
First of all, J doesn’t allow nested function definitions
(unless you encode the function as a string and evaluate it,
which we’ll do below).
And also, in OCaml, functions are closures,
which means that they have access to variables in outer scopes
(not just global scope).
Here, we need access to the y
variable,
which is defined outside of this function.
J doesn’t support closures either.
So we need a different approach.
There is a really hacky way to do what we’re trying to do. It’s kind of cringe-inducing, but I’ll show it to you anyway. (Later, we’ll discard this and do things the Right Way™.) Here goes:
factorial =: {{
a =: y
ff =. (('1'&[)`('a * factorial (a - 1)'&[) @. (a > 0)) ''
(3 : ff) ''
}}
What I’m doing here is taking the true case and false case function code
and representing them as strings.
(I’m also replacing y
with a
; I’ll explain why below.)
You can certainly use the @.
operator to pick out
one of two functions which return a string on any input:
(('foo'&[)`('bar'&[) @. 0) ''
foo
(('foo'&[)`('bar'&[) @. 1) ''
bar
Here, we’re encoding the code we want as a string:
(('1'&[)`('a * factorial (a - 1)'&[) @. 0) ''
1
(('1'&[)`('a * factorial (a - 1)'&[) @. 1) ''
a * factorial (a - 1)
Now here’s the kicker: we can evaluate that code on-the-fly
to get the result we want!
That’s what the (3 : ff) ''
expression does.
The :
primitive is called “definition”
and has a whole lot of uses related to defining various kinds of things in J.
When its left argument is the number 3
,
it defines a monadic (one argument) function,
which is what we want here,
since we want the expression in the string to be evaluated.
Technically speaking, we’re defining a function
which takes one argument (called y
),
ignores that argument,
computes the value of an expression and returns it.
(This is kind of like the OCaml code shown above.)
This is why we had to rename y
to a
;
if we didn’t, y
would represent the right argument
of the newly constructed function
(since y
always means this in an explicit function definition),
not the original value of y
.
Since this is a function,
we still have to give it an argument,
so we give it the dummy argument ''
,
which is ignored when we evaluate the body of the function.
Now, you might wonder how we avoided an infinite recursion this way.
It’s because we only created the function containing the recursive expression
when we actually wanted the value of that expression
i.e. when the argument to factorial
was greater than 0.
When it’s 0, the recursive expression
is never converted into a function and evaluated.
OK, so this works, but it’s extremely ugly
and certainly not the intended use of the @.
primitive.
So what is the right way to do this?
To understand this, we have to back up and realize that
the @.
primitive is picking out a different function to call
based on some condition
(here, whether the input argument was zero or not).
Whatever information this function needs has to be passed into it;
you can’t just paste in the expression you want
and hope for the best, as we’ve seen.
The only information available is the value of
the argument to factorial
i.e. y
.
So the functions are going to take y
as their argument.
The structure of the entire expression will therefore be
(schematically):
NB. This is not valid J syntax.
({false case function}`{true case function}@.(y > 0)) y
The y = 0
case is easy: you just want to return 1.
That means you need a function which returns 1 no matter what the argument.
(The argument will always be 0 when it’s called, but we don’t care.)
We could use 1&[
as the function.
J also has a primitive called 1:
which does the same thing:
(1&[) 10
1
1: 10
1
Since the y
argument will always be 0 when this function is called,
we could alternatively use the >:
primitive (increment),
which would return 1 when applied to 0.
But we’ll stick with 1:
because it’s clearer.
For the y > 0
case,
we need to figure out how to write y * factorial (y - 1)
as a function.
It turns out that we can reuse the technique we used previously:
(3 : 'y * factorial (y - 1)')
This is a function defined by evaluating the string
'y * factorial (y - 1)'
.
Because of the left argument of 3,
this will be a monadic (one argument) function.
It’s used like this:
factorial =: {{
(1:`(3 : 'y * factorial (y - 1)')@.(y > 0)) y
}}
Don’t confuse the two uses of the y
variable!
One is the argument to factorial
(in the (y > 0)
expression and the final y
)
and the other is the argument to the internal function
(3 : 'y * factorial (y - 1)')
.
The first y
is passed on to the second one
during evaluation, but they are distinct variables.
(Think of the true case function as being like
the OCaml function fun y -> y * factorial (y - 1)
applied to a variable also called y
;
the two y
s are distinct.)
This function actually works:
factorial 0
1
factorial 10
3628800
On the other hand, this definition is still pretty clunky. You might wonder why we can’t just write it like this:
factorial =: {{
(1:`({{ y * factorial (y - 1) }})@.(y > 0)) y
}}
Unfortunately, the {{
}}
syntax doesn’t nest.
(You can do this kind of thing in
Dyalog APL, though).
Instead of that, we’ll try a different approach.
It should be possible to come up with a definition
of the (3 : 'y * factorial (y - 1)')
function
that doesn’t use y
.
In other words, we want a tacit definition!
So let’s just ask J to compute it
using the magic 13 :
idiom:
13 : 'y * factorial (y - 1)'
] * [: factorial 1 -~ ]
The ~
primitive is an adverb that flips the arguments of its
(dyadic) function operand
(like the flip
function in Haskell).
So ~-
is -
with the arguments flipped,
and 1 -~ ]
is equivalent to y - 1
,
with ]
(right argument) standing in for y
.
The [:
(“cap”) syntax composes factorial
with 1 -~ ]
to get factorial (y - 1)
.
Then the ] *
is equivalent to y *
to get
] * [: factorial 1 -~ ]
which is equivalent to
y * factorial (y - 1)
.
Let’s try it out:
factorial =: {{
(1:`(] * [: factorial 1 -~ ]) @. (y > 0)) y
}}
factorial 0
1
factorial 10
3628800
It works! But we can simplify it even more.
Getting rid of y
The y
explicit argument is now only used in two places.
Can we get rid of it? Yes!
First, let’s make a simple change:
factorial =: {{
(1:`(] * [: factorial 1 -~ ]) @. ((0 < ]) y)) y
}}
All we’ve done is change (y > 0)
to ((0 < ]) y)
.
The 0 < [
function is actually a fork;
it’s converted to 0: < [
where 0:
is the constant function that always returns 0,
so (0: < ]) y
evaluates as ((0: y) < (] y)
which is just (0 < y)
.
We could also have written it as (] > 0:)
,
but not as (] > 0)
since the promotion from 0
to 0:
doesn’t work on the rightmost argument.
(Yes, it’s complicated!)
So far,
we’ve been using the @.
primitive with an integer as its right argument.
But there is another use case of this primitive:
when the right argument is a function (verb).
In that case, the pattern fs @. v y
(where v
represents a verb)
is equivalent to (fs @. (v y)) y
.
Using this pattern, we can simplify the definition to:
factorial =: {{
(1:`(] * [: factorial 1 -~ ]) @. (0 < ])) y
}}
Now we only have one y
in the definition.
The outermost parentheses are unnecessary, so let’s remove them:
factorial =: {{
1:`(] * [: factorial 1 -~ ]) @. (0 < ]) y
}}
And now, we don’t need an explicit definition at all; we can write it as a purely tacit function:
factorial =: 1:`(] * [: factorial 1 -~ ]) @. (0 < ])
The 0 < ]
function returns 0 for an argument of 0
and 1 for a positive argument.
We can replace this with the (monadic i.e. unary) *
primitive,
which means “signum”:
* 0
0
* 1
1
* 42
1
* _10
_1
We aren’t using negative numbers here, but it will work fine with 0 or positive numbers. This gives us this definition:
factorial =: 1:`(] * [: factorial 1 -~ ]) @. *
This is already pretty short, but we have a few tricks left.
J has a decrement primitive called <:
;
we can use it to replace 1 -~ ]
:
factorial =: 1:`(] * [: factorial <:) @. *
We can also use the composition primitive @
(“atop”)
in place of the cap [:
:
factorial =: 1:`(] * factorial @ <:) @. *
We could also have used the other composition primitive @:
(“at”)
in place of @
;
both work here, but we’re going for the most concise solution.
Now comes a weird one.
Observe that the expression ] * factorial @ <:
is a fork of the three functions ]
, *
, and factorial @ <:
.
When applied to an argument y
,
it evaluates as follows:
(] * (factorial @ <:)) y
--> ((] y) * ((factorial @ <:) y)
--> y * (factorial (<: y))
--> y * (factorial (y - 1))
as desired. But we can leave off the ]
on the left to form a hook.
A hook is a train of two functions,
and the “monadic” hook (which is what we are using here)
has the interpretation:
(f g) y --> y f (g y)
So if we evaluate * (factorial @ <:)
applied to y
,
we get:
(* (factorial @ <:)) y
--> y * ((factorial @ <:) y)
--> y * (factorial (<: y))
--> y * (factorial (y - 1))
which is the same thing. So we can simplify our definition to:
factorial =: 1:`(* factorial @ <:) @. *
Now the only thing left to tackle is the factorial
name.
The $:
primitive (“self-reference”)
You might ask why we even care about getting rid of the
factorial
name from inside the definition.
After all, it is a recursive definition,
so you’d expect to see the name replicated inside the body of the function.
However, if we wanted to use the function anonymously, we couldn’t;
we’d have to define the name to make it work.
(Admittedly, in most cases this isn’t much of a hardship.)
In functional languages,
we use the
Y combinator
to create anonymous recursive functions.
This is much less straightforward in J
because J doesn’t have closures,
but there is an easy alternative that works in many cases:
the $:
primitive (“self-reference”).
Inside a single-phrase function (verb) definition
(which is what we have here),
$:
stands for the function being defined.
So we can swap out factorial
for $:
to get:
factorial =: 1:`(* $: @ <:) @. *
This works, with or without names:
factorial =: 1:`(* $: @ <:) @. *
factorial 0
1
factorial 10
3628800
(1:`(* $: @ <:) @. *) 0
1
(1:`(* $: @ <:) @. *) 10
3628800
We can shrink it down by removing whitespace:
factorial =: 1:`(*$:@<:)@.*
and then we have a recursive definition of the factorial function in 14 characters.
Observations
What do I think of this definition?
On the one hand, it’s incredibly terse, and (once you understand it) quite elegant. On the other hand, it’s extremely obscure, relying critically on such J features as hooks, forks, gerunds, and the agenda and self-reference primitives. This is a pretty high bar for most programmers to internalize.
It also has nothing to do with rank polymorphism,
which is the unique selling point of array languages.
There’s nothing particularly array-ish here,
other than the @.
primitive
(which doesn’t use rank polymorphism).
Other than the $:
primitive,
you could easily write this definition in OCaml:
let agendaf fun_array f n = fun_array.(f n) n
let compose f g y = f (g y)
let mhook f g y = f y (g y)
let one _ = 1
let decr n = n - 1
let signum n = if n = 0 then 0 else if n < 0 then -1 else 1
let rec factorial n =
agendaf [| one; (mhook ( * ) (compose factorial decr)) |] signum n
You could even remove the recursion by standard techniques:
let rec y f x = f (fun z -> (y f) z) x
let factorial =
y (fun f n -> agendaf [| one; (mhook ( * ) (compose f decr)) |] signum n)
Of course, this is a lot more than 14 characters. So this example primarily highlights J as a notationally-concise language.
Ken Iverson was strongly influenced by both combinatory logic and John Backus’ FP language, both of which are mainly concerned with building up functions from other functions. This is reflected in the use of forks, hooks, agendas etc. in J. It’s a very elegant model of computation, but it comes with a high learning curve.
And there is a strange aspect to it.
If you want to write a function in a Python-like explicit form,
J provides all the tools to do that.
If you want to write a function in an incredibly compressed
function-level form (similar to FP or combinatory logic),
J provides all the tools to do that too.
But if you want to write a function in a way
between those two extremes, J is not very pleasant to use.
Even the {{
and }}
function brackets were only added to J very recently;
before that, you would have to define an explicit function like this:
factorial =: 3 : 0
if. y = 0 do.
1
else.
y * factorial (y - 1)
end.
)
(Yes, that’s a bare close paren on the last line!) For explicit one-liners, you would have to write e.g.:
factorial =: 3 : '(1:`(] * [: factorial 1 -~ ]) @. ((0 < ]) y)) y'
which is also not very nice. So apparently, writing functions with explicit arguments wasn’t a priority when J was being developed. I think Ken hoped that programmers would gravitate to function-level programming, as he had. But this is a big hurdle, and in my opinion, is much easier to tackle once concise explicit definitions are mastered first.
An instructive contrast is with the APL language, as represented by Dyalog APL. After Ken left the APL world to develop J, APL continued its evolution on its own. In particular, John Scholes developed the syntax for “direct functions” (also called “dfns”). Here’s what the factorial function looks like in Scholes’ notation:
fact ← {0 = ⍵ : 1 ⋄ ⍵ × ∇ ⍵-1}
Here, ⍵
represents the right argument to the function.
The definition states that if the argument is zero,
return 1.
Otherwise, multiply the right argument by
the recursive call (using the ∇
self-reference primitive)
of the right argument minus 1.
It’s completely comprehensible once you understand the symbols,
it uses an explicit argument,
and it clocks in at (wait for it) 14 non-whitespace characters!
This is the intermediate level between fully explicit definitions and compressed function-level definitions that J has yet to emulate successfully. (I really hope they do someday!)
As a result of this, I think that APL developers spend much less time thinking about function composition, trains, gerunds etc. than J programmers do. APL programmers can make very terse definitions just by using a fairly conventional style of programming. And they have full access to rank polymorphism as well.
As for J, the two biggest omissions that make it hard to write concise function definitions are:
- no nested functions,
- no closures.
With nested functions, we could have written our factorial function like this:
factorial =: {{
({{ 1 }}`({{ y * factorial (y - 1) }})@.(y > 0)) y
}}
This is reasonably easy to understand. and if closures were supported as well, we could write something like this:
factorial =: {{
a =: y
({{ 1 }}`({{ a * factorial (a - 1) }})@.(y > 0)) ''
}}
and it would work. (Neither APL nor J support closures.)
I still don’t like using @.
to select functions
instead of expressions
(requiring the ''
dummy argument),
but fixing that would require a pretty major refactoring
of the syntax.
If J were to go down that road,
a better plan in my opinion would be to adapt
Scholes’ APL notation.
Oh well, enough griping. Languages are the way they are because of a long process of gradual evolution. J has some beautiful stuff in it, but it’s also got some warts.
Summing up
I feel like array programming is a great idea, but we haven’t created the ideal array language yet. As a functional programmer, I see no reason why we can’t have functional features like closures, nested functions, tail call optimization etc. while also having rank polymorphism and a concise syntax. The research language Remora is a functional rank-polymorphic language which could serve as the basis for such a language. (The name “Remora” is the Latin name for sucker fishes; perhaps the authors are cautioning us against taking it too seriously?) Remora has everything I want in an array language with the exception of a concise syntax, but you could imagine using it as the back end for a language with a syntax that was similar to J. There’s lots of interesting work to do in this space!