Array programming, part 1

Ken Iverson, inventor of APL and J

There are a number of well-known programming paradigms, such as imperative programming, object-oriented programming, and (my favorite) functional programming. Most experienced programmers will have some experience in all of these. There are also some less well-known paradigms, such as logic programming (e.g. Prolog) that many programmers have heard of but few have actually used (I include myself in that list).

I want to talk about a different paradigm which at one time was fairly popular, then almost completely disappeared, and now may be on the cusp of a resurgence. That paradigm is array programming, and it’s exemplified by languages like APL, J, K, BQN, and a few others.

It’s unfortunate that comparatively few programmers are even aware of this paradigm. Despite this, array programming is a fascinating take on the whole notion of programming, and is well worth learning. It’s also a lot of fun!

Alan Perlis had a famous epigram to the effect that “A language that doesn’t affect the way you think about programming is not worth knowing.” I’m pretty sure he was speaking about array programming (specifically APL) when he wrote that. So let’s see if we can figure out what he meant.

What is a “programming paradigm”?

First off, we have to clarify the ill-defined notion of a “programming paradigm”. This is a way of dividing programming languages into groups which are in some way fundamentally similar to each other and fundamentally dissimilar to other languages. But what kinds of similarities and dissimilarities are we talking about?

Programmers who learn a few languages often notice that the languages they’ve learned differ primarily by their syntax. Compare (say) Python and Ruby, or Java and C#, and you’ll see:

  • tremendous swaths of overlap in the language semantics and the programming process
  • some syntax differences
  • some features present (or built-in) in one language but absent (or not built-in) in the other

and few other differences. So if you know one of them, it’s easy to learn the other.1

On the other hand, if you know Python it’s not as easy to learn Java, since Java is a statically-typed language, whereas Python is dynamically-typed.2 Does this mean that Python and Java are languages in different paradigms?

The answer is no, because you could take a Python program, add types to it (using mypy for instance), and be working in a language fairly close to Java. The presence or absence of types (at least, of the Java variety), isn’t a fundamental enough difference to separate two languages into two paradigm-groups.

In contrast, if you compared C (an imperative language) with Haskell (a functional language), almost nothing about the programming process would be the same. Sure, both languages have functions and operate on some kind of data, but that’s about it. The whole way you approach problems in C is completely different from the way you approach problems in Haskell. In contrast, the way you approach problems in OCaml (another functional language) is quite similar to the way you approach problems in Haskell. So both Haskell and OCaml share a “paradigm” (the functional paradigm) whereas C is in a different paradigm (the imperative paradigm). The functional paradigm is concerned mainly with function application and using functions as data, whereas the imperative paradigm is mainly concerned with updating mutable variables over time.3

I want to argue that array programming is a separate paradigm. Even though the “array paradigm” overlaps a lot with both the imperative and functional paradigms, and you can program in a basically just imperative or just functional manner in array languages, the array approach is fundamentally different to either of those paradigms. (This is one reason why array programming is best appreciated in languages specifically designed to embody this paradigm.) This also makes array languages quite hard to learn to use in an idiomatic manner, because (as with all new paradigms) you have to learn to think in a completely new way.

The key distinguishing feature of the array paradigm is that program elements involve aggregates of data instead of on individual pieces of data. Sounds simple, right? Let’s see what that looks like.

Simple examples

I’m going to show some simple examples of array programming in the J language, since it’s the array language I’m most familiar with. Also, unlike APL, J only uses the ASCII character set (which means characters on traditional keyboards). Expressions are entered in a REPL indented three spaces, and the output is displayed below unindented.

In most array languages, including J, you can enter literal arrays by typing scalar values separated by spaces:

   1 2 3 4 5  NB. <-- you enter
1 2 3 4 5     NB. <-- the interpreter prints out

The NB. is the way you write a comment in J. You’ve just entered a five-element array of integers, and J has evaluated it (to itself) and printed it out. The interpreter “prompt” is just a three-space indent.

This syntax does require that the elements are all of the same type, so you can’t mix e.g. characters and numbers. However, you can mix different kinds of numbers:

   1 2.2 3 4.4
1 2.2 3 4.4

All numbers in such an array get promoted to the “larger” type (here, floating-point numbers). Booleans are represented as the numbers 0 and 1; this will become important later on.

Only one-dimensional arrays can be entered as literals. If you want to enter a multidimensional array, you have to enter a one-dimensional array of its contents and then reshape it with the $ reshape operator:

   2 3 $ 1 2 3 4 5 6
1 2 3
4 5 6

The $ operator reshapes the array on its right according to the shape array on its left. So this code reshapes the 1 2 3 4 5 6 array to have the “shape” 2 3 i.e. two rows by three columns. The output shows the actual shape of the array.

Arrays can be added element-by-element without any extra syntax:

   1 2 3 + 4 5 6
5 7 9

This is the first really distinctive feature of array languages we’ve seen. There’s no need for loops or a map operator when applying functions to equal-sized arrays, because the “looping” or “mapping” is implicit. (Loop constructs do exist, but you rarely need them.)

This also works for multidimensional arrays:

   (2 3 $ 0 1 2 3 4 5) + (2 3 $ 1 1 1 2 2 2)
1 2 3
5 6 7

The i. (iota) operator generates arrays of consecutive elements.

   i. 6
0 1 2 3 4 5

It can also generate multidimensional arrays:

   i. 2 3
0 1 2
3 4 5
   i. 2 3 4
 0  1  2  3
 4  5  6  7
 8  9 10 11

12 13 14 15
16 17 18 19
20 21 22 23

The result of i. 2 3 4 is a three-dimensional array. Note the way it’s displayed.

You can add a scalar to an array:

   1 + i. 6
1 2 3 4 5 6
   1 + i. 2 3
1 2 3
4 5 6

This is another key feature of array languages: operators that work elementwise on dissimilar arrays can often (but not always) scale up the smaller array to match the larger one. Generally, the result is what you would want.

Array languages have a lot of operators, so to make this more manageable, they have very simple operator precedence rules. All operators act on the entire expression to their right. For instance:

   2 3 $ 1 + i. 6    NB. same as: 2 3 $ (1 + (i. 6))
1 2 3
4 5 6

You can assign expressions to names:

   a =: 2 3 $ 1 + i. 6
   a
1 2 3
4 5 6
   1 + a
2 3 4
5 6 7
   2 3 + a
3 4 5
7 8 9

The =: is not a typo; that is in fact the (global) assignment operator.

The last expression requires some explanation. If a is a 2x3 array, then 2 3 + a will add 2 to the first row of a and 3 to the second row. This is called prefix agreement and can get a bit complicated. In this case, the 2 3 array is converted to:

2 2 2
3 3 3

before being added to:

1 2 3
4 5 6

If you wanted to add a one-dimensional array (also known as a vector) to a two-dimensional array (also known as a matrix or table), then by default it is added using prefix agreement, which (in this case) means that the vector must have the same number of elements as the number of rows in the matrix, and each element of the vector will be added to the corresponding row of the array.

If you wanted to add each element of a vector to the corresponding column of the array, you have to use the “rank conjunction”, which is just the double-quote character ":

   a =: 2 3 $ 1 + i. 6
   a
1 2 3
4 5 6
   1 2 3 + a
|length error
|   1 2 3    +a
   1 2 3 +"1 a
2 4 6
5 7 9

J refuses to add 1 2 3 to a because a doesn’t have three rows. You can add it columnwise by modifying the + operator using the rank conjunction ("1) to make the derived operator +"1, which means “add columnwise”. Understanding how rank works is probably the deepest idea in array languages, and I’ll have more to say about this. Note that what this complexity buys you is not having to define explicit loops to do operations on multidimensional arrays.

Finally, there are a number of operators (called adverbs and conjunctions in J) which modify the behavior of functions (which are called verbs).4 Functional programmers will recognize these as higher-order functions, but they form a separate syntactic class in array languages, and in general are not quite “first class”. One of the most common ones is the insert adverb (often known as “over”), which is like a fold in functional programming. It inserts an operator between elements of an array.

   0 + 1 + 2 + 3 + 4 + 5 + 6 + 7 + 8 + 9
45
   +/0 1 2 3 4 5 6 7 8 9   NB. Same thing.
45
   +/i.9   NB. Same thing.
45

Adverbs combine syntactically with verbs (functions) on their immediate left, so the adverb / combines with the verb + to form +/, which sums the elements of an array. Note how incredibly concise the syntax is.

One of the curious aspects of this syntactic arrangement is that the syntax of array languages is context-dependent: in order to parse an expression, you have to know which operators are regular functions and which are operators. Since you can define new operator-like entities as names, you have to look up whether a name corresponds to a variable (a noun), a regular function (a verb), or an operator (an adverb or conjunction) before you can correctly parse an expression containing that name. This rubs me the wrong way, but it works out reasonably well in practice.

Conjunctions combine two verbs or a verb and a noun to get a derived verb. They are written between the two verbs or the verb/noun pair, like regular operators. We’ve already seen the " conjunction, which combines a verb and a noun to get a verb. The @: (“at”) conjunction composes two verbs to get a derived verb:

   +: 10  NB. "double" operator
20
   *: 10  NB. "square" operator
100
   (+: @: *:) 10
200       NB. "double at square" i.e. twice the square
   +:@:*:10    NB. You can drop parens and whitespace here.
200

I’ll show more array code later, but first I’d like to talk about where array programming came from.

Some history

Iverson notation

Array programming was invented by Ken Iverson. Ken grew up in Alberta (fellow Canadian!) and studied at Harvard under Howard Aiken, one of the earliest computer pioneers. At the time, they needed a concise notation for describing algorithms. They originally tried using normal mathematical notation, but that turned out to be insufficient for their needs, so Ken5 invented his own. This notation was creatively referred to as “Iverson notation”, and in fact contained enough features to be considered a full-fledged programming language. It was heavily influenced by matrix algebra, since Ken’s Ph.D. thesis work used a lot of matrix algebra. Ken published a book about the notation called A Programming Language.

The first implementations

At that point (1962), the language described in the book was purely a notation; no interpreters for the language existed. That changed in 1966, when the language was implemented on the IBM System/360 mainframe computer. The language was called “APL”, after the initials of the book.

Compared to other languages of the time (or now!), APL’s syntax was incredibly terse and symbolic. For instance, here is APL code which (inefficiently!) computes an array of all prime numbers from 1 to R:

(~RR∘.×R)/R1↓⍳R
Note
In J, this would be written as: (-.R e.,*/~R)#R=.2}.i.R.

I’ll explain this in detail later, but what jumps out is that APL contains many symbolic operators not found in most programming languages (and not typeable on a standard keyboard!). IBM even offered a special “APL typewriter ball” for use on its keyboards with interchangeable balls used to select different character sets. Along with this, keyboard overlays were common so you could see which APL characters you were typing.

The cryptic nature of the syntax turned out to be highly polarizing. Advocates loved it because it allowed them to write programs in a fraction of the time it would take to write a program in FORTRAN or COBOL (the other popular languages of the time), and it was also one of the first interpreted languages (along with LISP). Others found it simply incomprehensible (which it certainly is if you haven’t learned what the characters mean!). I’ll have more to say about the syntax issue in later posts.

Ken Iverson won a Turing Award for his work on APL. His Turing Award lecture “Notation as a Tool of Thought” can be read here.

Business data processing and time-sharing

APL settled into a useful niche in computation-heavy environments, especially since the alternative (FORTRAN) was non-interactive, compiled, and generally much harder to work with. First IBM, then other companies like I. P. Sharp in Toronto, started offering APL time-sharing services, which were for a long time highly lucrative. APL turned out to be very effective for business users, particularly for stock market data analysis. Ken Iverson eventually left IBM and worked at I. P. Sharp for many years. While there, APL continued to evolve under his watchful eye.

Spin-offs: K and J

APL’s non-standard character set proved to be a serious barrier to wider adoption of the language, so several APL variants sprung up which used conventional ASCII notation only.6 The best-known of these are K and J.

Arthur Whitney (another fellow Canadian!) worked with Ken at I. P. Sharp, then moved to other companies and developed a series of APL dialects. Most notably, he invented the K language, which used ASCII characters only, and later the Q language, which was a more “novice-friendly” version of K which also incorporated ideas from databases. K and Q have been used primarily in the financial community for time-series analysis of stock data. Arthur7 continues to develop new versions of K; the latest is called Shakti.

Meanwhile, Ken Iverson retired from I. P. Sharp in 1987 and began to work on a new dialect of APL which used only ASCII characters and which incorporated all the new ideas he and others had come up with. He called this language J, and implemented it with Roger Hui, who had also been at I. P. Sharp (and yes, he was also a fellow Canadian!). J kept the incredibly terse nature of APL, but since it was restricted to conventional characters, J code has a distinctive “line-noise” feel to it. I’ve shown you some simple J code above; here’s J code to computes e, the base of the natural logarithms:

e =: +/%!i.20

What this is doing is generating an array of integers from 0 to 19 (i. 20), computing their factorials (!), computing their reciprocals (%), summing them (+/), and assigning the result to the variable e. So this tiny bit of code is effectively summing an infinite series, or at least as much of it as is needed to compute the value desired. In fact, e can be computed more simply in J with just two characters: (^1), but the code above is a nice demonstration of the power and conciseness of array programming.

J incorporated many improvements that had been added to APL over the years, including a notion of function rank which generalizes the way that operators apply to arrays in a systematic way, and syntactic support for “trains”, a kind of super-powered point-free programming, which allows the user to compose functions in various useful ways. On the other hand, other improvements to APL, such as dfns (basically, anonymous functions) were not added to J until very recently.

The long decline

One reason that time-sharing worked so well for APL was that APL (and other array languages) tends to require quite a bit of memory. Nowadays this isn’t a problem, since memory is ridiculously cheap, but in the early days of personal computers, memory was very scarce. The first personal computers had less than 64K (!!!) of memory, which isn’t enough to do APL in any reasonable way. So as personal computers became adopted in greater and greater numbers, languages that could work in small memory spaces predominated. This included BASIC, C, Pascal, raw assembly language and a few others, but definitely not APL. APL still hung on in corporate environments that could afford to pay time-sharing fees, but it was bucking the trends of the industry. APL (and other array languages) are now extremely niche and exotic languages, and the vast majority of computer science students have never even heard of them.

A possible rebirth?

On the other hand, you can’t keep a good idea down forever, and since memory is now cheap, there is no technical reason why array languages couldn’t become popular again. More importantly, modern computers ship with GPUs (graphics processing units) that are basically special-purpose array programming units, so array programming, in principle at least, is a very good match for them.

In fact, you see the influence of array languages in modern software libraries such as NumPy (numerical Python), which was strongly influenced by J. One author referred to languages that incorporated aspects of array languages as “Iverson ghosts” This includes not just NumPy but Matlab, Julia, among others. So since programmers are being exposed to watered-down versions of array languages, is it possible that some of them might be interested in learning the real thing?

For reasons I’ll get into later, I very much doubt that array languages will ever become broadly popular, but I do see some signs of an uptick of interest.

One indication of this is that array programming podcasts have started springing up, starting with the excellent ArrayCast podcast, hosted by Conor Hoekstra, and more recently the APL Show, hosted by Adám Brudzewsky & Richard Park. Anyone interested in array languages could hardly do better than to listen to these podcasts; I’ve learned an incredible amount from them.

Another thing that has increased the profile of array languages is the Dyalog APL problem solving competition, sponsored by Dyalog APL, which is (to the best of my knowledge) the only commercial APL implementation remaining. Dyalog also produces a lot of high-quality free tutorial material, including the very convenient TryAPL online APL interpreter. Dyalog’s products are also available for free for non-commercial use. The J language is also freely-available, and the J wiki is a very useful resource for that language.

Coming up

I’m not done talking about array languages yet! In the next installment, I’m going to try to dig down into what exactly makes a language an “array language”, as opposed to a language which happens to have good support for array computations (like Python with NumPy). After that, I want to show some more examples of array code, highlighting the ways that the array programming paradigm is fundamentally different from other paradigms.


  1. Actually, maybe not that easy. When two languages are extremely similar, they interfere with each other, so small errors can be maddeningly hard to avoid. ↩︎

  2. Academics prefer the term unityped, meaning there is only one (formal) type, but we’ll ignore that here. ↩︎

  3. This is a gross oversimplification, of course, but bear with me. ↩︎

  4. The J language uses linguistic terms like noun, verb, adverb, and conjunction to identify language constructs, whereas other array languages like APL use more conventional terms like function, operator, etc. ↩︎

  5. I never met him, but I’m going to call him “Ken” despite that, because “Iverson” feels too formal. ↩︎

  6. Nowadays we have Unicode, which contains all the APL characters. However, most keyboards still only support inputting conventional characters. In practice, APL implementations use multi-character sequences to enter the unusual characters. ↩︎

  7. Yup, I’m first-naming again. ↩︎