Camels All the Way Down

In this post, I’ll present (in a somewhat playful manner) a common critique of Haskell, which famously has a lazy evaluation semantics, and see how it can also apply to OCaml, which is eager. In other words, I take the cheap route and bash various programming languages to get readers. To that end, let me start with a parable.


The parable of the lazy programmer

Having studied programming languages at CMU and worked at a certain proprietary trading firm, I’ve now been firmly indoctrinated into the Cult of ML,1 a family of functional programming languages. Originally this was Standard ML, but nowadays I write more OCaml.

Anyway, my wise mentors have always warned me about straying too far into the error of the Haskell programmers, lest I become swayed by their heresy and enter into apostasy against the Cult of ML.2 “Lo!” they warned me, “the central vice of Haskell lies in its fundamental laziness.3 Believe not, therefore, their wicked lies!”

The most oft-cited issue with pervasive laziness is that it obscures the runtime behavior of code. Because of this, it’s quite hard to analyze the complexity of any nontrivial Haskell code. Additionally, exceptions never show up quite where you expect them to, and debug print statements need to be handled with care.4 There are some more subtle reasons why we might not like laziness, too: for example, Hask is famously not a category, and one way to see this is how the built-in seq function, which lets the programmer selectively use eager evaluation, works.

“Stay your judgment but a minute,” the devilishly devious devotee of Haskell replies. “Yea, there may be certain trade-offs, but one benefit of laziness ye cannot deny: that both the finite and the infinite may be contained within a single type! With laziness, ye may fearlessly work with both the finite and the infinite. Be thralls no more to those who would limit your grasp to the finite.” The Haskell programmer, in his error, speaks of this:

-- Haskell

-- We can define an infinite list of natural numbers.
nats = [0..]

-- We can do normal operations on the infinite list, like filtering.
evens = filter even nats

-- Prints out [0,2,4,6,8].
print $ take 5 $ filter even nats

This is, prima facie, a strong argument. The ability to deal with infinite data structures just as easily as finite ones seems like a strong selling point of Haskell.

But Harper,5 the defender of the ML orthodoxy, is quick with the reply: “Not so! The Haskell programmer would erase the distinction between the finite and the infinite: a grave error indeed. For how can one speak of the finite if it is always liable to be confused with the infinite? Or what communion can the finite have with the infinite which does not destroy them both? Were we not taught by the Ancients the method of induction, which may be properly applied only to the finite? To mix the two cardinalities is a cardinal sin indeed, for with it we must forsake the most noble method of induction.”

The point here is somewhat subtle: one of the great benefits of functional languages is that we can rigorously reason about our code using structural induction. In our introductory functional programming class at CMU, we have the students write many such proofs, to the point that they probably get sick of it. But there’s a bit of a problem here: induction can only be done on a type whose elements all have finite “depth,” so to speak. Any first-year undergraduate will be familiar with the problem: a proof by mathematical induction can only show that some proposition holds for every natural number. It cannot show that it holds “at infinity,” whatever that means.6

Now if you’ve done any functional programming before, you’ve probably seen how a good type system makes it easy to express concepts in code, and the beautiful thing is that we can prove theorems about inductively defined types using structural induction. For example, we can define the natural numbers of good old Peano arithmetic like such:

-- Haskell

data Nat = Zero | Succ Nat

three = Succ $ Succ $ Succ Zero

-- Less than or equal
leq :: Nat -> Nat -> Bool
leq Zero _ = True
leq _ Zero = False
leq (Succ n) (Succ m) = leq n m

-- Prints true then false
print $ leq Zero three
print $ leq three Zero

But now here’s the “gotcha” moment: we would like to think of Nat as the type of natural numbers, in the sense that every value n :: Nat is a natural number. Indeed, Zero corresponds to 00 and three corresponds to 33. What about the following value of type Nat?

-- Haskell

-- What natural number is this?
ω = Succ ω

-- We can do useful computations with infinity, like determining that
-- it's greater than three.
print $ leq three ω

This is ω\omega, the fixed point of the successor function (i.e. the “number” that is its own successor, an infinite ordinal). This is not a natural number, despite being a value of type Nat! Because of this, the type Nat cannot properly be said to be the type of natural numbers. Likewise, the “list” type [a] in Haskell actually contains much more than just lists; it also contains infinite streams whose elements are of type a. This, one might imagine, is a rather severe deficiency, and a source of some embarrassment for the Haskell programmer.

The hypocrisy unveiled

Upon hearing this, the Haskell programmer is momentarily taken aback. “Touché,” he acknowledges, “I may concede the merit in your point.” Yet ever quick on his feet, he continues thus: “But this cannot be considered a point in favor of your position, O arrogant eager programmer! For behold, does not this OCaml code exhibit the same behavior?”

(* OCaml *)

type nat = Zero | Succ of nat

let rec leq n m =
  match (n, m) with
  | (Zero, _) -> true
  | (_, Zero) -> false
  | (Succ n', Succ m') -> leq n' m'

(* Uh oh... *)
let rec omega = Succ omega

(* We can show that three is less than infinity! *)
let three = Succ (Succ (Succ Zero))
let true = leq three omega

The OCaml programmer hangs her head in shame. “Alas, this I must confess: thou hast made plain a shameful mistake in our blessed language’s design. We have too freely allowed some expressions to inhabit the right-hand side of a recursive let binding.7 Yet I would submit this to thee: the mistake is not so great as it is in Haskell.” Why might this be? For one, this class of counterexamples is actually quite small in OCaml; if we make the situation even a little trickier, the compiler will complain about it:

(* OCaml *)

let succ n = Succ n

(* The compiler doesn't like this *)
let rec omega' = succ omega'

But another, more fundamental difference, is this: in Haskell, the type Nat contains not only the “conatural” numbers (i.e. the natural numbers extended with a point at infinity), but also many other divergent expressions wrapped in a Succ call!

-- Haskell

loop () = loop ()

-- This value cannot be thought of even as a conatural number
diverge = Succ $ loop ()

In OCaml, this is not the case. Due to eager evaluation, we cannot construct a value of type nat that diverges.

(* OCaml *)

let loop () = loop ()

(* This will run forever *)
let diverge = Succ (loop ())

So although yes, the behavior of let rec declarations in OCaml was arguably a mistake, the language’s eagerness at least guards us somewhat: the resulting infinities are limited only to a relatively “well-behaved” sort. In Haskell, the pervasive laziness leads to even more disastrous effects, and all bets are off.


  1. Here I mean “meta language,” not “machine learning.” I hate that I always have to make this distinction; now I know how cryptographers feel every time they have to explain that “crypto” means “cryptography,” not “cryptocurrency.”↩︎

  2. I’m joking, of course. I owe a lot to Haskell: it was actually the first functional language that I learned. When I was a high school student, it was somewhat trendy to learn functional programming, so I went through a few Haskell tutorials and was immediately hooked.↩︎

  3. In fact, according to Catholic doctrine, sloth—laziness—is one of the seven cardinal sins!↩︎

  4. Well, I’m aware that there are libraries that make debugging pretty easy.↩︎

  5. Bob Harper, whose class on programming languages I’m taking right now. Actually, I happen to be procrastinating on doing my homework for that class by writing this post.↩︎

  6. Well, one can study such a thing as coinduction, but the point is that this is dual to, not the same as, induction.↩︎

  7. I actually asked this question on the course Piazza for 15-312, and a TA linked the relevant language extension page.↩︎


Comments

Submit a comment

Your comment will be held for moderation. If needed, I'll reach out to the provided email address with moderation updates. Your email will not be publicly displayed.

Note: comments are still in beta. Let me know if anything is broken!