Exceptions as a Universal Type

A while back, Stephen Weeks wrote a blog post containing the following fun little OCaml challenge: implement a “universal” type into which all other types can be embedded. Essentially, implement the following module type:

module type Univ = sig
  type t

  val embed : unit -> ('a -> t) * (t -> 'a option)

That is, embed () returns a pair (inj, proj) of functions that inject into the universal type t and project out of it respectively. This is actually quite a fun little challenge; I’d encourage you to give it a try if you haven’t seen it before. Stephen suggests the following test cases:

open! Core

module Test (U : Univ) = struct

  (* Needed because Core hides the built-in polymorphic compare. This
     isn't essential to our tests, but I wanted to preserve
     compatibility with the original blog post. *)
  let ( = ) = Poly.( = )

  let of_int, to_int = U.embed ()
  let of_string, to_string = U.embed ()
  let r : U.t ref = ref (of_int 13)

  let () =
    assert (to_int !r = Some 13);
    assert (to_string !r = None);
    r := of_string "foo";
    assert (to_int !r = None);
    assert (to_string !r = Some "foo")

I haven’t seen the following solution posted on the Internet yet,1 so I’ll give what I think is the most obvious solution to this problem: using exceptions. One way to think of exceptions in ML languages like OCaml is that they’re constructors for an extensible sum type. What does this mean? When you define an ordinary variant (or “sum type”) in OCaml, the constructors are fixed at the moment the type is defined. Take a look at the following type:

type foo =
  | A
  | B
  | C

Once you’ve defined the type foo, you’re not allowed to go back and say “hold on, I want to add a constructor D.” This would make things obviously unsafe at runtime: a program that matches against foo only knows how to handle A, B, and C; it has no idea how to handle a D!

match (_ : foo) with
| A -> ...
| B -> ...
| C -> ...

Now polymorphic variants (available in OCaml but not in Standard ML) get you part of the way there, since they allow you to have some form of subtyping. You could have some types:

type bar = [ `A | `B | `C ]
type baz = [ `A | `B | `C | `D ]

Then you’re allowed to coerce a value of type bar into one of type baz, like this: (_ : bar) :> baz. But still, the compiler won’t allow you to pass something of type baz to a function that takes an argument of type bar, because this is obviously unsafe. You’re still not allowed to add constructors to a polymorphic variant after you’ve defined it, even if you are allowed to define a more general polymorphic variant.

So an extensible sum type isn’t achievable using ordinary variants or polymorphic ones.2 But wait: there’s a built-in OCaml type that behaves exactly like we want: the type exn of exceptions! An interesting perspective on exn is that it’s a variant of all the previously declared exceptions. When you declare a new exception, you’re effectively adding a constructor to this type, like so:

exception A
exception B
exception C

let _ : exn = A

But you can also add extra constructors:

exception D

let _ : exn = D

And of course, you can match on any exception that’s already been declared. In fact, when I took his programming languages class, Bob Harper claimed to us (after his typical manner of making outlandish but entertaining claims) that the name exn is a pun on “exception” and “extensible” for this very reason.

But back to the puzzle of the universal type. This perspective on exceptions allows us to see that in fact, OCaml already has a universal type, namely exn! So we can implement the module type Univ as follows:

module Make (M : T) = struct
  exception Embed of M.t

  let inj x = Embed x

  let proj x = function
    | Embed x -> Some x
    | _ -> None

module U_exn : Univ = struct
  type t = exn

  let embed (type a) () =
    let module Embedded =
      Make (struct
        type t = a
    Embedded.inj, Embedded.proj

(* It works! *)
module Test_exn = Test (U_exn)

Of course, this is not the only way to solve this puzzle. Another classic solution, based on this fifteen year-old Reddit thread, is to close over a ref cell hidden behind some kind of unique id:

module U_ref : Univ = struct
  type t = int

  let next_id = ref 0

  let embed () =
    next_id := !next_id + 1;
    let id = !next_id in
    let value = ref None in
    let set x =
      value := Some x;
    let get id' = if id' = id then !value else None in
    set, get

(* This also works *)
module Test_ref = Test (U_ref)

Personally, I find the exception solution much cuter than the ref cell solution, but you know, de gustibus and all that.

  1. Because I’ve said that, I know someone will soon email me to inform me that the solution I’m about to present can be found on some obscure blog post from 2010…↩︎

  2. There is, however, an OCaml language extension, introduced in OCaml 4.02, which adds extensible variants.↩︎