Exceptions are Ok

Let's look at how Results and Exceptions stack up to each other, and against ideal error handling ergonomics.

Nov 09, 2024

I used to see Exceptions in OCaml as a panic in Rust. It’d mean that something is absolutely fatal and the entire program should finish.

Then I wrote Riot and an uncaught exception became simply a terminated actor, so… throwing an exception wasn’t so bad anymore! Right?

Kind of.

Signaling and Intent

For most code, it is still very good to signal that something can fail in several ways, to let the caller decide whether they want to deal with this (so its really a recoverable error) or if they want to crash.

In Lisps and Ruby there’s this notation at the naming level where a function that ends with a bang (!) will mutate the value its called on.

It is quite nice to look at, and it doubles down on the idea that imperatives (!) are about mutating state.

However, when we Elixir heavily inspired itself from Ruby’s syntax, it borrowed this same symbol to signal that a function would raise an exception instead.

This is great! You read `File.read` and you’re expecting a result tuple, but if you read `File.read!` then you know this may blow up. And if you write `File.read!` then you are the one signaling that this operation is unrecoverable and its ok to crash.

After all in Elixir and in Riot, like in Erlang, a single actor crashing is no big deal.

But in OCaml, and in Riot, there’s no way to track what functions do raise exceptions and which ones do not. We can only look at the return type and if its a `result` we can assume that this function will not raise, but if it isn’t…it might! This is because exceptions are not tracked at the type-level.

For what is worth, I’d be super okay with OCaml 6 introducing tracked exceptions at whatever break of compatibility it brings. I think the benefits of surfacing that information are quite big.

Instead we rely on a convention: end function names with `_exn`. Personally I think this is kind of ugly, and its not always followed anyway. I wish we could reuse the Lisp/Ruby/Elixir signal and just attach a bang (!) to our functions. (And if we got that, I wouldn’t mind using `?` for predicates either). But that would just make it a bit cleaner, not mor ereliable. In any case, at least there’s some convention around this!

So without a reliable way to signal the possibility exceptions, we’ve overcorrected in favor of the Result type.

What does this look like?

The Ergonomics of Result

Result is both great and terrible. As a good ol’ monad, once you have one, if you wanna do things within it, you’re kind of stuck until you pattern match out of it. This used to be much more inconvenient with the use and abuse of operators like `»=` and `>|=` and more. Ugh. I am not a fan of custom ops for everyday code.

In Elixir, we don’t really use infix operators, so a sample function to find a user in a database could look like this:

def find_user!(db, id) do
  let conn = connect!(db)
  let query = create_find_user_query!(id)
  let query = prepare_query!(conn, query)
  let row = execute!(db, query)
  let {:ok, user_row} = parse_user_row(row)
  map_to_user_type(user_row)
end

And the equivalent function that “threads the result” would use the languages built-in `with` expressions.

def find_user(db, id) do
  with {:ok, conn} <- connect(db),
       {:ok, query} <- create_find_user_query(id),
       {:ok, query} <- prepare_query(conn, query),
       {:ok, row} <- execute(conn, query),
       {:ok, user_row} <- parse_user_row(row) do
    {:ok, map_to_user_type(user_row)}
  end
end

In OCaml we don’t have this syntax, and we are not fond of partial pattern matching anyways, so in practice this meant a lot of code looked like this:

let find_user db id =
  connect db >>= (fun conn ->
  create_find_user_query id >>= (fun query ->
  prepare_query conn query >>= (fun query ->
  execute db query >>= (fun row ->
   parse_user_row row
   >|= map_to_user_type
  )))))

Which to my eyes looks terrible, and if you try to format it it gets even worse. But in short, its trying to bind the resulting value from those operations (connect, execute, etc) into the variables to the right (conn, query, row).

Syntactical Improvements

The introduction of let-ops somewhat helped clean this up, since now we could define a custom version of `let` that did this transformation for us. In short, instead of:

connect db >>= (fun conn ->

Now you can write:

let* conn = connect db in

Which is a significant improvement in readability and ergonomics! Except that people still go out of their way to defined custom variations of this. let?, let*, let+, let!, and there was work to support let.await and custom words in there too.

In practice tho most people stick to let*, which helps keep things sane.

Our example now starts to look a lot more like Elixir’s ! version, with the benefit that the result is threaded:

let find_user db id =
  let* conn = connect db in
  let* query = create_find_user_query id in
  let* query = prepare_query conn query in
  let* row = execute conn query in
  let* parsed_row = row in
  Ok (map_to_user_type parsed_row)

Ah, much much easier on the eyes, right?

Unifying Errors

Now that this result-threading is all hidden, a new concern appears. Before when we had to use the wonky operators we were exposed very early on to the idea that there’s a Result value we are tapping into, and so doing things with the error channel was just a matter of reaching out to another function: `Result.map_error`. It fits just fine.

However, in our tidy, neat `let*` example, mapping errors becomes a complete eye-sore. We suddenly have this calls to `Result.map_error` where before we had absolutely nothing in sight about Results.

But why do we need to care about this? Because the definition of `let*` is the same as `Result.bind` (or `>>=`), which says that the error type should not change. So `connect` and `execute` and all our functions should return the same error type.

Not a big deal within the same small library, but as soon as you start mixing and matching functions that return results from different libraries, you will naturally end up with different error types.

For example, below we see 2 calls to 2 different libraries. Both of them have their own error type (`HttpClient.error` and `Db.error`). This makes the following example type error:

let fetch_user_and_save db id =
  let* user = HttpClient.fetch_user id in
  let* db_user = Db.save_user db user in
  Ok ()

And one way to unify them is to create a wrapper error type.

Wrapping Errors

We can do this ahead of time with a variant:

type wrapper_error = 
  | HttpClientError of HttpClient.error
  | DbError of Db.error

This is in fact what you’d normally do with your errors in Rust, and is a pattern that’s encouraged and supported by libraries like `thiserror`, which is maybe the most popular error lib over there. You’ll see a lot of Rust code like:

#[derive(Error, Debug, Clone)]
enum MyError {

  #[error(transparent)]
  HttpClientError(#[from] http_client::Error),

  #[error(transparent)]
  DbError(#[from] db::Error),

}

The downside of this is that we have to maintain that error type ourselves, and do the conversions manually. So our code becomes:

let fetch_user_and_save db id =
  let* user = HttpClient.fetch_user id
              |> Result.map_error (fun err -> HttpClientError err)
  in
  let* db_user = Db.save_user db user
                 |> Result.map_error (fun err -> DbError err)
  in
  Ok ()

We can clean this up slightly with custom wrapper functions, so that whole `map_error` call goes down to a single pipe:

let* user = HttpClient.fetch_user id |> map_http_client_error in

This makes our return type a simple (unit, wrapper_error) result

There’s an alternative approach to this, using polymorphic variants. Let’s look at that next.

Polymorphic Variant Errors

The main difference here is that polyvars can be inferred by the compiler, so in principle we don’t have to maintain them and they will grow and shrink based on how we use them.

To be concrete, we still have to do the wrapping but the wrapping is done with a free-standing constructor that doesn’t belong to any specific type, and then the compiler will look at all the ones used within our chain of `let*` and will put together a type for us. That’s super neat.

let fetch_user_and_save db id =
  let* user = HttpClient.fetch_user id
              |> Result.map_error (fun err -> `http err)
  in
  let* db_user = Db.save_user db user
                 |> Result.map_error (fun err -> `db err)
  in
  Ok ()

This takes our return type from (unit, wrapper_error) result to

(
  unit,
  [ | `http of HttpClient.error
    | `db of Db.error
  ]
) result

And if we had a new function in there that did some telemetry work, it’d grow automagically to:

(
  unit,
  [ | `http of HttpClient.error
    | `db of Db.error
    | `telemetry of Telemetry.error
  ]
) result

This is great for when we want to automatically propagate errors through many levels of the application, and with a little care we can even remove errors from this type after we explicitly handle them at different layers. It can be quite neat!

The main downside is that in practice, large projects use a lot of interface files, which now have to manually be updated anyways to reflect these error types that the compiler created for us. And the errors can grow very large (especially if you nest polymorphic variants), and become much harder to read.

NOTE: if your library already only uses polymorphic variants for errors, then no wrappers are needed!

Automatic Wrapping

So far we’ve solved a problem by creating another:

Fixing the syntactic problem makes the error hidden
Making the error hidden makes unification noisier
Created a lot of additional work for us

The way Rust gets rid of this additional work (within the scope of the `thiserror` library) is to let you implement the `Into` trait from one error into its wrapper, which lets the language basically inject that `Result.map_error` call for you.

In practice, this means that when we write this code, there is a lot of magic done by the `?` operator here:

#[derive(Error, Debug, Clone)]
enum MyError {

  #[error(transparent)]
  HttpClientError(#[from] http_client::Error),

  #[error(transparent)]
  DbError(#[from] db::Error),

}

fn fetch_user_and_save(db: db::Database, id: db::Id) -> Result<(), MyError> {
  let user = http_client::fetch_user(id)?;
  let _ = db.save_user(user)?;
  Ok(())
}

But in OCaml we don’t really have this for Result, so we’re stuck with the two approaches above.

The closest thing we have to this boilerplateless unification are exceptions.

Handling Errors

If at any point of our program we have an error value that we want to handle explicitly, instead of using `let*` we can simply use a match expression. This gives us very fine grained control over when and how we deal with these error values.

Unfortunately in practice it is not obvious which errors are recoverable and which aren’t, so when you do a `match` you’re presented with all the errors, not just the ones you can potentially solve.

Reporting Errors

It is also very common to want to report on errors somewhere at the very top of your program, and you’re likely going to end up with a `main` function like:

let main args =
  match run_program args with
  | Ok () -> ()
  | Error err -> print_error err
;;

main Sys.argv

And this `print_error` function will basically grow to include all unhandled or possible errors in your app. Very often you won’t have enough information to produce a meaningful report and will have to refactor code to include it, or be forced to wrap 3rdparty errors in custom error types to include this information.

Summary for Result

To summarize the experience with Result:

lets you track errors in type signatures
can look very clean with let-ops
can propagate errors easily
but composing errors is verbose and requires care to limit propagation
wrapper and polymorphic variants add maintenance overhead
- wrapper variants can make error reporting easier!
can be handled with a match expression
unfortunately not clear which errors are unrecoverable vs recoverable
reporting can be cumbersome and manual
we are forced to handle the error at some point

The Ergonomics of Exceptions

When we contrast results with exceptions we find that they are not that different:

they are not tracked in the type system
they also look very clean since there’s no syntactic overhead
they also propagate errors easily
error composition is free (all exceptions belong to the `exn` type), but it also requires care to limit propagation
no wrappers for errors are required
can be handled with a match expression
also unclear whether they are recoverable or not
while they will always be reported somehow, there’s no standard formatter for exceptions that knows how to format them all, so reporting is also rather cumbersome and manual
we are not forced to handle them

Basically free composition and no wrapping-maintenance at the cost of not knowing whether an exception will occur and not being forced to handle them.

Sounds like a deal with the devil! But if you make it at least you now know what you’re mostly trading off.

The Ideal Ergonomics

I think the ideal ergonomics for error handling in a language means that:

I can track recoverable errors at the type-level – unrecoverable or fatal errors I won’t be able to do much with anyways, so just report them and be done.
Have little to no syntax overhead – I want to focus on the happy path and decide which errors are recoverable / unrecoverable very clearly and cheaply.
Has free error composition – for unrecoverable errors I want to do zero work maintaining wrappers of any kind.
Requires no ad-hoc reporting – I want my errors to know how to report themselves basically, but I’d like the channel/format in which they report themselves to be flexible to support console logs on dev time, but also structured logging in production.

I think there’s a few patterns we can apply to reach a reasonable compromise without reaching for a dedicated error library with code generation and such, but I’m also exploring this so expect more on this topic soon!

Welp that’s it, I ran out of writing fuel.

If you liked this or have something to say about it please let me know! Here’s the Bluesky thread

/ Leandro

Leandro Ostera