control-monad-exception and the long type signatures myth

Yesterday Michael Snoyberg blogged a tutorial about his Attempt package for Haskell error handling. This inspired me to port his example to my control-monad-exception library and see how well it works.The main difference between the two libraries is that Attempt provides extensible exceptions which are not explicit whereas control-monad-exception' exceptions are explicit and checked by the type system a la Java. Under the hood they are very similar, Both are monads based on a datatype isomorphic to Either instantiated with extensible exceptions, but control-monad-exception has some extra oomph to make the typechecker understand exception handling. Michael dismisses control-monad-exception arguing that explicit exceptions produce "insanely long type signatures". Since I don't agree on that, I hope to support my point with this blog post.

His tutorial introduces an example about text processing to illustrate the use of Attempt: you want to parse text files which should contain three lines, each line containing an arithmetic expression of the form line  ::= <num> <op> <num>. Easy enough, the following Haskell program (copied from his blog) does the job:

> process1 :: FilePath -> IO Int
> process1 filePath = do
>   contents <- readFile filePath -- IO may fail for some reason
>   let [num1S, opS, num2S] = lines contents -- maybe there aren't 3 lines?
>       num1 = read num1S -- read might fail
>       op   = read opS   -- read might fail
>       num2 = read num2S -- read might fail
>   return $ toFunc op num1 num2

Michael explains very clearly why each of those lines may fail and then goes on to replace the unsafe functions by their Attempt equivalents:

  • read is a partial function; it is replaced by a total function in an Attempt monad read' :: AttemptMonad m => String -> m a  
  • readFile is an IO function which can take an exception. It is replaced by a version in the Attempt monad.  
  • the pattern matching is replaced by an assertion; I didn't understand the motivation and think the resulting code is a bit more difficult to understand so I will skip this part.

You do that and the result is the following code.

> data ProcessError = NotThreeLines String   deriving (Show, Typeable)
> instance E.Exception ProcessError
> process :: FilePath -> AttemptT IO Int
> process filePath =
>       contents <- A-readFile filePath
>       case lines contents of
>           [num1S, opS, num2S] -> do
>               num1 <- A.read num1S
>               op   <- A.read opS
>               num2 <- A.read num2S
>               return $ toFunc op num1 num2
>           _ -> Failure $ NotThreeLines contents

The resulting code is safe. Running it produces either a Success or a Failure, but can never end with a runtime exception. Now, let's look at doing the same with control-monad-exception (abbreviated c-m-e from here). First, we need to define safe versions of read and readFile. The c-m-e library provides a type class MonadThrow for computations which may raise an exception.

> safeRead :: (MonadThrow ReadException m, Read a) => String -> m a
> safeRead s = case .. of
>               [x] -> return x
>                _  -> throw $ ReadException s

I omitted the implementation since it is fairly routine and could live in a library. The interesting thing is that the inferred type signature documents the fact that a ReadException can be thrown. Compare it with the type signature of the version of read in the Attempt library:

> attempRead :: (MonadAttempt m, Read a) => String -> m a

The c-m-e version is not really that much longer after all. The corresponding c-m-e safe version for readFile has the type signature

> readFile :: (MonadIO m, MonadThrow IOException m) => FilePath -> m String

which again is not too bad. Note also that these functions are not defined in any library, since the c-m-e package currently provides only the exception monads and some support combinators. More on that now.

The nice thing at this point is that we haven't commited to a particular monad. c-m-e provides a monad transformer EMT which can be used for checked, explicit exceptions, but there are also MonadThrow instances for IO and Either which can use the error handling capabilities of those monads as well - and yes, those MonadThrow constraints will go away and you will be back in the world of unchecked and unexplicit exceptions, if that's what you want-. A MonadThrow instance for Attempt could be easily provided too, and in that way we could have all these safe functions in an independent package which can work with any error handling monad, be it attempt, c-m-e, explicit-exceptions or the IO monad. Please, we must have this.

Ok, enough rambling, back to the example. As you can imagine, the code for the process function is going to be exactly the same as before, modulo the function names. Except that now the type signature is not fixed to any particular monad:

> process :: (MonadIO m, MonadThrow IOException m, MonadThrow ReadException m)
>              => FilePath -> m Int

This is an inferred type signature and one which you would not write yourself normally, as in general you would be working inside a particular monad, not just any monad m. If this monad happens to be the Attempt monad, you would obtain exactly the type signature that we had above:

> process :: FilePath -> AttemptT IO Int

Which is appreciablily shorter. This tells you that process is a computation which can be run and produce either an Int or fail with an (undocumented) error. On the other hand, if you use EMT from the c-m-e package, the following inferred short-enough type signature tells you

> process :: (Throws IOException l, Throws ReadException l)
>              => FilePath -> EMT l IO Int

that process is a computation which can be run and may fail with an IOException or a ReadException. We can go ahead and define a new version of process which handles the IOException:

> process1 s = process s `catch` \(e::IOException) -> return (-1)

And now the compiler *knows* that process1 has the type:

> process1 :: Throws ReadException l => FilePath -> EMT l IO Int

as expected. This is the main idea behind the c-m-e library: using the typechecker to track which exceptions have not been caught yet. Type signatures should not grow unwieldly long, unless your code is sloppy and has lots of unhandled exceptions around.

I am going to leave it at here. Michael's tutorial goes on and wraps the exceptions into the domain exceptions BadIntException and BadOpException, which should always be done.

The c-m-e library provides also other niceties such as stack traces and selectively unchecked exceptions. I encourage you to look at the documentation for the package if you are interested on doing proper error handling in Haskell.

Posted

5 comments

Oct 26, 2009
snoyberg said...
I really do like the idea of explicit exceptions. However, in a large scale, they *will* become unwieldy. The example I used is small-scale enough to use c-m-e instead, but what about packaging up a library? As I think you know (maybe you don't), the impetus for the attempt library is another library I'm working on, data-object.

For simplicity, let's say that there is a type class that has a function that might succeed or might fail. However, since there are endless possibilities of instantiations of this type class, I have no way of knowing what possible exceptions someone might want to throw.

I know; I could might the response have some kind of generic DataObjectException or some such like that, but then what possible benefit have I given the user with the extra complexity introduced by c-m-e? (Yes, c-m-e is more complicated than attempt.)

There are many problem domains for which c-m-e is the right solution, I just don't happen to be working in them right now. But even those, attempt isn't the *wrong* solution. I'm not convinced that there are so many times when you need to know exactly what exceptions a function might throw.

As far as collaboration between the libraries, I'm all in favor. Give attempt a chance to stabalize, and then I'll be happy to discuss common instances and the like.

Oct 27, 2009
pepe iborra said...
Right, explicit exceptions in that kind of very general, in-a-type class function do not really make sense. In such a case Maybe or Either SomeException (i.e. Attempt) make more sense.

As you say, c-m-e and attempt are not mutually exclusive and each has its use cases. That's what I am concerned about. Making it easy to write libraries which do not tie to a particular error handling library. A first step I am going to give is extracting the MonadThrow class, together with instances for IO, Either, Maybe and [] to a separate package. If I had/have time I would/will also fork the Safe package to add a MonadThrow version of every function.

Oh, and MonadThrow/throw could probably use a better name. I am going to propose MonadFail/failure.

Oct 27, 2009
snoyberg said...
Sounds like creating a MonadFailure is a good start. I'll definitely take a look when it's available, and most likely provide some instances for attempt. (You could, of course, make attempt a dependency for your failure package and provide the instances yourself, but I think the other way makes more sense.)

Oct 28, 2009
pepe iborra said...
I have created two new projects in my github (http://github.com/pepeiborra), for control-monad-fail and safe-fail.

control-monad-fail is basically the MonadThrow class from c-m-e.
safe-fail is the generalization of the Safe library to any MonadFailure monad. If you instantiate with Maybe you obtain the xxxMay versions of functions in the Safe library. You can obtain the xxxDef versions using a helper function

def defaultValue = fromMaybe defaultValue

tailDef = def [] . tail

Similarly, you can obtain the xxxNote versions using another helper:

note msg = fromMaybe (error msg)

tailNote msg = note msg . tail

But you don't have to instantiate with Maybe. You can instantiate with IO and obtain versions which fail with IO exceptions. I can see myself using that a lot just for the immediate convenience.

I haven't released to Hackage yet, I want to sit on them a bit before that. I would be thankful if you would take a look at them and see if they fit your picture too.

Oct 28, 2009
pepe iborra said...
Or better:

note :: Exception e => String -> Either e a -> a
note msg = either (\e -> error (msg ++ show e)) id

Leave a comment...