Eugene Ching (@eugeii)

I’m a hacker, a security researcher, a coder.

And this is a personal space where I talk about code, security, notions of computing, and whatever else comes to mind.

Making sense of monadic functions

Posted by on Dec 3, 2011 in Coding | No Comments

This article is on what is loosely but oft referred to as “monadic functions”. It contains my personal and intuitive way of thinking of monadic functions, and it may not suit you well. However, for who out there who this may help, I hope it makes that little difference for you.

I notice that in most monad tutorials, there is tremendous detail in how monadic composition works, as well as examples on how the >>= bind operator is implemented in various commonly seen monads (Maybe, [], State, etc), but there is very little what exactly are monadic functions. The key idea that can be gleamed is that monadic functions are the stuff that act on monads, through the help of the >>= bind operator (which essentially finds a way to meaningfully “extract” the value that the monad contextualizes), and sometimes the return function as well. Also, we often see that monadic functions, say f and g, can be suitably composed as follows:

f, g :: a -> m b
f >>= g

We also come across examples, usually just before introducing the beautiful do-notation, that look like this:

f, g :: a -> m b
someFunction x = return x >>= (\a ->
                 ... some other code ...)

And then, somewhere in our journey, we come across code that looks like this (this example is in the IO monad, but is not peculiar to the IO monad):

someFunction x = do
  a <- getLine
  b <- getLine
  putStrLn (show x ++ a ++ b)

And then we may, perhaps, wonder: getLine is a function, but its type signature is not that of a -> m b, or for the IO monad, a -> IO b. In fact, its signature is:

getLine :: IO String

So how is that a monadic function? And yet our intuition tells us that it’s fine. Firstly, it works. Secondly, it’s producing a monadic value, which gets “assigned” to a. We then mentally translate this to the de-sugared notation:

someFunction x = getLine >>= (\a ->
                   getLine >>= (\b ->
                     putStrLn (show x ++ a ++ b)))

And then things become clearer, because the type signature that we’re looking for is embodied in the two anonymous functions. I don’t know about you, but this bothered me for a while. Not from the formal correctness of it all, as it’s obvious that’s not a problem, but the intuitive sense of a “monadic function”.

So let me try to sort of resolve the issue in a structured way in this article. If all this is already crystal clear to you, and you’re thinking “this guy doesn’t get it”, then you may want to not read this, as this contains my personal and intuitive way of thinking of monadic functions, and it may not suit you well. However, for who out there who this may help, I hope it makes that little difference for you.

First off, some terminology:

  • Monad: A typeclass that defines >>= and return (minimally).
  • Monadic value: A value contextualized by a type constructor. (Examples would be (Just 4), or [100, 200])
  • Inner value: The actual value that was contextualized by a type constructor. (Examples would be 4, and each of (100, 200))
  • Monadic function: A function that produces a monadic value. (Note that we said nothing about its input type)

Given mv >>= f, the main reason for requiring that >>= has the type of >>= :: m a -> (a -> m b) -> m b, and hence, f having the type of f :: a -> m b, is because we want a means to pass the inner value of mv to the monadic function f. However, this doesn’t make everything apparent.

Hence, creating an example out of the Maybe monad, suppose we have the following monadic functions:

mf1, mf2 :: Int -> Maybe Int
mf1 x = return (x*2)
mf2 x = return (x*3)

And we want to compose them:

cmf :: Int -> Maybe Int
cmf x = Just x >>= mf1 >>= mf2

Now suppose that we have some more “elaborate” monadic functions:

mf1, mf2 :: Bool -> Int -> Maybe Int
mf1 b x = if b then Just (x*2) else Just (x*4)
mf2 b x = if b then Just (x*3) else Just (x*9)

Oh now these functions take some extra stuff, and are no longer nice and simple like the original mf1 and mf2. Certainly (and very obviously so), direct composition does not work, as the type signatures are all messed up:

cmf' x = return x >>= mf1 >>= mf2

However, it doesn’t mean that this doesn’t work:

cmf x = do
  a <- mf1 True x
  b <- mf2 False a
  return (a*b)

Which is the same as this:

cmf x = mf1 True x  >>= (\a ->
        mf2 False a >>= (\b ->
        return (a*b)))

The issue here is that our “definition” of a monadic function is a tad loose. Often, you will find that where the words monadic function is used, it can either refer to:

  • Functions of the form f :: a -> m b, where a is the type of the inner value of the monad. (Call these classic monadic functions)
  • Functions of the form f :: anything -> m b, where the input of the function really doesn’t matter. (Call these loose monadic functions)

We note that for composability, as in cmf' x = return x >>= mf1 >>= mf2, our monadic functions mf1 and mf2 needed to be in classic form, where the input is only the type of the inner value of the monad.

However, any function that produces the monadic value can indeed be used “inside” the monad. By “inside” I mean code that acts within a monad, which is the code enclosed in the do-notation, and its corresponding de-sugared form. We can do this by making a wrapper around this loose monadic function, whereby the wrapper is a classic monadic function itself. I.e. a wrapper that takes in a single value of the type of the monad’s inner value.

In our example with our two loose monadic functions mf1 and mf2, we successfully “composed” them as follows:

cmf x = mf1 True x  >>= (\a ->
        mf2 False a >>= (\b ->
        return (a*b)))

which is:

cmf x = mf1 True x  >>=
          (\a -> mf2 False a >>=
            (\b -> return (a*b)))

Again, the anonymous functions are the wrappers which are, themselves, classic monadic functions. How so? Take a look at the anonymous function that has a parameter “a”. It takes a single input value, named “a”, and the type of “a” (by type inference) is the inner value of mf1 (by definition of >>=). It’s result is the result of the call of mf2, which we know is a loose monadic function and hence will produce a monadic value. Thus the result of the anonymous function is a monadic value. Hence, (a -> mf2 False a) :: a -> mb. The exact same argument goes for the second anonymous function.

This leads to the following conclusion. The fundamental reason for requiring the f in mv >>= f to be of the type f :: a -> m b, with emphasis on the “a”, is because in general, this function is designed to receive the value produced by another (may be “loose”) monadic function, or simply stated by some monadic value, or created by a monadic function that does “nothing” but return a monadic value (think getLine :: IO String). Hence, all of these work (we’re in the Maybe monad):

classicMonadicFunction x = Just (x+2)
looseMonadicFunction x y z = Just (x+y+z)
looseMonadicFunction' = Just 4

f = Just 4 >>= classicMonadicFunction                  -- Needs a wrapper
f = Just 4 >>= (\x -> looseMonadicFunction x 100 200)  -- Don't need a wrapper

f = do
  classicMonadicFunction 4

f = do
  Just 4 >>= classicMonadicFunction

f = do
  a <- looseMonadicFunction 10 100 200
  classicMonadicFunction a

f = do
  a <- classicMonadicFunction 4
  looseMonadicFunction a 100 200

f = do
  a <- Just 4
  classicMonadicFunction a

f = do
  a <- looseMonadicFunction'
  looseMonadicFunction a 100 200

And you can work out all the rest of the combinations yourself. This gives more flexibility in designing and implementing monadic functions, as they may be loose, classic, or just monadic values themselves (Just 4).

Hence remember, the only true constraint in designing monadic functions (or when using them), is that they have to return a monadic value. Failure to do so will result in a type error under all conditions.

Why you can forget about loose and classic monadic functions

Because the do-notation creates the necessary “wrappers” for you, you often don’t have to think about classic or loose monadic functions. Any function which is going to throw back a monadic value (and do whatever side-effects it wants to), is going to work inside the do-block. It’s after all, the same as you creating wrappers around every single monadic function (classic or loose), and thus guaranteeing that the f in mv >>= f is going to be of the correct type, and then calling your monadic functions (classic or loose) with the correct set of parameters.

Hence, the wrappers take care of the fact that the assumption of the >>= bind operator is that it needs to find a meaningful way to pass in the “extracted” inner value, leaving everything else you write nice and simple. Put another way. since the do-notation handles the creation of all wrappers, you can use any monadic function (loose or classic), without ever thinking of whether they’re loose or classic anymore (meaning you can forget about this article! ;)). Hence, all you have to know, intuitively, is that any function that returns a monadic value is a monadic function, and is good to go!

Going back to our getLine, a real example of where all these occurs is in
indeed the getLine function of the IO monad. Recall that the type of the
getLine function is as follows:

getLine :: IO String

This function is of the same type signature as our looseMonadicFunction'. It takes in nothing, because all of its functionality is in its side-effects: Grabbing a line from the terminal. It’s only result is a monadic value, which contextualizes the string that it grabs inside the IO monad.

The putStrLn function, on the other hand, is a classic monadic function. It should be obvious why, from it’s type signature. It returns a monadic value, and takes a single parameter of the type of the inner value:

putStrLn :: String -> IO ()