Sunday, 28 September 2014

An Example of Functional Programming

Many people, after reading my previous blog post, asked to see a practical example of FP with code. I know it's been a few months – I actually got married recently; wedding planning is very time consuming – but I've finally come up with an example. Please enjoy.

Most introductions to FP begin with pleas for immutability but I'm going to do something different. I've come up with a real-world example that's not too contrived. It's about validating user data. There will be 5 incremental requirements that you can imagine coming in sequentially, each one building on the other. We'll code to satisfy each requirement incrementally without peeking at the following requirements. We'll code with an FP mindset and use Scala to do so but the language isn't important. This isn't about Scala. It's about principals and a perspective of thought. You can write FP in Java (if you enjoy pain) and you can write OO in Haskell (I know someone who does this and baffles his friends). The language you use affects the ease of writing FP, but FP isn't bound to or defined by any one language. It's more than that. If you don't use Scala this will still be applicable and useful to you.

I know many readers will have programming experience but little FP experience so I will try to make this as beginner-friendly as possible and omit using jargon without explanation.

Req 1. Reject invalid input.

The premise here is that we have data and we want to know if it's valid or not. For example, suppose we want ensure a username conforms to certain rules before we accept it and store it in our app's database.

I'm championing functional programming here so let's use a function! What's the simplest thing we need to make this work? A function that takes some data and returns whether it's valid or not: A ⇒ Boolean.

Well that's certainly simple but I'm going to handwaveily tell you that primitives are dangerous. They denote the format of the underlying data but not its meaning. If you refactor a function like blah(dataWasValid: Boolean, hacksEnabled: Boolean, killTheHostages: Boolean) the compiler isn't going to help you if you get the arguments wrong somewhere. Have you ever had a bug where you used the ID of one data object in place of another because they were both longs? Did you hear about the NASA mission that failed because of mixed metric numbers (eg. miles and kilometers) being indistinguishable?

So let's address that by first correcting the definition of our function. We want a function that takes some data and returns whether it's valid or not an indication of validity: A ⇒ Validity.
sealed trait Validity
case object Valid extends Validity
case object Invalid extends Validity

type Validator[A] = A => Validity

We'll also create a sample username validator and put it to use. First the validator:
val usernameV: Validator[String] = {
  val p = "^[a-z]+$".r.pattern
  s => if (p.matcher(s).matches) Valid else Invalid

Now a sample save function:
def example(u: String): Unit =
  usernameV(u) match {
    case Valid   => println("Fake-saving username.")
    case Invalid => println("Invalid username.")

There's a problem here. Code like this will make FP practitioners cry and for good reason. How would we test this function? How could we ever manipulate or depend on what it does, or its outcome? The problem here is “effects” and unbridled, they are anathema to healthy, reusable code. An effect is anything that affects anything outside the function it lives in, relies on anything impure outside the function it lives in, or happens in place of the function returning a value. Examples are printing to the screen, throwing an exception, reading a file, reading a global variable.

Instead we will model effects as data. Where as the above example would either 1) print “Fake-saving username” or 2) print “Invalid username”, we will now either 1) return an effect that when invoked, prints “Fake-saving username”, or 2) return a reason for failure.

We'll define our own datatype called Effect, to be a function that neither takes input nor output.
(Note: If you're using Scalaz, scalaz.effect.IO is a decent catch-all for effects.)
type Effect = () => Unit
def fakeSave: Effect = () => println("Fake save")

Next, Scala provides a type Either[A,B] which can be inhabited by either Left[A] or Right[B] and we'll use this to return either an effect or failure reason.
Putting it all together we have this:
def example(u: String): Either[String, Effect] =
  usernameV(u) match {
    case Valid   => Right(fakeSave)
    case Invalid => Left("Invalid username.")

Req 2. Explain why input is invalid.

We need to specify a reason for failure now.
We still have two cases: valid with no error msg, invalid with an error msg. We'll simply add an error message to the Invalid case.
case class Invalid(e: String) extends Validity

Then we make it compile and return the invalidity result in our example.
 val usernameV: Validator[String] = {
   val p = "^[a-z]+$".r.pattern
-  s => if (p.matcher(s).matches) Valid else Invalid
+  s => if (p.matcher(s).matches) Valid else
+         Invalid("Username must be 1 or more lowercase letters.")
 def example(u: String): Either[String, Effect] =
   usernameV(u) match {
     case Valid      => Right(fakeSave)
-    case Invalid    => Left("Invalid username.")
+    case Invalid(e) => Left(e)

Req 3. Share reusable rules between validators.

Imagine our system has 50 data validation rules, 80% reject empty strings, 30% reject whitespace characters, 90% have maximum string lengths. We like reuse and D.R.Y. and all that; this requirement addresses that by demanding that we break rules into smaller constituents and reuse them.

We want to write small, independent units and join then into larger things. This leads us to an important and interesting topic: composability.

I want to suggest something that I know will cause many people to cringe – but hear me out – let's look to math. Remember basic arithmetic from ye olde youth?
8 = 5 + 3
8 = 5 + 1 + 2
Addition. This is great! It's building something from smaller parts. This seems like a perfect starting point for composition to me. There's a certain beauty and elegance to math, and its capability is proven; what better inspiration!

Let's look at some basic properties of addition.

Property #1:
8 = 8 + 0
8 = 0 + 8
Add 0 to any number and you get that number back unchanged.
Property #2:
8 = (1 + 3) + 4
8 = 1 + (3 + 4)
8 = 1 + 3 + 4
Parentheses don't matter. Add or remove them without changing the result.
Property #3:
I'll also mention that in primary school, you had full confidence in this:
number + number = number
It may seem silly to mention, but imagine if your primary school teacher told you that
number + number = number | null | InvalidArgumentException
+ has other properties too, like 2+6=6+2 but we don't want that for our scenario with validation. The above three provide enough benefit for what we need.

You might wonder why I'm describing these properties. Why should you care? Well as programmers you gain much by writing code with similar properties. Consider...
  • You know you don't have to remember to check for nulls, catch any exceptions, worry about our internal AddService™ being online.
  • As long as the overall order of elements is preserved, you needn't care about the order in which groups are composed. i.e. we know that a+b+c+d+e will safely yield the same result if we batch up execution of (a+b) and (c+d+e) then add their results last. And parenthesis support is already provided by the programming language.
  • If ever forced into composition by some code path and you can opt out by specifying the 0 because we know that 0+x and x+0 are the same as x. No need to overload methods or whatnot.
Simple right? Well have you ever heard the term “monoid” thrown around? (Not “monad”.) Guess what? We've just discussed all that makes a monoid what it is and you learned it as a young child.
A monoid is a binary operation (x+x=x) that has 3 properties:
  • Identity: The 0 is what we call an identity element. 0+x = x = x+0
  • Associativity: That's the ability to add/remove parentheses without changing the result.
  • Closure: Always returns a result of the same type, no RuntimeExceptions, no nulls.
If jargon from abstract algebra intimidates you, know that it's mostly just terminology. You already know the concepts and have for years. The knowledge is very accessible and it's incredibly useful to be able to identify these kinds of properties about your code.

Speaking of code, let's implement this new requirement as a monoid. We'll add Validator.+ for composition and ensure it preserves the associativity property, and for identity (also called zero).
(Note: If using Scalaz, Algebird or similar, you can explicitly declare your code to be a monoid to get a bunch of useful monoid-related features for free.)
case class Validator[A](f: A => Validity) {
  @inline final def apply(a: A) = f(a)

  def +(v: Validator[A]) = Validator[A](a =>
    apply(a) match {
      case Valid         => v(a)
      case e@ Invalid(_) => e

object Validator {
  def id[A] = Validator[A](_ => Valid)
The difficulty of building human-language sentences scales with expressiveness. For our demo it's enough to simply have validators contain error message clauses like “is empty”, “must be lowercase” and just tack the subject on later.

First we define some helper methods pred and regex, then use them to create our validators
object Validator {
  def pred[A](f: A => Boolean, err: => String) =
    Validator[A](a => if (f(a)) Valid else Invalid(err))

  def regex(r: java.util.regex.Pattern, err: => String) =
    pred[String](a => r.matcher(a).matches, err)

val nonEmpty = Validator.pred[String](_.nonEmpty, "must be empty")
val lowercase = Validator.regex("^[a-z]*$".r.pattern, "must be lowercase")

val usernameV = nonEmpty + lowercase
Then we gaffe our subject to on to our error messages before displaying it and we're done.
def buildErrorMessage(field: String, err: String) = s"$field $err"

def example(u: String): Either[String, Effect] =
  usernameV(u) match {
    case Valid      => Right(fakeSave)
    case Invalid(e) => Left(buildErrorMessage("Username", e))

Req 4. Explain all the reasons for rejection.

Users are complaining that they get an error message, fix their data accordingly only to have it then rejected for a different reason. They again fix their data and it is rejected again for yet another reason. It would be better to inform the user of all the things left to fix so they can amend their data in one shot.
For example, an error message could look like “Username 1) must be less than 20 chars, 2) must contain at least one number.”

In other words there can be 1 or more reasons for invalidity now. Ok, we'll amend Invalid appropriately...
case class Invalid(e1: String, en: List[String]) extends Validity

Then we just make the compiler happy...
 case class Validator[A](f: A => Validity) {
   @inline final def apply(a: A) = f(a)
   def +(v: Validator[A]) = Validator[A](a =>
-    apply(a) match {
-      case Valid         => v(a)
-      case e@ Invalid(_) => e
-    })
+    (apply(a), v(a)) match {
+      case (Valid          , Valid          ) => Valid
+      case (Valid          , e@ Invalid(_,_)) => e
+      case (e@ Invalid(_,_), Valid          ) => e
+      case (Invalid(e1,en) , Invalid(e2,em) ) => Invalid(e1, en ::: e2 :: em)
+    })
 object Validator {
   def pred[A](f: A => Boolean, err: => String) =
-    Validator[A](a => if (f(a)) Valid else Invalid(err))
+    Validator[A](a => if (f(a)) Valid else Invalid(err, Nil))
-def buildErrorMessage(field: String, err: String) = s"$field $err"
+def buildErrorMessage(field: String, h: String, t: List[String]): String = t match {
+  case Nil => s"$field $h"
+  case _   => (h :: t){case (e,i) => s"${i+1}) $e"}.mkString(s"$field ", ", ", ".")
 def example(u: String): Either[String, Effect] =
   usernameV(u) match {
     case Valid         => Right(fakeSave)
-    case Invalid(e)    => Left(buildErrorMessage("Username", e))
+    case Invalid(h, t) => Left(buildErrorMessage("Username", h, t))
(Note: If you're using Scalaz, NonEmptyList[A] is a better replacement for A, List[A] like I've done in Invalid. The same thing can also be achieved by OneAnd[List, A]. In fact OneAnd is a good way to have compiler-enforced non-emptiness.)

Req 5. Omit mutually-exclusive or redundant error messages.

Take the this error message: “Your name must 1) include a given name, 2) include a surname, 3) not be empty”. If the user forgot to enter their name you just want to say “hey you forgot to enter your name”, not bombard the user with details about potentially invalid names.

What does this mean? It means one rule is unnecessary if another rule fails. What we're really talking about here is the means by which rules are composed. Let's just add another composition method. We talked about the + operation in math already, well math also provides a multiplication operation too. Look at an expression like 6 + 14 + (7 * 8). Two types of composition, us explicitly clarifying our intent via parentheses. That's perfectly expressive to me and it solves our new requirement with simplicity and minimal dev. As a reminder that we can borrow from math without emulating it verbatim, instead of a symbol let's give this operation a wordy name like andIfSuccessful so that we can say nonEmpty andIfSuccessful containsNumber to indicate a validator that will only check for numbers if data isn't empty.

Just like these express different intents and yield different results
number = 4 * (2 + 10)
number = (4 * 2) + 10
So too can
rule = nonEmpty andIfSuccessful (containsNumber and isUnique)
rule = (nonEmpty andIfSuccessful containsNumber) and isUnique
Or if you don't mind custom operators
rule = nonEmpty >> (containsNumber + isUnique)
rule = (nonEmpty >> containsNumber) + isUnique

To implement this new requirement we add a single method to Validator:
def andIfSuccessful(v: Validator[A]) = Validator[A](a =>
  apply(a) match {
    case Valid           => v(a)
    case e@ Invalid(_,_) => e


And we're done.
It's not how I would've approached code years back in my OO/Java era, nor is it like any of the code I came across written by others in that job. As an experiment I started fulfilling these requirements in Java the way old me used to code and there was a loooot of wasted code between requirements. I'd get all annoyed at each new step, so much so that I didn't even bother finishing. On the contrary, I enjoyed writing the FP.

Right, what conclusions can we draw?

FP is simple. Each validation is a single function in a wrapper.
FP is flexible. Logic is reusable and can be assembled into complex expressions easily.
FP is easily maintainable & modifiable. It has less structure, less structural dependencies, and is less code, plus the compiler's got your back.
FP is easy on the author. There was next to no rewriting or throw-away of code between requirements, and each new requirement was easy to implement.

I hope this proves an effective concrete example of FP for programmers of different backgrounds. I also hope this enables you to write more reliable software and have a happier time doing it.

Go forth and function.

Monday, 9 June 2014

A Year of Functional Programming

It's been a year since I first came across the concept of functional programming. To say it's changed my life, is an unjust understatement. This is a reflection on that journey.

Warning: I use the term FP quite loosely throughout this article.

Where I Was

I've been coding since the age of 8, so 26 years now. I started with BASIC & different types of assembly then moved on to C, C++, PHP and Perl (those were different times, maaaan ☮), Java, JavaScript, Ruby. That's the brief gist of it. Basically: a lot of time on the clock and absolutely no exposure to FP concepts or languages. I thought functional programming just meant recursive functions (ooooo big deal right?). I really came in blind.

How It Started: Scala

Last year, I wanted the speed and static checking (note: not “types”) of Java with the conciseness and flexibility of Ruby. I came across Scala, skimmed a little and was impressed. I bought a copy of Scala For The Impatient and just ate it for breakfast. I read the entire thing in 2 or 3 days, jotted down everything useful and then just started coding. It was awesome! At first I was just coding the same way I would Java with less than half the lines of code. It is a very efficient Java.

Exposure: Haskell

Lurking around the Scala community, I came across a joke. Someone said “Scala is a gateway drug to Haskell.” I found that amusing, although not for the reasons that the author intended. Haskell? Isn't that some toy university language? An experiment or something. Is it even still alive? Scala's awesome and so powerful, why would that lead to Haskell? How.... intriguing. Inexplicably it piqued my interest and really stuck with me. Later I decided to look it up and yes, it sure was alive and very active. I was shocked to discover that it compiles to machine code (binary) and is comparable in speed to C/C++. What?! It seems idiotic now but a year ago I thought it was some interpreted equation solver. I'm not alone in that ignorance, sadly; talking about it to some mates over lunch last year and a friend incredulously burst out laughing, “Haskell?” as if I was trying to tell him my dishwasher had an impressive static typing system. It saddens me to realise how in the dark I was, and how many people still are. Haskell is pretty frikken awesome! True to the gateway drug prophecy, I do now look to Haskell as an improvement to using Scala. But let's get back on track.

Exposure: Scalaz

I also started seeing contentious discussion of some library called Scalaz. Curious, I had a look at the code to see what people were on about, and didn't understand it at all. I'd see classes like Blah[F[_], A, B], methods with confusing names that take params like G[_], A → C, F[X → G[Y]], implementations like F.boo_(f(x))(g)(x), and I'd just think “What the hell is this? How is this useful?”. I was used to methods that did something pertaining to a goal in its domain. This Scalaz code was very alien to me and yet, very intriguing. Some obviously-smart person spent time making the alphabet soup permutations, why?

I've since discovered that answer to that question and I never could've imagined the amount of benefit it would yield. Instead of methods with domain-specific meaning, I now see functions for morphisms, in accordance with relevant laws. Simply put: changing the shape or content of types. No mention of any kind of problem-domain; applicable to all problem-domains! It's been surprising over the last year to discover just how applicable this is. This kind of abstraction is prolific, it's in your code right now, disguised, and intertwined with your business logic. Without knowledge of these kind of abstractions (and the awareness that a type-component level of abstraction is indeed possible) you are doomed to reinvent the wheel for all time and not even realise it. Identifying and using these abstractions, your code becomes more reusable, simple, readable, flexible, and testable. When you learn to recognise it, it's breathtaking.

FP: The Basics

Now that I'd been exposed to FP I started actively learning about it. At first I learned about referential transparency, purity and side-effects. I nodded agreement but had major reservations about the feasibility of adhering to such principals, and so at first I wasn't really sold on it. Or rather, I was hesitant. I may have been guilty of mumbling sentences involving the term “real-world”. Next came immutability. Now I'm a guy who used to use final religiously in Java, const on everything in C++ back in the day, and here FP is advocating for data immutability. Not just religiously advocating but providing real, elegant solutions to issues that you encounter using fully immutable data structures. Wow! So with immutability, composability, lenses it had its hooks in me.

Next came advocation for more expressive typing and (G)ADTs. That appealed in theory too and again I was hesitant about its feasibility. Once I experimentally applied it to some parts of my project, I was blown away by how well it worked. That became the gateway into thinking of code/types algebraically, which lead to...

FP: Maths

I loved maths back in school and always found it easy. Reading FP literature I started coming across lots of math and at first thought “great! I'm awesome at maths!” but then, trying to make sense of some FP math stuff, I'd find myself spending hours clicking link after link, realising that I wasn't getting it and, in many cases, still couldn't even make sense of the notation. It became daunting. Even depressing. Frequently demotivating.

The good news is that everything you need is out there; you just have to be prepared to learn more than you think you need. I persisted, I stopped viewing it as an annoying bridge and starting treating it as a fun subject on its own and, before long things made sense again. It opens new doors when you learn it.

Example: I had a browser tab (about co-Yoneda lemma) open for 3 months because I couldn't make sense of it. It took months (granted not everyday) of trying then confusion then tangents to understand background and whatever it was that threw me off. Once I learned that final piece of background info, I went from understanding only the first 5% to 100%. It was a great feeling.

Feeling Intimidated

Looking back there were times when I felt learning FP quite intimidating. When I'm in/around/reading conversations between experienced FP'ers quite often I've seriously felt like a moron. I started wondering if I gave my head a good slap, would a potato fall out. It can be intimidating when you're not used to it. But really, my advice to you, Reader, is that everyone's nice and happy to help when you're confused. I have a problem asking for help but I've seen everyone else do it and be received kindly... then I swoop in an absorb all the knowledge, hehe.

It's a mindset change. I wish I'd known this earlier as it would've saved me frustration and doubt, but you kind of need to unlearn what you think you know about coding, then go back to the basics. Imagine you've driven trains for decades, and spontaneously decide you want to be a pilot. No, you can't just read a plane manual in the morning and be in Tokyo in the afternoon. No, if you grab a beer with experienced pilots you won't be able to talk about aviation at their level. It's normal, right? Be patient, learn the basics, have fun, you'll get there.

On that note, I highly recommend Fuctional Programming in Scala, it's a phenomenal book. It helped me wade my way from total confusion to comfortable comprehension on a large number of FP topics with which I was struggling trying to learn from blogs.

Realisation: Abstractions

Recently I looked at some code I wrote 8 months ago and was shocked! I looked at one file written in “good OO-style”, lots of inheritance and code reuse, and just thought “this is just a monoid and a bunch of crap because I didn't realise this is a monoid” so I rewrote the entire thing to about a third of the code size and ended up with double the flexibility. Shortly after I saw another file and this time thought “these are all just endofunctors,” and lo and behold, rewrote it to about a third of the code size and the final product being both easier to use and more powerful.

I now see more abstractions than I used to. Amazingly, I'm also starting to see similar abstractions outside of code, like in UI design, in (software) requirements. It's brilliant! If you're not on-board but aspire to write “DRY” code, you will love this.

Realisation: Confidence. Types vs Tests

I require a degree of confidence in my code/system, that varies with the project. I do whatever must be done to achieve that. In Ruby, that often meant testing it from every angle imaginable, which cost me significant effort and negated the benefit of the language itself being concise. In Java too, I felt the need to test rigorously.

At first I was the same in Scala, but since learning more FP, I test way less and have more confidence. Why? The static type system. By combining lower-level abstractions, an expressive type system, efficient and immutable data structures, and the laws of parametricity, in most cases when something compiles, it works. Simple as that. There are hard proofs available to you, I'm not talking about fuzzy feelings here. I didn't have much respect for static types coming from Java because it's hard to get much mileage out of it (even in Java 8 – switch over enum needs a default block? Argh fuck off! Gee maybe maybe all interfaces should have a catch-all method too then. That really boiled my blood the other day. Sorry-), anyway: Java as a static typing system is like an abusive alcoholic as a parent. They may put food on the table and clothes your back, but that's a far cry from a good parent. (And you'll become damaged.) Scala on the other hand teaches you to trust again. Trust. Trust in the compiler. I've come to learn that when you trust the compiler and can express yourself appropriately, entire classes of problems go away, and with it the need for testing. It's joyous.

Sadly though, eventually you get to a point where Scala starts to struggle. It gets stupid, it can't work out what you mean, what you're saying, you have to gently hold its hand and coax it with explicit type declarations or tricks with type variance or rank-n types. Once you get to that level you start to feel like you've outgrown Scala and now need a big boy's compiler which can lead to habitual grumbling and regular reading about Haskell, Idris, Agda, Coq, et al.

However when you do need tests, you can write a single test for a bunch of functions using a single expression. How? Laws. Properties. Don't know what I mean? Pushing an item on to a stack should always increase its size by 1, the popping of which should reduce its size by 1 and return the item pushed, and return a stack equivalent to what you started with. Using libraries like ScalaCheck, turning that into a single expression like pop(push(start, item)) == (start, item) which is essentially all you need to write; ScalaCheck will generates test data for you.

Where Next?

What does the future hold for me? Well, I could never go back to dynamically-typed language again.
I will stick with Scala as I've invested a lot in it and it's still the best language I know well. I'd like to get more hands-on experience with Haskell; I don't know its trade-offs that well but its type system seems angelic. Got my eye on Idris, too.


I used to get excited discovering new libraries. I'd always think “Great! I wonder what this will allow me to do.” Well now I feel that way about research & academic papers. They are the same thing except smarter, more lasting, more dependable, and they yield more flexibility. It's awesome and I've got decades of catching up to do! Over the next year I'll definitely spend a lot of time learning more FP and comp-sci theory. I'd also like to be able to understand most Haskell blogs I come across. They promise very intelligent constructs (which aren't specific to Haskell) but the typing usually gets a bit too dense; it'd be nice to be able to read those articles with ease.

Don't fall for the industry dogma that academia isn't applicable to the real-world. What a load of horseshit that lie is. It does apply to the real-world, it's here to help you achieve your goals, it will save you significant time and effort, even with the initial learning overhead considered. Don't say you don't have time to do things in half the time. If you're always busy and your business are super fast-paced agile scrum need-it-yesterday kinda people, well I know you don't have time, but what I'm offering is this: say you just need to get something out the door and can do it quickly and messily in 1000 lines in 6 hours with 10 bugs and 10 “abilities”, well if you spend a bit of your own time learning you could perform the same task in 400 lines in 4 hours with maybe 1 bug and 20 “abilities”. You've just saved 2 hours up-front, not to mention days of savings when adding new features, fixing bugs, etc. That's applicable to you, the “real-world” and the industry. I've spent years in the industry and not just as a coder and I wish I'd known about this stuff back then because it would've saved me so much time, effort and stress. There seems to be this odd disdain for academia throughout the industry. Reject it. It's an ignorant justification of laziness and short-sightedness. It's false. I encourage you to take the leap.