The Benefits of Pure Functions
The goal of this lesson is simple: to list and explain the benefits of writing pure functions.
My favorite benefits of pure functions are:
They’re easier to reason about
They’re easier to combine
They’re easier to test
They’re easier to debug
They’re easier to parallelize
FP developers talk about other benefits of writing pure functions. For instance, Venkat Subramaniam adds these benefits:
They are idempotent
They offer referential transparency
They are memoizable
They can be lazy
In this lesson I’ll examine each of these benefits.
Pure functions are easier to reason about than impure functions, and I cover this in detail in the lesson, “Pure Function Signatures Tell All.” The key point is that because a pure function has no side effects or hidden I/O, you can get a terrific idea of what it does just by looking at its signature.
Because “output depends only on input,” pure functions are easy to combine together into simple solutions. For example, you’ll often see FP code written as a chain of function calls, like this:
val x = doThis(a).thenThis(b)
.andThenThis(c)
.doThisToo(d)
.andFinallyThis(e)
This capability is referred to as functional composition. I’ll demonstrate more examples of it throughout this book.
As you’ll see in the “FP is Like Unix Pipelines” lesson, Unix pipelines work extremely well because most Unix commands are like pure functions: they read input and produce transformed output based only on the inputs and the algorithm you supply.
As I showed in the “Benefits of Functional Programming” chapter and the unit test in the previous lesson, pure functions are easier to test than impure functions. I expand on this in several other lessons in this book, including the lesson on property-based testing.
In the “Benefits of Functional Programming” chapter I wrote that on a large scale, FP applications are easier to debug. In the small scale, pure functions are also easier to debug than their impure counterparts. Because the output of a pure function depends only on the function’s input parameters and your algorithm, you don’t need to look outside the function’s scope to debug it.
Contrast that with having to debug the makeCookies
function in the previous lesson. Because love
is a hidden input, you have to look outside the function’s scope to determine what love
’s state was at the time makeCookies
was called, and how that state was set.
In that same chapter I also wrote that it’s easier to write parallel/concurrent applications with FP. Because all of those same reasons apply here I won’t repeat them, but I will show one example of how a compiler can optimize code within a pure function.
I’m not a compiler writer, so I’ll begin with this statement from the “pure functions” section of the Wikipedia functional programming page:
“If there is no data dependency between two pure expressions, then their order can be reversed, or they can be performed in parallel and they cannot interfere with one another (in other terms, the evaluation of any pure expression is thread-safe).”
As an example of what that means, in this code:
val x = f(a)
val y = g(b)
val z = h(c)
val result = x + y + z
there are no data dependencies between the first three expressions, so they can be executed in any order. The only thing that matters is that they are executed before the assignment to result
. If the compiler/interpreter wants to run those expressions in parallel, it can do that and then merge their values in the final expression. This can happen because (a) the functions are pure, and (b) there are no dependencies between the expressions.
That same Wikipedia page also states:
“If the entire language does not allow side-effects, then any evaluation strategy can be used; this gives the compiler freedom to reorder or combine the evaluation of expressions in a program (for example, using deforestation).”
The 2006 article, Functional Programming for the Rest Of Us, includes a quote similar to these Wikipedia quotes. It states, “An interesting property of functional languages is that they can be reasoned about mathematically. Since a functional language is simply an implementation of a formal system, all mathematical operations that could be done on paper still apply to the programs written in that language. The compiler could, for example, convert pieces of code into equivalent but more efficient pieces with a mathematical proof that two pieces of code are equivalent. Relational databases have been performing these optimizations for years. There is no reason the same techniques can’t apply to regular software.”
I don’t use the word “idempotent” too often, so I’ll quote from Venkat Subramaniam’s explanation of the benefit of idempotence in regards to pure functions (with a few minor edits by me):
The word idempotent has a few different meanings … a function or operation is idempotent if the result of executing it multiple times for a given input is the same as executing it only once for the same input. If we know that an operation is idempotent, we can run it as many times as we like … it’s safe to retry.
In a related definition, in A practical introduction to functional programming, Mary Rose Cook states:
A process is deterministic if repetitions yield the same result every time.
The terms idempotent and deterministic are similar to a favorite phrase of mine: you can call a pure function an infinite number of times and always get the same result.
Honestly, with these definitions it feels like I’m writing, “A benefit of pure functions is that they are pure functions.” My only reason for keeping this section is so that you have some exposure to the terms idempotent and deterministic.
This demonstrates that like many other uncommon phrases in functional programming, you can understand a concept long before you know that someone created a label for that concept.
Referential transparency (RT) is another technical term that you’ll hear in the FP world. It’s similar to idempotency, and refers to what you (and a compiler) can do because your functions are pure.
If you like algebra, you’ll like RT. It’s said that an expression is referentially transparent if it can be replaced by its resulting value without changing the behavior of the program.
For instance, assume that x
and y
are immutable values within some scope of an application, and within that scope they’re used to form this expression:
x + y
Then you can assign this expression to a third variable z
:
val z = x + y
Now, throughout the given scope of your program, anywhere the expression x + y
is used, it can be replaced by z
without affecting the result of the program (and vice-versa).
Note that although I state that x
and y
are immutable values, they can also be the result of pure functions. For instance, hello.length + world.length
will always be 10
. This result could be assigned to z
, and then z
could be used everywhere instead of this expression. In Scala this looks like this:
val x = "hello".length // 5
val y = "world".length // 5
val z = x + y // 10
Because all of those values are immutable, you can use z
anywhere you might use x+y
, and in fact, in this example you can replace z
with 10
anywhere, and your program will run exactly the same.
In FP we say things like, “
10
cannot be reduced any more.” (More on this later.)
Conversely, if x
or y
was an impure function, such as a “get the current time” function, z
could not be a reliable replacement for x + y
at different points in the application.
Because a pure function always returns the same result when given the same inputs, a compiler (or your application) can also use caching optimizations, such as memoization.
Wikipedia defines memoization like this:
“Memoization is an optimization technique used primarily to speed up computer programs by storing the results of expensive function calls and returning the cached result when the same inputs occur again.”
For example, I previously noted that my Android football game has this function call:
val possiblePlays = OffensiveCoordinator.determinePossiblePlays(gameState)
The determinePossiblePlays
function currently has several thousand lines of pure functions behind it, and over time it’s only going to get more complicated. Although this function doesn’t currently use memoization, it would be fairly simple to create a cache for it, so that each time it received the same gameState
it would return the same result.
The cache could be implemented as a Map
, with a type of Map[GameState, Seq[OffensivePlay]]
. Then when determinePossiblePlays
receives a GameState
instance, it could perform a fast lookup in this cache.
While those statements are true, I don’t want to oversimplify this too much.
determinePossiblePlays
makes decisions based on manyGameState
factors, including two important (a) game score and (b) time remaining. Those two variables would have to be factors in any cache.
Laziness is a major feature of the Haskell language, where everything is lazy. In Scala I primarily use laziness with large data sets and streams, so I haven’t personally taken advantage of this benefit yet.
(I’ll update this benefit when I have a good Scala example.)
In this lesson I wrote about the benefits of pure functions. My favorite benefits are:
They’re easier to reason about
They’re easier to combine
They’re easier to test
They’re easier to debug
They’re easier to parallelize
Other FP developers write about these benefits of pure functions:
They are idempotent
They offer referential transparency
They are memoizable
They can be lazy
Wikipedia has a good discussion on the benefits of “pure functions” on their Functional Programming page
Stack Exchange provides a definition of referential transparency
Stack Overflow says, Don’t worry about the term RT, it’s for pointy-headed purists
Venkat Subramaniam’s post on the benefits of pure functions
If you like debates on the precise meaning of technical terms, reddit.com has a thread titled, Purity and referential transparency are different