The Definition of “Pure Function”
This lesson has two goals:
Properly define the term “pure function.”
Show a few examples of pure functions.
It also tries to simplify the pure function definition, and shares a tip on how to easily identify many impure functions.
As I mentioned in the “What is Functional Programming?” chapter, I define functional programming (FP) like this:
Functional programming is a way of writing software applications using only pure functions and immutable values.
Because that definition uses the term “pure functions,” it’s important to understand what a pure function is. I gave a partial pure function definition in that chapter, and now I’ll provide a more complete definition.
Just like the term functional programming, different people will give you different definitions of a pure function. I provide links to some of those at the end of this lesson, but skipping those for now, Wikipedia defines a pure function like this:
The function always evaluates to the same result value given the same argument value(s). It cannot depend on any hidden state or value, and it cannot depend on any I/O.
Evaluation of the result does not cause any semantically observable side effect or output, such as mutation of mutable objects or output to I/O devices.
That’s good, but I prefer to reorganize those statements like this:
A pure function depends only on (a) its declared input parameters and (b) its algorithm to produce its result. A pure function has no “back doors,” which means:
Its result can’t depend on reading any hidden value outside of the function scope, such as another field in the same class or global variables.
It cannot modify any hidden fields outside of the function scope, such as other mutable fields in the same class or global variables.
It cannot depend on any external I/O. It can’t rely on input from files, databases, web services, UIs, etc; it can’t produce output, such as writing to a file, database, or web service, writing to a screen, etc.
A pure function does not modify its input parameters.
This can be summed up concisely with this definition:
A pure function is a function that depends only on its declared input parameters and its algorithm to produce its output. It does not read any other values from “the outside world” — the world outside of the function’s scope — and it does not modify any values in the outside world.
Once you’ve seen a formal pure function definition, I prefer this short mantra:
I like that because it’s short and easy to remember, but technically it isn’t 100% accurate because it doesn’t address side effects. A more accurate way of saying this is:
Output depends only on input
No side effects
You can represent that as shown in Figure [fig:equationHowPureFunctionsWork].
A simpler version of that equation is shown in Figure [fig:equationHowPFsWorkSimpler].
In this book I’ll generally either write, “Output depends on input,” or show one of these images.
Another way to state this is that the universe of a pure function is only the input it receives, and the output it produces, as shown in Figure [fig:universeOfAPureFunction].
If it seems like I’m emphasizing this point a lot, it’s because I am(!). One of the most important concepts of functional programming is that FP applications are built almost entirely with pure functions, and pure functions are very different than what I used to write in my OOP career. A great benefit of pure functions is that when you’re writing them you don’t have to think about anything else; all you have to think about is the universe of this function, what’s coming in and what’s going out.
Given the definition of pure functions and these simpler mantras, let’s look at some examples of pure and impure functions.
Mathematical functions are great examples of pure functions because it’s pretty obvious that “output depends only on input.” Methods like these in scala.math._ are all pure functions:
I refer to these as “methods” because they are defined using
def in the package object
math. However, these methods work just like functions, so I also refer to them as pure functions.
Because a Scala
String is immutable, every method available to a
String is a pure function, including:
Many methods that are available on Scala’s collections’ classes fit the definition of a pure function, including the common ones:
Conversely, the following functions are impure.
Going right back to the collections’ classes, the
foreach method is impure.
foreach is used only for its side effects, which you can tell by looking at its signature on the
def foreach(f: (A) => Unit): Unit
Date and time related methods like
getMinute are all impure because their output depends on something other than their inputs. Their results rely on some form of hidden I/O.
Methods on the
scala.util.Random class like
nextInt are also impure because their output depends on something other than their inputs.
In general, impure functions do one or more of these things:
Read hidden inputs (variables not explicitly passed in as function input parameters)
Write hidden outputs
Mutate the parameters they are given
Perform some sort of I/O with the outside world
By looking at function signatures only, there are two ways you can identify many impure functions:
They don’t have any input parameters
They don’t return anything (or they return
Unitin Scala, which is the same thing)
For example, here’s the signature for the
println method of the Scala
def println(x: Any): Unit ----
println is such a commonly-used method, you already know that it writes information to the outside world, but if you didn’t know that, its
Unit return type would be a terrific hint of that behavior.
Similarly when you look at the “read*” methods that were formerly in
Predef (and are now in scala.io.StdIn), you’ll see that a method like
readLine takes no input parameters, which is also a giveaway that it is impure:
def readLine(): String --
Because it takes no input parameters, the mantra, “Output depends only on input” clearly can’t apply to it.
If a function has no input parameters, how can its output depend on its input?
If a function has no result, it must have side effects: mutating variables, or performing some sort of I/O.
While this is an easy way to spot many impure functions, other impure methods can have both (a) input parameters and (b) a non-
Unit return type, but still be impure because they read variables outside of their scope, mutate variables outside of their scope, or perform I/O.
As you saw in this lesson, this is my formal definition of a pure function:
A pure function is a function that depends only on its declared inputs and its internal algorithm to produce its output. It does not read any other values from “the outside world” — the world outside of the function’s scope — and it does not modify any values in the outside world.
Once you understand the complete definition, I prefer the short mantra:
or this more accurate statement:
Output depends only on input
No side effects
Now that you’ve seen the definition of a pure function, I’ll show some problems that arise from using impure functions, and then summarize the benefits of using pure functions.