Most programming languages have a way to represent "no value" or "nothing". Python has None
, Ruby has nil
, Java and friends have null
. Javascript has two ways to represent this concept -- undefined
and null
.
How do we decide which one to use? For example, suppose I'm writing a function which is responsible for locating an object with certain qualities, but it is possible that no such object exists. If I detect that the latter is true, do I return undefined
or null
?
For another example, suppose such as function exists in a third-party library, and you are calling it. Should you check the result against null
before proceeding? Against undefined
? Against both just to be sure?
I attempted to arrive at a deep understanding of the answers to these questions.
What the distinction between null
and undefined
?
We can define the difference simply listing the situations where each pops up. (From what I can tell these are the most important ones, but I'm sure I'm overlooking a few.) My source is the ES6 spec.
-
undefined
arises- from accessing a variable that hasn't been assigned a value yet
- from accessing an object property that hasn't been assigned to yet or doesn't exist
- from capturing the return value of an expression or statement that has no other return value (e.g.
Array.forEach
) - from accessing a parameter within a function that wasn't provided by the caller
- from builtin APIs that are allowed to return it (like
Array.find
) - by typing "undefined"
- or by using the void operator.
-
null
arises- as the last element in the prototype chain (the prototype of
Object.prototype
) - by typing "null"
- or via builtin APIs that are allowed to return it (like
RegExp.match
,RegExp.exec
ordocument.getElementById
). Relatively few builtin APIs returnnull
.
- as the last element in the prototype chain (the prototype of
And of course each may occur at any time in third-party libraries.
This is helpful, but it's not really answering the question: we care about the essence of the distinction between them. Is there such a thing? Is it just a list of rules we have to look up all the time or is there a guiding idea that we can remember and use to deduce all the details?
[Answer: Originally yes, but in practice you should just look up the details.]
What the essence of the distinction between null
and undefined
?
Values in Javascript are divided into primitive values and object values. undefined
represents "this value is missing; it would have been either a primitive or an object". null
means "this value is missing; it would have been an object".
This explains why Array.find
returns undefined
-- that array could have contained either primitives or objects (or a mix of both), so we can't say what the return value would have been. But in RegExp.match
, the return value is supposed to be an Array of results -- hence we can more specifically return null
when no results are available. (We don't return an empty Array for ...reasons.) "If there was a value, it would have been an object" -- the value we use to represent this situation is null
.
Hence, null
is more specific than undefined
.
This understanding is based on the specification, which defines "undefined" as "primitive value used when a variable has not been assigned a value" and "null" as "primitive value that represents the intentional absence of any object value", as well as Axel Rauschmayer's book Speaking JS and TJ Crowder's thorough StackOverflow answer.
Here is an excerpt from Speaking JS which I found helpful:
A single nonvalue could play the roles of both undefined and null. Why does JavaScript have two such values? The reason is historical.
JavaScript adopted Java’s approach of partitioning values into primitives and objects. It also used Java’s value for “not an object,” null. Following the precedent set by C (but not Java), null becomes 0 if coerced to a number:
5 + null === 5
Remember that the first version of JavaScript did not have exception handling. Therefore, exceptional cases such as uninitialized variables and missing properties had to be indicated via a value. null would have been a good choice, but Brendan Eich wanted to avoid two things at the time:
The value shouldn’t have the connotation of a reference, because it was about more than just objects. The value shouldn’t coerce to 0, because that makes errors harder to spot. As a result, Eich added undefined as an additional nonvalue to the language. It coerces to NaN:
5 + undefined === NaN
How would you use this distinction in practice?
Suppose you're writing function mean(numbers)
which takes in an array and returns the average of the numbers in that array. If passed an empty array, what do you do? Following this understanding, you should return undefined
. It would be wrong to return null
, because that would indicate that this function could have returned an object, when really it returns a Number -- a primitive.
On the other hand, if you're writing a function that searches a list of Appointment
objects for the one under a certain name, and there are no appointments for that name, you should return null
. The return type could only have been Appointment
, so you can give more information by returning a null
rather than an undefined
.
Great, so are these questions resolved?
No, because I wouldn't recommend actually doing that in practice without understanding the conventions of the libraries you're working with and communicating with the developers you're working with. The consensus of the Javascript community seems to be that "Javascript uses undefined and programmers should use null" although I'm surprised that I've had a hard time finding an authoritative word on the matter. (See Airbnb's popular style guide which does not mention the issue.)
So someone writing a mean
function may be returning null
, and you cannot confidently check against only undefined
.
Perhaps the lack of consensus is because it's also conventional to use x == null
, which thanks to coercion is true even if x is undefined
, hence masking the difference between the two. I find this very misleading for someone who does not understand exactly what is going on, as it elides the difference between the two, but some Javascript developers use it (including major libraries such as Underscore and JQuery), and it is convenient.
In conclusion, undefined
vs null
is a quirk of the language that must be tamed through communication and documentation.
Can we say anything for certain?
Yes -- we can say that the original distinction between null
and undefined
was motivated by a distinction between primitive and object values. Although we can invent new semantics for them for our own use, we should be careful that our new semantics are actually consistent with behavior, and we should be clear that the semantics are invented.
It's tempting to take away something like "undefined
means uninitialized, null
means initialized but not present" (e.g. in this popular Stackoverflow answer), and indeed this is true in many cases. But it does not explain why Array.find
returns undefined
rather than null
, and so it will confuse you in the long run if
you mistake it for a full understanding.
Likewise, if in our programs we decide always to use null
or always to use undefined
to indicate missing data, we should be clear that this is just a convention we are adopting, and be aware that the builtin APIs will disagree with us at times.
What does this mean about the world?
Javascript was famously prototyped over 10 days in 1995 by Brendan Eich. It's not new that it has many inconsistencies and strange design decisions. But I find it enlightening to observe that there actually is a consistent distinction between null
and undefined
-- just not one that is widely understood or relevant today.
We can say now that it would probably have been better to just have undefined
or just have null
in the language -- but only with the benefit of hindsight and many other examples of similar scripting languages. We could not have done a better job than Brendan Eich in 1995.
Paradigms such as programming languages can spread regardless of merit, and all paradigms look good from the inside. Javascript is now the most popular programming language in the world.
[Minor edits on 2/24/2019 for clarity]