Eli Rose 's postsabout code trinkets quotesmedia log
'null' vs. 'undefined' in Javascript Wasn't Originally Total Chaos, But Now It May As Well Be

Most programming languages have a way to represent "no value" or "nothing". Python has None, Ruby has nil, Java and friends have null. Javascript has two ways to represent this concept -- undefined and null.

How do we decide which one to use? For example, suppose I'm writing a function which is responsible for locating an object with certain qualities, but it is possible that no such object exists. If I detect that the latter is true, do I return undefined or null?

For another example, suppose such as function exists in a third-party library, and you are calling it. Should you check the result against null before proceeding? Against undefined? Against both just to be sure?

I attempted to arrive at a deep understanding of the answers to these questions.

What the distinction between null and undefined?

We can define the difference simply listing the situations where each pops up. (From what I can tell these are the most important ones, but I'm sure I'm overlooking a few.) My source is the ES6 spec.

  • undefined arises

    • from accessing a variable that hasn't been assigned a value yet
    • from accessing an object property that hasn't been assigned to yet or doesn't exist
    • from capturing the return value of an expression or statement that has no other return value (e.g. Array.forEach)
    • from accessing a parameter within a function that wasn't provided by the caller
    • from builtin APIs that are allowed to return it (like Array.find)
    • by typing "undefined"
    • or by using the void operator.
  • null arises

    • as the last element in the prototype chain (the prototype of Object.prototype)
    • by typing "null"
    • or via builtin APIs that are allowed to return it (like RegExp.match, RegExp.exec or document.getElementById). Relatively few builtin APIs return null.

And of course each may occur at any time in third-party libraries.

This is helpful, but it's not really answering the question: we care about the essence of the distinction between them. Is there such a thing? Is it just a list of rules we have to look up all the time or is there a guiding idea that we can remember and use to deduce all the details?

[Answer: Originally yes, but in practice you should just look up the details.]

What the essence of the distinction between null and undefined?

Values in Javascript are divided into primitive values and object values. undefined represents "this value is missing; it would have been either a primitive or an object". null means "this value is missing; it would have been an object".

This explains why Array.find returns undefined -- that array could have contained either primitives or objects (or a mix of both), so we can't say what the return value would have been. But in RegExp.match, the return value is supposed to be an Array of results -- hence we can more specifically return null when no results are available. (We don't return an empty Array for ...reasons.) "If there was a value, it would have been an object" -- the value we use to represent this situation is null.

Hence, null is more specific than undefined.

This understanding is based on the specification, which defines "undefined" as "primitive value used when a variable has not been assigned a value" and "null" as "primitive value that represents the intentional absence of any object value", as well as Axel Rauschmayer's book Speaking JS and TJ Crowder's thorough StackOverflow answer.

Here is an excerpt from Speaking JS which I found helpful:

A single nonvalue could play the roles of both undefined and null. Why does JavaScript have two such values? The reason is historical.

JavaScript adopted Java’s approach of partitioning values into primitives and objects. It also used Java’s value for “not an object,” null. Following the precedent set by C (but not Java), null becomes 0 if coerced to a number:

5 + null === 5

Remember that the first version of JavaScript did not have exception handling. Therefore, exceptional cases such as uninitialized variables and missing properties had to be indicated via a value. null would have been a good choice, but Brendan Eich wanted to avoid two things at the time:

The value shouldn’t have the connotation of a reference, because it was about more than just objects. The value shouldn’t coerce to 0, because that makes errors harder to spot. As a result, Eich added undefined as an additional nonvalue to the language. It coerces to NaN:

5 + undefined === NaN

How would you use this distinction in practice?

Suppose you're writing function mean(numbers) which takes in an array and returns the average of the numbers in that array. If passed an empty array, what do you do? Following this understanding, you should return undefined. It would be wrong to return null, because that would indicate that this function could have returned an object, when really it returns a Number -- a primitive.

On the other hand, if you're writing a function that searches a list of Appointment objects for the one under a certain name, and there are no appointments for that name, you should return null. The return type could only have been Appointment, so you can give more information by returning a null rather than an undefined.

Great, so are these questions resolved?

No, because I wouldn't recommend actually doing that in practice without understanding the conventions of the libraries you're working with and communicating with the developers you're working with. The consensus of the Javascript community seems to be that "Javascript uses undefined and programmers should use null" although I'm surprised that I've had a hard time finding an authoritative word on the matter. (See Airbnb's popular style guide which does not mention the issue.)

So someone writing a mean function may be returning null, and you cannot confidently check against only undefined.

Perhaps the lack of consensus is because it's also conventional to use x == null, which thanks to coercion is true even if x is undefined, hence masking the difference between the two. I find this very misleading for someone who does not understand exactly what is going on, as it elides the difference between the two, but some Javascript developers use it (including major libraries such as Underscore and JQuery), and it is convenient.

In conclusion, undefined vs null is a quirk of the language that must be tamed through communication and documentation.

Can we say anything for certain?

Yes -- we can say that the original distinction between null and undefined was motivated by a distinction between primitive and object values. Although we can invent new semantics for them for our own use, we should be careful that our new semantics are actually consistent with behavior, and we should be clear that the semantics are invented.

It's tempting to take away something like "undefined means uninitialized, null means initialized but not present" (e.g. in this popular Stackoverflow answer), and indeed this is true in many cases. But it does not explain why Array.find returns undefined rather than null, and so it will confuse you in the long run if you mistake it for a full understanding.

Likewise, if in our programs we decide always to use null or always to use undefined to indicate missing data, we should be clear that this is just a convention we are adopting, and be aware that the builtin APIs will disagree with us at times.

What does this mean about the world?

Javascript was famously prototyped over 10 days in 1995 by Brendan Eich. It's not new that it has many inconsistencies and strange design decisions. But I find it enlightening to observe that there actually is a consistent distinction between null and undefined -- just not one that is widely understood or relevant today.

We can say now that it would probably have been better to just have undefined or just have null in the language -- but only with the benefit of hindsight and many other examples of similar scripting languages. We could not have done a better job than Brendan Eich in 1995.

Paradigms such as programming languages can spread regardless of merit, and all paradigms look good from the inside. Javascript is now the most popular programming language in the world.

[Minor edits on 2/24/2019 for clarity]