May 6, 2011

Semantics of undefined and null in javascript

Recently, I was discussing with a friend about the difference between undefined and null in javascript. I am surprised that I gave an explanation that he can understand, more surprisingly, I can understand too. Sometimes, people give an explanation, which they don't understand themselves, and also confuse others. Before going further, here is the explanation from JavaScript: The Definitive Guide

You might consider undefined to represent a system-level, unexpected, or error-like absence of value and null to represent program-level, normal, or expected absence of value. If you need to assign one of these values to a variable or property or pass one of these values to a function, null is almost always the right choice.

The explanation is confusing to me. What does this means in my daily coding, in what scenario I must use one but not the other or a scenario I can use either of them? So if null is almost the right choice, but why we have "undefined", is there any technical reason or semantic reason? Let's see why we have "undefined" value in javascript from technical perspective. When we write the following code, I can say, I declare two variables and they are not assigned with any value, and the default value of each is undefined.

var x;
var loooooooooooooooooooooooooong;

After reading the article Understand delete, I understand that, javascript variable mechanism. When you declared variable like above, you add a key/value pair entry to a mysterious object, VariableObject, which is dictionary. Here is pseudo code, that the engine will convert to

VariableObject["x"] = undefined;
VariableObject["loooooooooooooooooooooooooong"] = undefined;
//semantically, it means the following
//VariableObject.add("x", undefined);
//VariableObject.add("loooooooooooooooooooooooooong", undefined);

If the above example, two key/value pairs are added to the VariableObject dictionary. So both the key and value consume memory, so if you have a longer variable name, regardless its value, it use more memory than short variable. This is different from c++. In c++, a variable name is nick name of memory address. So practically, we should use shorter variable name, or use minifier to rename your variable. The VariableObject is special in that it is created by runtime. If the code is run in Global scope, the VariableObject is accessible as window. If it is run function scope, it is not accessible at all, which is known as Activation Object. Supposed it is run in global scope, it is same as the following.

window["x"] = undefined;
window["loooooooooooooooooooooooooong"] = undefined;

In the above case, variable x is said declared because its key is in the dictiobary, but its value is undefined. If the key is not even in the dictionary, then it is undeclared. Technically, there is difference between "undeclared" and "declared but undefined". But the following undefined check does not tell the difference.

//suggested by jQuery Code Style Guildeline
//http://docs.jquery.com/JQuery_Core_Style_Guidelines
//undefined check
//Global Variables: 
typeof variable === "undefined"
//Local Variables: 
variable === undefined
//Properties: 
object.prop === undefined

If you really need to know the difference, you need to use catch, because accessing undeclared variable directly will throw an exception.

function test(variableName) {
  try {
     var temp = eval(variableName);
    if (temp === undefined) {
       return "\"" + variableName + "\" is declared, its value is undefined"; 
    } else {
       return "\"" + variableName + "\" is declared, its value is not undefined"; 
    }
  } catch (e) {
      return "\"" + variableName + "\" is undeclared";
  }
}

var y;
var z = null;

alert(test("x")); // "x" is undeclared
alert(test("y")); // "y" is declared, its value is undefined
alert(test("z")); // "z" is declared, its value is not undefined

Most of time, we don't care the difference between undeclared and undefined. Practically, we can treat it the same, if a value is undefined, it does not exist in dictionary, although it is not quite true. If we accept this, we can use the undefined check as the jQuery code style guidline recommend.

Back to the question, why we need to have undefined? This is because variable is key/value entry in dictionary, this is because we can add entry into dictionary in runtime. If its value is undefined, practically it does not exist in the dictionary. Using null simply simply can express this semantics, because its value is "null", it is already in the dictionary. Now we know the techinicall difference, how can apply them into our coding. undefined check is normally used, before defining it. Here is an sample

//if somebody defined, if it has been defined, regardless its value,
//don't define it again.   
if (console.log === undefined ) {
    console.log = function () { ... }
  }


function css(key, value) {
  //if user does not give a value,
  //he want to get the value
  if (value === undefined) {
     return db[key];
  } else {
    //otherwise user want to set the value
   db[key] = value;
  }
}

But what about null in javascript? We know how technically it is different from undefined. What is its semantics? Short answer is it depends. It is up to you how to interpret it, and only you can define it in your application. In my matrix library, I use null to represent the case when no a resource has no dependencies, the undefined value to represetn the case when dependencies is yet to know, the semantics is quite different. But its semantics can be others if you want.

if (depedencies["x.js"] === undefined ) {
       //go figure out the what dependencies is and come back later

    } else if (depedncies["x.js"] === null) { 
       //there is no dependencies, load it directly.

    } else {
        //load dependencies["x.js"] first, because it is not empty.
   }