The True Story of JSON :: Mateusz Jabłoński - blog, podcast, kursy o programowaniu i rozwoju

The True Story of JSON

HRejterzy once blamed JSON for a project problem in one of their videos. If we assume that JSON is responsible for so many errors in IT projects, maybe it is worth getting to know it better. Meet the true story of JSON.

Publication date

10 February 2022

Level

regular

Mateusz Jabłoński

Frontend Developer with a passion. Husband in love. Proud father. Gamer by choice.

From this article you will learn:

What JSON is and how to pronounce its name correctly
What serialization and marshalling are about
What Cartoon Network has to do with the creation of JSON
What the advantages and disadvantages of the JSON format are

Who is the best-known person in the IT industry? Obviously - JSON, pronounced "Jason". Have you ever wondered where he came from? Or maybe it is the same character as Jason Voorhees? It would be rather unpleasant if that were actually the case.

Fortunately, JSON is not a person. JSON is a text data format used to transfer content. The main idea behind this format was that it had to be lightweight and independent of any programming language. These two assumptions are already written in the introduction to the document that laid the groundwork for its creation. But let us start from the beginning...

Before JSON Appeared

Computer science is the study of information: how it is transferred, retrieved, processed, and stored. Notice that this definition does not mention programming at all. What is more, you could even say that this description of computer science is closer to data analysis than to programming.

Taking that into account, computer science is meant to solve problems related to information, and one of those problems is sending it from one place to another. So how should we send data? We have to agree that sending complex structures such as objects or arrays directly would be difficult, inconvenient, and not very safe. On top of that, object structures may differ between programming languages, and that can be problematic, because different languages do not necessarily understand each other. So processes were invented to make this easier.

Serialization and Marshalling

Both processes are used to transform data into simpler structures, most often streams of bytes. In serialization, only the data from an object is written to the output form. Marshalling, however, does a bit more. In addition to serializing the data from the object, it prepares information about the implementation of that object. In practice, this means that if any program wants to use the received data, it will have to prepare the environment properly by downloading the code and loading it. Serialization is part of marshalling.

Since we already know how we can simplify our data, we still need to choose its format. Sending binary data is, of course, the lightest option, and this is how it was originally organized, but it is also the least readable one. Let us be honest - which of us can quickly analyze binary code?

In addition, different programming languages developed their own binary serialization formats, which made it harder to read data across different programming languages. On top of that, in the 1990s routers blocked data that looked suspicious. Well, byte streams did look suspicious, and in the end this led to a situation where only text communication was allowed.

In 1996, work began on a "new" markup language whose purpose was to send serialized data. This language was called XML, or Extensible Markup Language. It is worth adding here that XML did not really appear out of nowhere. It was based on SGML, the Standard Generalized Markup Language, which had been released back in the 1980s.

The XML standard was published in 1998 by the W3C. Bang. What more could we want? XML became the standard and everything would have been fine, if not for the fact that XML is rather heavy. Take a look at the menu example below:

xml

<breakfast_menu>
	<food>
		<name>Belgian Waffles</name>
		<price>$5.95</price>
		<description>Two of our famous Belgian Waffles with plenty of real maple syrup
		</description>
		<calories>650</calories>
	</food>
	<food>
		<name>Strawberry Belgian Waffles</name>
		<price>$7.95</price>
		<description>Light Belgian waffles covered with strawberries and whipped cream
		</description>
		<calories>900</calories>
	</food>
	<food>
		<name>Berry-Berry Belgian Waffles</name>
		<price>$8.95</price>
		<description>Light Belgian waffles covered with an assortment of fresh berries and whipped cream
		</description>
		<calories>900</calories>
	</food>
	<food>
		<name>French Toast</name>
		<price>$4.50</price>
		<description>Thick slices made from our homemade sourdough bread
		</description>
		<calories>600</calories>
	</food>
	<food>
		<name>Homestyle Breakfast</name>
		<price>$6.95</price>
		<description>Two eggs, bacon or sausage, toast, and our ever-popular hash browns
		</description>
		<calories>950</calories>
	</food>
</breakfast_menu>

There is quite a lot going on here, right? People started complaining about XML. The wave of criticism focused mostly on its verbosity, complexity, and lack of readability.

Douglas Crockford and Cartoon Network

Douglas Crockford is a programmer who began his programming career at Atari in the early 1980s. He is considered one of the gurus of JS, although on his website he says that the title Mahatma of JS suits him much better. He is known as the author of many interesting publications, such as "JavaScript: Good Parts" and "How JavaScript Works".

In the early 2000s, Douglas Crockford proposed a new data format based on JavaScript, or more precisely on the ECMA-262 3rd Edition standard from December 1999, and called it JavaScript Object Notation, or JSON. By design, it was meant to be a lightweight, text-based format independent of any programming language. He based the project on conventions known from the C family of languages.

The precursor of the new format was a browser plugin prepared for Cartoon Network for a children's browser game. This plugin was essentially responsible for text communication inside the game itself. It was supposed to be a lighter alternative to Flash, which was very popular at the time. Still, the development of this idea was strongly influenced by the appearance of AJAX in March 1999, a technology used to create asynchronous web applications.

And that is how, in April 2001, Douglas Crockford and Chip Morningstar sent the first JSON out into the world.

Standardization

As I mentioned above, Douglas Crockford based his idea on the latest available ECMAScript language standard: ECMA-262 3rd Edition. He described his format in the informal RFC 4627 document. However, as JSON grew in popularity, a single standard had to be created to enforce the correct way of using this new approach.

This happened as part of the ECMA-404 standard from 2013, which was updated in 2017.

Is JSON Really That Lightweight?

The main advantage of JSON was its lightness compared with XML, which dominated communication at the time. Let us compare the restaurant menu from the first example, written in XML, with a JSON version containing the same data:

json

{
	"breakfast_menu": [
		{
			"name": "Belgian Waffles",
			"price": "$5.95",
			"description": "Two of our famous Belgian Waffles with plenty of real maple syrup",
			"calories": 650
		},
		{
			"name": "Strawberry Belgian Waffles",
			"price": "$7.95",
			"description": "Light Belgian waffles covered with strawberries and whipped cream",
			"calories": 900
		},
		{
			"name": "Berry-Berry Belgian Waffles",
			"price": "$8.95",
			"description": "Light Belgian waffles covered with an assortment of fresh berries and whipped cream",
			"calories": 900
		},
		{
			"name": "French Toast",
			"price": "$4.50",
			"description": "Thick slices made from our homemade sourdough bread",
			"calories": 600
		},
		{
			"name": "Homestyle Breakfast",
			"price": "$6.95",
			"description": "Two eggs, bacon or sausage, toast, and our ever-popular hash browns",
			"calories": 950
		}
	]
}

What we can notice immediately is that JSON syntax resembles JavaScript objects. At the time JSON was created, this was genuinely something completely new, and it is worth emphasizing that requests became lighter. The example above is about 20% lighter than its XML equivalent.

Today, however, people often say that JSON has already become too heavy. More and more alternatives are appearing, based on binary serialization, such as ION, BSON, Smile, CBOR, MessagePack, Avro, or FlatBuffers.

Nobody Is Perfect. JSON Is Not Either.

One of the disadvantages pointed out when we talk about the JSON format is the fact that, in reality, we only have objects, arrays, and primitive types at our disposal. More complex structures have to be prepared properly - transformed or mapped - before they can appear in JSON format.

In addition, JSON is heavier than binary formats, comments cannot be used, and it is still less common than SGML-based formats such as XML.

Summary

In 2017, it was established that the name JSON should be pronounced the same way as the name of the hero from the myth "Jason and the Argonauts". A few years earlier, Douglas Crockford had already said that he personally preferred that pronunciation, although it did not matter much to him.

And that is how we have reached the end of JSON's story. The question is whether this is really the end, since the format is so widely used. After all, we encounter it in REST requests, in databases, in configuration files for various applications, and in many other places.