hckr.fyi // thoughts

XML vs. JSON Why JSON Sucks

by Michael Szul on

As much as XML was a huge buzzword in the early 2000’s, JSON has become the buzzword of the last few years. Much of this is a result of the proliferation of JavaScript frameworks recently. After AJAX surfaced as an excellent utility for getting data asynchronously from the server, many programmers began to look towards client-side programming as a solution to make web applications more dynamic and with better eye candy. Ruby on Rails, and the guys over at 37 Signals (now Basecamp), deeply integrated Scriptaculous into their framework at the time, which gave programmers a taste of what was possible. Meanwhile, John Resig began putting together the pieces of the puzzle that eventually became jQuery (read Resig’s Pro JavaScript Techniques for great insights into JavaScript framework building, and to see many of the techniques used in jQuery). Microsoft also began promoting advanced JavaScript techniques with its AJAX toolkit.

Why so many frameworks, and why the need? JavaScript, as a client-side programming language, requires execution within the browser, and we all known how well browsers adhere to standards. Microsoft, in particular, went wild with JavaScript and CSS implementations, which made programming on the client-side difficult. A lot of JavaScript functionality had to be tested against various browsers, and different objects had to be loaded to match the browser being used. Microsoft was the main culprit by relying too much on ActiveX controls inside of Internet Explorer. JavaScript frameworks made this easier, and started to take the sting out of the client-side focus.

Frameworks like jQuery might have made things a little too easy. Advanced JavaScript controls made possible by jQuery plugins, or some of the more desktop looking items found in Ext JS at the time, provided a front-end look-and-feel that web designers had been craving for user interface design. With plugins and ready-made implementation scripts, it made it easy for non-programmers to hack together scripts and make things look good from the user end, even if the code itself lacked optimization, clarity, or security. We often use the term script-kiddies for this group of people--a term that originally was coined for graphic designers, web designers, or hobbyists who attempted to be programmers by downloading and loosely tying together PHP scripts, and selling the end result to clients as a completed solution (Script-kiddies who work primarily with WordPress, Drupal, etc. are often called plugin-poppers, but that's another story).

With JavaScript, script-kiddies is a term most often used to refer to those who download JavaScript snippets from online repositories or jQuery plugins in order to display a certain effect, be it an accordion, cover flow, or adding movement to special elements. Since these items are usually implemented on the front-end, many web designers have gotten their hands dirty attempting to solve client or company problems in this way.

JavaScript was always mainly a front-end technology. Although capable of being run server-side, there was never an adequate implementation until Node.js came onto the scene. An embedded JavaScript engine on the server-side, Node.js allowed programmers to run JavaScript on both ends of the server trip. This meant that the same designers that were able to tinker with front-end programming, now could tinker with the server-side as well without learning something like Python, C#, or Java.

As a result of the increasing importance of JavaScript on the client-side, and Node.js on the server-side, as well as more designers getting involved with programmer work, JavaScript has been one of the fastest growing languages utilized in recent years, and with it, the data format known as JSON.

JSON stands for JavaScript Object Notation, and was first formalized by Douglas Crockford. The name itself was devised at Crockford’s company: State Software, Inc. It is claimed that JSON is a language-independent data format, but mostly JSON is used with JavaScript (its namesake), and was created based on non-strict JavaScript standards. In fact, JSON is optimized for JavaScript and works natively within the language (as actual JavaScript objects, including arrays, functions, etc.), while its "language-independence" in other programming languages is often awkward at best.

To be fair, Python probably offers the best comparison, as its data structures combined with the JSON library are relatively close to JavaScript:

data = json.dumps({
            'query': {
                'query_string': {
                    'default_field': '_all',
                    'query': 'contributors_url{} AND (category:project OR category:component OR category:registration)'.format(guid),
                    'analyze_wildcard': True,
                    'lenient': True
                }
            }
        })
    

The above is a dump of code used to create data sent to the Elasticsearch engine. You can see that the dumps() method takes what is essentially a JSON structure. One important consideration to make with this code snippet is that the Elasticsearch query syntax contains tokens that could bring into question the readability of more complex data, as those tokens might match the various brackets used in JSON data structures, but that's not a JSON problem per se.

Other languages aren’t as lucky when it comes to JSON similarity. Most C# programmers, for example, rely on NewtonSoft’s JSON.NET for JSON serialization (it’s even the adopted library in the ASP.NET MVC Web API framework). This means that it's easy to take a model object and serialize it to JSON, and if properly structured (thought out ahead), a JSON object sent into a REST call can be deserialized too, but things start to break down when more complex data structures are necessary or when the incoming JSON might not be fully known. In those cases, you most often find yourself falling back on using the dynamic keyword, which isn’t really ideal for a traditional statically typed language.

For example, look at the way this code needs to be parsed:

JArray arr = null;
    try
    {
        arr = (JArray)data["data"];
        foreach (dynamic item in arr)
        {
            using (DefaultContext db = new DefaultContext())
            {
                Exam_X_Question.Create(
                    db
                   ,(long)item.ExamID
                   ,(long)item.QuestionID
                   ,(int)item.MajorVersion
                   ,(byte)item.MinorVersion
                   ,DateTime.Now
                   ,DateTime.Now
                   ,(string)item.ComputingID
                   ,(bool)item.Active
                );
            }
        }
    }
    finally
    {
        arr = null;
    }
    

In the above code, Newtonsoft is used to parse an item named "data" from the data object into a JArray, we then have to loop over the items in that array, but use the dynamic keyword because those items are not explicitly typed, and we need to access properties off of them.

Despite these difficulties in some languages, with the recent popularization of JavaScript through AJAX, jQuery, Node.js, as well as web designers getting involved with scripts, JSON has become the data format of choice for many of these programmers and designers. Since JSON can be worked with as native objects (once parsed with a JSON parser or eval()) it makes it easier to work with inside of the JavaScript language.

It is also believed that JSON data is more compact that XML, and thus results in a smaller size going back-and-forth between the client and server--although this is debatable when XML tag architecture and compression are taken into account.

Even without compression, it is mistaken to assume that JSON is the better, shorter, more readable format. Author Darrel Miller has a GitHub Gist that does an excellent comparison of C# project files in both formats. JSON is about 2% smaller in bytes, but the XML format is about 38% shorter in length (lines of code), while being more readable. He ends his description of this Gist on a similar note:

I am NOT trying to make the case that XML is a better format for this use-case than JSON. I am pointing out that there are pros and cons to both formats and I believe it is unreasonable to automatically assume that JSON is smaller, more readable, or more editable.

For the sake of this chapter, we’ll go with the assumption that on face value, JSON is smaller than XML, so that we can focus on data structure and data interchange.

Although you wouldn't know if from the title of this chapter, I think that JSON is an excellent data format… for JavaScript. It's certainly a format that should always be an option when connecting with web services, as many web services are consumed via AJAX and manipulated through JavaScript. Also, the object-oriented nature of the JSON data, once parsed into JavaScript, is a great way to traverse data. I do not, however, believe that JSON is the best data format for decoupled programs and/or services.

Although JSON is considered a language independent data format, there is no denying that its basis lies in JavaScript, and it is with JavaScript that JSON has the most power. If your data format is more optimized and functions in one language better than any other, then it might not be all that language independent.

Remember that the purpose of SGML was for the generic coding of data, initially for formatting purposes. XML was devised as a simpler solution that contained the generic coding and formatting purpose of SGML, as well as general communication and semantic information for consumption. SGML, and subsequently XML, were meant as markup that was both human and machine readable. Although JSON is considered human readable, it certainly isn’t as clean as XML, and larger data sets become even more difficult to ascertain, especially for non-programmers. In fact, XML has many advantages over JSON when it comes to describing or marking up data. JSON, despite its different formats for data types, is essentially linear in its descriptive ability; XML on the other hand, has a layered feel to it.

For example, let's look at some contact records. In JSON, we can create a contact record that looks like this:

{ "contacts" : [
        {
            "prefix" : "Mr. ",
            "firstname" : "Michael",
            "lastname" : "Szul"
        }
    ] }
    

In XML, it would look like this:

<contacts>
        <contact>
            <prefix>Mr.</prefix>
            <firstname>Michael</firstname>
            <lastname>Szul</lastname>
        </contact>
    </contacts>
    

Right away, despite the additional size of the text, you can see how much more readable the XML would be to a non-technical person.

In fact, without any prior knowledge of programming or markup, a person could easily discern the contents and structure of such an XML file and make edits, while the varied brackets in the JSON could cause some confusion; maybe not with this simplistic example, but imagine a structure of 50 contacts and much more data. In fact, in XML, the most common mistake for a layman would be to miss the trailing slash on a self-closing tag. With JSON, common errors include missing commas and mismatching brackets. The latter two being way easier to miss at a glance than any XML issues.

XML allows you to additionally describe data by including attributes:

<contacts>
        <contact type="personal">
            <prefix type="common">Mr.</prefix>
            <firstname>Michael</firstname>
            <lastname>Szul</lastname>
        </contact>
    </contacts>
    

Notice the type attributes. These attributes are secondary to the core content of the contact. This could be seen as an added layer of description or context to the markup--better fleshing out of the data. How would this look in JSON? If all we were talking about was a contact type, it might look like this:

{ "contacts" : [
        {
            "type" : "personal",
            "prefix" : "Mr.",
            "firstname" : "Michael",
            "lastname" : "Szul"
         }
    ] }
    

We run into a problem with multiple "type" attributes, however. How do we handle both? One example would be to spell out the types:

{ "contacts" : [
        {
            "contacttype" : "personal",
            "prefixtype" : "common",
            "prefix" : "Mr.",
            "firstname" : "Michael",
            "lastname" : "Szul"
        }
    ] }
    

This flattens the data, which might not be the desired effect--plus it adds cumbersome text in order to spell out the two types. Another way to format it would be turn the prefix into its own object:

{ "contacts" : [
        {
            "type" : "personal",
            "prefix" : {
                "type" : "common",
    "name" : "Mr."
            }
            "firstname" : "Michael",
            "lastname" : "Szul"
        }
    ] }
    

This also presents a problem because the application or person consuming the data has to access the prefix value in a different way than the rest of the core data. For example, accessing the firstname value would be:

contacts[0].firstname
    

Prefix, on the other hand, would require further steps:

contacts[0].prefix.name
    

This seems like a trivial complaint, but imagine more complex data. In order to compensate for attributes at the same level with the same name, JSON requires either flattening the data by changing the name or altering the structure to create more in-depth objects. The first way feels like a kludge, while the second way lacks consistency.

Recently, some programmers have taken to specifying attributes as a specific object off the core data:

{ "contacts" : [
        {
            "@attributes" : {
                "type" : "personal"
            },
            "prefix" : {
                "@attributes" : {
    "type" : "common"
        }
    "name" : "Mr."
            }
            "firstname" : "Michael",
            "lastname" : "Szul"
        }
    ] }
    

Although this allows for cleaner JSON data and prevents the need to flatten it, it still doesn't fix the consistency issue.

Next let's consider structure. XML has nodes and attributes, while JSON has objects, arrays, and strings, represented by key/value pairings. Let's look at an example where the expectation is a student's exam results with questions and the selected answer:

{ "exam" : [
        {
            "question" : "What is the capital of Canada?",
            "answer" : "Ottawa",
            "correct": true,
        }
    ] }
    

In XML, it could look like this:

<exam>
        <question>
                    <stem>What is the capital of Canada?</stem>
                    <answer correct="true">Ottawa</prefix>
        </question>
    </exam>
    

What if the original requirements strictly said that only one answer was to be allowed per question, but those requirements later changed in the middle of a semester to allow for multiple answers? In JSON, you would have to modify it like this:

{ "exam" : [
        {
            "question": "What is the capital of Canada?",
            "answers": [
                         { "answer": "Ottawa",
                           "correct": true },
                         { "answer": "Calgary",
                           "correct": false },
                      ]
         }
    ] }
    

The first thing to notice is that we changed how the data needs to be accessed. No longer is it a simple object path like exam[0].answer, but instead an array of answers. Any program accessing the answer will need to change the way it paths to the data. In XML, we can simply add a new node:

<exam>
        <question>
                    <stem>What is the capital of Canada?</stem>
                    <answer correct="true">Ottawa</prefix>
                    <answer correct="false">Calgary</prefix>
        </question>
    </exam>
    

Arguments can be made that these are nit-picking issues that expose the lack of foresight in programmers rather than any issue with the underlying data format; however, we are also using simplistic examples. Whether XML or JSON, most programmers architecting the data will do so for complex data structures, but the most optimized data structure might not always be readily apparent or easily discernable, and instead require some massaging as specifications change or usage dictates. In such situations, JSON has a tendency of changing the structure because of the differing data types, which results in any program or programmer consuming such data to have to change their end of the equation as well. XML, however, just has nodes and attributes, and as a result, won’t require such client changes unless you add additional structure.

Programmatically processing data is far easier and more power in XML than JSON. Some will argue that this is the result of more libraries targeting XML than JSON, but XML was founded along with technologies like XPath, XQuery, and XSLT on the basis of data processing, retrieval and transformation. JSON was created (or discovered) to be a streamlined data format. If all you need is basic data transference and processing, JSON will probably work for you, but if you have complex data that needs advanced retrieval and processing, XML is really what you’re looking for.

Let’s take one of the examples from earlier, but slightly modified:

<contacts>
        <contact type="personal">
            <prefix type="common">Mr.</prefix>
            <firstname>Michael</firstname>
            <lastname>Szul</lastname>
        </contact>
        <contact type="work">
            <prefix type="common">Mr.</prefix>
            <firstname>Daniel</firstname>
            <lastname>Barbella</lastname>
        </contact>
    </contacts>
    

Using XPath, we can select all work contacts by using the string:

/contacts/contact[@type='work'] 
    

Our JSON structure would look like the following:

{ "contacts" : [
        {
            "type" : "personal",
            "prefix" : "Mr.",
            "firstname" : "Michael",
            "lastname" : "Szul"
        },
        {
            "type" : "work",
            "prefix" : "Mr.",
            "firstname" : "Daniel",
            "lastname" : "Barbella"
       },
    ] }
    

How would you select only the contacts where the type is "work?" You could always loop through it:

for(int i = 0; i < contacts.length; i++) {
        if(contacts[0].type === "work") {
            //Here is your work contact.    
        }
    }
    

If you need to access the work contacts throughout a particular routine, you could always push it onto an array:

var work = [];
    for(int i = 0; i < contacts.length; i++) {
        if(contacts[0].type == "work") {
            work.push(contacts[0]); 
        }
    }
    

Now you have a variable named work that contains all of your work contacts, but you'll have to loop through them to process them. This means, at its most basic level, you must double loop through data if you plan to process work contacts separately, or you'll have to process them inside of the "if" statement to avoid a double loop. In JSON, there is no way to select objects this way:

contacts[type="work"]
    

This statement violates the way JavaScript handles array access, which uses integers, or object access, which uses named selection within the square brackets.

XPath selection also offers greater degrees of document tree traversal. For example, consider one of our previous XML node lists only embedded in a larger structure:

<company>
        <name>Barbella Digital, Inc.</name>
        <addresses>
           ...data removed...
        </addresses>
        <contacts>
            <contact type="corporate">
                <prefix type="common">Mr.</prefix>
                <firstname>Michael</firstname>
                <lastname>Szul</lastname>
            </contact>
            <contact type="skunkworks">
                <prefix type="common">Mr.</prefix>
                <firstname>Daniel</firstname>
                <lastname>Barbella</lastname>
            </contact>
         </contacts>
    </company>
    

Let's assume that you're inside of a node list looping through contact names that are corporate types, but you discover that you need to go back and get the company name. With XML DOM objects, you can use XPath selectors like "ancestor" to travel back up the tree to select and process parent data. With JSON, because its basis is in JavaScript (and certainly if one were using JavaScript like is most common today), you cannot travel back up the data structure because of the possibility of multiple pointers.

This isn't to say that advanced selectors aren't available in JavaScript for processing JSON through regular expressions or other text processing means. In fact, there are several JavaScript libraries out there today that are specifically built to provide XPath-like selection to JSON objects. Although these libraries certainly work, once again we see the limitations of JSON that cause programmers to develop libraries to duplicate tools and functionality already present and available in other languages such as XML. If you find yourself creating or using functionality in JSON that is a duplicate of functionality already available to XML, then why not just use XML to begin with? At this point, not doing so seems like trying to force JSON into an area where it wasn't necessarily designed to go--the square peg/round hole argument.

Furthermore, now you're adding additional external JavaScript libraries to the downloaded files of web applications. Most who promote JSON mention that the size of the data is smaller than XML, but if processing using advanced libraries, you're adding additional size to the web application, even if it gets cached after the first run. Between large processing code due to looping and the lack of XPath like expressions, or the trade-off of better XPath-like expressions, but more external files, with larger data sets, JSON's supposed data size advantage starts to mean less and less in the larger scheme of web applications and products.

XML offers other advantages over JSON as well. Remember that XML, at heart, is a markup language, and in addition to representing data, it also describes the data. Sometimes data is semantic in nature. Most commonly, you would see this with HTML or XHTML:

<div>
        <p>The <strong>quick</strong> brown cow…</p>
    </div>
    

This can also be the case with XML stored in a database, or data that has been processed through a natural-language processor.

<content>
        <message type="notification">Service is current under maintenance. For addition information, <event type="email" destination="IThappens@example.com">contact</event> the IT department.</message>
    <content>
    

With this internal markup, a consuming application has enough information to do something with those "instructions," and since XML is truly language agnostic, it can be transformed into a JavaScript browser event, a Windows Form event, or instructions for a messaging system. JSON doesn't have this internal markup ability, and instead, one would be forced to create additional key/value pairs to spell out the instructions.

XML also offers the power of namespaces, including predefined ones such as xml:lang. Programmer Ben Longden (in a non-related JSON sucks blog post) notes that with XML, you get things like xml:lang for free, allowing for namespaced attributes to specific different language types, creating built-in support for multiples languages. Longden argues that although JSON might be good for simple data formats, when being used for media, it falls short of XML's support for multiple languages. Although JSON can offer key/value pairs for each added language, not doing so from the beginning results in the changes to object types mentioned earlier, and also creates a less elegant structure of data. Longden gives the following examples to show this deficit in elegance:

<error id="6" xml:lang="en">
        <message>Something bad happened</message>
        <message xml:lang="de">Etwas schlimmes ist passiert</message>
    </error>
    

The XML above is opposed to the possible JSON structures below :

{
        "id": 6,
        "messages": [
            { "lang": "en", "message": "Something bad happened" },
            { "lang": "de", "message": "Etwas schlimmes ist passiert" }
        ]
    }
    
{
        "id": 6,
        "messages": { 
            "en": "Something bad happened",
            "de": "Etwas schlimmes ist passiert"
        }
    }
    

Namespaces and schemas also offer greater power for XML than its JSON counterpart. Namespaces help to solve conflicts between tags with the same name that represent different data. Schemas, meanwhile, offer a way to construct specific XML implementations to offer better interoperability, and provide a way for validation. Although JSON data structures may also be validated, XML--through XML Schema--has much more robust validation ability, which is extremely important for enterprise messaging. In fact, XML Schema allows for such things as extension elements and substitution groups, creating an almost object-oriented, inheritance feel to XML files requiring validation.

The reality is that you lose a lot of data extensibility when you choose JSON over XML. XML can be extended in an almost trivial fashion through new tag elements, new attributes, and namespaces, but extending JSON is difficult outside of adding new key/value properties. As shown with the telephone example, and with Longden’s xml:lang example, since XML is represented by nodes and node lists with text and attributes, adding additional data is as simple as tacking on a node, but in JSON, how do you extend a key/value property where the value type is an integer or a decimal?

Again, this isn't to say that JSON isn't a good data format. It's excellent for small segments of data that are getting consumed by JavaScript code, and it certainly requires less written text to create the data object (as opposed to XML's need for open and closing tags). ASP.NET MVC's Web API can return either XML or JSON, and if I'm writing client-side JavaScript code, I prefer JSON for most data consumption (although using jQuery to parse and access XML is just as simple in many ways). The problem is that JSON is limited as a data interchange format: it's less readable, it's less extensible, and simple changes could alter the structure and break legacy code if you're not careful. In addition, it lacks the maturity of powerful related tools like XPath, XSLT, and XML Schema--so much so that library developers are attempting to mimic those tools for JSON.