Programming is hard by Stephan Schmidt

David Pollak was right about XML and JSON

David Pollak was right about XML and JSON, but perhaps in a different way. XML cannot be converted to (clean) JSON.

Suppose we have a shopping cart in XML which we want to convert to JSON:

<cart>
  <items>
    <item><name>one</name></item>
    <item><name>two</name></item>
  </items>
</cart>

One representation in JSON would be (cart could be omitted):

{ cart: { "items": [ { "name": "one" }, {"name": "two"} ] } }

We convert a list of nodes with the same name to a JSON array, xml2json-xslt does this for example. What happens if we only have one item in our shopping cart?

<cart>
  <items>
    <item><name>one</name></item>
  </items>
</cart>

Then our converter cannot detect that items is a list and will convert the XML to:

{ cart: { "items": { "name": "one" } } }

which is semantically something completely different. And very unpleasent for the receiver of our JSON code, because sometimes he gets an array and sometimes an object.

One way to solve the problem is to annotate the XML (looks ugly but works):

<cart>
  <items type="list">
    <item><name>one</name></item>
  </items>
</cart>

and adding an additional condition to the XSLT

or ../@type[.='list']]

or namespacing (doesn’t work yet, we get lots of namespaces and misuse XML namespaces) ?

<cart>
  <list:items>
    <item><name>one</name></item>
  </list:items>
</cart>

So David was right, I’m not sure he new why ;-)

Thanks for listening.

If you liked this post, subscribe to my free full RSS feed.
Filed under: JSON, XML

You can share this post!
Do you want to tell others about this article? Use the social bookmark icons to submit this artice to the service of your choice. Thanks.

Get free updates by email

If you did like this article you can get free updates with your RSS reader, you can follow me on Twitter or get free update to new posts by email. Enter your email:

 
About the author: Stephan has been working as a head of development and CTO. He has experiences in different technologies since 20 years including Java, Rails and Python. Stephans main field of interest is maintainablity and productivity in software development. Want to know more? All views are only his own.

Comments

Interesting mismatch. You could go the other way though, right? Since JSON does have the notion of an array/list, you would always be able to convert a JSON representation into an XML representation.

stephan

No it won’t go the other way, see for example

http://stephan.reposita.org/archives/2008/04/04/rest-lean-json-and-xml-from-the-same-code/

{ items: [ 1, 2, 3] } cannot be converted to XML, because the type of “1″ etc is missing.

One would need { items: { item: [1,2,3] } } to generate XML. Which is in itself no nice JSON.

André

Maybe you can solve this problem by using some conventions, e.g. if the root of parent node is pretty much the same as the child node this has to be a list.
I thing those conventions are necessary to describe and support the understanding of data structures. Otherwise there is no need for XML schema ;-)

Perhaps I’m missing something in this discussion, but I think the point this problem illustrates is that you can’t do a clean conversion without the XML schema. The schema will define the “maxOccurs” attribute of the “item” element in the “items” element. If it’s greater than one or unbounded, then it can make it an array. If not, then it won’t.

stephan

@Andre: Yes thought about that, looks very fragile though.

@David: Yes, a schema is the same as “type” or a namespace like list:items - some meta information about the structure.

But when transforming XML to JSON within the browser using a schema

a.) the schema is slow, needs another request and isn’t very easy to implement
b.) is not generic, item needs to be defined. When not being generic but specific it would be easier to write a list_xml2json.xsl file

See my posting Converting Restricted XML to Good-Quality JSON for something useful that can be done, given some assumptions and some schema information.

stephan

@John: Nice post, informative, thanks

MrZ

Hi EveryOne! :-D
This Post it’s very interesting and switch on a very important problem for who, like me, want some kind of light web-services-protocol… And it’s a very difficult discussion, so what follow are only ideas, may be wrong so…

i think that we must separate problem:

1] XML is extensible because one document don’t have a real semantic, i want to say that a text like this means nothing:

one
two

when i say ‘this tag means that’ every things it’s ok and this is the task of DTD
and XSD

2] Namespaces born to avoid conflict and give a way to express some kind of semantic relations…

3] XML has a DOM and this is important: it’s the description of ‘HOW’ and not ‘WHAT’
Also for this XML it’s useful… Parsing…Parsing…

4] since other points, we can try with a simulation of DOM not of XML it’s self!
But one point of attention: this is a way to make a traslation… So it’s right
to say:”there’s no simple way to traslate XML to JSON”… Preserve the value
of X of XML it’s the real problem…

{
name:’cart’,
type:ROOT,
attributes:[],
value:”
childs:[
{
name:'items',
type:NODE,
attributes:[],
value:”
childs:[
{
name:'item',
type:NODE,
attributes:[],
value:”
childs:{
name:’name’,
type:NODE,
attributes:[],
value:’one’
childs:{}
}
} ,
{
name:’item’,
type:NODE,
attributes:[],
value:”
childs:{
name:’name’,
type:NODE,
attributes:[],
value:’one’
childs:{}
}
}
]
}
]
}

Now i can generate also schema or apply schema…
About me the summary is that XML is a textual specification with XSD as meta-specification and if i have some software that use it, complexity can’t be eliminated and must percolate to JSON…
All The Best

MrZ

Hello,
I am sure completely dumb and missing something very important, but I really don’t understand the problem. To me, I would have converted

one

to at least

{ cart: { “items”: { item: { “name”: “one” } } } }

Otherwise, you are losing some information in the translation and there is no more structural equivalence between the XML and JSON.

As other people pointed out, converting a one element nested tag to a singleton list is a matter of 1) conventions or 2) typing. I think in common XML based tools I use like Ant or Maven, it is a common assumption that the pattern … denotes a collection.

Regards.

PS; thanks for your excellent articles !

stephan

@Arnaud:

“Otherwise, you are losing some information in the translation and there is no more structural equivalence between the XML and JSON.”

Yes.

There are two problems though: is the child of items an array or an object? If it depends on the number of children, this is very error prone for the client.

People don’t want to write large code in JS to access data. For example compare:

mycart = ...
mycart.cart.items.item[0].name

vs.

mycart = ....
mycart.items[0].name

Many people consider the second one more readable.

Leave a Reply