XML had gained complexity: schemas, namespaces, validation, translation, and then was extended to cover RPC behaviour via SOAP, WSDL, etc, with some annoyances and papercuts around escaping things and using CDATA.
At a certain point the onboarding requirement for a new engineer who just wanted to exchange some data between two systems or write a config file was no longer "open a text file and write what you want" but "learn all of this tooling and perform this incantation".
At this point convenience won.
Of course, time passes and now we're slowly adding in all of the complexity again as it turns out it's needed at a certain scale to be sure that systems work as intended.
YAML emerged when managing JSON configs via build tools become troublesome (trailing commas within config files for arrays built via Puppet, etc). YAML solved that, and made it also a little easier to read (for some other compromises, like now being whitespace sensitive).
Perhaps something else more convenient is just around the corner and we can do it all again.
If you were to build a competitor to XML/JSON/YAML today... what would it look like?
For structured data, I still find XML superior to other formats, especially if it's gonna live on the disk for a long time.
I used XML as a input file format for a scientific application and I still can't see how I can store same data in a YAML file, with inherent verification capabilities and with some flexibility for reordering blocks.
For smaller stuff, YAML is fine, for transient over the wire data exchange, JSON is fine, but for heavy lifting, XML is indispensable IME.
What really annoys me is where you've got perfectly workable XML document standards getting replaced (or at least competed with for brain time) by JSON standards that are measurably worse.
There's one document standard in particular which is designed for news articles, and in that case you want and need to be able to embed HTML representations in the outer document wrapper: this is a perfect fit for XML namespaces, it's almost exactly the use case they were intended for. But looking at the replacement JSON standard I find string fields called things like "body_html" or "body_text" and it just makes me _weep_.
From my understanding, JSON won over not because it's superior, but it's much easier to build and consume.
To some degree, YAML seems to be liked because of the very same feature of Python: Scoping via indentation. However, I'm not sure about that.
XML is akin to space shuttle. Complicated but, well thought out. Well designed for its mission, robust, but somewhat clunky. However, it's a much more accessible member of "software engineering for the enterprise" era.
These features are not fit for today's "fast" software engineering. Who'll parse that XML, yet alone verify it? Who'll write the callbacks? Or is a DOM parser better? Today people "don't have time for that (TM)".
On the more realistic side, XML is really useful and robust. Yes, it's not as fast to implement, but it's forgettable. Add a DTD verification step, then parse away. After ironing the kinks out, your parser can outlive Voyager probes, maybe even humanity itself. But it's overkill for most "move fast, break things" projects of today.
I for one, will use XML for the foreseeable future for my projects. Won't whine for consuming JSON and writing hard to understand YAML files, but if I'm going to exchange big, important data and store it on disk, it'll be XML.
Here's the thing: for that article standard use case, if I want to do anything more interesting with the document body than just hand it off to a web browser, I need to parse it anyway. I've not saved any work at all. For the specific case of semi-structured documents, JSON just isn't the right tool. The problem is that it is a good tool for the far simpler and more common case of trivial string key-value maps, so everyone tries to ram their pegs into that hole, regardless of shape.
As far as it being easy to consume goes, if we're in the browser, DOMParser is right there - it's a one-liner to get a DOM out.
This annoys me more than it probably should. But it also annoys me that XML didn't inherit `</>` as a generic closing tag from SGML. That would have gone a long way towards satisfying people who think XML is too verbose.
XML is great for its intended purpose - structured text with metadata.
For other purposes like configuration files or rpc, the element/attribute distinction is superfluous and just lead to overly verbose syntax.
Unfortunately the hype cycle is such that when a technology becomes fashionable it becomes used even in contexts where it is not appropriate. This in turn leads to a backlash against the overall technology, not just against inappropriate use of it.
XML is awful for structured text with metadata. Half of the time you don't know whether your object properties were -or should be- stored as element attributes, element children or a CDATA block when trying to access it programmatically.
Defining a sane schema fit for all the places and use cases where it will be needed is an exercise in frustration. Lightweight markup won for structured text because it is simple to use, and you can expand it in a piecemeal fashion as needed. See Worse-is-better and the adoption of the C language for how this happens.
This happens even for well-designed lightweight markup; everybody prefers Markdown over asciidoc because you only need to learn like 5 syntax elements to get it up and running, even if it forces you to to add by hand later all the functions that were already available in asciidoc.
Since you talk about objects with properties, I assume you are thinking about serializing an object graph to XML. I agree the element/attribute distinction and mixed content is not very useful in this context. JSON or similar is fine for such data.
Structured text with metadata is something like this:
<b>Hello, <a href="foo.org">world</a>!</b>
The element/attribute distinction and mixed content is useful here.
Btw. CData-blocks are purely an escape mechanism on the syntactic level. It should be completely transparent on the application level (e.g. like backslash escapes in a JSON string).
One of the problems that XML aims to solve is that you can get two XML documents and combine elements of them into a third one, maintaining the right semantics, even if each of the documents use the `<foo>` tag to mean different things. It accomplishes this with namespacing.
In my experience, it's a headache to handle namespaces in XML, e.g. when trying to refer to tag names in an XPath query.
That is one example of how XML goes beyond S-expressions. I still don't think it's a particularly hard problem.
There's no reason why any of that couldn't be replicated with JSON, YAML or even with S-expressions.
To play the devil's advocate, one of the nice features of XML is schemas.
The reason why complex formats with validation etc. don't really work out is that they are most often used for data in transit as opposed to data at rest. The correct format for typed data at rest that conforms to a schema is a SQL database and that won't change any time soon. If you're working with data in transit, schemas, validation etc. are more trouble than they're worth.
There's no reason why it couldn't be replicated, but it isn't - at least, not in any way with broad traction. It's not useful to compare something that could exist but doesn't with something that not only exists but is universal in its ecosystem. That universality is a feature in itself.
I can't believe Altova is still in the game! They must have a loyal base of enterprise users. It's quite shocking they haven't bothered to make an official MacOS or Linux version after all these years.
Interesting, just wondering are people still using eclipse often for its extensibility?
I checked a year ago or 2 on the plug-in situation and it sorta looked like a abandoned wasteland. IntelliJ seems to have absorbed most Java devs, while new tools such as VScode are the new go to for customization.
It even feels like eclipse foundation was abandoning eclipse a bit, as it’s pushing its web based tools…
Incremental compiler, ability to do mix language development and debugging for JNI, shortcuts I don't need both hands, automatically displaying errors with me having to trigger inspections, javadoc displayed without having to configure it, my laptop does sound like a propeller plane, ...