YAML seems like a really neat idea, but over time, I have I have come to regard it as being too complicated for me to use for configuration.
My personal favorite is TOML, but I would even prefer plain JSON over YAML
The last thing I want at 2 AM when trying to look figure out if an outage is due to a configuration change is having to think if each line of my configuration is doing the thing I want.
YAML prizes making data look nicely formatted over simplicity or precision. That for me, is not a tradeoff, I am willing to make.
- The format seems to feel the need to support everything, including things I am not sure are actual usecases (what's the point of Markup element for example? What does Metadata save us compared to just including it in document, given that parsers must parse it anyway?). This must make implementation most complex and costly, and makes reading the text format more difficult.
- Not a fan of octal notation. At 3am not sure I can't confuse 0 and o given certain fonts. Does anyone even use it these days?
- Unquoted string were discussed in the thread, I'd like to point out that it's very easy to make an unquoted string not "text-safe" (according to the spec) without noticing it, at which point document is invalid.
Just add white-space (maybe a user pasted a string from somewhere without noticing whitespace at the end or forgot the rules), a dot, an exclamation or a question mark. Having surprises like that is IMHO worse than a consistent quoting method.
Basically all the things I don't like are about the format supporting a bit too much. YAML 1.1 should teach us more is sometimes less.
Alright that's two votes against unquoted strings so far (plus my wife agrees so that's three against!)
I put in octal because it was trivial to implement after the others. The canonical format when it's stored or being sent is binary, and a decoder shouldn't be presenting integers in octal (that would just be weird). But a human might want octal when inputting data that will be converted to the binary format.
Markup is for presentation data, UI layouts, etc, but with full type support rather than all the hacky XML+whatever solutions that many UI toolkits are adopting. Also, having presentation data in binary form is nice to have.
Well, unquoted strings work when a format is built for that. If the default was "it's text unless we see the special sequences" it would be better for unquoted strings. But even then there are too many special characters in this format IMHO.
I saw there's a 'Media' type in the spec. It's seems the type is actually for serializing files. But there's no "name" (or we can call it "description") field. Of course we could accomplish this with a separate field - but than again the entire type's functionality could be accomplished with a u8x array and a string field. So if you're specifying this type at all, might as well add a name field to make it useful.
The media object is for embedding media within a document (an image, a sound, an animation, some bytecode to execute in a sandbox, or whatever). It's not intended to be used as an archive format for storing files (which, as you said, could be trivially accomplished with a byte array for the data, a string for the file name, and some metadata like permissions etc). A file is just one way among many to store media (in this case as an entry in a hierarchical database - the filesystem - keyed by filename). CE is only interested in the media itself, not the database technology.
The media object is a way to embed media data directly into a document such that the receiving end will have some idea of how to deal with it (from its media type). It won't have or need a "file name" because it's not intended to be stored in a filesystem, but rather to be used directly by an application. Yes, it could be built up from the primitives, but then you lose the canonical "media" type, and everyone invents their own incompatible compound types (much like what happened with dates in JSON and XML).
I'm skimming through the human readable spec, and it seems decent, but I noticed the spec allows unquoted strings. What's the reasoning for this? In my experience unquoted strings cause nothing but trouble, and are confusing to humans who may interpret them as keywords.
Any reason for not using RFC2119 keywords in the spec? Using them should make the spec easier to read.
> I noticed the spec allows unquoted strings. What's the reasoning for this? In my experience unquoted strings cause nothing but trouble, and are confusing to humans who may interpret them as keywords.
Unquoted strings are much nicer for humans to work with. All special keywords and object encodings are prefixed with sigils (@, &, $, #, etc), so any bare text starting with a letter is either a string or an invalid document, and any bare text starting with a numeral is either a number or an invalid document.
> Any reason for not using RFC2119 keywords in the spec? Using them should make the spec easier to read.
If strings are always unambiquously detectable, why allow quoting them at all? Having two representations for the same data means you can't normalize a document unambiguously. I can understand having barewords seems cleaner for things like map keys, but I am not convinced that it's a worthwhile tradeoff.
An important feature of RFC2119 keywords is that they're always capitalized (ie. the keyword is "MUST", not "Must", or "must"). This makes requirements and recommendations stand out amid explanatory text, improving legibility. For example, RFC2119 itself uses MUST and must with different meanings.
> If strings are always unambiquously detectable, why allow quoting them at all?
Because strings can contain whitespace and other structural characters that would confuse a parser.
> Having two representations for the same data means you can't normalize a document unambiguously.
The document will always be normalized unambiguously in binary format. The text format is a bit more lenient because humans are involved.
The idea is that the binary format is the source of truth, and is what is used in 90% of situations. The text format is only needed as a conduit for human input, or as a human readable representation of the binary data when you need to see what's going on.
> An important feature of RFC2119 keywords is that they're always capitalized (ie. the keyword is "MUST", not "Must", or "must").
It's a compromise; there are only so many letters, numbers, and symbols available in a single keystroke on all keyboards, and I don't want there to be any ambiguity with numbers and unquoted strings (e.g. interpreting the unquoted string value true as the boolean value true).
So everything else needs some kind of initiator and/or container syntax to logically separate it from the other objects when interpreted by a human or machine.
XML with a convenient UI tools to edit should have fit the bill. Yet, for whatever reason a convenient UI tool would never happen to be there when needed, and thus scared and tired of manual editing of XML the world have embraced YAML.
> XML with a convenient UI tools to edit should have fit the bill.
"You need this special tool to work" immediately and instantly rules out "easy to edit". Or makes the debate irrelevant: every format is easy to edit if you have "a convenient UI" to do it for you.
The fault was in XML editing, pure data authoring is hard. We have convenient UI — web browser, think of it as literate programming, a way to merge man page and configuration file.
And plain text editor is a "widely deployed special tool to work". Actual data is
Only when you "unmarshal" to an untyped data structure and then make assumptions about the type. I've used yaml with a go application, and it can't interpret NO as a bool when the field is a string.
My personal favorite is TOML, but I would even prefer plain JSON over YAML
The last thing I want at 2 AM when trying to look figure out if an outage is due to a configuration change is having to think if each line of my configuration is doing the thing I want.
YAML prizes making data look nicely formatted over simplicity or precision. That for me, is not a tradeoff, I am willing to make.