|
12 rats tied together posted:normally im down with weird markup language poo poo but if you've set out to create a version of xml that is easier for humans to read and write you ought to just use yaml instead yeah, instead of writing xml we should write something worse
|
# ? Mar 30, 2024 23:04 |
|
|
# ? May 9, 2024 10:36 |
|
we can't all write your posts
|
# ? Mar 30, 2024 23:09 |
|
12 rats tied together posted:normally im down with weird markup language poo poo but if you've set out to create a version of xml that is easier for humans to read and write you ought to just use yaml instead 12 yamls tied together
|
# ? Mar 31, 2024 04:13 |
|
yaml is probably the most hosed up of all the various web "developer" attempts to re-create XML
|
# ? Mar 31, 2024 04:15 |
|
shaggar was right
|
# ? Mar 31, 2024 04:17 |
|
redleader posted:12 yamls tied together hmm. to tie 12 yamls together, you would need to prefix each one with three dashes, and append three dots after each one, so you know when each document is complete. no other markup lang has this feature, for some reason
|
# ? Mar 31, 2024 04:24 |
|
~deep yaml~
|
# ? Mar 31, 2024 05:02 |
|
yaml is just what ruby people think json should be
|
# ? Mar 31, 2024 05:05 |
|
redleader posted:hmm. to tie 12 yamls together, you would need to prefix each one with three dashes, and append three dots after each one, so you know when each document is complete. no other markup lang has this feature, for some reason pedantry alert: it’s a data serialization format that’s become popular as a configuration language. it’s not a markup language, despite the Yet Another Markup Language acronym because there is another acronym, more annoying because it is recursive: YAML Ain’t Markup Language. It’s a format without a boundary defined for its single """document""" form, so it needs separators. The JSON equivalent of multi-YAMLs would be JSON Lines? And idk if there’s a name for “several xml documents just appended to each other”.
|
# ? Mar 31, 2024 05:46 |
|
1 xml 1 love (by prodigy)
|
# ? Mar 31, 2024 06:07 |
|
xml really is pretty flawed though. it has the headliner feature that pretty much makes up for everything else: it is standard, implemented everywhere, and you can put anything in it. among its flaws is certainly that it is so verbosely ugly that it masks both structure and contents, making it unnecessarily hard to both read and write. but the big flaw is that it provides little structure by default, and the structure it does provide is pretty bad. i.e. everything is strings placed in an unranked tree, and the nodes can have properties attached. there are weird conventions on top, but they mostly further confuse matters. for example order often not mattering (i believe the 1.0 spec in fact left it open, but generally people randomly treat it as mattering or not), and some very random ideas about what goes in attributes and what are nodes in themselves. that might not sound bad, but what is that modelling? can you think of a programming language (or *good* database system) which structures data like that? nah, if you're going to make a universal format of any complexity it is common sense that it should have some baseline types, like integers, decimal numbers (possibly some flavors), booleans, strings, probably some sensible way of doing dates and time (spans). then you should ensure that it is natural and easy to represent: sequences of things, sets of things, mappings, tables, oo-style object contents, and something like nested named sections (which is what xml kind of generalizes into everything, so it is fine there). you can add xsd to close the gap a bit for xml, but there's a lot of cans of worms down that route, as where xml itself is mostly a set of weird choices, a lot of the "supporting" standards are dreamed up committee nonsense.
|
# ? Mar 31, 2024 09:47 |
|
tldr
|
# ? Mar 31, 2024 09:52 |
|
You forgot schema support as a big important feature.
|
# ? Mar 31, 2024 09:52 |
|
the big problem with XML is the M. most people don't use it to mark up text, which is the main thing that it was designed for. for that purpose it is mostly fine. although you also have an entire syntax in the form of doctype declarations which afaik is non-optional for anything that wants to call itself an XML parser, and it is way too complicated, almost completely unused in practice, and has been a source of security vulnerabilities due to things like external entity definitions. what most people want is the simplest human readable and writable textual serialization format for structured data that could possibly work, which JSON is closer to than XML although JSON sucks in its own ways (i.e. it is too simple to be comfortable to write by hand due to its lack of comments, excessive quoting, and lack of support for trailing commas).
|
# ? Mar 31, 2024 10:33 |
|
ah, comments is a good point, also an entirely necessary cool and good feature. which xml indeed does have (one would wish they didn't look so much like the contents of the document, but that's hardly a serious offense)
|
# ? Mar 31, 2024 10:50 |
|
the amount of effort that has gone into people wanting to avoid writing a few closing tags never ceases to amaze web "developers": this standard is too bloated we'll write something new and lightweight without taking a look at prior art ten major versions, six reimplementations and tons of incompatibility later: xml but worse
|
# ? Mar 31, 2024 11:00 |
|
Sapozhnik posted:lack of support for trailing commas I'm pissed off that nobody learned from clojure/edn on this front: Commas should be whitespace. In 99% of cases where they are mandatory (method/function params, sql IN clauses, JSON arrays...), they are (or could be made) redundant as usually the question of when one element ends and the next one starts is also deduced from some other attribute, like delimiters, whitespace, the order or the content of things or other syntactic elements. But we humans may want to use them for readability which is why it's useful to have them be whitespace because then you can put them where ever you want. Also re xml as markup: i think markdown's idea that markup text should be readable as source is good, though honestly I don't know whether that can be made unambigous. But anyway didn't XML gently caress up indentation for multiline text? At least some parsers do. Like: code:
Ocean of Milk fucked around with this message at 11:37 on Mar 31, 2024 |
# ? Mar 31, 2024 11:16 |
|
Sapozhnik posted:the big problem with XML is the M. most people don't use it to mark up text, which is the main thing that it was designed for. for that purpose it is mostly fine. although you also have an entire syntax in the form of doctype declarations which afaik is non-optional for anything that wants to call itself an XML parser, and it is way too complicated, almost completely unused in practice, and has been a source of security vulnerabilities due to things like external entity definitions. json sucks for serialising data because it has a syntax for numbers but no defined semantics, so you can extremely easily end up silently losing data if (for example) you take the perfectly valid json output from the python standard library and use the perfectly standards-compliant parser and serialiser in jq to pretty-print it. in other words, it is every bit as bad as xml in its lack of adequate support for common data types.
|
# ? Mar 31, 2024 11:41 |
|
I agree with you and I think the key insight here is that "XML is/can be used as a markup language for arbitrarily complex structured documents, but is often used as just a data serialization/deserialization format; for that purpose JSON is less of a PITA"
|
# ? Mar 31, 2024 12:25 |
|
just read more about YAML kombucha-girl-no.png: arbitrary code exec in Python, lmao kombucha-girl-maybe.png: IDs for nodes, links to identified nodes
|
# ? Mar 31, 2024 12:37 |
|
Ocean of Milk posted:Also re xml as markup: i think markdown's idea that markup text should be readable as source is good, though honestly I don't know whether that can be made unambigous. markdown is also clownshoes. asciidoc predates it and is better in every way other than adoption (its also supported by github though)
|
# ? Mar 31, 2024 13:53 |
|
prisoner of waffles posted:just read more about YAML vince-falling-over-backwards.gif: all nodes have a type. ability to define custom types. schemas for placing constraints on nodes and their type.
|
# ? Mar 31, 2024 14:30 |
|
leper khan posted:markdown is also clownshoes. asciidoc predates it and is better in every way other than adoption (its also supported by github though) markdown has a cooler name
|
# ? Mar 31, 2024 14:46 |
|
redleader posted:hmm. to tie 12 yamls together, you would need to prefix each one with three dashes, and append three dots after each one, so you know when each document is complete. no other markup lang has this feature, for some reason so in order to concatenate yaml documents you have to tap out an SOS in morse code? makes sense
|
# ? Mar 31, 2024 15:15 |
|
asciidoc’s made the mistake of saying that it’s better syntax for docbook xml which makes it sound horribly complicated. looks like they’ve recently finally made the web page emphasize that it’s simple
|
# ? Mar 31, 2024 15:52 |
|
Ocean of Milk posted:I'm pissed off that nobody learned from clojure/edn on this front: Commas should be whitespace. In 99% of cases where they are mandatory (method/function params, sql IN clauses, JSON arrays...), they are (or could be made) redundant as usually the question of when one element ends and the next one starts is also deduced from some other attribute, like delimiters, whitespace, the order or the content of things or other syntactic elements. But we humans may want to use them for readability which is why it's useful to have them be whitespace because then you can put them where ever you want. Soricidus posted:json sucks for serialising data because it has a syntax for numbers but no defined semantics, so you can extremely easily end up silently losing data if (for example) you take the perfectly valid json output from the python standard library and use the perfectly standards-compliant parser and serialiser in jq to pretty-print it. yeah i mean we can bikeshed the exact characters used in the syntax but my thoughts are something along those lines as well. if i were to recreate something akin to json from scratch i'd actually probably simplify it even further. add comments and take away almost everything else that can possibly be taken away. crush the data model down to the bare minimum: a value is a string, an array of values, or a map of strings to values. strings are always quoted and must be quoted using double quotes only along with a reasonably conservative set of escape sequences, array items have no delimiters at all other than whitespace, and map entries must end with semicolons or commas or whatever so effectively you have mandatory "trailing commas". alternatively allow both semicolons and LF characters to be used as map or array item delimiters. the syntax would support comments, starting with a single character like idk # and ending with a LF. maybe allow unquoted map keys as well, as a concession to hand-written content. "numbers" come in way too many shapes and sizes (decimal and floating-point fractions, bit width, signedness, base) so punt on that to higher level validator and/or deserializer libraries and encode them as strings. kill nulls because having to distinguish between null values and missing values makes lossless round-tripping more complicated. booleans can be strings too; the only reason why numbers and null and booleans are special in JSON in the first place is because it inherited all of javascript's literal value syntax.
|
# ? Mar 31, 2024 15:57 |
|
but then you probably also want some sort of syntax for multi-line strings as well and now you have to start thinking about how that interacts with indentation. so yeah.
|
# ? Mar 31, 2024 16:02 |
|
Sapozhnik posted:crush the data model down to the bare minimum: a value is a string, an array of values, or a map of strings to values. strings are always quoted and must be quoted using double quotes only along with a reasonably conservative set of escape sequences, array items have no delimiters at all other than whitespace, and map entries must end with semicolons or commas or whatever so effectively you have mandatory "trailing commas" This actually doesn't sound so bad; kind of odd that arrays don't need delimiters but maps do, but I guess it's Sapozhnik posted:alternatively allow both semicolons and LF characters to be used as map or array item delimiters. No wait, go back, this is Sapozhnik posted:the syntax would support comments, starting with a single character like idk # and ending with a LF wait go back Sapozhnik posted:maybe allow unquoted map keys as well, as a concession to hand-written content. youdied.jpg I imagine the original YAML design discussion was pretty much that post continuing on for five pages.
|
# ? Mar 31, 2024 16:04 |
|
"syntax was a mistake" is an unironic legitimate plang position
|
# ? Mar 31, 2024 16:10 |
|
semantics was a mistake computers are trash
|
# ? Mar 31, 2024 16:18 |
|
yeah think from now on I'll make a point to frame it as a markdown/asciidoc replacement for when you need lot of escape hatches instead of an abbreviated form of xml
|
# ? Mar 31, 2024 17:29 |
|
not that i'm planning to do the other two or more 95%s of work to make the language useful to anyone but myself after implementing it
|
# ? Mar 31, 2024 17:35 |
|
personally if i were to write my own config format id just do file.split(\n').filter((line) => !line.startsWith('#')).map((line) => line.split('=').map((token) => token.trim())).reduce((acc, [k, v]) => { return {...acc, [k]: v} }, {}) and be done with it
abraham linksys fucked around with this message at 17:46 on Mar 31, 2024 |
# ? Mar 31, 2024 17:40 |
|
if you just want strings then sure
|
# ? Mar 31, 2024 19:01 |
|
they’re just strings once you serialize them. let the strings carry type information themselves, or let the schema specify it, as you prefer
|
# ? Mar 31, 2024 19:04 |
|
i learned the other day that yaml lists sometimes have two possible indentation levels that are semantically identical. i'm half embarrassed for having not known that and half absolutely not embarrassed for having not known that.
|
# ? Mar 31, 2024 19:37 |
|
raminasi posted:i learned the other day that yaml lists sometimes have two possible indentation levels that are semantically identical. i'm half embarrassed for having not known that and half absolutely not embarrassed for having not known that. im embarrassed that it is
|
# ? Mar 31, 2024 19:45 |
|
what do you mean sometimes? sequences like lists use indentation for scope. you can always indent however many times you feel like in YAML, the only rule is that things that share scope (items in the same list) need to be at the same indentation level. you can indent one, two, five hundred spaces, it doesn't matter because indentation is a presentation detail only
|
# ? Mar 31, 2024 19:47 |
|
12 rats tied together posted:what do you mean sometimes? sequences like lists use indentation for scope. you can always indent however many times you feel like in YAML, the only rule is that things that share scope (items in the same list) need to be at the same indentation level. you can indent one, two, five hundred spaces, it doesn't matter because indentation is a presentation detail only code:
|
# ? Mar 31, 2024 19:54 |
|
|
# ? May 9, 2024 10:36 |
|
bob dobbs is dead posted:"syntax was a mistake" is my unironic position with respect to yaml
|
# ? Mar 31, 2024 20:10 |