Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
Visions of Valerie
Jun 18, 2023

Come this autumn, we'll be miles away...

12 rats tied together posted:

normally im down with weird markup language poo poo but if you've set out to create a version of xml that is easier for humans to read and write you ought to just use yaml instead

yeah, instead of writing xml we should write something worse

Adbot
ADBOT LOVES YOU

CPColin
Sep 9, 2003

Big ol' smile.
we can't all write your posts

redleader
Aug 18, 2005

Engage according to operational parameters

12 rats tied together posted:

normally im down with weird markup language poo poo but if you've set out to create a version of xml that is easier for humans to read and write you ought to just use yaml instead

12 yamls tied together

Shaggar
Apr 26, 2006
yaml is probably the most hosed up of all the various web "developer" attempts to re-create XML

Kazinsal
Dec 13, 2011



shaggar was right

redleader
Aug 18, 2005

Engage according to operational parameters

redleader posted:

12 yamls tied together

hmm. to tie 12 yamls together, you would need to prefix each one with three dashes, and append three dots after each one, so you know when each document is complete. no other markup lang has this feature, for some reason

Carthag Tuek
Oct 15, 2005

Tider skal komme,
tider skal henrulle,
slægt skal følge slægters gang



~deep yaml~

akadajet
Sep 14, 2003

yaml is just what ruby people think json should be

prisoner of waffles
May 8, 2007

Ah! well a-day! what evil looks
Had I from old and young!
Instead of the cross, the fishmech
About my neck was hung.

redleader posted:

hmm. to tie 12 yamls together, you would need to prefix each one with three dashes, and append three dots after each one, so you know when each document is complete. no other markup lang has this feature, for some reason

pedantry alert: it’s a data serialization format that’s become popular as a configuration language. it’s not a markup language, despite the Yet Another Markup Language acronym because there is another acronym, more annoying because it is recursive: YAML Ain’t Markup Language.

It’s a format without a boundary defined for its single """document""" form, so it needs separators. The JSON equivalent of multi-YAMLs would be JSON Lines? And idk if there’s a name for “several xml documents just appended to each other”.

Carthag Tuek
Oct 15, 2005

Tider skal komme,
tider skal henrulle,
slægt skal følge slægters gang



1 xml 1 love

(by prodigy)

Cybernetic Vermin
Apr 18, 2005

xml really is pretty flawed though.

it has the headliner feature that pretty much makes up for everything else: it is standard, implemented everywhere, and you can put anything in it.

among its flaws is certainly that it is so verbosely ugly that it masks both structure and contents, making it unnecessarily hard to both read and write.

but the big flaw is that it provides little structure by default, and the structure it does provide is pretty bad. i.e. everything is strings placed in an unranked tree, and the nodes can have properties attached. there are weird conventions on top, but they mostly further confuse matters. for example order often not mattering (i believe the 1.0 spec in fact left it open, but generally people randomly treat it as mattering or not), and some very random ideas about what goes in attributes and what are nodes in themselves.

that might not sound bad, but what is that modelling? can you think of a programming language (or *good* database system) which structures data like that?

nah, if you're going to make a universal format of any complexity it is common sense that it should have some baseline types, like integers, decimal numbers (possibly some flavors), booleans, strings, probably some sensible way of doing dates and time (spans). then you should ensure that it is natural and easy to represent: sequences of things, sets of things, mappings, tables, oo-style object contents, and something like nested named sections (which is what xml kind of generalizes into everything, so it is fine there).

you can add xsd to close the gap a bit for xml, but there's a lot of cans of worms down that route, as where xml itself is mostly a set of weird choices, a lot of the "supporting" standards are dreamed up committee nonsense.

Carthag Tuek
Oct 15, 2005

Tider skal komme,
tider skal henrulle,
slægt skal følge slægters gang



tldr

Xarn
Jun 26, 2015
Probation
Can't post for 13 hours!
You forgot schema support as a big important feature.

Sapozhnik
Jan 2, 2005

Nap Ghost
the big problem with XML is the M. most people don't use it to mark up text, which is the main thing that it was designed for. for that purpose it is mostly fine. although you also have an entire syntax in the form of doctype declarations which afaik is non-optional for anything that wants to call itself an XML parser, and it is way too complicated, almost completely unused in practice, and has been a source of security vulnerabilities due to things like external entity definitions.

what most people want is the simplest human readable and writable textual serialization format for structured data that could possibly work, which JSON is closer to than XML although JSON sucks in its own ways (i.e. it is too simple to be comfortable to write by hand due to its lack of comments, excessive quoting, and lack of support for trailing commas).

Cybernetic Vermin
Apr 18, 2005

ah, comments is a good point, also an entirely necessary cool and good feature. which xml indeed does have (one would wish they didn't look so much like the contents of the document, but that's hardly a serious offense)

Sagacity
May 2, 2003
Hopefully my epitaph will be funnier than my custom title.
the amount of effort that has gone into people wanting to avoid writing a few closing tags never ceases to amaze

web "developers": this standard is too bloated we'll write something new and lightweight without taking a look at prior art

ten major versions, six reimplementations and tons of incompatibility later: xml but worse

Ocean of Milk
Jun 25, 2018

oh yeah

Sapozhnik posted:

lack of support for trailing commas

I'm pissed off that nobody learned from clojure/edn on this front: Commas should be whitespace. In 99% of cases where they are mandatory (method/function params, sql IN clauses, JSON arrays...), they are (or could be made) redundant as usually the question of when one element ends and the next one starts is also deduced from some other attribute, like delimiters, whitespace, the order or the content of things or other syntactic elements. But we humans may want to use them for readability which is why it's useful to have them be whitespace because then you can put them where ever you want.

Also re xml as markup: i think markdown's idea that markup text should be readable as source is good, though honestly I don't know whether that can be made unambigous.
But anyway didn't XML gently caress up indentation for multiline text? At least some parsers do. Like:

code:
<someTag>
    <someOtherTag>I'm the first line of a multiline string
                  I'm the second line, but the spaces used for indentation will actually be part of the string!
You have to do it like this, i.e. not have indentation.
Btw are you allowed a newline after the opening or before closing tag, or will that become part of the string also?
    </someOtherTag>
</someTag>
I've been bitten by that.

Ocean of Milk fucked around with this message at 11:37 on Mar 31, 2024

Soricidus
Oct 21, 2010
freedom-hating statist shill

Sapozhnik posted:

the big problem with XML is the M. most people don't use it to mark up text, which is the main thing that it was designed for. for that purpose it is mostly fine. although you also have an entire syntax in the form of doctype declarations which afaik is non-optional for anything that wants to call itself an XML parser, and it is way too complicated, almost completely unused in practice, and has been a source of security vulnerabilities due to things like external entity definitions.

what most people want is the simplest human readable and writable textual serialization format for structured data that could possibly work, which JSON is closer to than XML although JSON sucks in its own ways (i.e. it is too simple to be comfortable to write by hand due to its lack of comments, excessive quoting, and lack of support for trailing commas).

json sucks for serialising data because it has a syntax for numbers but no defined semantics, so you can extremely easily end up silently losing data if (for example) you take the perfectly valid json output from the python standard library and use the perfectly standards-compliant parser and serialiser in jq to pretty-print it.

in other words, it is every bit as bad as xml in its lack of adequate support for common data types.

prisoner of waffles
May 8, 2007

Ah! well a-day! what evil looks
Had I from old and young!
Instead of the cross, the fishmech
About my neck was hung.

I agree with you and I think the key insight here is that "XML is/can be used as a markup language for arbitrarily complex structured documents, but is often used as just a data serialization/deserialization format; for that purpose JSON is less of a PITA"

prisoner of waffles
May 8, 2007

Ah! well a-day! what evil looks
Had I from old and young!
Instead of the cross, the fishmech
About my neck was hung.
just read more about YAML

kombucha-girl-no.png: arbitrary code exec in Python, lmao
kombucha-girl-maybe.png: IDs for nodes, links to identified nodes

leper khan
Dec 28, 2010
Honest to god thinks Half Life 2 is a bad game. But at least he likes Monster Hunter.

Ocean of Milk posted:

Also re xml as markup: i think markdown's idea that markup text should be readable as source is good, though honestly I don't know whether that can be made unambigous.

markdown is also clownshoes. asciidoc predates it and is better in every way other than adoption (its also supported by github though)

12 rats tied together
Sep 7, 2006

prisoner of waffles posted:

just read more about YAML

kombucha-girl-no.png: arbitrary code exec in Python, lmao
kombucha-girl-maybe.png: IDs for nodes, links to identified nodes

vince-falling-over-backwards.gif: all nodes have a type. ability to define custom types. schemas for placing constraints on nodes and their type.

Grum
May 7, 2007

leper khan posted:

markdown is also clownshoes. asciidoc predates it and is better in every way other than adoption (its also supported by github though)

markdown has a cooler name

Clark Nova
Jul 18, 2004

redleader posted:

hmm. to tie 12 yamls together, you would need to prefix each one with three dashes, and append three dots after each one, so you know when each document is complete. no other markup lang has this feature, for some reason

so in order to concatenate yaml documents you have to tap out an SOS in morse code? makes sense

Plorkyeran
Mar 22, 2007

To Escape The Shackles Of The Old Forums, We Must Reject The Tribal Negativity He Endorsed
asciidoc’s made the mistake of saying that it’s better syntax for docbook xml which makes it sound horribly complicated. looks like they’ve recently finally made the web page emphasize that it’s simple

Sapozhnik
Jan 2, 2005

Nap Ghost

Ocean of Milk posted:

I'm pissed off that nobody learned from clojure/edn on this front: Commas should be whitespace. In 99% of cases where they are mandatory (method/function params, sql IN clauses, JSON arrays...), they are (or could be made) redundant as usually the question of when one element ends and the next one starts is also deduced from some other attribute, like delimiters, whitespace, the order or the content of things or other syntactic elements. But we humans may want to use them for readability which is why it's useful to have them be whitespace because then you can put them where ever you want.

Soricidus posted:

json sucks for serialising data because it has a syntax for numbers but no defined semantics, so you can extremely easily end up silently losing data if (for example) you take the perfectly valid json output from the python standard library and use the perfectly standards-compliant parser and serialiser in jq to pretty-print it.

in other words, it is every bit as bad as xml in its lack of adequate support for common data types.

yeah i mean we can bikeshed the exact characters used in the syntax but my thoughts are something along those lines as well. if i were to recreate something akin to json from scratch i'd actually probably simplify it even further. add comments and take away almost everything else that can possibly be taken away.

crush the data model down to the bare minimum: a value is a string, an array of values, or a map of strings to values. strings are always quoted and must be quoted using double quotes only along with a reasonably conservative set of escape sequences, array items have no delimiters at all other than whitespace, and map entries must end with semicolons or commas or whatever so effectively you have mandatory "trailing commas". alternatively allow both semicolons and LF characters to be used as map or array item delimiters. the syntax would support comments, starting with a single character like idk # and ending with a LF. maybe allow unquoted map keys as well, as a concession to hand-written content.

"numbers" come in way too many shapes and sizes (decimal and floating-point fractions, bit width, signedness, base) so punt on that to higher level validator and/or deserializer libraries and encode them as strings. kill nulls because having to distinguish between null values and missing values makes lossless round-tripping more complicated. booleans can be strings too; the only reason why numbers and null and booleans are special in JSON in the first place is because it inherited all of javascript's literal value syntax.

Sapozhnik
Jan 2, 2005

Nap Ghost
but then you probably also want some sort of syntax for multi-line strings as well and now you have to start thinking about how that interacts with indentation. so yeah.

Athas
Aug 6, 2007

fuck that joker

Sapozhnik posted:

crush the data model down to the bare minimum: a value is a string, an array of values, or a map of strings to values. strings are always quoted and must be quoted using double quotes only along with a reasonably conservative set of escape sequences, array items have no delimiters at all other than whitespace, and map entries must end with semicolons or commas or whatever so effectively you have mandatory "trailing commas"

This actually doesn't sound so bad; kind of odd that arrays don't need delimiters but maps do, but I guess it's

Sapozhnik posted:

alternatively allow both semicolons and LF characters to be used as map or array item delimiters.

No wait, go back, this is

Sapozhnik posted:

the syntax would support comments, starting with a single character like idk # and ending with a LF

wait go back

Sapozhnik posted:

maybe allow unquoted map keys as well, as a concession to hand-written content.

youdied.jpg

I imagine the original YAML design discussion was pretty much that post continuing on for five pages.

bob dobbs is dead
Oct 8, 2017

I love peeps
Nap Ghost
"syntax was a mistake" is an unironic legitimate plang position

Visions of Valerie
Jun 18, 2023

Come this autumn, we'll be miles away...
semantics was a mistake

computers are trash

matti
Mar 31, 2019

yeah think from now on I'll make a point to frame it as a markdown/asciidoc replacement for when you need lot of escape hatches instead of an abbreviated form of xml

matti
Mar 31, 2019

not that i'm planning to do the other two or more 95%s of work to make the language useful to anyone but myself after implementing it

abraham linksys
Sep 6, 2010

:darksouls:
personally if i were to write my own config format id just do file.split(\n').filter((line) => !line.startsWith('#')).map((line) => line.split('=').map((token) => token.trim())).reduce((acc, [k, v]) => { return {...acc, [k]: v} }, {}) and be done with it

abraham linksys fucked around with this message at 17:46 on Mar 31, 2024

Sagacity
May 2, 2003
Hopefully my epitaph will be funnier than my custom title.
if you just want strings then sure

Subjunctive
Sep 12, 2006

✨sparkle and shine✨

they’re just strings once you serialize them. let the strings carry type information themselves, or let the schema specify it, as you prefer

raminasi
Jan 25, 2005

a last drink with no ice
i learned the other day that yaml lists sometimes have two possible indentation levels that are semantically identical. i'm half embarrassed for having not known that and half absolutely not embarrassed for having not known that.

Carthag Tuek
Oct 15, 2005

Tider skal komme,
tider skal henrulle,
slægt skal følge slægters gang



raminasi posted:

i learned the other day that yaml lists sometimes have two possible indentation levels that are semantically identical. i'm half embarrassed for having not known that and half absolutely not embarrassed for having not known that.

im embarrassed that it is

12 rats tied together
Sep 7, 2006

what do you mean sometimes? sequences like lists use indentation for scope. you can always indent however many times you feel like in YAML, the only rule is that things that share scope (items in the same list) need to be at the same indentation level. you can indent one, two, five hundred spaces, it doesn't matter because indentation is a presentation detail only

Visions of Valerie
Jun 18, 2023

Come this autumn, we'll be miles away...

12 rats tied together posted:

what do you mean sometimes? sequences like lists use indentation for scope. you can always indent however many times you feel like in YAML, the only rule is that things that share scope (items in the same list) need to be at the same indentation level. you can indent one, two, five hundred spaces, it doesn't matter because indentation is a presentation detail only

code:
- it:
    - absolutely
- isn't
if relative indentation matters, it's not merely a presentation detail

Adbot
ADBOT LOVES YOU

bob dobbs is dead
Oct 8, 2017

I love peeps
Nap Ghost

bob dobbs is dead posted:

"syntax was a mistake" is my unironic position with respect to yaml

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply