Coding Horrors: You can gather all your technical debt into one easy framework!

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > Coding Horrors: You can gather all your technical debt into one easy framework!

«‹›1503 »

Hammerite: Mar 9, 2007; And you don't remember what I said here, either, but it was pompous and stupid.; Jade Ear Joe

Is this going to be a thing where we identify the open source software library this billion-dollar games company was using for free, and shame the author because it has a bug that affected the billion-dollar game, which they didn't get any money from

# ? Mar 1, 2021 22:43

Adbot: ADBOT LOVES YOU

# ? May 25, 2024 09:56

Tei: Feb 19, 2011

Less Fat Luke posted:

Sales in ecommerce have a higher churn the slower a site is for loading, search and the checkout flow - I know what you're saying but there's no way that I'm the only person that gave up on GTA Online because of the loading times.

I remember an article about how free to play games are less inclined to spend time on the low end machines, because thats not where the whales are. But maybe I am misremembering it.

# ? Mar 1, 2021 23:15

rjmccall: Sep 7, 2007; no worries friend; Fun Shoe

necrotic posted:

I have a 5950x and it still takes 5 minutes to get into GTAO.

The op says that some people report much lower load times, so there's obviously something that makes a difference.

It is actually possible for simple functions like strlen to vary a lot in performance by processor and OS release, especially if the library tries to dynamically detect whether it can / is profitable to use a specialized implementation. x86 has several different string-processing instructions with different processor availability and performance characteristics. Now, Agner says that they're all currently worse than a loop for strlen specifically, but Agner doesn't write the Windows C library.

# ? Mar 1, 2021 23:36

Volguus: Mar 3, 2009

repiv posted:

GTA:Online isn't just something users buy upfront, a significant part of its revenue (on the order of billions of dollars) comes from ongoing microtransactions

Players who bought the game but then bounced off the online mode because they got sick of loading are lost sources of MTX revenue. This bug easily cost R* millions of dollars.

Oh, I had no idea that's a microtransactions game (I played GTA 3 and 4 I think, no clue what came after that). Then yes, you're right, it probably did cost them millions of dollars. Oh well. They deserve it.

Xerophyte posted:

To be specific: RapidJSON and simdjson do not have that problem and they are, to my knowledge, the most common C++ json libraries right now. They weren't the most common in 2013, and simdjson which is currently the fastest by lots did not exist. There sure are a lot of crappy json parsers out there.

Oh interesting. I prefer (for quite some time now) the nlohmann json since it's header only, it has a decent enough API and, according to those benchmarks, is kinda the middle of the pack in there. Surely they aren't even using that.

# ? Mar 1, 2021 23:56

Jabor: Jul 16, 2010; #1 Loser at SpaceChem

rjmccall posted:

The op says that some people report much lower load times, so there's obviously something that makes a difference.

It is actually possible for simple functions like strlen to vary a lot in performance by processor and OS release, especially if the library tries to dynamically detect whether it can / is profitable to use a specialized implementation. x86 has several different string-processing instructions with different processor availability and performance characteristics. Now, Agner says that they're all currently worse than a loop for strlen specifically, but Agner doesn't write the Windows C library.

Perhaps the microtransaction store that the JSON is being parsed for is different for different users (e.g. doesn't have a bunch of loot box items in countries where they can't sell loot boxes), so it's just less of an issue for those users specifically.

# ? Mar 2, 2021 00:17

Foxfire_: Nov 8, 2010

Xerophyte posted:

I wonder if there's an actual json library that awful or if they just rolled their own crappy parser. My heart wants me to believe it's the latter, but the former would absolutely not surprise me.

If I were writing a JSON parser, making it performant when parsing a 10MB string would not be high on my list of cases to optimize for.

# ? Mar 2, 2021 00:49

xtal: Jan 9, 2011; by Fluffdaddy

Foxfire_ posted:

If I were writing a JSON parser, making it performant when parsing a 10MB string would not be high on my list of cases to optimize for.

That doesn't seem like super much, but I heard of just sending a sqlite database instead and it was like half the size

# ? Mar 2, 2021 01:16

rjmccall: Sep 7, 2007; no worries friend; Fun Shoe

Jabor posted:

Perhaps the microtransaction store that the JSON is being parsed for is different for different users (e.g. doesn't have a bunch of loot box items in countries where they can't sell loot boxes), so it's just less of an issue for those users specifically.

That could be, sure.

# ? Mar 2, 2021 01:23

FlapYoJacks: Feb 12, 2009

Volguus posted:

Oh interesting. I prefer (for quite some time now) the nlohmann json since it's header only, it has a decent enough API and, according to those benchmarks, is kinda the middle of the pack in there. Surely they aren't even using that.

Same here. It's a good library overall.

# ? Mar 2, 2021 01:27

Xerophyte: Mar 17, 2008; This space intentionally left blank

Foxfire_ posted:

If I were writing a JSON parser, making it performant when parsing a 10MB string would not be high on my list of cases to optimize for.

Now I'm curious what you think is more important. I get not wanting to deal with out of core performance but 10 MB is a moderately sized file. These are common payloads in anything that's using JSON for actual data interchange. The test I linked uses among others an RFC 7946 file for the border of a single country at moderate resolution which clocks in at 2 MB, for instance. You can argue that geometry data and the like shouldn't be stored or sent as JSON -- you could in fact argue that nothing should be stored or sent as JSON and I'd probably agree -- but, well, it is. Frequently.

Not having O(n�) behavior in input size and handling a few MB of input in a parser for a popular data-agnostic human-readable interchange format is hardly some weird niche thing.

Also, no, I certainly don't think a company using a library with a poor implementation for their use case shifts the responsibility for the results. It'd just be a little more sadness in the world if someone set out to make a JSON library and still ended up with that.

# ? Mar 2, 2021 01:36

Foxfire_: Nov 8, 2010

I'd expect that most JSON would be short and optimize for that + simplicity for fewer bugs since I wouldn't expect JSON-parsing to be generally bottlenecking vs IO. That might be a poor choice for a general-purpose library, depending on how it's actually used

Also, the thing that's bad is actually the C runtime code, not the JSON library. sscanf() [parse some tokens from the start of a string] is calling strlen() [how long is this entire string] and it doesn't really have any reason to. The standards don't promise any particular complexity for either, but it's turning a function that ought to be proportional to the size of the tokens being matched into something proportional to the size of the entire string. My guess for why it's not slow on some systems is that they're using different runtime versions.

You would think sscanf(someString, "%d,", &variable) would only be examining the string up to either the first , or the end-of-string, not that it touches every single byte always

Foxfire_ fucked around with this message at 02:00 on Mar 2, 2021

# ? Mar 2, 2021 01:57

Dross: Sep 26, 2006; Every night he puts his hot dogs in the trees so the pigeons can't get them.

Even if the parser wasn�t an issue there�s still the part where they�re iterating a 63000 item array 63000 times when they could just use a hash map instead (or cut that routine entirely)

# ? Mar 2, 2021 02:12

Khorne: May 1, 2002

Dross posted:

Even if the parser wasn�t an issue there�s still the part where they�re iterating a 63000 item array 63000 times when they could just use a hash map instead (or cut that routine entirely)

Excuse me, the parser was designed for 100 or fewer items where a hashmap would be a performance loss. It's not my responsibility as a developer to fix something that's working correctly.

Here's a link to the feature's story. Take it up with the product team who spec'd the feature and have them create a story so we can work it into the backlog.

Khorne fucked around with this message at 02:25 on Mar 2, 2021

# ? Mar 2, 2021 02:15

Foxfire_: Nov 8, 2010

You say that, but it would also not surprise me if the person who originally wrote it was not anticipating the 'possible microtransactions' list to be 63,000 things long. I feel bad about condemning them for that

# ? Mar 2, 2021 02:28

Khorne: May 1, 2002

Foxfire_ posted:

You say that, but it would also not surprise me if the person who originally wrote it was not anticipating the 'possible microtransactions' list to be 63,000 things long. I feel bad about condemning them for that

I'd bet money they just reused some hacked together json code from another one of their titles or used some mit licensed "lightweight json parsing" header file that checked out at first glance and used catchy buzzwords like "single allocation" and "in memory".

Khorne fucked around with this message at 02:43 on Mar 2, 2021

# ? Mar 2, 2021 02:40

Xerophyte: Mar 17, 2008; This space intentionally left blank

I would be surprised if the game did not package the runtime it was compiled for with the install so it's unlikely to be a cause of user variance.

Anyhow, my understanding after some googling is that the sscanf "bug" is present in the BSD, MSVC, GNU and MUSL's C libraries, all of which implement sscanf in terms of FILE objects and vfscanf. The strlen call is used to generate the fake FILE object that then gets sent to vfscanf. I'll agree that it isn't a great implementation, but it's apparently a common strategy in the standard libraries to eliminate duplication in their assorted formatting code. sscanf wasn't really designed for this use case, or really any use case where the input is a large string of unknown format. One more reason on the pile of reasons why C's string handling hurts.

People like to say that this-or-that is IO bound but hitting the IO bound when reading anything you actually need to process from an SSD is legit difficult. You're not going to get close if you have to do anything remotely complex, for instance if trying to parse floats from text. The simdjson guy had a somewhat basic presentation on what they had to do to actually hit the limit back in The Beforetimes, one takeaway is that they have a budget of about 1.5 CPU cycles/byte of text:
https://www.youtube.com/watch?v=wlvKAT7SZIQ
They're still CPU bound on anything involving floats, because as mentioned parsing floats from text really sucks.

# ? Mar 2, 2021 02:49

DELETE CASCADE: Oct 25, 2017; i haven't washed my penis since i jerked it to a phtotograph of george w. bush in 2003

gta online for ps4 takes absolutely forever to load

# ? Mar 2, 2021 05:49

more falafel please: Feb 26, 2005; forums poster

What JSON parsing library is using sscanf? That seems like a terrible idea for a vaguely EBNF-style language. The whole thing smacks of hand rolled.

# ? Mar 2, 2021 06:07

Foxfire_: Nov 8, 2010

Tei posted:

I refuse to believe 10MB of data is only data. Somebody must have decided to
store images in here has base64.

You're selling hats with every possible combination of 17 variables each with 50 possible values. You didn't bother compressing, deduplicating, or filtering irrelevant data because hey, it works fine today, we've only offered two hats so far, how many hats could we possibly need in the store at once. What's that, there's four different ways to sort the store and it's too slow to reorder them on the fly in the game? No worries, we'll just ship down the whole list four times in the right order. Oh right there's a search bar, better add keywords to each hat. An index should make it even faster, let's include that. Yes, for each of the four orderings, we're trading space for time here already what's another couple of bytes.

# ? Mar 2, 2021 09:04

Qwertycoatl: Dec 31, 2008

With 63000(!) microtransactions in the JSON, that's only about 160 bytes per item which is pretty reasonable

# ? Mar 2, 2021 09:19

Tei: Feb 19, 2011

Qwertycoatl posted:

With 63000(!) microtransactions in the JSON, that's only about 160 bytes per item which is pretty reasonable

it seems I was wrong... oh god

# ? Mar 2, 2021 09:23

Presto: Nov 22, 2002; Keep calm and Harry on.

Yeah, there's an astonishing number of clothing items, accessories, guns, cars, planes, boats, casino penthouse artwork, etc etc that you can buy.

Also I don't think the json thing is the whole story. I play GTA:O and sometimes it loads fairly quickly and sometimes it takes so long that I give up. I guess there's some network synchronization going on and if there's one person in the session with tin-can-and-string Internet it drags everyone else down.

Either that or some modder has screwed up the session and made it unplayable, which is a thing that happens a lot.

# ? Mar 2, 2021 18:24

rjmccall: Sep 7, 2007; no worries friend; Fun Shoe

Having a ton of transactions is pretty common in that kind of game. It�s usually not lootboxes (and I�ve never played GTAO and have no idea if it even has them), it�s just however many years of accumulated hats. I don�t think Path of Exile is anyone�s idea of a lootbox game, and its shop is ridiculously big after so many years of added content.

# ? Mar 2, 2021 18:56

Volmarias: Dec 31, 2002; EMAIL... THE INTERNET... SEARCH ENGINES...

Foxfire_ posted:

You say that, but it would also not surprise me if the person who originally wrote it was not anticipating the 'possible microtransactions' list to be 63,000 things long. I feel bad about condemning them for that

This is my read on it. The original implementation went so fast that they never even thought about it, so there's no reason to even think about profiling it for lurking bugs like this (I wouldn't have even thought to consider sscanf to want to get the entire length of the string, let alone so that it could be presented as a FILE (?!!!)) or write any kind of regression tests (which likely wouldn't even have alerted anyone).

At a certain point, I think the most reasonable question is "how many goddamn hats do you really need"

# ? Mar 2, 2021 21:58

HappyHippo: Nov 19, 2003; Do you have an Air Miles Card?

I think I read somewhere that O(n^2) is often a killer because it seems fine on the small/medium tests but chokes when it has to handle something serious.

There were probably far fewer transactions when they were implementing that.

# ? Mar 3, 2021 00:41

Vanadium: Jan 8, 2005

I'd like to submit for your consideration: https://twitter.com/zygoloid/status/1366917418354761728

code:

#include <stdio.h>
void f(float&&) { puts("float"); }
void f(int&&) { puts("int"); }
void g(auto &&...v) { (f(v), ...); }
int main() { g(1.0f, 2); }

# ? Mar 3, 2021 07:42

Kazinsal: Dec 13, 2011

My money is on "g++ will accept this and do the stupid option and other compilers will tell you to gently caress off"

# ? Mar 3, 2021 07:51

Foxfire_: Nov 8, 2010

Fails code review

# ? Mar 3, 2021 07:57

pokeyman: Nov 26, 2006; That elephant ate my entire platoon.

Bigger scandal imo is that you don't have to return from main when declaring it as returning an int. What is the point of that?

# ? Mar 3, 2021 08:03

rjmccall: Sep 7, 2007; no worries friend; Fun Shoe

That is hilarious, it does exactly the wrong thing because of the reference-binding rules.

# ? Mar 3, 2021 08:13

Kazinsal: Dec 13, 2011

It's like the person who wrote the code generator for that got the argument order on the stack backwards and assumed it was left to right instead of right to left, but somehow managed to accomplish this on x64 where cdecl is register based?

# ? Mar 3, 2021 08:18

Soricidus: Oct 21, 2010; freedom-hating statist shill

rjmccall posted:

That is hilarious, it does exactly the wrong thing because of the reference-binding rules.

*sigh* I feared this day would come. We cannot put it off any longer.

It is time for &&&

# ? Mar 3, 2021 08:20

Lime: Jul 20, 2004

I wouldn't say the horror is the reference-binding rules because binding an l-value to an r-value reference should fail. The real horror is just good old implicit conversions, and in particular the especially flexible rules for int to float or float to int pr-values.

Lime fucked around with this message at 08:59 on Mar 3, 2021

# ? Mar 3, 2021 08:54

rjmccall: Sep 7, 2007; no worries friend; Fun Shoe

It is an excellent example of a large number of well-intentioned rules working together to produce bad results. But I'll stand by what I said: the rules about passing && / const & references to fundamental types are so effectively lax in practice that disallowing the exact-type "conversion" here just ends up feeling like a bizarre exception with bizarre implications. It would not be unreasonable to only apply that restriction to non-fundamental types, or even non-trivial types.

Only allowing implicit conversions to do promotions would generally be a nice improvement, but in this case specifically we'd still end up calling the float overload when starting with an int argument.

# ? Mar 3, 2021 09:21

fritz: Jul 26, 2003

It won't compile for me with clang++ on osx, which I consider a blessing.

# ? Mar 3, 2021 15:20

Xerophyte: Mar 17, 2008; This space intentionally left blank

Ok, I found this real confusing. Stripping the C++20 I get

C++ code:

#include <stdio.h>
void f(float&&) { puts("float"); }
void f(int&&) { puts("int"); }
template<typename T> void g(T&& v) { f(v); }
int main() { g(1.0f); g(2); }

which fails in the same way.

With no optimization code generation emits stuff like (clang 11)

code:

void g<int>(int&&):                            # @void g<int>(int&&)
        push    rbp
        mov     rbp, rsp
        sub     rsp, 16
        mov     qword ptr [rbp - 8], rdi
        mov     rax, qword ptr [rbp - 8]
        cvtsi2ss        xmm0, dword ptr [rax]
        movss   dword ptr [rbp - 12], xmm0
        lea     rdi, [rbp - 12]
        call    f(float&&)
        add     rsp, 16
        pop     rbp
        ret

unless I move or forward in g.

v is an lvalue when used in the expression inside g which means that calling the "matching" rvalue reference function is not allowed according to the reference binding rules, and that seems ok.

Is the thing that if it were to be implicitly converted to the opposite integral/floating type then the converted type would be materialized as a bindable prvalue, and since implicit conversions can happen when needed that ends up being the only valid thing the compiler can do?

# ? Mar 3, 2021 17:06

CPColin: Sep 9, 2003; Big ol' smile.

I'm so happy I never have any idea what's going on in the C++ horrors.

# ? Mar 3, 2021 17:16

Adbot: ADBOT LOVES YOU

# ? May 25, 2024 09:56

Blue Footed Booby: Oct 4, 2006; got those happy feet

CPColin posted:

I'm so happy I never have any idea what's going on in the C++ horrors.

I'm sure it 's like all languages in that you can take a sane subset of the language features and have something reasonable, but holy poo poo does it have some byzantine footguns. It's the closest I've seen to fantasy novels where wizards toy with forces they don't fully understand and end up eaten by demons or turned into owlbears.

# ? Mar 3, 2021 17:53

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > Coding Horrors: You can gather all your technical debt into one easy framework!

«‹›1503 »