Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
Hammerite
Mar 9, 2007

And you don't remember what I said here, either, but it was pompous and stupid.
Jade Ear Joe
Is this going to be a thing where we identify the open source software library this billion-dollar games company was using for free, and shame the author because it has a bug that affected the billion-dollar game, which they didn't get any money from

Adbot
ADBOT LOVES YOU

Tei
Feb 19, 2011

Less Fat Luke posted:

Sales in ecommerce have a higher churn the slower a site is for loading, search and the checkout flow - I know what you're saying but there's no way that I'm the only person that gave up on GTA Online because of the loading times.

I remember an article about how free to play games are less inclined to spend time on the low end machines, because thats not where the whales are. But maybe I am misremembering it.

rjmccall
Sep 7, 2007

no worries friend
Fun Shoe

necrotic posted:

I have a 5950x and it still takes 5 minutes to get into GTAO.

The op says that some people report much lower load times, so there's obviously something that makes a difference.

It is actually possible for simple functions like strlen to vary a lot in performance by processor and OS release, especially if the library tries to dynamically detect whether it can / is profitable to use a specialized implementation. x86 has several different string-processing instructions with different processor availability and performance characteristics. Now, Agner says that they're all currently worse than a loop for strlen specifically, but Agner doesn't write the Windows C library.

Volguus
Mar 3, 2009

repiv posted:

GTA:Online isn't just something users buy upfront, a significant part of its revenue (on the order of billions of dollars) comes from ongoing microtransactions

Players who bought the game but then bounced off the online mode because they got sick of loading are lost sources of MTX revenue. This bug easily cost R* millions of dollars.

Oh, I had no idea that's a microtransactions game (I played GTA 3 and 4 I think, no clue what came after that). Then yes, you're right, it probably did cost them millions of dollars. Oh well. They deserve it.


Xerophyte posted:

To be specific: RapidJSON and simdjson do not have that problem and they are, to my knowledge, the most common C++ json libraries right now. They weren't the most common in 2013, and simdjson which is currently the fastest by lots did not exist. There sure are a lot of crappy json parsers out there.

Oh interesting. I prefer (for quite some time now) the nlohmann json since it's header only, it has a decent enough API and, according to those benchmarks, is kinda the middle of the pack in there. Surely they aren't even using that.

Jabor
Jul 16, 2010

#1 Loser at SpaceChem

rjmccall posted:

The op says that some people report much lower load times, so there's obviously something that makes a difference.

It is actually possible for simple functions like strlen to vary a lot in performance by processor and OS release, especially if the library tries to dynamically detect whether it can / is profitable to use a specialized implementation. x86 has several different string-processing instructions with different processor availability and performance characteristics. Now, Agner says that they're all currently worse than a loop for strlen specifically, but Agner doesn't write the Windows C library.

Perhaps the microtransaction store that the JSON is being parsed for is different for different users (e.g. doesn't have a bunch of loot box items in countries where they can't sell loot boxes), so it's just less of an issue for those users specifically.

Foxfire_
Nov 8, 2010

Xerophyte posted:

I wonder if there's an actual json library that awful or if they just rolled their own crappy parser. My heart wants me to believe it's the latter, but the former would absolutely not surprise me.

If I were writing a JSON parser, making it performant when parsing a 10MB string would not be high on my list of cases to optimize for.

xtal
Jan 9, 2011

by Fluffdaddy

Foxfire_ posted:

If I were writing a JSON parser, making it performant when parsing a 10MB string would not be high on my list of cases to optimize for.

That doesn't seem like super much, but I heard of just sending a sqlite database instead and it was like half the size

rjmccall
Sep 7, 2007

no worries friend
Fun Shoe

Jabor posted:

Perhaps the microtransaction store that the JSON is being parsed for is different for different users (e.g. doesn't have a bunch of loot box items in countries where they can't sell loot boxes), so it's just less of an issue for those users specifically.

That could be, sure.

FlapYoJacks
Feb 12, 2009

Volguus posted:

Oh interesting. I prefer (for quite some time now) the nlohmann json since it's header only, it has a decent enough API and, according to those benchmarks, is kinda the middle of the pack in there. Surely they aren't even using that.

Same here. It's a good library overall.

Xerophyte
Mar 17, 2008

This space intentionally left blank

Foxfire_ posted:

If I were writing a JSON parser, making it performant when parsing a 10MB string would not be high on my list of cases to optimize for.

Now I'm curious what you think is more important. I get not wanting to deal with out of core performance but 10 MB is a moderately sized file. These are common payloads in anything that's using JSON for actual data interchange. The test I linked uses among others an RFC 7946 file for the border of a single country at moderate resolution which clocks in at 2 MB, for instance. You can argue that geometry data and the like shouldn't be stored or sent as JSON -- you could in fact argue that nothing should be stored or sent as JSON and I'd probably agree -- but, well, it is. Frequently.

Not having O(n²) behavior in input size and handling a few MB of input in a parser for a popular data-agnostic human-readable interchange format is hardly some weird niche thing.


Also, no, I certainly don't think a company using a library with a poor implementation for their use case shifts the responsibility for the results. It'd just be a little more sadness in the world if someone set out to make a JSON library and still ended up with that.

Foxfire_
Nov 8, 2010

I'd expect that most JSON would be short and optimize for that + simplicity for fewer bugs since I wouldn't expect JSON-parsing to be generally bottlenecking vs IO. That might be a poor choice for a general-purpose library, depending on how it's actually used

Also, the thing that's bad is actually the C runtime code, not the JSON library. sscanf() [parse some tokens from the start of a string] is calling strlen() [how long is this entire string] and it doesn't really have any reason to. The standards don't promise any particular complexity for either, but it's turning a function that ought to be proportional to the size of the tokens being matched into something proportional to the size of the entire string. My guess for why it's not slow on some systems is that they're using different runtime versions.

You would think sscanf(someString, "%d,", &variable) would only be examining the string up to either the first , or the end-of-string, not that it touches every single byte always

Foxfire_ fucked around with this message at 02:00 on Mar 2, 2021

Dross
Sep 26, 2006

Every night he puts his hot dogs in the trees so the pigeons can't get them.

Even if the parser wasn’t an issue there’s still the part where they’re iterating a 63000 item array 63000 times when they could just use a hash map instead (or cut that routine entirely)

Khorne
May 1, 2002

Dross posted:

Even if the parser wasn’t an issue there’s still the part where they’re iterating a 63000 item array 63000 times when they could just use a hash map instead (or cut that routine entirely)
Excuse me, the parser was designed for 100 or fewer items where a hashmap would be a performance loss. It's not my responsibility as a developer to fix something that's working correctly.

Here's a link to the feature's story. Take it up with the product team who spec'd the feature and have them create a story so we can work it into the backlog.

Khorne fucked around with this message at 02:25 on Mar 2, 2021

Foxfire_
Nov 8, 2010

You say that, but it would also not surprise me if the person who originally wrote it was not anticipating the 'possible microtransactions' list to be 63,000 things long. I feel bad about condemning them for that

Khorne
May 1, 2002

Foxfire_ posted:

You say that, but it would also not surprise me if the person who originally wrote it was not anticipating the 'possible microtransactions' list to be 63,000 things long. I feel bad about condemning them for that
I'd bet money they just reused some hacked together json code from another one of their titles or used some mit licensed "lightweight json parsing" header file that checked out at first glance and used catchy buzzwords like "single allocation" and "in memory".

Khorne fucked around with this message at 02:43 on Mar 2, 2021

Xerophyte
Mar 17, 2008

This space intentionally left blank
I would be surprised if the game did not package the runtime it was compiled for with the install so it's unlikely to be a cause of user variance.

Anyhow, my understanding after some googling is that the sscanf "bug" is present in the BSD, MSVC, GNU and MUSL's C libraries, all of which implement sscanf in terms of FILE objects and vfscanf. The strlen call is used to generate the fake FILE object that then gets sent to vfscanf. I'll agree that it isn't a great implementation, but it's apparently a common strategy in the standard libraries to eliminate duplication in their assorted formatting code. sscanf wasn't really designed for this use case, or really any use case where the input is a large string of unknown format. One more reason on the pile of reasons why C's string handling hurts.

People like to say that this-or-that is IO bound but hitting the IO bound when reading anything you actually need to process from an SSD is legit difficult. You're not going to get close if you have to do anything remotely complex, for instance if trying to parse floats from text. The simdjson guy had a somewhat basic presentation on what they had to do to actually hit the limit back in The Beforetimes, one takeaway is that they have a budget of about 1.5 CPU cycles/byte of text:
https://www.youtube.com/watch?v=wlvKAT7SZIQ
They're still CPU bound on anything involving floats, because as mentioned parsing floats from text really sucks.

DELETE CASCADE
Oct 25, 2017

i haven't washed my penis since i jerked it to a phtotograph of george w. bush in 2003
gta online for ps4 takes absolutely forever to load

more falafel please
Feb 26, 2005

forums poster

What JSON parsing library is using sscanf? That seems like a terrible idea for a vaguely EBNF-style language. The whole thing smacks of hand rolled.

Foxfire_
Nov 8, 2010

more falafel please posted:

What JSON parsing library is using sscanf? That seems like a terrible idea for a vaguely EBNF-style language. The whole thing smacks of hand rolled.

Apparently real-world atof() and strtod() implementations call sscanf(), which calls strlen(). The RapidYAML issue someone linked earlier is the same thing where once they figured out the next thing in the input string was a float, they called atof() on it, with goes badly if there's a lot of trailing string. The nlohmann json library other people in this thread liked does basically the same thing (but strtof()), except it's happened to have copied the content being parsed into a short temporary first so it doesn't explode. RapidJSON rolled their own strtod() implementation instead of using a stdlib one, so they don't have the problem at all.

Tei
Feb 19, 2011

I refuse to believe 10MB of data is only data. Somebody must have decided to
store images in here has base64.

pokeyman
Nov 26, 2006

That elephant ate my entire platoon.

Tei posted:

I refuse to believe 10MB of data is only data. Somebody must have decided to
store images in here has base64.

You're selling hats with every possible combination of 17 variables each with 50 possible values. You didn't bother compressing, deduplicating, or filtering irrelevant data because hey, it works fine today, we've only offered two hats so far, how many hats could we possibly need in the store at once. What's that, there's four different ways to sort the store and it's too slow to reorder them on the fly in the game? No worries, we'll just ship down the whole list four times in the right order. Oh right there's a search bar, better add keywords to each hat. An index should make it even faster, let's include that. Yes, for each of the four orderings, we're trading space for time here already what's another couple of bytes.

Qwertycoatl
Dec 31, 2008

With 63000(!) microtransactions in the JSON, that's only about 160 bytes per item which is pretty reasonable

Tei
Feb 19, 2011

Qwertycoatl posted:

With 63000(!) microtransactions in the JSON, that's only about 160 bytes per item which is pretty reasonable

it seems I was wrong... oh god

Presto
Nov 22, 2002

Keep calm and Harry on.
Yeah, there's an astonishing number of clothing items, accessories, guns, cars, planes, boats, casino penthouse artwork, etc etc that you can buy.

Also I don't think the json thing is the whole story. I play GTA:O and sometimes it loads fairly quickly and sometimes it takes so long that I give up. I guess there's some network synchronization going on and if there's one person in the session with tin-can-and-string Internet it drags everyone else down.

Either that or some modder has screwed up the session and made it unplayable, which is a thing that happens a lot.

rjmccall
Sep 7, 2007

no worries friend
Fun Shoe
Having a ton of transactions is pretty common in that kind of game. It’s usually not lootboxes (and I’ve never played GTAO and have no idea if it even has them), it’s just however many years of accumulated hats. I don’t think Path of Exile is anyone’s idea of a lootbox game, and its shop is ridiculously big after so many years of added content.

Volmarias
Dec 31, 2002

EMAIL... THE INTERNET... SEARCH ENGINES...

Foxfire_ posted:

You say that, but it would also not surprise me if the person who originally wrote it was not anticipating the 'possible microtransactions' list to be 63,000 things long. I feel bad about condemning them for that

This is my read on it. The original implementation went so fast that they never even thought about it, so there's no reason to even think about profiling it for lurking bugs like this (I wouldn't have even thought to consider sscanf to want to get the entire length of the string, let alone so that it could be presented as a FILE (?!!!)) or write any kind of regression tests (which likely wouldn't even have alerted anyone).

At a certain point, I think the most reasonable question is "how many goddamn hats do you really need"

HappyHippo
Nov 19, 2003
Do you have an Air Miles Card?
I think I read somewhere that O(n^2) is often a killer because it seems fine on the small/medium tests but chokes when it has to handle something serious.

There were probably far fewer transactions when they were implementing that.

Vanadium
Jan 8, 2005

I'd like to submit for your consideration: https://twitter.com/zygoloid/status/1366917418354761728

code:
#include <stdio.h>
void f(float&&) { puts("float"); }
void f(int&&) { puts("int"); }
void g(auto &&...v) { (f(v), ...); }
int main() { g(1.0f, 2); }

Kazinsal
Dec 13, 2011


My money is on "g++ will accept this and do the stupid option and other compilers will tell you to gently caress off"

Foxfire_
Nov 8, 2010

Fails code review

pokeyman
Nov 26, 2006

That elephant ate my entire platoon.
Bigger scandal imo is that you don't have to return from main when declaring it as returning an int. What is the point of that?

rjmccall
Sep 7, 2007

no worries friend
Fun Shoe
That is hilarious, it does exactly the wrong thing because of the reference-binding rules.

Kazinsal
Dec 13, 2011


It's like the person who wrote the code generator for that got the argument order on the stack backwards and assumed it was left to right instead of right to left, but somehow managed to accomplish this on x64 where cdecl is register based?

Soricidus
Oct 21, 2010
freedom-hating statist shill

rjmccall posted:

That is hilarious, it does exactly the wrong thing because of the reference-binding rules.

*sigh* I feared this day would come. We cannot put it off any longer.

It is time for &&&

Lime
Jul 20, 2004

I wouldn't say the horror is the reference-binding rules because binding an l-value to an r-value reference should fail. The real horror is just good old implicit conversions, and in particular the especially flexible rules for int to float or float to int pr-values.

Lime fucked around with this message at 08:59 on Mar 3, 2021

rjmccall
Sep 7, 2007

no worries friend
Fun Shoe
It is an excellent example of a large number of well-intentioned rules working together to produce bad results. But I'll stand by what I said: the rules about passing && / const & references to fundamental types are so effectively lax in practice that disallowing the exact-type "conversion" here just ends up feeling like a bizarre exception with bizarre implications. It would not be unreasonable to only apply that restriction to non-fundamental types, or even non-trivial types.

Only allowing implicit conversions to do promotions would generally be a nice improvement, but in this case specifically we'd still end up calling the float overload when starting with an int argument.

fritz
Jul 26, 2003

It won't compile for me with clang++ on osx, which I consider a blessing.

Xerophyte
Mar 17, 2008

This space intentionally left blank
Ok, I found this real confusing. Stripping the C++20 I get
C++ code:
#include <stdio.h>
void f(float&&) { puts("float"); }
void f(int&&) { puts("int"); }
template<typename T> void g(T&& v) { f(v); }
int main() { g(1.0f); g(2); }
which fails in the same way.

With no optimization code generation emits stuff like (clang 11)
code:
void g<int>(int&&):                            # @void g<int>(int&&)
        push    rbp
        mov     rbp, rsp
        sub     rsp, 16
        mov     qword ptr [rbp - 8], rdi
        mov     rax, qword ptr [rbp - 8]
        cvtsi2ss        xmm0, dword ptr [rax]
        movss   dword ptr [rbp - 12], xmm0
        lea     rdi, [rbp - 12]
        call    f(float&&)
        add     rsp, 16
        pop     rbp
        ret
unless I move or forward in g.

v is an lvalue when used in the expression inside g which means that calling the "matching" rvalue reference function is not allowed according to the reference binding rules, and that seems ok.

Is the thing that if it were to be implicitly converted to the opposite integral/floating type then the converted type would be materialized as a bindable prvalue, and since implicit conversions can happen when needed that ends up being the only valid thing the compiler can do?

CPColin
Sep 9, 2003

Big ol' smile.
I'm so happy I never have any idea what's going on in the C++ horrors.

Adbot
ADBOT LOVES YOU

Blue Footed Booby
Oct 4, 2006

got those happy feet

CPColin posted:

I'm so happy I never have any idea what's going on in the C++ horrors.

I'm sure it 's like all languages in that you can take a sane subset of the language features and have something reasonable, but holy poo poo does it have some byzantine footguns. It's the closest I've seen to fantasy novels where wizards toy with forces they don't fully understand and end up eaten by demons or turned into owlbears.

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply