Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
leper khan
Dec 28, 2010
Honest to god thinks Half Life 2 is a bad game. But at least he likes Monster Hunter.

https://www.digitalmars.com/

https://www.ibm.com/products/c-and-c-plus-plus-compiler-family

https://www.intel.com/content/www/us/en/developer/tools/oneapi/dpc-compiler.html#gs.8q8ktz

Adbot
ADBOT LOVES YOU

Foxfire_
Nov 8, 2010

Xarn
Jun 26, 2015

No NVCC or LCC, 2/10

ExcessBLarg!
Sep 1, 2001

Plorkyeran posted:

What if I told you that you can compile c++ without having gcc or clang?
Well then I wouldn't expect xxd to be available. Or make for that matter.

feedmegin
Jul 30, 2008

ExcessBLarg! posted:

Well then I wouldn't expect xxd to be available. Or make for that matter.

Commercial Unix (with commercial compilers) still exists man. It also comes with a vi-compatible editor. It's actual vi, though, not vim.

ExcessBLarg!
Sep 1, 2001
Serious answer: The projects where I find myself needing to use xxd from make or something are also projects that effectively require gcc or clang and I'm not really making an effort to ensure they're portable enough to run on a commercial Unix with a commercial compiler. If I were obligated to target the latter I'd treat the entire project as a major "compat" effort and charge accordingly.

Twerk from Home
Jan 17, 2009

This avatar brought to you by the 'save our dead gay forums' foundation.
What are everyone's preferred high-speed non-cryptographically secure PRNGs? I've got a hot loop in an application that is now spending >50% of its time in std::uniform_int_distribution<IntType>::operator() with a std::mt19937_64.

There's a ton of options, but I'd appreciate some nuance from the devs here.

https://github.com/lemire/testingRNG
https://lemire.me/blog/2019/03/19/the-fastest-conventional-random-number-generator-that-can-pass-big-crush/

Computer viking
May 30, 2011
Now with less breakage.

Last time I needed one I read bytes from /dev/random, but I'm sure there are reasons to not do that?

Xerophyte
Mar 17, 2008

This space intentionally left blank
Depends somewhat on your application: how sensitive it is to sample covariance for various slices of an RNG sequence and the like. I used xorshift64/128 as my go-tos for years, mostly from inertia. They're not good, but fast, simple, good enough in a lot of cases (for graphics, at least) and require nearly no hardware features. Xorshifts (and derivatives like xoroshiro) don't pass big crush, the low bits aren't great, etc, but the flaws are known and the algorithms are battle tested. If you're targeting limited hardware and know you don't need to worry too much about lower bit quality they can still be a good choice.

Lately I've been using PCG hashes and corresponding RNGs. They're considerably higher quality than xorshift and relatively well-tested in graphics for how new they are. They're the default RNGs in pbrt, parts of Unreal (as hashes for procedurals, I think?), and a bunch of other renderers.

Xarn
Jun 26, 2015
Is the cost mostly in the RNG, or in the distribution? Based on what range you are using, you might be able to bypass the distribution, or use a better implementation.


Anyway, I use PCG for my projects & if it was performance critical, I would replace the std:: one with a better one (Incidentally, Lemire has a blog post about that as well)

nielsm
Jun 1, 2009



Computer viking posted:

Last time I needed one I read bytes from /dev/random, but I'm sure there are reasons to not do that?

Code that needs to run on Windows, could be a common reason.

Foxfire_
Nov 8, 2010

/dev/random is often a terrible quality generator. And even when it isn't, doing syscalls then having the kernel do prng math is slower than doing prng math directly, the kernel isn't any better at it than your process

Twerk from Home
Jan 17, 2009

This avatar brought to you by the 'save our dead gay forums' foundation.

Xarn posted:

Is the cost mostly in the RNG, or in the distribution? Based on what range you are using, you might be able to bypass the distribution, or use a better implementation.


Anyway, I use PCG for my projects & if it was performance critical, I would replace the std:: one with a better one (Incidentally, Lemire has a blog post about that as well)

Aw jeez, Xarn is our winner. A bad misread of profiler results on my part, it's spending its time not in getting the next random, but in translating it to the desired int range. Also, it looks like Lemire got most of his super-fast one into the GNU libstdc++!

Thanks for the tips.

https://lemire.me/blog/2019/06/06/nearly-divisionless-random-integer-generation-on-various-systems/
https://lemire.me/blog/2019/09/28/doubling-the-speed-of-stduniform_int_distribution-in-the-gnu-c-library/

Xarn
Jun 26, 2015
Just remember that uniform int distribution is just clever rejection sampling, so you might be generating multiple random numbers per distribution result.

BattleMaster
Aug 14, 2000

Foxfire_ posted:

/dev/random is often a terrible quality generator. And even when it isn't, doing syscalls then having the kernel do prng math is slower than doing prng math directly, the kernel isn't any better at it than your process

Yeah I just use it to seed a PRNG. I use getrandom (which with no flags takes from /dev/urandom) to seed xoroshiro128+ for my monte carlo photon simulations.

I am satisfied with the results and it's noticeably faster than glibc's rand - like a 10-30%+ runtime reduction, depending on material (lighter materials like water have much more scattering than heavier ones like lead and therefore need more random numbers per photon.) It really adds up for simulations of billions of photons!

BattleMaster fucked around with this message at 04:30 on Aug 13, 2022

Methanar
Sep 26, 2013

by the sex ghost
I legitimately feel sick after trying to read through a C program that abuses the windows API.

Falcorum
Oct 21, 2010

Absurd Alhazred posted:

Maybe that person should have called it an "embedding concept" and it would have gotten more votes.

Call it embeddable textual/binary ranges and then you get at least one element of "ranges" that is actually useful and doesn't balloon compile times

Falcorum fucked around with this message at 18:45 on Aug 15, 2022

VikingofRock
Aug 24, 2008




I have kind of a weird problem that is a bit beyond my template expertise, and I thought I would ask here. I have a function which has several overloads, which all look like this:

C++ code:
template <typename T>
void Foo(T& t, std::integral_constant<int, N>);
There are overloads from N=0 to 16.

I am trying to write a different function, which looks like this:

C++ code:
template <typename T, int M>
void SelectFoo(T& t) {
   // this is what I'm trying to write
}
and this function needs to call the correct overload of `Foo`. Now, usually this works with `N = M` (i.e. calling `Foo(t, std::integral_constant<int, M>{})`), and everything is gravy. But sometimes that causes a syntax error in `Foo()`, and instead calling with or `N = M-1` will work, or `N = M - 2`, or `M-3`. I am very certain that there is exactly one value for N somewhere between 0 and M that will not lead to a syntax error in Foo. How do I find write `SelectFoo` to call the right overload?

Jabor
Jul 16, 2010

#1 Loser at SpaceChem
Can't you just do
code:

template <typename T>
void SelectFoo(T& t) {
   Foo(t, std::integral_constant<int, 0>{})
}

template <typename T>
void SelectFoo(T& t) {
   Foo(t, std::integral_constant<int, 1>{})
}

// etc

If only one of those instantiations will actually compile, then overload resolution will pick that one and ignore the rest?

Plorkyeran
Mar 22, 2007

To Escape The Shackles Of The Old Forums, We Must Reject The Tribal Negativity He Endorsed
SFINAE only applies to the signature of a function and not the body. You'd have to something along the lines of:

C++ code:
template<typename... Ts> struct make_int { using type = int; };
template<typename... Ts> using int_t = typename make_int<Ts...>::type;

template <typename T>
void SelectFoo(T& t, int_t<decltype(Foo(t, std::integral_constant<int, 0>()))> = 0) {
   Foo(t, std::integral_constant<int, 0>{})
}
template <typename T>
void SelectFoo(T& t, int_t<decltype(Foo(t, std::integral_constant<int, 1>()))> = 0) {
   Foo(t, std::integral_constant<int, 1>{})
}
This will probably absolutely murder your compile times.

rjmccall
Sep 7, 2007

no worries friend
Fun Shoe
Specifically, you would need to put whatever expression will ultimately fail to compile in the function signature.

Really you need to figure out the correct computation for N to make it work instead of relying on overloading.

Beef
Jul 26, 2004
Wouldn't dead code optimization remove all the unused functions? As long a they are not externally linked that is.

VikingofRock
Aug 24, 2008




Thank you everyone!

rjmccall posted:

Specifically, you would need to put whatever expression will ultimately fail to compile in the function signature.

Really you need to figure out the correct computation for N to make it work instead of relying on overloading.

The correct number for N is `M - (the number of structs from which T inherits)`. I googled around a bit and it didn't seem like there is a way to get the number of parent structs of T via template shenanigans, but maybe there is a way that I am unaware of.

ultrafilter
Aug 23, 2007

It's okay if you have any questions.


VikingofRock posted:

The correct number for N is `M - (the number of structs from which T inherits)`.

What exactly are you trying to do here?

Plorkyeran
Mar 22, 2007

To Escape The Shackles Of The Old Forums, We Must Reject The Tribal Negativity He Endorsed

Beef posted:

Wouldn't dead code optimization remove all the unused functions? As long a they are not externally linked that is.

The problem with this approach is not that the compiled binary is too large.

rjmccall
Sep 7, 2007

no worries friend
Fun Shoe

VikingofRock posted:

Thank you everyone!

The correct number for N is `M - (the number of structs from which T inherits)`. I googled around a bit and it didn't seem like there is a way to get the number of parent structs of T via template shenanigans, but maybe there is a way that I am unaware of.

This is a weird condition, but yeah, you’re not going to able to do it except manually with a trait “database”.

VikingofRock
Aug 24, 2008




ultrafilter posted:

What exactly are you trying to do here?

It's a little complicated, but effectively I want to turn the bindable fields of a struct into a tuple of references to those fields, supporting structs of at least 16 fields. Right now, this is done with a series of overloads that look like

C++ code:

// For N=3
template <typename T> auto BindStructFields(T& value, std::integral_constant<int, 3>) {
  auto& [x, y, z] = value;
  return std::tie(x, y, z);
}

And there is another function which automatically calls the right overload by determining the number of fields that can be used to initialize the tuple. This is M in my explanation above.

However! There is an edge case here, which we now need to account for. In the case that the struct in question inherits from an empty struct (or several empty structs), the number of bindable fields is not the same as the number of fields used to initialize the struct. That is:

C++ code:

// Foo is a struct with three integer member fields
// and which inherits from a single empty struct

// This is how you initialize Foo
Foo foo = {{}, 1, 2, 3};
// This is how you bind to it.
auto& [a, b, c] = foo;

I want to account for this edge case.

I actually found a work around, though.

C++ code:

template <typename T, int N = 0>
auto AutoBindStructFields(T& value) {
  constexpr bool can_bind = std::is_invocable<[](auto& t) -> decltype(BindStructFields(t, std::integral_constant<int, N>{})){}, T&>;
  if constexpr (can_bind) {
    return BindStructFields(value, std::integral_constant<int, N>{});
  } else {
    return AutoBindStructFields<T, N+1>(v);
  }
}

Now, optimally, I would start the above at N = (number of initialization fields) and count down, but for some reason that doesn't compile, but this does. Whatever, good enough, compile times aren't too murdered.

Beef
Jul 26, 2004

Plorkyeran posted:

The problem with this approach is not that the compiled binary is too large.

Much like my code, I don't remember why I wrote that post. (Yeah, I completely misunderstood the problem.)

VikingofRock
Aug 24, 2008




Update: my workaround does not, in fact, work. I just had special-cased N=0 to return std::tie(), and it was doing that for every type.

Twerk from Home
Jan 17, 2009

This avatar brought to you by the 'save our dead gay forums' foundation.
Is there a convenient, portable C++ way to read in values that were written little endian? I'm aware of endian.h on Linux, but if I wanted something that worked on Macs too what would the simplest option be?

My specific need at the moment is just validating that the first 4 bytes of a file are a magic number, which I can do just by explicitly reading 4 bytes into an unsigned char array and comparing arrays so that I don't have to muck with endianness, but I'm realizing I don't have a lightweight general purpose solution for this handy.

Jabor
Jul 16, 2010

#1 Loser at SpaceChem
Do you need this to be portable to systems with unusual values for CHAR_BIT?

Twerk from Home
Jan 17, 2009

This avatar brought to you by the 'save our dead gay forums' foundation.

Jabor posted:

Do you need this to be portable to systems with unusual values for CHAR_BIT?

Nope, absolutely not. I am assuming 8 bit bytes.

OddObserver
Apr 3, 2009
Then you can probably just do some shifts.

pseudorandom name
May 6, 2007

std::endian and (if you live far enough in the future) std::byteswap

Foxfire_
Nov 8, 2010

MSVC and ICC are surprisingly crap at optimizing the simple shift versions:
https://godbolt.org/z/qMvv4drbv

more falafel please
Feb 26, 2005

forums poster

99% of the time when I have to deal with endian junk it's for networking stuff, so I use ntohl() etc, but yeah, std::endian is probably the right way for C++.

roomforthetuna
Mar 22, 2005

I don't need to know anything about virii! My CUSTOM PROGRAM keeps me protected! It's not like they'll try to come in through the Internet or something!

Foxfire_ posted:

MSVC and ICC are surprisingly crap at optimizing the simple shift versions:
https://godbolt.org/z/qMvv4drbv
clang and gcc are surprisingly amazing at it though, it's like magic to turn a fuckload of shift operations and casts and stuff for a 64-bit endian-change, and reduce it to a single inlined "bswap".

Twerk from Home
Jan 17, 2009

This avatar brought to you by the 'save our dead gay forums' foundation.

pseudorandom name posted:

std::endian and (if you live far enough in the future) std::byteswap

Thanks all!

Bit shifts have always worked and will continue to work, but I bet there was some shiny whizz-bang addition to the stdlib!

Twerk from Home
Jan 17, 2009

This avatar brought to you by the 'save our dead gay forums' foundation.
I just watched this talk about strings at Facebook, and their successful / unsuccessful attempts at improving string performance in C++.

https://www.youtube.com/watch?v=kPR8h4-qZdk

Starting at about 21-22 minutes, there's a discussion about their failed attempt to eliminate the null terminator until calling c_str(), at which point it would be added. The core bug that prevented their optimization has to do with:



1. malloc for data across multiple pages A and B, and only write the bytes that are within page A. No bytes written to page B. In the snippet above, only the part of data that's in Page A is written to.
2. Another malloc / free happens in another part of Page B, Page B is conditionally returned to the kernel because the part of data that's in Page B has never been written to, and the other memory allocated in B was freed.
3. c_str() is called, and data[size()] is read from the part of data in Page B, which was conditionally returned to the kernel, and thus uninitialized. This is undefined behavior and the kernel returns a 0.
4. Another malloc happens in Page B, and when written to, all of Page B comes back in its previous state, including the uninitialized garbage that was actually at data[size()], so now data[size()] is not 0 even though nothing has written to it.
5. That's a c string that's not null terminated now!

The only discussion I found around it is https://stackoverflow.com/questions/46123029/reading-from-uninitialized-memory-returns-different-answers-every-time, where the answers completely miss the point and assume that Facebooks senior engineers don't know what they're doing and it's a stupid simpler bug.

My big question is that if you're malloc-ing 256 bytes, writing in 128 bytes, and then reading the 129th byte's value, wasn't that always undefined behavior in the first place because you're reading uninitialized memory? I would expect that this problem could have happened without the extra malloc/free/return page to kernel cycle because data[129] was never written to, so it's fine for the 1st read from it to return 0 and the 2nd read from it to return nonzero because if you malloc something and don't write data from it, reading it is always undefined, right?

Adbot
ADBOT LOVES YOU

Foxfire_
Nov 8, 2010

I think it is or is not undefined depending on how exactly you interpret some unclear bits of the C standard:

7.20.3.3 (2) The malloc function allocates space for an object whose size is specified by size and whose value is indeterminate.
3.17.2: indeterminate value either an unspecified value or a trap representation
3.17.3: unspecified value valid value of the relevant type where this International Standard imposes no requirements on which value is chosen in any instance

There is some other stuff where you can pigeonhole logic your way to an unsigned char never having any trap representations. So then the question is if multiple reads to some indeterminate non-trap representation are required to always produce the same (arbitray) value, or if they can change on every read.

Here's a standards committee thing adjacent to it: https://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_451.htm Amusingly, their conclusion seems to be "C largely doesn't work and almost all nontrivial programs have undefined behavior (their opinion was that memcpy()-ing a struct with uninitialized padding bits has undefined behavior)

Shorter summary
code:
unsigned char x;
if (x == 0) { printf("Dick"); }
if (x != 0) { printf("Butt"); }
Does the program have UB, and if not, is it required to print anything?

Foxfire_ fucked around with this message at 18:50 on Aug 23, 2022

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply