Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
Mr VacBob
Aug 27, 2003
Was yea ra chs hymmnos mea
If you're debugging an optimized program, the debug info is probably wrong and giving you nonsense answers.

Adbot
ADBOT LOVES YOU

Mr VacBob
Aug 27, 2003
Was yea ra chs hymmnos mea
There's nothing slow about std::vector; accessing it is exactly as fast as it's possible to be. DId he give any details on why it's slow? Does he just want to insert stuff into the middle of all his data structures?

Mr VacBob
Aug 27, 2003
Was yea ra chs hymmnos mea
One of your printf formats is being ignored (probably because it doesn't like "%Lfx") so it's reading the wrong thing off the stack.

And use %p instead of casting pointers to something else.

Mr VacBob
Aug 27, 2003
Was yea ra chs hymmnos mea
Classes will usually have a vtable pointer at the beginning, so you can't rely on offsets from the beginning of the class. You can rely on offsets from other members, I guess.

Also, use __attribute__((packed)) if you don't care about compiler portability, otherwise you'll forget to undo the #pragma and it'll be slow.

Mr VacBob
Aug 27, 2003
Was yea ra chs hymmnos mea

JoeNotCharles posted:

Actually you can in gcc - it's a compiler extension. (You shouldn't, though, because it makes your code non-portable.)

It's legal C99, but that assumes a C99 compiler.

vv That means they're not perfect (see the note), but I've never had any trouble with them in practice.

Mr VacBob fucked around with this message at 20:22 on May 6, 2008

Mr VacBob
Aug 27, 2003
Was yea ra chs hymmnos mea
gcc completely or mostly ignores 'inline' if you have optimizations on. Recent (4.3 or after) versions have decent inlining detection, but it was completely messed up before that. You really have to use __attribute__((always_inline)) sometimes.

Mr VacBob
Aug 27, 2003
Was yea ra chs hymmnos mea

crazypenguin posted:

The next biggest chunk for GCC is probably that the software architecture doesn't really lend itself to being changed very easily because god forbid somebody write a nonfree plugin for GCC! Quick! Chop our own arms off!

This isn't true anymore, I wish people would stop acting like it was. There's at least two plugin APIs for GCC and a link-time optimization project coming along now. And the vectorization pass should be about as good or better as MS's; most of the problems come from parts being really old and full of awful spaghetti code written by RMS.

Mr VacBob
Aug 27, 2003
Was yea ra chs hymmnos mea
-Wall in gcc will catch this. -Werror=implicit will catch it even better.

Mr VacBob
Aug 27, 2003
Was yea ra chs hymmnos mea

slovach posted:

What is with MSVC and SSE intrinsics? MS recommends their usage over inline asm, but the stuff it seems to generate is beyond earthly logic at times.

Ok, _mm_set stuff... why would it honestly make 4 movss instructions over one movaps and a constant? Occasionally it seems to come up with some shuffling black magic out of nowhere... and isn't instruction pairing a good idea with MMX / SSE? Even if the intrinsics are paired, it seems to do it's own thing and break that.

Maybe it knows best, but it seems to generate some strange code sometimes...

The people who recommend MMX/SSE intrinsics over asm usually don't read their output asm.

Of course, inline asm is kind of hard if you want MSVC/GCC compatibility, or even x86-32/x86-64 compatibility across the same compiler.

"Instruction pairing" hasn't applied to anything since the first Pentium. Out-of-order x86 cores don't care what order your instructions are in (well...), so just to minimize instructions and register spills and don't worry about that.

Mr VacBob fucked around with this message at 00:28 on Sep 10, 2010

Mr VacBob
Aug 27, 2003
Was yea ra chs hymmnos mea
Nothing, they're identical. This is what C is for.

Obviously you shouldn't try to free() it or anything.

Mr VacBob
Aug 27, 2003
Was yea ra chs hymmnos mea
That program won't fail if you use -mfpmath=sse, but will if you use x87. Anything x86-64 and anything Darwin use SSE by default. Most non-x86 platforms should work too, because they aren't as horrible as x87 and don't extend float to double on loads.

Adbot
ADBOT LOVES YOU

Mr VacBob
Aug 27, 2003
Was yea ra chs hymmnos mea

ehnus posted:

PowerPC definitely extends 32-bit floats to 64-bit floats on load.

Oh, so it does. I was thinking of double - x87 extends that to 80-bit, which breaks the PPC optimization of writing memcpy using double copies.

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply