C/C++ Programming Questions Not Worth Their Own Thread

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > C/C++ Programming Questions Not Worth Their Own Thread

«‹›641 »

Stevor: Feb 18, 2004; THIS IS VERY BIZARRE

shrughes posted:

Gerblyn posted:

I think...

I think you are right Gerblyn, but your explanation really makes sense to me. The instance of the class is created in the main function, and what shrughes said I could implement without a compiler error, but the problem is obviously in the functions I'm calling. That being said, I understand more about instances of classes now, and what they do. I will take these things in mind and try to rework my code.
Thanks guys.

e: after encapsulating all of the code for the sprite (drawing, mechanics, etc) and understanding some of how the constructor works and instancing a class, I've gotten the application to do what I wanted, and I understand it even better now.

Stevor fucked around with this message at 16:07 on Feb 7, 2011

# ? Feb 7, 2011 14:59

Adbot: ADBOT LOVES YOU

# ? Jun 7, 2024 09:21

Theseus: Jan 15, 2008; All I know is if there is a God, he's laughin' his ass off.

I have what I hope is a stupid, easily-answered question.

I have a union such that every instance of it must be allocated on a 16-byte boundary. I'm using C++ and the g++ compiler. Unfortunately, I'm having some issues with it: the __attribute__ ((aligned(16))) directive doesn't seem to work. My instances of the union seem to have ended up on 8-byte boundaries instead! For performance reasons, they're being declared on the stack, which I assume is the source of the issue: I've read about a bit to try to find a workaround and there seems to be general consensus that alignment of variables on the stack is not guaranteed. I would make them static to force heap allocation, but I need to make the application multithreaded in the future, so that's not an option. Does anyone have any suggestions for forcing alignment on the stack? Instances of the union are 16 bytes in size themselves, but I'm not adverse to increasing that to as much as 32 bytes if needed.

# ? Feb 11, 2011 04:40

roomforthetuna: Mar 22, 2005; I don't need to know anything about virii! My CUSTOM PROGRAM keeps me protected! It's not like they'll try to come in through the Internet or something!

One way you can force alignment of a thing that will work on the stack is to allocate your-alignment-size of extra space, then set a pointer to (start_of_space + alignment) &~ (alignment - 1)

I don't know if the reference point being a pointer will make your performance worse though.

(I found this method used in a chess engine, which aligned board-data to a boundary so that shifts and bitwise operations on a pointer could be used to get board coordinates, back in the day when shifts and bitwise operations were unambiguously faster than mathematical ones.)

code:

//For example
#define ALIGNMENT 16
 char buffer[sizeof(YourUnion)+ALIGNMENT];
 YourUnion *pUnion=(buffer+ALIGNMENT)&~(ALIGNMENT-1);
 //done, pUnion is now an aligned YourUnion object.

If you were making an array of them you wouldn't need to align them individually, you could just make the buffer like
char buffer[sizeof(YourUnion)*ARRAYSIZE+ALIGNMENT];

vvvv Edited to fix my mistake - not my math, just my typing. I had it right but then I replaced hardcoded 16 with ALIGNMENT and accidentally dropped my -1.

roomforthetuna fucked around with this message at 17:07 on Feb 11, 2011

# ? Feb 11, 2011 06:35

shrughes: Oct 11, 2008; (call/cc call/cc)

Theseus posted:

Does anyone have any suggestions for forcing alignment on the stack?

What architecture are you targeting? What version of GCC are you using? I have a hard time _not_ getting 16-byte alignment.

When using a union u { char z; }, I get proper 16-byte alignment whether i put it on the union, or on the field 'z', or on a particular variable allocated on the stack. When using a union u { struct { int i, j, k, l; } z; }, for example, I can't seem to avoid 16-byte alignment. Even when I use a union u { char z[16]; }.

Edit: Also, roomforthetuna's math is off.

# ? Feb 11, 2011 07:29

Theseus: Jan 15, 2008; All I know is if there is a God, he's laughin' his ass off.

shrughes posted:

What architecture are you targeting? What version of GCC are you using? I have a hard time _not_ getting 16-byte alignment.

When using a union u { char z; }, I get proper 16-byte alignment whether i put it on the union, or on the field 'z', or on a particular variable allocated on the stack. When using a union u { struct { int i, j, k, l; } z; }, for example, I can't seem to avoid 16-byte alignment. Even when I use a union u { char z[16]; }.

Edit: Also, roomforthetuna's math is off.

I am targeting 32-bit x86 architecture. On many 64- bit machines, 16-byte allocation is the default, but this is not the case on 32-bit machines. My GCC version is 4.4.3, though I have a system locked to 3.4.6 that I also want to run it on.

# ? Feb 11, 2011 08:01

Hughlander: May 11, 2005

Theseus posted:

I have what I hope is a stupid, easily-answered question.

I have a union such that every instance of it must be allocated on a 16-byte boundary. I'm using C++ and the g++ compiler. Unfortunately, I'm having some issues with it: the __attribute__ ((aligned(16))) directive doesn't seem to work. My instances of the union seem to have ended up on 8-byte boundaries instead! For performance reasons, they're being declared on the stack, which I assume is the source of the issue: I've read about a bit to try to find a workaround and there seems to be general consensus that alignment of variables on the stack is not guaranteed. I would make them static to force heap allocation, but I need to make the application multithreaded in the future, so that's not an option. Does anyone have any suggestions for forcing alignment on the stack? Instances of the union are 16 bytes in size themselves, but I'm not adverse to increasing that to as much as 32 bytes if needed.

What I've done is:

code:

struct SomeStruct
{

};

union
{
    struct SomeStruct;
    char Padding[sizeof(SomeStruct) + sizeof(SomeStruct) % 16];
};

Hughlander fucked around with this message at 19:34 on Feb 11, 2011

# ? Feb 11, 2011 15:28

roomforthetuna: Mar 22, 2005; I don't need to know anything about virii! My CUSTOM PROGRAM keeps me protected! It's not like they'll try to come in through the Internet or something!

roomforthetuna posted:

I don't know if the reference point being a pointer will make your performance worse though.

Now I'm curious - someone more low-level than me in modern architecture, would this hurt your performance? Other than the initial assignment I mean.

I suppose I can run a quick test myself!

And I have, and the results are, frankly, weird.

code:

  char buffer[2048];
  char *pbuffer=(char*)(((DWORD)buffer+15)&~15);
  clock_t tm1=clock();
  for (int i=0; i<1000000000; i++) {
    buffer[32]='a';
  }
  clock_t tm2=clock();
  for (int i=0; i<1000000000; i++) {
    pbuffer[32]='a';
  }
  clock_t tm3=clock();
  TRACE(_T("tm2-tm1 = %d\ntm3-tm2 = %d\n"),tm2-tm1,tm3-tm2);

(Forgive my non-64-bit-compatible Windowsisms, and also, apparently my earlier example of how you can manually force alignment wouldn't compile because you have to cast away from a pointer before you can use a bitwise and, then cast back, but you can do that yourself if you decide to use the method!)

So anyway, the results of this test weretm2-tm1 = 3581 tm3-tm2 = 2920 - reference by the array buffer was actually slower than by the pointer.

I thought this might be because one was run while things were still starting up, so I switched them around, pbuffer first, which gave the results tm2-tm1 = 3334 tm3-tm2 = 3519 - array buffer still slower than pointer.

So then I thought, maybe it's *because* the pointer is better aligned (it was; buffer=0x0012f6cc, pbuffer=0x0012f6d0), so I changed it to pbuffer[28] so they'd be working on the exact same byte. Results (pbuffer still going first)tm2-tm1 = 3343 tm3-tm2 = 3528 - array buffer still slower than pointer.

So then I thought, how about if they're both actually the same value! So I added some padding bytes before buffer, to bring it to a round 16. Both buffer and pbuffer were now 0x0012f6d0, and referencing pbuffer[32] and buffer[32]. results:tm2-tm1 = 3323 tm3-tm2 = 3544 - array buffer still slower than pointer. (Reminder - after the first one, the results are pbuffer first then buffer.)

So, er, what's up with this? Is it because the pointer is already in a register, but using the array means it gets loaded into a register every time before you can add 32 to it? Is this something that would optimise away under a speed optimization (I used a "no optimize" build)?

Anyway, in conclusion, using this method to force alignment of your data appears to, at the very least, not significantly hinder performance.

# ? Feb 11, 2011 17:47

raminasi: Jan 25, 2005; a last drink with no ice

I have another move semantics question (I'm still feeling pretty :saddowns:

about all this). Given my (hopefully fixed) class definition from before:

code:

class material {
	std::string _name;

	material(const material & src); // it turns out I don't want copying
	material & operator = (const material & src);

public:
	const std::string & name() const { return _name; }

	material(material && src) : name(std::move(src._name)) { } // derp derp
	explicit material(const cppw::Instance * inst);
	
	materal & operator = (material && src) { _name = std::move(src._name); }

	friend bool operator == (const material & lhs, const material & rhs);
};

Why, later, does

code:

void some_consumer::test(material && m) {
	material n(m);
}

fail to compile with error C2248: 'material::material' : cannot access private member declared in class 'material' (with Intellisense telling me that the copy constructor is inaccessible)? I thought that I wouldn't need to use std::move because m is already an rvalue reference, so I'm clearly missing something.

# ? Feb 11, 2011 20:39

Paniolo: Oct 9, 2007; Heads will roll.

Adding std::move should fix it. I ran into a similar thing myself, it seems that you always need to add std::move. Someone who understands the mechanics a little better can probably explain why.

# ? Feb 11, 2011 21:28

That Turkey Story: Mar 30, 2003

GrumpyDoctor posted:

I thought that I wouldn't need to use std::move because m is already an rvalue reference, so I'm clearly missing something.

The type of m there is an rvalue reference type, however, using a named rvalue reference in an expression always yields an lvalue. You only "see" rvalues when they are actual temporaries or when you are directly working with the return of a function whose return type is an rvalue reference type (such as with std::move).

# ? Feb 11, 2011 22:04

Sneftel: Jan 28, 2009

What I think is going on there is that m is itself an lvalue (despite being an rvalue-reference), causing it to prefer to bind to the (private) lvalue-taking constructor. By shoving a std::move in there, you get an rvalue version of it, which binds to the rvalue-taking constructor.

But I could be wrong about all that.

EDIT: But if so, I'm in good company!

# ? Feb 11, 2011 22:12

Optimus Prime Ribs: Jul 25, 2007

Is it possible to access types created with typedef inside of a templated class?
I tried doing it like this:

code:

template <class _Ty>
struct MyClass
{
	typedef int MyFooTest;
	MyFooTest	getFoo();
};

template <class _Ty>
MyClass<_Ty>::MyFooTest MyClass<_Ty>::getFoo()
{
	return 0;
}

But I just get the error: missing ';' before 'MyClass<_Ty>::getFoo'.
I'm not that great with templates so I imagine I'm doing something pretty wrong here.

# ? Feb 12, 2011 05:13

OddObserver: Apr 3, 2009

You need to say
'typename MyClass<_Ty>::MyFooTest' when referring to such a type in templated contexts.

# ? Feb 12, 2011 05:16

Optimus Prime Ribs: Jul 25, 2007

Well that was simple.
Thanks buddy.

# ? Feb 12, 2011 05:24

litghost: May 26, 2004; Builder

Optimus Prime Ribs posted:

Well that was simple.
Thanks buddy.

Just for a little more detail, the problem here is non-depedent typenames.

# ? Feb 12, 2011 16:21

Jam2: Jan 15, 2008; With Energy For Mayhem

I picked up "The C Programming Language" and I'm just getting started with the language. I want to start using C to tackle programming puzzles and develop skills along the way.

Which environment is a better development environment for C, Windows or OS X?

# ? Feb 13, 2011 07:45

Brecht: Nov 7, 2009

Jam2 posted:

Which environment is a better development environment for C, Windows or OS X?

OS X, unquestionably.

# ? Feb 13, 2011 09:54

HFX: Nov 29, 2004

Brecht posted:

OS X, unquestionably.

Cygwin / Mygwin and Eclipse make it a bit more of a tossup, but I would probably agree with Brecht for the most part.

# ? Feb 13, 2011 10:19

Gerblyn: Apr 4, 2007; "TO BATTLE!"; Fun Shoe

roomforthetuna posted:

So, er, what's up with this? Is it because the pointer is already in a register, but using the array means it gets loaded into a register every time before you can add 32 to it? Is this something that would optimise away under a speed optimization (I used a "no optimize" build)?

Anyway, in conclusion, using this method to force alignment of your data appears to, at the very least, not significantly hinder performance.

My best guess would be that the processor can access memory aligned on a 16 byte boundary faster than one on an 4 byte boundary, though I don't know enough about processor architecture to say for sure. If you run the code in a debugger, you should be able to examine the assembly that the compiler has produced for each loop. You might be able to spot a difference in the way that the pointer arithmetic works between loops which could explain the difference as well...

Jam2 posted:

Which environment is a better development environment for C, Windows or OS X?

I use MS Visual C++ and I find it a pretty solid system to work in. It's bloody expensive though, so you may prefer using Eclipse, which is free. I've never used it for C++ myself, but I know it's a very popular choice.

# ? Feb 13, 2011 13:47

Optimus Prime Ribs: Jul 25, 2007

Gerblyn posted:

I use MS Visual C++ and I find it a pretty solid system to work in. It's bloody expensive though, so you may prefer using Eclipse, which is free.

Visual Studio is what I use for C++ development as well. I don't like VS2010 one bit, but VS2008 does everything I need it to and I've never had a reason to use anything else. But I got lucky and got it for free through school. v :shobon:

v
It's certainly not a bad choice, but as for the "better" choice then yeah I'd go with OSX.

# ? Feb 13, 2011 14:03

shrughes: Oct 11, 2008; (call/cc call/cc)

roomforthetuna posted:

So, er, what's up with this? Is it because the pointer is already in a register, but using the array means it gets loaded into a register every time before you can add 32 to it? Is this something that would optimise away under a speed optimization (I used a "no optimize" build)?

Generally speaking there would be two differences: with pbuffer you'll be accessing memory relative to some register with the value of the pointer (e.g. accessing %rax+32 with a hard-coded offset, if the pointer is stored in %rax), but using buffer directly you might be accessing memory relative to the %rbp register. Since your buffer is 2048 bytes, you'll get an instruction writing to %rbp-2016 or something.

Since 2016 doesn't fit in a byte, this takes a longer instruction, and probably a more expensive instruction, than one that writes to %rax+32. It's certainly a different instruction.

Or maybe it's moving %rsp down to the bottom of the array, and then accessing %rsp+32. The instruction encoding is different for the %rsp register for some reason.

For example, writing 'a' to %rax-80, %rbp-80, and %rsp-80:

code:

c6 40 b0 61             movb   $0x61,-0x50(%rax)
c6 45 b0 61             movb   $0x61,-0x50(%rbp)
c6 44 24 b0 61          movb   $0x61,-0x50(%rsp)

Maybe the %rsp-using instruction is slower. I've never understood the purpose of using both the %rbp and %rsp registers for stack frames, and I don't know what VC++ would output.

# ? Feb 13, 2011 14:16

Duke of Straylight: Oct 22, 2008; by Y Kant Ozma Post

Jam2 posted:

Which environment is a better development environment for C, Windows or OS X?

It's C. It probably works on your toaster. Just use whatever environment you're comfortable with and whatever IDE or editor works best for you.

# ? Feb 13, 2011 15:20

Jam2: Jan 15, 2008; With Energy For Mayhem

What do I need to get started writing code and compiling C on windows? What about on OS X?

# ? Feb 13, 2011 16:32

Mustach: Mar 2, 2003; In this long line, there's been some real strange genes. You've got 'em all, with some extras thrown in.

Gerblyn posted:

I use MS Visual C++ and I find it a pretty solid system to work in. It's bloody expensive though, so you may prefer using Eclipse, which is free. I've never used it for C++ myself, but I know it's a very popular choice.

Visual C++ Express Edition doesn't cost any money.

On OS X, you would use XCode, which is also free.

Also, installing either of those two gives you access to command-line compilers, which you may prefer over an IDE. cl on Windows and clang on OS X

Mustach fucked around with this message at 16:38 on Feb 13, 2011

# ? Feb 13, 2011 16:36

nielsm: Jun 1, 2009

Jam2 posted:

What do I need to get started writing code and compiling C on windows?

Visual C++ Express

Jam2 posted:

What about on OS X?

Xcode Tools from your OS X install DVD.

# ? Feb 13, 2011 16:38

Brecht: Nov 7, 2009

Mustach posted:

Visual C++ Express Edition doesn't cost any money.

On OS X, you would use XCode, which is also free.

Also, installing either of those two gives you access to command-line compilers, which you may prefer over an IDE. cl on Windows and clang on OS X

And gcc, which is what you should be using if you're just learning the language.

edit: clang is good too though

# ? Feb 13, 2011 16:59

rjmccall: Sep 7, 2007; no worries friend; Fun Shoe

roomforthetuna posted:

So, er, what's up with this? Is it because the pointer is already in a register, but using the array means it gets loaded into a register every time before you can add 32 to it? Is this something that would optimise away under a speed optimization (I used a "no optimize" build)?

Microbenchmarks without optimization are meaningless � the compiler doesn't necessarily even use the same instruction selection and register allocation algorithms for non-optimized builds. Even -O1 kills your loops in both these cases, or at least it does in clang.

That said, I agree with shrughes's analysis of your results; it's almost certainly some vagary of instruction selection.

# ? Feb 13, 2011 20:48

Mustach: Mar 2, 2003; In this long line, there's been some real strange genes. You've got 'em all, with some extras thrown in.

Brecht posted:

And gcc, which is what you should be using if you're just learning the language.

edit: clang is good too though

I think clang is better for a beginner, because while it supports all of the gcc flags they're likely to see while googling things, it gives monstrously better error messages.

# ? Feb 14, 2011 13:13

Scaevolus: Apr 16, 2007

shrughes posted:

Since 2016 doesn't fit in a byte, this takes a longer instruction, and probably a more expensive instruction, than one that writes to %rax+32. It's certainly a different instruction.

Wouldn't they probably be the same size when converted to uOps?

# ? Feb 15, 2011 04:36

pseudorandom name: May 6, 2007

At the very least, it'll be more expensive in the sense that the instruction is longer, with all the implications that has for the I-cache, decoder, etc.

# ? Feb 15, 2011 05:06

Scaevolus: Apr 16, 2007

Speculating about a test like this without actually reading the assembly is pointless.

# ? Feb 15, 2011 05:10

Blotto Skorzany: Nov 7, 2008; He's a PSoC, loose and runnin'
came the whisper from each lip
And he's here to do some business with
the bad ADC on his chip
bad ADC on his chiiiiip

Scaevolus posted:

Speculating about a test like this without actually reading the assembly is pointless.

So you learned your lesson from the minecraft project :v:

# ? Feb 15, 2011 05:21

volatile bowels: Sep 7, 2009; All-Star

what does z=x++ + y mean?

i know z= ++x+y means x=x+1 and then add y to the new x. I'm a little confused on the first one...I know I could just throw it into a compiler, but I need to figure it out by hand for a test at some point

vvv Thanks!

volatile bowels fucked around with this message at 06:56 on Feb 15, 2011

# ? Feb 15, 2011 05:23

DeciusMagnus: Mar 16, 2004; Seven times five
They were livin' creatures
Watch 'em come to life
Right before your eyes

z is equal to the current (before increment) value of x added to y. After the next sequence point, x will be incremented by one.

# ? Feb 15, 2011 05:48

Scaevolus: Apr 16, 2007

Otto Skorzeny posted:

So you learned your lesson from the minecraft project

I was running tests, and reading the assembly. :colbert:

# ? Feb 15, 2011 07:17

shrughes: Oct 11, 2008; (call/cc call/cc)

DeciusMagnus posted:

z is equal to the current (before increment) value of x added to y. After the next sequence point, x will be incremented by one.

No, x will be incremented before the next sequence point. After the next sequence point, it will have been incremented by one. And if x is a non-primitive type, it will be incremented before the expression x++ returns the original value of x.

# ? Feb 15, 2011 09:09

that awful man: Feb 18, 2007; YOSPOS, bitch

shrughes posted:

Or maybe it's moving %rsp down to the bottom of the array, and then accessing %rsp+32. The instruction encoding is different for the %rsp register for some reason.

For example, writing 'a' to %rax-80, %rbp-80, and %rsp-80:
code:
c6 40 b0 61             movb   $0x61,-0x50(%rax)
c6 45 b0 61             movb   $0x61,-0x50(%rbp)
c6 44 24 b0 61          movb   $0x61,-0x50(%rsp)

The encoding is longer when using ESP/RSP than any other register because its code is used as an escape in the Mod R/M byte to indicate that a SIB byte follows.

shrughes posted:

I've never understood the purpose of using both the %rbp and %rsp registers for stack frames

I don't have a definitive answer for this, but I point out:

The ENTER and LEAVE instructions assume the use of EBP as a frame pointer.
On the 8086, you could only address memory relative to a base register (BX or BP), an index register (SI or DI), or the sum of a base register and an index register. When 32-bit mode was introduced things improved, with the interpretation of the Mod R/M byte being simplified and the introduction of the SIB byte. But 32-bit routines could still call 16-bit routines so you have to be backward-compatible...
You could get away with using only ESP, and indeed some RISC machines only use a stack pointer, but every time you pushed something onto the stack the offsets for the arguments/locals would change. This wouldn't be a huge problem today, but compiler technology was not so advanced in the early 80s and once you've got libraries that use that sort of linkage...

There are probably more reasons, but it's late so :effort:

# ? Feb 15, 2011 09:21

pseudorandom name: May 6, 2007

Using BP also makes stack unwinding really easy.

# ? Feb 15, 2011 09:33

rjmccall: Sep 7, 2007; no worries friend; Fun Shoe

There's a standard code-generation optimization called frame pointer elimination that does exactly what you're suggesting; it can actually be a fairly nice win on x86-32, given the paucity of registers. Since it, by definition, destroys the chain of stack frames, it usually does nasty things to utilities that rely on walking that, e.g. stack trace dumpers and other debugging tools. Most exceptions implementations use metadata schemes which are capable of walking through FPE frames, but not all of them; IIRC, FPE breaks Windows SEH (or would if SEH didn't just disable it).

The only time you *can't* do FPE is when a function dynamically varies its stack usage, e.g. because it uses variable-length arrays or alloca(); in that case you're forced to keep the frame pointer around so that you have a stable reference to the locals.

rjmccall fucked around with this message at 10:01 on Feb 15, 2011

# ? Feb 15, 2011 09:50

Adbot: ADBOT LOVES YOU

# ? Jun 7, 2024 09:21

Ciaphas: Nov 20, 2005; > BEWARE, COWARD

A dumb question about debuggers. Are they designed to work only with executables output by particular compilers, or at least a limited set of compilers?

I ask because I've been asked if I'd like to use Visual Studio at work instead of Sun Studio, the caveat being that we still have to use SunCC for compiling for now. So I'd like to use the Visual Studio debugger if possible.

# ? Feb 16, 2011 06:17

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > C/C++ Programming Questions Not Worth Their Own Thread

«‹›641 »