Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
Star War Sex Parrot
Oct 2, 2003

sarehu posted:

C is fine or good if you want to understand "how it works." You'll run out of C to learn pretty quickly, and at some point there's more benefit to C++, and it's more for software design education.

I mean basically memory is a big array of bytes, there's structure size and alignment. Look at the assembly output of some simple C functions to understand how function calls work, and... you're done. There isn't really that much stuff at the bottom.
I think if the objective is to learn "how it works" then you have to learn more than just C, since as you say there's not much at the bottom. To me, what's more important is cultivating the context to evaluate and understand other languages by complimenting the study of C with something like Programming Language Pragmatics. The table of contents for 3rd edition is here. The book at times veers into topics more suitable for compiler or language design (most people don't need to know how to use EBNF to define formal grammars), but concepts like binding, the call stack layout, etc. can lead to better understanding at a lower level and the book in general serves as a great reference when someone mentions a language term or concept you don't know.

Star War Sex Parrot fucked around with this message at 20:04 on Aug 24, 2016

Adbot
ADBOT LOVES YOU

Ciaphas
Nov 20, 2005

> BEWARE, COWARD :ovr:


Debugging someone else's C++ code and I'm getting a SIGBUS ('invalid address alignment') crash at certain points. Can I take this as a sign that I need to look for lousy pointer math somewhere or am I misdirecting my focus?

sarehu
Apr 20, 2007

(call/cc call/cc)

Ciaphas posted:

Debugging someone else's C++ code and I'm getting a SIGBUS ('invalid address alignment') crash at certain points. Can I take this as a sign that I need to look for lousy pointer math somewhere or am I misdirecting my focus?

You could also be dereferencing a garbage pointer.

peepsalot
Apr 24, 2007

        PEEP THIS...
           BITCH!

Ciaphas posted:

Debugging someone else's C++ code and I'm getting a SIGBUS ('invalid address alignment') crash at certain points. Can I take this as a sign that I need to look for lousy pointer math somewhere or am I misdirecting my focus?
Hey I just ran into this issue the other week and spent a whole day figuring it out.

The problem in my case was that the someone else had used "__attribute__((packed))" on a struct definition when they shouldn't have. I don't know the proper way to handle structs with that attribute, but as I understand, if you use it, you can't pass around pointer to any of its members. Because that pointer is not guaranteed to be on a "word" boundary and the cpu will flip. My solution was to remove the packed attribute from the struct. It wasn't saving any significant ram having it there in my case.

Ciaphas
Nov 20, 2005

> BEWARE, COWARD :ovr:


Thanks, I'll check for both of those too. Back to the millstone~

John F Bennett
Jan 30, 2013

I always wear my wedding ring. It's my trademark.

Thanks all for the great advice! I've started today with learning myself the basics of C. Coming from Java it's of course all very familiar which is great.

Just trying to wrap my head around pointers, memory management, etc... But as I have lots of time at the moment, I'm positive it'll work out.

If someone has any pointers (heh) about pointers, memory allocation or more, always welcome!

Ciaphas
Nov 20, 2005

> BEWARE, COWARD :ovr:


Speaking of pointers I've found the problem area referenced above but I'm having a brain fart and don't understand why it's causing the SIGBUSes.

C++ code:
unsigned char buf[4]; // actually function parameter, already filled--contains little endian data from file
int data = *((int*)buf); // crash here
When I change the assignment to using bit shifts in my usual fashion, it works fine (in fact, the incoming data is all zeroes for this test):
C++ code:
int data = ((int)buf[0]) |
   ((int)buf[1] << 8) |
   ((int)buf[2] << 16) |
   ((int)buf[3] << 24);
So what's wrong with that pointer chicanery up there, again?

(edit) The address of buf was something like 0xc1c1d9, which isn't on a 4 byte boundary... I wonder if the system hates that? This is on Solaris SPARC. That would explain why this crash doesn't happen all the time...

Ciaphas fucked around with this message at 21:23 on Aug 24, 2016

Edison was a dick
Apr 3, 2010

direct current :roboluv: only

Ciaphas posted:

Speaking of pointers I've found the problem area referenced above but I'm having a brain fart and don't understand why it's causing the SIGBUSes.

C++ code:
unsigned char buf[4]; // actually function parameter, already filled--contains little endian data from file
int data = *((int*)buf); // crash here
So what's wrong with that pointer chicanery up there, again?

I don't think a char[4] has the same alignment requirements as an int, so you could be trying an unaligned integer read.

nielsm
Jun 1, 2009



Ciaphas posted:

So what's wrong with that pointer chicanery up there, again?

(edit) The address of buf was something like 0xc1c1d9, which isn't on a 4 byte boundary... I wonder if the system hates that? This is on Solaris SPARC. That would explain why this crash doesn't happen all the time...

Yep, since it's char width it only gets aligned to a single byte boundary.

Also, someone can certainly talk to you about strict aliasing rules and why you should rather use memcpy() for that operation.

Ciaphas
Nov 20, 2005

> BEWARE, COWARD :ovr:


I should mention that changing the code worked, and it smelled like poo poo so I would have done so anyway (I avoid Pointer Shenanigans whenever possible)--I just cop to not fully understanding why it sometimes works and sometimes doesn't. I'll have to look up "strict aliasing rules". Thanks awfully.

peepsalot
Apr 24, 2007

        PEEP THIS...
           BITCH!

Ciaphas posted:

Speaking of pointers I've found the problem area referenced above but I'm having a brain fart and don't understand why it's causing the SIGBUSes.

C++ code:
unsigned char buf[4]; // actually function parameter, already filled--contains little endian data from file
int data = *((int*)buf); // crash here
When I change the assignment to using bit shifts in my usual fashion, it works fine (in fact, the incoming data is all zeroes for this test):
C++ code:
int data = ((int)buf[0]) |
   ((int)buf[1] << 8) |
   ((int)buf[2] << 16) |
   ((int)buf[3] << 24);
So what's wrong with that pointer chicanery up there, again?

(edit) The address of buf was something like 0xc1c1d9, which isn't on a 4 byte boundary... I wonder if the system hates that? This is on Solaris SPARC. That would explain why this crash doesn't happen all the time...
yeah so a pointer to an int means that if you ever dereference it, then sizeof(int) (probably 4) bytes need to be fetched from ram by the cpu. Some CPU architectures have a problem with reading 4 bytes at a time from a non-4-byte boundary. same for 2,4,8 byte length types. An array of chars has no restrictions on which byte it needs to begin or end; each element is a byte.

Ciaphas
Nov 20, 2005

> BEWARE, COWARD :ovr:


On a not at all related to this code base note, I don't know the guy who wrote this code so can someone come up with an anthropomorphic personification of pointers for me to shoot instead, thanks

Dr Monkeysee
Oct 11, 2002

just a fox like a hundred thousand others
Nap Ghost
Is it a coding horror to rely on member field construction and destruction order to enforce a sequence of operations? It looks like the order is guaranteed by the standard but it also feels it may be a little too implicit for ease of understanding.

eth0.n
Jun 1, 2012

Dr Monkeysee posted:

Is it a coding horror to rely on member field construction and destruction order to enforce a sequence of operations? It looks like the order is guaranteed by the standard but it also feels it may be a little too implicit for ease of understanding.

Just make sure you list them in the right order in all your constructor initializers. The ordering might be implicit, but you can make it look explicit.

rjmccall
Sep 7, 2007

no worries friend
Fun Shoe
There aren't very many programming sins that can't be expiated with a good comment.

Captain Cappy
Aug 7, 2008

Dr Monkeysee posted:

Is it a coding horror to rely on member field construction and destruction order to enforce a sequence of operations? It looks like the order is guaranteed by the standard but it also feels it may be a little too implicit for ease of understanding.

I don't think there's a problem with it if you have warnings on or if its a compiler error if people reorder it.

ExcessBLarg!
Sep 1, 2001

Ciaphas posted:

So what's wrong with that pointer chicanery up there, again?
As others mentioned, the issue is that SPARC doesn't support unaligned four-byte reads. Thus, in any instance where "buf" isn't on four-byte boundary attempting to perform a four-byte read will cause a CPU exception (unaligned trap). The OS at that point has two options, either to emulate the read as a sequence of one-byte reads (which, along with trap is painfully slow), or to send a signal (SIGBUS) to the userspace process.

This is actually a separate problem from strict aliasing, which is where the compiler assumes that two pointers of different types won't occupy the same memory. Usually pointer reinterpretation also violates strict aliasing rules, although "char *" is generally an exception to that, and so an unsigned char buffer might be as well. Either way it's bad practice.

There's a few better ways to handle this. One is to create a union type containing the unsigned char buffer and the integer, store the data in the unsigned char, and read it out of the integer:
code:
union {
    int data;
    unsigned char buf[sizeof(int)];
} databuf;

// Write the incoming data to &databuf.buf.
// Data is now available as an integer in databuf.data.
This is union type punning, and although the behavior is undefined by standard it's supported in all reasonable compilers and explicitly supported in GCC. There's a tutorial that explains the details a bit better, although in that specific example of using a union type to recast a pointer to an existing buffer could still result in an unaligned read in this case.

The other two options are to make a local "int data" variable and memcpy the contents of buf to it, and the one that you did which is to use bit shift/or operators to combine data.

One advantage of using shift operations is that it doesn't make any endian assumptions about the host architecture, which is important if the serialized data may have endianness that doesn't match it. In fact, the original code, if it didn't crash, probably would've interpreted the data wrong since Solaris SPARC is big endian.

ExcessBLarg! fucked around with this message at 18:19 on Aug 27, 2016

sarehu
Apr 20, 2007

(call/cc call/cc)

Dr Monkeysee posted:

Is it a coding horror to rely on member field construction and destruction order to enforce a sequence of operations? It looks like the order is guaranteed by the standard but it also feels it may be a little too implicit for ease of understanding.

I'll chime in with a good solid "no." And sometimes it becomes an idiom.

Ciaphas
Nov 20, 2005

> BEWARE, COWARD :ovr:


ExcessBLarg! posted:

As others mentioned, the issue is that SPARC doesn't support unaligned four-byte reads. Thus, in any instance where "buf" isn't on four-byte boundary attempting to perform a four-byte read will cause a CPU exception (unaligned trap). The OS at that point has two options, either to emulate the read as a sequence of one-byte reads (which, along with trap is painfully slow), or to send a signal (SIGBUS) to the userspace process.

This is actually a separate problem from strict aliasing, which is where the compiler assumes that two pointers of different types won't occupy the same memory. Usually pointer reinterpretation also violates strict aliasing rules, although "char *" is generally an exception to that, and so an unsigned char buffer might be as well. Either way it's bad practice.

There's a few better ways to handle this. One is to create a union type containing the unsigned char buffer and the integer, store the data in the unsigned char, and read it out of the integer:
code:
union {
    int data;
    unsigned char buf[sizeof(int)];
} databuf;

// Write the incoming data to &databuf.buf.
// Data is now available as an integer in databuf.data.
This is union type punning, and although the behavior is undefined by standard it's supported in all reasonable compilers and explicitly supported in GCC. There's a tutorial that explains the details a bit better, although in that specific example of using a union type to recast a pointer to an existing buffer could still result in an unaligned read in this case.

The other two options are to make a local "int data" variable and memcpy the contents of buf to it, and the one that you did which is to use bit shift/or operators to combine data.

One advantage of using shift operations is that it doesn't make any endian assumptions about the host architecture, which is important if the serialized data may have endianness that doesn't match it. In fact, the original code, if it didn't crash, probably would've interpreted the data wrong since Solaris SPARC is big endian.

Yep we go back and forth between different endian architectures all the time so I learned to use the bit shift method long ago and it stuck. Still, thanks for the extra feedback

Dr Monkeysee
Oct 11, 2002

just a fox like a hundred thousand others
Nap Ghost

Captain Cappy posted:

I don't think there's a problem with it if you have warnings on or if its a compiler error if people reorder it.

It wouldn't be catchable by the compiler. Think a series of subsystem startup/tear down steps that are semantically distinct, and therefore shouldn't all be mashed together into one function or type, but order matters. In addition they can easily be modeled as RAII.

The order of the controlling class's member fields, where each field is one of these subsystem classes, would enforce the order the subsystems are initialized and shutdown.

Seems like it'd work just fine but it's less explicit than calling a series of initialize_this, teardown_that functions. But I'm no c++ wizard so maybe this idiom is expected and unsurprising.

Dr Monkeysee fucked around with this message at 06:51 on Aug 28, 2016

Ralith
Jan 12, 2011

I see a ship in the harbor
I can and shall obey
But if it wasn't for your misfortune
I'd be a heavenly person today

Dr Monkeysee posted:

It wouldn't be catchable by the compiler. Think a series of subsystem startup/tear down steps that are semantically distinct, and therefore shouldn't all be mashed together into one function or type, but order matters. In addition they can easily be modeled as RAII.

The order of the controlling class's member fields, where each field is one of these subsystem classes, would enforce the order the subsystems are initialized and shutdown.

Seems like it'd work just fine but it's less explicit than calling a series of initialize_this, teardown_that functions. But I'm no c++ wizard so maybe this idiom is expected and unsurprising.
RAII instead of explicit init/deinit is generally considered a good idea. If you want to make things more explicit, consider having objects' constructors take a reference to the objects they depend upon.

b0lt
Apr 29, 2005

ExcessBLarg! posted:

As others mentioned, the issue is that SPARC doesn't support unaligned four-byte reads. Thus, in any instance where "buf" isn't on four-byte boundary attempting to perform a four-byte read will cause a CPU exception (unaligned trap). The OS at that point has two options, either to emulate the read as a sequence of one-byte reads (which, along with trap is painfully slow), or to send a signal (SIGBUS) to the userspace process.

You don't need to emulate it as one byte reads, you can do two reads and reassemble them.

quote:

This is union type punning, and although the behavior is undefined by standard it's supported in all reasonable compilers and explicitly supported in GCC.

suncc doesn't, and if you're on SPARC, there's a distinct possibility you're stuck with it.

quote:

The other two options are to make a local "int data" variable and memcpy the contents of buf to it

This is the correct solution, any reasonable compiler will optimize this into The Right Thing.

Ciaphas
Nov 20, 2005

> BEWARE, COWARD :ovr:


b0lt posted:

suncc doesn't, and if you're on SPARC, there's a distinct possibility you're stuck with it.

suncc is the bane of my loving existence i've been fighting for years to get off this piece of poo poo playforma nd we are so close

ExcessBLarg!
Sep 1, 2001

b0lt posted:

You don't need to emulate it as one byte reads, you can do two reads and reassemble them.
Two vs. four reads sure, my point was that the OS handles this in a trap handler and that's why it's slow. Ran into similar problems frequently on Alpha Linux back in the day.

Interesting though that suncc doesn't support union type punning but optimizes away memcpy.

The Letter A
Nov 8, 2002

edit: The problem was not what I thought it was. Let me work it out a little bit and I'll come back with a better idea of what I'm up against

The Letter A fucked around with this message at 22:01 on Aug 29, 2016

Highblood
May 20, 2012

Let's talk about tactics.
What's the point of creating enum variables when you can just use the enum as is?

Sex Bumbo
Aug 14, 2004
The enum definition isn't a variable and therefore doesn't do things a variable does?

ExcessBLarg!
Sep 1, 2001

Highblood posted:

What's the point of creating enum variables when you can just use the enum as is?
To communicate intent for the variable.

raminasi
Jan 25, 2005

a last drink with no ice
I'm trying to build a thing on OSX using makefiles, which is new for me. It's the kind of project with a labyrinth of intermediate files generated by CMake. I got CMake to configure and generate fine, but when I actually run make to build it, it won't link - it can't find a bunch of symbols that should be located in dependency libraries. I found what I'm pretty sure is the linker arguments string in the torrent of stuff CMake generated, and I can see the correct libraries in it (and again, CMake didn't complain during generation). I'm used to Windows, and have no instincts here - any ideas? The specific linker error is "ld: symbol(s) not found for architecture x86_64" after a list of the missing symbols.

What's infuriating is that this exact codebase built fine yesterday, but I wanted to rearrange things to make my dependencies a little more organized, so I started over from scratch (re-clone, etc) with what I thought was one simple change in the configuration - a moved directory. I even tried again with it moved back and it didn't work.

nielsm
Jun 1, 2009



The cross-compiling and multi-architecture binaries involved in macOS applications are very confusing. There are some flags to the C/C++/ObjC compiler to specify what arch(s) you want to include, so make sure every dependency includes at least what is required for the main application.

raminasi
Jan 25, 2005

a last drink with no ice
What do you mean? I'm not controlling anything about the dependencies - I just unzip a thing and point CMake at it.

Doc Block
Apr 15, 2003
Fun Shoe
You've built a 64-bit x86 slice (OS X allows multiple archs in the same binary), but either haven't linked everything to that slice (IDK if that's even possible) or you've got some prebuilt libraries you're linking against that don't have x86-64 slices.

Doc Block fucked around with this message at 23:07 on Sep 3, 2016

roomforthetuna
Mar 22, 2005

I don't need to know anything about virii! My CUSTOM PROGRAM keeps me protected! It's not like they'll try to come in through the Internet or something!

Highblood posted:

What's the point of creating enum variables when you can just use the enum as is?
This question is approximately the same as "what's the point of creating int variables when you can just use integer constants?"
Edit: or maybe #defined values would be more equivalent to using an enum without variables.

raminasi
Jan 25, 2005

a last drink with no ice

Doc Block posted:

You've built a 64-bit x86 slice (OS X allows multiple archs in the same binary), but either haven't linked everything to that slice (IDK if that's even possible) or you've got some prebuilt libraries you're linking against that don't have x86-64 slices.

I just checked the prebuilt binaries with file, and it says "Mach-O 64-bit dynamically linked shared library x86_64," so I think I've got the right architectured ones, anyway. I also dumped the symbols of one of them with nm, but I can't really tell whether there are matches or not because ld gives me de-mangled symbols and nm gives me mangled ones. There are definitely plausible match candidates. I feel like there's just some stupid switch I need to set that I forgot about but I'll be damned if I can figure out what it is.

ExcessBLarg!
Sep 1, 2001

roomforthetuna posted:

Edit: or maybe #defined values would be more equivalent to using an enum without variables.
I think Highblood's point is that once an enum is defined you can assign it to a variable of any integer type, so why bother explicitly declaring variables as type "enum foo" when you could use "int" instead. (Again, it's to communicate intent.)

I do think it's unfortunate that enum values aren't namespaced to the defining type. It means all enum values have to be unique within a translation unit. Perhaps it has to be this way since promotion rules make it ambiguous "what enum type" the result of arithmetic operations are, unless it's always just an integer.

As for enums vs. #defines, enums make it into debugging info whereas #defines don't pass the preprocessor.

Doc Block
Apr 15, 2003
Fun Shoe

raminasi posted:

I just checked the prebuilt binaries with file, and it says "Mach-O 64-bit dynamically linked shared library x86_64," so I think I've got the right architectured ones, anyway. I also dumped the symbols of one of them with nm, but I can't really tell whether there are matches or not because ld gives me de-mangled symbols and nm gives me mangled ones. There are definitely plausible match candidates. I feel like there's just some stupid switch I need to set that I forgot about but I'll be damned if I can figure out what it is.

The error message you got specifically means that the linker can't find all the necessary symbols to link the x86-64 binary slice, which means something isn't getting compiled for x86-64.

You should be able to look at the list of missing symbols and figure out what library or whatever they're supposed to come from, then check that to make sure it's getting built for x86-64.

Ralith
Jan 12, 2011

I see a ship in the harbor
I can and shall obey
But if it wasn't for your misfortune
I'd be a heavenly person today

ExcessBLarg! posted:

I do think it's unfortunate that enum values aren't namespaced to the defining type. It means all enum values have to be unique within a translation unit. Perhaps it has to be this way since promotion rules make it ambiguous "what enum type" the result of arithmetic operations are, unless it's always just an integer.

What? Scoped enums have been a thing since C++11, and it has never been the case that "all enum values have to be unique within a translation unit".

raminasi
Jan 25, 2005

a last drink with no ice

Doc Block posted:

The error message you got specifically means that the linker can't find all the necessary symbols to link the x86-64 binary slice, which means something isn't getting compiled for x86-64.

You should be able to look at the list of missing symbols and figure out what library or whatever they're supposed to come from, then check that to make sure it's getting built for x86-64.

Yes, that is what I did, and they are, assuming that's what that output from file means.

Is the fact that they're dynamic libraries wrinkling something here?

ExcessBLarg!
Sep 1, 2001

Ralith posted:

What? Scoped enums have been a thing since C++11,
That's cool. They're not in C though.

Ralith posted:

and it has never been the case that "all enum values have to be unique within a translation unit".
Enum labels have to be unique, not values, sorry.

Adbot
ADBOT LOVES YOU

Doc Block
Apr 15, 2003
Fun Shoe

raminasi posted:

Yes, that is what I did, and they are, assuming that's what that output from file means.

Is the fact that they're dynamic libraries wrinkling something here?

Is it actually getting linked against those libraries, then? Is every symbol from that library missing, or just some of them?

I wouldn't think them being dynamic libraries would affect compile-time linking unless the linker can't find them (or isn't being told to link to them).

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply