Coding Horrors: You can gather all your technical debt into one easy framework!

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > Coding Horrors: You can gather all your technical debt into one easy framework!

«‹›1503 »

Steve French: Sep 8, 2003

Plorkyeran posted:

Why do you think s does not point to an element in an array? The compiler can see that you're doing things which are legal if it points to an array and illegal if it does not, therefore it knows that s points to an array. If you then proceed to pass a non-array as s you've formed an invalid program.

You are also not reading my statements carefully enough.

This is what I'm saying:

A -> B
~B
therefore,
~A

You think I'm saying:

A -> B
...A?
therefore,
B!

A = the aforementioned strawman explanation for why s - 1 is undefined
B = strlen on strings longer than length 1 is undefined

# ? Feb 5, 2014 23:25

Adbot: ADBOT LOVES YOU

# ? Jun 5, 2024 05:12

Steve French: Sep 8, 2003

ShoulderDaemon posted:

This isn't how undefined assumptions work in compilers. The compiler does know that, because if you didn't pass a large enough array into the function, then the result would be undefined, and the compiler is allowed to assume that your program is well-defined. Compilers don't check to make sure that a program is well-defined, they simply assume that it is and perform optimizations within that context, because it doesn't matter if the optimizations break not-well-defined programs; those programs were already undefined according to spec anyway.

Right, so if we're going on the idea that a compiler always assumes that the program is well defined, said compiler should, while compiling my palindrome implementation, assume that the value input for s does not point to the very first element of the original array, because:

C code:

  char *s = "apoop";
  palindrome(s + 1);

should be perfectly valid.

So at the very least, the "undefined behavior" assertion should be qualified to the condition that not only does len == 0, but s as passed in points to the start of an array.

# ? Feb 5, 2014 23:30

Plorkyeran: Mar 22, 2007; To Escape The Shackles Of The Old Forums, We Must Reject The Tribal Negativity He Endorsed

That is a well defined C program. It's just not a remotely sane palindrome function. I did in fact point out that it was well formed as long as you didn't pass a pointer to the first element way back when this discussion started.

# ? Feb 5, 2014 23:33

Dren: Jan 5, 2001; Pillbug

I did not know that pointer math outside the bounds of the array is undefined behavior. Steve French's original code is only undefined in the case where len is zero, right? So if a check for s being NULL or len being 0 were added to the beginning of the function would it be good to go?

# ? Feb 5, 2014 23:35

Plorkyeran: Mar 22, 2007; To Escape The Shackles Of The Old Forums, We Must Reject The Tribal Negativity He Endorsed

Steve French posted:

You are also not reading my statements carefully enough.

Well since you brought up 6.5.6.7 several times I assumed you thought it was somehow relevant, and made my best guess as to why you would think that. As apparently I was wrong, perhaps you would like to explain how it is relevant to strlen?

# ? Feb 5, 2014 23:36

Scaevolus: Apr 16, 2007

Steve French is right. The given strncpy code:

Steve French posted:

C code:

char *
STRNCPY (char *s1, const char *s2, size_t n)
{
  char c;
  char *s = s1;

  --s1;
...

Creates a pointer to one before an object, which a fully-compliant C implementation could throw an error with. In particular, a hardware implementation of Baggy Bounds Checking operating using tagged pointers would trigger a CPU exception on underflow in the pointer.

Scaevolus fucked around with this message at 23:41 on Feb 5, 2014

# ? Feb 5, 2014 23:36

Plorkyeran: Mar 22, 2007; To Escape The Shackles Of The Old Forums, We Must Reject The Tribal Negativity He Endorsed

Dren posted:

I did not know that pointer math outside the bounds of the array is undefined behavior. Steve French's original code is only undefined in the case where len is zero, right? So if a check for s being NULL or len being 0 were added to the beginning of the function would it be good to go?

A check for len = 0 would work, checking for s being NULL would not.

# ? Feb 5, 2014 23:37

Plorkyeran: Mar 22, 2007; To Escape The Shackles Of The Old Forums, We Must Reject The Tribal Negativity He Endorsed

glibc is pretty loving terrible in general btw, and it doing something is in no way evidence that it is a sane, correct or useful thing to be doing

# ? Feb 5, 2014 23:38

Scaevolus: Apr 16, 2007

Plorkyeran posted:

glibc is pretty loving terrible in general btw, and it doing something is in no way evidence that it is a sane, correct or useful thing to be doing

In this case, performing the undefined behavior we were discussing on almost all of its inputs.

# ? Feb 5, 2014 23:42

Dren: Jan 5, 2001; Pillbug

Plorkyeran posted:

A check for len = 0 would work, checking for s being NULL would not.

Oh, I meant check (len == 0 || !s) and bail if true because if someone passes a NULL in there and he dereferences it that would be undefined too.

# ? Feb 5, 2014 23:43

Steve French: Sep 8, 2003

Plorkyeran posted:

Well since you brought up 6.5.6.7 several times I assumed you thought it was somehow relevant, and made my best guess as to why you would think that. As apparently I was wrong, perhaps you would like to explain how it is relevant to strlen?

I had been trying to identify the most specific parts of the C standard that dictated the undefined behavior; I had recalled someone mentioning that clause (though now looking back on the thread I realize it must have been someone in IRC rather than in this thread), and that was one interpretation that concludes undefined behavior for s - 1. I then asked if others believed that interpretation was correct, and then posted an explanation of why it must be incorrect, based on the fact that it would also imply strlen's behavior is undefined for the vast majority of inputs.

I am not saying strlen's behavior is not defined.
I am not saying my palindrome function's behavior is defined on all inputs (and I sure never intended to suggest that it was what I would actually write were I charged in a real situation with implementing such a function, though I think calling it "not remotely sane" might be a bit of a stretch)

I'm done with this enormous derail that I've caused. I, at least, found it entertaining and somewhat informative.

# ? Feb 5, 2014 23:47

Vanadium: Jan 8, 2005

Steve French posted:

Okay, let me recap and make sure I'm understanding this correctly. Please correct any misunderstandings or inaccuracies.

The C standard says, in 6.5.6, regarding additive operators:

[...]

So, in my code, the char *s parameter, in the context of that function, does not point to an element in an array, so it behaves like a pointer to the first element of an array of length one. Results of the operation that point to anything that is not in the array or one past the end of the array are undefined, so s - 1 is undefined.

Is this correct?

How are you calling your palindrome function so that the pointer does not point to an element in an array?

# ? Feb 5, 2014 23:51

Vanadium: Jan 8, 2005

I mean yes char c = ???: strlen(&c); is probably undefined, except maybe if c == '\0', I dunno

# ? Feb 5, 2014 23:57

Plorkyeran: Mar 22, 2007; To Escape The Shackles Of The Old Forums, We Must Reject The Tribal Negativity He Endorsed

Scaevolus posted:

In this case, performing the undefined behavior we were discussing on almost all of its inputs.

Makes for a decent illustration that code which is undefined can still work in practice. None of the platforms anyone would use glibc on will put stuff in a place where subtracting one from the beginning will break naturally, and it'd be very hard for the optimizer to cause problems. An auto-vectorizer could probably break it, but those tend to also break valid code, so that's not very exciting.

# ? Feb 6, 2014 00:13

Flownerous: Apr 16, 2012

Vanadium posted:

How does all of this interact with taking a char* pointer to the beginning of a large struct and then accessing it byte for byte?

Yeah does the type of the pointer change things? Or is it well-defined as long as you stay within the struct or 1 past the end, regardless of the type of the pointer?

# ? Feb 6, 2014 00:15

Dessert Rose: May 17, 2004; awoken in control of a lucid deep dream...

Flownerous posted:

Yeah does the type of the pointer change things? Or is it well-defined as long as you stay within the struct or 1 past the end, regardless of the type of the pointer?

The latter. You are essentially reinterpreting the chunk of memory that the struct occupies as an array of char. One past the end is valid here.

Whether the struct makes any sense when read this way is another matter, of course.

# ? Feb 6, 2014 00:27

Scaevolus: Apr 16, 2007

# ? Feb 6, 2014 00:51

coffeetable: Feb 5, 2006; TELL ME AGAIN HOW GREAT BRITAIN WOULD BE IF IT WAS RULED BY THE MERCILESS JACKBOOT OF PRINCE CHARLES

YES I DO TALK TO PLANTS ACTUALLY

Gotta say, programming would be a lot more entertaining if gcc did actually pick random bytecode to spit out every time it encountered undefined code.

# ? Feb 6, 2014 01:01

Dessert Rose: May 17, 2004; awoken in control of a lucid deep dream...

coffeetable posted:

Gotta say, programming would be a lot more entertaining if gcc did actually pick random bytecode to spit out every time it encountered undefined code.

Even better would be if it had a table of "fun" things to emit and chose randomly from it.

# ? Feb 6, 2014 01:05

TheFreshmanWIT: Feb 17, 2012

So here is an interesting defect that sorta plays into the UB discussion above. We write C++ code that runs on a few platforms, Win Desktop, WinRT, Android, and iOS. The result is we have a set of 'base' C++ code that compiles on every OS. One such component is an open source library that we've purchased a commerical (non GPL'ed) version of. The following code is a very simplified/anonymized reproduction of that:

code:

template class<class T>
static inline const std::string int2string(T value, const int base = 10)
{
	// STUFF, returns a to-string'ed version of an int
}

class WhateverClass
{

	void whateverFunction()
	{
		// Logic to do other things...
		std::string var = int2string(magicBool ? 1 : 0);

		// Reference point #1

		// other logic...
	}


	private:
	bool magicBool;
}

The main defect you may see (I'll mention it so you don't assume it elsewhere) is that magicBool ended up being uninitialized when WhateverClass used 1 of the 2 constructors.

On 3 of the 4 platforms, this worked perfectly, var was always a "1" at reference point #1. The fact that it was uninitialized wasn't a big deal, since we wanted it to be a "1" ANYWAY, so it was a very rare defect.

HOWEVER, on Android, "var" was coming back as arbitrary values! It would be "74", "122", "3", "99", or whatever. The "var" was being used to assemble a network string, so this was causing the server to kick us for sending a bad header.

While the uninitialized value is a serious issue that I've fixed (along with specifically setting magicBool to true when we want it), what I found interesting was the optimizer turning "magicBool ? 1 : 0" into JUST "magicBool". The developer's intent was to convert a bool to a simple int (then string) representation, however the optimizer seems to have decided that the 1:0 are actually booleans!

I've changed the var assignment to be var = magicBool ? "1":"0", initialized magicBool, etc. Found this interesting all the same.

# ? Feb 6, 2014 01:11

Jabor: Jul 16, 2010; #1 Loser at SpaceChem

Assuming that, in the underlying representation, boolean true is represented by 1, while boolean false is represented by 0, it's a sensible optimization to just use that value directly instead of doing a comparison and branch to achieve the same result.

And we know that the boolean is either 0 or 1, because otherwise the program would be undefined and it doesn't matter anyway.

# ? Feb 6, 2014 01:20

Plorkyeran: Mar 22, 2007; To Escape The Shackles Of The Old Forums, We Must Reject The Tribal Negativity He Endorsed

My favorite result of uninitialized bools is:

C++ code:

$ cat bool.cpp
#include <cstdio>

int main() {
  bool foo;
  if (foo)
    printf("true\n");
  if (!foo)
    printf("false\n");
}

$ g++ bool.cpp && ./a.out
true
false

# ? Feb 6, 2014 01:55

pseudorandom name: May 6, 2007

And that right there is why you must never use bool in a wire format.

# ? Feb 6, 2014 01:57

Jethro: Jun 1, 2000; I was raised on the dairy, Bitch!

Steve French posted:

Which would be awfully nice for the compiler to know, but it doesn't.

Whether or a pointer points to an element of an array is something determined at runtime based on what the pointer actually points to. The fact that the compiler doesn't (and can't) know whether the pointer points to an element of an array is why section 6.5.6 is written how it is, and it's also why the behavior in question is undefined, as opposed to an error.

# ? Feb 6, 2014 02:03

ewe2: Jul 1, 2009

Plorkyeran posted:

My favorite result of uninitialized bools is:

C++ code:

$ cat bool.cpp
#include <cstdio>

int main() {
  bool foo;
  if (foo)
    printf("true\n");
  if (!foo)
    printf("false\n");
}

$ g++ bool.cpp && ./a.out
true
false

Interesting. I tried that out:

code:

$ clang++ -fsanitize=bool bool.cpp && ./a.out
false

$ g++ bool.cpp && ./a.out
false

Guess both compilers made a decision there.

edit: ah figured it out. That g++ bug is a 32-bit result. Mine were 64bit results. Clang gets the same answer in both 32bit and 64bit.

ewe2 fucked around with this message at 03:27 on Feb 6, 2014

# ? Feb 6, 2014 03:23

Scaevolus: Apr 16, 2007

ewe2 posted:

edit: ah figured it out. That g++ bug is a 32-bit result. Mine were 64bit results. Clang gets the same answer in both 32bit and 64bit.

It's not a compiler bug, it's undefined behavior. In this case, clang/g++ can optimize the check out entirely, since it can see that the value is undefined at that point.

Scaevolus fucked around with this message at 03:38 on Feb 6, 2014

# ? Feb 6, 2014 03:31

rjmccall: Sep 7, 2007; no worries friend; Fun Shoe

Uh. Don't give us too much credit here. It's unoptimized code, kernels zero-initialize the stack by default, and this is main so there may just not have been anything pushed this deep on the stack yet.

If you were using iostreams instead of stdio so that you had more global initializers running, maybe then the stack would be partly scribbled on; as it is, the variable might as well be correctly initialized to false. Nobody does data flow analysis at -O0.

You would not believe the number of uninitialized-variable bugs that go uncaught because they happen to occur in code that runs first thing in the process.

# ? Feb 6, 2014 09:02

seiken: Feb 7, 2005; hah ha ha

How haven't I heard about Espruino until now?

Let's run Javascript on a microcontroller. Let's reparse everything every time it's executed. Don't put comments or whitespace in loops, it'll slow things down! Watch out, arrays and objects are linked lists so lookup is O(n)! We're all about saving ram, but bools are going to take 20 bytes unless you put everything in a string!

This might be the worst programming trainwreck I've ever seen. How on Earth did it get Kickstarter'd for �100000?

# ? Feb 6, 2014 17:23

Internet Janitor: May 17, 2008; "That isn't the appropriate trash receptacle."

Peering into the source code proves interesting, if not surprising:

https://github.com/espruino/Espruino/blob/master/src/jsutils.c#L299

# ? Feb 6, 2014 17:40

Suspicious Dish: Sep 24, 2011; 2020 is the year of linux on the desktop, bro; Fun Shoe

Did they really just rewrite strtod, poorly?

# ? Feb 6, 2014 17:54

Scaevolus: Apr 16, 2007

rjmccall posted:

Uh. Don't give us too much credit here. It's unoptimized code, kernels zero-initialize the stack by default, and this is main so there may just not have been anything pushed this deep on the stack yet.

If you were using iostreams instead of stdio so that you had more global initializers running, maybe then the stack would be partly scribbled on; as it is, the variable might as well be correctly initialized to false. Nobody does data flow analysis at -O0.

Woops, I was looking assembly listings at -O2, which exploits the undefined behavior to simplify the code.

# ? Feb 6, 2014 19:03

feedmegin: Jul 30, 2008

TheFreshmanWIT posted:

So here is an interesting defect that sorta plays into the UB discussion above. We write C++ code that runs on a few platforms, Win Desktop, WinRT, Android, and iOS. The result is we have a set of 'base' C++ code that compiles on every OS. One such component is an open source library that we've purchased a commerical (non GPL'ed) version of.

Qt developer spotted?

# ? Feb 6, 2014 19:29

Fuck them: Jan 21, 2011; and their bullshit

Experienced dev checking in broken code that is completely necessary for the work flow of our web app to test what we're doing: horror of coding or horror of version control?

The skype-bitching about it between him and the team lead is the best thing ever, though.

Also the fact that our testing involves going through the workflow of the web app instead of proper testing/TDD is probably the true horror.

:q:

# ? Feb 7, 2014 16:48

Hughlander: May 11, 2005

2banks1swap.avi posted:

Experienced dev checking in broken code that is completely necessary for the work flow of our web app to test what we're doing: horror of coding or horror of version control?

The skype-bitching about it between him and the team lead is the best thing ever, though.

Also the fact that our testing involves going through the workflow of the web app instead of proper testing/TDD is probably the true horror.

Proper testing would include going through the workflow. End to end testing or acceptance testing are still a form of testing.

# ? Feb 7, 2014 17:03

necrotic: Aug 2, 2005; I owe my brother big time for this!

Hughlander posted:

Proper testing would include going through the workflow. End to end testing or acceptance testing are still a form of testing.

Yup. Unit Testing is on the code level, testing a specific unit of work (ie a function). Integration/Functional testing is the whole shebang, which can also be automated with Selenium or something similar. If you do automate it you still want some form of "hey, I was testing this page but here's what changed", even if it's just an image-diff.

# ? Feb 7, 2014 17:20

Macichne Leainig: Jul 26, 2012; by VG

This is more mild than horror, but I still looked at it and went "what."

code:

<Grid.ColumnDefinitions>
    <ColumnDefinition Width="2*"></ColumnDefinition>
    <ColumnDefinition Width="4*"></ColumnDefinition>
</Grid.ColumnDefinitions>

Basically, in xaml, the number* in a scenario like this is a ratio of sizes. Column 0 is 2 parts wide, column 1 is 4 parts wide, for 6 parts total. Or, more logically, Column 0 is 1/3rd and column 1 is 2/3rds.

Macichne Leainig fucked around with this message at 17:49 on Feb 7, 2014

# ? Feb 7, 2014 17:30

shrughes: Oct 11, 2008; (call/cc call/cc)

The horror is your sloppy unindented first line quoting.

# ? Feb 7, 2014 17:46

Macichne Leainig: Jul 26, 2012; by VG

shrughes posted:

The horror is your sloppy unindented first line quoting.

That's more I didn't copy the spaces in front of the first line. I suppose I can do Visual Studio's magic alt-click dragging.

Fake edit: And there, now it's properly indented. :colbert:

Fake edit 2: Also not shown is the 2px high row that's used to separate some elements visually.

Macichne Leainig fucked around with this message at 17:52 on Feb 7, 2014

# ? Feb 7, 2014 17:48

shrughes: Oct 11, 2008; (call/cc call/cc)

Tha Chodesweller posted:

That's more I didn't copy the spaces in front of the first line.

Yes that's what I was referring to. You horrorble person.

# ? Feb 7, 2014 18:03

Adbot: ADBOT LOVES YOU

# ? Jun 5, 2024 05:12

Macichne Leainig: Jul 26, 2012; by VG

I am a horrorble person. I was just looking at the error logging framework I wrote 4 months ago and just went, "holy hell this is a pain in the rear end to use."

I made it so you had to declare a new custom error object every time you wanted to report an error, instead of just tossing a function the necessary information. It got very confusing to go ErrorFramework.ReportError(new Error { Category = Category.Fatal, Message = "poo poo BROKE" } ); every time something went wrong. (That isn't the full syntax, my custom error object isn't just named error and has quite a few more properties.)

Isn't looking back at poo poo you did a while back the best kind of coding horror? :allears:

# ? Feb 7, 2014 19:08

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > Coding Horrors: You can gather all your technical debt into one easy framework!

«‹›1503 »