Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
rjmccall
Sep 7, 2007

no worries friend
Fun Shoe
LLVM actually does two kinds of TCO: an IR-level optimization which rewrites recursive tail calls into loops, and a machine-level optimization which turns tail calls into jumps. That page is only talking about the latter; the former is target-independent and done at -O1 and higher.

Adbot
ADBOT LOVES YOU

rjmccall
Sep 7, 2007

no worries friend
Fun Shoe

mjau posted:

If you know that at least one of the strings are valid (eg it's a string constant), strcmp is just as safe as strncmp.

Both strings need to be valid unless (1) you are guaranteed that the possibly-invalid string, if invalid, has at least as many meaningful characters as the valid one or (2) you don't mind reading past the end of a string and either getting garbage results or crashing.

Also, there are very few excuses in this day and age to be using a string representation that doesn't pass around the string length.

rjmccall
Sep 7, 2007

no worries friend
Fun Shoe

mjau posted:

Well, sure, but strncmp won't help you there. If you just compare against a prefix of the known valid string, you'll get invalid results.

Well, we're talking about strncmp and fixed-size buffers here. Using a single call to strncmp to determine semantic equality only works if you're limiting the number of characters of precision anyway. Otherwise, you need some sort of fallback if strncmp returns equal and one of the strings is longer than a single buffer.

rjmccall
Sep 7, 2007

no worries friend
Fun Shoe
Technically it can if you run out of memory.

rjmccall
Sep 7, 2007

no worries friend
Fun Shoe

Lexical Unit posted:

Note that it's completely possible and straight forward to use non-static methods for events.

You tend to see stuff like this a lot when someone couldn't figure out the syntax for taking a member pointer. Which is not completely unreasonable.

rjmccall
Sep 7, 2007

no worries friend
Fun Shoe

Zombywuf posted:

C does it as well :-)

code:
char *myarray = ((char *)malloc(end_index - start_index + 1)) - start_index;

I had to call short a two-week vacation in Portugal just to tell you that this is technically a violation of the standard.

rjmccall
Sep 7, 2007

no worries friend
Fun Shoe
That's not actually true; inner classes are members of their enclosing class and therefore have the normal access privileges for members, which they extend transitively to their members. But the reverse definitely holds, which is usually obnoxious and just forces your iterators to friend their containers.

rjmccall fucked around with this message at 23:12 on Oct 20, 2010

rjmccall
Sep 7, 2007

no worries friend
Fun Shoe
Curly braces inside parentheses are how you get functional composition of expressions in C.

code:
main(x,y) long x, y; {
  return ({ static void*data; for (; --x; puts(*(typeof(*"")**)(
    {long z = y + ({long a = x; a += x; a += a; a += a * ((1L << 32) >> 32); }); &z[data]; }))); 0; });
}

rjmccall
Sep 7, 2007

no worries friend
Fun Shoe
Yeah, the full inventory is three GCC extensions, two unportable assumptions, either one or three instances of undefined behavior, and an illegal declaration of main.

EDIT: Well, plus an extra unportable assumption mixed in with the undefined behavior.

rjmccall fucked around with this message at 06:16 on Nov 21, 2010

rjmccall
Sep 7, 2007

no worries friend
Fun Shoe
For what it's worth, those functions do change program semantics: they change evaluation order so that the array is allocated after the operands are evaluated, rather than before. Why that would be desireable, I don't know.

rjmccall
Sep 7, 2007

no worries friend
Fun Shoe
Kilson: that's completely equivalent to the standard array-initializer expression, yes.

rjmccall
Sep 7, 2007

no worries friend
Fun Shoe
One gets created when evaluating the RHS of the assignment; another gets created when evaluating the LHS of the assignment (i.e., creating an object to assign into). It's not possible to elide copy-assignments, so that's the bare minimum.

libstdc++'s std::map requires two more temporaries because it actually implements operator[] in terms of insert(it, value_type(key, mapped_type())) when the key isn't found, which creates a temporary key and *two* temporary values.

EDIT: beaten quite badly.

rjmccall
Sep 7, 2007

no worries friend
Fun Shoe

Lumpy posted:

"Depreciated" was a poor choice of wording on my part.

While depreciate and deprecate have some overlap in ordinary English, only deprecate is accepted technical jargon.

rjmccall
Sep 7, 2007

no worries friend
Fun Shoe
They're still well short of Turing complete.

rjmccall
Sep 7, 2007

no worries friend
Fun Shoe

ShoulderDaemon posted:

You can embed arbitrary Perl expressions in a Perl regexp.

Huh. Things I did not know.

It looks like it bounds recursive depth before advancement; I'm not certain you could actually simulate a Turing machine with that other than by cheating.

rjmccall
Sep 7, 2007

no worries friend
Fun Shoe
It's old ugliness. Objective-C is a C language extension, and C didn't get a native boolean type until C99.

rjmccall
Sep 7, 2007

no worries friend
Fun Shoe

mjau posted:

Union members are allowed to alias even with strict aliasing optimization switched on. That's kind of what they're for.

This is in fact a very exciting and lively debate on the C committee. Neither C nor C++ actually permits aliasing through a union. Unions have an "active member", determined by which member was last stored through, and it is impermissible to read from any member other than the active member, with one exception: if the member accessed and the active member are both structs, and they share a common prefix, it is okay to access a member of that common prefix. However, some people have argued that unions ought to influence aliasing somehow, except that nobody seems quite certain what the language rule should be.

Anyway, char can alias anything, so if the code instead just casted the double* to char* and did that final loop, it would be perfectly legal by both standards.

rjmccall
Sep 7, 2007

no worries friend
Fun Shoe
There is a difference, but it's based on the fuzzy rule of Liskov substitution. Or rather, it's based on the quite solid rule of Liskov substitution, but languages generally have other rules which interact poorly with such substitutions, so instead we have to base this around vague ideas of what sane programmers would actually do.

So for example, Java's implicit conversion from int to long does not universally satisfy Liskov substitution because using an int instead of a long changes the type of an expression, which can impact overload resolution; but if you don't do semantically-distinguished overloads, it shouldn't matter. Similarly, upcasts can change overload resolution and member lookup, but if you're sane, it won't matter. But float->int conversions violate substitution basically as a matter of course.

rjmccall
Sep 7, 2007

no worries friend
Fun Shoe

nielsm posted:

But apparently C# Struct var; is equivalent to something like:
char buffer[sizeof(Struct)];
Struct &var = *(Struct*)(void*)buffer;


Presumably with adequate alignment.

Also, goons, "object" is a language-defined term. A local variable in Java is not an object. Pretty much everything in C++ is an object (except functions, references, and unconstructed storage). The end.

rjmccall
Sep 7, 2007

no worries friend
Fun Shoe

tractor fanatic posted:

I think that's legal C++, because of some weird rules regarding const correctness and strings. Codepad says this

code:
char * what = "hello";
is just a warning.

For backward compatibility, C++03 has a deprecated implicit conversion which discards the const when the source is a string literal. C++11 removes this.

The strchr signature is a wart that is basically required by a lack of expressivity in the C type system.

rjmccall
Sep 7, 2007

no worries friend
Fun Shoe

TasteMyHouse posted:

I think we can agree that what they did there was really bad right? I was kind of pissed at it

There is a camp that claims that modifying const objects should be okay as long as you put them back the way you found them. It's not completely irrational, but it's obviously something to be very careful about, and it's still not a great idea.

rjmccall
Sep 7, 2007

no worries friend
Fun Shoe

SlightlyMadman posted:

But clearly, if those circumstances aren't present at the exact moment that I'm writing some code, they're not worth taking into consideration.

There is, in fact, plenty of code which manipulates objects which are guaranteed to be (1) dynamically allocated and (2) accessed only by a single thread. Often both of those constraints are close to inherent to the domain.

Again, I think this sort of code is very dangerous, particularly with string data, and it's poor style in the sense that it's a potentially-unexpected constraint on the calling code. But if you accept that the caller and callee are already tightly coupled, and your domain has made the above constraints tenable, then I would argue it's not a completely unreasonable interpretation of the const contract.

rjmccall
Sep 7, 2007

no worries friend
Fun Shoe

BonzoESC posted:

Can we say with confidence that the real horror is the usage of notoriously tricky and subtle languages like C and C++ in 2011?

There are analogies in any language with mutable state. I can certainly have (say) a List in Java which I pass down the stack with the informal understanding that the callee won't add anything to it. Under such an understanding, it's a point of interpretation whether that means "don't modify it at all" or "you can modify it as long as you undo those changes before you return", and there can be good reasons to do either.

The biggest difference is that C and C++ actually let you formalize (and therefore type-check) the mutability contract, which makes this point of interpretation kindof a question about the language.

rjmccall
Sep 7, 2007

no worries friend
Fun Shoe
I meant that the return type really ought to be const-qualified if and only if the first argument is. Given that that's not expressible in the type system, I tend to side with the committee that it's better to lose const-safety than to make a *very* common task in C string processing this awkward.

ETA: but you're right that that's a legal workaround

rjmccall
Sep 7, 2007

no worries friend
Fun Shoe

Aleksei Vasiliev posted:

Don't know if this is really a horror but I laughed at it

For the record: while this code should certainly be using named constants, and I personally would write it with if instead of nested conditionals, the three explicit optimizations here are worthwhile, and a date library is a good candidate for this degree of tuning.

First, the compiler is very unlikely to be able to do the range analysis necessary to figure out that it can use 32-bit arithmetic here, but doing so is a substantial speedup on a 32-bit architecture.

Second, if the code were written to divide by 86400000 instead of 1024, a good compiler would do that division with multiplies, but it would probably be blocked (again, by inadequate knowledge of the range constraints) from factoring out 84375 and getting all the way to this code.

Third, while some compilers might convert a sequence of conditions into a binary search, many wouldn't, and that's not necessarily just a missing optimization: there's a lot of code that's structured so that the more likely cases are checked first.

So yeah, the style could be a lot better, but the ideas are good.

rjmccall
Sep 7, 2007

no worries friend
Fun Shoe

SlightlyMadman posted:

Even if you somehow did end up in a situation like this, you'd probably be better off optimizing the loop itself. If you really and truly need to do some millions of month look-ups, you'd probably still be better off just caching them.

This criticism is misplaced. If you were rolling your own milli-to-month converter as a tiny part of a larger project, I agree that your time would probably be better spent optimizing something else. It is not at all obvious that this is equally true of the maintainer of a date/time library.

Analogously, an application programmer should not start optimizing their code by rewriting malloc, but criticizing the libc maintainers for doing so is asinine.

And yes, I can imagine someone wanting to decompose timestamps into dates in a fairly tight loop.

rjmccall
Sep 7, 2007

no worries friend
Fun Shoe

SlightlyMadman posted:

This is all theoretically true, but libc maintainers are hopefully smart enough to make their own decisions rather than be swayed by the opinions of a guy on the internet. Everyone else happens to fall into the "not a libc maintainer" category, and probably has no business writing code like that.

So, to paraphrase, your opinions are above reproach because anyone smart enough to not need them is smart enough to ignore you.

rjmccall
Sep 7, 2007

no worries friend
Fun Shoe
The argument for promotion here is that people tend to only use arithmetic on byte or short when they're using them as small integer types, and in those cases you're probably going to be really annoyed by the arithmetic wrapping at a tiny width — e.g. if you were averaging three bytes (useful in certain kinds of image processing). This is then further complicated by the type of integer literals generally not being overloaded — you want to say that 4 is an int, but that means that someByte + 4 is an int instead of a byte.

I'm not sure I accept this argument, but there you have it.

rjmccall
Sep 7, 2007

no worries friend
Fun Shoe

Zombywuf posted:

I did not know Java generics were this bad http://a-dimit.blogspot.com/2011/11/re-highbrow-java-or-java-generics-and.html

The Java array design would be totally different if generics had existed from the beginning; the possibility of polymorphism over array types would make it acceptable to eliminate the covariance rule for array types, and element types would probably be subject to the same erasure rules as everything else.

C++ only kindof-sortof overcomes the code/metadata bloat of templates; Java would have a much harder time of it, and I'm not offended enough by the lack of enforcement of generic casts to think that's a good price to pay.

rjmccall
Sep 7, 2007

no worries friend
Fun Shoe

Zombywuf posted:

The idea of arrays having a design that prevents this kind of polymorphism boggles the mind.

My understanding is that they had a problem, a deadline, and subtype polymorphism. Parametric polymorphism is more appropriate, but it's also a much more complicated language feature, particularly if you don't want "instantiations" to require independent type-checking. I mean, Java's generics support is already too sophisticated by some metrics — most programmers don't understand the mathematics and just work around (misuses of) the type system with casts. And there's a lot more yet that Java generics can't express, like the co-/contra-variance of List<T> with T.

rjmccall
Sep 7, 2007

no worries friend
Fun Shoe
If I were making this point, I would say that Lisp is not a language because it's really many languages that people have idiosyncratically invented at every place that deploys Lisp, which is furthermore all-too-often synonymous with having a home-rolled Lisp interpreter and runtime. That would be a fair critique of the wildly fragmented Lisp ecosystem, where moving from site to site often might as well involve picking up an entirely new language.

What he actually wrote is either badly mangled or open trolling of the Lisp community.

rjmccall
Sep 7, 2007

no worries friend
Fun Shoe
There are a lot of interesting ideas in Rust. That's what you would hope to see from an acknowledged research project. It's hard to criticize anything about it, because they haven't really committed to anything about it, and might not for many years to come. I might point out that this also doubles as a reason not to get unduly unexcited about it, but after all, this is the internet.

rjmccall
Sep 7, 2007

no worries friend
Fun Shoe
Which would, of course, not work, because it's not like the C preprocessor does constant folding either (except to resolve #if conditions).

rjmccall
Sep 7, 2007

no worries friend
Fun Shoe
Incidentally, clang already does have a -fcatch-undefined-behavior flag, and John Regehr (equivocation incoming) has had at least one student at some point in the past working on extending it. So some of this is already done.

The really tricky part of catching all undefined behavior would be not integer overflow but things that rely on the effective type of an object, like the union type-punning rules and the aliasing rules. You'd need some sort of crazy side-table to avoid breaking ABI.

(and yes, I am employed to work on clang full-time)

rjmccall
Sep 7, 2007

no worries friend
Fun Shoe

Otto Skorzeny posted:

I thought the rules were 'just' that this wasn't allowed, eg. a read of type A from union {A, B} foo is only allowed if the last write to foo was of type A?

Pretty much, although there's an exception (in both C and C++) about structs with a common prefix.

rjmccall
Sep 7, 2007

no worries friend
Fun Shoe

Haud posted:

code:
if(x == 0)
   //Do Nothing
else
   DoSomething();

I actually like this style a lot when there are multiple interesting cases (i.e. at least one 'else if'). Sometimes you want to make sure some case doesn't fall through any of the following checks; repeating the logic in the guards for those blocks is expensive, redundant, and exclusive of using 'else'.

That is, I claim it's better to have:

code:
if (summoners.empty() || !hasBook || !hasCandle) {
  // do nothing
} else if (summoners.size() == 1) {
  // Only one summoner;  use the Rite of Kae Pum Ix An.
  ...
} else {
  // Multiple summoners;  use the Rite of Kaer Cham Ix Anu.
  ...
}
Than:

code:
if (summoners.size() == 1 && hasBook && hasCandle) {
  // Only one summoner;  use the Rite of Kae Pum Ix An.
  ...
} else if (summoners.size() > 1 && hasBook && hasCandle) {
  // Multiple summoners;  use the Rite of Kaer Cham Ix Anu.
  ...
}
I mean, usually I would just structure this so that I can use an early return/break/continue/goto/resume-continuation/abort/hlt instead of chaining the conditions, but sometimes life gives you lemons.

rjmccall
Sep 7, 2007

no worries friend
Fun Shoe

Suspicious Dish posted:

https://gist.github.com/1875229

That's awesome. The base and key and evaluated twice, but in reverse order from what I'd expect: the load is actually performed with the values from the second evaluation, and the store is done with the values from the first. In retrospect, it's the more sensible order to do it given that the target is a stack machine. Well, more sensible but still obviously wrong semantically.

The instruction sequence emitted to perform a post-increment is really ridiculous; it goes through a temporary local for no apparent reason at all, and then has to kill it four instructions later.

rjmccall
Sep 7, 2007

no worries friend
Fun Shoe
If that's because it gets compiled out in release builds, you might consider having it compile to __builtin_unreachable() in GCC and Clang.

rjmccall
Sep 7, 2007

no worries friend
Fun Shoe
I can imagine someone wanting that in C++ due to destructors "interfering" with the control flow. The generalization would be "break, continue, goto, and return should never have to run any destructors before reaching their destination." It's then easier to apply that rule to all locals instead of making it only apply to types with nontrivial destructors.

I mean, I think that would be a deeply problematic sign of a programmer who refuses to accept the C++ "say what you mean and learn to use a debugger" philosophy, but tons of those people exist.

I have no idea what you'd want this rule in a language without destructors, unless, yeah, it's the legacy of some awesome compiler bug.

rjmccall fucked around with this message at 02:06 on Mar 26, 2012

Adbot
ADBOT LOVES YOU

rjmccall
Sep 7, 2007

no worries friend
Fun Shoe
This just in: terrible performance scales for trivial use cases.


I love this. Yes, the problem people have with this code must be that they dislike concise and expressive notations.

Ugh, I'm going to treat this as an exercise in language design. Simple ways to improve the legibility:
  • It's really silly to burn unique syntax on doing a prefix slice of an array. foo[^10] is only one character shorter than foo[0..10] (maybe 0..9 for consistency?), but the latter is immediately legible to anyone familiar with the idea of a slice. You're obviously going to have a range slice anyway; don't add something else sui generis just for the sake of a character or two. At the very least, use ..10.
  • No way do people zip with binary operators so often that Z+ is actually worthwhile. If you're really gung-ho about zip being a binary operator, at least spell it zip(+), which also extends to non-operator combining functions.
  • Building lists by separating components with commas has always been dumb; it's not self-describing at all. It's particularly nasty in this example because I expect comma to have the lowest possible precedence, so I want to interpret [@^x, 0 Z+ 0, @^x] as [@^x, (0 Z+ 0), @^x]. Make this [@^x, 0] Z+ [0, @^x].
  • So, @^x is apparently a "placeholder argument", which is a way of referring to an otherwise-anonymous argument; the name is arbitrary, and they get filled in the order they're encountered in the closure. This is three bad ideas in one. First, this is perl, so my first interpretation is that @^x is some bullshit operation on the variable named x. Second, if I'm using an actual variable name, I'm going to expect it to have been introduced somewhere in my lexical context, so not only is the name not adding value (if I wanted to name it, I could easily use one of the other closure syntaxes), it's actively confusing. Third, the code order dependency is awkward and brittle. Using numeric names (like $0) would be fine, if you can manage to avoid using those for other things, which you really really should.
  • I can't think of a good notation that suggests "build an infinite list by repeatedly applying a function". This syntax sure doesn't. I can, however, think of some good function names for it; Haskell calls it iterate.

tl;dr: In a better-designed language, this would be:
code:
for iterate([1], {[@0, 0] zip(+) [0, @0]})[0..10] { .say }
and while it's still pretty terse, and you still need to know some perlisms to fully get it, it is no longer total punctuation soup.

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply