|
Janin posted:Strictness optimizations like foldl' vs foldl aren't part of the compiler, though; there's no magical transformation which will make "sum = foldl (+) 0" work, because the definition of foldl precludes them. You are aware that sum is defined as a left fold, and does in practice wildly change its behaviour depending on if optimizations were enabled? Strictness analysis and stream fusion are extremely general optimizations and Haskell very much depends on them in many common cases. The foldl to foldl' transformation is something that ghc really can do all on its own. Janin posted:It's not like somebody ever looks at a piece of Haskell code and wonders "gee, how much memory will this use?", and it certainly doesn't depend on compiler optimizations. The only compiler optimization which can change big-O behavior is stream fusion, which is performed only when requested by the code. Ahaha wondering about the memory/time complexity of various pieces of Haskell is how I spend most of my days. When you have a long-running program that runs in constant memory on ghc 6.10, but grows at a constant rate in memory on ghc 6.12, it's a bit of a mystery. That turned out to be the common subexpression optimization getting a little bit more capable and deciding to never throw away some results that it might need in the future. Of course, turning off CSE resulting in both 6.10 and 6.12 never completing the task because I was missing a single let-binding I needed and without CSE the program wasn't sharing that value and would have to constantly recompute it. I've worked with Haskell for years and I am still regularly surprised by various memory/time properties of large pieces of code. I can generally explain them eventually, and the profiling and debugging tools have been getting a lot better at helping programmers answer questions about performance, but it's still extremely hard to develop a good intuition for performance in Haskell. Janin posted:Most of the features you complain about are compiler-specific extensions, or have been removed from the language. This is the actual list of features, which I don't find particularly complex: Well, n+k patterns, bang notation in types, lazy patterns, and scoped type variables are all Haskell98. And they're all at least marginally crazy. Not to mention that the layout rules, especially when combined with do notation, are not all all predictable for new users to the language; there's no way you can look at the syntax for a if statement in a do block and tell me that the right decision was made there. And if you want to use the extremely common State monad, you need to deal with undecidable/overlapping instances and functional dependencies. If you want any of the generic programming stuff, you'll probably want standalone deriving support. If you do anything with large data types you'll most likely find yourself using either record punning or GADTs (or both!). Programming Haskell without extensions in the current community is a joke.
|
# ? Jun 2, 2010 17:52 |
|
|
# ? May 14, 2024 18:50 |
|
ShoulderDaemon posted:You are aware that sum is defined as a left fold, and does in practice wildly change its behaviour depending on if optimizations were enabled? Strictness analysis and stream fusion are extremely general optimizations and Haskell very much depends on them in many common cases. The foldl to foldl' transformation is something that ghc really can do all on its own. ShoulderDaemon posted:Ahaha wondering about the memory/time complexity of various pieces of Haskell is how I spend most of my days. When you have a long-running program that runs in constant memory on ghc 6.10, but grows at a constant rate in memory on ghc 6.12, it's a bit of a mystery. That turned out to be the common subexpression optimization getting a little bit more capable and deciding to never throw away some results that it might need in the future. Of course, turning off CSE resulting in both 6.10 and 6.12 never completing the task because I was missing a single let-binding I needed and without CSE the program wasn't sharing that value and would have to constantly recompute it. ShoulderDaemon posted:Well, n+k patterns, bang notation in types, lazy patterns, and scoped type variables are all Haskell98. And they're all at least marginally crazy. Not to mention that the layout rules, especially when combined with do notation, are not all all predictable for new users to the language; there's no way you can look at the syntax for a if statement in a do block and tell me that the right decision was made there. Could you elaborate on what's wrong with do-if syntax? I find it quite readable, especially compared to (for example) Ruby or LISP. code:
ShoulderDaemon posted:And if you want to use the extremely common State monad, you need to deal with undecidable/overlapping instances and functional dependencies. If you want any of the generic programming stuff, you'll probably want standalone deriving support. If you do anything with large data types you'll most likely find yourself using either record punning or GADTs (or both!). Programming Haskell without extensions in the current community is a joke. !? 'State' doesn't require any language extensions, and certainly not anything as dangerous as undecidable/overlapping instances. You'll need type families to define a 'MonadState m' class/instance, but those can be understood after about a minute of reading. Generic programming in Haskell is a catastrophe; from where I stand, it looks like a bunch of type-system wankery and bad ideas. Anybody trying to implement it in Haskell via a half-dozen unsafe extensions deserves their migraines.
|
# ? Jun 2, 2010 18:55 |
|
Janin posted:'sum' is only defined as a left fold in the H'98 report -- there's no reason to implement it using foldl, and I'm not aware of any compiler which does. Almost any Haskell code will break if optimizations are disabled, because so many are required for usable space/time performance. You have to implement as something equivalent to foldl, because otherwise it has the wrong strictness properties - in particular, compilers are required to support Num instances with non-strict (+) operators correctly, even though there aren't any such instances in common usage. The optimization is just a special-case based on feedback from the strictness checker. And yeah, optimizations change the correctness and complexity of Haskell programs, which is sort of what I was complaining about here. It should be possible to determine the correctness of a program independently of whatever magic your compiler does behind your back. Janin posted:This is more of a bug in the compiler implementation (which is chock-full of them). GHC, in my opinion, shouldn't be performing any CSE until they can figure out how to make it work (they never will). Granted. The various automatic inlining rules are also chock-full of this kind of crap that they will never be able to make work "correctly". Janin posted:Scoped type variables and bang notation are GHC-specific language extensions. Lazy patterns are simply part of pattern matching; there's nothing complex there. n+k patterns were removed from the language. Nope. Creating typeclasses or typeclass instances creates local scopes for some type variables which are larger than a single declaration, although fortunately not in a manner which is particularly likely to be confusing. Bang notation in constructors is Haskell98. n+k patterns are Haskell98, and if you get to dismiss them for not being part of the next version of Haskell, then you also have to acknowledge all the extensions that will be part of the next version of Haskell. Janin posted:Could you elaborate on what's wrong with do-if syntax? I find it quite readable, especially compared to (for example) Ruby or LISP. This is a syntax error in Haskell98: code:
Janin posted:'State' doesn't require any language extensions, and certainly not anything as dangerous as undecidable/overlapping instances. You'll need type families to define a 'MonadState m' class/instance, but those can be understood after about a minute of reading. You can't write automatic instances for MonadState of the form "(MonadState s m) => MonadState s (t m)" for some transformer t, which are the most common form of instance you will want to write if you are building transformer stacks, without undecidable instances. You can't build or import the mtl on a compiler which does not allow undecidable instances or overlapping instances, because the mtl itself creates a number of these instances so that things like RWS work correctly. In short, because data families weren't introduced early enough, many of the more critical modules in the Haskell world now depend directly or indirectly on unsafe bypasses of the type checker. Janin posted:Generic programming in Haskell is a catastrophe; from where I stand, it looks like a bunch of type-system wankery and bad ideas. Anybody trying to implement it in Haskell via a half-dozen unsafe extensions deserves their migraines. Granted, but you frequently cannot escape it if you are using other peoples' code.
|
# ? Jun 2, 2010 19:53 |
|
ShoulderDaemon posted:And yeah, optimizations change the correctness and complexity of Haskell programs, which is sort of what I was complaining about here. It should be possible to determine the correctness of a program independently of whatever magic your compiler does behind your back. It's possible to determine the output of a program based only on the source code. Determining how much time or space it will require to run depends on the compiler, by necessity. ShoulderDaemon posted:Nope. Creating typeclasses or typeclass instances creates local scopes for some type variables which are larger than a single declaration, although fortunately not in a manner which is particularly likely to be confusing. Bang notation in constructors is Haskell98. n+k patterns are Haskell98, and if you get to dismiss them for not being part of the next version of Haskell, then you also have to acknowledge all the extensions that will be part of the next version of Haskell. Scoped type variables and bang notation are not supported in H'98, nor H'10. ShoulderDaemon posted:This is a syntax error in Haskell98: ShoulderDaemon posted:You can't write automatic instances for MonadState of the form "(MonadState s m) => MonadState s (t m)" for some transformer t, which are the most common form of instance you will want to write if you are building transformer stacks, without undecidable instances. You can't build or import the mtl on a compiler which does not allow undecidable instances or overlapping instances, because the mtl itself creates a number of these instances so that things like RWS work correctly. In short, because data families weren't introduced early enough, many of the more critical modules in the Haskell world now depend directly or indirectly on unsafe bypasses of the type checker. code:
It's not possible to write most modules without depending indirectly on bypassing the type checker, because most of them will call into foreign libraries at some point.
|
# ? Jun 2, 2010 20:13 |
|
Today's Ruby/Rails coding horror, variables slightly changed to protect the innocent:code:
Double edit: I just realized the time column from our database is a double, but it's not a unix epoch timestamp, it's stored as 20100603160000 which is actually 06/03/2010 16:00 manero fucked around with this message at 17:13 on Jun 3, 2010 |
# ? Jun 3, 2010 16:36 |
|
Maybe I'm nuts, but I don't see the horror there, other than the way you store your time in the database. That's pretty much what I would do if I had to work with that format.
|
# ? Jun 3, 2010 17:41 |
|
jandrese posted:Maybe I'm nuts, but I don't see the horror there, other than the way you store your time in the database. That's pretty much what I would do if I had to work with that format. Yep the only problem I'd have with that is storing the time as a double. It would be better to have it as a packed Integer.
|
# ? Jun 3, 2010 18:13 |
|
In my peripheral interfacing class (working with AVR stk500 and atmega8) we were discussing how to deal with shared data and hardware interrupts. He was showing us examples of how one might solve a problem given a simple example and then asking the class if the solution would/wouldn't work and why or why not etc. Here was one example:code:
|
# ? Jun 3, 2010 20:15 |
|
HFX posted:Yep the only problem I'd have with that is storing the time as a double. It would be better to have it as a packed Integer. That reminds me of php's microtime() function. Goddamnit whoever wrote that is a cocksucker.
|
# ? Jun 3, 2010 20:30 |
|
TRex EaterofCars posted:That reminds me of php's microtime() function. Goddamnit whoever wrote that is a cocksucker. If you're unfortunate enough to use PHP on Windows, microtime() will make you want to tear your eyes out. code:
|
# ? Jun 3, 2010 21:49 |
|
Lysidas posted:If you're unfortunate enough to use PHP on Windows, microtime() will make you want to tear your eyes out. This kind of brain damage.
|
# ? Jun 3, 2010 21:54 |
|
Lysidas posted:If you're unfortunate enough to use PHP php manual posted:Parameters see? problem solved Using php 4? php:<? $ohgodwhy = array_sum(explode(' ', microtime())); ?>
|
# ? Jun 3, 2010 22:33 |
|
Just discovered a new take on the for-case pattern (bonus points: spot the bug which brought me to discover this):code:
|
# ? Jun 4, 2010 00:55 |
|
The only thing dumber than using the copy-paste design pattern is making edits to the pasted code without tool assistance (e.g. find-and-replace).
|
# ? Jun 4, 2010 01:00 |
|
Janin posted:Just discovered a new take on the for-case pattern (bonus points: spot the bug which brought me to discover this): That's like a matryoshka of horrors
|
# ? Jun 4, 2010 01:28 |
|
Where in the gently caress did that come from?
|
# ? Jun 4, 2010 02:04 |
|
Crazy RRRussian posted:Where in the gently caress did that come from? Your tax dollars and a summer intern
|
# ? Jun 4, 2010 02:06 |
|
Janin posted:Your tax dollars and a summer intern Jesus. Christ. It at this moment I thank god that all of my coworkers are not only competent, but exceptional. We have no coding horrors (at least compared to the poo poo in this thread).
|
# ? Jun 4, 2010 02:23 |
|
code:
|
# ? Jun 4, 2010 03:25 |
|
Dude, just rewrite the code to avoid the substr operation. substr runs in linear time as every time you call it the STL needs to allocate new buffer and copy part of old string into it. From looking at the code you gave us it seems like you could rewrite your code to avoid substr rather easily and this will likely result in significant speed up. Do this and let me know what speed up you get with filters of size above 1000. EDIT: Also compare the time needed to make prefix tree with time needed to do naive string matching but avoiding substr. Crazy RRRussian fucked around with this message at 03:50 on Jun 4, 2010 |
# ? Jun 4, 2010 03:45 |
|
Crazy RRRussian posted:Dude, just rewrite the code to avoid the substr operation. He already rewrote it you dingleberry.
|
# ? Jun 4, 2010 04:34 |
|
Avenging Dentist posted:He already rewrote it you dingleberry. Yea sure, but hes saying he had to do a radix trie for it. I am just curious how the time needed to construct the tree impacts performance here with naive string matching.
|
# ? Jun 4, 2010 04:54 |
|
Crazy RRRussian posted:Dude, just rewrite the code to avoid the substr operation. substr runs in linear time as every time you call it the STL needs to allocate new buffer and copy part of old string into it. From looking at the code you gave us it seems like you could rewrite your code to avoid substr rather easily and this will likely result in significant speed up. Do this and let me know what speed up you get with filters of size above 1000. I have to admit there is another horror here and that is I didn't profile the code to determine the problem, I assumed it was the linear search. So I did remove the substr, and here are the results: code:
|
# ? Jun 4, 2010 05:01 |
|
This gem is from my company's CTO. He has a base set of libraries he's used for everything for at least the last 5 years.php:<? ... $sql = "SELECT * FROM user WHERE "; foreach ($query as $key => $value) { $sql .= $key." = '".$value."' AND "; } $sql = substr($sql, 0, -4); ... ?>
|
# ? Jun 4, 2010 15:00 |
|
Crazy RRRussian posted:Yea sure, but hes saying he had to do a radix trie for it. I am just curious how the time needed to construct the tree impacts performance here with naive string matching. Probably pretty good given that many implementations of std::string use copy on write.
|
# ? Jun 4, 2010 18:38 |
|
I believe specification for string says they must be null terminated. Thus they could not use COW unless const [] operator does checks, which would make it slow. Also in my experience the standard STL implementation (whatever comes with Ubumtu) could not be doing COW because substr seems to be a heavy operation in them even if retrieved substring is never modified. Same with STL implementation on windows. This is coming from optimizing cpp prograns on both linux and windows to avoid using substr operation and this always resulting in nice speed up.
Crazy RRRussian fucked around with this message at 18:59 on Jun 4, 2010 |
# ? Jun 4, 2010 18:56 |
|
Crazy RRRussian posted:I believe specification for string says they must be null terminated. Thus they could not use COW unless const [] operator does checks, which would make it slow. Also in my experience the standard STL implementation (whatever comes with Ubumtu) could not be doing COW because substr seems to be a heavy operation in them even if retrieved substring is never modified. Same with STL implementation on windows. This is coming from optimizing cpp prograns on both linux and windows to avoid using substr operation and this always resulting in nice speed up. This paragraph is the real coding horror.
|
# ? Jun 4, 2010 19:25 |
|
floWenoL posted:This paragraph is the real coding horror.
|
# ? Jun 4, 2010 19:42 |
|
code:
|
# ? Jun 4, 2010 20:14 |
|
Crazy RRRussian posted:Why? Because almost every sentence has an error?
|
# ? Jun 4, 2010 20:16 |
|
Crazy RRRussian posted:I believe specification for string says they must be null terminated. It's "believe the specification," not "believe specification." It's NUL-terminated, not "null terminated." Also your belief is wrong. Crazy RRRussian posted:Thus they could not use COW unless const [] operator does checks, which would make it slow. Based on a wrong belief, hence wrong. Also it would be wrong if the belief was not wrong. Crazy RRRussian posted:Also in my experience the standard STL implementation There is no standard STL implementation. gently caress THERE WAS A FLY ON MY HAND. How could you say "standard STL implementation"? Crazy RRRussian posted:(whatever comes with Ubumtu) could not be doing COW because substr seems to be a heavy operation in them even if retrieved substring is never modified. substr is a different operation, obviously. COW simply refers to the copy constructor, and the assignment operator, and maybe other things I can't think of, but certainly not substr. Its implementation of std::string has sizeof(std::string) = sizeof(void*), which consists of a pointer to the buf member of a structure that looks like this (I may have mixed up the limit and length fields). code:
Crazy RRRussian posted:Same with STL implementation on windows. It's the STL implementation on windows. Crazy RRRussian posted:This is coming from optimizing cpp prograns on both linux and windows to avoid using substr operation and this always resulting in nice speed up. Hexadecimal, master substr optimizer. He even optimizes substrs from his own sentences.
|
# ? Jun 4, 2010 21:00 |
|
Yea ok, I said some stupid stuff about STL, however I am still right in that substr is loving slow and runs in linear time.
|
# ? Jun 4, 2010 21:06 |
|
Crazy RRRussian posted:substr is loving slow and runs in linear time. These two have nothing to do with each other. In most cases where substr is slow, it's because of dynamic memory allocation, not because it's "linear time". Any string comparison is also linear time, so (in absence of allocations) you're just adding a scalar multiple (2) to the asymptotic behavior of the code. I'm actually surprised libstdc++ doesn't use COW on substr starting at 0, but then COW is stupid as gently caress for strings to begin with, and in violation of the C++0x spec (due to the requirements of move ctors, I believe). I long for the day that clang and libcxx are the defaults on all systems.
|
# ? Jun 4, 2010 21:30 |
|
Crazy RRRussian posted:Yea ok, I said some stupid stuff about STL, however I am still right in that substr is loving slow and runs in linear time. You're wrong, by default, because you care about whether you were once right. Also, you not only are wrong, you were wrong. substr is loving fast -- it runs in linear time!
|
# ? Jun 4, 2010 21:30 |
|
Avenging Dentist posted:(due to the requirements of move ctors, I believe) How so? The real horror is that C++ is so complicated that you can only "believe" it's in violation of the C++0x spec.
|
# ? Jun 4, 2010 21:33 |
|
shrughes posted:The real horror is that C++ is so complicated that you can only "believe" it's in violation of the C++0x spec. I haven't read the whole proposed spec since I see little point in trying to hit a moving target (especially since the semantics of rvalue references have changed several times now). I'm actually just going on something said on the GCC mailing list. EDIT: Ah, sorry, it's because C++0x provides new concurrency guarantees for std::string, namely that 1) two threads may read from the same string object, and 2) a thread may copy and manipulate a string even if another thread is holding on to the original (short version: strings can't be retarded in C++0x). http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2668.htm if you care. Avenging Dentist fucked around with this message at 21:42 on Jun 4, 2010 |
# ? Jun 4, 2010 21:35 |
|
gently caress my lifecode:
|
# ? Jun 4, 2010 21:43 |
|
The real real horror is that Hexadecimal is still managing to pull people into arguments with hardly any effort
|
# ? Jun 4, 2010 21:45 |
|
A solution to "database is locked":code:
|
# ? Jun 4, 2010 23:06 |
|
|
# ? May 14, 2024 18:50 |
|
Ugg boots posted:A solution to "database is locked": But this automatically works when the database is unlocked. Automatically.
|
# ? Jun 5, 2010 01:09 |