|
Krankenstyle posted:ideas: win95 pcs nailed to a door
|
# ? Nov 16, 2019 10:26 |
|
|
# ? May 18, 2024 07:23 |
|
wow i he no idea c was so bad http://verisimilitudes.net/2019-11-12 parsing arguments like a Chump
|
# ? Nov 18, 2019 01:30 |
|
only 30% slower if you remove process launch overheads and any sort of argument parsing (but keep them for the c version) and maybe use a different definition of words that’s more convenient for the lisp implementation, didn’t really follow that part and lol at re-reading it more carefully
|
# ? Nov 18, 2019 03:12 |
|
rjmccall posted:only 30% slower if you remove process launch overheads and any sort of argument parsing (but keep them for the c version) but the posix standards conspire againstt good languages!!!! i mean it does not surprise me that a naive implementation of wc on a language platform with a good compiler isn't horrendously slow. was this supposed to be a surprising result?
|
# ? Nov 18, 2019 03:17 |
|
rjmccall posted:only 30% slower if you remove process launch overheads and any sort of argument parsing (but keep them for the c version) the “origin” article written in haskell has to use multiple threads to beat it, but somehow that is a victory over the c which is basically a while loop (hand optimized ???)
|
# ? Nov 18, 2019 03:19 |
|
fwiw, as far as i know, this is still the up-to-date, “hand-optimized” wc implementation on darwin
|
# ? Nov 18, 2019 03:37 |
|
rjmccall posted:fwiw, as far as i know, this is still the up-to-date, “hand-optimized” wc implementation on darwin i figured he must be complaining about the gnu coreutils implementation, but no, it's not substantially worse http://git.savannah.gnu.org/cgit/coreutils.git/tree/src/wc.c portable C is very ugly, but this is a reasonably good example of good C. the comments explain what is going on. it will build on many crappy unix systems. etc.
|
# ? Nov 18, 2019 04:16 |
|
neither of these even tries to mmap to avoid copying data, shameful levels of optimization i had no idea you could set a file to not start reading at its beginning though, wtf unix
|
# ? Nov 18, 2019 04:47 |
|
this is the oldest wc I could find, from version 5 research unix: https://github.com/dspinellis/unix-history-repo/blob/Research-V5-Snapshot-Development/usr/source/s2/wc.c it took some work to get it to compile and not segfault constantly because libc functions on v5 are different than in modern libc but the result is: code:
|
# ? Nov 18, 2019 05:03 |
|
Kazinsal posted:this is the oldest wc I could find, from version 5 research unix: https://github.com/dspinellis/unix-history-repo/blob/Research-V5-Snapshot-Development/usr/source/s2/wc.c the gnu version has a lot more flags, handles a lot more corner cases, and builds on every weird rear end lovely unix you can imagine
|
# ? Nov 18, 2019 05:07 |
|
if you want to blow your mind compare gnu grep to v6 grep
|
# ? Nov 18, 2019 05:08 |
|
Notorious b.s.d. posted:the gnu version has a lot more flags, handles a lot more corner cases, and builds on every weird rear end lovely unix you can imagine "gnu's not unix", right down to the philosophy of "small and smart programs that do a basic task well"
|
# ? Nov 18, 2019 05:11 |
|
Kazinsal posted:"gnu's not unix", right down to the philosophy of "small and smart programs that do a basic task well" by that standard, unix hasn't been unix, ever (no unix program has ever done a basic task well)
|
# ? Nov 18, 2019 05:14 |
|
like gnu coreutils are very composable -- they combine in useful ways -- but they are not small, and what little "smart" is baked in contributes to the size and complexity
|
# ? Nov 18, 2019 05:14 |
|
dO oNe ThInG AnD dO It WeLl
|
# ? Nov 18, 2019 05:31 |
|
the only reason I’m dismissive of Unix philosophy is that in two decades of paying attention to computers I’ve never encountered an argument for it that isn’t either essentially aesthetic or entirely a priori yes that does mean I ignore almost everything relating to computers. saves a lot of time
|
# ? Nov 18, 2019 05:35 |
|
the unix philosophy was developed after the fact, after they had already put together a pretty nice system for day to day use composable cli tools are a good invention and the unix pipe, as crude as it is, works pretty well for a wide variety of tasks that does not make it a coherent approach to all design problems
|
# ? Nov 18, 2019 05:50 |
|
I agree that cat *.butt | wc -l is cool but also cat butt is not a silver bullet. it’s been almost 50 years. where’s the history of Unix philosophy, its successes, its failures. why is programming so relentlessly forward looking
|
# ? Nov 18, 2019 06:07 |
|
Nomnom Cookie posted:I agree that cat *.butt | wc -l is cool but also cat butt is not a silver bullet. itym wc -l *.butt, or in the event there are too many files, find . -iname '*.butt' -exec wc -l {} \;
|
# ? Nov 18, 2019 06:10 |
|
rjmccall posted:fwiw, as far as i know, this is still the up-to-date, “hand-optimized” wc implementation on darwin verisimilitudes 7 hours ago [-] I do have some testing that showed me the Common Lisp was roughly the same speed as the C, when testing against a file of a few dozen megabytes. As I explain, I wasn't interested in optimizing it further, as I feel showing this achieved in just a few minutes to be valuable on its own. I also note that a C programmer may boast about being a small fraction of a second faster, ignoring everything else that goes into the program, which I find foolish. The reason I didn't strictly adhere to the POSIX behavior was because I don't know where this is documented and don't feel like scanning through the C to find out. On all of the files I've tested, which include a wide array of punctuation and other such things, the results were identical, but I'm merely not making any promises. I'd prefer to not be accused of being one of those Lispers who only complete part of the program; if you look at the libraries I've written, which actually concern me, then you'll find they're well-documented and rather comprehensive for their purposes. reply
|
# ? Nov 18, 2019 06:11 |
|
JawnV6 posted:I'd prefer to not be accused of being one of those Lispers who only complete part of the program omg holy poo poo don't accuse me of doing exactly the smug lisp weenie thing that everyone else does! it's different when i'm a smug lisp weenie!
|
# ? Nov 18, 2019 06:14 |
|
dude's original premise isn't wrong. his common lisp version of wc is a lot easier to read and presumably easier to maintain/port however once he adds in all the posix conformance that is present in gnu coreutils it is not going to be a whole hell of a lot simpler or better than the original coreutils wc. easier for him to write, sure, but not necessarily easier to maintain going forward turns out it's really hard to comply with a bunch of hastily written standards from decades ago
|
# ? Nov 18, 2019 06:15 |
|
Kazinsal posted:this is the oldest wc I could find, from version 5 research unix: https://github.com/dspinellis/unix-history-repo/blob/Research-V5-Snapshot-Development/usr/source/s2/wc.c now make it support non-English languages
|
# ? Nov 18, 2019 06:59 |
|
pseudorandom name posted:now make it support non-English languages lol
|
# ? Nov 18, 2019 07:44 |
|
Notorious b.s.d. posted:that does not make it a coherent approach to all design problems but rob pike says that
|
# ? Nov 18, 2019 07:45 |
|
Notorious b.s.d. posted:itym wc -l *.butt, or in the event there are too many files, find . -iname '*.butt' -exec wc -l {} \; itym find . -iname '*.butt' -print0 | xargs -0 wc -l'
|
# ? Nov 18, 2019 07:47 |
|
pseudorandom name posted:now make it support non-English languages that's not in line with the unix philosophy
|
# ? Nov 18, 2019 08:37 |
|
redleader posted:that's not in line with the unix philosophy on the one hand POSIX requires wc to respect all the usual locale environment variables, on the other hand the only locales POSIX mandates are "C" and "POSIX"; but it would be a poor implementation that ignored 3/4 of the world's population but I'm not surprised that techbros are falling back on their white privilege to complain about wc's complexity and performance
|
# ? Nov 18, 2019 08:54 |
|
does U+2028 LINE SEPARATOR count as a newline? for that matter, does U+0085 NEXT LINE? that one is even in ascii
|
# ? Nov 18, 2019 09:00 |
|
Nomnom Cookie posted:where’s the history of Unix philosophy, its successes, its failures. why is programming so relentlessly forward looking the current context is lisp, the language whose strongest exponents like to believe sprang fully-formed from the head of john mccarthy and has always had every feature ascribable to any programming language. a language which, for some, will always be a priori superior for all tasks irrespective of any measure of its fitness
|
# ? Nov 18, 2019 09:09 |
|
Internet Janitor posted:the current context is lisp, the language whose strongest exponents like to believe sprang fully-formed from the head of john mccarthy and has always had every feature ascribable to any programming language. a language which, for some, will always be a priori superior for all tasks irrespective of any measure of its fitness lol no static types
|
# ? Nov 18, 2019 09:26 |
|
Nomnom Cookie posted:why is programming so relentlessly forward looking it's a young field with a bunch of money being pumped into it so that a lot of computer touchers can invent trendy new ways to do the exact same things, but worse give it time imo, either the field will mature or civilization will end, problem solved either way
|
# ? Nov 18, 2019 09:30 |
|
animist posted:lol no static types surely a trivial exercise for the boundless capabilities of lisp's macro system so trivial, in fact, that i shall leave it as an exercise for the reader
|
# ? Nov 18, 2019 09:32 |
|
pseudorandom name posted:now make it support non-English languages for the most part it'll handle word count of non-english languages using latin-ish alphabets just fine. it won't understand word counts on logographic alphabets that don't use word separators, but then again, neither does coreutils wc
|
# ? Nov 18, 2019 11:14 |
|
JawnV6 posted:wow i he no idea c was so bad http://verisimilitudes.net/2019-11-12 I've deigned to measure the performance of my program suitably for presentation here. Who writes like that? The entire website reads a bit like a parody. Also, everybody knows a good wc has to run on the GPU.
|
# ? Nov 18, 2019 11:23 |
|
what about -m?
|
# ? Nov 18, 2019 11:24 |
|
it could do that if we threw in full utf8 handling and started considering what constitutes a character instead of counting words in our command line word counting program
|
# ? Nov 18, 2019 11:30 |
|
the concept of a character or even a word is unsurprisingly undefined or not a concept in a lot of languages
|
# ? Nov 18, 2019 11:34 |
|
there isn’t even a single universally accepted way to count words in english. the idea that you can solve the problem in the general case with a trivial program is laughable
|
# ? Nov 18, 2019 12:19 |
|
|
# ? May 18, 2024 07:23 |
Soricidus posted:there isn’t even a single universally accepted way to count words in english. the idea that you can solve the problem in the general case with a trivial program is laughable duh obviously you trim your string from both ends and then the number of words is number of spaces + 1
|
|
# ? Nov 18, 2019 12:59 |