Ask General Programming Questions Not Worth Their Own Thread

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > Ask General Programming Questions Not Worth Their Own Thread

«‹›2 »

floWenoL: Oct 23, 2002

dougdrums posted:

I didn't think about catch throwing the exception. When I took java my instructor straight out told us he had no clue what it was for. Thanks.

finally has nothing to do with catch throwing an exception. You should brush up on exceptions and how they're used, as it seems you don't understand them.

# ¿ Mar 2, 2008 06:57

Adbot: ADBOT LOVES YOU

# ¿ Apr 28, 2024 21:15

floWenoL: Oct 23, 2002

axolotl farmer posted:

In bash, is there a way to redirect stdout to a command that works on files?

I have a list that I'd like to compare to another list using comm. One of the lists is created by a [long tangle of seds, awks and greps], while the other is a file (mylist). The command comm only takes input as files.

Do I have to go by a temp file, or is there some way to get comm to work with the stdout?

from:
[long tangle of seds, awks and greps] >tempfile
comm tempfile mylist

to:
comm mylist [long tangle of seds, awks and greps]

Yes. You want "comm mylist <(long tangle of seds awks and greps)". Under the hood, bash replaces <(...) with something like /dev/fd/XXX which is the dev path to the stdout of the process.

# ¿ Mar 14, 2008 22:40

floWenoL: Oct 23, 2002

6174 posted:

The problem I have with this solution is it is dependent upon functions that operate on floating point numbers (log10, pow, etc). Is there a way to do this with only integral operations?

Sure. First, you want floor(log10(number)) + 1, and not ceil(log10(number)). That avoids the need for the extra test after. You can do discrete log10 (floor . log10) easily by trial multiplication of 10; there's no known efficient solution, though, but for the size of integers you're working with, it doesn't really matter. You can also do pow(10,x) easily by repeated multiplication, or repeated squaring, which is a bit faster, but again, it doesn't really matter for the size of the numbers you're working with.

# ¿ Mar 22, 2008 05:32

floWenoL: Oct 23, 2002

JoeNotCharles posted:

It may not be "fair", but it's accurate. Perl is a terrible language, and I would strongly advise everyone to avoid as hard as possible.

Why don't you try making recommendations without sparking a language debate? Even though your rose-colored (or ruby-colored ha ha) evaluation of Python and Ruby's text processing capabilities as compared to Perl's are amusing.

# ¿ Apr 3, 2008 06:44

floWenoL: Oct 23, 2002

clay posted:

I have an 2d array set up like this in C:
code:
int *ary[1000];

for (i; i < 1000; ++i) {
    ary[i] = malloc(other_ary[i]);
}
And I want to set up the exact same thing using mmap. Any pointers on this?

What exactly are you trying to do? mmap is used to treat files as if they were loaded in memory; the closest thing would be to mmap a region that is backed by an anonymous file.

# ¿ May 4, 2008 23:12

floWenoL: Oct 23, 2002

clay posted:

I'm loading the netflix dataset to memory. Right now I have it load the entire dataset into memory each pass which is pretty silly and takes about 4 minutes. I want to mmap it all once so I can just load the mmaped file.

I'm still not sure why loading in a file requires a 2-d array, but the canonical code to mmap a file for reading is:

code:

#include <fcntl.h> /* open() */
#include <sys/mman.h> /* mmap()/munmap() */
#include <sys/stat.h> /* stat() */
#include <unistd.h> /* close() */

...

int fdin;
struct stat buf;
char *bufin;

...

fdin = open("myfile", O_RDONLY);
fstat(fdin, &buf);
bufin = mmap(NULL, buf.st_size, PROT_READ, MAP_SHARED, fdin, 0);

...
munmap(bufin, buf.st_size);
close(fdin);

# ¿ May 5, 2008 09:14

floWenoL: Oct 23, 2002

CrusaderSean posted:

Whoops, didn't see it was O(n). I'm not really shuffling a table. I have a large array of strings and I'm permuting the order of each string.

Not sure how you're storing the strings, but if it is by value you'd probably get a huge speed up by shuffling references/pointers to the strings instead.

# ¿ May 7, 2008 21:53

floWenoL: Oct 23, 2002

JawnV6 posted:

Inline the function and don't use pointers. See if there's still a difference.

On the functions you wrote, xor is doing 3 loads and 3 stores, assign is doing 2 loads, 2 stores, and using an extra register. The difference in speed is because of the loads/stores, not 'pipelining'. mov and xor will get through exactly the same.

Making the swap function static will probably cause it to be inlined, if it's not already.

Edit:
It's probably best to show the optimized generated assembly, instead of speculating as to what the compiler did.

floWenoL fucked around with this message at 21:42 on Aug 26, 2008

# ¿ Aug 26, 2008 21:40

floWenoL: Oct 23, 2002

Okay, this is how you do the benchmark. First you post the code, making sure to put something in to prevent the compiler from optimizing the whole thing out:

code:

/* xorswap.c */
#include <stdio.h>
#include <stdlib.h>

static inline void swap(int *a, int *b) {
  *a ^= *b;
  *b ^= *a;
  *a ^= *b;
}

int main(int argc, char **argv) {
  int n, a, b;
  if (argc < 4) return 0;
  n = atoi(argv[1]);
  a = atoi(argv[2]);
  b = atoi(argv[3]);
  while (--n >= 0) swap(&a, &b);
  printf("%d %d\n", a, b);
  return 0;
}

code:

/* tmpswap.c */
#include <stdio.h>
#include <stdlib.h>

static inline void swap(int *a, int *b) {
  int tmp = *a;
  *a = *b;
  *b = tmp;
}

int main(int argc, char **argv) {
  int n, a, b;
  if (argc < 4) return 0;
  n = atoi(argv[1]);
  a = atoi(argv[2]);
  b = atoi(argv[3]);
  while (--n >= 0) swap(&a, &b);
  printf("%d %d\n", a, b);
  return 0;
}

Then you post your compiler version/commands, making sure to actually turn optimizations on fully:

code:

$ gcc --version
gcc (GCC) 4.0.3 (Ubuntu 4.0.3-1ubuntu5)
$ gcc -O3 -Wall tmpswap.c -S
$ gcc -O3 -Wall xorswap.c -S
$ gcc -O3 -Wall tmpswap.c -o tmpswap
$ gcc -O3 -Wall xorswap.c -o xorswap

Then you post the relevant parts of the generated assembly:

code:

# tmpswap.s
.L14:
	movl	%eax, %ecx
	movl	%edx, %eax
.L7: # entry point
	addl	$1, %ebx
	movl	%ecx, %edx
	cmpl	%esi, %ebx
	jne	.L14

code:

# xorswap.s
.L7: # entry point
	xchgl	%eax, %ecx
	addl	$1, %edx
	cmpl	%edx, %ebx
	jne	.L7

Then you post your results and analysis:

code:

$ time ./tmpswap 2000000000 3 5
3 5

real    0m1.993s
user    0m1.965s
sys     0m0.006s
$ time ./xorswap 2000000000 3 5
3 5

real    0m2.042s
user    0m1.980s
sys     0m0.002s

It's a wash, and I get different results when I run it multiple times. It also might be worth doing the benchmark in assembly, i.e. comparing xchgl with tmp with xor.

floWenoL fucked around with this message at 00:02 on Aug 27, 2008

# ¿ Aug 26, 2008 23:53

floWenoL: Oct 23, 2002

Jo posted:

Correct me if I'm mistaken: #define or set_name or whatever for a compiled language makes its way into the executable code while constant final static makes its way into a register. Is there really any performance difference in the grand scheme of things?

Nothing in this paragraph is correct.

# ¿ Sep 1, 2008 14:11

floWenoL: Oct 23, 2002

Mustach posted:

I assigned the result of realloc to another variable.
This won't work in C89, and even in C99 and C++ it has the added problems of re-calculating the length of a non-portable string literal in every spot the macro is used. If you really just gotta be sure buff is big enough to hold the string representation of page count:
code:
char *buff = malloc(floor(log10(pagecount)) + 2);

A 'clearer' way would be:

code:

#include <stdio.h>

char widest_uint32[] = "4294967295";

int main() {
  char buffer[sizeof(widest_uint32)];
  sprintf(buffer, "%d", (int)sizeof(widest_uint32));
  printf("sizeof(widest_uint32) = %s\n", buffer);
  return 0;
}

quote:

but really you'd be pretty safe with something like
code:
char buff[124];

If it's so safe, why'd you bump it up from 100? :P

# ¿ Sep 7, 2008 09:21

floWenoL: Oct 23, 2002

Jarl posted:

In C++ why is it that this:

typedef CDLLclass (*CreateDllObject)();

is okay, but this:

typedef CDLLclass* (*CreateDllObject)();

is a big no no? It simply wont compile.
I need this to be possible, so is there some "extra" syntax rule that is needed for this to work?

Works for me. What error message are you getting?

# ¿ Sep 8, 2008 22:15

floWenoL: Oct 23, 2002

SiLk-2k5 posted:

TL;DR: I've got a subroutine with a string parameter that returns a string based on one of 96 string cases in a switch statement. Anything I can think of doing to make the code faster?

Why in god's name are you not using a hashtable.

# ¿ Sep 18, 2008 23:08

floWenoL: Oct 23, 2002

JoeNotCharles posted:

None whatsoever.

The only difference between switch and if is that switch only evaluates the condition once, and if evaluates it for each "else if" clause. When the condition is a simple variable like this, there's nothing to evaluate so there's no difference at all.

Also, if the condition is a complex expression or a function call that the compiler can determine returns the same result every time and has no side effects, any decent compiler will optimize it out so it only gets evaluated once anyway.

This is so off-base it's not even funny. :psyduck:

# ¿ Sep 19, 2008 01:07

floWenoL: Oct 23, 2002

JoeNotCharles posted:

Really? How does it work then?

Switch statements lend themselves to jump tables when the data types are bounded integer values, but this is only in C or C++, and I guess maybe C# sometimes, but of course for strings this can't be done.

I don't know if C# does inter-procedural optimization. Maybe it does. But generally unless its inlined data flow analysis won't optimize it out for ifs.

# ¿ Sep 19, 2008 02:13

floWenoL: Oct 23, 2002

Fitret posted:

I subscribe to HBO and record them with lovely applications that don't let me change the naming structure because I'm too cheap to pay for SageTV or BeyondTV. Plus, these will auto-convert them H264 .mkv's for me. That having been said, I've spent enough time trying to figure out this regex that I might just bite the bullet and buy SageTV or BeyondTV if they'll let me choose how to rename my episodes and automatically compress them. Vista's MediaCenter won't, which is lame.

Man, I wish I had enough money to afford a dedicated DVR box with enough horsepower to do high-def recording and H264 encoding. I don't get it, though; if you're rich enough to have such a machine, why can't you just buy SageTV or BeyondTV?

# ¿ Oct 12, 2008 22:08

floWenoL: Oct 23, 2002

Bonus posted:

If I use a binary heap for Q, I put them in by their distance from the source. That's all cool, but when relaxing (u,v), the distance for a vertice is changed and then suddenly the binary heap doesn't satisfy the heap property. If I rebuild the heap after relaxing (u,v), well that increases the complexity so much that it doesn't make sense to use a heap at all.

No need to rebuild the heap; just remove (u, old_dist) and insert (u, new_dist), both of which are lg n operations.

# ¿ Nov 25, 2008 16:27

floWenoL: Oct 23, 2002

Bonus posted:

Uh, wait, but I'm supposed to modify the distence of v, not u. u is the priority element and modifying it is log n, but I'm supposed to modify v, which is the neighbour of the priority element and I don't know where that element is in the priority queue.

Oh, right. Actually, it suffices to exchange the changed element with its parent while necessary. That's log(n) and preserves the heap property. I guess you'd also have to store per-node info as to where it is in the heap (and update it). Hmm.

# ¿ Nov 25, 2008 17:47

floWenoL: Oct 23, 2002

Why are you assigning the return value of FpMult (an int) to the string product? Why are you then immediately overwriting product with whatever cin reads in?

Why does a function named FpMult() return an int? How exactly do you write a floating point number in hex?

# ¿ Dec 1, 2008 02:21

floWenoL: Oct 23, 2002

Zombywuf posted:

You should probably do both, that is have a class that does A) and another class that does B) using A) to do it. Always one task per class.

Edit:
Nevermind, I misunderstood the question.

floWenoL fucked around with this message at 22:59 on Dec 1, 2008

# ¿ Dec 1, 2008 22:32

floWenoL: Oct 23, 2002

JoeNotCharles posted:

But there's no point in trying to design a complex threaded architecture to handle every possible case if you're not going to need it.

Yeah, you're right; I misinterpreted B to be an asynchronous version of A.

# ¿ Dec 1, 2008 22:59

floWenoL: Oct 23, 2002

hexadecimal posted:

I am not well versed in C# but if i was to do this, I would use lex/yacc. I am pretty sure C# supports regular expression matching with its standard API, so lexer part is done. You should look into C# compiler/compiler, or just make your own parser, then parse in XML and spit it back out in whatever format you want. This is probably at least a few days worth of work.

It is becoming clear that whenever someone asks a question, whatever hexadecimal suggests is almost certainly what you don't want to do.

# ¿ Dec 11, 2008 06:40

floWenoL: Oct 23, 2002

6174 posted:

That works. It allows just -llapack to find the library.

This brings up a related question. If ld simply maps -lfile to libfile.so, then why is it that quite a few of the libraries I've got in /usr/lib do not have a .so variant? Many of them are named libfile.so.# for some number #. But there are also quite a few that have a .so symlinked to the .so.# version. Presumably with as many libraries that have this, the Ubuntu packagers didn't botch this many packages, so I'm missing something. I do see that if I run ldd on the executable, it points to liblapack.so.3. Does this mean that a .so symlink is only needed if you are linking to a specific library, and since most people aren't doing that (at least using ld), the .so symlink isn't needed and thus the package maintainers don't put it in?

It's been a while since I've worked with Linux packaging, but I believe you won't get the symlinks unless you have the package <FOO>-dev (or similar) installed.

# ¿ Dec 12, 2008 01:12

floWenoL: Oct 23, 2002

hexadecimal posted:

What you are describing there is more like memoization not dynamic programing itself. your formula doesn't change for Fibonacci numbers, but a lot of dynamic programing is for problems when you want to optimize some parameters, like the knap sack problem - http://en.wikipedia.org/wiki/Knapsack_problem.

I have no idea what you're trying to say, but the important characteristics of a dynamic programming problem are just overlapping subproblems and optimal substructure, and frequently, memoizing the algorithm is equivalent to the normal bottom-up approach.

# ¿ Dec 20, 2008 08:55

floWenoL: Oct 23, 2002

hexadecimal posted:

Lex/Yacc + Libcurl (C/C++) is what I would use. Depending on how complex your parsing needs to be for the webpage, you may or may not need Yacc, but lexer is a pretty nice tool. Also fast as gently caress.

You seriously give the worst programming advice.

# ¿ Jan 8, 2009 02:47

floWenoL: Oct 23, 2002

DLCinferno posted:

I think someone needs a new custom title.

[big-red-text]
IGNORE ANYTHING I HAVE TO SAY REGARDING PROGRAMMING
[/big-red-text]

# ¿ Jan 8, 2009 04:00

floWenoL: Oct 23, 2002

ShoulderDaemon posted:

So, I said this, and then I started thinking to myself that I should be prepared to be called-out on a statement like this. My first try was 99 lines, but I like round numbers, so here's a 50 line version:

Man, look at those strict aliasing violations.

# ¿ Apr 7, 2009 06:07

floWenoL: Oct 23, 2002

Avenging Dentist posted:

I would guess that's he's blowing out of the stack somewhere, and adding another stack variable is saving his rear end.

I don't know why, but your statement sounds really gay. :pwn:

# ¿ Aug 10, 2009 21:53

floWenoL: Oct 23, 2002

ErIog posted:

Am I a knob for not using the XML libraries in Python? Whenever I need to do random XML parsing, I find it easier to just use regex rather than parse the file through the official library. I realize that this might create headaches in the future, but usually these are one-off tools to automate some sort of data mining or file renaming based on XML data.

As someone who has had to maintain code that parses XML using regexps (and would randomly break every time attributes moved around) I hope you someday go through the same pain.

# ¿ Sep 11, 2009 18:43

floWenoL: Oct 23, 2002

Otto Skorzeny posted:

Someone remind me whether comments in email addresses nest to arbitrary depth or not

They do, so even the 'full' e-mail regex doesn't parse all valid e-mail addresses.

# ¿ Sep 23, 2009 03:46

floWenoL: Oct 23, 2002

Eclipse12 posted:

Try reading my post next time. I said I didn't know how to code; I never said that I hadn't asked someone who does how long my idea would take to make.

Here's another thing you didn't say: what the project actually is.

# ¿ Sep 27, 2009 21:02

floWenoL: Oct 23, 2002

lord funk posted:

I've got some code optimizing to do, and I was wondering if there is a good resource / list of operations ordered by how expensive they are to run.

As in, if multiplying values together is more computationally expensive than adding, or if running if(this == that) 10,000 times is more expensive than multiplying something four times?

Is there a good place to learn this? My Google-fu is poo poo today.

With pipelining and branch prediction, it's not as simple as listing a cycle count for each instruction anymore. The profiler is your friend.

# ¿ Oct 28, 2009 17:56

floWenoL: Oct 23, 2002

Avenging Dentist posted:

No, they learn hexadecimal. *eyes you meaningfully*

He was doing so well, too...

# ¿ Feb 23, 2010 21:40

floWenoL: Oct 23, 2002

litghost posted:

This has some weird aliasing issues, not to mention it uses two relatively expensive calls and is hard to immediately follow what it is doing. You might be better off just manually doing the oscillation.

whooosh!

# ¿ Apr 7, 2010 04:26

floWenoL: Oct 23, 2002

OverloadUT posted:

They share mostly the same set of stats, so I figure they should both be subclasses of a "Combatant" parent.

Does not follow.

# ¿ Apr 8, 2010 02:50

floWenoL: Oct 23, 2002

Jose Cuervo posted:

I have the following problem, and I do not know how to logically approach it:
I have N different time intervals of the form [a,b] (e.g. the time interval [5,9] consists of the times 5, 6, 7, 8, and 9). The problem is to determine the intersection of these N time intervals, where some will overlap, and some will not.
For example, if I have [19,23], [3,7], [5,19], and [44,48], the resulting intersection I want is [3,23], [44,48].

Right now I am able to take all the start and end points of each time interval and sort them in ascending order [3, 5, 7, 19, 19, 23, 44, 48] and have an accompanying vector [s, s, e, s, e, e, s, e] of which point is a start or end.

I am lost as to how to use this information to come up with the resulting intersection, and I am open to new strategies. Any ideas?

Don't you mean union?

# ¿ Apr 27, 2010 05:23

floWenoL: Oct 23, 2002

systran posted:

if action1 == 'attack' and action2 == 'attack':
roll1 = random.randint(1, 12)
roll2 = random.randint(1, 12)

if action1 == 'charge' and action2 == 'attack':
roll1 = random.randint(1, 12)
if roll1 > 6:
playerTwoHealth = playerTwoHealth - 2

This is why significant whitespace sucks.

# ¿ Jun 16, 2010 22:01

floWenoL: Oct 23, 2002

baquerd posted:

The problem is not what will happen today or tomorrow, but what will happen next year when you start having precision rounding problems after some jackass put a task between two others 54 times over the year?

You could certainly do that, and have a lower bound tolerance that forces a full database re-ranking or have a maintenance task that re-ranks them every week or so, but it's still unstable.

You're not going to be able to get a perfect solution here, where each ranking change only affects at most one task, and have it stable forever.

Instead of doubles, you can use strings with lexicographic order and treat them as, say, base-26 numbers between 0 and 1. (e.g., 'abbcz' would map to .0,1,1,2,26)

# ¿ Oct 17, 2011 21:01

Adbot: ADBOT LOVES YOU

# ¿ Apr 28, 2024 21:15

floWenoL: Oct 23, 2002

baquerd posted:

What does "abbcz" represent (is is legal to have more than one rank per task?) and how do you translate that into the required database double rank value for task a, b, c, and z that minimizes rank movement for arbitrary re-ordering?

"abbcz" represents the number (0/26 + 1/26^2 + 1/^26^3 + 2/^26^4 + 25/26^5). There's no double rank; the rank is stored as a string. For every two distinct string ranks, there's always a string rank in between them, so you never need to 're-rank'.

The only downside is that the ranks may grow to arbitrary sizes. But with careful management you can mitigate this for all but the most pathological cases.

Edit:
Eh, I guess I didn't read the original question closely enough. If he's stuck with using doubles, then there's not much you can do.

floWenoL fucked around with this message at 03:12 on Oct 18, 2011

# ¿ Oct 18, 2011 03:10

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > Ask General Programming Questions Not Worth Their Own Thread

«‹›2 »

dougdrums posted:

axolotl farmer posted:

6174 posted:

JoeNotCharles posted:

clay posted:

code:

clay posted:

code:

CrusaderSean posted:

JawnV6 posted:

code:

code:

code:

code:

code:

code:

more falafel please posted:

Jo posted:

Mustach posted:

code:

code:

quote:

code:

Jarl posted:

SiLk-2k5 posted:

JoeNotCharles posted:

JoeNotCharles posted:

Fitret posted:

Bonus posted:

Bonus posted:

Zombywuf posted:

JoeNotCharles posted:

hexadecimal posted:

6174 posted:

hexadecimal posted:

hexadecimal posted:

DLCinferno posted:

ShoulderDaemon posted:

Avenging Dentist posted:

ErIog posted:

Otto Skorzeny posted:

Eclipse12 posted:

lord funk posted:

Avenging Dentist posted:

litghost posted:

OverloadUT posted:

Jose Cuervo posted:

systran posted:

baquerd posted:

baquerd posted: