Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
floWenoL
Oct 23, 2002

dougdrums posted:

I didn't think about catch throwing the exception. When I took java my instructor straight out told us he had no clue what it was for. Thanks.

finally has nothing to do with catch throwing an exception. You should brush up on exceptions and how they're used, as it seems you don't understand them.

Adbot
ADBOT LOVES YOU

floWenoL
Oct 23, 2002

axolotl farmer posted:

In bash, is there a way to redirect stdout to a command that works on files?

I have a list that I'd like to compare to another list using comm. One of the lists is created by a [long tangle of seds, awks and greps], while the other is a file (mylist). The command comm only takes input as files.

Do I have to go by a temp file, or is there some way to get comm to work with the stdout?

from:
[long tangle of seds, awks and greps] >tempfile
comm tempfile mylist

to:
comm mylist [long tangle of seds, awks and greps]

Yes. You want "comm mylist <(long tangle of seds awks and greps)". Under the hood, bash replaces <(...) with something like /dev/fd/XXX which is the dev path to the stdout of the process.

floWenoL
Oct 23, 2002

6174 posted:

The problem I have with this solution is it is dependent upon functions that operate on floating point numbers (log10, pow, etc). Is there a way to do this with only integral operations?

Sure. First, you want floor(log10(number)) + 1, and not ceil(log10(number)). That avoids the need for the extra test after. You can do discrete log10 (floor . log10) easily by trial multiplication of 10; there's no known efficient solution, though, but for the size of integers you're working with, it doesn't really matter. You can also do pow(10,x) easily by repeated multiplication, or repeated squaring, which is a bit faster, but again, it doesn't really matter for the size of the numbers you're working with.

floWenoL
Oct 23, 2002

JoeNotCharles posted:

It may not be "fair", but it's accurate. Perl is a terrible language, and I would strongly advise everyone to avoid as hard as possible.

Why don't you try making recommendations without sparking a language debate? Even though your rose-colored (or ruby-colored ha ha) evaluation of Python and Ruby's text processing capabilities as compared to Perl's are amusing.

floWenoL
Oct 23, 2002

clay posted:

I have an 2d array set up like this in C:

code:
int *ary[1000];

for (i; i < 1000; ++i) {
    ary[i] = malloc(other_ary[i]);
}
And I want to set up the exact same thing using mmap. Any pointers on this?

What exactly are you trying to do? mmap is used to treat files as if they were loaded in memory; the closest thing would be to mmap a region that is backed by an anonymous file.

floWenoL
Oct 23, 2002

clay posted:

I'm loading the netflix dataset to memory. Right now I have it load the entire dataset into memory each pass which is pretty silly and takes about 4 minutes. I want to mmap it all once so I can just load the mmaped file.

I'm still not sure why loading in a file requires a 2-d array, but the canonical code to mmap a file for reading is:

code:
#include <fcntl.h> /* open() */
#include <sys/mman.h> /* mmap()/munmap() */
#include <sys/stat.h> /* stat() */
#include <unistd.h> /* close() */

...

int fdin;
struct stat buf;
char *bufin;

...

fdin = open("myfile", O_RDONLY);
fstat(fdin, &buf);
bufin = mmap(NULL, buf.st_size, PROT_READ, MAP_SHARED, fdin, 0);

...
munmap(bufin, buf.st_size);
close(fdin);

floWenoL
Oct 23, 2002

CrusaderSean posted:

Whoops, didn't see it was O(n). I'm not really shuffling a table. I have a large array of strings and I'm permuting the order of each string.

Not sure how you're storing the strings, but if it is by value you'd probably get a huge speed up by shuffling references/pointers to the strings instead.

floWenoL
Oct 23, 2002

JawnV6 posted:

Inline the function and don't use pointers. See if there's still a difference.

On the functions you wrote, xor is doing 3 loads and 3 stores, assign is doing 2 loads, 2 stores, and using an extra register. The difference in speed is because of the loads/stores, not 'pipelining'. mov and xor will get through exactly the same.

Making the swap function static will probably cause it to be inlined, if it's not already.

Edit:
It's probably best to show the optimized generated assembly, instead of speculating as to what the compiler did.

floWenoL fucked around with this message at 21:42 on Aug 26, 2008

floWenoL
Oct 23, 2002

Okay, this is how you do the benchmark. First you post the code, making sure to put something in to prevent the compiler from optimizing the whole thing out:

code:
/* xorswap.c */
#include <stdio.h>
#include <stdlib.h>

static inline void swap(int *a, int *b) {
  *a ^= *b;
  *b ^= *a;
  *a ^= *b;
}

int main(int argc, char **argv) {
  int n, a, b;
  if (argc < 4) return 0;
  n = atoi(argv[1]);
  a = atoi(argv[2]);
  b = atoi(argv[3]);
  while (--n >= 0) swap(&a, &b);
  printf("%d %d\n", a, b);
  return 0;
}
code:
/* tmpswap.c */
#include <stdio.h>
#include <stdlib.h>

static inline void swap(int *a, int *b) {
  int tmp = *a;
  *a = *b;
  *b = tmp;
}

int main(int argc, char **argv) {
  int n, a, b;
  if (argc < 4) return 0;
  n = atoi(argv[1]);
  a = atoi(argv[2]);
  b = atoi(argv[3]);
  while (--n >= 0) swap(&a, &b);
  printf("%d %d\n", a, b);
  return 0;
}
Then you post your compiler version/commands, making sure to actually turn optimizations on fully:

code:
$ gcc --version
gcc (GCC) 4.0.3 (Ubuntu 4.0.3-1ubuntu5)
$ gcc -O3 -Wall tmpswap.c -S
$ gcc -O3 -Wall xorswap.c -S
$ gcc -O3 -Wall tmpswap.c -o tmpswap
$ gcc -O3 -Wall xorswap.c -o xorswap
Then you post the relevant parts of the generated assembly:

code:
# tmpswap.s
.L14:
	movl	%eax, %ecx
	movl	%edx, %eax
.L7: # entry point
	addl	$1, %ebx
	movl	%ecx, %edx
	cmpl	%esi, %ebx
	jne	.L14
code:
# xorswap.s
.L7: # entry point
	xchgl	%eax, %ecx
	addl	$1, %edx
	cmpl	%edx, %ebx
	jne	.L7
Then you post your results and analysis:

code:
$ time ./tmpswap 2000000000 3 5
3 5

real    0m1.993s
user    0m1.965s
sys     0m0.006s
$ time ./xorswap 2000000000 3 5
3 5

real    0m2.042s
user    0m1.980s
sys     0m0.002s
It's a wash, and I get different results when I run it multiple times. It also might be worth doing the benchmark in assembly, i.e. comparing xchgl with tmp with xor.

floWenoL fucked around with this message at 00:02 on Aug 27, 2008

floWenoL
Oct 23, 2002

more falafel please posted:

Ok, but I think my point still stands, the compiler optimizes out the xor trick because it's not actually an optimization.

Yeah, it's inconclusive. I'd test the xor trick in assembly but :effort:. Also, I wonder if gcc actually detects the xor trick or if it just falls out of the data flow analysis.

floWenoL
Oct 23, 2002

Jo posted:

Correct me if I'm mistaken: #define or set_name or whatever for a compiled language makes its way into the executable code while constant final static makes its way into a register. Is there really any performance difference in the grand scheme of things?

Nothing in this paragraph is correct.

floWenoL
Oct 23, 2002

Mustach posted:

I assigned the result of realloc to another variable.
This won't work in C89, and even in C99 and C++ it has the added problems of re-calculating the length of a non-portable string literal in every spot the macro is used. If you really just gotta be sure buff is big enough to hold the string representation of page count:
code:
char *buff = malloc(floor(log10(pagecount)) + 2);

A 'clearer' way would be:

code:
#include <stdio.h>

char widest_uint32[] = "4294967295";

int main() {
  char buffer[sizeof(widest_uint32)];
  sprintf(buffer, "%d", (int)sizeof(widest_uint32));
  printf("sizeof(widest_uint32) = %s\n", buffer);
  return 0;
}

quote:

but really you'd be pretty safe with something like
code:
char buff[124];

If it's so safe, why'd you bump it up from 100? :P

floWenoL
Oct 23, 2002

Jarl posted:

In C++ why is it that this:

typedef CDLLclass (*CreateDllObject)();

is okay, but this:

typedef CDLLclass* (*CreateDllObject)();

is a big no no? It simply wont compile.
I need this to be possible, so is there some "extra" syntax rule that is needed for this to work?

Works for me. What error message are you getting?

floWenoL
Oct 23, 2002

SiLk-2k5 posted:

TL;DR: I've got a subroutine with a string parameter that returns a string based on one of 96 string cases in a switch statement. Anything I can think of doing to make the code faster?

Why in god's name are you not using a hashtable.

floWenoL
Oct 23, 2002

JoeNotCharles posted:

None whatsoever.

The only difference between switch and if is that switch only evaluates the condition once, and if evaluates it for each "else if" clause. When the condition is a simple variable like this, there's nothing to evaluate so there's no difference at all.

Also, if the condition is a complex expression or a function call that the compiler can determine returns the same result every time and has no side effects, any decent compiler will optimize it out so it only gets evaluated once anyway.

This is so off-base it's not even funny. :psyduck:

floWenoL
Oct 23, 2002

JoeNotCharles posted:

Really? How does it work then?

Switch statements lend themselves to jump tables when the data types are bounded integer values, but this is only in C or C++, and I guess maybe C# sometimes, but of course for strings this can't be done.

I don't know if C# does inter-procedural optimization. Maybe it does. But generally unless its inlined data flow analysis won't optimize it out for ifs.

floWenoL
Oct 23, 2002

Fitret posted:

I subscribe to HBO and record them with lovely applications that don't let me change the naming structure because I'm too cheap to pay for SageTV or BeyondTV. Plus, these will auto-convert them H264 .mkv's for me. That having been said, I've spent enough time trying to figure out this regex that I might just bite the bullet and buy SageTV or BeyondTV if they'll let me choose how to rename my episodes and automatically compress them. Vista's MediaCenter won't, which is lame.

Man, I wish I had enough money to afford a dedicated DVR box with enough horsepower to do high-def recording and H264 encoding. I don't get it, though; if you're rich enough to have such a machine, why can't you just buy SageTV or BeyondTV? :)

floWenoL
Oct 23, 2002

Bonus posted:

If I use a binary heap for Q, I put them in by their distance from the source. That's all cool, but when relaxing (u,v), the distance for a vertice is changed and then suddenly the binary heap doesn't satisfy the heap property. If I rebuild the heap after relaxing (u,v), well that increases the complexity so much that it doesn't make sense to use a heap at all.

No need to rebuild the heap; just remove (u, old_dist) and insert (u, new_dist), both of which are lg n operations.

floWenoL
Oct 23, 2002

Bonus posted:

Uh, wait, but I'm supposed to modify the distence of v, not u. u is the priority element and modifying it is log n, but I'm supposed to modify v, which is the neighbour of the priority element and I don't know where that element is in the priority queue.

Oh, right. Actually, it suffices to exchange the changed element with its parent while necessary. That's log(n) and preserves the heap property. I guess you'd also have to store per-node info as to where it is in the heap (and update it). Hmm.

floWenoL
Oct 23, 2002

Why are you assigning the return value of FpMult (an int) to the string product? Why are you then immediately overwriting product with whatever cin reads in?

Why does a function named FpMult() return an int? How exactly do you write a floating point number in hex?

floWenoL
Oct 23, 2002

Zombywuf posted:

You should probably do both, that is have a class that does A) and another class that does B) using A) to do it. Always one task per class.

Edit:
Nevermind, I misunderstood the question.

floWenoL fucked around with this message at 22:59 on Dec 1, 2008

floWenoL
Oct 23, 2002

JoeNotCharles posted:

But there's no point in trying to design a complex threaded architecture to handle every possible case if you're not going to need it.

Yeah, you're right; I misinterpreted B to be an asynchronous version of A.

floWenoL
Oct 23, 2002

hexadecimal posted:

I am not well versed in C# but if i was to do this, I would use lex/yacc. I am pretty sure C# supports regular expression matching with its standard API, so lexer part is done. You should look into C# compiler/compiler, or just make your own parser, then parse in XML and spit it back out in whatever format you want. This is probably at least a few days worth of work.

It is becoming clear that whenever someone asks a question, whatever hexadecimal suggests is almost certainly what you don't want to do.

floWenoL
Oct 23, 2002

6174 posted:

That works. It allows just -llapack to find the library.

This brings up a related question. If ld simply maps -lfile to libfile.so, then why is it that quite a few of the libraries I've got in /usr/lib do not have a .so variant? Many of them are named libfile.so.# for some number #. But there are also quite a few that have a .so symlinked to the .so.# version. Presumably with as many libraries that have this, the Ubuntu packagers didn't botch this many packages, so I'm missing something. I do see that if I run ldd on the executable, it points to liblapack.so.3. Does this mean that a .so symlink is only needed if you are linking to a specific library, and since most people aren't doing that (at least using ld), the .so symlink isn't needed and thus the package maintainers don't put it in?

It's been a while since I've worked with Linux packaging, but I believe you won't get the symlinks unless you have the package <FOO>-dev (or similar) installed.

floWenoL
Oct 23, 2002

hexadecimal posted:

What you are describing there is more like memoization not dynamic programing itself. your formula doesn't change for Fibonacci numbers, but a lot of dynamic programing is for problems when you want to optimize some parameters, like the knap sack problem - http://en.wikipedia.org/wiki/Knapsack_problem.

I have no idea what you're trying to say, but the important characteristics of a dynamic programming problem are just overlapping subproblems and optimal substructure, and frequently, memoizing the algorithm is equivalent to the normal bottom-up approach.

floWenoL
Oct 23, 2002

hexadecimal posted:

Lex/Yacc + Libcurl (C/C++) is what I would use. Depending on how complex your parsing needs to be for the webpage, you may or may not need Yacc, but lexer is a pretty nice tool. Also fast as gently caress.

You seriously give the worst programming advice.

floWenoL
Oct 23, 2002

DLCinferno posted:

I think someone needs a new custom title.

[big-red-text]
IGNORE ANYTHING I HAVE TO SAY REGARDING PROGRAMMING
[/big-red-text]

floWenoL
Oct 23, 2002

ShoulderDaemon posted:

So, I said this, and then I started thinking to myself that I should be prepared to be called-out on a statement like this. My first try was 99 lines, but I like round numbers, so here's a 50 line version:

Man, look at those strict aliasing violations.

floWenoL
Oct 23, 2002

Avenging Dentist posted:

I would guess that's he's blowing out of the stack somewhere, and adding another stack variable is saving his rear end.

I don't know why, but your statement sounds really gay. :pwn:

floWenoL
Oct 23, 2002

ErIog posted:

Am I a knob for not using the XML libraries in Python? Whenever I need to do random XML parsing, I find it easier to just use regex rather than parse the file through the official library. I realize that this might create headaches in the future, but usually these are one-off tools to automate some sort of data mining or file renaming based on XML data.

As someone who has had to maintain code that parses XML using regexps (and would randomly break every time attributes moved around) I hope you someday go through the same pain.

floWenoL
Oct 23, 2002

Otto Skorzeny posted:

Someone remind me whether comments in email addresses nest to arbitrary depth or not

They do, so even the 'full' e-mail regex doesn't parse all valid e-mail addresses.

floWenoL
Oct 23, 2002

Eclipse12 posted:

Try reading my post next time. I said I didn't know how to code; I never said that I hadn't asked someone who does how long my idea would take to make.

Here's another thing you didn't say: what the project actually is.

floWenoL
Oct 23, 2002

lord funk posted:

I've got some code optimizing to do, and I was wondering if there is a good resource / list of operations ordered by how expensive they are to run.

As in, if multiplying values together is more computationally expensive than adding, or if running if(this == that) 10,000 times is more expensive than multiplying something four times?

Is there a good place to learn this? My Google-fu is poo poo today.

With pipelining and branch prediction, it's not as simple as listing a cycle count for each instruction anymore. The profiler is your friend.

floWenoL
Oct 23, 2002

Avenging Dentist posted:

No, they learn hexadecimal. *eyes you meaningfully*

He was doing so well, too...

floWenoL
Oct 23, 2002

litghost posted:

This has some weird aliasing issues, not to mention it uses two relatively expensive calls and is hard to immediately follow what it is doing. You might be better off just manually doing the oscillation.

whooosh!

floWenoL
Oct 23, 2002

OverloadUT posted:

They share mostly the same set of stats, so I figure they should both be subclasses of a "Combatant" parent.

Does not follow.

floWenoL
Oct 23, 2002

Jose Cuervo posted:

I have the following problem, and I do not know how to logically approach it:
I have N different time intervals of the form [a,b] (e.g. the time interval [5,9] consists of the times 5, 6, 7, 8, and 9). The problem is to determine the intersection of these N time intervals, where some will overlap, and some will not.
For example, if I have [19,23], [3,7], [5,19], and [44,48], the resulting intersection I want is [3,23], [44,48].

Right now I am able to take all the start and end points of each time interval and sort them in ascending order [3, 5, 7, 19, 19, 23, 44, 48] and have an accompanying vector [s, s, e, s, e, e, s, e] of which point is a start or end.

I am lost as to how to use this information to come up with the resulting intersection, and I am open to new strategies. Any ideas?

Don't you mean union?

floWenoL
Oct 23, 2002

systran posted:

if action1 == 'attack' and action2 == 'attack':
roll1 = random.randint(1, 12)
roll2 = random.randint(1, 12)

if action1 == 'charge' and action2 == 'attack':
roll1 = random.randint(1, 12)
if roll1 > 6:
playerTwoHealth = playerTwoHealth - 2

This is why significant whitespace sucks.

floWenoL
Oct 23, 2002

baquerd posted:

The problem is not what will happen today or tomorrow, but what will happen next year when you start having precision rounding problems after some jackass put a task between two others 54 times over the year?

You could certainly do that, and have a lower bound tolerance that forces a full database re-ranking or have a maintenance task that re-ranks them every week or so, but it's still unstable.

You're not going to be able to get a perfect solution here, where each ranking change only affects at most one task, and have it stable forever.

Instead of doubles, you can use strings with lexicographic order and treat them as, say, base-26 numbers between 0 and 1. (e.g., 'abbcz' would map to .0,1,1,2,26)

Adbot
ADBOT LOVES YOU

floWenoL
Oct 23, 2002

baquerd posted:

What does "abbcz" represent (is is legal to have more than one rank per task?) and how do you translate that into the required database double rank value for task a, b, c, and z that minimizes rank movement for arbitrary re-ordering?

"abbcz" represents the number (0/26 + 1/26^2 + 1/^26^3 + 2/^26^4 + 25/26^5). There's no double rank; the rank is stored as a string. For every two distinct string ranks, there's always a string rank in between them, so you never need to 're-rank'.

The only downside is that the ranks may grow to arbitrary sizes. But with careful management you can mitigate this for all but the most pathological cases.

Edit:
Eh, I guess I didn't read the original question closely enough. If he's stuck with using doubles, then there's not much you can do.

floWenoL fucked around with this message at 03:12 on Oct 18, 2011

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply