Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
OddObserver
Apr 3, 2009
If you re-balance every time, each re-balance costs O(log n)

Adbot
ADBOT LOVES YOU

roomforthetuna
Mar 22, 2005

I don't need to know anything about virii! My CUSTOM PROGRAM keeps me protected! It's not like they'll try to come in through the Internet or something!

OddObserver posted:

If you re-balance every time, each re-balance costs O(log n)
Isn't an optimal rebalance O(n) because you have to visit every node to see it's balanced, assuming you didn't cache some additional data?

Absurd Alhazred
Mar 27, 2010

by Athanatos

roomforthetuna posted:

Isn't an optimal rebalance O(n) because you have to visit every node to see it's balanced, assuming you didn't cache some additional data?

The additional data is that you're starting with a balanced binary sort tree. Then it's O(log n) to insert a new element, rebalancing as necessary, to keep it that way.

roomforthetuna
Mar 22, 2005

I don't need to know anything about virii! My CUSTOM PROGRAM keeps me protected! It's not like they'll try to come in through the Internet or something!

Absurd Alhazred posted:

The additional data is that you're starting with a balanced binary sort tree. Then it's O(log n) to insert a new element, rebalancing as necessary, to keep it that way.
Ah, I see. But that's not a match for the situation the question was about, which was "implement find on someone else's binary tree". In that context if you sort it before each find call you're going to be incurring the cost of a full rebalance since you don't know what else might have happened between find calls.

Or you *might* be able to cache the entire tree into a balanced version on the first call then reuse your cache, or balance it in-place on the first call only, which would help if the test involves multiple finds and no significant edits.

If any of that is the case then the test should really be telling you some of these criteria though, because it is absolutely not the job of a find function to do any of these things normally.

Xarn
Jun 26, 2015
Splay tree

Raenir Salazar
Nov 5, 2010

College Slice

roomforthetuna posted:

Ah, I see. But that's not a match for the situation the question was about, which was "implement find on someone else's binary tree". In that context if you sort it before each find call you're going to be incurring the cost of a full rebalance since you don't know what else might have happened between find calls.

Or you *might* be able to cache the entire tree into a balanced version on the first call then reuse your cache, or balance it in-place on the first call only, which would help if the test involves multiple finds and no significant edits.

If any of that is the case then the test should really be telling you some of these criteria though, because it is absolutely not the job of a find function to do any of these things normally.

Yeah, I am just trying to cover my bases as to why a afaik reasonable iterative search function somehow ended up timing out on the alleged tree in the exam. Which the two explanations being (a) there's a cycle which I needed to detect and return nullptr in that case I suppose or (b) the tree isn't balanced which either needs to be rebalanced or some else I dunno.

The only thing known about the tree is that it is in theory a BST and it could have over 300,000 elements.

ultrafilter
Aug 23, 2007

It's okay if you have any questions.


They might've just hosed up and set the timeout too low.

Xarn
Jun 26, 2015

ultrafilter posted:

They might've just hosed up and set the timeout too low.

I mean it's actually this, but there is that 0.001% chance that someone wanted you to


Xarn posted:

Splay tree

Jeffrey of YOSPOS
Dec 22, 2005

GET LOSE, YOU CAN'T COMPARE WITH MY POWERS
No rebalancing algorithm is going to work in the context of a problem where a single find causes a timeout, though I suppose maybe the timeout was across multiple calls to find. I'm guessing it was a loop in the tree or too tight a bound. It's really weird if the problem didn't tell you what assumptions you could make about the trees or suggest you need to detect "bad" inputs.

Star War Sex Parrot
Oct 2, 2003

Xarn posted:

Splay tree
*Kool-Aid mans through the wall*

DID SOMEONE SAY SPLAY TREE?!

Ruzihm
Aug 11, 2010

Group up and push mid, proletariat!


splay with treez nuts

Raenir Salazar
Nov 5, 2010

College Slice
Yeah literally the only thing I remember the question going out of its way to convey was "The tree could have 300,000 or more nodes!" and watch out for RAM usage (I assume this is saying to avoid recursion), nothing about checking for whether its a valid tree, or a balanced one and so on.

Maybe my implementation had some quirk to it to cause it to take a little longer than it otherwise should, that's one possibility but my implementation was afaik the same as you can get off of google.

OddObserver
Apr 3, 2009
If it's not valid then it's not a tree!

Absurd Alhazred
Mar 27, 2010

by Athanatos
Technically, a list is a tree. :grin:

Raenir Salazar
Nov 5, 2010

College Slice

Absurd Alhazred posted:

Technically, a list is a tree. :grin:

Wouldn't this be a Vine?

Absurd Alhazred
Mar 27, 2010

by Athanatos

Raenir Salazar posted:

Wouldn't this be a Vine?

I don't know, was your timeout six seconds?

OddObserver
Apr 3, 2009

Absurd Alhazred posted:

Technically, a list is a tree. :grin:

Yes, and that would be the maximally unbalanced case, for which 300K entries probably would still take a negligible amount of time.

What I was referring to was stuff like cycles. If it has cycles, it's not a tree!

HootTheOwl
May 13, 2012

Hootin and shootin

OddObserver posted:

Yes, and that would be the maximally unbalanced case, for which 300K entries probably would still take a negligible amount of time.

What I was referring to was stuff like cycles. If it has cycles, it's not a tree!

It's a perfectly balanced Uniary Tree

Zopotantor
Feb 24, 2013

...und ist er drin dann lassen wir ihn niemals wieder raus...

OddObserver posted:

Yes, and that would be the maximally unbalanced case, for which 300K entries probably would still take a negligible amount of time.

What I was referring to was stuff like cycles. If it has cycles, it's not a tree!

It may be a "rational tree". I can't find good references for these, but they used to be a thing in some dialects of Prolog; basically they are a particular class of cyclic directed graphs that arise if you do unification without the occurrence check. If you type "X = foo(X)" into a Prolog system and it doesn’t reply "no", it does this. And it will probably hang, because it doesn’t actually support dealing with the resulting structure.

giogadi
Oct 27, 2009

Ruzihm posted:

splay with treez nuts

:vince:

Rocko Bonaparte
Mar 12, 2002

Every day is Friday!
I just ran into an array degrading into a pointer with Linux kernel hash tables. They're just an array of buckets with a lot of macros around them. So when I declare one in one scope and pass it into a function with another scope, now I just have a single-element pointer and everything goes to poo poo. It's using ARRAY_SIZE under the hood. All the examples I ever saw were declaring hash tables in single files and not passing the pointer around. It caught me by surprise. Is there some alternative structure available for this kind of thing or do I basically have to write my own?

Edit: I specifically don't want to declare everything together with global state here.

Vanadium
Jan 8, 2005

What if you pass around pointers to the arrays, does that work? Like one more indirection?

Rocko Bonaparte
Mar 12, 2002

Every day is Friday!

Vanadium posted:

What if you pass around pointers to the arrays, does that work? Like one more indirection?

I'm going to speculate that won't work unless there's some other nook-and-cranny in C that I don't know about. As soon as it becomes purely pointer arguments, the functions themselves no longer have any awareness of the actual length.

Something I tried tonight that did seem to work: I threw the array in a struct and I pass a pointer to that struct around. The member in the struct is an array of a fixed size, so I the compiler is always able to deduce its full size.

If you want to see the kind of thing I'm working with, it's this, with a some alternation based on kernel compilation or not to use a userspace list under the hood.

The DEFINE_HASHTABLE macro is what normally declares the hashtable, and it's just an array of struct hlist_head. As soon as I feed it into a function as a pointer, it's just struct hlist_head* without any more context and it's basically a single element array. The macros aren't doing anything cool to gauge the count any other way. You can see a HASH_SIZE macro calling ARRAY_SIZE macro which just does sizeof the whole array against sizeof the type.

I don't like the struct myself because all my functions are now hung up on working with a hard-coded array and that offends me.

Xarn
Jun 26, 2015
From a quick skim, the macros are just helpers that invoke actual functions and provide them with size. Why not call those functions directly, and pass the size as another arg?

Jabor
Jul 16, 2010

#1 Loser at SpaceChem
There are plenty of more full-featured hashtable implementations around if all you want is "a hashtable". Is there a particular reason you're drawn to the linux kernel implementation? Note that it's incredibly stupid simple (by design!), and was originally written just to simplify the implementation of all the ad-hoc statically sized hashtables that were floating around the kernel at that point in time.

If you're doing things like passing a hashtable around to various other bits of code for them to use directly, you're already beyond the design scope of this particular implementation. And probably at some point you're going to want to do something like "resize the hashtable when it gets full", at which point you'll need to rewrite it.

Vanadium
Jan 8, 2005

Rocko Bonaparte posted:

I'm going to speculate that won't work unless there's some other nook-and-cranny in C that I don't know about. As soon as it becomes purely pointer arguments, the functions themselves no longer have any awareness of the actual length.

If the top-level is an array, it becomes a pointer and loses the length. But a pointer to an array doesn't become a pointer to a pointer.

This seems to "work":

code:
typedef int HashTable[42];
unsigned long foo(HashTable* h) {
    return sizeof(*h);
}
ofc that doesn't help if the size isn't actually static but I don't think the struct solution works then either?

nielsm
Jun 1, 2009



The sizeof() operator in C and C++ is a compile time thing, it looks at the storage size of the object given and compiles into a constant value. It doesn't know anything about dynamic memory management and definitely can't help you find the size of an arbitrary memory block you have a pointer to.

If you have hash tables of varying sizes you want to let a single set of shared functions work on, you need to carry the size of the hash table around along with the data. Either as a separate variable, or as a member of a struct representing the hash table.

rjmccall
Sep 7, 2007

no worries friend
Fun Shoe
I mean, if you’re using C (not C++) and you really want to pursue this, ARRAY_SIZE probably does do the right thing with a VLA, so you can do something like:

C code:
void foo(struct hlist_head *_table, size_t len) {
  typedef struct hlist_head *table_t[len];
  table_t *table = (table_t*) _table;
  …
}

Xarn
Jun 26, 2015
:catstare:

Rocko Bonaparte
Mar 12, 2002

Every day is Friday!

Xarn posted:

From a quick skim, the macros are just helpers that invoke actual functions and provide them with size. Why not call those functions directly, and pass the size as another arg?

I was thinking about doing that, but I already have a segfault generator disguised as a program where I had to redo one of the hash table's iteration macros to handle bulk cleanup. Also, depending on how far I change it, I am kind of getting into a Ship of Theseus and might be better off just writing something else from scratch.

Speaking of that:

Jabor posted:

There are plenty of more full-featured hashtable implementations around if all you want is "a hashtable". Is there a particular reason you're drawn to the linux kernel implementation? Note that it's incredibly stupid simple (by design!), and was originally written just to simplify the implementation of all the ad-hoc statically sized hashtables that were floating around the kernel at that point in time.

If you're doing things like passing a hashtable around to various other bits of code for them to use directly, you're already beyond the design scope of this particular implementation. And probably at some point you're going to want to do something like "resize the hashtable when it gets full", at which point you'll need to rewrite it.

I figured I'd start with what was available in front of me, and I did stumble into it with my eyes half-open. I also got seduced by the idea that hashtable was meant to consolidate a lot of different, independent implementations in the kernel, so I figured I'd be an rear end if I just went off and wrote my own anyways. I should be more shameless.

This particular piece of code is the data store for the larger module so it has to build in kernel space. I'm testing it in user space right now with a alternate header that #includes a userspace list implementation too. This lets me do unit testing on it with a C++ framework. Having a C++ compiler for the unit testing and a C compiler for module building does have some other wrinkles. Recall how I was asking about designated inits because the C++ compiler was choking on them.

The situation I'm in here is fine with a fixed bucket size since I have a good handle on the volume going into it, but I did think of replacing the lists underneath with some kind of hash table too, and those are much more volatile.

If it was a decade ago when I was last doing a lot of C and C++, I would have probably rolled my own, but I'm getting back into it and paying the dope fees.


Vanadium posted:

code:
typedef int HashTable[42];
unsigned long foo(HashTable* h) {
    return sizeof(*h);
}

I'm actually going to switch to this. I had lobotomized typedef temporarily from my brain because the coding standards here poopoo'd typedefs for structs, which is where I normally think of using them. So the concept of using a typedef completely left my mind.


rjmccall posted:

I mean, if you’re using C (not C++) and you really want to pursue this, ARRAY_SIZE probably does do the right thing with a VLA, so you can do something like:

Xarn's response to that made my laugh. I am building unit tests for this with a C++ compiler so I think it's a no-go. Also, how is that not just a leak on the stack that'll immediately blow up?

rjmccall
Sep 7, 2007

no worries friend
Fun Shoe
There’s no allocation on the stack there.

You can think of the C VLA feature as two different things: first, the ability to express array types with dynamic bounds, and second, the ability to declare local variables with such an array type. The first is really just automatically doing sizeof/offsetting stuff that you could just as well do by hand; for example, adding i to a int (*)[numCols] implicitly scales by numCols * sizeof(int), but of course if you just had an int* you could add i * numCols and get the same effect. That’s all that’s happening in my example, playing around with types so that you can still use type-directed features like sizeof. The second does a dynamically-sized stack allocation, which is tricky in ways that end up dominating the discussion of VLAs; but you don’t have to use that part of the feature.

Rocko Bonaparte
Mar 12, 2002

Every day is Friday!
I feel like the typedef to a fixed-size array should have worked but I got spooky behavior in the macros when I run tests (in C++). The hash_add macro would go through all the motions of adding a new item, but it wouldn't ultimately stick and it would think the table was actually empty after the call. I expanded it out and saw the same despite all the pointers moving around. Eventually I had the hlist_head for the hashed bucket expanded like this:

code:
hlist_head* buckethead = &(*dict)[bucket];
Which somehow got everything to finally sync up, but I couldn't figure out how to state that (with clenched teeth) at the root of the hash_add macro, and I wasn't in the business of expanding every drat macro. Still, I'm kind of puzzled by it. I'm assuming it's something dumb but I wanted to post before the weekend started up. Maybe I'll see it fresh next week. I'm actually doing fine with the struct containing the array right now. No segfaults and no test errors.

rjmccall
Sep 7, 2007

no worries friend
Fun Shoe
Mis-match of whether you were passing a pointer or an l-value?

Rocko Bonaparte
Mar 12, 2002

Every day is Friday!

rjmccall posted:

Mis-match of whether you were passing a pointer or an l-value?

Maybe? The DEFINE_HASHTABLE macro creates a struct hlist_head, not a pointer, and it looks like the other macros assume it's not a pointer.

I guess I should check tomorrow if I wasn't just doing my typical stupid thing of declaring something on the stack in a helper and then running around with a pointer to it like it would still be valid after the call completed. I used to do that all the drat time when I was a kid and beat it out of myself, but the macros here kind of hid it away and suckered me into it at least once while working on this.

Computer viking
May 30, 2011
Now with less breakage.

Is that a thing a decent C compiler will try to warn you about these days?

Xarn
Jun 26, 2015

rjmccall posted:


Do you participate in lib-ext mailing?

If yes, what is your take on the recent discussion on [[nodiscard]] in specification of standard library? Because I can't help thinking that more times was wasted in the mailing and discussion about whether we should do it, whether it belongs in the standard, etc, than it would be to just review the goddamn paper like any other.

Ranidas
Jun 19, 2007
This is probably a really stupid newbie question, but I am just learning and stumped on this. Not necessarily looking for a finished solution but maybe some guidance on where my thought process is leading me astray.

So, background. This is in C. The problem I'm working on needs a function that takes in a 2 dimensional boolean array that represents arrows from one candidate to the next. This is for a election where you can rank candidates. The winner is the candidate that has "arrows" pointing to others but none at them. So the winner will never appear in bool locked[i][j] in the j portion. But, when locking these in you need to avoid making a loop. So if there were just three candidates, something like:

code:
bool loop (int winner, int loser)
{
    for (int i = 0; i < candidate_count; i++)
    {
        if (locked[i][winner] == true)
            if (locked[loser][i] == true)
                return true;
    }
    
    return false;
}
Will return true if there is a loop. Such as if locked[0][1] and locked [1][2] are true (candidate 0 preferred over candidate 1, and candidate 1 preferred over candidate 2) and you try to set locked[2][0] to true. But that doesn't work for more than three candidates, and it seems like I need recursion to make that work. Or at least I'm assuming so as the lesson ended on recursive functions.

I came up with this:

code:
bool loop (int winner, int loser)
{
    if (locked[loser][winner] == true)
        return true;
    
    for (int i = 0; i < candidate_count; i++)
    	{
        if (locked[i][winner] == true)
            return loop(loser, i);
    	}
    
    return false;
}
Which... does not work. My goal is to do the same sort of check but walk back through however many candidate arrows there are to find larger loops. And my testing seems to show that what is happening is that as it starts down the line of recursion, if the first i, i = 0, does not make a loop then return loop(loser, i) is returning false before it can go further down the 'for' loop in the original function call. So it says no loop and locks it in even if there is a loop. If I try to qualify the 'return false' in any way, like using a counter and a 'if' statement to make sure it gets all the way through the original 'for' loop I get compilation errors saying a non-void function does not return a value in all control paths.

This is my first time using recursive functions, and I'm not sure if I'm really setting up the structure in a realistic way. Am I missing an obvious fix, or is this not at all how a recursive function should be set up?

Sweeper
Nov 29, 2007
The Joe Buck of Posting
Dinosaur Gum

Xarn posted:

Do you participate in lib-ext mailing?

If yes, what is your take on the recent discussion on [[nodiscard]] in specification of standard library? Because I can't help thinking that more times was wasted in the mailing and discussion about whether we should do it, whether it belongs in the standard, etc, than it would be to just review the goddamn paper like any other.

I like nodiscard and would prefer if all my return values were nodiscard, I’m happy to (void) them away if I don’t want them

I think the [[thing]] format is ugly as gently caress though, makes my fall throughs look like poo poo

OddObserver
Apr 3, 2009

Sweeper posted:

I like nodiscard and would prefer if all my return values were nodiscard, I’m happy to (void) them away if I don’t want them

I think the [[thing]] format is ugly as gently caress though, makes my fall throughs look like poo poo

Sounds like it's an attempt to standardize something similar to GCC's warn_unused_result attribute?

Adbot
ADBOT LOVES YOU

rjmccall
Sep 7, 2007

no worries friend
Fun Shoe

Xarn posted:

Do you participate in lib-ext mailing?

If yes, what is your take on the recent discussion on [[nodiscard]] in specification of standard library? Because I can't help thinking that more times was wasted in the mailing and discussion about whether we should do it, whether it belongs in the standard, etc, than it would be to just review the goddamn paper like any other.

No, I don't have time to keep up with the C++ lists proactively; I only weigh in on specific concerns when they're brought up to me.

As a language designer, I think the right rule is that expression results should default to not being discardable, and functions should have to opt out of that. That is obviously not a rule that you can retroactively impose after 30 years, so the best you're going to achieve is to allow particularly important functions to opt in to a diagnostic. That is still a useful thing to have, even if it is manifestly less useful than having the right rule in the first place would have been.

The C++ standard doesn't formalize diagnostics, just whether a program is well-formed, which is why the [[nodiscard]] attribute specification has funny wording about "the compiler is encouraged to emit a warning". It seems obvious that you could borrow that same funny wording for the standard library's specifications of e.g. c.empty() and c.size() and get the point across without crossing any novel lines like implying that standard library types have member functions. I don't know why anyone would have a problem with that, but it's the C++ committee, so I'm not surprised that somehow this turned into a major fight.

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply