|
If you re-balance every time, each re-balance costs O(log n)
|
# ? Jul 28, 2021 17:39 |
|
|
# ? Jun 3, 2024 06:37 |
|
OddObserver posted:If you re-balance every time, each re-balance costs O(log n)
|
# ? Jul 28, 2021 19:37 |
|
roomforthetuna posted:Isn't an optimal rebalance O(n) because you have to visit every node to see it's balanced, assuming you didn't cache some additional data? The additional data is that you're starting with a balanced binary sort tree. Then it's O(log n) to insert a new element, rebalancing as necessary, to keep it that way.
|
# ? Jul 28, 2021 19:43 |
|
Absurd Alhazred posted:The additional data is that you're starting with a balanced binary sort tree. Then it's O(log n) to insert a new element, rebalancing as necessary, to keep it that way. Or you *might* be able to cache the entire tree into a balanced version on the first call then reuse your cache, or balance it in-place on the first call only, which would help if the test involves multiple finds and no significant edits. If any of that is the case then the test should really be telling you some of these criteria though, because it is absolutely not the job of a find function to do any of these things normally.
|
# ? Jul 28, 2021 19:52 |
|
Splay tree
|
# ? Jul 28, 2021 20:33 |
|
roomforthetuna posted:Ah, I see. But that's not a match for the situation the question was about, which was "implement find on someone else's binary tree". In that context if you sort it before each find call you're going to be incurring the cost of a full rebalance since you don't know what else might have happened between find calls. Yeah, I am just trying to cover my bases as to why a afaik reasonable iterative search function somehow ended up timing out on the alleged tree in the exam. Which the two explanations being (a) there's a cycle which I needed to detect and return nullptr in that case I suppose or (b) the tree isn't balanced which either needs to be rebalanced or some else I dunno. The only thing known about the tree is that it is in theory a BST and it could have over 300,000 elements.
|
# ? Jul 28, 2021 21:54 |
|
They might've just hosed up and set the timeout too low.
|
# ? Jul 28, 2021 22:00 |
|
ultrafilter posted:They might've just hosed up and set the timeout too low. I mean it's actually this, but there is that 0.001% chance that someone wanted you to Xarn posted:Splay tree
|
# ? Jul 28, 2021 22:16 |
|
No rebalancing algorithm is going to work in the context of a problem where a single find causes a timeout, though I suppose maybe the timeout was across multiple calls to find. I'm guessing it was a loop in the tree or too tight a bound. It's really weird if the problem didn't tell you what assumptions you could make about the trees or suggest you need to detect "bad" inputs.
|
# ? Jul 28, 2021 22:32 |
|
Xarn posted:Splay tree DID SOMEONE SAY SPLAY TREE?!
|
# ? Jul 28, 2021 22:33 |
splay with treez nuts
|
|
# ? Jul 28, 2021 22:34 |
|
Yeah literally the only thing I remember the question going out of its way to convey was "The tree could have 300,000 or more nodes!" and watch out for RAM usage (I assume this is saying to avoid recursion), nothing about checking for whether its a valid tree, or a balanced one and so on. Maybe my implementation had some quirk to it to cause it to take a little longer than it otherwise should, that's one possibility but my implementation was afaik the same as you can get off of google.
|
# ? Jul 29, 2021 00:25 |
|
If it's not valid then it's not a tree!
|
# ? Jul 29, 2021 00:50 |
|
Technically, a list is a tree.
|
# ? Jul 29, 2021 00:51 |
|
Absurd Alhazred posted:Technically, a list is a tree. Wouldn't this be a Vine?
|
# ? Jul 29, 2021 00:52 |
|
Raenir Salazar posted:Wouldn't this be a Vine? I don't know, was your timeout six seconds?
|
# ? Jul 29, 2021 00:55 |
|
Absurd Alhazred posted:Technically, a list is a tree. Yes, and that would be the maximally unbalanced case, for which 300K entries probably would still take a negligible amount of time. What I was referring to was stuff like cycles. If it has cycles, it's not a tree!
|
# ? Jul 29, 2021 00:55 |
|
OddObserver posted:Yes, and that would be the maximally unbalanced case, for which 300K entries probably would still take a negligible amount of time. It's a perfectly balanced Uniary Tree
|
# ? Jul 29, 2021 02:37 |
|
OddObserver posted:Yes, and that would be the maximally unbalanced case, for which 300K entries probably would still take a negligible amount of time. It may be a "rational tree". I can't find good references for these, but they used to be a thing in some dialects of Prolog; basically they are a particular class of cyclic directed graphs that arise if you do unification without the occurrence check. If you type "X = foo(X)" into a Prolog system and it doesn’t reply "no", it does this. And it will probably hang, because it doesn’t actually support dealing with the resulting structure.
|
# ? Jul 29, 2021 11:21 |
|
Ruzihm posted:splay with treez nuts
|
# ? Jul 29, 2021 13:02 |
|
I just ran into an array degrading into a pointer with Linux kernel hash tables. They're just an array of buckets with a lot of macros around them. So when I declare one in one scope and pass it into a function with another scope, now I just have a single-element pointer and everything goes to poo poo. It's using ARRAY_SIZE under the hood. All the examples I ever saw were declaring hash tables in single files and not passing the pointer around. It caught me by surprise. Is there some alternative structure available for this kind of thing or do I basically have to write my own? Edit: I specifically don't want to declare everything together with global state here.
|
# ? Jul 29, 2021 22:46 |
|
What if you pass around pointers to the arrays, does that work? Like one more indirection?
|
# ? Jul 30, 2021 06:39 |
|
Vanadium posted:What if you pass around pointers to the arrays, does that work? Like one more indirection? I'm going to speculate that won't work unless there's some other nook-and-cranny in C that I don't know about. As soon as it becomes purely pointer arguments, the functions themselves no longer have any awareness of the actual length. Something I tried tonight that did seem to work: I threw the array in a struct and I pass a pointer to that struct around. The member in the struct is an array of a fixed size, so I the compiler is always able to deduce its full size. If you want to see the kind of thing I'm working with, it's this, with a some alternation based on kernel compilation or not to use a userspace list under the hood. The DEFINE_HASHTABLE macro is what normally declares the hashtable, and it's just an array of struct hlist_head. As soon as I feed it into a function as a pointer, it's just struct hlist_head* without any more context and it's basically a single element array. The macros aren't doing anything cool to gauge the count any other way. You can see a HASH_SIZE macro calling ARRAY_SIZE macro which just does sizeof the whole array against sizeof the type. I don't like the struct myself because all my functions are now hung up on working with a hard-coded array and that offends me.
|
# ? Jul 30, 2021 07:41 |
|
From a quick skim, the macros are just helpers that invoke actual functions and provide them with size. Why not call those functions directly, and pass the size as another arg?
|
# ? Jul 30, 2021 09:25 |
|
There are plenty of more full-featured hashtable implementations around if all you want is "a hashtable". Is there a particular reason you're drawn to the linux kernel implementation? Note that it's incredibly stupid simple (by design!), and was originally written just to simplify the implementation of all the ad-hoc statically sized hashtables that were floating around the kernel at that point in time. If you're doing things like passing a hashtable around to various other bits of code for them to use directly, you're already beyond the design scope of this particular implementation. And probably at some point you're going to want to do something like "resize the hashtable when it gets full", at which point you'll need to rewrite it.
|
# ? Jul 30, 2021 09:26 |
|
Rocko Bonaparte posted:I'm going to speculate that won't work unless there's some other nook-and-cranny in C that I don't know about. As soon as it becomes purely pointer arguments, the functions themselves no longer have any awareness of the actual length. If the top-level is an array, it becomes a pointer and loses the length. But a pointer to an array doesn't become a pointer to a pointer. This seems to "work": code:
|
# ? Jul 30, 2021 09:30 |
The sizeof() operator in C and C++ is a compile time thing, it looks at the storage size of the object given and compiles into a constant value. It doesn't know anything about dynamic memory management and definitely can't help you find the size of an arbitrary memory block you have a pointer to. If you have hash tables of varying sizes you want to let a single set of shared functions work on, you need to carry the size of the hash table around along with the data. Either as a separate variable, or as a member of a struct representing the hash table.
|
|
# ? Jul 30, 2021 10:16 |
|
I mean, if you’re using C (not C++) and you really want to pursue this, ARRAY_SIZE probably does do the right thing with a VLA, so you can do something like:C code:
|
# ? Jul 30, 2021 16:52 |
|
|
# ? Jul 30, 2021 17:01 |
|
Xarn posted:From a quick skim, the macros are just helpers that invoke actual functions and provide them with size. Why not call those functions directly, and pass the size as another arg? I was thinking about doing that, but I already have a segfault generator disguised as a program where I had to redo one of the hash table's iteration macros to handle bulk cleanup. Also, depending on how far I change it, I am kind of getting into a Ship of Theseus and might be better off just writing something else from scratch. Speaking of that: Jabor posted:There are plenty of more full-featured hashtable implementations around if all you want is "a hashtable". Is there a particular reason you're drawn to the linux kernel implementation? Note that it's incredibly stupid simple (by design!), and was originally written just to simplify the implementation of all the ad-hoc statically sized hashtables that were floating around the kernel at that point in time. I figured I'd start with what was available in front of me, and I did stumble into it with my eyes half-open. I also got seduced by the idea that hashtable was meant to consolidate a lot of different, independent implementations in the kernel, so I figured I'd be an rear end if I just went off and wrote my own anyways. I should be more shameless. This particular piece of code is the data store for the larger module so it has to build in kernel space. I'm testing it in user space right now with a alternate header that #includes a userspace list implementation too. This lets me do unit testing on it with a C++ framework. Having a C++ compiler for the unit testing and a C compiler for module building does have some other wrinkles. Recall how I was asking about designated inits because the C++ compiler was choking on them. The situation I'm in here is fine with a fixed bucket size since I have a good handle on the volume going into it, but I did think of replacing the lists underneath with some kind of hash table too, and those are much more volatile. If it was a decade ago when I was last doing a lot of C and C++, I would have probably rolled my own, but I'm getting back into it and paying the dope fees. Vanadium posted:
I'm actually going to switch to this. I had lobotomized typedef temporarily from my brain because the coding standards here poopoo'd typedefs for structs, which is where I normally think of using them. So the concept of using a typedef completely left my mind. rjmccall posted:I mean, if you’re using C (not C++) and you really want to pursue this, ARRAY_SIZE probably does do the right thing with a VLA, so you can do something like: Xarn's response to that made my laugh. I am building unit tests for this with a C++ compiler so I think it's a no-go. Also, how is that not just a leak on the stack that'll immediately blow up?
|
# ? Jul 30, 2021 17:40 |
|
There’s no allocation on the stack there. You can think of the C VLA feature as two different things: first, the ability to express array types with dynamic bounds, and second, the ability to declare local variables with such an array type. The first is really just automatically doing sizeof/offsetting stuff that you could just as well do by hand; for example, adding i to a int (*)[numCols] implicitly scales by numCols * sizeof(int), but of course if you just had an int* you could add i * numCols and get the same effect. That’s all that’s happening in my example, playing around with types so that you can still use type-directed features like sizeof. The second does a dynamically-sized stack allocation, which is tricky in ways that end up dominating the discussion of VLAs; but you don’t have to use that part of the feature.
|
# ? Jul 30, 2021 18:05 |
|
I feel like the typedef to a fixed-size array should have worked but I got spooky behavior in the macros when I run tests (in C++). The hash_add macro would go through all the motions of adding a new item, but it wouldn't ultimately stick and it would think the table was actually empty after the call. I expanded it out and saw the same despite all the pointers moving around. Eventually I had the hlist_head for the hashed bucket expanded like this:code:
|
# ? Jul 31, 2021 01:14 |
|
Mis-match of whether you were passing a pointer or an l-value?
|
# ? Jul 31, 2021 04:19 |
|
rjmccall posted:Mis-match of whether you were passing a pointer or an l-value? Maybe? The DEFINE_HASHTABLE macro creates a struct hlist_head, not a pointer, and it looks like the other macros assume it's not a pointer. I guess I should check tomorrow if I wasn't just doing my typical stupid thing of declaring something on the stack in a helper and then running around with a pointer to it like it would still be valid after the call completed. I used to do that all the drat time when I was a kid and beat it out of myself, but the macros here kind of hid it away and suckered me into it at least once while working on this.
|
# ? Aug 1, 2021 18:35 |
|
Is that a thing a decent C compiler will try to warn you about these days?
|
# ? Aug 2, 2021 17:18 |
|
rjmccall posted:Do you participate in lib-ext mailing? If yes, what is your take on the recent discussion on [[nodiscard]] in specification of standard library? Because I can't help thinking that more times was wasted in the mailing and discussion about whether we should do it, whether it belongs in the standard, etc, than it would be to just review the goddamn paper like any other.
|
# ? Aug 2, 2021 19:42 |
|
This is probably a really stupid newbie question, but I am just learning and stumped on this. Not necessarily looking for a finished solution but maybe some guidance on where my thought process is leading me astray. So, background. This is in C. The problem I'm working on needs a function that takes in a 2 dimensional boolean array that represents arrows from one candidate to the next. This is for a election where you can rank candidates. The winner is the candidate that has "arrows" pointing to others but none at them. So the winner will never appear in bool locked[i][j] in the j portion. But, when locking these in you need to avoid making a loop. So if there were just three candidates, something like: code:
I came up with this: code:
This is my first time using recursive functions, and I'm not sure if I'm really setting up the structure in a realistic way. Am I missing an obvious fix, or is this not at all how a recursive function should be set up?
|
# ? Aug 2, 2021 21:46 |
|
Xarn posted:Do you participate in lib-ext mailing? I like nodiscard and would prefer if all my return values were nodiscard, I’m happy to (void) them away if I don’t want them I think the [[thing]] format is ugly as gently caress though, makes my fall throughs look like poo poo
|
# ? Aug 2, 2021 22:03 |
|
Sweeper posted:I like nodiscard and would prefer if all my return values were nodiscard, I’m happy to (void) them away if I don’t want them Sounds like it's an attempt to standardize something similar to GCC's warn_unused_result attribute?
|
# ? Aug 2, 2021 22:38 |
|
|
# ? Jun 3, 2024 06:37 |
|
Xarn posted:Do you participate in lib-ext mailing? No, I don't have time to keep up with the C++ lists proactively; I only weigh in on specific concerns when they're brought up to me. As a language designer, I think the right rule is that expression results should default to not being discardable, and functions should have to opt out of that. That is obviously not a rule that you can retroactively impose after 30 years, so the best you're going to achieve is to allow particularly important functions to opt in to a diagnostic. That is still a useful thing to have, even if it is manifestly less useful than having the right rule in the first place would have been. The C++ standard doesn't formalize diagnostics, just whether a program is well-formed, which is why the [[nodiscard]] attribute specification has funny wording about "the compiler is encouraged to emit a warning". It seems obvious that you could borrow that same funny wording for the standard library's specifications of e.g. c.empty() and c.size() and get the point across without crossing any novel lines like implying that standard library types have member functions. I don't know why anyone would have a problem with that, but it's the C++ committee, so I'm not surprised that somehow this turned into a major fight.
|
# ? Aug 2, 2021 23:41 |