Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
Jabor
Jul 16, 2010

#1 Loser at SpaceChem
Are you trying to optimize that solution before checking to see whether it actually needs to be optimized?

Don't you think that's a bit ... premature?

Adbot
ADBOT LOVES YOU

Jabor
Jul 16, 2010

#1 Loser at SpaceChem

KARMA! posted:

How would you represent a tree structure with a db then if you do want to be performant? Off-load the complexity to whatever language connects to the db, using specific db extensions, or what?

If your database supports recursive queries, you'd just have a table of relationships (an adjacency list) and run recursive queries on that. If you don't, it depends on your use case - for a lot of cases (e.g. checking for friend-of-a-friends), just faking the recursion yourself out to a small depth (say, 4) could be entirely sufficient and still performant enough. Or if your data forms a tree and needs to be queried often but not updated very much you might use nested sets.

In general though, if you find yourself running into these problems it's often a sign that you need to re-think your data model. Or perhaps your storage implementation - if your data fits better as a graph than as a set of relations, why not use a graph database instead of a relational one?

Jabor
Jul 16, 2010

#1 Loser at SpaceChem

NihilCredo posted:

but it can only be executed before any WHERE filter

Uh, how would your proposed synthetic incremental id work in this case? As soon as you filter, you have holes, so there's no guarantee that whatever random number you've selected is still in the set.

Jabor
Jul 16, 2010

#1 Loser at SpaceChem

NihilCredo posted:

You're correct, it would be an uneven distribution, but that ought to be acceptable for my use case. The table is a sort of work queue (SELECT FOR UPDATE SKIP LOCKED style), so any gaps should 'smooth out' over time.

In that case, you could use a TABLESAMPLE query to pick a random row (any row, don't put your condition in yet), then find the first row that matches your condition and has id >= your randomly selected row's id. If you get no results, then you looped, and can grab the first one in the table (with id < your randomly selected row's id, if you want to keep it to just one table scan in the case where no rows meet your condition).

Jabor
Jul 16, 2010

#1 Loser at SpaceChem
It sounds like you want to use the Range operators to determine whether the range the user is interested in is entirely contained by the range you have in the database.

So something like
code:
select * from hotels where hotels.seasondates @> daterange(first_date, last_date, '[]')

Jabor
Jul 16, 2010

#1 Loser at SpaceChem
code:
select p.playerid, s.seasonid, coalesce(ps.score, 0)
from players p cross join seasons s
left join players_seasons ps on ps.playerid = p.playerid and ps.seasonid = s.seasonid
where
  ps.score is not null
  or s.active = 1
Kind of a brute force approach, cross join players and seasons and filter down to the ones that are relevant. The trick is that you can't be joining into the seasons table based on players_seasons, since you're not going to have that link on some of the things that you need it for. You can drop the coalescing if you want an explicit null there ofc.

Alternatively you can union two subqueries (one for the active season, one for older seasons) if you need ~blazing fast performance~ and your players_seasons are super sparse or something.

Jabor
Jul 16, 2010

#1 Loser at SpaceChem
You can think of the number as being a pointer to an actual lock object in memory, if you like. If two things try to lock using the same pointer, one will succeed and the other will need to wait, while if two things try to lock different pointers they will both succeed and not interfere with each other.

If two things want to use the same lock then they need to use the same number - how exactly they agree on what number to use is up to you.

Jabor
Jul 16, 2010

#1 Loser at SpaceChem
If the newest movement for a particular bean is like, a week old, do you still want to show it? Normally a ticker only shows things that have just happened, in which case it becomes very easy to create your query just by flipping things around - instead of selecting beans and finding their most recent movement, select movements and join that with the bean table to get the bean characteristics.

Jabor
Jul 16, 2010

#1 Loser at SpaceChem
code:

DBMS_OUTPUT.PUT_LINE('Pledge Amount: ' || pledges.pledge_amt);

This is the only place that uses pledges.pledge_amt instead of pledges.pledgeamt.

In general your naming schemes seem kind of inconsistent.

Jabor
Jul 16, 2010

#1 Loser at SpaceChem
Those look like regular-rear end single-precision floats.

Also this is the SQL thread, you might prefer the stickied general programming thread if you have other questions.

Jabor
Jul 16, 2010

#1 Loser at SpaceChem
Your example doesn't really make sense.

- Weight should to be a float value, but your weight extra actually has an int
- There's a row that claims to be a birth date, but contains a string instead of a date
- You want a query to divine that the aforementioned bogus birth date row is actually a supplier, and return that instead of the correct (uncorrupted) supplier value?

Assuming those are just errors in putting together the example and aren't actually in your database ... what's the point of this query? If you're looking up a single element, leaving it as rows seems equally as useful (if not more so) than pivoting it into columns, given that I don't know (when running this query) what columns I'm going to have. If I'm querying a bunch of elements for further processing, getting back data of a totally unknown shape is pretty useless to me too. Do you just want to render in a UI all the values for every piece of data you have? Even that would better be served by getting the values as separate rows (ordered so that all the values for one piece of data are together), unless your "UI" is just an SQL shell you're pasting queries into.

Jabor
Jul 16, 2010

#1 Loser at SpaceChem
The use case for uuids is "I want to generate records in many disparate servers at the same time, and not have to deal with collisions when we eventually reconcile them".

If you're not running a wildly distributed setup, and are perhaps doing something a bit more traditional with a beefy server plus some read replicas, then don't bother.

Jabor
Jul 16, 2010

#1 Loser at SpaceChem
There are use cases for "anyone with the url is presumed to be allowed access", but you still don't want everything to be publicly browsable.

Jabor
Jul 16, 2010

#1 Loser at SpaceChem
I'm reading the docs you linked and all I can think is that ORMs are a mistake.

The SQL you want is something like:

code:
SELECT id, title, is_sticky, last_read_post_id, number_posts_read
FROM threads t
LEFT JOIN read_count r ON r.user_id = ? AND t.id = r.thread_id
WHERE t.is_deleted = False AND t.forum_id = ?
ORDER BY t.is_sticky DESC, t.last_post_at DESC
Trivial to write, I have no idea how you would do that in your ORM.

Jabor
Jul 16, 2010

#1 Loser at SpaceChem
Having to shuffle things around in an index instead of just flipping the "this has been read" field every time a user reads a message seems like it'd cause more performance issues than it solves.

Have you considered tracking "this is the number of unread messages for this user" instead of counting them all up every time you need them?

Jabor
Jul 16, 2010

#1 Loser at SpaceChem
Having a second index just on the column you're using for the predicate is of absolutely zero help.

The DB can use the index on the foreign key column to find the entries that correspond to the selected users. And it could use the index on the predicate column to identify the entries for all users that meet the predicate. But there's not actually any way to combine those results into "all entries for the selected users that meet the predicate" - it's easier to just ignore the second index and instead look at every entry from the first index one at a time.

(Especially if you're talking about low millions of rows - that's not a huge amount of data to churn through).

Jabor
Jul 16, 2010

#1 Loser at SpaceChem
You use cardinality symbols in drawn diagrams so that it's easy to see what's up visually. You use the phrases in written or spoken text because trying to embed cardinality symbols there is cumbersome and doesn't actually help with clarity.

I'm not sure where your confusion about reciprocity is coming from.

Jabor
Jul 16, 2010

#1 Loser at SpaceChem

credburn posted:

If something is 1:1 in one direction, but 1:M the other, say, then what use is describing a relationship of being "one to one" if that's only true in one direction?

Can you describe a relation that works this way?

In general, if something is one-to-many in one direction, it's many-to-one in the other direction.

Jabor
Jul 16, 2010

#1 Loser at SpaceChem
To use your specific example, there are many cases that have the same client, so it's many-to-one.

Jabor
Jul 16, 2010

#1 Loser at SpaceChem

credburn posted:

It's making more sense to me now, but this still stumps me:

many-to-one I get because of the way you worded it. But why would it be incorrect to say it was one-to-many, as in one client instructs many cases?

It wouldn't be incorrect, you could absolutely describe it that way if you wanted to. "Clients has a one-to-many relationship with Cases" means the same thing as "Cases has a many-to-one relationship with Clients".

How do you know which way around to describe it? That's all about the context. Often one way will be a more natural way of describing the relationship, while the other one seems confusing or backwards.

Jabor
Jul 16, 2010

#1 Loser at SpaceChem
That chatgpt answer is really cool because it looks plausible and probably works on a couple of test cases but is also completely wrong and doesn't meet the specification at all.

Please do not use chatgpt to answer questions, it does not actually know the answers.

Jabor
Jul 16, 2010

#1 Loser at SpaceChem
Have you read the warning messages, or did you only look at how many there were?

Jabor
Jul 16, 2010

#1 Loser at SpaceChem
cmds hell sounds like an accurate description

Jabor
Jul 16, 2010

#1 Loser at SpaceChem

Ihmemies posted:

Actually what the gently caress these tasks are..

So how I'm going to write an infinitely deep SQL query without recursion? Like I just can't use a "for" loop or anything :D Hnngh.. poo poo..

This sounds like it's asking "have you read the documentation/lecture notes closely enough to find the CONNECT BY operator"

Jabor
Jul 16, 2010

#1 Loser at SpaceChem
Not directly answering your question, but in many cases where something turns out to be too expensive to compute on every single page load, the right answer is to store the computed result somewhere and update it every time it changes.

Adbot
ADBOT LOVES YOU

Jabor
Jul 16, 2010

#1 Loser at SpaceChem
instead of spending brainpower worrying about clustering and how to optimally set up a primary key, i would just do the normalization and then spec out a server that fits the resulting dataset in ram.

it's likely to be both cheaper and give better outcomes than spending a bunch of your expensive time on optimization.

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply