|
I Thanks for the help that makes sense. I guess what I'm looking for is a way to select all the data from the minor league table and the requested data desired from the major league table. I'm not sure if this is actually possible. If you understand mongo and mongoose this is how I would currently do it: with a url like /minorleagueteam/:id async.parallel({ majorLeague: function(callback){ HomeTeam.find('team-id: req.params.id').exec(callback) } minorLeague: function(callback){ MinorLeague.find('team-id: req.params.id').exec(callback) } } function(err, results){ res.render('template', { hometeam: results.majorLeague, minor-league: results.minorLeague }) }) and I could get all the info from MajorLeague that relate to that corresponding record if that makes sense while still being able to access all the data rows of both tables
|
# ? Jan 6, 2021 23:08 |
|
|
# ? Jun 8, 2024 02:54 |
|
it would probably make more sense if you gave an example output of what exactly you're looking for
|
# ? Jan 6, 2021 23:16 |
|
yeah that makes sense lol. Basically I want to query for lets say MinorLeagueTeam "Bluebirds" and get the following pieces of information: HomeTeam HomeTeam team_name HomeTeam team_location MinorLeagueTeam minorteam_name minorteam_location so if I was doing a page for the MinorLeague team I'd want to be able to show that corresponding data for their associated Home Team?
|
# ? Jan 6, 2021 23:21 |
|
So you're looking for something like this: SELECT M.minorteam_name, M.minorteam_location, H.team_name, H.team_location FROM MinorLeagueTeam M INNER JOIN HomeTeam H ON M.majorleague_id = H.team_id WHERE M.minorteam_name = 'Bluebirds' You're joining on the key present in both tables, using a where clause to grab only the records associated to the team you care about, and selecting the 4 specific fields you need
|
# ? Jan 6, 2021 23:34 |
|
That rules thank you so much. I didn't know you could SELECT from more than one table at a time
|
# ? Jan 6, 2021 23:47 |
Yes * in my query is a valid placeholder that will fetch everything from all joined tables.
|
|
# ? Jan 6, 2021 23:50 |
|
that rules thanks everyone
|
# ? Jan 7, 2021 00:00 |
|
kumba posted:You're joining on the key present in both tables, using a where clause to grab only the records associated to the team you care about, and selecting the 4 specific fields you need Nota bene, and just to put it out there rather than state something is one way authoritatively: When using LEFT OUTER JOINs, putting criteria in the JOIN predicate vs. the WHERE clause may have different behavior than we see here with an INNER JOIN including essentially turning an OUTER JOIN into an INNER JOIN, or even turning it into an ANTI JOIN
|
# ? Jan 7, 2021 19:08 |
|
I have a few PostgreSQL questions. When you use CREATE FUNCTION's SET clause with the FROM CURRENT option, the documentation says that it saves the value that is current as the value to be applied when the procedure is entered. Is the difference from excluding the SET clause altogether that ordinarily, SET LOCAL would persist after the function exits and until the next commit? I'm guessing this is the case, but I want to rule out that FROM CURRENT is just a synonym for performing the default action without the SET clause being there. For a procedure, when auto-commit is turned off, and there are statements in a procedure without an explicit commit after the last commit but before the procedure exits, does PostgreSQL automatically commit those, or would the commit have to be done outside the procedure? I am guessing the latter, because in that situation SET LOCAL within CREATE PROCEDURE without a SET clause would do the same thing as I'm guessing for CREATE FUNCTION without a SET clause. galenanorth fucked around with this message at 10:21 on Jan 9, 2021 |
# ? Jan 9, 2021 02:53 |
|
galenanorth posted:I have a few PostgreSQL questions. When you use CREATE FUNCTION's SET clause with the FROM CURRENT option, the documentation says that it saves the value that is current as the value to be applied when the procedure is entered. Is the difference from excluding the SET clause altogether that ordinarily, SET LOCAL would persist after the function exits and until the next commit? I'm guessing this is the case, but I want to rule out that FROM CURRENT is just a synonym for performing the default action without the SET clause being there. this appears to be the case: code:
|
# ? Jan 9, 2021 04:59 |
|
Nth Doctor posted:Nota bene, and just to put it out there rather than state something is one way authoritatively: can you expand on this? curious.
|
# ? Jan 9, 2021 16:40 |
|
abelwingnut posted:can you expand on this? curious. Failing a condition on an INNER JOIN removes the entire row. Failing a condition on an OUTER JOIN sets the columns from that table/view/whatever to NULL, but still returns the rest of the row. Failing a condition on a WHERE removes the entire row.
|
# ? Jan 9, 2021 17:15 |
|
abelwingnut posted:can you expand on this? curious. Imagine I have 2 tables: code:
From the schema, an Order can have zero or more Order_Lines defined. If you care first and foremost about all order headers regardless of the lines, you'd start by selecting from Order and your join to Order_Line would be LOJ so you don't inadvertently filter out Orders with zero lines. So here's our template: code:
I stubbed in some variables that would normally be parameters in a sproc. That's why the application of those parameters is flexible with NULLs and so forth. If I wanted to implement that @Item_Key filter, where I put it will make my query return different results. Maybe I want to say give me all orders, and if they have item 123 on them, include those lines. Then I need to put @Item_No in the JOIN predicate: code:
Whereas if I wanted this query to ONLY return orders with Item 123 on it, I would put it in the WHERE clause: code:
Let's say instead I want only Order records without any Order_Line records at all I would write: code:
|
# ? Jan 9, 2021 18:01 |
|
I have some list views in my application that are just absolutely balls slow. In order to fulfill the UI requirements as well as the advanced filtering requirements we have to do a lot of joins with a lot of different tables and for customers whose base data set includes hundreds of thousands of items in a table that contains 1M+ items, loading a single page of 30 items can take about 15-20 seconds. I've mucked around with EXPLAIN quite a bit and although I've found a couple places where I can improve indices to shave a couple seconds here and there I can't seem to get the base queries down into the range where I'd like them to be. Is this about the time that I should be looking at caching this data in an elasticsearch (or other) store or should MySQL be able to do what I need it to do and it's worth continuing to try to press on and improve my queries?
|
# ? Jan 11, 2021 18:31 |
|
prom candy posted:I have some list views in my application that are just absolutely balls slow. In order to fulfill the UI requirements as well as the advanced filtering requirements we have to do a lot of joins with a lot of different tables and for customers whose base data set includes hundreds of thousands of items in a table that contains 1M+ items, loading a single page of 30 items can take about 15-20 seconds. I've mucked around with EXPLAIN quite a bit and although I've found a couple places where I can improve indices to shave a couple seconds here and there I can't seem to get the base queries down into the range where I'd like them to be. Is this about the time that I should be looking at caching this data in an elasticsearch (or other) store or should MySQL be able to do what I need it to do and it's worth continuing to try to press on and improve my queries? What is your index situation looking like? Are your views joining on non PK/FK columns? If joining to a table not on its clustered index, is there a supporting index instead?
|
# ? Jan 11, 2021 18:50 |
|
Nth Doctor posted:What is your index situation looking like? Are your views joining on non PK/FK columns? If joining to a table not on its clustered index, is there a supporting index instead? We pretty much always join on FK columns and we have indices for pretty much every FK that's used for joins. In most cases when I use EXPLAIN I see that it is using indices, although in some cases it seems to be ignoring them. For example a user is on a team and teams can have child teams, so for a list of items we might have SELECT * FROM items WHERE team_id IN (1, 2, 3, 4); 1, 2, 3, and 4 being the teams that the user is allowed to see. When I EXPLAIN that query it ignores the index and just shows me "Using where" but if I do SELECT * FROM items WHERE team_id = 1; it uses the index on team_id. We also have a lot of many-to-many relationships so joining the right tables together requires joins across the intermediary table as well. Those are also all indexed but in playing around I've found that eliminating some of those joins can speed things up somewhat. Of course we can't eliminate those joins without removing software features so that's a non-starter. prom candy fucked around with this message at 19:00 on Jan 11, 2021 |
# ? Jan 11, 2021 18:57 |
|
prom candy posted:We pretty much always join on FK columns and we have indices for pretty much every FK that's used for joins. In most cases when I use EXPLAIN I see that it is using indices, although in some cases it seems to be ignoring them. For example a user is on a team and teams can have child teams, so for a list of items we might have SELECT * FROM items WHERE team_id IN (1, 2, 3, 4); You say that "teams can have child teams". Can child teams have their own child teams (arbitrary nesting), or are there only 2 levels of nesting? Why are you issuing queries with conditions like "WHERE team_id IN (1, 2, 3, 4)"? This suggests that you are calculating the list of teams whose items the user can see and then using that list in queries. It would be better to make the database do that work. It might be something like code:
I don't know why the IN condition might prevent use of the index.
|
# ? Jan 11, 2021 19:14 |
|
Just one level of nesting, thank god. In a given http request cycle we've already loaded the user's list of teams for ACL so we have it available for any subsequent queries that need it.
|
# ? Jan 11, 2021 19:24 |
|
I had a very similar experience in my last job, and had a good experience with dynamic SQL turning those 20-30 second queries into 1-2. The general problem you’re facing, if I understand you correctly, is labeled a “kitchen sink query”. That, for a single query to enable a diverse set of filters, you need to use a large number of WHERE or JOIN conditions, many of which may or may not be in use at a given time. The optimizer will generate a single query plan for your query and that plan will be optimized for your first run or the general use case and won’t work well for all or even most queries. By generating a dynamic query that only layers on the conditions you need, you enable the optimizer to create an efficient path that is specific to that set of filters. In my case, we had some orgs that had hundreds of thousands of rows of data in the base table and others that only had hundreds. Our data also included customer_id as the first element of a compound PK in each table. So in addition to the query layering I described above, I’d also manually inject the customer_id into the raw query string so that each customer would get different query plans specific to their data even if they were filtering on the same things as one another. It made a massive difference. You get a lot more cached query plans but they all performed much better. Talking orders of magnitude here. Phone posting so lemme know if any of this is unclear.
|
# ? Jan 11, 2021 20:03 |
|
Here’s an article by Aaron Bertrand about this problem with this same advice: https://www.sentryone.com/blog/aaronbertrand/backtobasics-updated-kitchen-sink-example
|
# ? Jan 11, 2021 20:06 |
|
prom candy posted:I have some list views in my application that are just absolutely balls slow. In order to fulfill the UI requirements as well as the advanced filtering requirements we have to do a lot of joins with a lot of different tables and for customers whose base data set includes hundreds of thousands of items in a table that contains 1M+ items, loading a single page of 30 items can take about 15-20 seconds. I've mucked around with EXPLAIN quite a bit and although I've found a couple places where I can improve indices to shave a couple seconds here and there I can't seem to get the base queries down into the range where I'd like them to be. Is this about the time that I should be looking at caching this data in an elasticsearch (or other) store or should MySQL be able to do what I need it to do and it's worth continuing to try to press on and improve my queries? On top of what the others have said, MySQL can only do nested loop joins, which are simple, and the fastest way to do small results sets, but will always suck no matter what for 10k+ results sets, and be terrible for 100k+. Trimming unused joins will certainly help though.
|
# ? Jan 11, 2021 20:36 |
|
Sorry to break the conversation, but do any of you know of a good resource or book to learn more about Joins and such? I can’t seem to wrap my head around it
|
# ? Jan 11, 2021 20:46 |
|
Empress Brosephine posted:Sorry to break the conversation, but do any of you know of a good resource or book to learn more about Joins and such? I can’t seem to wrap my head around it Without knowing more in particular about what you're looking for, here's a visual guide to the JOINs. Personally I would ignore any of the RIGHT joins. You can bash most of those into syntactically equivalent LEFT joins.
|
# ? Jan 11, 2021 20:53 |
|
That seems to work; thank you!
|
# ? Jan 11, 2021 20:55 |
|
Moonwolf posted:On top of what the others have said, MySQL can only do nested loop joins, which are simple, and the fastest way to do small results sets, but will always suck no matter what for 10k+ results sets, and be terrible for 100k+. Trimming unused joins will certainly help though. Postgres and MSSQL can perform much better when you're asking them to do herculean things, right? My understanding is that the less database-friendly your workload is, the worse MySQL falls behind.
|
# ? Jan 11, 2021 21:05 |
|
Moonwolf posted:On top of what the others have said, MySQL can only do nested loop joins, which are simple, and the fastest way to do small results sets, but will always suck no matter what for 10k+ results sets, and be terrible for 100k+. Trimming unused joins will certainly help though. i knew this, and i'm still shocked any time someone reminds me of it
|
# ? Jan 11, 2021 21:53 |
|
Ruggan posted:I had a very similar experience in my last job, and had a good experience with dynamic SQL turning those 20-30 second queries into 1-2. Thanks, yeah it's a mix of things. We have to do a certain amount of joining just to show the data on the page before even taking the filters into account, and then as more filters are applied we also do more joins. This is all built with ActiveRecord so the joins are applied dynamically depending on what the user has filtered for. As far as I can tell we're never joining a table unless it's required either just because of the base level of data we want to show on the page or because the user applied that specific filter. I've asked about removing some of the features/filters from these screens but it's a non-starter, people rely on them. Moonwolf posted:On top of what the others have said, MySQL can only do nested loop joins, which are simple, and the fastest way to do small results sets, but will always suck no matter what for 10k+ results sets, and be terrible for 100k+. Trimming unused joins will certainly help though. So if that's the case is it just time to look for alternative technologies? I've been looking at switching to Postgres but the idea of migrating a giant production application to a different database backend is terrifying. What I was thinking of doing is creating a denormalized cache layer in something like Elasticsearch or Mongo or DynamoDB that basically existed just to serve these oversized list views. Although at that point I'm sure it's just trading one set of problems (slow data) for another (inaccurate data)
|
# ? Jan 11, 2021 22:00 |
|
Moonwolf posted:On top of what the others have said, MySQL can only do nested loop joins, which are simple, and the fastest way to do small results sets, but will always suck no matter what for 10k+ results sets, and be terrible for 100k+. Trimming unused joins will certainly help though. wait, really? no hash joins? no merge joins? lmfao
|
# ? Jan 12, 2021 05:22 |
|
DELETE CASCADE posted:wait, really? no hash joins? no merge joins? lmfao mysql got hash joins in version 8.0.18 (late 2019). index merge is a mysql 5.0 feature (2005ish). how well that poo poo works i couldn't say.
|
# ? Jan 12, 2021 06:23 |
|
Bruegels Fuckbooks posted:mysql got hash joins in version 8.0.18 (late 2019). index merge is a mysql 5.0 feature (2005ish). how well that poo poo works i couldn't say. I forgot that because we got stuck in 5.7 land (there is no 6 or 7 for some baffling reason), I've never seen Index Merge come out of a plan, even ones where it'd be vastly better than looping though. So if you've got enough memory and you're on the newer version getting it to hash join will make your bigger data sets vastly more reasonably timed.
|
# ? Jan 12, 2021 09:14 |
|
I have a table that includes all the transactions that were placed with out business in 2020. It includes many rows so I won't be able to just open the table in Excel and create a pivot table. How would I go about having the following output with a query. WHERE date between "01-01-2020" and "12-31-2020" GROUP BY and COUNT the amount? I want to see the percentage, count, and total amount of transactions that were between 1 to 1000 USD and for > 1000 euros. How would I go about this? Busy Bee fucked around with this message at 12:44 on Jan 12, 2021 |
# ? Jan 12, 2021 12:39 |
|
Busy Bee posted:I have a table that includes all the transactions that were placed with out business in 2020. It includes many rows so I won't be able to just open the table in Excel and create a pivot table. How would I go about having the following output with a query. Something like the attached screenshot, the forum wouldn't let me post a query. Fix date formats to fit your db.
|
# ? Jan 12, 2021 15:39 |
|
Moonwolf posted:Something like the attached screenshot, the forum wouldn't let me post a query. I was working a similar angle, but storing the total count in a variable first and was phone typing without anything to reference so I abandoned it until I was at a PC. Also if this query is one time only, you could just use YEAR(Transaction_Date) = 2020 to simplify your WHERE clauses.
|
# ? Jan 12, 2021 15:56 |
|
Thanks for the help! I really appreciate it. I ended up making some minor adjustments since I ran into some errors Specifically adding "as threshold" to the end of the 2nd line to add a column name. For the 3rd line, I changed it to "ROUND(CAST(COUNT(*) as float/CAST((SELECT COUNT(*)....................) as float) * 100,10) "Percent", Also changed it to "GROUP BY 1;"
|
# ? Jan 12, 2021 17:08 |
|
Busy Bee posted:Also changed it to "GROUP BY 1;" This is functionally the same as removing the GROUP BY entirely. Columns that aren't aggregates (like the CASE making text to label the row in the example) need to be in the GROUP BY which was why the CASE was there in the first place
|
# ? Jan 12, 2021 17:44 |
|
Nth Doctor posted:This is functionally the same as removing the GROUP BY entirely. Columns that aren't aggregates (like the CASE making text to label the row in the example) need to be in the GROUP BY which was why the CASE was there in the first place Yeah, order by (column number) works, group by doesn't, that's just grouping by (true) so it won't work. Busy Bee posted:Thanks for the help! I really appreciate it. I ended up making some minor adjustments since I ran into some errors Yeah, my example would have given arbitrary precision, so getting it to an int percentage makes sense.
|
# ? Jan 12, 2021 17:53 |
|
Sorry to ask such a dumb question but im having a mind blank and can't figure out how I would do this.... I have a table for users and a table for licenses Liscences I want to have all the types of liscences so like Regular CDL CDL-A CDL-B and I want users to be able to be assigned a license type that I can join back in How would I set up the license table? Would I just have a new row for each type or do I just do like a primary key of like a ID and then a enum row of license_type with the options in it? Sorry!
|
# ? Jan 13, 2021 17:01 |
|
you probably want a users table UserID (PK), UserName, etc licenses table LicenseID (PK), LicenseName, etc and then probably a UserLicenses table UserLicensesID (PK), UserID (FK), LicenseID (FK)
|
# ? Jan 13, 2021 17:17 |
|
That rules thanks. Is it good practice to create a third table that combines the data points instead of just doing joins constantly?
|
# ? Jan 13, 2021 17:22 |
|
|
# ? Jun 8, 2024 02:54 |
|
Empress Brosephine posted:That rules thanks. you're still going to have to do joins to get the information you want, e.g. select u.Username, l.LicenseType from dbo.Users u join dbo.UserLicenses ul on u.UserID = ul.UserID join dbo.Licenses l on ul.LicenseID = l.LicenseID if you didn't do this and all you have is a licenses table plus a users table with columns for each licensetype, what happens when you add a new license type? you would have to add a corresponding column for that license to the users table, and then update it for everyone. now imagine you have 50 new license types this way, you just add a new license to the licenses table, then add rows with the UserID along with that new LicenseID to the UserLicenses table for everyone who has it, and voila! the above query still returns everything you need kumba fucked around with this message at 17:48 on Jan 13, 2021 |
# ? Jan 13, 2021 17:34 |