SELECT * FROM Questions WHERE Type = 'Stupid'

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > SELECT * FROM Questions WHERE Type = 'Stupid'

«‹›303 »

Empress Brosephine: Mar 31, 2012; by Jeffrey of YOSPOS

Also quick question in addition:

Is there anyway to get a estimate for monthly hosting costs for a database when I don't really know what I need? I have a client that wants a way to purchase tickets for an event that holds about 35k people, but they don't want to pay for a ticketing service :v:

I don't even know how to start to lookup something like that in regard to records hosting wise as it'd be 35k each year. They want me to help assist but I don't want to screw myself over on hosting costs.

# ? Apr 8, 2021 21:03

Adbot: ADBOT LOVES YOU

# ? Jun 7, 2024 09:21

Impotence: Nov 8, 2010; Lipstick Apathy

pay for a ticketing service or your service will go down instantly on day one and then you'll overspend every other day

# ? Apr 8, 2021 21:14

Empress Brosephine: Mar 31, 2012; by Jeffrey of YOSPOS

I agree 100% so I'd like to give them some crazy price for.hosting lmao

# ? Apr 8, 2021 21:26

Hammerite: Mar 9, 2007; And you don't remember what I said here, either, but it was pompous and stupid.; Jade Ear Joe

They sound like bad clients tbh. they want you to set up a payment site for this one-off thing and they don't want to just pay for a company that does this sort of thing as their specialty to do it properly? it'd be one thing if they were going to regularly hold events, but for a one-off?

tell them all the reasons it's a bad idea and then if they still say "yeah but how much will it cost" tell them no

# ? Apr 8, 2021 21:37

Empress Brosephine: Mar 31, 2012; by Jeffrey of YOSPOS

I'm leaning that way. It's a yearly event that is pretty popular. Their existing site uses gravity forms which captures payment so they're like " why can't we do that but for tickets "...I mean you're capturing maybe one form a day, that's why.

I worked in ticketing for a while and I managed to get $1 per ticket from a ticketing service but that was too much for them soooo unless the sql hosting is less than $.10 cents a ticket I don't aim to make anything

# ? Apr 8, 2021 21:39

NihilCredo: Jun 6, 2011; iram omni possibili modo preme:
plus una illa te diffamabit, quam multæ virtutes commendabunt

Empress Brosephine posted:

unless the sql hosting is less than $.10 cents a ticket I don't aim to make anything

Are you running a business or a charity?

# ? Apr 8, 2021 22:44

Empress Brosephine: Mar 31, 2012; by Jeffrey of YOSPOS

True. Their current rate is $.79 a ticket lol and they're too cheap for that

# ? Apr 8, 2021 23:01

PhantomOfTheCopier: Aug 13, 2008; Pikabooze!

Most cloud providers have free tier services, but you're probably going to waste a ton of time load testing it to make sure it doesn't go down on the day of the event. If all ticket purchases arrive over an eight hour window, you're running 1.2/sec on average but you won't be able to handle 5pm/after work spikes and so forth.

If their order system is so powerful, just tell them to create 35k items, one for each seat, and let their system take care of it. :clint:

Or they can do batches of 1000 and hold some in reserve to deal with the system selling things beyond availability.

# ? Apr 9, 2021 22:42

Empress Brosephine: Mar 31, 2012; by Jeffrey of YOSPOS

Thanks for the reply. I think they do something ready like that with their exists bf system. What would even be required to make it so it wouldn't go down with the demand? I'm surprised a free tier my ght be a option though either way!

# ? Apr 9, 2021 22:57

Rabbit Hill: Mar 11, 2009; God knows what lives in me in place of me.; Grimey Drawer

I have a question about MS Access 2016 query design. Basically, I have two spreadsheets and I'm trying to find matches between them with a "text contains" (rather than the default "text equals") join. Here are screenshots --

Table 1 --

Table 2 --

I want to run a query to find matches between these tables using the ISBN numbers of books as the common variable, because I need to create a report that appends info from Table 2 onto Table 1. But as you can see, the "ISBNs" column in Table 1 sometimes contains multiple numbers, while the "ISBN" column in Table 2 contains only one number.

Here is my preliminary query design in two views:

Design view --

SQL view (sorry for the tiny text)--

How do I look for only partial matches between the "ISBNs" in Table 1 with the "ISBN" in Table 2?

# ? Apr 13, 2021 18:29

Heavy_D: Feb 16, 2002; "rararararara" contains the meaning of everything, kept in simple rectangular structures

Rabbit Hill posted:

I have a question about MS Access 2016 query design.
How do I look for only partial matches between the "ISBNs" in Table 1 with the "ISBN" in Table 2?

Have you tried something like :

code:

SELECT ....
FROM [JLG Circ history w ISBNs] INNER JOIN [JLG w pricing] 
ON ([JLG Circ history w ISBNs].[ISBNs] Like '*' & [JLG w pricing].[ISBN] & '*')

If that doesn't work, you could try the old "cartesian join and filter" approach, but there's a strong possibility that MS Access performance on this approach will be dire...

code:

SELECT ....
FROM [JLG Circ history w ISBNs], [JLG w pricing] 
WHERE ([JLG Circ history w ISBNs].[ISBNs] Like '*' & [JLG w pricing].[ISBN] & '*')

# ? Apr 13, 2021 23:00

Rabbit Hill: Mar 11, 2009; God knows what lives in me in place of me.; Grimey Drawer

quote:

If that doesn't work, you could try the old "cartesian join and filter" approach, but there's a strong possibility that MS Access performance on this approach will be dire...
code:
SELECT ....
FROM [JLG Circ history w ISBNs], [JLG w pricing] 
WHERE ([JLG Circ history w ISBNs].[ISBNs] Like '*' & [JLG w pricing].[ISBN] & '*')

This worked!!

It first gave me some janky results --

but I filtered all the blanks out of the "Level" column, then selected all the remaining rows, and exported the query as .txt file. (If I export in .xlsx format, it screws up the dates and times in the "Last Loan Date" column.) Had to clean up the .txt file in Excel a bit, but it was quick and now I finally have that report I need.

Thanks, dude! :3:

# ? Apr 14, 2021 20:29

raminasi: Jan 25, 2005; a last drink with no ice

What does it mean in a Postgres query plan if a Unique node with 5624 rows is wrapped by a Materialize node with 1129368 rows? Why does materializing the unique row set change the row count? (5624 is the number I'd expect, and the inflated row number is carrying through the rest of the query and destroying its performance.)

Also, what can I read to help me understand why disabling enable_seqscan drastically improves a query's performance? (The time drops from like 20 seconds to under half a second.) I assume it's a problem with the table stats but I've never really learned how those work. I stumbled across this when I read a query plan, thought to myself "why the hell is it doing a table scan there," figured out how to force it not to, and watched both the plan and the query's performance change to what I expected. I figure there's a way to make the query planner smarter than I am but I've never really used Postgres before.

# ? Apr 21, 2021 23:05

TheFluff: Dec 13, 2006; FRIENDS, LISTEN TO ME
I AM A SEAGULL
OF WEALTH AND TASTE

raminasi posted:

What does it mean in a Postgres query plan if a Unique node with 5624 rows is wrapped by a Materialize node with 1129368 rows? Why does materializing the unique row set change the row count? (5624 is the number I'd expect, and the inflated row number is carrying through the rest of the query and destroying its performance.)

Also, what can I read to help me understand why disabling enable_seqscan drastically improves a query's performance? (The time drops from like 20 seconds to under half a second.) I assume it's a problem with the table stats but I've never really learned how those work. I stumbled across this when I read a query plan, thought to myself "why the hell is it doing a table scan there," figured out how to force it not to, and watched both the plan and the query's performance change to what I expected. I figure there's a way to make the query planner smarter than I am but I've never really used Postgres before.

Hard to say much of anything about this, unless you post the full query plan (bowdlerize the table/column names if you have to).

As for the seq scan though, that seems like an issue with table/column stats indeed if there is an index that can be used but postgres thinks a seq scan is better. If you're seeing a lot of bad estimates in the query plan that's a hint that the stats are bad. One fairly common way that this can happen is if you have an index on a column that is actually fairly selective (say a maximum of 10-20 rows share a distinct value in the typical case) but the rows get inserted in batches and all rows in a batch share the same value for this column. Because of the way the stats collection works, this is very likely to make the query planner drastically underestimate the selectivity of this index (it'll think the number of distinct values for this column is way smaller than it actually is) and that might make it use a seq scan instead, or do some other very stupid thing. Here's a good read about the problem and how to fix it.

One extra tip beyond that link - if you want test the hypothesis you can do this:

SQL code:

START TRANSACTION;
ALTER TABLE some_table
  ALTER COLUMN some_indexed_column
  SET n_distinct = 1234567; -- number of actual distinct values in the column
ANALYZE some_table (some_indexed_column);
EXPLAIN (ANALYZE, COSTS, VERBOSE, BUFFERS, FORMAT JSON)
  SELECT 1; -- your query goes here
ROLLBACK TRANSACTION;

If this avoids the seq scan then you know the problem is the stats on the suspicious column.

Finally, if you haven't tried it yet, https://explain.dalibo.com/ is an excellent query plan visualizer for postgres. Saves plans to your browser's local storage, doesn't upload anything anywhere unless you explicitly share the plan.

TheFluff fucked around with this message at 10:44 on Apr 23, 2021

# ? Apr 23, 2021 10:37

raminasi: Jan 25, 2005; a last drink with no ice

TheFluff posted:

Hard to say much of anything about this, unless you post the full query plan (bowdlerize the table/column names if you have to).

As for the seq scan though, that seems like an issue with table/column stats indeed if there is an index that can be used but postgres thinks a seq scan is better. If you're seeing a lot of bad estimates in the query plan that's a hint that the stats are bad. One fairly common way that this can happen is if you have an index on a column that is actually fairly selective (say a maximum of 10-20 rows share a distinct value in the typical case) but the rows get inserted in batches and all rows in a batch share the same value for this column. Because of the way the stats collection works, this is very likely to make the query planner drastically underestimate the selectivity of this index (it'll think the number of distinct values for this column is way smaller than it actually is) and that might make it use a seq scan instead, or do some other very stupid thing. Here's a good read about the problem and how to fix it.

One extra tip beyond that link - if you want test the hypothesis you can do this:
SQL code:
START TRANSACTION;
ALTER TABLE some_table
  ALTER COLUMN some_indexed_column
  SET n_distinct = 1234567; -- number of actual distinct values in the column
ANALYZE some_table (some_indexed_column);
EXPLAIN (ANALYZE, COSTS, VERBOSE, BUFFERS, FORMAT JSON)
  SELECT 1; -- your query goes here
ROLLBACK TRANSACTION;
If this avoids the seq scan then you know the problem is the stats on the suspicious column.

Finally, if you haven't tried it yet, https://explain.dalibo.com/ is an excellent query plan visualizer for postgres. Saves plans to your browser's local storage, doesn't upload anything anywhere unless you explicitly share the plan.

See, you say you couldn't say much of anything, but then you described the problem exactly! That's our write pattern, I can see the row overestimations in the bad query plan now that I know to look for them, and your test validated the problem. Unfortunately, the solutions offered in that writeup seem a little thin, but this is part of a proof of concept to test out some possible approaches for a new greenfield application so this is good information to have regardless. Thanks so much!

# ? Apr 23, 2021 15:55

TheFluff: Dec 13, 2006; FRIENDS, LISTEN TO ME
I AM A SEAGULL
OF WEALTH AND TASTE

raminasi posted:

See, you say you couldn't say much of anything, but then you described the problem exactly! That's our write pattern, I can see the row overestimations in the bad query plan now that I know to look for them, and your test validated the problem. Unfortunately, the solutions offered in that writeup seem a little thin, but this is part of a proof of concept to test out some possible approaches for a new greenfield application so this is good information to have regardless. Thanks so much!

Haha, nice! Feels good to be able to help :unsmith:

Last time I encountered this problem I basically just punted on it, by increasing the statistics target for that particular column. Not so much that it made n_distinct anywhere near actually accurate, but enough that it made the query planner realize that the index was selective enough to be useful. If the table continues growing though the problem will eventually come back, and at some point I think the solution will have to be some kind of change to the table structure. In my case what I was planning on was to eventually start moving old rows out of the table to keep the size under control, but I've since left that job. There were other options as well, but once you start considering reworking the data model all the answers are basically "it depends" so I don't know if I can give any further useful advice for your particular case. At least you know what to google for now though, there are a few other writeups by people who have had similar problems and maybe one of them can be useful as inspiration.

TheFluff fucked around with this message at 16:47 on Apr 23, 2021

# ? Apr 23, 2021 16:39

NecroBob: Jul 29, 2003

So this question occurred to me today, and I did some (admittedly) light research on it, but found no definitive answer.

Given the following query, in MS SQL Server:

SQL code:

INSERT INTO [schema].[table] ([ColumnA], [ColumnB], [ColumnC])
     VALUES (
                'some string value'
                ,(
                    SELECT MAX(S.[Index]) + 1
                      FROM [schema].[anotherTable] AS S
                     WHERE S.[RecordId] = 'some guid'
                )
                ,'some other value for column C'
            )

Is the subquery transactional in nature? That is, if someone adds a record to [anotherTable] for 'some guid' that is a higher [Index] value than what was previously there, is that MAX(S.[Index]) evaluation in the outer query no longer accurate? Or does the engine block writes to the referenced inner table while the query (as a whole) is executing?

I didn't take a look at this specific execution plan, but similar SELECT queries induce JOINs, so my first thought is 'yes, it is transactional in nature for the duration of the execution', but I figured I'd run it by someone far more experienced than me. I'm also unaware of any way to artificially delay queries in order to test it for myself.

Also also, the INSERT INTO query was just off the top of my head, so if it's not technically accurate, sorry.

E: laserline correction

NecroBob fucked around with this message at 02:55 on Apr 30, 2021

# ? Apr 30, 2021 02:50

Empress Brosephine: Mar 31, 2012; by Jeffrey of YOSPOS

This might be more of a javascript queston, but whats the best practice of deleting rows from lets say two tables, if lets say table #2 requires data from table #1? i'm guessing a transaction?

# ? May 1, 2021 15:27

Nth Doctor: Sep 7, 2010; Darkrai used Dream Eater!
It's super effective!

Empress Brosephine posted:

This might be more of a javascript queston, but whats the best practice of deleting rows from lets say two tables, if lets say table #2 requires data from table #1? i'm guessing a transaction?

If you want to lock both tables for the duration, yeah I guess.

If you created the foreign key with cascading deletion then the parent record's deletion would trigger the child's deletion automatically.

Or you could just delete the child records followed by the parent record yourself. That's what I've generally done.

# ? May 1, 2021 17:00

Empress Brosephine: Mar 31, 2012; by Jeffrey of YOSPOS

sweet ok thanks

# ? May 1, 2021 17:30

Empress Brosephine: Mar 31, 2012; by Jeffrey of YOSPOS

edit: I figured it out I think. My sql was doing things it didn't need to do! Now I need to figureo ut if you can do a "if" statement in postgres

Empress Brosephine fucked around with this message at 13:34 on May 3, 2021

# ? May 3, 2021 13:24

PhantomOfTheCopier: Aug 13, 2008; Pikabooze!

Empress Brosephine posted:

edit: I figured it out I think. My sql was doing things it didn't need to do! Now I need to figureo ut if you can do a "if" statement in postgres

The CASE statement exists in SQL, but there are special variants (coalesce) and honestly it's usually the wrong way to think about the problem.

# ? May 4, 2021 01:33

Paul MaudDib: May 3, 2006; TEAM NVIDIA:
FORUM POLICE

PhantomOfTheCopier posted:

The CASE statement exists in SQL, but there are special variants (coalesce) and honestly it's usually the wrong way to think about the problem.

what�s wrong with CASE?

I frequently use an idiom of building up a set of rows to insert or update inside a CTE and then just doing a simple �insert� or �merge into�. CASE works great there for handling the CTE logic to build up the input set.

Paul MaudDib fucked around with this message at 09:13 on May 4, 2021

# ? May 4, 2021 02:38

PhantomOfTheCopier: Aug 13, 2008; Pikabooze!

Paul MaudDib posted:

what’s wrong with CASE?

I frequently use an idiom of building up a set of rows to insert or update inside a CTE and then just doing a simple “insert” or “merge into”. CASE works great there for handling the CTE logic to build up the input set.

Oh you mean VALUES?

# ? May 4, 2021 10:37

DELETE CASCADE: Oct 25, 2017; i haven't washed my penis since i jerked it to a phtotograph of george w. bush in 2003

in oracle they use decode() and it's *awful*

# ? May 5, 2021 03:59

Moonwolf: Jun 29, 2004; Flee from th' terrifyin' evil of "NHS"!

You can just use CASE though, if you want it to be actually readable.

# ? May 5, 2021 08:05

Vegetable: Oct 22, 2010

I need some help optimizing a query.

Table A is a login table, containing user_id and login_time.

Table B is a sales table, containing user_id, purchase_time and item_id.

I want a list of all logins. For each login, I want two additional data points: (a) the last item_id bought by the user before that login and (b) the first item_id bought by the user after that login.

Historically, I've always dealt with it as follows:

1. Create two additional columns in the sales_table:
- LEAD(purchase_time) OVER (PARTITION BY user_id ORDER BY purchase_time) AS next_purchase_time
- LAG(purchase_time) OVER (PARTITION BY user_id ORDER BY purchase_time) AS last_purchase_time
2. Run the query as such:

SELECT a.user_id, a.login_time, b1.item_id AS pre_login_item, b2.item_id AS post_login_item
FROM a
LEFT JOIN b b1
ON a.user_id = b1.user_id
AND a.login_time > b1.purchase_time
AND a.login_time < b1.next_purchase_time
LEFT JOIN b b2
ON a.user_id = b2.user_id
AND a.login_time < b2.purchase_time
AND a.login_time > b2.last_purchase_time

This query works fine, except it's very slow. It's being used by some employees on a regular basis for some adhoc analysis. How might I make it faster?

# ? May 5, 2021 10:00

Paul MaudDib: May 3, 2006; TEAM NVIDIA:
FORUM POLICE

can anyone recommend any reading materials on Apache Solr? willing to buy a book if it's worth it.

we have a longtime consultant who's really crazy about it, and he wants to use it on a new project. he's the only person on the team who has significant experience with it, and once the initial design stuff is done it's largely going to fall on us to implement and entirely on us to maintain in the long term.

I think it might have some merit for warehousing and "fuzzy search" type roles, but I want to fully understand the performance implications of the design decisions he's pushing for. I'm a little worried it's going to turn into a "and then you do a full-table scan..." for every single thing, and I know we can't afford that in performance terms. Also concerned about the amount of latency that's going to be acceptable between the OLTP and the warehouse (and I think doing OLTP on Solr sounds like a real bad idea).

Little dubious about this since the team has plenty of experience with relational, and good relational DBAs, and I think the problems with the dataset are fundamentally tractable using relational approaches. And he's the only one with Solr experience...

# ? May 20, 2021 05:14

Moonwolf: Jun 29, 2004; Flee from th' terrifyin' evil of "NHS"!

Vegetable posted:

I need some help optimizing a query.

Table A is a login table, containing user_id and login_time.

Table B is a sales table, containing user_id, purchase_time and item_id.

I want a list of all logins. For each login, I want two additional data points: (a) the last item_id bought by the user before that login and (b) the first item_id bought by the user after that login.

Historically, I've always dealt with it as follows:

1. Create two additional columns in the sales_table:
- LEAD(purchase_time) OVER (PARTITION BY user_id ORDER BY purchase_time) AS next_purchase_time
- LAG(purchase_time) OVER (PARTITION BY user_id ORDER BY purchase_time) AS last_purchase_time
2. Run the query as such:

SELECT a.user_id, a.login_time, b1.item_id AS pre_login_item, b2.item_id AS post_login_item
FROM a
LEFT JOIN b b1
ON a.user_id = b1.user_id
AND a.login_time > b1.purchase_time
AND a.login_time < b1.next_purchase_time
LEFT JOIN b b2
ON a.user_id = b2.user_id
AND a.login_time < b2.purchase_time
AND a.login_time > b2.last_purchase_time

This query works fine, except it's very slow. It's being used by some employees on a regular basis for some adhoc analysis. How might I make it faster?

This looks like a harder problem, given the data what you're doing looks about as effective as a base query can be. What db is this in, and what level of schema work can you get done on it?

# ? May 20, 2021 08:15

NihilCredo: Jun 6, 2011; iram omni possibili modo preme:
plus una illa te diffamabit, quam multæ virtutes commendabunt

What indexes do you have over the tables? If one or more of the time columns aren't indexed, you'll get full table scans which are super slow.

Also, I'm thinking you might have a couple extra optimizations available:

1) don't compute two separate columns for lead and lag, use just one
2) fetch the item_id directly via the LEAD() function instead of joining

Basically run a CTE / subquery that turns table B into (purchase_time, item, purchase_time_next, item_next), and then you can just join A once on the condition (purchase_time < login_time and login_time < purchase_time_next), handling nulls at the start and end however you see fit.

# ? May 20, 2021 11:44

Spikes32: Jul 25, 2013; Happy trees

Can anyone point me to some reading on when views are appropriate for new reports? I'm creating some new crystal report reports for a LIMS system and so far am just grabbing data from two tables. At what point should I make a view instead?

# ? May 20, 2021 21:59

mega dy: Dec 6, 2003

Paul MaudDib posted:

we have a longtime consultant who's really crazy about it, and he wants to use it on a new project. he's the only person on the team who has significant experience with it, and once the initial design stuff is done it's largely going to fall on us to implement and entirely on us to maintain in the long term.

I would avoid doing it if your current understanding is just "this one guy really likes it." Regardless of how good of a technology it is, you need to thoughtfully explore it as a solution and then make sure that your team has the ability to make good design choices, to maintain everything, and to integrate it with the rest of your stack. A great piece of technology with a team who knows nothing about it isn't going to get you far.

Maybe rig up a prototype environment that has it up that you can play around with and live on that for a bit at least.

# ? May 20, 2021 22:13

LongSack: Jan 17, 2003

Need some help with a query.

I�m working on a demo app that simulates (very simply) a commodities trading app. The commodities are various colors of beans.

I have a Beans table with Id, Color, and Price. I have a Movements table with BeanId, Date, StartPrice, Movement, and EndPrice.

In my app I want to run a ticker, so I want to select Id, Color and Price from the Beans table joined with the Movement from the row in the Movements table with the highest date (i.e, most current movement data) for that BeanId. From there I can construct my ticker with �Blue 65.75 up 2.25� or whatever (I�ll fancy it up for actual display).

I can do it with multiple queries using EF, but i think there�s gotta be a more efficient way. Sadly, it exceeds my SQL abilities.

Thanks.

# ? May 21, 2021 00:03

Jabor: Jul 16, 2010; #1 Loser at SpaceChem

If the newest movement for a particular bean is like, a week old, do you still want to show it? Normally a ticker only shows things that have just happened, in which case it becomes very easy to create your query just by flipping things around - instead of selecting beans and finding their most recent movement, select movements and join that with the bean table to get the bean characteristics.

# ? May 21, 2021 04:53

NinpoEspiritoSanto: Oct 22, 2013

dy. posted:

I would avoid doing it if your current understanding is just "this one guy really likes it." Regardless of how good of a technology it is, you need to thoughtfully explore it as a solution and then make sure that your team has the ability to make good design choices, to maintain everything, and to integrate it with the rest of your stack. A great piece of technology with a team who knows nothing about it isn't going to get you far.

Maybe rig up a prototype environment that has it up that you can play around with and live on that for a bit at least.

This is wonderful advice that should be stapled to most senior dev's/product owner's foreheads

# ? May 21, 2021 09:43

LongSack: Jan 17, 2003

Jabor posted:

If the newest movement for a particular bean is like, a week old, do you still want to show it? Normally a ticker only shows things that have just happened, in which case it becomes very easy to create your query just by flipping things around - instead of selecting beans and finding their most recent movement, select movements and join that with the bean table to get the bean characteristics.

Yeah, this is just a silly demo app. The plan is for the "movements" to run on a scheduled basis via an Azure Function, I was thinking nightly or maybe hourly or whatever.

I can do what I want in multiple steps, with something like this:

C# code:

var ids = Movements.Select(x => x.BeanId).Distinct();
foreach (var id in ids)
{
    var movement = Movements.Where(x => x.BeanId == id).OrderByDescending(x => x.MovementDate).FirstOrDefault(); // rather than Take(1) in case there aren't any
    var bean = Beans.SingleOrDefault(x => x.Id == id);
    // Do something with ticker
}

but I was hoping for something I could make into a stored procedure or at least avoid multiple trips

# ? May 21, 2021 14:32

Munkeymon: Aug 14, 2003; Motherfucker's got an
armor-piercing crowbar! Rigoddamndicu𝜆ous.

LongSack posted:

Yeah, this is just a silly demo app. The plan is for the "movements" to run on a scheduled basis via an Azure Function, I was thinking nightly or maybe hourly or whatever.

I can do what I want in multiple steps, with something like this:
C# code:
var ids = Movements.Select(x => x.BeanId).Distinct();
foreach (var id in ids)
{
    var movement = Movements.Where(x => x.BeanId == id).OrderByDescending(x => x.MovementDate).FirstOrDefault(); // rather than Take(1) in case there aren't any
    var bean = Beans.SingleOrDefault(x => x.Id == id);
    // Do something with ticker
}
but I was hoping for something I could make into a stored procedure or at least avoid multiple trips

Last time I tried to do something like this in EF I ended up with something like

code:

from b in Beans
from m in Movements.Where(x => x.BeanId == b.BeanId).OrderByDescending(x => x.MovementDate).Take(1)
select new {
    m.MovementDate,
    b.BeanId
}

But check the SQL because EF will happily issue a query for every BeanId if it can't figure out a better way.

If you're doing it in SQL you can

code:

select *
from Beans as B
  cross apply (select top 1 iM.MovementDate from Movement as iM where iM.BeanId = B.BeanId order by iM.MovementDate desc) as LMD

Also possible to use a window function in there which IMO is harder to read but may be more compatible with databases that aren't SQLServer, IDK.

Munkeymon fucked around with this message at 16:26 on May 21, 2021

# ? May 21, 2021 16:18

TheFluff: Dec 13, 2006; FRIENDS, LISTEN TO ME
I AM A SEAGULL
OF WEALTH AND TASTE

Postgres also supports lateral joins (although in postgres you'd write "cross join lateral" instead of "cross apply") and is the only other reasonable alternative relational database :colbert:

(well, except sqlite I guess, but that has sort of a different usecase)

# ? May 21, 2021 18:32

boofhead: Feb 18, 2021

Bundy posted:

This is wonderful advice that should be stapled to most senior dev's/product owner's foreheads

On this topic, literally yesterday I was in a meeting where some data analysts were describing some complex report they built and the lead dev immediately starts asking for details and saying "hang on, we could automate the data collection!" -- man, your workload is already noticeably higher than capacity, there's a structural lack of clear prioritisation, and you're trying to convince people to give you more work when they're happy enough to grab it manually for a sometimes report just because it's technically possible and technically not ideal? Why do I get the impression that a lot of developers never truly listened when their mother said "finish your meal before you go get seconds"

I used to be a product manager before just starting again as a junior dev and the boss has been saying he can't wait to get my input on improving processes.. I am trying to find a polite way to point out that lacking the power to say no to excess work is a problem with politics, not processes. My second week they had a team feedback review where the response was basically "when we asked you how to fix our work-life balance issues, you should have understood that the workload would not be decreasing nor resources increasing. Now how can I convince you to give a more positive feedback next time"

boofhead fucked around with this message at 08:52 on May 22, 2021

# ? May 22, 2021 08:48

Adbot: ADBOT LOVES YOU

# ? Jun 7, 2024 09:21

NPR Journalizard: Feb 14, 2008

Spikes32 posted:

Can anyone point me to some reading on when views are appropriate for new reports? I'm creating some new crystal report reports for a LIMS system and so far am just grabbing data from two tables. At what point should I make a view instead?

I have heard that you should pretty much always be using views, unless you have a good reason not to.

# ? May 22, 2021 09:39

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > SELECT * FROM Questions WHERE Type = 'Stupid'

«‹›303 »