SELECT * FROM Questions WHERE Type = 'Stupid'

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > SELECT * FROM Questions WHERE Type = 'Stupid'

«‹›303 »

kumba: Nov 8, 2003; I posted my food for USPOL Thanksgiving!

enjoy the ride; Lipstick Apathy

Ah okay I understand now. Stab #2: use a CTE and partition by the type

code:

; 
WITH data 
     ( 
          id, 
          deviceid, 
          gatewayid, 
          type, 
          value, 
          receivedtimestamp, 
          timestamp, 
          rownum 
     ) 
     AS 
     ( 
              SELECT   *, 
                       row_number() OVER (partition BY type, order by timestamp DESC) 
              FROM     parameter 
              WHERE    deviceid = 1 
     ) 
SELECT   * 
FROM     data 
WHERE    rownum = 1 
ORDER BY timestamp ASC

# ? Dec 21, 2017 15:38

Adbot: ADBOT LOVES YOU

# ? Jun 8, 2024 00:26

go play outside Skyler: Nov 7, 2005

kumba posted:

Ah okay I understand now. Stab #2: use a CTE and partition by the type

code:

; 
WITH data 
     ( 
          id, 
          deviceid, 
          gatewayid, 
          type, 
          value, 
          receivedtimestamp, 
          timestamp, 
          rownum 
     ) 
     AS 
     ( 
              SELECT   *, 
                       row_number() OVER (partition BY type, order by timestamp DESC) 
              FROM     parameter 
              WHERE    deviceid = 1 
     ) 
SELECT   * 
FROM     data 
WHERE    rownum = 1 
ORDER BY timestamp ASC

Thanks a bunch! This looks like a perfect solution ... unfortunately I am using MySQL 5.6 which does not seem to support CTEs and the WITH keyword

# ? Dec 21, 2017 15:42

NihilCredo: Jun 6, 2011; iram omni possibili modo preme:
plus una illa te diffamabit, quam multæ virtutes commendabunt

go play outside Skyler posted:

Thanks a bunch! This looks like a perfect solution ... unfortunately I am using MySQL 5.6 which does not seem to support CTEs and the WITH keyword

CTEs should be logically equivalent to subqueries (though they may have different performance / caching depending on the vendor). Meaning that that query is the same as:

code:

SELECT   * 
FROM  ( 
   SELECT   
      id, 
      deviceid, 
      gatewayid, 
      type, 
      value, 
      receivedtimestamp, 
      timestamp, 
      row_number() OVER (partition BY type, order by timestamp DESC) AS rownum
   FROM  parameter 
   WHERE  deviceid = 1 
) AS data
WHERE    rownum = 1 
ORDER BY timestamp ASC

You might be able to skip the subquery by using HAVING rownum = 1 instead of WHERE, I'm not sure if HAVING works with partitions.

# ? Dec 21, 2017 16:12

meanieface: Mar 27, 2012; During times of universal deceit, telling the truth becomes a revolutionary act.

NihilCredo posted:

CTEs should be logically equivalent to subqueries (though they may have different performance / caching depending on the vendor). Meaning that that query is the same as:
code:
SELECT   * 
FROM  ( 
   SELECT   
      id, 
      deviceid, 
      gatewayid, 
      type, 
      value, 
      receivedtimestamp, 
      timestamp, 
      row_number() OVER (partition BY type, order by timestamp DESC) AS rownum
   FROM  parameter 
   WHERE  deviceid = 1 
) AS data
WHERE    rownum = 1 
ORDER BY timestamp ASC
You might be able to skip the subquery by using HAVING rownum = 1 instead of WHERE, I'm not sure if HAVING works with partitions.

Can�t skip the sub query.
If you want to filter by a windowing function, you�ve got to do that outside of the query creating the windowing function.

SELECT *
FROM (SELECT ROW_NUMBER() OVER(PARTITION BY CoffeeUsage ORDER BY CoffeeFlavor) AS CoffeeRow, CoffeeFlavor
FROM INeedANap)
WHERE CoffeeRow > 1

(Silly query bc posting from phone.)

# ? Dec 21, 2017 17:07

meanieface: Mar 27, 2012; During times of universal deceit, telling the truth becomes a revolutionary act.

Just-In-Timeberlake posted:

can someone help me out with this pivot, I'm a loving moron when it comes to this

I want to take these results:

code:

name	col	        value
---------------------------------------
guid	name	guid
guid	value	9e1cb05b-a7ba-4283-abb4-29d0212ee303
label	name	label
label	value	Invoice issues
email	name	email
email	value	abc<at>abc.com
format	name	format
format	value	html

and pivot them into this:

code:

guid       label       email      format
------------------------------------------------------
9e1...     Invoi....   abc@...  html

This request made me feel like god is dead just in time for Yule. Bravo on that timing.

Assuming that you are doing this as part of another, larger, process and you only want *one* row at the end --

SELECT email, [format], [guid], label
FROM
(SELECT r.*
FROM #temptbl r
WHERE col = 'value'
) r
pivot(MAX([value]) FOR r.name IN ([email],[format],[guid],[label])) p

# ? Dec 21, 2017 18:04

Just-In-Timeberlake: Aug 18, 2003

meanieface posted:

This request made me feel like god is dead just in time for Yule. Bravo on that timing.

Assuming that you are doing this as part of another, larger, process and you only want *one* row at the end --

SELECT email, [format], [guid], label
FROM
(SELECT r.*
FROM #temptbl r
WHERE col = 'value'
) r
pivot(MAX([value]) FOR r.name IN ([email],[format],[guid],[label])) p

thanks, I'll give this a shot after the holidays.

Just a weird requirement, but sometimes you gotta do dirty things and cash that check.

# ? Dec 22, 2017 10:41

Hammerite: Mar 9, 2007; And you don't remember what I said here, either, but it was pompous and stupid.; Jade Ear Joe

meanieface posted:

Can�t skip the sub query.
If you want to filter by a windowing function, you�ve got to do that outside of the query creating the windowing function.

SELECT *
FROM (SELECT ROW_NUMBER() OVER(PARTITION BY CoffeeUsage ORDER BY CoffeeFlavor) AS CoffeeRow, CoffeeFlavor
FROM INeedANap)
WHERE CoffeeRow > 1

(Silly query bc posting from phone.)

this is all immaterial because go play outside Skyler said they are on MySQL v5.6 and windowing functions were added to MySQL in v8.0.

# ? Dec 23, 2017 01:32

Hammerite: Mar 9, 2007; And you don't remember what I said here, either, but it was pompous and stupid.; Jade Ear Joe

code:

SELECT
    p.*
FROM
    Parameter AS p
    JOIN (
        SELECT Type, MAX(Timestamp) AS Latest
        FROM Parameter
        WHERE DeviceId = 1
        GROUP BY Type
    ) AS subq ON p.Type = subq.Type AND p.Timestamp = subq.Latest
WHERE
    p.DeviceId = 1

# ? Dec 23, 2017 01:52

DELETE CASCADE: Oct 25, 2017; i haven't washed my penis since i jerked it to a phtotograph of george w. bush in 2003

Hammerite posted:

this is all immaterial because go play outside Skyler said they are on MySQL v5.6 and windowing functions were added to MySQL in v8.0.

lmao get rekt idiot, use a real database :rolleyes:

# ? Dec 25, 2017 05:43

Nolgthorn: Jan 30, 2001; The pendulum of the mind alternates between sense and nonsense

I'm using http://knexjs.org/ for MySQL manipulation which is fine. But migrations are failing when I use it with MariaDB, the string indexes are the wrong length. Is there a way I can write my migrations so that they set the index length? MariaDB is supposed to be a drop in replacement for MySQL.

The node adapter for MariaDB is abandoned for almost 2 years and doesn't compile on some computers using npm install.

# ? Dec 25, 2017 21:23

PierreTheMime: Dec 9, 2004; Hero of hormagaunts everywhere!; Buglord

I'm having to deal with SSIS packages for a project and the supposed SME for their team knows absolutely nothing with regards to running in command line (which we need for automation). The problem is that I'm not sure if the package is properly setting all the variables I'm passing in.

Is there an easy way to get a list of "current" variables in the log during or post-execution? Currently the variables I'm passing are enough to allow the package to run, but the tables we're expecting to update aren't and it's a bit frustrating working with a blackbox where you're uncertain how the information you're handing a tool is being used internally. I've looked over the XML and ensured that all "evaluate as expression" flags are set to false, and searched over the whole doc to ensure the variables I'm setting are being used in the correct location.

The especially frustrating thing is that certain variables are definitely being evaluated, as it's dumping the log file in the location and filename that's only specified by a variable.

Edit: nevermind, turns out the bastards had the other variables being used inside a loop with a bad static value ( :argh:

). I manually edited their garbage and had to replace a half-dozen static values that they swore they�d updated to variable and now it�s working, at least until they change something again.

PierreTheMime fucked around with this message at 04:00 on Dec 29, 2017

# ? Dec 29, 2017 00:56

Paul MaudDib: May 3, 2006; TEAM NVIDIA:
FORUM POLICE

I'd like to create an "overlay table", where I have some immutable base set of data and then I can create a mask over the top using a second table with the same PK. In other words the view select would be:

CREATE VIEW OVERLAY_VW AS SELECT CASE IF OVERLAY_TBL.COL1 IS NOT NULL THEN OVERLAY_TBL.COL1 ELSE BASE_TBL.COL1 END AS COL1 (repeat for each column) FROM BASE_TBL LEFT JOIN OVERLAY_TBL ON BASE_TBL.PK = OVERLAY_TBL.PK

Is there a more graceful way to do that?

The use-case here is that raw data is periodically imported from a dirty external source and I am doing both scripted and manual cleanup on top of it, so it would be nice to be able to be able to just truncate and reload that table when I get new data, while not losing any cleanup that I've already done.

Paul MaudDib fucked around with this message at 03:31 on Dec 29, 2017

# ? Dec 29, 2017 03:03

Munkeymon: Aug 14, 2003; Motherfucker's got an
armor-piercing crowbar! Rigoddamndicu𝜆ous.

Paul MaudDib posted:

I'd like to create an "overlay table", where I have some immutable base set of data and then I can create a mask over the top using a second table with the same PK. In other words the view select would be:

CREATE VIEW OVERLAY_VW AS SELECT CASE IF OVERLAY_TBL.COL1 IS NOT NULL THEN OVERLAY_TBL.COL1 ELSE BASE_TBL.COL1 END AS COL1 (repeat for each column) FROM BASE_TBL LEFT JOIN OVERLAY_TBL ON BASE_TBL.PK = OVERLAY_TBL.PK

Is there a more graceful way to do that?

The use-case here is that raw data is periodically imported from a dirty external source and I am doing both scripted and manual cleanup on top of it, so it would be nice to be able to be able to just truncate and reload that table when I get new data, while not losing any cleanup that I've already done.

COALESCE(OVERLAY_TBL.COL1, BASE_TBL.COL1) or ISNULL if you're on MSSQL.

# ? Dec 29, 2017 15:58

Just-In-Timeberlake: Aug 18, 2003

meanieface posted:

This request made me feel like god is dead just in time for Yule. Bravo on that timing.

Assuming that you are doing this as part of another, larger, process and you only want *one* row at the end --

SELECT email, [format], [guid], label
FROM
(SELECT r.*
FROM #temptbl r
WHERE col = 'value'
) r
pivot(MAX([value]) FOR r.name IN ([email],[format],[guid],[label])) p

This worked by the way, so thanks!

# ? Dec 29, 2017 19:28

olives black: Nov 24, 2017; LENIN.
STILL.
WON'T.
FUCK.
ME.

Anyone got suggestions for books on proper relational database design? I'm thinking about starting a new project from scratch, and I want to avoid hitting the coding stage with a flawed design.

Amazon says this one's good, but if anyone has a better suggestion I'd appreciate it.

Sorry for asking this on page 227, but I saw nothing in the OP or the first few pages on this (just one or two posts on anti-patterns).

# ? Dec 31, 2017 01:19

Hughmoris: Apr 21, 2007; Let's go to the abyss!

I need to learn how to write SQL queries for work. I just stepped through the free Stanford Relational Algebra videos and they were surprisingly engaging. Next is the SQL section of videos.

I know that often times theory can be learned in the classroom but never knowingly utilized in the field. For those who write queries, how often (if ever) do you think about relational algebra?

Hughmoris fucked around with this message at 01:55 on Dec 31, 2017

# ? Dec 31, 2017 01:52

olives black: Nov 24, 2017; LENIN.
STILL.
WON'T.
FUCK.
ME.

Hughmoris posted:

I need to learn how to write SQL queries for work. I just stepped through the free Stanford Relational Algebra videos and they were surprisingly engaging. Next is the SQL section of videos.

I know that often times theory can be learned in the classroom but never knowingly utilized in the field. For those who write queries, how often (if ever) do you think about relational algebra?

Checking this out, thanks.

# ? Dec 31, 2017 02:24

Nth Doctor: Sep 7, 2010; Darkrai used Dream Eater!
It's super effective!

Hughmoris posted:

For those who write queries, how often (if ever) do you think about relational algebra?

Never. Been a SQL developer for 13 years and I do schema design as a mostly intuitive thing, which is a lovely answer, I know, but seriously: never do I think explicitly in terms of the algebra.

# ? Dec 31, 2017 05:50

kumba: Nov 8, 2003; I posted my food for USPOL Thanksgiving!

enjoy the ride; Lipstick Apathy

Been a SQL analyst for 3 years, have never thought about algebra once

# ? Dec 31, 2017 15:41

KernelSlanders: May 27, 2013; Rogue operating systems on occasion spread lies and rumors about me.

olives black posted:

Anyone got suggestions for books on proper relational database design? I'm thinking about starting a new project from scratch, and I want to avoid hitting the coding stage with a flawed design.

Amazon says this one's good, but if anyone has a better suggestion I'd appreciate it.

Sorry for asking this on page 227, but I saw nothing in the OP or the first few pages on this (just one or two posts on anti-patterns).

The answer is going to be different for transactional schema design and analytic schema design. For OLAP, The Data Warehouse Toolkit (or most things by Ralph Kimbal) is generally considered the standard. For OLTP, I'd defer to someone else' opinion.

# ? Dec 31, 2017 19:17

KernelSlanders: May 27, 2013; Rogue operating systems on occasion spread lies and rumors about me.

Nth Doctor posted:

Never. Been a SQL developer for 13 years and I do schema design as a mostly intuitive thing, which is a lovely answer, I know, but seriously: never do I think explicitly in terms of the algebra.

Relational algebra is irrelevant to query writing and has largely been replaced by convention for schema design. If you want to design a new database, especially the query engine, or understand why the conventions are what they are, understanding the algebra becomes important.

# ? Dec 31, 2017 19:19

Hughmoris: Apr 21, 2007; Let's go to the abyss!

Is there a recommended formatting/style guide I should adhere to as I start learning to write SQL queries? I'm using SSMS but the computer is pretty locked down and I don't think I'll be able to use plugins to help.

# ? Dec 31, 2017 19:37

Kuule hain nussivan: Nov 27, 2008

Hughmoris posted:

Is there a recommended formatting/style guide I should adhere to as I start learning to write SQL queries? I'm using SSMS but the computer is pretty locked down and I don't think I'll be able to use plugins to help.

There doesn't seem to be a lot of official SQL style guides. http://www.sqlstyle.guide/ seems like a pretty decent read. I personally don't sign off on all of it, but most of it is common sense stuff.

# ? Dec 31, 2017 19:47

Paul MaudDib: May 3, 2006; TEAM NVIDIA:
FORUM POLICE

The only times I've needed to think about relational algebra are when you're trying to read a query plan and figure out what it's doing that is taking so goddamn long, and really only in a tertiary sense. It's something that's hidden "under the hood", you really shouldn't need to think about it unless you are writing a database engine yourself.

# ? Dec 31, 2017 19:48

Hughmoris: Apr 21, 2007; Let's go to the abyss!

Thanks for the answers. One more beginner question: Should I be spending my time learning how to use the query builder/designer in SSMS, or should I focus instead on writing queries by hand for a while?

# ? Dec 31, 2017 21:23

spiritual bypass: Feb 19, 2008; Grimey Drawer

I've never used that particular query building tool, but I've also never wanted one. SQL is usually pretty simple. It's probably much faster just to type everything by hand once you remember all the keywords and concept.

# ? Dec 31, 2017 22:33

Paul MaudDib: May 3, 2006; TEAM NVIDIA:
FORUM POLICE

Virtually without exception, in any type of software development the bottleneck will never be "how fast your fingers can type" and therefore any sort of GUI tool (beyond a basic IDE) is doomed to fail.

Schema management tools like PowerDesigner can be really nice for handling boilerplate schema migrations but that's the only "visual" tool I use on any sort of a consistent basis. Otherwise I just use PGadmin and DBeaver.

# ? Jan 1, 2018 00:01

NihilCredo: Jun 6, 2011; iram omni possibili modo preme:
plus una illa te diffamabit, quam multæ virtutes commendabunt

Hughmoris posted:

Thanks for the answers. One more beginner question: Should I be spending my time learning how to use the query builder/designer in SSMS, or should I focus instead on writing queries by hand for a while?

I work in an office full of young support people who use SSMS four out of eight hours a day but who couldn't tell you what a trigger is. I warned them several times that by blindly following wizards and menus, instead of learning the actual SQL commands and how they work, they're practically leaving money on the table (in the form of valuable skills and future job prospects). Please don''t do the same.

# ? Jan 1, 2018 00:44

Hughmoris: Apr 21, 2007; Let's go to the abyss!

NihilCredo posted:

I work in an office full of young support people who use SSMS four out of eight hours a day but who couldn't tell you what a trigger is. I warned them several times that by blindly following wizards and menus, instead of learning the actual SQL commands and how they work, they're practically leaving money on the table (in the form of valuable skills and future job prospects). Please don''t do the same.

Thanks for the advice. I'm not interested in taking short cuts if it'll possibly handicap me down the road.

I just finished up the Wise Owl: SQL Server Queries series, and am working my way through some practice problems. Are there any SQL rooms in IRC or Discord that people hang out in, to get feedback on some simple queries I'm writing?

For this particular problem: I have a Movies DB, and I want to know who the oldest actor is for each movie. My query returns the results but are there any easy improvements to be had?

SQL code:

WITH my_select
AS (SELECT
    FilmName,
    MIN(ActorDOB) AS BirthDate
FROM tblCast AS c
INNER JOIN tblActor AS a
    ON c.CastActorID = a.ActorID
INNER JOIN tblFilm AS f
    ON c.CastFilmID = f.FilmID
GROUP BY FilmName)

SELECT
    FilmName,
    ActorName,
    ActorDOB
FROM my_select
INNER JOIN tblActor
    ON my_select.BirthDate = tblActor.ActorDOB
ORDER BY FilmName

# ? Jan 1, 2018 23:50

NihilCredo: Jun 6, 2011; iram omni possibili modo preme:
plus una illa te diffamabit, quam multæ virtutes commendabunt

Hughmoris posted:

For this particular problem: I have a Movies DB, and I want to know who the oldest actor is for each movie. My query returns the results but are there any easy improvements to be had?

One mistake: your query will return, for each movie, all the actors that share the same birth date as the oldest actor in the movie, even if those actors didn't appear in the movie.

# ? Jan 2, 2018 00:14

Hughmoris: Apr 21, 2007; Let's go to the abyss!

NihilCredo posted:

One mistake: your query will return, for each movie, all the actors that share the same birth date as the oldest actor in the movie, even if those actors didn't appear in the movie.

I've been trying my hand at fixing this for the past 45 minutes, can't figure it out. Any tips? I can explain what I need to do in pseudo code but I'm not sure how to tackle it in SQL.

Hughmoris fucked around with this message at 02:58 on Jan 2, 2018

# ? Jan 2, 2018 01:27

kumba: Nov 8, 2003; I posted my food for USPOL Thanksgiving!

enjoy the ride; Lipstick Apathy

Hughmoris posted:

I've been trying my hand at fixing this for the past 45 minutes, can't figure it out. Any tips? I can explain what I need to do in pseudo code but I'm not sure how to tackle it in SQL.

You want to rank the birth dates within each movie, and only select the person with the lowest birthdate for each movie, so I'd use a ranking function in your CTE:

SQL code:

WITH my_select
AS (SELECT
    FilmName,
    ActorName,
    ActorDOB AS BirthDate,
    ROW_NUMBER() OVER (PARTITION BY f.FilmID Order By a.ActorDOB asc) AS BirthDateRank
FROM tblCast AS c
INNER JOIN tblActor AS a
    ON c.CastActorID = a.ActorID
INNER JOIN tblFilm AS f
    ON c.CastFilmID = f.FilmID
)

SELECT
    FilmName,
    ActorName,
    ActorDOB
FROM my_select
WHERE BirthDateRank = 1
ORDER BY FilmName

Note: this will still run into a problem if two actors have the same birth date and they are the oldest, since this will result in something non-deterministic. You'll want to look at the various ranking functions (RANK, DENSE_RANK, ROW_NUMBER, etc) and what other criteria you'd want as a tiebreaker for the edge case.

# ? Jan 2, 2018 03:04

NihilCredo: Jun 6, 2011; iram omni possibili modo preme:
plus una illa te diffamabit, quam multæ virtutes commendabunt

Hughmoris posted:

I've been trying my hand at fixing this for the past 45 minutes, can't figure it out. Any tips? I can explain what I need to do in pseudo code but I'm not sure how to tackle it in SQL.

The subquery is fine.

You don't want to join on all the actors with the same birth date, but on all the actors with the same birth date in that movie.

Add a join on tblCast to the main query.

# ? Jan 2, 2018 03:13

Hughmoris: Apr 21, 2007; Let's go to the abyss!

kumba posted:

You want to rank the birth dates within each movie, and only select the person with the lowest birthdate for each movie, so I'd use a ranking function in your CTE:
SQL code:
WITH my_select
AS (SELECT
    FilmName,
    ActorName,
    ActorDOB AS BirthDate,
    ROW_NUMBER() OVER (PARTITION BY f.FilmID Order By a.ActorDOB asc) AS BirthDateRank
FROM tblCast AS c
INNER JOIN tblActor AS a
    ON c.CastActorID = a.ActorID
INNER JOIN tblFilm AS f
    ON c.CastFilmID = f.FilmID
)

SELECT
    FilmName,
    ActorName,
    ActorDOB
FROM my_select
WHERE BirthDateRank = 1
ORDER BY FilmName
Note: this will still run into a problem if two actors have the same birth date and they are the oldest, since this will result in something non-deterministic. You'll want to look at the various ranking functions (RANK, DENSE_RANK, ROW_NUMBER, etc) and what other criteria you'd want as a tiebreaker for the edge case.

Thank you. I haven't seen ROW_NUMBER() or PARTITION BY in my adventures yet, so I'll do some reading along with the ranking functions.

NihilCredo posted:

The subquery is fine.

You don't want to join on all the actors with the same birth date, but on all the actors with the same birth date in that movie.

Add a join on tblCast to the main query.

I'll give that a try. Thanks!

# ? Jan 2, 2018 03:16

PhantomOfTheCopier: Aug 13, 2008; Pikabooze!

Different movies may have the same title as well.

# ? Jan 2, 2018 14:58

Ruggan: Feb 20, 2007; WHAT THAT SMELL LIKE?!

One problem I haven't found a great solution to is returning nested data structures.

Let's say I have two tables: Task and TaskOwner. The relationship is one-to-many, and I'm consuming the output in C# to expose it via a web API endpoint as JSON. In the example below, I'll be returning 10 rows from the Task table and all related TaskOwners.

I've used two approaches in the past, and I don't love either of them:

Return 10 Task rows. Get distinct list of TaskOwnerIds and return them as a second query. Map the result to the Task rows.
Join Task to TaskOwner and return the result. De-duplicate Task rows in the ORM, some of which will be repeated due to the one-to-many join.

What I'd really like is some way of returning the nested data structure directly from SQL - in a way, composing the object with a single query without returning duplicated data. Maybe this is possible with XML PATH, but that also seems ugly. Any novel approaches people have taken for this?

# ? Jan 2, 2018 17:57

Paul MaudDib: May 3, 2006; TEAM NVIDIA:
FORUM POLICE

The novel approach would be to trust in your ORM's cache. Pull the Task data once, and whenever you access the TaskOwner property your ORM should only actually hit the DB if the TaskOwner is not already cached. Even if not, primary key lookups should be extremely fast in any database. Or if the data is sparse enough that that's a bottleneck, you do the solution with the join.

Before you go implementing something crazy, are you sure that this is actually a bottleneck for you?

Paul MaudDib fucked around with this message at 18:10 on Jan 2, 2018

# ? Jan 2, 2018 18:05

Ruggan: Feb 20, 2007; WHAT THAT SMELL LIKE?!

Hughmoris posted:

Thank you. I haven't seen ROW_NUMBER() or PARTITION BY in my adventures yet, so I'll do some reading along with the ranking functions.

I'll give that a try. Thanks!

Windowed functions are really awesome and powerful, and I use them a lot. However, you need to be really careful with the window's frame. The frame of a window defines the subset of rows in which the function will work. The default frame is "BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW". That basically means, when analyzing this function across all your returned rows, only analyze rows that are between the beginning of your PARTITION and the row being calculated (not the end of your PARTITION).

This can introduce a very problematic "gotchya" where your window function isn't returning the data you'd expect because the frame isn't set appropriately. If you use the LAST_VALUE() function, for example, the last row will always be the current row:

Make sure you read up on windowed functions and frames before you start using code in production.

https://www.red-gate.com/simple-talk/sql/learn-sql-server/window-functions-in-sql-server-part-2-the-frame/

# ? Jan 2, 2018 18:05

Ruggan: Feb 20, 2007; WHAT THAT SMELL LIKE?!

Paul MaudDib posted:

The novel approach would be to trust in your ORM's cache. Pull the Task data once, and whenever you access the TaskOwner property your ORM should only actually hit the DB if the TaskOwner is not already cached.

Before you go implementing something crazy, are you sure that this is actually a bottleneck for you?

Yeah. It's a customer facing site where I work. The tasks are basically todo items that need to be done to complete a project. There are tens of thousands of tasks per project, and about 3 million in our system total. Each of these tasks has multiple one-to-many relationships. TaskOwner is a mapping table between the Tasks and Users table, of which we have about 50k.

We currently use Entity Framework, but it imposes a costly overhead, and does not integrate well with version control due to its graphical UI. I personally really dislike EF. It also encourages our developers to tightly couple to our database structure, as they can just hook into tables directly. Due to this tight coupling, I'm nervous to change the database's structure - I'm not sure if I'll inadvertently break some developer's bad code. I'd prefer all web app interaction to go through pre-defined SQL (stored procedures).

I want to move us to using something like Dapper, which I use in some personal projects - it's a Micro-ORM most notably in use at StackOverflow. Most of our queries that hit or pull from the database are already encapsulated in stored procedures, which that supports. I can really wring some great performance out of it. All I need to do is codify all the interactions as Stored Procs and force the transition. My biggest holdup is finding a way to represent these complex nested objects while minimizing the amount of data that needs to flow around. I can do it with one of the methods I mentioned, but I wish there was a better way.

Ruggan fucked around with this message at 18:20 on Jan 2, 2018

# ? Jan 2, 2018 18:17

Adbot: ADBOT LOVES YOU

# ? Jun 8, 2024 00:26

Ruggan: Feb 20, 2007; WHAT THAT SMELL LIKE?!

Turns out SQL has a FOR JSON PATH formatter that can be used to make nested outputs:

code:

SELECT      T.TaskId as Id,
            T.TaskName as Name,
            Owners = (
                SELECT      U.UserId as Id,
                            U.UserName as Name
                FROM        TaskOwner TU
                INNER JOIN  User U
                ON          TF.OwnerId = F.OwnerId
                WHERE       TF.TaskId = T.TaskId
                FOR JSON PATH
            )
            
FROM        Task T

FOR JSON PATH

This will output tasks in the following format:

code:

[  
   {  
      "Id":1,
      "Name":"Test",
      "Owners":[  
         {  
            "Id":1,
            "Name":"A"
         },
         {  
            "Id":2,
            "Name":"B"
         },
         {  
            "Id":3,
            "Name":"C"
         }
      ]
   },
   ...
]

This is basically exactly what I'm looking for, but I want to understand more of how the engine is doing it behind the scenes before I actually use it. Also need to performance test it.

Anybody have experience with using FOR JSON PATH?

# ? Jan 2, 2018 19:14

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > SELECT * FROM Questions WHERE Type = 'Stupid'

«‹›303 »