Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
Scaramouche
Mar 26, 2001

SPACE FACE! SPACE FACE!

Yeah, data was bad. Put in a simple check to make sure child wasn't using itself as a parent
code:
;WITH CTE
AS
(SELECT  ParentSKU,ChildSKU,CAST('thing' AS VARCHAR(255)) as thing
    FROM    BOM
    UNION ALL
   SELECT  C.ParentSKU,C.ChildSKU,CAST(C.ParentSKU + '->' + C.ChildSKU AS VARCHAR(255)) as tthing
   FROM    CTE AS  C
        INNER JOIN  BOM  AS  T   ON  T.ChildSKU =   C.ParentSKU AND T.ChildSKU<>C.ParentSKU
)
SELECT * FROM CTE where parentsku='SIMPLEPRODUCT'
But it's still only going 1 level deep on COMPLEXPRODUCT

Scaramouche fucked around with this message at 21:16 on Jul 26, 2018

Adbot
ADBOT LOVES YOU

PhantomOfTheCopier
Aug 13, 2008

Pikabooze!
I can't post. App won't post, and every attempt on the website is getting stuck in captcha.

Scaramouche posted:

Yeah, data was bad. Put in a simple check to make sure child wasn't using
itself as a parent
code:
;WITH CTE
AS
(SELECT  ParentSKU,ChildSKU,CAST('thing' AS VARCHAR(255)) as thing
    FROM    BOM
    UNION ALL
   SELECT  C.ParentSKU,C.ChildSKU,CAST(C.ParentSKU + '->' + C.ChildSKU AS
VARCHAR(255)) as tthing
   FROM    CTE AS  C
        INNER JOIN  BOM  AS  T   ON  T.ChildSKU =   C.ParentSKU AND
T.ChildSKU<>C.ParentSKU
)
SELECT * FROM CTE where parentsku='SIMPLEPRODUCT'
But it's still only going 1 level deep on COMPLEXPRODUCT
Something doesn't logically match the definition of a "row" here. I'm having a hard time seeing a situation where a 'thing' has both a parent and child defined within the row. Which is leading to the confusion that it's... backwards. c.parentsku=t.childsku, in the first recursive step,
finds all new rows having as a child the row from the initialization. Now one could construct supersets, but I'm fairly certain you intended to select the children of the current rows, which is c.childsku=bom.thingyid.

Nth Doctor
Sep 7, 2010

Darkrai used Dream Eater!
It's super effective!


Scaramouche posted:

Yeah, data was bad. Put in a simple check to make sure child wasn't using itself as a parent

Circular BOMs are THE WORST.

Scaramouche
Mar 26, 2001

SPACE FACE! SPACE FACE!

PhantomOfTheCopier posted:

I can't post. App won't post, and every attempt on the website is getting stuck in captcha.

Something doesn't logically match the definition of a "row" here. I'm having a hard time seeing a situation where a 'thing' has both a parent and child defined within the row. Which is leading to the confusion that it's... backwards. c.parentsku=t.childsku, in the first recursive step,
finds all new rows having as a child the row from the initialization. Now one could construct supersets, but I'm fairly certain you intended to select the children of the current rows, which is c.childsku=bom.thingyid.

Sorry dude I was having problems myself posting but only to this topic weirdly enough. ZDR said that in the past some errant sql in this thread was triggering captchas, I wonder if that was happening again, but thanks for your insight.


Nth Doctor posted:

Circular BOMs are THE WORST.

So it's looking like partly this, but also because the data is kind of crap. I have the CTE working (thanks PhantomOfTheCopier) but it's giving me "vertical" results (e.g. all rows in 1,2 configuration) whereas what I'd prefer is a "horizontal" result (e.g. 1,2,3,4,5,6 to the length of all components with 1 being the first "parent") so it looks like I'm going to have to pivot it out. I also can only do one SKU at a time otherwise I hit recursion limit eventually, even at max setting (2^15). This is where it's at currently:
code:
WITH SCREWYOU (parentsku,childsku,path) AS
(
      SELECT
            parentsku,childsku,CAST('parent' as VARCHAR(255)) as thing
      FROM
            mobom
      WHERE 
	parentsku = 'SCREWYOU'
      UNION ALL 
      SELECT
            emp.parentsku,emp.childsku,CAST((CONVERT(varchar,emp.ComID) + '->' + emp.ParentSKU + '->' + emp.ChildSKU) as VARCHAR(255)) as thinger
      FROM
            mobom emp
      INNER JOIN SCREWYOU ON 
            emp.parentsku = SCREWYOU.childsku
)
SELECT * FROM SCREWYOU
OPTION (MAXRECURSION 32767);

Jethro
Jun 1, 2000

I was raised on the dairy, Bitch!

Scaramouche posted:

[Recursive stuff]

PhantomOfTheCopier posted:

Something doesn't logically match the definition of a "row" here. I'm having a hard time seeing a situation where a 'thing' has both a parent and child defined within the row. Which is leading to the confusion that it's... backwards. c.parentsku=t.childsku, in the first recursive step,
finds all new rows having as a child the row from the initialization. Now one could construct supersets, but I'm fairly certain you intended to select the children of the current rows, which is c.childsku=bom.thingyid.
Yeah, you have the join backwards and you never actually use the data from the "child" row anyway.

Try something like this (assuming this is T-SQL):

SQL code:
;WITH CTE
AS
(SELECT  DISTINCT ParentSKU TopLevelSKU, CAST(null as VARCHAR(30)) as ParentSKU, ParentSKU as ChildSKU, CAST(ParentSKU AS VARCHAR(255)) as thing, CAST('|'+ParentSKU AS VARCHAR(255)) as bomPath, 0 as level
    FROM    BOM
    UNION ALL
   SELECT  P.TopLevelSKU, C.ParentSKU, C.ChildSKU, CAST(SPACE((P.level+1)*2)+'->' + C.ChildSKU AS VARCHAR(255)) as thing, CAST(P.bomPath + '|' + C.ChildSKU AS VARCHAR(255)) as bomPath, P.level+1 as level
   FROM    CTE AS  P
        INNER JOIN  BOM  AS  C   ON  P.ChildSKU =   C.ParentSKU AND C.ChildSKU <>   C.ParentSKU
 AND P.bomPath not like '%|'+C.ParentSKU+'|%'
)
SELECT * FROM CTE where TopLevelSKU='SIMPLEPRODUCT'
order by bomPath
Untested, but this should go multiple levels, and it goes in the right direction. Also, I think this will filter out all circular BOMs, not just ones where the parent and child are the same.

A few more notes:

If you're only ever returning the BOM for one top-level product (or a defined list of top-level products) at a time, you may want to put the TopLevelSKU='SIMPLEPRODUCT' filter condition in the base query. The DB is probably smart enough to push the parameter down without you explicitly telling it to, but if it's not, then you're generating the entire BOM for all products and then throwing most of it away.
Fake Edit: Oh, you've already done this.

If you're instead returning the BOM for all products (once you're done with your testing), you should also consider what the definition of "all" is. If you're returning the BOM for ProductA and ProductB, each of which contains Component1, do you also need to return the BOM where Component1 is considered the top-level product? You may need to come up with logic for how to determine what is a top-level product (since if you had a list you wouldn't be reading this paragraph). My suggestion would be something like putting the condition
SQL code:
where ParentSKU not in (select ChildSKU from BOM where ChildSKU<>ParentSKU)
in the base query.
E: Could that be what the ChildSKU=ParentSKU means, or are you sure that this is a data problem?

Fake Edit:

Scaramouche posted:

Sorry dude I was having problems myself posting but only to this topic weirdly enough. ZDR said that in the past some errant sql in this thread was triggering captchas, I wonder if that was happening again, but thanks for your insight.


So it's looking like partly this, but also because the data is kind of crap. I have the CTE working (thanks PhantomOfTheCopier) but it's giving me "vertical" results (e.g. all rows in 1,2 configuration) whereas what I'd prefer is a "horizontal" result (e.g. 1,2,3,4,5,6 to the length of all components with 1 being the first "parent") so it looks like I'm going to have to pivot it out. I also can only do one SKU at a time otherwise I hit recursion limit eventually, even at max setting (2^15). This is where it's at currently:
[snip]
If you want each row to build on the results from previous iterations, you have to actually use the results from previous iterations. Right now you're using the previous rows to determine what additional data to fetch, but then you don't actually use more than one level of the hierarchy at a time.

Jethro fucked around with this message at 19:41 on Jul 30, 2018

Scaramouche
Mar 26, 2001

SPACE FACE! SPACE FACE!

Jethro posted:

Try something like this (assuming this is T-SQL):

SQL code:
;WITH CTE
AS
(SELECT  DISTINCT ParentSKU TopLevelSKU, CAST(null as VARCHAR(30)) as ParentSKU, ParentSKU as ChildSKU, CAST(ParentSKU AS VARCHAR(255)) as thing, CAST('|'+ParentSKU AS VARCHAR(255)) as bomPath, 0 as level
    FROM    BOM
    UNION ALL
   SELECT  P.TopLevelSKU, C.ParentSKU, C.ChildSKU, CAST(SPACE((P.level+1)*2)+'->' + C.ChildSKU AS VARCHAR(255)) as thing, CAST(P.bomPath + '|' + C.ChildSKU AS VARCHAR(255)) as bomPath, P.level+1 as level
   FROM    CTE AS  P
        INNER JOIN  BOM  AS  C   ON  P.ChildSKU =   C.ParentSKU AND C.ChildSKU <>   C.ParentSKU
 AND P.bomPath not like '%|'+C.ParentSKU+'|%'
)
SELECT * FROM CTE where TopLevelSKU='SIMPLEPRODUCT'
order by bomPath
Untested, but this should go multiple levels, and it goes in the right direction. Also, I think this will filter out all circular BOMs, not just ones where the parent and child are the same.


I'm applauding here; sheer genius. I never would have went down the path of using the level since I didn't understand properly what the CTE was doing. This isn't exactly what I wanted, but exactly what I wanted probably doesn't exist, but I believe can be created from these results (e.g. breaking out into a properly normalized parent table, child table, parent_child table). Thank you so much everyone, but you as well Jethro.

I was casting around in the dark in the most annoying "find some crap and copy and paste it and then complain it doesn't work" way but you guys zoomed right in on the solution. The only thing I had to do with your 'untested' query was make the varchar length on ParentSKU match the column definition.

Portland Sucks
Dec 21, 2004
༼ つ ◕_◕ ༽つ
Is there a way to generate a derived column that "auto increments" by counting duplicate values in a particular column?

For example say I have 3 rows that all have the same value in the column LotNumber but different values along the rest. I'd like to return rows

LotNumber, PieceNumber, values...
LotNumber, 1, values...
LotNumber, 2, values...
LotNumber, 3, values...

Nth Doctor
Sep 7, 2010

Darkrai used Dream Eater!
It's super effective!


Portland Sucks posted:

Is there a way to generate a derived column that "auto increments" by counting duplicate values in a particular column?

For example say I have 3 rows that all have the same value in the column LotNumber but different values along the rest. I'd like to return rows

LotNumber, PieceNumber, values...
LotNumber, 1, values...
LotNumber, 2, values...
LotNumber, 3, values...

Middle of the night phone posting but:
ROW_NUNBER() OVER(LotNumber ORDER BY blah) AS PieceNumber or some similar syntax should do it.

Use the same order by for your overall query to make the numbering make sense.

Ruggan
Feb 20, 2007
WHAT THAT SMELL LIKE?!


Anyone have experience with SQL to Web App?

I have a project where we retrieve data from a database and use it in an app. Yeah, whatever, standard. Weve used Entity Framework in the past but Im not a fan of the tight database coupling it introduces, so Ive been using Dapper (an ADO.NET wrapper) and rewriting some of our code to call stored procedures that serve as a data access layer / interface. Once were decoupled, I get a bunch of benefits - Im free to change database schema without requiring app code changes in sync, and the app code becomes much simpler.

One of our pieces of code takes a set of filters, pagination, and sort info, and returns a list of items for a data table. Right now the code that runs is dynamic SQL and Im trying to move to more defined structure. Im outputting nested json from SQL for each list item being returned, and this works fine for a single page of records (25 items). It chokes up when I do an export operation that contains thousands of items.

Three questions:

1. Is a column nested query treated like an apply or join?
2. Is there a better way to retrieve my nested data without simply querying each table individually and stitching the data together in the app code? My current method returns denormalized data which of course contains a lot of duplicate records (e.g. status name for a given status id - thousands of items may have the same status and will all contain the status name in their json).
3. Do I really need to have different method calls for different views of the data? I prefer generic models but having specific ones probably lets me gain incremental efficiency where possible.

If anyone has novel approaches or experience theyre willing to share, Id love to hear it.

Ruggan fucked around with this message at 16:53 on Jul 31, 2018

Cold on a Cob
Feb 6, 2006

i've seen so much, i'm going blind
and i'm brain dead virtually

College Slice

Ruggan posted:

Anyone have experience with SQL to Web App?

I have a project where we retrieve data from a database and use it in an app. Yeah, whatever, standard. Weve used Entity Framework in the past but Im not a fan of the tight database coupling it introduces, so Ive been using Dapper (an ADO.NET wrapper) and rewriting some of our code to call stored procedures that serve as a data access layer / interface. Once were decoupled, I get a bunch of benefits - Im free to change database schema without requiring app code changes in sync, and the app code becomes much simpler.

One of our pieces of code takes a set of filters, pagination, and sort info, and returns a list of items for a data table. Right now the code that runs is dynamic SQL and Im trying to move to more defined structure. Im outputting nested json from SQL for each list item being returned, and this works fine for a single page of records (25 items). It chokes up when I do an export operation that contains thousands of items.

Three questions:

1. Is a column nested query treated like an apply or join?
2. Is there a better way to retrieve my nested data without simply querying each table individually and stitching the data together in the app code? My current method returns denormalized data which of course contains a lot of duplicate records (e.g. status name for a given status id - thousands of items may have the same status and will all contain the status name in their json).
3. Do I really need to have different method calls for different views of the data? I prefer generic models but having specific ones probably lets me gain incremental efficiency where possible.

If anyone has novel approaches or experience theyre willing to share, Id love to hear it.

1. A query nested in the SELECT portion of a query is called a scalar subquery. If it references the outer query (i.e. via the subquery WHERE clause) it's correlated and will be executed for every row in the result set. If not, it's non-correlated and should be executed once, but doing an EXPLAIN PLAN is the only way to be sure... I've seen code that you think should be non-correlated but still executes for each row such as using a function with constant parameters e.g. SELECT foo(1) [SomeValue], b.x, b.y FROM dbo.Bar b.
2. Preference is use inner joins if possible, outer joins if not, and use group by or select distinct with these inner/outer joins when required. Otherwise, if you're caching data in your app layer, in which case it may be better to stitch the data together there. You might also want to look at temp tables, table variables, common table expressions, or views depending on your scenario and how complicated it is, such as if you're reusing a result set multiple times across several sql statements.
3. At the database level, database views might help you a bit with pre-canning common join scenarios for SELECT queries. Usually when I use dapper I am writing raw sql for every command or query. I do dynamically build my WHERE and ORDER BY clauses using dapper's SqlBuilder plus a few extension methods I wrote for it, and you could maybe use SqlBuilder for managing SELECT and JOIN but I don't like losing that control so I don't bother. Generally speaking, if you're using Dapper it's because you want more control than a generic system (i.e. an ORM or SQL generating tool) will give you. For simple one-table CRUD operations there are some extension libraries for Dapper but I am not familiar with them.

Hughmoris
Apr 21, 2007
Let's go to the abyss!
I'm continuing my journey of learning SQL, specifically T-SQL. I'm starting to explore how to set a query to run weekly on a specific weekday.

I'm an analyst that is not part of the server or date warehouse team. Assuming my company has all the typical SQL reporting tools (SSRS?), what should I read up on for the simplest way to schedule a T-SQL query to run every Tuesday?

Ideal world, the query runs every Tuesday and it emails the resultant CSV file to somebody.

Hughmoris fucked around with this message at 05:39 on Aug 2, 2018

Vegetable
Oct 22, 2010

I'm using Hive and can't get my results to be sorted the intended way.

This is what I want, but can't get:



The logic is as follows:

1. Overall total goes on top of everything else.
2. Rows with identical supergroups are lumped together. Supergroup with highest quantity should come first.
3. Within each supergroup, the subtotal row goes first. The rest are sorted with the largest groups on top.

This is what I'm getting but don't want:



This is my current code:

code:
ORDER BY CASE
            when supergroup = 'TOTAL' then 1
            else industry
        END
         ,units_sold
How do I get what I want? Ideally this would all be done in the presentation layer, but our system doesn't have one, so we have to make the best with what we've got.

Vegetable fucked around with this message at 06:01 on Aug 2, 2018

Cold on a Cob
Feb 6, 2006

i've seen so much, i'm going blind
and i'm brain dead virtually

College Slice

Hughmoris posted:

I'm continuing my journey of learning SQL, specifically T-SQL. I'm starting to explore how to set a query to run weekly on a specific weekday.

I'm an analyst that is not part of the server or date warehouse team. Assuming my company has all the typical SQL reporting tools (SSRS?), what should I read up on for the simplest way to schedule a T-SQL query to run every Tuesday?

Ideal world, the query runs every Tuesday and it emails the resultant CSV file to somebody.

Read up on the sql agent jobs. If all you want is csv results, you can use sp_send_dbmail from inside your sql job.

Relevant links:
Agent/Jobs: https://docs.microsoft.com/en-us/sql/ssms/agent/sql-server-agent?view=sql-server-2017
Sending Email from T-SQL: https://docs.microsoft.com/en-us/sql/relational-databases/system-stored-procedures/sp-send-dbmail-transact-sql?view=sql-server-2017

Nth Doctor
Sep 7, 2010

Darkrai used Dream Eater!
It's super effective!


Vegetable posted:

I'm using Hive and can't get my results to be sorted the intended way.

This is what I want, but can't get:



The logic is as follows:

1. Overall total goes on top of everything else.
2. Rows with identical supergroups are lumped together. Supergroup with highest quantity should come first.
3. Within each supergroup, the subtotal row goes first. The rest are sorted with the largest groups on top.

This is what I'm getting but don't want:



This is my current code:

code:
ORDER BY CASE
            when supergroup = 'TOTAL' then 1
            else industry
        END
         ,units_sold
How do I get what I want? Ideally this would all be done in the presentation layer, but our system doesn't have one, so we have to make the best with what we've got.

An ORDER BY by itself isn't enough to help you. Can you rejigger your query to get a dedicated subtotal-by-group column where every row for a group has the same value, which happens to match the rolled up subtotal? Then you could do
code:
ORDER BY CASE
            when supergroup = 'TOTAL' then 1
            else 2
        END,
        subtotal_column DESC,
        group,
         units_sold DESC
E: So I've never worked with HiveQL but it looks like it supports windowing functions so having a column like:
SUM() OVER(PARTITION BY group) AS subtotal_column should work. The sum would end up being twice the actual subtotal since we're including the subtotal column in that sum as well but since this is only for ordering and not display it shouldn't be a huge problem unless you're bumping up against max values in your aggregates.

Nth Doctor fucked around with this message at 12:59 on Aug 2, 2018

Hughmoris
Apr 21, 2007
Let's go to the abyss!

Cold on a Cob posted:

Read up on the sql agent jobs. If all you want is csv results, you can use sp_send_dbmail from inside your sql job.

Relevant links:
Agent/Jobs: https://docs.microsoft.com/en-us/sql/ssms/agent/sql-server-agent?view=sql-server-2017
Sending Email from T-SQL: https://docs.microsoft.com/en-us/sql/relational-databases/system-stored-procedures/sp-send-dbmail-transact-sql?view=sql-server-2017

Thanks for the guidance and links!

Say I wanted to end up getting fancier than a CSV. I have an Excel workbook that I use as a reporting template. Basically I open the workbook template, refresh my data with the latest CSV file which in turn updates my pivot tables, then save and email out.

I'm all for automating as much as possible. Do you see a path to accomplish this given the need to run the query first? I've fairly comfortable with VBA and basic scripting (not T-SQL related) but my workstation isn't always on when it needs to run.


*Well that sucks. Looks like I dont have access to SQL Agent Jobs.

Hughmoris fucked around with this message at 16:00 on Aug 2, 2018

Jethro
Jun 1, 2000

I was raised on the dairy, Bitch!

Nth Doctor posted:

[snip]
E: So I've never worked with HiveQL but it looks like it supports windowing functions so having a column like:
SUM() OVER(PARTITION BY group) AS subtotal_column should work. The sum would end up being twice the actual subtotal since we're including the subtotal column in that sum as well but since this is only for ordering and not display it shouldn't be a huge problem unless you're bumping up against max values in your aggregates.
Yeah, if Hive supports windowing then you're done. You don't even need to add any columns (assuming Hive supports putting the window functions in the ORDER BY).
code:
ORDER BY CASE
            when supergroup = 'TOTAL' then 1
            else 2
        END,
        SUM(units_sold) OVER(PARTITION BY supergroup) DESC,
        supergroup,
        CASE
            when group = 'subtotal' then 1
            else 2
        END,
        units_sold DESC
So, use the CASE to put the TOTAL row at the top, sort by (twice) the supergroup subtotal to get the largest supergroups after that, sort by supegroup (to keep each supergroup together in case of ties), use the second CASE to put the subtotals at the top within each supergroup, and then sort by units_sold. You may also want to put group at the end of the ORDER BY if you care about having a particular consistent order in case of ties.

Nth Doctor
Sep 7, 2010

Darkrai used Dream Eater!
It's super effective!


Jethro posted:

You may also want to put group at the end of the ORDER BY if you care about having a particular consistent order in case of ties.

:yeah:

Cold on a Cob
Feb 6, 2006

i've seen so much, i'm going blind
and i'm brain dead virtually

College Slice

Hughmoris posted:

Thanks for the guidance and links!

Say I wanted to end up getting fancier than a CSV. I have an Excel workbook that I use as a reporting template. Basically I open the workbook template, refresh my data with the latest CSV file which in turn updates my pivot tables, then save and email out.

I'm all for automating as much as possible. Do you see a path to accomplish this given the need to run the query first? I've fairly comfortable with VBA and basic scripting (not T-SQL related) but my workstation isn't always on when it needs to run.


*Well that sucks. Looks like I dont have access to SQL Agent Jobs.

I have zero actual experience doing this myself but you might be better off bringing data directly into excel on demand i.e.
https://docs.microsoft.com/en-us/azure/sql-database/sql-database-connect-excel

Edit: that link is Azure specific but you should be able to do the same with a local SQL Server as well.

Scaramouche
Mar 26, 2001

SPACE FACE! SPACE FACE!

Hughmoris posted:

I'm continuing my journey of learning SQL, specifically T-SQL. I'm starting to explore how to set a query to run weekly on a specific weekday.

I'm an analyst that is not part of the server or date warehouse team. Assuming my company has all the typical SQL reporting tools (SSRS?), what should I read up on for the simplest way to schedule a T-SQL query to run every Tuesday?

Ideal world, the query runs every Tuesday and it emails the resultant CSV file to somebody.

It won't be what you end up doing (unless security is incredibly bad at your workplace) but when I needed to figure out how to do this way back when I did it on a local instance on my desktop first using scheduled tasks and a proc that used the send_dbmail function. Because security was incredibly bad where I was originally I was able to do it on my local MSSSQL using a third party/remote connection to the live database.

My Rhythmic Crotch
Jan 13, 2011

PhantomOfTheCopier posted:

The counting approach then?
code:
WITH values AS (SELECT DISTINCT(value) FROM my_table WHERE id=1234)
,total AS (SELECT id,count(id) ttl FROM my_table GROUP BY id)
,matches AS (SELECT id,count(id) ttl FROM my_table WHERE value IN (SELECT value FROM values) GROUP BY id)
SELECT total.id FROM total,matches WHERE total.id=matches.id AND total.ttl=matches.ttl AND matches.ttl=(SELECT COUNT(value) FROM values);
code:
...
insert into my_table values(9998, 'derp1');
insert into my_table values(9998, 'derp3');
-- to check the subset case

...
  id  
------
 9999
 1234
(2 rows)
Edit: Next there will be some NULLs trickery hiding. :razz:

Edit: No, but there is a dup problem (if id 5557 has values derp1,derp2,derp2 it will match).

Editedit: Which you can get around with count(distinct(value)) instead of count(id)... unless there are null values.

Quoting from several pages and months ago, this is the solution that I went with. It allowed me to eliminate pages and pages of crusty, slow application code written by one of the junior guys. It's been happily humming along in production for some time now. Thanks to all who contributed ideas.

LongSack
Jan 17, 2003

Using MSSQL (T-SQL) SQLExpress (the free one), is there a way to create a database where the physical files reside on a NAS device? Ive tried before, but always seem to end up with permission errors. The problem is that when I back up a database the backup file ends up on the same physical drive as the database itself, which is far from ideal. Or is there a way to back up a database to a NAS device? Again, prior attempts resulted in permission errors. TIA

Edit: sorry if Im missing something obvious, Im not an O/S guy, my strengths are firewalls and other threat prevention devices, and I can sling some code.

LongSack fucked around with this message at 04:58 on Aug 5, 2018

Mark Larson
Dec 27, 2003

Interesting...
Couldn't you just run a scheduled Windows job to copy the latest backup files from the local drive to the NAS?

Hughmoris
Apr 21, 2007
Let's go to the abyss!
Thanks for the ideas on how to schedule a weekly query and email.

I have an older coworker who is a Microsoft Excel and Access guru. He pretty much exclusively uses the Query Designer (visual designer) in Access, and he is struggling a bit in transitioning to T-SQL since the Query Designer in SSMS is poor compared to Access's (per him).

Are there any third party tools I can point him at that have a great visual query designer, and can hook in to SQL Server?

LongSack
Jan 17, 2003

Mark Larson posted:

Couldn't you just run a scheduled Windows job to copy the latest backup files from the local drive to the NAS?

Yeah, i think what i'll do is write a program that backs up all the databases, then copies the backup files to the NAS and then schedule that program to run. Thanks.

Nth Doctor
Sep 7, 2010

Darkrai used Dream Eater!
It's super effective!


LongSack posted:

Using MSSQL (T-SQL) SQLExpress (the free one), is there a way to create a database where the physical files reside on a NAS device? Ive tried before, but always seem to end up with permission errors. The problem is that when I back up a database the backup file ends up on the same physical drive as the database itself, which is far from ideal. Or is there a way to back up a database to a NAS device? Again, prior attempts resulted in permission errors. TIA

Edit: sorry if Im missing something obvious, Im not an O/S guy, my strengths are firewalls and other threat prevention devices, and I can sling some code.

How is the NAS accessed on the SQLExpress' machine? \\myNAS\backups\db or X:\backups\db type stuff ? It looks like the scripts here would be file location agnostic, provided the user that is executing the backup and/or under which the SQL instance is running has write permissions to the destination.

LongSack
Jan 17, 2003

Nth Doctor posted:

How is the NAS accessed on the SQLExpress' machine? \\myNAS\backups\db or X:\backups\db type stuff ? It looks like the scripts here would be file location agnostic, provided the user that is executing the backup and/or under which the SQL instance is running has write permissions to the destination.

I suspect so, but not certain. I had been using a WD "personal cloud" type server, but it's really unreliable and a pain in the rear end, so i ordered a "real" NAS (A Synology 2-bay enclosure and a couple WD Red 2TB drives). They won't be delivered until tomorrow, so i won't know for sure until I get it set up. I strongly suspect that one of those 2 methods (or perhaps both) will be how it goes. Thanks for the links!

kumba
Nov 8, 2003

I posted my food for USPOL Thanksgiving!

enjoy the ride

Lipstick Apathy
i am brainfarting hard and cannot work this out for some reason

say i have a list of account numbers, products, begin dates, and end dates like so:

code:
AccountNum   |  Product   |  BeginDate  |  EndDate
1234567890   |  ProdA     |  1/1/2016   |  5/31/2016
1234567890   |  ProdA     |  8/5/2016   |  10/31/2016
1234567890   |  ProdA     |   1/1/2018  |  4/30/2018
1234567890   |  ProdB     |   2/2/2017  |  6/30/2017
1234567890   |  ProdB     |   8/1/2017  |  12/31/2017
7894561203   |  ProdA     |   5/5/2018  |  7/31/2018
7894561203   |  ProdC     |  12/1/2016   |  2/28/2017
7894561203   |  ProdC     |  4/1/2017   |  6/30/2018
My desired output is as follows: for each account number & product combination, I need the earliest begin date and the latest begin date, like so

code:
AccountNum   |  Product   |  BeginDate  |  EndDate
1234567890   |  ProdA     |  1/1/2016   |   4/30/2018
1234567890   |  ProdB     |  2/2/2017   |  12/31/2017
7894561203   |  ProdA     |  5/5/2018   |  7/31/2018
7894561203   |  ProdC     |  12/1/2016  |  6/30/2018
There are varying numbers of accounts, there are a set number of products but each account can add or remove any product at any time, so there's no upper limit on how many different times a single account can begin and end the same product

I feel like there's an easy answer here but I'm coming up short. Counting the number of instances of a product per account, ordering by account/product/begin date, apply row_number partitioned by account/product and order by begin date, grab begindate where row_number = 1 and shove the count of products into a lag function to get the proper end date? I think that will work but I feel like there's got to be an easier way of accomplishing this

Kuule hain nussivan
Nov 27, 2008

kumba posted:

i am brainfarting hard and cannot work this out for some reason

say i have a list of account numbers, products, begin dates, and end dates like so:

code:
AccountNum   |  Product   |  BeginDate  |  EndDate
1234567890   |  ProdA     |  1/1/2016   |  5/31/2016
1234567890   |  ProdA     |  8/5/2016   |  10/31/2016
1234567890   |  ProdA     |   1/1/2018  |  4/30/2018
1234567890   |  ProdB     |   2/2/2017  |  6/30/2017
1234567890   |  ProdB     |   8/1/2017  |  12/31/2017
7894561203   |  ProdA     |   5/5/2018  |  7/31/2018
7894561203   |  ProdC     |  12/1/2016   |  2/28/2017
7894561203   |  ProdC     |  4/1/2017   |  6/30/2018
My desired output is as follows: for each account number & product combination, I need the earliest begin date and the latest begin date, like so

code:
AccountNum   |  Product   |  BeginDate  |  EndDate
1234567890   |  ProdA     |  1/1/2016   |   4/30/2018
1234567890   |  ProdB     |  2/2/2017   |  12/31/2017
7894561203   |  ProdA     |  5/5/2018   |  7/31/2018
7894561203   |  ProdC     |  12/1/2016  |  6/30/2018
There are varying numbers of accounts, there are a set number of products but each account can add or remove any product at any time, so there's no upper limit on how many different times a single account can begin and end the same product

I feel like there's an easy answer here but I'm coming up short. Counting the number of instances of a product per account, ordering by account/product/begin date, apply row_number partitioned by account/product and order by begin date, grab begindate where row_number = 1 and shove the count of products into a lag function to get the proper end date? I think that will work but I feel like there's got to be an easier way of accomplishing this

Does...

SELECT Account, Product, MIN(BeginDate), MAX(EndDate)
FROM Table
GROUP BY Account, Product

...work?

Kuule hain nussivan fucked around with this message at 22:35 on Aug 9, 2018

Nth Doctor
Sep 7, 2010

Darkrai used Dream Eater!
It's super effective!


Kuule hain nussivan posted:

]

Does...

SELECT Account, Product, MIN(BeginDate), MAX(EndDate)
FROM Table
GROUP BY Account, Product

...work?

this the thing.

kumba
Nov 8, 2003

I posted my food for USPOL Thanksgiving!

enjoy the ride

Lipstick Apathy
I have been staring at this data for so long I ignored the obvious, for gently caress's sake

Thanks :)

Ruggan
Feb 20, 2007
WHAT THAT SMELL LIKE?!


Now get me each product, the count of current product subscriptions, the account that purchased it first, and the distinct count of other unique products that account had purchased prior. Make sure to include products that have never been purchased.

19 o'clock
Sep 9, 2004

Excelsior!!!
Novice here.

I'm running some queries that have gotten wayyyyy out of hand regarding string manipulation in extracting dates from some invoice fields. I want to build some UDF's but don't have developer access to the server to do so.

Is there a way I can define these alongside the query itself? I.e.:

code:
CREATE FUNCTION udfGetStartDate (@input VARCHAR(50))
RETURNS DATE
AS BEGIN
	DECLARE @returnvalue as DATE
	DECLARE @invoicecomment as VARCHAR(50)
		
	SET @invoicecomment = @input

	SET @returnvalue = try_cast(Substring(@invoicecomment, Charindex('/', @invoicecomment) - 2, 10) as date)

	RETURN @returnvalue
END

SELECT udfGetStartDate(I.InvoiceCmnt) as START_DATE 
       FROM tableInvoices I
       WHERE I.Whatever
I'm trying to feed this all to the server but to no avail. I want to make this so that someone besides myself can manage the messiness of nested and re-used string functions all over the place.

Cold on a Cob
Feb 6, 2006

i've seen so much, i'm going blind
and i'm brain dead virtually

College Slice
In sql server you can create connection scoped temporary procedures i.e. create procedure #tempproc, but I have never done this so ymmv.

You could also do dynamic sql, i.e.

code:
DECLARE @mydoublefunc nvarchar(500) = N'SELECT @outputval = @inputval * 2;';
DECLARE @mydoublefunc_parms nvarchar(500) = N'@inputval int, @outputval int OUTPUT';

DECLARE @result int;
EXEC sp_executesql @mydoublefunc, @mydoublefunc_parms, 5, @outputval = @result OUTPUT;
PRINT CONVERT(varchar, @result);
This might not be allowed by your DBA though.

Failing that, maybe run a pre-processor on your SQL code before running it.

19 o'clock
Sep 9, 2004

Excelsior!!!

Cold on a Cob posted:

In sql server you can create connection scoped temporary procedures i.e. create procedure #tempproc, but I have never done this so ymmv.

You could also do dynamic sql, i.e.

code:
DECLARE @mydoublefunc nvarchar(500) = N'SELECT @outputval = @inputval * 2;';
DECLARE @mydoublefunc_parms nvarchar(500) = N'@inputval int, @outputval int OUTPUT';

DECLARE @result int;
EXEC sp_executesql @mydoublefunc, @mydoublefunc_parms, 5, @outputval = @result OUTPUT;
PRINT CONVERT(varchar, @result);
This might not be allowed by your DBA though.

Failing that, maybe run a pre-processor on your SQL code before running it.

Thank you!! I'm looking at temporary procedures right now and will check out dynamic sql, too. Good ideas - thank you for the quick response!!

Cold on a Cob
Feb 6, 2006

i've seen so much, i'm going blind
and i'm brain dead virtually

College Slice

19 o'clock posted:

Thank you!! I'm looking at temporary procedures right now and will check out dynamic sql, too. Good ideas - thank you for the quick response!!

You're welcome! None of these are perfect replacements for a good UDF but hopefully they'll help.

19 o'clock
Sep 9, 2004

Excelsior!!!

Cold on a Cob posted:

You're welcome! None of these are perfect replacements for a good UDF but hopefully they'll help.

Coding with severe limitations is ultimately good in that you learn some weird corner-case tricks, but boy does it suck when you're in it.

Hughmoris
Apr 21, 2007
Let's go to the abyss!
T-SQL novice, and I see a little guidance on how to approach a problem.

I have a table that contains patients and their ordered medications. I have a second table which contains those patients and their lab results. What I want to do is evaluate what impact a medication order can have one their follow-up lab value.

Medication_Table:
code:
patient, medication, administration_time
john, potassium supplement, 2018-08-01 08:30
john, magnesium supplement, 2018-08-02 09:00
mary, phosphorus supplement, 2018-08-05 15:00
phil, magnesium supplement, 2018-08-03 13:15
Lab Result Table:
code:
patient, lab_order, lab_order_time, lab_result
john, potassium, 2018-07-25 13:00, 4.3
john, potassium, 2018-08-01 10:30, 4.5
phil, magnesium, 2018-08-04 04:00, 1.8
Ideal Final Table:
code:
patient, medication, administration time, next_relevant_lab_order, next_relevant_lab_order_time, next_relevant_lab_result
john, potassium supplement, 2018-08-01 08:30, potassium, 2018-08-01 10:30, 4.5
phil, magnesium supplement, 2018-08-03 13:15, magnesium, 2018-08-04 04:00, 1.8
I guess the plain english version is that for every electrolyte supplement given, I want to look to that patient's closest subsequent electrolyte lab value to see what the effect is.

Is this something that Window functions / Lead functions are meant for?

Ruggan
Feb 20, 2007
WHAT THAT SMELL LIKE?!


Windowed functions are intended for doing some sort of partitioned or ordered calculation on your dataset. Lag and Lead are specifically for finding the last or next value of a row without needing to do some crazy joins and filtering. Usually you would use windowed functions on a single dataset (like an individual table or maybe a product of joins in the right situation). Lets say you were trying to find the value of the prior lab result for the same lab: using Lag lets you skip hairy aggregations and self-joins.

If I were approaching this problem, I would probably start by trying the easiest logical solution: an outer apply select top 1 lab where lab_date > admin_date order by lab_date. Applies can have worse performance than joins because theyre basically evaluating each row individually, so that might not be your ideal solution.

If that performed poorly, then I might try left joining labs to admins on lab_date > admin_date and aggregating min lab_date and grouping on admin_id - effectively giving you the first lab after each admin - and using that to join out to the right data for every admin and its next lab. Thats the traditional join based solution. You might be able to further optimize by capping your join on something like lab_date < dateadd(hours, admin_date, @dosage_effect_duration) - that could reduce the number of rows joined (reads).

Hughmoris
Apr 21, 2007
Let's go to the abyss!

Ruggan posted:

Windowed functions are intended for doing some sort of partitioned or ordered calculation on your dataset. Lag and Lead are specifically for finding the last or next value of a row without needing to do some crazy joins and filtering. Usually you would use windowed functions on a single dataset (like an individual table or maybe a product of joins in the right situation). Lets say you were trying to find the value of the prior lab result for the same lab: using Lag lets you skip hairy aggregations and self-joins.

If I were approaching this problem, I would probably start by trying the easiest logical solution: an outer apply select top 1 lab where lab_date > admin_date order by lab_date. Applies can have worse performance than joins because theyre basically evaluating each row individually, so that might not be your ideal solution.

If that performed poorly, then I might try left joining labs to admins on lab_date > admin_date and aggregating min lab_date and grouping on admin_id - effectively giving you the first lab after each admin - and using that to join out to the right data for every admin and its next lab. Thats the traditional join based solution. You might be able to further optimize by capping your join on something like lab_date < dateadd(hours, admin_date, @dosage_effect_duration) - that could reduce the number of rows joined (reads).

I learned something new today! I've never used the outer apply method before so I tried that out and it worked like a charm. To your point though, the performance is not great due to having to read through a lot of rows. I'll try my hand at the left join technique you spoke to. Thanks for the help!

Adbot
ADBOT LOVES YOU

Hughmoris
Apr 21, 2007
Let's go to the abyss!
Hmm. Yeah, I think I'm going to have to figure out how to use the second method you recommended. The outer apply query is taking about 7 minutes for 1 month.

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply