Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
Count Thrashula
Jun 1, 2003

Death is nothing compared to vindication.
Buglord

Nth Doctor posted:

This looks like a job for Change Data Capture

Holy poo poo. I guess this is new to 2016? We're migrating to that in about a month, guess I can put off work until then.

Adbot
ADBOT LOVES YOU

Internet Explorer
Jun 1, 2005





COOL CORN posted:

Holy poo poo. I guess this is new to 2016? We're migrating to that in about a month, guess I can put off work until then.

If you click "Other Versions" at the top of that page you can see the feature goes back to SQL 2012. Really good to be able to see the differences in implementation from year to year.

Boogeyman
Sep 29, 2004

Boo, motherfucker.

Internet Explorer posted:

If you click "Other Versions" at the top of that page you can see the feature goes back to SQL 2012. Really good to be able to see the differences in implementation from year to year.

It's available all the way back to SQL 2008, which I know because I'm currently dealing with some dummy's terrible implementation of it and it's making me sad.

hedgecore
May 2, 2004
In a MySQL table, how do I go about iterating over all rows, where if a column value is greater than 1, duplicating the row that many times?

As in, I have an `ORDERS` table, and one of the columns is "Quantity". If "Quantity" = 2, then I actually want to make a copy of that row so there are two rows with that data, if "Quantity" = 3 then I want there to be three of them, and so on.

Count Thrashula
Jun 1, 2003

Death is nothing compared to vindication.
Buglord

Boogeyman posted:

It's available all the way back to SQL 2008, which I know because I'm currently dealing with some dummy's terrible implementation of it and it's making me sad.

What's a terrible implementation of it :ohdear:

Like... I just want to enable it on some tables so that I have those cdc system tables to query on

Boogeyman
Sep 29, 2004

Boo, motherfucker.

COOL CORN posted:

What's a terrible implementation of it :ohdear:

Like... I just want to enable it on some tables so that I have those cdc system tables to query on

I will qualify all of this by saying that I am by no means an expert about anything to do with CDC, considering I just found out about it recently (and accidentally). I was investigating why some databases I inherited were so bloated, but I couldn't figure out where the data was hiding. I usually use sp_blitzindex to check table sizes, and that doesn't include system tables, which means that the CDC tables weren't showing up. I finally figured out that CDC was enabled, found the tables, and started looking into how it's all set up. Why I'm upset about it:

1. I don't like the fact that it was trying to hide the data from me, but whatever, that's a limitation of the tool I was using.

2. I don't like the fact that most of the cleanup jobs that were created are either not running as often as they should, or they're retaining way too much data. Again, due to the dummy that set it up.

3. Apparently this isn't the default config, but there are other jobs that exist that are pulling rows out of the CDC tables every night and stuffing them into a separate database on the same instance. I guess I kind of get why they would do that, but it just makes everything a lot more complicated. And isn't all that great for performance considering they're slaughtering this instance.

4. I don't like the amount of storage that it's consuming. Probably due to our workload (which should never have had CDC enabled in the first place). We had an 850GB table (ridiculous, I know), and the CDC table backing it was another 750GB. Tracking "changes" that weren't really changes, just inserts. Why would you enable CDC on a table that never actually changes? That's anyone's guess. We had another table that was 2.8GB in size, the CDC table backing that one was 299GB because no one ever considered the frequency of the updates on it.

I'm sure it's probably a great fit for certain applications, so long as you consider the performance implications, and carefully consider each table's access patterns (insert/update/delete) because that will determine how much extra storage you'll need to support it. In my case, it was some absolute dipshit that implemented it, half rear end customized the jobs that SQL came up with, and left no documentation for me to work off of, so CDC is pretty much my "want to murder someone" trigger at this point.

Hammerite
Mar 9, 2007

And you don't remember what I said here, either, but it was pompous and stupid.
Jade Ear Joe

hedgecore posted:

In a MySQL table, how do I go about iterating over all rows, where if a column value is greater than 1, duplicating the row that many times?

As in, I have an `ORDERS` table, and one of the columns is "Quantity". If "Quantity" = 2, then I actually want to make a copy of that row so there are two rows with that data, if "Quantity" = 3 then I want there to be three of them, and so on.

That's a bit of a weird thing to want to do with a relational database. The most painless way of doing it that I can think of involves creating a temporary table and joining to it. Is this a one-time thing, or something you'll have to do regularly? Do you know what the maximum "Quantity" value is that will be encountered? Does it definitely have to be done in SQL rather than in your favourite scripting language?

hedgecore
May 2, 2004

Hammerite posted:

That's a bit of a weird thing to want to do with a relational database. The most painless way of doing it that I can think of involves creating a temporary table and joining to it. Is this a one-time thing, or something you'll have to do regularly? Do you know what the maximum "Quantity" value is that will be encountered? Does it definitely have to be done in SQL rather than in your favourite scripting language?

Yeah, it's super weird. The application using the data set is just expecting it in that form, but the new data source doesn't provide it that way.

Even though it's a one-off migration, I agree it's probably just easier to write a script to do it once if there's no obvious logical way to do it with SQL.
Thanks!

BabyFur Denny
Mar 18, 2003

hedgecore posted:

Yeah, it's super weird. The application using the data set is just expecting it in that form, but the new data source doesn't provide it that way.

Even though it's a one-off migration, I agree it's probably just easier to write a script to do it once if there's no obvious logical way to do it with SQL.
Thanks!

If you have a proper SQL db you can easily do it with recursive CTEs. But I think you already mentioned you're using mySQL?

Hammerite
Mar 9, 2007

And you don't remember what I said here, either, but it was pompous and stupid.
Jade Ear Joe

hedgecore posted:

Yeah, it's super weird. The application using the data set is just expecting it in that form, but the new data source doesn't provide it that way.

Even though it's a one-off migration, I agree it's probably just easier to write a script to do it once if there's no obvious logical way to do it with SQL.
Thanks!

The most painless way I thought of to do it would be to create a "triangular table" of quantities to row numbers, and construct a join to that table from your original table.

i.e.

CREATE TABLE Triangular (Quantity INT NOT NULL, RowNumber INT NOT NULL, PRIMARY KEY (Quantity, RowNumber));

INSERT INTO Triangular (Quantity, RowNumber) VALUES
(1, 1),
(2, 1), (2, 2),
(3, 1), (3, 2), (3, 3),
(4, 1), (4, 2), (4, 3), (4, 4),
... -- continue for as long as necessary

Roundboy
Oct 21, 2008

hedgecore posted:

In a MySQL table, how do I go about iterating over all rows, where if a column value is greater than 1, duplicating the row that many times?

As in, I have an `ORDERS` table, and one of the columns is "Quantity". If "Quantity" = 2, then I actually want to make a copy of that row so there are two rows with that data, if "Quantity" = 3 then I want there to be three of them, and so on.

It's a script either way.

Query the table by whatever criteria you need, pull the quantity value as an integer.

Loop x times as an insert statement to a new table using the data. If there is an order number or something unique to that row,you can

Loop x times
Insert into new table
Select * from old table where unique id=the value from the current line

Make sure your new table does not have a primary key as you are going to Intentionally dupe data.

You can run this periodically to only add new records, or better yet have a trigger that does the same thing when a row is inserted into the old table.

Jethro
Jun 1, 2000

I was raised on the dairy, Bitch!

hedgecore posted:

In a MySQL table, how do I go about iterating over all rows, where if a column value is greater than 1, duplicating the row that many times?

As in, I have an `ORDERS` table, and one of the columns is "Quantity". If "Quantity" = 2, then I actually want to make a copy of that row so there are two rows with that data, if "Quantity" = 3 then I want there to be three of them, and so on.

Create a table with one column that just has all the integers from 1 to the largest quantity you'd ever expect (plus a few extra just in case) and then join to that table with the condition quantity >= row_count

anthonypants
May 6, 2007

by Nyc_Tattoo
Dinosaur Gum
Has anyone here ever used pgloader before? I'm copying a ~1.5GB MySQL database into postgres, and the last time it logged any activity was a CREATE INDEX about 20 hours ago, but it's still eating a CPU core so I don't know if it's actually doing anything or if it's stuck. I also don't know anything about databases, so I'm not sure how to see if data's being imported. Or, maybe that's a good indication that I should just stick with MySQL?

PhantomOfTheCopier
Aug 13, 2008

Pikabooze!

BabyFur Denny posted:

If you have a proper SQL db you can easily do it with recursive CTEs. But I think you already mentioned you're using mySQL?
That was my first thought as well or maybe some window function, but can't this still be hacked with an equivalent of generate_series? Use a MySQL variable trick to generate the sequence of natural numbers, Cartesian join with the counts et als, restrict where count is greater than sequence index.

... and then wonder how long That will take to run. Slower than PG fdw?

stoops
Jun 11, 2001
I have a field in my table that requires text to be stored in it.

Currently, the biggest record I have for that field has a length of 100626.

My question is what should i store it as?

I have it as LongText right now, but is there another type, something smaller, I can get away with?

Kuule hain nussivan
Nov 27, 2008

stoops posted:

I have a field in my table that requires text to be stored in it.

Currently, the biggest record I have for that field has a length of 100626.

My question is what should i store it as?

I have it as LongText right now, but is there another type, something smaller, I can get away with?
MediumText is good for 16 million and some characters, so you could use that. Not sure if there'll be a size difference though.

Edit: You apparently save a byte of overhead :toot:

Kuule hain nussivan fucked around with this message at 19:42 on Oct 13, 2016

Dex
May 26, 2006

Quintuple x!!!

Would not escrow again.

VERY MISLEADING!

anthonypants posted:

Has anyone here ever used pgloader before? I'm copying a ~1.5GB MySQL database into postgres, and the last time it logged any activity was a CREATE INDEX about 20 hours ago, but it's still eating a CPU core so I don't know if it's actually doing anything or if it's stuck. I also don't know anything about databases, so I'm not sure how to see if data's being imported. Or, maybe that's a good indication that I should just stick with MySQL?

20 hours? that's definitely hung. i used it for a single thing a while back and ran into similar issues(dropped to a single core, no sign of any disk i/o). ended up going table by table, like this guy https://github.com/dimitri/pgloader/issues/337#issuecomment-181258542 , although way further down in that thread there's some work_mem chatter that might get you further. there's also this https://github.com/dimitri/pgloader/issues/420#issuecomment-235697613

anthonypants
May 6, 2007

by Nyc_Tattoo
Dinosaur Gum

Dex posted:

20 hours? that's definitely hung. i used it for a single thing a while back and ran into similar issues(dropped to a single core, no sign of any disk i/o). ended up going table by table, like this guy https://github.com/dimitri/pgloader/issues/337#issuecomment-181258542 , although way further down in that thread there's some work_mem chatter that might get you further. there's also this https://github.com/dimitri/pgloader/issues/420#issuecomment-235697613
Thanks for this, I ran a few tables at a time to figure out the problematic ones, then bumped up the mem usage to 512MB/128MB for them and pgloader is now saying it's completing :toot:

PhantomOfTheCopier
Aug 13, 2008

Pikabooze!
As much as I'm a hardcore fan of PG, just how much effort would it take for the defaults to consume 90% of all available memory on the system instead of the 128MB default from 1998. Honestly this bites almost every MySQL user and half those starting with PostgreSQL.

https://help.bluebox.net/hc/en-us/articles/205855907-PostgreSQL-Configuration

https://help.bluebox.net/hc/en-us/articles/205844677-PostgreSQL-Tuning

anthonypants
May 6, 2007

by Nyc_Tattoo
Dinosaur Gum
So I've got some MySQL columns that are of types blob, longblob, and varbinary. pgloader converted all those to bytea. This is for a ticketing database, so I made it create a brand new database, and it wants the blob columns to be of type text and the varbinary columns to be of type varchar. Doing an alter column set data type text/varchar keeps the hex value ('\x[whatever]'), but how do I convert a column from hex to ascii? I'm looking for something like convert_from(decode(`select column_name from table_name`, 'hex'), 'utf-8') and either inserting that column into the table so I can rename it later, or just replacing the contents of the column but I have no idea how to do this and Google isn't helping.

Pardot
Jul 25, 2001




anthonypants posted:

So I've got some MySQL columns that are of types blob, longblob, and varbinary. pgloader converted all those to bytea. This is for a ticketing database, so I made it create a brand new database, and it wants the blob columns to be of type text and the varbinary columns to be of type varchar. Doing an alter column set data type text/varchar keeps the hex value ('\x[whatever]'), but how do I convert a column from hex to ascii? I'm looking for something like convert_from(decode(`select column_name from table_name`, 'hex'), 'utf-8') and either inserting that column into the table so I can rename it later, or just replacing the contents of the column but I have no idea how to do this and Google isn't helping.

Does this help? I'm not clear on why you don't want it as 3 bytea columns.

code:
***=# create table lmao (a bytea); insert into lmao values ('\xCAFE'), ('\x0F0F0F'), ('\x219AAA');
CREATE TABLE
Time: 4.192 ms
INSERT 0 3
Time: 0.465 ms
***=# \d lmao
    Table "public.lmao"
 Column | Type  | Modifiers
--------+-------+-----------
 a      | bytea |

***=# select * from lmao
-# ;
    a
----------
 \xcafe
 \x0f0f0f
 \x219aaa
(3 rows)

Time: 0.243 ms

***=# alter table lmao alter column a type text using encode(a, 'hex');
ALTER TABLE
Time: 4.769 ms

***=# \d lmao
    Table "public.lmao"
 Column | Type | Modifiers
--------+------+-----------
 a      | text |

***=# select * from lmao;
   a
--------
 cafe
 0f0f0f
 219aaa
(3 rows)

Time: 0.289 ms
***=# rollback;
ROLLBACK
Time: 2.022 ms
=#

Pardot fucked around with this message at 01:14 on Oct 22, 2016

anthonypants
May 6, 2007

by Nyc_Tattoo
Dinosaur Gum

Pardot posted:

Does this help? I'm not clear on why you don't want it as 3 bytea columns.

code:
***=# create table lmao (a bytea); insert into lmao values ('\xCAFE'), ('\x0F0F0F'), ('\x219AAA');
CREATE TABLE
Time: 4.192 ms
INSERT 0 3
Time: 0.465 ms
***=# \d lmao
    Table "public.lmao"
 Column | Type  | Modifiers
--------+-------+-----------
 a      | bytea |

***=# select * from lmao
-# ;
    a
----------
 \xcafe
 \x0f0f0f
 \x219aaa
(3 rows)

Time: 0.243 ms

***=# alter table lmao alter column a type text using encode(a, 'hex');
ALTER TABLE
Time: 4.769 ms

***=# \d lmao
    Table "public.lmao"
 Column | Type | Modifiers
--------+------+-----------
 a      | text |

***=# select * from lmao;
   a
--------
 cafe
 0f0f0f
 219aaa
(3 rows)

Time: 0.289 ms
***=# rollback;
ROLLBACK
Time: 2.022 ms
=#
Hell yes, this looks like exactly what I want. Thank you!

The reason I don't want the bytea columns is that I want to convert our internal ticketing database from mysql to postgres, and using pgloader worked for like 99% of the system, but there are some fields showing up in the webpage with those "\x[hex]" text, and others can't be edited because it tried to do am insert and gets an error that it was expecting a different datatype. I'm also doing an upgrade of the ticketing system, so I should probably be doing that part first, then have it update the database, and then convert to postgres, but I was lazy.

e: The actual query that worked for me was alter table lmao alter column a type text using decode(a, 'escape');, the encode() one just got rid of the \x escape and left the hex.

anthonypants fucked around with this message at 19:12 on Oct 22, 2016

Squashy Nipples
Aug 18, 2007

Is it OK to do Outer Join three tables together?

This DID NOT work:
code:
SELECT [FU].[EMPLOYEE_ID], [FU].[FIRST_NAME], [FU].[LAST_NAME], [FU].[ASSIGNMENT_STATUS_TYPE],
 [FU].[BANK_SPHERE_ID], [FU].[ASSIGNMENT_TYPE], [FU].[USER_STATUS], RIGHT([FU].[COST_CENTER],4) AS [COST_CENTER],
 [FU].[DEPARTMENT_CODE], [FU].[PARTENON_CODE], [FU].[NETWORKID], [AD].[sAMAccountName], [AD].[corporateID],
 RIGHT([LD].[comcodcentrocoste],4) AS [comcodcentrocoste], RIGHT([LD].[comcodpuesto],4) AS [comcodpuesto],
 [LD].[comcodpuestoserv],[LD].[comcodcentro],[LD].[comcodcentroserv]
 FROM [TempCopy41262WU.csv] AS [FU]
 LEFT OUTER JOIN [TempCopy41263JZ.csv] AS [AD] ON ([FU].[EMPLOYEE_ID] = [AD].[employeeID])
 LEFT OUTER JOIN [TempCopy41264WB.csv] AS [LD] ON ([FU].[BANK_SPHERE_ID] = [LD].[uid])
 WHERE [FU].[ASSIGNMENT_STATUS_TYPE] = 'ACTIVE' ORDER BY [FU].[EMPLOYEE_ID];

Syntax error (missing operator) in query expression '([FU].[EMPLOYEE_ID] = [AD].[employeeID]) 
LEFT OUTER JOIN [TempCopy41264WB.csv] AS [LD] ON ([FU].[BANK_SPHERE_ID] = [LD].[uid])'.
And this did:
code:
SELECT [FUAD].[EMPLOYEE_ID], [FUAD].[FIRST_NAME], [FUAD].[LAST_NAME], [FUAD].[BANK_SPHERE_ID],
 [FUAD].[ASSIGNMENT_TYPE], [FUAD].[USER_STATUS], RIGHT([FUAD].[COST_CENTER],4) AS [COST_CENTER],
 [FUAD].[DEPARTMENT_CODE], [FUAD].[PARTENON_CODE], [FUAD].[NETWORKID], [FUAD].[sAMAccountName],
 [FUAD].[corporateID], RIGHT([LD].[comcodcentrocoste],4) AS [comcodcentrocoste], RIGHT([LD].[comcodpuesto],4) AS [comcodpuesto],
 [LD].[comcodpuestoserv],[LD].[comcodcentro],[LD].[comcodcentroserv] FROM ( 
SELECT *
 FROM [TempCopy41697YJ.csv] AS [FU] LEFT OUTER JOIN [TempCopy41698NT.csv] AS [AD]  ON [FU].[EMPLOYEE_ID] = [AD].[employeeID]
 WHERE [FU].[ASSIGNMENT_STATUS_TYPE] = 'ACTIVE'
 ORDER BY [FU].[EMPLOYEE_ID] 
 ) AS FUAD LEFT OUTER JOIN [TempCopy41699BP.csv] AS [LD] ON ([FUAD].[BANK_SPHERE_ID] = [LD].[uid])
 ORDER BY [FU].[EMPLOYEE_ID]
Can someone explain the difference?

Hammerite
Mar 9, 2007

And you don't remember what I said here, either, but it was pompous and stupid.
Jade Ear Joe

Squashy Nipples posted:

Is it OK to do Outer Join three tables together?
...

There is no reason why you can't join (inner, outer, whatever) as many tables as you want (no reason that has anything to do with SQL, anyway).

Given the error message from your first query I want to say that the SQL parser barfed at the ON condition having brackets around it, but you did that with one of the two ON clauses in your second query too, so I'm not sure what's going on.

Squashy Nipples
Aug 18, 2007

Yeah, I'm using JET SQL, so the brackets are optional (as long as the headers contain no whitespace).

I went a little nuts with the bracketing because the first error was in syntax. ( I was missing a comma).
I also rarely use table aliases unless I'm doing a complicated Join.

NihilCredo
Jun 6, 2011

iram omni possibili modo preme:
plus una illa te diffamabit, quam multæ virtutes commendabunt

Are you using Access? If so, It wants you to put parentheses around each join, like this: SELECT smell FROM (((Cheek_1 LEFT JOIN Cheek_2 ON Ass_Crack) LEFT JOIN Toilet_Paper ON poop) LEFT JOIN Toilet ON Water) (because Access is terrible)

edit: missed your last post while thinking poop analogies. Never used JET but it might be worth a try still.

Squashy Nipples
Aug 18, 2007

No, not Access, using ADO and JET to synthesize data from three different CSV files with VBS/VBA. (My code automatically generates a schema.ini file that allows me to join the individual files)

EDIT:

NihilCredo posted:

edit: missed your last post while thinking poop analogies. Never used JET but it might be worth a try still.

Access SQL and JET SQL are very similar (based on the same underlying code). There are just enough small differences for it to be maddening. :haw:

PhantomOfTheCopier
Aug 13, 2008

Pikabooze!

NihilCredo posted:

Are you using Access? If so, It wants you to put parentheses around each join, like this: SELECT smell FROM (((Cheek_1 LEFT JOIN Cheek_2 ON Ass_Crack) LEFT JOIN Toilet_Paper ON poop) LEFT JOIN Toilet ON Water) (because Access is terrible)

edit: missed your last post while thinking poop analogies. Never used JET but it might be worth a try still.

Documentation suggests that it can handle the associative law automatically, so you're forced to specify the nesting order as NihilCredo is suggesting. The order of the parts seems weird though... https://msdn.microsoft.com/en-us/library/ff197346.aspx

Note that the ON statements come after the nested table specifications. :psyduck:

Dalmuti
Apr 8, 2007
Anyone at PASS this week?

Nth Doctor
Sep 7, 2010

Darkrai used Dream Eater!
It's super effective!


Dalmuti posted:

Anyone at PASS this week?

I wish I was. If you can make it to one of his talks, Mladen Prajdić is excellent. He gave me the most terrifying 75 minutes of my professional life back in 2014.

lol internet.
Sep 4, 2007
the internet makes you stupid
Couple questions, how come during a SQL query, the system disk write is higher then the read?

Also, any recommended resources\reads for a non-dba who needs to troubleshoot bad performance in SQL? :smith:

kumba
Nov 8, 2003

I posted my food for USPOL Thanksgiving!

enjoy the ride

Lipstick Apathy
To your first question - assuming you're talking about data extraction and not DDL, largely depends on the query, probably heavily utilizing temp tables, could be lots of things

To your second (and note that I am assuming you are a total novice here so I'm not trying to insult your intelligence if you already know this I promise) - the biggest piece of advice I give to someone who is just starting to look at and write SQL statements that need to be relatively performant is making sure items in your WHERE clauses are SARGable - that is, that SQL is able to use relevant indexes in order to speed up performance. This means that you do not, if you can help it, wrap a database column in a function of any kind.

A SARGable clause looks like this:
code:
select * from mytable a
where a.myDate = dateadd(dd,-7,cast(current_timestamp) as date)
A non-SARGable clause looks like this:
code:
select * from mytable a
where dateadd(a.myDate,7,cast(current_timestamp as date) = cast(current_timestamp) as date)
The first one will be orders of magnitude quicker than the second, because the second query has to evaluate the dateadd function against every single row in the table to determine what to compare against the right half. In the first query, the dateadd function is evaluated exactly once, and then the comparison is made, making things much much faster, in addition to being able to use whatever indexes there may be on the column.

To get into more complicated stuff, you'll likely need to look at specific execution plans and query statistics for your local environment since everything is going to differ from one to another. This is a decent start: https://www.simple-talk.com/sql/performance/simple-query-tuning-with-statistics-io-and-execution-plans/

PhantomOfTheCopier
Aug 13, 2008

Pikabooze!

lol internet. posted:

Couple questions, how come during a SQL query, the system disk write is higher then the read?
Is the write io higher than baseline or just higher than read io? Is some manner of verbose query logging enabled that would suggest increased writes? What is the query execution time (it would be highly backend dependent, but it might be storing more data to support mvcc ops during the query).

NextTime000
Feb 3, 2011

bweeeeeeeeeeeeeeeeeee
eeeeeeeeeeeeeeeeeeeee
eeeeeeeeeeeeeeeeeeeee
<----------------------------
(postgresql) I got a few questions about what's the more "proper" way to go about a few things in this game database project I've been working on:

1.) I got a table of weapons (n), a table of the special abilities (m), and a table that holds the various combinations of special abilities that a weapon could have (n,m), some weapons don't any special abilities while others have 3 or 4, and these abilities can appear on different weapons, so that's why I have the many to many table there.

If I wanted to make a list of weapons of a specific category that also shows the names of all special abilities each weapon has, what would be the best way do that? I got 2 main ideas in my head but they both kinda keep fighting it out because I am not sure which would make the most sense.

Method one: a single function while using GROUP BY and the Array_agg() function, code simplified for readability
code:
--function declaration stuff...
RETURN QUERY SELECT w.item_id, 
       w.item_name,
       w.other_columns
       Array_agg(a.ability_id),
       Array_agg(a.ability_name)
FROM   weapons w 
       left join weapons_union_abilities wua 
              ON ( w.item_id = wua.weapon_id ) 
       left join abilities a 
              ON ( wua.ability_id = a.ability_id ) 
WHERE  w.weapon_type = $1 
GROUP  BY w.item_id; 


Method two: Two functions, first function just retrieves all weapon entries of a given category; second function retrieves the many-to-many relationship table, Joins in the ability and weapon data, and returns only the relationships for weapons of the given type. match up the two returning tables in the php returns
code:
--Function declaration stuff 
RETURN query 

SELECT w.item_id, 
       w.item_name, 
       w.other_columns 
FROM   weapons w 
WHERE  w.weapon_type = $1 

--other function declaration 
RETURN query 

SELECT a.ability_id, 
       a.ability_name, 
       w.item_id 
FROM   weapons_union_abilities wua 
       LEFT JOIN abilities a 
              ON ( wua.ability_id = a.ability_id ) 
       LEFT JOIN weapons w 
              ON ( wua.weapon_id = w.item_id ) 
WHERE  w.weapon_type = $1; 
I guess I am just feeling uneasy about using the Array_agg() + GROUP BY method and the results being a bit messy-looking; looking at these simplified versions is making me kinda realize I should just stick to the first method.


2.) For the website's navigation menu/bar/whatever, I would want to have it display text to match the language for the rest of the page, but I feel like having a client query the database every page load for these commonly used words and phrases might not be the best way to do this; would I want to maybe cheese it a bit and have the database spit out a file containing these common words once in a while, and have the PHP end of the website just load that file instead for these kinds of things?

I am probably going to wind up just making the site navigation work in one language to start, honestly I am not really sure on the interest for supporting any other languages outside of Japanese and English really; but I would like to kinda leave it open to more just because I want to give it a shot I guess?

Cryolite
Oct 2, 2006
sodium aluminum fluoride
How can I perform a group by on multiple columns to get counts of the number of items in that group, similar to this image?



I know I can do each column separately in its own query by using count(1) with a single group by on the respective column for that query. However is it possible to do it in one query, and to scale to more than 2 columns easily? I'm using postgres.

lol internet.
Sep 4, 2007
the internet makes you stupid

PhantomOfTheCopier posted:

Is the write io higher than baseline or just higher than read io? Is some manner of verbose query logging enabled that would suggest increased writes? What is the query execution time (it would be highly backend dependent, but it might be storing more data to support mvcc ops during the query).

It's just higher then the read io. Unfortunately I don't have a baseline and to be fair it's essentially a select * from query which is returning 1mil records.

I'm just curious why it hits the IO. Read is understandable, but why write? I should mention the read/write is roughly 6GB. I don't think any sort of logging or editing is enabled. Oh and query execution time is roughly 100 seconds.

lol internet. fucked around with this message at 07:51 on Nov 7, 2016

Nth Doctor
Sep 7, 2010

Darkrai used Dream Eater!
It's super effective!


E: I'm illiterate

Pardot
Jul 25, 2001




NextTime000 posted:



Method one: a single function while using GROUP BY and the Array_agg() function, code simplified for readability


I think the array_aggs are great, and having one query is the best approach, so this one. I don't know why you want it in a function though, but that's besides the point.

Since it seems this data makes its way to a js front end, you might want to look at the json agg methods and roll up the attributes as a single array of json objects. That way you're not dealing with parallel arrays later on.

Cryolite
Oct 2, 2006
sodium aluminum fluoride

Cryolite posted:

How can I perform a group by on multiple columns to get counts of the number of items in that group, similar to this image?

I know I can do each column separately in its own query by using count(1) with a single group by on the respective column for that query. However is it possible to do it in one query, and to scale to more than 2 columns easily? I'm using postgres.

Nevermind, I got it by doing the two separate group bys and then using an outer join. I had to use coalesce a bunch though.

Adbot
ADBOT LOVES YOU

Squashy Nipples
Aug 18, 2007

Cryolite posted:

Nevermind, I got it by doing the two separate group bys and then using an outer join. I had to use coalesce a bunch though.

Been there. Sometimes you can use UNION for that, too.

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply