SELECT * FROM Questions WHERE Type = 'Stupid'

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > SELECT * FROM Questions WHERE Type = 'Stupid'

Ranzear: Jul 25, 2013

Anyone got an ear to bend about Galera clustering? I want to replicate everything to a single point for backups and replicate content from that single point back to the other shards, but I want shards to have their own exclusive databases for per-shard data. A bit like:

Master/backup:
content
sharda
shardb

Sharda:
content
sharda

Shardb:
content
shardb

I just want to know if I can exclude remote databases selectively I guess. I've been googling pretty hard and don't see why not just yet, as long as the shard dbs have different names.

Edit: I was being too specific while googling about. Galera just uses the vanilla 'replicate-ignore-db' in my.cnf. I'll just symlink a .cnf file in my setup script.

Ranzear fucked around with this message at 00:45 on Jul 1, 2017

# ¿ Jul 1, 2017 00:32

Adbot: ADBOT LOVES YOU

# ¿ May 18, 2024 08:12

Ranzear: Jul 25, 2013

Definitely more info needed as to why you need a four column unique constraint or even cram all that into one table to begin with, but yes your approach will work.

Anyone worked with MariaDB Dynamic Columns and have a oneliner for checking the BLOB is valid before doing a column_add?

# ¿ Aug 18, 2017 18:33

Ranzear: Jul 25, 2013

Is your app recreating the database handler multiple times?

# ¿ Sep 30, 2017 07:24

Ranzear: Jul 25, 2013

PHP here. I just pass my single database handler per request as the first arg to all my class-static functions.

# ¿ Oct 1, 2017 00:14

Ranzear: Jul 25, 2013

Defenestrategy posted:

A question: Why do you suggest having a setup like member_id and member_info tables instead of just one members table?

Frontloaded and generally good assumptions about first normal forms.

If a member could be in an arbitrary number of classes and needs to track data independently between them it can't be done without breaking nicely flattened, aka normalized, structures. Hence a second table that can have the same member_id multiple times for a row per class they're taking, filled with their data relevant to that class. Classes would be defined in their table, but class occurrences are in a table that references a class_id multiple times too. Then attendance of a member to a specific timeslot of a class is in yet another table, referencing both other tables.

The other relevant part of this 'one here, many there' is that the many entries table is foreign keyed to the single entry table. This is also a case where your primary key should just be an integer you forget about, it's the foreign key constraint that really drives the table. That's one of my big db tips: constraints, constraints, constraints.

So your schema might look like:

code:

classes
	pid, (unique) classid
class_times
	pid, (incrementing) timeslotid, (foreign key) classes.classid, date_reference, timeslot_reference, etc
members
	pid, (unique) memberid, fullname
member_registrations
	pid, (incrementing) registrationid, (foreign key) members.memberid, (foreign key) classes.classid
member_attendance
	pid, (foreign key) member.memberid, (foreign key) class_times.timeslotid, attendance_enum    (plus add a unique constraint on memberid and timeslotid)

Yes, some tables might end up having a single column aside from a primary key. That's fine and expected. Don't try to maximize column usage like you're gonna run out of tables. Also, IMO, primary keys are just a 'default' index and storage ordering, so I just leave them autoinc ints and focus on making my own uniqueness and foreign key constraints, and it lets me have a little more flexibility down the line especially with (re)naming fields. Altering a primary key just gives me heebie-jeebies, plus I've done some Galera stuff which needs to over-increment primary keys and I need to handle it at the application layer instead.

Notice this allows attendance to be marked for classes not registered. Maybe someone needs to make up a class and you want that, else add a column with a foreign key requiring registrationid.

Building constraints into the schema like this lets you avoid bricking the db with small coding errors, it'll just tell your app to hecc off. Handling those errors instead of trying to cover all cases is also easier on the app coding side.

Ranzear fucked around with this message at 05:21 on Feb 3, 2023

# ¿ Feb 3, 2023 05:17

Ranzear: Jul 25, 2013

You want COALESCE(count(foo.bar), 0) to get zeroes instead of null, then it's just ~~joins and grouping~~ slightly less of a clusterfuck:

SQL code:

SELECT
  p.person_id,
  (SELECT COALESCE(COUNT(person_id), 0) FROM notes n WHERE p.person_id = n.person_id) AS note_count,
  (SELECT COALESCE(COUNT(person_id), 0) FROM codes c WHERE p.person_id = c.person_id) AS code_count
FROM person p
GROUP BY p.person_id;

code:

person_id	notes	codes
1		2	1
2		0	1
3		0	0

Complete even with zero-zero cases.

COUNT(*) is an antipattern. Put what you want to count in it.

Double edit: gently caress gently caress gently caress. Everything is wrong. Hang on...

Triple edit: Subqueries it is. This works fine until you try to do more than one join is all:

SQL code:

SELECT
  p.person_id,
  COALESCE(COUNT(person_id), 0) AS note_count
FROM person p
LEFT JOIN notes n ON p.person_id = n.person_id
GROUP BY p.person_id;

Ranzear fucked around with this message at 04:17 on Feb 4, 2023

# ¿ Feb 4, 2023 02:20

Ranzear: Jul 25, 2013

Fixed mine with subqueries with where clauses to keep the counts separate. When just counting one other table the join works but the count spills over with two or more.

# ¿ Feb 4, 2023 03:30

Ranzear: Jul 25, 2013

Ranzear posted:

COUNT(*) is an antipattern. Put what you want to count in it.

What the gently caress Postgre? COUNT(*) actually no-ops the compare and is faster? So much for my "it costs nothing to have more clarity here" ideal. Speed difference even shows up in the Fiddle even if the plan is the same.

Edit: I thought it might be the table aliases in the counts making it 10x slower for some reason, it's just being weird all around. I'd still prefer to filter out null person_ids unless I had a constraint.

Ranzear fucked around with this message at 04:24 on Feb 4, 2023

# ¿ Feb 4, 2023 04:02

Ranzear: Jul 25, 2013

TheFluff posted:

what the heck is this take?!

Maybe 80% poo poo take, 20% misremembered MariaDB quirks: Something about always using a not null column in count() which probably results in the exact same no-op as Postgre with *, and then there's a level of explicitness and clarity that comes of using the primary key in there without the performance penalty.

What probably matters more is picking something and sticking to it. Also a safe bet I read that five years ago specifically about Aria Storage Engine which still isn't done.

Ranzear fucked around with this message at 01:48 on Feb 5, 2023

# ¿ Feb 5, 2023 01:37

Ranzear: Jul 25, 2013

It's definitely threads and buffer size due to awful defaults. Index stuff would never be that bad. If you have access or can ask for the config I can better tell you what's up.

# ¿ Apr 20, 2023 01:53

Ranzear: Jul 25, 2013

Query cache is way too big and configured backwards. query_cache_size should be 256m tops, only then with extremely large but static data. I don't think I ever set it part 128m on absolute monster nodes. query_cache_limit is completely reversed too: That's the max size of an individual query to be stored in the cache, so should just be reasonably larger than some result set you'd expect to reuse (not even your largest). 1m is fine unless y'all need a smidge more, but it can be mere kilobytes for real benefit.

Don't quote me on the actual mechanisms going on, but imagine that writing and later if not immediately invalidating 5 gig at a time every statement might sound a little off yeah? Surprised it hasn't complained about the sizes being irreconcilable.

Should A/B test with reasonable values, but I'd just turn it off entirely until you're sure there's no indexing to fix too.

That it hasn't given a startup error of outright refusal gives me a sneaking suspicion this isn't the only config file in play. Verify everything with SHOW VARIABLES, especially the innodb values which otherwise look fine.

Ranzear fucked around with this message at 11:27 on Apr 23, 2023

# ¿ Apr 23, 2023 11:24

Adbot: ADBOT LOVES YOU

# ¿ May 18, 2024 08:12

Ranzear: Jul 25, 2013

Lib and let die posted:

Commas before, you filthy animals

Sincerely appreciating this way of coping with dogmatic intolerance of trailing commas in TYOOL 2024.

# ¿ Nov 22, 2023 00:52

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > SELECT * FROM Questions WHERE Type = 'Stupid'