|
EDIT: nevermind, it wasn't a problem with the query
IAmKale fucked around with this message at 15:30 on Aug 21, 2013 |
# ? Aug 20, 2013 21:27 |
|
|
# ? May 19, 2024 15:18 |
|
Karthe posted:
Look into outer joins to include entries in r_ele that might not have entries in k_ele. You'll get NULLs as replacement values, so be careful with how you apply functions to the result.
|
# ? Aug 21, 2013 03:59 |
|
Malcolm XML posted:Look into outer joins to include entries in r_ele that might not have entries in k_ele. You'll get NULLs as replacement values, so be careful with how you apply functions to the result. On that note, is there a good explanation on how SQL indices work? I still have only a tenuous understanding of them - for example, I have no idea why SQL code:
SQL code:
|
# ? Aug 21, 2013 15:27 |
|
Start with the documentation for your DBhttp://www.postgresql.org/docs/7.1/static/indices-multicolumn.html posted:The query optimizer can use a multi-column index for queries that involve the first n consecutive columns in the index (when used with appropriate operators), up to the total number of columns specified in the index definition. For example, an index on (a, b, c) can be used in queries involving all of a, b, and c, or in queries involving both a and b, or in queries involving only a, but not in other combinations. (In a query involving a and c the optimizer might choose to use the index for a only and treat c like an ordinary unindexed column.)
|
# ? Aug 21, 2013 16:17 |
|
I don't see it mentioned often anywhere, but is anyone here familiar with Sybase? I have a client with a dental management program that runs on a Sybase Adaptive Server Anywhere 7.x backend, and let's just say that the application in question has earned a "Best of" DailyWTF entry for how badly it uses said database. http://thedailywtf.com/Articles/Classic-WTF-Rutherford,-Price,-Atkinson,-Strickland,-and-Associates-Dentistry,-Inc.aspx We actually have the remote sites all connecting back to a pair of terminal servers to run this application "locally" at the main site thanks to this idiotic implementation, and don't even get me started on the hacks that are the backup scripts to make it work right. Anyways, we're having odd load issues with the server where idle CPU usage on the database process will never go below 25% and performance suffers greatly, where under normal use it's floating near zero. Disconnecting all clients has no immediate impact, though usually if left for 30-60 minutes it will suddenly fall off to zero. Restarting the DB server resolves the issue until it comes back again some days or weeks later. The machine is a dedicated 2008R2 box with 16GB of RAM and a Xeon E5640, though ASA7 is apparently 32 bit w/o LAA so it doesn't use more than 2GB ever. The application vendor has been less than helpful, so I'm looking for any resources that might help me try to at least determine what the database is doing when it gets stuck like this. Sybase's public docs are not proving amazingly useful, they seem to be sort of like MSDN in that the information is probably there somewhere, but finding it is proving to be a challenge. Being apparently eight versions behind current of course doesn't help either, as much of the documentation I can find is made for much newer code.
|
# ? Aug 21, 2013 18:06 |
|
I have a query I'm running on football data from here: http://armchairanalysis.com/data.php It's loaded into a mysql database and the schema for the relevant tables looks like this: code:
code:
code:
edit: only happens to players w/ 2 letter team names edit2: turns out the import data was CRLF but I thought it was just LF so the funkiness is the CR since the team name field is at the end of the line Dren fucked around with this message at 16:56 on Aug 23, 2013 |
# ? Aug 23, 2013 16:44 |
|
Dren posted:Some fields are all jacked up. What the heck? You've already found your problem, but to prevent future ones I'd suggest double checking your encoding/localization. Unless it's changed mysql by default doesn't deal with them too well, and it breaks things further down the line if any of the players have unicode/high bit characters in their name.
|
# ? Aug 23, 2013 19:40 |
|
Is there any reasons not to specify ON DELETE CASCADE in PG? I really like the convenience of being able to delete one single row at the top of the relationship chain and not have to bother cleaning the DB bottom-up. I've heard arguments that it protects you against fat fingering things accidentally in production, but seriously? I'd appreciate an expert opinion on this one.
|
# ? Aug 25, 2013 04:51 |
|
DreadCthulhu posted:Is there any reasons not to specify ON DELETE CASCADE in PG? I really like the convenience of being able to delete one single row at the top of the relationship chain and not have to bother cleaning the DB bottom-up. I've heard arguments that it protects you against fat fingering things accidentally in production, but seriously? I'd appreciate an expert opinion on this one. To be more serious, ask yourself an opposing question: Under what circumstances would you want CASCADE? You've mentioned wanting everything else referencing the data to disappear, but in what situation does that make sense? Will a cascading delete lead to data invalidation or inconsistency (with the real-world reality being modeled)? Does it save you from considerable work or only a minimal effort? Now think of some of the things that could go wrong: Do you truly want it to delete, or is a null or default option more appropriate? What should happen if a cascade starts but encounters a referenced column down the line that doesn't cascade? How do you handle cascading into a many-to-many relationship (when you delete an account, do you just delete the item from the receipt that credited that account, or do you delete the entire receipt)? Do you want it to cascade into trillions of referencing rows, or should that be handled manually during maintenance to prevent database response issues? While auto-vacuum will mark pages for use, does this change your routine vacuum schedule? If you're building proper triggers and/or procedures for more complicated structures, then you need to set RESTRICT so you can raise and trap errors. You'd be stuck with a two-pass approach in the application otherwise (check for references before deleting), whereas a procedure can ensure that appropriately-referencing rows are removed and raise an exception to return to the application. The distinction is "delete from user where name='moses'" and not really knowing what CASCADE deleted, versus the noise that is likely to be generated when you "strike_from_the_memory_of_man('moses')". PhantomOfTheCopier fucked around with this message at 07:22 on Aug 25, 2013 |
# ? Aug 25, 2013 07:19 |
|
My opinion doesn't really qualify as "expert", but I think the answer is that like a lot of useful tools, there are good places to use ON DELETE CASCADE, and bad places to use it! A good example of a place where you might use ON DELETE CASCADE selectively: you have an on-line game. Chess or something, maybe. Users can "favourite" games they think are good. So you have a table User, and a table Game, and a table FavouritedGame. FavouritedGame has foreign keys to User and Game. You might want to provide for the possibility that games can be deleted. But you don't want that operation to be blocked by users "favouriting" the game in question, that would be kind of silly. Nor does it really make sense to retain rows in FavouritedGame but with the Game column set to null, or anything like that. So you use ON DELETE CASCADE to save you the trivial task of going through and deleting rows from FavouritedGame when you want to delete the game. Suppose you also want to provide for the possibility of a user deleting their account. Obviously you add ON DELETE CASCADE to the User column in FavouritedGames. But Game also has a foreign key to User. Or maybe it has two, one for Black and one for White. Let's assume you haven't used a cross-reference table, because this is a simple and contrived example. This time you don't want to delete all games that a user has played in, just because you're deleting that user. The user's opponents might want to look back over the games they've played, and you would annoy them if they could be deleted like that. This time, you probably use ON DELETE SET NULL instead.
|
# ? Aug 25, 2013 14:11 |
|
Ok makes sense, thanks you two. Is the downside mainly that once in a blue moon I'll have to write this giant transaction that will have to delete every table bottom up? Or as you said, not just delete but perhaps set a deletion flag and clean it up later? This way I get much more manual control over these rows?
|
# ? Aug 25, 2013 21:58 |
|
So I fixed a mistake I made but I'm not sure why Select works but not Delete. Below: code:
Whereas, code:
and it's my mistake and I normally don't do that but why does the select work and not the delete?
|
# ? Aug 26, 2013 20:34 |
|
Single quotes are for literal strings, so 'Column'='shitbarn' is effectively false. Try double quotes.
|
# ? Aug 26, 2013 20:39 |
|
TheReverend posted:So I fixed a mistake I made but I'm not sure why Select works but not Delete. The quotes around the column name make it no longer a column name, but just a string, with the literal value "Column". So you are asking to delete all rows where the value "Column" is the same as the value "shitbarn". Those two strings are never the same, obviously, so no rows are deleted. That said, your select statement should also be returning zero rows. I suspect in your select statement you are using "double quotes" instead of 'single quotes' around the column name, because double quotes are used to quote names instead of values.
|
# ? Aug 26, 2013 20:40 |
|
Thanks! I normally have no quotes around any column name but I hosed up and wanted to know what was going on.
|
# ? Aug 26, 2013 21:15 |
|
with "proper quoting" as (select few 'distinguish' proper from improper) delete from only the "intended tables"; DELETE 0
|
# ? Aug 26, 2013 21:54 |
|
MS SQL 2005/8/12 I have to streamline a process where a user (and eventually our automation software) has to run a query after running a program to make sure it did what we wanted it to do because our software vendor are all horrible monsters who have trouble releasing not completely poo poo code Right now we are using a wrapper for osql which is meh but is limited. I am trying to write something new and make it somewhat secure & not be able to drop tables If I set my script/program up to do code:
Right now I am handling it with "select " + var + "from" + var + "where etc etc but I would like to have a manual override option where they can just dump in the entire query similar to osql. But also I don't want a disgruntled user to drop our main production table(s). TL;DR I am rewriting osql but want to limit it to select queries ha;lp Hadlock fucked around with this message at 23:53 on Aug 27, 2013 |
# ? Aug 27, 2013 23:51 |
|
select 0; drop table users Give read-only authority to the users. Problem solved.
|
# ? Aug 28, 2013 00:04 |
|
That's what I was afraid of. Thanks sir. We have a replicated db, I will just read from that then.
|
# ? Aug 28, 2013 00:13 |
|
I need to come up with a query that will let me select kanji by choosing their individual components. For starters, here's some sample output to help illustrate what I need to do: # |id |radicals ----+----+-------+------ 1 |1 |亜 |一(1) |(2) 口(38) 2 |2 |唖 |一(1) |(2) 口(38) 3 |3 |娃 |土(40) 女(45) 4 |4 |阿 |一(1) 亅(6) 口(38) 5 |5 |哀 |亠(8) 口(38) 衣(177 Grabbing all kanji that have, say, radical #1 is easy enough, just a simple WHERE radical_fk = #. What I haven't been able to figure out is how to grab all kanji matching multiple radicals. I tried something like this: SQL code:
SQL code:
I need to return only those kanji that match ALL of the radical_fk's that I feed it. How can I make this happen?
|
# ? Aug 30, 2013 18:06 |
|
I usually do the join with the query (read: IN (q1,q2,q3,...) in your case) and then: group by id having count(distinct radicals)=(number of radicals in your query). Probably far from optimized for specific applications, but that's one general approach.
|
# ? Aug 30, 2013 18:11 |
|
This should be a pretty easy one, but I can't for the life of me wrap my head around it. In a SQL Server 2008 R2 database, I have a table called Recordings, which has a column called CatalogNo, and a column called Label. I have another table called Labels, which has a column called URL, and a column called Label. Recordings.Label is currently null. Recordings.CatalogNo has Catalog Number data in it, except when it has a URL in it (starting with 'http://'). That URL matches a row in Labels.URL. Every row in Recordings after the row with a URL in the CatalogNo column needs to have its Recordings.Label field updated with the corresponding Labels.Label data, until Recordings.Catalogno hits a new URL and starts over. I'm assuming I have to do something with cursors, but I have no idea how to write a statement that does what I need, and I've written a bunch of stuff so far that doesn't update anything at all and just runs forever. Can someone point me in the right direction? Thanks! e: whoops I tried again and it crashed Management Studio. I am not good at computers. e2: I might have it. Now to let it chew through everything and see if it actually works. e3: Jesus Christ cursors are slow. ~1/5 done after an hour and 45 minutes. Sir Bobert Fishbone fucked around with this message at 00:20 on Aug 31, 2013 |
# ? Aug 30, 2013 21:27 |
|
Karthe posted:I need to return only those kanji that match ALL of the radical_fk's that I feed it. How can I make this happen? Sounds like a great use case for arrays =# create table kanji(id serial, char text, radicals int[]); =# insert into kanji(char, radicals) values ('亜', '{1,2,38}'), ('唖', '{1,2,38}'), ('娃 ' '{40,45}'), ('阿', '{1,6,38}'), ('哀', '{8,38,177}'); INSERT 0 5 =# select * from kanji; id | char | radicals ----+------+------------ 1 | 亜 | {1,2,38} 2 | 唖 | {1,2,38} 3 | 娃 | {40,45} 4 | 阿 | {1,6,38} 5 | 哀 | {8,38,177} (5 rows) =# select * from kanji where radicals @> '{1,2}'; id | char | radicals ----+------+---------- 1 | 亜 | {1,2,38} 2 | 唖 | {1,2,38} (2 rows) Time: 1.304 ms =# select * from kanji where radicals @> '{1}'; id | char | radicals ----+------+---------- 1 | 亜 | {1,2,38} 2 | 唖 | {1,2,38} 4 | 阿 | {1,6,38} (3 rows) Time: 0.260 ms =# select * from kanji where radicals @> '{38}'; id | char | radicals ----+------+------------ 1 | 亜 | {1,2,38} 2 | 唖 | {1,2,38} 4 | 阿 | {1,6,38} 5 | 哀 | {8,38,177} (4 rows) =# select * from kanji where radicals @> '{38,200}'; id | char | radicals ----+------+---------- (0 rows)
|
# ? Aug 30, 2013 21:40 |
|
Edit : there was a stupid simple solution to this question.
MightyShadow fucked around with this message at 10:37 on Sep 3, 2013 |
# ? Sep 3, 2013 10:05 |
|
MightyShadow posted:I have another stupid question that probably has a really simple answer but I just can't think of it. Looks quite complicated. You might need to join to a subquery??? Wait for other opinions before relying on this advice, I'm not very confident about this. code:
|
# ? Sep 3, 2013 10:38 |
|
Don't worry about it, I don't need to use the wholesaleconsigneditems table to calculate the value I need, I just got a total amount of consigneditem for each product and multiplied it by the cost price.
|
# ? Sep 3, 2013 11:07 |
|
Question: (for mySQL 5.1) I have a big table with a primary key column (row_id bigint unsigned). Each row also has a type (there are less than 20 types all in all). One certain type makes up more than 95% of all entries in this table, and performance is becoming an issue. Since there are a lot of queries that do or could select on only one type, I thought I would implement partitioning on this table. Every type gets its own partition and if I select something from the smaller types, I only have a few thousand rows in my partition to go through instead of the 25 million row table. Unfortunately partitioning only works on primary key columns. So I changed the primary key to (row_id, row_type) and partitioning worked. But I want to avoid row_id duplicates across the row_types. So I added a unique key (row_id) so there can only be one. But then partitioning no longer works because unique keys need to contain all columns used in partitioning. There goes my plan The row_id is more or less random (chosen by some users according to their own system), is there any way to implement useful partitioning on this table?
|
# ? Sep 4, 2013 16:31 |
|
BabyFur Denny posted:Question: (for mySQL 5.1) You could use (row_type, row_id) as the primary key and simply accept that rows will be identified by (row_type, row_id) pairs rather than row_id on its own. Alternatively you could use actual separate tables, but this change may be impractical for your application. But you say that you want to do this because performance is becoming a problem. With proper use of indexes, it seems like this should not be a problem with 25 million rows. What indexes exist on this table?
|
# ? Sep 4, 2013 17:02 |
|
This is coming from an MSSQL background, so take this with the appropriately-sized pinch of salt, but there a couple of red flags there for me. The row_id being random - you imply it's inserted in random order, or at least never ascending or descending consistently. Assuming this is clustered, that's going to be a bad thing. Your best clustering index is one that is a) guaranteed to be unique and b) updated consistently up or down (and yeah, narrow and static ). You may need to add your own column to this to sort out the ordering - then index the rest appropriately and you'll likely see a performance benefit. Second thing is that in the scenario you're describing, partitioning isn't really going to give you much benefit. If the data ages and/or decays a timestamp is often good; if not you should really consider a scheme that will divide the data a little more evenly than 95% in one and nineteen other tiny parts - is there any other organisational unit you can harness, even if it means copying a related column in? Third thing is in MSSQL at least you don't absolutely have to have the partition in your primary key; you can cluster in the partition column and pop the key in non-clustered. To be fair though by what I've heard of your situation so far I would be inclined to create your own auto-increment column composite with the partition column, clustered primary key - and pop the types in on a non-clustered index with any other columns desired (type first) with the partition included for alignment if that matters on MySQL. That third bit is from ignorance of MySQL though, just making assumptions according to MSSQL. What does your data look like? Are there any relevant lookups you could denormalize a little to help partitioning? Salt n Reba McEntire fucked around with this message at 17:24 on Sep 4, 2013 |
# ? Sep 4, 2013 17:16 |
|
Moogle posted:The row_id being random - you imply it's inserted in random order, or at least never ascending or descending consistently. Assuming this is clustered, that's going to be a bad thing. Your best clustering index is one that is a) guaranteed to be unique and b) updated consistently up or down (and yeah, narrow and static ). You may need to add your own column to this to sort out the ordering - then index the rest appropriately and you'll likely see a performance benefit. Oh, yeah I did not catch this. That is kind of a weird way to design a table. Try running OPTIMIZE TABLE and see if there is any improvement.
|
# ? Sep 4, 2013 18:02 |
|
How do people deal with needing different versions of a very similar parameterized query? For example, sometimes I might want an insert with or without RETURNING (PG-only). Sometimes I want LIMIT 1, sometimes a certain sorting order etc. Manually writing permutations of all these queries sounds like a bad idea for long-term maintenance as the schema changes. Let's say you're not using a full-blown ORM, what are your options? Do people seriously roll their own query builders?
|
# ? Sep 7, 2013 03:29 |
|
There has to be hundreds of query building libraries for each language by now that aren't full blown ORMs. http://www.jooq.org/ https://code.google.com/p/python-sql/ http://sqlobject.org/SQLBuilder.html
|
# ? Sep 7, 2013 15:53 |
|
edit: I'm a dumbass who didn't read the OP.
Dominoes fucked around with this message at 18:18 on Sep 7, 2013 |
# ? Sep 7, 2013 17:14 |
|
Dominoes posted:MATLAB You might want to pose that question here instead of the SQL thread. I mean, we could take a crack at it but we'd probably suggest changing your schema or adding a HAVING clause or some poo poo.
|
# ? Sep 7, 2013 18:16 |
|
Nth Doctor posted:You might want to pose that question here instead of the SQL thread.
|
# ? Sep 7, 2013 18:18 |
|
-
|
# ? Sep 7, 2013 18:18 |
|
Nth Doctor posted:You might want to pose that question here instead of the SQL thread. I did actually look up a fast fourier transform postgres extension to respond with
|
# ? Sep 7, 2013 21:14 |
|
DreadCthulhu posted:How do people deal with needing different versions of a very similar parameterized query? Pardot posted:I did actually look up a fast fourier transform postgres extension to respond with
|
# ? Sep 7, 2013 21:48 |
|
PhantomOfTheCopier posted:I didn't like the answers because I didn't like the question. Do you mean a VIEW or a set-returning FUNCTION? If you want something returned, why not always return it and ignore it when you don't need it? Did you mean LIMIT 1 or CURSOR or PREPARE? A database reorg requires appropriate attention be given to all the objects, not just the tables, and while PostgreSQL tries to help you (ERROR: cannot drop table X because other objects depend on it... DETAIL: type Y depends on table X...), it certainly won't catch trouble outside the database until the code passes in the old/faulty queries. I think my question was mostly about building parameterized SQL queries procedurally in the application layer, sorry for the confusion.
|
# ? Sep 8, 2013 06:03 |
|
|
# ? May 19, 2024 15:18 |
|
On an unrelated note, I read these following two best practices recently in a "top 10 tricks that will make you awesome at SQL" posts somewhere and was wondering if you guys could comment on them:
I'm an unwashed newbie and am unable to appreciate the message. Could someone explain?
|
# ? Sep 8, 2013 06:06 |