|
Roundboy posted:Did i explain this enough? What is the proper work around ? Maybe this is too easy, but: where (tableb.fax != tablea.fax1 and tableb.fax != tablea.fax2) Should only return true when there's no match.
|
# ? May 20, 2009 19:26 |
|
|
# ? May 14, 2024 18:05 |
|
Tested.. no go.. what if fax1 or fax2 is null ? here is a 'live' example A code:
code:
|
# ? May 20, 2009 19:37 |
|
Howabout a CASE statement?
|
# ? May 20, 2009 19:39 |
|
Roundboy posted:Tested.. no go.. what if fax1 or fax2 is null ? Doh, knew it was too easy to work. where ( (tableb.fax != tablea.fax1 or tablea.fax1 is null) and (tableb.fax != tablea.fax2 or tablea.fax2 is null) ) Not sure what it does performance-wise.
|
# ? May 20, 2009 19:49 |
|
MS SQL Server 2000 I need to remove all non-digits at the beginning of a string up to the first digit. I'd use a regex replace normally, but the the standard replace with SQL Server 2000 just doesn't seem to do what I need. Is there another simple function I'm overlooking? edit: Yes, yes I was: PATINDEX. This solved my problem, where n is the field code:
Aredna fucked around with this message at 21:10 on May 20, 2009 |
# ? May 20, 2009 20:27 |
|
Could we put a huge note in the OP that's like stop worrying about performance. Unless your rows are measured in millions, at least... Something.
|
# ? May 20, 2009 20:37 |
|
I should also note i have been trying to apply "demorgan's law" here, with no good results. EDIT: Oh jesus crap.. everything was working out fine.. if i put a to_number around each fax field. Apparently Oracle did not like to match a fax number stored in a VARCHAR2(10) to another one in a VARCHAR2(20). Even though they were identical numbers Roundboy fucked around with this message at 21:30 on May 20, 2009 |
# ? May 20, 2009 21:09 |
|
Triple Tech posted:Could we put a huge note in the OP that's like stop worrying about performance. Unless your rows are measured in millions, at least... Something. I had some queries that would take 2 minutes to execute. Often, the best optimization isn't in table structure - that's when you get into millions of rows - but in the query. Cut that motherfucker down to 1.3 seconds.
|
# ? May 20, 2009 22:46 |
|
It's poor understanding of queries combined with poor schema designs... I can't say I'm a pro and I probably have a lot to learn, but I've been working with SQL Server for the past couple months and it's pretty eye opening. I'm even considering looking for a way to specialize in data modeling and what not. I don't want to say DBA because that's more "database janitor", haha...
|
# ? May 20, 2009 23:06 |
|
Sometimes one is just stuck with a lovely schema where refactoring it is painful and hard to get approval for. I've found that performance can often be improved dramatically on statements with sub-par performance just by index tuning and SQL statement tuning. It's fun and it's a bit of an art. Hey, if that's janitor work, bring me a mop.
|
# ? May 21, 2009 01:28 |
|
Golbez posted:I had some queries that would take 2 minutes to execute. Often, the best optimization isn't in table structure - that's when you get into millions of rows - but in the query. Cut that motherfucker down to 1.3 seconds. What DBMS do you work with? I've found that its pretty hard to fool the DB2 query optimizer... you have to start writing really ponderous, hosed up queries before it starts to effect execution time in a statistically significant way. Whenever I've tested queries overnight, the network throughput is always a much bigger variance. I dunno, maybe this means my janitor is really good at indexing or something?
|
# ? May 21, 2009 02:03 |
|
Dr.Khron posted:What DBMS do you work with? MySQL. I had queries with 20 joins and all sorts of crap that were making things horrible. A search engine rewrite later and the slow query log is so much shorter.
|
# ? May 21, 2009 02:49 |
|
Golbez posted:MySQL. I had queries with 20 joins and all sorts of crap that were making things horrible. A search engine rewrite later and the slow query log is so much shorter. I'm not surprised your optimizer was making bad choices with that many joins. When you start including more than 8 or so tables in a query (in Oracle at least) the optimizer begins to go into traveling salesman problem mode and hurfs some serious durfs. http://en.wikipedia.org/wiki/Travelling_salesman_problem If your rdbms has an equivalent to the Oracle ordered hint you can force it to not stroke out and begin dribbling from the mouth when you need to use that many joins though, if you can figure out the proper table access order.
|
# ? May 21, 2009 04:33 |
|
Or you can just be suspicious that you're doing that many joins.
|
# ? May 21, 2009 04:41 |
|
Triple Tech posted:Or you can just be suspicious that you're doing that many joins. Sometimes you gotta do what you gotta do and can't refactor the table structure into something less retarded. C'est la vie.
|
# ? May 21, 2009 04:45 |
|
Triple Tech posted:Or you can just be suspicious that you're doing that many joins. Basically this. It was finally time to rip the life support off that search engine and begin anew. I also discovered the EXPLAIN command.
|
# ? May 21, 2009 04:54 |
|
Markoff Chaney posted:Sometimes you gotta do what you gotta do and can't refactor the table structure into something less retarded. C'est la vie. Sometimes you solve this in queries, sometimes you solve this in code. I'm always tempted to do all of my work in the query, but more and more I'm pushing work into code - it's a hell of a lot more expensive to upgrade a DB server than it is to buy another webserver. Optimizing is always great, but when you've got a DB that JUST CAN'T EVER BE DOWN, adding another index to 50 million+ rows isn't something you can really do - so code changes it is. Took one big query, split it into 4 smaller ones and did the joining in arrays - went from taking up to 15 minutes to never taking more than 10 seconds, less than half of that being DB time.
|
# ? May 21, 2009 05:44 |
|
WARNING: UGLY QUERY AHEAD I've inherited the maintenance duties for one of our company's older sites, it's a book store and the massive search function has to check the user's query against holiday names among other things, I've been tasked with adding that functionality. The query was trudging ahead just fine without my addition, but somehow when I add the bit about the holiday names it returns every matched row 8 times... here's the full query (it was originally one line! , I've formatted it here. The part I added is at the end "summer fun" was my test query, when I run just my portion on the server I get 10 results. The entire query including my part returns the same 10 books, but 8 times each The sql statement is cobbled together inside coldfusion to deal with various filters like age and price limits, any troubleshooting suggestions on why a statement would return the same row multiple times would be much appreciated. code:
|
# ? May 21, 2009 17:21 |
|
I'm not familiar with the syntax, but my guess would be that the OR is overriding the earlier logic for the joins due to the way the logical operators are processed. Add some fields from sub2 and sub3 to the query to see if they are indeed identical rows, or if they're caused by bad joins. If that turns out to be the case, all you should have to do is enclose the last two statements in parentheses so they're evaluated as X AND (Y OR Z) instead of (X AND Y) OR Z.
|
# ? May 21, 2009 19:44 |
|
AWESOME! Worked like a charm! Thanks so much!
|
# ? May 21, 2009 20:07 |
|
Hey everyone, I'm writing a paper on SQL injections for a technical writing course at university, and I'm at the end of the paper where I'd like to explain how to protect yourself against attacks. I figured that escaping dangerous characters was a good way to do it, but after reading Wikipedia I saw that "Parameterized Statements" are suggested. I don't really understand it, though, the article says that user input must not be directly embedded in the SQL statement, but instead you should do something like this:code:
Edit: is this because the escaping it yourself method requires you to remember every dangerous character to escape, whereas the parameterization method will know what characters to escape? checkeredshawn fucked around with this message at 02:57 on May 22, 2009 |
# ? May 22, 2009 02:23 |
|
Yes, to some extent. But depending on the implementation... If you're doing the escaping yourself, you're sending one wholy constructed string with variables rendered as text. But for some engines, like SQL server, you can actually send a SQL command with parameters and send in arguments, which ties directly into the parameterization style. So, it's like... Even safer. And it helps with caching and what not. Basically, parameterization is superior if it applies to your queries, and it often will (unless you're trying to do some meta poo poo).
|
# ? May 22, 2009 03:43 |
|
So the idea is you capture the input as parameters and run checks as to what they are before putting them into the query string?
|
# ? May 22, 2009 04:06 |
|
Ideally, there should already be some managed workflow where you're not really checking the parameters, just submitting them to a library that already does this checking and/or uses them as parameters. Any impulse to stitch query strings together should be redirected to finding out what the proper, safe, injection-free way of doing SQL is for your environment. And if there is no answer on that (which is highly unlikely), the best thing to do would be to design your own library which does exactly this. It should encapsulate the concept validating parameters and then seemlessly passing them to the DB.
|
# ? May 22, 2009 05:01 |
|
MySQL:code:
code:
code:
Help me understand how to deal with null.
|
# ? May 23, 2009 17:35 |
|
Upgrade the value to not-a-null... Unless there's a case where zero and null you want reported. "where owes does not equal (paid as if null then 0 else paid)" This won't work if owes is equal to zero and paid is null, because the effective value for paid in the equation will also be zero.
|
# ? May 23, 2009 17:52 |
|
select id, owes, paid from test where owes != IF(paid is null,0,paid) gives you the right results, but yeah it's a little trickier if owes = 0 like the guy above suggests
|
# ? May 25, 2009 09:18 |
|
Anyone know what's going on in this thread I made? Error when converting from YaBB to smf (to then later convert to phpBB) code:
|
# ? May 25, 2009 23:18 |
|
I'm using MaxMind's GeoIP Org and GeoLite Country databases for ip lookups and I'm trying to optimize my SQL statements to improve performance, as we do thousands of lookups every day. Currently I'm using code:
I'm wondering if there is a way to set this so I could feed it an array as a parameter and make it look up all, instead of calling the select statement thousands of times at once. Do I need to create a stored procedure for this or something?
|
# ? May 26, 2009 13:01 |
|
slightly tangential but I found it quicker to convert the IP to an integer in ASP than doing it with mysql's inet_aton function. are both o.start and o.end indexed? also why does geoip have 3 tables?
|
# ? May 26, 2009 20:53 |
|
The following SQL gets me a list of Coporate Actions (CAs) for an investment (Dividends, Capital Gains, and Accrual Payouts):code:
Its simple enough to make the above SQL a sub-table, and just SUM up the RATEs, Grouping by the CADATE. BUT, I need to capture the DGH_TYPES and the logic behind how they were combined! Lets pretend that there are two CA/DGH_TYPEs, 'A' and 'B'. If one or more 'A's is Grouped with one or more 'B's, then its a 'C', as such: - If any number of 'A's are Grouped together, the result is an 'A' - If any number of 'B's are Grouped together, the result is an 'B' - If 'A's and 'B's are Grouped together, the result is an 'C' code:
I was thinking of using powers of two to represent the two base states, but I'm pretty sure that only works with two states, not 1 to infinity.
|
# ? May 26, 2009 21:12 |
|
I don't really understand the structure of the table, but you can definitely make it faster by indexing o.end and c.end because they are being joined on. But it's hard to say definitively without more information. As for running it multiple times... Most often, you can make a single column, temporary table and make a new query that looks up all the stuff you need by joining off that table. Need. More. Data.
|
# ? May 26, 2009 21:13 |
|
Vince McMahon posted:slightly tangential but I found it quicker to convert the IP to an integer in ASP than doing it with mysql's inet_aton function. are both o.start and o.end indexed?
|
# ? May 26, 2009 21:16 |
|
Vince McMahon posted:slightly tangential but I found it quicker to convert the IP to an integer in ASP than doing it with mysql's inet_aton function. are both o.start and o.end indexed? code:
|
# ? May 27, 2009 12:33 |
|
I'm toying with the idea of indexing Usenet groups. To do this, newsgroup providers give you access to an overview format of messages in the range you specify. It looks like this:code:
code:
I have an overview dataset with more than 50 million articles to play around with. It's actually already in a database, but I want advice on how to store this in a good way. Currently a row in the release-table has a huge field in which the articles are stored After stripping away the different part specifiers and other junk which isn't necessary, I have a trimmed down subject. This is used to create a row in the releases-table. So, when scanning another round of articles, I need to be able to find a release by searching for the trimmed down subject. Should I keep all the articles in an article-table? I will need to be able to load a release and add articles that aren't already in the release (it happens that I need to scan a range I've already been over and I want to avoid duplicates). I currently use MySQL 5.1.30. Any ideas on how to implement something like this? Currently, a table named releases has columns much like a single article has, with an extra varchar which contains the trimmed subject. It's primary key is an integer. To get the articles for a release, a row in the articles-table will have an entry for the release id which in turn has all the articles in one of its columns. Articles-table has 525050 rows and is currently 45 gb. Releases has the same amount of rows and is 120 mb.
|
# ? May 27, 2009 17:47 |
|
I'm pretty close to figuring out my problem... But stuck on a data-type conversion. DB2 has no built data-type for Boolean (true/false), and I don't have the access to make my own data types. Is there a built in way to produce text-style Boolean output, or is a CASE statement the only way? Here is my current workaround: code:
|
# ? May 27, 2009 17:50 |
|
tunah, if the data is already normalized, what more can you do to store it? Not sure how else you'd go about it. Dr.Khron, yeah, that's probably the way to go. Maybe abstract it into a function or something.
|
# ? May 27, 2009 18:58 |
|
moostaffa posted:
Would it be possible to add a cc/cn column to the org table? Or are there overlapping issues there? If you tried that, also with passing an integer instead of the ip to mysql, it might be a bit quicker.
|
# ? May 27, 2009 19:14 |
|
Still not totally understanding those geo tables but I will add this, in general: Most operations are I/O bound instead of CPU bound. So converting an IP to some normal form isn't the primary bottleneck of the lookup, unless the tables are trivially small or already stored in RAM. But testing and query plans will answer this definitively.
|
# ? May 27, 2009 19:18 |
|
|
# ? May 14, 2024 18:05 |
|
All fixed! What I realized is that the Accrual Flush is a boolean condition: you can pay out accruals regardless of whether or not there is also a fixed payout on a date. As long as I can insert zeros for the rate of an Accrual Flush, then I can just sum them up with AF = true. (this is because some of the Accrual payouts have specious rates listed... Thanks a lot, Putnam!) Even though doing it this way means I have to change the code, too, it actually makes the code simpler, because it removes a branch and some logic. Here is the SQL: code:
|
# ? May 27, 2009 20:34 |