|
I don't know what you're really trying to do, but it sounds like inner (or outer, if you need nulls) joining. Inner join: code:
code:
epswing fucked around with this message at 16:32 on May 9, 2012 |
# ? May 9, 2012 16:11 |
|
|
# ? May 31, 2024 02:44 |
|
That looks a bit more accurate, I believe that's correct- thanks. e; Yea- I should've been using the AND instead of the OR, achieves the same results. Like I said, I just really don't like databases. My one gripe with programming. Sab669 fucked around with this message at 16:54 on May 9, 2012 |
# ? May 9, 2012 16:23 |
|
I've got a problem I've been trying to help my dad with, but I don't know SQL and figured it would be easier to ask you guys because I think it's probably pretty simple-- So he's working in Crystal Reports and has a database which has the fields: ID, Name, Date+Time, and Zone. There are lots of duplicate entries and entries that repeat the ID and Zone multiple times, with differing times for each entry-- He needs to be able to: If the Zone and ID are the duplicates, grab the most recent time+date and don't show the others. If any more information is needed, I'll reply ASAP- Thanks.
|
# ? May 12, 2012 04:20 |
|
Shmoogy posted:I've got a problem I've been trying to help my dad with, but I don't know SQL and figured it would be easier to ask you guys because I think it's probably pretty simple-- Crystal has it's own weird SQL syntax but if he can do raw queries this is how it would look in t-sql: SELECT MAX(Date+Time),Name From Thing GROUP BY Zone,ID (I have no idea if Crystal SQL supports Group By or aggregates like MAX) Assumptions: - Date+Time is one column that is consistently formatted as a datetime, and not two columns that are concatenated - If Zone is repeated ID is also repeated (e.g. this won't bring back cases where there's a duplicate Zone, but not a duplicate ID). If he doesn't have access/knowledge for raw queries this Stack Overflow might help: http://stackoverflow.com/questions/203473/how-do-i-group-by-date-in-crystal-reports-8-0 But don't use the group by on the date, use it on the Zone/ID fields.
|
# ? May 12, 2012 04:52 |
|
What's the best way to find records that "mostly" match? I'm a little slow, but I can't seem to accomplish what I want with just using wildcards and LIKE. My goal is to feed a list of names to a script and have it check against the database for customers that "mostly" match (specifying a threshold) and then verify exact matches manually. (We're in the process of migrating to a new system and I'm trying to consolidate multiple accounts for the same customer, but a lot of the time we get "Mike Smith" one year, "Mike Smith Jr" the next year, "Michael Smith" the year after that, etc.) Thanks for any help! e: To clarify, we seem to have quite a bit of "typos" in our database, which is why it's important for me to be able to say "Michael mostly matches Micheal so it counts". fullroundaction fucked around with this message at 19:08 on May 14, 2012 |
# ? May 14, 2012 19:06 |
|
Scaramouche posted:Crystal has it's own weird SQL syntax but if he can do raw queries this is how it would look in t-sql: Thanks for this -I'll see if I can help him figure it out through this info.
|
# ? May 14, 2012 19:08 |
|
Shmoogy posted:I've got a problem I've been trying to help my dad with, but I don't know SQL and figured it would be easier to ask you guys because I think it's probably pretty simple-- In crystal you will want to just use groups on the zone and id fields put the fields into the "details" section, and then use a running total field that grabs the maximum of the time+date. Trying to do sql inside of the crystal is usually a bad idea.
|
# ? May 14, 2012 19:59 |
|
fullroundaction posted:What's the best way to find records that "mostly" match? I'm a little slow, but I can't seem to accomplish what I want with just using wildcards and LIKE. Use trigram searching http://www.postgresql.org/docs/current/static/pgtrgm.html edit: and/or some of the things from http://www.postgresql.org/docs/current/static/fuzzystrmatch.html Pardot fucked around with this message at 23:17 on May 14, 2012 |
# ? May 14, 2012 23:09 |
|
I am not sure how to describe this technically, so I'm using as simple language as possible: How can I locate rows using two fields, but the fields don't match up 1:1? So, I have field X and then field Y but I cant just use two IN clauses because x and y have a many to one relationship. If I was just looking for a single record, I would use SELECT * FROM TABLE WHERE X = ' ' and Y = ' '; but I need this but 7,000 times. In the past, I've used excel to concatenate however many selects I need to get the data, but that method kind of sucks and I'd like a better method. I'm using DB2 SQL if that is applicable.
|
# ? May 15, 2012 19:09 |
|
Xguard86 posted:I am not sure how to describe this technically, so I'm using as simple language as possible: I think you need to be more specific about what the two fields X and Y are and how they relate to each other (and what governs the pairs of values you want to retrieve).
|
# ? May 15, 2012 19:18 |
|
Field X is the client ID field y is the record ID. So, each client ID can have many records. So if I want Client 1 record 1 Client 1 record 2 then Client 2 record 1 I need a way to select only those and not Client 2 record 2 which is what client IN () and then record IN () does.
|
# ? May 15, 2012 19:27 |
|
Xguard86 posted:Field X is the client ID field y is the record ID. So, each client ID can have many records. So if I want Ah so you would want something like this code:
|
# ? May 15, 2012 19:37 |
|
I have explicit combinations Here is the first five rows from my list: 383 319368 859 64844 546 1407037 623 400119 859 64378 the left column is client ID the right is record. Would I still be using a MAX when I have explicit combinations? I'm not following the solution.
|
# ? May 15, 2012 19:50 |
|
Xguard86 posted:I have explicit combinations Here is the first five rows from my list: So you have completely arbitrary pairs of values that you want to retrieve? You have two choices: you can write a query that says WHERE x = 383 AND y = 319368 OR x = 859 AND y = 64844 OR x = 546 AND y = 1407037 OR x = 623 AND y = 400119 OR x = 859 AND y = 64378 or you can create a temporary table, insert those pairs of values and JOIN to the table you're interested in. This is what you should probably do if the list of values can be big.
|
# ? May 15, 2012 20:02 |
|
hmm this is difficult. I don't have the permissions to create a table and there are over 7,000 records I need to locate. I might be stuck just using excel to create a batch of single selects and join them together.
|
# ? May 15, 2012 20:12 |
|
What is the source of the pairs of values?
|
# ? May 15, 2012 21:10 |
|
an excel spreadsheet emailed to me.
|
# ? May 15, 2012 21:17 |
|
is it possible to combine the two fields and then search for that? so like client 300 record id 1000 WHERE client||record_id = 3001000
|
# ? May 15, 2012 21:22 |
|
Xguard86 posted:is it possible to combine the two fields and then search for that? Yes, but your best bet is probably to write a formula in the spreadsheet to generate a massive union of select statements. I can't remember excel syntax offhand but something like: code:
|
# ? May 15, 2012 21:33 |
|
I have to build a unique index over two columns (appId, id) and also be able to search for multiple items at once like so:code:
code:
Original version: Two 64bit ids = 128 bits String version: Two 20-char strings separated by an underscore = (20 + 20 + 1) * 8 = 328 bits But for the most part, appId will be 3 or 6 characters long. so the average length of the items will probably be closer to (20 + 6 + 1) * 8 = 216 bits Will this run significantly slower? The other dev keeps telling me "noooo it will go really slow im crying". But I'd rather do it the string way because it'll be way faster to get code done for it and it'll reduce system complexity. To give you a sense of size, the current table size is ~50 million rows (we're porting over from an old system and adding new functionality). Also this is in mongodb. I wrote it like sql because it probably uses close to the same algorithms for searching (and more people are likely to respond). Also if there's a way to limit the charset to /[0-9_-]*/ to save on bits that would be cool to know as well. Peanut and the Gang fucked around with this message at 21:54 on May 15, 2012 |
# ? May 15, 2012 21:43 |
|
Zombywuf posted:Yes, but your best bet is probably to write a formula in the spreadsheet to generate a massive union of select statements. I can't remember excel syntax offhand but something like: this is my standard procedure for these kind of things. I was hoping I could get fancier, but at least I've confirmed there isn't a better method I've just missed.
|
# ? May 15, 2012 21:47 |
|
Xguard86 posted:this is my standard procedure for these kind of things. I was hoping I could get fancier, but at least I've confirmed there isn't a better method I've just missed. You can link the spreadsheet to to MSSQL and run T-SQL queries over it by some witchcraft I've never bothered to master. It's much easier to do it that way.
|
# ? May 15, 2012 21:52 |
|
I've run into what I'm assuming is going to be a dead end but am hoping someone can find some possible way to accomplish this. We have a really old scheduling system here and when people request holidays it inputs them as 30 minute chunks for some ridiculous reason. For example if I request May 16th off this is what it would look like in the database:code:
code:
code:
Edit: nevermind, turns out in Cognos I can get minimums and maximums based on a grouping osd fucked around with this message at 19:23 on May 16, 2012 |
# ? May 16, 2012 18:35 |
|
osd posted:I've run into what I'm assuming is going to be a dead end but am hoping someone can find some possible way to accomplish this. We have a really old scheduling system here and when people request holidays it inputs them as 30 minute chunks for some ridiculous reason. For example if I request May 16th off this is what it would look like in the database: Depends what DBMS you're using, but if the dates have a 0 value for time and the times a 0 value for dates, you should be able to add them together and then use regular min/max functions on them to get the earliest and latest datetimes.
|
# ? May 16, 2012 19:23 |
|
This is probably a really basic question but I was really doubting myself earlier when planning the schema for a new system. The structure of things is: There is an event (not in the programming sense) which has a unique ID Each event can have 1 to n customers who all have unique reference numbers Each customer can make 0 to n responses to the invitation to the event Each response can specify 1 to n options So, the event table's PK is the EventID The event customers table's PK can be EventID and CustomerID (the customers table is completely separate, this just relates the customer to the event) Would the normal thing to do here be to have the response table's PK being EventID, CustomerID and ResponseID? And then the next increment for the options? Also, the options each have a fixed ID so they are fine to tack on to the PK if this is the route to go down but the responses do not. If this is the route to follow can the responses table's PK be EventID, CustomerID and then a count of responses +1 for ResponseID? Please enlighten me. This is on SQL Server 2008. Thanks
|
# ? May 17, 2012 22:05 |
|
I've got a question, too! I have 2 tables (MSSQL 2008) Table 1 user_idx (int unique identifier) user_name Table 2 picture_idx (int unique identifier) user_idx (int) upload_date (smalldatetime) picture_name Table 1 is a listing of unique users, each with a unique user_idx, Table 2 is a large list of uploaded photos, each having a unique picture_idx and tied to the user_idx from Table 1. What I need is a query to pull the last uploaded photo for all users so that I end up with something like: user_name - upload_date - picture_name I'm doing it now by getting the users from Table 1 then doing separate queries from Table 2 to grab the info, but I know that ain't good, I'd like to be able to do it in a single query. Thank you!
|
# ? May 17, 2012 23:19 |
|
G-Dub posted:This is probably a really basic question but I was really doubting myself earlier when planning the schema for a new system. Neither methodology is right or wrong. There's strong arguments for both. The first is example eliminates "useless" extra columns. The second example can make it easier to program for.
|
# ? May 18, 2012 00:19 |
|
To add to this. My personal recommendation is the second example. Your tables would be: Event: EventId, other cols EventCustomers: EventCustomerId, EventId, CustomerId EventCustomersResponse: EventCustomerResponseId, EventCustomerId, Whatever cols go here. EventCustomersResponseOptions: EventCusomerResponseOptionId, EventCustomerResponseId, ResponseOptionId JeansW fucked around with this message at 00:31 on May 18, 2012 |
# ? May 18, 2012 00:28 |
|
MattO posted:I've got a question, too! Do a join, with a subquery to get the last uploaded pictures for each user. If you need to get info on users who've never uploaded a picture, change the first join to a left join and check for null. code:
|
# ? May 18, 2012 00:40 |
|
Hammerite posted:Do a join, with a subquery to get the last uploaded pictures for each user. If you need to get info on users who've never uploaded a picture, change the first join to a left join and check for null. Thanks! Works perfect!
|
# ? May 18, 2012 00:46 |
|
JeansW posted:There's two schools of thought when it comes to table PKs. One school is that "it should mirror the real world" which means your EventCustomers table would have a PK of EventId + CustomerId. The other is "your PK should always be a single unique ID" in which case your table would be EventCustomerId (PK), EventId, CustomerId. And then you'd want a unique constraint on those last 2 columns. Like JeansW said, there's no right or wrong when it comes to these things - the only universally correct answer is "it depends". Having said that I agree with his second post and want to add that "the real world" changes ALL THE loving TIME, so a good data structure needs to be flexible enough to accommodate that. Some people flip out about this sort of thing and say "well you shouldn't waste effort writing something based on potential future requirements", and to an extent they're right (see: gold plating). However when there's no real cost on future proofing something then in my opinion you'd be an idiot not to (and an extra column isn't a cost if you're using an RDBMS - these days the extra memory it will use might as well be 0).
|
# ? May 18, 2012 11:12 |
|
JeansW posted:Neither methodology is right or wrong. There's strong arguments for both. The first is example eliminates "useless" extra columns. The second example can make it easier to program for. Can you elaborate on how it makes it "easier to program for"? I've never understood the motivation for the practice of adding an artificial auto-incrementing primary key for tables which have an obvious composite key as a candidate primary key. I should like to be able to better understand the reasoning, though I suspect I would still be convinced it's a wrong thing to do.
|
# ? May 18, 2012 12:50 |
|
Hammerite posted:Can you elaborate on how it makes it "easier to program for"? I've never understood the motivation for the practice of adding an artificial auto-incrementing primary key for tables which have an obvious composite key as a candidate primary key. I should like to be able to better understand the reasoning, though I suspect I would still be convinced it's a wrong thing to do. code:
code:
|
# ? May 18, 2012 13:32 |
|
Goat Bastard posted:
Each of those is just as "easy" as the other though? The most you can say is that one takes longer to read than the other. But it takes no longer to understand. It would be handy if SQL allowed you to use shorthand like ON st.(col1, col2, col3) = sot.(col1, col2, col3) or ON st.PRIMARY = sot.<foreign key name> but it doesn't, c'est la vie. I also don't know what you mean by extendibility.
|
# ? May 18, 2012 13:43 |
|
Hammerite posted:Each of those is just as "easy" as the other though? The most you can say is that one takes longer to read than the other. But it takes no longer to understand. Really I meant that it's more annoying to write. But yea you're right that's not a big deal and I would never use it myself as an argument for/against synthetic keys. By extensibility I'm talking about the point I tried to make in the post before that. Referring to G-Dub's example: G-Dub posted:There is an event (not in the programming sense) which has a unique ID The obvious natural PK to the EventCustomer table is (event_id, customer_id), but then what if you get thrown the additional requirement that each customer can respond multiple times to one event (maybe they're suddenly allowed to decline then attend it, or attend it multiple times for some brain-dead business reason or something). I'm not saying this is a comprehensive reason to use a synthetic key over a natural one, but it's helped me out more than once at no extra cost, so these days it's a no-brainer to me (except in many-many join tables - I'm still on the fence there, leaning towards the composite key). Plus it's kind of annoying (although like I said, no real cost) to have multiple columns for a foreign key to one table.
|
# ? May 18, 2012 14:06 |
|
Hammerite posted:ON st.(col1, col2, col3) = sot.(col1, col2, col3) code:
|
# ? May 18, 2012 17:00 |
|
MEAT TREAT posted:
Not supported in MS SQL Server or Sybase unfortunately.
|
# ? May 18, 2012 18:46 |
|
JeansW posted:Not supported in MS SQL Server or Sybase unfortunately. that was added to the spec 20 years ago!
|
# ? May 18, 2012 22:04 |
|
MEAT TREAT posted:that was added to the spec 20 years ago! I don't even think of the language as SQL when I use Microsoft stuff, I think "Ok, this is T-SQL." Same with Oracle, same with MySQL - if you have to switch between the three regularly, it's an absolute pain in the rear end.
|
# ? May 18, 2012 22:38 |
|
|
# ? May 31, 2024 02:44 |
|
Pretty dumb question, but can you not concattwo fields of different data types? I have an INT and a varchar but this isn't working: SELECT categoryID, categoryName + count AS categoryNameAndCount FROM categories but when I run it strictly get just the ID and count. I've also tried; SELECT categoryID, CONCAT(categoryName + count) AS name FROM categories Using phpmyadmin
|
# ? May 19, 2012 00:52 |