|
I have a table with some email addresses, dates and names. I select the data ordered my email, then date and get code:
code:
code:
code:
code:
code:
Any way to do this within MySQL?
|
# ? Jul 3, 2009 01:40 |
|
|
# ? May 14, 2024 07:46 |
|
SQL DBMSes use three-valued logic, or 3VL, and NULLs are placeholders, or indeterminate values that represent that a data value is missing or unknown. In 3VL: code:
code:
|
# ? Jul 3, 2009 02:01 |
|
Markoff Chaney posted:NULL can be useful, just remember that it's strange, at least in Oracle. 1 = NULL and 1 != NULL both evaluate as false for one thing, and for another you can't index on null. If you've got a column with an index on it and there are null values in there you can use that index if you specify <column value> = <whatever>, but not <column value> IS NULL; Basically never do equality tests with NULL on any platform ever, and always use IS NULL. Yes, it can bone up indexing, but it's better that it's right first
|
# ? Jul 3, 2009 04:35 |
|
When I go to phpMyAdmin and view the details of most of the tables in my database, I see something that looks like the attatched image. My database has 7 tables. Two of them have two columns as a Primary Key. Those tables do not exhibit this behaviour in phpMyAdmin. The other 5 each have a single column as Primary Key. The naming convention for each of these tables' Primary Key columns is (tablename)ID. So the User table has a Primary Key called UserID. Those tables all exhibit this behaviour. I don't know what PRIMARY is in the image; there is no column called that in the table, so I don't understand what phpMyAdmin is trying to tell me.
|
# ? Jul 3, 2009 19:46 |
|
Hammerite posted:When I go to phpMyAdmin and view the details of most of the tables in my database, I see something that looks like the attatched image. The attatched image shows the results of asking MySQL to show me the indexes on two of my tables, one of which (Message) is one with duplicate indexes, the other of which (WatchedGame) has a single index involving two columns. It becomes clear that there was some fault with the queries I used to create these tables. Something in said queries caused MySQL to get the wrong idea and add an additional index to each one. But I'm confused as to how a CREATE TABLE query should be written, because I didn't use different syntax for the different types of tables. I created WatchedGame using code:
code:
Also, I guess I should delete one of the duplicate indexes, but does it matter which one? If I delete the wrong one, will MessageID cease to be my primary key and cause me problems, or any other wacky poo poo like that? code:
|
# ? Jul 3, 2009 20:25 |
|
I just tested this out again by creating a throwaway tablecode:
Edit: The problem is with using SERIAL. I ought to use BIGINT(20) UNSIGNED NOT NULL AUTO_INCREMENT in its place. Hammerite fucked around with this message at 21:55 on Jul 3, 2009 |
# ? Jul 3, 2009 21:48 |
|
Hammerite posted:
Where's the real key of the table Message? If you define MessageID as an autoincrementing number then it can be used as a surrogate key, and to simplify joins, but without a real relational key then the table isn't normalized to first normal form and would allow duplicate rows to be inserted. Table WatchedGame does not have any foreign keys defined that reference the tables User or Game. Similarly the table Message does not have any foreign keys, say to the User and/or Thread tables. Foreign keys are a form of declarative referential integrity (DRI), and without DRI it is possible for invalid data values to be written to the tables. In table Message, the two "bit" or "Boolean" columns DeletedByUser and DeletedByAdmin could be replaced by a column named something like "DeleteCode". In the original design, code:
code:
code:
|
# ? Jul 4, 2009 10:13 |
|
camels posted:Where's the real key of the table Message? If you define MessageID as an autoincrementing number then it can be used as a surrogate key, and to simplify joins, but without a real relational key then the table isn't normalized to first normal form and would allow duplicate rows to be inserted. I don't understand! MessageID is the primary key of Message. What then constitutes a real key? camels posted:Table WatchedGame does not have any foreign keys defined that reference the tables User or Game. Similarly the table Message does not have any foreign keys, say to the User and/or Thread tables. Foreign keys are a form of declarative referential integrity (DRI), and without DRI it is possible for invalid data values to be written to the tables. How do you mean - such as a record with a User value that doesn't match the UserID value of any row in table User? I suppose I was aware of the possibility but unconcerned, as I was not aware of an easy way to prevent it at the database level. I will look into the idea of defining some foreign keys for my tables. camels posted:Some of the columns may be sized too wide: The "ID" columns are declared as BIGINT(20), and column MessageText is declared as TEXT. Sizing columns to fit the data more closely is a good practice for data integrity and performance. The reason I originally set some of my longer string type columns to TEXT was because I had the impression (I do not know where from) that the maximum size you could give to a VARCHAR column was 255, as in MessageText VARCHAR(255). Reading http://dev.mysql.com/doc/refman/5.0/en/storage-requirements.html I see that this impression was erroneous. Do you suggest (say) MessageText VARCHAR(10000) or similar, with the limit being some large number I don't want messages to exceed?
|
# ? Jul 4, 2009 13:27 |
|
Hammerite posted:I don't understand! MessageID is the primary key of Message. What then constitutes a real key? A real key is one that is derived from the data, for example a candidate key of the table Message might be (User, Thread, PostDate). By adding a UNIQUE key constraint to Message on (User, Thread, PostDate) the table would be normalized to first normal form. quote:How do you mean - such as a record with a User value that doesn't match the UserID value of any row in table User? I suppose I was aware of the possibility but unconcerned, as I was not aware of an easy way to prevent it at the database level. I will look into the idea of defining some foreign keys for my tables. I think I would recommend reading a book or article on database basics, especially an introduction to data modeling, keys, and normalization. quote:The reason I originally set some of my longer string type columns to TEXT was because I had the impression (I do not know where from) that the maximum size you could give to a VARCHAR column was 255, as in MessageText VARCHAR(255). Reading http://dev.mysql.com/doc/refman/5.0/en/storage-requirements.html I see that this impression was erroneous. Do you suggest (say) MessageText VARCHAR(10000) or similar, with the limit being some large number I don't want messages to exceed? I would estimate the max size of the columns and maybe size it a bit wider, based on the business rules and domain.
|
# ? Jul 5, 2009 01:24 |
|
OK, thank you for your advice. edit: quote:I think I would recommend reading a book or article on database basics, especially an introduction to data modeling, keys, and normalization. Hammerite fucked around with this message at 01:47 on Jul 5, 2009 |
# ? Jul 5, 2009 01:30 |
|
As an additional thought, googling tells me that in order to use foreign keys in MySQL I need to switch the tables that will use them from the MyISAM engine to InnoDB. Idle reading suggests that MyISAM is a generally somewhat higher-performance engine than InnoDB but that since in particular some of my tables are UPDATE'd quite often in comparison with how often they are read, InnoDB with foreign keys may or may not be superior. It is tough for me to know what is optimal.
|
# ? Jul 5, 2009 02:37 |
|
Hammerite posted:OK, thank you for your advice. I think The Manga Guide to Databases is a good intro book. I also like Fabian Pascal's introductory books; see my earlier post. Some online articles: http://databases.about.com/od/specificproducts/a/normalization.htm Has good info and links Wikipedia article on Database normalization. http://filedb.experts-exchange.com/incoming/2009/02_w06/102071/rettigNormalizationPoster.pdf For a quick starting point on database theory: http://en.wikipedia.org/wiki/Relational_model
|
# ? Jul 5, 2009 02:44 |
|
Hammerite posted:As an additional thought, googling tells me that in order to use foreign keys in MySQL I need to switch the tables that will use them from the MyISAM engine to InnoDB. Idle reading suggests that MyISAM is a generally somewhat higher-performance engine than InnoDB but that since in particular some of my tables are UPDATE'd quite often in comparison with how often they are read, InnoDB with foreign keys may or may not be superior. It is tough for me to know what is optimal. The only legitimate reason to use MyISAM over InnoDB is for fulltext indexing, and there are plenty of other ways to go about searching (i.e. Sphinx, though that's another can of worms).
|
# ? Jul 5, 2009 03:13 |
|
McGlockenshire posted:Using FKs always incurs a performance penalty due to the lookups required, but that's true with any database from any vendor. It's better to eat the possible performance penalty and only worry about it if it becomes a measurable issue. Not sure if MySQL is the same, but on Oracle if you're doing SQL operations on two tables that should have a foreign key they will actually be sped up adding one. I suspect this is some sort of optimization trickery on Oracle's part.
|
# ? Jul 6, 2009 12:13 |
|
In PL/SQL how do I use a collection in an "IN" clause? I was certain I have done this before, but I can't remember or Google the syntax.
|
# ? Jul 8, 2009 00:03 |
|
I've got a painfully vague but hopefully easy question. This is actually a continuation of sorts for the problem I posted a few pages ago; the person the report was written for wants some things expanded and I've been having trouble with that. I've already got a report written in Actuate that retrieves account numbers for accounts that have more than one active electrical service installed. It works great, no problems. Now however I have to display a subreport that pops up information on each individual service, information that is listed in various other tables. I'm pretty sure I've got a solid query there that does what I need it to. Only one problem: I can't find a good way to get at the data I need. A brief look at the initial query so you know what I'm talking about : code:
To definitely identify a given service and get the appropriate information for it I need two things: Its location number and its service seq. I want to grab these from my initial query and use them as parameters in the next: Just having the account number doesn't really help, since an account can have more services than just electrical and I don't want to display those. The SQL in my initial query clearly has to access that information at some point to count it up, but since I'm using an aggregate function in the form of COUNT I don't think there's any way to get at that information in my result set. My hope is that you all can tell me otherwise. For reference, here's the query I used in the nested report a bit further down: code:
|
# ? Jul 9, 2009 20:17 |
|
Xae posted:In PL/SQL how do I use a collection in an "IN" clause? code:
|
# ? Jul 9, 2009 21:06 |
|
MoNsTeR posted:where something in (select * from table(t_foo)) I plan on doing a BULK COLLECT though. I'm rewriting it to only use one select instead of the N it currently uses.
|
# ? Jul 10, 2009 13:16 |
|
Was disappointed just now to find that you can't docode:
|
# ? Jul 10, 2009 23:43 |
|
You might have more luck attacking INFORMATION_SCHEMA. None of the special info MySQL provides can be treated like a table, which is very silly.
|
# ? Jul 11, 2009 01:04 |
|
Hammerite posted:Was disappointed just now to find that you can't do code:
|
# ? Jul 11, 2009 01:13 |
|
I'm trying to do geolocation by IP address, essentially. I have a geolocation table, called ip_group_city, which basically maps IP ranges to coordinates. The way you look up the coordinates for an IP address is:code:
code:
What I need is a magical query that does something like this: code:
|
# ? Jul 17, 2009 03:20 |
|
Use a sub-select?code:
|
# ? Jul 17, 2009 03:33 |
|
Awhile back I posted about storing a huge pile of stock market data in a single table. The query times on it is pretty bad. Well, I don't know if I should actually complain. Imagine all symbols on the New York Stock Exchange, in some case going back about 29 years, with their daily data. They're all smashed together in one table. I'm wondering if one could expect a speedup if I put each company in its own table, and had a master table storing the indices so I know which one to call up. Is this kind of the normal thing to do with huge piles of data? Assuming I need to look at all companies across a year, that means doing this lookup a few thousand times. In the end, could it still be faster than querying on the huge pile? I also read about stored routines and wondered if the lookup for taking a stock symbol to a table index is the kind of thing for which I should use that. Edit: Perhaps as a defense against being told to just try it--organizing a database to accomodate that layout doesn't really seem hard to me, but to query it is going to take a little bit of revamping of the program I'm using to call it. It's a Java application using JDBC; I'm pretty much firing SQL out so I don't think the program itself would be a factor. Rocko Bonaparte fucked around with this message at 04:17 on Jul 17, 2009 |
# ? Jul 17, 2009 04:10 |
|
Rocko Bonaparte posted:Awhile back I posted about storing a huge pile of stock market data in a single table. The query times on it is pretty bad. Well, I don't know if I should actually complain. Imagine all symbols on the New York Stock Exchange, in some case going back about 29 years, with their daily data. They're all smashed together in one table. I'm wondering if one could expect a speedup if I put each company in its own table, and had a master table storing the indices so I know which one to call up. Is this kind of the normal thing to do with huge piles of data? Assuming I need to look at all companies across a year, that means doing this lookup a few thousand times. In the end, could it still be faster than querying on the huge pile? No, do not break it into individual tables. The root cause of your slowness is probably due to a bad index and/or bad table structure. Splitting into multiple tables is going to make it worse. Post your sql, data structure (along with indexes), number of records, database type (mysql, sql server, etc.), and how long your statement is taking to execute.
|
# ? Jul 17, 2009 04:17 |
|
Begby posted:No, do not break it into individual tables. I've basically created one table that is organized like so: symbol, string--I think 8 characters close date--date opening price--decimal 8,4 high price--decimal 8,4 low price--decimal 8,4 closing price--decimal 8,4 volume--big integer, 20 digits If any of that makes any sense; 8,4 means 8 digits with 4 more for precision. I added some symbols for companies that went out of business years ago, so I don't know the full record count anymore. I believe I have at least over 2 million records. New date for each symbol is added on a regular basis; daily at points. So on a given day, each symbol might get a new record added for it, one after the other. I tend to query a year's worth of data for all symbols within a date range. I'm running mysql, but I'd have to get the version string from you later. How should I measure the time to you? Should I use the low-level mysql client interface?
|
# ? Jul 17, 2009 04:24 |
|
I've got about 5,000 integers that need to be checked if a user has seen before. These integers are grouped into different categories of varying size, so one category could contain 325 of the integers while another has 50 of them. They're all unique integers and won't appear in any category more than once. Meaning that if it's in one category, it's only in that category. Right now I have a table setup to contain the following: user_id | category_1 | category_2 | category_3 | category_n... etc In category_1 there's just a comma separated list of integers that the user has seen before. eg. 1,2,3,5,6,7,111,9,10 This seems a bit clunky so I'm thinking about storing each individual integer seen in its own row. Kind of like this. user_id | int_seen | category The only problem I can think of is that there could be any number of user_ids... It'd probably be capped at 10,000 but that'd still be 50,000,000 entries in the table... My solution to that would be clearing the table of a specific user_id's entries when they have seen all the integers and making another table that contains a list of users that have done this. That and maybe having a group_id column in the original table so people in a group can share a result set instead. What do you guys think? Any ideas or alternatives? edit: hope that made sense
|
# ? Jul 17, 2009 04:37 |
|
Well that's certainly an improvement over your current setup. Is the category for a given number constant across user IDs? My first thought would be something like: CREATE TABLE Integers (Integer INT, CategoryID INT); CREATE TABLE Users_Integers (UserID INT, Integer INT, UserHasSeenInteger LOGICAL);
|
# ? Jul 17, 2009 05:50 |
|
Zoracle Zed posted:Well that's certainly an improvement over your current setup. Is the category for a given number constant across user IDs? Yes, the category for a number is constant across user IDs. I'll probably need to include the category id as well in the user's "seen" table as the integers are displayed by category to begin with. This is on MySQL too by the way.
|
# ? Jul 17, 2009 08:17 |
|
I have a row in my MySQL table that I want to duplicate into the same table. I could do this with INSERT INTO :table SELECT * FROM :table WHERE id=:id, but since one on the columns is an auto-increment mysql balks at me for trying to have 2 rows with the same id. Is there a real easy way to exclude/ignore the auto-increment column, or do I have to read in all column names and filter out the auto-incrementing ones?
|
# ? Jul 17, 2009 12:41 |
|
minato posted:Use a sub-select? Balls. From what I can tell, this is exactly what I need, but when I execute it it chugs for about 30 seconds and then returns code:
|
# ? Jul 17, 2009 13:25 |
|
Rocko Bonaparte posted:Well, that's a fast response. I'm on the road right now and was pondering this again so I thought I'd post. So I'll try to be more precise in the next few days by adding more information but since you asked now, I wanted to tell you what I vaguely know off the top of my head. This sounds like an issue with your indexes. What is your SQL statement and what indexes do you have? As far as time you don't have to be exact, but any MySQL gui should tell you how long it takes for a query to run. Another thing that will help, put EXPLAIN in front of your sql statement and run it, and post the results here. It will show what indexes are being used and if any temporary tables are being created.
|
# ? Jul 17, 2009 14:06 |
|
drcru posted:Yes, the category for a number is constant across user IDs. I'll probably need to include the category id as well in the user's "seen" table as the integers are displayed by category to begin with.
|
# ? Jul 17, 2009 15:29 |
|
dancavallaro posted:Balls. From what I can tell, this is exactly what I need, but when I execute it it chugs for about 30 seconds and then returns gently caress it. I just decided to break it up into separate queries, and it runs fast enough. It doesn't really matter because my the IP address data I'm using is only updated once a day anyway, so I only have to do this lookup once a day and then I can cache it for the next 24 hours. Next step: server-side clustering of up to ~200,000 coordinate locations to display on a map via Google Maps API.
|
# ? Jul 17, 2009 17:31 |
|
Mercator posted:I have a row in my MySQL table that I want to duplicate into the same table. I could do this with INSERT INTO :table SELECT * FROM :table WHERE id=:id, but since one on the columns is an auto-increment mysql balks at me for trying to have 2 rows with the same id. I might just go ahead and filter out the column names. I can at least imagine how that code would look (albeit a bit on the convoluted side).
|
# ? Jul 18, 2009 00:17 |
|
Zoracle Zed posted:What do you mean by "the integers are displayed by category" (give me an example query) and why can't you do it with an inner join between the two tables? They're called separately, one for the integer list and one to see which integers they've seen. SELECT `integer` FROM `integers_list` WHERE `category_id` = ? SELECT `integer_seen` FROM `integers_seen` WHERE `user_id` = ? AND `category_id` = ?
|
# ? Jul 18, 2009 00:46 |
|
Rocko Bonaparte posted:Well, that's a fast response. I'm on the road right now and was pondering this again so I thought I'd post. So I'll try to be more precise in the next few days by adding more information but since you asked now, I wanted to tell you what I vaguely know off the top of my head. Don't forget to include a column for Split/Reverse Split ratio (make it defualt to 1.000). This way, you can build the split math right into the SQL. This is SUPER important for data across long periods of time. Otherwise, you'll have to adjust all the prices before you populate the DB. Also, agreeing that you are having setup problems. I have a similar DB to yours, but with only 5 years of data... but its in freaking Access. You should be able to do what you want in MySQL with plenty of speed.
|
# ? Jul 18, 2009 01:02 |
|
I believe my data is currently automatically compensated for splits, but that is still an important point that I should review. Anyways, I don't know poo poo about indices, so there's a problem right there. I'll look it up here and see to make of it. Anyways, the table is a slightly different format: code:
I regular query for me is something like: code:
Anyways the mysql version: code:
code:
Update 2: I just looked up indices quickly. I was running on the damaged notion that a database would take care of this for me already. That seems to not be the case. So if I'm prone to looking for data based on close_date, should I create an index for it? I would assume I should do something like: code:
Rocko Bonaparte fucked around with this message at 04:57 on Jul 18, 2009 |
# ? Jul 18, 2009 04:28 |
|
drcru posted:They're called separately, one for the integer list and one to see which integers they've seen. Using the schema I suggested earlier, you could do: code:
|
# ? Jul 18, 2009 16:50 |
|
|
# ? May 14, 2024 07:46 |
|
Sergeant Hobo posted:I might just go ahead and filter out the column names. I can at least imagine how that code would look (albeit a bit on the convoluted side). Yeah, I figured as much. It's not real pretty but its still understandable. thanks for looking though.
|
# ? Jul 18, 2009 18:25 |