|
Oracle has a CONNECT BY operator that lets you do this.quote:select c quote:124 If I couldn't do that I would write a recursive row-returning function to crawl the tree. http://dev.mysql.com/tech-resources/articles/hierarchical-data.html The above link has info on MySQL approaches to handling nested data.
|
# ¿ Feb 21, 2008 21:39 |
|
|
# ¿ Jun 2, 2024 00:06 |
|
Selecting non-grouped columns in a GROUP BY statement returns inconsistent results. Take the MAX of your user's post dates and order by the result of this aggregate function.
|
# ¿ Feb 29, 2008 04:46 |
|
Oracle 10G XE is free, with the following limitations: 4GB user data 1GB RAM 1 CPU http://www.oracle.com/technology/products/database/xe/index.html
|
# ¿ Feb 29, 2008 23:54 |
|
fletcher posted:Great info about postgres guys, thanks. If I was going to try something new should I go postgres or oracle 10g? I don't think I will be going beyond there computer limitations any time soon. I would use Oracle 10g XE or Microsoft SQL Server XE if I was learning something new. They're both free and mature, with Microsoft SQL Server XE having the advantage of integrating with Visual Studio Express out of the box. My preference is Oracle. I believe that PostgreSQL is still lacking analytic functions with windowing, a feature I would have a very hard time living without.
|
# ¿ Mar 3, 2008 16:05 |
|
Sergeant Hobo posted:I am taking a class where we are using Oracle 10g. I want to know if there a way to do one command that can be used to drop all the table at once (mainly because I don't feel like typing drop table tablename; a billion times, not that I'm lazy or anything). A quick Google search came up with this command: Run something like this. You can instantiate it as a procedure for easy reuse. code:
|
# ¿ Mar 7, 2008 16:59 |
|
No forward slash at the end, and a space before the word "cascade". Yes, adding "cascade constraints" will handle foreign keys.
|
# ¿ Mar 14, 2008 19:17 |
|
Charun posted:I have a date related problem. I would create a table of months with their beginning date (00:00:00) and ending date (23:59:59) and join against that. You'd want to get the month the event started in, the event the month ended in, then all the months between there. The most straightforward solution in Oracle syntax follows. MySQL 5.0 syntax should be very close to this. code:
code:
|
# ¿ Mar 17, 2008 16:22 |
|
Charun posted:and it works perfectly! Do you want plat or archives? No thanks, I just contribute for fun.
|
# ¿ Mar 18, 2008 16:38 |
|
nbv4 posted:hmm, thats not working either. Here is my exact query: Just compute your sorting column in the inner query and sort on it in the outer select directly.
|
# ¿ Mar 26, 2008 02:59 |
|
drcru posted:Why is this: DISTINCT is not a function, so it cannot actually be applied to a single return column, despite how it looks. code:
You'll need to reword what you're looking for - maybe "for each postid the tag of the postid with the largest date". noonches posted an example of a solution to this kind of problem higher up on this page.
|
# ¿ Mar 27, 2008 02:31 |
|
Code Jockey posted:Real quick general question, not specific to MS SQL. You can't say it would always be faster, because there are a lot of "it depends". With a "top N" query you need to sort before you can return rows, but if you know you only want 400 results the database can use less sort space (only need to keep a window of the top 400, instead of sorting every row returned), which would make the query more efficient, and faster with very large result sets because you would not need to use temp space. If your query is doing a full table scan it could be faster because it could stop earlier, and it could similarly end earlier with some indexed access paths. Oracle can, based on statistics, shuffle around some joins and access paths based on the FIRST_ROWS_N (N=1,10,100,1000) hint.
|
# ¿ May 3, 2008 01:00 |
|
Grigori Rasputin posted:I'm pretty new to MySQL and am wondering how best to enforce a set of data rules at the table level. The basic idea is that I want to link like data together, but never repeat a combination of it. I can deal with it in code, but I'd rather prevent it as low as possible. It sounds like you want to store and enforce integrity on a directed acyclic graph in the database. Here's a good Google result talking about DAGs and databases: http://69.10.233.10/KB/database/Modeling_DAGs_on_SQL_DBs.aspx I didn't read the whole article, but you could handle (1) with a direct BEFORE INSERT trigger, and you could handle (2) in a similar fashion by making sure that when a user tries to add edge e' (n->n') to an existing DAG G that edge e'' (n'->n) is not in the transitive closure of G (which would be pre-computed and stored in a table). This table would grow quite large in the case that you had a large number of vertices. If an edge e' introduces a cycle you would be able to tell very quickly by examining the transitive closure of G and raising an error. Inserting edges that do not introduce cycles (or removing edges) would be at worst O(V^3), where V is the number of vertices in G (in the case you choose to fully recompute it after each operation). You'd be doing this in code in a similar fashion anyway, and it will be faster in the database since you will have persistent, pre-computed information about your graphs, but it's probably going to be easier to keep the logic in your code.
|
# ¿ May 6, 2008 01:23 |
|
indigoe posted:Should I be indexing more fields that are included in my WHERE clauses? I'm concerned I will end up with an index that's near as big as the database itself. Is there some other way to approach this? Indexes are not always good, but a common rule of thumb is that if you are pulling out less than 15% of the rows in the table then evaluating a new index on a column (or group of columns) with a lot of distinct combinations might help. With a table that large any indexes will necessarily be large and cumbersome. In my own experience I have found that the problem can usually be reduced by pre-computing intermediary values and storing them as additional columns, or in a secondary table. The strategies MySQL has available to it for reporting tasks are very limited, as compared to SQL Server and Oracle, but you want to do as much as you can in the database before you offload it to another system for processing. It will almost *never* be faster to offload calculations to another language if these same calculations can be done directly (through direct SQL or through procedures) in the database.
|
# ¿ May 6, 2008 15:37 |
|
nbv4 posted:I have a table thats about to be populated with a few hundred thousand records. The table has about 34 columns. Some of these columns are boolean columns that could be combined with one another. For instance one column called "is_car" and another "is_motorcycle", which could be combined as "vehicle_type" with a value of either NULL, "m", or "c". It doesn't matter with small columns like that. If you consolidated them it would just reduce the width of the row in the raw data file by a few bytes. Make sure you don't consider making an attribute table of (src_id, attrname, attrval) and storing your attributes in a 1:N relationship like that.
|
# ¿ May 6, 2008 15:49 |
|
Begby posted:Why not? Space is effectively unlimited on databases, but CPU power is not. You should design your database to answer the questions you plan to ask quickly and efficiently. People model their database like this for a lot of reasons, but it's usually to avoid making concrete design decisions or to get around their DBAs (if it's for space it's an even worse reason). It's no question that flattening the data to answer simple questions will be faster when that data is pre-flattened. You'll experience decreased concurrency through this system, and complex questions will be even harder for the database to answer quickly. God help you if you're using self joins to answer these questions.
|
# ¿ May 6, 2008 17:18 |
|
Zedlic posted:Thanks for this. If an UP/DN ACK pair is *always* present (the other end always replies) and they are logged reasonably in order and there are no duplicates and you're using Oracle or SQL Server you could use an analytic function like FIRST_VALUE to compute the final table efficiently. Otherwise, the query is going to be necessarily slow unless you help stage the join condition. One way you could do this is translating this rolling 0-999 value into a real value in the range 0-999,999,999 (a big number you won't ever reach) and storing this in another table, or computing it on the fly with a function. You could then index this function/column, and joins would become very fast.
|
# ¿ May 21, 2008 19:42 |
|
Zedlic posted:I've thought about mapping the ACK_number to something bigger that won't realistically loop ever, but the problem is finding a mapping function that maps each pair of ACK_numbers to a unique number. Any ideas on that are appreciated. If you created a new column ack_number_noloop then you could count the number of ACKs with that sequence number with a smaller timestamp, multiply that count by 1000, and add the ack_number. This is a way to link the two records together as yaoi prophet recommended. It will work as long as when you get message 500 then the ACK for the previous message 500 has already been committed to the database.
|
# ¿ May 22, 2008 02:54 |
|
I'd do thiscode:
|
# ¿ May 25, 2008 22:02 |
|
epswing posted:Can you tell me why this is the same/better/worse? I think it's easier to understand what the query is doing when you break the query up. Additionally, while using the case function in that way is convenient, it is hard on the database because Oracle has to evaluate it for every row. I benchmarked the two original queries and found that my query ran in 50% of the time. I then benchmarked my original query against a modified version that uses the case method, and found that my query ran in 70% of the time. code:
|
# ¿ May 26, 2008 02:27 |
|
Stephen posted:
The query is trying to return, for each make/model, the make, the model, the number of times it has been viewed, one photo, and something from auto_upsales?
|
# ¿ Jun 13, 2008 19:45 |
|
Aredna posted:Right off you can change this: The GROUP BY will still force the query to return only one row per auto_id in the photos table. The ORDER BY clause in the original sub-select will probably make it return the row for the auto_id having the largest or smallest orderid. Taking this out of the sub-select will probably cripple performance, doing an outer join against a 200k row table. You gotta feel bad for MySQL users. Without having analytics you really have to bend over backwards to avoid self joins.
|
# ¿ Jun 13, 2008 19:58 |
|
Stephen posted:How do you get around something like this in Oracle? If you aren't allowed to reference columns that aren't in the GROUP BY, how do you output them with the rest of the row? Is this a MySQL habit that I should kick right now before I get into Oracle/SQLServer? In older versions you would need to self join (get the id and maximum value and then join back to the original table). In newer versions you can use analytics such as row_number() or rank() to compute an ordering which you then filter on.
|
# ¿ Jun 13, 2008 22:22 |
|
Stephen posted:Is there anything I can do to improve the efficiency of this query? The views table is pushing 5 million records, and the photos table is up to near 200 000 so it's really starting to slow down. I have views.auto_id and views.view_type_id both indexed but it's still taking quite awhile. There aren't any great approaches to solve this problem in a generic way since at the end of the day you still need to aggregate 5 million records, and compute a possibly large outer join. The best thing to do is probably to asynchronously pre-compute part or all of the query and refresh/update on insert or on a timer depending on your needs and your workload. You're probably insert few read many so the additional time added to insert operations is a worthwhile investment. Make sure you look at MySQL concurrency control and make adequate accommodations so that queries don't block while the tables are refreshing.
|
# ¿ Jun 14, 2008 22:10 |
|
MrHyde posted:Thanks for the help. I wasn't really clear. I don't want a list of integers, I just want a list of comma separated values I have, a better example would be ('firstVal','2ndval','3rdval') A pattern I've used with Oracle in the past that should be portable is to use a dummy table that is known to have more rows than elements in your input string to parse the string with substr/instr, using your current row number as an offset. You will need to find a way to number the rows, possibly using row_number() or rank(), in the case you do not have a rownum pseudo-column. See the following (the input string 'a,b,c,d' would be passed in and duplicated in several spots in the query). code:
code:
|
# ¿ Jul 1, 2008 22:28 |
|
crazypenguin posted:(mysql 5) I have a list of names (application side). I need to populate a table with the ids that refer to rows in the names table. Except some of these names may not yet exist in the names table and should be created. If your names list is small you should pass it to your application. If it's large then a good solution might be to wrap the checking/creation SQL in a function which does nothing if the name exists, and otherwise it creates it. Then you would call this function N times, once for each name in your list.
|
# ¿ Jul 2, 2008 03:13 |
|
epswing posted:(Discussed on synirc, posting for posterity: ) Using 3 joins is, in my opinion, the appropriate solution to this problem (you will need to use outer joins since the offers table can have nulls). With indexes the look-up against the names table should be very efficient. Here's a single table join Tom Kyte has suggested for "attribute" tables (http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:31263576751669) that will work in this situation. code:
|
# ¿ Jul 7, 2008 15:55 |
|
fletcher posted:What's the most efficient way to query a table for timestamps that are < 15 minutes old in MySQL? In the first expression you are converting a date column from its native format into an integer for a comparison. Stick with the second expression. Although they are meant to be equivalent, the optimizer will more than likely not rewrite the first one to make them equivalent.
|
# ¿ Jul 16, 2008 03:04 |
|
MoNsTeR posted:More or less. I would write it as: I'll line up on the other side. ANSI joins are the only thing that make large statements understandable and readable. The (+) operator makes life miserable, has limitations, and even Oracle recommends you get on with your life. 10.2 SQL Reference posted:Oracle recommends that you use the FROM clause OUTER JOIN syntax rather than the Oracle join operator. Outer join queries that use the Oracle join operator (+) are subject to the following rules and restrictions, which do not apply to the FROM clause OUTER JOIN syntax:
|
# ¿ Jul 16, 2008 20:20 |
|
Alex The Great posted:I have a SQLite database that I'm trying to search through, and firstly LIKE doesn't use indexing, so I'm having some really slow results. If I understand you correctly you could solve your problem by also adding a substr() predicate. Something like this: code:
A poster named D. Richard Hipp posted some advice for a similar question at http://www.nabble.com/SQLite-Like-Query-Optimization-td15511886.html. D. Richard Hipp posted:LIKE operators cannot use indices unless the index is var1ety fucked around with this message at 20:43 on Jul 21, 2008 |
# ¿ Jul 21, 2008 18:23 |
|
Stephen posted:I've got two MySQL tables: Aggregating the votes table is probably what's taking so long. The query will out of necessity need to do a full table scan to satisfy the request, which can take a long time when the number of rows is large. You can run the query on its own to verify. There are not a lot of things you can do to tune this passively. You can make sure the votes table is minimized in side (by restricting the maximum length of the field data types) so that more rows can be retrieved at a time, and you can try caching the aggregation operation in some fashion. If you can do more drastic things I would encourage you to redefine the votes table so that it holds an aggregated total per canine instead of individual votes. The table is probably write few read many, so it makes sense to optimize for reading.
|
# ¿ Jul 31, 2008 01:15 |
|
Scarboy posted:Oracle, I'm not sure about the server version though. There's a pretty comprehensive thread on the topic on Ask Tom at the following URL: http://asktom.oracle.com/pls/asktom/f?p=100:11:2397328357508087::::P11_QUESTION_ID:229614022562
|
# ¿ Aug 5, 2008 20:31 |
|
Golbez posted:I'm somewhat of a noob to normalization, so which would be the best option here? Nobody should ever use C. Ever. A or B are fine. I would personally use A. Make sure you size your number column appropriately or you will reduce the benefit.
|
# ¿ Aug 6, 2008 15:11 |
|
Golbez posted:How do you mean? I figure a Tinyint would be perfectly fine. It wasn't a deep insight, I just meant to choose a datatype appropriate for the expected cardinality of the numeric column. Oracle only has the NUMBER data type, and if you do not size it then it defaults to 22 bytes.
|
# ¿ Aug 7, 2008 01:06 |
|
Carabus posted:I would appreciate some advise on function/algorithm design, no code yet. I'm interested in data mining a table recording web page views. There are nearly 1 million rows (don't ask) but I am only interested in the 80k or so which have nonempty user columns. Another relevant column is page_id. Isn't this code:
|
# ¿ Aug 8, 2008 21:33 |
|
Aredna posted:Why is IFNULL() superior to COALESCE? In a mixed-version environment where one version does not support COALESCE it might make sense to standardize on operators available on all Production versions of the database. We are forced to, in some situations, ignore new features in our databases so that scripts will run with minimal modification on older versions. My personal preference is to use COALESCE/CASE in all situations and ignore our database's large number of null-handling functions.
|
# ¿ Aug 8, 2008 21:53 |
|
SHODAN posted:Quick question: I'm using the MySQL .NET connector in C# and trying to add parameters to a command using something like the following for the command text: In Oracle you would use concatenation. code:
|
# ¿ Aug 31, 2008 01:31 |
|
LightI3ulb posted:I want to build a query that will use three tables, and I know the speed of each table individually, so I want to maximize the speed with the joins. That being said, does the order of events in the query have an effect on which one is completed first? It depends on your database vendor and version. Many modern databases leverage index and table statistics to generate their own query plan for answering your question in the way they believe will be the most efficient, such that the order in which you specify your tables and predicates does not matter.
|
# ¿ Sep 10, 2008 18:51 |
|
Uziel posted:I need some help querying some counts from Oracle. There's no intrinsic way to do this in Oracle until 11g (which adds pivot and unpivot). If you're using an older version you'll need to create a query with 80 columns, or create a filter outside of Oracle that will pivot your data set.
|
# ¿ Sep 12, 2008 17:26 |
|
Uziel posted:The server is running Oracle 10. The simplest way is like this: code:
code:
|
# ¿ Sep 12, 2008 17:40 |
|
|
# ¿ Jun 2, 2024 00:06 |
|
Victor posted:Have you tried the following? Dual is a dummy table in Oracle with exactly one row. You must select from a table in Oracle, for better or for worse.
|
# ¿ Sep 12, 2008 18:36 |