Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
Xae
Jan 19, 2005

yaoi prophet posted:

Oracle is hint crazy. There are 124 different documented hints, and 57 undocumented.
The best part about Oracle hints is that they are just that, hints. If Oracle doesn't want to use an index, or do a table scan you can't force it. It will do what it drat well pleases and gently caress you for suggesting otherwise. I have seen some crazy poo poo done to systems try to force Oracle into a certain path. The worst at when people go nuts thinking Oracle hosed up the execution plan, when it was doing it correctly and people try to force Oracle into using a slower method.


Oracle always amazes me at how impressive it can be, but also how stubborn it can be.

Adbot
ADBOT LOVES YOU

Xae
Jan 19, 2005

Barrackas posted:

Well, as you say, this smacks of something that wasn't designed with PL/SQL in mind because it's true :) It's a massive MSSQL database (over a thousand stored procedures, hundreds of tables etc.) that is being ported to Oracle.

It would be desirable to keep a similar flow to the procedures between the T-SQL and PL/SQL versions, epecially for maintenance, so I think it's acceptable for this project to use conventions that may be otherwise discouraged (unless there's some significant performance hit of course).

While we're talking about Oracle conventions, would you explain to me what's wrong with the things you've listed? I'm actually pretty much new to database development full stop, not just Oracle, so I'm still trying to get my mind out of OO programming mode.

In particular, what's wrong with explicit cursors and dynamic SQL using EXECUTE IMMEDIATE in procedures?

Cheers!
90% of the time you don't need Dynamic SQL. Oracle will pre-parse most statements in stored procedures. Since that can't be done with dynamic SQL they often times execute slower since Oracle may need to Hard Parse the statement.

Hard parsing is when Oracle creates an execution plan for the statement. For many SQL statements the act of parsing consumes more resources than executing the statement. Once Oracle has determined the execution plan for the statement it stores it for a while so that if similar statements come through it can use that plan instead of generating a new one.

Generally speaking you want the commit or rollback to be handled by a smarter language than PL/SQL. There may be some batch processes that you want PL/SQL to do the transaction handling for, but most of the time your Stored Procedures shouldn't be handling the transaction.

Xae
Jan 19, 2005

yaoi prophet posted:

Cursors are not evil, they are just vastly overused by people who aren't comfortable with implicit cursors or complex queries. Implicit cursors perform better and result in much cleaner code, and it's pretty rare to actually NEED an explicit cursor.
I can't say I have ever seen much of a difference between explicit and implicit cursors. I don't doubt that they are slightly faster, but I think any gains are pretty small when you get over a couple rows.

Xae
Jan 19, 2005

hey wiz posted:

Would this work? The SQL%FOUND is part of Oracle not just PL/SQL right?

I don't think it would. If there is no row Oracle should raise a NO_DATA_FOUND exception. The most common way to check for data before you select it is to do a SELECT COUNT(*) FROM <<stuff>>, as COUNT(*) will always return data, even if it is 0.

Xae
Jan 19, 2005

Anyone know a way I can make a procedure/function in Oralce that will allow comparisons of any two objects of the same type?

Xae
Jan 19, 2005

Another Oracle question here.

I'm still working on and off on a way to compare two generic objects in PL/SQL. gently caress this is a goddamn run around.

I devised a way of doing it by creating a massive overload clusterfuck (240 declarations, generated by another PL/SQL script) and letting Oracle determine the type through the overloading.

This method was vetoed by the client, with decent reasoning, there are issues that are popping up with the dynamic blind sorting, and the overloading.

They have instead decided to go with a Java method. While java is able to handle generic objects PL/SQL is not. I have been searching for a way to force some sort of serialization or somehow get into Oracle's kernal functions where they allow generics in PL/SQL (see the sys.ANYDATA and sys.ANYTYPE types for examples) however it appears that Oracle blocks access to those. Even for the java method they want there still has to be a PL/SQL procedure "mapped" to the Java method.

So, anyway: Does anyone know of a way to have a single function in PL/SQL that will accept any type of object? Or somehow get raw binary or a pointer or SOMETHING that will allow a generic type?

Xae
Jan 19, 2005

uh zip zoom posted:

in MS SQL, I'm trying to alter a nullable integer column so that it's not null with a default value of zero.

How could this best be done using a constraint (since I believe that's the only way to go about it, since I don't believe you can add default values when altering a column)?
Update the Column, set =0 where col is null
Create a Before Insert Row Trigger
Add Constraint

Xae
Jan 19, 2005

Alex007 posted:

Triggers ? For Default values ? God dammit this is a stupid idea.

This is what you're looking for:
code:
ALTER TABLE MyTable ADD DEFAULT (0) FOR MyCol
ALTER TABLE MyTable ALTER COLUMN MyCol int NOT NULL 
Yes, if you INSERT into that table without specifying a value for MyCol, it will use your default value.

What the gently caress do you think a default value is?

Xae
Jan 19, 2005

Alex007 posted:

Default values are better implemented using, well, default values, not triggers.

Its a chunk of code that goes "If blah is null, then set to x". Oddly similar to what a trigger would be. And for almost any database the performance hit would be negligible. Regardless of the terminology the code running is going to be shockingly similar.

Xae
Jan 19, 2005

Start with the Postgres Docs Link. Even if you don't use Postgres their docs are some of the best out there for most core DB things like Indexes, constraints, etc.

Xae
Jan 19, 2005

Anyone have experience with Teradata's Fastload or Multiload? I'm pretty familiar with Oracle's sqlloader but I need to load a bit of data into a Teradata setup (1.5 million rows). Unfortunately the turn around from the DBA/ETL team is measured in weeks, and I need something very fast.

Both of Teradata's programs are pissing me off to no end, they are seemingly needlessly complex and the documentation is borderline non-existent for them.

All I want to do is load a "|" delimited file with 6 columns into a database, it shouldn't be so loving difficult.

Xae
Jan 19, 2005

Triple Tech posted:

I don't know much about database design, but I'm not convinced the standard stored procedure benefit still comes through with dynamically generated SQL. It's like eval'ed code in a programming language, it's not semantically relevant up until the point that it's run, because it can always change before then. This isn't something predictable like bind parameters.

That's all I was trying to get at.

There is tremendous overhead in executing a SQL statement. Most of the work goes into finding out what the statement is and what it wants, not actually getting the data (unless you are pulling tons of rows). A DMBS will pre-parse and pre-plan queries in a stored procedure. This means that when they execute half the work is already done.

Think about all the rules for SQL, when you feed it dynamic statements it has to check them to make sure it is valid. Then it has to plan out the joins and filters, figure out what indexes to use what method to do the join and only after all that is done, can it actually read the data.

My first job was rewriting Pro-C into PL/SQL. Simply sitting on the database and not having everything fed through as dynamic statements resulted in orders of magnitude better performance. It took processes from days to hours. In addition many Stored Procedure languages have custom keywords and operations that are designed to be stupid fast, like Oracle's "BULK COLLECT".

Xae fucked around with this message at 16:02 on Aug 23, 2008

Xae
Jan 19, 2005

Have you thought of using a Surrogate key? One that has no "real meaning" but is just a unique number to link the tables?

Xae
Jan 19, 2005

No Safe Word posted:

The short answer is: you shouldn't care what the value of an auto_increment field is.

I believe there are database-specific ways of telling it to recycle old values (can't guarantee it's supported on all databases of course), but there should be no reason you should need the value of an auto_incrementing field to have any "business meaning" except as an identifying number.

You are completely correct. Though I should add that you shouldn't care that there are gaps in the sequence either.

If it really bothers you switch to a logical delete.

Xae
Jan 19, 2005

I need a way to export huge(20+gb) data sets from Oracle as fast as possible. However there are some problems:

1) It needs to be by query, not by Table.
2) It must be to my local machine. The network admins have refused to give me a couple dozen gigs on the network drives to dump these.
3) Variable formatting. I have to pull in the format the requester wants.
4) Preferably some way to break on file size or records retrieved. I need to be able to fit these on a DVD or CD.
5) I can't actually run code on the system, so no PL/SQL.


This is probably impossible, but it is the situation I find my self in. I'll charitably describe the situation as a "clusterfuck", but it is the assignment I have.

Right now I just run queries through SQL*PLUS and spool them, but this is stupidly slow and I generally dislike spending 20-30 hours on a weekend baby sitting my laptop at work. I have looked at SQLDeveloper, but it only seems to support export by table, not be query.

Any ideas?

Xae
Jan 19, 2005

Jethro posted:

It seems to me that if you are retrieving specific queries with specific formatting, you aren't "dumping data" you are generating reports. So maybe look into some sort of reporting tool?

Most of the formatting is simply switching between delimiters in columns. Reports are generally human readable, these I consider "dumping data" because they are in thousand character long '|' or ',' delimited files. The files are to be loaded into external (to the company) databases, not to be read by humans.

Xae
Jan 19, 2005

MoNsTeR posted:

http://www.quest.com/toad-for-mysql/

TOAD > *

PL/SQL Developer > TOAD (Oracle only).

It only costs a fifth as much too!

It is nice to see some decent tools for MySQL, hopefully companies start to follow suit and offer them for PostgreSQL as well.

Xae
Jan 19, 2005

tayl0r posted:

That is a good point, but you could easily use write stored procedures for everything the application developers want to do to the database (inserts, updates, deletes), and then only give them access to select + run the stored procedures.

The DBAs / database developers could own the stored procedures, or at least look over the code first to ensure the application developers weren't doing anything stupid with them.

Plus, it's a lot more secure this way since less people will have insert / update / delete access to the database.

Logical deletes make a lot of sense, particularly if you are dealing with anything involving money. You really don't want to ever lose a record involving cash. It has less to do with worrying that you will screw up and more to do with keeping a history.

Xae
Jan 19, 2005

MoNsTeR posted:

I could not work without TOAD, period. It is worth 10 times whatever ridiculous price they charge for it.

Unless you are a DBA TOAD is a waste. PL/SQL Developer is much better for development work. It is cheaper and far superior.

Xae fucked around with this message at 20:12 on Dec 20, 2008

Xae
Jan 19, 2005

Markoff Chaney posted:

I'm going to go out on a limb here and ask what is superior about PL/SQL Developer specifically other than the price tag. I've used TOAD my whole career and love it, other than the random crashes (which are pretty infrequent in TOAD 9). I'm almost pure development, other than having some DBA privs to do things like kill sessions and add datafiles when a tablespace fills up on a development box. What can PL/SQL developer do for me that TOAD cannot?

Not crash, multi-threaded so one query won't lock the whole thing, not crash, superior interface, vastly better debugger, built in testing, doesn't crash.

The interface as a whole is miles better. It is the only development tool I have ever purchased out of my own pocket. I ended up having a client buy an unlimited license to support the people who wanted to ditch TOAD after seeing my use PL/SQL Developer. The Unlimited license was less than the cost of 6 TOAD licenses.

Xae fucked around with this message at 05:04 on Dec 23, 2008

Xae
Jan 19, 2005

MoNsTeR posted:

The earliest TOAD I've used is 7.5 and that's multi-threaded, so I'm not sure what ancient version you've tried. Since upgrading to 9.5 I've had maybe 10 crashes total, that's nearly a solid year of using it 8-10 hours a day.

To be fair though I'll download the PL/SQL Developer trial.

Edit:
Hoooooo boy, where to start.
1. Can't have more than one statement in a SQL window
2. no Describe/View/whatever hotkey
3. View menu command is woefully underpowered (where's my Used By tab?!) and schizophrenic to boot (why is there a seperate View and View Spec & Body?)
4. Explain Plan pops up a seperate window and is less readable

These may seem small but they are so integral to my minute-by-minute productivity that they are dealbreakers.

I do like how it uses line-printer style white/green in the query results window, how the editor itself behaves, that it's lightweight and fast, and how it manages to make a bit more efficient use of screen real estate. If you really couldn't afford TOAD and didn't like Oracle SQL Developer this could work, but by no means could it replace TOAD for a seasoned user.

Now, a lot of that is familiarity. Someone who's never used TOAD isn't going to miss their ultra-powerful F4 key, isn't going to expect Ctrl+Enter to run queries and Ctrl+E to explain plan. And another big chunk is what you're working on and how you personally organize it. Maybe you have a printed ERD and all your packages open in editors so you don't need that describe hotkey like someone with thousands of tables and hundreds of procedures. Maybe you don't organize your objects by schema so the behavior of the object browser makes sense.

It's all personal preference ultimately so if this thing floats your boat, hey great you get to save kajillion dollars on TOAD licenses as gravy. But no one who actually knows how to leverage TOAD's functionality would ever switch to it. I would bet a hundred bucks easy that no DB developer in my office would prefer it.

1) Yes, you can.
2) I think you can control-click a table name, or any object name, to get the description
3) Because you don't always need to see the body? It comes in handy when you don't want to grab a 20k SLOC package to see the note in the spec
4) You can alter the explain plan window quite a bit.

You can set up custom filters object filters that let you just peak at any objects you want.

It is also fairly easy to develop your own plug-ins to add any TOAD like features you just can't do with out. Almost the entire program can be customized.

Xae
Jan 19, 2005

Victor posted:

My shop has run into problems with tons of business logic being embedded in triggers, so it is perhaps an esoteric issue.

The danger of triggers is people forget about them, or that the business case changes and you have to delve through layer after layer of triggers to change the code.

If you can count on good documentation they are perfectly fine to use, but reality is that the documentation will inevitably suck.

Cascading triggers (triggers inserting/updating other tables, whose triggers update other tables, etc) have been responsible for some of my biggest headaches. I spent 4 weeks mapping out triggers for a small claims system. Most people had forgotten what triggers did what and even that some existed. The system ended up doing an asinine amount of duplicated work.

I dislike the whole "automagic" system that triggers create. People should know what the system is doing with out diving through a couple dozen triggers.

Xae fucked around with this message at 01:47 on Jan 5, 2009

Xae
Jan 19, 2005

Victor posted:

Curious. I'm not aware of any such issues with our triggers, but I have not personally worked with the more complicated scenarios. What I will say is that it is nice to be able to write normal insert/update statements to do things instead of executing sprocs that necessarily have to operate one-at-a-time, at least until we get table-valued parameters in SQL2008. I don't know what the TVP support situation is for other databases. Having to operate through a web service seems even more gross.

Perhaps you have comments on Kenneth Downs advocating triggers? He and I appear to agree, although I have not investigated this thoroughly.

Kyte from Oracle is able to explain my aversion to triggers far better than I can. I ran into a similar situation where I used them to maintain denormalized data similar to Kenneth Downs first example.


Triggers have their place, such as setting watermark fields or simple checks, but any 'business logic' (I loving hate that phrase, but it works) should be explicit.

Aside from simple architecture preference there can be some wonky things that happen with triggers and transactions. I can't remember the code the the ORA errror, but my previous employer get them by the bucket full.

Xae
Jan 19, 2005

The issue is less about when the original team is working on the project and more about 5-10 years later. Databases tend to long outlive the application built on them. I do a bit of digging in old databases, and time after time some 'clever' trigger solution is the cause of problems later on.

Oracle's mutating table error causes even more headaches. The intent is to tell people they are bumping into ACID problems, the reality is that they come up with convoluted schemes to circumvent it.

Xae
Jan 19, 2005

Anyone have ideas on how to pull out data in a denormalized fashion?

I have two tables, they are related as a many to one. I want it to come out as


"Parent.Column1 | Parent.Column2 | Child.Column1-A,Child.Column1-B...Child.Column1-N | Parent.Column3"

Xae
Jan 19, 2005

Victor posted:

Some RDBMSes allow you to do this easily. For example, SQL2005 has the pivot operator.

Thanks for the tip, I'll have to check out the docs and see if Teradata supports it. Oh god, another night digging through Teradata docs :suicide:.

Xae
Jan 19, 2005

Victor posted:

That would probably actually be a good way to explain the pivot operator...

The problem is that there are '...N' possible values. We kicked some ideas around, but we couldn't figure out a way to flatten a child table into one column in the script. We'll have to mess with the ETL side of the system, or just say 'gently caress it'.

Xae
Jan 19, 2005

Victor posted:

Yeah, so you code-gen the TSQL...

Yea, if I could use a procedural language it would be easy to do with a stored Proc. But the Support Team has a pretty steep charge to put in a stored procedure, and no one wants to pay it.

I ended up just telling them it was impossible unless they wanted to pay. The joys of internal billing.

Xae
Jan 19, 2005

Victor posted:

sp_executesql

That is sort of nifty, I'll have to keep that in mind for the SQL Server systems, I don't see a Teradata equivalent though.

Xae
Jan 19, 2005

camels posted:

On the off chance that this might be feasible - if the result set is not too large, would it be possible to do the pivoting on the client side, with a reporting tool?

The result set is in the tens of millions of rows and dumps to flat files.

It is defiantly possible to do, and in normal circumstances that is probably how I would do it. But right now budgets are getting slashed, so things like this that are nice to have but are expensive aren't being funded.

Xae
Jan 19, 2005

It isn't until I used Teradata that I realized how badly all other explain plans sucked.

code:
Explanation
 1) First, we lock ADHOC.widgets for access, we lock
    PROD.SALE_TRANS_LINE for access, we lock
    PROD.SALE_TRANS for access, we lock
    PROD.SALE_TRANS_PARTY for access, we lock
    PROD.PARTY_CUSTOMER for access, and we lock
    PROD.INVENTORY_ITEM for access.
 2) Next, we do an all-AMPs RETRIEVE step from
    ADHOC.widgets by way of an all-rows scan with a
    condition of ("(NOT (ADHOC.widgets.widget_ITEM_ID IS
    NULL )) AND ((NOT (ADHOC.widgets.SALES_ID IS NULL )) AND
    (NOT (ADHOC.widgets.SALE_DATE IS NULL )))") into
    Spool 2 (all_amps), which is built locally on the AMPs.  Then we
    do a SORT to partition Spool 2 by rowkey.  The size of Spool 2 is
    estimated with no confidence to be 368,466 rows.  The estimated
    time for this step is 0.03 seconds.
 3) We do an all-AMPs JOIN step from Spool 2 (Last Use) by way of a
    RowHash match scan, which is joined to PROD.SALE_TRANS_LINE by
    way of a RowHash match scan with a condition of ("NOT
    (PROD.SALE_TRANS_LINE.ITEM_ID IS NULL)").  Spool 2 and
    PROD.SALE_TRANS_LINE are joined using a rowkey-based
    sliding-window merge join (contexts = 1, 16), with a join
    condition of ("(SALES_ID = PROD.SALE_TRANS_LINE.SALES_ID) AND
    (SALE_DATE = PROD.SALE_TRANS_LINE.SALE_DATE)").  The input
    table PROD.SALE_TRANS_LINE will not be cached in memory.  The
    result goes into Spool 3 (all_amps), which is redistributed by
    hash code to all AMPs.  Then we do a SORT to order Spool 3 by row
    hash.  The size of Spool 3 is estimated with no confidence to be
    357,487 rows.  The estimated time for this step is 8.29 seconds.
....
 7) We execute the following steps in parallel.
      1) We do an all-AMPs JOIN step from Spool 6 (Last Use) by way of
         a RowHash match scan, which is joined to
         PROD.INVENTORY_ITEM by way of a RowHash match scan with
         no residual conditions locking PROD.INVENTORY_ITEM for
         access.  Spool 6 and PROD.INVENTORY_ITEM are joined using
         a merge join, with a join condition of (
         "PROD.INVENTORY_ITEM.ITEM_ID = widget_ITEM_ID").  The
         result goes into Spool 9 (all_amps), which is redistributed
         by hash code to all AMPs.  Then we do a SORT to order Spool 9
         by row hash.  The size of Spool 9 is estimated with no
         confidence to be 469,527 rows.  The estimated time for this
         step is 0.34 seconds.
      2) We do an all-AMPs RETRIEVE step from PROD.PARTY_CUSTOMER
         by way of an all-rows scan with no residual conditions into
         Spool 10 (all_amps), which is redistributed by hash code to
         all AMPs.  Then we do a SORT to order Spool 10 by row hash.
         The result spool file will not be cached in memory.  The size
         of Spool 10 is estimated with high confidence to be
         179,947,614 rows.  The estimated time for this step is 1
         minute and 27 seconds.
 8) We do an all-AMPs JOIN step from Spool 9 (Last Use) by way of a
    RowHash match scan, which is joined to Spool 10 (Last Use) by way
    of a RowHash match scan.  Spool 9 and Spool 10 are left outer
    joined using a merge join, with a join condition of ("(CUST_ID )=
    (TRANSLATE((PARTY_ID )USING LATIN_TO_UNICODE)(FLOAT, FORMAT
    '-9.99999999999999E-999'))").  The result goes into Spool 1
    (group_amps), which is built locally on the AMPs.  Then we do a
    SORT to order Spool 1 by the sort key in spool field1.  The size
    of Spool 1 is estimated with no confidence to be 469,527 rows.
    The estimated time for this step is 7.31 seconds.
 9) Finally, we send out an END TRANSACTION step to all AMPs involved
    in processing the request.
 -> The contents of Spool 1 are sent back to the user as the result of
    statement 1.  The total estimated time is 2 minutes and 10 seconds.
My Favorite part:

quote:

We do an all-AMPs RETRIEVE step from PROD.PARTY_CUSTOMER
by way of an all-rows scan with no residual conditions into
Spool 10 (all_amps), which is redistributed by hash code to
all AMPs. Then we do a SORT to order Spool 10 by row hash.
The result spool file will not be cached in memory. The size
of Spool 10 is estimated with high confidence to be
179,947,614 rows. The estimated time for this step is 1
minute and 27 seconds.


Retrieve 180,000,000 rows, hash and sort them in 90 seconds?

:hellyeah:

Compare with Oracles explain plan (different query):
code:
Plan Table
------------------------------------------------------------------------------------------------------------ 
| Operation                  |  Name    |  Rows | Bytes|  Cost  |  TQ  |IN-OUT| PQ Distrib | Pstart| Pstop | 
------------------------------------------------------------------------------------------------------------ 
| SELECT STATEMENT           |          |    1  |  51  |      3 |      |      |            |       |       |
|  HASH JOIN                 |          |     1 |   51 |      3 | 0,01 | P->S | QC (RANDOM)|       |       | 
|   PARTITION RANGE ITERATOR |          |       |      |        | 0,01 | PCWP |            |     4 |     5 | 
|    PARTITION HASH ALL      |          |       |      |        | 0,01 | PCWP |            |     1 |     3 | 
|     TABLE ACCESS FULL      |EMP_COMP  |     3 |   87 |      1 | 0,01 | PCWP |            |    10 |    15 | 
|   TABLE ACCESS FULL        |DEPT      |    21 |  462 |      1 | 0,00 | P->P | PART (KEY) |       |       | 
------------------------------------------------------------------------------------------------------------ 
9 rows selected.

Xae fucked around with this message at 17:58 on Jan 23, 2009

Xae
Jan 19, 2005

Triple Tech - What is the projected rate of change? If only a small number of entries change you can just set up the table with Record Begin/End Dates and a REC_CURRENT_FLG.

Xae
Jan 19, 2005

Triple Tech posted:

What is the flag in this model used for? It's not enough to just select the one with the max date? I'm sure the flag helps trivialize this process. It just seems redundant, data-wise. That or make a trigger-powered table of latest entries and just a historical dump table of previous incarnations.

Projected rate of change? Should be zero. This processes is sort of an alert filter against real life.

Some people hate using between and think it kills performance, I add it out of habit now so I don't have to argue with people. You can keep a complete history and still have an easy select by using a view.
code:
Create or replace view the_view as
SELECT *
  FROM big_table
 WHERE REC_CURRENT_FLG= 'Y'


Or
create or replace view the_view as
SELECT *
  FROM big_table
 WHERE SYSDATE between rec_beg_dt and rec_end_dt

Xae
Jan 19, 2005

Hammerite posted:

Suppose that I have a table called User and a table called UserLogins. The table UserLogins has the columns UserID, LoginTime (together making the table's Primary Key), and EnteredNumber. EnteredNumber is an integer between 0 and 9 that the user is required to enter when he logs in.

Suppose I want to use a query to create a table of users that is sorted according to the total of all the numbers that a user has ever entered when he logged in, in descending order... but if a user has logged in more than 100 times, I only want to take into account the last 100 times he logged in. How do I go about doing this?

(The task I am trying to figure out how to do actually has nothing to do with users logging in; I just decided to express it in a generic setting. So if your response is "That sounds ridiculous, why are you trying to do that?", yes it is ridiculous and no I'm not really trying to do that, but please tell me how I would do it anyway.)
Try this:
code:

SELECT USER_ID, sum(num_entered)
  FROM  (SELECT USER_ID, num_entered
	   FROM base_table
	  WHERE user_id = lc_counts.user_id
	    AND ROWNUM < 101
	  ORDER BY time_stamp desc) temp_work_table
;

Xae
Jan 19, 2005

Hammerite posted:

Thanks for the suggestion. I ought to have mentioned that I am constrained to using MySQL. Looking on the internet, I see that ROWNUM is a feature that other RDBMSs have but that MySQL does not have. Is there a way of doing it that will work in MySQL?

Edit: Also, I want to make sure it's clear that I want to apply the "most recent 100 logins" requirement to the users individually, not to the output as a whole. So if I have 3 users Adam, Beth and Chloe who have logged in 120, 50 and 90 times respectively, I want to get back three rows: the sum of Adam's last 100 login numbers, the sum of all of Beth's login numbers and the sum of all of Chloe's login numbers. And the only ordering in time of the logins that matters is the ordering of the individual user's logins relative to one another; it shouldn't matter when Chloe's logins occurred relative to Adam's, for example.

You will need to use a work table to store the results then. MySQL supports the LIMIT function to limit the number of rows.
Pseudo Code
code:
get distinct user_ids

Loop through user_ids
	insert into work_table
	select * from base_table
	where user_id = cursor.user_id
	ORDER BY time_stamp desc
	LIMIT 100;
end loop

Xae
Jan 19, 2005

Sergeant Hobo posted:

So index the important stuff then? Like primary/alternate keys, fields you know are going to involve queries, etc.?

PKs almost always create an index automatically.

I normally throw indexes on the columns that end up in where and group by clauses the most.

Xae
Jan 19, 2005

If you want to mess around a bit you can try using Rank. It is handy to know and this is a good excuse as to learn.

Xae
Jan 19, 2005

In PL/SQL how do I use a collection in an "IN" clause?

I was certain I have done this before, but I can't remember or Google the syntax.

Xae
Jan 19, 2005

MoNsTeR posted:

where something in (select * from table(t_foo))
That was what I couldn't find, thanks.
I plan on doing a BULK COLLECT though. I'm rewriting it to only use one select instead of the N it currently uses.

Adbot
ADBOT LOVES YOU

Xae
Jan 19, 2005

dwayne_dibbley posted:

I have a table that stores the history of status changes for accounts but have some bad rows in there, here is a simplified example:
code:
account_no	status	valid_from	valid_to
1		A	01-JAN-2009	22-JAN-2009
1		B	23-JAN-2009	01-FEB-2009
1		C	02-FEB-2009	12-FEB-2009
1		B	13-FEB-2009	14-FEB-2009
1		B	15-FEB-2009	22-FEB-2009
1		B	23-FEB-2009	03-MAR-2009
1		A	04-MAR-2009	17-MAR-2009
Those 3 consecutive rows for status B should be combined into one row like so:
code:
account_no	status	valid_from	valid_to
1		A	01-JAN-2009	22-JAN-2009
1		B	23-JAN-2009	01-FEB-2009
1		C	02-FEB-2009	12-FEB-2009
1		B	13-FEB-2009	03-MAR-2009
1		A	04-MAR-2009	17-MAR-2009
(This problem applies to many account_no values and different statuses)

Anyone have any idea how to do this row combination in SQL? I can do it programatically using row-by-row comparisons but for a 22 million row table it will take many hours...

Thanks.

code:
SELECT account_no, status, min(valod_from), max(valid_to)
  FROM account_status
GROUP BY account_no, status;

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply