Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
benisntfunny
Dec 2, 2004
I'm Perfect.

Mashi posted:

My question is: What is the cleanest / most efficient way to retrieve this data as if it were stored the regular horizontal way? The way I'm doing it right now is:

code:
SELECT p.*,
       (SELECT property_value
        FROM   person_properties pp
        WHERE  pp.person_id = p.id
               AND property_name = "name") AS name,
       (SELECT property_value
        FROM   person_properties pp
        WHERE  pp.person_id = p.id
               AND property_name = "hair_color") AS hair_color,
       (SELECT property_value
        FROM   person_properties pp
        WHERE  pp.person_id = p.id
               AND property_name = "likes_tuna") AS likes_tuna
FROM   people p
WHERE  id = 45
I'd like to do this using a single join but I can't find legal way to dynamically set column names. So is there a way, or is this the best I am going to get?

I can think of ways to do that dynamically using SQL injection but what's the point ? It will cost just as much and probably make your code longer. With the way your table is setup (which I'd hate) your pretty much doing 3 scans anyway you slice it. This type of data should have been pre-determined and stored horizontally... or in XML... but you probably already know that...

Adbot
ADBOT LOVES YOU

benisntfunny
Dec 2, 2004
I'm Perfect.

Triple Tech posted:

Maybe I'm misunderstanding something. Can INSERT insert multiple rows? So are you INSERTing the SELECTion of the temporary table into the main table?

I have no idea what you're TRYing to ask. I really don't think you're asking if INSERT can do this?
code:
INSERT INTO dbo.WhatTheFuckAreYouTalkingAbout (column1, column2, column3)
SELECT column1,
       column2,
       column3 
FROM   ##myHotGlobalTempTable
Because of course it can.

benisntfunny fucked around with this message at 05:17 on Dec 20, 2007

benisntfunny
Dec 2, 2004
I'm Perfect.

Xae posted:

For fucks sake people, learn to use non-full/inner/equal joins.

I think my favorite was something I had found that was written to generate some kind of drop down menu. it was something to the effect of

code:
BEGIN TRAN

SELECT Option1,
       Option2,
       Option3
FROM   SomeTable

END TRAN

benisntfunny
Dec 2, 2004
I'm Perfect.

bloodsoup posted:

So, I would like to find out:

What version of sql am I using? (I connect to AS/400 with Sequel Viewpoint, if that helps.)

What are the names of all the tables, and the last time they were accessed? (There are something like 10,000 of them.)

What are the names and descriptions of the fields in the tables I use? (I use maybe 20.)

How are these tables linked (what fields do they have in common)?
I'm not 100% Sure on this but it sounds like your database is probably DB2.

If that's the case, and you have all these tables this page might answer some questions:
http://www.devx.com/dbzone/Article/29585/1954?pf=true

Descriptions of the fields you use? I doubt that it's in the database itself, given there's 10,000 tables as you claim, but does your company use a data modeling program? If your database is in there it might provide descriptions in the event that the database had to integrate with a new one.

benisntfunny
Dec 2, 2004
I'm Perfect.

Golbez posted:

The following search:

code:
 SELECT ... WHERE MATCH (text fields) AGAINST ('"test range"' IN BOOLEAN MODE) 
will return all results with that exact phrase. So, what if I want to find 'test ranges'? Is there a way to smarten up the mysql literal string search, or do I have to hack in plurals? Note that this gives me 65 results, but if I take it out of boolean mode - so that it's looking for each word individually - I get thousands of results, because a lot of records have either test or range. Any ideas?

I don't work with MySQL but if I were writing this in SQL Server it would be
code:
SELECT ... WHERE TEXT_FIELDS LIKE 'Test Range%' 
Of course you could always use _ for a single wildcard. I doubt MySQL has stemming built into it.. but you could probably create a function to stem words.


EDIT:
oh and depending on my data and the word I was searching I might consider...
code:
SELECT ... WHERE TEXT_FIELDS IN ('Test Range','Test Ranges','Test Ranger',Testing Range','Tested Range') 
This of course only works if you know the initial values before hand. The wildcard approach of course will work with variables.. you could say
code:
SELECT ... WHERE TEXT_FIELDS LIKE @WORD+'%'
but you might get things like 'Test Range." Then again, depending on the fields you're searching ( I assume it is literally a TEXT type ) you could have the option to do a replace first and remove all periods, replace them with spaces. This doesn't work on TEXT types in SQL Server.. you'd have to convert them to VARCHAR. I don't know if MySQL is the same way.

I think you have a lot of different options to choose from depending on exactly what kind of data you're looking at, and how this Query will be executed.

benisntfunny fucked around with this message at 15:00 on Jan 7, 2008

benisntfunny
Dec 2, 2004
I'm Perfect.

chocojosh posted:

I think you want the between clause.

http://www.w3schools.com/sql/sql_between.asp

Call me crazy but I don't think that's at all what he's looking for. He's asking how to stem words, not look for the items that fall in the middle of them.

benisntfunny
Dec 2, 2004
I'm Perfect.

cletus42o posted:

I got around it by just getting a list of league IDs, and looping over that and only summing up the results a league at a time, but I'd prefer to have it all in one query. Anyone have any ideas?

Well, I would have wrote a loop myself with SQL injection just doing a UNION over and over. Probably a horrible approach. I did a quick search and found:
http://www.xaprb.com/blog/2006/12/07/how-to-select-the-firstleastmax-row-per-group-in-sql/

By the way, I don't think your inner join works at all. Your inner join selects 15 points... unless I'm reading something wrong. I probably am.

benisntfunny
Dec 2, 2004
I'm Perfect.

Ardhanari posted:

Replace the last two WHERE conditions with a BETWEEN?

That really shouldn't make much of a difference at all.. that query is nothing.. sounds like your machine must just really be a piece of poo poo.

Sort of like when I had Dreamhost hosting my MySQL databases...

benisntfunny
Dec 2, 2004
I'm Perfect.

deimos posted:

Or it's misconfigured (using my-small.cnf or something similar).
There's also this... since I don't know the exact data setup...
http://www.forbiddenweb.org/topic/12493/index.html

or..
Also to be sure it's actually using the index you could force it on there.
http://dev.mysql.com/doc/refman/5.1/en/index-hints.html
I've run into trouble with some of my SQL Server queries being "optimized" to run without utilizing an index. I'm sure this problem isn't exclusive to Microsoft.

benisntfunny
Dec 2, 2004
I'm Perfect.

minato posted:

Option 2) Use a single table:
-----------------------------
code:
table: "event"
event_id | event_time | forum_thread_id | file_id | user_id
Each row contains an event_id/event_time, and exactly ONE of "forum_thread_id", "file_id", and "user_id" will ever be non-NULL.

Pros:
+Simple to understand
+Means only searching one table, less schema management
+A new column is all that's required to add a new type.
Cons:
-Should implement a row "CHECK" to ensure that exactly 1 of the event-specific fields was chosen.
-A little more code required to identify which type of event this is.
Use option 2 with an index on event_time... you might choose other indexes but at least do that..

Option 2 is basically how my company does logging enterprise wide. Not that everything we do is right... lots of times it's not.. but it seems to work pretty good overall.

benisntfunny
Dec 2, 2004
I'm Perfect.
SQL Server 2005...

I have a query, that is far too long to paste here. Basically we have a large content database and I wrote a query to audit on all the potential issues we have with content on a daily basis... The Title equals the Description... odd characters finding their way into the content etc...

One of the items I have in there is a hits count based on a summary table with about 3,000,000 rows give or take a million. The hits table has an index on the date.

As a query this all completes in about 20 minutes.

Using the exact same query a stored procedure version takes over 5 hours before I stopped it because it was annoying me.

When the stored procedure executes it chooses NOT to use the index on my hits table. I've tried recompiling and giving the procedure table hints but it doesn't seem to matter.

Any ideas?
This sucks.

benisntfunny fucked around with this message at 20:28 on Sep 23, 2008

benisntfunny
Dec 2, 2004
I'm Perfect.

Victor posted:

Do you request data from this 3mil record table via subquery or derived table/CTE expression?

code:
select  p.*,
        children = (select count(*) from Child where parent_id = p.id)
from    Parent p
vs.
code:
select  p.*,
        children = isnull(c.count, 0)
from    Parent p
left join (
    select  parent_id,
            count = count(*)
    from    Child
    group by parent_id
) c on c.parent_id = p.id
Try the derived table version if you haven't already.

Here's what I run.
code:
INSERT INTO @Hits (TMP_ContentId,TMP_Hits) --temp table
SELECT CNT1.CNT_CONTENT_ID,
       SUM(ISNULL ( EVENT_TYPE_CNT, 0 ) )
FROM   dbo.EVENT_TABLE
INNER JOIN dbo.BUT_CONTENT AS CNT1
	ON CONVERT( INT, ISNULL ( TEXT, 0 ) )  = CNT1.CNT_CONTENT_ID
WHERE  EVENT_TYPE_ID = @Hit --code value for hits
       AND TEXT = CNT1.CNT_CONTENT_ID
       AND USER_ROLE_ID IN      (
                                  SELECT item 
                                  FROM parse_comma_delimited_list_in(@Roles, ',' )
                                 )
       AND CNT1.ACTIVE_DATE > GETDATE()
I'll play around with your version and see how it goes. Thanks

benisntfunny
Dec 2, 2004
I'm Perfect.

Begby posted:

You are joining two tables on a key that is stored in a text field that you are converting to an integer. I can't see how that could ever be fast, I can't imagine it using any index at all since you are using convert.

The key is not a text field. It's a date field. I left something out of this example. it should actually say

[code]
INSERT INTO @Hits (TMP_ContentId,TMP_Hits) --temp table
SELECT CNT1.CNT_CONTENT_ID,
SUM(ISNULL ( EVENT_TYPE_CNT, 0 ) )
FROM dbo.EVENT_TABLE
INNER JOIN dbo.BUT_CONTENT AS CNT1
ON CONVERT( INT, ISNULL ( TEXT, 0 ) ) = CNT1.CNT_CONTENT_ID
WHERE TIME_STAMP BETWEEN @StartDate AND @EndDate -- INDEX COLUMN
AND EVENT_TYPE_ID = @Hit --code value for hits
AND TEXT = CNT1.CNT_CONTENT_ID
AND USER_ROLE_ID IN (
SELECT item FROM parse_comma_delimited_list_in(@Roles, ',' ) )
AND CNT1.ACTIVE_DATE > GETDATE()
[code]

benisntfunny
Dec 2, 2004
I'm Perfect.

Begby posted:

If you only have one index then that is a big reason why it is running slow, among other things. 3 million records really isn't all that much, SQL server should be able to handle this kind of query. The issue here though is your field types and indexes. You will probably want to build a composite index with all the items in your where clause and play with the order.

You are inner joining BUT_CONTENT and EVENT_TABLE on a field. If that is not indexed in BUT_CONTENT it is going to cause a slowdown. I am not all that familiar with optimization on SQL server though, so perhaps someone else here can assist.

Also, like I said you are converting the field called TEXT to an integer in the join, then you are also doing a comparison in the where clause which is redundant (and not doing a conversion). You should be able to take it out of the where clause. Then you need to sort out if you really need to do that conversion at all.

One good way to troubleshoot this kind of stuff is to add a large limit, then eliminate all but one filter from your where clause, then add them back in one at a time until the query gets slow. Build your index as necessary.
I think because my post was up a little far people are getting the wrong idea here. The query itself, is fine. My whole query, which this is only a small part of, completes in 21 minutes (this is acceptable time considering the things it does). Running it as a Stored Proc however is NOT acceptable time, 5+ hours.

The query above to calculate hits for 9000 items on a three month span only takes 40 seconds or so to complete as a query. Over 2 hours as a stored procedure with the exact same code.

benisntfunny fucked around with this message at 14:03 on Sep 29, 2008

benisntfunny
Dec 2, 2004
I'm Perfect.

Aredna posted:

I apologize if this was covered as I may have missed it, but are you also doing the insert when you run it as a query or only when you are running it as as stored procedure? If you are only using the insert in the stored procedure perhaps it's related to that.

the code is the exact same between stored procedure and query.

Again this is just a snippit of the stored procedure. I'm only including this because everything ran fine as a stored proc before adding this additional code... also the analyzer tells me this is where it's getting stuck.

benisntfunny
Dec 2, 2004
I'm Perfect.
Is there a better way to do this? tsql

I have entries being made into the database something like this...
code:
Order_Type   Customer_Name        Date
Shipment     Bob                  10/10/2009
Shipment     Bob                  10/11/2009
Pickup       Bob                  10/10/2009
I want to select the last unique order types for bob.

So my result is
code:
Order_Type   Customer_Name        Date
Shipment     Bob                  10/11/2009
Pickup       Bob                  10/10/2009
So here's more or less the code I've written (Which I absolutely hate and think it's dumb).
code:
DECLARE @TMP TABLE (
OrderType VARCHAR(100),
Date      DATETIME
)

INSERT INTO @TMP (OrderType,Date)
SELECT Order_Type,
       Date
FROM   OrdersTable
WHERE  Customer_Name = 'Bob'
DELETE @TMP
WHERE  Date IN (
                SELECT t1.Date 
                FROM @TMP t1
                INNER JOIN @TMP t2
                    ON t1.OrderType = t2.OrderType
                WHERE t1.Date < t2.Date
                )
SELECT *
FROM   OrdersTable
WHERE  Customer_Name = 'Bob'
       AND Date IN (
                    SELECT Date 
                    From   @TMP
                   )
My biggest problem, aside from looking stupid and having a .00000000005% of sometimes returning two records for the same Order_type is that I'm doing two hits at the table.


EDIT: This is dumb, can't I just do the exact same thing I'm doing in the delete but the opposite and handle it in the select?

EDIT2: No.

benisntfunny fucked around with this message at 23:54 on Oct 14, 2009

benisntfunny
Dec 2, 2004
I'm Perfect.

ShoulderDaemon posted:

code:
select order_type, customer_number, max( date ) from orderstable group by order_type, customer_number
?

Also, customer_number is Bob?

it's an example... albiet since I gave a number as varchar you might wonder how good. MAX(DATE) didn't work for me. I also have quite a bit more columns being returned.

benisntfunny
Dec 2, 2004
I'm Perfect.

Sprawl posted:

Max(Date) should do exactly what you want unless its not stored in the database as a date.
code:
SELECT PAT_CODE_ID,MAX(PAT_CREATE_DT) 
FROM PET_PERSON_ATTRIBUTE
INNER JOIN BUT_CODE
	ON PAT_CODE_ID = COD_CODE_ID
INNER JOIN BUT_CODE_TYPE
    ON COD_CODE_TYPE_ID = CTY_CODE_TYPE_ID
       AND CTY_STATIC_NM = 'EmailSent'
WHERE PAT_VARCHAR1_TXT = '063563961000015'
GROUP BY PAT_CODE_ID,PAT_CREATE_DT


PAT_CODE_ID	(No column name)
14036	2009-10-14 17:05:22.803
14036	2009-10-14 17:12:51.503
14036	2009-10-14 17:47:38.323
14037	2009-10-14 18:06:01.063
14037	2009-10-14 18:06:01.843
14037	2009-10-14 18:06:02.073

benisntfunny
Dec 2, 2004
I'm Perfect.

ShoulderDaemon posted:

You probably don't want to group by the column you are trying to take the maximum of.

Probably shouldn't. You're right. THanks. I knew it was dumb.

EDIT: well actually it didn't work on my main select, but it works enough for me to do this a cleaner way.

benisntfunny fucked around with this message at 00:10 on Oct 15, 2009

benisntfunny
Dec 2, 2004
I'm Perfect.
Is there some type of linkedserver or OPENROWSET type thing for Oracle 10g so I can query a SQL Server within the editor window?

benisntfunny
Dec 2, 2004
I'm Perfect.

Corsair Jr posted:

Does anyone know of a means for SQL Server 2008 to optimize the datatypes for a static set of data?

I've got a table with roughly 9 million records (there won't be any records added for years) and right now every column is set to varchar(100). It would be awesome if there were some way for SQL Server to analyze the entire data set and select the appropriate data type for each column.

If you plan on leaving them all as varchars you can measure each column.
SELECT MAX(Column1),
MAX(Column2),
MAX(Column3),
ETC

If you think some are INT only, try changing it to INT and see if it works... Same for datetime etc.

You'll likely want to create an index or two depending on what your data actually is.

There might be some other ways but doing it this way will let you get a better idea of what is in there and help you know what you should be indexing. Unless the data was computer generated I often find this approach helps me find bad data to clean when I try to change columns from one datatype to another and it doesn't work.

benisntfunny
Dec 2, 2004
I'm Perfect.

Agrikk posted:

Another SQL Server 2008 question:

I need to count the number of specific events that occur on a user account during a specific date range based on its creation date.

I'm confused on how to write a query that retreives UserID and ActiveDate, then counts occurences of EventX between ActiveDate and Activedate +42 days.

I keep bumping up against a "A SELECT statement that assigns a value to a variable must not be combined with data-retrieval operations." error and I don't understand how to dump a column record into a variable and then use that variable in a subquery of the original statement.

Select userid, activedate, count(*)
from sometable
Where activedate between somedate and dateadd(dd,42,somedate)
And eventx = 'whatever'
Group by userid, activedate

At least I think that's what you're asking. If your dates include time then that's not going to be very fun. You migh want to do something like convert(varchar,activedate,101)

This sucks typing on iPad.


--- looking at your example in your edit....
It doesn't match up with what you've said. You don't need that first item in your where clause since it's handled by an ON statement and you don't have to use BETWEEN like I have but you might want to do something like <= >=

benisntfunny fucked around with this message at 06:09 on May 9, 2011

benisntfunny
Dec 2, 2004
I'm Perfect.

Aredna posted:

This is on SQL Server 2008. I'm optimizing a query and want to test the performance of several different indexes. The issue is that the table is loaded into cache after the first run increasing the speed by 50x.

I found some methods of clearing the system cache for benchmarking, but I'm not the only one on the server and don't want to cause slowdowns for others just so I can get a better benchmark.

Without being able to test on a server by myself, what are my best options left?

local server? Obviously it won't be exactly the same but you might be able to get a general idea.

benisntfunny
Dec 2, 2004
I'm Perfect.

Agrikk posted:

I actually said fuckit and am now looking at a product called SQL Data Compare Pro. It does data comparisons between two databases and can be called via command line (helpful for scheduled tasks...)

Visual Studio Team Data Edition does it. I mean if you guys are interested spending money...

But yeah - from the last guy's point. It sounds like you could probably just use MERGE and an Agent Job. Probably doesn't even need SSIS.

benisntfunny
Dec 2, 2004
I'm Perfect.

Internet Explorer posted:

Okay SQL megathread. I have not read the entire thread and the first post isn't much of a Megathread info post, so I am sorry if this question has been asked before. We have some front-office / back-office software that is used for a lot of financial reporting and that sort of thing to the higher ups who run the company. The problem is, my boss is the only "DBA" and he does everything by hand. Every time they need some numbers he types of an SQL query or uses an old query and edits it. There has to be some piece of software out there that will help him build reports that are essentially templates, or are easy enough for the end-user to find the data they need with dropdown boxes, etc. Something like Crystal Reports, I guess? I do not have much experience in this department. Please recommend me something because it sucks having a boss who is supposed to be managing things and spends his entire day reinventing the wheel. Thank you!

[Edit: MSSQL 2008 if it matters.]

I don't really see anything wrong with what your boss is doing in theory. Sounds like he just sucks at SQL.

benisntfunny
Dec 2, 2004
I'm Perfect.

Aredna posted:

This is on SQL Server 2008. I received a performance tip to change my queries from the form of "select ... into #temp" to "insert into #temp select ..." because the latter uses fewer locking resources.

I've searched around and found one website that agrees, but all other websites say that "select ... into #temp" will be faster due to less logging taking place.

The obvious answer is to test both on my system, but as a rule of thumb which method does everyone else use? What "locking resources" are being used less and how does that impact overall performance?

On SQL Server I've always used INSERT INTO #PandaBear SELECT... on Oracle I do the opposite. Reasoning? None. I also prefer using variables for my temp tables in SQL Server* (i.e. @PandaBear instead).



*Unless it's a metric gently caress ton of data then I'd probably stick with a temp table.

benisntfunny fucked around with this message at 00:19 on May 24, 2011

benisntfunny
Dec 2, 2004
I'm Perfect.

bewilderment posted:

I have that basic fundamental knowledge - it's the course I did about a year ago. I know that the code I posted is bad, it's practically pseudocode and not intended to be the actual thing I'd type, I was just typing quickly.

To explain what I'm actually doing, the database is linked to a web app. The intent is to, based on input, dynamically generate a listbox from one of the columns of the table. The actual schema of the database is poor but it's what I've got to work with; basically there are three tables that contain identically designed columns, and the actual contents of each table are different. The idea is say, the input to the stored procedure was 'foo', so I pull the appropriate column from the Foo table.

I wouldn't call the database every time someone selects a menu item. You should build out the structure of your drop down menu in the database and have your application take in the whole structure and cache it for a given amount of time.

You can do the structure a few ways...
a table like SuperSimpleWay
code:
SELECT Option1,
       Option2,
       Option3
FROM SuperSimpleWay
Or do two tables a MenuItems table and a MenuItemRelationship table
code:
SELECT m1.MenuItemName,
       m2.MenuItemName
FROM   MenuItemRelationship mir
INNER JOIN MenuItems m1
   ON mir.MenuItemParentID = m1.MenuItemId
INNER JOIN MenuItem m2 
   ON mir.MenuItemChildID = m2.MenuItemId
The alternative takes longer to build, both writing the code and calling, but allows you to reuse the same name in different menus. If you're cacheing it for like 12-24 hours it really won't matter that it takes an extra .3 seconds to compile the list. If one of your menu items changes you can update it in one place and ensure the changes happen to all menus that call that item. -- It all depends really on how many drop down menus you'll have on your site and how anal you want to be with your database.

edit: to the other guy's point though... and I hear your reasoning... but even your pseudo code is pretty bad. You might want to really review some of your coursework prior to coding. The question of "Can you use an IF statement?" alone is enough for me to think you need to brush up on your SQL skills.

edit more: I know I said "I wouldn't call the database every time", but what i meant to say was "DO NOT WRITE YOUR STORED PROC LIKE THIS. DO NOT CALL THE DATABASE EVERYTIME."

If you can't figure out how to do the caching method where you build a menu and store it then find an alternative outside the DB like storing your menu in an XML config file somewhere on your server... hard code it into your page... do something where you're not hitting the database repeatedly like that. If your web site was large you would be opening tons of connections to perform that simple operation. It would get especially bad with people like me who will sometimes find menu items I want using arrow keys... type in the first letter of the menu item them press the down arrow till I get where I'm going. You'd hit the DB every single time I skip over an entry and run that rather lovely SQL just to find the next menu item.

benisntfunny fucked around with this message at 13:37 on May 24, 2011

benisntfunny
Dec 2, 2004
I'm Perfect.

epswing posted:

I'm starting to write tests for this database-backed application, and now realize the need for a fresh database every time the test suite is run.

I'm using SQL Server 2008 Express R2.

What I want to do is "import the schema of database X into a new database Y".

How would I automate this? If there's some way I can do this from C# that would be amazing.

You need a brand new database every time? You don't just need a clean database with default data?

Why not create two databases that are identical? One is the default state and one is your test. Upon starting a new test run dump the data in your test DB and move your default state data over. Also you could do it all with one database and a script that inserts the necessary data depending on how much you have.

Creating a new database EVERY TIME a test suite is run seems like a bit much.

benisntfunny
Dec 2, 2004
I'm Perfect.

epswing posted:

By dump I guess you mean DELETE FROM or TRUNCATE. This isn't so easy with a normalized database. The tables need to be emptied in a particular order so as to not break referential integrity. Also, when the schema changes, the test database needs to stay in sync.

If I can just create a database, copy a schema, and run my tests, I don't have to do anything.

What's wrong with this?

I'm very familiar with normalized databases and doing exactly what I mentioned. In fact I do just that activity in many test environments daily. I use DELETE just in case the copying screws up half way and needs to rollback.

I'm not sure what you mean by "I don't have to do anything" it sounds like to me you have to create a database and copy a schema over. Any way - people will do things in whatever way they like best. You might want to look into this:
http://msdn.microsoft.com/en-us/library/aa258257(v=sql.80).aspx

edit: Additionally you can write a script that drops all of the FKs and reinserts them after everything is done -- so yeah. This is essentially what Microsoft's VS Data Edition does for schema/data compares.

benisntfunny fucked around with this message at 19:07 on May 24, 2011

benisntfunny
Dec 2, 2004
I'm Perfect.

epswing posted:

I'm 99% sure prikey implies index.

You'd be 100% correct.

quote:

A PRIMARY KEY is a unique index where all key columns must be defined as NOT NULL. If they are not explicitly declared as NOT NULL, MySQL declares them so implicitly (and silently). A table can have only one PRIMARY KEY. If you do not have a PRIMARY KEY and an application asks for the PRIMARY KEY in your tables, MySQL returns the first UNIQUE index that has no NULL columns as the PRIMARY KEY.
http://dev.mysql.com/doc/refman/5.1/en/create-table.html

benisntfunny
Dec 2, 2004
I'm Perfect.

Agrikk posted:

Import/Export wizard was being stupid. I did just that and it worked just fine.

Thanks.

Import/Export only reads portion of the initial rows to guess at the datatype when reading flat files which always seems to lead to fuckage.

benisntfunny
Dec 2, 2004
I'm Perfect.

Thel posted:

- SQL (derp)

- Backups/maintenance plans

- Security

- Deployment

- Query performance/analysis of query plans
-- Keys, indexes (indices?)

- T-SQL
-- Stored procedures
-- Functions (scalar, table etc)


Anything else?
-- Replication
-- Depending on what version the company is using you'll want to know a little of what's in/not in 2000 vs 2005 vs 2008 - what it takes to upgrade and why they'd want to aside from support from MS.
-- Server health. Identifying/solving problems going on in the server. (locks, stopped replication, applications opening too many connections, failed agent jobs)

benisntfunny
Dec 2, 2004
I'm Perfect.

Lyon posted:

Random question, I work for a software company and we sell an information system focused on laboratories. Obviously we use the crap out of databases and SQL for the back end of our product.

I was hoping someone could suggest a good primer on databases and SQL?

Generic database stuff could be useful but it would be best to focus on the specific type of database you're using.

MySQL, Oracle, MS SQL Server? ... DB2? Even though relational DB concepts are pretty universal there are topics specific to the type of DB you're on.

benisntfunny
Dec 2, 2004
I'm Perfect.

Thel posted:

MS SQL Server 2005.

(I should know this already, shouldn't I? :downs:)

If we're only talking about a 1000 rows it would probably be easiest and safest to store it in a variable before populating into the table

code:
BEGIN TRY
BEGIN TRAN
    DECLARE @ADTable TABLE (
    Row1 INT,
    Row2 INT,
    etc)

    INSERT INTO @ADTable
    SELECT * 
    FROM   OPENQUERY(MyAdDataSource,
                     ''SELECT Blah blah blah''
                     )
    DELETE ADTable
    INSERT INTO ADTable 
    SELECT * -- I would probably actually be explicit
    FROM   @ADTable

COMMIT TRAN
END TRY
BEGIN CATCH
ROLLBACK TRAN
    --Spit out error messages
END CATCH
There's lots of ways to really skin this cat but given your table size this is probably the most straight forward.

EDIT: Also interesting that you said the table is only 1000 rows because AD queries only support 1000 results at a time. Are you sure it's probably not much larger than 1000 and that you're just not getting enough information? You may have to change your query to grab data in chunks.

benisntfunny fucked around with this message at 15:25 on Jun 22, 2011

benisntfunny
Dec 2, 2004
I'm Perfect.

Bob Morales posted:

What can I do about slow MySQL importing?

I dump from server A with mysqldump. It takes about 2 minutes to give me a 2.1GB file. Then I send it over to server B and import it in to that server (starting fresh with a new DB) and it takes 15 minutes!


Could you setup replication to the other database?
http://dev.mysql.com/doc/refman/5.1/en/replication.html

Even if you didn't always want data in server B you could setup a different schema then just take locally from B to B to populate data when you wanted it.

benisntfunny
Dec 2, 2004
I'm Perfect.

Bob Morales posted:

I am under the assumption that if we used replication, that once we wrote to the database on server B, we'd have to delete it and start over from a new dump in order to get back in sync.

The idea is to use this as a test with real production data instead of the data in our staging/development servers.

Server A is actually a local replication clone of our production DB.

Here's what I'm suggesting... and I'm not sure how many places you can replicate to in MySQL so it may not work.

Keep your replication to Server A
Replicate another copy to Server B in a new schema
Limit the ID in Server B to read only on that new schema

Delete your current test tables and repopulate with data from new schema that is a fully replicated copy of prod as needed. No reason you would have to delete anything from those tables.

benisntfunny fucked around with this message at 02:24 on Jun 24, 2011

benisntfunny
Dec 2, 2004
I'm Perfect.

Bob Morales posted:

I got it down to 8 minutes, but apparently that's not fast enough. That's one of our biggest databases, our other ones take 20 seconds to dump and a minute and 20 seconds to import.

I'm going to try using this:

http://code.google.com/p/sqlyog/

or maatkit to see if it's faster to re-sync our data with the production master after we've made changes, or if it's faster to just dump/reload.

So you're going to rule out the option I gave you which would allow you to locally copy the data over? - which is without a doubt going to be your fastest option. We typically copy about 4-5GB of data using a similar method in under a minute.

how are you defining the DB as big? Also if the issue is 8 minutes is too long because you're locking the tables during that time and you don't want to do the replication method I suggested I would at least dump the data into an alternative schema pre-load then load from those tables locally to reduce locking time.

edit: also - If you have any DB auditing type columns in your data you could easily do a compare to only grab the data you need and update existing information where the modified date is greater than the current modified date.

benisntfunny fucked around with this message at 02:28 on Jun 24, 2011

benisntfunny
Dec 2, 2004
I'm Perfect.

Bob Morales posted:

I really don't understand what you were saying to do. 8 minutes is only too long because someone else said so.

Basically we have a production server, and now we have a local replication mirror. What I am trying to do is set up a test server, which will allow you to take a snapshop of a database from the mirror and actually do stuff with it. We have a clone of our webserver set up as well. Basiscally so we can see changes with real data instead of on our staging or development servers. But once we start testing the app, the data is going to change of course.


I hope someone else can chime in and tell me I'm not making sense.

Here's a picture of what I'm trying to explain. I fully understand what you want to do. Stop getting loving snippy it's not our fault your testing is super time sensitive and can't be executed through standard load processes.

benisntfunny
Dec 2, 2004
I'm Perfect.

MrBishop posted:

code:
SELECT t1.Center, t1.[Group Id], t1.[WeekNumber], t1.Deads, t2.[Move Ins],
    MAX(CAST(ISNULL(t2.[Total Dead], 0) AS Int)) AS [Cum. Deads],
    CASE t2.[Move Ins]
        WHEN '0' THEN 0.0
        ELSE CAST(ISNULL(t1.[Deads], 0) AS Float)/CAST(t2.[Move Ins] AS Float)
    END AS Mortality,
    CASE t2.[Move Ins]
        WHEN '0' THEN 0.0
        ELSE MAX(CAST(ISNULL(t2.[Total Dead], 0) AS Float))/CAST(t2.[Move Ins] AS Float)
    END AS [Cum. Mortality]
FROM t1 left JOIN t2 ON (t1.[Group Id]=t2.[Group Id] AND t1.WeekNumber>=t2.WeekNumber)
GROUP BY t1.Center, t1.[Group Id], t1.[WeekNumber], t1.Deads,
    t2.[Move Ins]
ORDER BY 2, 3
Create a temp table with all the dates - right join them into your query
edit:
Copy and paste this to see what I'm saying - also. I updated what I originally said to do.
code:
DECLARE @FakeResults TABLE
(WeekNumber INT
 )
 DECLARE @FakeResults2 TABLE
 (
 WeekNumber INT,
 DeadShit   INT
 )
INSERT INTO @FakeResults VALUES (1)
INSERT INTO @FakeResults VALUES (2)
INSERT INTO @FakeResults VALUES (3)
INSERT INTO @FakeResults VALUES (4)
INSERT INTO @FakeResults VALUES (5)
INSERT INTO @FakeResults VALUES (6)
INSERT INTO @FakeResults VALUES (7)

INSERT INTO @FakeResults2 VALUES (1,92031)
INSERT INTO @FakeResults2 VALUES (3,2)
INSERT INTO @FakeResults2 VALUES (5,41)
INSERT INTO @FakeResults2 VALUES (6,9)

SELECT ISNULL(fr2.WeekNumber,fr1.WeekNumber),
       ISNULL(fr2.DeadShit,0)
FROM @FakeResults2 fr2
RIGHT JOIN @FakeResults fr1
	ON fr2.WeekNumber = fr1.Weeknumber
	
Obviously you'll probably want to dynamically load all your week numbers into that table. There's ways you can do this:
http://www.kodyaz.com/articles/sql-server-dates-table-using-tsql-cte-calendar-table.aspx

benisntfunny fucked around with this message at 21:12 on Jun 24, 2011

Adbot
ADBOT LOVES YOU

benisntfunny
Dec 2, 2004
I'm Perfect.

MrBishop posted:

Thanks for following up, but maybe I didn't explain well enough where the problem lies, and that's in having a running total, that continues to accumulate (0) through the missing weeks. What I started with is:
code:
WeekNumber|Deads|Cum. Deads
---------------------------
1         |0    |0
2         |3    |3
3         |0    |3
4         |2    |5

ah. sorry. I actually think you've probably done exactly what I would do too. I did just have trouble understanding you at first.

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply