SELECT * FROM Questions WHERE Type = 'Stupid'

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > SELECT * FROM Questions WHERE Type = 'Stupid'

Boogeyman: Sep 29, 2004; Boo, motherfucker.

Here's an interesting question. I have a database that houses customers and users, and a RADIUS server uses the database for authentication. We also keep logs of data usage for accounting purposes, and I have an ASP.NET application that generates reports that the customers can view. The customer selects which users will be included in the report with two listboxes (selected/unselected).

Some customers have thousands of users. This means that the SQL generated for the reports becomes pretty hairy if they select lots of units. Something along the lines of this...

code:

SELECT
  A.*
FROM
  AccountingTable A
WHERE
  A.UserId IN
    (
      1001,
      1002,
      1004,
      ...
      1974,
      1977
    ) AND
  <more criteria here>

The IN clause becomes ridiculously huge and while it's not a problem yet, I can see it becoming a problem in the future as we add customers (or when customers add another crapton of users). The only other way I can think of would involve creating a temporary table to hold the user IDs, then inserting the selected IDs into the table, then joining the temporary table to the accounting table to bring back the results. Would that be a better way of doing it, or is there something easier that I'm missing?

# ¿ Nov 8, 2007 20:35

Adbot: ADBOT LOVES YOU

# ¿ Apr 29, 2024 14:11

Boogeyman: Sep 29, 2004; Boo, motherfucker.

Victor posted:

Hairy == slow? Ugly?

Slow, ugly, and I think I'll be hitting a limit soon as far as characters allowed in a SQL statement. I believe it tops out at 8000 characters, correct?

# ¿ Nov 8, 2007 21:20

Boogeyman: Sep 29, 2004; Boo, motherfucker.

Victor posted:

Selecting that many IDs from a GUI is a bit questionable, so if it is possible, I would just put a limit on that (say, to a 100 entries), and also have a "select all" option, in which case you wouldn't use in. However, people can be retarded and require that users be allowed to select hundreds of entries, so this might not be an option.

This is the main problem. I've told them repeatedly that common sense dictates that no one in their right mind would select 6879 individual entries out of 12543 to generate a report, and that it should be "either select up to XXX users, or all users". I can't win with these clowns though.

I'll go back and look over Dakha's post and the resulting solution to see what I can come up with. I'll also try to push the "X or all" argument on them again to see if I can get them to agree with me (not likely).

# ¿ Nov 8, 2007 22:18

Boogeyman: Sep 29, 2004; Boo, motherfucker.

Opius posted:

My question is: is there anyway to group results in sets of 10 minutes, rather than single minutes in the query?

Give this a shot...tried it against a similar database I have here and it seems to work fine.

code:

SELECT
  DATEPART(year, timestamp) AS [Year],
  DATEPART(month, timestamp) AS [Month],
  DATEPART(day, timestamp) AS [Day],
  DATEPART(hour, timestamp) AS Hour,
  CAST(DATEPART(minute, timestamp) / 10 AS INT) * 10 AS RoundMinute,
  COUNT(*) AS RecordCount
FROM
  SomeTable
WHERE
  timestamp >= '10/1/2007' AND
  timestamp < '11/1/2007'
GROUP BY
  DATEPART(year, timestamp),
  DATEPART(month, timestamp),
  DATEPART(day, timestamp),
  DATEPART(hour, timestamp),
  CAST(DATEPART(minute, timestamp) / 10 AS INT) * 10
ORDER BY
  DATEPART(year, timestamp),
  DATEPART(month, timestamp),
  DATEPART(day, timestamp),
  DATEPART(hour, timestamp),
  CAST(DATEPART(minute, timestamp) / 10 AS INT) * 10

You'll have to join the date parts back up for display purposes, but that shouldn't be a big deal.

# ¿ Dec 10, 2007 19:58

Boogeyman: Sep 29, 2004; Boo, motherfucker.

nbv4 posted:

See how I did that? The sorting is based on two different columns at the same time. An easy solution would be to make a third column that is just the two columns added together, but I suck at SQL so hard I can't even figure out how to do that

If you're using MSSQL, this will work (just tested it).

code:

SELECT
  item_id,
  value1,
  value2
FROM
  whatever
ORDER BY
  COALESCE(value1, value2) DESC

COALESCE returns the first non-null argument that was passed to it. If either value1 or value2 is null for each row, this will work.

Boogeyman fucked around with this message at 19:12 on Mar 24, 2008

# ¿ Mar 24, 2008 18:56

Boogeyman: Sep 29, 2004; Boo, motherfucker.

I have a generic question about indexes in SQL Server 2005. Let's say I have the following tables (PK = primary key, FK = foreign key)...

Customer
-CustomerId (PK)
-CustomerName

Account
-AccountId (PK)
-CustomerId (FK)
-AccountNumber

Device
-DeviceId (PK)
-AccountId (FK)
-DeviceName
-PhoneNumber

MSSQL automatically creates indexes on primary keys, so after creating these tables, I already have an index on Customer.CustomerId, Account.AccountId, and Device.DeviceId.

Now, common sense says that I should have indexes on the foreign keys to boost MSSQL join performance. So, I add indexes on Account.CustomerId and Device.AccountId.

Finally, I want to search on a couple of fields in my application, so I add indexes for Customer.CustomerName, Account.AccountNumber, and Device.PhoneNumber.

So, for indexes, I have individual indexes for the following tables/columns...

Customer.CustomerId
Account.AccountId
Device.DeviceId
Account.CustomerId
Device.AccountId
Customer.CustomerName
Account.AccountNumber
Device.PhoneNumber

Here's the question. Am I OK leaving all of these indexes as single column indexes, or would I be better off combining some of them? For instance, instead of having separate indexes for Account.AccountId and Account.CustomerId, or should I include both columns in the primary key index that was created with the table? Likewise for Device.DeviceId and Device.AccountId.

I'm just wondering if MSSQL is smart enough to use the single column indexes efficiently, or if grouping the columns together would help it out. I know that there's no 100% correct answer and that I need to test all of this on my data set, but I'm just trying to find out if there's a general guideline for stuff like this.

# ¿ Mar 25, 2008 23:29

Boogeyman: Sep 29, 2004; Boo, motherfucker.

Here's a weird one...in SQL 2008, I'm trying to export data from one database to another database. The table structures are exactly the same. In SQL 2005, I would right-click on the database, choose Tasks -> Export, choose the source and destination, and tell SQL to append the data from the source table to the destination table. It would crank for a while (the source table has about 20 million rows, and the destination table has about 600 million rows), and all would be well.

In SQL 2008, it seems that something has changed. I go through the exact same process, but after I tell SQL to append to the destination table, it decides that a datetime column in the source table is actually a datetime2 column. The table structure hasn't changed since 2003, and I've double checked anyways to make sure no one screwed it up...it's definitely a datetime column. Anyways, SQL attempts to convert the "datetime2" values into datetimes, which takes freaking forever.

So, why does SQL keep picking up that source column as a datetime2 and not a datetime? Is there a bug or something that I don't know about in the import/export wizard?

# ¿ Apr 7, 2009 19:49

Boogeyman: Sep 29, 2004; Boo, motherfucker.

Quick MSSQL memory question. SQL 2008 R2 Standard supports up to 64GB of RAM. Does anyone know if that limit is per server or per instance? I'd really like to get one beefy server with 128GB of RAM and run two instances (64GB each) instead of having to buy two separate servers.

# ¿ Jul 12, 2011 19:29

Boogeyman: Sep 29, 2004; Boo, motherfucker.

Jethro posted:

Some quick googling says that it's 64GB per instance.

That's the same thing I was seeing, but I was worried since that info wasn't coming directly from Microsoft. I ended up having my boss talk to our CDW rep today and he confirmed that it is 64GB per instance, which makes things easier.

# ¿ Jul 13, 2011 17:03

Boogeyman: Sep 29, 2004; Boo, motherfucker.

I'm pretty sure I already know the answer to this one, but I'd like to double check to be sure. Say I have a table in a MSSQL database that has these fields:

Id
ParentId
Data1
Data2
Data3
Data4
Data5
Created
LastModified
Deactivated

On this table, I have two indexes, one on ParentId and one on both ParentId and Deactivated. Since both of these indexes list ParentId first, the one that only indexes ParentId is redundant, right? Would there be any instance where MSSQL would choose to use the index that only contains ParentId, or can I scrap it?

# ¿ Jul 5, 2012 18:34

Boogeyman: Sep 29, 2004; Boo, motherfucker.

baquerd posted:

Only thing that comes to mind is it the first one is clustered and the second is non-clustered.

True, guess that would make a difference. Supposing they're both non-clustered, the one that only contains ParentId would be redundant, correct?

# ¿ Jul 5, 2012 18:49

Boogeyman: Sep 29, 2004; Boo, motherfucker.

Internet Explorer posted:

If you click "Other Versions" at the top of that page you can see the feature goes back to SQL 2012. Really good to be able to see the differences in implementation from year to year.

It's available all the way back to SQL 2008, which I know because I'm currently dealing with some dummy's terrible implementation of it and it's making me sad.

# ¿ Oct 10, 2016 16:31

Adbot: ADBOT LOVES YOU

# ¿ Apr 29, 2024 14:11

Boogeyman: Sep 29, 2004; Boo, motherfucker.

COOL CORN posted:

What's a terrible implementation of it

Like... I just want to enable it on some tables so that I have those cdc system tables to query on

I will qualify all of this by saying that I am by no means an expert about anything to do with CDC, considering I just found out about it recently (and accidentally). I was investigating why some databases I inherited were so bloated, but I couldn't figure out where the data was hiding. I usually use sp_blitzindex to check table sizes, and that doesn't include system tables, which means that the CDC tables weren't showing up. I finally figured out that CDC was enabled, found the tables, and started looking into how it's all set up. Why I'm upset about it:

1. I don't like the fact that it was trying to hide the data from me, but whatever, that's a limitation of the tool I was using.

2. I don't like the fact that most of the cleanup jobs that were created are either not running as often as they should, or they're retaining way too much data. Again, due to the dummy that set it up.

3. Apparently this isn't the default config, but there are other jobs that exist that are pulling rows out of the CDC tables every night and stuffing them into a separate database on the same instance. I guess I kind of get why they would do that, but it just makes everything a lot more complicated. And isn't all that great for performance considering they're slaughtering this instance.

4. I don't like the amount of storage that it's consuming. Probably due to our workload (which should never have had CDC enabled in the first place). We had an 850GB table (ridiculous, I know), and the CDC table backing it was another 750GB. Tracking "changes" that weren't really changes, just inserts. Why would you enable CDC on a table that never actually changes? That's anyone's guess. We had another table that was 2.8GB in size, the CDC table backing that one was 299GB because no one ever considered the frequency of the updates on it.

I'm sure it's probably a great fit for certain applications, so long as you consider the performance implications, and carefully consider each table's access patterns (insert/update/delete) because that will determine how much extra storage you'll need to support it. In my case, it was some absolute dipshit that implemented it, half rear end customized the jobs that SQL came up with, and left no documentation for me to work off of, so CDC is pretty much my "want to murder someone" trigger at this point.

# ¿ Oct 11, 2016 15:41

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > SELECT * FROM Questions WHERE Type = 'Stupid'