Enterprise Storage Megathread: Why is my NAS a SAN?

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > Enterprise Storage Megathread: Why is my NAS a SAN?

«‹›4 »

H110Hawk: Dec 28, 2006

BonoMan posted:

What's the general process for replacing drives in a NAS with newer ones? Say you have an 8 bay NAS and they're all taken up and then 3 years down the line you want to replace the drives? How does that happen without just copying everything over to a duplicate NAS or whatever?

Typically copy and replace is how it is done. With the way disk sizes grow you will likely be able to do some sideline magic where you take half of your new disks, make a quick software array on your current computer, copy data. Yank all of your disks from NAS, make software array, copy data. Put all new disks including original array into NAS and build, copy data final time.

Some controllers allow you to do one at a time disk swaps/rebuilds. Once you have rebuilt onto the larger disks it will automatically (or with some button pressing) expand the raw device to be the larger size. Once you've done that you have to expand the overlaid filesystem somehow. If it's a "black box" device you are at the mercy of the device, if it's linux/windows you are at the mercy of the filesystem grow commands. (NTFS can grow with tools a-la partition magic. Other common filesystems have similar things, http://www.google.com/search?q=ext3+grow+filesystem ) Sometimes OS's don't take kindly to raw block devices that are directly attached changing size while booted. I would suggest mounting your filesystem read only for the rebuild where the FS can grow.

(In reality you just restore from backups, right? :laugh:

)

# ¿ Jun 26, 2009 17:17

Adbot: ADBOT LOVES YOU

# ¿ May 13, 2024 00:20

H110Hawk: Dec 28, 2006

lilbean posted:

So the real question is, can I cheap out and use the 7200.11/7200.12 drives for the X4540 without any issue? They're literally half the cost of the ES.2 disks. Also, I'm not worried about support since we've confirmed that issues not caused by third-party disks are still supported.

You should be fine. The hardest part of the operation is breaking the loctite.

# ¿ Jun 29, 2009 16:43

H110Hawk: Dec 28, 2006

lilbean posted:

I thought TLER was specific to Western Digital drives.

TLER is just another name for how long the disk will keep telling the system it is trying again. It turns previous hard errors into soft ones. If the soft error goes past a certain time threshold, it becomes a hard one.

http://en.wikipedia.org/wiki/Time-Limited_Error_Recovery

ZFS will certainly kick out more disks than typical using AS over NS Seagate disks. It's up to you to decide much of an impact that will have on your operation. For one vdev worth of disks it is likely worth trying.

# ¿ Jun 29, 2009 20:58

H110Hawk: Dec 28, 2006

Misogynist posted:

I just moved part of my ESXi development environments off of local storage and onto a 48TB Sun x4500 I had lying around, shared via ZFS+NFS on OpenSolaris 2009.06 over a 10GbE link.

I was worried about performance because it's SATA disk, but holy poo poo this thing screams with all those disks. I have never seen a Linux distro install so fast ever in my life. The bottleneck seems to be the 10GbE interface, which apparently maxes out around 6 gig.

What zpool configuration are you using? Any specific kernel tweaks?

# ¿ Aug 6, 2009 20:29

H110Hawk: Dec 28, 2006

optikalus posted:

Someone posted that blog entry to my forums a few days ago. I see several issues with it, mainly cooling and power. Only 6 fans to cool 45 drives? Those are going to bake.

I think you're wrong here, fan count is a lot less important than CFM forced over the disks. Sure the disks may be hotter than a 2U server with 4 disks in it, but they will be consistently the same temperature. The disks will probably suffer some premature failure, but that is the whole point of RAID. Get a cheapo seagate AS disk with 5 year warranty and just replace them as they fail.

Non-delayed startup and initial power surge are certainly very valid problems. If you're doing 14A on boot you can only have ~1 of those per 20A circuit. It makes me wonder if they wander around with an extension cord on a dedicated circuit to turn them on, then migrate it over to the running circuits. Reading the blog it sounds like what they do, turn on the disks, then turn on the cpu/etc

Makes me wonder where are they finding a datacenter that can keep themselves even moderately warm for that many watts/sq ft. I imagine with proper ventilation on the hot rows it should be doable, but I can't imagine how many heat exchangers they need. You don't need to keep it really cold, you just have to keep up with the heat put out by the devices and makes sure hot air isn't getting circulated back over to the intake fans in the cold rows.

# ¿ Sep 2, 2009 21:34

H110Hawk: Dec 28, 2006

optikalus posted:

The way the disks are laid out, the drives in the middle row will no doubt be several degrees hotter than the drives next to the fans. Air also has very poor thermal conductivity, so having such a small distance between the drives means that:

I'd love to see a plot of the temperatures of the disks vs. location in the chassis. Even in my SuperMicros, the 8th disk consistently runs 3 degrees C hotter than all the other drives:

I think you miss the point. The disks need constant temperature, not low temperature. Remember the google article everyone loved a year or two back? Nothing has changed. It also doesn't matter if they have a slightly elevated failure rate. Their cost for downtime is nearly 0 compared to most other application out there. Build cost vs. Technician time is what they have to minimize. In that case, lowest price wins. See the other storage thread for my arguments.

Have you never opened up a Sun X4500? They cram in disks in the same fashion. It's what this was apparently modeled after.

http://www.seagate.com/staticfiles/support/disc/manuals/desktop/Barracuda%207200.11/100507013e.pdf
http://www.sun.com/servers/x64/x4540/gallery/index.xml?t=4&p=2&s=1

# ¿ Sep 6, 2009 19:23

H110Hawk: Dec 28, 2006

EnergizerFellow posted:

- 110V @ 14A. Seriously 110V? These boxes need to be on 208V ASAP. Better efficiency from the PSs and AC lines too.

Sometimes it's hard to get 200v power in datacenters.

quote:

- Run the numbers on low-power CPUs and 5400/7200 RPM 'green' drives. Given the large number of boxes, additional component cost can be offset in power consumption savings and datacenter BTU.

It appears that the same disk from the 'green' line saves you 3 Watts per disk. Their seagate disk costs $120. The WD green disk costs $122. 45 disks per box, ~10 boxes per rack, is 1,350W (12.2A@110v) per rack saved at a cost of $900. This also reduces the power/block from 14 to 12.8A. A rack costs $1,000-1,200/month with 60 amps of 110v power and mostly adequate cooling. This means that over the course of 2-3 months they would make their money back on being able to nearly stuff one more pod into each rack. This makes the assumption they get the performance they need from the disks, which they likely do.

I wonder if there are other factors preventing this, such as disk supply or other inefficiencies in the system they aren't showing us.

http://www.wdc.com/en/products/products.asp?driveid=575
http://www.seagate.com/staticfiles/support/disc/manuals/desktop/Barracuda%207200.11/100507013e.pdf
http://www.google.com/products/catalog?q=western+digital+1.5tb&cid=5473896154771326069&sa=title#scoring=p

# ¿ Sep 7, 2009 19:18

H110Hawk: Dec 28, 2006

Halo_4am posted:

I'm thinking they're just trying to sell you an additional MD1000...

Agreed. You should compliment him on his quick thinking on the vibration patterns, though. Ask him if it was his idea. Sales guys occasionally need to be laughed at and called out.

# ¿ Sep 14, 2009 16:21

H110Hawk: Dec 28, 2006

three posted:

They bought these equallogic units before I started here because the Dell rep convinced them syncing Oracle would be as simple as replicating using the equallogic interface.

Call up that dell rep who made promises and get him to tell you how. Withhold payment if that is still possible. If you just need hourly crash-equivalent data because your application is fully transaction aware then it shouldn't be that bad. Play hardball with him. Have him send you a unit which works if this one doesn't, and send back the old one. "You promised me X gigabytes of storage with the ability to make a consistent Oracle snapshot for $N."

In theory you can make a backup based on a transaction view without locking your entire database for writes, but I have never used Oracle.

# ¿ Dec 22, 2009 19:55

H110Hawk: Dec 28, 2006

three posted:

While this is what I would do if I was in charge, I doubt the regional manager is going to send these back. We pretty much have to find a way to make this work.

rage-saq posted:

As people have already mentioned, find out WHO made the promises, get emails of them if possible, and hold them to it. If you are a vendor/consultant/etc and you make claims about XYZ with vanilla icecream on the side, and it doesn't do any of that, you can be sure as hell you can LEGALLY get out of paying agreed upon price. Thats how this industry works.

Your manager didn't get to be your manager by sitting in the back seat when problems happened. Only you know your companies political culture, but now is the time to step up and show your competence. Present the technical solutions which this machine can give you. Back it all up with documentation. Demand to speak with the Dell rep about it if your boss finds them lacking. Speak to that reps boss the moment he sounds stupid. Half the time it only takes a little bit of saber rattling to get the job done. They can forward ship you the correct unit for free if this one doesn't meet the requirements. Pray that everything was put together in a technical document which was sent to the rep.

If you're simply stuck with the unit, no contact with the rep, and a pussy of a boss who won't fix it, document everything. Keep copies at home. If they come to you with "Why doesn't this work!" show them where you documented during setup how it wouldn't work, notified your boss, attempted to make it right, and implemented the best available solution.

The next step, if the technical requirements weren't properly stated to the rep, is for your department to have a technical representative at all sales meetings. This includes the lunch/dinner/perk junkets the sales rep will drag your boss along on. Again, only you know how sharp you stay at certain levels of intoxication, don't go past them. If you can't stop yourself after a beer or two then don't drink at all. If your boss is stupid and won't let you be present, then at least have them send along a proposal signed off by you.

Gun for a promotion. Figure it out. Solve problems. Don't be a whiny bitch. Don't be a pushover. There, I said it, now you go do it.

# ¿ Dec 23, 2009 18:24

H110Hawk: Dec 28, 2006

three posted:

After discussion, I think it was more of that my manager thought it would work... and it likely won't; I don't believe the Dell rep ever said specifically it would work with Oracle.

He's not upset about it, and it isn't that big of a deal...

Cool. Those sorts of things have a tendency to trickle down in corporate culture with bad results for the ones at the bottom of the hill.

# ¿ Dec 24, 2009 19:47

H110Hawk: Dec 28, 2006

lilbean posted:

I've used one for a year now on Solaris 10, beat the poo poo out of it and I love it. H10hawk follows the thread too and manages like a dozen of them, so this is as good a place to ask questions as any.

I actually quit that job a few months ago. And it was 30+ thumpers, I lost count. :X I also only used Solaris 10 to great success and my replacement was hell bent on OpenSolaris. Last I heard it kept ending in tears. Stay clear of that hippie bullshit and you should be fine.

FISHMANPET posted:

Is it possible to use an iSCSI card to share a target?

I've never used iSCSI, but from what I've read about it an "iSCSI card" is nothing more than a glorified networking card with an iSCSI stack inside of it. ZFS handles this internally and you wasted money. I would keep this fact around if they try and lord over you other things they don't understand, as this one is them spending money they shouldn't have. The following is justifying you wasting space.

quote:

Also, since you guys manage a bunch of thumpers, what should I tell my boss as to why I shouldn't make 2 20+2 RAIDZ2 pools? I had a hard enough time convicing him to let me use *both* system drives for my root partitions (1 TB? But it has Compact Flash!) and now I'm trying to to use one of these from the 'ZFS Best Practices Guide'

In all fairness, it does have a compact flash port. Use it. Hell, use it as a concession to them. Did they buy the support contract with your thumper, even Bronze? Call them suckers up and ask for a best practices configuration on your very first thumper (Of Many!), and if they balk at it call your sales rep and ask them to get it for you. Get it in an email from Sun. Tell them your honest reliability concerns.

Now, think long and hard about how many parity disks you need, and how many hot spares you want. Your target with snapshots is 50% raw space as usuable. I tended to get 11T/thumper. In all honesty it isn't going to matter, because management is going to be Right(tm) and you are going to be Wrong(tm). I would setup 6 raid groups and "waste" those last 4 drives or whatever on hotspares, or just use RAIDZ instead of RAIDZ2 and reclaim a few terabytes. You have 4 hotspares, but you will need to very diligently monitor it for failure as it takes forever to rebuild a raidgroup.

Caveats: Update to the latest version of Solaris10 and upgrade your zpool. When resilvering a raidgroup do not take snapshots or other similar operations. Unless they've fixed it, doing anything like that restarts the resilvering process.

Edit: Oh, and stop swearing at Solaris. It can hear you, and it will punish you. Instead, embrace it, and hold a smug sense of superiority of others over knowing how things were done Back In The Day. Back when they did things the Right Way(tm). :clint:

# ¿ Feb 6, 2010 03:44

H110Hawk: Dec 28, 2006

Serfer posted:

This might not be the right crowd, but any idea how something like a Netapp box has a pool of drives which are connected to the two head units? They can assign which drives go to which in software as well. The only solutions I can come up with involve a single point of failure (eg, using a controller and serving up the drives each as their own LUN to the heads).

Via magic, faeries, pixie dust, and most importantly lots of money.

This blog has a really great picture to illustrate the setup:

http://netapp-blog.blogspot.com/2009/08/netapp-activeactive-vs-activepassive.html

The closest thing to a single point of failure is the cluster interconnect for NVRAM mirroring. However, if the interconnect fails your cluster continues to serve data from its non-fault tolerant state, but will not transition to a new state. This means if filer A is currently active for A and B, it will continue to do so upon cluster link failure. If filer A and filer B are both serving their own data, they will never fail over to the other automatically.

The filers maintain some state information on the disks themselves in a reserved few blocks for filer A and filer B respectively so they can make educated guesses about the other filer's state. There are VERY dire warnings and consequences to acting upon a filer when it cannot sense its neighbor.

Never disagree with what a RAID setup thinks about your array without very good reason. (This is just unsolicited advice. It's the most concise way I train people in using storage systems, as it is what every action boils down to on a fileserver.)

# ¿ Mar 6, 2010 20:25

H110Hawk: Dec 28, 2006

FISHMANPET posted:

I was wondering if anybody could provice a useful link on SAS expanders? I've seen all sorts of SAS cards that say they support 100+ drives, but I don't understand how they do it. Google isn't helpful for once, and I'm just really curious.

You pretty much buy a SAS card with an external* connector, and then a SAS backplane. Plug those two things together and voila! It's that simple. Just make sure you have an in and out port on your backplane if you want to daisy chain things, and that you aren't going to run out of SAS bandwidth.

For example: http://www.supermicro.com/products/chassis/4U/846/SC846TQ-R1200.cfm

Some companies sell little connectors which run the backplane ports to the back of the chassis like an expansion card. It is literally a female->male connector with the female end screwed to a custom cut slot cover.

*External is not mandatory, it's the same plug, signalling, etc, just wired out of the back instead of to the inside. You can run the cables however you like.

# ¿ Mar 24, 2010 18:18

H110Hawk: Dec 28, 2006

optikalus posted:

I ran into this problem on my PowerVault 220S with 14 146GB drives. One of the drives failed (Fujitsu), so I replaced it with a new Fujitsu and the Adaptec RAID card would not use the drive. It was 2MB smaller than the other drives >

I don't think this has anything to do with the partition size, though.

You have to be sure you read the Guaranteed Sector Count on any disk you purchase to replace an existing one. You are correct, it's not the partition size, but the size of whatever "thing" your array sees when building itself. This could be an exported multi-disk device (think raid10), a partition/file (when doing testing of raid subsystems), or the raw block device itself.

You can add a failsafe to this by lowering the used sector count in your raid controller software for each disk while building your array. Even if your array asks you "How many gigs do you want to use on this disk?" there is typically a way to see the actual block/sector counts. Where do you set it? 1% should be totally safe, but an easy way to tell is to look at all the major disk manufacturers for similar size disks, pick the smallest number, and reduce that by a tiny percentage. Even then just pay attention to the spec sheet when ordering and send it back if it doesn't match spec.

If you want an example of this, look at a Netapp sysconfig -r output. Compare the Logical to Physical sector counts. You will see Logical is far lower than physical. This helps with block remapping and them not having direct control over the manufacturing process, and that they will send you Hitcahi, Seagate, or Fujitsu disks as replacements.

H110Hawk fucked around with this message at 22:23 on Apr 10, 2010

# ¿ Apr 10, 2010 22:18

H110Hawk: Dec 28, 2006

Intraveinous posted:

AFAIK, nothing says I can't replace this box with non-IBM Power hardware, so I'm thinking about dumping it on a BL460/465c blade (CPU licensing costs will likely skew things in Intel's favor since I should still be able to get dual core 550x cpu) with one of the 80GB SSDs. FusionIO and HP have been claiming north of 100K IOPS, and 600-800MB/sec read rates from this kit.

Assuming you have some kind of HA way to failover intra-datacenter and inter-datacenter you could do just what you are suggesting. I'm adverse to blades, but whatever makes you happy. I would grab 4 of the cheapest 1u Dell/HP/IBM/Whatever you can find with 5520 cpus in them, fill them with memory and a boot disk. When it gets there, velcro in an intel SSD to one of the empty bays. It doesn't need cooling, you could even leave the plastic spacer in there as a caddy.

Use two in your live datacenter and two in your DR. Have your generate-index script write to the live server's hot spare, fail over, write to the new hot spare, then write to the hot-spare-datacenter servers serially. Remember to detect write failures so you know when your SSD dies and call it a day.

I would suggest a full commodity hardware solution but I guess that wouldn't go over well. Instead of an off the shelf intel ssd you could use one of those PCI-E SSD's you were looking at as well.

# ¿ May 25, 2010 22:39

H110Hawk: Dec 28, 2006

pelle posted:

Blue Arc:
Mercury 100
with 4 disk enclosures with 48x1 TB SAS 7200 RPM drives.

Had BlueArc gotten any better than the steaming pile of poo poo they were in the Titan 2000 days? Apparently our sales guy was super shady and eventually got fired. I met the new guy and he said most of his time is spent cleaning up the mess the old guy made. Adding a dose of reality to that, only half of our problems were likely caused by the sales guy, the other half was due to poor hardware.

# ¿ Sep 9, 2010 01:57

H110Hawk: Dec 28, 2006

Redundant is another word for money spent to watch multiple things fail simultaneously.

# ¿ Sep 21, 2010 05:30

H110Hawk: Dec 28, 2006

three posted:

He was there to identify any slowness in our Oracle database. His conclusion was that iSCSI was not capable of handling the network traffic, as iSCSI "maxes out at around 45MBps". His solution was: upgrade to 10GbE iSCSI or move to FC.

Our SAN is an Equallogic SAN with 4 ports all connected, and they each hit around 45MBps (4x45MBps). He said this was the cause of disk queueing showing up in perfmon.

First I would make sure your setup is indeed capable of performing faster. Double check that your disks are bored, your server isn't backed up elsewhere, and that the network itself is capable of filling the pipe. Throwing 10gbe at a problem without verifying the source cause is a great way to get 45MBps (bits? Bytes?) of throughput on a 10Gbps pipe.

Have your network guy sign off on the network after actually checking the ports. Make sure your NIC's are the right brand (Intel, Broadcom if you must), etc.

# ¿ Oct 12, 2010 03:36

H110Hawk: Dec 28, 2006

three posted:

Broadcom NIC

Make sure your NIC doesn't suck. Find some way to make that thing roll out a gbps of traffic to another box. You might need to schedule a maintenance window for whatever is hammering your SAN. (iperf? Whate are the kids using these days?) Have your network lackey `show int` on both sides of the connection and all intermediate ones.

Throw your NIC away and put an Intel in there it's not worth loving around.

# ¿ Oct 12, 2010 17:18

H110Hawk: Dec 28, 2006

Misogynist posted:

Our setup, mentioned above, also ran on the integrated Broadcom NetXtreme II NICs that come in most IBM servers. We hit wire speeds handily.

This is very likely a complete waste of money if the driver isn't shown to be sucking down a ton of CPU. I've never seen a gigabit network just plain cap out at half its specified speed.

There is a special place in hell for the bnx2 linux driver and the related Broadcom NetXtreme II NICs.

# ¿ Oct 12, 2010 19:08

H110Hawk: Dec 28, 2006

Misogynist posted:

Is this true in the general case? We've only had problems related to major regressions in RHEL 5.3/5.4 kernel updates that only affected machines using jumbo frames. Everything else has been running pretty smoothly.

Yes. I fought them constantly at my old job running debian on a wide range of kernels, and now again at this job. We are using CentOS 5.4 here. Though to be fair, we did get the RHN article which says how to solve the problem of "whoops! the bnx2 driver crashed under load!"

code:

# head -n 1 /etc/modprobe.conf
options bnx2 disable_msi=1

We just bought cases of cheap intel dual port 10/100 nics and threw them into nearly every bnx2 server we had at my last job. You can get them second hand quite cheaply if you're willing to buy whole stocks of them.

Here a prerequisite of our new servers was that they had Intel NIC's. Of course the handful of Dell's we bought to supplement those bulk servers use bnx2 nic's for no apparent reason. Had we not found that modprobe article we would have bought Intel gig server adapters for them, and I still might.

# ¿ Oct 12, 2010 21:42

H110Hawk: Dec 28, 2006

Depending on your application you can save whole bags of money by buying those 3-5 year old storage units off people who had to forklift upgrade. Sometimes it is far cheaper to buy two of everything and design your application to tolerate a "tripped over the cord" level event.

To answer your question, yes I am that insane. I also ride a unicorn to work.

# ¿ Oct 28, 2010 02:12

H110Hawk: Dec 28, 2006

skipdogg posted:

It's not uncommon to get 40% off list or more.

50% or you aren't even trying, and frankly you're wasting the sales guys time. 60% + lunch if you have the time to really turn the screws. Dinner and event tickets should follow the sale to discuss your upcoming projects.

# ¿ Oct 29, 2010 06:11

H110Hawk: Dec 28, 2006

oblomov posted:

Huh, and I thought our discount was good.... Going to have to talk to procurement.

I should start a consulting firm which splits the difference of money saved by people paying 51-100% MSRP for SAN gear. Everyone knows the big profit is in the support contract, and that the markup on hardware is there for customers like the military who will gladly pay $2600 for a NetApp�� hard drive.

You don't even have to call $competitor, just say you have, reduce the price $company_a gave you by whatever % and claim to have a quote in front of you saying that. Ask what your % discount is and just ignore them until they give you 50% off. Remember how all those quotes say SUPER-DUPER-TOP-SECRET at the top? You can't show $company_a the quote from $competitor, that would be dishonest! The worst they do is come back and tell you they won't go any lower, take it or leave it. After about two times with them saying that decide if you want to pay that amount.

Grow a pair. :clint:

# ¿ Oct 30, 2010 04:59

H110Hawk: Dec 28, 2006

Xenomorph posted:

but when I click

Ding! Call a sales rep.

# ¿ Nov 5, 2010 20:31

H110Hawk: Dec 28, 2006

Misogynist posted:

Not just because of this issue, but mostly because you're still very new at acquiring hardware -- you will always get a better price with someone on the phone than you will the website, even if you don't try to haggle with them. Just find what you want and ask them for a quote. (I think with Dell you can just save your order and refer to your cart number, which makes things easy.)

Precisely. The big problem was he was clicking and not calling.

Make an account, if you are a real company go ahead and establish a line of credit with them once you have an initial quote. The price might drop, and Net30 is always preferable to paying cash up front.

There will come a time on the phone call when, in a very serious tone, your rep is going to say how he has to get special approval for any further discounts. It will all sound very foreboding. The appropriate response is "Ok, when should I expect the updated pricing?" It will be 2-3 business days.

Also, reading the above post just cost you $2,000 in my consulting fee.

# ¿ Nov 6, 2010 00:12

H110Hawk: Dec 28, 2006

Corvettefisher posted:

Does anyone know how to make a RAID 10 in Freenas? I can't seem to find it anywhere, all the guides point me to raid 5 or 1 or 0 but not 1+0? Am I missing something or does freenas not do this. Yes this is software emulation

Maybe it's like some really terrible LSI firmware versions which are out there. Do you have to make a bunch of raid 1's, then start over and make a raid 0 out of all of your mirrors?

# ¿ Nov 16, 2010 06:01

H110Hawk: Dec 28, 2006

ferrit posted:

2) Setup a proper reallocate schedule for all volumes to ensure that the volumes aren't "fragmented" and that we're not running into a hot disk scenario. We've tried this and although it appears to help somewhat, there are several times when we still see latency rise due to the back-to-back consistency points.

I'm a bit rusty on this, but you can see if you have a hot disk yourself to see exactly how much performance you're gaining from reallocation. I haven't sat at a NetApp console in a year+ so verify these commands before running them.

During your worst IO performance times, where you are getting back to back CP's do:
# priv set advanced
# statit -b
(wait several minutes)
# statit -e
# priv set

This should give you a whole pile of output. Look through the disk utilization numbers and see how you are doing. Netapp still does consolidated parity, right? This means 1 or 2 of your disks per raid group will show some piddling amount of utilization and that is normal.

Also look through sysconfig -r, wafl scan status, and options to make sure you aren't doing some kind of constant scrubbing or other high impact job during peak hours. Any scrub jobs should be paused during times of extremely high utilization.

Sometimes you can just slap a bigger NVRAM card into your existing netapp. This might get into warranty voiding territory.

I've never used Oracle, but make sure you are doing aligned writes to your netapp. Ask Netapp support how you can verify that. One block write should net one data block write (and one or two parity blocks) on your netapp.

At a certain point management is going to have to bite the bullet on extra disks. Depending on how risk adverse you are you can spend as much or as little as you want. PM me for details.

This is also a pretty good source of diagnostic commands. Temporarily bypass the ssl warnings. Do not under any circumstances run a command which you do not understand. Do not argue with the filer. The filer will win all arguments. The filer knows best.
https://wiki.smallroom.net/doku.php?id=storage:netapp:general

H110Hawk fucked around with this message at 16:58 on Nov 18, 2010

# ¿ Nov 18, 2010 16:55

H110Hawk: Dec 28, 2006

ferrit posted:

My statit's are definitely showing that the last rg (the shelf with the 450 GB disks) have a higher utilisation figure than the other 3 rg's with the 150 GB disks - but could that just be because they are bigger disks and thus utilised more? I thought that the volume reallocate would have assisted us in levelling this out.

Do NetApp even sell NVRAM upgrades anymore?

Another quick question to throw in the mix, not necessarily related to this issue - aggregate snapshots. Do you have them enabled or not?

The 450GB disks are going to run hot because new blocks are likely to wind up there as you reach capacity on the 150gb disks. You have roughly the same IOPS performance, I assume, between the 150gb and the 450gb disks. This means you are trying to pull more blocks from the same number of IOPS and creating a bottleneck. If your VAR did not explain this they should be raked over the coals.

I have no idea if they ever sold NVRAM upgrades. I do know that if you slapped a bigger NVRAM card into an older box it would use it. :q:

Snapshots have their place. If you have no use for them then just disable them. Snap reserve is a magical thing sometimes, as it can let you squeeze some "whoops!" space out of the device much like ext2's root reserved space. One common thing you can do with them is tie them to your frontend software, issue a coalesce/get ready for backup command, fire a snapshot, then release the lock. This then lets you do a copy of the snapshot somewhere else for backup. If this is not a part of your backup system then don't worry about it. I personally love snapshots.

# ¿ Nov 19, 2010 23:48

H110Hawk: Dec 28, 2006

adorai posted:

If they will only provision storage as NFS from whatever device they have, the solution is simple: use an opensolaris/openindiana/openfiler/plain old linux VM stored on NFS that presents that storage as iSCSI. Problem solved, cost: $0 and the storage admin doesn't know any better.

This seems like such a terrible hack. It doesn't cost $0, and when has playing tricks on storage/network/system admins ever wound up being a net benefit when they inevitably find out?

# ¿ Dec 21, 2010 02:29

H110Hawk: Dec 28, 2006

quote:

let me give you a 2 minute tutorial on what buttons NOT to press. Have fun!

All of them. :v:

Remember: The enterprise storage device is always right. Arguing with the storage device will result in forfeiture of your weekend, especially on a Monday.

# ¿ Feb 13, 2011 17:03

H110Hawk: Dec 28, 2006

conntrack posted:

right sized

Did i make this post a thousand times before in this thread? I ask these questions in alot of forums and many people just go BUT WE NEEED SSD BECAUSE ITS COOOOOOL.

I didn't think "right sized" was an actual term. Color me surprised. http://www.oxforddictionaries.com/definition/rightsize?view=uk

You need to buy an SSD or 3 and see if they are right for your applications. We bought one and extrapolated some data which condensed 52 spinning 7200 rpm disks into 6 SSD's. Coupled with the fact that we have one disk per server it was incredible savings.

# ¿ Mar 3, 2011 16:37

H110Hawk: Dec 28, 2006

Spamtron7000 posted:

Does anyone have straight information on EMC FAST and the element size? My understanding is that data tiering within a Clariion using FAST only occurs on a 1Gb element size - meaning that moving a "hot spot" really means moving 1Gb chunk of data.

To give you an idea of the other end of the spectrum BlueArc tried to do this with their Data Migrator which would offload files from Fast -> Slow storage based on criteria you set. This happened at the file level, so if you had a bajillion files you wound up with a bajillion links to the slow storage. I'm not saying one way is better than the other, or one implementation is better than another, but there are extremes both ways with this sort of thing.

I for one would bet EMC has it less-wrong than BlueArc. Is their system designed for Exchange datastores? Is there a consideration in how you configure Exchange to deal with this?

# ¿ Mar 4, 2011 21:43

H110Hawk: Dec 28, 2006

Misogynist posted:

Exchange 2010 is basically built to run on tier 3 storage to begin with (90% IOPS reduction from 2003 and most of what's left is sequential), so this may not be the best example application anymore.

Slick. I remain blessedly ignorant of how Exchange works. Do you have to tell it how to behave for 3-tier, or does it expect that to be handled behind the scenes by the block device?

# ¿ Mar 6, 2011 02:26

H110Hawk: Dec 28, 2006

Cavepimp posted:

I just inherited a little bit of a mess of a network, including a NetApp StoreVault S500. From what I can gather it's a few years old and no longer under maintenance.

That was junk when it was bought. I fell for the same pitch and wound up with 3 of them.

# ¿ Mar 31, 2011 16:07

H110Hawk: Dec 28, 2006

Dreadite posted:

We're looking at the possibility of getting a SAN for our office of ~25 users. We wouldn't need more than 2TB of space, or anything particularly fast, but we'd like one for all the cool features that come with a SAN.

The problem seems to be price, as we're only looking to spend 10-12k. Is getting something to meet our modest needs under 12k completely unrealistic? What brands should we be looking at? I've gotten a quote for 15k for a NetApp 2020, but we'd really like to spend less than that so it doesn't cut into our budget for new servers.

Do you know what % that is off list price? Push back on the price until they say no. Tell them your budget is $10k, then when/if they come back with a $12k quote, bite. Remember you have to renew that support contract annually or purchase a third party one.

http://www.peppm.org/Products/netapp/price.pdf
http://www.macmall.com/p/NetApp-NAS-%28Network-Attached-Storage%29/product~dpno~7780983~pdp.febgdcc

# ¿ Apr 20, 2011 20:54

H110Hawk: Dec 28, 2006

Dreadite posted:

This is good advice. Something I noticed was that this particular vendor quoted 11k for the actual hardware and 3600 dollars for what appears to be "racking and stacking" the server in our noc. Needless to say, that's outrageous, but this is my first time buying a piece of hardware in this way. Is that to be expected with all vendors, or can I find someone who will just send me my hardware?

A lot of people like help racking their hardware. Others require it for warranty coverage. Ask them exactly what that entails. If it doesn't involve a lot of actual setup stuff, such as aggregate planning, network configuration, etc, calmly explain them that you are a very technical group and can trivially rack a server yourself with clear instructions. They have flexibility in that price because it's a cost internal to them. One of their technical lackey's, possibly the same guy in the sales meetings with you, is going to drive out and unbox all the stuff to rack it.

That being said, if you're buying one disk tray or a head/tray combo unit $3,600 is a lot of money. Keep in mind that this is nothing personal, and that MSRP is so high on these boxes because there are companies which actually pay that much or are happy with 5%-10% off list as a killer deal.

Dreadite posted:

Edit: I'm actually waiting on a quote from another couple of vendors for some EMC equipment and an HP Lefthand setup, I'll probably report back with those prices too so I can get a feel if the prices are fair.

Be sure and share these numbers around in a circle. Prices have ways of suddenly dropping when competition is introduced. "Well, I have a quote for similar gear for $10,000." It works better if you actually have a quote with that number on it. If they ask to see it remember that they all say "TOP SECRET" across the top so point that out in their quote and say you can't share them out of respect for the vendors putting in all of this hard work. Explain the parts and services included, but not which vendor is giving it to you.

H110Hawk fucked around with this message at 16:01 on Apr 22, 2011

# ¿ Apr 22, 2011 15:58

H110Hawk: Dec 28, 2006

paperchaseguy posted:

Information Lifecycle Management

gently caress if anyone else knows what that means, either.

Think of an automatic paper shredder tied to your data retention policy.

# ¿ Jun 24, 2011 16:32

Adbot: ADBOT LOVES YOU

# ¿ May 13, 2024 00:20

H110Hawk: Dec 28, 2006

FISHMANPET posted:

It was so much easier when I could blindly hate Oracle, but then I get put in a meeting where I discuss Solaris Express build numbers with the engineer on the phone

This is the magical support that you're looking for from Sun. I haven't used them in a few years so I don't know if Oracle has broken this ability, but once you get a good support person keep them. Cut your ticket and just email the number to the person.

When we were having persistent problems with our fleet of X4500's we eventually got put in touch with a guy who had been with the company since back in the day. He was openly hostile about "Those new programs they've crammed into Solaris 10." Kids these days. He got the job done though when the others were stumped.

# ¿ Jul 4, 2011 17:00

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > Enterprise Storage Megathread: Why is my NAS a SAN?

«‹›4 »