Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
revmoo
May 25, 2006

#basta
I ordered 8 3TB SAS drives and then it hit me that this thing might not have the raid controller to handle them.

It's a Poweredge 2900 ECM01. I see this in lspci:

RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS 1078 (rev 04)

Is there a max drive capacity on this RAID controller, and if so what is it? And if I need to replace it what do I need to buy?

I don't care about the hardware RAID, I'll be using LVM, just need something that can address the 3TB drives.

Adbot
ADBOT LOVES YOU

originalnickname
Mar 9, 2005

tree
Google says less than or equal to 2TB. I say upgrade the firmware since you've got the drives on order anyway and if it works, yay you!

revmoo
May 25, 2006

#basta
I'm seeing conflicting info _everywhere_ but it appears you're right and I need a H700 controller.

Is this right?

originalnickname
Mar 9, 2005

tree
All I can say is what works for me. I can personally attest that my fileserver has this one in IT mode and can do 4TB drives:

LSI Logic / Symbios Logic MegaRAID SAS 2008 [Falcon] (rev 03)

I'm passing all the drives through to ZFS... Good luck!

revmoo
May 25, 2006

#basta
Drives arrived, the RAID controller sees them but doesn't seem to be able to do anything with them. I've got an H700 controller on the way which theoretically can handle them.

What's the best way to get these drives just plainly accessible to the OS so I can do RAID in software?

netwerk23
Aug 22, 2000
I spelled 'network' wrong.

revmoo posted:

What's the best way to get these drives just plainly accessible to the OS so I can do RAID in software?

JBOD?

originalnickname
Mar 9, 2005

tree
From Reddit (grain of salt)

That card you are referring to is a SAS1078 chipset. Most similar generic is: LSI Logic MegaRAID SAS 8708EM2 RAID Controller, similar OEM is a Dell Perc 6/i. Unfortunately this card does not support IT mode, but from some research shows you can configure each disk for its own RAID 0 in the RAID controller BIOS and it will report each disk individually. If its cheap or free, may be worth an attempt. It will likely slow the boot time down a bit for the RAID card to initialize, but most "servers" don't reboot too often anyways.

If it doesn't have a JBOD option, looks like that's the only way to do it.

revmoo
May 25, 2006

#basta
The individual RAID 0 was my conclusion as well. Kind of clunky but no worries. I'm going to run two 15k RAID 1 drives for the os with the controller and the 8 drives in individual RAID 0 mode to manage with LVM.

What tools do I use in linux to manage the H700? Mainly how do I monitor the health of the RAID 1 os drives, and how do I tell it that I want to hot swap in a new drive and rebuild the two disk array?

nem
Jan 4, 2003

panel.dev
apnscp: cPanel evolved
OMSA and you can use SNMP to manage traps or smartmontools.

revmoo
May 25, 2006

#basta
Any way to run that on Ubuntu or do I have to use RHEL?

nem
Jan 4, 2003

panel.dev
apnscp: cPanel evolved
http://linux.dell.com/repo/community/ubuntu/

If you’re running it on your HV maybe CentOS is a better option so you have ongoing updates for the next decade. It’s what Dell formally supports for managing its platform. Your VMs can be whatever you want them to be even if OS lifecycles are much shorter.

Edit: you’re virtualizing this right? :psyduck:

nem fucked around with this message at 21:12 on Nov 8, 2018

revmoo
May 25, 2006

#basta
Nah it's just a file storage box. I just need a huge storage pool and a os.

Got my H700 RAID card in today. They changed the cabling.

New cables on order, at least there's a way to convert!

nem
Jan 4, 2003

panel.dev
apnscp: cPanel evolved

revmoo posted:

Nah it's just a porn storage box.

CentOS + kvm on the hypervisor then so you can take advantage of Dell's instrumentation software through a supported route. 2900 is the first family to support virtualization, but it'll take a hit without VT-D... then again for a fileserver not a big deal.

Create the LVM layer on your HV in CentOS, then you can roll whatever Ubuntu distribution you want as a guest. Share your fileserver mount to the guest. Upside is you can blow out the OS whenever you please without any loss of data or working out LVM configuration. If you need to reboot 15 seconds will be far more pleasant than listening to your turbine farm spin up for 30 seconds with another 3 minutes of BIOS initialization.

Another thing I've seen is that dsm services can be picky on kernel/OS/hardware combinations, meaning you may very well run into a combination of the three where your SNMP traps and omreport utility suddenly fail to report any devices. Standardizing your OS and leaving it alone gives you peace of mind that monitoring will work for the indeterminate future.

revmoo
May 25, 2006

#basta
Thats pretty solid advice, thanks!

revmoo
May 25, 2006

#basta
Ok so new issue. I have ten disks in the box, two OS drives in RAID 1 in the top two slots and 8 3TB disks in the main bay slots. With the H700 controller installed I'm able to initialize all ten disks and the CentOS installer sees all 9 VDs.

Problem is on boot it says "Flex bay cable missing / misconfigured"

I've tried re-seating the cables, didn't seem to help. Anybody know what's going on?

EDIT: After ordering $80 worth of cables that allegedly have a different pinout that seemed to work for a lot of folks I read a post about moving the H700 to another slot. Worked. No more error.

Weird. Also kind of annoying because now I have this card just hanging in the slot with nothing to retain it. I specifically ordered the integrated card and now I wish I hadn't.

revmoo fucked around with this message at 16:12 on Nov 11, 2018

revmoo
May 25, 2006

#basta
New question, how the heck do you actually install OMSA on CentOS?

revmoo
May 25, 2006

#basta
Nevermind, I don't care. CentOS is a dumpster fire. I'll use Ubuntu.

It took me 15 mins to install Ubuntu and get ssh server going. It took an hour and a half with CentOS and the final straw was getting a nice core dump every time I ran service --status-all

nem
Jan 4, 2003

panel.dev
apnscp: cPanel evolved
code:
wget -q -O - http://linux.dell.com/repo/hardware/latest/bootstrap.cgi | bash
From Dell's setup instructions.

CentOS/Redhat hasn't used SysV since 7.0 released July 2014. systemd all the way - systemctl status

revmoo
May 25, 2006

#basta
*removed angry rant against omv developers*

revmoo fucked around with this message at 23:40 on Nov 15, 2018

apropos man
Sep 5, 2016

You get a hundred and forty one thousand years and you're out in eight!
The title of this thread is what the headline of The Daily Mail would look like, if the world were populated by Goons.

revmoo
May 25, 2006

#basta
Pretty much.

Server's working great, just trying to figure out monitoring for the drives, psu, and fans. This stuff is super confusing and guides you find online are all out of date.

Btw why do you have to tell smartctl a /dev/sd* device AND the megaraid #. It seems like it ignores the sd* device given... so wht require it?

apropos man
Sep 5, 2016

You get a hundred and forty one thousand years and you're out in eight!
On Linux if you have smartmontools installed it's:

code:
smartctl -a /dev/sd*
In order to view S.M.A.R.T info for a disk. You may be missing a hyphen.

revmoo
May 25, 2006

#basta
I know but it makes me do:

smartctl -a /dev/sda -d megaraid,X

incrementing the 'X' from 0-9 to see all ten drives, even though I can keep /dev/sda as /dev/sda. It makes no sense.

apropos man
Sep 5, 2016

You get a hundred and forty one thousand years and you're out in eight!
Try this then:

smartctl -a /dev/sda -d megaraid,{0..9}

[something tells me it mightn't work. depends upon if it's bash interpreting the command or smartctl. I've never had to specify anything after the drive letter for my uses]

apropos man fucked around with this message at 22:23 on Nov 19, 2018

revmoo
May 25, 2006

#basta
I don't think your getting my problem. I'm able to read the SMART data from all ten drives just fine.

I just don't understand why I'm forced to give the -a flag a random drive to use to read data from unrelated drives.

apropos man
Sep 5, 2016

You get a hundred and forty one thousand years and you're out in eight!
I'm looking at it from a Linux command line POV. I don't know why it's asking for that either. You're on Ubuntu, right? What sort of array/filesystem are you using?

revmoo
May 25, 2006

#basta
My setup is:

2 146GB 15k SAS drives in RAID 1 using hardware raid for the OS

8 3TB 7200 SAS drives in RAID 6 using software raid in Ubuntu

H700 RAID controller

apropos man
Sep 5, 2016

You get a hundred and forty one thousand years and you're out in eight!
Does the RAID controller assign an array to /dev/sdX?

If so, it it both the hardware and software RAID or just one of?

I'll leave you to it, but I'm guessing that the array is defined as /dev/sdX and the only way that smartmontools can interpret which disk you're querying from which array, is by

1. adding the disk type (-d megaraid) to tell smartmontools that you're using an array
2. specify the array (because you have two of them)
3. and finally the disk ID in that array.

apropos man fucked around with this message at 23:00 on Nov 19, 2018

revmoo
May 25, 2006

#basta

apropos man posted:

Does the RAID controller assign an array to /dev/sdX?

If so, it it both the hardware and software RAID or just one of?

There is a /dev/sdX for every VD (9)

I can leave the smartctl command at /dev/sda and enumerate all ten drives by changing the megaraid number

apropos man
Sep 5, 2016

You get a hundred and forty one thousand years and you're out in eight!
You have 9 Virtual Disks? On a RAID 1 based OS with a RAID 6 storage?

Shouldn't that be two VD's?

revmoo
May 25, 2006

#basta
No. The 8 storage drives are configured as individual RAID 0 to present to the OS.

apropos man
Sep 5, 2016

You get a hundred and forty one thousand years and you're out in eight!
You said earlier that your 8 drives were in RAID 6.

revmoo
May 25, 2006

#basta
You might want to go re-read what I posted.

I have ten (10) drives. Two in hardware raid1 and eight in software raid6, resulting in 9 VDs presented to the OS.

apropos man
Sep 5, 2016

You get a hundred and forty one thousand years and you're out in eight!

revmoo posted:

No. The 8 storage drives are configured as individual RAID 0 to present to the OS.

Anyways, I found this blogpost and they are also perturbed by the curious semantics you seem to be employing in order to query individual drives:

https://blog.inf.ed.ac.uk/chris/smartctl-and-megaraid/

Do they come to a satisfactory conclusion? Nope. It looks like you can keep on specifying /dev/sdWHATEVER to specify the array and it's the disk ID number that's the definitive article in this case.

apropos man fucked around with this message at 23:18 on Nov 19, 2018

revmoo
May 25, 2006

#basta
That's so odd. Good to see I'm not the only one though.

apropos man
Sep 5, 2016

You get a hundred and forty one thousand years and you're out in eight!
If you continue to use Linux for a while you'll come across 'features' like this. The various hundreds of packages and add-ons sometimes have things that could be de-bugged. smartmontools doesn't come as default on all distros, so I guess you could call it an add-on. I've even spotted a glitch in the timer arguments for systemd before, and I'm by no means an expert, but I digress.

What you've noticed doesn't seem like it affects the overall functionality of smartctl - it's just an oddity that might get sorted out some day. I've got a smartctl bash script that runs on a systemd timer once a month for each of my drives and it's always worked well but I'm only using a RAID 1 mirror.

If it's annoying for you to specify each drive in turn to see SMART status then it might be a good opportunity to start learning some simple bash scripting and create your own script that checks each drive in turn and greps out the relevant results for you. Once you get the hang of the shell a bit more you can do cool stuff like set up mailing with a Gmail account, then have your PC email some SMART test results to you once a month. Google setting up "postfix MTA Gmail" if you want stuff like that to happen, but first I'd experiment with some scripting.

nem
Jan 4, 2003

panel.dev
apnscp: cPanel evolved
As a complement check out ntfy, which supports a variety of backends including desktop, Slack, syslog, and push.

revmoo
May 25, 2006

#basta
So I kicked off a smartctl long test on all 10 drives about two weeks ago. The 140GB OS drives finished in a normal time span. The 3TB drives are STILL testing, literal weeks later.

What gives?

Adbot
ADBOT LOVES YOU

apropos man
Sep 5, 2016

You get a hundred and forty one thousand years and you're out in eight!
I can't answer your question but, personally, I have my smartctl tests staggered on a timer. So /dev/sda gets done on the 1st of the month, /dev/sdb on the 2nd, /dev/sdc on the 3rd etc.

I don't feel it's a particularly good idea to wham all my drives at the same time.

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply