Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Locked thread
Tatsujin
Apr 26, 2004

:golgo:
EVERYONE EXCEPT THE HOT WOMEN
:golgo:
After spending a couple weeks bashing my head in a project to improve the dismal puppet config I inherited, I did come across Foreman and it seems to be a good choice and I don't have to reverse engineer Puppet Enterprise or convince accounting to pay for it. At least those couple weeks greatly expanded my knowledge on repository-based environment management and cloud orchestration.

Adbot
ADBOT LOVES YOU

Vulture Culture
Jul 14, 2003

I was never enjoying it. I only eat it for the nutrients.
https://www.youtube.com/watch?v=ZY8hnMnUDjU

Bhodi
Dec 9, 2007

Oh, it's just a cat.
Pillbug
Like the guy said, systems guys are terrible at writing APIs and unfortunately, systems guys are the only ones writing cloud software. I really think that cloudfoundry has the right idea with their wholesale stealing appropriation compatibility with AWS / EC2.

They did it the best, they are the largest, and thus the de-facto standard.

Bhodi fucked around with this message at 01:08 on May 31, 2015

evol262
Nov 30, 2010
#!/usr/bin/perl
Eucalyptus also has this going for it

Vulture Culture
Jul 14, 2003

I was never enjoying it. I only eat it for the nutrients.

Bhodi posted:

Like the guy said, systems guys are terrible at writing APIs and unfortunately, systems guys are the only ones writing cloud software. I really think that cloudfoundry has the right idea with their wholesale stealing appropriation compatibility with AWS / EC2.

They did it the best, they are the largest, and thus the de-facto standard.
Did I miss a big announcement where Cloud Foundry got EC2-like IaaS features? Last I heard related to IaaS and CF was that Mirantis, the only OpenStack vendor that gets it, joined the Cloud Foundry Foundation back in April. Their focus has been PaaS and they've been mostly ignoring the IaaS side, outside of fuzzy lines in between like containerized apps.

MagnumOpus
Dec 7, 2006

Bhodi posted:

Like the guy said, systems guys are terrible at writing APIs and unfortunately, systems guys are the only ones writing cloud software.

That's an interesting perspective. Most of my complaints about CF come from what appear to be developer-mindset decisions that go against literal decades of systems engineering convention e.g. BOSH and its VCAP madness.

Bhodi
Dec 9, 2007

Oh, it's just a cat.
Pillbug

Vulture Culture posted:

Did I miss a big announcement where Cloud Foundry got EC2-like IaaS features? Last I heard related to IaaS and CF was that Mirantis, the only OpenStack vendor that gets it, joined the Cloud Foundry Foundation back in April. Their focus has been PaaS and they've been mostly ignoring the IaaS side, outside of fuzzy lines in between like containerized apps.
Sort of, mostly on the iias, they cloned ec2 for querying / assigning tags, host lists, and some other similar stuff including provisioning iirc. I don't know how deep the rabbit hole goes but they swiped the entire thing and claim compatibility and "ease of portability" so

I moved on before I got a chance to explore it too much, I work with cloudforms now.

Bhodi
Dec 9, 2007

Oh, it's just a cat.
Pillbug
This came through my twitter feed today and I just have to laugh



A complicated and expensive looking contraption that won't actually work due to engineering issues? Perfect representation.

MagnumOpus
Dec 7, 2006

I was gonna try to white knight Openstack but last week when we came back up from our provider maintenance window we had a bunch of VMs with no volumes mounted and a couple that had the wrong volumes mounted so I can't do that with a straight face. We were considering moving to Rackspace, but our west coast partner BU just decided to get out of there themselves for similar instability issues. I see a refactor to a new provisioner in our future.

Vulture Culture
Jul 14, 2003

I was never enjoying it. I only eat it for the nutrients.

Bhodi posted:

This came through my twitter feed today and I just have to laugh



A complicated and expensive looking contraption that won't actually work due to engineering issues? Perfect representation.
I keep running across these weird design decisions in OpenStack as I go further down the rabbit hole. I moved our Cinder storage (Ceph backend) to a private jumbo-frames network with new IPs the other day, and after reconfiguring all the clients, I couldn't figure out why it wasn't attaching to the volumes. It turns out that, despite having named multi-backend support enabled in Cinder, it had encoded the old IP addresses of my Ceph mons into the database. But not the Cinder database, where you might expect that as volume metadata -- the Nova database, on each individual block device attachment, buried in a JSON field in a database that gets converted verbatim into a section of libvirt.xml (???). Luckily I had a DB dump lying around that I could grep through, or I never would have found that poo poo.

But it's just like that Summit video says -- there's so many weird architectural decisions in OpenStack that arose strictly from people sneaking changes into weird places so they wouldn't have to deal with some other person.

Vulture Culture fucked around with this message at 22:09 on May 31, 2015

evol262
Nov 30, 2010
#!/usr/bin/perl
It's code reviewed before merge, but a lot of the weird decisions are because of this sense that openstack isn't really a platform, it's just a bunch of independent pieces talking to each other over a message bus with AAA handled by one common piece.

From the perspective of a developer who's never actually worked on a production environment and never had to migrate hardware or make big architectural changes, this stuff makes sense. Because the tests are run in autoprovisioned instances from Jenkins, not your extant environment that got a storage change.

And Cinder is just an API, so Cinder knowing anything about VMs at all means it's exceeding scope and doing more than providing anonymous storage, much less what Nova may be doing with them.

For Nova, telling a VM to go talk to Cinder every time introduces a dependence on it being available and adds extra traffic when it comes up. If the value is hardcoded, who cares? Less API calls. And if it doesn't work, you can just terminate and reprovision, right?

In some way, this actually removes all the dependence. But it doesn't make logical sense, and debugging it is a nightmare. Cloudstack and Eucalyptus do this better. "Real" openstack platforms that treat it as an integrated thing and not keystone+whatever could do this better but probably never will because vendors who are part of the openstack foundation don't want to see it work perfectly or add every possible rfe.

VMs shouldn't get the wrong volumes, though. Your provider probably hosed up cloning a LUN somewhere.

Vulture Culture
Jul 14, 2003

I was never enjoying it. I only eat it for the nutrients.

evol262 posted:

It's code reviewed before merge, but a lot of the weird decisions are because of this sense that openstack isn't really a platform, it's just a bunch of independent pieces talking to each other over a message bus with AAA handled by one common piece.

From the perspective of a developer who's never actually worked on a production environment and never had to migrate hardware or make big architectural changes, this stuff makes sense. Because the tests are run in autoprovisioned instances from Jenkins, not your extant environment that got a storage change.

And Cinder is just an API, so Cinder knowing anything about VMs at all means it's exceeding scope and doing more than providing anonymous storage, much less what Nova may be doing with them.

For Nova, telling a VM to go talk to Cinder every time introduces a dependence on it being available and adds extra traffic when it comes up. If the value is hardcoded, who cares? Less API calls. And if it doesn't work, you can just terminate and reprovision, right?

In some way, this actually removes all the dependence. But it doesn't make logical sense, and debugging it is a nightmare. Cloudstack and Eucalyptus do this better. "Real" openstack platforms that treat it as an integrated thing and not keystone+whatever could do this better but probably never will because vendors who are part of the openstack foundation don't want to see it work perfectly or add every possible rfe.

VMs shouldn't get the wrong volumes, though. Your provider probably hosed up cloning a LUN somewhere.
From a human factors perspective, this is a great explanation of why OpenStack is trash

MagnumOpus
Dec 7, 2006

Hell is other people's OpenStack flavors.

squeakygeek
Oct 27, 2005

MagnumOpus posted:

Hell is other people's OpenStack flavors.

It was bad enough setting up Eucalyptus. I can only imagine how painful OpenStack is.

Vulture Culture
Jul 14, 2003

I was never enjoying it. I only eat it for the nutrients.

squeakygeek posted:

It was bad enough setting up Eucalyptus. I can only imagine how painful OpenStack is.
Once you have OpenStack running correctly, it's mostly non-objectionable, except that logging/metrics/etc. are a roll-your-own kind of solution. The main issue is that getting to that point can be a 3-4 month ordeal for a single engineer, especially if you're learning storage/network layers like Ceph and Open vSwitch on top of everything else. Once you have that, your logs are still probably not parsed/formatted correctly in your logging solution of choice. It's overkill for most companies that aren't service providers or otherwise actually earning revenue from OpenStack in some fashion.

The worst part is that OpenStack's reference architecture diagrams, especially ones from vendor slides, aren't even correct. We set up a multi-master Percona XtraDB/Galera (MySQL) cluster, because this seems to be an incredibly common deployment choice. Turns out that because of the way that OpenStack (especially Nova) uses SELECT ... FOR UPDATE, you can't have more than one node taking writes even though object IDs are mostly UUID rather than auto-increment -- for most services, you have to separate out your reader/writer connections like traditional master/slave topologies.

Vulture Culture fucked around with this message at 15:48 on Jun 22, 2015

Buffer
May 6, 2007
I sometimes turn down sex and blowjobs from my girlfriend because I'm too busy posting in D&D. PS: She used my credit card to pay for this.

Vulture Culture posted:

Once you have OpenStack running correctly, it's mostly non-objectionable, except that logging/metrics/etc. are a roll-your-own kind of solution. The main issue is that getting to that point can be a 3-4 month ordeal for a single engineer, especially if you're learning storage/network layers like Ceph and Open vSwitch on top of everything else. Once you have that, your logs are still probably not parsed/formatted correctly in your logging solution of choice. It's overkill for most companies that aren't service providers or otherwise actually earning revenue from OpenStack in some fashion.

The worst part is that OpenStack's reference architecture diagrams, especially ones from vendor slides, aren't even correct. We set up a multi-master Percona XtraDB/Galera (MySQL) cluster, because this seems to be an incredibly common deployment choice. Turns out that because of the way that OpenStack (especially Nova) uses SELECT ... FOR UPDATE, you can't have more than one node taking writes even though object IDs are mostly UUID rather than auto-increment -- for most services, you have to separate out your reader/writer connections like traditional master/slave topologies.

That's good to know, because it seems the docs everywhere are just do maria+galera, you'll be fine. But a lot of openstack docs are "it will work this way... in vagrant." Soo... anyway, how are you mitigating that? HAProxy in front with two vips, one for read, one for write?

Vulture Culture
Jul 14, 2003

I was never enjoying it. I only eat it for the nutrients.

Buffer posted:

That's good to know, because it seems the docs everywhere are just do maria+galera, you'll be fine. But a lot of openstack docs are "it will work this way... in vagrant." Soo... anyway, how are you mitigating that? HAProxy in front with two vips, one for read, one for write?
Pretty much. One vIP, two ports. The writer backend is configured to prefer the local host with the other two as backups, and vice versa on the reader backend.

We're also a single-tenant organization, so we got rid of the DbQuotaDriver, which eliminates at least half of the lock contention via SELECT ... FOR UPDATE on the Nova database.

This is a hilarious issue because if you watch the performance videos from the OpenStack summits, you know all about this problem (it's been covered by at least Percona and Mirantis in two different talks at two different summits), but it seems like for political reasons they don't actually go into any of this poo poo in the docs.

Vulture Culture fucked around with this message at 16:51 on Jul 15, 2015

nonathlon
Jul 9, 2004
And yet, somehow, now it's my fault ...
Sort of not a cloud problem, but I figure the context might have created a lot of the same questions:

I'm _not_ an IT guy but a bioinformatician building / running the infrastructure for a major genomics project. So I have to be an IT guy, above and beyond my skills. We've based the infrastructure in AWS, which has been a big win. Deploy an app with ElasticBeanstalk and get scaling and easy config? Awesome.

However:

With various different systems and web services, it would be nice to have a single identity and login system across them all. Of course, not all the software can use the same auth systems (LDAP, Shibboleth, OpenID, etc.). But the intricacies of auth systems has me running around, tangling with Amazon Directory Services, getting horribly confused over LDAP. It's frankly beyond my skills and I'm looking for an easy way out. Any advice? This has consumed a huge amount of my time that I frankly don't have.

Mr Shiny Pants
Nov 12, 2012
Anyone try Joyent Smart Datacenter? It looks really nice and pretty easy to setup compared to Open Stack.

Too bad I don't have some spare machines :(

adorai
Nov 2, 2002

10/27/04 Never forget
Grimey Drawer
I have played around with smartos. It works well, but is not exactly user friendly.

Mr Shiny Pants
Nov 12, 2012

adorai posted:

I have played around with smartos. It works well, but is not exactly user friendly.

I just found this: http://blog.smartcore.net.au/smartos-gui-project-fifo-0-6-0-demo/

Looks really slick.

Vulture Culture
Jul 14, 2003

I was never enjoying it. I only eat it for the nutrients.
Some great slides on Kubernetes 1.0 courtesy of Rajdeep Dua with VMware (!), if container scheduling is your thing:

http://www.slideshare.net/rajdeep/introduction-to-kubernetes

namaste friends
Sep 18, 2004

by Smythe
Thanks, that actually managed to explain everything to me about Kubernetes.

Vulture Culture
Jul 14, 2003

I was never enjoying it. I only eat it for the nutrients.
Some more Kubernetes resources as I find them:

Kelsey Hightower of CoreOS has published a brief Google Compute Engine lab from his talk at OSCON for getting a very basic Kubernetes configuration up and running:
https://github.com/kelseyhightower/intro-to-kubernetes-workshop

The first two chapters of his upcoming Kubernetes: Up and Running book can also be found here:
https://tectonic.com/assets/pdf/Kubernetes_Up_and_Running_Preview.pdf

Evil Robot
May 20, 2001
Universally hated.
Grimey Drawer
Really nice paper describing Google's internal network infrastructure:

http://conferences.sigcomm.org/sigcomm/2015/pdf/papers/p183.pdf

mayodreams
Jul 4, 2003


Hello darkness,
my old friend

Evil Robot posted:

Really nice paper describing Google's internal network infrastructure:

http://conferences.sigcomm.org/sigcomm/2015/pdf/papers/p183.pdf

Thanks for this. I was going to share it internally but then realized that only one other person would be able to read and comprehend it. :negative:

adorai
Nov 2, 2002

10/27/04 Never forget
Grimey Drawer
Has anyone here spearheaded the move of a real established enterprise to AWS or Azure?

Bhodi
Dec 9, 2007

Oh, it's just a cat.
Pillbug
I did, but I cheated. I used VPC and had an easy way to hook EC2 into our host provisioning system.

adorai
Nov 2, 2002

10/27/04 Never forget
Grimey Drawer
i am mostly interested in how you convinced the other techs that it was a good move. I am considering a split move to aws and azure, and while I can effectively make it happen, i want to know how to not make my infrastructure guys (who will still have jobs) hate me.

Mr Shiny Pants
Nov 12, 2012

adorai posted:

i am mostly interested in how you convinced the other techs that it was a good move. I am considering a split move to aws and azure, and while I can effectively make it happen, i want to know how to not make my infrastructure guys (who will still have jobs) hate me.

What are the benefits for the move? And do they weigh up? I mean, every TCO study I've seen concludes that you pay more for the cloud in the long run. Meanwhile hardware is getting cheaper and cheaper and you need less and less of it because it does get faster.

Bhodi
Dec 9, 2007

Oh, it's just a cat.
Pillbug
I didn't actually have to convince anyone, it was exploratory as a cheaper alternative to a physical presence in Asia. We set up a small scale pilot and eval'd the results after a few months.

I'm sure promotional material and sales guys from your chosen cloud company would be happy to come in and give a pitch, or at the very least send you some material and talking points. That's kind of their job!

Vulture Culture
Jul 14, 2003

I was never enjoying it. I only eat it for the nutrients.

Mr Shiny Pants posted:

I mean, every TCO study I've seen concludes that you pay more for the cloud in the long run.
ahahahaha look at this guy

Running your own physical hardware is like running your own electrical grid -- there are certain kinds of businesses it makes sense for, but you'd better be prepared to rationalize it nowadays.

Mr Shiny Pants posted:

Meanwhile hardware is getting cheaper and cheaper and you need less and less of it because it does get faster.
This is also why every decent service provider (AWS, Azure, GCE) continuously gets cheaper and adds newer, faster hardware types.

Mr Shiny Pants
Nov 12, 2012

Vulture Culture posted:

ahahahaha look at this guy

?

Vulture Culture posted:


Running your own physical hardware is like running your own electrical grid -- there are certain kinds of businesses it makes sense for, but you'd better be prepared to rationalize it nowadays.

This is also why every decent service provider (AWS, Azure, GCE) continuously gets cheaper and adds newer, faster hardware types.

Sorry we haven't all jumped on the cloud bandwagon, maybe it is different in the states.

StabbinHobo
Oct 18, 2002

by Jeffrey of YOSPOS

Vulture Culture posted:

This is also why every decent service provider (AWS, Azure, GCE) continuously gets cheaper and adds newer, faster hardware types.

yea but, and I don't have a graph to prove this just a feel, it doesn't seem to be tracking moores law at all. 3 years of amazon price cuts might add up to a 50% price/performance improvement, but a fresh new generation of hardware will be 2x at least.

would love it if someone did the data work to prove me wrong/right

Thanks Ants
May 21, 2004

#essereFerrari


Bear in mind that your AWS costs aren't just hardware and include staff wages, power and cooling costs, datacenter construction etc.

You can't compare the costs of putting your infrastructure in AWS to just the hardware cost of buying servers without including a load of other costs in your on-premise calculations.

MagnumOpus
Dec 7, 2006

I can't believe there are supposed professionals in this thread making blanket statements one way or the other about cloud. We all know there are some use cases where it is and where it isn't the right choice, like every other implementation option in the industry. Let's keep this topic more useful than the sales blogosphere please.

adorai posted:

Has anyone here spearheaded the move of a real established enterprise to AWS or Azure?

adorai posted:

i am mostly interested in how you convinced the other techs that it was a good move. I am considering a split move to aws and azure, and while I can effectively make it happen, i want to know how to not make my infrastructure guys (who will still have jobs) hate me.

Whether cloud is right for you depends on a lot of factors. Can you share some about your deployment?

- Are you using database systems that are designed to scale vertically or horizontally? OLAP or OLTP workloads or both?
- Do you have spike utilization during busy hours or is your profile more stable throughout the day?
- Got private/regulated data?
- How prepared is your org for doing DevOps work? This is a big one that is often overlooked; all that elasticity (generally) only pays off if you're willing to implement and maintain systems that actually scale without constant live tinkering.

MagnumOpus fucked around with this message at 19:56 on Aug 15, 2015

Vulture Culture
Jul 14, 2003

I was never enjoying it. I only eat it for the nutrients.

Mr Shiny Pants posted:

Sorry we haven't all jumped on the cloud bandwagon, maybe it is different in the states.
I'm laughing because you're talking about "the TCO" like it's remotely the same for any two companies. Everyone has vastly different methods of operating their IT services -- some people have 10,000 square feet of datacenter space that they've already paid for, and some companies have a wiring closet with a Netgear switch in it. Some companies run strategic IT in tandem with the front-end line-of-business, and others operate it as a cost center to try to squeeze out better efficiency than cloud at the cost of business agility and service focus. Any study that tells you what "the TCO" is for cloud -- in either direction -- is pulling a fast one on you. You need to calculate this for yourself.

StabbinHobo posted:

yea but, and I don't have a graph to prove this just a feel, it doesn't seem to be tracking moores law at all. 3 years of amazon price cuts might add up to a 50% price/performance improvement, but a fresh new generation of hardware will be 2x at least.

would love it if someone did the data work to prove me wrong/right
Most of the costs of cloud don't follow a 1:1 relationship with hardware, the same way that they don't in an on-premises setting. There's overhead around the cloud platform to manage all those resources, the people who have to write, maintain and operate that platform, network transit costs, etc. There's no way you're ever going to get that to 1:1 parity until Skynet self-assembles the machines into a cluster that runs themselves and teleports bits to their destinations on the other end of the wire.

Even running everything yourself, you have to deal with:

  • Physical space, power, cooling for a datacenter, obviously
  • Support staff to operate the datacenter, rack/stack hardware, track assets, deal with operating and developing policy/practice around virtualization platforms, navigating interdepartmental politics of "why can't I have a 32-core VM," other "strategic" CIO issues around utilization-based internal billing that are better spent actually integrating with the line of business to make money
  • Support staff to handle those support staff -- management/vendor relations, purchasing, shipping/receiving, accounts payable/receivable, HR to handle these employees upon employees
  • Physical facilities around all those support staff -- offices, parking, bathrooms, Internet pipes, wi-fi access points (and more network engineers to manage those previous two things)
  • Opportunity cost because an IT vendor schedule slip or delivery problem (and project managers to mitigate that risk (and further loss of business agility because you've forced yourself into a waterfall delivery model))
  • Handling whatever half-assed self-service provisioning you're bolting on top to keep Shadow IT from taking over the company from the outside in (though Cloud Foundry is getting rather good)
  • Overhead of physically integrating all of these facilities and support staff into any future mergers/acquisitions

None of this stuff is remotely free, and every dumb little complication detracts a lot from the business's ability to just be a business.

If you're fundamentally a logistics company, awesome, but there's a pretty big case to be made for "I don't want to deal with this poo poo." Cloud is not for everyone, but there's this big group of people clinging onto the legacy IT on-premises model like the people holding onto penny stocks appreciating 2% a year over the last decade because they're not actually losing money. Great, but wouldn't you still rather sell those and invest the capital into an asset that will actually make you money instead?

Vulture Culture fucked around with this message at 20:45 on Aug 15, 2015

Vulture Culture
Jul 14, 2003

I was never enjoying it. I only eat it for the nutrients.

MagnumOpus posted:

Whether cloud is right for you depends on a lot of factors. Can you share some about your deployment?

- Are you using database systems that are designed to scale vertically or horizontally? OLAP or OLTP workloads or both?
- Do you have spike utilization during busy hours or is your profile more stable throughout the day?
- Got private/regulated data?
- How prepared is your org for doing DevOps work? This is a big one that is often overlooked; all that elasticity (generally) only pays off if you're willing to implement and maintain systems that actually scale without constant live tinkering.
Another big one: how much data do you have, and what are you doing with it? An organization with, say, 10 PB of general-use genomics data accessed over network filesystems is probably not a great candidate for cloud. Other data warehousing activities might be better in EMR/Redshift or something.

Mr Shiny Pants
Nov 12, 2012

Vulture Culture posted:

I'm laughing because you're talking about "the TCO" like it's remotely the same for any two companies. Everyone has vastly different methods of operating their IT services -- some people have 10,000 square feet of datacenter space that they've already paid for, and some companies have a wiring closet with a Netgear switch in it. Some companies run strategic IT in tandem with the front-end line-of-business, and others operate it as a cost center to try to squeeze out better efficiency than cloud at the cost of business agility and service focus. Any study that tells you what "the TCO" is for cloud -- in either direction -- is pulling a fast one on you. You need to calculate this for yourself.

Now that wasn't so hard now was it. ;)

Seeing Adorai's postings in the past I assumed we are not talking about a closet with a single switch.

Adbot
ADBOT LOVES YOU

Vulture Culture
Jul 14, 2003

I was never enjoying it. I only eat it for the nutrients.

Mr Shiny Pants posted:

Now that wasn't so hard now was it. ;)

Seeing Adorai's postings in the past I assumed we are not talking about a closet with a single switch.
Even then, the key questions aren't "should we use cloud?" They are "where can we strategically outsource operations?" and "at what point should we pay someone else to run [thing X]?"

Cloud vendors are aware that there's a big piece of the pie left uneaten in legacy/enterprise. I assure you, they're working hard to fix that problem.

  • Locked thread