Infra/networking thread

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > YOSPOS > Infra/networking thread

12 rats tied together: Sep 7, 2006

Farmer Crack-rear end posted:

i don't deal with code so my cynical assumption is that a lot of "scaling" is basically "if we can fully automate spinning up and tearing down server instances, it won't matter how often our shoddy code crashes!"

im sure some degree of that exists somewhere in the world, but with my current employer the company basically makes money per CPU cycle, so we have a fleet of edge servers that are dedicated to the part that makes us money and then we have a huge log pipeline that is responsible for turning that data into something we can bill clients with

in this case in order to make the most amount of money in any given second, we need exactly as much log processing infrastructure to process all of that data within an appropriate timeframe. the amount of input data scales with overall internet activity, which fluctuates as people wake up, go to work, get home from work, etc.

if your employer makes money off of the internet demand curve in any way, however bullshit it might be, they are going to benefit from regional scaling optimizations that follow it. it's both not reasonable to manually scale to follow this pattern and not possible right-size a "web scale" (lmao) service, in general

CMYK BLYAT! posted:

k8s isn't a troll: google want to provide some sort of lingua franca around managing computing resources in modern environments based on their practical experience running one. everyone else has done so on their particular cloud compute platform in myriad ways, and there are legion sysadmins saying "by god we can continue to use provider-native tools to do the same poo poo", and they're not wrong, but they're not providing a lingua franca,

this was me, the guy arguing against k8s because it is extra complexity for no new functionality, until we hit a developer:sysadmin ratio where I was being asked "what subnet should I pick?" upwards of 5x per day by people who didn't read the docs for picking a subnet.

the lingua franca goes from not mattering much to being the only reason you can focus for reasons that are totally out of your control, so, you should optimize towards it whenever possible. even if that means adding complexity.

# ¿ Jun 9, 2020 18:31

Adbot: ADBOT LOVES YOU

# ¿ May 8, 2024 14:26

12 rats tied together: Sep 7, 2006

the perceived difficulty in running k8s on-prem is really exaggerated compared to reality, imo.

it's definitely not as easy as "click button -> run shell one liner to configure kubectl" like AWS gives you, but it's a fairly standard core group/worker group cluster setup and you need to put some consideration into rack and power diversity etc to make sure you don't lose enough of your core to either break everything or do some crazy split brain poo poo or whatever.

if you aren't equipped to be able to intelligently locate your k8s nodes such that losing a rack doesn't cripple your cluster, well, you kind of had that problem anyway and k8s wasn't going to solve it for you. you still have to be good at operating physical infrastructure and a lot of people just aren't. it is very useful on-prem of course, for obvious reasons. chick-fil-a was doing it right when they started running a cluster in each store.

i would suggest that if you were going to try and build your own set of application abstractions for on-prem you should pretty much copy paste the k8s api and its concepts wholesale. it will help you avoid poo poo where developers will go "ok all the servers that end with 5 are the test servers" or issues where you decommission node12 and replace it with node94 instead of re-seating node12 and an entire set of processes that rely on assumed node names crumbles to the ground around you

its a very good way to be thinking about infrastructure deployed on-prem

# ¿ Jun 9, 2020 23:18

12 rats tied together: Sep 7, 2006

IMO its fair that, as a google product, k8s has high expectations of you in terms of your overall ability to control the applications that you manage. for something like a ui node, dropping inflight requests shouldn't matter because your web ui is just a javascript api client bundled with static content, so you can cleanly handle retry logic behind the scenes.

if the application is a listener for some adtech type poo poo (since its google) you likely do not care about any amount of requests that isn't at least in the 1000s.

if you're running a database in there OTOH yeah that's kind of a problem, and could use some special consideration or handling. i probably would not ever intentionally run a database in k8s though.

# ¿ Jun 10, 2020 15:20

12 rats tied together: Sep 7, 2006

yeah thats very fair, i would not want to put any stock trading(?) poo poo into k8s, especially if it were adjacent to whatever event firehose exists in that domain. i'd probably want to map external events directly to internal stream storage and then, if k8s has to be involved, its just hosting stream processors

similarly i would probably not put supply-side adtech poo poo into k8s either

i guess if you were google and you were both the supply and the demand you could be more intelligent about routing events to handlers that only sometimes exist, but, in general with adtech that poo poo is way less important because if you can't find a bidder for an ad slot you just can just pick something at random and serve it. can't really do that with stocks

# ¿ Jun 10, 2020 19:36

12 rats tied together: Sep 7, 2006

imo the requirements of a system can't make it a bad system, since a system is only as useful as it can be applied to solve real world problems

there are probably real world problems that require you to never drop events and have latency requirements that make decoupling handlers from receivers not a workable solution. i can't think of any right now but i don't go outside anymore so

i would think that fraud detection is a process that could be decoupled from whatever the input is, like, you would have some statistical model that you train out of band being used to verify live events, but the events are also used for training that model, which you redeploy every 6 hours or whatever.

in any case i think putting something in k8s fundamentally means giving up some control in exchange for some niceties. that's not always gonna be a good idea (but usually is)

# ¿ Jun 10, 2020 21:29

12 rats tied together: Sep 7, 2006

ate poo poo on live tv posted:

I dont' understand, you can't just put your k8s cluster behind a load balancer, then remove the cluster or node from the LB pool which will then no longer send requests to the cluster, then 30sec latter send a sigterm?

OP mentioned a ClusterIP service which is a intra-k8s-cluster-only sort of deal. the master nodes will basically allocate containers to servers, create virtual ips for collections of those containers, and then distribute a pepe silvia style iptables dnat spiderweb to each of the nodes in the cluster.

it means you can hit any node and it will get to your pod, but it also means the core nodes don't really have a consistent view of what traffic is currently flowing since the only thing they do is examine object definitions and push iptables rules to nodes. because they don't have a real view into traffic (there is no central nat table, for example, like in an SNAT load balancer), they can never be sure that traffic has stopped or that there are no more active tcp sessions or whatever.

the other types of services in standard use (NodePort, LoadBalancer) are built on top of ClusterIP so they share the same limitation. apparently there is an ExternalName service which uses cname-to-cluster-dns as a redirection layer instead of iptables, but thats probably even slower and less useful for OP's job

its a fair gripe, this app probably shouldnt be in k8s

# ¿ Jun 11, 2020 09:20

12 rats tied together: Sep 7, 2006

Progressive JPEG posted:

switching from a background processing model to a real time SLA is not a slight change in requirements

yeah, rofl

Nomnom Cookie posted:

obtw 12 rats can you point me to where on the k8s marketing site or docs it says that kubernetes is unsuitable for applications requiring correctness from the cluster orchestrator

i wont even pretend for a little bit that k8s isn't at least half marketing the google brand to get people to work there. SRE as a role sucks rear end in general, the best way to get people to stick with it at your multinational megacorp would be to convince them that they are special in some way

admitting that k8s isn't an ops panacea, or the google might not know the best way to run any infrastructure for any purpose, would ruin that perception

its true that you can work around the problem. it sucks that you work around it with "sleep 3" or by doing way too much loving work compared to any other load balancer to ever exist except maybe the Cisco (R) Catalyst (Copyright 1995) Content Switching Module

# ¿ Jun 12, 2020 14:46

12 rats tied together: Sep 7, 2006

a huge part of my job in like 20...15? was managing a pair of them in a pair of 6509s. it was really something

they had some config sync poo poo that worked really well actually

# ¿ Jun 12, 2020 23:17

12 rats tied together: Sep 7, 2006

CMYK BLYAT! posted:

are there experienced SREs in the world that are not disillusioned and resigned to the touch computer forever for capitalism life

the only thing keeping me touching computer is health insurance.

abigserve posted:

so you get a lot of "tcp port alive" health checks and poo poo like that

i do like that i can punt this to the dev team really easy in k8s. i can even just link them api docs, its reasonable to expect developers to be able to read and understand api documentation

we still have a lot of extremely dumb health checks. its either an issue with .net or an issue with the gestalt understanding of running web applications on windows and i am wholly uninterested in finding out which is true

# ¿ Jun 16, 2020 17:24

12 rats tied together: Sep 7, 2006

if your application makes money by attaching itself to the internet firehose, kubernetes is worth running, especially if you make money off of serving internet ads or if you have any other kind of 2-phase revenue generating process that runs into fun scaling problems like "this floor of the data center is full".

in those situations being able to quickly rebalance various internal processes based on metrics like "ads we bought but haven't redistributed updated budget information for" and poo poo like that is arguably worth the complexity.

using kubernetes to host web applications or random tools crap though is absolutely just google cargo culting, resume driven infrastructure, etc.

# ¿ Feb 22, 2021 17:29

12 rats tied together: Sep 7, 2006

its extra funny that approx 100% of the big data workflows that exist outside of google are some wrapper around "just put it all in s3". if you told me google just used s3 also i would absolutely believe it

# ¿ Feb 22, 2021 17:50

12 rats tied together: Sep 7, 2006

jre posted:

s3 is good though

yes.

ate poo poo on live tv posted:

s3 is "good" for a specific subset of data storage problems. So of course developers try to use it like a global cache.

i would honestly rather see people overcommit to s3 as a global cache than have to deal with people cramming hundreds of thousands of lines of business logic into microsoft sql server stored procedures

# ¿ Feb 23, 2021 06:40

12 rats tied together: Sep 7, 2006

it's good, actually. it operates on a single object at a time and it requires that the object is in a particular format, parquet is 100% one of the formats but i don't remember any of the other ones.

for a lot of use cases the interface is a massive perf and cost boost because you don't need to maintain pools of compute (EMR, lmao) that are equipped to download, unzip, and decode whatever bullshit you're writing

# ¿ Feb 23, 2021 18:41

12 rats tied together: Sep 7, 2006

i think redshift has some "just run against these s3 buckets" stuff these days too which is nice since those clusters can be pretty expensive as they size up

# ¿ Feb 23, 2021 20:16

12 rats tied together: Sep 7, 2006

thank you, that sounds like a good feature. my experience is that on-staff "data warehouse" people are some of the worst engineers to work with so i'm 100% always the guy saying to put stuff in redshift instead of running vertica on ec2 in addition to our 800k monthly EMR spend for some reason

i had a data warehousing team request like 90 something ec2 vertica nodes on i3 class instances, which they wanted us to configure some awful software raid poo poo on, we delivered the servers and the first thing they did was shut over half of them off "to save money", which wiped the instance store volumes and associated configs

# ¿ Feb 24, 2021 17:32

12 rats tied together: Sep 7, 2006

agree except imho "good" ops and release teams were already building those abstractions (they are not very complicated). having done SRE tech interviews for a while though i will readily admit that "good" ops and release teams might as well not exist and the field overall is home to some of the most deranged engineering i've ever seen

like, if your choices are "a server that runs chef-solo and updates aws autoscaling group desired counters on every convergence" and "kubernetes", definitely run kubernetes.

12 rats tied together fucked around with this message at 19:26 on Feb 24, 2021

# ¿ Feb 24, 2021 19:24

12 rats tied together: Sep 7, 2006

its called redshift because you put the servers far away in amazon's cloud and then because they are farther away you think about them less often

# ¿ Feb 24, 2021 21:05

12 rats tied together: Sep 7, 2006

ate poo poo on live tv posted:

My company has a massive presence in AWS and our datatransfer costs are >50% of our total AWS costs and lol datatransfer alone is almost equal to our monthly spend for our physical DCs.

I think this really depends on your workload, right? I just finished working at an adtech spot with a large (idk what i'm allowed to say exactly but as the primary aws implementation rear end in a top hat i saw all the bills) aws presence and data transfer was in the 10-25% of total spend per region. Most of it was compute since as a DSP we had a shitload of free ingress traffic with comparatively rather small response traffic.

The numbers crunch just fine in that scenario, and while it is still cheaper to host it yourself, adtech is all about following that demand curve as profitably as possible so the elasticity usually ends up being a huge win

The biggest issue we had was when someone spun up a huge kafka cluster across 6 AZs which started incurring geometrically increasing transfer fees as the brokers started doing replication poo poo

# ¿ Feb 26, 2021 21:29

12 rats tied together: Sep 7, 2006

when was the last time microsoft created a successful product that wasn't just them buying someone else's rails or electron app for billions of dollars

# ¿ Mar 26, 2021 16:33

12 rats tied together: Sep 7, 2006

it goes away when you switch to juniper devices which have a cli not designed explicitly to gently caress with you

# ¿ Jun 12, 2021 05:11

12 rats tied together: Sep 7, 2006

my homie dhall posted:

what do people think about cumulus?

it's very good and you should use it if you can

# ¿ Jun 12, 2021 16:36

12 rats tied together: Sep 7, 2006

instead of a module you should use a resource foreach. almost always

# ¿ Sep 9, 2023 04:34

12 rats tied together: Sep 7, 2006

sure, but ec2 route objects is like the poster child for "reasons why modules are bad". you will just end up reinventing the parameter set for the route resource on top of a module's parameter set for no reason. "var.default_route_target_type = nat_gateway" and poo poo.

the docs are better these days and explicitly warn you not to create a module for a thing unless you can come up with a better name than "the names of the resources inside of the thing". if you try to map every single resource dependency in AWS inside of a module subtree you will have created an infinite amount of work for yourself because an AWS account is essentially a tree itself already

the thing you want from modules can almost always be expressed as

code:

locals {
  my_bullshit = {
    private_subnet_count = 4
    private_subnet_az_set = [az1, az2]
    // etc
  }
}

and then just for_each=local.mybullshit in your routes. there are examples in the docs, and indeed, managing the treelike nature of an AWS VPC and its dependencies is the example in the docs for resource foreach.

# ¿ Sep 9, 2023 05:01

12 rats tied together: Sep 7, 2006

post hole digger posted:

this is interesting. I guess this is basically what I'm trying to do, but I'd be sort of abstracting the 'module' layer on top of it. I'd have to rethink how I intend to list the variables and such, but I think this makes sense. [...]

Is the idea then that you'd be doing like The Iron Rose mentioned and feed different envs of 'my_bullshit' into the parameters with something like different tfvars files?

You can do that, but not necessarily. Ultimately, if you go with modules or if you go with resource for_each, you still have a code commitment in every "root state". Regardless of strategy chosen here you can attempt to manage your root states with a single set of tf files (and pass in .tfvars) or you can have separate static subfolders with their own tf files.

The choice between module vs resource for_each is basically, which type of debt you want to have. If you go with modules, you have to deal with the insanity that is toggling conditions in modules. You have, usually, undesirable loose coupling because terraform only has bad choices when it comes to sharing data between workspaces. You also have to deal with module threading where e.g. if you have some root state that calls a network module that calls a subnet module which conditionally creates a "nat gateway route" vs an "internet gateway route", you need to thread outputs from subnet to network, from network to root, and then finally in root you can declare them an output. Because only root is "real", it's the only place the outputs actually get manifested into state in a queryable way (terraform output cli command, state file parse, or terraform_remote_state data source).

If you go with resource for_each, in every place that you need subnets you are going to have to type `resource "aws_subnet" "..."`. However every subnet you type is "real" immediately with no extra work, and you never need to encode conditional pathways (specific code) into modules (generic place). This also makes `terraform console` better which is a huge maintenance win. Instead of dealing with the insanity of nullable parameters + conditional logic in modules, you simply use for expressions to produce inputs into your aws_subnet resource. This lets you define configuration structures of arbitrary shape (for example, in this region, we get the value of locals.mybullshit from an aws secrets manager secret. in this other region, the values are hardcoded) and it gives you a syntax for producing similarly arbitrary shaped input objects for each resource.

The benefits of for expressions are twofold: 1, a for expression can evaluate to "an empty thing" and the resources are simply not created (compare to setting var.count -- if you pass 0, terraform still creates "list of 0 length" in state). 2, an architecture that makes use of for expressions trends towards healthier long term maintenance because the shape of the configuration data is beneficially decoupled from its application to a resource. If you need to change input data to something, you can simply write a new for expression, and because you didn't use modules you have all of the data you need, guaranteed, available for you in `terraform console`, so step 1 of your PR to reshape an input object is to open up the console, print the existing object, and iterate on it in the REPL until it looks good.

# ¿ Sep 12, 2023 19:46

12 rats tied together: Sep 7, 2006

if that all sounds bad and stupid it's because terraform is a bad and stupid tool

# ¿ Sep 12, 2023 19:54

12 rats tied together: Sep 7, 2006

in this case i disagree because the pathway to maintainable terraform is paved with for expressions and the terraform console repl, and modules don't play nice. you can use them, i've used them myself, but best case scenario is that they save you some effort on your current PR and they don't make anything immediately worse. they will always be worse later. usually you will hit some tripping point in the future where some assumption that was encoded into the module caller boundary is no longer true and now you need to either create a new "-without-wrong-thing" module or you need to deal the combinatoric explosion of conditional resources inside your module that is wrong now.

pulumi is, of course, better. but you have to hire SREs who can create 4 classes without immediately diamond probleming themselves, which is basically impossible, or prohibitively expensive.

# ¿ Sep 12, 2023 20:20

12 rats tied together: Sep 7, 2006

post hole digger posted:

alright youre losing me here lol

why? repls rule and the terraform repl is extremely useful.

Asymmetric POSTer posted:

this thread is making me thankful i do not have to janitor terraform

considering the earlier example about how you need to thread outputs from route to subnet -> subnet to main, imagine how fun its going to be when you realize that you need main to tell subnet what the value of "nat_gateway" is and then that subnet doesn't know how to distinguish between "null -- there is no nat gateway, configure the literal value null on your resources that reference this parameter" and "null -- i am not passing a value (perhaps because someone else told me that this should be null)" and how the distinction between "actually nothing" and "i dont have anything to say" matters a lot when configuring resources that have required, mutually exclusive parameters such as aws routes, which can have a nat gateway id, or an internet gateway id, or a vpn gateway id, but never more than one, and how this would intersect with what the author of subnet put for the default value of "nat_gateway"

then do it again because that was just main telling subnet and subnet still has to tell route

# ¿ Sep 13, 2023 00:14

12 rats tied together: Sep 7, 2006

the exciting part about infra code was that it makes problems go away so you can work on other more interesting things

with terraform we somehow lost that mindset and instead it became a problem to be solved itself in a variety of lovely self-reinforcing ways

# ¿ Sep 13, 2023 02:30

12 rats tied together: Sep 7, 2006

its easier. setting up ospf is you just turn it on and put everything in area 0.

# ¿ Feb 18, 2024 19:40

12 rats tied together: Sep 7, 2006

i don't see why not. you would want to make sure that you dont accidentally install ECMP routes to anycast destinations and end up 50/50ing your traffic to 2 random nodes, but i'd be really surprised if there wasn't a config param on your routers for that

# ¿ Feb 18, 2024 22:13

12 rats tied together: Sep 7, 2006

ospf is the normal routing protocol. ibgp and is-is are the kubernetes cringe of the networking world

is-is is maybe closer to solaris in that if you are running it you're probably a Knower and are using it to solve a problem that actually exists instead of a fake problem like kubernetes

# ¿ Feb 19, 2024 20:44

12 rats tied together: Sep 7, 2006

kube is certainly A Solution to the problem of "how do i bin pack a bunch of garbage onto these computers"

imagine though if you didn't have this problem. imagine what kind of world that would be.

# ¿ Feb 20, 2024 19:40

12 rats tied together: Sep 7, 2006

the answer though OP is the route table. it's what it was designed for and it has all the features you need to solve that problem.

# ¿ Feb 20, 2024 21:20

12 rats tied together: Sep 7, 2006

no just use ospf. or configure a 0.0.0.0 route out the slow nic and a more specific route out the internal one. probably start with the second one

# ¿ Feb 20, 2024 21:36

12 rats tied together: Sep 7, 2006

Nomnom Cookie posted:

4. 192.168.0.0/24 is the fake network that can only talk to VMs on the same machine and puts packets directly in socket buffers
5. the requirement is to transparently use the intra-VM fake nic when possible so that packets dont have to hairpin through the host's network stack to reach the destination when that destination happens to be on the same physical machine
6. therefore you cant just make a route table because the problem is that the correct destination IP varies depending on the sender address

hmm. in that case i would simply stop using VMs, run 2 processes in the same OS (the OS is good at this im told) and then use pipes instead of sockets.

# ¿ Feb 21, 2024 01:18

12 rats tied together: Sep 7, 2006

[dusting off ccna] GLBP lets you do load balancing. it is a type of first-hop redundancy protocol which normally is for active/standby-ing your gateways. i would start by googling FHRP + load balancing / sharing / etc. words. you will need some form of managed router.

# ¿ Feb 21, 2024 01:47

Adbot: ADBOT LOVES YOU

# ¿ May 8, 2024 14:26

12 rats tied together: Sep 7, 2006

i used the pure "s3 compatible api" at a previous role and I thought it was good/fine

# ¿ Apr 15, 2024 23:21

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > YOSPOS > Infra/networking thread