Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
Partycat
Oct 25, 2004

FatCow posted:



Also, "L3 switch" is largely being used as "lovely cheap L3 switch" here. I'm pretty sure a 4948 can do most of the use cases the people in this thread need.

Agreed but many of the L3 switches that this thread would core at their small business or site like cat3750/3850/6509e etc will not grow to being a big boy router when you want to do something behind “packet goes in packet goes out”. Then it breaks in new and stupid ways and you get to suffer when something blows the capacity.

Multicast had absolutely wrecked some of our 6500 series and those were more or less just SVIs with a default static route offbox.

It can work just without as much breathing room as a proper device, though admittedly the lines between router switch firewall etc blur more every year.

Adbot
ADBOT LOVES YOU

Collateral Damage
Jun 13, 2009

I hate all kinds of connections that need to be screwed in place. VGA, DVI, any proprietary connector with screws, all can go die in a fire. Even serial DB9 can gently caress right off, use 8P8C.

GreenNight
Feb 19, 2006
Turning the light on the darkest places, you and I know we got to face this now. We got to face this now.

I like those more than like DisplayPort where the users just yank until the cable or port breaks.

Collateral Damage
Jun 13, 2009

You can't fix stupid. They'd do the same with screw-in connectors.

abigserve
Sep 13, 2009

this is a better avatar than what I had before

Partycat posted:

Agreed but many of the L3 switches that this thread would core at their small business or site like cat3750/3850/6509e etc will not grow to being a big boy router when you want to do something behind “packet goes in packet goes out”. Then it breaks in new and stupid ways and you get to suffer when something blows the capacity.

Multicast had absolutely wrecked some of our 6500 series and those were more or less just SVIs with a default static route offbox.

It can work just without as much breathing room as a proper device, though admittedly the lines between router switch firewall etc blur more every year.

We've run 6500's (now 6800s) as backbone routers for over a decade and never had a problem?

That's a 10G national network running MPLS VPNs, multicast, the entire topology reflected in ipv6, dual-active PoPs, macsec on all our dark fibre links, etc.

3750/3650 are also fine for small site and anything with less than a gigabit uplink counts as a small site. Just don't try and configure IPsec on them

CrazyLittle
Sep 11, 2001





Clapping Larry

FatCow posted:

What? First off, a L3 switch will behave exactly like a router when it routes packets when it comes to MAC addresses. The MAC will only pass through to the ISP if it is switched traffic.

You're absolutely correct - assuming the end user actually configures their switch and connected clients to route and not just switch up to the network egress

Partycat
Oct 25, 2004

abigserve posted:

We've run 6500's (now 6800s) as backbone routers for over a decade and never had a problem?

That's a 10G national network running MPLS VPNs, multicast, the entire topology reflected in ipv6, dual-active PoPs, macsec on all our dark fibre links, etc.

3750/3650 are also fine for small site and anything with less than a gigabit uplink counts as a small site. Just don't try and configure IPsec on them

We run them for aggregation in VSS with another pair of 6500s on top and VSS pairs of 4500X under, each of those aggregating piles of 3850/3750 switches with various numbers of active users.

I'll readily admit the operation of these isn't my focus, I handle mainly voice/video at this time. However, when we have bizarre problems we seem to see it due to the level of sensitivity and reporting that comes with that application type. When the 6500s are sitting there running high CPU and/or there are table issues due to host/traffic density then uh.

I guess what I would generalize with is that you need to choose the appropriate hardware for the application, but understanding that application can change or shift over time. This giant L2 approach is probably beyond the ability of this type of equipment to scale in a more-or-less out of the box approach to what we're applying it for. For other reasons I keep asking to re-align with the SRND that pushes routing out to the edge. At minimum that keeps every piece of equipment along the way of having to hold all those client MACs for no real reason.

abigserve
Sep 13, 2009

this is a better avatar than what I had before

Partycat posted:

We run them for aggregation in VSS with another pair of 6500s on top and VSS pairs of 4500X under, each of those aggregating piles of 3850/3750 switches with various numbers of active users.

I'll readily admit the operation of these isn't my focus, I handle mainly voice/video at this time. However, when we have bizarre problems we seem to see it due to the level of sensitivity and reporting that comes with that application type. When the 6500s are sitting there running high CPU and/or there are table issues due to host/traffic density then uh.

I guess what I would generalize with is that you need to choose the appropriate hardware for the application, but understanding that application can change or shift over time. This giant L2 approach is probably beyond the ability of this type of equipment to scale in a more-or-less out of the box approach to what we're applying it for. For other reasons I keep asking to re-align with the SRND that pushes routing out to the edge. At minimum that keeps every piece of equipment along the way of having to hold all those client MACs for no real reason.

The point I'd stress it that from a feature perspective a 6800 is an extremely capable and comparatively very cheap per-port campus/small DC router which is what we were talking about. You'd have to be running some serious poo poo to be blitzing one of those or have topology problems (it's always topology problems)

abigserve
Sep 13, 2009

this is a better avatar than what I had before
Is anyone using telegraf/grafana/influxdb for network monitoring with a reasonable number of devices?

I can't get it working using more than about 100~ boxes. The SNMP input plugin just doesn't scale at all without standing up additional telegraf instances.

abigserve fucked around with this message at 06:23 on Mar 21, 2018

Methanar
Sep 26, 2013

by the sex ghost

abigserve posted:

Is anyone using telegraf/grafana/influxdb for network monitoring with a reasonable number of devices?

I can't get it working using more than about 100~ boxes. The SNMP input plugin just doesn't scale at all without standing up additional telegraf instances.

Telegraf should be pushing metrics, not pulling. I have a poo poo ton of telegraf instances pushing a poo poo ton of metrics towards influxdb, mostly for regular servers though and not network devices specifically. For the arista switches that I do have telegraf collecting metrics on though, it works well. There are a few weird things where it complains that the OS doesn't have some feature enabled, but for measuring bit rates on interfaces it does what its supposed to.

Install the agent on all of your boxes

Methanar fucked around with this message at 06:40 on Mar 21, 2018

abigserve
Sep 13, 2009

this is a better avatar than what I had before

Methanar posted:

Telegraf should be pushing metrics, not pulling. I have a poo poo ton of telegraf instances pushing a poo poo ton of metrics towards influxdb, mostly for regular servers though and not network devices specifically. For the arista switches that I do have telegraf collecting metrics on though, it works well. There are a few weird things where it complains that the OS doesn't have some feature enabled, but for measuring bit rates on interfaces it does what its supposed to.

Install the agent on all of your boxes

Network monitoring = monitoring routers and switches = gotta use snmp

edit; and yeah +1 to arista/cumulus but even then that's DC stuff only and I want to monitor the whole network

Methanar
Sep 26, 2013

by the sex ghost

abigserve posted:

Network monitoring = monitoring routers and switches = gotta use snmp

edit; and yeah +1 to arista/cumulus but even then that's DC stuff only and I want to monitor the whole network

Hm, definitely no way you can install arbitrary bsd/rpm packages or compile telegraf yourself to run on whatever you've got?

I guess running multiple instances of telegraf that pull is probably best you're going to get then.

pctD
Aug 25, 2009



Pillbug
We are using telegraf to pull snmp metrics from our pdus and dumb switches. There’s some configuration you have to tweak to make it run in batches more efficiently iirc.

abigserve
Sep 13, 2009

this is a better avatar than what I had before

Methanar posted:

Hm, definitely no way you can install arbitrary bsd/rpm packages or compile telegraf yourself to run on whatever you've got?

I guess running multiple instances of telegraf that pull is probably best you're going to get then.

Negatory. That'd be the dream.

On the latest version of telegraf and I've split the configuration into threes of about 400 hosts each but it just can't handle it. If anyone has any tips I'm all ears because I'm always about to just say gently caress it and write my own collector.

ragzilla
Sep 9, 2005
don't ask me, i only work here


abigserve posted:

because I'm always about to just say gently caress it and write my own collector.

That’s what I did. Although it’s against 1.2 and I need to update for 1.5 since they’ve got int64 now, and I’ve only tested up to 20 devices or so so far (and if-mib only). I’ll try to clean the code up (remove anything internal) and toss it on github.

Or there’s snmpcollector but I’m not a fan since it holds metrics in memory and saves the deltas to the database (I prefer saving the raw counter and running derivative on it later). But it’s certainly a lot more comprehensive in terms of MIB support.

-edit-
Behold my terrible code, github.com/ragzilla/ngm

ragzilla fucked around with this message at 19:51 on Mar 21, 2018

abigserve
Sep 13, 2009

this is a better avatar than what I had before

ragzilla posted:

That’s what I did. Although it’s against 1.2 and I need to update for 1.5 since they’ve got int64 now, and I’ve only tested up to 20 devices or so so far (and if-mib only). I’ll try to clean the code up (remove anything internal) and toss it on github.

Or there’s snmpcollector but I’m not a fan since it holds metrics in memory and saves the deltas to the database (I prefer saving the raw counter and running derivative on it later). But it’s certainly a lot more comprehensive in terms of MIB support.

-edit-
Behold my terrible code, github.com/ragzilla/ngm

Nice one. I started writing something similar in Go as well which I'll probably work on more now that I have motivation. I have about 1200 devices to poll and multiple tables on each so she's not super straightforward, I'll take a look at your code as well...

Interestingly I got a response from the Telegraf devs on Github which basically said "you're poo poo outta luck, fam" so that's the end of that for now!

ragzilla
Sep 9, 2005
don't ask me, i only work here


abigserve posted:

Nice one. I started writing something similar in Go as well which I'll probably work on more now that I have motivation. I have about 1200 devices to poll and multiple tables on each so she's not super straightforward, I'll take a look at your code as well...

Interestingly I got a response from the Telegraf devs on Github which basically said "you're poo poo outta luck, fam" so that's the end of that for now!

I too have been tempted to rewrite and support more table types (I need to get environmentals, protocols, cpu/mem, and vpn sessions at a minimum). And apparently we're sitting at 2246 devices right now (although not all are SNMP, but then there's the issue of writing an ICMP poller in go).

gently caress managing stuff like Solarwinds and Cacti at that volume when I just want to log every single point on every interface.

abigserve
Sep 13, 2009

this is a better avatar than what I had before

ragzilla posted:

I too have been tempted to rewrite and support more table types (I need to get environmentals, protocols, cpu/mem, and vpn sessions at a minimum). And apparently we're sitting at 2246 devices right now (although not all are SNMP, but then there's the issue of writing an ICMP poller in go).

gently caress managing stuff like Solarwinds and Cacti at that volume when I just want to log every single point on every interface.

At the end of the day Statseeker (and now, AKIPS) seems to be the only suitable solution. We've had SS running for years and it never misses a beat, but good god it's a lovely interface which is why I was trying to be smart and replace it with a modern solution.

Interestingly you can use Grafana to pull data directly from statseeker and while that generally resolves the UI problems it also just doesn't feel worth it to run it like that given there is a nonzero amount of work integrating it nicely.

itskage
Aug 26, 2003


If I disable WLC on a 3580, will it drop the rest of the config (SSIDs and everything) or will they still be there if I need to re-enable it?

e: The answer is no. It keeps them in the config.

itskage fucked around with this message at 19:07 on Mar 24, 2018

ragzilla
Sep 9, 2005
don't ask me, i only work here


abigserve posted:

At the end of the day Statseeker (and now, AKIPS) seems to be the only suitable solution. We've had SS running for years and it never misses a beat, but good god it's a lovely interface which is why I was trying to be smart and replace it with a modern solution.

Interestingly you can use Grafana to pull data directly from statseeker and while that generally resolves the UI problems it also just doesn't feel worth it to run it like that given there is a nonzero amount of work integrating it nicely.

I wrote a (new) thing, telepoller. Inspired by telegraf syntax (heck, lifted a bunch of their SNMP/config code and made it run parallel), uses Uint64 where it can (so you'll need InfluxDB with the build flag turned on, and my updated Telegraf for batched inserts).

Not quite as battle tested as my old code (fun story, in the development I ran 'delete from ifMIB' on the production database, whoops), but it should be decently reliable. Got it pointed at one box for right now, probably add some metrics to track idle time like the old one did and then turn it loose on more of the network.

Methanar
Sep 26, 2013

by the sex ghost
What is the preferred way of preventing a particular subnet from being routed OUT through a particular ASN.

code:
ISPA  ISPB      ISPC  ISPD
  \     /         \    /
    R1    --------  R2
    |                |
S1    S2          S3   S3
I don't want S2 to go out through ISPA.

I was thinking I could attach a route-map to subnet S2 and set local preference 110 on it and then stick that on the ISPC's advertisement.

code:
neighbor ISPC route-map COG-RM-IN in

route-map COG-RM-IN permit 10
   match ip address access-list COG-ACL-OUT
   set local-preference 110
!

ip access-list COG-ACL-OUT
   10 permit ip S3/24 any
!
This did not do what I wanted it to. A bit scared to keep trial and erroring this because I'm watching a whole lot of traffic move around very quickly in my grafana graphs in ways I didn't expect.

tortilla_chip
Jun 13, 2007

k-partite
Prepend your AS on that prefix going to ISP A.

E: Your stated goal is a bit ambiguous. Do you want traffic from the Internet to S2 to prefer a particular link? Do you want outbound traffic from S2 to the Internet to take a particular path? Do you want S2 to be unreachable from the Internet via ISP A?

tortilla_chip fucked around with this message at 21:05 on Apr 5, 2018

Methanar
Sep 26, 2013

by the sex ghost

tortilla_chip posted:

Prepend your AS on that prefix going to ISP A.

E: Your stated goal is a bit ambiguous. Do you want traffic from the Internet to S2 to prefer a particular link? Do you want outbound traffic from S2 to the Internet to take a particular path? Do you want S2 to be unreachable from the Internet via ISP A?

I want traffic that originates from the subnet S2 to go out of something that isn't ISPA's link. Making traffic with an source IP of 11.11.11.0/24 go out via ISPC would be a good alternative.

I don't particularly care about how it ingresses my own ASN. It's egress that matters.

Methanar fucked around with this message at 21:48 on Apr 5, 2018

falz
Jan 29, 2005

01100110 01100001 01101100 01111010
Dont use route maps.

Come up with a sensible bgp community design. Have a unique community for routes originating from rX and sY.

Set a policy to do some action on whatever border peer to do something such as: don't advertise, as prepend, set that providers depref community, some combination of these.

madsushi
Apr 19, 2009

Baller.
#essereFerrari
Is S2 a router? Routing is typically destination-based, so if you want R1 to do something different for S1-sourced vs S2-sourced, you'll have to get creative.

Filthy Lucre
Feb 27, 2006
Routing is normally done by the destination IP.

If you're wanting to route by the source IP (i.e. S2), you'll need to use Source Routing or Policy Based Routing to control the next hop based on the source IP.

Edit: Here's a basic PBR solution that might work for you; http://www.ciscozine.com/pbr-route-a-packet-based-on-source-ip-address/

Filthy Lucre fucked around with this message at 21:44 on Apr 5, 2018

falz
Jan 29, 2005

01100110 01100001 01101100 01111010
Also, if "COG" means Cogent, they do have action BGP communities.

https://onestep.net/communities/as174/

Since you said BGP, you should really use purely BGP to scale this, PBR w/ route maps will become an unmanageable clusterfuck.

Methanar
Sep 26, 2013

by the sex ghost
Okay I'm reading through PBR, it might work? It is a clusterfuck waiting to happen though.

I see this as a pure BGP example of weighting outbound traffic. Particularly the MED section under the Traffic engineering outgoing traffic header.
Is that sort of what was being alluded to with doing a proper BGP community design?
https://www.noction.com/knowledge-base/multihoming

I hate that I need to do all of this for the first time on a live network.

tortilla_chip
Jun 13, 2007

k-partite
BGP is the wrong tool to try to implement source based routing.

Methanar
Sep 26, 2013

by the sex ghost
Okay maybe I'm the XY problem.


I have 4 subnets. Two subnets are sending out a huge amount of traffic to the internet. Within the last week, the amount of traffic has grown to the point that my original lovely egress load balancing scheme of prepending my own ASN to my bgp neighbor statements until things looked right no longer works.

Pretty much no matter how I prepend, prepending is just too broad and coarse and I always end up trying to push >10G through a 10G link. Since I can't apply prepending changes to a single subnet right now, I thought that some kind of source based routing or applying local preference somehow to a subnet might work.

The real problem is that I can't seem to apply any kind of BGP manipulation to BOTH of my subnets and make it work. I need to have one subnet behave differently from the other.

Can I fix this with VRFs then? Each subnet gets its own routing table (and bgp process?)

Actually, looking at my neighbor statements, maybe I'm being stupid. I should be able to selectively say what subnets ARE allowed to go out a particular neighbor through a prefix list. Right now the only one I have is advertising out to the internet what subnets are available through my ASN, right?

#current, list contains all 4 subnets. Tells the internet what is accessible in my ASN
neighbor 11.11.11.11 prefix-list HE-PL-OUT out
#will fix my problem? contains one high traffic subnet
neighbor 11.11.11.11 prefix-list HE-PL-IN in

Methanar fucked around with this message at 23:53 on Apr 5, 2018

abigserve
Sep 13, 2009

this is a better avatar than what I had before
You are trying to balance the outbound traffic on a per subnet basis within a single routing table. There is no way to implement this functionality using just BGP. You have the following options available to you:

1. Policy route the traffic from S2. As all your routing is done between two boxes this is fairly easy and you don't care about the return path this should be fairly easy with a single inbound route-map setting next hop.
2. Put your subnets in different VRFs and leak the routes as required between each.
3. Get some more bandwidth, yo

tortilla_chip
Jun 13, 2007

k-partite
To attempt better loadsharing in the outbound direction you can look at eiBGP multipathing.

Methanar
Sep 26, 2013

by the sex ghost
Well its working with pbr. Guess I wait and see now if the CPU consumption goes through the roof tonight.

Don't feel at all satisfied that its working.

Ahdinko
Oct 27, 2007

WHAT A LOVELY DAY
So I haven't done Zone based firewalling like you have to do on IOS-XE in a little bit, and I'm getting a bit confused with how to make this work properly and securely.

In standard IOS I'd have my access-list on an interface with an ip inspect aswell, so reply traffic can get in and out but someone else on the outside couldn't SSH to my router for example.

In IOS-XE, it seems if I apply an ACL to the interface, if it doesn't match the ACL it just gets dropped and doesn't bother going into inspection via the zone based stuff to permit reply traffic. How do I make this work right?

Here's my config:
in IOS (works):
code:
ip access-list extended public-in-acl
 permit tcp x.x.x.x 0.0.3.255 host x.x.x.x eq 22

ip inspect name public-in-insp ftp
ip inspect name public-out-insp icmp
ip inspect name public-out-insp dns
ip inspect name public-out-insp pptp
ip inspect name public-out-insp ftp
ip inspect name public-out-insp h323
ip inspect name public-out-insp udp timeout 3600
ip inspect name public-out-insp tcp timeout 3600
ip inspect name public-out-insp ntp

interface Dialer1
 ip access-group public-in-acl in
 ip inspect public-in-insp in
 ip inspect public-out-insp out
In IOS-XE:

code:
ip access-list extended public-in-acl
 permit tcp x.x.x.x 0.0.3.255 host x.x.x.x eq 22

class-map type inspect match-any most-traffic
 match protocol icmp
 match protocol http
 match protocol ftp
 match protocol tcp
 match protocol udp
!
policy-map type inspect p1
 class type inspect most-traffic
  inspect
 class class-default
  drop

zone-pair security in-out source in destination out
 service-policy type inspect p1
zone-pair security out-in source out destination in
 service-policy type inspect p1

interface Dialer1
 zone-member security out

interface GigabitEthernet0/0/1
zone-member security in
This works and lets traffic through, but it lets everything through and someone can SSH to my router. As soon as I do the below in IOS-XE, all inspection/reply traffic seems to stop working and all that will go through is SSH to my router. Internal users can no longer reach the internet, etc:

code:
interface Dialer1
ip access-group public-in-acl in
zone-member security out

Ahdinko fucked around with this message at 17:17 on Apr 6, 2018

mythicknight
Jan 28, 2009

my thick night

I have a new 4400 series router and some PRI lines to turn up.

Am I able to just use a normal cat6 cable to plug them in or do I still need a specifically wired ethernet cable? The last time I did this I had to wire my own since the pinouts were different.

I guess i should just try cat6 and hope. I hate making cables.

wolrah
May 8, 2006
what?

mythicknight posted:

I have a new 4400 series router and some PRI lines to turn up.

Am I able to just use a normal cat6 cable to plug them in or do I still need a specifically wired ethernet cable? The last time I did this I had to wire my own since the pinouts were different.

I guess i should just try cat6 and hope. I hate making cables.

Straight through you can use any standard ethernet cable. Crossover is where they differ, ethernet switches 1/2 with 3/6 where T1 switches 1/2 with 4/5

abigserve
Sep 13, 2009

this is a better avatar than what I had before

Methanar posted:

Well its working with pbr. Guess I wait and see now if the CPU consumption goes through the roof tonight.

Don't feel at all satisfied that its working.

Don't forget that your failover plan now has to include changing the PBR in the event the link dies.

Methanar
Sep 26, 2013

by the sex ghost

abigserve posted:

Don't forget that your failover plan now has to include changing the PBR in the event the link dies.

Yup :(

I've got a pretty good idea of how fragile this is. Found out saturday night peak that I went too far in the opposite direction. Shuttled too much traffic from A to B and now its B that's pushing more traffic than is possible.

abigserve
Sep 13, 2009

this is a better avatar than what I had before
Honestly you sound like a candidate for SD-WAN. However, at your scale I doubt such a solution exists and if it does it'll probably be orders of magnitude more expensive than just buying more links.

If you are skilled python developer and wanted to really get into the weeds I think you could build an Openflow based solution written on top of the RYU controller.

You would have an of switch sitting between the two links, the same BGP configuration, and it would be balancing outbound traffic based on whatever you wanted it to (could do tcp connections for example). To the downstream router it would think it's sending traffic to ISPA but really it's going to the OF switch, then switched to ISPB (or vice versa).

Google/Facebook do this sorta stuff already but they won't give you access to their code.

Adbot
ADBOT LOVES YOU

Sepist
Dec 26, 2005

FUCK BITCHES, ROUTE PACKETS

Gravy Boat 2k
Just use pfr on your egress routers

https://www.google.com/amp/s/aitaseller.wordpress.com/2012/10/15/pfr-cisco-performance-routing/amp/

Sepist fucked around with this message at 12:55 on Apr 9, 2018

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply