Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refine HostInterfaceMTUSize alert rule #107

Open
facundofc opened this issue May 17, 2024 · 7 comments
Open

Refine HostInterfaceMTUSize alert rule #107

facundofc opened this issue May 17, 2024 · 7 comments

Comments

@facundofc
Copy link

facundofc commented May 17, 2024

Enhancement Proposal

The charm ships an alert rule (HostInterfaceMTUSize) which looks for changes in network interfaces. The rule does not filter out any interface, meaning that it will trigger for changes in br-int's MTU. More details about the br-int interface can be read on this issue's comment, but the bottom line is that its MTU is irrelevant. Users creating networks of different MTUs will cause the iface's MTU to change and this does not indicate any issue.

I can think only two ways of solving this, but more ideas are welcome of course.

  1. Add a configuration property to the charm to specify a set of interfaces to be ignored by the rule.
  2. Make the charm automatically detect which interfaces are relevant to monitor. This could be determined by reading the output of ip -j -d link and looking for a certain set of properties (like .linkinfo.info_kind == "openvswitch" and the like).

I see issues with both approaches, namely:

  1. This is the path to growing the list of config properties for this charm to be potentially pretty long, as this is a very general charm ("monitor a host").
  2. I'm not sure how often this could happen, but the list of interfaces to be ignored could change. This brings the need to detect that and re-render the rule.

Personally, I would go for alternative 1. Likely there are more interfaces we want to ignore, like TAP interfaces created for the VMs and so on.

Our immediate workaround is to add a very long silence matching device != "br-int", but this is far from ideal.

@sed-i
Copy link
Contributor

sed-i commented Feb 7, 2025

Via @lucabello:

We probably don’t want to add a juju config option, but the automatic detection (ip -j -d link) might be worth looking into. Would you be able to help spec it out a bit?

@sed-i
Copy link
Contributor

sed-i commented Feb 14, 2025

@facundofc would you be able to help spec this out?

@lucabello
Copy link
Contributor

Closing this, but please feel free to re-open if needed!

@nishant-dash
Copy link

Hello,

why has this issue been closed? @lucabello

@lucabello
Copy link
Contributor

lucabello commented Mar 7, 2025

Hey @nishant-dash, this has been closed because of inactivity; could you help us spec this out as @sed-i was suggesting? :)

@nishant-dash
Copy link

You can either check with

ip -j -d link | jq . | jq -r ".[] | (.ifname, .mtu, .min_mtu, .max_mtu)" | paste - - - - | column -t

or

from pyroute2 import IPDB
with IPDB() as ipdb:
    for iface in ipdb.interfaces.values():
        print(iface.ifname, iface.mtu, iface.get("min_mtu", ""), iface.get("max_mtu", ""))

and you will see

...
lo 65536 0 0
...
ens1f1np1 9000 68 9978
...
br-int 1500 68 65535
...
br-data 9000 68 65535
br-data 9000 68 65535
...
bondm.3216 1500 0 65535
...
ens1f0v1 1500 68 9978
...

In terms of mtu, at first glance it seems that as long as

  1. they don't flap
  2. are min_mtu <= mtu <= max_mtu

we should be ok, and this would apply to all interfaces.

Other than mtu, you would need to look at linkinfo which is a dict containing a variety of values like

  "linkinfo": {                        
    "info_kind": "openvswitch"         
  },    

or

  "linkinfo": {                                                   
    "info_kind": "bridge",                                                                                                            
    "info_data": { 
...
}

or it may be an interface thats part of a bond, in which case it does not have that property.

For the purpose of MTU, I believe the points I mentioned might suffice? @facundofc

@facundofc
Copy link
Author

Hi there! Apologies for leaving this hanging.

There seems to be a misunderstanding about the original issue. No, we don't want to detect MTU flapping on all interfaces, effectively the issue is that we currently are alerting on br-int MTU flapping, and we shouldn't. So:

we should be ok, and this would apply to all interfaces.

Not really.

I still don't know how to automatically detect which interfaces to monitor for MTU changes. My original suggestion of filtering out ifaces with .linkinfo.info_kind == "openvswitch" doesn't seem to hold. In a new deployment we're setting up bridges as OpenvSwitch directly, and I think we'd like to monitor these interfaces too (and of course, their info_kind is opensvswitch). An example of what I mean, in the same host we have:

  {
    "ifindex": 12,
    "ifname": "br-int",
    "flags": [
      "BROADCAST",
      "MULTICAST"
    ],
    "mtu": 1500,
    "qdisc": "noop",
    "operstate": "DOWN",
    "linkmode": "DEFAULT",
    "group": "default",
    "txqlen": 1000,
    "link_type": "ether",
    "address": "7e:9a:7a:2a:07:da",
    "broadcast": "ff:ff:ff:ff:ff:ff",
    "promiscuity": 1,
    "min_mtu": 68,
    "max_mtu": 65535,
    "linkinfo": {
      "info_kind": "openvswitch"
    },
    "inet6_addr_gen_mode": "eui64",
    "num_tx_queues": 1,
    "num_rx_queues": 1,
    "gso_max_size": 65536,
    "gso_max_segs": 65535
  }

and

  {
    "ifindex": 18,
    "ifname": "br-data",
    "flags": [
      "BROADCAST",
      "MULTICAST",
      "UP",
      "LOWER_UP"
    ],
    "mtu": 9000,
    "qdisc": "noqueue",
    "operstate": "UNKNOWN",
    "linkmode": "DEFAULT",
    "group": "default",
    "txqlen": 1000,
    "link_type": "ether",
    "address": "98:03:9b:9c:98:e8",
    "broadcast": "ff:ff:ff:ff:ff:ff",
    "promiscuity": 1,
    "min_mtu": 68,
    "max_mtu": 65535,
    "linkinfo": {
      "info_kind": "openvswitch"
    }

The first one (br-int) we want to ignore, but the second one (br-data) we want to monitor. The only differences I see are on operstate and their flags, but I have no idea how reliable it is to use these to distinguish them.

As per the min_mtu <= mtu <= max_mtu part, I don't think we have to check this. It seems impossible, at least for physical interfaces, to set the MTU outside of these limits:

# ip -j -d link | jq -r ".[] | [.ifname, .mtu, .min_mtu, .max_mtu] | @tsv" | column -t
lo         65536  0    0
enp0s31f6  1500   68   9000
wlp0s20f3  1500   256  2304
tun0       1500   68   65535
# ip l set mtu 200 dev wlp0s20f3
Error: mtu less than device minimum.
# ip l set mtu 2400 dev wlp0s20f3
Error: mtu greater than device maximum.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants