Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Zino 2.0.0 (beta) has a fairly large performance/scaling issue #383

Open
lunkwill42 opened this issue Nov 7, 2024 · 0 comments · May be fixed by #391
Open

Zino 2.0.0 (beta) has a fairly large performance/scaling issue #383

lunkwill42 opened this issue Nov 7, 2024 · 0 comments · May be fixed by #391
Labels
bug Something isn't working

Comments

@lunkwill42
Copy link
Member

lunkwill42 commented Nov 7, 2024

Background

Zino 1 (the legacy Tcl codebase) uses Scotty for SNMP communication. The actual SNMP code here is written in C, while Scotty provides a Tcl interface for it.

For Python, there aren't many good alternatives for SNMP libraries that can work asynchronously, and often it boils down to PySNMP, a pure-Python SNMP implementation. We chose PySNMP 4 as the underlying SNMP library for Zino early in development, not for performance, but for speed of development.

For small workloads, PySNMP works just fine. However, in-field tests with a backbone network consisting of ca. 230 routers, Zino 2.0.0-beta.2 performs abysmally, unfortunately. Zino quickly reaches a 100% CPU load, and on anything less than stellar hardware, it will fail to finish all collection jobs within the allotted interval (typically 5 minutes).

Profiling a running Zino 2 server reveals, as we expected, that most of Zino's CPU time is spent inside the PySNMP library, doing low-level decoding and encoding of SNMP packets - a task which Python isn't really suited for at scale.

For smaller workloads, Zino 2.0.0-beta.2 is probably fine (though we assume it will consume noticeable amounts of CPU-time). We do not have benchmarks that show what the cutoff point may be.

Potential mitigation strategies

Here are some potential mitigation strategies that we have considered and/or tested.

Running Zino under PyPy

PyPy is an alternative implementation of Python, with a built-in JIT compiler. The team behind PyPy boasts that

On average, PyPy is 4.4 times faster than CPython 3.7.

This can be achieved without changing any Zino code, and will eke some extra performance out of Zino (tested by us). It will, however, still consume a lot of CPU compared to Zino 1.

Compiling Zino to C++ using Typon

Typon is a work-in-progress Python-to-C++ compiler, which can vastly increase code performance by avoiding the overhead of the Python interpreter itself.

However, Typon is a work-in-progress, and in our tests, we weren't even able to produce a working "Hello world"-program that didn't just hang.

Running Zino under Python 3.13 with JIT enabled

Python 3.13 was released on October 7, 2024. It introduced a new, experimental JIT compiler to the Python interpreter, which could have performance-enhancing capabilities for Zino (although, probably not as much as PyPy, which has been around for a long time). However, the JIT compiler must be enabled at compile-time, and we have yet to be able to get a working Python build with the JIT compiler enabled.

Running Zino under Python 3.13 with free-threading enabled

Python 3.13 also introduced the long-awaited experimental "free-threading" (or NoGIL) support. This greatly enhances Python multi-threading performance by removing the oft-mentioned GIL (Global Interpreter Lock), which more or less blocks Python code from achieve true multi-threading concurrency. This experimental feature must also be enabled at compile-time, specifically because it will break Python code that was written in a non-threadsafe manner (under the long-standing assumption that the GIL will ensure thread-safety).

However, Zino 2.0 is, as Zino 1, designed to run in a single thread, and would gain no benefit from this (Performance would actually be degraded for single-threaded applications). It might gain from it if we were to rewrite Zino 2 to use multiple worker threads.

Replacing PySNMP with a different library

This was the option we had in mind as we set out to build Zino using PySNMP. Zino uses the adapter pattern to avoid tying every piece of Zino to a specific SNMP library. A new adapter for a different library can be written, while providing the same interface Zino code is already using.

However, as mentioned initially, there are few good options for Python-compatible SNMP libraries outside of PySNMP:

  • Net-SNMP, the old-timer stable and performant SNMP library written in C, provides its own Python bindings. However, these are only suited for synchronous use of the Net-SNMP library, which is useless for Zino.
  • easysnmp is an alternative Python binding to Net-SNMP, with a nicer, more Python interface. However, this library also provides no useful support for asynchronous usage.
  • pynetsnmp is actually quite nice, and is what we use for NAV. This project provides a ctypes-based dynamic Python interface to the Net-SNMP library. It does, however, have several disadvantages, that would impact Zino more than it currently does NAV:
    • The project is mainly focused on Python 2, which the world at large has abandoned.
    • We have had to fork the project in order to support Python 3 and IPv6 (due to complete radio silence from the upstream developers when patches are offered)
    • While the library provides a low-level interface to Net-SNMP, it is mostly focused on providing an SNMP protocol interface for the Twisted library, which is a framework for writing asynchronus network code, pre-dating the asyncio implementation built-in to Python 3.
    • Lastly, the library is licensed under GPL v3, which would make it completely incompatible with the Apache-2 licensed Zino for legal reasons (we would have to relicense Zino to GPL)

Writing a new Python binding to Net-SNMP

Since we do have years of positive experiences with the Net-SNMP library and its performance (this is, after all, the library that has provided most command line SNMP functionality for years and years, and has worked successfully for NAV), as a (potentially) last alternative, we could write a new Python binding to Net-SNMP. Some advantages (over pynetsnmp) and synergies would be:

  • We can license the result however we like.
  • We can use the more modern and recommended CFFI to interface with Net-SNMP from Python code (rather than ctypes, which pynetsnmp uses)
  • We can support an asyncio-abled asynchronous interface for Python code.
  • We don't have to care about Python 2 compatibility, or compatibility with older Net-SNMP versions, which pynetsnmp retains.
  • The result could be re-usable in NAV, and enable us to further modernize the NAV codebase by ditching Twisted in favor of asyncio.
@lunkwill42 lunkwill42 added the bug Something isn't working label Nov 7, 2024
@lunkwill42 lunkwill42 pinned this issue Nov 7, 2024
@lunkwill42 lunkwill42 linked a pull request Jan 22, 2025 that will close this issue
6 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant