I suggest use red-black tree instead of binary search tree to store the flows. #2363

AlexanderDymov · 2024-04-03T15:39:52Z

Hello guys,

I would like to ask a question:
The binary search tree (BST) in which flows are stored is not balanced.
The complexity analysis of BST shows that, on average, the insert, delete and search takes O(log n) for n nodes. But in the worst case, they degrade to that of a singly linked list: O(n).

When processing each packet, 3 search operation in the tree are performed (2 calls to ndpi_tfind and 1 call to ndpi_tsearch).
This can cause large delays if the number of flows is large (> 1 million) and the tree is degenerate into a list.

The question is: Why don't you use a balanced tree like a red-black tree where the worst case search cost is O(log n)?

Thank you for any hint

IvanNardi · 2024-04-03T16:08:26Z

You analysis is correct.
However there is a missing point: nDPI (the library) doesn't do any flow management; this (important and critical) topic is left to the application. I am sure that any serious applications linking to nDPI is using a proper and efficient data structure for keeping flows information, tailored to the specific application needs.
ndpiReader is only a basic example, aiming to show nDPI capabilities and how to use nDPI API: performance is not its goal.
I am not sure because I was not involved with the project at the time, but it is quite likely that ndpiReader is using a simple tree only because that data structure was easily available at the time, or it was easy to implement.

Having said that, any effort from the community to improve (also) ndpiReader performance is welcomed :)

AlexanderDymov · 2024-04-03T16:47:56Z

Hi, Ivan!

Thank you for you answer.

I will research this problem and possible solutions.

Thank you!

mmanoj · 2024-07-20T18:21:48Z

@AlexanderDymov

Any update on your research?

AlexanderDymov added the enhancement label Apr 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

I suggest use red-black tree instead of binary search tree to store the flows. #2363

I suggest use red-black tree instead of binary search tree to store the flows. #2363

AlexanderDymov commented Apr 3, 2024 •

edited

Loading

IvanNardi commented Apr 3, 2024

AlexanderDymov commented Apr 3, 2024

mmanoj commented Jul 20, 2024

I suggest use red-black tree instead of binary search tree to store the flows. #2363

I suggest use red-black tree instead of binary search tree to store the flows. #2363

Comments

AlexanderDymov commented Apr 3, 2024 • edited Loading

IvanNardi commented Apr 3, 2024

AlexanderDymov commented Apr 3, 2024

mmanoj commented Jul 20, 2024

AlexanderDymov commented Apr 3, 2024 •

edited

Loading