Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent node crashing when an entry is empty #4433

Closed
ArpegorPSGH opened this issue Apr 24, 2022 · 15 comments
Closed

Prevent node crashing when an entry is empty #4433

ArpegorPSGH opened this issue Apr 24, 2022 · 15 comments

Comments

@ArpegorPSGH
Copy link
Contributor

There are some cases where some nodes output empty lists (like overlap polygons if no polygons overlap), and then it makes some of the following nodes crash, especially the for each node. It is possible to circumvent this issue by using an input switch conditionned on the output length to switch it with a default value, but it is so unelegant.
Could it be possible to make it so that if at least one of the node entries connected is empty, the node operation is skipped, and its outputs are set to empty lists as well? Maybe just register it as a warning for debugging purposes?

@Durman
Copy link
Collaborator

Durman commented Apr 25, 2022

First of all it would be nice to investigate source of the crash. Could you upload a file with minimum number of nodes which would reproduce the crash?

@ArpegorPSGH
Copy link
Contributor Author

Okay, but more than a specific crash, isn't it a general crash case because of the conception of the nodes?

@Durman
Copy link
Collaborator

Durman commented Apr 25, 2022

Could it be possible to make it so that if at least one of the node entries connected is empty, the node operation is skipped, and its outputs are set to empty lists as well? Maybe just register it as a warning for debugging purposes?

I agree than nodes should be more robust if they does not have enough data to perform their function. I think that initially all nodes were designed in way that when they does not have enough data they raise Nod Data Error. It can be quite inconvenient sometimes. I already pointed to this problem a while a go but it seems there is no separate issue about that.

The problem is a bit more delicate. Nodes unable to perform operation can either return empty list or they can return unmodified input data. The last variant is what would be expected from muted node. Not all input sockets are mandatory so it is for a node to decide whether it can perform any operation or not. Also mapping between input and output nodes should be declared inside a node so it would be clear how to pass unmodified data.

Currently the whole system is made in way that all nodes get input data by their selves, thus its for them to handle the problem of missing data. As we have already many hundreds of nodes it's a quite big peace of work to fix them all. And actually this is not the way to go. My proposal is to delegate data management to update system first and after this it will be simpler to fix the problem. I already started doing this - #4323.

@ArpegorPSGH
Copy link
Contributor Author

How can a node be muted if its outputs are completely different from its inputs? Moreover, doing so would just mess up the operations afterwards. I think only outputting empty lists is really useful.

If it is too complicated to know whether the node can perform its operation or not with the data it has, then the update system could just consider that if an input is connected, but the given list is empty, then it should skip node operation.

Also, if you are working on a new version of the update system, could it be possible to optimize treatment by not performing operations not needed? For example, if an input switch is used, all inputs that will not be chosen can be skipped. Also, could it be possible to make nodes perform operations in parallel (multithreading) on the first level of the data given (multiple meshes), especially the for each loop node?

@ArpegorPSGH
Copy link
Contributor Author

Here is an example of a For Each loop node crash when an entry is an empty list.
Exemple For each loop crash when empty input_2022_04_25_17_14.zip

@Durman
Copy link
Collaborator

Durman commented Apr 26, 2022

How can a node be muted if its outputs are completely different from its inputs?

In this case it should output None (empty list). This is standard behavior in Blender node editors. Such nodes can be classified as data generators. But also there is another class of nodes which modify data. For example Matrix Apply node. Muting such node would mean that it should output input meshes without applied matrixes.

could it be possible to optimize treatment by not performing operations not needed?

There is a proposal for that #4322

For example, if an input switch is used, all inputs that will not be chosen can be skipped.

Because of the vectorization system it's not so simple. On one vectorization layer one input can be used on another layer another one.

Also, could it be possible to make nodes perform operations in parallel (multithreading) on the first level of the data given (multiple meshes), especially the for each loop node?

I think yes but it also should be done after moving data management from nodes to update system. And probably it does not give desired performance. Many nodes works hundreds times slower because of they use pure Python or because of some other reasons. So it would be more efficient to make optimizations inside such nodes. And there is global problem of vectorization performance. Now it uses pure Python loops which are slow and alternatives are not clear.

@ArpegorPSGH
Copy link
Contributor Author

Well, that makes a lot of room for improvements on efficiency. According to you, the node trees could compute up to a few thousands times faster if everything is optimized as much as possible, is that right?

Even if some nodes are very slow, multi-threading will divide the computation time all the same. Unless it cannot be done without optimizing nodes first, allowing multi-threading is more important I think, because it will reduce the computation time of all nodes at once, and not just one after the other.

How is all that coming along? Is it doable with a reasonable amount of work to invest? (i. e. it won't take years, right?)

By the way, some nodes can use numpy and output numpy arrays, which is generally faster, but not all nodes can do that : is there a reason? And can nodes with and without numpy in a same node tree work well with each other?

@Durman
Copy link
Collaborator

Durman commented Apr 26, 2022

Yes, performance is one of the biggest problems of Sverchok. Some has opinion that this is its only problem.)

I can't make any prognoses about multithreading because I even did not have any experiments with it yet. The main problem is free time which I can afford on this project.

Technically I would say neither of nodes outputs Numpy arrays. All nodes outputs lists and these lists can include either other lists or Numpy arrays. This's why Numpy does not solve all problems. Big number of small Numpy arrays does not work faster than Python lists.

Initially all nodes worked only with Python lists and only relatively recently Numpy arrays started to appear. I think in most case nodes can handle both Numpy arrays and Python lists but I guess that some nodes can raise an error with Numpy arrays because they was not fully integrated.

@ArpegorPSGH
Copy link
Contributor Author

You are alone on that task? I know, free time is the problem. When I finish my current project (a year from now?) on Sverchok and if I get enough free time, I might be willing to help you improve performances. However, I am currently quite a beginner in python, so optimizing code is maybe a bit too much for me.

@Durman
Copy link
Collaborator

Durman commented Apr 26, 2022

On update system I'm along now since I completely rewritten the old one. =) I would not expect to get a lot of help here because its a lot of tricky code without any documentation. What would be real is to define slow nodes and try to improve their performance. It would especially efficient if the nodes was used in some real node tree layouts.

@ArpegorPSGH
Copy link
Contributor Author

To know which nodes are slow, I can partially test that with my project, as I use around a hundred of them. However, the current run time displayed is in ms, so most of the nodes run in 1-2ms, and the simple ones like list item, list length, etc... run in 0-1ms, so not easy to have an idea which ones are the slowest (well, 1ms for just getting a few items in a list is like forever). Maybe switch that to microseconds.

@Durman
Copy link
Collaborator

Durman commented Apr 28, 2022

This can be fixed here

sverchok/node_tree.py

Lines 288 to 290 in 6d31391

if update_time is not None:
update_time = int(update_time * 1000)
sv_bgl.draw_text(self, f'{update_time}ms', update_pref + node_id, align="UP", dynamic_location=False)

But probably it would be simpler to increase input data to make nodes to take more time.

@ArpegorPSGH
Copy link
Contributor Author

Well, yes, for a systematic test, but I'm talking of finding in my node trees the slow nodes, and each node analyzes quite small amounts of data, but inside long loops performing thousands of iterations.

@Durman
Copy link
Collaborator

Durman commented Apr 28, 2022

image

Also we have such panel in development mode. It can profile calling of internal functions. Its output can be nicely visualized like this:
test1

@Durman
Copy link
Collaborator

Durman commented May 5, 2022

Will close since the problem is known

@Durman Durman closed this as completed May 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants