Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

explicit/SCCComputerTarjan.tarjan() crashes to stack overflow for larger models #77

Open
ghost opened this issue May 28, 2018 · 3 comments

Comments

@ghost
Copy link

ghost commented May 28, 2018

Hi.
When using the explicit engine to compute steady-state probabilities for this model, stack overflow exception is thrown and the computation fails, while the GUI gets stuck and pretends the computation continues.

dtmc //random walk along a path

const int qCapacity = 50000;

module main

qSize : [0..qCapacity] init 0;

[step] (qSize < qCapacity & qSize != 0) -> 0.5 : (qSize' = qSize+1) + 0.5 : (qSize' = qSize-1);
[step] (qSize = 0) -> (qSize' = qSize+1);
[step] (qSize = qCapacity) -> (qSize' = qSize-1);

endmodule

I did not explore this bug in much depth, nor I understand this algorithm.
But from what I can tell, the reason is as follows.
SCCComputerTarjan attempts to explore all successors of the initial state recursively.
Each time the computation tries to explore a deeper successor, function tarjan(int i) gets called recursively on top of the previous one, which is apparent from the code.

	private void tarjan(int i) throws PrismException
	{
		...
		SuccessorsIterator it = model.getSuccessors(i);
		while (it.hasNext()) {
			...
			if (n.index == -1) {
				tarjan(e); //#THE RECURSIVE CALL##################################
				...
		}
		....
	}

For queue-like models with many states, it can easily happen that the recursion is so deep that stack overflow occurs before all states are explored. Because of this, it has happened to me that similar queue-like models with as few as 5000 states are already too much to handle.

Thanks!

@kleinj
Copy link
Member

kleinj commented May 28, 2018

Hi,

yes, the current implementation of computing the strongly-connected components via Tarjan's algorithm is implemented recursively in the explicit engine. For "deep" models, this can lead to the Java stack getting exhausted. A quick "fix" is to increase the Java stack size, by tweaking the default value in the assignment

PRISM_JAVASTACKSIZE="-Xss4M"

in the bin/prism startup script (which is also called when you run xprism). The default is a 4MB stack size, if you increase that to something like 40 or 100MB, you should be able to handle larger models.

Some time ago I implemented a non-recursive implementation of Tarjan's algorithm, i.e., one that maintains its own computation stack, but that is not yet available in the PRISM master due to pending testing / polishing / benchmarking.

Thanks for remarking on the GUI's behaviour when the computation thread dies, we should make the handling of that error case more robust.

Cheers,
Joachim

@kleinj
Copy link
Member

kleinj commented Jul 18, 2018

Hi,

we just merged some changes to allow setting the Java stack size via e.g. -javastacksize 40m (similar to -javamaxmem) and improved the error handling when a stack overflow occurs (command-line version and GUI). Thanks again for reporting. A non-recursive version of Tarjan's algorithm should be forthcoming shortly as well.

Cheers,
Joachim

@ghost
Copy link
Author

ghost commented Jul 18, 2018

That's some good news, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant