Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up preprocessing? #35

Open
spth opened this issue Nov 15, 2020 · 4 comments
Open

Speed up preprocessing? #35

spth opened this issue Nov 15, 2020 · 4 comments

Comments

@spth
Copy link
Member

spth commented Nov 15, 2020

For one of the largest function in the standard library that comes with SDCC, printf, treedec::comb::PP_FI_TM takes much longer to get a tree decomposition than Thorup.

According to valgrind, at -O1, 71 % of the time spent in treedec::comb::PP_FI_TM is just for the boost::clear_vertex call in line 293 of preprocessing.hpp (the call that has the // expensive? comment directly above it).

Would it be possible to improve speed here somehow?

@felix-salfelder
Copy link
Member

felix-salfelder commented Nov 16, 2020 via email

@llarisch
Copy link
Member

Can you please report on runtime for PP, PP_FI and PP_FI_TM separately such that we know how much time is spent in the subalgorithms? The complexity of clear_vertex depends on the edge set type chosen. What graph type are you using?

@spth
Copy link
Member Author

spth commented Nov 16, 2020

Can you please report on runtime for PP, PP_FI and PP_FI_TM separately such that we know how much time is spent in the subalgorithms? The complexity of clear_vertex depends on the edge set type chosen. What graph type are you using?

I can only reproduce this at -O1. So I do not know how much of a problem there is at -O2 vs. me just not being able to read the information at -O2 due to stuff not being visible in valgrind.

Relative (to treedec::comb::PP_FI_TM) runtime:
100% in treedec::comb::PP_FI_TM, distributed as follows:
22 % in treedec::glue_bag
77% in treedec::preprocessing, distributed as follows:
71% in boost::clear_vertex, distributed as follows:
30% in delete
17% in new

treedec::comb::PP_FI_TM) is called from three places, relative runtime:
21% redundancy elimination
40% address space switching
39% register allocation

In each case, a tree-decomposition of a cfg is computed. The cfg is first copied into a cfg2_t, which is then passed to treedec::comb::PP_FI_TM.

typedef boost::adjacency_list<boost::setS, boost::vecS, boost::undirectedS> cfg2_t;
cfg2_t cfg2;
copy_undir(cfg2, cfg);
treedec::comb::PP_FI_TM<cfg2_t> a2(cfg2);
a2.do_it();
a2.get_tree_decomposition(tree_dec2);

@felix-salfelder
Copy link
Member

71% in boost::clear_vertex, distributed as follows:
30% in delete
17% in new

This should improve with vector based adjacency list, as these are manipulated in-place.

typedef boost::adjacency_list<boost::setS, boost::vecS, boost::undirectedS> cfg2_t;
cfg2_t cfg2;
typedef boost::adjacency_list<boost::vecS, boost::vecS, boost::undirectedS> cfg2_t;
cfg2v_t cfg2;
treedec::comb::PP_FI_TM<cfg2v_t> a2(cfg2);

Should do and make use of boost::erase_if. I will make sure this works and add a test as soon as I can.

Also, please check if you specified NDEBUG during compilation. Omitting it will result in a debug mode build which may be an order of magnitude slower.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants