Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance Issue: Synthesis Takes Too Long to Complete #4445

Closed
PerryLogic opened this issue Jun 11, 2024 · 9 comments
Closed

Performance Issue: Synthesis Takes Too Long to Complete #4445

PerryLogic opened this issue Jun 11, 2024 · 9 comments
Labels
pending-verification This issue is pending verification and/or reproduction

Comments

@PerryLogic
Copy link

Version

yosys 0.41+126

On which OS did this happen?

Linux

Reproduction Steps

Hello,
I encountered a performance issue while using Yosys to synthesize a Verilog file. The specific details are as follows:

I used the Verilog file design.v, and when running the Yosys synthesis, it has been over 30 minutes without completion. My system configuration is as follows: 128G RAM, and a high-performance CPU. However, the synthesis process seems to be stuck in an infinite loop or some kind of performance bottleneck.

To rule out issues with the file itself, I have checked the Verilog file for syntax errors but did not find any obvious problems. Additionally, I am using the latest version of Yosys.

Attached is the Verilog file (design.v) that triggers this issue. I hope to get the community's help and attention.

After this error, the parsing process is interrupted. I have checked the Verilog file for syntax errors but couldn't find any obvious issues. I am using the latest version of Yosys.

Attached is the Verilog file (design.v) that triggers the error.
Thank you in advance for your attention to this matter.
I look forward to hearing from you regarding this issue.
design_file.zip

Expected Behavior

Synthesis completes successfully within a reasonable time

Actual Behavior

Synthesis takes too long and does not complete

@PerryLogic PerryLogic added the pending-verification This issue is pending verification and/or reproduction label Jun 11, 2024
@whitequark
Copy link
Member

whitequark commented Jun 11, 2024

You should minimize the reproducer. It is a waste of everyone's time to file non-minimized fuzzer-discovered issues and personally I am leaning towards closing such issues outright.

@georgerennie
Copy link
Collaborator

georgerennie commented Jun 11, 2024

I would agree with the previous comment on minimization. If you haven't seen it before https://blog.regehr.org/archives/2037 provides good guidelines on reporting fuzzed bugs. Additionally it would have been helpful to include the synthesis script used. I have assumed it was just synth as that causes a hang for me too.

If you aren't familiar with it, Yosys includes a testcase minimizer in the form of the bugpoint command. Running
yosys -p "read_verilog design.v; bugpoint -command \"synth\" -runner \"timeout 1\" -fast; write_verilog bug.v" results in the following testcase after only a few minutes (I've tidied the one shown here with clean -purge; write_verilog -noattr bug.v)

/* Generated by Yosys 0.40+45 (git sha1 dd2195543, g++ 13.2.1 -fPIC -Os) */

module module11(_0_);
  output [3:0] _0_;
  wire [3:0] _0_;
  wire _1_;
  wire [3:0] _2_;
  wire [39:0] _3_;
  wire [39:0] _4_;
  assign _3_ = _1_ >= _2_;
  assign _0_ = $signed(4'hx) >> _4_;
  assign _4_ = _3_ - 40'h734e475132;
endmodule

Running read_verilog bug.v; synth on this seems to go very slow at

2.10. Executing PEEPOPT pass (run peephole optimizers).
shiftadd pattern in module11: shift=$shr$bug.v:11$2, add/sub=$sub$bug.v:12$3, offset: -1313296690

2.11. Executing OPT_CLEAN pass (remove unused cells and wires).
Finding unused cells or wires in module \module11..

You can recreate this by just running wreduce; peepopt. peepopt's shiftadd converts A>>(B+D) to (A'>>D)>>(B) where D is const and A' is a zero padded version of A. The issue seems to be that where D is -1313296690, A is padded with a 1313296691 bit constant that is all zeros, and Yosys is understandably slow to handle such a large constant.

It seems to me like although this is a case where performance isn't ideal, no one would ever have a shift of -1313296690 in a non-fuzzing scenario so its not really a big deal.

Edit: This seems to be about the same as #4056 and #4057 but for shiftadd instead of shiftmul

@whitequark
Copy link
Member

Thanks @georgerennie! I would have linked the same article if I had the URL on hand.

@PerryLogic
Copy link
Author

@whitequark @georgerennie Thank you for your response. I encountered this issue during a stress test, and I also tried it on several machines, but faced the same problem where logic synthesis could not be completed within a reasonable time. I have checked #4056 and #4057, which also reported extended synthesis times due to certain padding operations. Although I discovered this performance issue during a stress test, I believe this fault is noteworthy. Thank you.

@georgerennie
Copy link
Collaborator

So I had a look into it a little more, and there is actually a bug happening here, but it isn't what causes the poor performance.

Although the shift is nominally by 40'h734e475132, shiftadd stores the offset in an int. This shift value overflows the int representation, becoming a shift value it shouldn't.

The poor performance is still repeatable on a testcase like the following that doesn't overflow the int. Perhaps this optimization should be disabled for large shifts to gate against this, although again I don't believe this to be a particularly noteworthy issue for real designs.

module module11(_0_);
  output [3:0] _0_;
  wire [3:0] _0_;
  wire _1_;
  wire [3:0] _2_;
  wire [31:0] _3_;
  wire [31:0] _4_;
  assign _3_ = _1_ >= _2_;
  assign _0_ = $signed(4'hx) >> _4_;
  assign _4_ = _3_ - 32'h4e475132;
endmodule

@whitequark
Copy link
Member

In my view, Yosys should limit both the wire size and the shift size to some reasonable amount well under 4 billion. The Verilog specification explicitly allows this (provided it's above a minimum value), and it will take care of all similar issues, which otherwise will come at a small trickle indefinitely.

@georgerennie
Copy link
Collaborator

While I agree, it won't stop the int casting issue here (as we can't reasonably forbid constants being over 4 billion). Ideally RTLIL::Const could have a method like representable_as_int() to gate this. #4448 is a basic fix that just disables this optimization if the constant is too wide (I chose 24 bits for now) which prevents it being incorrectly cast to int and also puts an upper bound on the size of the circuit this optimization will generate.

@phsauter
Copy link
Contributor

In my view, Yosys should limit both the wire size and the shift size to some reasonable amount well under 4 billion. The Verilog specification explicitly allows this (provided it's above a minimum value), and it will take care of all similar issues, which otherwise will come at a small trickle indefinitely.

It is specified in Verilog-2005 4.3.1 if anyone wants to look it up.
The smallest allowed upper limit on the size of a vector (number of wires/bits) is 65536 ($2^{16}$).
So something like 24 bits seems reasonable to me.

@PerryLogic
Copy link
Author

@georgerennie Thank you so much for your help!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pending-verification This issue is pending verification and/or reproduction
Projects
None yet
Development

No branches or pull requests

4 participants