rsz: Try multiple repairs per iteration #6262

kbieganski · 2024-11-28T13:54:57Z

This makes RSZ try multiple fixes at once per iteration.
If the slack is worsened, fall back to one repair (old behavior).
Speeds up some specific designs significantly without really slowing anything else down.
The biggest gains I got were on a design that I can't share, unfortunately.

It follows a heuristic where the number of repairs is proprortional to the currently repaired path slack, with a cap of 10 repairs.
Probably not the best approach possible, but I tried multiple strategies and this one is the best so far.

Note: The code needs cleaning up, but I'll do that after I get some feedback.

`asap7/<secret design 1>`, floorplan

Branch	Time [s]	Relative	WS	TNS	Power
Optimized	641.23	1.00	1.57	0.00	4.46
Baseline	1031.58	1.61	0.77	-3.64	4.46

`asap7/<secret design 2>`, floorplan

Branch	Time [s]	Relative	WS	TNS	Power
Optimized	4882.00	1.00	0.71	-12.16	45.86
Baseline	8534.00	1.75	-22.62	-18.86	45.86

`asap7/aes`, floorplan

Branch	Time [s]	Relative	WS	TNS	Power
Optimized	38.53	1.00	-36.44	-1088.10	0.11
Baseline	38.43	1.00	-36.44	-1088.10	0.11

`asap7/aes`, floorplan

Branch	Time [s]	Relative	WS	TNS	Power
Optimized	13.92	1.00	0.09	-0.11	0.04
Baseline	14.16	1.02	0.09	-0.04	0.04

`asap7/aes`, floorplan

Branch	Time [s]	Relative	WS	TNS	Power
Optimized	64.13	1.00	-0.08	-1.94	0.31
Baseline	70.62	1.10	-0.07	-1.89	0.31

`asap7/aes`, floorplan

Branch	Time [s]	Relative	WS	TNS	Power
Optimized	55.13	1.01	0.32	0.00	0.13
Baseline	54.34	1.00	0.32	0.00	0.13

`asap7/aes`, floorplan

Branch	Time [s]	Relative	WS	TNS	Power
Optimized	40.75	1.00	0.00	0.00	0.06
Baseline	40.97	1.01	0.00	-0.00	0.06

github-actions

clang-tidy made some suggestions

src/rsz/src/RepairSetup.cc

github-actions · 2024-12-04T10:08:30Z

clang-tidy review says "All clean, LGTM! 👍"

github-actions · 2024-12-04T13:56:25Z

clang-tidy review says "All clean, LGTM! 👍"

github-actions · 2024-12-04T21:48:53Z

clang-tidy review says "All clean, LGTM! 👍"

povik · 2024-12-11T11:03:40Z

@kbieganski If you are interested in repair_timing performance see parallaxsw/OpenSTA#140

QuantamHD · 2024-12-12T15:53:05Z

@maliberty ready for review again

maliberty · 2024-12-12T16:12:21Z

@precisionmoon please review

precisionmoon · 2024-12-12T18:55:07Z

@precisionmoon please review

Ok. Let me review the changes. @kbieganski, were you able to run ORFS tests? I see some small degradations in unit tests, so I'm curious what you see in other public PDK designs beyond asap7.

precisionmoon · 2024-12-13T02:50:02Z

src/rsz/src/Resizer.cc

@@ -309,6 +309,15 @@ bool Resizer::bufferBetweenPorts(Instance* buffer)
 bool Resizer::removeBuffer(Instance* buffer,
                           bool honorDontTouchFixed,
                           bool recordJournal)
+{
+  if (canRemoveBuffer(buffer, honorDontTouchFixed)) {
+    removeBuffer2(buffer, recordJournal);


It'd be good to rename removeBuffer2 to something like doRemoveBuffer or removeBufferCore

Yes, I plan on doing that, I wanted to get some general feedback first.

kbieganski · 2024-12-13T08:28:38Z

@kbieganski, were you able to run ORFS tests?

I also tested nangate45, I believe there was no significant difference. I'll dig those results up/rerun those tests and add them here.

maliberty · 2024-12-15T14:19:24Z

I merged master to a copy of this branch and started a secure CI.

maliberty · 2024-12-15T14:23:03Z

src/rsz/test/repair_fanout7.ok

-worst slack -3.94
+worst slack -4.44


Non-trivial degradation.

maliberty · 2024-12-15T14:30:33Z

src/rsz/src/Resizer.cc

    in_db_net->setDoNotTouch(false);
    out_db_net->setDoNotTouch(false);


These should be restored when done to not lose user intent later in the flow.

maliberty · 2024-12-15T14:33:07Z

src/rsz/src/RepairSetup.cc

@@ -537,7 +539,19 @@ bool RepairSetup::repairPath(PathRef& path,
                 || (pair1.second == pair2.second && pair1.first > pair2.first);
        });
    // Attack gates with largest load delays first.
+    int repairs_per_iter = std::round(std::max(-path_slack * 10e9f, 1.0f));


Is there an assumption that slack is in ns? This is dependent on the Liberty units and may vary by PDK.

It's just a constant that worked pretty well.
The final version was going to either have this factor configurable via a parameter, or have the number of repairs always be the maximum (10 here) for the first endpoint, and then linearly go down to 1 for the final endpoints.

maliberty · 2024-12-15T16:45:00Z

In the public CI:

------------------------------
       Failing designs
------------------------------
asap7 mock-alu (base)
  Flow reached last stage.
  Found 1 metrics failures.
      [ERROR] finish__timing__setup__ws fail test: -549.646 >= -537.65

gf180 aes-hybrid (base)
  Flow reached last stage.
  Found 1 metrics failures.
      [ERROR] finish__timing__setup__ws fail test: -1.56671 >= -1.38

gf180 aes (base)
  Flow reached last stage.
  Found 1 metrics failures.
      [ERROR] finish__timing__setup__ws fail test: -1.28034 >= -1.17

gf180 ibex (base)
  Flow reached last stage.
  Found 1 metrics failures.
      [ERROR] finish__timing__drv__setup_violation_count fail test: 920.0 <= 794.0

nangate45 bp_fe (base)
  Flow reached last stage.
  Found 2 metrics failures.
      [ERROR] finish__timing__setup__ws fail test: -0.868694 >= -0.68
      [ERROR] finish__timing__wns_percent_delay fail test: -37.627236 >= -35.7

nangate45 swerv_wrapper (base)
  Last log file 3_3_place_gp.log
  Found 1 errors in the logs.
      [ERROR GPL-0307] RePlAce divergence detected. Re-run with a smaller max_phi_cof value.

sky130hd gcd (base)
  Flow reached last stage.
  Found 1 metrics failures.
      [ERROR] finish__timing__drv__setup_violation_count fail test: 43.0 <= 40.0

sky130hd microwatt (base)
  Last log file 5_1_grt.log
  Found 1 errors in the logs.
      [ERROR GRT-0116] Global routing finished with congestion. Check the congestion regions in the DRC Viewer.

The non-metrics failures should be investigated first.

precisionmoon · 2024-12-17T00:23:49Z

This is an interesting approach. Besides the CI test failures from @maliberty that need review, I have some additional questions:

This approach tries multiple transforms at once. Does it lead to an over-fixing for one driver?
For example, buffering, sizing, pin swap and cloning can all happen on the same driver. But only sizing may have been sufficient.
Also, can this lead to under-fixing for the overall design? This approach may do aggressive fixing for a few drivers and can get stuck in local minima early. TNS degradation in repair_fanout7 from -294.3 to -412.3 may be due to this. Also, delayed buffer removal may make it harder to estimate its impact.
Can we measure QoR at the end of the flow instead of at floorplanning? FP results can be misleading because this doesn't reflect the true QoR at the end.

kbieganski · 2024-12-17T20:04:59Z

This approach tries multiple transforms at once. Does it lead to an over-fixing for one driver?

Do you mean one driver or one path? The idea is that you need multiple repairs for the worst path of the worst endpoint anyway, and then the number of repairs goes down when we get to paths with a better slack.

Can we measure QoR at the end of the flow instead of at floorplanning?

I'll try and get some measurements.

kbieganski · 2024-12-18T16:09:56Z

@maliberty Where can I see those failures? Or do you mean private CI? Can't see it in Jenkins.

maliberty · 2024-12-18T16:16:55Z

That particular pipeline is private but the test cases are all in ORFS so you can run them yourself.

precisionmoon · 2024-12-18T18:54:09Z

This approach tries multiple transforms at once. Does it lead to an over-fixing for one driver?

Do you mean one driver or one path? The idea is that you need multiple repairs for the worst path of the worst endpoint anyway, and then the number of repairs goes down when we get to paths with a better slack.

One driver. Currently, only one transform is tried per driver. But you will allow multiple ones.

Can we measure QoR at the end of the flow instead of at floorplanning?

I'll try and get some measurements.

Thank you.

kbieganski · 2024-12-19T11:42:30Z

One driver. Currently, only one transform is tried per driver. But you will allow multiple ones.

I think I'm confused by the terminology. In the code, we iterate over driver vertices in a given path. Only one transform is allowed per each vertex because if it's successful, it goes to the next iteration immediately.

Full flow results for asap7/ibex:

branch	power_internal	power_switching	power_leakage	power_total	total_neg_slack	worst_slack
Baseline	0.0264367	0.0280711	2.07997e-06	0.0545099	-681.398	-62.5979
This PR	0.0264367	0.0280711	2.07997e-06	0.0545099	-681.398	-62.5979

For nangate45/ibex:

branch	power_internal	power_switching	power_leakage	power_total	total_neg_slack	worst_slack
Baseline	0.0530963	0.0498528	0.000808576	0.103758	-74.3429	-0.278099
This PR	0.052714	0.0488411	0.000799717	0.102355	-44.9603	-0.184015

asap7/aes:

branch	power_internal	power_switching	power_switching	power_total	total_neg_slack	worst_slack	time
Baseline	0.060006	0.0897325	0.0897325	0.14974	-3135.11	-44.9717	96.3463
This PR	0.060006	0.0897325	0.0897325	0.14974	-3135.11	-44.9717	95.4654

nangate45/aes:

branch	power_internal	power_switching	power_switching	power_total	total_neg_slack	worst_slack	time
Baseline	0.24709	0.25256	0.25256	0.500457	-9.28	-0.131091	186.176
This PR	0.249503	0.254516	0.254516	0.50483	-7.16125	-0.119882	171.022

Also I think this is a mode that should only be enabled in floorplanning, as later on it doesn't seem to speed anything up.

kbieganski · 2024-12-20T17:07:21Z

@maliberty Is there a particular way I should run these? If I run e.g.

make DESIGN_CONFIG=designs/sky130hd/microwatt/config.mk

congestion is ok:

[INFO GRT-0096] Final congestion report:
Layer         Resource        Demand        Usage (%)    Max H / Max V / Total Overflow
---------------------------------------------------------------------------------------
li1                  0             0            0.00%             0 /  0 /  0
met1           1537989        645858           41.99%             0 /  0 /  0
met2           1579174        696075           44.08%             0 /  0 /  0
met3           1044188        197190           18.88%             0 /  0 /  0
met4            475065        149522           31.47%             0 /  0 /  0
met5            137620          5504            4.00%             0 /  0 /  0
---------------------------------------------------------------------------------------
Total          4774036       1694149           35.49%             0 /  0 /  0

(unless I'm misinterpreting it)
And the SweRV one is successful as well.

maliberty · 2024-12-20T22:50:57Z

You want the metadata target to check against the limits. Usually I use make finish metadata

Signed-off-by: Krzysztof Bieganski <[email protected]>

kbieganski force-pushed the rsz-multiple-repairs-per-iter branch from 390c298 to 4504aab Compare November 28, 2024 14:01

github-actions bot reviewed Nov 28, 2024

View reviewed changes

src/rsz/src/RepairSetup.cc Outdated Show resolved Hide resolved

kbieganski marked this pull request as draft November 28, 2024 15:50

kbieganski force-pushed the rsz-multiple-repairs-per-iter branch from 4504aab to 430d571 Compare December 4, 2024 10:00

kbieganski force-pushed the rsz-multiple-repairs-per-iter branch from 430d571 to 439c31b Compare December 4, 2024 13:48

kbieganski force-pushed the rsz-multiple-repairs-per-iter branch from 439c31b to e74e2f7 Compare December 4, 2024 21:40

kbieganski marked this pull request as ready for review December 5, 2024 09:53

precisionmoon reviewed Dec 13, 2024

View reviewed changes

maliberty reviewed Dec 15, 2024

View reviewed changes

src/rsz/test/repair_fanout7.ok Outdated

Comment on lines 480 to 420

worst slack -3.94

worst slack -4.44

Copy link

Member

maliberty Dec 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Non-trivial degradation.

maliberty reviewed Dec 15, 2024

View reviewed changes

rsz: Delay buffer removal to prevent use-after-free

a6f154a

Signed-off-by: Krzysztof Bieganski <[email protected]>

kbieganski added 2 commits January 3, 2025 17:29

rsz: Number of repairs based on path slack, fallback to 1

f83df6b

Signed-off-by: Krzysztof Bieganski <[email protected]>

Update test output

53bcae0

Signed-off-by: Krzysztof Bieganski <[email protected]>

kbieganski force-pushed the rsz-multiple-repairs-per-iter branch from e74e2f7 to 53bcae0 Compare January 3, 2025 22:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

rsz: Try multiple repairs per iteration #6262

rsz: Try multiple repairs per iteration #6262

kbieganski commented Nov 28, 2024 •

edited

Loading

github-actions bot left a comment

github-actions bot commented Dec 4, 2024

github-actions bot commented Dec 4, 2024

github-actions bot commented Dec 4, 2024

povik commented Dec 11, 2024

QuantamHD commented Dec 12, 2024

maliberty commented Dec 12, 2024

precisionmoon commented Dec 12, 2024

precisionmoon Dec 13, 2024

kbieganski Dec 13, 2024

kbieganski commented Dec 13, 2024

maliberty commented Dec 15, 2024

maliberty Dec 15, 2024

maliberty Dec 15, 2024

maliberty Dec 15, 2024

kbieganski Dec 17, 2024

maliberty commented Dec 15, 2024

precisionmoon commented Dec 17, 2024

kbieganski commented Dec 17, 2024

kbieganski commented Dec 18, 2024

maliberty commented Dec 18, 2024

precisionmoon commented Dec 18, 2024

kbieganski commented Dec 19, 2024 •

edited

Loading

kbieganski commented Dec 20, 2024

maliberty commented Dec 20, 2024

		in_db_net->setDoNotTouch(false);
		out_db_net->setDoNotTouch(false);

rsz: Try multiple repairs per iteration #6262

Are you sure you want to change the base?

rsz: Try multiple repairs per iteration #6262

Conversation

kbieganski commented Nov 28, 2024 • edited Loading

asap7/<secret design 1>, floorplan

asap7/<secret design 2>, floorplan

asap7/aes, floorplan

asap7/aes, floorplan

asap7/aes, floorplan

asap7/aes, floorplan

asap7/aes, floorplan

github-actions bot left a comment

Choose a reason for hiding this comment

github-actions bot commented Dec 4, 2024

github-actions bot commented Dec 4, 2024

github-actions bot commented Dec 4, 2024

povik commented Dec 11, 2024

QuantamHD commented Dec 12, 2024

maliberty commented Dec 12, 2024

precisionmoon commented Dec 12, 2024

precisionmoon Dec 13, 2024

Choose a reason for hiding this comment

kbieganski Dec 13, 2024

Choose a reason for hiding this comment

kbieganski commented Dec 13, 2024

maliberty commented Dec 15, 2024

maliberty Dec 15, 2024

Choose a reason for hiding this comment

maliberty Dec 15, 2024

Choose a reason for hiding this comment

maliberty Dec 15, 2024

Choose a reason for hiding this comment

kbieganski Dec 17, 2024

Choose a reason for hiding this comment

maliberty commented Dec 15, 2024

precisionmoon commented Dec 17, 2024

kbieganski commented Dec 17, 2024

kbieganski commented Dec 18, 2024

maliberty commented Dec 18, 2024

precisionmoon commented Dec 18, 2024

kbieganski commented Dec 19, 2024 • edited Loading

kbieganski commented Dec 20, 2024

maliberty commented Dec 20, 2024

kbieganski commented Nov 28, 2024 •

edited

Loading

`asap7/<secret design 1>`, floorplan

`asap7/<secret design 2>`, floorplan

`asap7/aes`, floorplan

`asap7/aes`, floorplan

`asap7/aes`, floorplan

`asap7/aes`, floorplan

`asap7/aes`, floorplan

kbieganski commented Dec 19, 2024 •

edited

Loading