Skip to content

Commit 45d1eca

Browse files
committed
initial scaffold for tunable variables
Signed-off-by: Jack Luar <[email protected]>
1 parent ad1b24d commit 45d1eca

File tree

4 files changed

+33
-44
lines changed

4 files changed

+33
-44
lines changed

docs/user/FlowVariables.md

+6-4
Original file line numberDiff line numberDiff line change
@@ -94,6 +94,7 @@ configuration file.
9494
| <a name="FILL_CELLS"></a>FILL_CELLS| Fill cells are used to fill empty sites. If not set or empty, fill cell insertion is skipped.| | |
9595
| <a name="FILL_CONFIG"></a>FILL_CONFIG| JSON rule file for metal fill during chip finishing.| | |
9696
| <a name="FLOORPLAN_DEF"></a>FLOORPLAN_DEF| Use the DEF file to initialize floorplan.| | |
97+
| <a name="GDS_ALLOW_EMPTY"></a>GDS_ALLOW_EMPTY| Regular expression of module names of macros that have no .gds file| | |
9798
| <a name="GDS_FILES"></a>GDS_FILES| Path to platform GDS files.| | |
9899
| <a name="GENERATE_ARTIFACTS_ON_FAILURE"></a>GENERATE_ARTIFACTS_ON_FAILURE| For instance Bazel needs artifacts (.odb and .rpt files) on a failure to allow the user to save hours on re-running the failed step locally, but when working with a Makefile flow, it is more natural to fail the step and leave the user to manually inspect the logs and artifacts directly via the file system. Set to 1 to change the behavior to generate artifacts upon failure to e.g. do a global route. The exit code will still be non-zero on all other failures that aren't covered by the "useful to inspect the artifacts on failure" use-case. Example: just like detailed routing, a global route that fails with congestion, is not a build failure(as in exit code non-zero), it is a successful(as in zero exit code) global route that produce reports detailing the problem. Detailed route will not proceed, if there is global routing congestion This allows build systems, such as bazel, to create artifacts for global and detailed route, even if the operation had problems, without having know about the semantics between global and detailed route. Considering that global and detailed route can run for a long time and use a lot of memory, this allows inspecting results on a laptop for a build that ran on a server.| 0| |
99100
| <a name="GLOBAL_PLACEMENT_ARGS"></a>GLOBAL_PLACEMENT_ARGS| Use additional tuning parameters during global placement other than default args defined in global_place.tcl.| | |
@@ -102,7 +103,7 @@ configuration file.
102103
| <a name="GPL_ROUTABILITY_DRIVEN"></a>GPL_ROUTABILITY_DRIVEN| Specifies whether the placer should use routability driven placement.| 1| |
103104
| <a name="GPL_TIMING_DRIVEN"></a>GPL_TIMING_DRIVEN| Specifies whether the placer should use timing driven placement.| 1| |
104105
| <a name="GUI_TIMING"></a>GUI_TIMING| Load timing information when opening GUI. For large designs, this can be quite time consuming. Useful to disable when investigating non-timing aspects like floorplan, placement, routing, etc.| 1| |
105-
| <a name="HOLD_SLACK_MARGIN"></a>HOLD_SLACK_MARGIN| Specifies a time margin for the slack when fixing hold violations. This option allows you to overfix or underfix(negative value, terminate retiming before 0 or positive slack). Use min of HOLD_SLACK_MARGIN and 0(default hold slack margin) in floorplan. This avoids overrepair in floorplan for hold by default, but allows skipping hold repair using a negative HOLD_SLACK_MARGIN. Exiting timing repair early is useful in exploration where the .sdc has a fixed clock period at designs target clock period and where HOLD/SETUP_SLACK_MARGIN is used to avoid overrepair(extremely long running times) when exploring different parameter settings.| 0| |
106+
| <a name="HOLD_SLACK_MARGIN"></a>HOLD_SLACK_MARGIN| Specifies a time margin for the slack when fixing hold violations. This option allows you to overfix or underfix(negative value, terminate retiming before 0 or positive slack). floorplan.tcl uses min of HOLD_SLACK_MARGIN and 0(default hold slack margin). This avoids overrepair in floorplan for hold by default, but allows skipping hold repair using a negative HOLD_SLACK_MARGIN. Exiting timing repair early is useful in exploration where the .sdc has a fixed clock period at the design's target clock period and where HOLD/SETUP_SLACK_MARGIN is used to avoid overrepair(extremely long running times) when exploring different parameter settings. When an ideal clock is used, that is before CTS, a clock insertion delay of 0 is used in timing paths. This creates a mismatch between macros that have a .lib file from after CTS, when the clock is propagated. To mitigate this, OpenSTA will use subtract the clock insertion delay of macros when calculating timing with ideal clock. Provided that min_clock_tree_path and max_clock_tree_path are in the .lib file, which is the case for macros built with OpenROAD. This is less accurate than if OpenROAD had created a placeholder clock tree for timing estimation purposes prior to CTS. There will inevitably be inaccuracies in the timing calculation prior to CTS. Use a slack margin that is low enough, even negative, to avoid overrepair. Inaccuracies in the timing prior to CTS can also lead to underrepair, but there no obvious and simple way to avoid underrapir in these cases. Overrepair can lead to excessive runtimes in repair or too much buffering being added, which can present itself as congestion of hold cells or buffer cells. Another use of SETUP/HOLD_SLACK_MARGIN is design parameter exploration when trying to find the minimum clock period for a design. The SDC_FILE for a design can be quite complicated and instead of modifying the clock period in the SDC_FILE, which can be non-trivial, the clock period can be fixed at the target frequency and the SETUP/HOLD_SLACK_MARGIN can be swept to find a plausible current minimum clock period.| 0| |
106107
| <a name="IO_CONSTRAINTS"></a>IO_CONSTRAINTS| File path to the IO constraints .tcl file.| | |
107108
| <a name="IO_PLACER_H"></a>IO_PLACER_H| The metal layer on which to place the I/O pins horizontally (top and bottom of the die).| | |
108109
| <a name="IO_PLACER_V"></a>IO_PLACER_V| The metal layer on which to place the I/O pins vertically (sides of the die).| | |
@@ -137,13 +138,13 @@ configuration file.
137138
| <a name="PWR_NETS_VOLTAGES"></a>PWR_NETS_VOLTAGES| Used for IR Drop calculation.| | |
138139
| <a name="RCX_RULES"></a>RCX_RULES| RC Extraction rules file path.| | |
139140
| <a name="RECOVER_POWER"></a>RECOVER_POWER| Specifies how many percent of paths with positive slacks can be slowed for power savings [0-100].| 0| |
140-
| <a name="REMOVE_ABC_BUFFERS"></a>REMOVE_ABC_BUFFERS| Remove abc buffers from the netlist. If timing repair in floorplanning is taking too long, use a SETUP_HOLD_MARGIN to terminate timing repair early instead of using REMOVE_ABC_BUFFERS or set SKIP_LAST_GAST=1.| | yes|
141+
| <a name="REMOVE_ABC_BUFFERS"></a>REMOVE_ABC_BUFFERS| Remove abc buffers from the netlist. If timing repair in floorplanning is taking too long, use a SETUP/HOLD_SLACK_MARGIN to terminate timing repair early instead of using REMOVE_ABC_BUFFERS or set SKIP_LAST_GASP=1.| | yes|
141142
| <a name="REMOVE_CELLS_FOR_EQY"></a>REMOVE_CELLS_FOR_EQY| String patterns directly passed to write_verilog -remove_cells <> for equivalence checks.| | |
142143
| <a name="REPAIR_PDN_VIA_LAYER"></a>REPAIR_PDN_VIA_LAYER| Remove power grid vias which generate DRC violations after detailed routing.| | |
143144
| <a name="REPORT_CLOCK_SKEW"></a>REPORT_CLOCK_SKEW| Report clock skew as part of reporting metrics, starting at CTS, before which there is no clock skew. This metric can be quite time-consuming, so it can be useful to disable.| 1| |
144145
| <a name="RESYNTH_AREA_RECOVER"></a>RESYNTH_AREA_RECOVER| Enable re-synthesis for area reclaim.| 0| |
145146
| <a name="RESYNTH_TIMING_RECOVER"></a>RESYNTH_TIMING_RECOVER| Enable re-synthesis for timing optimization.| 0| |
146-
| <a name="ROUTING_LAYER_ADJUSTMENT"></a>ROUTING_LAYER_ADJUSTMENT| Default routing layer adjustment| 0.5| |
147+
| <a name="ROUTING_LAYER_ADJUSTMENT"></a>ROUTING_LAYER_ADJUSTMENT| Adjusts routing layer capacities to manage congestion and improve detailed routing. High values ease detailed routing but risk excessive detours and long global routing times, while low values reduce global routing failure but can complicate detailed routing. The global routing running time normally reduces dramatically(entirely design specific, but going from hours to minutes has been observed) when the value is low(such as 0.10). Sometimes, global routing will succeed with lower values and fail with higher values. Exploring results with different values can help shed light on the problem. Start with a too low value, such as 0.10, and bisect to value that works by doing multiple global routing runs. As a last resort, `make global_route_issue` and using the tools/OpenROAD/etc/deltaDebug.py can be useful to debug global routing errors. If there is something specific that is impossible to route, such as a clock line over a macro, global routing will terminate with DRC errors routes that could have been routed were it not for the specific impossible routes. deltaDebug.py should weed out the possible routes and leave a minimal failing case that pinpoints the problem.| 0.5| |
147148
| <a name="RTLMP_AREA_WT"></a>RTLMP_AREA_WT| Weight for the area of the current floorplan.| 0.1| |
148149
| <a name="RTLMP_ARGS"></a>RTLMP_ARGS| Overrides all other RTL macro placer arguments.| | |
149150
| <a name="RTLMP_BOUNDARY_WT"></a>RTLMP_BOUNDARY_WT| Weight for the boundary or how far the hard macro clusters are from boundaries.| 50.0| |
@@ -167,7 +168,7 @@ configuration file.
167168
| <a name="SDC_FILE"></a>SDC_FILE| The path to design constraint (SDC) file.| | |
168169
| <a name="SDC_GUT"></a>SDC_GUT| Load design and remove all internal logic before doing synthesis. This is useful when creating a mock .lef abstract that has a smaller area than the amount of logic would allow. bazel-orfs uses this to mock SRAMs, for instance.| | |
169170
| <a name="SEAL_GDS"></a>SEAL_GDS| Seal macro to place around the design.| | |
170-
| <a name="SETUP_SLACK_MARGIN"></a>SETUP_SLACK_MARGIN| Specifies a time margin for the slack when fixing setup violations. This option allows you to overfix or underfix(negative value, terminate retiming before 0 or positive slack).| 0| |
171+
| <a name="SETUP_SLACK_MARGIN"></a>SETUP_SLACK_MARGIN| Specifies a time margin for the slack when fixing setup violations. This option allows you to overfix or underfix(negative value, terminate retiming before 0 or positive slack). See HOLD_SLACK_MARGIN for more details.| 0| |
171172
| <a name="SET_RC_TCL"></a>SET_RC_TCL| Metal & Via RC definition file path.| | |
172173
| <a name="SKIP_CTS_REPAIR_TIMING"></a>SKIP_CTS_REPAIR_TIMING| Skipping CTS repair, which can take a long time, can be useful in architectural exploration or when getting CI up and running.| | |
173174
| <a name="SKIP_GATE_CLONING"></a>SKIP_GATE_CLONING| Do not use gate cloning transform to fix timing violations (default: use gate cloning).| | |
@@ -343,6 +344,7 @@ configuration file.
343344
## final variables
344345

345346
- [ADDITIONAL_GDS](#ADDITIONAL_GDS)
347+
- [GDS_ALLOW_EMPTY](#GDS_ALLOW_EMPTY)
346348
- [GND_NETS_VOLTAGES](#GND_NETS_VOLTAGES)
347349
- [MAX_ROUTING_LAYER](#MAX_ROUTING_LAYER)
348350
- [MIN_ROUTING_LAYER](#MIN_ROUTING_LAYER)

flow/scripts/variables.yaml

+8
Original file line numberDiff line numberDiff line change
@@ -104,13 +104,15 @@ CORE_UTILIZATION:
104104
The core utilization percentage (0-100).
105105
stages:
106106
- floorplan
107+
tunable: 1
107108
CORE_AREA:
108109
description: >
109110
The core area specified as a list of lower-left and upper-right corners in
110111
microns
111112
(X1 Y1 X2 Y2).
112113
stages:
113114
- floorplan
115+
tunable: 1
114116
REPORT_CLOCK_SKEW:
115117
description:
116118
Report clock skew as part of reporting metrics, starting at CTS,
@@ -344,6 +346,7 @@ CELL_PAD_IN_SITES_DETAIL_PLACEMENT:
344346
- cts
345347
- grt
346348
default: 0
349+
tunable: 1
347350
PLACE_PINS_ARGS:
348351
description: >
349352
Arguments to place_pins
@@ -362,6 +365,7 @@ PLACE_DENSITY_LB_ADDON:
362365
description: >
363366
Check the lower boundary of the PLACE_DENSITY and add
364367
PLACE_DENSITY_LB_ADDON if it exists.
368+
tunable: 1
365369
REPAIR_PDN_VIA_LAYER:
366370
description: >
367371
Remove power grid vias which generate DRC violations after detailed routing.
@@ -657,13 +661,15 @@ CORE_MARGIN:
657661
is undefined.
658662
stages:
659663
- floorplan
664+
tunable: 1
660665
DIE_AREA:
661666
description: >
662667
The die area specified as a list of lower-left and upper-right corners in
663668
microns
664669
(X1 Y1 X2 Y2).
665670
stages:
666671
- floorplan
672+
tunable: 1
667673
RESYNTH_AREA_RECOVER:
668674
description: >
669675
Enable re-synthesis for area reclaim.
@@ -702,12 +708,14 @@ CTS_CLUSTER_DIAMETER:
702708
default: 20
703709
stages:
704710
- cts
711+
tunable: 1
705712
CTS_CLUSTER_SIZE:
706713
description: >
707714
Maximum number of sinks per cluster.
708715
default: 50
709716
stages:
710717
- cts
718+
tunable: 1
711719
CTS_SNAPSHOT:
712720
description: >
713721
Creates ODB/SDC files prior to clock net and setup/hold repair.

tools/AutoTuner/requirements.txt

+1
Original file line numberDiff line numberDiff line change
@@ -9,3 +9,4 @@ tensorboard>=2.14.0,<=2.16.2
99
protobuf==3.20.3
1010
SQLAlchemy==1.4.17
1111
urllib3<=1.26.15
12+
pyyaml==6.0.1

tools/AutoTuner/src/autotuner/distributed.py

+18-40
Original file line numberDiff line numberDiff line change
@@ -32,6 +32,7 @@
3232
import glob
3333
import subprocess
3434
import random
35+
import yaml
3536
from datetime import datetime
3637
from multiprocessing import cpu_count
3738
from subprocess import run
@@ -360,42 +361,22 @@ def read_tune_pbt(name, this):
360361
return config, sdc_file, fr_file
361362

362363

363-
def parse_flow_variables():
364+
def parse_tunable_variables():
364365
"""
365-
Parse the flow variables from source
366-
- Code: Makefile `vars` target output
367-
366+
Parse the tunable variables from variables.yaml
368367
TODO: Tests.
369-
370-
Output:
371-
- flow_variables: set of flow variables
372368
"""
373369
cur_path = os.path.dirname(os.path.realpath(__file__))
374-
375-
# first, generate vars.tcl
376-
makefile_path = os.path.join(cur_path, "../../../../flow/")
377-
initial_path = os.path.abspath(os.getcwd())
378-
os.chdir(makefile_path)
379-
result = subprocess.run(["make", "vars", f"PLATFORM={args.platform}"])
380-
if result.returncode != 0:
381-
print(f"[ERROR TUN-0018] Makefile failed with error code {result.returncode}.")
382-
sys.exit(1)
383-
if not os.path.exists("vars.tcl"):
384-
print(f"[ERROR TUN-0019] Makefile did not generate vars.tcl.")
385-
sys.exit(1)
386-
os.chdir(initial_path)
387-
388-
# for code parsing, you need to parse from both scripts and vars.tcl file.
389-
pattern = r"(?:::)?env\((.*?)\)"
390-
files = glob.glob(os.path.join(cur_path, "../../../../flow/scripts/*.tcl"))
391-
files.append(os.path.join(cur_path, "../../../../flow/vars.tcl"))
392-
variables = set()
393-
for file in files:
394-
with open(file) as fp:
395-
matches = re.findall(pattern, fp.read())
396-
for match in matches:
397-
for variable in match.split("\n"):
398-
variables.add(variable.strip().upper())
370+
vars_path = os.path.join(cur_path, "../../../../flow/scripts/variables.yaml")
371+
372+
# Read from variables.yaml and get variables with tunable = 1
373+
with open(vars_path) as file:
374+
try:
375+
result = yaml.safe_load(file)
376+
except yaml.YAMLError as exc:
377+
print("[ERROR TUN-0018] Error parsing variables.yaml.")
378+
sys.exit(1)
379+
variables = {key for key, value in result.items() if value.get("tunable", 0) == 1}
399380
return variables
400381

401382

@@ -406,7 +387,7 @@ def parse_config(config, path=os.getcwd()):
406387
options = ""
407388
sdc = {}
408389
fast_route = {}
409-
flow_variables = parse_flow_variables()
390+
flow_variables = parse_tunable_variables()
410391
for key, value in config.items():
411392
# Keys that begin with underscore need special handling.
412393
if key.startswith("_"):
@@ -424,15 +405,12 @@ def parse_config(config, path=os.getcwd()):
424405
"[WARNING TUN-0013] Non-flatten the designs are not "
425406
"fully supported, ignoring _SYNTH_FLATTEN parameter."
426407
)
427-
# Default case is VAR=VALUE
428408
else:
429-
# FIXME there is no robust way to get this metainformation from
430-
# ORFS about the variables, so disable this code for now.
431-
409+
# Default case is VAR=VALUE
432410
# Sanity check: ignore all flow variables that are not tunable
433-
# if key not in flow_variables:
434-
# print(f"[ERROR TUN-0017] Variable {key} is not tunable.")
435-
# sys.exit(1)
411+
if key not in flow_variables:
412+
print(f"[ERROR TUN-0017] Variable {key} is not tunable.")
413+
sys.exit(1)
436414
options += f" {key}={value}"
437415
if bool(sdc):
438416
write_sdc(sdc, path)

0 commit comments

Comments
 (0)