Application-Specific Integrated Circuit (ASIC) implementation of 16 bit SIMD processor
The Verilog source code used for this project was implemented by Tingyuan LIANG: https://github.com/zslwyuan/Basic-SIMD-Processor-Verilog-Tutorial
SIMD stands for Single Instruction, Multiple Data. It is a parallel computing architecture where a single instruction is applied to multiple data elements simultaneously, enabling efficient processing of large datasets. The core of the processor is a 16 bit SIMD ALU with three basic computation units: SIMD adder, SIMD multiplier and SIMD shifter. The ALU operation will take two clocks. The first clock cycle will be used to load values into the registers. The second will be for performing the operations. There are 48 operations supported by the processor consisting of 18-bit instructions with 6 bit opcode.
- src/simd.sdc : Declared clock period as 16.5ns
-
init/pin_order.cfg : The pin_order file decides the placement of input-output pins at the periphery of the floorplan
-
init/config.json : The configuration parameters are declared using the config.json file. These parameters help in exploring various designs. We observe a trade-off between power, area and delay at various stages of OpenLane execution depending on the parameters declared
Nine synthesis strategies are explored in the interactive mode using the run_synth_explore command. The strategies are compared on the basis of best area, gate count and delay. A strategy is chosen for synthesis depending on the defined cost function.
-
report/multi_corner_sta.power.rpt : The power report obtained after the signoff shows the power consumed for multiple corners
-
report/multi_corner_sta.summary.rpt : The STA report gives the summary about the setup time and hold time slack
-
gds/simd.gds : We obtain the GSDII file using KLayout
The layout can also be viewed using gui.py :
Above error can be fixed by increasing the die area i.e, FP_DIE_AREA in config.json file.
To fix the above error we can reduce CTS_SINK_CLUSTERING_SIZE and CTS_SINK_CLUSTERING_MAX_DIAMETER to overcome negative slack or all together increase clock period.
The minimum width for layer metal5 is 1.6um.
Decreasing PL_TARGET_DENSITY can reduce the routing congestion.