-
Notifications
You must be signed in to change notification settings - Fork 13
ESYNet User Manual : Router Module
ESYNet provides EsynetRouter
to model a traditional 5-stage pipeline router.
The data path of a router can be described as input channels and output channels connected by a crossbar. Flits can be forwarded from any input ports to any output ports. The number of input ports and output ports is the same, and they are specified by option -phy_number
or by a network configuration file.
Input ports have several virtual channels. The number of virtual channels for one physical channel can be specified by option -vc_number
or a network configuration file. By default, each physical channel has one virtual channel. In some content, the default configuration is called no virtual channel as well.
Input virtual channel holds an input buffer. The size of the input buffer can also be specified by option -in_buffer_size
or a network configuration file.
Output ports do not have virtual channels. The output port can also have a
buffer specified by -out_buffer_size
or a network configuration files. However, the output buffers are not used currently as the result of switching method and flow control method. Output ports also hold the status of the connecting link.
Each router has one crossbar to deliver packets between input ports and output ports. The number of inputs of the crossbar is determined by the physical channels or the total number of input virtual channels, which is selected by option -crossbar
. If the inputs of the crossbar is the same as the number of physical channels, only one virtual channel from each physical channel can occupy the crossbar. Otherwise, all virtual channels from all physical channels have the opportunity to occupy the crossbar.
The control path is also called pipeline in many contents. The router pipeline covers the procedure for one packet from the input buffer to one output port. EsynetRouter
describes a traditional 5-stage pipeline router. The flit is pushed into the input buffer when the flit is received by one router. The pipeline starts when the head flit arrives at the top of the input buffer. Then, the packet goes through 5 steps, includes routing calculation (RC), virtual channel allocation (VC), switch arbitration (SA), switch traversal (ST), and link traversal (LT). After that, the packet has been sent out. Each step is seen as one stage in the pipeline. For wormhole switch method, the head flit needs to go through all 5 steps, while the body and tail flit only go through SA, SW, and LT steps.
Routing calculation (RC) calculates the forward direction of the packets, which is presented by the next-stage router and the virtual channel. One packet can require more than one direction. For example, one packet can require two different next-stage routers; or one packet can require two different virtual channels of the same next-stage router.
Virtual channel allocation (VA) assigns the packets with one virtual channel in the next-stage router. To clear understand VA and distinguish between VA and SA, please keep in mind that VA allocates the virtual channels in the next-stage router. The arbitration procedure has two steps. In the first step, the virtual channels in the next-stage router provide a grant to the packet to show which packet can be assigned. In the second step, the packet selects one grant if it receives multiple grants. If one packet only requires one forward direction by RC, the second step are not useful.
Switch arbitration (SA) selects flits to occupy the crossbar and the output port. If the size of the crossbar is as same as the number of physical ports, SA needs two steps. The first step selects one virtual channel from all assigned virtual channel of one port. The second step selects one input ports from all port requiring the same next-stage router. If the size of the crossbar is as same as the number of all virtual channels, SA only needs one step to selects one virtual channel from the virtual channels requiring the same next-stage router.
The flit with the grant from SA delivers through the crossbar during switch traversal (ST) stage. In link traversal (LT) stage, the flit in the output port delivers to the output link.
Please see for the details of the implementation.
This section introduces the corresponding hardware structures of the variables in EsynetRouter
.
EsynetRouter::m_input_port
illustrates the input ports in the router, which is a vector of EsynetInputPort
. Each item in the vector corresponding to one physical port. The EsynetInputPort
is also a vector of the virtual channels EsynetInputVirtualChannel
. Each virtual channel has its own input buffer EsynetInputVirtualChannel::m_input_buffer
.
Each physical port EsynetInputPort
stores the connected port in the neighbor router. Each virtual channel stores the status of pipeline, the required forward direction by the routing algorithm and the forward direction granted by the virtual channel allocation.
EsynetRouter::m_output_port
illustrate the output ports in the router, which is a vector of EsynetOutputPort
. Each item in the vector corresponding to one physical port. Each physical port has one buffer. EsynetOutputPort
is also a vector of the virtual channels EsynetOutputVirtualChannel
.
Each phsical port EsynetOutputPort
stores the connected port int eh neighbor router, and the status of the connected link. Each virtual channel stores the assigned virtual channel by the virtual channel allocation and the credit value.
EsynetRouter
provides 4 group of arbiters used in VC or SA stage. EsynetRouter::m_vc_input_arbiter
and EsynetRouter::m_vc_output_arbiter
are provided for each virtual channel while EsynetRouter::m_port_input_arbiter
and EsynetRouter::m_port_output_arbiter
are provided for each physical channel.
EsynetRouter::m_vc_input_arbiter
are used in the first stage of VA, and EsynetRouter::m_vc_output_arbiter
are used in the second stage of VA. EsynetRouter::m_port_input_arbiter
are used in the first step of SA and EsynetRouter::m_port_output_arbiter
are used in the second step of SA.
EsynetRouter
only provides one function pointer (EsynetRouter::m_curr_algorithm
) to routing algorithm. Thus, the routing algorithm should cover the situation of all input ports. EsynetRouter::m_routing_table
stores the loop up table for table-based routing algorithms.
EsynetRouter
also provides a power module (EsynetRouter::m_power_unit
) and a statistics unit (EsynetRouter::m_statistics_unit
).
Arbiters are implemented in the routers for virtual channel allocation and switch arbitration. There are several popular arbiters. See "Principles and Practices of Interconection Networks" for defail introduction. ESYNet provides EsynetArbiter
to implement an arbiter. Each instance of EsynetArbiter
corresponding to an abiter.
EsynetArbiter
provides three kinds of arbiters, including random arbiter, Round-Robin arbiter, and matrix arbiter (ARBITER_MATRIX). The type of arbiter is determined by option -arbiter
. The arbitration algorithm and the size of arbiter is determined by the construction function.
The arbiter makes the grant according to the request and the status of the arbiter. The status of the arbiter is stored in EsynetArbiter
and the request is set by EsynetArbiter::setRequest
function. EsynetArbiter::setRequest
can set the request by a vector of boolean signals reg or set the request of the signal specified by a. EsynetArbiter
gives out the index of granted signals by EsynetArbiter::grant
function.
Grant function also updates the status of the arbiter. Even with the same request, the output of two time of grant functions can be different. In the view of one arbiter, it only recognizes the request signals. But in the view of the router, the arbiter makes decision among virtual channels or physical channel. Thus, EsynetArbiter
provides the ability to make connection between virtual channels and the request signals. According to the map between virtual channels and request signals, EsynetArbiter
can set the request from one virtual channel as well as give out the virtual channel which gets grant directly.
EsynetRouter
provides four groups of arbiters. EsynetRouter::m_vc_input_arbiter
and EsynetRouter::m_vc_output_arbiter` are used in virtual channel allocation.
-
EsynetRouter::m_vc_input_arbiter
finds out the grant for the virtual channels require the same virtual channel in the neighbor router. The requirements are provided by routing algorithm. -
EsynetRouter::m_vc_output_arbiter
finds out the virtual channel in the neighbor router which should be assigned with the virtual channel. The requirements are provided by the result ofEsynetRouter::m_vc_output_arbiter
. If a packet can be forward to multiple directions (different virtual channels or physical channels), it is possible that the packet gets multiple grants from virtual channel in the neighbor routers. Thus, it is necessary to select one from these grant.
EsynetRouter::m_port_input_arbiter
and EsynetRouter::m_port_output_arbiter
are used in switch arbitration.
-
EsynetRouter::m_port_input_arbiter
finds out which input virtual channel can be delivered through crossbar. -
EsynetRouter::m_port_output_arbiter
finds out which physical channel can use link.
Random arbiter chooses one request randomly. The choose is made by random generator EsynetSRGenFlatLong
. Random arbiter does not store any status. Grant clears all request signals. If there is no valid request signal, grant return -1
.
Round-Robin arbiter loops all the request signals in a certain order until a valid request signal. For the next time, it loops from the grant signal of last time. In this way, all the request signals have their chance to be assigned highest priority. The fairness of Round-Robin arbiter is weaker than matrix arbiter, but it sill be reasonable acceptable. Moverover, Round-robin arbiter is simpler.
EsynetArbiter
stores the last grant signal in EsynetArbiter::m_state_vector
, which is a 1-D vector of boolean. The item for last grant signal is set by True.
Grant function loops all the request signals from the last grant signal stored in EsynetArbiter::m_state_vector
. The first valid request signal is assigned as grant signal. If there is no valid request signal, grant return -1
.
EsynetArbiter
provides the ability to assign highest or lowest priority to a port by changing the value of D-FF. If one signal is set with highest priority, the loop for next arbitration starts from the special signal; if one signal is set with lowest priority, the loop for next arbitration starts from the next signal to the special signal. It is very critical that the assigned priority only valid for next time of arbitration because the granted request signals must be assigned with lowest priority after arbitration. Thus, the user-defined priority should be set again after each arbitration.
Matrix arbiter is named after the matrix array of D-FF to store the arbiter register. The D-FF in row i
and column j
indicates that request i
takes priority over request j
. The main idea behinds the allgorithm is that the last granted signals should be assigned to the lowest priority while the priority of other request signals keep the same. The details and example of matrix arbiter can be found in book "Principles and Practices of Interconection Networks".
EsynetArbiter
stores the status of D-FF in matrix array in EsynetArbiter::m_state_matrix
, which is a 2-D matrix of boolean.
Grant function loops all the request signals to find the grant signals meet following condition: no valid request signal has higher priority over the grant signal. In other word, if the grant signal is a
, m_state_matrix[i][a]
for valid request i
must be all False. If no request signal meets such a condition, grant funcion returns -1
.
After finding the grant signals, grant function updates D-FF. Grant function clears all the D-FF in the same row as the grant signals and set the D-FF in the same column as the grant signals. In this way, the priority of the grant signal drops the lowest.
EsynetArbiter
provides the ability to assign highest or lowest priority to a port by changing the value of D-FF. It is very critical that the assigned priority only valid for next time of arbitration because the granted request signals must be assigned with lowest priority after arbitration. Thus, the user-defined priority should be set again after each arbitration.
ESYNet supports a typical wormhole switch methods. Different from the store-and-forward and virtual-cut method, packets can be forward to next router as long as the packets arrive at the top of input buffer. One packet can be distributed into several routers. The key points to implement wormhole switch methods as follow.
- Head flit builds up a channel between the input virtual channel and the input virtual channel in next router. Other flits follow the channel. The channel is released after the tail flits have been delivered.
- One link between routers can be shared by several virtual channels. If one virtual channel is blocked due to flow control, other virtual channel can occupy the link.
Wormhole switch methods work effectively for the topology without ring (torus topology can be seen as a collection of rings). Moreover, wormhole switch methods work for all topology if packets only have 1 flit. However, if packets have more than 1 flit, wormhole switch leads to deadlock in the topology with ring.
ESYNet provides another switch methods which is called Ring
for the topolgy with ring. The packets on the ring are switched as wormhole switch method. If they arrive the destination, they can be forwarded to network interface as wormhole switch method as well. However, the packets from network interface must wait until there is enough space to accept the packets on the ring. For eaxmple, a new packet from a network interface has 5 flits. Such a packet is blocked in the input buffer of physical port 0 until the input buffer in the next router has 5 free entries. Then, the new packet can get grant and the router forwards the packet into the ring. The router gets the number of free entires in the input buffer of neighbor router by credit, as seen [Flow Control](#Flow Control).
ESYNet provides credit-based flow control. Credit means the free slots in the input buffers of neighbor routers. The credit value is hold in the output ports. Each virtual channel has their own credit value.
The initialization value of credit is as same as the size of input buffer. When the router sends out one flit to its neighbor, credit decreases by 1 because the delivered flit wil occupy one entry in the buffer. When the neighbor router moves one flit from the input port to the output port, the neighbor router generates one CREDIT
event with the new credit value because the delivered flit will free one entry in the buffer.
Credit is used in several places in the pipeline. First, credit is used by congestion-awared routing algorithm, like DyXY. In DyXY, packets goes to NE, NW, SE and SW have two possible directions. The direction with larger credit is select because the direction has less congestion.
Second, during switch arbitration, credit is used to check whether there is enough free slot in the neighbor router to accept a new flit. If there is only 1 virtual channel in the input port in the neighbor router, flits are blocked if the credit is 0. If there are more than 1 virtual channels in input ports in the neighbor router, the flits assigned to the virtual channel with credit value of 0 are skipped during switch arbitration and flits assigned to other virtual channel can still deliver through the crossbar and the link.
Third, during virtual channel allocation, credit is used to block the packets from network interface if there is not enough free slots in the ring. To avoid deadlock, packets cannot be injected into the ring unless there is enough space to accept the whole packets. Only if the credit value is larger than or equal to the size of a new packet from network interface, the packet can get grant.
EsynetRouter
also provides a power unit based on Orion 2.0. The power unit accumlates the energy consumption during the simulation. At the end of simulation, the power consumption is calculated by diving the energy consumption with simulation time.
ESYNet accumulates five different events, including writing input buffer, reading input buffer, crossbar traversal, link traversal and arbitration. The power consumption is calculated by the formula $E = \alpha C V^2$
. C
and V
is constant value related to technology and circuit architecture. It is not suppest to change the C
and V
defined in esynet_global.h
. \alpha
is calculated by compare the signal with previous time. One changed signal leads to a piece of power consumption, otherwise, signal consumpts no energy.
Copyright @ Junshi Wang