Description
Sometimes when we write Clash we write large signal computations that are essentially pure expressions. Here's a silly but illustrative example:
mySignal :: Signal Int
mySignal = mySignal1 +. mySignal2 +. mySignal3 +. ... +. mySignal1024
where (+.) = liftA2 (+)
Even if we reparenthesise this expression so that it is of log depth then it will still be eight additions deep and it is easily possibly that a signal cannot propagate through it properly within one cycle. We could rewrite it thus:
mySignal :: Signal Int
mySignal = r( ... (r (r mySignal1 +. r mySignal2) +. r (r mySignal3 +. r mySignal4)) + ...
where r = register undefined
(+.) = liftA2 (+)
That is, we can register every subexpression. If every subexpression meets timing requirements then the rewritten version of mySignal
will too. We can be even cleverer and only add registers in the few places where it is necessary (as long as we also add registers in parallel pipelines so that inputs to subsequent combinators are synchronised). (NB it is not possible to add registers into any recursive parts of the design.)
My question is: would it be possible to add an optional register insertion phase to Clash? It is not semantics preserving, because the transformed output may be delayed by more than the original output, but it is semantics preserving up to time shift. Clash could report this time shift to the user.