When you have a large chunk of combinational logic, sometimes it is useful to create pipeline out of it. Benefits are that critical path is much lower and you can work with higher clock, whereas throughput of your circuit is still almost 1 operation/cycle. Only downside is that results are delayed a couple of cycles.

I have had a not-so-succesfull attempt to make a pipeline geneartor for verilog. But I was informed recently that synopsys design compiler has option to do this automatically.

This is design I want to pipeline.

module addxor(input [5:0] A, input [5:0] B, output C, input clock);
wire [5:0] M = A + B;
wire [5:0] N = M * M;
wire [5:0] H = A - N;
wire [5:0] J = H - B;
wire [5:0] G = A[0] ? H : J;
assign C = |G;
endmodule

pipeline_design

From the desing compiler there is a straight-forward command which is:
design_vision-xg-t> pipeline_design
I don’t like this command, since it modifies the behaviour of original RTL design. Pre-synthesis and post-synthesis results are different. In pre-synthesis pipeline is not functional, at this means I have to write different tests for pre and post synthesis.

Retiming to the rescue

Retiming was invented in 1981, and synthesis tools such as DC use it. It allows it to move registers arround the combinational logic in order to optimize it.
module addxor_pipelined(input [5:0] A, input [5:0] B, output reg C, input clock);
wire C1;
addxor U1(A,B,C1,clock);
reg C2,C3,C4,C5;
always @ (posedge clock)
begin
C2 <= C1;
C3 <= C2;
C4 <= C3;
C5 <= C4;
C <= C5;
end
endmodule
This could be a common design pattern for pipeline, which consists of two steps:
  1. Encapsulate a combinational module; and
  2. Delay the output.
Our hopes are that synthesis tool will balance timing in circuit, thus producing a pipeline.

Without mapping/retiming this design looks like this.
After running the compile command, design does not change much, at the end there are flip-flops in series at the output of circuit. But circuit does have a timing violations.
design_vision-xg-t> source config.con
design_vision-xg-t> compile # there will be timing violations
design_vision-xg-t> optimize_registers -period 0 -flatten
VoilĂ !
Large squares are flip-flops, and you will notice that they are spread out through an entire design, this is what we wanted. No more timing violations. In case you are interested:
my DC configuration.

In conclusion, with this method for pipelining, you can use same testing code for pre-synthesis and post-synthesis design.

Thanks for reading, leave your comments below.