This Question is Answered

1 "correct" answer available (5 pts) 2 "helpful" answers available (4 pts)
5 Replies Last post: Jul 7, 2009 12:52 PM by Guy  
sathish Lurker 3 posts since
May 28, 2009
Reply
Currently Being Moderated

Jul 1, 2009 11:37 PM

Synthesis Design Optimization techniques (Precision RTL)

Hi everyone,

 

I am trying to optimize the RTL in Precision RTL.

 

What are the techniques to be followed to meet area and timing.(without modifying the RTL).

 

e.g contraints & attributes for area and timing.

 

Thanks in Advance.

 

Regards,

Sathish

Reply
Guy jedi 173 posts since
Apr 1, 2008
Currently Being Moderated
Jul 2, 2009 4:36 PM in response to: sathish
Re: Synthesis Design Optimization techniques (Precision RTL)

The first thing you want to do is make sure your constraints are properly setup, so Precision can see the real problems in the design. Here are some general steps to follow:

 

1) Compile any Xilinx CoreGen blocks into Precision

 

CoreGen blocks can result in black boxes. Black boxes block timing paths in Precision, so a critical path that goes through a black box may not be properly optimized. Black boxes also prevent Precision from getting an accurate area count. The more information that is supplied to Precision, the better job it will do with optimization.

 

Search through your Precision log file for any black box warnings:

 

# Warning: [237]: Found black-box: work.ram; "./ram.vhd", line 22:.

 

It is easy to add the CoreGen blocks to Precision. From the GUI, you can add the Xilinx NGC files to the input file list. This will issue the following command in Precision:

 

add_input_file {ram.ngc}

 

During "compile", Precision will take the timing and area information from the CoreGen file and apply it to your design.

 

2) If you are targeting Altera, turn on the Precision clearbox flow

 

Just like the Xilinx CoreGen blocks, Altera MegaFunctions will show up as black boxes in the design. You can tell Precision to resolve the Altera black boxes by turning on the clearbox flow.

 

- Make sure your $QUARTUS_ROOTDIR environment variable is set to a valid Quartus install directory
- Enable clearbox with the Precision "setup_design -clearbox=true" setting
- Compile your design

 

3) Make sure your design is properly constrained

 

Precision has a "report_missing_constraints" that can be run after "compile". This command will tell you if any part of your design is unconstrained:

 

report_missing_constraints

Missing Constraints Report
Clocks are not defined at the following pins
---------------------------------------------
clk
Input delays are missing at the following pins
-----------------------------------------------
data_in
Output delays are missing at the following pins
------------------------------------------------
data_out

 

It is important to constrain you design. If Precision has the right constraints, it can work harder to improve the real critical paths.

 

4) For area issues, make sure the RAM, DSP, and other dedicated resource numbers are close to what you are expecting

 

The most common problem with area, and designs not fitting, is due to RAM inference and mapping. Check the Precision area report to see if you are getting the expected number of RAM resource:

 

Resource                Used    Avail   Utilization
-----------------------------------------------
Function Generators     69      10944     0.63%
CLB Slices              40      5472      0.73%
Dffs or Latches         79      11424     0.69%
Block RAMs              0       36        0.00%

 

The Precision Resource Manager allows you to check all of the inferred resources after "compile". You can also use the Resource Manager to move different instances in and out of dedicated resources like block RAM and dedicated DSP blocks.

 

5) Use the debug functionality in Precision to help you isolate problems

 

By default, Precision will only add the details for your most critical path to the timing report. You can increase the number of critical paths in the Precision timing report by using the "setup_analysis" command:

 

setup_analysis -num_critical_paths=100 -missing_constraints=true

 

This will tell Precision to report 100 of the most critical paths. From your Precision timing report, you can select an instance in the report and "Trace to Tech Schematic". The technology schematic can help you answer questions about the critical path. What objects does the critical path go through? You can trace objects in the schematic back to the original RTL source. This helps you figure out where the critical path is coming from.

 

You can also use the "Trace Forward" and "Trace Backward" functionality in the schematic to see how much slack you have on either side of a register. This can help you determine if retiming will help fix the critical path. If there is some positive slack on one side of the register, Precision may be able to retime the logic and improve the critical path.

 

6) Dissolve hierarchy if the critical path goes through multiple levels of hierarchy

 

Hierarchy can sometimes block Precision from doing the best optimizations on a critical path. When you are looking through the critical paths in your timing report, make sure that all of the instances in the critical path are in the same level of hierarchy:

 

I2/sample_gt16_0/ix100z52223/CI MUXCY             3.825   dn
I2/sample_gt16_0/ix100z52223/O  MUXCY   0.764     4.589   dn
I2/sample_gt16_0/nx100z1        (net)   1.300                  16
I4/I0/ix50013z1316/I1           LUT2              5.889   dn
I4/I0/ix50013z1316/O            LUT2    0.698     6.587   up
I4/I0/nx50013z1                 (net)   0.680                   2

 

In this case the critical path goes from instance I2 to instance I4. You can use the "ungroup" command to flatten the whole design:

 

ungroup -all -hier

 

Another option would be to use the "HIERARCHY" attribute to flatten the instances in the critical path:

 

set_attribute -design rtl -name HIERARCHY -value flatten -instance {I2}
set_attribute -design rtl -name HIERARCHY -value flatten -instance {I4}

 

This will place all of the logic for these two instances in the same level of hierarchy, and it may help Precision do a better job of optimizing the timing path.

 

7) Use different settings in Precision get better results

 

Most of these options can be accessed in the Precision GUI from the "Tools->Set Options..." pull-down menu.

 

- Turn on retiming:

 

setup_design -retiming=true

 

With retiming turned on Precision will be able to push logic across register boundaries to help you meet timing.

 

- Turn on either the "Compile for Timing" or "Compile for Area" front-end optimizations:

 

setup_design -compile_for_timing=true

 

or

 

setup_design -compile_for_area=true

 

These options are mutually exclusive, and they will run optimizations to help you meet timing or area.

 

- Turn off resource sharing

 

setup_design -resource_sharing=false

 

By turning off the resource sharing, you can get better timing results, but it may increase the area.

 

- Lower the fanout to help you meet timing

 

The Precision timing report will list the fanout for the nets in your critical path. If one of the nets has a high fanout number, you can use the "max_fanout" attribute to lower the fanout on the net:

 

set_attribute -name MAX_FANOUT -value 50 -net {I2.sample.my_net}

 

The "max_fanout" setting can also be applied to instances:

 

set_attribute -name MAX_FANOUT -value 50 -instance {I2}

 

You can also apply a global "max_fanout" setting:

 

setup_design -max_fanout=1000

 

Most of the time there is a tradeoff between timing and area. You may make a change to help you meeting timing, but it could cause the area to increase. You can use Precision to help you better understand your design letting you make the right tradeoffs between area and timing optimizations.

Hans Wannabe 31 posts since
Aug 27, 2008
Currently Being Moderated
Jul 3, 2009 9:16 AM in response to: sathish
Re: Synthesis Design Optimization techniques (Precision RTL)

sathish wrote:

 

Hi Guy!

 

I belive Precision supports SDC contraints , But i didn't see any example provided in help.

 

Could you be able to provide any SDC contraints for Area and Timing..If any golden script please provide to be me with some example design.

 

Thanks in Advance!

 

Regards,

Sathish

 

Just use the GUI to set your constraints and Precision will generate the SDC file for you automatically (see the manual for the SDC files it creates). I am not sure if there is an area SDC constraint.

 

In addition to Guy's comprehensive list I would suggest that you write a script that goes through some/all of the settings and compare the results at the end. What you may find is that some of the settings have an opposite effect. For example, I found on one of my designs that I could get the highest Fmax by using the "compile for area" option instead of  the "compile for delay". Similarly I found that Physical Aware Synthesis (Plus only version) gives better results on most of my designs if I turn register re-timing off. Make sure you do the comparison on Place&Route'd results and not on the values given by Precision.

 

Good luck,

Hans.

Guy jedi 173 posts since
Apr 1, 2008
Currently Being Moderated
Jul 7, 2009 12:52 PM in response to: sathish
Re: Synthesis Design Optimization techniques (Precision RTL)

sathish wrote:

 

 

I have tried the attributes which you have mentioned , But I am not able to judge what is happening @ precision RTL.

'am able to see the different results for different design.

 

e.g. Global flatten atttribute, how it helps for area and timing.

 

      Global preserve attribute , how it helps for area and timing.

 

      Global Compile for area,When to use and not to use.

 

      Global max_fanout : Yes , agree with you 'have seen If we increase the limit, Area is reduced.

 

   And other than this whaterver you mentioned I am able to use and see the results. like resource sharing ,RAM etc.

 

If I want only area , What kind of action I should take from precision report file.

'have seend some of the designs , QuickLogic PnR gives good result if 'have les no. of LUT4 ( more LUT3).

Is their any way I can control the LUTs inference..by using any attribute.

 

'have tried the preserve_signal attribute on internal RTL primitive net for an instance (RTL schematic) , The result is LUT4 reduced and LUT3 increaced (not always).

Is it right way to play with LUTs or any other options.

 

Thanks,

Sathish

 

For better area results try running Precision without any timing constraints. Do not add a global frequency or any clock constraints. This will allow Precision to focus on optimizing the area. Also, flattening the design can help both area and timing results. Like Hans had mentioned in the earlier post, try a few different options and check the results after your run.

 

I do not know why preserve_signal is effecting the number of LUT3 cells. All designs are different, so it is hard to tell what is going on in this case.

More Like This

  • Retrieving data ...