Skip to content

Commit df30f28

Browse files
Updated documentation
1 parent 0d46da6 commit df30f28

File tree

6 files changed

+90
-22
lines changed

6 files changed

+90
-22
lines changed

README.developers.md

Lines changed: 52 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -386,17 +386,21 @@ The following are key QoR metrics which should be used to evaluate the impact of
386386

387387
Implementation Quality Metrics:
388388

389-
| Metric | Meaning | Sensitivity |
390-
|-----------------------------|--------------------------------------------------------------------------|-------------|
391-
| num_pre_packed_blocks | Number of primitive netlist blocks (after tech. mapping, before packing) | Low |
392-
| num_post_packed_blocks | Number of Clustered Blocks (after packing) | Medium |
393-
| device_grid_tiles | FPGA size in grid tiles | Low-Medium |
394-
| min_chan_width | The minimum routable channel width | Medium\* |
395-
| crit_path_routed_wirelength | The routed wirelength at the relaxed channel width | Medium |
396-
| critical_path_delay | The critical path delay at the relaxed channel width | Medium-High |
389+
| Metric | Meaning | Sensitivity |
390+
|---------------------------------|------------------------------------------------------------------------------|-------------|
391+
| num_pre_packed_blocks | Number of primitive netlist blocks (after tech. mapping, before packing) | Low |
392+
| num_post_packed_blocks | Number of Clustered Blocks (after packing) | Medium |
393+
| device_grid_tiles | FPGA size in grid tiles | Low-Medium |
394+
| min_chan_width | The minimum routable channel width | Medium\* |
395+
| crit_path_routed_wirelength | The routed wirelength at the relaxed channel width | Medium |
396+
| NoC_agg_bandwidth\** | The total link bandwidth utilized by all traffic flows | Low |
397+
| NoC_latency\** | The total time of traffic flow data transfer (summed over all traffic flows) | Low |
398+
| NoC_latency_constraints_cost\** | The total number of traffic flow latency constraints | Low |
397399

398400
\* By default, VPR attempts to find the minimum routable channel width; it then performs routing at a relaxed (e.g. 1.3x minimum) channel width. At minimum channel width routing congestion can distort the true timing/wirelength characteristics. Combined with the fact that most FPGA architectures are built with an abundance of routing, post-routing metrics are usually only evaluated at the relaxed channel width.
399401

402+
\** NoC-related metrics are only reported when --noc option is enabled.
403+
400404
Run-time/Memory Usage Metrics:
401405

402406
| Metric | Meaning | Sensitivity |
@@ -493,7 +497,7 @@ k6_frac_N10_frac_chain_mem32K_40nm.xml boundtop.v common 9f591f6-
493497
k6_frac_N10_frac_chain_mem32K_40nm.xml ch_intrinsics.v common 9f591f6-dirty success 363 493 270 247 10 10 17 99 130 1 0 1792 1.86527 -194.602 -1.86527 46 1562 13 1438 20 2.4542 -226.033 -2.4542 0 0 3.92691e+06 1.4642e+06 259806. 2598.06 333135. 3331.35 0.03 0.01 -1 -1 -1 0.46 0.31 0.94 0.09 2.59 62684 8672 32940
494498
```
495499

496-
### Example: Titan Benchmarks QoR Measurements
500+
### Example: Titan Benchmarks QoR Measurement
497501

498502
The [Titan benchmarks](https://docs.verilogtorouting.org/en/latest/vtr/benchmarks/#titan-benchmarks) are a group of large benchmark circuits from a wide range of applications, which are compatible with the VTR project.
499503
The are typically used as post-technology mapped netlists which have been pre-synthesized with Quartus.
@@ -511,7 +515,7 @@ $ make get_titan_benchmarks
511515
#Move to the task directory
512516
$ cd vtr_flow/tasks
513517

514-
#Run the VTR benchmarks
518+
#Run the Titan benchmarks
515519
$ ../scripts/run_vtr_task.py regression_tests/vtr_reg_nightly_test2/titan_quick_qor
516520

517521
#Several days later... they complete
@@ -528,6 +532,44 @@ stratixiv_arch.timing.xml stereo_vision_stratixiv_arch_timing.blif 0208312
528532
stratixiv_arch.timing.xml cholesky_mc_stratixiv_arch_timing.blif 0208312 success 140214 108592 67410 5444 121 90 -1 111 151 -1 -1 5221059 8.16972 -454610 -8.16972 1518597 15 0 0 2.38657e+08 21915.3 9.34704 -531231 -9.34704 0 0 211.12 364.32 490.24 6356252 -1 -1
529533
```
530534

535+
### Example: NoC Benchmarks QoR Measurements
536+
NoC benchmarks currently include synthetic and MLP benchmarks. Synthetic benchmarks have various NoC traffic patters,
537+
bandwidth utilization, and latency requirements. High-quality NoC router placement solutions for these benchmarks are
538+
known. By comparing the known solutions with NoC router placement results, the developer can evaluate the sanity of
539+
the NoC router placement algorithm. MLP benchmarks are the only realistic netlists included in this benchmark set.
540+
541+
Based on the number of NoC routers in a synthetic benchmark, it is run on one of two different architectures. All MLP
542+
benchmarks are run on an FPGA architecture with 16 NoC routers. Post-technology mapped netlists (blif files)
543+
for synthetic benchmarks are added to the VTR project. However, MLP blif files are very large and should be downloaded
544+
separately.
545+
546+
Since NoC benchmarks target different FPGA architectures, they are run as different circuits. A typical way to run all
547+
NoC benchmarks is to run a task list and gather QoR data form different tasks:
548+
549+
#### Running and Integrating the NoC Benchmarks with VTR
550+
```shell
551+
#From the VTR root
552+
553+
#Download and integrate NoC MLP benchmarks into the VTR source tree
554+
$ make get_noc_mlp_benchmarks
555+
556+
#Move to the task directory
557+
$ cd vtr_flow
558+
559+
#Run the VTR benchmarks
560+
$ scripts/run_vtr_task.py -l tasks/noc_qor/task_list.txt
561+
562+
#Several days later... they complete
563+
564+
#NoC benchmarks are run as several different tasks. Therefore, QoR results should be gathered from multiple directories,
565+
#one for each task.
566+
$ head -5 tasks/noc_qor/large_complex_synthetic/latest/parse_results.txt
567+
$ head -5 tasks/noc_qor/large_simple_synthetic/latest/parse_results.txt
568+
$ head -5 tasks/noc_qor/small_complex_synthetic/latest/parse_results.txt
569+
$ head -5 tasks/noc_qor/small_simple_synthetic/latest/parse_results.txt
570+
$ head -5 tasks/noc_qor/MLP/latest/parse_results.txt
571+
```
572+
531573
### Example: Koios Benchmarks QoR Measurement
532574

533575
The [Koios benchmarks](https://github.com/verilog-to-routing/vtr-verilog-to-routing/tree/master/vtr_flow/benchmarks/verilog/koios) are a group of Deep Learning benchmark circuits distributed with the VTR project.

doc/README

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ Overview
44
The VTR documentation is generated using sphinx, a python based documentation generator.
55

66
The documentation itself is written in re-structured text (files ending in .rst), which
7-
is a lightwieght mark-up language for text documents.
7+
is a lightweight mark-up language for text documents.
88

99
Currently VTR's documenation is automatically built by https://readthedocs.org/projects/vtr/ and is served at:
1010

@@ -36,7 +36,7 @@ from the main documentation directory (i.e. <vtr_root>/doc).
3636

3737
This will produce the output html in the _build directory.
3838

39-
You can then view the resulting documention with the web-browser of your choice.
39+
You can then view the resulting documentation with the web-browser of your choice.
4040
For instance:
4141

4242
$ firefox _build/html/index.html

doc/src/vtr/benchmarks.rst

Lines changed: 14 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -191,7 +191,20 @@ The SymbiFlow benchmarks can be downloaded and extracted by running the followin
191191
cd $VTR_ROOT
192192
make get_symbiflow_benchmarks
193193
194-
Once downloaded and extracted, benchmarks are provided as post-synthesized eblif files under: ::
194+
Once downloaded and extracted, benchmarks are provided as post-synthesized blif files under: ::
195195

196196
$VTR_ROOT/vtr_flow/benchmarks/symbiflow
197197

198+
.. _noc_benchmarks:
199+
200+
NoC Benchmarks
201+
----------------
202+
NoC benchmarks are composed of synthetic and MLP benchmarks and target NoC-enhanced FPGA architectures. Synthetic
203+
benchmarks include a wide variety of traffic flow patters and are divided into two groups: 1) simple and 2) complex
204+
benchmarks. As their names imply, simple benchmarks use very simple and small logic modules connected to NoC routers,
205+
while complex benchmarks implement more complicated functionalities like encryption. These benchmarks do not come from
206+
real application domains. On the other hand, MLP benchmarks include modules that perform matrix-vector multiplication
207+
and move data. Pre-synthesized netlists for the synthetic benchmarks are added to VTR project, but MLP netlists should
208+
be downloaded separately.
209+
210+
.. note:: The NoC MLP benchmarks are not included with the VTR release (due to their size). However they can be downloaded and extracted by running ``make get_noc_mlp_benchmarks`` from the root of the VTR tree. They can also be `downloaded manually <https://www.eecg.utoronto.ca/~vaughn/titan/>`_.

vtr_flow/benchmarks/noc/Large_Designs/MLP/Readme.txt

Lines changed: 11 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -12,15 +12,17 @@ Benchmark Structure:
1212
|---<Benchmark>.flows - Is the NoC traffic flows file associated with the given benchmark
1313
(A benchmark can have multiple traffic flows files)
1414
|---verilog - Contains design files needed to generate the netlist file for the benchmark
15-
|---shared_verilog - Contains design files needed by all benchmarks to generate thier netlist files
15+
|---shared_verilog - Contains design files needed by all benchmarks to generate their netlist files
16+
|---blif_files - Contains symbolic links to all .blif files that exist in this directory
17+
|---flow_files - Contains symbolic links to all .flow files that exist in this directory
1618

1719
Running the benchmarks:
1820
Pre-requisite
1921
- Ensure VPR is built (refer to 'https://docs.verilogtorouting.org/en/latest/' for build instructions)
2022
- Set 'VTR_ROOT' as environment variable pointing to the location of the VTR source tree
2123
- Ensure python version 3.6.9 or higher is installed
2224
- Copy over the netlist files from 'https://drive.google.com/drive/folders/135QhmfgUaGnK2ZEfbfEXtdm1BfS7YoG7?usp=sharing'.
23-
The file structure in the previous link is similiar to structure found in '$VTR_ROOT/vtr_flow/benchmarks/noc/Large_Designs/MLP'.
25+
The file structure in the previous link is similar to structure found in '$VTR_ROOT/vtr_flow/benchmarks/noc/Large_Designs/MLP'.
2426
Place the netlist files in the appropriate folder locations.
2527

2628
Running single instance:
@@ -48,7 +50,7 @@ Running the benchmarks:
4850
-vpr_executable $VTR_ROOT/build/vpr/vpr --device EP4SE820 -flow_file $VTR_ROOT/vtr_flow/benchmarks/noc/Large_Designs/MLP/MLP_1/mlp_1.flows \
4951
-noc_routing_algorithm xy_routing -number_of_seeds 5 -number_of_threads 1 -route
5052

51-
- The above command will generate an output file in the run directory that contains all the place and route metrics. This is a txt file with a name which matches the
53+
- The above command will generate an output file in the run directory that contains all the place and route metrics. This is a txt file with a name which matches
5254
the flows file provided. So for the command shown above the output file is 'mlp_1.txt'
5355

5456
Special benchmarks:
@@ -64,8 +66,13 @@ Running the benchmarks:
6466
of the NoC routers needs to be locked. A
6567
- To run a single instance of this benchmark, pass in the following command line parameter and its value to the command shown above:
6668
'--fix_clusters $VTR_ROOT/vtr_flow/benchmarks/noc/Large_Designs/MLP/MLP_2_phase_optimization/MLP_2_phase_optimization_step_2/MLP_two_phase_optimization_step_two_constraints.place'
67-
- To run the benchmarkusing the automated script just pass in the following command line parameter and its value to the script command above:
69+
- To run the benchmarking the automated script just pass in the following command line parameter and its value to the script command above:
6870
'-fix_clusters $VTR_ROOT/vtr_flow/benchmarks/noc/Large_Designs/MLP/MLP_2_phase_optimization/MLP_2_phase_optimization_step_2/MLP_two_phase_optimization_step_two_constraints.place'
71+
72+
Running VTR tasks:
73+
- All synthetic benchmarks can be run as VTR tasks. Example tasks are provided in vtr_flow/tasks/noc_qor
74+
- Instructions on how to run VTR tasks to measure QoR for NoC benchmarks in available in VTR Developer Guide.
75+
6976
Expected run time:
7077
- These benchmarks are quite large so the maximum expected run time for a single run is a few hours
7178
- To speed up the run time with multiple VPR runs the thread count can be increased from 1. Set thread count equal to number seeds for fastest run time.

vtr_flow/benchmarks/noc/Synthetic_Designs/Readme.txt

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,9 @@ Benchmark Structure:
88
|---<Benchmark>.flows - Is the NoC traffic flows file associated with the given benchmark
99
(A benchmark can have multiple traffic flows files)
1010
|---verilog - Contains design files needed to generate the netlist file for the benchmark
11-
|---shared_verilog - Contains design files needed by all benchmarks to generate thier netlist files
11+
|---shared_verilog - Contains design files needed by all benchmarks to generate their netlist files
12+
|---blif_files - Contains symbolic links to all .blif files that exist in this directory
13+
|---flow_files - Contains symbolic links to all .flow files that exist in this directory
1214

1315
Running the benchmarks:
1416
Pre-requisite
@@ -42,7 +44,11 @@ Running the benchmarks:
4244
-noc_routing_algorithm xy_routing -noc_swap_percentage 40 -number_of_seeds 5 -number_of_threads 1
4345

4446
- The above command will generate an output file in the run directory that contains all the place and route metrics. This is a txt file with a name which matches the
45-
the flows file provided. So for the command shown above the outout file is 'complex_2_noc_1D_chain.txt'
47+
flows file provided. So for the command shown above the output file is 'complex_2_noc_1D_chain.txt'
48+
49+
Running VTR tasks:
50+
- All synthetic benchmarks can be run as VTR tasks. Example tasks are provided in vtr_flow/tasks/noc_qor
51+
- Instructions on how to run VTR tasks to measure QoR for NoC benchmarks in available in VTR Developer Guide.
4652

4753
Expected run time:
4854
- These benchmarks are quite small so the maximum expected run time for a single run is ~30 minutes

vtr_flow/benchmarks/titan_blif/README.rst

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,10 @@
1-
The `Titan <http://www.eecg.utoronto.ca/~vaughn/titan/>` benchmarks are distributed seperately from VTR due to their large size.
1+
The `Titan <http://www.eecg.utoronto.ca/~vaughn/titan/>` benchmarks are distributed separately from VTR due to their large size.
22

3-
The Titan repo is located under /home/kmurray/trees/titan on the U of T EECG network. Memebers of Vaughn Betz's research lab have read/write privileges.
3+
The Titan repo is located under /home/kmurray/trees/titan on the U of T EECG network. Members of Vaughn Betz's research lab have read/write privileges.
44

55
This repo is where the Titan flow is developed and where any changes to it should be made.
66

7-
In addition to the titan benchmarks, this repo contains scripts that are used ingeneration of the architecture description for Stratix IV.
7+
In addition to the titan benchmarks, this repo contains scripts that are used in generation of the architecture description for Stratix IV.
88

99
More specifically, they contain scripts that generate memory blocks & complex switch blocks.
1010

0 commit comments

Comments
 (0)