@@ -6,7 +6,8 @@ Packages with benchmark tests:
66 - ` BenchmarkSender ` , ` BenchmarkVerificationSenderProof ` , ` TestParallelBenchmarkSender ` , and ` TestParallelBenchmarkVerificationSenderProof ` are used to benchmark the generation of a transfer action. This includes also the generation of ZK proof for a transfer operation.
77 - ` BenchmarkTransferProofGeneration ` , ` TestParallelBenchmarkTransferProofGeneration ` are used to benchmark the generation of ZK proof alone.
88- ` token/core/zkatdlog/nogh/v1/issue ` : ` BenchmarkIssuer ` and ` BenchmarkProofVerificationIssuer `
9- - ` token/core/zkatdlog/nogh/v1 ` : ` BenchmarkTransfer `
9+ - ` token/core/zkatdlog/nogh/v1/validator ` : ` TestParallelBenchmarkValidatorTransfer ` .
10+ - ` token/core/zkatdlog/nogh/v1 ` : ` BenchmarkTransferServiceTransfer ` and ` TestParallelBenchmarkTransferServiceTransfer ` .
1011
1112The steps necessary to run the benchmarks are very similar.
1213We give two examples here:
@@ -116,10 +117,10 @@ Example results have been produced on an Apple M1 Max and can be consulted [here
116117This is a test that runs multiple instances of the above benchmark in parallel.
117118This allows the analyst to understand if shared data structures are actual bottlenecks.
118119
119- It uses a custom-made runner whose documentation can be found [ here] ( ../../../token/core/common /benchmark/runner.md ) .
120+ It uses a custom-made runner whose documentation can be found [ here] ( ../../../token/services /benchmark/runner.md ) .
120121
121122``` shell
122- go test ./token/core/zkatdlog/nogh/v1/transfer -test.run=TestParallelBenchmarkSender -test.v -test.benchmem -test. timeout 0 -bits=" 32" -curves=" BN254" -num_inputs=" 2" -num_outputs=" 2" -workers=" 1,10 " -duration=" 10s" | tee bench.txt
123+ go test ./token/core/zkatdlog/nogh/v1/transfer -test.run=TestParallelBenchmarkSender -test.v -test.timeout 0 -bits=" 32" -curves=" BN254" -num_inputs=" 2" -num_outputs=" 2" -workers=" NumCPU " -duration=" 10s" -setup_samples=128 | tee bench.txt
123124```
124125
125126The test supports the following flags:
@@ -136,120 +137,82 @@ The test supports the following flags:
136137 a comma-separate list of number of outputs (1,2,3,...)
137138 -workers string
138139 a comma-separate list of workers (1,2,3,...,NumCPU), where NumCPU is converted to the number of available CPUs
140+ -profile bool
141+ write pprof profiles to file
142+ -setup_samples uint
143+ number of setup samples, 0 disables it
139144```
140145
141146### Results
142147
143- ``` go
148+ ``` shell
144149=== RUN TestParallelBenchmarkSender
145- === RUN TestParallelBenchmarkSender /Setup (bits_32,_curve_BN254,_#i_2,_#o_2)_with_1_workers
146- Metric Value Description
147- ------ ----- -----------
148- Workers 1
149- Total Ops 168 (Low Sample Size )
150- Duration 10. 023390959s (Good Duration )
151- Real Throughput 16.76 /s Observed Ops /sec (Wall Clock )
152- Pure Throughput 17.77 /s Theoretical Max (Low Overhead )
153-
154- Latency Distribution :
155- Min 55. 180375ms
156- P50 (Median) 55. 945812ms
157- Average 56. 290356ms
158- P95 58. 108814ms
159- P99 58. 758087ms
160- Max 59. 089958ms (Stable Tail )
161-
162- Stability Metrics :
163- Std Dev 898.087 µs
164- IQR 1. 383083ms Interquartile Range
165- Jitter 590.076 µs Avg delta per worker
166- CV 1.60 % Excellent Stability (<5 %)
167-
168- Memory 1301420 B /op Allocated bytes per operation
169- Allocs 18817 allocs/op Allocations per operation
170-
171- Latency Heatmap (Dynamic Range ):
172- Range Freq Distribution Graph
173- 55. 180375ms-55. 369563ms 17 █████████████████████████ (10.1 %)
174- 55. 369563ms-55. 5594ms 18 ██████████████████████████ (10.7 %)
175- 55. 5594ms-55. 749887ms 27 ████████████████████████████████████████ (16.1 %)
176- 55. 749887ms-55. 941028ms 20 █████████████████████████████ (11.9 %)
177- 55. 941028ms-56. 132824ms 13 ███████████████████ (7.7 %)
178- 56. 132824ms-56. 325277ms 9 █████████████ (5.4 %)
179- 56. 325277ms-56. 51839ms 4 █████ (2.4 %)
180- 56. 51839ms-56. 712165ms 6 ████████ (3.6 %)
181- 56. 712165ms-56. 906605ms 9 █████████████ (5.4 %)
182- 56. 906605ms-57. 101711ms 13 ███████████████████ (7.7 %)
183- 57. 101711ms-57. 297486ms 10 ██████████████ (6.0 %)
184- 57. 297486ms-57. 493933ms 3 ████ (1.8 %)
185- 57. 493933ms-57. 691053ms 3 ████ (1.8 %)
186- 57. 691053ms-57. 888849ms 4 █████ (2.4 %)
187- 57. 888849ms-58. 087323ms 3 ████ (1.8 %)
188- 58. 087323ms-58. 286478ms 2 ██ (1.2 %)
189- 58. 286478ms-58. 486315ms 2 ██ (1.2 %)
190- 58. 486315ms-58. 686837ms 2 ██ (1.2 %)
191- 58. 686837ms-58. 888047ms 2 ██ (1.2 %)
192- 58. 888047ms-59. 089958ms 1 █ (0.6 %)
193-
194- --- Analysis & Recommendations ---
195- [WARN] Low sample size (168 ). Results may not be statistically significant. Run for longer.
196- [INFO] High Allocations (18817 /op). This will trigger frequent GC cycles and increase Max Latency .
197- ----------------------------------
198150=== RUN TestParallelBenchmarkSender/Setup(bits_32,_curve_BN254,_#i_2,_#o_2)_with_10_workers
199- Metric Value Description
200- ------ ----- -----------
201- Workers 10
202- Total Ops 1232 (Low Sample Size )
203- Duration 10. 070877291s (Good Duration )
204- Real Throughput 122.33 /s Observed Ops /sec (Wall Clock )
205- Pure Throughput 130.12 /s Theoretical Max (Low Overhead )
151+ Metric Value Description
152+ ------ ----- -----------
153+ Workers 10
154+ Total Ops 1230 (Low Sample Size)
155+ Duration 10.068s (Good Duration)
156+ Real Throughput 122.17/s Observed Ops/sec (Wall Clock)
157+ Pure Throughput 123.04/s Theoretical Max (Low Overhead)
206158
207159Latency Distribution:
208- Min 61. 2545ms
209- P50 (Median) 75. 461375ms
210- Average 76. 852256ms
211- P95 93. 50851ms
212- P99 106. 198982ms
213- Max 144. 872375ms (Stable Tail )
160+ Min 59.895916ms
161+ P50 (Median) 77.717333ms
162+ Average 81.27214ms
163+ P95 112.28194ms
164+ P99 137.126207ms
165+ P99.9 189.117473ms
166+ Max 215.981417ms (Stable Tail)
214167
215168Stability Metrics:
216- Std Dev 9. 28799ms
217- IQR 10. 909229ms Interquartile Range
218- Jitter 9. 755984ms Avg delta per worker
219- CV 12.09 % Moderate Variance (10 -20 %)
220-
221- Memory 1282384 B /op Allocated bytes per operation
222- Allocs 18668 allocs/op Allocations per operation
169+ Std Dev 16.96192ms
170+ IQR 19.050834ms Interquartile Range
171+ Jitter 15.937043ms Avg delta per worker
172+ CV 20.87% Unstable (> 20%) - Result is Noisy
173+
174+ System Health & Reliability:
175+ Error Rate 0.0000% (100% Success) (0 errors)
176+ Memory 1159374 B/op Allocated bytes per operation
177+ Allocs 17213 allocs/op Allocations per operation
178+ Alloc Rate 133.20 MB/s Memory pressure on system
179+ GC Overhead 1.27% (High GC Pressure)
180+ GC Pause 127.435871ms Total Stop-The-World time
181+ GC Cycles 264 Full garbage collection cycles
223182
224183Latency Heatmap (Dynamic Range):
225184Range Freq Distribution Graph
226- 61. 2545ms -63. 948502ms 36 ███████ ( 2.9 %)
227- 63. 948502ms- 66. 760987ms 86 █████████████████ ( 7.0 %)
228- 66. 760987ms- 69. 697167ms 152 ███████████████████████████████ ( 12.3 %)
229- 69. 697167ms- 72. 762481ms 181 █████████████████████████████████████ (14.7 %)
230- 72. 762481ms- 75. 962609ms 195 ████████████████████████████████████████ (15.8 %)
231- 75. 962609ms- 79. 303481ms 179 ████████████████████████████████████ ( 14.5 %)
232- 79. 303481ms- 82. 791286ms 152 ███████████████████████████████ ( 12.3 %)
233- 82. 791286ms- 86. 432486ms 94 ███████████████████ ( 7.6 %)
234- 86. 432486ms- 90. 233828ms 59 ████████████ ( 4.8 %)
235- 90. 233828ms- 94. 202355ms 40 ████████ ( 3.2 %)
236- 94. 202355ms- 98. 345419ms 29 █████ ( 2.4 %)
237- 98. 345419ms- 102. 670697ms 9 █ ( 0.7 %)
238- 102. 670697ms- 107. 186203ms 8 █ ( 0.6 %)
239- 107. 186203ms- 111. 900303ms 4 (0.3 %)
240- 111. 900303ms- 116. 821732ms 2 (0.2 %)
241- 116. 821732ms- 121. 959608ms 3 (0.2 %)
242- 121. 959608ms- 127. 32345ms 1 (0.1 %)
243- 127. 32345ms- 132. 923196ms 1 (0.1 %)
244- 138. 769222ms- 144. 872375ms 1 (0.1 %)
185+ 59.895916ms -63.862831ms 98 ██████████████████████ (8.0 %)
186+ 63.862831ms-68.092476ms 163 ████████████████████████████████████ (13.3 %)
187+ 68.092476ms-72.602251ms 170 ██████████████████████████████████████ (13.8 %)
188+ 72.602251ms-77.410709ms 172 ██████████████████████████████████████ (14.0 %)
189+ 77.410709ms-82.537631ms 177 ████████████████████████████████████████ (14.4 %)
190+ 82.537631ms-88.004111ms 128 ████████████████████████████ (10.4 %)
191+ 88.004111ms-93.832637ms 119 ██████████████████████████ (9.7 %)
192+ 93.832637ms-100.047186ms 73 ████████████████ (5.9 %)
193+ 100.047186ms-106.673326ms 40 █████████ (3.3 %)
194+ 106.673326ms-113.738317ms 32 ███████ (2.6 %)
195+ 113.738317ms-121.271222ms 20 ████ (1.6 %)
196+ 121.271222ms-129.303034ms 14 ███ (1.1 %)
197+ 129.303034ms-137.866793ms 12 ██ (1.0 %)
198+ 137.866793ms-146.997731ms 3 (0.2 %)
199+ 146.997731ms-156.733413ms 4 (0.3 %)
200+ 167.11389ms-178.181868ms 2 (0.2%)
201+ 178.181868ms-189.98288ms 1 (0.1%)
202+ 189.98288ms-202.565475ms 1 (0.1%)
203+ 202.565475ms-215.981417ms 1 (0.1%)
245204
246205--- Analysis & Recommendations ---
247- [WARN] Low sample size (1232 ). Results may not be statistically significant. Run for longer.
248- [INFO] High Allocations (18668 /op). This will trigger frequent GC cycles and increase Max Latency .
206+ [WARN] Low sample size (1230). Results may not be statistically significant. Run for longer.
207+ [FAIL] High Variance (CV 20.87%). System noise is affecting results. Isolate the machine or increase duration.
208+ [INFO] High Allocations (17213/op). This will trigger frequent GC cycles and increase Max Latency.
249209----------------------------------
250- --- PASS : TestParallelBenchmarkSender (20. 83s)
251- --- PASS : TestParallelBenchmarkSender /Setup (bits_32,_curve_BN254,_#i_2,_#o_2)_with_1_workers (10. 39s)
252- --- PASS : TestParallelBenchmarkSender /Setup (bits_32,_curve_BN254,_#i_2,_#o_2)_with_10_workers (10. 44s)
210+
211+ --- Throughput Timeline ---
212+ Timeline: [▇▇▇█▇▇▇▇▆▇] (Max: 131 ops/s)
213+
214+ --- PASS: TestParallelBenchmarkSender (13.97s)
215+ --- PASS: TestParallelBenchmarkSender/Setup(bits_32,_curve_BN254,_#i_2,_#o_2)_with_10_workers (13.96s)
253216PASS
254- ok github.com /hyperledger-labs/fabric-token-sdk/token/core/zkatdlog/nogh/v1/transfer 21. 409s
217+ ok github.com/hyperledger-labs/fabric-token-sdk/token/core/zkatdlog/nogh/v1/transfer 14.566s
255218```
0 commit comments