Q8GEMM per-channel quant 32bit/16bit accumulation by mortzur · Pull Request #54 · pytorch/QNNPACK

mortzur · 2019-03-21T19:28:51Z

Micro-kernels implementation of the following:

Q8GEMM with per-channel weights quantization parameters (.c) + unit tests + benchmarks
Q8GEMM with per-channel weights quantization parameters for AARCH32 (.S) + unit tests + benchmarks
Q8GEMM with per-channel weights quantization parameters with 16bit opportunistic accumulation (.c) + unit tests + benchmarks
Q8GEMM with per-channel weights quantization parameters with 16bit opportunistic accumulation for AARCH32 (.S) + unit tests + benchmarks

…utput channel

This reverts commit 0b06799.

franksun007 · 2019-06-10T21:49:08Z

I believe the following patches are missing from the CMakeList.txt

diff --git a/CMakeLists.txt b/CMakeLists.txt
index a5ddc49..6320b1e 100644
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -188,11 +188,15 @@ SET(QNNPACK_AARCH32_ASM_UKERNELS
   src/q8conv/4x8-aarch32-neon.S
   src/q8dwconv/up8x9-aarch32-neon.S
   src/q8gemm/4x8-aarch32-neon.S
-  src/q8gemm/4x8c2-xzp-aarch32-neon.S)
+  src/q8gemm/4x8c2-xzp-aarch32-neon.S
+  src/q8gemm/4x8-aarch32-neon-per-channel.S
+  src/q8gemm/4x8-aarch32-neon-per-channel-16bitAcc.S)
 
 SET(QNNPACK_AARCH64_ASM_UKERNELS
   src/q8conv/8x8-aarch64-neon.S
-  src/q8gemm/8x8-aarch64-neon.S)
+  src/q8gemm/8x8-aarch64-neon.S
+  src/q8gemm/4x8-neon_per_channel.c
+  src/q8gemm/4x8-neon_per_channel_16bitAcc.c)
 
 SET(QNNPACK_X86_SSE2_UKERNELS
   src/q8avgpool/mp8x9p8q-sse2.c

franksun007 · 2019-06-10T22:59:13Z

Also, ~~test~~ and benchmarks failed to compile on ARM32 platform. This might be an easy fix with if guard.

Sorry, my bad. Only the benchmark is not compiling correctly.

mortzur added 9 commits February 19, 2019 14:47

adding unit-test example of per-channel scale and zero-point

0b06799

Adding q8gemm neon ukernel with weights quantization parameters per o…

1909ceb

…utput channel

Revert "adding unit-test example of per-channel scale and zero-point"

00c1494

This reverts commit 0b06799.

Benchmarks for q8gemm with per-channel kernel quantization parameters

0ab971e

moving 4x8-neon per channel ukernel to a separate file

efe00eb

q8gemm per-channel armv7 ukernel

fe09e88

cleanup comments

5609f85

per-channel quantization with 16bit accumulation

fb3fb18

cleanup

eb5433f

mortzur requested a review from hlu1 March 21, 2019 19:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Q8GEMM per-channel quant 32bit/16bit accumulation#54

Q8GEMM per-channel quant 32bit/16bit accumulation#54
mortzur wants to merge 9 commits intopytorch:masterfrom
mortzur:per_channel_quant_16bit

mortzur commented Mar 21, 2019

Uh oh!

franksun007 commented Jun 10, 2019

Uh oh!

franksun007 commented Jun 10, 2019 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

mortzur commented Mar 21, 2019

Uh oh!

franksun007 commented Jun 10, 2019

Uh oh!

franksun007 commented Jun 10, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

franksun007 commented Jun 10, 2019 •

edited

Loading