FP16 and INT8 inference speed

Hi， thanks for your excellent work, it really helps me in my work. But i have a question here. I modified your code to infer my model. I used FP16 and INT8 precision, both modes have correct inference results. But the FP16 mode and INT8 mode has almost the same inference speed. I wonder why the INT8 mode does not infer faster than FP16 mode? Any sugessions, much thanks.
ps: I used realesrgan-x4v3 model, and i transfer it to onnx format.
<img width="405" height="919" alt="Image" src="https://github.com/user-attachments/assets/36c59aab-9200-4a93-9dfa-39784f5cfa63" />

<img width="532" height="851" alt="Image" src="https://github.com/user-attachments/assets/c5b6d8a0-04d9-47f2-9768-acc563160008" />

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

FP16 and INT8 inference speed #91

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

FP16 and INT8 inference speed #91

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions