Skip to content

Conversation

@oreomaker
Copy link
Collaborator

@oreomaker oreomaker commented Jan 5, 2026

Please check Guidelines for Contributing.

Summary by CodeRabbit

Release Notes

  • New Features

    • Added support for CastType, Embedding, Index, and View operations in QNN backend.
    • Introduced Mul operation support alongside existing Add operation.
    • Added Sigmoid neural network layer.
  • Refactor

    • Simplified tensor creation logic in QNN backend by removing caching and registration overhead.
    • Restructured pattern registration for improved modularity and extensibility.
    • Unified operation pattern handling interface.

✏️ Tip: You can customize this high-level summary in your review settings.

oreomaker and others added 6 commits January 4, 2026 03:24
…iew operations

- Add new QnnAOTCastTypePattern to handle CastType operations with quantization
 support for float/int conversions and QNN Quantize/Dequantize/Cast operations
- Add new QnnAOTEmbeddingPattern to convert embedding operations to QNN Gather
- Add new QnnAOTIndexPattern to handle index operations using QNN Gather with
  proper axis detection
- Add new QnnAOTViewPattern to convert view operations to QNN Reshape
- Refactor LLM2QnnLoweringPass to use template-based pattern registration
  instead of manual insertion
- Update Elewise visitor to add QnnAOTMulPattern for multiplication operations
- Modify base pattern to use rewrite method instead of compile method
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 5, 2026

📝 Walkthrough

Walkthrough

The PR adds five new QNN AOT visitor patterns (CastType, Embedding, Index, View, Mul), introduces a Sigmoid neural network layer, and refactors the pattern registration system to support bulk initialization. It also removes tensor caching and runtime registration side effects from QNN tensor creation and changes the base pattern class from compile-based to rewrite-based API.

Changes

Cohort / File(s) Summary
QNN AOT Tensor Creation
mllm/backends/qnn/aot/QnnWrappersAPI.cpp
Simplified tensor creation by removing caching (static_tensor_, all_tensors_) and QNN runtime registration calls (tensorCreateContextTensor, tensorCreateGraphTensor). Tensors are now only constructed and returned without side effects.
QNN AOT Pattern Base Refactoring
mllm/backends/qnn/aot/visitor/Base.hpp
Removed pure virtual compile method from QnnAOTBasePattern and QnnAOTQuantRecipeBasePattern. Replaced with no-op rewrite methods returning false, decoupling pattern rewriting from compilation logic.
QNN AOT Lowering Infrastructure
mllm/backends/qnn/aot/passes/LLM2QnnLoweringPass.cpp, mllm/backends/qnn/aot/passes/LLM2QnnLoweringPass.hpp
Introduced variadic template helper registerPatterns() to bulk-register multiple pattern types. Migrated from single QnnAOTAddPattern to consolidated registration of Embedding, CastType, Add, View, and Index patterns.
QNN AOT CastType Pattern
mllm/backends/qnn/aot/visitor/CastType.hpp, mllm/backends/qnn/aot/visitor/CastType.cpp
New pattern to convert linalg::CastTypeOp into QNN operations (Quantize, Dequantize, Cast). Validates qnn_graph_name/qnn_context_name attributes and registers the QNN node.
QNN AOT Embedding Pattern
mllm/backends/qnn/aot/visitor/Embedding.hpp, mllm/backends/qnn/aot/visitor/Embedding.cpp
New pattern to convert linalg::EmbeddingOp into QNN Gather operations. Extracts weight and indices tensors, creates Gather node with axis parameter, and registers in AOT context.
QNN AOT Index Pattern
mllm/backends/qnn/aot/visitor/Index.hpp, mllm/backends/qnn/aot/visitor/Index.cpp
New pattern to convert linalg::IndexOp into QNN Gather operations. Infers axis from indices, constructs indices tensor, and registers operation with axis parameter.
QNN AOT View Pattern
mllm/backends/qnn/aot/visitor/View.hpp, mllm/backends/qnn/aot/visitor/View.cpp
New pattern to convert linalg::ViewOp into QNN Reshape operations. Computes output shape, builds shape tensor, and registers Reshape node.
QNN AOT Elewise Patterns
mllm/backends/qnn/aot/visitor/Elewise.hpp, mllm/backends/qnn/aot/visitor/Elewise.cpp
Renamed QnnAOTAddPattern method from compile to rewrite. Added new QnnAOTMulPattern with identical structure for ElementWiseMultiply operations.
Sigmoid Layer
mllm/nn/layers/Sigmoid.hpp, mllm/nn/layers/Sigmoid.cpp, mllm/nn/Nn.hpp
New Sigmoid layer implementation with default and options-based constructors. Exported via Nn.hpp public interface. Includes in-place operation support via macros.
Sigmoid Op Registration
mllm/compile/ir/linalg/Op.hpp, mllm/compile/ir/linalg/Op.cpp
Moved SigmoidOp registration from CustomizedOps block to main LINALG operator definitions block for consistency.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Suggested reviewers

  • liang1232018
  • chenghuaWang
  • yirongjie

Poem

🐰 Five patterns hop into the QNN,
Embedding, Index, View—quite the keen bunch!
Sigmoid learns to curve with graceful flair,
While tensors dance free, unregistered in air.

Pre-merge checks and finishing touches

❌ Failed checks (2 warnings, 1 inconclusive)
Check name Status Explanation Resolution
Description check ⚠️ Warning The PR description only contains the template boilerplate with no actual content describing the changes, objectives, or rationale for the modifications. Replace the template text with a substantive description of the changes, including what patterns/features were added, why they were added, and any relevant implementation details or testing information.
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
Title check ❓ Inconclusive The title 'Qnn aot' is too vague and generic, not clearly describing the primary changes or scope of the pull request. Provide a more descriptive title that captures the main changes, such as 'Add QNN AOT patterns and SigmoidOp support' or 'Refactor QNN AOT lowering with multiple new patterns'.
✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Fix all issues with AI Agents 🤖
In @mllm/backends/qnn/aot/passes/LLM2QnnLoweringPass.cpp:
- Line 24: registerPatterns is missing QnnAOTMulPattern registration even though
QnnAOTMulPattern is implemented (in Elewise.hpp) like QnnAOTAddPattern; update
the registerPatterns template instantiation to include QnnAOTMulPattern
alongside QnnAOTAddPattern (so the list becomes QnnAOTEmbeddingPattern,
QnnAOTCastTypePattern, QnnAOTAddPattern, QnnAOTMulPattern, QnnAOTViewPattern,
QnnAOTIndexPattern) to ensure multiplication ops are lowered by
LLM2QnnLoweringPass.

In @mllm/backends/qnn/aot/visitor/Embedding.cpp:
- Around line 34-35: The lookup of the weight symbol via
writer.getContext()->lookupSymbolTable(a_op->getName() + ".weight") can return
nullptr; validate the returned pointer before calling outputs().front() (the
code that assigns to variable weight). If lookup returns nullptr, handle it
(e.g., log or raise an informative error including a_op->getName(), or
return/throw) to avoid null dereference; only call
outputs().front()->cast_<ir::tensor::TensorValue>() when the symbol table
pointer is non-null. Ensure the error path provides clear context about the
missing "<opName>.weight" symbol.

In @mllm/backends/qnn/aot/visitor/View.cpp:
- Around line 38-55: The computed but unused shape variables should be removed:
delete the local variables `shape_data` and `shape_tensor_shape` and the loop
that fills `shape_data` (the for-loop over `output_shape`) in View.cpp so only
the output shape query remains; keep the QNN Reshape op creation
(`QnnAOTNodeOperation::create("Reshape")`), package name set, inputs/outputs via
`env->captureQnnAOTNodeTensor(...)`, `setName(view_op->getAOp()->getName())`,
and the final `env->captureAOTNodeOp(...)` call unchanged.
🧹 Nitpick comments (4)
mllm/backends/qnn/aot/visitor/CastType.cpp (1)

15-25: Consider using a switch statement or lookup for maintainability.

The isInt function has a long chain of OR conditions covering all integer and quantized types. While correct, a more maintainable approach could use a switch statement or a set-based lookup.

🔎 Alternative implementation
 static bool isFloat(DataTypes dtype) { return dtype == kFloat32 || dtype == kFloat16; }
 
 static bool isInt(DataTypes dtype) {
-  return dtype == kInt8 || dtype == kInt16 || dtype == kInt32 || dtype == kUInt8 || dtype == kUInt16 || dtype == kUInt32
-         || dtype == kInt8PerTensorSym || dtype == kInt8PerChannelSym || dtype == kUInt8PerTensorSym
-         || dtype == kUInt8PerChannelSym || dtype == kInt16PerTensorSym || dtype == kInt16PerChannelSym
-         || dtype == kUInt16PerTensorSym || dtype == kUInt16PerChannelSym || dtype == kInt8PerTensorAsy
-         || dtype == kInt8PerChannelAsy || dtype == kUInt8PerTensorAsy || dtype == kUInt8PerChannelAsy
-         || dtype == kInt16PerTensorAsy || dtype == kInt16PerChannelAsy || dtype == kUInt16PerTensorAsy
-         || dtype == kUInt16PerChannelAsy;
+  switch (dtype) {
+    case kInt8: case kInt16: case kInt32:
+    case kUInt8: case kUInt16: case kUInt32:
+    case kInt8PerTensorSym: case kInt8PerChannelSym:
+    case kUInt8PerTensorSym: case kUInt8PerChannelSym:
+    case kInt16PerTensorSym: case kInt16PerChannelSym:
+    case kUInt16PerTensorSym: case kUInt16PerChannelSym:
+    case kInt8PerTensorAsy: case kInt8PerChannelAsy:
+    case kUInt8PerTensorAsy: case kUInt8PerChannelAsy:
+    case kInt16PerTensorAsy: case kInt16PerChannelAsy:
+    case kUInt16PerTensorAsy: case kUInt16PerChannelAsy:
+      return true;
+    default:
+      return false;
+  }
 }
mllm/compile/ir/linalg/Op.hpp (1)

31-31: Duplicate forward declaration of SigmoidOp.

SigmoidOp is forward-declared both at line 31 and line 83. The duplicate at line 83 appears to be a remnant. Consider removing one to avoid redundancy.

🔎 Proposed fix

Remove line 83:

 class WhereOp;
-class SigmoidOp;
 }  // namespace mllm

Also applies to: 83-83

mllm/backends/qnn/aot/visitor/Index.cpp (1)

47-76: Axis detection logic is correct but complex—consider a brief inline comment.

The loop correctly identifies the gather axis by skipping "select all" slices while incrementing the axis counter, then breaks upon finding vector or tensor indices. The early exit and error handling for multiple axes are well implemented.

A brief comment explaining the axis-counting strategy could improve readability for future maintainers.

mllm/backends/qnn/aot/visitor/Elewise.cpp (1)

48-81: LGTM with optional refactor opportunity.

QnnAOTMulPattern::rewrite correctly implements element-wise multiplication support using ElementWiseMultiply. The implementation mirrors AddPattern which is acceptable for clarity.

Optional: Consider extracting a shared helper for the common boilerplate (env retrieval, attribute validation, input/output extraction, node registration) if more element-wise patterns are added in the future.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6ada683 and 18717f8.

📒 Files selected for processing (19)
  • mllm/backends/qnn/aot/QnnWrappersAPI.cpp
  • mllm/backends/qnn/aot/passes/LLM2QnnLoweringPass.cpp
  • mllm/backends/qnn/aot/passes/LLM2QnnLoweringPass.hpp
  • mllm/backends/qnn/aot/visitor/Base.hpp
  • mllm/backends/qnn/aot/visitor/CastType.cpp
  • mllm/backends/qnn/aot/visitor/CastType.hpp
  • mllm/backends/qnn/aot/visitor/Elewise.cpp
  • mllm/backends/qnn/aot/visitor/Elewise.hpp
  • mllm/backends/qnn/aot/visitor/Embedding.cpp
  • mllm/backends/qnn/aot/visitor/Embedding.hpp
  • mllm/backends/qnn/aot/visitor/Index.cpp
  • mllm/backends/qnn/aot/visitor/Index.hpp
  • mllm/backends/qnn/aot/visitor/View.cpp
  • mllm/backends/qnn/aot/visitor/View.hpp
  • mllm/compile/ir/linalg/Op.cpp
  • mllm/compile/ir/linalg/Op.hpp
  • mllm/nn/Nn.hpp
  • mllm/nn/layers/Sigmoid.cpp
  • mllm/nn/layers/Sigmoid.hpp
💤 Files with no reviewable changes (1)
  • mllm/backends/qnn/aot/QnnWrappersAPI.cpp
🧰 Additional context used
📓 Path-based instructions (4)
{mllm,mllm-cli,pymllm}/**/*

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

{mllm,mllm-cli,pymllm}/**/*: Files must not contain C0 control codes 0x00–0x08, 0x0B–0x0C, 0x0E–0x1F, C1 control codes 0x7F–0x9F, or DEL 0x7F. Horizontal tab (0x09) and line feed (0x0A) are explicitly allowed.
All files must be encoded in UTF-8 without BOM.
Any violation of character set (Rule 1) or encoding (Rule 2) requirements must cause the review to fail.
No line may end with trailing whitespace.
Use Unix line endings (LF).
File and directory names must consist only of printable Unicode characters, excluding C0 control codes 0x00–0x08, 0x0B–0x0C, 0x0E–0x1F, C1 control codes 0x7F–0x9F, and DEL 0x7F.
Only use acceptable file extensions: .c, .cc, .cpp, .cxx, .h, .hh, .hpp, .py, .pyi, .sh, .txt, .md, .yml, .yaml, .json, .toml.
Optional license headers, if present, must comply with character set rules (no C0/C1 control codes except tab and line feed).

Files:

  • mllm/backends/qnn/aot/visitor/Embedding.hpp
  • mllm/backends/qnn/aot/visitor/CastType.hpp
  • mllm/backends/qnn/aot/visitor/View.cpp
  • mllm/backends/qnn/aot/visitor/Elewise.hpp
  • mllm/backends/qnn/aot/visitor/Index.hpp
  • mllm/backends/qnn/aot/visitor/Elewise.cpp
  • mllm/backends/qnn/aot/visitor/Base.hpp
  • mllm/backends/qnn/aot/visitor/Embedding.cpp
  • mllm/backends/qnn/aot/visitor/CastType.cpp
  • mllm/backends/qnn/aot/visitor/View.hpp
  • mllm/backends/qnn/aot/passes/LLM2QnnLoweringPass.hpp
  • mllm/backends/qnn/aot/passes/LLM2QnnLoweringPass.cpp
  • mllm/backends/qnn/aot/visitor/Index.cpp
  • mllm/nn/layers/Sigmoid.cpp
  • mllm/compile/ir/linalg/Op.hpp
  • mllm/nn/Nn.hpp
  • mllm/compile/ir/linalg/Op.cpp
  • mllm/nn/layers/Sigmoid.hpp
{mllm,mllm-cli,pymllm}/**/*.{c,cc,cpp,cxx,h,hh,hpp,py,pyi,sh}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

{mllm,mllm-cli,pymllm}/**/*.{c,cc,cpp,cxx,h,hh,hpp,py,pyi,sh}: TODO and FIXME comments must be written as 'TODO:' or 'FIXME:' followed by UTF-8 text that adheres to character set rules.
Encourage consistent coding style and patterns with the existing codebase.
Ensure code is portable across supported platforms (e.g., Linux, Windows) unless explicitly platform-specific.

Files:

  • mllm/backends/qnn/aot/visitor/Embedding.hpp
  • mllm/backends/qnn/aot/visitor/CastType.hpp
  • mllm/backends/qnn/aot/visitor/View.cpp
  • mllm/backends/qnn/aot/visitor/Elewise.hpp
  • mllm/backends/qnn/aot/visitor/Index.hpp
  • mllm/backends/qnn/aot/visitor/Elewise.cpp
  • mllm/backends/qnn/aot/visitor/Base.hpp
  • mllm/backends/qnn/aot/visitor/Embedding.cpp
  • mllm/backends/qnn/aot/visitor/CastType.cpp
  • mllm/backends/qnn/aot/visitor/View.hpp
  • mllm/backends/qnn/aot/passes/LLM2QnnLoweringPass.hpp
  • mllm/backends/qnn/aot/passes/LLM2QnnLoweringPass.cpp
  • mllm/backends/qnn/aot/visitor/Index.cpp
  • mllm/nn/layers/Sigmoid.cpp
  • mllm/compile/ir/linalg/Op.hpp
  • mllm/nn/Nn.hpp
  • mllm/compile/ir/linalg/Op.cpp
  • mllm/nn/layers/Sigmoid.hpp
{mllm,mllm-cli,pymllm}/**/*.{c,cc,cpp,cxx,h,hh,hpp,py,pyi}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

{mllm,mllm-cli,pymllm}/**/*.{c,cc,cpp,cxx,h,hh,hpp,py,pyi}: Ensure public APIs, classes, and functions have clear docstrings or comments explaining purpose, parameters, returns, and errors.
Adhere to language-specific best practices and idioms (e.g., PEP 8 for Python, Google C++ Style Guide for C++).

Files:

  • mllm/backends/qnn/aot/visitor/Embedding.hpp
  • mllm/backends/qnn/aot/visitor/CastType.hpp
  • mllm/backends/qnn/aot/visitor/View.cpp
  • mllm/backends/qnn/aot/visitor/Elewise.hpp
  • mllm/backends/qnn/aot/visitor/Index.hpp
  • mllm/backends/qnn/aot/visitor/Elewise.cpp
  • mllm/backends/qnn/aot/visitor/Base.hpp
  • mllm/backends/qnn/aot/visitor/Embedding.cpp
  • mllm/backends/qnn/aot/visitor/CastType.cpp
  • mllm/backends/qnn/aot/visitor/View.hpp
  • mllm/backends/qnn/aot/passes/LLM2QnnLoweringPass.hpp
  • mllm/backends/qnn/aot/passes/LLM2QnnLoweringPass.cpp
  • mllm/backends/qnn/aot/visitor/Index.cpp
  • mllm/nn/layers/Sigmoid.cpp
  • mllm/compile/ir/linalg/Op.hpp
  • mllm/nn/Nn.hpp
  • mllm/compile/ir/linalg/Op.cpp
  • mllm/nn/layers/Sigmoid.hpp
{mllm,mllm-cli,pymllm}/**/*.{c,cc,cpp,cxx,py,pyi}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

{mllm,mllm-cli,pymllm}/**/*.{c,cc,cpp,cxx,py,pyi}: Prioritize production-ready code quality by evaluating time and space complexity of algorithms and data structures, and suggest more efficient alternatives for operations with high complexity (e.g., O(n^2) or worse) when feasible.
Avoid unnecessary object creation in loops or hot paths.
Check for proper error handling and resource cleanup (e.g., using try-finally, context managers, or RAII).
Ensure functions that can fail return appropriate error codes or raise exceptions.
Validate inputs for public APIs and critical internal functions.
Add comments for complex algorithms or non-obvious logic.
Identify potential security issues (e.g., buffer overflows, injection risks, insecure temporary files) and recommend using secure alternatives (e.g., parameterized queries, secure random generators).
Suggest adding unit tests for untested complex logic or edge cases.
Ensure code is testable by avoiding global state and using dependency injection.
Flag overly complex functions (e.g., high cyclomatic complexity) and suggest breaking them down.
Use named constants instead of magic numbers.
Add appropriate logging (e.g., debug, info, warning, error) for significant events and errors, avoiding sensitive data exposure.

Files:

  • mllm/backends/qnn/aot/visitor/View.cpp
  • mllm/backends/qnn/aot/visitor/Elewise.cpp
  • mllm/backends/qnn/aot/visitor/Embedding.cpp
  • mllm/backends/qnn/aot/visitor/CastType.cpp
  • mllm/backends/qnn/aot/passes/LLM2QnnLoweringPass.cpp
  • mllm/backends/qnn/aot/visitor/Index.cpp
  • mllm/nn/layers/Sigmoid.cpp
  • mllm/compile/ir/linalg/Op.cpp
🧬 Code graph analysis (12)
mllm/backends/qnn/aot/visitor/Embedding.hpp (1)
mllm/backends/qnn/aot/visitor/Base.hpp (8)
  • op (17-17)
  • op (17-17)
  • op (26-26)
  • op (26-26)
  • writer (19-19)
  • writer (19-19)
  • writer (28-28)
  • writer (28-28)
mllm/backends/qnn/aot/visitor/CastType.hpp (5)
mllm/backends/qnn/aot/visitor/Base.hpp (8)
  • op (17-17)
  • op (17-17)
  • op (26-26)
  • op (26-26)
  • writer (19-19)
  • writer (19-19)
  • writer (28-28)
  • writer (28-28)
mllm/backends/qnn/aot/visitor/Elewise.hpp (4)
  • op (14-14)
  • op (25-25)
  • writer (16-16)
  • writer (27-27)
mllm/backends/qnn/aot/visitor/Embedding.hpp (2)
  • op (14-14)
  • writer (16-16)
mllm/backends/qnn/aot/visitor/Index.hpp (2)
  • op (14-14)
  • writer (16-16)
mllm/backends/qnn/aot/visitor/View.hpp (2)
  • op (14-14)
  • writer (16-16)
mllm/backends/qnn/aot/visitor/View.cpp (9)
mllm/backends/qnn/aot/visitor/CastType.cpp (4)
  • isMatch (27-29)
  • isMatch (27-27)
  • rewrite (31-81)
  • rewrite (31-31)
mllm/backends/qnn/aot/visitor/Elewise.cpp (8)
  • isMatch (13-15)
  • isMatch (13-13)
  • isMatch (48-50)
  • isMatch (48-48)
  • rewrite (17-46)
  • rewrite (17-17)
  • rewrite (52-81)
  • rewrite (52-52)
mllm/backends/qnn/aot/visitor/Embedding.cpp (4)
  • isMatch (13-15)
  • isMatch (13-13)
  • rewrite (17-59)
  • rewrite (17-17)
mllm/backends/qnn/aot/visitor/Index.cpp (4)
  • isMatch (14-16)
  • isMatch (14-14)
  • rewrite (18-99)
  • rewrite (18-18)
mllm/backends/qnn/aot/visitor/Base.hpp (8)
  • op (17-17)
  • op (17-17)
  • op (26-26)
  • op (26-26)
  • writer (19-19)
  • writer (19-19)
  • writer (28-28)
  • writer (28-28)
mllm/backends/qnn/aot/visitor/Elewise.hpp (4)
  • op (14-14)
  • op (25-25)
  • writer (16-16)
  • writer (27-27)
mllm/backends/qnn/aot/visitor/Embedding.hpp (2)
  • op (14-14)
  • writer (16-16)
mllm/backends/qnn/aot/visitor/Index.hpp (2)
  • op (14-14)
  • writer (16-16)
mllm/backends/qnn/aot/visitor/View.hpp (2)
  • op (14-14)
  • writer (16-16)
mllm/backends/qnn/aot/visitor/Elewise.hpp (9)
mllm/backends/qnn/aot/visitor/Base.hpp (8)
  • writer (19-19)
  • writer (19-19)
  • writer (28-28)
  • writer (28-28)
  • op (17-17)
  • op (17-17)
  • op (26-26)
  • op (26-26)
mllm/backends/qnn/aot/visitor/CastType.hpp (2)
  • writer (16-16)
  • op (14-14)
mllm/backends/qnn/aot/visitor/Embedding.hpp (2)
  • writer (16-16)
  • op (14-14)
mllm/backends/qnn/aot/visitor/Index.hpp (2)
  • writer (16-16)
  • op (14-14)
mllm/backends/qnn/aot/visitor/View.hpp (2)
  • writer (16-16)
  • op (14-14)
mllm/backends/qnn/aot/passes/LLM2QnnLoweringPass.hpp (1)
  • op (22-22)
mllm/backends/qnn/aot/passes/OpNamingPass.hpp (1)
  • op (17-17)
mllm/backends/qnn/aot/passes/MarkTensorIO.hpp (1)
  • op (17-17)
mllm/backends/qnn/aot/passes/MarkQnnGraphPass.hpp (1)
  • op (17-17)
mllm/backends/qnn/aot/visitor/Elewise.cpp (2)
mllm/backends/qnn/aot/visitor/Base.hpp (8)
  • writer (19-19)
  • writer (19-19)
  • writer (28-28)
  • writer (28-28)
  • op (17-17)
  • op (17-17)
  • op (26-26)
  • op (26-26)
mllm/backends/qnn/aot/QnnWrappersAPI.hpp (3)
  • qnn_context_name (216-216)
  • qnn_context_name (218-219)
  • qnn_context_name (221-222)
mllm/backends/qnn/aot/visitor/Base.hpp (6)
mllm/backends/qnn/aot/visitor/CastType.hpp (1)
  • writer (16-16)
mllm/backends/qnn/aot/visitor/Elewise.hpp (2)
  • writer (16-16)
  • writer (27-27)
mllm/backends/qnn/aot/visitor/Embedding.hpp (1)
  • writer (16-16)
mllm/backends/qnn/aot/visitor/Index.hpp (1)
  • writer (16-16)
mllm/backends/qnn/aot/visitor/View.hpp (1)
  • writer (16-16)
mllm/compile/ir/linalg/Attribute.hpp (6)
  • node (27-27)
  • node (27-27)
  • node (327-327)
  • node (327-327)
  • node (348-348)
  • node (348-348)
mllm/backends/qnn/aot/visitor/CastType.cpp (1)
mllm/backends/qnn/aot/visitor/CastType.hpp (2)
  • op (14-14)
  • writer (16-16)
mllm/backends/qnn/aot/visitor/View.hpp (6)
mllm/backends/qnn/aot/passes/LLM2QnnLoweringPass.hpp (1)
  • op (22-22)
mllm/backends/qnn/aot/visitor/Base.hpp (8)
  • op (17-17)
  • op (17-17)
  • op (26-26)
  • op (26-26)
  • writer (19-19)
  • writer (19-19)
  • writer (28-28)
  • writer (28-28)
mllm/backends/qnn/aot/visitor/CastType.hpp (2)
  • op (14-14)
  • writer (16-16)
mllm/backends/qnn/aot/visitor/Elewise.hpp (4)
  • op (14-14)
  • op (25-25)
  • writer (16-16)
  • writer (27-27)
mllm/backends/qnn/aot/visitor/Embedding.hpp (2)
  • op (14-14)
  • writer (16-16)
mllm/backends/qnn/aot/visitor/Index.hpp (2)
  • op (14-14)
  • writer (16-16)
mllm/backends/qnn/aot/visitor/Index.cpp (3)
mllm/backends/qnn/aot/visitor/CastType.cpp (4)
  • isMatch (27-29)
  • isMatch (27-27)
  • rewrite (31-81)
  • rewrite (31-31)
mllm/backends/qnn/aot/visitor/Embedding.cpp (4)
  • isMatch (13-15)
  • isMatch (13-13)
  • rewrite (17-59)
  • rewrite (17-17)
mllm/backends/qnn/aot/visitor/View.cpp (4)
  • isMatch (14-16)
  • isMatch (14-14)
  • rewrite (18-58)
  • rewrite (18-18)
mllm/nn/layers/Sigmoid.cpp (1)
mllm/nn/layers/Sigmoid.hpp (2)
  • Sigmoid (13-13)
  • Sigmoid (15-15)
mllm/compile/ir/linalg/Op.hpp (2)
mllm/core/aops/SigmoidOp.cpp (1)
  • SigmoidOp (12-12)
mllm/core/aops/SigmoidOp.hpp (1)
  • SigmoidOp (15-15)
mllm/compile/ir/linalg/Op.cpp (2)
mllm/core/aops/SigmoidOp.cpp (1)
  • SigmoidOp (12-12)
mllm/core/aops/SigmoidOp.hpp (1)
  • SigmoidOp (15-15)
🪛 Clang (14.0.6)
mllm/backends/qnn/aot/visitor/Embedding.hpp

[error] 14-14: parameter name 'op' is too short, expected at least 3 characters

(readability-identifier-length,-warnings-as-errors)


[error] 16-16: parameter name 'op' is too short, expected at least 3 characters

(readability-identifier-length,-warnings-as-errors)

mllm/backends/qnn/aot/visitor/CastType.hpp

[error] 14-14: parameter name 'op' is too short, expected at least 3 characters

(readability-identifier-length,-warnings-as-errors)


[error] 16-16: parameter name 'op' is too short, expected at least 3 characters

(readability-identifier-length,-warnings-as-errors)

mllm/backends/qnn/aot/visitor/Elewise.hpp

[error] 16-16: parameter name 'op' is too short, expected at least 3 characters

(readability-identifier-length,-warnings-as-errors)


[error] 25-25: parameter name 'op' is too short, expected at least 3 characters

(readability-identifier-length,-warnings-as-errors)


[error] 27-27: parameter name 'op' is too short, expected at least 3 characters

(readability-identifier-length,-warnings-as-errors)

mllm/backends/qnn/aot/visitor/Index.hpp

[error] 14-14: parameter name 'op' is too short, expected at least 3 characters

(readability-identifier-length,-warnings-as-errors)


[error] 16-16: parameter name 'op' is too short, expected at least 3 characters

(readability-identifier-length,-warnings-as-errors)

mllm/backends/qnn/aot/visitor/Base.hpp

[error] 19-19: method 'rewrite' can be made static

(readability-convert-member-functions-to-static,-warnings-as-errors)

mllm/backends/qnn/aot/visitor/View.hpp

[error] 14-14: parameter name 'op' is too short, expected at least 3 characters

(readability-identifier-length,-warnings-as-errors)


[error] 16-16: parameter name 'op' is too short, expected at least 3 characters

(readability-identifier-length,-warnings-as-errors)

mllm/compile/ir/linalg/Op.hpp

[error] 190-190: parameter name 'p' is too short, expected at least 3 characters

(readability-identifier-length,-warnings-as-errors)


[error] 190-190: class 'SigmoidOp' defines a non-default destructor but does not define a copy constructor, a copy assignment operator, a move constructor or a move assignment operator

(cppcoreguidelines-special-member-functions,-warnings-as-errors)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: build-android
  • GitHub Check: build-macos
  • GitHub Check: build-x86
🔇 Additional comments (25)
mllm/backends/qnn/aot/visitor/Embedding.cpp (2)

13-15: LGTM!

The pattern matching logic correctly identifies EmbeddingOp with the using_qnn attribute.


17-29: LGTM!

The validation logic is thorough, with proper error handling for invalid casts and missing attributes.

mllm/backends/qnn/aot/visitor/Embedding.hpp (1)

12-21: LGTM!

The class design follows the established pattern for QNN AOT visitors, with consistent isMatch, rewrite, and factory create() methods. The structure aligns well with other pattern classes in the codebase.

mllm/backends/qnn/aot/visitor/CastType.cpp (3)

27-29: LGTM!

The pattern matching correctly identifies CastTypeOp with the using_qnn attribute.


31-58: LGTM!

The validation and setup logic is thorough, with proper error handling for casts and missing attributes.


59-69: Verify coverage of all required cast types.

The current logic supports:

  • Float → Int (Quantize)
  • Int → Float (Dequantize)
  • Float32 → Float16 (Cast)

However, it does not handle:

  • Float16 → Float32 conversions
  • Int → Int conversions (e.g., int8 → int32)

If these conversions are expected in the pipeline, they should be handled. If they are intentionally unsupported, consider adding a comment explaining why.

Additionally, the error message on Line 67 logs dtype values as integers. Consider logging the enum names or symbolic representations for better debuggability.

mllm/backends/qnn/aot/visitor/Base.hpp (1)

19-19: LGTM!

The refactoring from a compile-based to rewrite-based API is consistent with the new pattern implementations. The default no-op implementations allow derived classes to selectively override behavior.

Also applies to: 28-28

mllm/backends/qnn/aot/visitor/CastType.hpp (1)

12-21: LGTM!

The class design follows the established pattern for QNN AOT visitors and is consistent with the other pattern classes (Embedding, Index, View). The factory method enables proper registration in the lowering pass.

mllm/nn/Nn.hpp (1)

13-13: LGTM!

The Sigmoid layer inclusion is correctly placed in alphabetical order and follows the existing pattern with the IWYU pragma export comment.

mllm/backends/qnn/aot/passes/LLM2QnnLoweringPass.hpp (1)

25-28: LGTM! Clean template-based refactor.

The variadic template helper with fold expression is a clean approach to bulk pattern registration. This eliminates repetitive initialization code while maintaining type safety.

mllm/compile/ir/linalg/Op.cpp (1)

57-57: LGTM!

The Sigmoid operation declaration is correctly placed alongside other activation functions (after SiLU) and properly integrates with the LINALG operation set.

mllm/backends/qnn/aot/passes/LLM2QnnLoweringPass.cpp (1)

14-19: LGTM! Clean includes for new visitor patterns.

The new includes are properly organized and support the expanded pattern registration.

mllm/nn/layers/Sigmoid.cpp (1)

9-11: LGTM!

The Sigmoid layer constructors are correctly implemented, properly delegating to the base Layer class with the appropriate operation type and options.

mllm/compile/ir/linalg/Op.hpp (1)

190-190: LGTM!

The SigmoidOp registration via LINALG_AOPS_DEFINE follows the established pattern used by other activation operations (e.g., SiLUOp at line 189). The static analysis warnings about short parameter names and missing special member functions are false positives—these are macro-generated and consistent with other operators in this file.

mllm/nn/layers/Sigmoid.hpp (1)

1-21: LGTM!

The Sigmoid layer follows the established pattern for NN layers in this codebase, using the standard macros MLLM_LAYER_ANY_INPUTS_1_OUTPUTS_FORWARD and MLLM_LAYER_ENABLE_INPLACE_ATTRIBUTE. The dual constructor approach (default and options-based) provides flexibility consistent with other layers.

mllm/backends/qnn/aot/visitor/Index.hpp (1)

12-21: LGTM!

The QnnAOTIndexPattern follows the established visitor pattern structure with isMatch, rewrite, and create() factory method. The static analysis warnings about short parameter names (op) can be safely ignored—this naming convention is consistent across all QNN AOT visitors (Base.hpp, Embedding.hpp, CastType.hpp, View.hpp, Elewise.hpp).

mllm/backends/qnn/aot/visitor/View.hpp (1)

12-21: LGTM!

The QnnAOTViewPattern mirrors the structure of other AOT patterns. The static analysis warning about the op parameter name is a false positive given the codebase convention.

mllm/backends/qnn/aot/visitor/Index.cpp (2)

14-16: LGTM!

The isMatch implementation correctly checks for linalg::IndexOp with the using_qnn attribute, following the same pattern as other AOT visitors.


18-99: LGTM!

The rewrite implementation properly:

  • Validates required attributes (qnn_graph_name, qnn_context_name)
  • Handles casting failures with appropriate error messages
  • Constructs the QNN Gather node with correct input/output wiring
  • Attaches the axis scalar parameter

The error handling covers edge cases (missing indices, multiple axes) appropriately.

mllm/backends/qnn/aot/visitor/Elewise.hpp (2)

16-16: API change from compile to rewrite aligns with base class.

The method rename aligns with QnnAOTBasePattern::rewrite in Base.hpp (line 18). This is a consistent API change.


23-32: LGTM!

QnnAOTMulPattern follows the established structure of QnnAOTAddPattern, providing symmetric support for element-wise multiplication. The static factory method correctly returns OpTypes::kMul.

mllm/backends/qnn/aot/visitor/Elewise.cpp (2)

17-46: LGTM!

The method rename from compile to rewrite aligns with the base class interface. The implementation correctly creates an ElementWiseAdd QNN node with proper input/output wiring.


20-20: quant_recipe attribute is validated but not used.

Both AddPattern (line 20) and MulPattern (line 55) validate the presence of quant_recipe via MLLM_RETURN_FALSE_IF_NOT, but neither uses the value. If this is intentional gating (ensuring the op is properly marked for quantization), consider adding a brief comment. Otherwise, the check may be vestigial.

Also applies to: 55-55

mllm/backends/qnn/aot/visitor/View.cpp (2)

14-16: LGTM!

The isMatch implementation correctly identifies ViewOp operations marked for QNN processing, consistent with other AOT visitor patterns.


18-58: Overall structure is well-implemented.

The rewrite method follows the established pattern from other AOT visitors:

  • Proper error handling with early returns (lines 21-25, 27-30)
  • Consistent attribute validation using MLLM_RETURN_FALSE_IF_NOT macros
  • Correct input/output extraction and type casting
  • Proper method chaining for QNN operation configuration
  • Correct registration with the AOT context

Once the shape specification issue is resolved, this implementation will be consistent with the rest of the codebase.

// TODO reg other patterns here.
// TODO reg other patterns here.
// TODO reg other patterns here.
registerPatterns<QnnAOTEmbeddingPattern, QnnAOTCastTypePattern, QnnAOTAddPattern, QnnAOTViewPattern, QnnAOTIndexPattern>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash

# Search for QnnAOTMulPattern class definition
echo "=== Searching for QnnAOTMulPattern class definition ==="
ast-grep --pattern $'class QnnAOTMulPattern $$$' 2>/dev/null || rg -n 'class QnnAOTMulPattern' --type=cpp --type=hpp

# Search for Mul pattern references
echo -e "\n=== Searching for Mul pattern references ==="
rg -nC2 'QnnAOTMulPattern|MulPattern' --type=cpp --type=hpp | head -50

# Check the actual file at line 24
echo -e "\n=== Checking LLM2QnnLoweringPass.cpp around line 24 ==="
fd -e cpp 'LLM2QnnLoweringPass' --exec cat -n {} \; | head -50

# Search for all registerPatterns calls
echo -e "\n=== Searching for all registerPatterns calls ==="
rg -nC5 'registerPatterns' --type=cpp --type=hpp

Repository: UbiquitousLearning/mllm

Length of output: 2650


🏁 Script executed:

#!/bin/bash

# Search for QnnAOTMulPattern without file type restrictions
echo "=== Searching for QnnAOTMulPattern ==="
rg -r 'QnnAOTMulPattern' --max-count=20

# Look for Elewise visitor files
echo -e "\n=== Looking for Elewise visitor files ==="
fd 'Elewise' -type f

# Check Elewise.hpp if it exists
echo -e "\n=== Checking Elewise.hpp content ==="
find . -name 'Elewise.hpp' -type f -exec head -100 {} \;

# Search for MulPattern or Mul handler registration
echo -e "\n=== Searching for Mul handler in visitor files ==="
rg -n 'Mul' mllm/backends/qnn/aot/visitor/ 2>/dev/null || true

Repository: UbiquitousLearning/mllm

Length of output: 2779


🏁 Script executed:

#!/bin/bash

# Check if Elewise.hpp is included and if MulPattern import is there
echo "=== Checking LLM2QnnLoweringPass.hpp ==="
fd 'LLM2QnnLoweringPass.hpp' -type f -exec cat {} \;

# Search for any other pattern registrations
echo -e "\n=== Searching for other registerPatterns calls ==="
rg -n 'registerPatterns|QnnAOTMulPattern' mllm/backends/qnn/aot/passes/

# Check if there's a reason MulPattern might be excluded
echo -e "\n=== Checking git history or comments around line 24 ==="
rg -B5 -A5 'registerPatterns' mllm/backends/qnn/aot/passes/LLM2QnnLoweringPass.cpp

Repository: UbiquitousLearning/mllm

Length of output: 1063


🏁 Script executed:

#!/bin/bash

# Get the full content of LLM2QnnLoweringPass.hpp to understand registerPatterns
echo "=== Full LLM2QnnLoweringPass.hpp ==="
fd 'LLM2QnnLoweringPass.hpp' -type f -exec cat -n {} \;

# Check what patterns are imported in the .cpp file
echo -e "\n=== Checking all includes in LLM2QnnLoweringPass.cpp ==="
head -20 mllm/backends/qnn/aot/passes/LLM2QnnLoweringPass.cpp

# Verify Elewise.hpp exports both patterns
echo -e "\n=== Verifying Elewise.hpp exports ==="
fd 'Elewise.hpp' -type f -exec cat -n {} \; | head -40

Repository: UbiquitousLearning/mllm

Length of output: 1264


🏁 Script executed:

#!/bin/bash

# Find the files first
echo "=== Finding header files ==="
find mllm/backends/qnn/aot/passes -name "*.hpp" -o -name "*.h"
find mllm/backends/qnn/aot/visitor -name "*.hpp" -o -name "*.h"

# Read LLM2QnnLoweringPass.hpp directly
echo -e "\n=== Content of LLM2QnnLoweringPass.hpp ==="
cat mllm/backends/qnn/aot/passes/LLM2QnnLoweringPass.hpp

# Read Elewise.hpp directly
echo -e "\n=== Content of Elewise.hpp (full) ==="
cat mllm/backends/qnn/aot/visitor/Elewise.hpp

# Check if MulPattern is mentioned anywhere in the registration context
echo -e "\n=== Search for MulPattern in all visitor files ==="
find mllm/backends/qnn/aot/visitor -type f \( -name "*.hpp" -o -name "*.cpp" \) -exec grep -l "Mul" {} \;

Repository: UbiquitousLearning/mllm

Length of output: 3516


Add QnnAOTMulPattern to the registered patterns list.

QnnAOTMulPattern is fully defined and implemented in Elewise.hpp (matching the structure of QnnAOTAddPattern), but it is missing from the registerPatterns call on line 24. Since QnnAOTAddPattern is registered, QnnAOTMulPattern should be registered as well for consistency and to handle multiplication operations through the QNN backend.

Suggested change
registerPatterns<QnnAOTEmbeddingPattern, QnnAOTCastTypePattern, QnnAOTAddPattern, QnnAOTMulPattern, QnnAOTViewPattern, QnnAOTIndexPattern>();
🤖 Prompt for AI Agents
In @mllm/backends/qnn/aot/passes/LLM2QnnLoweringPass.cpp around line 24,
registerPatterns is missing QnnAOTMulPattern registration even though
QnnAOTMulPattern is implemented (in Elewise.hpp) like QnnAOTAddPattern; update
the registerPatterns template instantiation to include QnnAOTMulPattern
alongside QnnAOTAddPattern (so the list becomes QnnAOTEmbeddingPattern,
QnnAOTCastTypePattern, QnnAOTAddPattern, QnnAOTMulPattern, QnnAOTViewPattern,
QnnAOTIndexPattern) to ensure multiplication ops are lowered by
LLM2QnnLoweringPass.

Comment on lines +34 to +35
auto weight =
writer.getContext()->lookupSymbolTable(a_op->getName() + ".weight")->outputs().front()->cast_<ir::tensor::TensorValue>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Add error handling for symbol table lookup.

The symbol table lookup could return nullptr if the weight symbol is not found, leading to a null pointer dereference when calling outputs().front(). Add validation to ensure the symbol exists before accessing its outputs.

🔎 Proposed fix
-  auto weight =
-      writer.getContext()->lookupSymbolTable(a_op->getName() + ".weight")->outputs().front()->cast_<ir::tensor::TensorValue>();
+  auto weight_symbol = writer.getContext()->lookupSymbolTable(a_op->getName() + ".weight");
+  if (!weight_symbol) {
+    MLLM_ERROR("Failed to find weight symbol for embedding: {}.weight", a_op->getName());
+    return false;
+  }
+  auto weight = weight_symbol->outputs().front()->cast_<ir::tensor::TensorValue>();
🤖 Prompt for AI Agents
In @mllm/backends/qnn/aot/visitor/Embedding.cpp around lines 34-35, The lookup
of the weight symbol via writer.getContext()->lookupSymbolTable(a_op->getName()
+ ".weight") can return nullptr; validate the returned pointer before calling
outputs().front() (the code that assigns to variable weight). If lookup returns
nullptr, handle it (e.g., log or raise an informative error including
a_op->getName(), or return/throw) to avoid null dereference; only call
outputs().front()->cast_<ir::tensor::TensorValue>() when the symbol table
pointer is non-null. Ensure the error path provides clear context about the
missing "<opName>.weight" symbol.

Comment on lines +38 to +55
// Create Shape Tensor
auto output_shape = output->tensor_.shape();
std::vector<int32_t> shape_data;
for (auto dim : output_shape) { shape_data.push_back(static_cast<int32_t>(dim)); }

// Shape tensor shape is [rank(output)]
std::vector<int32_t> shape_tensor_shape = {static_cast<int32_t>(shape_data.size())};

// Create QNN Reshape Op
auto qnn_op_node = QnnAOTNodeOperation::create("Reshape");
qnn_op_node->setPackageName("qti.aisw");

qnn_op_node->emplaceInput(env->captureQnnAOTNodeTensor(qnn_context_name, qnn_graph_name, input))
->emplaceOutput(env->captureQnnAOTNodeTensor(qnn_context_name, qnn_graph_name, output))
->setName(view_op->getAOp()->getName());

// Register this op node into one graph.
env->captureAOTNodeOp(qnn_context_name, qnn_graph_name, qnn_op_node);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# First, locate and read the View.cpp file
find . -name "View.cpp" -path "*/qnn/aot/visitor/*" | head -5

Repository: UbiquitousLearning/mllm

Length of output: 107


🏁 Script executed:

#!/bin/bash
# Search for QNN Reshape patterns and how other operations handle multiple inputs
rg -n "Reshape|RESHAPE" --type=cpp -A 15 -B 5 | head -100

Repository: UbiquitousLearning/mllm

Length of output: 9304


🏁 Script executed:

#!/bin/bash
# Look for patterns of how emplaceInput and shape tensors are used
rg -n "emplaceInput.*shape|shape.*emplaceInput" --type=cpp -B 5 -A 5 | head -80

Repository: UbiquitousLearning/mllm

Length of output: 49


🏁 Script executed:

#!/bin/bash
# Search for other similar visitor files to understand the pattern
fd -n "\.cpp$" -path "*/qnn/aot/visitor/*" | xargs wc -l | sort -n

Repository: UbiquitousLearning/mllm

Length of output: 316


🏁 Script executed:

#!/bin/bash
# Read the View.cpp file to see the exact code
wc -l mllm/backends/qnn/aot/visitor/View.cpp

Repository: UbiquitousLearning/mllm

Length of output: 108


🏁 Script executed:

#!/bin/bash
# Read the specific lines mentioned in the review
sed -n '30,60p' mllm/backends/qnn/aot/visitor/View.cpp

Repository: UbiquitousLearning/mllm

Length of output: 1186


🏁 Script executed:

#!/bin/bash
# Search for how other visitor files handle shape tensors and emplaceInput
find mllm/backends/qnn/aot/visitor -name "*.cpp" | xargs grep -l "emplaceInput.*Tensor\|shape_tensor"

Repository: UbiquitousLearning/mllm

Length of output: 274


🏁 Script executed:

#!/bin/bash
# Search for QNN Reshape operations more broadly
rg "QnnAOTNodeOperation.*Reshape|Reshape.*QnnAOTNodeOperation" --type=cpp -A 20 -B 5

Repository: UbiquitousLearning/mllm

Length of output: 1476


🏁 Script executed:

#!/bin/bash
# Look for Embedding and Index patterns mentioned in the review to compare
find mllm/backends/qnn/aot/visitor -name "*.cpp" | xargs grep -l "Embedding\|Index"

Repository: UbiquitousLearning/mllm

Length of output: 150


🏁 Script executed:

#!/bin/bash
# Read Embedding.cpp to see the pattern
cat -n mllm/backends/qnn/aot/visitor/Embedding.cpp

Repository: UbiquitousLearning/mllm

Length of output: 2813


🏁 Script executed:

#!/bin/bash
# Read Index.cpp to see the pattern
cat -n mllm/backends/qnn/aot/visitor/Index.cpp

Repository: UbiquitousLearning/mllm

Length of output: 4304


🏁 Script executed:

#!/bin/bash
# Search for other Reshape usages in QNN backend
find mllm/backends/qnn -name "*.cpp" -o -name "*.hpp" | xargs rg -i "reshape" -B 5 -A 10 | head -150

Repository: UbiquitousLearning/mllm

Length of output: 10133


🏁 Script executed:

#!/bin/bash
# Look for QNN Reshape documentation or wrappers
find mllm/backends/qnn -name "*.hpp" | xargs rg "Reshape" -B 3 -A 5

Repository: UbiquitousLearning/mllm

Length of output: 51


🏁 Script executed:

#!/bin/bash
# Check for any QNN API documentation or examples
find mllm/backends/qnn -name "*Reshape*" -o -name "*reshape*"

Repository: UbiquitousLearning/mllm

Length of output: 49


🏁 Script executed:

#!/bin/bash
# Look at the QnnWrappersAPI to understand op creation patterns
find mllm/backends/qnn/aot -name "*Wrapper*" -o -name "*API*" | xargs ls -la

Repository: UbiquitousLearning/mllm

Length of output: 248


🏁 Script executed:

#!/bin/bash
# Read the full View.cpp file to see context
cat -n mllm/backends/qnn/aot/visitor/View.cpp

Repository: UbiquitousLearning/mllm

Length of output: 2664


🏁 Script executed:

#!/bin/bash
# Check the QnnWrappersAPI to understand how Reshape nodes are created
head -100 mllm/backends/qnn/aot/QnnWrappersAPI.hpp

Repository: UbiquitousLearning/mllm

Length of output: 3715


🏁 Script executed:

#!/bin/bash
# Look for how the qnn_op_node is processed after creation
rg -n "captureAOTNodeOp" --type=cpp -B 3 -A 10 | head -80

Repository: UbiquitousLearning/mllm

Length of output: 5687


🏁 Script executed:

#!/bin/bash
# Search for any other Reshape patterns in the codebase
rg -i "reshape" mllm/backends/qnn --type=cpp --type=hpp -B 5 -A 10 | grep -A 15 "emplaceInput\|emplaceParam\|paramTensor\|paramScalar" | head -100

Repository: UbiquitousLearning/mllm

Length of output: 94


🏁 Script executed:

#!/bin/bash
# Look for QNN documentation comments about Reshape
rg -i "reshape" mllm/backends/qnn/aot -B 3 -A 3 --type=cpp --type=hpp | grep -i "shape\|param"

Repository: UbiquitousLearning/mllm

Length of output: 94


🏁 Script executed:

#!/bin/bash
# Check the entire QnnWrappersAPI.cpp to see how operations are finalized
sed -n '400,700p' mllm/backends/qnn/aot/QnnWrappersAPI.cpp

Repository: UbiquitousLearning/mllm

Length of output: 9376


🏁 Script executed:

#!/bin/bash
# Search for any documentation or comments about QNN Reshape behavior
rg -i "reshape.*identity\|identity.*reshape\|reshape.*shape\|reshape.*param" mllm/backends/qnn -A 3 -B 3 --type=cpp

Repository: UbiquitousLearning/mllm

Length of output: 49


🏁 Script executed:

#!/bin/bash
# Check if there are other AOT visitor patterns that compute but don't use variables
find mllm/backends/qnn/aot/visitor -name "*.cpp" | xargs rg -n "std::vector.*shape\|std::vector.*data" -A 5

Repository: UbiquitousLearning/mllm

Length of output: 51


🏁 Script executed:

#!/bin/bash
# Verify the exact context around the "identity reshape" comment
sed -n '70,90p' mllm/backends/qnn/op/QNNViewOp.cpp

Repository: UbiquitousLearning/mllm

Length of output: 886


🏁 Script executed:

#!/bin/bash
# Check the entire graphAddNode signature to understand parameters
rg "graphAddNode" mllm/backends/qnn -B 5 -A 5 --type=cpp | head -60

Repository: UbiquitousLearning/mllm

Length of output: 4080


🏁 Script executed:

#!/bin/bash
# Check how output shape is used in the QNN operations
rg "output.*shape\|output_shape" mllm/backends/qnn/aot/visitor -B 2 -A 2 --type=cpp

Repository: UbiquitousLearning/mllm

Length of output: 49


Remove unused shape computation variables.

Lines 38-44 compute shape_data and shape_tensor_shape, which are never used. QNN Reshape operations infer the target shape from the output tensor and do not require explicit shape parameters. Delete these unused variables.

- // Create Shape Tensor
- auto output_shape = output->tensor_.shape();
- std::vector<int32_t> shape_data;
- for (auto dim : output_shape) { shape_data.push_back(static_cast<int32_t>(dim)); }
-
- // Shape tensor shape is [rank(output)]
- std::vector<int32_t> shape_tensor_shape = {static_cast<int32_t>(shape_data.size())};
-
  // Create QNN Reshape Op

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In @mllm/backends/qnn/aot/visitor/View.cpp around lines 38-55, The computed but
unused shape variables should be removed: delete the local variables
`shape_data` and `shape_tensor_shape` and the loop that fills `shape_data` (the
for-loop over `output_shape`) in View.cpp so only the output shape query
remains; keep the QNN Reshape op creation
(`QnnAOTNodeOperation::create("Reshape")`), package name set, inputs/outputs via
`env->captureQnnAOTNodeTensor(...)`, `setName(view_op->getAOp()->getName())`,
and the final `env->captureAOTNodeOp(...)` call unchanged.

Copy link
Collaborator

@chenghuaWang chenghuaWang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@chenghuaWang chenghuaWang merged commit 4fb397e into UbiquitousLearning:main Jan 6, 2026
4 checks passed
@chenghuaWang chenghuaWang changed the title Qnn aot feat(qualcomm): Qnn AOT Lowering pass Jan 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants