-
Notifications
You must be signed in to change notification settings - Fork 159
feat(qualcomm): Qnn AOT Lowering Passes #584
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- Add QnnAOTLinearPattern implementation to handle LinearOp
conversion to QNN FullyConnected operation
- Register Linear pattern in LLM2QnnLoweringPass
- Fix tensor access in setupComplexTensorQuantization by using
correct index {i, 0, 0} instead of {i}
- Correct block axis calculation using fixed dimension index 1
- Update quantization spec creation with proper axis parameter
📝 WalkthroughWalkthroughAdds Gather operation support end-to-end (core aops, functional API, CPU op) and extensive QNN AOT lowering/quantization plumbing; refactors QNN utilities and updates the qwen3 example to use nn::functional::gather for embedding lookups. Changes
Sequence Diagram(s)sequenceDiagram
participant User as User Code
participant NN as nn::functional::gather
participant AOp as aops::GatherOp
participant CPU as CPUGatherOp
participant QNNLower as QNN AOT Lowering/Runtime
Note over User,NN: High-level gather call
User->>NN: gather(x, dim, indices)
NN->>AOp: build/trace GatherOp(dim, inputs)
alt runtime path: CPU execution
AOp->>CPU: CPUGatherOp::forward(inputs)
CPU-->>User: output tensor
else runtime path: QNN AOT lowering
AOp->>QNNLower: Lower to QNN Gather node (emit inputs, axis param)
QNNLower-->>User: output prepared for QNN runtime
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~35 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom Pre-merge Checks in the settings. ✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 16
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
mllm/backends/qnn/QNNUtils.cpp (1)
306-328: Fix semantic inconsistency in kInt16 type mapping.The mapping of
kInt16 → QNN_DATATYPE_UFIXED_POINT_16is semantically incorrect. Signed integer types should not map to unsigned fixed-point types. This contradicts the quantized variantkInt16PerTensorSym → QNN_DATATYPE_SFIXED_POINT_16used elsewhere in the same function (line 377), which correctly maps signed to signed fixed-point.Change line 311 from
QNN_DATATYPE_UFIXED_POINT_16toQNN_DATATYPE_SFIXED_POINT_16to maintain sign consistency.The quantization parameter initialization for these types is properly handled via
createQuantizeParams()and scale propagation is implemented correctly throughout the codebase.
🤖 Fix all issues with AI agents
In @mllm/backends/cpu/ops/GatherOp.cpp:
- Around line 57-59: In GatherOp.cpp the loop silently skips copies when "idx <
0 || idx >= dim_size", leaving parts of the output uninitialized; change the
Gather implementation to zero-initialize the output buffer/tensor (the
destination written by the gather) before the loop and, when an index is
out-of-bounds, either log a warning including the offending "idx" and "dim_size"
or raise an explicit error instead of continuing; ensure you reference the same
variables ("idx", "dim_size") and the output buffer in your changes so the
zero-fill happens once before the loop and all invalid-index cases are handled
with a clear log/error.
In @mllm/backends/qnn/aot/QnnWrappersAPI.cpp:
- Around line 174-178: The blockwise expansion sets blockwise_expansion.axis =
cfg->ch_axis but incorrectly computes blockwise_expansion.numBlocksPerAxis using
v->tensor_.size(1); change the calculation to use the configured channel axis
like the other LPBQ fields (i.e., compute numBlocksPerAxis as
v->tensor_.size(cfg->ch_axis)) so the axis and block count are consistent when
cfg->ch_axis is not 1; update the code around Qnn_BlockwiseExpansion_t
blockwise_expansion and setBlockwiseQuantization usage to use
v->tensor_.size(cfg->ch_axis).
In @mllm/backends/qnn/aot/visitor/Concat.cpp:
- Around line 43-61: NormalizeAxis may produce an out-of-bounds or negative
value; after computing rank from first_input->tensor_.shape() and normalizing
axis (axis += rank), add explicit validation that 0 <= axis && axis < rank and
handle the error (throw an exception or log and abort) before proceeding, and
only then call QNNParamScalarWrapper::create("axis", (uint32_t)axis) to avoid
unsigned wraparound; update the Concat creation flow around
QnnAOTNodeOperation::create("Concat")/emplaceParamScalar to perform this check
early.
In @mllm/backends/qnn/aot/visitor/Gather.cpp:
- Around line 49-50: The dynamic_cast of gather_op->getAOp() to
mllm::aops::GatherOp* can return nullptr and is dereferenced immediately; change
the code to first assign the result to a local pointer (e.g., auto* gatherAOp =
dynamic_cast<mllm::aops::GatherOp*>(gather_op->getAOp())), check if gatherAOp is
nullptr, handle the error path (log via the existing logger, return/throw an
appropriate error, or abort the visitor) and only then read
gatherAOp->options().dim and call emplaceParamScalar, so you avoid undefined
behavior when the cast fails.
In @mllm/backends/qnn/aot/visitor/Linear.cpp:
- Around line 57-58: Ensure the tensor has at least 3 dimensions before
accessing shape()[0..2] in the reshape call: check the weight tensor's
rank/shape size (e.g., via tensor_.dim() or tensor_.shape().size()) in the code
around Linear.cpp where weight_val->tensor_ is being viewed, and if it is
smaller than 3 either return/throw a clear error (or log and skip) with context
(function/class name) or handle the alternative shape case; only perform the
view(...) when the validation passes to avoid out-of-bounds access.
In @mllm/backends/qnn/aot/visitor/Repeat.cpp:
- Line 45: Validate options.repeat_times before casting: check that int
repeat_times = options.repeat_times is > 0 (and optionally within a sane upper
bound) and return or throw a clear error if not; only then cast to uint32_t for
the QNN Tile call. Update both places where you cast repeat_times to uint32_t
(the initial assignment and the later usage around lines 54-55) to use the
validated value and emit a descriptive error mentioning
options.repeat_times/repeat_times so negative, zero, or excessively large values
cannot be silently converted into huge unsigned integers.
In @mllm/backends/qnn/aot/visitor/RMSNorm.cpp:
- Around line 40-45: The casted TensorValue pointers obtained for weight, i_0,
and o_0 (from
writer.getContext()->lookupSymbolTable(...)->outputs().front()->cast_<ir::tensor::TensorValue>()
and op->inputs()/outputs() casts) are used without null checks; add explicit
nullptr checks after each cast (for weight, i_0, and o_0) and handle failures by
returning early or throwing/logging a clear error so the function does not
dereference a null pointer — locate the casts in RMSNorm.cpp and guard each
cast_<ir::tensor::TensorValue>() result before further use.
In @mllm/backends/qnn/aot/visitor/Softmax.cpp:
- Around line 32-33: Validate that op has at least one input and one output and
that the front() elements and their casts succeed before using them: check
op->inputs().empty() and op->outputs().empty(), ensure op->inputs().front() and
op->outputs().front() are non-null, then perform the
cast_<ir::tensor::TensorValue>() and verify the returned pointers (input and
output) are non-null; if any check fails, return or raise an appropriate
error/log instead of dereferencing.
- Around line 42-48: The code adjusts a possibly-negative axis but never
verifies it falls into [0, rank-1], which can cause wraparound when later cast
to unsigned; after computing rank and applying "if (axis < 0) { axis += rank; }"
(referencing real_softmax_op->options().axis, axis, input->tensor_.shape(), and
rank), validate that axis >= 0 && axis < rank and handle invalid values (e.g.,
return an error/throw an exception or log and abort the op) rather than
proceeding; update the Softmax visitor to check this bound immediately after the
negative-axis adjustment and produce a clear error/early-exit if out of range.
In @mllm/backends/qnn/aot/visitor/Transpose.cpp:
- Around line 45-52: The permutation logic in Transpose.cpp uses options.dim0
and options.dim1 directly as indices into perm, which can be negative; normalize
both dims to positive equivalents (e.g., if dim < 0 then dim += rank) before
using them. Update the block that builds perm (after computing rank and perm
vector) to normalize options.dim0 and options.dim1 into local ints (or size_t)
and then perform the std::swap using those normalized indices so negative
dimensions behave the same as in CPU TransposeOp and other QNN AOT visitors
(refer to input->tensor_.shape(), rank, perm, and options.dim0/options.dim1).
In @mllm/backends/qnn/aot/visitor/Where.cpp:
- Around line 31-34: The code in Where.cpp assumes three inputs when calling
std::next on op->inputs() (accessing condition, true_input, false_input); first
validate the input count by checking op->inputs().size() (or equivalent) is >= 3
and handle the error path (throw, return, or log) before dereferencing
iterators; update the block that creates condition, true_input, and false_input
to guard against fewer inputs and use safe iterator/index access only after the
size check so std::next calls cannot go out of bounds.
In @mllm/backends/qnn/QNNUtils.cpp:
- Around line 310-312: The case handling for kInt16 incorrectly maps the signed
16-bit integer to QNN_DATATYPE_UFIXED_POINT_16 (unsigned); update the mapping in
the switch (case kInt16) to use the signed 16-bit fixed-point/type variant
(e.g., QNN_DATATYPE_FIXED_POINT_16 or the correct QNN signed-16 enum value used
elsewhere in the codebase) so that kInt16 maps to the signed QNN datatype
instead of QNN_DATATYPE_UFIXED_POINT_16.
- Line 618: The function name uses a reserved leading double underscore
(__printQnnTensor); rename it to a non-reserved identifier (e.g., printQnnTensor
or Qnn_PrintTensor) and update its declaration in qnnutils.hpp and any call
sites to the new name, ensuring linkage/extern "C" attributes remain unchanged;
also run a project-wide search for __printQnnTensor to fix all references and
rebuild to verify no symbol mismatches.
- Around line 453-454: Replace the commented-out device assertion with proper
conditional validation: keep the existing runtime check for non-AOT tensors by
reinstating MLLM_RT_ASSERT(tensor.device() == kQNN) when type !=
QNN_TENSOR_TYPE_STATIC, and add an explicit check for the AOT/static case (e.g.,
assert tensor.device() == kCPU or allow kCPU/kQNN as appropriate) to validate
expected placement; locate the assertion around the variables type,
QNN_TENSOR_TYPE_STATIC and tensor.device() in QNNUtils.cpp and implement the
conditional MLLM_RT_ASSERT branches to handle both runtime and AOT scenarios.
In @mllm/backends/qnn/QNNUtils.hpp:
- Line 91: The function name __printQnnTensor uses a double-underscore prefix
which is reserved; rename it to a non-reserved identifier (e.g., printQnnTensor
or debugPrintQnnTensor) in the declaration and update the matching definition
and all call sites to the new name (ensure Qnn_Tensor_t reference stays
unchanged and update any header/implementation pairs and tests that reference
__printQnnTensor).
In @mllm/compile/ir/linalg/Op.hpp:
- Line 84: Remove the duplicate forward declaration of GatherOp: locate the
second occurrence of "class GatherOp;" (the one at the later position) and
delete it so only the original forward declaration remains (leave the first
"class GatherOp;" intact).
🧹 Nitpick comments (8)
mllm/backends/qnn/aot/visitor/Index.cpp (2)
42-76: Consider adding inline comments and axis bounds validation.The axis calculation logic is complex and would benefit from inline comments explaining the algorithm. Additionally, consider validating that the calculated
axisis within the bounds of the input tensor's rank to catch configuration errors early.Suggested improvements
Add inline comments:
// Iterate through index specifications to find the gather axis // and extract indices tensor. Increment axis for each dimension // until we find vector or tensor indices. for (const auto& blob : indices_list) { // Skip dimensions with full slice (kAll, kAll) if (blob.slice_indices_.has_value()) { // ... existing code } // ... rest of logic }Add axis validation after line 81:
if (axis >= input->getType()->cast_<ir::tensor::TensorType>()->getShape().size()) { MLLM_ERROR("Calculated axis " + std::to_string(axis) + " exceeds input tensor rank"); return false; }
14-16: Add function documentation.Both
isMatchandrewritefunctions lack documentation. Per coding guidelines, public APIs should have clear comments explaining purpose, parameters, returns, and errors. This is especially important for pattern implementations that are part of the AOT lowering infrastructure.Suggested documentation
/// Checks if the operation matches the QNN AOT Index pattern. /// @param op The operation to check /// @return true if the operation is an IndexOp with QNN attributes bool QnnAOTIndexPattern::isMatch(const mllm::ir::op_ptr_t& op) { return op->isa_<mllm::ir::linalg::IndexOp>() && (op->getAttr("using_qnn") != nullptr); } /// Rewrites an IndexOp to QNN Gather operation. /// Extracts the gather axis and indices from the IndexOp and emits /// a QNN Gather node to the AOT graph. /// @param writer IR writer for creating new operations /// @param op The IndexOp to rewrite /// @return true if rewrite succeeded, false otherwise /// @note QNN Gather only supports gathering on a single axis bool QnnAOTIndexPattern::rewrite(ir::IRWriter& writer, const ir::op_ptr_t& op) { // ... existing implementation }Based on coding guidelines.
Also applies to: 18-98
mllm/backends/qnn/QNNUtils.cpp (1)
681-682: Improve formatting in debug output string.Add spaces after colons for better readability in the blockwise expansion debug output.
🔎 Proposed fix
- quant_str = "Blockwise Expansion (axis:" + std::to_string(t.quantizeParams.blockwiseExpansion->axis) - + ", blockSize:" + std::to_string(t.quantizeParams.blockwiseExpansion->numBlocksPerAxis) + ")"; + quant_str = "Blockwise Expansion (axis: " + std::to_string(t.quantizeParams.blockwiseExpansion->axis) + + ", blockSize: " + std::to_string(t.quantizeParams.blockwiseExpansion->numBlocksPerAxis) + ")";mllm/backends/qnn/aot/visitor/Softmax.cpp (1)
43-43: Extract beta as a named constant or clarify the comment.The
betavalue is hardcoded to1.0f. Unlikeaxis, which is retrieved fromreal_softmax_op->options(), beta is not configurable and is a fixed requirement of the QNN Softmax API. The comment "Default beta" is misleading. Consider extracting it as a named constant (e.g.,constexpr float QNN_SOFTMAX_BETA = 1.0f;) at file scope for clarity and maintainability, or update the comment to reflect that it's a fixed QNN parameter.mllm/backends/qnn/aot/visitor/Equal.cpp (1)
18-52: LGTM! Solid implementation following established patterns.The rewrite implementation correctly:
- Validates the operation cast and required attributes
- Captures inputs/outputs through the AOT environment
- Creates and registers a QNN ElementWiseEqual node
- Follows the same structure as other AOT patterns (Gather, MatMul, Where, etc.)
Optional: Minor style consistency suggestion
Line 34 uses
std::next(op->inputs().begin())while other patterns like MatMul explicitly usestd::next(op->inputs().begin(), 1). Though functionally equivalent, explicit parameters improve consistency.- auto input1 = (*std::next(op->inputs().begin()))->cast_<ir::tensor::TensorValue>(); + auto input1 = (*std::next(op->inputs().begin(), 1))->cast_<ir::tensor::TensorValue>();mllm/backends/qnn/aot/visitor/Gather.cpp (1)
32-37: Consider adding null checks for tensor casts.The
cast_<ir::tensor::TensorValue>()results at lines 33, 34, and 37 are used without validation. While similar patterns in the codebase also omit these checks, adding them would improve robustness.🔎 Proposed fix
// Inputs auto table = op->inputs().front()->cast_<ir::tensor::TensorValue>(); auto indices = (*std::next(op->inputs().begin()))->cast_<ir::tensor::TensorValue>(); // Output auto output = op->outputs().front()->cast_<ir::tensor::TensorValue>(); + + if (!table || !indices || !output) { + MLLM_ERROR("Failed to cast inputs/output to TensorValue"); + return false; + }mllm/core/aops/GatherOp.hpp (1)
24-24: Consider adding[[nodiscard]]to the accessor.The
options()accessor returns a const reference that should typically not be discarded. Adding[[nodiscard]]would help catch accidental misuse.🔎 Proposed fix
- inline const GatherOpOptions& options() const { return options_; } + [[nodiscard]] inline const GatherOpOptions& options() const { return options_; }mllm/core/aops/GatherOp.cpp (1)
36-40:reserveunderestimates final vector size.
new_shape.reserve(dim)only reserves space fordimelements, but the final size isdim + indices_shape.size() + (shape.size() - dim - 1). Consider reserving the correct final size to avoid reallocations:std::vector<int32_t> new_shape; - new_shape.reserve(dim); + new_shape.reserve(dim + indices_shape.size() + (shape.size() - dim - 1));
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (37)
examples/qwen3_qnn_aot/modeling_qwen_qnn_aot.hppmllm/backends/cpu/CPUBackend.cppmllm/backends/cpu/ops/GatherOp.cppmllm/backends/cpu/ops/GatherOp.hppmllm/backends/qnn/QNNUtils.cppmllm/backends/qnn/QNNUtils.hppmllm/backends/qnn/aot/QnnWrappersAPI.cppmllm/backends/qnn/aot/QnnWrappersAPI.hppmllm/backends/qnn/aot/passes/LLM2QnnLoweringPass.cppmllm/backends/qnn/aot/passes/LLMQuantRecipePass.cppmllm/backends/qnn/aot/passes/LLMQuantRecipePass.hppmllm/backends/qnn/aot/visitor/Concat.cppmllm/backends/qnn/aot/visitor/Concat.hppmllm/backends/qnn/aot/visitor/Equal.cppmllm/backends/qnn/aot/visitor/Equal.hppmllm/backends/qnn/aot/visitor/Gather.cppmllm/backends/qnn/aot/visitor/Gather.hppmllm/backends/qnn/aot/visitor/Index.cppmllm/backends/qnn/aot/visitor/Linear.cppmllm/backends/qnn/aot/visitor/Linear.hppmllm/backends/qnn/aot/visitor/Matmul.cppmllm/backends/qnn/aot/visitor/Matmul.hppmllm/backends/qnn/aot/visitor/RMSNorm.cppmllm/backends/qnn/aot/visitor/RMSNorm.hppmllm/backends/qnn/aot/visitor/Repeat.cppmllm/backends/qnn/aot/visitor/Repeat.hppmllm/backends/qnn/aot/visitor/Softmax.cppmllm/backends/qnn/aot/visitor/Softmax.hppmllm/backends/qnn/aot/visitor/Transpose.cppmllm/backends/qnn/aot/visitor/Transpose.hppmllm/backends/qnn/aot/visitor/Where.cppmllm/backends/qnn/aot/visitor/Where.hppmllm/compile/ir/linalg/Op.hppmllm/core/aops/GatherOp.cppmllm/core/aops/GatherOp.hppmllm/nn/Functional.cppmllm/nn/Functional.hpp
🧰 Additional context used
📓 Path-based instructions (4)
{mllm,mllm-cli,pymllm}/**/*
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
{mllm,mllm-cli,pymllm}/**/*: Files must not contain C0 control codes 0x00–0x08, 0x0B–0x0C, 0x0E–0x1F, C1 control codes 0x7F–0x9F, or DEL 0x7F. Horizontal tab (0x09) and line feed (0x0A) are explicitly allowed.
All files must be encoded in UTF-8 without BOM.
Any violation of character set (Rule 1) or encoding (Rule 2) requirements must cause the review to fail.
No line may end with trailing whitespace.
Use Unix line endings (LF).
File and directory names must consist only of printable Unicode characters, excluding C0 control codes 0x00–0x08, 0x0B–0x0C, 0x0E–0x1F, C1 control codes 0x7F–0x9F, and DEL 0x7F.
Only use acceptable file extensions: .c, .cc, .cpp, .cxx, .h, .hh, .hpp, .py, .pyi, .sh, .txt, .md, .yml, .yaml, .json, .toml.
Optional license headers, if present, must comply with character set rules (no C0/C1 control codes except tab and line feed).
Files:
mllm/backends/qnn/aot/visitor/Repeat.cppmllm/backends/qnn/aot/visitor/Transpose.hppmllm/backends/cpu/CPUBackend.cppmllm/backends/qnn/aot/visitor/Equal.hppmllm/core/aops/GatherOp.hppmllm/backends/qnn/aot/visitor/Linear.hppmllm/backends/qnn/aot/visitor/Concat.hppmllm/backends/qnn/aot/visitor/Where.cppmllm/backends/qnn/aot/visitor/Gather.cppmllm/backends/qnn/aot/visitor/Softmax.cppmllm/backends/qnn/aot/visitor/RMSNorm.cppmllm/backends/qnn/aot/visitor/Matmul.hppmllm/backends/qnn/aot/passes/LLMQuantRecipePass.hppmllm/backends/qnn/aot/visitor/Gather.hppmllm/backends/qnn/aot/visitor/RMSNorm.hppmllm/nn/Functional.cppmllm/backends/qnn/QNNUtils.hppmllm/backends/cpu/ops/GatherOp.hppmllm/core/aops/GatherOp.cppmllm/backends/cpu/ops/GatherOp.cppmllm/backends/qnn/aot/visitor/Repeat.hppmllm/backends/qnn/aot/QnnWrappersAPI.cppmllm/nn/Functional.hppmllm/backends/qnn/aot/visitor/Softmax.hppmllm/backends/qnn/aot/visitor/Equal.cppmllm/backends/qnn/aot/passes/LLM2QnnLoweringPass.cppmllm/backends/qnn/aot/visitor/Transpose.cppmllm/backends/qnn/aot/visitor/Linear.cppmllm/backends/qnn/aot/visitor/Index.cppmllm/backends/qnn/aot/QnnWrappersAPI.hppmllm/backends/qnn/aot/visitor/Concat.cppmllm/backends/qnn/QNNUtils.cppmllm/backends/qnn/aot/visitor/Where.hppmllm/compile/ir/linalg/Op.hppmllm/backends/qnn/aot/visitor/Matmul.cppmllm/backends/qnn/aot/passes/LLMQuantRecipePass.cpp
{mllm,mllm-cli,pymllm}/**/*.{c,cc,cpp,cxx,h,hh,hpp,py,pyi,sh}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
{mllm,mllm-cli,pymllm}/**/*.{c,cc,cpp,cxx,h,hh,hpp,py,pyi,sh}: TODO and FIXME comments must be written as 'TODO:' or 'FIXME:' followed by UTF-8 text that adheres to character set rules.
Encourage consistent coding style and patterns with the existing codebase.
Ensure code is portable across supported platforms (e.g., Linux, Windows) unless explicitly platform-specific.
Files:
mllm/backends/qnn/aot/visitor/Repeat.cppmllm/backends/qnn/aot/visitor/Transpose.hppmllm/backends/cpu/CPUBackend.cppmllm/backends/qnn/aot/visitor/Equal.hppmllm/core/aops/GatherOp.hppmllm/backends/qnn/aot/visitor/Linear.hppmllm/backends/qnn/aot/visitor/Concat.hppmllm/backends/qnn/aot/visitor/Where.cppmllm/backends/qnn/aot/visitor/Gather.cppmllm/backends/qnn/aot/visitor/Softmax.cppmllm/backends/qnn/aot/visitor/RMSNorm.cppmllm/backends/qnn/aot/visitor/Matmul.hppmllm/backends/qnn/aot/passes/LLMQuantRecipePass.hppmllm/backends/qnn/aot/visitor/Gather.hppmllm/backends/qnn/aot/visitor/RMSNorm.hppmllm/nn/Functional.cppmllm/backends/qnn/QNNUtils.hppmllm/backends/cpu/ops/GatherOp.hppmllm/core/aops/GatherOp.cppmllm/backends/cpu/ops/GatherOp.cppmllm/backends/qnn/aot/visitor/Repeat.hppmllm/backends/qnn/aot/QnnWrappersAPI.cppmllm/nn/Functional.hppmllm/backends/qnn/aot/visitor/Softmax.hppmllm/backends/qnn/aot/visitor/Equal.cppmllm/backends/qnn/aot/passes/LLM2QnnLoweringPass.cppmllm/backends/qnn/aot/visitor/Transpose.cppmllm/backends/qnn/aot/visitor/Linear.cppmllm/backends/qnn/aot/visitor/Index.cppmllm/backends/qnn/aot/QnnWrappersAPI.hppmllm/backends/qnn/aot/visitor/Concat.cppmllm/backends/qnn/QNNUtils.cppmllm/backends/qnn/aot/visitor/Where.hppmllm/compile/ir/linalg/Op.hppmllm/backends/qnn/aot/visitor/Matmul.cppmllm/backends/qnn/aot/passes/LLMQuantRecipePass.cpp
{mllm,mllm-cli,pymllm}/**/*.{c,cc,cpp,cxx,py,pyi}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
{mllm,mllm-cli,pymllm}/**/*.{c,cc,cpp,cxx,py,pyi}: Prioritize production-ready code quality by evaluating time and space complexity of algorithms and data structures, and suggest more efficient alternatives for operations with high complexity (e.g., O(n^2) or worse) when feasible.
Avoid unnecessary object creation in loops or hot paths.
Check for proper error handling and resource cleanup (e.g., using try-finally, context managers, or RAII).
Ensure functions that can fail return appropriate error codes or raise exceptions.
Validate inputs for public APIs and critical internal functions.
Add comments for complex algorithms or non-obvious logic.
Identify potential security issues (e.g., buffer overflows, injection risks, insecure temporary files) and recommend using secure alternatives (e.g., parameterized queries, secure random generators).
Suggest adding unit tests for untested complex logic or edge cases.
Ensure code is testable by avoiding global state and using dependency injection.
Flag overly complex functions (e.g., high cyclomatic complexity) and suggest breaking them down.
Use named constants instead of magic numbers.
Add appropriate logging (e.g., debug, info, warning, error) for significant events and errors, avoiding sensitive data exposure.
Files:
mllm/backends/qnn/aot/visitor/Repeat.cppmllm/backends/cpu/CPUBackend.cppmllm/backends/qnn/aot/visitor/Where.cppmllm/backends/qnn/aot/visitor/Gather.cppmllm/backends/qnn/aot/visitor/Softmax.cppmllm/backends/qnn/aot/visitor/RMSNorm.cppmllm/nn/Functional.cppmllm/core/aops/GatherOp.cppmllm/backends/cpu/ops/GatherOp.cppmllm/backends/qnn/aot/QnnWrappersAPI.cppmllm/backends/qnn/aot/visitor/Equal.cppmllm/backends/qnn/aot/passes/LLM2QnnLoweringPass.cppmllm/backends/qnn/aot/visitor/Transpose.cppmllm/backends/qnn/aot/visitor/Linear.cppmllm/backends/qnn/aot/visitor/Index.cppmllm/backends/qnn/aot/visitor/Concat.cppmllm/backends/qnn/QNNUtils.cppmllm/backends/qnn/aot/visitor/Matmul.cppmllm/backends/qnn/aot/passes/LLMQuantRecipePass.cpp
{mllm,mllm-cli,pymllm}/**/*.{c,cc,cpp,cxx,h,hh,hpp,py,pyi}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
{mllm,mllm-cli,pymllm}/**/*.{c,cc,cpp,cxx,h,hh,hpp,py,pyi}: Ensure public APIs, classes, and functions have clear docstrings or comments explaining purpose, parameters, returns, and errors.
Adhere to language-specific best practices and idioms (e.g., PEP 8 for Python, Google C++ Style Guide for C++).
Files:
mllm/backends/qnn/aot/visitor/Repeat.cppmllm/backends/qnn/aot/visitor/Transpose.hppmllm/backends/cpu/CPUBackend.cppmllm/backends/qnn/aot/visitor/Equal.hppmllm/core/aops/GatherOp.hppmllm/backends/qnn/aot/visitor/Linear.hppmllm/backends/qnn/aot/visitor/Concat.hppmllm/backends/qnn/aot/visitor/Where.cppmllm/backends/qnn/aot/visitor/Gather.cppmllm/backends/qnn/aot/visitor/Softmax.cppmllm/backends/qnn/aot/visitor/RMSNorm.cppmllm/backends/qnn/aot/visitor/Matmul.hppmllm/backends/qnn/aot/passes/LLMQuantRecipePass.hppmllm/backends/qnn/aot/visitor/Gather.hppmllm/backends/qnn/aot/visitor/RMSNorm.hppmllm/nn/Functional.cppmllm/backends/qnn/QNNUtils.hppmllm/backends/cpu/ops/GatherOp.hppmllm/core/aops/GatherOp.cppmllm/backends/cpu/ops/GatherOp.cppmllm/backends/qnn/aot/visitor/Repeat.hppmllm/backends/qnn/aot/QnnWrappersAPI.cppmllm/nn/Functional.hppmllm/backends/qnn/aot/visitor/Softmax.hppmllm/backends/qnn/aot/visitor/Equal.cppmllm/backends/qnn/aot/passes/LLM2QnnLoweringPass.cppmllm/backends/qnn/aot/visitor/Transpose.cppmllm/backends/qnn/aot/visitor/Linear.cppmllm/backends/qnn/aot/visitor/Index.cppmllm/backends/qnn/aot/QnnWrappersAPI.hppmllm/backends/qnn/aot/visitor/Concat.cppmllm/backends/qnn/QNNUtils.cppmllm/backends/qnn/aot/visitor/Where.hppmllm/compile/ir/linalg/Op.hppmllm/backends/qnn/aot/visitor/Matmul.cppmllm/backends/qnn/aot/passes/LLMQuantRecipePass.cpp
🧬 Code graph analysis (21)
mllm/backends/qnn/aot/visitor/Repeat.cpp (6)
mllm/backends/qnn/aot/visitor/Index.cpp (4)
isMatch(14-16)isMatch(14-14)rewrite(18-98)rewrite(18-18)mllm/backends/qnn/aot/visitor/Concat.cpp (4)
isMatch(14-16)isMatch(14-14)rewrite(18-71)rewrite(18-18)mllm/backends/qnn/aot/visitor/Equal.cpp (4)
isMatch(14-16)isMatch(14-14)rewrite(18-52)rewrite(18-18)mllm/backends/qnn/aot/visitor/Gather.cpp (4)
isMatch(14-16)isMatch(14-14)rewrite(18-56)rewrite(18-18)mllm/backends/qnn/aot/visitor/Matmul.cpp (4)
isMatch(13-15)isMatch(13-13)rewrite(17-51)rewrite(17-17)mllm/backends/qnn/aot/visitor/Where.cpp (4)
isMatch(13-15)isMatch(13-13)rewrite(17-53)rewrite(17-17)
mllm/backends/qnn/aot/visitor/Transpose.hpp (9)
mllm/backends/qnn/aot/visitor/Concat.hpp (2)
op(14-14)writer(16-16)mllm/backends/qnn/aot/visitor/Equal.hpp (2)
op(14-14)writer(16-16)mllm/backends/qnn/aot/visitor/Gather.hpp (2)
op(14-14)writer(16-16)mllm/backends/qnn/aot/visitor/Matmul.hpp (2)
op(14-14)writer(16-16)mllm/backends/qnn/aot/visitor/RMSNorm.hpp (2)
op(14-14)writer(16-16)mllm/backends/qnn/aot/visitor/Repeat.hpp (2)
op(14-14)writer(16-16)mllm/backends/qnn/aot/visitor/Softmax.hpp (2)
op(14-14)writer(16-16)mllm/backends/qnn/aot/visitor/Where.hpp (2)
op(14-14)writer(16-16)mllm/backends/qnn/aot/visitor/Index.hpp (2)
op(14-14)writer(16-16)
mllm/backends/qnn/aot/visitor/Equal.hpp (9)
mllm/backends/qnn/aot/visitor/Concat.hpp (2)
op(14-14)writer(16-16)mllm/backends/qnn/aot/visitor/Gather.hpp (2)
op(14-14)writer(16-16)mllm/backends/qnn/aot/visitor/Matmul.hpp (2)
op(14-14)writer(16-16)mllm/backends/qnn/aot/visitor/RMSNorm.hpp (2)
op(14-14)writer(16-16)mllm/backends/qnn/aot/visitor/Repeat.hpp (2)
op(14-14)writer(16-16)mllm/backends/qnn/aot/visitor/Softmax.hpp (2)
op(14-14)writer(16-16)mllm/backends/qnn/aot/visitor/Transpose.hpp (2)
op(14-14)writer(16-16)mllm/backends/qnn/aot/visitor/Where.hpp (2)
op(14-14)writer(16-16)mllm/backends/qnn/aot/visitor/Index.hpp (2)
op(14-14)writer(16-16)
mllm/core/aops/GatherOp.hpp (1)
mllm/backends/cpu/ops/GatherOp.hpp (3)
options(20-22)options(20-20)inputs(15-15)
mllm/backends/qnn/aot/visitor/Linear.hpp (5)
mllm/backends/qnn/aot/visitor/Matmul.hpp (2)
op(14-14)writer(16-16)mllm/backends/qnn/aot/visitor/RMSNorm.hpp (2)
op(14-14)writer(16-16)mllm/backends/qnn/aot/visitor/Softmax.hpp (2)
op(14-14)writer(16-16)mllm/backends/qnn/aot/visitor/Elewise.hpp (4)
op(14-14)op(25-25)writer(16-16)writer(27-27)mllm/backends/qnn/aot/passes/MarkQnnGraphPass.hpp (1)
op(17-17)
mllm/backends/qnn/aot/visitor/Concat.hpp (9)
mllm/backends/qnn/aot/visitor/Equal.hpp (2)
op(14-14)writer(16-16)mllm/backends/qnn/aot/visitor/Gather.hpp (2)
op(14-14)writer(16-16)mllm/backends/qnn/aot/visitor/Matmul.hpp (2)
op(14-14)writer(16-16)mllm/backends/qnn/aot/visitor/RMSNorm.hpp (2)
op(14-14)writer(16-16)mllm/backends/qnn/aot/visitor/Repeat.hpp (2)
op(14-14)writer(16-16)mllm/backends/qnn/aot/visitor/Softmax.hpp (2)
op(14-14)writer(16-16)mllm/backends/qnn/aot/visitor/Transpose.hpp (2)
op(14-14)writer(16-16)mllm/backends/qnn/aot/visitor/Where.hpp (2)
op(14-14)writer(16-16)mllm/backends/qnn/aot/visitor/Index.hpp (2)
op(14-14)writer(16-16)
examples/qwen3_qnn_aot/modeling_qwen_qnn_aot.hpp (1)
mllm/nn/Functional.hpp (1)
gather(165-165)
mllm/backends/qnn/aot/visitor/Softmax.cpp (8)
mllm/backends/qnn/aot/visitor/Concat.cpp (4)
isMatch(14-16)isMatch(14-14)rewrite(18-71)rewrite(18-18)mllm/backends/qnn/aot/visitor/Equal.cpp (4)
isMatch(14-16)isMatch(14-14)rewrite(18-52)rewrite(18-18)mllm/backends/qnn/aot/visitor/Gather.cpp (4)
isMatch(14-16)isMatch(14-14)rewrite(18-56)rewrite(18-18)mllm/backends/qnn/aot/visitor/Matmul.cpp (4)
isMatch(13-15)isMatch(13-13)rewrite(17-51)rewrite(17-17)mllm/backends/qnn/aot/visitor/RMSNorm.cpp (4)
isMatch(14-16)isMatch(14-14)rewrite(18-66)rewrite(18-18)mllm/backends/qnn/aot/visitor/Repeat.cpp (4)
isMatch(15-17)isMatch(15-15)rewrite(19-84)rewrite(19-19)mllm/backends/qnn/aot/visitor/Transpose.cpp (4)
isMatch(15-17)isMatch(15-15)rewrite(19-76)rewrite(19-19)mllm/backends/qnn/aot/visitor/Where.cpp (4)
isMatch(13-15)isMatch(13-13)rewrite(17-53)rewrite(17-17)
mllm/backends/qnn/aot/visitor/RMSNorm.cpp (8)
mllm/backends/qnn/aot/visitor/Concat.cpp (4)
isMatch(14-16)isMatch(14-14)rewrite(18-71)rewrite(18-18)mllm/backends/qnn/aot/visitor/Equal.cpp (4)
isMatch(14-16)isMatch(14-14)rewrite(18-52)rewrite(18-18)mllm/backends/qnn/aot/visitor/Gather.cpp (4)
isMatch(14-16)isMatch(14-14)rewrite(18-56)rewrite(18-18)mllm/backends/qnn/aot/visitor/Matmul.cpp (4)
isMatch(13-15)isMatch(13-13)rewrite(17-51)rewrite(17-17)mllm/backends/qnn/aot/visitor/Repeat.cpp (4)
isMatch(15-17)isMatch(15-15)rewrite(19-84)rewrite(19-19)mllm/backends/qnn/aot/visitor/Softmax.cpp (4)
isMatch(14-16)isMatch(14-14)rewrite(18-69)rewrite(18-18)mllm/backends/qnn/aot/visitor/Transpose.cpp (4)
isMatch(15-17)isMatch(15-15)rewrite(19-76)rewrite(19-19)mllm/backends/qnn/aot/visitor/Where.cpp (4)
isMatch(13-15)isMatch(13-13)rewrite(17-53)rewrite(17-17)
mllm/backends/qnn/aot/visitor/Gather.hpp (8)
mllm/backends/qnn/aot/visitor/Concat.hpp (2)
op(14-14)writer(16-16)mllm/backends/qnn/aot/visitor/Equal.hpp (2)
op(14-14)writer(16-16)mllm/backends/qnn/aot/visitor/Matmul.hpp (2)
op(14-14)writer(16-16)mllm/backends/qnn/aot/visitor/RMSNorm.hpp (2)
op(14-14)writer(16-16)mllm/backends/qnn/aot/visitor/Repeat.hpp (2)
op(14-14)writer(16-16)mllm/backends/qnn/aot/visitor/Softmax.hpp (2)
op(14-14)writer(16-16)mllm/backends/qnn/aot/visitor/Transpose.hpp (2)
op(14-14)writer(16-16)mllm/backends/qnn/aot/visitor/Where.hpp (2)
op(14-14)writer(16-16)
mllm/nn/Functional.cpp (2)
mllm/core/Tensor.cpp (1)
instance(427-427)mllm/backends/qnn/aot/visitor/Gather.hpp (1)
kGather(18-20)
mllm/backends/cpu/ops/GatherOp.hpp (1)
mllm/core/aops/GatherOp.hpp (2)
inputs(20-20)inputs(22-22)
mllm/core/aops/GatherOp.cpp (2)
mllm/core/aops/GatherOp.hpp (5)
GatherOp(16-16)trace_context(18-18)inputs(20-20)inputs(22-22)options_(24-24)mllm/backends/cpu/ops/GatherOp.hpp (3)
options(20-22)options(20-20)inputs(15-15)
mllm/backends/cpu/ops/GatherOp.cpp (3)
mllm/backends/cpu/ops/GatherOp.hpp (4)
CPUGatherOp(13-13)options(20-22)options(20-20)inputs(15-15)mllm/core/aops/GatherOp.cpp (2)
forward(21-23)forward(21-21)mllm/core/aops/GatherOp.hpp (2)
inputs(20-20)inputs(22-22)
mllm/backends/qnn/aot/visitor/Repeat.hpp (9)
mllm/backends/qnn/aot/visitor/Concat.hpp (2)
op(14-14)writer(16-16)mllm/backends/qnn/aot/visitor/Equal.hpp (2)
op(14-14)writer(16-16)mllm/backends/qnn/aot/visitor/Gather.hpp (2)
op(14-14)writer(16-16)mllm/backends/qnn/aot/visitor/Matmul.hpp (2)
op(14-14)writer(16-16)mllm/backends/qnn/aot/visitor/RMSNorm.hpp (2)
op(14-14)writer(16-16)mllm/backends/qnn/aot/visitor/Softmax.hpp (2)
op(14-14)writer(16-16)mllm/backends/qnn/aot/visitor/Transpose.hpp (2)
op(14-14)writer(16-16)mllm/backends/qnn/aot/visitor/Where.hpp (2)
op(14-14)writer(16-16)mllm/backends/qnn/aot/visitor/Index.hpp (2)
op(14-14)writer(16-16)
mllm/backends/qnn/aot/QnnWrappersAPI.cpp (1)
mllm/backends/qnn/aot/QnnWrappersAPI.hpp (7)
v(47-49)v(47-47)v(56-56)v(58-58)v(60-60)v(62-62)v(65-65)
mllm/backends/qnn/aot/visitor/Softmax.hpp (11)
mllm/backends/qnn/aot/visitor/Linear.hpp (2)
op(14-14)writer(16-16)mllm/backends/qnn/aot/visitor/Concat.hpp (2)
op(14-14)writer(16-16)mllm/backends/qnn/aot/visitor/Equal.hpp (2)
op(14-14)writer(16-16)mllm/backends/qnn/aot/visitor/Gather.hpp (2)
op(14-14)writer(16-16)mllm/backends/qnn/aot/visitor/Matmul.hpp (2)
op(14-14)writer(16-16)mllm/backends/qnn/aot/visitor/RMSNorm.hpp (2)
op(14-14)writer(16-16)mllm/backends/qnn/aot/visitor/Repeat.hpp (2)
op(14-14)writer(16-16)mllm/backends/qnn/aot/visitor/Transpose.hpp (2)
op(14-14)writer(16-16)mllm/backends/qnn/aot/visitor/Where.hpp (2)
op(14-14)writer(16-16)mllm/backends/qnn/aot/visitor/Elewise.hpp (4)
op(14-14)op(25-25)writer(16-16)writer(27-27)mllm/backends/qnn/aot/passes/MarkQnnGraphPass.hpp (1)
op(17-17)
mllm/backends/qnn/aot/visitor/Equal.cpp (8)
mllm/backends/qnn/aot/visitor/Index.cpp (4)
isMatch(14-16)isMatch(14-14)rewrite(18-98)rewrite(18-18)mllm/backends/qnn/aot/visitor/Concat.cpp (4)
isMatch(14-16)isMatch(14-14)rewrite(18-71)rewrite(18-18)mllm/backends/qnn/aot/visitor/Gather.cpp (4)
isMatch(14-16)isMatch(14-14)rewrite(18-56)rewrite(18-18)mllm/backends/qnn/aot/visitor/Matmul.cpp (4)
isMatch(13-15)isMatch(13-13)rewrite(17-51)rewrite(17-17)mllm/backends/qnn/aot/visitor/RMSNorm.cpp (4)
isMatch(14-16)isMatch(14-14)rewrite(18-66)rewrite(18-18)mllm/backends/qnn/aot/visitor/Repeat.cpp (4)
isMatch(15-17)isMatch(15-15)rewrite(19-84)rewrite(19-19)mllm/backends/qnn/aot/visitor/Transpose.cpp (4)
isMatch(15-17)isMatch(15-15)rewrite(19-76)rewrite(19-19)mllm/backends/qnn/aot/visitor/Where.cpp (4)
isMatch(13-15)isMatch(13-13)rewrite(17-53)rewrite(17-17)
mllm/backends/qnn/aot/visitor/Linear.cpp (2)
mllm/backends/qnn/aot/visitor/Linear.hpp (2)
op(14-14)writer(16-16)mllm/backends/qnn/aot/QnnWrappersAPI.hpp (5)
qnn_context_name(219-219)qnn_context_name(221-222)qnn_context_name(224-225)input(84-84)output(86-86)
mllm/backends/qnn/aot/visitor/Matmul.cpp (8)
mllm/backends/qnn/aot/visitor/Concat.cpp (4)
isMatch(14-16)isMatch(14-14)rewrite(18-71)rewrite(18-18)mllm/backends/qnn/aot/visitor/Equal.cpp (4)
isMatch(14-16)isMatch(14-14)rewrite(18-52)rewrite(18-18)mllm/backends/qnn/aot/visitor/Gather.cpp (4)
isMatch(14-16)isMatch(14-14)rewrite(18-56)rewrite(18-18)mllm/backends/qnn/aot/visitor/Repeat.cpp (4)
isMatch(15-17)isMatch(15-15)rewrite(19-84)rewrite(19-19)mllm/backends/qnn/aot/visitor/Softmax.cpp (4)
isMatch(14-16)isMatch(14-14)rewrite(18-69)rewrite(18-18)mllm/backends/qnn/aot/visitor/Transpose.cpp (4)
isMatch(15-17)isMatch(15-15)rewrite(19-76)rewrite(19-19)mllm/backends/qnn/aot/visitor/Where.cpp (4)
isMatch(13-15)isMatch(13-13)rewrite(17-53)rewrite(17-17)mllm/backends/qnn/aot/visitor/Matmul.hpp (2)
op(14-14)writer(16-16)
mllm/backends/qnn/aot/passes/LLMQuantRecipePass.cpp (2)
mllm/compile/ir/linalg/Attribute.hpp (7)
node(27-27)node(27-27)node(327-327)node(327-327)node(348-348)node(348-348)create(77-77)mllm/compile/ir/builtin/Attribute.hpp (10)
node(24-24)node(24-24)node(44-44)node(44-44)node(64-64)node(64-64)node(84-84)node(84-84)node(104-104)node(104-104)
🪛 Clang (14.0.6)
mllm/backends/qnn/aot/visitor/Transpose.hpp
[error] 14-14: parameter name 'op' is too short, expected at least 3 characters
(readability-identifier-length,-warnings-as-errors)
[error] 16-16: parameter name 'op' is too short, expected at least 3 characters
(readability-identifier-length,-warnings-as-errors)
mllm/backends/qnn/aot/visitor/Equal.hpp
[error] 14-14: parameter name 'op' is too short, expected at least 3 characters
(readability-identifier-length,-warnings-as-errors)
[error] 16-16: parameter name 'op' is too short, expected at least 3 characters
(readability-identifier-length,-warnings-as-errors)
mllm/core/aops/GatherOp.hpp
[error] 24-24: function 'options' should be marked [[nodiscard]]
(modernize-use-nodiscard,-warnings-as-errors)
[error] 27-27: member variable 'options_' has protected visibility
(cppcoreguidelines-non-private-member-variables-in-classes,-warnings-as-errors)
mllm/backends/qnn/aot/visitor/Linear.hpp
[error] 14-14: parameter name 'op' is too short, expected at least 3 characters
(readability-identifier-length,-warnings-as-errors)
[error] 16-16: parameter name 'op' is too short, expected at least 3 characters
(readability-identifier-length,-warnings-as-errors)
mllm/backends/qnn/aot/visitor/Concat.hpp
[error] 14-14: parameter name 'op' is too short, expected at least 3 characters
(readability-identifier-length,-warnings-as-errors)
[error] 16-16: parameter name 'op' is too short, expected at least 3 characters
(readability-identifier-length,-warnings-as-errors)
mllm/backends/qnn/aot/visitor/Matmul.hpp
[error] 14-14: parameter name 'op' is too short, expected at least 3 characters
(readability-identifier-length,-warnings-as-errors)
[error] 16-16: parameter name 'op' is too short, expected at least 3 characters
(readability-identifier-length,-warnings-as-errors)
mllm/backends/qnn/aot/passes/LLMQuantRecipePass.hpp
[error] 304-304: parameter name 'op' is too short, expected at least 3 characters
(readability-identifier-length,-warnings-as-errors)
mllm/backends/qnn/aot/visitor/Gather.hpp
[error] 14-14: parameter name 'op' is too short, expected at least 3 characters
(readability-identifier-length,-warnings-as-errors)
[error] 16-16: parameter name 'op' is too short, expected at least 3 characters
(readability-identifier-length,-warnings-as-errors)
mllm/backends/qnn/aot/visitor/RMSNorm.hpp
[error] 14-14: parameter name 'op' is too short, expected at least 3 characters
(readability-identifier-length,-warnings-as-errors)
[error] 16-16: parameter name 'op' is too short, expected at least 3 characters
(readability-identifier-length,-warnings-as-errors)
mllm/nn/Functional.cpp
[error] 215-215: 3 adjacent parameters of 'gather' of similar type are easily swapped by mistake
(bugprone-easily-swappable-parameters,-warnings-as-errors)
mllm/backends/qnn/QNNUtils.hpp
[error] 91-91: declaration uses identifier '__printQnnTensor', which is a reserved identifier
(bugprone-reserved-identifier,-warnings-as-errors)
mllm/backends/qnn/aot/visitor/Repeat.hpp
[error] 14-14: parameter name 'op' is too short, expected at least 3 characters
(readability-identifier-length,-warnings-as-errors)
[error] 16-16: parameter name 'op' is too short, expected at least 3 characters
(readability-identifier-length,-warnings-as-errors)
mllm/backends/qnn/aot/visitor/Softmax.hpp
[error] 14-14: parameter name 'op' is too short, expected at least 3 characters
(readability-identifier-length,-warnings-as-errors)
[error] 16-16: parameter name 'op' is too short, expected at least 3 characters
(readability-identifier-length,-warnings-as-errors)
mllm/backends/qnn/aot/QnnWrappersAPI.hpp
[error] 65-65: parameter name 'v' is too short, expected at least 3 characters
(readability-identifier-length,-warnings-as-errors)
mllm/backends/qnn/QNNUtils.cpp
[error] 618-618: declaration uses identifier '__printQnnTensor', which is a reserved identifier
(bugprone-reserved-identifier,-warnings-as-errors)
[error] 628-628: variable name 't' is too short, expected at least 3 characters
(readability-identifier-length,-warnings-as-errors)
[error] 630-630: variable 'tensor_type' is not initialized
(cppcoreguidelines-init-variables,-warnings-as-errors)
[error] 640-640: variable 'dtype_str' is not initialized
(cppcoreguidelines-init-variables,-warnings-as-errors)
[error] 666-666: variable 'shape_str' is not initialized
(cppcoreguidelines-init-variables,-warnings-as-errors)
[error] 673-673: variable 'quant_str' is not initialized
(cppcoreguidelines-init-variables,-warnings-as-errors)
[error] 675-675: repeated branch in conditional chain
(bugprone-branch-clone,-warnings-as-errors)
mllm/backends/qnn/aot/visitor/Where.hpp
[error] 14-14: parameter name 'op' is too short, expected at least 3 characters
(readability-identifier-length,-warnings-as-errors)
[error] 16-16: parameter name 'op' is too short, expected at least 3 characters
(readability-identifier-length,-warnings-as-errors)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: build-android
- GitHub Check: build-macos
- GitHub Check: build-x86
| if (idx < 0) idx += dim_size; | ||
|
|
||
| if (idx < 0 || idx >= dim_size) { continue; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Silent skip of out-of-bounds indices may leave output uninitialized.
When idx < 0 || idx >= dim_size after negative index adjustment, the code silently skips the copy, leaving that portion of the output tensor uninitialized. This could result in garbage values or undefined behavior downstream.
Consider either:
- Logging a warning/error for out-of-bounds indices.
- Zero-initializing the output tensor before the loop.
- Raising an error for invalid indices.
🔎 Proposed fix: Log warning and zero-fill
- if (idx < 0 || idx >= dim_size) { continue; }
+ if (idx < 0 || idx >= dim_size) {
+ MLLM_WARN("GatherOp: index {} out of bounds [0, {}), filling with zeros", idx, dim_size);
+ int64_t dst_offset = (o * indices_count + i) * inner_size * data_type_size;
+ std::memset(output_ptr + dst_offset, 0, inner_size * data_type_size);
+ continue;
+ }📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if (idx < 0) idx += dim_size; | |
| if (idx < 0 || idx >= dim_size) { continue; } | |
| if (idx < 0) idx += dim_size; | |
| if (idx < 0 || idx >= dim_size) { | |
| MLLM_WARN("GatherOp: index {} out of bounds [0, {}), filling with zeros", idx, dim_size); | |
| int64_t dst_offset = (o * indices_count + i) * inner_size * data_type_size; | |
| std::memset(output_ptr + dst_offset, 0, inner_size * data_type_size); | |
| continue; | |
| } |
🤖 Prompt for AI Agents
In @mllm/backends/cpu/ops/GatherOp.cpp around lines 57 - 59, In GatherOp.cpp the
loop silently skips copies when "idx < 0 || idx >= dim_size", leaving parts of
the output uninitialized; change the Gather implementation to zero-initialize
the output buffer/tensor (the destination written by the gather) before the loop
and, when an index is out-of-bounds, either log a warning including the
offending "idx" and "dim_size" or raise an explicit error instead of continuing;
ensure you reference the same variables ("idx", "dim_size") and the output
buffer in your changes so the zero-fill happens once before the loop and all
invalid-index cases are handled with a clear log/error.
| Qnn_BlockwiseExpansion_t blockwise_expansion; | ||
| blockwise_expansion.axis = cfg->ch_axis; | ||
| blockwise_expansion.axis = cfg->ch_axis; | ||
| blockwise_expansion.scaleOffsets = nullptr; // Will be set by setBlockwiseQuantization | ||
| blockwise_expansion.numBlocksPerAxis = v->tensor_.size(cfg->ch_axis) / cfg->block_size; | ||
| blockwise_expansion.numBlocksPerAxis = v->tensor_.size(1) / cfg->block_size; | ||
| blockwise_expansion.blockScaleBitwidth = 12; // 12 bits for 4 to 16 expansion |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
# First, understand the structure of QuantizationSpecLPBQ and ch_axis
rg -n "ch_axis" --type cpp -B 2 -A 2 | head -80Repository: UbiquitousLearning/mllm
Length of output: 6583
🏁 Script executed:
# Check how blockwise_expansion is used and what axis/numBlocksPerAxis mean
rg -n "blockwise_expansion" --type cpp -B 3 -A 3 | head -100Repository: UbiquitousLearning/mllm
Length of output: 1458
🏁 Script executed:
# Look for LPBQ configuration and how ch_axis is set
rg -n "QuantizationSpecLPBQ\|kLPBQ" --type cpp -B 2 -A 5 | head -150Repository: UbiquitousLearning/mllm
Length of output: 49
🏁 Script executed:
# Check the tensor dimensions in context of weight tensors
rg -n "v->tensor_\.size\(" --type cpp -B 2 -A 2 | head -100Repository: UbiquitousLearning/mllm
Length of output: 1627
🏁 Script executed:
# Search for where LPBQ ch_axis is set in LLMQuantRecipePass
rg -n "QuantizationSpecLPBQ::create" --type cpp -B 5 -A 15 | head -150Repository: UbiquitousLearning/mllm
Length of output: 4470
🏁 Script executed:
# Look for Linear layer quantization setup - typically ch_axis would be set there
rg -n "Linear\|MatMul" mllm/backends/qnn/aot/passes/LLMQuantRecipePass.cpp -B 2 -A 8 | head -200Repository: UbiquitousLearning/mllm
Length of output: 49
🏁 Script executed:
# Check what values ch_axis can have - look at initialization and usage patterns
rg -n "ch_axis\s*=" --type cpp -B 1 -A 1 | grep -v ".hpp" | head -100Repository: UbiquitousLearning/mllm
Length of output: 81
🏁 Script executed:
# Found ch_axis = 0 at line 797. Let's verify if this is the only place LPBQ is created
rg -n "QuantizationSpecLPBQ::create" --type cpp -A 2 | head -50Repository: UbiquitousLearning/mllm
Length of output: 831
🏁 Script executed:
# Check weight tensor dimensions - for Linear layers, shapes should be [out_features, in_features]
rg -n "shape\|size\(0\)|size\(1\)" mllm/backends/qnn/aot/QnnWrappersAPI.cpp | head -50Repository: UbiquitousLearning/mllm
Length of output: 153
🏁 Script executed:
# Verify: does the code document what axis the blockwise quantization should be along?
rg -n "blockwise\|block_size" mllm/backends/qnn/aot/QnnWrappersAPI.cpp -B 2 -A 2 | head -80Repository: UbiquitousLearning/mllm
Length of output: 49
🏁 Script executed:
# Let's check if there are any tests or comments explaining the relationship between ch_axis and tensor dims
rg -n "blockwise\|block_size" mllm/compile/ir/linalg/ -B 2 -A 2 | head -100Repository: UbiquitousLearning/mllm
Length of output: 49
🏁 Script executed:
# Check the full LPBQ quantization spec to understand semantics
cat mllm/compile/ir/linalg/Attribute.hpp | grep -A 30 "struct QuantizationSpecLPBQ"Repository: UbiquitousLearning/mllm
Length of output: 1463
🏁 Script executed:
# Let's check the relationship more carefully - look at how scale_level_0_int is used
rg -n "scale_level_0_int" --type cpp -B 2 -A 2 | head -100Repository: UbiquitousLearning/mllm
Length of output: 3644
🏁 Script executed:
# Check dimensions of weight tensors in Linear quantization - what are size(0) and size(1)?
rg -n "Linear.*weight\|register.*weight" mllm/backends/qnn/aot/passes/LLMQuantRecipePass.cpp -B 3 -A 3 | head -80Repository: UbiquitousLearning/mllm
Length of output: 49
🏁 Script executed:
# Look for any documentation or schema about blockwise expansion axis semantics
rg -n "Qnn_BlockwiseExpansion" --type cpp -B 5 -A 10 | head -150Repository: UbiquitousLearning/mllm
Length of output: 5137
Mismatch between blockwise expansion axis and numBlocksPerAxis calculation.
blockwise_expansion.axis is set to cfg->ch_axis (line 175), but numBlocksPerAxis is computed using hardcoded v->tensor_.size(1) (line 177). At line 165, the code correctly uses v->tensor_.size(cfg->ch_axis) for other LPBQ fields; this pattern should be consistent. If ch_axis is 0 (as set in LLMQuantRecipePass.cpp:797), then numBlocksPerAxis should use size(0), not size(1):
Suggested fix
blockwise_expansion.axis = cfg->ch_axis;
blockwise_expansion.scaleOffsets = nullptr; // Will be set by setBlockwiseQuantization
- blockwise_expansion.numBlocksPerAxis = v->tensor_.size(1) / cfg->block_size;
+ blockwise_expansion.numBlocksPerAxis = v->tensor_.size(cfg->ch_axis) / cfg->block_size;🤖 Prompt for AI Agents
In @mllm/backends/qnn/aot/QnnWrappersAPI.cpp around lines 174 - 178, The
blockwise expansion sets blockwise_expansion.axis = cfg->ch_axis but incorrectly
computes blockwise_expansion.numBlocksPerAxis using v->tensor_.size(1); change
the calculation to use the configured channel axis like the other LPBQ fields
(i.e., compute numBlocksPerAxis as v->tensor_.size(cfg->ch_axis)) so the axis
and block count are consistent when cfg->ch_axis is not 1; update the code
around Qnn_BlockwiseExpansion_t blockwise_expansion and setBlockwiseQuantization
usage to use v->tensor_.size(cfg->ch_axis).
| // Handle negative axis | ||
| // We can use the first input to determine rank, assuming all inputs have same rank | ||
| auto first_input = op->inputs().front()->cast_<ir::tensor::TensorValue>(); | ||
| auto input_shape = first_input->tensor_.shape(); | ||
| int rank = input_shape.size(); | ||
| if (axis < 0) { axis += rank; } | ||
|
|
||
| // Create QNN Op Node | ||
| auto qnn_op_node = QnnAOTNodeOperation::create("Concat"); | ||
| qnn_op_node->setPackageName("qti.aisw"); | ||
|
|
||
| // Add Inputs | ||
| for (auto& input_val : op->inputs()) { | ||
| auto input = input_val->cast_<ir::tensor::TensorValue>(); | ||
| qnn_op_node->emplaceInput(env->captureQnnAOTNodeTensor(qnn_context_name, qnn_graph_name, input)); | ||
| } | ||
|
|
||
| // Add Params | ||
| qnn_op_node->emplaceParamScalar(QNNParamScalarWrapper::create("axis", (uint32_t)axis)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion | 🟠 Major
Add bounds validation for the normalized axis.
After normalizing the negative axis, the code should verify that the resulting axis is within the valid range [0, rank). Additionally, before casting to uint32_t at line 61, explicitly validate that axis is non-negative to prevent potential unsigned integer wraparound issues in case the normalization logic has edge cases.
🔎 Proposed fix to add axis bounds validation
int rank = input_shape.size();
if (axis < 0) { axis += rank; }
+
+ // Validate axis is in valid range
+ if (axis < 0 || axis >= rank) {
+ MLLM_ERROR("Axis " << axis << " is out of bounds for rank " << rank);
+ return false;
+ }
// Create QNN Op Node
auto qnn_op_node = QnnAOTNodeOperation::create("Concat");🤖 Prompt for AI Agents
In @mllm/backends/qnn/aot/visitor/Concat.cpp around lines 43 - 61, NormalizeAxis
may produce an out-of-bounds or negative value; after computing rank from
first_input->tensor_.shape() and normalizing axis (axis += rank), add explicit
validation that 0 <= axis && axis < rank and handle the error (throw an
exception or log and abort) before proceeding, and only then call
QNNParamScalarWrapper::create("axis", (uint32_t)axis) to avoid unsigned
wraparound; update the Concat creation flow around
QnnAOTNodeOperation::create("Concat")/emplaceParamScalar to perform this check
early.
| int axis = dynamic_cast<mllm::aops::GatherOp*>(gather_op->getAOp())->options().dim; | ||
| qnn_op_node->emplaceParamScalar(QNNParamScalarWrapper::create("axis", (int32_t)axis)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unchecked dynamic_cast may cause undefined behavior.
The dynamic_cast at line 49 can return nullptr if the cast fails, and directly accessing ->options().dim on a null pointer leads to undefined behavior.
🔎 Proposed fix: Add null check
// Add scalar param axis
- int axis = dynamic_cast<mllm::aops::GatherOp*>(gather_op->getAOp())->options().dim;
+ auto gather_aop = dynamic_cast<mllm::aops::GatherOp*>(gather_op->getAOp());
+ if (!gather_aop) {
+ MLLM_ERROR("Failed to cast to aops::GatherOp");
+ return false;
+ }
+ int axis = gather_aop->options().dim;
qnn_op_node->emplaceParamScalar(QNNParamScalarWrapper::create("axis", (int32_t)axis));📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| int axis = dynamic_cast<mllm::aops::GatherOp*>(gather_op->getAOp())->options().dim; | |
| qnn_op_node->emplaceParamScalar(QNNParamScalarWrapper::create("axis", (int32_t)axis)); | |
| auto gather_aop = dynamic_cast<mllm::aops::GatherOp*>(gather_op->getAOp()); | |
| if (!gather_aop) { | |
| MLLM_ERROR("Failed to cast to aops::GatherOp"); | |
| return false; | |
| } | |
| int axis = gather_aop->options().dim; | |
| qnn_op_node->emplaceParamScalar(QNNParamScalarWrapper::create("axis", (int32_t)axis)); |
🤖 Prompt for AI Agents
In @mllm/backends/qnn/aot/visitor/Gather.cpp around lines 49 - 50, The
dynamic_cast of gather_op->getAOp() to mllm::aops::GatherOp* can return nullptr
and is dereferenced immediately; change the code to first assign the result to a
local pointer (e.g., auto* gatherAOp =
dynamic_cast<mllm::aops::GatherOp*>(gather_op->getAOp())), check if gatherAOp is
nullptr, handle the error path (log via the existing logger, return/throw an
appropriate error, or abort the visitor) and only then read
gatherAOp->options().dim and call emplaceParamScalar, so you avoid undefined
behavior when the cast fails.
| weight_val->tensor_ = weight_val->tensor_.view( | ||
| {weight_val->tensor_.shape()[0], weight_val->tensor_.shape()[1] * weight_val->tensor_.shape()[2]}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add shape validation before tensor reshaping.
The code accesses shape()[0], shape()[1], and shape()[2] without validating that the weight tensor has at least 3 dimensions. This could cause an out-of-bounds access if the tensor shape doesn't meet expectations.
🔎 Proposed fix to add shape validation
+ // Validate weight tensor has expected 3D shape
+ if (weight_val->tensor_.shape().size() < 3) {
+ MLLM_ERROR("Weight tensor must have at least 3 dimensions, got " +
+ std::to_string(weight_val->tensor_.shape().size()));
+ return false;
+ }
+
weight_val->tensor_ = weight_val->tensor_.to(kUInt8);
weight_val->tensor_ = weight_val->tensor_.view(
{weight_val->tensor_.shape()[0], weight_val->tensor_.shape()[1] * weight_val->tensor_.shape()[2]});📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| weight_val->tensor_ = weight_val->tensor_.view( | |
| {weight_val->tensor_.shape()[0], weight_val->tensor_.shape()[1] * weight_val->tensor_.shape()[2]}); | |
| // Validate weight tensor has expected 3D shape | |
| if (weight_val->tensor_.shape().size() < 3) { | |
| MLLM_ERROR("Weight tensor must have at least 3 dimensions, got " + | |
| std::to_string(weight_val->tensor_.shape().size())); | |
| return false; | |
| } | |
| weight_val->tensor_ = weight_val->tensor_.to(kUInt8); | |
| weight_val->tensor_ = weight_val->tensor_.view( | |
| {weight_val->tensor_.shape()[0], weight_val->tensor_.shape()[1] * weight_val->tensor_.shape()[2]}); |
🤖 Prompt for AI Agents
In @mllm/backends/qnn/aot/visitor/Linear.cpp around lines 57 - 58, Ensure the
tensor has at least 3 dimensions before accessing shape()[0..2] in the reshape
call: check the weight tensor's rank/shape size (e.g., via tensor_.dim() or
tensor_.shape().size()) in the code around Linear.cpp where weight_val->tensor_
is being viewed, and if it is smaller than 3 either return/throw a clear error
(or log and skip) with context (function/class name) or handle the alternative
shape case; only perform the view(...) when the validation passes to avoid
out-of-bounds access.
| case kInt16: { | ||
| ret = QNN_DATATYPE_INT_16; | ||
| ret = QNN_DATATYPE_UFIXED_POINT_16; | ||
| break; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Critical: Incorrect type mapping for kInt16.
Line 311 maps kInt16 (signed 16-bit integer) to QNN_DATATYPE_UFIXED_POINT_16 (unsigned fixed-point 16-bit). This is a sign mismatch that will cause incorrect value interpretation and potential data corruption.
🔎 Proposed fix
case kInt16: {
- ret = QNN_DATATYPE_UFIXED_POINT_16;
+ ret = QNN_DATATYPE_SFIXED_POINT_16;
break;
}🤖 Prompt for AI Agents
In @mllm/backends/qnn/QNNUtils.cpp around lines 310 - 312, The case handling for
kInt16 incorrectly maps the signed 16-bit integer to
QNN_DATATYPE_UFIXED_POINT_16 (unsigned); update the mapping in the switch (case
kInt16) to use the signed 16-bit fixed-point/type variant (e.g.,
QNN_DATATYPE_FIXED_POINT_16 or the correct QNN signed-16 enum value used
elsewhere in the codebase) so that kInt16 maps to the signed QNN datatype
instead of QNN_DATATYPE_UFIXED_POINT_16.
| // in AOT case, the tensor is all on CPU (TODO: handle this) | ||
| // if (type != QNN_TENSOR_TYPE_STATIC) { MLLM_RT_ASSERT(tensor.device() == kQNN); } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add proper device validation instead of commenting out the assertion.
The device type assertion has been commented out for AOT handling, but this removes an important safety check that could catch device placement errors. Instead of removing the check entirely, add conditional logic to handle both runtime and AOT cases properly.
🔎 Proposed fix
- // in AOT case, the tensor is all on CPU (TODO: handle this)
- // if (type != QNN_TENSOR_TYPE_STATIC) { MLLM_RT_ASSERT(tensor.device() == kQNN); }
+ // In AOT case, tensors may be on CPU; in runtime, non-static tensors must be on QNN
+ if (type != QNN_TENSOR_TYPE_STATIC) {
+ // TODO: Replace with proper AOT detection mechanism
+ bool isAOT = (tensor.device() == kCPU);
+ if (!isAOT) {
+ MLLM_RT_ASSERT(tensor.device() == kQNN);
+ }
+ }🤖 Prompt for AI Agents
In @mllm/backends/qnn/QNNUtils.cpp around lines 453 - 454, Replace the
commented-out device assertion with proper conditional validation: keep the
existing runtime check for non-AOT tensors by reinstating
MLLM_RT_ASSERT(tensor.device() == kQNN) when type != QNN_TENSOR_TYPE_STATIC, and
add an explicit check for the AOT/static case (e.g., assert tensor.device() ==
kCPU or allow kCPU/kQNN as appropriate) to validate expected placement; locate
the assertion around the variables type, QNN_TENSOR_TYPE_STATIC and
tensor.device() in QNNUtils.cpp and implement the conditional MLLM_RT_ASSERT
branches to handle both runtime and AOT scenarios.
| } | ||
| } | ||
|
|
||
| void __printQnnTensor(const Qnn_Tensor_t* tensor) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix reserved identifier violation in function name.
The function name __printQnnTensor uses a leading double underscore, which is reserved for implementation use in C++. This violates the C++ standard and could conflict with compiler or standard library identifiers.
🔎 Proposed fix
-void __printQnnTensor(const Qnn_Tensor_t* tensor) {
+void printQnnTensor(const Qnn_Tensor_t* tensor) {Don't forget to update the header declaration in mllm/backends/qnn/qnnutils.hpp as well.
Committable suggestion skipped: line range outside the PR's diff.
🧰 Tools
🪛 Clang (14.0.6)
[error] 618-618: declaration uses identifier '__printQnnTensor', which is a reserved identifier
(bugprone-reserved-identifier,-warnings-as-errors)
🤖 Prompt for AI Agents
In @mllm/backends/qnn/QNNUtils.cpp at line 618, The function name uses a
reserved leading double underscore (__printQnnTensor); rename it to a
non-reserved identifier (e.g., printQnnTensor or Qnn_PrintTensor) and update its
declaration in qnnutils.hpp and any call sites to the new name, ensuring
linkage/extern "C" attributes remain unchanged; also run a project-wide search
for __printQnnTensor to fix all references and rebuild to verify no symbol
mismatches.
|
|
||
| bool freeQnnTensors(Qnn_Tensor_t*& tensors, uint32_t numTensors); | ||
|
|
||
| void __printQnnTensor(const Qnn_Tensor_t* tensor); // for debug use |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reserved identifier violation: rename the debug function.
The function name __printQnnTensor uses a double underscore prefix, which is reserved by the C++ standard. This can lead to undefined behavior or conflicts with compiler internals.
🔎 Proposed fix
-void __printQnnTensor(const Qnn_Tensor_t* tensor); // for debug use
+void printQnnTensorDebug(const Qnn_Tensor_t* tensor); // for debug use📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| void __printQnnTensor(const Qnn_Tensor_t* tensor); // for debug use | |
| void printQnnTensorDebug(const Qnn_Tensor_t* tensor); // for debug use |
🧰 Tools
🪛 Clang (14.0.6)
[error] 91-91: declaration uses identifier '__printQnnTensor', which is a reserved identifier
(bugprone-reserved-identifier,-warnings-as-errors)
🤖 Prompt for AI Agents
In @mllm/backends/qnn/QNNUtils.hpp at line 91, The function name
__printQnnTensor uses a double-underscore prefix which is reserved; rename it to
a non-reserved identifier (e.g., printQnnTensor or debugPrintQnnTensor) in the
declaration and update the matching definition and all call sites to the new
name (ensure Qnn_Tensor_t reference stays unchanged and update any
header/implementation pairs and tests that reference __printQnnTensor).
mllm/backends/qnn/QNNUtils.hpp
Outdated
| qnnParam_.scalarParam.uint32Value = static_cast<uint32_t>(value); | ||
| } else if constexpr (std::is_same_v<T, int32_t>) { | ||
| qnnParam_.scalarParam.dataType = QNN_DATATYPE_INT_32; | ||
| qnnParam_.scalarParam.uint32Value = static_cast<int32_t>(value); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
qnnParam_.scalarParam.int32Value = static_cast<int32_t>(value);
|
|
||
| LINALG_AOPS_DEFINE(EqualOp, EQUALOP); | ||
| LINALG_AOPS_DEFINE(WhereOp, WHEREOP); | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no corresponding LINALG_AOPS_DEFINE(GatherOp, GATHEROP);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In @mllm/backends/qnn/QNNUtils.hpp:
- Around line 91-92: The debug function name __printQnnTensor uses reserved
leading underscores; rename the function (e.g., printQnnTensor or
QnnPrintTensor) and update its declaration void __printQnnTensor(const
Qnn_Tensor_t* tensor), its definition, and every call site to the new identifier
so the prototype, implementation, and usages remain consistent and avoid
reserved-identifier UB.
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
mllm/backends/qnn/QNNUtils.hpp
🧰 Additional context used
📓 Path-based instructions (3)
{mllm,mllm-cli,pymllm}/**/*
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
{mllm,mllm-cli,pymllm}/**/*: Files must not contain C0 control codes 0x00–0x08, 0x0B–0x0C, 0x0E–0x1F, C1 control codes 0x7F–0x9F, or DEL 0x7F. Horizontal tab (0x09) and line feed (0x0A) are explicitly allowed.
All files must be encoded in UTF-8 without BOM.
Any violation of character set (Rule 1) or encoding (Rule 2) requirements must cause the review to fail.
No line may end with trailing whitespace.
Use Unix line endings (LF).
File and directory names must consist only of printable Unicode characters, excluding C0 control codes 0x00–0x08, 0x0B–0x0C, 0x0E–0x1F, C1 control codes 0x7F–0x9F, and DEL 0x7F.
Only use acceptable file extensions: .c, .cc, .cpp, .cxx, .h, .hh, .hpp, .py, .pyi, .sh, .txt, .md, .yml, .yaml, .json, .toml.
Optional license headers, if present, must comply with character set rules (no C0/C1 control codes except tab and line feed).
Files:
mllm/backends/qnn/QNNUtils.hpp
{mllm,mllm-cli,pymllm}/**/*.{c,cc,cpp,cxx,h,hh,hpp,py,pyi,sh}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
{mllm,mllm-cli,pymllm}/**/*.{c,cc,cpp,cxx,h,hh,hpp,py,pyi,sh}: TODO and FIXME comments must be written as 'TODO:' or 'FIXME:' followed by UTF-8 text that adheres to character set rules.
Encourage consistent coding style and patterns with the existing codebase.
Ensure code is portable across supported platforms (e.g., Linux, Windows) unless explicitly platform-specific.
Files:
mllm/backends/qnn/QNNUtils.hpp
{mllm,mllm-cli,pymllm}/**/*.{c,cc,cpp,cxx,h,hh,hpp,py,pyi}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
{mllm,mllm-cli,pymllm}/**/*.{c,cc,cpp,cxx,h,hh,hpp,py,pyi}: Ensure public APIs, classes, and functions have clear docstrings or comments explaining purpose, parameters, returns, and errors.
Adhere to language-specific best practices and idioms (e.g., PEP 8 for Python, Google C++ Style Guide for C++).
Files:
mllm/backends/qnn/QNNUtils.hpp
🪛 Clang (14.0.6)
mllm/backends/qnn/QNNUtils.hpp
[error] 91-91: declaration uses identifier '__printQnnTensor', which is a reserved identifier
(bugprone-reserved-identifier,-warnings-as-errors)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: build-android
- GitHub Check: build-x86
- GitHub Check: build-macos
🔇 Additional comments (1)
mllm/backends/qnn/QNNUtils.hpp (1)
281-286: LGTM! Proper separation of signed and unsigned integer handling.The split into separate branches for
uint32_tandint32_tcorrectly maps each type to its corresponding QNN datatype and struct field, ensuring type safety and proper handling of signed integers.
| void __printQnnTensor(const Qnn_Tensor_t* tensor); // for debug use | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reserved identifier violation: rename the debug function.
The function name __printQnnTensor uses a double underscore prefix, which is reserved by the C++ standard for compiler and standard library implementations. This can lead to undefined behavior or conflicts with compiler internals.
🔎 Proposed fix
-void __printQnnTensor(const Qnn_Tensor_t* tensor); // for debug use
+void printQnnTensorDebug(const Qnn_Tensor_t* tensor); // for debug useNote: Ensure the function definition and all call sites are updated to match the new name.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| void __printQnnTensor(const Qnn_Tensor_t* tensor); // for debug use | |
| void printQnnTensorDebug(const Qnn_Tensor_t* tensor); // for debug use | |
🧰 Tools
🪛 Clang (14.0.6)
[error] 91-91: declaration uses identifier '__printQnnTensor', which is a reserved identifier
(bugprone-reserved-identifier,-warnings-as-errors)
🤖 Prompt for AI Agents
In @mllm/backends/qnn/QNNUtils.hpp around lines 91 - 92, The debug function name
__printQnnTensor uses reserved leading underscores; rename the function (e.g.,
printQnnTensor or QnnPrintTensor) and update its declaration void
__printQnnTensor(const Qnn_Tensor_t* tensor), its definition, and every call
site to the new identifier so the prototype, implementation, and usages remain
consistent and avoid reserved-identifier UB.
chenghuaWang
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Please check Guidelines for Contributing.
Summary by CodeRabbit
New Features
Bug Fixes
✏️ Tip: You can customize this high-level summary in your review settings.