Skip to content

[VL][TEST] Use spark.execution.id to replace the velox's queryId#12248

Open
JkSelf wants to merge 1 commit into
apache:mainfrom
JkSelf:queryId
Open

[VL][TEST] Use spark.execution.id to replace the velox's queryId#12248
JkSelf wants to merge 1 commit into
apache:mainfrom
JkSelf:queryId

Conversation

@JkSelf
Copy link
Copy Markdown
Contributor

@JkSelf JkSelf commented Jun 5, 2026

What changes are proposed in this pull request?

How was this patch tested?

Was this patch authored or co-authored using generative AI tooling?

Copilot AI review requested due to automatic review settings June 5, 2026 05:12
@github-actions github-actions Bot added the VELOX label Jun 5, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR wires Spark SQL’s spark.sql.execution.id from the JVM side into the native runtime via JNI, and uses it in the Velox backend to label query/task identifiers (intended to replace the prior Velox queryId based on stage/task IDs).

Changes:

  • Add executionId to the JNI nativeCreateKernelWithIterator path and propagate it into SparkTaskInfo.
  • Read spark.sql.execution.id from TaskContext local properties in NativePlanEvaluator and pass it to native.
  • Update Velox-side identifier construction to prefer executionId when present.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
gluten-arrow/src/main/java/org/apache/gluten/vectorized/PlanEvaluatorJniWrapper.java Extends native iterator-kernel creation signature with executionId.
gluten-arrow/src/main/java/org/apache/gluten/vectorized/NativePlanEvaluator.java Extracts Spark SQL execution id from TaskContext and forwards it to JNI.
cpp/core/jni/JniWrapper.cc Accepts executionId in JNI entrypoint and stores it in SparkTaskInfo.
cpp/core/compute/Runtime.h Adds executionId to SparkTaskInfo and includes it in toString().
cpp/velox/compute/WholeStageResultIterator.cc Uses executionId to form Velox identifiers (task/query context ids).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +75 to +86
namespace {
std::string getVeloxTaskId(const SparkTaskInfo& taskInfo) {
if (taskInfo.executionId != -1) {
return fmt::format("Gluten_Execution_{}", std::to_string(taskInfo.executionId));
}
return fmt::format(
"Gluten_Stage_{}_TID_{}_VTID_{}",
std::to_string(taskInfo.stageId),
std::to_string(taskInfo.taskId),
std::to_string(taskInfo.vId));
}
} // namespace
Comment on lines 243 to 246
memoryManager_->getAggregateMemoryPool(),
spillExecutor_,
fmt::format(
"Gluten_Stage_{}_TID_{}_VTID_{}",
std::to_string(taskInfo_.stageId),
std::to_string(taskInfo_.taskId),
std::to_string(taskInfo_.vId)));
getVeloxTaskId(taskInfo_));
return ctx;
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants