Open
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
Adds a new Dataset Selector source operator that lets users pick a dataset version in the property panel and emits one tuple per file path in that version, using Texera’s dataset path format.
Changes:
- Backend: introduce
DatasetSelectorSourceOpDesc+DatasetSelectorSourceOpExecand register the operator type. - Frontend: add a custom Formly field (
datasetversionselector) to select dataset + version and bind it todatasetVersionPath. - Tests/assets: add a basic descriptor/schema unit test and an operator icon.
Reviewed changes
Copilot reviewed 9 out of 10 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| frontend/src/assets/operator_images/DatasetSelector.png | Adds an icon for the new operator. |
| frontend/src/app/workspace/component/property-editor/operator-property-edit-frame/operator-property-edit-frame.component.ts | Maps datasetVersionPath to a custom Formly control type. |
| frontend/src/app/workspace/component/dataset-version-selector/dataset-version-selector.component.ts | Implements dataset+version selector logic and writes back datasetVersionPath. |
| frontend/src/app/workspace/component/dataset-version-selector/dataset-version-selector.component.html | UI for dataset and version dropdowns. |
| frontend/src/app/common/formly/formly-config.ts | Registers the new Formly field type datasetversionselector. |
| frontend/src/app/app.module.ts | Declares the new selector component. |
| common/workflow-operator/src/test/scala/.../DatasetSelectorSourceOpDescSpec.scala | Adds unit tests for descriptor metadata and output schema. |
| common/workflow-operator/src/main/scala/.../DatasetSelectorSourceOpExec.scala | Implements tuple production by resolving dataset version and listing objects. |
| common/workflow-operator/src/main/scala/.../DatasetSelectorSourceOpDesc.scala | Defines operator metadata + output schema (filename). |
| common/workflow-operator/src/main/scala/.../LogicalOp.scala | Registers DatasetSelector in the operator type list. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
...onent/property-editor/operator-property-edit-frame/operator-property-edit-frame.component.ts
Show resolved
Hide resolved
...d/src/app/workspace/component/dataset-version-selector/dataset-version-selector.component.ts
Show resolved
Hide resolved
...d/src/app/workspace/component/dataset-version-selector/dataset-version-selector.component.ts
Show resolved
Hide resolved
...main/scala/org/apache/texera/amber/operator/source/dataset/DatasetSelectorSourceOpExec.scala
Show resolved
Hide resolved
...main/scala/org/apache/texera/amber/operator/source/dataset/DatasetSelectorSourceOpExec.scala
Show resolved
Hide resolved
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this PR?
This PR adds a new Dataset Selector operator that allows users to select a dataset from the property panel and output one tuple per filepath in that version. The emitted values follow Texera’s existing dataset file path format, so they can be consumed directly by downstream operators. On the frontend, this PR adds a dedicated dataset-version selector field in the property panel and wires datasetVersionPath to that custom UI.
Any related issues, documentation, discussions?
Closes #4363.
How was this PR tested?
Tested manually, and a test case was added.
The test covers the dataset selector descriptor metadata and output schema.
Was this PR authored or co-authored using generative AI tooling?
No.