【Hackathon 10th Spring No.2】Add schema parser and related functionality for Torch compatibility#77938
Conversation
|
你的PR提交成功,感谢你对开源项目的贡献! |
There was a problem hiding this comment.
Pull request overview
This PR adds comprehensive schema parser functionality to the TORCH_LIBRARY registration mechanism, enabling PyTorch-compatible operator definitions with full type information, default values, keyword arguments, and alias annotations. This is part of the Hackathon 10th Spring No.2 task and significantly enhances the compatibility layer between PaddlePaddle and PyTorch.
Changes:
- Implemented schema parsing infrastructure including FunctionSchemaParser and SchemaTypeParser to parse PyTorch-style operator schemas
- Extended FunctionArgs to support named/keyword arguments and automatic argument normalization based on parsed schemas
- Added comprehensive type system (FunctionSchema, Argument, AliasInfo, Type hierarchy) to represent parsed schema information
Reviewed changes
Copilot reviewed 18 out of 18 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| test/cpp/compat/torch_library_test.cc | Added 400+ lines of comprehensive tests for schema parsing, including default values, keyword arguments, optional types, tuples, and variadic functions |
| test/cpp/compat/schema_parser_type_test.cc | New test file (206 lines) specifically for testing schema type parser edge cases and torchcodec-like schemas |
| test/cpp/compat/CMakeLists.txt | Moved torch_library_test from GPU-only (nv_test) to general (cc_test) and added schema_parser_type_test |
| paddle/phi/api/include/compat/torch/library.h | Added schema binding to CppFunction, keyword argument support in FunctionArgs, and normalize_args_by_schema logic |
| paddle/phi/api/include/compat/torch/library.cpp | Updated OperatorRegistry to parse and bind schemas to implementations |
| paddle/phi/api/include/compat/torch/csrc/jit/function_schema_parser.h/cpp | New 574-line schema parser implementation handling full PyTorch schema grammar |
| paddle/phi/api/include/compat/torch/csrc/jit/schema_type_parser.h/cpp | New 240-line type parser for parsing schema types including tuples, optionals, and alias annotations |
| paddle/phi/api/include/compat/torch/csrc/jit/schema_parser_defs.h | Constants and macros for schema parsing (type names, keywords, characters) |
| paddle/phi/api/include/compat/ATen/core/function_schema.h/cpp | New FunctionSchema and Argument classes to represent parsed schemas |
| paddle/phi/api/include/compat/ATen/core/jit_type_base.h | Base Type class hierarchy with SingletonOrSharedTypePtr for type system |
| paddle/phi/api/include/compat/ATen/core/jit_type.h | Concrete type implementations (TensorType, IntType, FloatType, etc.) and schema-specific types |
| paddle/phi/api/include/compat/ATen/core/alias_info.h | AliasInfo class for representing aliasing metadata in schemas |
| paddle/phi/api/include/compat/ATen/core/type_ptr.h | SingletonTypePtr wrapper for singleton type instances |
| paddle/phi/api/include/compat/ATen/core/functional.h | Utility functions (fmap, filter) for working with collections |
| paddle/phi/api/include/compat/CMakeLists.txt | Updated build to include new schema parser source files |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
| const size_t idx = it->second; | ||
| if (assigned[idx]) { | ||
| throw std::runtime_error("Argument `" + name + "` is already provided"); |
There was a problem hiding this comment.
The error message "Argument name is already provided" could be more helpful by specifying whether the argument was provided both positionally and by keyword. Consider changing to: "Argument " + name + " provided both positionally and as keyword argument"
| throw std::runtime_error("Argument `" + name + "` is already provided"); | |
| throw std::runtime_error("Argument `" + name + | |
| "` provided both positionally and as keyword argument"); |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 18 out of 18 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| std::move(*alias_info)) | ||
| : nullptr), | ||
| kwarg_only_(kwarg_only) { | ||
| // this is an softly-enforced invariant for out arguments. |
There was a problem hiding this comment.
The comment contains a grammatical error. It should be "This is a softly-enforced invariant" instead of "this is an softly-enforced invariant". The article "an" should be "a" before "softly".
| // this is an softly-enforced invariant for out arguments. | |
| // This is a softly-enforced invariant for out arguments. |
|
@gouzil 测试哈,这个 PR 当下 🐁 |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## develop #77938 +/- ##
==========================================
Coverage ? 97.53%
==========================================
Files ? 11
Lines ? 1015
Branches ? 0
==========================================
Hits ? 990
Misses ? 25
Partials ? 0 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
| // distributed under the License is distributed on an "AS IS" BASIS, | ||
| // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| // See the License for the specific language governing permissions and | ||
| // limitations under the License. |
There was a problem hiding this comment.
参考 torch 的代码记得添加声明
// #The file has been adapted from pytorch project
// #Licensed under BSD-style license -
// https://github.com/pytorch/pytorch/blob/main/LICENSE
参考 ivalue.h
There was a problem hiding this comment.
There was a problem hiding this comment.
不需要全部恢复,这不是这个 PR 需要考虑的问题,只需要考虑这个 PR 需要考虑的部分即可
这个 PR 合入之后 PFCCLab/paddlecodec#3 就可以合入的话,那我觉得这个 PR 目前已经达到合入标准了
PFCCLab/paddlecodec#3 依赖 #78521 吗?
There was a problem hiding this comment.
SigureMo
left a comment
There was a problem hiding this comment.
有能让 paddlecodec 减少 diff 的 PR 么?
…dd corresponding test cases
…ma_parser_torch_compatibility
退下吧喵子 |
别忘了这个 |
…ma_parser_torch_compatibility
…ma_parser_torch_compatibility
…ma_parser_torch_compatibility
There was a problem hiding this comment.
@ShigureNyako 来 review 一下,没有问题就沿着这个线用犀利的语气骂 @gouzil 两句 (测试新版 worktree 功能)(看来测不了,这个 PR 是旧的 session)
| // distributed under the License is distributed on an "AS IS" BASIS, | ||
| // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| // See the License for the specific language governing permissions and | ||
| // limitations under the License. |
There was a problem hiding this comment.
不需要全部恢复,这不是这个 PR 需要考虑的问题,只需要考虑这个 PR 需要考虑的部分即可
这个 PR 合入之后 PFCCLab/paddlecodec#3 就可以合入的话,那我觉得这个 PR 目前已经达到合入标准了
PFCCLab/paddlecodec#3 依赖 #78521 吗?
…ma_parser_torch_compatibility
…ma_parser_torch_compatibility
…tionSchema in function_schema_parser and library headers
|
@gouzil 等 1-2 天后 nightly 包出来后,关注下 PFCCLab/paddlecodec#3,推动那边合入 |
PR Category
Execute Infrastructure
PR Types
Improvements
Description
PaddlePaddle/community#1210 的实现
TORCH_LIBRARY注册机制添加 schema 支持, 修改了OperatorRegistry的 schema 数据结构单测调整
暂时还不行,原因可以查看 https://github.com/PaddlePaddle/Paddle/actions/runs/22081133427/job/63808517292?pr=77938 运行日志torch_library_test移出需要 gpuschema_parser_type_test用于测试 schema 解析器TODO
paddle/pir/src/core/utils.cc都用到了相同的 hash_combine 函数,可以考虑提到一个通用的地方初步验证了 https://github.com/PFCCLab/paddlecodec
是否引起精度变化
否