feat(deployment): centerpoint deployment integration#181
feat(deployment): centerpoint deployment integration#181vividf wants to merge 27 commits intotier4:feat/new_deployment_and_evaluation_pipelinefrom
Conversation
bfb778f to
441d06e
Compare
| verification = dict( | ||
| enabled=False, | ||
| tolerance=1e-1, | ||
| tolerance=1, |
There was a problem hiding this comment.
Explain what is tolerance here, and why updating from 0.1 to 1
There was a problem hiding this comment.
The value was originally set for calibration classification and later copied to CenterPoint, but it does not work correctly for CenterPoint.
INFO:deployment.core.evaluation.verification_mixin: tensorrt (cuda:0) latency: 205.08 ms
INFO:deployment.core.evaluation.verification_mixin: output[heatmap]: shape=(1, 5, 510, 510), max_diff=0.070197, mean_diff=0.007674
INFO:deployment.core.evaluation.verification_mixin: output[reg]: shape=(1, 2, 510, 510), max_diff=0.007944, mean_diff=0.001120
INFO:deployment.core.evaluation.verification_mixin: output[height]: shape=(1, 1, 510, 510), max_diff=0.025401, mean_diff=0.002122
INFO:deployment.core.evaluation.verification_mixin: output[dim]: shape=(1, 3, 510, 510), max_diff=0.031920, mean_diff=0.001143
INFO:deployment.core.evaluation.verification_mixin: output[rot]: shape=(1, 2, 510, 510), max_diff=0.075215, mean_diff=0.004582
INFO:deployment.core.evaluation.verification_mixin: output[vel]: shape=(1, 2, 510, 510), max_diff=0.221999, mean_diff=0.004940
INFO:deployment.core.evaluation.verification_mixin:
Overall Max difference: 0.221999
INFO:deployment.core.evaluation.verification_mixin: Overall Mean difference: 0.004347
WARNING:deployment.core.evaluation.verification_mixin: tensorrt (cuda:0) verification FAILED ✗ (max diff: 0.221999 > tolerance: 0.100000)
INFO:deployment.core.evaluation.verification_mixin:
There was a problem hiding this comment.
Do you know any reason why it fail? Since it seems like a verification, it's always better to check the reason rather than update the tolerance
There was a problem hiding this comment.
It doesn't necessarily indicate a failure.
When converting from PyTorch to TensorRT, some numerical differences are expected due to different kernels, precision handling, and TensorRT optimizations.
The verification is mainly used as a safeguard to detect major issues (e.g., incorrect conversion settings) rather than to enforce exact numerical equivalence.
There was a problem hiding this comment.
Since 1e-1 is when we set for resnet18 for calibration classification, it is different in the cases.
There was a problem hiding this comment.
Btw, this is the verification result in tensorrt fp16 right? If that's the case, it makes sense
There was a problem hiding this comment.
Anyway, 5e-1 can be a better value
There was a problem hiding this comment.
Running onnx (cuda:0) reference...
2026-03-10 15:20:07.511273431 [V:onnxruntime:, execution_steps.cc:103 Execute] stream 0 activate notification with index 0
2026-03-10 15:20:07.567219724 [V:onnxruntime:, execution_steps.cc:47 Execute] stream 0 wait on Notification with id: 0
INFO:deployment.core.evaluation.verification_mixin: onnx (cuda:0) latency: 1423.80 ms
INFO:deployment.core.evaluation.verification_mixin:
Running tensorrt (cuda:0) test...
INFO:deployment.core.evaluation.verification_mixin: tensorrt (cuda:0) latency: 1141.26 ms
INFO:deployment.core.evaluation.verification_mixin: output[heatmap]: shape=(1, 5, 510, 510), max_diff=0.464849, mean_diff=0.056135
INFO:deployment.core.evaluation.verification_mixin: output[reg]: shape=(1, 2, 510, 510), max_diff=0.056639, mean_diff=0.006198
INFO:deployment.core.evaluation.verification_mixin: output[height]: shape=(1, 1, 510, 510), max_diff=0.227012, mean_diff=0.065522
INFO:deployment.core.evaluation.verification_mixin: output[dim]: shape=(1, 3, 510, 510), max_diff=0.336713, mean_diff=0.028087
INFO:deployment.core.evaluation.verification_mixin: output[rot]: shape=(1, 2, 510, 510), max_diff=0.515039, mean_diff=0.023962
INFO:deployment.core.evaluation.verification_mixin: output[vel]: shape=(1, 2, 510, 510), max_diff=0.932002, mean_diff=0.034206
INFO:deployment.core.evaluation.verification_mixin:
Overall Max difference: 0.932002
INFO:deployment.core.evaluation.verification_mixin: Overall Mean difference: 0.037279
WARNING:deployment.core.evaluation.verification_mixin: tensorrt (cuda:0) verification FAILED ✗ (max diff: 0.932002 > tolerance: 0.500000)
On a different computer, it can have different values.
I will leave 1 for now
There was a problem hiding this comment.
Did you set any random seed to set this validation since the randomness (for example, shuffling pointclouds) significantly affects the results. Otherwise, i believe the difference between computer is too huge
| @property | ||
| def _components_cfg(self) -> Dict[str, Any]: | ||
| """Get unified components configuration.""" | ||
| if "components" not in self.config.deploy_cfg: |
There was a problem hiding this comment.
Thanks, this is fixed in the big refactor b3b8355
| def _onnx_config(self) -> Dict[str, Any]: | ||
| """Get shared ONNX export settings.""" | ||
| onnx_config_raw = self.config.onnx_config | ||
| if onnx_config_raw is None: |
There was a problem hiding this comment.
Same, consider to use assert
| use ONNX model variants. | ||
| """ | ||
| # Import triggers @MODELS.register_module() registrations | ||
| from deployment.projects.centerpoint.onnx_models import centerpoint_head_onnx as _head # noqa: F401 |
There was a problem hiding this comment.
Please remove register_models and simply move these to the top statements, and call them in __all__ individually
| try: | ||
| outputs = self._components_cfg["backbone_head"]["io"]["outputs"] | ||
| except KeyError as exc: | ||
| raise KeyError("Missing required config path: components_cfg['backbone_head']['io']['outputs']") from exc |
There was a problem hiding this comment.
Do we need these? I believe we should just use dataclass instead of dict, and thus we can remove these checking
There was a problem hiding this comment.
Thanks, fixed in the big refactor b3b8355#diff-f5b73e91a7e42d76185a9036cfa8599384848052498bbd125f3eef66a08ada77
caa92a6 to
93e5558
Compare
de7020e to
6470ac5
Compare
|
Some of the modules, for example, |
| model_cfg = Config.fromfile(args.model_cfg) | ||
| config = BaseDeploymentConfig(deploy_cfg) | ||
|
|
||
| _validate_required_components(config.components_cfg) |
There was a problem hiding this comment.
move _validate_required_components to BaseDeploymentConfig
|
|
||
| context = CenterPointExportContext(rot_y_axis_reference=bool(getattr(args, "rot_y_axis_reference", False))) | ||
| runner.run(context=context) | ||
| return 0 |
There was a problem hiding this comment.
Do we need to return status code here?
| def _release_gpu_resources(self) -> None: | ||
| """Release TensorRT resources (engines and contexts) and CUDA events.""" | ||
| # Destroy CUDA events | ||
| if hasattr(self, "_backbone_start_event"): |
There was a problem hiding this comment.
Use for-loop to achieve this
| } | ||
|
|
||
| for component_name, engine_path in engine_files.items(): | ||
| if not osp.exists(engine_path): |
There was a problem hiding this comment.
This error validation should be done in resolve_artifact_path
5256306 to
2b28f60
Compare
Signed-off-by: vividf <yihsiang.fang@tier4.jp>
Signed-off-by: vividf <yihsiang.fang@tier4.jp>
Signed-off-by: vividf <yihsiang.fang@tier4.jp>
Signed-off-by: vividf <yihsiang.fang@tier4.jp>
Signed-off-by: vividf <yihsiang.fang@tier4.jp>
Signed-off-by: vividf <yihsiang.fang@tier4.jp>
Signed-off-by: vividf <yihsiang.fang@tier4.jp>
Signed-off-by: vividf <yihsiang.fang@tier4.jp>
Signed-off-by: vividf <yihsiang.fang@tier4.jp>
Signed-off-by: vividf <yihsiang.fang@tier4.jp>
Signed-off-by: vividf <yihsiang.fang@tier4.jp>
Signed-off-by: vividf <yihsiang.fang@tier4.jp>
Signed-off-by: vividf <yihsiang.fang@tier4.jp>
Signed-off-by: vividf <yihsiang.fang@tier4.jp>
…erpoint Signed-off-by: vividf <yihsiang.fang@tier4.jp>
Signed-off-by: vividf <yihsiang.fang@tier4.jp>
Signed-off-by: vividf <yihsiang.fang@tier4.jp>
Signed-off-by: vividf <yihsiang.fang@tier4.jp>
Signed-off-by: vividf <yihsiang.fang@tier4.jp>
Signed-off-by: vividf <yihsiang.fang@tier4.jp>
Signed-off-by: vividf <yihsiang.fang@tier4.jp>
Signed-off-by: vividf <yihsiang.fang@tier4.jp>
Signed-off-by: vividf <yihsiang.fang@tier4.jp>
Signed-off-by: vividf <yihsiang.fang@tier4.jp>
Signed-off-by: vividf <yihsiang.fang@tier4.jp>
Signed-off-by: vividf <yihsiang.fang@tier4.jp>
1ca0e1c to
a6b9840
Compare
Summary
Integrates CenterPoint into the unified deployment framework, enabling deployment and evaluation of ONNX and TensorRT models.
Note, this PR include changes in #180
Changes
projects/CenterPointtodeployment/projects/centerpointdeploy.pyscript with new unified CLI (deployment.cli.main)Migration Notes
projects/CenterPoint/scripts/deploy.py) is removedpython -m deployment.cli.main centerpoint <deploy_config> <model_config>deployment.projects.centerpoint.onnx_modelsHow to run
Exported ONNX (Same)
Voxel Encoder

Backbone Head
