-
Notifications
You must be signed in to change notification settings - Fork 1.1k
[AINode] Integrate device manager framework #16998
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #16998 +/- ##
============================================
+ Coverage 39.28% 39.33% +0.05%
Complexity 212 212
============================================
Files 5073 5079 +6
Lines 339848 340571 +723
Branches 43401 43477 +76
============================================
+ Hits 133495 133976 +481
- Misses 206353 206595 +242 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This pull request integrates a device manager framework to support AINode managing models under different process units (CPU, CUDA). The changes introduce a unified backend usage interface through DeviceManager, replace string-based device IDs with torch.device objects throughout the codebase, and update the Thrift API to return device types alongside device IDs.
Key changes:
- New device manager framework with backend adapters for CPU and CUDA
- Refactored device handling from string-based to
torch.deviceobjects - Updated Thrift API to return device type information alongside device IDs
Reviewed changes
Copilot reviewed 22 out of 22 changed files in this pull request and generated 15 comments.
Show a summary per file
| File | Description |
|---|---|
| iotdb-protocol/thrift-ainode/src/main/thrift/ainode.thrift | Changed TShowAIDevicesResp from list to map to include device type information |
| iotdb-core/node-commons/src/main/java/org/apache/iotdb/commons/schema/column/ColumnHeaderConstant.java | Added DEVICE_TYPE column header and updated showAIDevicesColumnHeaders |
| iotdb-core/datanode/src/main/java/org/apache/iotdb/db/queryengine/plan/execution/config/metadata/ai/ShowAIDevicesTask.java | Updated to handle deviceIdMap instead of deviceIdList |
| iotdb-core/ainode/pyproject.toml | Restricted Python version and adjusted numpy/pandas version constraints |
| iotdb-core/ainode/iotdb/ainode/core/util/gpu_mapping.py | Removed (functionality replaced by device manager) |
| iotdb-core/ainode/iotdb/ainode/core/rpc/handler.py | Integrated DeviceManager and updated device validation logic |
| iotdb-core/ainode/iotdb/ainode/core/model/model_loader.py | Updated to use DeviceManager for model device placement |
| iotdb-core/ainode/iotdb/ainode/core/manager/inference_manager.py | Changed to use torch.device instead of string device IDs |
| iotdb-core/ainode/iotdb/ainode/core/manager/device_manager.py | New unified device management interface |
| iotdb-core/ainode/iotdb/ainode/core/inference/pool_scheduler/basic_pool_scheduler.py | Updated device ID types from string to torch.device |
| iotdb-core/ainode/iotdb/ainode/core/inference/pool_scheduler/abstract_pool_scheduler.py | Updated type signatures for torch.device |
| iotdb-core/ainode/iotdb/ainode/core/inference/pool_controller.py | Refactored to use torch.device throughout |
| iotdb-core/ainode/iotdb/ainode/core/inference/pipeline/pipeline_loader.py | Updated device parameter type to torch.device |
| iotdb-core/ainode/iotdb/ainode/core/inference/pipeline/basic_pipeline.py | Integrated DeviceManager for device handling |
| iotdb-core/ainode/iotdb/ainode/core/inference/inference_request_pool.py | Updated device handling with DeviceManager |
| iotdb-core/ainode/iotdb/ainode/core/device/env.py | New distributed environment handling |
| iotdb-core/ainode/iotdb/ainode/core/device/device_utils.py | New device parsing utilities |
| iotdb-core/ainode/iotdb/ainode/core/device/backend/cuda_backend.py | New CUDA backend adapter |
| iotdb-core/ainode/iotdb/ainode/core/device/backend/cpu_backend.py | New CPU backend adapter |
| iotdb-core/ainode/iotdb/ainode/core/device/backend/base.py | New backend adapter protocol definition |
| iotdb-core/ainode/iotdb/ainode/core/device/backend/init.py | New package initialization |
| iotdb-core/ainode/iotdb/ainode/core/device/init.py | New package initialization |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
iotdb-core/ainode/iotdb/ainode/core/device/backend/cpu_backend.py
Outdated
Show resolved
Hide resolved
iotdb-core/ainode/iotdb/ainode/core/inference/pool_scheduler/basic_pool_scheduler.py
Show resolved
Hide resolved
iotdb-core/ainode/iotdb/ainode/core/inference/pool_controller.py
Outdated
Show resolved
Hide resolved
iotdb-core/ainode/iotdb/ainode/core/device/backend/cuda_backend.py
Outdated
Show resolved
Hide resolved
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…o aidevicemanager
|
|
LGTM |
|
LGTM~ |



To support AINode managing models under different process units, we introduce device manager in this PR. We provide unified backend usage interfaces, which can be found at
iotdb/ainode/core/manager/device_manager.py.