Skip to content
This repository was archived by the owner on Jun 3, 2025. It is now read-only.

Commit 5396662

Browse files
authored
example Yolo deployment server (#104)
* example Yolo deployment server * add pre-nms postprocessing * pre-processing note
1 parent dcd1e66 commit 5396662

File tree

4 files changed

+396
-0
lines changed

4 files changed

+396
-0
lines changed

integrations/ultralytics/README.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -100,3 +100,8 @@ python train.py \
100100
--sparseml-recipe /PATH/TO/RECIPE/recipe.yaml \
101101
<regular yolov5/train.py parameters>
102102
```
103+
104+
105+
## Server
106+
The `server/` directory contians an self-documented example of deploying a sparsified
107+
Yolo model with the DeepSparse engine.
Lines changed: 86 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,86 @@
1+
<!--
2+
Copyright (c) 2021 - present / Neuralmagic, Inc. All Rights Reserved.
3+
4+
Licensed under the Apache License, Version 2.0 (the "License");
5+
you may not use this file except in compliance with the License.
6+
You may obtain a copy of the License at
7+
8+
http://www.apache.org/licenses/LICENSE-2.0
9+
10+
Unless required by applicable law or agreed to in writing,
11+
software distributed under the License is distributed on an "AS IS" BASIS,
12+
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13+
See the License for the specific language governing permissions and
14+
limitations under the License.
15+
-->
16+
17+
# Example Yolo Model Server and Client Using DeepSparse Flask
18+
19+
To illustrate how the DeepSparse engine can be used for Yolo model deployments, this directory
20+
contains a sample model server and client.
21+
22+
The server uses Flask to create an app with the DeepSparse Engine hosting a
23+
compiled Yolo model.
24+
The client can make requests into the server returning object detection results for given images.
25+
26+
27+
## Installation
28+
29+
Similarly to the SparseML integration, dependencies can be installed via `pip` and the files in
30+
this self-contained example can be copied directly into the `ultralytics/yolov5` for execution.
31+
32+
If both repositories are already cloned, you may skip that step.
33+
34+
```bash
35+
# clone
36+
git clone https://github.com/ultralytics/yolov5.git
37+
git clone https://github.com/neuralmagic/sparseml.git
38+
39+
# copy script
40+
cp sparseml/integrations/ultralytics/server/*.py yolov5
41+
cd yolov5
42+
43+
# install dependencies
44+
pip install -r requirements.txt
45+
pip install deepsparse flask flask-cors
46+
```
47+
48+
## Execution
49+
50+
### Server
51+
52+
First, start up the host `server.py` with your model of choice.
53+
54+
Example command:
55+
```bash
56+
python server.py ~/models/yolov3-pruned_quant.onnx
57+
```
58+
59+
You can leave that running as a detached process or in a spare terminal.
60+
61+
This starts a Flask app with the DeepSparse Engine as the inference backend, accessible at `http://0.0.0.0:5543` by default.
62+
63+
The app exposes HTTP endpoints at:
64+
- `/info` to get information about the compiled model
65+
- `/predict` to send images to the model and receive detected in response.
66+
The number of images should match the compiled model's batch size.
67+
68+
For a full list of options, run `python server.py -h`.
69+
70+
Currently, the server is set to do pre-processing for the yolov3-spp
71+
model, if other models are used, the image shape, output shapes, and
72+
anchor grids should be updated.
73+
74+
### Client
75+
76+
`client.py` provides a `YoloDetectionClient` object to make requests to the server easy.
77+
The file is self documented. See example usage below:
78+
79+
```python
80+
from client import YoloDetectionClient
81+
82+
remote_model = YoloDetectionClient()
83+
image_path = "/PATH/TO/EXAMPLE/IMAGE.jpg"
84+
85+
model_outputs = remote_model.detect(image_path)
86+
```
Lines changed: 68 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,68 @@
1+
# Copyright (c) 2021 - present / Neuralmagic, Inc. All Rights Reserved.
2+
#
3+
# Licensed under the Apache License, Version 2.0 (the "License");
4+
# you may not use this file except in compliance with the License.
5+
# You may obtain a copy of the License at
6+
#
7+
# http://www.apache.org/licenses/LICENSE-2.0
8+
#
9+
# Unless required by applicable law or agreed to in writing,
10+
# software distributed under the License is distributed on an "AS IS" BASIS,
11+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12+
# See the License for the specific language governing permissions and
13+
# limitations under the License.
14+
15+
"""
16+
Client object for making requests to the example DeepSparse Yolo inference server
17+
"""
18+
19+
20+
import time
21+
from typing import List, Union
22+
23+
import numpy
24+
import requests
25+
26+
import cv2
27+
from deepsparse.utils import arrays_to_bytes, bytes_to_arrays
28+
29+
30+
class YoloDetectionClient:
31+
"""
32+
Client object for making requests to the example DeepSparse Yolo inference server
33+
34+
:param address: IP address of the server, default is 0.0.0.0
35+
:param port: Port the server is hosted on, default is 5543
36+
"""
37+
38+
def __init__(self, address: str = "0.0.0.0", port: str = 5543):
39+
self._url = f"http://{address}:{port}/predict"
40+
41+
def detect(
42+
self, images: Union[str, numpy.ndarray, List[str], List[numpy.ndarray]]
43+
) -> List[numpy.ndarray]:
44+
"""
45+
:param images: list of or singular file paths or numpy arrays of images to
46+
run the detection model on
47+
:return: list of post-processed object detection results for each image including
48+
class label, likelihood, and bounding box coordinates
49+
"""
50+
if not isinstance(images, List):
51+
images = [images]
52+
images = [
53+
cv2.imread(image) if isinstance(image, str) else image for image in images
54+
]
55+
56+
print(f"Sending batch of {len(images)} images for detection to {self._url}")
57+
58+
start = time.time()
59+
# Encode inputs
60+
data = arrays_to_bytes(images)
61+
# Send data to server for inference
62+
response = requests.post(self._url, data=data)
63+
# Decode outputs
64+
outputs = bytes_to_arrays(response.content)
65+
total_time = time.time() - start
66+
print(f"Round-trip time took {total_time * 1000.0:.4f}ms")
67+
68+
return outputs

0 commit comments

Comments
 (0)