Skip to content

Commit 12363fa

Browse files
authored
Merge branch 'mindspore-lab:main' into docs
2 parents 4b342a4 + ee0ae3d commit 12363fa

File tree

14 files changed

+211
-512
lines changed

14 files changed

+211
-512
lines changed

.github/workflows/ci.yml

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -36,10 +36,10 @@ jobs:
3636
pip install pytest
3737
# MindSpore must be installed following the instruction from official web, but not from pypi.
3838
# That's why we exclude mindspore from requirements.txt. Does this work?
39-
pip install "mindspore>=1.8,<=1.10"
40-
#- name: Test with pytest (UT)
41-
# run: |
42-
# pytest tests/modules/*.py
39+
pip install "mindspore>=1.9,<=1.10"
40+
- name: Test with pytest (UT)
41+
run: |
42+
pytest tests/ut/*.py
4343
- name: Test with pytest (ST)
4444
run: |
4545
pytest tests/st/test_train_eval_dummy.py

README.md

Lines changed: 3 additions & 140 deletions
Original file line numberDiff line numberDiff line change
@@ -65,156 +65,19 @@ pip install git+https://github.com/mindspore-lab/mindocr.git
6565

6666
### Text Detection Model Training
6767

68-
We will use **DBNet** model and **ICDAR2015** dataset for illustration, although other models and datasets are also supported. <!--ICDAR15 is a commonly-used model and a benchmark for scene text recognition.-->
69-
70-
#### 1. Data Preparation
71-
72-
Please download the ICDAR2015 dataset from this [website](https://rrc.cvc.uab.es/?ch=4&com=downloads), then format the dataset annotation refer to [dataset_convert](tools/dataset_converters/README.md).
73-
74-
After preparation, the data structure should be like
75-
76-
``` text
77-
.
78-
├── test
79-
│   ├── images
80-
│   │   ├── img_1.jpg
81-
│   │   ├── img_2.jpg
82-
│   │   └── ...
83-
│   └── det_gt.txt
84-
└── train
85-
   ├── images
86-
   │   ├── img_1.jpg
87-
   │   ├── img_2.jpg
88-
   │   └── ....jpg
89-
   └── det_gt.txt
90-
```
91-
92-
#### 2. Configure Yaml
93-
94-
Please choose a yaml config file containing the target pre-defined model and data pipeline that you want to re-use from `configs/det`. Here we choose `configs/det/dbnet/db_r50_icdar15.yaml`.
95-
96-
And change the data config args according to
97-
``` yaml
98-
train:
99-
dataset:
100-
data_dir: PATH/TO/TRAIN_IMAGES_DIR
101-
label_file: PATH/TO/TRAIN_LABELS.txt
102-
eval:
103-
dataset:
104-
data_dir: PATH/TO/TEST_IMAGES_DIR
105-
label_file: PATH/TO/TEST_LABELS.txt
106-
```
107-
108-
Optionally, change `num_workers` according to the cores of CPU, and change `distribute` to True if you are to train in distributed mode.
109-
110-
#### 3. Training
111-
112-
To train the model, please run
113-
114-
``` shell
115-
# train dbnet on ic15 dataset
116-
python tools/train.py --config configs/det/dbnet/db_r50_icdar15.yaml
117-
```
118-
119-
The training result (including checkpoints, per-epoch performance and curves) will be saved in the directory parsed by the arg `ckpt_save_dir`.
120-
121-
#### 4. Evaluation
122-
123-
To evaluate, please set the checkpoint path to the arg `ckpt_load_path` in yaml config file and run
124-
125-
``` shell
126-
python tools/eval.py --config configs/det/dbnet/db_r50_icdar15.yaml
127-
```
68+
We will use **DBNet** model and **ICDAR2015** dataset for demonstration, although other models and datasets are also supported. Please refer to [DBNet model README](configs/det/dbnet/README.md).
12869

12970

13071
### Text Recognition Model Training
13172

132-
We will use **CRNN** model and **LMDB** dataset for illustration, although other models and datasets are also supported.
133-
134-
#### 1. Data Preparation
135-
136-
Please download the LMDB dataset from [here](https://www.dropbox.com/sh/i39abvnefllx2si/AAAbAYRvxzRp3cIE5HzqUw3ra?dl=0) (ref: [deep-text-recognition-benchmark](https://github.com/clovaai/deep-text-recognition-benchmark#download-lmdb-dataset-for-traininig-and-evaluation-from-here)).
137-
138-
There're several .zip data files:
139-
- `data_lmdb_release.zip` contains the entire datasets including train, valid and evaluation.
140-
- `validation.zip` is the union dataset for Validation
141-
- `evaluation.zip` contains several benchmarking datasets.
142-
143-
Unzip the data and after preparation, the data structure should be like
144-
145-
``` text
146-
.
147-
├── train
148-
│   ├── MJ
149-
│   │   ├── data.mdb
150-
│   │   ├── lock.mdb
151-
│   ├── ST
152-
│   │   ├── data.mdb
153-
│   │   ├── lock.mdb
154-
└── validation
155-
|   ├── data.mdb
156-
|   ├── lock.mdb
157-
└── evaluation
158-
├── IC03
159-
│   ├── data.mdb
160-
│   ├── lock.mdb
161-
├── IC13
162-
│   ├── data.mdb
163-
│   ├── lock.mdb
164-
└── ...
165-
```
166-
167-
#### 2. Configure Yaml
168-
169-
Please choose a yaml config file containing the target pre-defined model and data pipeline that you want to re-use from `configs/rec`. Here we choose `configs/rec/crnn/crnn_resnet34.yaml`.
73+
We will use **CRNN** model and **LMDB** dataset for demonstration, although other models and datasets are also supported. Please refer to [CRNN model README](configs/rec/crnn/README.md).
17074

171-
Please change the data config args accordingly, such as
172-
``` yaml
173-
train:
174-
dataset:
175-
type: LMDBDataset
176-
data_dir: lmdb_data/rec/train/
177-
eval:
178-
dataset:
179-
type: LMDBDataset
180-
data_dir: lmdb_data/rec/validation/
181-
```
182-
183-
Optionally, change `num_workers` according to the cores of CPU, and change `distribute` to True if you are to train in distributed mode.
184-
185-
#### 3. Training
186-
187-
We will use distributed training for the large LMDB dataset.
188-
189-
To train in distributed mode, please run
190-
191-
```shell
192-
# Distributed training on Ascends
193-
mpirun --allow-run-as-root -n 8 python tools/train.py --config configs/rec/crnn/crnn_resnet34.yaml
194-
```
195-
196-
```shell
197-
# n is the number of GPUs/NPUs
198-
mpirun --allow-run-as-root -n 2 python tools/train.py --config configs/rec/crnn/crnn_resnet34.yaml
199-
```
200-
> Notes: please ensure the arg `distribute` in yaml file is set True
201-
202-
203-
The training result (including checkpoints, per-epoch performance and curves) will be saved in the directory parsed by the arg `ckpt_save_dir`.
204-
205-
#### 4. Evaluation
206-
207-
To evaluate, please set the checkpoint path to the arg `ckpt_load_path` in yaml config file and run
208-
209-
``` shell
210-
python tools/eval.py --config configs/rec/crnn/crnn_resnet34.yaml
211-
```
21275

21376
### Inference and Deployment
21477

21578
#### Inference with MX Engine
21679

217-
Please refer to [mx_infer](docs/cn/inference_cn.md)
80+
Please refer to [mx_infer](docs/cn/inference_cn.md).
21881

21982
#### Inference with Lite
22083

README_CN.md

Lines changed: 2 additions & 147 deletions
Original file line numberDiff line numberDiff line change
@@ -61,158 +61,13 @@ pip install git+https://github.com/mindspore-lab/mindocr.git
6161

6262
### 训练文本检测模型
6363

64-
MindOCR支持多种文本检测模型及数据集,在此我们使用**DBNet** 模型和 **ICDAR2015**数据集进行演示。
65-
66-
#### 1. 数据准备
67-
68-
请从[该网址](https://rrc.cvc.uab.es/?ch=4&com=downloads)下载ICDAR2015数据集,然后参考[数据转换](tools/dataset_converters/README_CN.md)对数据集标注进行格式化。
69-
70-
完成数据准备工作后,数据的目录结构应该如下所示:
71-
72-
``` text
73-
.
74-
├── test
75-
│   ├── images
76-
│   │   ├── img_1.jpg
77-
│   │   ├── img_2.jpg
78-
│   │   └── ...
79-
│   └── det_gt.txt
80-
└── train
81-
   ├── images
82-
   │   ├── img_1.jpg
83-
   │   ├── img_2.jpg
84-
   │   └── ....jpg
85-
   └── det_gt.txt
86-
```
87-
88-
#### 2. 配置Yaml文件
89-
90-
`configs/det`中选择一个包含目标预训练模型和数据流程的yaml配置文件,这里我们选择`configs/det/dbnet/db_r50_icdar15.yaml`
91-
92-
然后,按照以下指引更改数据配置参数:
93-
``` yaml
94-
train:
95-
dataset:
96-
data_dir: PATH/TO/TRAIN_IMAGES_DIR
97-
label_file: PATH/TO/TRAIN_LABELS.txt
98-
eval:
99-
dataset:
100-
data_dir: PATH/TO/TEST_IMAGES_DIR
101-
label_file: PATH/TO/TEST_LABELS.txt
102-
```
103-
104-
【可选】可以根据CPU核的数量设置`num_workers`参数的值;如果需要在分布式模式下训练,可修改`distribute`为True。
105-
106-
#### 3. 训练
107-
108-
运行以下命令开始模型训练:
109-
110-
``` shell
111-
# train dbnet on ic15 dataset
112-
python tools/train.py --config configs/det/dbnet/db_r50_icdar15.yaml
113-
```
114-
115-
如果在分布式模式下,请运行命令:
116-
117-
```shell
118-
# n is the number of GPUs/NPUs
119-
mpirun --allow-run-as-root -n 2 python tools/train.py --config configs/det/dbnet/db_r50_icdar15.yaml
120-
```
121-
> 注意:请确保yaml文件中的`distribute`参数为True。
122-
123-
124-
训练结果 (包括checkpoint、每个epoch的性能和曲线图)将被保存在yaml配置文件的`ckpt_save_dir`参数配置的路径下,默认为 "./tmp_det/"。
125-
126-
#### 4. 评估
127-
128-
评估环节,在yaml配置文件中将`ckpt_load_path`参数配置为checkpoint文件的路径,然后运行:
129-
130-
``` shell
131-
python tools/eval.py --config configs/det/dbnet/db_r50_icdar15.yaml
132-
```
64+
MindOCR支持多种文本检测模型及数据集,在此我们使用**DBNet**模型和**ICDAR2015**数据集进行演示。请参考[DBNet模型文档](configs/det/dbnet/README_CN.md)
13365

13466

13567
### 训练文本识别模型
13668

137-
MindOCR支持多种文本识别模型及数据集,在此我们使用**CRNN** 模型和 **LMDB**数据集进行演示。
138-
139-
#### 1. 数据准备
140-
141-
参考[deep-text-recognition-benchmark](https://github.com/clovaai/deep-text-recognition-benchmark#download-lmdb-dataset-for-traininig-and-evaluation-from-here),从[这里](https://www.dropbox.com/sh/i39abvnefllx2si/AAAbAYRvxzRp3cIE5HzqUw3ra?dl=0)下载LMDB数据集。
142-
143-
一共有如下.zip压缩数据文件:
144-
- `data_lmdb_release.zip` 包含训练、验证及测试的全部数据;
145-
- `validation.zip` 是验证数据集的合集;
146-
- `evaluation.zip` 包含多个评估数据集。
147-
148-
解压文件并完成数据准备操作后,数据文件夹结构如下:
149-
150-
``` text
151-
.
152-
├── train
153-
│   ├── MJ
154-
│   │   ├── data.mdb
155-
│   │   ├── lock.mdb
156-
│   ├── ST
157-
│   │   ├── data.mdb
158-
│   │   ├── lock.mdb
159-
└── validation
160-
|   ├── data.mdb
161-
|   ├── lock.mdb
162-
└── evaluation
163-
├── IC03
164-
│   ├── data.mdb
165-
│   ├── lock.mdb
166-
├── IC13
167-
│   ├── data.mdb
168-
│   ├── lock.mdb
169-
└── ...
170-
```
171-
172-
#### 2. 配置Yaml文件
173-
174-
在`configs/rec`中选择一个包含目标预训练模型和数据流程的yaml配置文件,这里我们选择`configs/rec/crnn/crnn_resnet34.yaml`。
175-
176-
相应的更改数据配置参数:
177-
``` yaml
178-
train:
179-
dataset:
180-
type: LMDBDataset
181-
data_dir: lmdb_data/rec/train/
182-
eval:
183-
dataset:
184-
type: LMDBDataset
185-
data_dir: lmdb_data/rec/validation/
186-
```
187-
【可选】可以根据CPU核的数量设置`num_workers`参数的值;如果需要在分布式模式下训练,可修改`distribute`为True。
188-
189-
#### 3. 训练
190-
191-
运行以下命令开始模型训练:
192-
193-
``` shell
194-
# train crnn on MJ+ST dataset
195-
python tools/train.py --config configs/rec/crnn/crnn_resnet34.yaml
196-
```
69+
MindOCR支持多种文本识别模型及数据集,在此我们使用**CRNN**模型和**LMDB**数据集进行演示。请参考[CRNN模型文档](configs/rec/crnn/README_CN.md)
19770

198-
如果在分布式模式下,请运行命令:
199-
200-
```shell
201-
# n is the number of GPUs/NPUs
202-
mpirun --allow-run-as-root -n 2 python tools/train.py --config configs/rec/crnn/crnn_resnet34.yaml
203-
```
204-
> 注意:请确保yaml文件中的`distribute`参数为True。
205-
206-
207-
训练结果 (包括checkpoint、每个epoch的性能和曲线图)将被保存在yaml配置文件的`ckpt_save_dir`参数配置的路径下,默认为 "./tmp_rec/"。
208-
209-
#### 4. 评估
210-
211-
评估环节,在yaml配置文件中将`ckpt_load_path`参数配置为checkpoint文件的路径,然后运行:
212-
213-
``` shell
214-
python tools/eval.py --config configs/rec/crnn/crnn_resnet34.yaml
215-
```
21671

21772
### 推理与部署
21873

0 commit comments

Comments
 (0)