Skip to content

Commit 734fed9

Browse files
horchamhorcham
andauthored
Document Update (#603)
* Add Doc * Add Doc --------- Co-authored-by: horcham <zhanghongquan15@huawei.com>
1 parent 6604ed9 commit 734fed9

24 files changed

+2852
-1049
lines changed

README.md

Lines changed: 65 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,7 @@ English | [中文](README_CN.md)
1717
[📚Tutorials](#tutorials) |
1818
[🎁Model List](#model-list) |
1919
[📰Dataset List](#dataset-list) |
20+
[📖Frequently Asked Questions](#frequently-asked-questions) |
2021
[🎉Notes](#notes)
2122

2223
</div>
@@ -36,17 +37,18 @@ MindOCR is an open-source toolbox for OCR development and application based on [
3637

3738
## Installation
3839

39-
<details close markdown>
40+
<details open markdown>
41+
<summary> Details </summary>
4042

4143
#### Prerequisites
4244

4345
MindOCR is built on MindSpore AI framework, which supports CPU/GPU/NPU devices.
4446
MindOCR is compatible with the following framework versions. For details and installation guideline, please refer to the installation links shown below.
4547

46-
- mindspore >= 1.9 (ABINet requires mindspore >= 2.0) [[install](https://www.mindspore.cn/install)]
48+
- mindspore >= 2.2.0 [[install](https://www.mindspore.cn/install)]
4749
- python >= 3.7
4850
- openmpi 4.0.3 (for distributed training/evaluation) [[install](https://www.open-mpi.org/software/ompi/v4.0/)]
49-
- mindspore lite (for inference) [[install](docs/en/inference/environment.md)]
51+
- mindspore lite (for offline inference) >= 2.2.0 [[install](docs/en/inference/environment.md)]
5052

5153

5254
#### Dependency
@@ -126,10 +128,12 @@ python tools/eval.py \
126128

127129
For more illustration and usage, please refer to the model training section in [Tutorials](#tutorials).
128130

129-
### 3. Model Inference - Quick Guideline
131+
### 3. Model Offline Inference - Quick Guideline
130132

131-
You can do MindSpore Lite inference in MindOCR using **MindOCR models** or **Third-party models** (PaddleOCR, MMOCR, etc.).
132-
Please refer to [MindOCR Models Inference - Quick Start](docs/en/inference/inference_quickstart.md) or [Third-party Models Inference - Quick Start](docs/en/inference/inference_thirdparty_quickstart.md).
133+
You can do MindSpore Lite inference in MindOCR using **MindOCR models** or **Third-party models** (PaddleOCR, MMOCR, etc.). Please refer to the following documents
134+
- [Python/C++ Inference on Ascend 310](docs/en/inference/inference_tutorial.md)
135+
- [MindOCR Models Offline Inference - Quick Start](docs/en/inference/inference_quickstart.md)
136+
- [Third-party Models Offline Inference - Quick Start](docs/en/inference/inference_thirdparty_quickstart.md).
133137

134138
## Tutorials
135139

@@ -142,9 +146,12 @@ Please refer to [MindOCR Models Inference - Quick Start](docs/en/inference/infer
142146
- [Text Recognition](docs/en/tutorials/training_recognition_custom_dataset.md)
143147
- [Distributed Training](docs/en/tutorials/distribute_train.md)
144148
- [Advance: Gradient Accumulation, EMA, Resume Training, etc](docs/en/tutorials/advanced_train.md)
145-
- Inference and Deployment
146-
- [Python/C++ Inference on Ascend 310](docs/en/inference/inference_tutorial.md)
149+
- Inference with MindSpore
147150
- [Python Online Inference](tools/infer/text/README.md)
151+
- Inference with MindSpore Lite
152+
- [Python/C++ Inference on Ascend 310](docs/en/inference/inference_tutorial.md)
153+
- [MindOCR Models Offline Inference - Quick Start](docs/en/inference/inference_quickstart.md)
154+
- [Third-party Models Offline Inference - Quick Start](docs/en/inference/inference_thirdparty_quickstart.md).
148155
- Developer Guides
149156
- [Customize Dataset](mindocr/data/README.md)
150157
- [Customize Data Transformation](mindocr/data/transforms/README.md)
@@ -184,6 +191,13 @@ Please refer to [MindOCR Models Inference - Quick Start](docs/en/inference/infer
184191

185192
</details>
186193

194+
<details open markdown>
195+
<summary>Key Information Extraction</summary>
196+
197+
- [x] [LayoutXLM SER](configs/kie/vi_layoutxlm/README_CN.md) (arXiv'2016)
198+
199+
</details>
200+
187201
For the detailed performance of the trained models, please refer to [configs](./configs).
188202

189203
For details of MindSpore Lite and ACL inference models support, please refer to [MindOCR Models Support List](docs/en/inference/inference_quickstart.md) and [Third-party Models Support List](docs/en/inference/inference_thirdparty_quickstart.md) (PaddleOCR, MMOCR, etc.).
@@ -219,14 +233,49 @@ MindOCR provides a [dataset conversion tool](tools/dataset_converters) to OCR da
219233

220234
</details>
221235

236+
<details close markdown>
237+
<summary>Layout Analysis Datasets</summary>
238+
239+
- [PublayNet](https://github.com/ibm-aur-nlp/PubLayNet) [[paper](https://arxiv.org/abs/1908.07836)] [[download](https://dax-cdn.cdn.appdomain.cloud/dax-publaynet/1.0.0/publaynet.tar.gz)]
240+
241+
</details>
242+
243+
<details close markdown>
244+
<summary>Key Information Extraction Datasets</summary>
245+
246+
- [XFUND](https://github.com/doc-analysis/XFUND) [[paper](https://aclanthology.org/2022.findings-acl.253/)] [[download](https://github.com/doc-analysis/XFUND/releases/tag/v1.0)]
247+
248+
</details>
249+
222250
We will include more datasets for training and evaluation. This list will be continuously updated.
223251

252+
## Frequently Asked Questions
253+
Frequently asked questions about configuring environment and mindocr, please refer to [FAQ](docs/en/tutorials/frequently_asked_questions.md).
254+
224255
## Notes
225256

226257
### What is New
258+
259+
<details close markdown>
260+
<summary>News</summary>
261+
227262
- 2023/12/14
263+
1. Add new trained models
264+
- [LayoutXLM SER](configs/kie/vi_layoutxlm) for key information extraction
265+
- [VI-LayoutXLM SER](configs/kie/layoutlm_series) for key information extraction
266+
- [PP-OCRv3 DBNet](configs/det/dbnet/db_mobilenetv3_ppocrv3.yaml) for text detection and [PP-OCRv3 SVTR](configs/rec/svtr/svtr_ppocrv3_ch.yaml) for recognition, supporting online inferece and finetuning
267+
2. Add more benchmark datasets and their results
268+
- [XFUND](configs/kie/vi_layoutxlm/README_CN.md)
269+
3. Multiple specifications support for Ascend 910: DBNet ResNet-50, DBNet++ ResNet-50, CRNN VGG7, SVTR-Tiny, FCENet, ABINet
270+
- 2023/11/28
271+
1. Add offline inference support for PP-OCRv4
272+
- [PP-OCRv4 DBNet](deploy/py_infer/src/configs/det/ppocr/ch_PP-OCRv4_det_cml.yaml) for text detection and [PP-OCRv4 CRNN](deploy/py_infer/src/configs/rec/ppocr/ch_PP-OCRv4_rec_distillation.yaml) for text recognition, supporting offline inferece
273+
2. Fix bugs of third-party models offline inference
274+
- 2023/11/17
228275
1. Add new trained models
229276
- [YOLOv8](configs/layout/yolov8) for layout analysis
277+
2. Add more benchmark datasets and their results
278+
- [PublayNet](configs/layout/yolov8/README_CN.md)
230279
- 2023/07/06
231280
1. Add new trained models
232281
- [RobustScanner](configs/rec/robustscanner) for text recognition
@@ -285,13 +334,14 @@ which can be enabled by add "shape_list" to the `eval.dataset.output_columns` li
285334
- 2023/03/13
286335
1. Add system test and CI workflow.
287336
2. Add modelarts adapter to allow training on OpenI platform. To train on OpenI:
288-
```text
289-
i) Create a new training task on the openi cloud platform.
290-
ii) Link the dataset (e.g., ic15_mindocr) on the webpage.
291-
iii) Add run parameter `config` and write the yaml file path on the website UI interface, e.g., '/home/work/user-job-dir/V0001/configs/rec/test.yaml'
292-
iv) Add run parameter `enable_modelarts` and set True on the website UI interface.
293-
v) Fill in other blanks and launch.
294-
```
337+
```text
338+
i) Create a new training task on the openi cloud platform.
339+
ii) Link the dataset (e.g., ic15_mindocr) on the webpage.
340+
iii) Add run parameter `config` and write the yaml file path on the website UI interface, e.g., '/home/work/user-job-dir/V0001/configs/rec/test.yaml'
341+
iv) Add run parameter `enable_modelarts` and set True on the website UI interface.
342+
v) Fill in other blanks and launch.
343+
```
344+
</details>
295345
296346
### How to Contribute
297347

README_CN.md

Lines changed: 80 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
# MindOCR
44

5-
[![CI](https://github.com/mindspore-lab/mindocr/actions/workflows/ci.yml/badge.svg)](https://github.com/mindspore-lab/mindocr/actions/workflows/ci.yml)
5+
[![CI](https://github.com/mindspore-lab/mindocr/actions/workflow/ci.yml/badge.svg)](https://github.com/mindspore-lab/mindocr/actions/workflow/ci.yml)
66
[![license](https://img.shields.io/github/license/mindspore-lab/mindocr.svg)](https://github.com/mindspore-lab/mindocr/blob/main/LICENSE)
77
[![open issues](https://img.shields.io/github/issues/mindspore-lab/mindocr)](https://github.com/mindspore-lab/mindocr/issues)
88
[![PRs](https://img.shields.io/badge/PRs-welcome-pink.svg)](https://github.com/mindspore-lab/mindocr/pulls)
@@ -17,6 +17,7 @@
1717
[📚使用教程](#使用教程) |
1818
[🎁模型列表](#模型列表) |
1919
[📰数据集列表](#数据集列表) |
20+
[📖常见问题](#常见问题) |
2021
[🎉更新日志](#更新日志)
2122

2223
</div>
@@ -36,16 +37,16 @@ MindOCR是一个基于[MindSpore](https://www.mindspore.cn/en) 框架开发的OC
3637

3738
## 安装教程
3839

39-
<details close markdown>
40+
<details open markdown>
4041

4142
#### MindSpore相关环境准备
4243

4344
MindOCR基于MindSpore AI框架(支持CPU/GPU/NPU)开发,并适配以下框架版本。安装方式请参见下方的安装链接。
4445

45-
- mindspore >= 1.9 (ABINet 需要 mindspore >= 2.0) [[安装](https://www.mindspore.cn/install)]
46+
- mindspore >= 2.2.0 [[安装](https://www.mindspore.cn/install)]
4647
- python >= 3.7
47-
- openmpi 4.0.3 (for distributed training/evaluation) [[安装](https://www.open-mpi.org/software/ompi/v4.0/)]
48-
- mindspore lite (for inference) [[安装](docs/cn/inference/environment.md)]
48+
- openmpi 4.0.3 (用于分布式训练与验证) [[安装](https://www.open-mpi.org/software/ompi/v4.0/)]
49+
- mindspore lite (用于离线推理) >= 2.2.0 [[安装](docs/cn/inference/environment.md)]
4950

5051
#### 包依赖
5152

@@ -93,9 +94,9 @@ python tools/infer/text/predict_system.py --image_dir {path_to_img or dir_to_img
9394

9495
可以看到图像中的文字块均被检测出来并正确识别。更详细的用法介绍,请参考推理[教程](#使用教程)
9596

96-
### 2. 模型训练与评估-快速指南
97+
### 2. 模型训练、评估与推理-快速指南
9798

98-
使用`tools/train.py`脚本可以很容易地训练OCR模型,该脚本可支持文本检测和识别模型训练。
99+
使用`tools/train.py`脚本可以进行OCR模型训练,该脚本可支持文本检测和识别模型训练。
99100
```shell
100101
python tools/train.py --config {path/to/model_config.yaml}
101102
```
@@ -112,19 +113,28 @@ python tools/train.py --config configs/det/dbnet/db++_r50_icdar15.yaml
112113
python tools/train.py --config configs/rec/crnn/crnn_icdar15.yaml
113114
```
114115

115-
类似的,使用`tools/eval.py` 脚本可以很容易地评估已训练好的模型,如下所示:
116+
使用`tools/eval.py` 脚本可以评估已训练好的模型,如下所示:
116117
```shell
117118
python tools/eval.py \
118119
--config {path/to/model_config.yaml} \
119120
--opt eval.dataset_root={path/to/your_dataset} eval.ckpt_load_path={path/to/ckpt_file}
120121
```
121122

122-
更多使用方法,请参考[使用教程](#使用教程)中的模型训练章节。
123+
使用`tools/infer/text/predict_system.py` 脚本可进行模型的在线推理,如下所示:
124+
```shell
125+
python tools/infer/text/predict_system.py --image_dir {path_to_img or dir_to_imgs} \
126+
--det_algorithm DB++ \
127+
--rec_algorithm CRNN
128+
```
123129

124-
### 3. 模型推理-快速指南
130+
更多使用方法,请参考[使用教程](#使用教程)中的模型训练、推理章节。
125131

126-
你可以在MindOCR中对**MindOCR自研模型****第三方模型**(如PaddleOCR、MMOCR等)进行MindSpore Lite推理。
127-
请见[MindOCR自研模型推理-快速开始](docs/cn/inference/inference_quickstart.md)[第三方模型推理-快速开始](docs/cn/inference/inference_thirdparty_quickstart.md)
132+
### 3. 模型离线推理-快速指南
133+
134+
你可以在MindOCR中对**MindOCR原生模型****第三方模型**(如PaddleOCR、MMOCR等)进行MindSpore Lite推理。请参考以下文档
135+
- [基于Python/C++和昇腾310的OCR推理](docs/cn/inference/inference_tutorial.md)
136+
- [MindOCR原生模型离线推理 - 快速开始](docs/cn/inference/inference_quickstart.md)
137+
- [第三方模型离线推理 - 快速开始](docs/cn/inference/inference_thirdparty_quickstart.md)
128138

129139
## 使用教程
130140

@@ -137,9 +147,12 @@ python tools/eval.py \
137147
- [文本识别](docs/cn/tutorials/training_recognition_custom_dataset.md)
138148
- [分布式训练](docs/cn/tutorials/distribute_train.md)
139149
- [进阶技巧:梯度累积,EMA,断点续训等](docs/cn/tutorials/advanced_train.md)
140-
- 推理与部署
141-
- [基于Python/C++和昇腾310的OCR推理](docs/cn/inference/inference_tutorial.md)
150+
- 使用MindSpore进行在线推理
142151
- [基于Python的OCR在线推理](tools/infer/text/README.md)
152+
- 使用MindSpore Lite进行离线推理
153+
- [基于Python/C++和昇腾310的OCR推理](docs/cn/inference/inference_tutorial.md)
154+
- [MindOCR原生模型离线推理 - 快速开始](docs/cn/inference/inference_quickstart.md)
155+
- [第三方模型离线推理 - 快速开始](docs/cn/inference/inference_thirdparty_quickstart.md)
143156
- 开发者指南
144157
- [如何自定义数据集](mindocr/data/README.md)
145158
- [如何自定义数据增强方法](mindocr/data/transforms/README.md)
@@ -176,10 +189,18 @@ python tools/eval.py \
176189
- [x] [YOLOv8](configs/layout/yolov8/README_CN.md) ([Ultralytics Inc.](https://github.com/ultralytics/ultralytics))
177190
</details>
178191

192+
<details open markdown>
193+
<summary>关键信息抽取</summary>
194+
195+
- [x] [LayoutXLM SER](configs/kie/vi_layoutxlm/README_CN.md) (arXiv'2016)
196+
197+
</details>
198+
199+
179200
关于以上模型的具体训练方法和结果,请参见[configs](./configs)下各模型子目录的readme文档。
180201

181202
关于[MindSpore Lite](https://www.mindspore.cn/lite)[ACL](https://www.hiascend.com/document/detail/zh/canncommercial/63RC1/inferapplicationdev/aclcppdevg/aclcppdevg_000004.html)模型推理的支持列表,
182-
请参见[MindOCR自研模型推理支持列表](docs/cn/inference/inference_quickstart.md)[第三方模型推理支持列表](docs/cn/inference/inference_thirdparty_quickstart.md)(如PaddleOCR、MMOCR等)。
203+
请参见[MindOCR原生模型推理支持列表](docs/cn/inference/inference_quickstart.md)[第三方模型推理支持列表](docs/cn/inference/inference_thirdparty_quickstart.md)(如PaddleOCR、MMOCR等)。
183204

184205
## 数据集列表
185206

@@ -213,29 +234,63 @@ MindOCR提供了[数据格式转换工具](tools/dataset_converters) ,以支
213234

214235
</details>
215236

237+
<details close markdown>
238+
<summary>版面分析数据集</summary>
239+
240+
- [PublayNet](https://github.com/ibm-aur-nlp/PubLayNet) [[paper](https://arxiv.org/abs/1908.07836)] [[download](https://dax-cdn.cdn.appdomain.cloud/dax-publaynet/1.0.0/publaynet.tar.gz)]
241+
242+
</details>
243+
244+
<details close markdown>
245+
<summary>关键信息抽取数据集</summary>
246+
247+
- [XFUND](https://github.com/doc-analysis/XFUND) [[paper](https://aclanthology.org/2022.findings-acl.253/)] [[download](https://github.com/doc-analysis/XFUND/releases/tag/v1.0)]
248+
249+
</details>
250+
216251
我们会在更多的数据集上进行模型训练和验证。该列表将持续更新。
217252

253+
## 常见问题
254+
关于配置环境、使用mindocr遇到的高频问题,可以参考[常见问题](docs/cn/tutorials/frequently_asked_questions.md)
255+
218256
## 重要信息
219257

220258
### 更新日志
259+
<details close markdown>
260+
<summary>详细</summary>
261+
221262
- 2023/12/14
263+
1. 增加新模型
264+
- 关键信息抽取[LayoutXLM SER](configs/kie/vi_layoutxlm)
265+
- 关键信息抽取[VI-LayoutXLM SER](configs/kie/layoutlm_series)
266+
- 文本检测[PP-OCRv3 DBNet](configs/det/dbnet/db_mobilenetv3_ppocrv3.yaml)和文本识别[PP-OCRv3 SVTR](configs/rec/svtr/svtr_ppocrv3_ch.yaml),支持在线推理和微调训练
267+
2. 添加更多基准数据集及其结果
268+
- [XFUND](configs/kie/vi_layoutxlm/README_CN.md)
269+
3. 昇腾910硬件多规格支持:DBNet ResNet-50、DBNet++ ResNet-50、CRNN VGG7、SVTR-Tiny、FCENet、ABINet
270+
- 2023/11/28
271+
1. 增加支持PP-OCRv4模型离线推理
272+
- 文本检测 [PP-OCRv4 DBNet](deploy/py_infer/src/configs/det/ppocr/ch_PP-OCRv4_det_cml.yaml)和文本识别 [PP-OCRv4 CRNN](deploy/py_infer/src/configs/rec/ppocr/ch_PP-OCRv4_rec_distillation.yaml),支持离线推理
273+
2. 修复第三方模型离线推理bug
274+
- 2023/11/17
222275
1. 增加新模型
223276
- 版面分析[YOLOv8](configs/layout/yolov8)
277+
2. 添加更多基准数据集及其结果
278+
- [PublayNet](configs/layout/yolov8/README_CN.md)
224279
- 2023/07/06
225280
1. 增加新模型
226-
- 文本识别[RobustScanner](configs/rec/robustscanner)
281+
- 文本识别 [RobustScanner](configs/rec/robustscanner)
227282
- 2023/07/05
228283
1. 增加新模型
229-
- 文本识别[VISIONLAN](configs/rec/visionlan)
284+
- 文本识别 [VISIONLAN](configs/rec/visionlan)
230285
- 2023/06/29
231286
1. 新增2个SoTA模型
232-
- 文本检测[FCENet](configs/det/fcenet)
233-
- 文本识别[MASTER](configs/rec/master)
287+
- 文本检测 [FCENet](configs/det/fcenet)
288+
- 文本识别 [MASTER](configs/rec/master)
234289
- 2023/06/07
235290
1. 增加新模型
236-
- 文本检测[PSENet](configs/det/psenet)
237-
- 文本检测[EAST](configs/det/east)
238-
- 文本识别[SVTR](configs/rec/svtr)
291+
- 文本检测 [PSENet](configs/det/psenet)
292+
- 文本检测 [EAST](configs/det/east)
293+
- 文本识别 [SVTR](configs/rec/svtr)
239294
2. 添加更多基准数据集及其结果
240295
- [totaltext](docs/cn/datasets/totaltext.md)
241296
- [mlt2017](docs/cn/datasets/mlt2017.md)
@@ -246,8 +301,8 @@ MindOCR提供了[数据格式转换工具](tools/dataset_converters) ,以支
246301

247302
- 2023/05/15
248303
1. 增加新模型
249-
- 文本检测[DBNet++](configs/det/dbnet)
250-
- 文本识别[CRNN-Seq2Seq](configs/rec/rare)
304+
- 文本检测 [DBNet++](configs/det/dbnet)
305+
- 文本识别 [CRNN-Seq2Seq](configs/rec/rare)
251306
- 在SynthText数据集上预训练的[DBNet](https://download.mindspore.cn/toolkits/mindocr/dbnet/dbnet_resnet50_synthtext-40655acb.ckpt)
252307
2. 添加更多基准数据集及其结果
253308
- [SynthText](docs/cn/datasets/synthtext.md), [MSRA-TD500](docs/cn/datasets/td500.md), [CTW1500](docs/cn/datasets/ctw1500.md)
@@ -285,6 +340,7 @@ MindOCR提供了[数据格式转换工具](tools/dataset_converters) ,以支
285340
iv) 在网页的UI界面增加运行参数`enable_modelarts`并将其设置为True;
286341
v) 填写其他项并启动训练任务。
287342
```
343+
</details>
288344

289345
### 如何贡献
290346

configs/kie/vi_layoutxlm/README_CN.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -247,4 +247,5 @@ python tools/eval.py --config configs/kie/vi_layoutxlm/ser_vi_layoutxlm_xfund_zh
247247
<!--- Guideline: Citation format GB/T 7714 is suggested. -->
248248

249249
[1] Yang Xu, Yiheng Xu, Tengchao Lv, Lei Cui, Furu Wei, Guoxin Wang, Yijuan Lu, Dinei Florencio, Cha Zhang, Wanxiang Che, Min Zhang, Lidong Zhou. LayoutLMv2: Multi-modal Pre-training for Visually-Rich Document Understanding. arXiv preprint arXiv:2012.14740, 2020.
250+
250251
[2] Yiheng Xu, Tengchao Lv, Lei Cui, Guoxin Wang, Yijuan Lu, Dinei Florencio, Cha Zhang, Furu Wei. LayoutXLM: Multimodal Pre-training for Multilingual Visually-rich Document Understanding. arXiv preprint arXiv:2104.08836, 2021.

deploy/py_infer/src/configs/rec/ppocr/rec_chinese_common_train_v2.0.yaml renamed to deploy/py_infer/src/configs/rec/ppocr/rec_chinese_common_v2.0.yaml

File renamed without changes.

deploy/py_infer/src/configs/rec/ppocr/rec_chinese_lite_train_v2.0.yaml renamed to deploy/py_infer/src/configs/rec/ppocr/rec_chinese_lite_v2.0.yaml

File renamed without changes.

deploy/py_infer/src/configs/rec/ppocr/rec_en_number_lite_train.yaml renamed to deploy/py_infer/src/configs/rec/ppocr/rec_en_number_lite.yaml

File renamed without changes.

0 commit comments

Comments
 (0)