Improve dependency (#25)

lolipopshock · web-flow · commit 5f158e3ede0c · 2021-04-12T14:07:04.000-04:00
* Remove pycocotools in the dependency

* Enforce newest detectron2 version v0.4

* Add test_Detectron2Model_version_compatibility for checking detectron2 v0.4

* disable test_Detectron2Model_version_compatibility

* Add installation instructions

* Add installation instructions across the repo

* Update copyright for doc
diff --git a/README.md b/README.md
@@ -18,19 +18,22 @@ Layout Parser is a deep learning based tool for document image layout analysis t
 
 ## Installation 
 
-Use pip or conda to install the library:
+You can find detailed installation instructions in [installation.md](installation.md). But generally, it's just `pip install` 
+some libraries: 
+
 ```bash
-pip install layoutparser
+pip install -U layoutparser
 
 # Install Detectron2 for using DL Layout Detection Model
 # Please make sure the PyTorch version is compatible with
 # the installed Detectron2 version. 
-pip install 'git+https://github.com/facebookresearch/detectron2.git#egg=detectron2' 
+pip install 'git+https://github.com/facebookresearch/detectron2.git@v0.4#egg=detectron2' 
 
 # Install the ocr components when necessary 
 pip install layoutparser[ocr]      
 ```
-This by default will install the CPU version of the Detectron2, and it should be able to run on most of the computers. But if you have a GPU, you can consider the GPU version of the Detectron2, referring to the [official instructions](https://github.com/facebookresearch/detectron2/blob/master/INSTALL.md).
+
+**For Windows Users:** Please read [installation.md](installation.md) for details about installing Detectron2.
 
 ## Quick Start
 
diff --git a/dev-requirements.txt b/dev-requirements.txt
@@ -10,4 +10,4 @@ sphinx_rtd_theme
 google-cloud-vision==1
 pytesseract
 pycocotools
-git+https://github.com/facebookresearch/detectron2.git@v0.1.3#egg=detectron2
+git+https://github.com/facebookresearch/detectron2.git@v0.4#egg=detectron2
diff --git a/docs/conf.py b/docs/conf.py
@@ -18,8 +18,8 @@
 # -- Project information -----------------------------------------------------
 
 project = 'Layout Parser'
-copyright = '2020, Zejiang Shen, Ruochen Zhang'
-author = 'Zejiang Shen, Ruochen Zhang'
+copyright = '2020-2021, Layout Parser Contributors'
+author = 'Layout Parser Contributors'
 
 # The full version, including alpha/beta/rc tags
 release = layoutparser.__version__
diff --git a/docs/index.rst b/docs/index.rst
@@ -10,7 +10,7 @@ Welcome to Layout Parser's documentation!
    :maxdepth: 2
    :caption: Notes
 
-   notes/quickstart
+   notes/installation.md
    notes/modelzoo.md
 
 .. toctree::
diff --git a/docs/notes/installation.md b/docs/notes/installation.md
@@ -0,0 +1 @@
+../../installation.md
diff --git a/installation.md b/installation.md
@@ -0,0 +1,72 @@
+# Installation
+
+## Install Python
+
+Layout Parser is a Python package that requires Python >= 3.6. If you do not have Python 
+installed on your computer, you might want to turn to [the official instruction](https://www.python.org/downloads/)
+and download and install the appropriate version of Python. 
+
+## Install the Layout Parser main library
+
+Installing the Layout Parser library is very straightforward: you just need to run the following command: 
+
+```bash
+pip3 install -U layoutparser
+```
+
+## [Optional] Install Detectron2 and modeling utils
+
+### For Mac OS and Linux users 
+
+If you would like to use deep learning model detection, you need also install Detectron2 
+on your computer. This could be done by running the following command: 
+
+```bash
+pip3 install 'git+https://github.com/facebookresearch/detectron2.git@v0.4#egg=detectron2' 
+```
+
+This might take sometime as the command will *compile* the library. You might also install a Detectron2 version with
+GPU supports or encounter some issues during the installation process. Please refer to the official Detectron2 
+[installation instruction](https://github.com/facebookresearch/detectron2/blob/master/INSTALL.md) for detailed
+information. 
+
+### For Windows users
+
+However, as reported by many users, the installation of Detectron2 is very tricky on Windows platforms. 
+In our extensive tests, we find that it is nearly impossible to provide a one-line installation command for Windows users.
+As a workaround solution, for now we conclude the possible challenges for installing Detectron2 on Windows, and attach helpful 
+resources for solving them. 
+We are also investigating other possibilities for avoiding Detectron2 and enabling the pre-trained models. If you have any 
+suggestions or ideas for fixing this issue, please feel free to [submit an issue](https://github.com/Layout-Parser/layout-parser/issues) in our repo. 
+
+1. Challenges for installing `pycocotools` 
+    - You can find detailed instructions on [this post](https://changhsinlee.com/pycocotools/) from Chang Hsin Lee. 
+    - Another solution is try to install `pycocotools-windows`, see https://github.com/cocodataset/cocoapi/issues/415. 
+2. Challenges for installing the `Detectron2` library itself
+    - [@ivanpp](https://github.com/ivanpp) curates a detailed description for installing `Detectron2` on Windows: [Detectron2 walkthrough (Windows)](https://ivanpp.cc/detectron2-walkthrough-windows/#step3installdetectron2)
+    - `Detectron2` claims it won't provide official support for Windows (see [1](https://github.com/facebookresearch/detectron2/issues/9#issuecomment-540974288) and [2](https://detectron2.readthedocs.io/en/latest/tutorials/install.html)), but they claim that Detectron2 is continuously built on windows with CircleCI (see [3](https://github.com/facebookresearch/detectron2/blob/master/INSTALL.md#common-installation-issues)). Hopefully this situation will be improved in the future.
+
+
+## [Optional] Install OCR utils
+
+Layout Parser also comes with supports for OCR functions. In order to use them, you need 
+to install the OCR utils via: 
+
+```bash
+pip3 install -U layoutparser[ocr]
+```
+
+Additionally, if you want to use the Tesseract-OCR engine, you also need to install it on your computer. Please check the 
+[official documentation](https://tesseract-ocr.github.io/tessdoc/Installation.html) for detailed installation instructions. 
+
+## Known issues
+
+<details><summary>Error: instantiating lp.GCVAgent.with_credential returns module 'google.cloud.vision' has no attribute 'types'. </summary>
+<p>
+
+In this case, you have a newer version of the google-cloud-vision. Please consider downgrading the API using: 
+```bash
+pip install layoutparser[ocr]
+```
+</p>
+</details>
diff --git a/setup.py b/setup.py
@@ -26,7 +26,6 @@
         "pyyaml>=5.1",
         "torch",
         "torchvision",
-        "pycocotools>=2.0.2",
         "iopath",
       ],
       extras_require={
diff --git a/tests/fixtures/model/layout_detection_reference.jpg b/tests/fixtures/model/layout_detection_reference.jpg
diff --git a/tests/fixtures/model/layout_detection_reference.json b/tests/fixtures/model/layout_detection_reference.json
@@ -0,0 +1 @@
+{"page_data": {}, "blocks": [{"x_1": 648.9922485351562, "y_1": 1418.7113037109375, "x_2": 1132.6805419921875, "y_2": 1479.303955078125, "block_type": "rectangle", "type": "Text", "score": 0.9995978474617004}, {"x_1": 106.12457275390625, "y_1": 1032.07470703125, "x_2": 599.2977905273438, "y_2": 1323.208984375, "block_type": "rectangle", "type": "Text", "score": 0.9981802701950073}, {"x_1": 639.54736328125, "y_1": 773.1265869140625, "x_2": 1135.9765625, "y_2": 1044.6507568359375, "block_type": "rectangle", "type": "Text", "score": 0.9974864721298218}, {"x_1": 104.36861419677734, "y_1": 767.3282470703125, "x_2": 595.1759643554688, "y_2": 970.451171875, "block_type": "rectangle", "type": "Text", "score": 0.9974320530891418}, {"x_1": 107.37610626220703, "y_1": 1448.544189453125, "x_2": 598.3998413085938, "y_2": 1488.01611328125, "block_type": "rectangle", "type": "Text", "score": 0.9953517913818359}, {"x_1": 132.01339721679688, "y_1": 146.253173828125, "x_2": 1160.3997802734375, "y_2": 652.8322143554688, "block_type": "rectangle", "type": "Figure", "score": 0.9953091740608215}, {"x_1": 103.79012298583984, "y_1": 1327.6717529296875, "x_2": 601.3895874023438, "y_2": 1429.9224853515625, "block_type": "rectangle", "type": "Text", "score": 0.9949470162391663}, {"x_1": 103.83270263671875, "y_1": 671.7702026367188, "x_2": 1138.1756591796875, "y_2": 748.6300659179688, "block_type": "rectangle", "type": "Text", "score": 0.9943684935569763}, {"x_1": 104.0943832397461, "y_1": 985.9046020507812, "x_2": 444.34979248046875, "y_2": 1011.3511352539062, "block_type": "rectangle", "type": "Title", "score": 0.9880087375640869}, {"x_1": 395.9805908203125, "y_1": 141.7040252685547, "x_2": 1141.115478515625, "y_2": 659.3515625, "block_type": "rectangle", "type": "Figure", "score": 0.9815265536308289}, {"x_1": 107.32891845703125, "y_1": 149.01644897460938, "x_2": 405.1805419921875, "y_2": 582.9757690429688, "block_type": "rectangle", "type": "Figure", "score": 0.965209424495697}, {"x_1": 638.6964721679688, "y_1": 1075.6173095703125, "x_2": 1137.9869384765625, "y_2": 1154.6956787109375, "block_type": "rectangle", "type": "Text", "score": 0.9612341523170471}, {"x_1": 137.1743621826172, "y_1": 591.2607421875, "x_2": 376.2920227050781, "y_2": 609.2918701171875, "block_type": "rectangle", "type": "Text", "score": 0.9027073979377747}, {"x_1": 643.3095703125, "y_1": 1175.7694091796875, "x_2": 1127.9664306640625, "y_2": 1416.0784912109375, "block_type": "rectangle", "type": "Table", "score": 0.8846631646156311}]}
diff --git a/tests/test_model.py b/tests/test_model.py
@@ -1,3 +1,4 @@
+from layoutparser import load_json
 from layoutparser.models import *
 import cv2
 
@@ -32,4 +33,18 @@ def test_Detectron2Model(is_large_scale=False):
     # Test in enforce CPU mode
     model = Detectron2LayoutModel("tests/fixtures/model/config.yml", enforce_cpu=True)
     image = cv2.imread("tests/fixtures/model/test_model_image.jpg")
-    layout = model.detect(image)
+    layout = model.detect(image)
+    
+def test_Detectron2Model_version_compatibility(enabled=False):
+    
+    if enabled:
+        model = Detectron2LayoutModel(
+            config_path="lp://PubLayNet/faster_rcnn_R_50_FPN_3x/config",
+            extra_config=[
+                "MODEL.ROI_HEADS.SCORE_THRESH_TEST", 0.85,
+                "MODEL.ROI_HEADS.NMS_THRESH_TEST", 0.75,
+            ],
+        )
+        image = cv2.imread("tests/fixtures/model/layout_detection_reference.jpg")
+        layout = model.detect(image)
+        assert load_json("tests/fixtures/model/layout_detection_reference.json") == layout

Original file line number	Diff line number	Diff line change
`@@ -0,0 +1 @@`
	`1`	+{"page_data": {}, "blocks": [{"x_1": 648.9922485351562, "y_1": 1418.7113037109375, "x_2": 1132.6805419921875, "y_2": 1479.303955078125, "block_type": "rectangle", "type": "Text", "score": 0.9995978474617004}, {"x_1": 106.12457275390625, "y_1": 1032.07470703125, "x_2": 599.2977905273438, "y_2": 1323.208984375, "block_type": "rectangle", "type": "Text", "score": 0.9981802701950073}, {"x_1": 639.54736328125, "y_1": 773.1265869140625, "x_2": 1135.9765625, "y_2": 1044.6507568359375, "block_type": "rectangle", "type": "Text", "score": 0.9974864721298218}, {"x_1": 104.36861419677734, "y_1": 767.3282470703125, "x_2": 595.1759643554688, "y_2": 970.451171875, "block_type": "rectangle", "type": "Text", "score": 0.9974320530891418}, {"x_1": 107.37610626220703, "y_1": 1448.544189453125, "x_2": 598.3998413085938, "y_2": 1488.01611328125, "block_type": "rectangle", "type": "Text", "score": 0.9953517913818359}, {"x_1": 132.01339721679688, "y_1": 146.253173828125, "x_2": 1160.3997802734375, "y_2": 652.8322143554688, "block_type": "rectangle", "type": "Figure", "score": 0.9953091740608215}, {"x_1": 103.79012298583984, "y_1": 1327.6717529296875, "x_2": 601.3895874023438, "y_2": 1429.9224853515625, "block_type": "rectangle", "type": "Text", "score": 0.9949470162391663}, {"x_1": 103.83270263671875, "y_1": 671.7702026367188, "x_2": 1138.1756591796875, "y_2": 748.6300659179688, "block_type": "rectangle", "type": "Text", "score": 0.9943684935569763}, {"x_1": 104.0943832397461, "y_1": 985.9046020507812, "x_2": 444.34979248046875, "y_2": 1011.3511352539062, "block_type": "rectangle", "type": "Title", "score": 0.9880087375640869}, {"x_1": 395.9805908203125, "y_1": 141.7040252685547, "x_2": 1141.115478515625, "y_2": 659.3515625, "block_type": "rectangle", "type": "Figure", "score": 0.9815265536308289}, {"x_1": 107.32891845703125, "y_1": 149.01644897460938, "x_2": 405.1805419921875, "y_2": 582.9757690429688, "block_type": "rectangle", "type": "Figure", "score": 0.965209424495697}, {"x_1": 638.6964721679688, "y_1": 1075.6173095703125, "x_2": 1137.9869384765625, "y_2": 1154.6956787109375, "block_type": "rectangle", "type": "Text", "score": 0.9612341523170471}, {"x_1": 137.1743621826172, "y_1": 591.2607421875, "x_2": 376.2920227050781, "y_2": 609.2918701171875, "block_type": "rectangle", "type": "Text", "score": 0.9027073979377747}, {"x_1": 643.3095703125, "y_1": 1175.7694091796875, "x_2": 1127.9664306640625, "y_2": 1416.0784912109375, "block_type": "rectangle", "type": "Table", "score": 0.8846631646156311}]}