You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Improve LayoutElements and Add Shape Operations (#20)
* Support 4-point list input for quadrilateral
* Add union and intersect operations for interval, rectangle, and quadrilateral
* Improve to_interval for quadrilateral
* Incorporate unified shape conversion APIs for Textblock
* Add union and intersect for textblock
* Add shape operation doc
* Add docstring for union and intersect
[BETA: the API and behavior *will* be changed in the future.]
4
+
5
+
Starting from v0.2, Layout Parser provides supports for two types of shape operations, `union` and `intersection`, across all `BaseCoordElement`s and `TextBlock`. We've made some design choices to construct a set of generalized APIs across different shape classes, detailed as follows:
6
+
7
+
## The `union` Operation
8
+
9
+

10
+
▲ The Illustration of Union Operations. The resulting matrix are symmetric so only the lower triangular region is left empty. Each cell shows the visualization of the shape objects, their coordinates, and their object class. For the output visualization, the gray and dashed line delineates the original obj1 and obj2, respectively, for reference.
11
+
12
+
**Notes**:
13
+
1. The x-interval and y-interval are both from the `Interval` Class but with different axes. It's ill-defined to union two intervals from different axes so in this case Layout Parser will raise an `InvalidShapeError`.
14
+
2. The union of two rectangles is still a rectangle, which is the minimum covering rectangle of the two input rectangles.
15
+
3. For the outputs associated with `Quadrilateral` inputs, please see details in the [Problems related to the Quadrilateral Class](#problems-related-to-the-quadrilateral-class) section.
16
+
17
+
## The `intersect` Operation
18
+
19
+

20
+
▲ The Illustration of Union Operations. Similar to the previous visualization, the resulting matrix are symmetric so only the lower triangular region is left empty. Each cell shows the visualization of the shape objects, their coordinates, and their object class. For the output visualization, the gray and dashed line delineates the original obj1 and obj2, respectively, for reference.
21
+
22
+
## Problems related to the `Quadrilateral` Class
23
+
24
+
It is possible to generate arbitrary shapes when performing shape operations on `Quadrilateral` objects. Currently Layout Parser does not provide the support for `Polygon` objects (but we plan to support that object in the near future), thus it becomes tricky to add support for these operations for `Quadrilateral`. The temporary solution is that:
25
+
1. When performing shape operations on `Quadrilateral` objects, Layout Parser will raise `NotSupportedShapeError`.
26
+
2. A workaround is to set `strict=False` in the input (i.e., `obj1.union(obj2, strict=False)`). In this case, any quadrilateral objects will be converted to `Rectangle`s first and the operation is executed. The results may not be *strictly* equivalent to those performed on the original objects.
"The intersection between an Interval and a Quadrilateral might generate Polygon shapes that are not supported in the current version of layoutparser. You can pass `strict=False` in the input that converts the Quadrilateral to Rectangle to avoid this Exception."
642
+
)
643
+
else:
644
+
warnings.warn(
645
+
f"With `strict=False`, the other of shape {other.__class__} will be converted to {Rectangle} for obtaining the intersection"
646
+
)
647
+
returnself.intersect(other.to_rectangle())
648
+
649
+
else:
650
+
raiseException(f"Invalid input type {other.__class__} for other")
"The intersection between an Interval and a Quadrilateral might generate Polygon shapes that are not supported in the current version of layoutparser. You can pass `strict=False` in the input that converts the Quadrilateral to Rectangle to avoid this Exception."
680
+
)
681
+
else:
682
+
warnings.warn(
683
+
f"With `strict=False`, the other of shape {other.__class__} will be converted to {Rectangle} for obtaining the intersection"
684
+
)
685
+
returnself.union(other.to_rectangle())
686
+
687
+
else:
688
+
raiseException(f"Invalid input type {other.__class__} for other")
"The intersection between a Rectangle and a Quadrilateral might generate Polygon shapes that are not supported in the current version of layoutparser. You can pass `strict=False` in the input that converts the Quadrilateral to Rectangle to avoid this Exception."
1011
+
)
1012
+
else:
1013
+
warnings.warn(
1014
+
f"With `strict=False`, the other of shape {other.__class__} will be converted to {Rectangle} for obtaining the intersection"
1015
+
)
1016
+
returnself.intersect(other.to_rectangle())
1017
+
1018
+
else:
1019
+
raiseException(f"Invalid input type {other.__class__} for other")
"The intersection between an Interval and a Quadrilateral might generate Polygon shapes that are not supported in the current version of layoutparser. You can pass `strict=False` in the input that converts the Quadrilateral to Rectangle to avoid this Exception."
1039
+
)
1040
+
else:
1041
+
warnings.warn(
1042
+
f"With `strict=False`, the other of shape {other.__class__} will be converted to {Rectangle} for obtaining the intersection"
1043
+
)
1044
+
returnself.union(other.to_rectangle())
1045
+
1046
+
else:
1047
+
raiseException(f"Invalid input type {other.__class__} for other")
"The intersection between a Quadrilateral and other objects might generate Polygon shapes that are not supported in the current version of layoutparser. You can pass `strict=False` in the input that converts the Quadrilateral to Rectangle to avoid this Exception."
"The intersection between a Quadrilateral and other objects might generate Polygon shapes that are not supported in the current version of layoutparser. You can pass `strict=False` in the input that converts the Quadrilateral to Rectangle to avoid this Exception."
0 commit comments