Skip to content

Commit f599861

Browse files
authored
Improve LayoutElements and Add Shape Operations (#20)
* Support 4-point list input for quadrilateral * Add union and intersect operations for interval, rectangle, and quadrilateral * Improve to_interval for quadrilateral * Incorporate unified shape conversion APIs for Textblock * Add union and intersect for textblock * Add shape operation doc * Add docstring for union and intersect
1 parent 82e919f commit f599861

File tree

6 files changed

+375
-8
lines changed

6 files changed

+375
-8
lines changed

docs/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,7 @@ Welcome to Layout Parser's documentation!
2626
:caption: API Reference
2727

2828
api_doc/elements
29+
notes/shape_operations.md
2930
api_doc/ocr
3031
api_doc/models
3132
api_doc/visualization

docs/notes/intersection.png

100 KB
Loading

docs/notes/shape_operations.md

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
# Shape Operations
2+
3+
[BETA: the API and behavior *will* be changed in the future.]
4+
5+
Starting from v0.2, Layout Parser provides supports for two types of shape operations, `union` and `intersection`, across all `BaseCoordElement`s and `TextBlock`. We've made some design choices to construct a set of generalized APIs across different shape classes, detailed as follows:
6+
7+
## The `union` Operation
8+
9+
![Illustration of Union Operations](union.png)
10+
▲ The Illustration of Union Operations. The resulting matrix are symmetric so only the lower triangular region is left empty. Each cell shows the visualization of the shape objects, their coordinates, and their object class. For the output visualization, the gray and dashed line delineates the original obj1 and obj2, respectively, for reference.
11+
12+
**Notes**:
13+
1. The x-interval and y-interval are both from the `Interval` Class but with different axes. It's ill-defined to union two intervals from different axes so in this case Layout Parser will raise an `InvalidShapeError`.
14+
2. The union of two rectangles is still a rectangle, which is the minimum covering rectangle of the two input rectangles.
15+
3. For the outputs associated with `Quadrilateral` inputs, please see details in the [Problems related to the Quadrilateral Class](#problems-related-to-the-quadrilateral-class) section.
16+
17+
## The `intersect` Operation
18+
19+
![Illustration of Intersection Operations](intersection.png)
20+
▲ The Illustration of Union Operations. Similar to the previous visualization, the resulting matrix are symmetric so only the lower triangular region is left empty. Each cell shows the visualization of the shape objects, their coordinates, and their object class. For the output visualization, the gray and dashed line delineates the original obj1 and obj2, respectively, for reference.
21+
22+
## Problems related to the `Quadrilateral` Class
23+
24+
It is possible to generate arbitrary shapes when performing shape operations on `Quadrilateral` objects. Currently Layout Parser does not provide the support for `Polygon` objects (but we plan to support that object in the near future), thus it becomes tricky to add support for these operations for `Quadrilateral`. The temporary solution is that:
25+
1. When performing shape operations on `Quadrilateral` objects, Layout Parser will raise `NotSupportedShapeError`.
26+
2. A workaround is to set `strict=False` in the input (i.e., `obj1.union(obj2, strict=False)`). In this case, any quadrilateral objects will be converted to `Rectangle`s first and the operation is executed. The results may not be *strictly* equivalent to those performed on the original objects.

docs/notes/union.png

105 KB
Loading

src/layoutparser/elements.py

Lines changed: 251 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -121,6 +121,20 @@ def wrap(self, other, *args, **kwargs):
121121
return wrap
122122

123123

124+
class NotSupportedShapeError(Exception):
125+
"""For now (v0.2), if the created shape might be a polygon (shapes with more than 4 vertices),
126+
layoutparser will raise NotSupportedShapeError. It is expected to be fixed in the future versions.
127+
See
128+
:ref:`shape_operations:problems-related-to-the-quadrilateral-class`.
129+
"""
130+
131+
132+
class InvalidShapeError(Exception):
133+
"""For shape operations like intersection of union, lp will raise the InvalidShapeError when
134+
invalid shapes are created (e.g., intersecting a rectangle and an interval).
135+
"""
136+
137+
124138
class BaseLayoutElement:
125139
def set(self, inplace=False, **kwargs):
126140

@@ -254,6 +268,22 @@ def is_in(self, other, soft_margin={}, center=False):
254268

255269
pass
256270

271+
#######################################################################
272+
################# Shape Operations (intersect, union) ################
273+
#######################################################################
274+
275+
@abstractmethod
276+
def intersect(self, other: "BaseCoordElement", strict: bool = True):
277+
"""Intersect the current shape with the other object, with operations defined in
278+
:doc:`../notes/shape_operations`.
279+
"""
280+
281+
@abstractmethod
282+
def union(self, other: "BaseCoordElement", strict: bool = True):
283+
"""Union the current shape with the other object, with operations defined in
284+
:doc:`../notes/shape_operations`.
285+
"""
286+
257287
#######################################################################
258288
############### Geometric Operations (pad, shift, scale) ##############
259289
#######################################################################
@@ -579,6 +609,84 @@ def is_in(self, other, soft_margin={}, center=False):
579609
else:
580610
raise Exception(f"Invalid input type {other.__class__} for other")
581611

612+
@support_textblock
613+
def intersect(self, other: BaseCoordElement, strict: bool = True):
614+
""""""
615+
616+
if isinstance(other, Interval):
617+
if self.axis != other.axis:
618+
if self.axis == "x" and other.axis == "y":
619+
return Rectangle(self.start, other.start, self.end, other.end)
620+
else:
621+
return Rectangle(other.start, self.start, other.end, self.end)
622+
else:
623+
return self.__class__(
624+
max(self.start, other.start),
625+
min(self.end, other.end),
626+
self.axis,
627+
self.canvas_height,
628+
self.canvas_width,
629+
)
630+
631+
elif isinstance(other, Rectangle):
632+
x_1, y_1, x_2, y_2 = other.coordinates
633+
if self.axis == "x":
634+
return Rectangle(max(x_1, self.start), y_1, min(x_2, self.end), y_2)
635+
elif self.axis == "y":
636+
return Rectangle(x_1, max(y_1, self.start), x_2, min(y_2, self.end))
637+
638+
elif isinstance(other, Quadrilateral):
639+
if strict:
640+
raise NotSupportedShapeError(
641+
"The intersection between an Interval and a Quadrilateral might generate Polygon shapes that are not supported in the current version of layoutparser. You can pass `strict=False` in the input that converts the Quadrilateral to Rectangle to avoid this Exception."
642+
)
643+
else:
644+
warnings.warn(
645+
f"With `strict=False`, the other of shape {other.__class__} will be converted to {Rectangle} for obtaining the intersection"
646+
)
647+
return self.intersect(other.to_rectangle())
648+
649+
else:
650+
raise Exception(f"Invalid input type {other.__class__} for other")
651+
652+
@support_textblock
653+
def union(self, other: BaseCoordElement, strict: bool = True):
654+
""""""
655+
if isinstance(other, Interval):
656+
if self.axis != other.axis:
657+
raise InvalidShapeError(
658+
f"Unioning two intervals of different axes is not allowed."
659+
)
660+
else:
661+
return self.__class__(
662+
min(self.start, other.start),
663+
max(self.end, other.end),
664+
self.axis,
665+
self.canvas_height,
666+
self.canvas_width,
667+
)
668+
669+
elif isinstance(other, Rectangle):
670+
x_1, y_1, x_2, y_2 = other.coordinates
671+
if self.axis == "x":
672+
return Rectangle(min(x_1, self.start), y_1, max(x_2, self.end), y_2)
673+
elif self.axis == "y":
674+
return Rectangle(x_1, min(y_1, self.start), x_2, max(y_2, self.end))
675+
676+
elif isinstance(other, Quadrilateral):
677+
if strict:
678+
raise NotSupportedShapeError(
679+
"The intersection between an Interval and a Quadrilateral might generate Polygon shapes that are not supported in the current version of layoutparser. You can pass `strict=False` in the input that converts the Quadrilateral to Rectangle to avoid this Exception."
680+
)
681+
else:
682+
warnings.warn(
683+
f"With `strict=False`, the other of shape {other.__class__} will be converted to {Rectangle} for obtaining the intersection"
684+
)
685+
return self.union(other.to_rectangle())
686+
687+
else:
688+
raise Exception(f"Invalid input type {other.__class__} for other")
689+
582690
def pad(self, left=0, right=0, top=0, bottom=0, safe_mode=True):
583691

584692
if self.axis == "x":
@@ -880,6 +988,64 @@ def is_in(self, other, soft_margin={}, center=False):
880988
else:
881989
raise Exception(f"Invalid input type {other.__class__} for other")
882990

991+
@support_textblock
992+
def intersect(self, other: BaseCoordElement, strict: bool = True):
993+
""""""
994+
995+
if isinstance(other, Interval):
996+
return other.intersect(self)
997+
998+
elif isinstance(other, Rectangle):
999+
1000+
return self.__class__(
1001+
max(self.x_1, other.x_1),
1002+
max(self.y_1, other.y_1),
1003+
min(self.x_2, other.x_2),
1004+
min(self.y_2, other.y_2),
1005+
)
1006+
1007+
elif isinstance(other, Quadrilateral):
1008+
if strict:
1009+
raise NotSupportedShapeError(
1010+
"The intersection between a Rectangle and a Quadrilateral might generate Polygon shapes that are not supported in the current version of layoutparser. You can pass `strict=False` in the input that converts the Quadrilateral to Rectangle to avoid this Exception."
1011+
)
1012+
else:
1013+
warnings.warn(
1014+
f"With `strict=False`, the other of shape {other.__class__} will be converted to {Rectangle} for obtaining the intersection"
1015+
)
1016+
return self.intersect(other.to_rectangle())
1017+
1018+
else:
1019+
raise Exception(f"Invalid input type {other.__class__} for other")
1020+
1021+
@support_textblock
1022+
def union(self, other: BaseCoordElement, strict: bool = True):
1023+
""""""
1024+
if isinstance(other, Interval):
1025+
return other.intersect(self)
1026+
1027+
elif isinstance(other, Rectangle):
1028+
return self.__class__(
1029+
min(self.x_1, other.x_1),
1030+
min(self.y_1, other.y_1),
1031+
max(self.x_2, other.x_2),
1032+
max(self.y_2, other.y_2),
1033+
)
1034+
1035+
elif isinstance(other, Quadrilateral):
1036+
if strict:
1037+
raise NotSupportedShapeError(
1038+
"The intersection between an Interval and a Quadrilateral might generate Polygon shapes that are not supported in the current version of layoutparser. You can pass `strict=False` in the input that converts the Quadrilateral to Rectangle to avoid this Exception."
1039+
)
1040+
else:
1041+
warnings.warn(
1042+
f"With `strict=False`, the other of shape {other.__class__} will be converted to {Rectangle} for obtaining the intersection"
1043+
)
1044+
return self.union(other.to_rectangle())
1045+
1046+
else:
1047+
raise Exception(f"Invalid input type {other.__class__} for other")
1048+
8831049
def pad(self, left=0, right=0, top=0, bottom=0, safe_mode=True):
8841050

8851051
x_1 = self.x_1 - left
@@ -966,7 +1132,9 @@ class Quadrilateral(BaseCoordElement):
9661132
points (:obj:`Numpy array` or `list`):
9671133
A `np.ndarray` of shape 4x2 for four corner coordinates
9681134
or a list of length 8 for in the format of
969-
`[p[0,0], p[0,1], p[1,0], p[1,1], ...]`.
1135+
`[p0_x, p0_y, p1_x, p1_y, p2_x, p2_y, p3_x, p3_y]`
1136+
or a list of length 4 in the format of
1137+
`[[p0_x, p0_y], [p1_x, p1_y], [p2_x, p2_y], [p3_x, p3_y]]`.
9701138
height (:obj:`numeric`, `optional`, defaults to `None`):
9711139
The height of the quadrilateral. This is to better support the perspective
9721140
transformation from the OpenCV library.
@@ -978,17 +1146,22 @@ class Quadrilateral(BaseCoordElement):
9781146
_name = "quadrilateral"
9791147
_features = ["points", "height", "width"]
9801148

981-
def __init__(self, points, height=None, width=None):
1149+
def __init__(
1150+
self, points: Union[np.ndarray, List, List[List]], height=None, width=None
1151+
):
9821152

9831153
if isinstance(points, np.ndarray):
9841154
if points.shape != (4, 2):
9851155
raise ValueError(f"Invalid points shape: {points.shape}.")
9861156
elif isinstance(points, list):
987-
if len(points) != 8:
1157+
if len(points) == 8:
1158+
points = np.array(points).reshape(4, 2)
1159+
elif len(points) == 4 and isinstance(points[0], list):
1160+
points = np.array(points)
1161+
else:
9881162
raise ValueError(
9891163
f"Invalid number of points element {len(points)}. Should be 8."
9901164
)
991-
points = np.array(points).reshape(4, 2)
9921165
else:
9931166
raise ValueError(
9941167
f"Invalid input type for points {type(points)}."
@@ -1182,6 +1355,49 @@ def is_in(self, other, soft_margin={}, center=False):
11821355
else:
11831356
raise Exception(f"Invalid input type {other.__class__} for other")
11841357

1358+
@support_textblock
1359+
def intersect(self, other: BaseCoordElement, strict: bool = True):
1360+
""""""
1361+
1362+
if strict:
1363+
raise NotSupportedShapeError(
1364+
"The intersection between a Quadrilateral and other objects might generate Polygon shapes that are not supported in the current version of layoutparser. You can pass `strict=False` in the input that converts the Quadrilateral to Rectangle to avoid this Exception."
1365+
)
1366+
else:
1367+
if isinstance(other, Interval) or isinstance(other, Rectangle):
1368+
warnings.warn(
1369+
f"With `strict=False`, the current Quadrilateral object will be converted to {Rectangle} for obtaining the intersection"
1370+
)
1371+
return other.intersect(self.to_rectangle())
1372+
elif isinstance(other, Quadrilateral):
1373+
warnings.warn(
1374+
f"With `strict=False`, both input Quadrilateral objects will be converted to {Rectangle} for obtaining the intersection"
1375+
)
1376+
return self.to_rectangle().intersect(other.to_rectangle())
1377+
else:
1378+
raise Exception(f"Invalid input type {other.__class__} for other")
1379+
1380+
@support_textblock
1381+
def union(self, other: BaseCoordElement, strict: bool = True):
1382+
""""""
1383+
if strict:
1384+
raise NotSupportedShapeError(
1385+
"The intersection between a Quadrilateral and other objects might generate Polygon shapes that are not supported in the current version of layoutparser. You can pass `strict=False` in the input that converts the Quadrilateral to Rectangle to avoid this Exception."
1386+
)
1387+
else:
1388+
if isinstance(other, Interval) or isinstance(other, Rectangle):
1389+
warnings.warn(
1390+
f"With `strict=False`, the current Quadrilateral object will be converted to {Rectangle} for obtaining the intersection"
1391+
)
1392+
return other.union(self.to_rectangle())
1393+
elif isinstance(other, Quadrilateral):
1394+
warnings.warn(
1395+
f"With `strict=False`, both input Quadrilateral objects will be converted to {Rectangle} for obtaining the intersection"
1396+
)
1397+
return self.to_rectangle().union(other.to_rectangle())
1398+
else:
1399+
raise Exception(f"Invalid input type {other.__class__} for other")
1400+
11851401
def pad(self, left=0, right=0, top=0, bottom=0, safe_mode=True):
11861402

11871403
x_map = {0: -left, 1: -left, 2: right, 3: right}
@@ -1238,7 +1454,7 @@ def crop_image(self, image):
12381454
image, self.perspective_matrix, (int(self.width), int(self.height))
12391455
)
12401456

1241-
def to_interval(self, axis="x", **kwargs):
1457+
def to_interval(self, axis, **kwargs):
12421458

12431459
x_1, y_1, x_2, y_2 = self.coordinates
12441460
if axis == "x":
@@ -1404,6 +1620,14 @@ def relative_to(self, other):
14041620
def is_in(self, other, soft_margin={}, center=False):
14051621
return self.block.is_in(other, soft_margin, center)
14061622

1623+
@mixin_textblock_meta
1624+
def union(self, other: BaseCoordElement, strict: bool = True):
1625+
return self.block.union(other, strict=strict)
1626+
1627+
@mixin_textblock_meta
1628+
def intersect(self, other: BaseCoordElement, strict: bool = True):
1629+
return self.block.intersect(other, strict=strict)
1630+
14071631
@mixin_textblock_meta
14081632
def shift(self, shift_distance):
14091633
return self.block.shift(shift_distance)
@@ -1419,6 +1643,28 @@ def scale(self, scale_factor):
14191643
def crop_image(self, image):
14201644
return self.block.crop_image(image)
14211645

1646+
def to_interval(self, axis: Optional[str] = None, **kwargs):
1647+
if isinstance(self.block, Interval):
1648+
return self
1649+
else:
1650+
if not axis:
1651+
raise ValueError(
1652+
f"Please provide valid `axis` values {'x' or 'y'} as the input"
1653+
)
1654+
return self.set(block=self.block.to_interval(axis=axis, **kwargs))
1655+
1656+
def to_rectangle(self):
1657+
if isinstance(self.block, Rectangle):
1658+
return self
1659+
else:
1660+
return self.set(block=self.block.to_rectangle())
1661+
1662+
def to_quadrilateral(self):
1663+
if isinstance(self.block, Quadrilateral):
1664+
return self
1665+
else:
1666+
return self.set(block=self.block.to_quadrilateral())
1667+
14221668
@classmethod
14231669
def from_series(cls, series):
14241670

0 commit comments

Comments
 (0)