Layout Elements¶
Coordinate System¶
-
class
layoutparser.elements.
Interval
(start, end, axis, canvas_height=None, canvas_width=None)[source]¶ Bases:
layoutparser.elements.base.BaseCoordElement
This class describes the coordinate system of an interval, a block defined by a pair of start and end point on the designated axis and same length as the base canvas on the other axis.
- Parameters
start (
numeric
) – The coordinate of the start point on the designated axis.end (
numeric
) – The end coordinate on the same axis as start.axis (
str
) – The designated axis that the end points belong to.canvas_height (
numeric
, optional, defaults to 0) – The height of the canvas that the interval is on.canvas_width (
numeric
, optional, defaults to 0) – The width of the canvas that the interval is on.
-
property
height
¶ Calculate the height of the interval. If the interval is along the x-axis, the height will be the height of the canvas, otherwise, it will be the difference between the start and end point.
- Returns
Output the numeric value of the height.
- Return type
numeric
-
property
width
¶ Calculate the width of the interval. If the interval is along the y-axis, the width will be the width of the canvas, otherwise, it will be the difference between the start and end point.
- Returns
Output the numeric value of the width.
- Return type
numeric
-
property
coordinates
¶ This method considers an interval as a rectangle and calculates the coordinates of the upper left and lower right corners to define the interval.
- Returns
Output the numeric values of the coordinates in a Tuple of size four.
- Return type
Tuple(numeric)
-
property
points
¶ Return the coordinates of all four corners of the interval in a clockwise fashion starting from the upper left.
- Returns
A Numpy array of shape 4x2 containing the coordinates.
- Return type
Numpy array
-
property
center
¶ Calculate the mid-point between the start and end point.
- Returns
Returns of coordinate of the center.
- Return type
Tuple(numeric)
-
property
area
¶ Return the area of the covered region of the interval. The area is bounded to the canvas. If the interval is put on a canvas, the area equals to interval width * canvas height (axis=’x’) or interval height * canvas width (axis=’y’). Otherwise, the area is zero.
-
put_on_canvas
(canvas)[source]¶ Set the height and the width of the canvas that the interval is on.
- Parameters
canvas (
Numpy array
orBaseCoordElement
orPIL.Image.Image
) – The base element that the interval is on. The numpy array should be the format of [height, width].- Returns
A copy of the current Interval with its canvas height and width set to those of the input canvas.
- Return type
-
condition_on
(other)[source]¶ Given the current element in relative coordinates to another element which is in absolute coordinates, generate a new element of the current element in absolute coordinates.
- Parameters
other (
BaseCoordElement
) – The other layout element involved in the geometric operations.- Raises
Exception – Raise error when the input type of the other element is invalid.
- Returns
The BaseCoordElement object of the original element in the absolute coordinate system.
- Return type
BaseCoordElement
-
relative_to
(other)[source]¶ Given the current element and another element both in absolute coordinates, generate a new element of the current element in relative coordinates to the other element.
- Parameters
other (
BaseCoordElement
) – The other layout element involved in the geometric operations.- Raises
Exception – Raise error when the input type of the other element is invalid.
- Returns
The BaseCoordElement object of the original element in the relative coordinate system.
- Return type
BaseCoordElement
-
is_in
(other, soft_margin={}, center=False)[source]¶ Identify whether the current element is within another element.
- Parameters
other (
BaseCoordElement
) – The other layout element involved in the geometric operations.soft_margin (
dict
, optional, defaults to {}) – Enlarge the other element with wider margins to relax the restrictions.center (
bool
, optional, defaults to False) – The toggle to determine whether the center (instead of the four corners) of the current element is in the other element.
- Returns
Returns True if the current element is in the other element and False if not.
- Return type
-
intersect
(other: layoutparser.elements.base.BaseCoordElement, strict: bool = True)[source]¶ Intersect the current shape with the other object, with operations defined in Shape Operations.
-
union
(other: layoutparser.elements.base.BaseCoordElement, strict: bool = True)[source]¶ Union the current shape with the other object, with operations defined in Shape Operations.
-
pad
(left=0, right=0, top=0, bottom=0, safe_mode=True)[source]¶ Pad the layout element on the four sides of the polygon with the user-defined pixels. If safe_mode is set to True, the function will cut off the excess padding that falls on the negative side of the coordinates.
- Parameters
left (
int
, optional, defaults to 0) – The number of pixels to pad on the upper side of the polygon.right (
int
, optional, defaults to 0) – The number of pixels to pad on the lower side of the polygon.top (
int
, optional, defaults to 0) – The number of pixels to pad on the left side of the polygon.bottom (
int
, optional, defaults to 0) – The number of pixels to pad on the right side of the polygon.safe_mode (
bool
, optional, defaults to True) – A bool value to toggle the safe_mode.
- Returns
The padded BaseCoordElement object.
- Return type
BaseCoordElement
-
shift
(shift_distance)[source]¶ Shift the interval by a user specified amount along the same axis that the interval is defined on.
- Parameters
shift_distance (
numeric
) – The number of pixels used to shift the interval.- Returns
The shifted Interval object.
- Return type
BaseCoordElement
-
scale
(scale_factor)[source]¶ Scale the layout element by a user specified amount the same axis that the interval is defined on.
- Parameters
scale_factor (
numeric
) – The amount for downscaling or upscaling the element.- Returns
The scaled Interval object.
- Return type
BaseCoordElement
-
crop_image
(image)[source]¶ Crop the input image according to the coordinates of the element.
- Parameters
image (
Numpy array
) – The array of the input image.- Returns
The array of the cropped image.
- Return type
Numpy array
-
to_rectangle
()[source]¶ Convert the Interval to a Rectangle element.
- Returns
The converted Rectangle object.
- Return type
-
class
layoutparser.elements.
Rectangle
(x_1, y_1, x_2, y_2)[source]¶ Bases:
layoutparser.elements.base.BaseCoordElement
This class describes the coordinate system of an axial rectangle box using two points as indicated below:
(x_1, y_1) ---- | | | | | | ---- (x_2, y_2)
- Parameters
x_1 (
numeric
) – x coordinate on the horizontal axis of the upper left corner of the rectangle.y_1 (
numeric
) – y coordinate on the vertical axis of the upper left corner of the rectangle.x_2 (
numeric
) – x coordinate on the horizontal axis of the lower right corner of the rectangle.y_2 (
numeric
) – y coordinate on the vertical axis of the lower right corner of the rectangle.
-
property
height
¶ Calculate the height of the rectangle.
- Returns
Output the numeric value of the height.
- Return type
numeric
-
property
width
¶ Calculate the width of the rectangle.
- Returns
Output the numeric value of the width.
- Return type
numeric
-
property
coordinates
¶ Return the coordinates of the two points that define the rectangle.
- Returns
Output the numeric values of the coordinates in a Tuple of size four.
- Return type
Tuple(numeric)
-
property
points
¶ Return the coordinates of all four corners of the rectangle in a clockwise fashion starting from the upper left.
- Returns
A Numpy array of shape 4x2 containing the coordinates.
- Return type
Numpy array
-
property
center
¶ Calculate the center of the rectangle.
- Returns
Returns of coordinate of the center.
- Return type
Tuple(numeric)
-
property
area
¶ Return the area of the rectangle.
-
condition_on
(other)[source]¶ Given the current element in relative coordinates to another element which is in absolute coordinates, generate a new element of the current element in absolute coordinates.
- Parameters
other (
BaseCoordElement
) – The other layout element involved in the geometric operations.- Raises
Exception – Raise error when the input type of the other element is invalid.
- Returns
The BaseCoordElement object of the original element in the absolute coordinate system.
- Return type
BaseCoordElement
-
relative_to
(other)[source]¶ Given the current element and another element both in absolute coordinates, generate a new element of the current element in relative coordinates to the other element.
- Parameters
other (
BaseCoordElement
) – The other layout element involved in the geometric operations.- Raises
Exception – Raise error when the input type of the other element is invalid.
- Returns
The BaseCoordElement object of the original element in the relative coordinate system.
- Return type
BaseCoordElement
-
is_in
(other, soft_margin={}, center=False)[source]¶ Identify whether the current element is within another element.
- Parameters
other (
BaseCoordElement
) – The other layout element involved in the geometric operations.soft_margin (
dict
, optional, defaults to {}) – Enlarge the other element with wider margins to relax the restrictions.center (
bool
, optional, defaults to False) – The toggle to determine whether the center (instead of the four corners) of the current element is in the other element.
- Returns
Returns True if the current element is in the other element and False if not.
- Return type
-
intersect
(other: layoutparser.elements.base.BaseCoordElement, strict: bool = True)[source]¶ Intersect the current shape with the other object, with operations defined in Shape Operations.
-
union
(other: layoutparser.elements.base.BaseCoordElement, strict: bool = True)[source]¶ Union the current shape with the other object, with operations defined in Shape Operations.
-
pad
(left=0, right=0, top=0, bottom=0, safe_mode=True)[source]¶ Pad the layout element on the four sides of the polygon with the user-defined pixels. If safe_mode is set to True, the function will cut off the excess padding that falls on the negative side of the coordinates.
- Parameters
left (
int
, optional, defaults to 0) – The number of pixels to pad on the upper side of the polygon.right (
int
, optional, defaults to 0) – The number of pixels to pad on the lower side of the polygon.top (
int
, optional, defaults to 0) – The number of pixels to pad on the left side of the polygon.bottom (
int
, optional, defaults to 0) – The number of pixels to pad on the right side of the polygon.safe_mode (
bool
, optional, defaults to True) – A bool value to toggle the safe_mode.
- Returns
The padded BaseCoordElement object.
- Return type
BaseCoordElement
-
shift
(shift_distance=0)[source]¶ Shift the layout element by user specified amounts on x and y axis respectively. If shift_distance is one numeric value, the element will by shifted by the same specified amount on both x and y axis.
- Parameters
shift_distance (
numeric
orTuple(numeric)
orList[numeric]
) – The number of pixels used to shift the element.- Returns
The shifted BaseCoordElement of the same shape-specific class.
- Return type
BaseCoordElement
-
scale
(scale_factor=1)[source]¶ Scale the layout element by a user specified amount on x and y axis respectively. If scale_factor is one numeric value, the element will by scaled by the same specified amount on both x and y axis.
- Parameters
scale_factor (
numeric
orTuple(numeric)
orList[numeric]
) – The amount for downscaling or upscaling the element.- Returns
The scaled BaseCoordElement of the same shape-specific class.
- Return type
BaseCoordElement
-
class
layoutparser.elements.
Quadrilateral
(points: Union[numpy.ndarray, List, List[List]], height=None, width=None)[source]¶ Bases:
layoutparser.elements.base.BaseCoordElement
This class describes the coodinate system of a four-sided polygon. A quadrilateral is defined by the coordinates of its 4 corners in a clockwise order starting with the upper left corner (as shown below):
points[0] -...- points[1] | | . . . . . . | | points[3] -...- points[2]
- Parameters
points (
Numpy array
or list) – A np.ndarray of shape 4x2 for four corner coordinates or a list of length 8 for in the format of [p0_x, p0_y, p1_x, p1_y, p2_x, p2_y, p3_x, p3_y] or a list of length 4 in the format of [[p0_x, p0_y], [p1_x, p1_y], [p2_x, p2_y], [p3_x, p3_y]].height (
numeric
, optional, defaults to None) – The height of the quadrilateral. This is to better support the perspective transformation from the OpenCV library.width (
numeric
, optional, defaults to None) – The width of the quadrilateral. Similarly as height, this is to better support the perspective transformation from the OpenCV library.
-
property
height
¶ Return the user defined height, otherwise the height of its circumscribed rectangle.
- Returns
Output the numeric value of the height.
- Return type
numeric
-
property
width
¶ Return the user defined width, otherwise the width of its circumscribed rectangle.
- Returns
Output the numeric value of the width.
- Return type
numeric
-
property
coordinates
¶ Return the coordinates of the upper left and lower right corners points that define the circumscribed rectangle.
- Returns
Tuple(numeric)
: Output the numeric values of the coordinates in a Tuple of size four.
-
property
points
¶ Return the coordinates of all four corners of the quadrilateral in a clockwise fashion starting from the upper left.
- Returns
A Numpy array of shape 4x2 containing the coordinates.
- Return type
Numpy array
-
property
center
¶ Calculate the center of the quadrilateral.
- Returns
Returns of coordinate of the center.
- Return type
Tuple(numeric)
-
property
area
¶ Return the area of the quadrilateral.
-
property
mapped_rectangle_points
¶
-
property
perspective_matrix
¶
-
condition_on
(other)[source]¶ Given the current element in relative coordinates to another element which is in absolute coordinates, generate a new element of the current element in absolute coordinates.
- Parameters
other (
BaseCoordElement
) – The other layout element involved in the geometric operations.- Raises
Exception – Raise error when the input type of the other element is invalid.
- Returns
The BaseCoordElement object of the original element in the absolute coordinate system.
- Return type
BaseCoordElement
-
relative_to
(other)[source]¶ Given the current element and another element both in absolute coordinates, generate a new element of the current element in relative coordinates to the other element.
- Parameters
other (
BaseCoordElement
) – The other layout element involved in the geometric operations.- Raises
Exception – Raise error when the input type of the other element is invalid.
- Returns
The BaseCoordElement object of the original element in the relative coordinate system.
- Return type
BaseCoordElement
-
is_in
(other, soft_margin={}, center=False)[source]¶ Identify whether the current element is within another element.
- Parameters
other (
BaseCoordElement
) – The other layout element involved in the geometric operations.soft_margin (
dict
, optional, defaults to {}) – Enlarge the other element with wider margins to relax the restrictions.center (
bool
, optional, defaults to False) – The toggle to determine whether the center (instead of the four corners) of the current element is in the other element.
- Returns
Returns True if the current element is in the other element and False if not.
- Return type
-
intersect
(other: layoutparser.elements.base.BaseCoordElement, strict: bool = True)[source]¶ Intersect the current shape with the other object, with operations defined in Shape Operations.
-
union
(other: layoutparser.elements.base.BaseCoordElement, strict: bool = True)[source]¶ Union the current shape with the other object, with operations defined in Shape Operations.
-
pad
(left=0, right=0, top=0, bottom=0, safe_mode=True)[source]¶ Pad the layout element on the four sides of the polygon with the user-defined pixels. If safe_mode is set to True, the function will cut off the excess padding that falls on the negative side of the coordinates.
- Parameters
left (
int
, optional, defaults to 0) – The number of pixels to pad on the upper side of the polygon.right (
int
, optional, defaults to 0) – The number of pixels to pad on the lower side of the polygon.top (
int
, optional, defaults to 0) – The number of pixels to pad on the left side of the polygon.bottom (
int
, optional, defaults to 0) – The number of pixels to pad on the right side of the polygon.safe_mode (
bool
, optional, defaults to True) – A bool value to toggle the safe_mode.
- Returns
The padded BaseCoordElement object.
- Return type
BaseCoordElement
-
shift
(shift_distance=0)[source]¶ Shift the layout element by user specified amounts on x and y axis respectively. If shift_distance is one numeric value, the element will by shifted by the same specified amount on both x and y axis.
- Parameters
shift_distance (
numeric
orTuple(numeric)
orList[numeric]
) – The number of pixels used to shift the element.- Returns
The shifted BaseCoordElement of the same shape-specific class.
- Return type
BaseCoordElement
-
scale
(scale_factor=1)[source]¶ Scale the layout element by a user specified amount on x and y axis respectively. If scale_factor is one numeric value, the element will by scaled by the same specified amount on both x and y axis.
- Parameters
scale_factor (
numeric
orTuple(numeric)
orList[numeric]
) – The amount for downscaling or upscaling the element.- Returns
The scaled BaseCoordElement of the same shape-specific class.
- Return type
BaseCoordElement
TextBlock¶
-
class
layoutparser.elements.
TextBlock
(block, text=None, id=None, type=None, parent=None, next=None, score=None)[source]¶ Bases:
layoutparser.elements.base.BaseLayoutElement
This class constructs content-related information of a layout element in addition to its coordinate definitions (i.e. Interval, Rectangle or Quadrilateral).
- Parameters
block (
BaseCoordElement
) – The shape-specific coordinate systems that the text block belongs to.text (
str
, optional, defaults to None) – The ocr’ed text results within the boundaries of the text block.id (
int
, optional, defaults to None) – The id of the text block.type (
int
, optional, defaults to None) – The type of the text block.parent (
int
, optional, defaults to None) – The id of the parent object.next (
int
, optional, defaults to None) – The id of the next block.score (
numeric
, defaults to None) – The prediction confidence of the block
-
property
height
¶ Return the height of the shape-specific block.
- Returns
Output the numeric value of the height.
- Return type
numeric
-
property
width
¶ Return the width of the shape-specific block.
- Returns
Output the numeric value of the width.
- Return type
numeric
-
property
coordinates
¶ Return the coordinates of the two corner points that define the shape-specific block.
- Returns
Output the numeric values of the coordinates in a Tuple of size four.
- Return type
Tuple(numeric)
-
property
points
¶ Return the coordinates of all four corners of the shape-specific block in a clockwise fashion starting from the upper left.
- Returns
A Numpy array of shape 4x2 containing the coordinates.
- Return type
Numpy array
-
property
area
¶ Return the area of associated block.
-
condition_on
(other)[source]¶ Given the current element in relative coordinates to another element which is in absolute coordinates, generate a new element of the current element in absolute coordinates.
- Parameters
other (
BaseCoordElement
) – The other layout element involved in the geometric operations.- Raises
Exception – Raise error when the input type of the other element is invalid.
- Returns
The BaseCoordElement object of the original element in the absolute coordinate system.
- Return type
BaseCoordElement
-
relative_to
(other)[source]¶ Given the current element and another element both in absolute coordinates, generate a new element of the current element in relative coordinates to the other element.
- Parameters
other (
BaseCoordElement
) – The other layout element involved in the geometric operations.- Raises
Exception – Raise error when the input type of the other element is invalid.
- Returns
The BaseCoordElement object of the original element in the relative coordinate system.
- Return type
BaseCoordElement
-
is_in
(other, soft_margin={}, center=False)[source]¶ Identify whether the current element is within another element.
- Parameters
other (
BaseCoordElement
) – The other layout element involved in the geometric operations.soft_margin (
dict
, optional, defaults to {}) – Enlarge the other element with wider margins to relax the restrictions.center (
bool
, optional, defaults to False) – The toggle to determine whether the center (instead of the four corners) of the current element is in the other element.
- Returns
Returns True if the current element is in the other element and False if not.
- Return type
-
union
(other: layoutparser.elements.base.BaseCoordElement, strict: bool = True)[source]¶ Union the current shape with the other object, with operations defined in Shape Operations.
-
intersect
(other: layoutparser.elements.base.BaseCoordElement, strict: bool = True)[source]¶ Intersect the current shape with the other object, with operations defined in Shape Operations.
-
shift
(shift_distance)[source]¶ Shift the layout element by user specified amounts on x and y axis respectively. If shift_distance is one numeric value, the element will by shifted by the same specified amount on both x and y axis.
- Parameters
shift_distance (
numeric
orTuple(numeric)
orList[numeric]
) – The number of pixels used to shift the element.- Returns
The shifted BaseCoordElement of the same shape-specific class.
- Return type
BaseCoordElement
-
pad
(left=0, right=0, top=0, bottom=0, safe_mode=True)[source]¶ Pad the layout element on the four sides of the polygon with the user-defined pixels. If safe_mode is set to True, the function will cut off the excess padding that falls on the negative side of the coordinates.
- Parameters
left (
int
, optional, defaults to 0) – The number of pixels to pad on the upper side of the polygon.right (
int
, optional, defaults to 0) – The number of pixels to pad on the lower side of the polygon.top (
int
, optional, defaults to 0) – The number of pixels to pad on the left side of the polygon.bottom (
int
, optional, defaults to 0) – The number of pixels to pad on the right side of the polygon.safe_mode (
bool
, optional, defaults to True) – A bool value to toggle the safe_mode.
- Returns
The padded BaseCoordElement object.
- Return type
BaseCoordElement
-
scale
(scale_factor)[source]¶ Scale the layout element by a user specified amount on x and y axis respectively. If scale_factor is one numeric value, the element will by scaled by the same specified amount on both x and y axis.
- Parameters
scale_factor (
numeric
orTuple(numeric)
orList[numeric]
) – The amount for downscaling or upscaling the element.- Returns
The scaled BaseCoordElement of the same shape-specific class.
- Return type
BaseCoordElement
-
crop_image
(image)[source]¶ Crop the input image according to the coordinates of the element.
- Parameters
image (
Numpy array
) – The array of the input image.- Returns
The array of the cropped image.
- Return type
Numpy array
-
to_dict
() → Dict[str, Any][source]¶ Generate a dictionary representation of the current textblock of the format:
{ "block_type": <name of self.block>, <attributes of self.block combined with non-empty self._features> }
-
classmethod
from_dict
(data: Dict[str, Any]) → layoutparser.elements.layout_elements.TextBlock[source]¶ Initialize the textblock based on the dictionary representation. It generate the block based on the block_type and block_attr, and loads the textblock specific features from the dict.
- Parameters
data (
dict
) – The dictionary representation of the object
Layout¶
-
class
layoutparser.elements.
Layout
(blocks: Optional[List] = None, *, page_data: Dict = None)[source]¶ Bases:
collections.abc.MutableSequence
The
Layout
class id designed for processing a list of layout elements on a page. It stores the layout elements in a list and the related page_data, and provides handy APIs for processing all the layout elements in batch. `- Parameters
blocks (
list
) – A list of layout element blockspage_data (Dict, optional) – A dictionary storing the page (canvas) related information like height, width, etc. It should be passed in as a keyword argument to avoid any confusion. Defaults to None.
-
sort
(key=None, reverse=False, inplace=False) → Optional[layoutparser.elements.layout.Layout][source]¶ Sort the list of blocks based on the given
- Parameters
key ([type], optional) – key specifies a function of one argument that
used to extract a comparison key from each list element. (is) –
to None. (Defaults) –
reverse (bool, optional) – reverse is a boolean value. If set to True,
the list elements are sorted as if each comparison were reversed. (then) –
to False. (Defaults) –
inplace (bool, optional) – whether to perform the sort inplace. If set
False, it will return another object instance with _block sorted in (to) –
order. Defaults to False. (the) –
- Examples::
>>> import layoutparser as lp >>> i = lp.Interval(4, 5, axis="y") >>> l = lp.Layout([i, i.shift(2)]) >>> l.sort(key=lambda x: x.coordinates[1], reverse=True)
-
filter_by
(other, soft_margin={}, center=False)[source]¶ Return a Layout object containing the elements that are in the other object.
- Parameters
other (
BaseCoordElement
) – The block to filter the current elements.- Returns
A new layout object after filtering.
- Return type
-
shift
(shift_distance)[source]¶ Shift all layout elements by user specified amounts on x and y axis respectively. If shift_distance is one numeric value, the element will by shifted by the same specified amount on both x and y axis.
- Parameters
shift_distance (
numeric
orTuple(numeric)
orList[numeric]
) – The number of pixels used to shift the element.- Returns
A new layout object with all the elements shifted in the specified values.
- Return type
-
pad
(left=0, right=0, top=0, bottom=0, safe_mode=True)[source]¶ Pad all layout elements on the four sides of the polygon with the user-defined pixels. If safe_mode is set to True, the function will cut off the excess padding that falls on the negative side of the coordinates.
- Parameters
left (
int
, optional, defaults to 0) – The number of pixels to pad on the upper side of the polygon.right (
int
, optional, defaults to 0) – The number of pixels to pad on the lower side of the polygon.top (
int
, optional, defaults to 0) – The number of pixels to pad on the left side of the polygon.bottom (
int
, optional, defaults to 0) – The number of pixels to pad on the right side of the polygon.safe_mode (
bool
, optional, defaults to True) – A bool value to toggle the safe_mode.
- Returns
A new layout object with all the elements padded in the specified values.
- Return type
-
scale
(scale_factor)[source]¶ Scale all layout element by a user specified amount on x and y axis respectively. If scale_factor is one numeric value, the element will by scaled by the same specified amount on both x and y axis.
- Parameters
scale_factor (
numeric
orTuple(numeric)
orList[numeric]
) – The amount for downscaling or upscaling the element.- Returns
A new layout object with all the elements scaled in the specified values.
- Return type
-
get_texts
()[source]¶ Iterate through all the text blocks in the list and append their ocr’ed text results.
- Returns
A list of text strings of the text blocks in the list of layout elements.
- Return type
List[str]
-
get_info
(attr_name)[source]¶ Given user-provided attribute name, check all the elements in the list and return the corresponding attribute values.
- Parameters
attr_name (
str
) – The text string of certain attribute name.- Returns
The list of the corresponding attribute value (if exist) of each element in the list.
- Return type
List
-
to_dict
() → Dict[str, Any][source]¶ Generate a dict representation of the layout object with the page_data and all the blocks in its dict representation.
- Returns
The dictionary representation of the layout object.
- Return type
Dict
-
get_homogeneous_blocks
() → List[layoutparser.elements.base.BaseLayoutElement][source]¶ Convert all elements into blocks of the same type based on the type casting rule:
Interval < Rectangle < Quadrilateral < TextBlock
- Returns
A list of base layout elements of the maximal compatible type
- Return type
List[BaseLayoutElement]
-
to_dataframe
(enforce_same_type=False) → pandas.core.frame.DataFrame[source]¶ Convert the layout object into the dataframe. Warning: the page data won’t be exported.
- Parameters
enforce_same_type (
bool
, optional) – If true, it will convert all the contained blocks to the maximal compatible data type. Defaults to False.- Returns
The dataframe representation of layout object
- Return type
pd.DataFrame