Layout Elements

Coordinate System

class layoutparser.elements.Interval(start, end, axis, canvas_height=None, canvas_width=None)[source]

Bases: layoutparser.elements.BaseCoordElement

This class describes the coordinate system of an interval, a block defined by a pair of start and end point on the designated axis and same length as the base canvas on the other axis.

Parameters
  • start (numeric) – The coordinate of the start point on the designated axis.

  • end (numeric) – The end coordinate on the same axis as start.

  • axis (str) – The designated axis that the end points belong to.

  • canvas_height (numeric, optional, defaults to 0) – The height of the canvas that the interval is on.

  • canvas_width (numeric, optional, defaults to 0) – The width of the canvas that the interval is on.

property height

Calculate the height of the interval. If the interval is along the x-axis, the height will be the height of the canvas, otherwise, it will be the difference between the start and end point.

Returns

Output the numeric value of the height.

Return type

numeric

property width

Calculate the width of the interval. If the interval is along the y-axis, the width will be the width of the canvas, otherwise, it will be the difference between the start and end point.

Returns

Output the numeric value of the width.

Return type

numeric

property coordinates

This method considers an interval as a rectangle and calculates the coordinates of the upper left and lower right corners to define the interval.

Returns

Output the numeric values of the coordinates in a Tuple of size four.

Return type

Tuple(numeric)

property points

Return the coordinates of all four corners of the interval in a clockwise fashion starting from the upper left.

Returns

A Numpy array of shape 4x2 containing the coordinates.

Return type

Numpy array

property center

Calculate the mid-point between the start and end point.

Returns

Returns of coordinate of the center.

Return type

Tuple(numeric)

property area

Return the area of the covered region of the interval. The area is bounded to the canvas. If the interval is put on a canvas, the area equals to interval width * canvas height (axis=’x’) or interval height * canvas width (axis=’y’). Otherwise, the area is zero.

put_on_canvas(canvas)[source]

Set the height and the width of the canvas that the interval is on.

Parameters

canvas (Numpy array or BaseCoordElement or PIL.Image.Image) – The base element that the interval is on. The numpy array should be the format of [height, width].

Returns

A copy of the current Interval with its canvas height and width set to those of the input canvas.

Return type

Interval

condition_on(other)[source]

Given the current element in relative coordinates to another element which is in absolute coordinates, generate a new element of the current element in absolute coordinates.

Parameters

other (BaseCoordElement) – The other layout element involved in the geometric operations.

Raises

Exception – Raise error when the input type of the other element is invalid.

Returns

The BaseCoordElement object of the original element in the absolute coordinate system.

Return type

BaseCoordElement

relative_to(other)[source]

Given the current element and another element both in absolute coordinates, generate a new element of the current element in relative coordinates to the other element.

Parameters

other (BaseCoordElement) – The other layout element involved in the geometric operations.

Raises

Exception – Raise error when the input type of the other element is invalid.

Returns

The BaseCoordElement object of the original element in the relative coordinate system.

Return type

BaseCoordElement

is_in(other, soft_margin={}, center=False)[source]

Identify whether the current element is within another element.

Parameters
  • other (BaseCoordElement) – The other layout element involved in the geometric operations.

  • soft_margin (dict, optional, defaults to {}) – Enlarge the other element with wider margins to relax the restrictions.

  • center (bool, optional, defaults to False) – The toggle to determine whether the center (instead of the four corners) of the current element is in the other element.

Returns

Returns True if the current element is in the other element and False if not.

Return type

bool

pad(left=0, right=0, top=0, bottom=0, safe_mode=True)[source]

Pad the layout element on the four sides of the polygon with the user-defined pixels. If safe_mode is set to True, the function will cut off the excess padding that falls on the negative side of the coordinates.

Parameters
  • left (int, optional, defaults to 0) – The number of pixels to pad on the upper side of the polygon.

  • right (int, optional, defaults to 0) – The number of pixels to pad on the lower side of the polygon.

  • top (int, optional, defaults to 0) – The number of pixels to pad on the left side of the polygon.

  • bottom (int, optional, defaults to 0) – The number of pixels to pad on the right side of the polygon.

  • safe_mode (bool, optional, defaults to True) – A bool value to toggle the safe_mode.

Returns

The padded BaseCoordElement object.

Return type

BaseCoordElement

shift(shift_distance)[source]

Shift the interval by a user specified amount along the same axis that the interval is defined on.

Parameters

shift_distance (numeric) – The number of pixels used to shift the interval.

Returns

The shifted Interval object.

Return type

BaseCoordElement

scale(scale_factor)[source]

Scale the layout element by a user specified amount the same axis that the interval is defined on.

Parameters

scale_factor (numeric) – The amount for downscaling or upscaling the element.

Returns

The scaled Interval object.

Return type

BaseCoordElement

crop_image(image)[source]

Crop the input image according to the coordinates of the element.

Parameters

image (Numpy array) – The array of the input image.

Returns

The array of the cropped image.

Return type

Numpy array

to_rectangle()[source]

Convert the Interval to a Rectangle element.

Returns

The converted Rectangle object.

Return type

Rectangle

to_quadrilateral()[source]

Convert the Interval to a Quadrilateral element.

Returns

The converted Quadrilateral object.

Return type

Quadrilateral

classmethod from_series(series)[source]
class layoutparser.elements.Rectangle(x_1, y_1, x_2, y_2)[source]

Bases: layoutparser.elements.BaseCoordElement

This class describes the coordinate system of an axial rectangle box using two points as indicated below:

(x_1, y_1) ----
|             |
|             |
|             |
---- (x_2, y_2)
Parameters
  • x_1 (numeric) – x coordinate on the horizontal axis of the upper left corner of the rectangle.

  • y_1 (numeric) – y coordinate on the vertical axis of the upper left corner of the rectangle.

  • x_2 (numeric) – x coordinate on the horizontal axis of the lower right corner of the rectangle.

  • y_2 (numeric) – y coordinate on the vertical axis of the lower right corner of the rectangle.

property height

Calculate the height of the rectangle.

Returns

Output the numeric value of the height.

Return type

numeric

property width

Calculate the width of the rectangle.

Returns

Output the numeric value of the width.

Return type

numeric

property coordinates

Return the coordinates of the two points that define the rectangle.

Returns

Output the numeric values of the coordinates in a Tuple of size four.

Return type

Tuple(numeric)

property points

Return the coordinates of all four corners of the rectangle in a clockwise fashion starting from the upper left.

Returns

A Numpy array of shape 4x2 containing the coordinates.

Return type

Numpy array

property center

Calculate the center of the rectangle.

Returns

Returns of coordinate of the center.

Return type

Tuple(numeric)

property area

Return the area of the rectangle.

condition_on(other)[source]

Given the current element in relative coordinates to another element which is in absolute coordinates, generate a new element of the current element in absolute coordinates.

Parameters

other (BaseCoordElement) – The other layout element involved in the geometric operations.

Raises

Exception – Raise error when the input type of the other element is invalid.

Returns

The BaseCoordElement object of the original element in the absolute coordinate system.

Return type

BaseCoordElement

relative_to(other)[source]

Given the current element and another element both in absolute coordinates, generate a new element of the current element in relative coordinates to the other element.

Parameters

other (BaseCoordElement) – The other layout element involved in the geometric operations.

Raises

Exception – Raise error when the input type of the other element is invalid.

Returns

The BaseCoordElement object of the original element in the relative coordinate system.

Return type

BaseCoordElement

is_in(other, soft_margin={}, center=False)[source]

Identify whether the current element is within another element.

Parameters
  • other (BaseCoordElement) – The other layout element involved in the geometric operations.

  • soft_margin (dict, optional, defaults to {}) – Enlarge the other element with wider margins to relax the restrictions.

  • center (bool, optional, defaults to False) – The toggle to determine whether the center (instead of the four corners) of the current element is in the other element.

Returns

Returns True if the current element is in the other element and False if not.

Return type

bool

pad(left=0, right=0, top=0, bottom=0, safe_mode=True)[source]

Pad the layout element on the four sides of the polygon with the user-defined pixels. If safe_mode is set to True, the function will cut off the excess padding that falls on the negative side of the coordinates.

Parameters
  • left (int, optional, defaults to 0) – The number of pixels to pad on the upper side of the polygon.

  • right (int, optional, defaults to 0) – The number of pixels to pad on the lower side of the polygon.

  • top (int, optional, defaults to 0) – The number of pixels to pad on the left side of the polygon.

  • bottom (int, optional, defaults to 0) – The number of pixels to pad on the right side of the polygon.

  • safe_mode (bool, optional, defaults to True) – A bool value to toggle the safe_mode.

Returns

The padded BaseCoordElement object.

Return type

BaseCoordElement

shift(shift_distance=0)[source]

Shift the layout element by user specified amounts on x and y axis respectively. If shift_distance is one numeric value, the element will by shifted by the same specified amount on both x and y axis.

Parameters

shift_distance (numeric or Tuple(numeric) or List[numeric]) – The number of pixels used to shift the element.

Returns

The shifted BaseCoordElement of the same shape-specific class.

Return type

BaseCoordElement

scale(scale_factor=1)[source]

Scale the layout element by a user specified amount on x and y axis respectively. If scale_factor is one numeric value, the element will by scaled by the same specified amount on both x and y axis.

Parameters

scale_factor (numeric or Tuple(numeric) or List[numeric]) – The amount for downscaling or upscaling the element.

Returns

The scaled BaseCoordElement of the same shape-specific class.

Return type

BaseCoordElement

crop_image(image)[source]

Crop the input image according to the coordinates of the element.

Parameters

image (Numpy array) – The array of the input image.

Returns

The array of the cropped image.

Return type

Numpy array

to_interval(axis, **kwargs)[source]
to_quadrilateral()[source]
classmethod from_series(series)[source]
class layoutparser.elements.Quadrilateral(points, height=None, width=None)[source]

Bases: layoutparser.elements.BaseCoordElement

This class describes the coodinate system of a four-sided polygon. A quadrilateral is defined by the coordinates of its 4 corners in a clockwise order starting with the upper left corner (as shown below):

points[0] -...- points[1]
|                      |
.                      .
.                      .
.                      .
|                      |
points[3] -...- points[2]
Parameters
  • points (Numpy array or list) – A np.ndarray of shape 4x2 for four corner coordinates or a list of length 8 for in the format of [p[0,0], p[0,1], p[1,0], p[1,1], …].

  • height (numeric, optional, defaults to None) – The height of the quadrilateral. This is to better support the perspective transformation from the OpenCV library.

  • width (numeric, optional, defaults to None) – The width of the quadrilateral. Similarly as height, this is to better support the perspective transformation from the OpenCV library.

property height

Return the user defined height, otherwise the height of its circumscribed rectangle.

Returns

Output the numeric value of the height.

Return type

numeric

property width

Return the user defined width, otherwise the width of its circumscribed rectangle.

Returns

Output the numeric value of the width.

Return type

numeric

property coordinates

Return the coordinates of the upper left and lower right corners points that define the circumscribed rectangle.

Returns

Tuple(numeric): Output the numeric values of the coordinates in a Tuple of size four.

property points

Return the coordinates of all four corners of the quadrilateral in a clockwise fashion starting from the upper left.

Returns

A Numpy array of shape 4x2 containing the coordinates.

Return type

Numpy array

property center

Calculate the center of the quadrilateral.

Returns

Returns of coordinate of the center.

Return type

Tuple(numeric)

property area

Return the area of the quadrilateral.

property mapped_rectangle_points
property perspective_matrix
map_to_points_ordering(x_map, y_map)[source]
condition_on(other)[source]

Given the current element in relative coordinates to another element which is in absolute coordinates, generate a new element of the current element in absolute coordinates.

Parameters

other (BaseCoordElement) – The other layout element involved in the geometric operations.

Raises

Exception – Raise error when the input type of the other element is invalid.

Returns

The BaseCoordElement object of the original element in the absolute coordinate system.

Return type

BaseCoordElement

relative_to(other)[source]

Given the current element and another element both in absolute coordinates, generate a new element of the current element in relative coordinates to the other element.

Parameters

other (BaseCoordElement) – The other layout element involved in the geometric operations.

Raises

Exception – Raise error when the input type of the other element is invalid.

Returns

The BaseCoordElement object of the original element in the relative coordinate system.

Return type

BaseCoordElement

is_in(other, soft_margin={}, center=False)[source]

Identify whether the current element is within another element.

Parameters
  • other (BaseCoordElement) – The other layout element involved in the geometric operations.

  • soft_margin (dict, optional, defaults to {}) – Enlarge the other element with wider margins to relax the restrictions.

  • center (bool, optional, defaults to False) – The toggle to determine whether the center (instead of the four corners) of the current element is in the other element.

Returns

Returns True if the current element is in the other element and False if not.

Return type

bool

pad(left=0, right=0, top=0, bottom=0, safe_mode=True)[source]

Pad the layout element on the four sides of the polygon with the user-defined pixels. If safe_mode is set to True, the function will cut off the excess padding that falls on the negative side of the coordinates.

Parameters
  • left (int, optional, defaults to 0) – The number of pixels to pad on the upper side of the polygon.

  • right (int, optional, defaults to 0) – The number of pixels to pad on the lower side of the polygon.

  • top (int, optional, defaults to 0) – The number of pixels to pad on the left side of the polygon.

  • bottom (int, optional, defaults to 0) – The number of pixels to pad on the right side of the polygon.

  • safe_mode (bool, optional, defaults to True) – A bool value to toggle the safe_mode.

Returns

The padded BaseCoordElement object.

Return type

BaseCoordElement

shift(shift_distance=0)[source]

Shift the layout element by user specified amounts on x and y axis respectively. If shift_distance is one numeric value, the element will by shifted by the same specified amount on both x and y axis.

Parameters

shift_distance (numeric or Tuple(numeric) or List[numeric]) – The number of pixels used to shift the element.

Returns

The shifted BaseCoordElement of the same shape-specific class.

Return type

BaseCoordElement

scale(scale_factor=1)[source]

Scale the layout element by a user specified amount on x and y axis respectively. If scale_factor is one numeric value, the element will by scaled by the same specified amount on both x and y axis.

Parameters

scale_factor (numeric or Tuple(numeric) or List[numeric]) – The amount for downscaling or upscaling the element.

Returns

The scaled BaseCoordElement of the same shape-specific class.

Return type

BaseCoordElement

crop_image(image)[source]

Crop the input image using the points of the quadrilateral instance.

Parameters

image (Numpy array) – The array of the input image.

Returns

The array of the cropped image.

Return type

Numpy array

to_interval(axis='x', **kwargs)[source]
to_rectangle()[source]
classmethod from_series(series)[source]
to_dict() → Dict[str, Any][source]

Generate a dictionary representation of the current object:

{
    "block_type": "quadrilateral",
    "points": [
        p[0,0], p[0,1],
        p[1,0], p[1,1],
        p[2,0], p[2,1],
        p[3,0], p[3,1]
    ],
    "height": value,
    "width": value
}

TextBlock

class layoutparser.elements.TextBlock(block, text=None, id=None, type=None, parent=None, next=None, score=None)[source]

Bases: layoutparser.elements.BaseLayoutElement

This class constructs content-related information of a layout element in addition to its coordinate definitions (i.e. Interval, Rectangle or Quadrilateral).

Parameters
  • block (BaseCoordElement) – The shape-specific coordinate systems that the text block belongs to.

  • text (str, optional, defaults to None) – The ocr’ed text results within the boundaries of the text block.

  • id (int, optional, defaults to None) – The id of the text block.

  • type (int, optional, defaults to None) – The type of the text block.

  • parent (int, optional, defaults to None) – The id of the parent object.

  • next (int, optional, defaults to None) – The id of the next block.

  • score (numeric, defaults to None) – The prediction confidence of the block

property height

Return the height of the shape-specific block.

Returns

Output the numeric value of the height.

Return type

numeric

property width

Return the width of the shape-specific block.

Returns

Output the numeric value of the width.

Return type

numeric

property coordinates

Return the coordinates of the two corner points that define the shape-specific block.

Returns

Output the numeric values of the coordinates in a Tuple of size four.

Return type

Tuple(numeric)

property points

Return the coordinates of all four corners of the shape-specific block in a clockwise fashion starting from the upper left.

Returns

A Numpy array of shape 4x2 containing the coordinates.

Return type

Numpy array

property area

Return the area of associated block.

condition_on(other)[source]

Given the current element in relative coordinates to another element which is in absolute coordinates, generate a new element of the current element in absolute coordinates.

Parameters

other (BaseCoordElement) – The other layout element involved in the geometric operations.

Raises

Exception – Raise error when the input type of the other element is invalid.

Returns

The BaseCoordElement object of the original element in the absolute coordinate system.

Return type

BaseCoordElement

relative_to(other)[source]

Given the current element and another element both in absolute coordinates, generate a new element of the current element in relative coordinates to the other element.

Parameters

other (BaseCoordElement) – The other layout element involved in the geometric operations.

Raises

Exception – Raise error when the input type of the other element is invalid.

Returns

The BaseCoordElement object of the original element in the relative coordinate system.

Return type

BaseCoordElement

is_in(other, soft_margin={}, center=False)[source]

Identify whether the current element is within another element.

Parameters
  • other (BaseCoordElement) – The other layout element involved in the geometric operations.

  • soft_margin (dict, optional, defaults to {}) – Enlarge the other element with wider margins to relax the restrictions.

  • center (bool, optional, defaults to False) – The toggle to determine whether the center (instead of the four corners) of the current element is in the other element.

Returns

Returns True if the current element is in the other element and False if not.

Return type

bool

shift(shift_distance)[source]

Shift the layout element by user specified amounts on x and y axis respectively. If shift_distance is one numeric value, the element will by shifted by the same specified amount on both x and y axis.

Parameters

shift_distance (numeric or Tuple(numeric) or List[numeric]) – The number of pixels used to shift the element.

Returns

The shifted BaseCoordElement of the same shape-specific class.

Return type

BaseCoordElement

pad(left=0, right=0, top=0, bottom=0, safe_mode=True)[source]

Pad the layout element on the four sides of the polygon with the user-defined pixels. If safe_mode is set to True, the function will cut off the excess padding that falls on the negative side of the coordinates.

Parameters
  • left (int, optional, defaults to 0) – The number of pixels to pad on the upper side of the polygon.

  • right (int, optional, defaults to 0) – The number of pixels to pad on the lower side of the polygon.

  • top (int, optional, defaults to 0) – The number of pixels to pad on the left side of the polygon.

  • bottom (int, optional, defaults to 0) – The number of pixels to pad on the right side of the polygon.

  • safe_mode (bool, optional, defaults to True) – A bool value to toggle the safe_mode.

Returns

The padded BaseCoordElement object.

Return type

BaseCoordElement

scale(scale_factor)[source]

Scale the layout element by a user specified amount on x and y axis respectively. If scale_factor is one numeric value, the element will by scaled by the same specified amount on both x and y axis.

Parameters

scale_factor (numeric or Tuple(numeric) or List[numeric]) – The amount for downscaling or upscaling the element.

Returns

The scaled BaseCoordElement of the same shape-specific class.

Return type

BaseCoordElement

crop_image(image)[source]

Crop the input image according to the coordinates of the element.

Parameters

image (Numpy array) – The array of the input image.

Returns

The array of the cropped image.

Return type

Numpy array

classmethod from_series(series)[source]
to_dict() → Dict[str, Any][source]

Generate a dictionary representation of the current textblock of the format:

{
    "block_type": <name of self.block>,
    <attributes of self.block combined with
        non-empty self._features>
}
classmethod from_dict(data: Dict[str, Any])layoutparser.elements.TextBlock[source]

Initialize the textblock based on the dictionary representation. It generate the block based on the block_type and block_attr, and loads the textblock specific features from the dict.

Parameters

data (dict) – The dictionary representation of the object

Layout

class layoutparser.elements.Layout(blocks: List = [], page_data: Dict = None)[source]

Bases: collections.abc.MutableSequence

The Layout class id designed for processing a list of layout elements on a page. It stores the layout elements in a list and the related page_data, and provides handy APIs for processing all the layout elements in batch. `

Parameters
  • blocks (list) – A list of layout element blocks

  • page_data (Dict, optional) – A dictionary storing the page (canvas) related information like height, width, etc. Defaults to None.

insert(key, value)[source]

S.insert(index, value) – insert value before index

copy()[source]
relative_to(other)[source]
condition_on(other)[source]
is_in(other, soft_margin={}, center=False)[source]
filter_by(other, soft_margin={}, center=False)[source]

Return a Layout object containing the elements that are in the other object.

Parameters

other (BaseCoordElement) – The block to filter the current elements.

Returns

A new layout object after filtering.

Return type

Layout

shift(shift_distance)[source]

Shift all layout elements by user specified amounts on x and y axis respectively. If shift_distance is one numeric value, the element will by shifted by the same specified amount on both x and y axis.

Parameters

shift_distance (numeric or Tuple(numeric) or List[numeric]) – The number of pixels used to shift the element.

Returns

A new layout object with all the elements shifted in the specified values.

Return type

Layout

pad(left=0, right=0, top=0, bottom=0, safe_mode=True)[source]

Pad all layout elements on the four sides of the polygon with the user-defined pixels. If safe_mode is set to True, the function will cut off the excess padding that falls on the negative side of the coordinates.

Parameters
  • left (int, optional, defaults to 0) – The number of pixels to pad on the upper side of the polygon.

  • right (int, optional, defaults to 0) – The number of pixels to pad on the lower side of the polygon.

  • top (int, optional, defaults to 0) – The number of pixels to pad on the left side of the polygon.

  • bottom (int, optional, defaults to 0) – The number of pixels to pad on the right side of the polygon.

  • safe_mode (bool, optional, defaults to True) – A bool value to toggle the safe_mode.

Returns

A new layout object with all the elements padded in the specified values.

Return type

Layout

scale(scale_factor)[source]

Scale all layout element by a user specified amount on x and y axis respectively. If scale_factor is one numeric value, the element will by scaled by the same specified amount on both x and y axis.

Parameters

scale_factor (numeric or Tuple(numeric) or List[numeric]) – The amount for downscaling or upscaling the element.

Returns

A new layout object with all the elements scaled in the specified values.

Return type

Layout

crop_image(image)[source]
get_texts()[source]

Iterate through all the text blocks in the list and append their ocr’ed text results.

Returns

A list of text strings of the text blocks in the list of layout elements.

Return type

List[str]

get_info(attr_name)[source]

Given user-provided attribute name, check all the elements in the list and return the corresponding attribute values.

Parameters

attr_name (str) – The text string of certain attribute name.

Returns

The list of the corresponding attribute value (if exist) of each element in the list.

Return type

List

to_dict() → Dict[str, Any][source]

Generate a dict representation of the layout object with the page_data and all the blocks in its dict representation.

Returns

The dictionary representation of the layout object.

Return type

Dict

get_homogeneous_blocks() → List[layoutparser.elements.BaseLayoutElement][source]

Convert all elements into blocks of the same type based on the type casting rule:

Interval < Rectangle < Quadrilateral < TextBlock
Returns

A list of base layout elements of the maximal compatible type

Return type

List[BaseLayoutElement]

to_dataframe(enforce_same_type=False) → pandas.core.frame.DataFrame[source]

Convert the layout object into the dataframe. Warning: the page data won’t be exported.

Parameters

enforce_same_type (bool, optional) – If true, it will convert all the contained blocks to the maximal compatible data type. Defaults to False.

Returns

The dataframe representation of layout object

Return type

pd.DataFrame