Load COCO Layout Annotations¶

Preparation¶

In this notebook, I will illustrate how to use LayoutParser to load and visualize the layout annotation in the COCO format.

Before starting, please remember to download PubLayNet annotations and images from their website (let’s just use the validation set for now as the training set is very large). And let’s put all extracted files in the data/publaynet/annotations and data/publaynet/val folder.

And we need to install an additional library for conveniently handling the COCO data format:

pip install pycocotools

OK - Let’s get on the code:

Loading and visualizing layouts using Layout-Parser¶

from pycocotools.coco import COCO
import layoutparser as lp
import random
import cv2

def load_coco_annotations(annotations, coco=None):
    """
    Args:
        annotations (List):
            a list of coco annotaions for the current image
        coco (`optional`, defaults to `False`):
            COCO annotation object instance. If set, this function will
            convert the loaded annotation category ids to category names
            set in COCO.categories
    """
    layout = lp.Layout()

    for ele in annotations:

        x, y, w, h = ele['bbox']

        layout.append(
            lp.TextBlock(
                block = lp.Rectangle(x, y, w+x, h+y),
                type  = ele['category_id'] if coco is None else coco.cats[ele['category_id']]['name'],
                id = ele['id']
            )
        )

    return layout

The load_coco_annotations function will help convert COCO annotations into the layoutparser objects.

COCO_ANNO_PATH = 'data/publaynet/annotations/val.json'
COCO_IMG_PATH  = 'data/publaynet/val'

coco = COCO(COCO_ANNO_PATH)

loading annotations into memory...
Done (t=1.17s)
creating index...
index created!

color_map = {
    'text':   'red',
    'title':  'blue',
    'list':   'green',
    'table':  'purple',
    'figure': 'pink',
}


for image_id in random.sample(coco.imgs.keys(), 1):
    image_info = coco.imgs[image_id]
    annotations = coco.loadAnns(coco.getAnnIds([image_id]))

    image = cv2.imread(f'{COCO_IMG_PATH}/{image_info["file_name"]}')
    layout = load_coco_annotations(annotations, coco)

    viz = lp.draw_box(image, layout, color_map=color_map)
    display(viz) # show the results

You could add more information in the visualization.

lp.draw_box(image,
              [b.set(id=f'{b.id}/{b.type}') for b in layout],
              color_map=color_map,
              show_element_id=True, id_font_size=10,
              id_text_background_color='grey',
              id_text_color='white')

Model Predictions on loaded data¶

We could also check how the trained layout model performs on the input image. Following this instruction, we could conveniently load a layout prediction model and run predictions on the existing image.

model = lp.Detectron2LayoutModel('lp://PubLayNet/faster_rcnn_R_50_FPN_3x/config',
                                 extra_config=["MODEL.ROI_HEADS.SCORE_THRESH_TEST", 0.8],
                                 label_map={0: "text", 1: "title", 2: "list", 3:"table", 4:"figure"})

layout_predicted = model.detect(image)

lp.draw_box(image,
              [b.set(id=f'{b.type}/{b.score:.2f}') for b in layout_predicted],
              color_map=color_map,
              show_element_id=True, id_font_size=10,
              id_text_background_color='grey',
              id_text_color='white')