Semantic segmentation#

Overview#

The ground truth exposed in this modality is the parent body part of each pixel visible in the corresponding visual spectrum image.

This modality consists of the following files:

Relevant file

Location

semantic_segmentation.png

Camera folder

semantic_segmentation_metadata.json

Scene folder

semantic_segmentation.png#

This file contains a version of the visual spectrum image that has been converted into a semantic segmentation map.

Image1 Image2

A semantic segmentation map of a human face (left) and the corresponding visual spectrum image (right)

In a semantic segmentation map, the color value of each pixel has been replaced with a color value that indicates which class it belongs to. The color key is found in semantic_segmentation_metadata.json, in the Scene folder.

To process a semantic segmentation map, we recommend using code along the following lines:

import cv2

def load_segmentation(path):
    segmentation = cv2.cvtColor(cv2.imread(segmentation_path, cv2.IMREAD_UNCHANGED), cv2.COLOR_BGR2RGB)
    return segmentation

path = "semantic_segmentation.png"
segmentation = load_segmentation(path)

We strongly recommend against using PIL or plt.imread to load the semantic segmentation file, because the uint16 color will be reduced to uint8 and the color key will no longer be accurate.

Using this ground truth, you can train your model to recognize individual parts of the face and verify the accuracy of the network against the reality. See https://github.com/DatagenTech/dgutils/blob/master/Notebooks/003_semantic_segmentation.ipynb for some examples on how to use semantic segmentation to isolate objects in the datapoint.

Because the semantic segmentation of the subject in the scene remains the same regardless of lighting conditions and background imagery, only one semantic segmentation map per camera is needed regardless of the number of lighting scenarios. If you have more than one camera in the scene, each camera folder has its own semantic segmentation map, showing the body parts from that camera’s point of view.

semantic_segmentation_metadata.json#

This file contains the lookup table of RGB values that were assigned to each semantic object in semantic_segmentation.png. All datasets use the same lookup table, though it is subject to change as we introduce new features into the platform.

At the top of this file is a single field called version, a String that provides version tracking for this file. Whenever you access this file in a datapoint, make sure to check that the version matches what you expect it to be; otherwise its format and fields may not be recognized.

Because the lookup table remains the same across all cameras and lighting scenarios, this file is saved in the Scene folder. Each datapoint has its own version of the file, which lists only items present in the scene. For example, if the subject of the scene does not have a beard, the “beard” item will not appear in the file.

Click here for a complete list of segmentations.