Where to find the modalities#

When you download a dataset, it is compressed as a .tar.gz file that must be extracted using an unzipping tool such as gzip or 7zip. The dataset is organized first by scene, then by camera.

Scene-level modalities#

The first level of the dataset is organized by scene. A scene consists of a specific set of 3D objects - in the case of the Faces platform, a specific human face wearing a specific expression. Your dataset contains one folder for each scene that you generated. These folders are named scene_00001, scene_00002, scene_00003, and so on:

DataGen-1
|---->scene_00001
|---->scene_00002
|---->scene_00003
    etc.

The files in the scene folder contain information about the scene that is not dependent on one’s location and orientation within the scene:

  • actor_metadata.json: The parameters that were used in generating the human face. This file contains information on the identity, expression, position, location, rotation, and keypoints that are used to define the actor’s eyes. See the Actor metadata and Eye keypoints modalities.

  • lights_metadata.json: Settings that are used for special lighting conditions such as infrared lighting. See the Visible spectrum image modality.

  • semantic_segmentation_metadata.json: A reference file for the colors that are used in semantic segmentation images. See the Semantic segmentation modality.

Camera-level modalities#

The second level of organization is by camera. Most of the data collected about the human face in the scene is highly dependent on where you are viewing the scene from - in other words, the camera’s location and settings. The rendered images themselves will look different depending on where the camera is in the scene, its field of view, and other settings. But the same is true for the coordinates of facial landmarks in the image, the distances between parts of the face and the camera lens, and so on; these all change depending on where the camera is and what it is doing.

Camera-level information is stored in separate folders inside the scene folder. Each camera in the scene is given its own folder:

DataGen-1
|---->scene_00001
    |---->camera_00001
    |---->camera_00002
    |---->camera_00003
        etc.
|---->scene_00002
    |---->camera_00001
    |---->camera_00002
    |---->camera_00003
        etc.
|---->scene_00003
    |---->camera_00001
    |---->camera_00002
    |---->camera_00003
        etc.
    etc.

Each camera folder contains the following camera-level information:

  • camera_metadata.json: The camera’s intrinsic and extrinsic parameters, such as location, orientation, resolution, and aspect ratio. See the Camera metadata modality.

  • depth.exr: A depth map of the face, giving the distance between the camera lens and each pixel on the surface of the face. See the Depth map modality.

  • environment.json: The list of lighting scenarios that were used when generating each datapoint. See the Visible spectrum image modality.

  • face_bounding_box.json: The coordinates of the corners of the bounding box surrounding the subject’s face in each datapoint. See the Bounding box modality.

  • infrared_spectrum.png: A rendered image of the subject in near-infrared lighting, if you used this option in your dataset. See the Visible spectrum image.

  • normal_maps.exr: An image of the face in which each pixel is recolored based on the direction of the normal vector coming out of the face at that location. See the Normal map modality.

  • semantic_segmentation.png: An image of the face in which each pixel is recolored based on the body part that it belongs to. See the Semantic segmentation modality.

  • visible_spectrum.png: A datapoint containing a visible spectrum image of the subject, using a lighting scenario of your choosing. See the Visible spectrum image modality.

  • key_points: A folder containing a set of JSON files that describe keypoint locations; see below.

Remember: every camera subfolder in the same scene describes the same human face, but from different angles.

The key_points folder#

Each camera folder contains a single key_points subfolder. This folder collects all of the keypoint modalities in a single location:

  • ears_key_points.json: The 2D and 3D locations of the 55 keypoints in the iBug ear keypoint standard. See the Ear keypoints modality.

  • face_dense_key_points.json: The 2D and 3D locations of the 468 keypoints in Google’s MediaPipe facial keypoint standard. See the Facial keypoints (MediaPipe) modality.

  • face_standard_key_points.json: The 2D and 3D locations of the 68 keypoints in the iBUG facial keypoint standard. See the Facial keypoints (iBUG) modality.

  • head_key_points.json: The 2D and 3D locations of the 81 keypoints in a standard developed by Datagen that describes the structure of the human head. See the Head keypoints modality.

  • all_key_points.json: A collection of all of the above sets of keypoints in a single file for convenience.