Eye keypoints

Overview

The ground truth exposed in this modality is a set of eye keypoints using a standard developed by Datagen.

This modality consists of the following file:

Relevant file

Location

actor_metadata.json

Scene folder

actor_metadata.json

This file contains information on the subject’s eyes, in two parts:

  • The 3D and 2D coordinates of a set of eye landmarks defined by Datagen.

  • Information about the direction of gaze, including normalized vectors and the target of the gaze in 3D coordinates.

The file uses the following format:

"apex_of_cornea_point": {
    // 3D and 2D coordinates of the apex of the cornea.
},
"center_of_rotation_point": {
    // 3D and 2D coordinates of the eyeball's center of rotation.
},
"iris_circle": {
    // 3D and 2D coordinates of 12 keypoints that define the circle of the iris.
},
"center_of_iris_point": {
    // 3D and 2D coordinates of the center of the iris.
},
"pupil_circle": {
    // 3D and 2D coordinates of 12 keypoints that define the circle of the pupil.
},
"center_of_pupil_point": {
    // 3D and 2D coordinates of the center of the pupil.
},
"eye_gaze": {
    // Detailed information about the direction in which the subject is looking
},
"eyelid_closure_intensity_level": 1

Note

There are additional fields in actor_metadata.json that are part of the Actor metadata modality.

Objects and fields:

  • apex_of_cornea_point: Object. This object contains the 3D coordinates of the apex (the highest point) of the cornea in global coordinates, as well as the 2D coordinates of that point in each of the datapoints you generated in the scene. The object uses the following structure, in which all 2D coordinate values <2D-coordinates> are of type Int and all 3D coordinate values <3D-coordinates> are of type Float:

    "apex_of_cornea_point": {
       "2d": {
          "camera_1": {
             "right_eye": {
                "x": 80,
                "y": 911
             },
             "left_eye": {
                "x": 79,
                "y": 974
             }
          },
          "camera_2": {
             "right_eye": {
                "x": 100,
                "y": 563
             },
             "left_eye": {
                "x": 100,
                "y": 595
             }
          }
       },
       "3d": {
          "right_eye": {
             "x": 0.7163663506507874,
             "y": 0.07476498186588287,
             "z": 0.16999228298664093
          },
          "left_eye": {
             "x": 0.7749997973442078,
             "y": 0.07624759525060654,
             "z": 0.17110392451286316
          }
       }
    },
    

    The apex of the cornea is a crucial part of determining the optical axis vector, which is the direction that a person appears to be looking from the point of view of an external observer.

    ../_images/apex_of_cornea_zoom.png

    The apex of the cornea.

  • center_of_rotation_point: Object: This object contains the coordinates of the center of the eyeball, around which the eyeball rotates. The coordinates are provided in both 3D form (global coordinates) and 2D form (location in each of the datapoints you generated in the scene). The object uses the following structure, in which all 2D coordinate values <2D-coordinates> are of type Int and all 3D coordinate values <3D-coordinates> are of type Float:

    "center_of_rotation_point": {
       "2d": {
          "camera_1": {
             "right_eye": {
                "x": 87,
                "y": 906
             },
             "left_eye": {
                "x": 85,
                "y": 968
             }
          },
          "camera_2": {
             "right_eye": {
                "x": 104,
                "y": 561
             },
             "left_eye": {
                "x": 103,
                "y": 593
             }
          }
       },
       "3d": {
          "right_eye": {
             "x": 0.716782808303833,
             "y": 0.08541443198919296,
             "z": 0.16397099196910858
          },
          "left_eye": {
             "x": 0.7742490172386169,
             "y": 0.08667740225791931,
             "z": 0.16510838270187378
          }
       }
    },
    

    The center of rotation is a crucial part of determining the visual axis vector, which is the direction that a person is looking.

    ../_images/center_of_rotation_zoom.png

    The eye’s center of rotation.

  • iris_circle: Object. This object contains the coordinates of a series of points that define the iris. The coordinates are provided in both 3D form (global coordinates) <3D-coordinates> and 2D form (location in each of the datapoints you generated from the scene) <2D-coordinates>. The object uses the following overall structure:

    "iris_circle": {
       "2d": {
          "camera_1": {
             "right_eye": [
                 //Array of 12 points in 2D format
             ],
             "left_eye": [
                 //Array of 12 points in 2D format
             ]
          },
          "camera_2": {
              "right_eye": [
                 //Array of 12 points in 2D format
             ],
             "left_eye": [
                 //Array of 12 points in 2D format
             ]
          }
       },
       "3d": {
          "right_eye": [
             //Array of 12 points in 3D format
          ],
          "left_eye": [
             //Array of 12 points in 3D format
          ]
       }
    },
    

    Each array of 2D points contains twelve objects with the following structure, each of which contains two Ints:

    {
        "x": 98,
        "y": 596
    },
    

    Each array of 3D points contains twelve objects with the following structure, each of which contains three Floats:

    {
       "x": 0.7164850234985352,
       "y": 0.08002109080553055,
       "z": 0.17416279017925262
    },
    
    ../_images/iris_points_zoom.png

    Twelve keypoints that define the boundary of the iris, and another point at its center.

  • center_of_iris_point: Object. This object contains the 3D coordinates of the center of the iris in global coordinates, as well as the 2D coordinates of that point in each of the datapoints you generated from the scene. The object uses the following structure, in which all 2D coordinate values <2D-coordinates> are of type Int and all 3D coordinate values <3D-coordinates> are of type Float:

    "center_of_iris_point": {
       "2d": {
          "camera_1": {
             "right_eye": {
                "x": 81,
                "y": 910
             },
             "left_eye": {
                "x": 80,
                "y": 973
             }
          },
          "camera_2": {
             "right_eye": {
                "x": 101,
                "y": 563
             },
             "left_eye": {
                "x": 100,
                "y": 595
             }
          }
       },
       "3d": {
          "right_eye": {
             "x": 0.7165213823318481,
             "y": 0.07704445719718933,
             "z": 0.16874763369560242
          },
          "left_eye": {
             "x": 0.7748080492019653,
             "y": 0.07848701626062393,
             "z": 0.16979405283927917
          }
       }
    },
    
  • pupil_circle: Object. This object contains the coordinates of a series of points that define the pupil. The coordinates are provided in both 3D form (global coordinates) <3D-coordinates> and 2D form (location in each of the datapoints you generated from the scene) <2D-coordinates>. The object uses the following overall structure:

    "pupil_circle": {
       "2d": {
          "camera_1": {
             "right_eye": [
                 //Array of 12 points in 2D format
             ],
             "left_eye": [
                 //Array of 12 points in 2D format
             ]
          },
          "camera_2": {
              "right_eye": [
                 //Array of 12 points in 2D format
             ],
             "left_eye": [
                 //Array of 12 points in 2D format
             ]
          }
       },
       "3d": {
          "right_eye": [
             //Array of 12 points in 3D format
          ],
          "left_eye": [
             //Array of 12 points in 3D format
          ]
       }
    },
    

    Each array of 2D points contains twelve objects with the following structure, each of which contains two Ints:

    {
       "x": 100,
       "y": 562
    },
    

    Each array of 3D points contains twelve objects with the following structure, each of which contains three Floats:

    {
       "x": 0.7164993286132812,
       "y": 0.07862702757120132,
       "z": 0.17070025205612183
    },
    
    ../_images/pupil_points_zoom.png

    Twelve keypoints that define the boundary of the pupil, and another point at its center.

  • center_of_pupil_point: Object. This object contains the 3D coordinates of the center of the pupil in global coordinates, as well as the 2D coordinates of that point in each of the datapoints you generated from the scene. The object uses the following structure, in which all 2D coordinate values <2D-coordinates> are of type Int and all 3D coordinate values <3D-coordinates> are of type Float:

    "center_of_pupil_point": {
       "2d": {
          "camera_1": {
             "right_eye": {
                "x": 81,
                "y": 910
             },
             "left_eye": {
                "x": 80,
                "y": 973
             }
          },
          "camera_2": {
             "right_eye": {
                "x": 101,
                "y": 563
             },
             "left_eye": {
                "x": 101,
                "y": 595
             }
          }
       },
       "3d": {
          "right_eye": {
             "x": 0.7164666056632996,
             "y": 0.07730603218078613,
             "z": 0.16853395104408264
          },
          "left_eye": {
             "x": 0.7747995257377625,
             "y": 0.07873070985078812,
             "z": 0.1696217656135559
          }
       }
    },
    

    The center of the pupil is a crucial part of determining the optical axis vector, which is the direction that a person appears to be looking from the point of view of an external observer.

  • eye_gaze: Object. This object contains information on where the subject is looking, presented in three ways:

    • axis_directions: Object. This object contains the normalized optical and visual axis vectors for each eye.

      The optical axis vector is the direction that the subject appears to be looking from the point of view of an external observer. It is calculated using the following formula:

      \[optical~axis~direction = apex~of~cornea~point - center~of~pupil~point\]

      The visual axis vector is the direction that the subject is actually looking. (The direction of this vector is always very slightly offset from the direction of the optical axis vector.) The visual axis vector is calculated using the following formula:

      \[visual~axis~direction = center~of~rotation~point - fovea~point\]

      The optical and visual axis vectors are presented using this format:

      "axis_directions": {
         "right_eye": {
            "axis_directions": {
               "visual_axis_direction": {
                  "x": 0.014786778155726103,
                  "y": -0.8652962281691663,
                  "z": 0.5010427014816077
               },
               "optical_axis_direction": {
                  "x": -0.034199164818520936,
                  "y": -0.8668075176884541,
                  "z": 0.49746873711269146
               }
            }
         },
         "left_eye": {
            "axis_directions": {
               "visual_axis_direction": {
                  "x": 0.009155659821882036,
                  "y": -0.8657963448364028,
                  "z": 0.5003127653389934
               },
               "optical_axis_direction": {
                  "x": 0.0690889121195633,
                  "y": -0.8566151210336242,
                  "z": 0.5113093551253336
               }
            }
         }
      },
      
    • target_point: Object. This object contains the target, in global 3D coordinates <3D-coordinates>, of the target of the subject’s gaze. Generally both the right eye and the left eye are looking at the same point, which is presented as a set of three Floats, as follows:

      "target_point": {
         "right_eye": {
            "x": 0.9370604157447815,
            "y": -13.373147964477539,
            "z": 7.9055495262146
         },
         "left_eye": {
            "x": 0.9370604157447815,
            "y": -13.373147964477539,
            "z": 7.9055495262146
         }
      },
      
    • eye_gaze_direction_type: String. This object contains a brief description of the general direction that the subject is looking. Typical values are forward, up, down, left, right, top_left, top_right, bottom_left, and bottom_right.

  • eyelid_closure_intensity_level: Int. This object describes how much the subject’s eyes are closed, on a scale of 1 (completely open) to 5 (completely closed).