# About our coordinate systems

## Overview

Every keypoint identified in Datagen’s modalities is provided in both 2D and 3D coordinates. This page describes the coordinate systems that our modalities use.

## 3D Coordinates

3D coordinates describe a location in objective space. They are therefore camera-independent: You can have fifteen cameras photographing the same landmark from different angles and with different settings, but the landmark’s 3D coordinates will remain the same for all of them.

3D coordinates describe the distance between the landmark and the global origin, in a global coordinate system measured in meters. In this coordinate system, x is to the left and right; y is forward and back; and z is up and down. These directions are from the point of view of the default positions of the camera and the head, where the camera is located at (0, -1.6, 0.12) and the subject’s head is located at the origin (0, 0, 0).

Every time a set of 3D coordinates appears in Datagen’s metadata, it uses the same format: an object called “global_3d” containing three *Floats* that describe the x, y, and z coordinates of a landmark or keypoint:

```
"global_3d": {
"x": 0.1370361637738016,
"y": -0.7877883480654823,
"z": 0.022076432851867542
},
```

### Head location and rotation in the 3D space

#### Default position

Since a head is not a point-sized object, it is important to emphasize which part of the head is actually located at the origin. The coordinates of the head actually define the location of the suprasternal notch (the small pocket at the center of the V at the front of the neck). If you assigned the head to a different position or range of positions in our platform, you are actually defining the location of the suprasternal notch.

As an example of how the head is oriented around the suprasternal notch, let us take keypoint 28, which is the keypoint directly between the eyes in the iBUG facial keypoint standard. Because it is particularly useful for defining the location and orientation of the face as opposed to the neck, this keypoint is separately included in actor_metadata.json.

At default settings, when the head has not been moved or rotated and the suprasternal notch is at the origin, the coordinates of keypoint 28 are as follows:

**x**will be very close to 0, because the keypoint is in the center of the face and therefore almost directly above the origin on the left-right axis.**y**will be slightly negative, because the keypoint is slightly further forward than the suprasternal notch.**z**will be positive, because the keypoint is above the suprasternal notch.

#### Moving the head

**x**:*Float.*Lower values of x move the head to its right; higher values of x move the head to its left.**y**:*Float.*Lower values of y move the head forwards; higher values of y move the head backwards. If y is lower than -1.6, the head has been moved so far forward that it is behind the default position of the camera (see the camera section below), and it will not be visible in the rendered images unless the camera itself is moved or rotated.**z**:*Float*. Lower values of z move the head down; higher levels of z move the head up.

#### Rotating the head

The orientation of the head actually defines how the head is rotated about the neck as the neck remains fixed in place. The platform limits the rotation of the head based on the realistic limits of human physiology.

Note

You always stretch your neck in some way when you rotate your head. As a result, the rotation controls may pull the suprasternal notch slightly off of the coordinates that you defined for it. This does not however change the origin point of the head, which will always be exactly where you defined it.

The head is rotated first by **pitch**, then by **yaw**, then by **roll**, using the values you entered into the pitch, yaw, and roll controls in the platform.

**yaw**:*Float*. Lower values of yaw turn the face to its right. Higher values of yaw turn the face to its left.**pitch:***Float*. Lower values of pitch turn the face upward. Higher values of pitch turn the face downward.**roll**:*Float*. Lower values of roll turn the face counterclockwise. Higher values of roll turn the face clockwise.

### Camera location and rotation in the 3D space

#### Default position

By default, our platform places the camera in approximately the same horizontal plane as the subject’s head, but rotated 180 degrees around the z axis and 1.6 meters away in the negative y direction.

This places the camera directly opposite the subject’s face, such that the two are looking directly at each other.

Our datapoints use an idealized point-sized camera. Therefore, when you input the camera’s position in the platform you are defining that position exactly.

#### Moving the camera

**x**:*Float.*Lower values in the x component move the camera to its left; higher values of x move the camera to its right. The camera’s default x component is 0, which centers the camera on the left-right axis and points it directly at the default position of the subject.**y**:*Float.*Lower values in the y component move the camera backwards; higher values of y move the camera forwards. The camera’s default y component is -1.600, which places it far enough away from the default position of the subject’s head to see it in its entirety. If the y component is greater than 0, then the camera has moved past the subject, and you will need to rotate the camera or move the subject if you want the subject to appear in the images.**z**:*Float*. Lower values in the z component move the camera down; higher values move the camera up. The spotlight’s default z component is 0.12, which places it slightly higher than the default position of the subject. Since the subject’s coordinates actually define the location of the suprasternal notch, raising the camera slightly in this way puts it at approximately the same height as the subject’s face.

#### Rotating the camera

The camera is rotated by applying first **yaw**, then **pitch**, then **roll**, according to the values you entered in the yaw, pitch, and roll controls on the platform.

**yaw**:*Float*. Lower values of yaw turn the camera to its left. Higher values of yaw turn the camera to its right.**pitch:***Float*. Lower values of pitch turn the camera downward. Higher values of pitch turn the camera upward.**roll**:*Float*. Lower values of roll turn the camera clockwise. Higher values of roll turn the camera counterclockwise.

## 2D Coordinates

2D coordinates describe the location of a landmark in the photographs taken by a specific camera with specific settings. They are therefore highly camera-dependent. The landmark will have the same coordinates in every single photograph taken by the same camera, no matter how much the lighting conditions or backgrounds change. But in the set of photographs taken by a different camera, in a different place in the scene, with different settings, the landmark will likely have completely different coordinates.

2D coordinates describe the distance between the landmark and the upper left corner of the image. In this coordinate system, x is the distance from the top edge of the image, and y is the distance from the left side of the image.

Every time a set of 2D coordinates appears in Datagen’s metadata, it uses the same format: an object called “pixel_2d” that contains two *Ints* describing the x and y coordinates of a specific pixel in a 2D image:

```
"pixel_2d": {
"x": 3724,
"y": -143
}
```

Note that there is no requirement that the landmark be visible in the image. In the example above, the landmark is 143 pixels to the *left* of the left side of the image, which means it is not visible at all; furthermore, the value of x is so high that the landmark is probably below the bottom of the image.