Inconsistent translation Labels for pose estimation track

in the problem description you are writing:
" The ground truth data (transformation vectors) for the public set is available in the output.csv on the Data Download page."
however I cannot find any “output.csv” on the Data Download page, only “train_labels.csv”.
If I then take a closer look to these train labels I find the following:
In the problem description you write, that the translational components are like a random walk, I can confirm this in the images

Question 1: Are these random changes in satellite position in the images just there for confusion/robustness? and not represented in any of the labels? I cannot find anything the the train_labels.cxsv that would correlate with this random jitter in the images.

If I take a further look at the labels, esp. the translational components, I would assume that the x,y,z values are the translational changes (in the unit meter? )in 3D of the 2 spacecraft. You also offer a laser range finder dataset with the distance of the chaser spacecraft to the target in meters.
If I now plot these ranges from the laser range finder together with the positions in the train_labels.csv I find no correlation
From your description:
" x, y, z are the translational components that describe changes in the position of the chaser spacecraft in three dimensions (i.e., forward-backward, left-right, up-down)."
I understand therefore that the x-direction would be the forward-backward direction. This would mean the x component should be strongly correlated to the distance in from the laser range finder.
From the train_labels.csv I would expect the distance between the 2 spacecraft to change considerably within the sequence, however the range of the laser range finder shows very little change, and I cannot see the periodicity from the x,y,z components. I also have problems seeing a considerable change in relative distance of the 2 spacecraft in the images of a sequence as the train_labels suggest.
Am I missing something here?

1 Like

Hi @Shadow43,

The ground truth data is indeed the file named train_labels.csv. There is no output.csv file—that was an old name that we missed updating. Apologies for the confusion. We’ve updated the Problem Description page to fix this.

I will address for other questions in the following order:

  • First, how to understand the translation values
  • Then, what the “random walk” refers to

Translation values

Regarding how to understand the translation values in the ground truth, it’s important to understand this idea:

For each subsequent image, you are to calculate the transformation required to get back to that reference image. That is, for each image supply the transformation - the rotation first, then the translation - we would need to apply to the chaser to get the target spacecraft (treated as stationary) to appear exactly as it does in the reference image.

Because of the formulation, the translation values represent position changes defined in the reference frame of the spacecraft’s initial position and orientation. So, the x value does indeed represent a translation in the forward-backward direction, but along the forward-back axis of the chaser’s initial pose in the reference image. It is not a translation along the forward-back axis of the orientation shown by the i-th image of the chain.

To reiterate more specifically: we define the location of the chaser spacecraft in the initial image at the origin of our reference frame, with the x-axis pointing towards the target spacecraft and the y-axis pointing to the left. Then, continuing to use this reference frame, the (x, y, z) values for the i-th labeled image correspond to the location of the chaser spacecraft in this reference frame.

In the example you plotted, the fact that you see the laser rangefinder showing little change indicates that the chaser spacecraft is moving about the target spacecraft approximately on the surface of a sphere of relatively constant radius (~100 meters) centered on the target spacecraft. It means the chaser spacecraft is always ~100 meters from the point (~100,0,0). You can roughly verify this with a calculation like sqrt((x-100)**2 + y**2 + z**2). Note that this a rough approximation for a few additional reasons: (a) the laser rangefinder measurements have some noise added, and (b) the range is the distance to the surface of the target spacecraft and not the distance to its center.

In fact, the pattern you see in the first half of the trajectory where the x,y,z appear nearly periodic can be understood as the chaser moving in an approximate circle around the target. The first peak in x, which is approximately double the range value, coincides with y and z approximately crossing zero, which can be understood as the chaser being on the opposite side of the circle from where it started.

Random walk

The “random walk” refers to the process that produced the trajectory of the chaser spacecraft. It does not refer to noise in the data. (The laser rangefinder measurements do contain some noise, but the ground truth labels do not.)

To understand this, it may be helpful to consider a counterfactual situation. For example, another possible way to produce a trajectory would be to simulate the motion of both spacecraft subject to forces under the physics of Newton’s Laws of Motion, for example if both spacecraft were orbiting a third body due to that body’s gravity. In such a situation, a physics-based model of Newtonian mechanics and gravity would have significant predictive value for the chaser’s motion by itself without consideration of the images. Because the actual data generation process uses a biased random walk, such a physics-based dynamical model is not applicable.

There is a random walk associated with the distance to the target spacecraft (seen with the changes in the laser rangefinder measurement), as well as a biased random walk associated with how we move around the spacecraft. The distortions in the periodicity of the (x, y, z) components is evidence of this. The particular example you plotted is fairly mild, where there is high bias and relatively low randomness. Other chains can have much less obvious of a periodic component. The purpose of this is to discourage competitors from creating solutions that will try to predict the dynamics of the system / periodicity, and encourage solutions that have no reliance on dynamics.

1 Like

@jayqi thank you for the detailed explanation, that helps a lot

1 Like

Hi @jayqi !

I’m wondering a bit what this means precisely.
For example in 0c734abdda image 000.png looks like this:

Here the spacecraft does not look centered in the image.

I’m also wondering a bit how all this relates to the intrinsic matrix of the camera. The coordinate system used there looks quite similar to right-down-forward (e.g. opencv/colmap). Could you clarify how these systems relate?

Hi @parskatt,

For your first question: In general, it’s the case that the center pixel of the image is on the surface of the target spacecraft (representing the surface that the laser rangefinder is measuring), but the target spacecraft may not be centered. This means that target spacecraft with long booms, such as the one you’ve shown, can be very far off center. Additionally, some images may not have clear lighting that illuminates that part of the spacecraft, which can make it look further off center.

Regarding your second question: the intrinsic matrix is provided matching the normal intrinsic parameters convention, with width being the horizontal axis and the height being the vertical axis. The intrinsic matrix is not specifically calculated in relation to the coordinate system conventions of the pose annotations.

1 Like

Thanks for the reply, good to know.

An additional question: The provided intrinsics do not seem to correct for all scenes. Is this intended? The principal point seems to be correct, but the focals do not quite seem right (off by about 15%). Is this intentional?

Hi @jayqi, I have a follow-up question on the maximum movement in and out constraint.

  1. Since we are rotating the body-fixed frames of the chaser such that they align with the base orientation chaser body-fixed frame, and then adding the translation state, I assume that the translation vectors are in the base orientation frame, as mentioned above:
  1. I am assuming this constraint refers to the ‘x’ direction, however, if we look at sequence '0fd304ddda
    ', the first translation vector in the base orientation body-fixed frame is greater than 1, which confuses me. How should the 1 meter “maximum movement in or out” constraint be applied? I can shift the translation vector to describe a translation from the ith to the i-1th chaser orientation in the ith chaser orientation and then check the constraint and check the +/- 1 constraint. But initial tests for this sequence with the truth data do not conform to my understanding of the constraint.

Hi @kilonova,

  1. I assume that the translation vectors are in the base orientation frame

Yes, that is correct.

  1. the first translation vector in the base orientation body-fixed frame is greater than 1, which confuses me. How should the 1 meter “maximum movement in or out” constraint be applied?

The “maximum movement in or out” refers to the range of the spacecraft, i.e., the distance between them. Because the chaser spacecraft is moving around the target spacecraft, the range depends on all three position coordinates, not just x, which is just the forward translation from the initial position in the frame of the initial orientation.

Another way you could think about it is that you could transform from the Cartesian coordinates we’re using to spherical coordinates centered on the target spacecraft, and the range would be the radial position of chaser spacecraft.

1 Like

Hi @jayqi , since you’re here:

The provided intrinsics do not seem to correct for all scenes.
The provided ones are

[[5.2125371e+03 0.0000000e+00 6.4000000e+02]
[0.0000000e+00 6.2550444e+03 5.1200000e+02]
[0.0000000e+00 0.0000000e+00 1.0000000e+00]]

However, we got better results on most scenes with

[[5.2125371e+03 0.0000000e+00 6.4000000e+02]
[0.0000000e+00 5.2125371e+03 5.1200000e+02]
[0.0000000e+00 0.0000000e+00 1.0000000e+00]]

However, even these do not always seem to be correct.
Are the intrinsics the same for all images, and what is the correct calibration?

Hi @parskatt,

Thanks for your patience. We’re checking on the camera parameters and don’t have an answer for you yet.

Is this piece of information mostly to help us constrain the noisy range measurement, which we can use to indirectly constrain the translation vector between images (in the base chaser frame)?

Is this piece of information mostly to help us constrain the noisy range measurement, which we can use to indirectly constrain the translation vector between images (in the base chaser frame)?

@kilonova There are several constraints on the range and movement of the spacecraft which are documented in this section. As documented, you are encouraged to use these constraints in your solution.

@parskatt We don’t have any more information to provide about the intrinsic parameters. These parameters were derived from the camera settings used in Blender when generating the images. We encourage you to handle the intrinsic parameters in your modeling in whatever way that works best for you.

1 Like