Hi ! train_labels’ datetime object Hour varies for different grids - but the competition’s page states: " datetime (string): The UTC datetime of the measurement in the format YYYY-MM-DDTHH:mm:ssZ . A value represents the average between 12:00am to 11:59pm local time. The datetime provided represents the start of that 24 hour period in UTC time."
Can someone help me reconcile the above - I was expecting all datetime objects to be tagged at 00:00:00 based on the above - but I see they are not. Taking the first row for example:
2018-02-01T08:00:00Z,3S31A,11.4
Can someone help me understand:
if 11.4 is an average
if so, what is the state date and end date of this average
For the example line you give, the location grid ID is in Los Angeles; there, 08:00:00Z is midnight local time. I think that 11.4 represents the average for Feb. 2, 2018 (12:00AM to 11:59PM local time), but it is timestamped at what 12:00AM local time would be in UTC time, which is 8AM. The number would thus be the average from 2018-02-01T08:00:00Z to 2018-02-02T7:59:59Z, in the “datetime” format they are using.
In the Labels (outputs), it has been mentioned: “The datetime provided represents the start of that 24 hour period in UTC time.”. So, in this example, the average should be from 2018-02-02T08:00:00Z to 2018-02-03T07:59:00Z. Could you please check?
I think that is almost correct. If the timestamp given is 2018-02-01T08:00:00Z, then the interval of the average is from 2018-02-01T08:00:00Z to 2018-02-02T07:59:00Z. The “start” refers to the start of the 24 hour period, lasting from midnight to 11:59PM in local time, but (in this time zone) from 8AM to 7:59AM the next day in UTC time.
I am having doubts because of another entry by cszc under a different query “clarification on features’ dates”. I copied the text here:
The label’s datetime represents the start of a 24 period over which the air quality is averaged. For example, a label with datetime 2019-01-31T08:00:00Z represents an average taken from 2019-01-31T08:00:00Z to 2019-02-01T07:59:00Z (inclusive).
Therefore, you can use satellite data with an endtime on or before 2019-02-01T07:59:00Z for a label with datetime 2019-01-31T08:00:00Z . Note that this is 11:59pm local time (pacific time).
This would be the correct answer, then. I think that is consistent with the explanation I gave, but maybe I am misunderstanding something or explaining poorly. Regardless, the explanation of cszc is the one to go with.