Site id in test set but not in train set

jash.shah · May 5, 2017, 5:21am

There’s a site id in the test set (STOK) that is not there in the train set. If this is deliberate where can I get the Lat/Long for this site id?

charles.hornbaker · May 5, 2017, 5:08pm

Hi @jash.shah, thanks for bringing this to our attention! It is indeed one of the cases where the first observations occur in the test set. Since we’re not asking you to predict the location of the sites, here’s the site information:

site_id: STOK
camlr_region: 48.1
longitude_epsg_4326: -59.85
latitude_epsg_4326: -62.4

jash.shah · May 5, 2017, 7:58pm

Thank you @charles.hornbaker!

jgaines · May 11, 2017, 7:05am

Hi @jash.shah and @charles.hornbaker,

Could I ask two questions about this please.

First, what do we mean by “test set” here? Is it the nest_counts.csv file? If so, that’s not really labelled test data in the standard sense, is it? Isn’t it just a consolidated time series view of the training_set_observations.csv data?

Second, I can see STOK in nest_count but not in training_set_observations.csv. In nest_count, as far as I can tell, it has no observations whatsoever against it. This is also true in the error file (training_set_e_n.csv). So what does Charles mean by saying STOK is “one of the cases where the first observations occur in the test set”?

Thanks!

charles.hornbaker · May 11, 2017, 7:49pm

Hi @jgaines,

By “test set”, I’m referring to the set of nest counts that you need to predict (Nest count data for 2014-2017). If you look in the submission_format.csv file, the first two columns contain the site and species pairs that you will provide predictions for. For a few of these, such as STOK, the first nest count observation occurs in 2014, so it does not appear in the observations you received in the training set. It’s up to you how to make predictions for these sites using the available information.

Topic		Replies	Views
Missing observations in the 2014-2017 test data Random Walk of the Penguins	2	924	June 10, 2017
Nest_counts time series has more than raw observation? Random Walk of the Penguins	1	804	May 12, 2017
Site_id and latitude/longitude precision Random Walk of the Penguins	2	1061	May 4, 2017
Ensuring Sites are in Test or Train Conser-vision Practice Area	1	363	July 3, 2023
Prediction of nest counts, chicks, or adults? Random Walk of the Penguins	6	1181	May 28, 2017

Site id in test set but not in train set

Related topics