Object_id values not in test data

cplkyle · February 24, 2019, 12:28am

Hello all!

A question regarding eliminating data from our model fit that wouldn’t be seen in test data.

The logic in the benchmark analysis was to drop a pipeline (L12) that was measured on training data but not found in testing data. We’ve noticed that for object_id there are 94 unique objects cleaned in the training data but only 88 listed in the testing data. Does it make sense to drop the 6 “non-shared” objects from the training data before training the model?

Topic		Replies	Views
Train and test data consistency Youth Mental Health: Automated Abstraction	11	256	October 14, 2024
Training empty data From Fog Nets to Neural Nets	1	1336	March 29, 2016
Same 4 boats in train and test N+1 Fish, N+2 Fish	2	890	October 14, 2017
Grid ID's that are in submission_format but not in train_labels NASA Airathon	3	327	March 3, 2022
Are whales in the test set the same as in the train set? Where's Whale-do?	6	425	May 4, 2022

Object_id values not in test data

Related topics