Uniqueness of MessageId/UETR

yizhewan · January 19, 2023, 11:44pm

Hi,

May I know if the MessageId is a unique identifier of the transactions in the final evaluation sets? Is UETR unique too? Additionally, will the train and test transactions have overlapping UETR and/or MessageId? Thank you!

jayqi · January 19, 2023, 11:58pm

Hi @yizhewan,

Yes, both MessageId and UETR are unique identifiers, as described here.

There should not be overlap in values between the train and test splits.

yizhewan · January 20, 2023, 1:27am

Thank you. I’m asking because my code will concatenate train and test datasets during test time with UETR as the index column. It passes the smoke test as well as my own test using published train/test sets. However, it gets
ValueError: cannot reindex on an axis with duplicate labels
when I’m doing a groupby operation.

Since I’m not able to reproduce the error on smoke test and my local test data, I doubt if there’s duplicated UETR or MessageId in the final train/test datasets. Thank you for your response.

jayqi · January 20, 2023, 9:42am

Hi @yizhewan,

Thank you for having brought this to our attention. It was actually the case that there was an error with the evaluation data, and that was reflected by some UETR values present in both train and test splits. This has been fixed, so your submissions going forward should not have this issue.

Teams that have made successful submissions for Track A before this fix have had their submissions cleared and are being asked to resubmit. The previously erroneous data should have no impact on final results.

Apologies for the mistake, and thanks again for reporting this.

Topic		Replies	Views
Possible Error in Swift Data PETs Prize Challenge	2	337	September 2, 2022
One hot encoding / test data unique values Pushback to the Future Challenge	1	228	April 10, 2023
There are duplicate rows in the test data Warm Up: Predict Blood Donations	5	3282	March 16, 2018
Bank name in transaction is not found PETs Prize Challenge	4	256	January 24, 2023
Labels test data of evaluation sets PETs Prize Challenge	3	260	February 14, 2023

Uniqueness of MessageId/UETR

Related topics