Thanks for pointing this out. It looks like a small bug in our obfuscation process led to 6 hashing collisions in the household training data.
All occur between countries A and C:
SlDKnCuu enTUTSQi znHDEHZP CtFxPQPT CNkSTLvx hJrMTBVd
We have confirmed that none of these correspond to the same question, e.g., question
SlDKnCuu asks something different for country A than it does C.
As for your other question:
There is some small overlap across countries for each question but not many. The reason these were hashed differently is that the original surveys coded the questions differently. So it’s best to assume no overlap.