How should I interpret NA values in food consumption variables?

seanprb · December 20, 2025, 7:03pm

There are some missing values in the consumption variables e.g 5 NAs in “consumed1300”, is this supposed to be imputed or removed? How should I interpret such missing values given they seem only to exist in the consumption variables?

Also is the training data already weighted?

chrisk-dd · January 7, 2026, 4:12pm

Hi @seanprb,

NA values are “not applicable,” effectively undefined or null. How you treat these values is up to you!

The training data (train_hh_features) contains weights in the “weight” column.

Best,
Chris

oknaitik · January 9, 2026, 1:44pm

Can you confirm if the percentile rank (p_t) corresponding to threshold t in poverty rate MAPE computation corresponds to survey 300000, even for leaderboard eval?

chrisk-dd · January 9, 2026, 2:10pm

Hi @oknaitik-

The poverty rate distribution MAPE shown on the leaderboard corresponds to values in the test dataset, not in the training dataset.

Best,
Chris

oknaitik · January 9, 2026, 2:24pm

But what about the weights w_t (refer to snap attached) used in computation for leaderboard?

chrisk-dd · January 9, 2026, 2:38pm

The weights w_t are the same at each threshold for each survey (i.e., the weight is 1 for the 40th percentile threshold for each survey).

oknaitik · January 9, 2026, 2:55pm

Sure they are same irrespective of the survey. But are those weights based on survey 300000 data, i.e. 3rd row in train_rates_gt.csv? Please confirm this!

chrisk-dd · January 12, 2026, 5:44pm

I’m a little unclear on the question, let me try to answer with an example:

The dollar value of the various poverty thresholds are fixed, so for your predictions for every survey you are predicting the percentage of the population with a consumption below $3.17, $3.94, $4.60, … etc. Your prediction for the poverty rate at $7.70 is weighted with a weight of 1. Your predictions for the poverty rate at $7.06 and $8.40 are weighted with a weight of 0.95. And so on as specified in the metric section.

So, the thresholds are set by survey 300000. The weights at each threshold are the same for all submitted predictions.

Does this answer your question?

oknaitik · January 12, 2026, 6:40pm

Oh man! I’m not sure how costly this might be for me. I’ve been using the exact poverty rates at various thresholds from train_rates_gt.csv (survey `300000`) for the wMAPE calculation the entire time.

However, the poverty rates even in that file are very close to the aforesaid values. Thanks for the clarification!

Topic		Replies	Views
About the Poverty Prediction Challenge category Poverty Prediction Challenge	9	611	January 24, 2026
Question About NMAE metric, numbers seem off? Cold Start Energy Forecasting	11	1577	October 9, 2018
Null values correlation Poverty Prediction Challenge	2	258	December 31, 2025
Household country B Data has so many bugs in R Pover-T Tests: Predicting Poverty	3	1021	February 6, 2018
Final prediction on countries different from A,B,C Pover-T Tests: Predicting Poverty	1	1064	December 27, 2017

How should I interpret NA values in food consumption variables?

Related topics