Null values correlation

ybbat · December 24, 2025, 6:25pm

Plotted the correlation between the amount of missing values in each of the consumedxxxx features for surveys 1-6.

There are some clear patterns within and between the surveys, not sure what insights can be gleamed from this if any. Thought I’d share in case anyone thought it was interesting or wanted to look further into this.

Excuse the lack of axis, couldn’t figure out an easy way to put readable ones in, but each row/col is just consumedxxxx starting from consumed100 ending at consumed5000.

Each cell represents the correlation between the amount of nulls in the two features, i.e. we can see in survey 3 that if for a response consumed100 is null, then all of consumed100-900 will be null, (in this survey there is only 1 row where these features are null so this isn’t all that interesting in of itself). I don’t think this exercise is all that useful in this problem, though in theory this could let us infer some things about the structure of the questionnaire.

oknaitik · December 31, 2025, 6:54am

This is quite interesting! Thanks.

Could you help me understand what utl_exp_ppp17 is? Is that the household (not per-capita) expenditure or last 7 days household expenditure since 95% of the samples have it greater than cons_ppp17? how is utl_exp_ppp17 different from cons_ppp17?

ybbat · December 31, 2025, 5:20pm

My assumption would be that utl_exp_ppp17 is the amount the household spends on utilities (electricity, water, etc), it doesn’t specify a timeframe so may be weekly/monthly. Whereas cons_ppp17 is the total daily expenditure per person in the household. I may be wrong though.

Topic		Replies	Views
How should I interpret NA values in food consumption variables? Poverty Prediction Challenge	8	412	January 12, 2026
About the Poverty Prediction Challenge category Poverty Prediction Challenge	9	761	January 24, 2026
Household train data missing "poor" column Pover-T Tests: Predicting Poverty	3	1387	February 1, 2018
Luck with individual data? Pover-T Tests: Predicting Poverty	0	920	January 8, 2018
Household country B Data has so many bugs in R Pover-T Tests: Predicting Poverty	3	1023	February 6, 2018

Null values correlation

Related topics