All of the labels seem to be less than 300, is this an intentional charaistic of the data?

This seems to contrast your baseline algorithm, which makes predictions up to 1400+ (of course, it’s called a baseline algorithm for a reason, but still).