Scaling data with one or multiple scalers?

The LSTM benchmark model scaled the data with a new scaler each time a series was trained. Can anyone explain some of the advantages/disadvantages of using a new scaler every single time as opposed to scaling the entire dataset with just one scaler?

My gut instinct is that a new scaler each time will not represent the differences between the series accurately. For example, a series with the highest consumption at 4 MWh would be scaled to the same value as a series with the highest consumption at 20 kWh.

The consideration of using multiple scalers is to avoid the dominance of high consumption series ids, because of the use of NMAE as performance metric.