Clarifications Needed


In regards with the missing Consumption Values, what is the preferred/expected strategy please?

  1. To interpolate the missing values or
  2. To consider those entries as Anomalies?

Also, in my naïve experience with Anomaly Detection, the training data is usually represented by a bulk set of positive samples sprinkled with very few labeled negative samples (anomalies) to be used for cross-validation/testing/parameter fine tuning. In this case, ALL training data provided seems to be unlabeled (unless I am missing something fundamental).

Is this meant to be a completely unsupervised learning problem please?

Thank you!

1 Like