Deal with no dependent feature in test data.Interpret results with logloss

ggeo · June 21, 2017, 7:13am

Hello,

I have build some models using logloss metric , I can see a result like this

nIter  logLoss  
  11     0.5675282
  21     0.5544149
  31     0.5745408

So, ok I am taking the smallest value.

I am using the
predict(mymodel, newdata=test_data)

and I am receiving something like:

no no no no no yes no no no no no no no no no no no no no no no no no no no no no no no no no no no ....

hence, the predictions.

I am not sure how to interpret the results.
My predictions are yes or no (I used that because the model demands a character and not a number (1 or 0)).
The logloss is the result from the model.

How can I finilize the results?
If the test_data contained the dependent variable , I would use a confusion matrix.
But again what about logloss?

Thanks!

enric1296 · June 21, 2017, 3:39pm

Hi,

With that script you are predicting the total dataset.

What you really want to predict is the last column wich have a categorical feature (0/1 or no/yes ) so i recomend you to use this script in order to predict only that row with a probability that minimazes the log loss using your model.

prediction <- predict(model, data.matrix(test[,-1]))

i hope it will help you, regards!

ggeo · June 22, 2017, 6:47am

Hi and thanks for the answer.

I don’t know what you are trying to do with using test[,-1].If you just ommit the first column which is the ID’s, then ok, I have already dropped that when I use the test_data.

I have figured how to interpret the results.
You just need to add type="prob":

predict(model, test_data, type="prob")

and you have the probabilities!

enric1296 · June 22, 2017, 7:07am

Hi

I dropped the id before predicting. And whter adding type="prob" or not depends on your model. Im using xgboost but if you use randomforest the predictions are only 0 or 1 so you need to add that script.

Topic		Replies	Views
First competition question Warm Up: Predict Blood Donations	4	2108	September 12, 2018
0's and Logarithmic Loss Metric Warm Up: Predict Blood Donations	8	8241	March 28, 2017
Help on predicting Warm Up: Predict Blood Donations	0	763	November 20, 2018
Predicting on holdout data Warm Up: Predict Blood Donations	2	671	December 8, 2018
The heart disease present Warm Up: Machine Learning with a Heart	1	649	May 10, 2019

Deal with no dependent feature in test data.Interpret results with logloss

Related topics