I am using the predict(mymodel, newdata=test_data)
and I am receiving something like:
no no no no no yes no no no no no no no no no no no no no no no no no no no no no no no no no no no ....
hence, the predictions.
I am not sure how to interpret the results.
My predictions are yes or no (I used that because the model demands a character and not a number (1 or 0)).
The logloss is the result from the model.
How can I finilize the results?
If the test_data contained the dependent variable , I would use a confusion matrix.
But again what about logloss?
With that script you are predicting the total dataset.
What you really want to predict is the last column wich have a categorical feature (0/1 or no/yes ) so i recomend you to use this script in order to predict only that row with a probability that minimazes the log loss using your model.
I donโt know what you are trying to do with using test[,-1].If you just ommit the first column which is the IDโs, then ok, I have already dropped that when I use the test_data.
I have figured how to interpret the results.
You just need to add type="prob":
I dropped the id before predicting. And whter adding type="prob" or not depends on your model. Im using xgboost but if you use randomforest the predictions are only 0 or 1 so you need to add that script.