What's your strategy?

JarryJafery · July 25, 2017, 11:24am

I think you have to scale your data then the results would be better then before.

payback · December 16, 2017, 6:45pm

HI all!
This is my first post… my code is on https://github.com/Payback80/drivendata_blood_donation
my score is 0.4269 with very few lines of code
preprocessing: check for NA, outliers, multicollinearity
feature engineering: some, check the code
strategy: xgboost and H2o automl

natalie.rouge · March 21, 2018, 4:57pm

Hi all!
My score: 0.4350

Model used: vanilla logistic regression with 10-fold cross validation using caret in R

Pre-processing: remove total volume (100% correlation with number of donations)

Feature engineering: added new_donor variable (if months since last donation = months since first donation). I tried adding other variables like frequency (average months in between donations), interaction between the existing variables but didnt seem to improve performance much.

I have a question for anyone using the logLoss metrics in Caret. Do you get a negative logLoss? It’s weird I thought it should be higher than 0 but doesn’t seem to be the case.

My code:
bd_train <- trainControl(method=“repeatedcv”, number = 10, repeats=3, savePredictions = TRUE, classProbs=TRUE, summaryFunction=mnLogLoss)

model_bd1 <- train(donated ~ mo_last + no_donation + mo_first + new_donor , data=blood_donation, method=“glm”, family=“binomial”, trControl=bd_train, metric=“logLoss”)

skisource · April 16, 2018, 3:38am

Hi All! This is my first ever hands-on since completing DataCamp data scientist track )

So… rank 109 / 0.4349
Python (in PyCharm) with Keras, deep learning model comprised of BatchNorm layer, three Dense layers.
Model performs with around 0.5 loss and 0.72-ish accuracy metric.

Dropout layers did not improve much.

Hawi · November 23, 2018, 6:06am

Hello,
I’m new to this competition. I’ve just begun to solve it. I only have RStudio, and I’m unable to find a package to calculate Log Loss. Any recommendation for free software that I could use would be much appreciated.

Thank you!

dpcarballo · November 27, 2018, 3:14pm

logloss <- function(z,y, eps=1e-13) {
#z: real values
#y: predicted values
#eps: numeric cero (0 would cause inifinte results)

y[y<eps]=eps
y[y>(1-eps)]=1-eps
l=mean(abs(z*log(y) + (1-z) * log(1-y)))

return(l)
}

FerLicht · August 23, 2020, 2:13pm

Hello,

Did you make any progress? I am teaching a new course and I would like to give them real examples to work on, we are using Rstudio Cloud for R and Google Colab for Python.

Best regards,

Topic		Replies	Views
Share your approach! Pump it Up: Data Mining the Water Table	46	19675	December 27, 2021
First competition question Warm Up: Predict Blood Donations	4	2078	September 12, 2018
Data Dictionary? Warm Up: Predict Blood Donations	1	1438	May 3, 2016
Not able to increase beyond 7% Warm Up: Predict Blood Donations	0	947	August 16, 2017
Performance of model Warm Up: Predict Blood Donations	0	543	January 26, 2019

What's your strategy?

Related Topics