Using UCI data to achieve 0.1311 on LB, just for fun here

ymcdull · December 8, 2016, 3:12pm

import pandas as pd

source = pd.read_csv(“https://archive.ics.uci.edu/ml/machine-learning-databases/blood-transfusion/transfusion.data”)
test = pd.read_csv(“https://s3.amazonaws.com/drivendata/data/2/public/5c9fa979-5a84-45d6-93b9-543d1a0efc41.csv”)

source.columns = [“monthsF”, “donations”, “volumnCC”, “monthL”, ‘target’]
test.columns = [“id”, “monthsF”, “donations”, “volumnCC”, “monthL”]

source = source.astype(str)
source[“target”] = pd.to_numeric(source[“target”])
test = test.astype(str)

source[“combined”] = source.apply(lambda x: “-”.join([x[“monthsF”], x[“donations”], x[“volumnCC”], x[“monthL”]]), axis = 1)
mydict = dict(source.groupby([“combined”])[“target”].mean())

test[“combined”] = test.apply(lambda x: “-”.join([x[“monthsF”], x[“donations”], x[“volumnCC”], x[“monthL”]]), axis = 1)
test[“Made Donation in March 2007”] = list(test.combined.apply(lambda x: mydict))

test[[“id”, “Made Donation in March 2007”]].to_csv(“using_uci_data.csv”, index = False)

Gillesvdw · December 18, 2016, 10:03am

Wow ruining the leaderboard of a competition. So much fun that must be…

epattaro · January 19, 2017, 12:23pm

nice! i will download it and give it a try. good idea!

Topic		Replies	Views
Data Dictionary? Warm Up: Predict Blood Donations	1	1482	May 3, 2016
First competition question Warm Up: Predict Blood Donations	4	2108	September 12, 2018
How to get started with the basics Warm Up: Predict Blood Donations	0	867	July 9, 2018
Looking for other blood donation dataset Warm Up: Predict Blood Donations	0	2859	August 11, 2015
Performance of model Warm Up: Predict Blood Donations	0	582	January 26, 2019

Using UCI data to achieve 0.1311 on LB, just for fun here

Related topics