Feature Selection

Hi

Today as i was coding for the project I got stuck during Feature Selection. Has Anyone used any techniques for the same. If so Please share.
Thanks

Hi kunal26,

What exactly is your problem with feature selection? Are you having difficulties deciding which features to use in your model?
You could try starting with just a couple of features, say 3 or 4, and see how your model performs. If you have a high bias (underfitting) you can add more features.
You could also start with all the features and eliminate some if you have a high variance (overfitting).

Annabel

I used a Weka algorithm to rank the features from most important to least important and this is not a bad starting point if you are not sure what features to use:

=== Run information ===

Evaluator: weka.attributeSelection.InfoGainAttributeEval
Search: weka.attributeSelection.Ranker -T -1.7976931348623157E308 -N -1
Relation: heartdiseasedata-weka.filters.unsupervised.attribute.Remove-R1
Instances: 180
Attributes: 14
slope_of_peak_exercise_st_segment
thal
resting_blood_pressure
chest_pain_type
num_major_vessels
fasting_blood_sugar_gt_120_mg_per_dl
resting_ekg_results
serum_cholesterol_mg_per_dl
oldpeak_eq_st_depression
sex
age
max_heart_rate_achieved
exercise_induced_angina
class
Evaluation mode: evaluate on all training data

=== Attribute Selection on all input data ===

Search Method:
Attribute ranking.

Attribute Evaluator (supervised, Class (nominal): 14 class):
Information Gain Ranking Filter

Ranked attributes:
** 0.2201 2 thal**
** 0.193 4 chest_pain_type**
** 0.1498 13 exercise_induced_angina**
** 0.125 5 num_major_vessels**
** 0.1015 9 oldpeak_eq_st_depression**
** 0.0985 1 slope_of_peak_exercise_st_segment**
** 0.0869 12 max_heart_rate_achieved**
** 0.0862 10 sex**
0 8 serum_cholesterol_mg_per_dl
0 11 age
0 3 resting_blood_pressure
0 6 fasting_blood_sugar_gt_120_mg_per_dl
0 7 resting_ekg_results

Selected attributes: 2,4,13,5,9,1,12,10,8,11,3,6,7 : 13