# Splitting nodes in DT for continuous features(classification)

Splitting nodes in decision trees (DT) when a given feature is categorical is done using concepts like entropy, Information Gain and Gini impurity.

But when the features are continuous , how does one split the nodes of the decision tree? I assume you are familiar with the concept of entropy .

Suppose that we have a training data set of n sample points . let us consider one particular feature f1 which is continuous in nature .

### Approach for splitting nodes

1. We need to perform splitting of nodes for all sample points .
2. we sort the f1 column in ascending order .
3. then taking every value in f1 as a threshold, calculate the entropy and then an Information Gain.
4. we select the threshold with the most information gain and make a split.
5. we then continue to do the same for leaf nodes until either max_depth is reached or min_samples required to reach is more than sample points .

Lets try to understand the above by one example :

let the following be the f1 feature column and let say its a two class classification problem:,

WE START BY SORTING THE FEATURE VALUES IN INCREASING ORDER:

NOW WE WILL CHOOSE EACH POINT AS THRESHOLD ONE BY ONE , 2.8 , 3.9 and so on . Below we display the splitting for one point , let say 5.4.

we perform similar splittings for all the data points , and whichever gives us the max IG is our first splitting point. If you cannot recall what IG is , this image might help: ref: Quora

Now , for further splits , similar approach is repeated on leaf nodes .

### DISADVANTAGE

Although we could handle the problem by feature binning and converting the numerical features into categorical .

More to come!