# Supervised learning: Decision Trees

### Introduction to Decision Trees

Decision Trees are a non-linear predictive modeling tool widely used in machine learning for both classification and regression tasks. The model is represented as a tree-like structure where an internal node represents a feature (or attribute), the branch represents a decision rule, and each leaf node represents the outcome. The topmost node in a tree is known as the root node. Decision Trees are popular due to their simplicity, interpretability, and applicability to both numerical and categorical data.

### How Decision Trees Work

Decision Trees use a decision-making process to infer the class labels of the samples. The decision of making strategic splits heavily influences their performance. Here’s how they generally work:

1. Choosing the Best Split: At each node in the tree, the algorithm selects the best split among all features based on a certain criterion that measures the impurity of the node. The most common criteria are:
• Gini Impurity (used by the CART algorithm): A measure of how often a randomly chosen element from the set would be incorrectly labeled.
• Entropy (used by the ID3, C4.5, and C5.0 algorithms): A measure of the randomness in the information being processed. The higher the entropy, the harder it is to draw conclusions from that information.
2. Recursive Splitting: After the best split is chosen, the dataset is split into subsets, which then recursively form branch nodes in the tree. This process continues until a stopping criterion is met; for instance:
• All tuples at a node belong to the same attribute class.
• No more remaining attributes are there to be processed.
• The tree has reached a maximum specified depth.
3. Pruning: This step is crucial to avoid overfitting. It involves removing parts of the tree that don’t provide additional power in classifying instances. Pruning can drastically improve the model’s generalizability.

### Types of Decision Trees

• Classification Trees: Used when the target variable is categorical. The tree is used to infer the class labels of samples.
• Regression Trees: Used when the target variable is continuous. The decision rules of the tree predict the target’s value.