The limitations of decision trees

(written by lawrence krubner, however indented passages are often quotes). You can contact lawrence at:, or follow me on Twitter.

This article has some nice setups with the visuals:

If you are going to buy a house, you will make a list of pros/cons. That much is common sense. It is also an implicit decision tree. Most of these concepts are common sense. But I am struck, yet again, and how quickly this can seem to be unimaginably complex, once you involve the abstract jargon of math:

And once again, I note the sheer awfulness of math notation. How much faster would the human race advance if math notation was easier? We are crippled by the macho posturing of mathematicians, who insist on a tough notation that leaves the bulk of people out in the cold.

In the end, they list these limits to simple decision trees:

Even though decision tree models have numerous advantages,

* Very simple to understand and easy to interpret

* Can be visualized

* Requires little data preparation. Note however that sklearn.tree module does not support missing values.

The cost of using the tree (i.e., predicting data) is logarithmic in the number of data points used to train the tree.

These models are NOT common in use directly. Some common drawbacks of decision tree are:

* Can create over-complex trees that do not generalize the data well.

* Can be unstable because small variations in the data might result in a completely different tree being generated.

* Practical decision-tree learning algorithms are based on heuristic algorithms such as the greedy algorithm where locally optimal decisions are made at each node. Such algorithms cannot guarantee to return the globally optimal decision tree.

* Decision tree learners create biased trees if some classes dominate. It is therefore recommended to balance the dataset prior to fitting with the decision tree.

* Certain class of functions are difficult to model using tree models, such as XOR, parity or multiplexer.

Post external references

  1. 1