Skip to content

Commit c076fdd

Browse files
authored
Merge pull request #27 from carpentries-incubator/tp/issue_26
Explain the purpose of max_features for Random Forests. Closes #26.
2 parents bae2e1f + 02430e2 commit c076fdd

File tree

1 file changed

+11
-0
lines changed

1 file changed

+11
-0
lines changed

_episodes/06-random-forest.md

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -34,6 +34,17 @@ for i, estimator in enumerate(mdl.estimators_):
3434

3535
![](../fig/section6-fig1.png){: width="900px"}
3636

37+
> ## Question
38+
> a) When specifying the model, we set `max_features` to `1`. All of the trees make decisions using both features, so it appears that our model is not respecting the argument. What is the explanation for this inconsistency?
39+
> b) What would you expect to see with a `max_features` of `1` AND a `max_depth` of `1`?
40+
> c) Repeat the plots with the new argument to check your answer to b. What do you see with respect to Age? Why?
41+
> > ## Answer
42+
> > a) If it was true that setting `max_features=1` as an argument led to trees with a single variable, we would not see the trees in our figure (which all make decisions based on both features). The explanation is that features are being limited at each split, not at the model level.
43+
> > b) Setting `max_features` to `1` limits our trees to a single split. We now see two sets of trees, some restricted to Acute Physiology Score and some restricted to Age.
44+
> > c) Our trees decided against splitting on Age. The model was unable to find a single Age that led to improvement (based on its optimisation criteria).
45+
> {: .solution}
46+
{: .challenge}
47+
3748
Let's look at final model's decision surface.
3849

3950
```python

0 commit comments

Comments
 (0)