I've spotted two small bugs in Lecture_6_Notebook.ipynb in the Forward Selection and Backward Selection code. There are 3 models in both cases where the feature sets in have exactly the same value of R squared and AIC respectively.
In both cases, the model with the the largest number of features is selected. Really, in accordance with Occam's razor, we should favor the simplest model and select the model with the smallest number of features.
- Forward Selection code should read:
best_predictor_set = sorted(predictors, key=lambda t: t[1], reverse=True)[0]
- Backward Selection code should read:
best_predictor_set = sorted(predictors, key=lambda t: t[1], reverse=True)[-1]
PS: I would have submitted a pull request, but wasn't sure you would want it as the output in the notebook would change.