This is an extension of a previous question:
How to avoid overfitting bias when both hyperparameter tuning and model selecting?
…which provided some options for the question at hand, but now I would like to pivot to knowing what is accepted practice or rule of thumb.
In short, say we do hyperparameter tuning on multiple ML model families. The following selection step of choosing the model family itself provides another opportunity for optimistic bias. This could be resolved by some of the strategies noted in the link above.
Noting the previous discussion, are there accepted rules of thumb (or research) on when said strategies are important? For instance, if just optimizing two model families, is it generally safe to ignore the concern and pick the model family in the train split score (or perhaps even the test split)? Or is there a certain n number of model families at which this becomes a danger and tripple-nesting or gridsearch modifications of some kind is needed?