The article reiterates the classic way to select features by choosing the ones that minimize your errors. The ensemble techniques you mention, on the other hand, yield feature importance as a natural by-product of the algorithms themselves.
It is always a good idea to combine your business knowledge of the features along with an appropriate feature selection algorithm. Using too many features with the ensembles in the hope of getting the important, uncorrelated ones as a by-product, will make your training time too lengthy, and may get you poorer results compared to training with better-chosen features (which is why you want to select features in the first place, anyway). On the other hand, feature selection by adding features one-by-one (like the article explains) also means expensive retraining of your machine learning for each iteration.
So use features that make business sense and are minimally correlated to begin with, and train a subset of your training data along with plotting learning curves to assess whether each feature improves or reduces performance. Perhaps even look at correlations and feature importance based on different techniques to see which ones overlap. At the end, data science is as much an art as it is science.