[Feature Request] Use budget as an input for surrogate model training #1183

bbudescu · 2024-12-19T11:35:16Z

Some time ago, while browsing the state of the art, I stumbled upon this idea and I can't for the life of me remember which algo introduced it, in which paper it was published or which packages implement it. I could have sworn it was BOHB and SMAC3, but it turns out I was wrong.

The main idea was to treat the budget parameter similarly to how SMAC3 treats instance features, i.e., to train the surrogate model on it, as well as on instance features and hyperparameters. When maximizing the acquisition function, we'd only care about the predictions at max_budget.

As such, a slice along the budget dimension in the cost surface modeled by the RF would effectively represent an estimate for configurations' learning curve. This way, the underlying surrogate model would also provide learning curve prediction (extrapolation), and costs measured at lower budgets would improve estimations for which configs will maximize the acquisition function at max_budget.

In this case, it would also help to provide more datapoints to constrain the surrogate model, so it would make sense to report cost(s) after every unit increment of the budget (i.e., after every epoch), rather then just at the budgets at which the multi-fidelity intensifier judges whether to keep running or cut short.

The text was updated successfully, but these errors were encountered:

benjamc · 2025-01-08T13:35:41Z

Hi Bogdan,
yes you are right, but I also cannot remember quickly which paper introduced this. We agree that it would be nice to include more multi-fidelity approaches.

bbudescu · 2025-01-10T10:06:38Z

Wait, it was, in fact BOHB. Checkout the paper (abs, pdf), under section 3.2 Hyperband where it describes the surrogate model as such:

While the objective function is typically expensive to evaluate (since it requires training a machine learning model with the specified hyperparameters), in most applications it is possible to define cheap-to-evaluate approximate versions that are parameterized by a so-called budget.

And later on, in section 4.1. Algorithm description it sounds as if the KDE is supposed to be trained on budget input, as well.

Also, in Appendix I. Surrogates, in the section I.1. Constructing the Surrogates it says:

To build a surrogate, we sampled 10 000 random configurations for each dataset, trained them for 50 epochs, and recorded their classification error after each epoch, along with their total training time. We fitted two independent random forests that predict these two quantities as a function of the hyperparameter configuration used. This enabled us to predict the classification error as a function of time with sufficient accuracy.

I can see that even in the original HpBandSter implementation a different model is trained for every budget, and budget is not treated as just another input to train the surrogate model upon.

How come there's this difference between the paper and its implementations? Or am I misunderstanding the paper?

benjamc added the feature label Dec 19, 2024

benjamc added this to SMAC board Dec 19, 2024

benjamc added the multi-fidelity label Jan 8, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature Request] Use budget as an input for surrogate model training #1183

[Feature Request] Use budget as an input for surrogate model training #1183

bbudescu commented Dec 19, 2024 •

edited

Loading

benjamc commented Jan 8, 2025

bbudescu commented Jan 10, 2025

[Feature Request] Use budget as an input for surrogate model training #1183

[Feature Request] Use budget as an input for surrogate model training #1183

Comments

bbudescu commented Dec 19, 2024 • edited Loading

benjamc commented Jan 8, 2025

bbudescu commented Jan 10, 2025

bbudescu commented Dec 19, 2024 •

edited

Loading